On 29 Jan 2012, at 23:28, Henry Story wrote:

> 
> On 29 Jan 2012, at 23:04, Andy Seaborne wrote:
> 
>> Hi Henry,
>> 
>> On 29/01/12 21:40, Henry Story wrote:
>>>  [ I just opened a bug report for this, but it was suggested that a wider
>>> discussion on how to do it would be useful on this list. ]
>> 
>> The thread of interest is:
>> 
>> http://www.mail-archive.com/[email protected]/msg02451.html
>> 
>>> Unless I am mistaken the only way to parse some content is using methods 
>>> that use an
>>> InputStream such as this:
>>> 
>>>    val m = ModelFactory.createDefaultModel()
>>>     m.getReader(lang.jenaLang).read(m, in, base.toString)
>> 
>> As already commented on the thread, passing the reader to an actor allows 
>> async reading.  Readers are configurable - you can have anything you like.  
>> No reason why the RDFReader can't be using async NIO.
> 
> Mhh, can I call at time t1
> 
>   reader.read( model, inputStream, base);
> 
> with an inputStream that only contains a chunk of the data? And then call it 
> again with
> another chunk of the data later with a newly filled input stream that 
> contains the next segment
> of the data?
> 
>   reader.read( model, inputStream2, base);
> 
> It says nothing about that in the documentation, so I just assumed it does 
> not work...

Well I did look at the code (but perhaps not deeply enough, and only the 
released 
version of Jena). From that I got the feeling that one has to send one whole 
RDF 
document down an input stream at a time.

If one cannot send chunks to the reader then essentially the thread that calls 
the
read(...) method above will block until the whole document is read in. Even if 
an 
actor calls that method, the actor will then block the thread that it is 
executing
in until it  is finished. So actors don't help (unless there is some magic I 
don't
know about). Now if the server serving the document is serving it at 56 bauds, 
really
slowly, then one thread could be used up even though it is producing very very
little work.

If on the other hand I could send partial pieces of XML documents down 
different 
input streams and different times, then the NIO thread could call the reader 
every time it received some data. For example in the code I was writing here 
using the
http-async-client https://gist.github.com/1701141

The method I have now on line 39-42

  def onBodyPartReceived(bodyPart: HttpResponseBodyPart) = {
    bodyPart.writeTo(out)
    STATE.CONTINUE
  }


  could be changed to 

  def onBodyPartReceived(bodyPart: HttpResponseBodyPart) = {
    reader.read(model, new ByteArrayInputStream(bodyPart.getBodyPartBytes(), 
base)
    STATE.CONTINUE
  }
   
  and so the body part would be consumed by the read in chunks.

> 
>> 
>> There is also RIOT - have you looked parsing the read request to a parser in 
>> an actor, the catching the Sink<Triple> interface for the return -- that 
>> wokrs in an actor style.
>> 
>> The key question is what Jena can enable,  this so that possibilities can be 
>> built on top.  I don't think jena is a good level to pick one approach over 
>> another as it is in danger of clashing with other choice in the application. 
>>  Your akka is a good example of one possible choice.
>> 
>>>  I did open the issue-203 so that when we agree on a solution we could send 
>>> in
>>> some patches.
>> 
>> Look forward to seeing this,
>> 
>>      Andy
> 
> Social Web Architect
> http://bblfish.net/
> 

Social Web Architect
http://bblfish.net/

Reply via email to