I had a similar task a while ago: I did wrap the StreamRDF in a wrapper that 
synchronized the relevant methods, and that worked fine. Then I tried using 
several independent output files, one for each thread, and performance improved 
enormously.

Keep in mind that if you use NTriples or Trig, merging two files (for later 
processing) is just concatenating them.

ajs6f

> On Nov 26, 2017, at 9:15 AM, Zak Mc Kracken <[email protected]> wrote:
> 
> Hi Andy,
> 
> thank you for your reply. Good to know. My use case is an RDF exporter that 
> takes data from a relatively slow data source (like a DBMS). In order to 
> speed things up, it has multiple threads reading data, converting it to RDF 
> and then sending generated RDF to their own Jena Model (one per thread). At 
> the end, they stream the model to a common sink/stream, such as a file.
> 
> Actually I'm designing this with some flexibility: one can chose to pass a 
> java.util.function.Consumer<Model> to the exporter, that is, an handler that 
> does something with a thread model, once it is ready. That's because, I want 
> to reuse the upstream processing for either an RDF file exporter, or a Neo4J 
> uploader (which should be able to manage concurrent writings at a finer grain 
> level), or, in general, some other kind of converter.
> 
> That said, I'm OK with making the file writing part synchronized and hence 
> non really parallel, my question was to understand it better how Jena works 
> with this.
> 
> Best,
> Marco.
> 
> On 26/11/2017 11:14, Andy Seaborne wrote:
>> If the output stream is shared, then no.  It's buffered internally.
>> 
>> So at small scale, it'll look safe because the whole output is one buffer or 
>> the order was OK.  But beyond that, the buffered flushes will be interleaved 
>> and buffer boundaries are based on characters, not logical unit of the RDF 
>> output.
>> 
>> Parallel writing to a shared OutputStream is a bad idea.
>> 
>> What's the use case you have for a shared output stream?
>> 
>>     Andy
>> 
>> 
> 

Reply via email to