Re: Streaming data to Fuseki

Andy Seaborne Fri, 16 Oct 2020 08:53:39 -0700



On 16/10/2020 10:02, Martynas Jusevičius wrote:

Thanks Andy.

And say if I wanted to write StreamRDF as a batch of HTTP requests
instead of a single stream -- would that be possible?
Is that what BatchedStreamRDF is for?


No.

"Triples are batched on subject"

It is used for example when writing Turtle blocks.


Can you also shed some light on InferenceProcessorStreamRDF and what
kind of RDFS support it provides?


It's the code behind "riot --rdf=schema data"


I think the Streaming I/O documentation could use an overview of the
various implementations of StreamRDF, because right now it's not
really obvious from the JavaDocs.
https://jena.apache.org/documentation/io/streaming-io.html

On Thu, Oct 15, 2020 at 5:33 PM Andy Seaborne <[email protected]> wrote:




On 15/10/2020 10:35, Martynas Jusevičius wrote:

Thanks Andy.

Where is the streaming RDF parsed?


All parsing is streaming - it's in the parsers (JSON-LD excepted).

GSP_RW.quadsPutPostTxn and down from there.

Could you please point me to that code?

We touched this in an older thread. I understand StreamRDF is the
destination for writing, but I'd like to see how reading of streams is
done.
https://lists.apache.org/thread.html/re390f37b04d43a4ac5f8521040161bf6a0582bb5a6ce422c14bccf1e%40%3Cusers.jena.apache.org%3E

On Tue, Oct 13, 2020 at 10:15 AM Andy Seaborne <[email protected]> wrote:




On 12/10/2020 22:35, Martynas Jusevičius wrote:

Hi,

how would it go if I would stream quads to the Fuseki quad store endpoint?

Can Fuseki (3.16.0) cope with streaming say 100000 quads over HTTP or
do I need to split it into multiple requests?


With a TDB2 store there is no hard size limit. 100's millions work, into
an existing database, live. It is not as fast a bulk loading.

TDB1 is more limited to 10's of millions (RAM limitation).

       Andy



Martynas

Re: Streaming data to Fuseki

Reply via email to