Re: RDFStream to RDFConnection

Claude Warren Mon, 08 Jul 2019 09:56:51 -0700

The case I was trying to solve was reading a largish XML document and
converting it to an RDF graph.  After a few iterations I ended up writing a
custom Sax parser that calls the RDFStream triple/quad methods.  But I
wanted a way to update a Fuseki server so RDFConnection seemed like the
natural choice.


In some recent work for my employer I found that I like the RDFConneciton
as the same code can work against a local dataset or a remote one.

Claude

On Mon, Jul 8, 2019 at 4:34 PM ajs6f <[email protected]> wrote:

> This "replay" buffer approach was the direction I first went in for TIM,
> until turning to MVCC (speaking of MVCC, that code is probably somewhere,
> since we don't squash when we merge). Looking back, one thing that helped
> me move on was the potential effect of very large transactions. But in a
> controlled situation like Claude's, that problem wouldn't arise.
>
> ajs6f
>
> > On Jul 8, 2019, at 11:07 AM, Andy Seaborne <[email protected]> wrote:
> >
> > Claude,
> >
> > Good timing!
> >
> > This is what RDF Delta does and for updates rather than just StreamRDF
> additions though its not to an RDFConnection - it's to a patch service.
> >
> > With hindsight, I wonder if that woudl have been better as
> BufferingDatasetGraph - a DSG that keeps changes and makes the view of the
> buffer and underlying DatasetGraph behave correctly (find* works and has
> the right cardinality of results). Its a bit fiddley to get it all right
> but once it works it is a building block that has a lot of re-usability.
> >
> > I came across this with the SHACL work for a BufferingGraph (with
> prefixes) give "abort" of transactions to simple graphs which aren't
> transactional.
> >
> > But it occurs in Fuseki with complex dataset set ups like rules.
> >
> >    Andy
> >
> > On 08/07/2019 11:09, Claude Warren wrote:
> >> I have written an RDFStream to RDFConnection with caching.  Basically,
> the
> >> stream caches triples/quads until a limit is reached and then it writes
> >> them to the RDFConnection.  At finish it writes any triples/quads in the
> >> cache to the RDFConnection.
> >> Internally I cache the stream in a dataset.  I write triples to the
> default
> >> dataset and quads as appropriate.
> >> I have a couple of questions:
> >> 1) In this arrangement what does the "base" tell me? I currently ignore
> it
> >> and want to make sure I havn't missed something.
> >
> > The parser saw a BASE statement.
> >
> > Like PREFIX, in Turtle, it can happen mid-file (e.g. when files are
> concatenated).
> >
> > Its not necessary because the data stream should have resolved IRIs in
> it so base is used in a stream.
> >
> >> 2) I capture all the prefix calls in a PrefixMapping that is accessible
> >> from the RDFConnectionStream class.  They are not passed into the
> dataset
> >> in any way.  I didn't see any method to do so and don't really think it
> is
> >> needed.  Does anyone see a problem with this?
> >> 3) Does anyone have a use for this class?  If so I am happy to
> contribute
> >> it, though the next question becomes what module to put it in?  Perhaps
> we
> >> should have an extras package for RDFStream implementations?
> >> Claude
>
>

-- 
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: RDFStream to RDFConnection

Reply via email to