Re: dataset assembler for JENA-624

A. Soroka Thu, 05 Nov 2015 09:23:35 -0800

I just pushed an update to simplify this PR as discussed. It looks pretty 
similar to what you are showing below, but I used ja:data in both places (i.e. 
into the dataset and into a given graph, default or named) to allow for 
directly reading files into either. If ja:data is on the root, it reads either 
quads into the dataset or triples into the default graph, depending on what is 
on the other end of the ja:data triple. If ja:data is on the object of a 
ja:namedGraph triple, it reads into the appropriate named graph as specified by 
a ja:graphName triple, so:


root a ja:MemoryDataset ;
   ja:data <file://myQuads.nq> ;
   ja:data <file://myTriplesforDefaultGraph.nt> ;
   ja:namedGraph [ ja:graphName <info:mygraph> ; ja:data 
<file://someTriplesForMyGraph> ] .

I don’t understand what you mean by the method readGraphDesc? Is that supposed 
to read an assembler description of some kind? I thought we were excising that?

---
A. Soroka
The University of Virginia Library

> On Nov 5, 2015, at 12:05 PM, Andy Seaborne <[email protected]> wrote:
> 
> On 05/11/15 16:49, Andy Seaborne wrote:
>> On 05/11/15 16:11, A. Soroka wrote:
>>> On Nov 5, 2015, at 11:02 AM, Andy Seaborne <[email protected]> wrote:
>>>> On 05/11/15 15:44, A. Soroka wrote:
>>>>> Yeah, I basically copied the “parallel semantics” of ja:graph and
>>>>> ja:defaultGraph from DatasetAssembler. Perhaps I misunderstood them?
>>>> If you think it's a problem, do you have a better name?
>>> 
>>> I’m now a bit confused about what they mean and what ja:data is
>>> supposed to mean, and I don’t want to spread my confusion any further!
>>> {grin} To (try to) get some clarity: are they supposed to have the
>>> same meaning (they do now in my PR, and I really think they do in
>>> DatasetAssembler)? If they are supposed to have a different meaning,
>>> and one of those meanings is “load from this URI”, what is the
>>> difference in that and the newly-introduced ja:data predicate?
>>> 
>> 
>> They are different.
>> 
>> If you call assembler.open for the object of ja:defaultGraph it will
>> create a model. It does not know about datasets at that point.  You get
>> a regular model in-memory.
>> 
>> Adding to the in-mem dataset with addNamedModel or setDefaultModel will
>> be a copy into the datastructures of the in-mem dataset.
>> 
>> The use cases for direct file loading are:
>> 
>> 
>> 1/ Load file into dataset.
>> RDFDataMgr.read(dataset, file)
>>   Case 1a: quads
>>   Case 1b: triples
>> 
>> 
>> 2/ Load file to graph
>>   Case 2a: default graph
>>        RDFDataMgr.read(dataset.getdefaultModel(), file)
>>   Case 2b: named graph
>>        RDFDataMgr.read(dataset.getNamedModel(), file)
>> 
>> Note: When asked to read triples and it's a quads file, the RDFDataMgr
>> outputs just the triples from the default graph of the input (it does
>> not know it's reading into a named graph - it's just a destination of a
>> stream of triples).
>> 
>>     Andy
>> 
>> 
>> 
> 
> In code, that is something like this: (error checking could be enhanced):
> 
> @Override
> public Dataset open(final Assembler assembler, final Resource root, final 
> Mode mode) {
>    checkType(root, MemoryDataset);
>    Dataset dataset = createTxnMem();
>    setContext(root, dataset.getContext());
> 
>    // Case 1
>    multiValue(root, pGraph).stream().map(RDFNode::asResource)
>      .forEach(r->readGraphDesc(dataset, r));
> 
>    // Case 2
>    multiValue(root, data).stream().map(RDFNode::asResource)
>      .forEach(x -> RDFDataMgr.read(dataset, x.getURI()));
> 
>    // If desired:
>    // ... ja:defaultGraph
>    // ... ja:namedGraph
> 
>    return dataset;
> }
> 
> (and I hope my plague of tabs got sorted out).
> 
>       Andy
>

Re: dataset assembler for JENA-624

Reply via email to