Re: parse one quad?

2018-03-12 Thread Andy Seaborne

One at a time and performance ...

You could create a TokenizerText.makeTokenizerString and simply pull 4 
tokens out and then check end of stream or DOT.


Do you need the quad immediately or can you create a inout that t=you 
send the string to and catch it in the StreamRDF, and reuse that 
framework across quad parsing?


Andy

On 12/03/18 20:18, ajs6f wrote:

org.apache.jena.sparql.core.Quad. That's enough for what I want to do.


ajs6f


On Mar 12, 2018, at 4:17 PM, Martynas Jusevičius  wrote:

So what are you going to parse the quad into, if not Dataset?

On Mon, Mar 12, 2018 at 9:11 PM, ajs6f  wrote:


Thanks, Martynas, but no; I don't have a Dataset (and don't need or want
to build one for a single quad), and no InputStream (although I could get
one from a String without too much fuss.

RDFDataMgr or RDFParser are usually the best tools for parsing, but I'm
looking for something a bit lighter-weight.

ajs6f


On Mar 12, 2018, at 4:07 PM, Martynas Jusevičius 

wrote:


Maybe this?
https://jena.apache.org/documentation/javadoc/arq/org/

apache/jena/riot/RDFDataMgr.html#read-org.apache.jena.
query.Dataset-java.io.InputStream-org.apache.jena.riot.Lang-


On Mon, Mar 12, 2018 at 8:46 PM, ajs6f  wrote:


I've got a use case for parsing one quad (in NQuads form) from a String.
I've been paging around through RIOT and other parts of Jena, but I just
can't seem to find any way to do this without building up a bunch of
auxiliary objects (like Readers or StreamRDFs, etc.). Performance is
something of a concern, so I'd rather not build up any more than I have

to.


Am I missing something, or do we just not expose that functionality?

(I'm

inclined to bet that we _have_ to have impled it somewhere, just for our
own sanity, but maybe not!)

ajs6f









Re: parse one quad?

2018-03-12 Thread ajs6f
org.apache.jena.sparql.core.Quad. That's enough for what I want to do.


ajs6f

> On Mar 12, 2018, at 4:17 PM, Martynas Jusevičius  
> wrote:
> 
> So what are you going to parse the quad into, if not Dataset?
> 
> On Mon, Mar 12, 2018 at 9:11 PM, ajs6f  wrote:
> 
>> Thanks, Martynas, but no; I don't have a Dataset (and don't need or want
>> to build one for a single quad), and no InputStream (although I could get
>> one from a String without too much fuss.
>> 
>> RDFDataMgr or RDFParser are usually the best tools for parsing, but I'm
>> looking for something a bit lighter-weight.
>> 
>> ajs6f
>> 
>>> On Mar 12, 2018, at 4:07 PM, Martynas Jusevičius 
>> wrote:
>>> 
>>> Maybe this?
>>> https://jena.apache.org/documentation/javadoc/arq/org/
>> apache/jena/riot/RDFDataMgr.html#read-org.apache.jena.
>> query.Dataset-java.io.InputStream-org.apache.jena.riot.Lang-
>>> 
>>> On Mon, Mar 12, 2018 at 8:46 PM, ajs6f  wrote:
>>> 
 I've got a use case for parsing one quad (in NQuads form) from a String.
 I've been paging around through RIOT and other parts of Jena, but I just
 can't seem to find any way to do this without building up a bunch of
 auxiliary objects (like Readers or StreamRDFs, etc.). Performance is
 something of a concern, so I'd rather not build up any more than I have
>> to.
 
 Am I missing something, or do we just not expose that functionality?
>> (I'm
 inclined to bet that we _have_ to have impled it somewhere, just for our
 own sanity, but maybe not!)
 
 ajs6f
 
 
>> 
>> 



Re: parse one quad?

2018-03-12 Thread Martynas Jusevičius
So what are you going to parse the quad into, if not Dataset?

On Mon, Mar 12, 2018 at 9:11 PM, ajs6f  wrote:

> Thanks, Martynas, but no; I don't have a Dataset (and don't need or want
> to build one for a single quad), and no InputStream (although I could get
> one from a String without too much fuss.
>
> RDFDataMgr or RDFParser are usually the best tools for parsing, but I'm
> looking for something a bit lighter-weight.
>
> ajs6f
>
> > On Mar 12, 2018, at 4:07 PM, Martynas Jusevičius 
> wrote:
> >
> > Maybe this?
> > https://jena.apache.org/documentation/javadoc/arq/org/
> apache/jena/riot/RDFDataMgr.html#read-org.apache.jena.
> query.Dataset-java.io.InputStream-org.apache.jena.riot.Lang-
> >
> > On Mon, Mar 12, 2018 at 8:46 PM, ajs6f  wrote:
> >
> >> I've got a use case for parsing one quad (in NQuads form) from a String.
> >> I've been paging around through RIOT and other parts of Jena, but I just
> >> can't seem to find any way to do this without building up a bunch of
> >> auxiliary objects (like Readers or StreamRDFs, etc.). Performance is
> >> something of a concern, so I'd rather not build up any more than I have
> to.
> >>
> >> Am I missing something, or do we just not expose that functionality?
> (I'm
> >> inclined to bet that we _have_ to have impled it somewhere, just for our
> >> own sanity, but maybe not!)
> >>
> >> ajs6f
> >>
> >>
>
>


Re: parse one quad?

2018-03-12 Thread ajs6f
Thanks, Martynas, but no; I don't have a Dataset (and don't need or want to 
build one for a single quad), and no InputStream (although I could get one from 
a String without too much fuss.

RDFDataMgr or RDFParser are usually the best tools for parsing, but I'm looking 
for something a bit lighter-weight.

ajs6f

> On Mar 12, 2018, at 4:07 PM, Martynas Jusevičius  
> wrote:
> 
> Maybe this?
> https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/riot/RDFDataMgr.html#read-org.apache.jena.query.Dataset-java.io.InputStream-org.apache.jena.riot.Lang-
> 
> On Mon, Mar 12, 2018 at 8:46 PM, ajs6f  wrote:
> 
>> I've got a use case for parsing one quad (in NQuads form) from a String.
>> I've been paging around through RIOT and other parts of Jena, but I just
>> can't seem to find any way to do this without building up a bunch of
>> auxiliary objects (like Readers or StreamRDFs, etc.). Performance is
>> something of a concern, so I'd rather not build up any more than I have to.
>> 
>> Am I missing something, or do we just not expose that functionality? (I'm
>> inclined to bet that we _have_ to have impled it somewhere, just for our
>> own sanity, but maybe not!)
>> 
>> ajs6f
>> 
>> 



Re: parse one quad?

2018-03-12 Thread Martynas Jusevičius
Maybe this?
https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/riot/RDFDataMgr.html#read-org.apache.jena.query.Dataset-java.io.InputStream-org.apache.jena.riot.Lang-

On Mon, Mar 12, 2018 at 8:46 PM, ajs6f  wrote:

> I've got a use case for parsing one quad (in NQuads form) from a String.
> I've been paging around through RIOT and other parts of Jena, but I just
> can't seem to find any way to do this without building up a bunch of
> auxiliary objects (like Readers or StreamRDFs, etc.). Performance is
> something of a concern, so I'd rather not build up any more than I have to.
>
> Am I missing something, or do we just not expose that functionality? (I'm
> inclined to bet that we _have_ to have impled it somewhere, just for our
> own sanity, but maybe not!)
>
> ajs6f
>
>


parse one quad?

2018-03-12 Thread ajs6f
I've got a use case for parsing one quad (in NQuads form) from a String. I've 
been paging around through RIOT and other parts of Jena, but I just can't seem 
to find any way to do this without building up a bunch of auxiliary objects 
(like Readers or StreamRDFs, etc.). Performance is something of a concern, so 
I'd rather not build up any more than I have to.

Am I missing something, or do we just not expose that functionality? (I'm 
inclined to bet that we _have_ to have impled it somewhere, just for our own 
sanity, but maybe not!)

ajs6f



Store data with bulk loader via Jena API

2018-03-12 Thread Davide Curcio
I used the bulk loader with Jena API to store data in TDB, but I don't know 
what is
the best way to use it. Because I create statements for each iteration,
and I pass data to bulk loader via InputStream. But if I
store data after load statements into a model, I have an
OutOfMemoryError, but I store data for each iteration, the system is too
slow. So, I would like to know what is the best way to do this with Jena
API.

Thanks


Re: Getting Symmetric Concise Bounded Description with Fuseki

2018-03-12 Thread Martynas Jusevičius
I disagree about SCBD as the default. In a Linked Data context, DESCRIBE is
usually used to return description of a resource, meaning the resource is
in the subject position. And then bnode closure is added, because otherwise
there would be no way to reach those bnodes. It's not about exploring the
graph in all directions.

If you want more specific description, then you can always use CONSTRUCT.

Some triplestores, for example Dydra, allow specification of the
description algorithm using a special PREFIX scheme, such as

PREFIX describeForm: 

On Mon, Mar 12, 2018 at 4:40 PM, Reto Gmür  wrote:

> Hi Andy
>
> > -Original Message-
> > From: Andy Seaborne 
> > Sent: Saturday, March 10, 2018 3:47 PM
> > To: users@jena.apache.org
> > Subject: Re: Getting Symmetric Concise Bounded Description with Fuseki
> >
> > Hi Reto,
> >
> > The whole DescribeHandler system is very(, very) old and hasn't changed
> in
> > ages, other than maintenance.
> >
> > On 10/03/18 11:44, Reto Gmür wrote:
> > > Hi Andy,
> > >
> > > It first didn't quite work as I wanted it to: the model of the
> resource passed
> > to the describe model is the default graph so I got only the triples in
> that
> > graph.  Setting "tdb:unionDefaultGraph true" didn't change the graph the
> > DescribeHandler gets.
> >
> > tdb:unionDefaultGraph only affects SPARQL execution.
> >
> > > Looking at the default implementation I saw that the Dataset can be
> > accessed from the context passed to the start method with
> > cxt.get(ARQConstants.sysCurrentDataset). I am now using the Model
> returned
> > by dataset. getUnionModel.
> >
> > That should work.  Generally available getUnionModel post-dates the
> describe
> > handler code.
> >
> > > I'm wondering why the DescribeBNodeClosure doesn't do the same but
> > instead queries for all graphs that contain the resource and then works
> on
> > each of the NamedModel individually. Is the UnionModel returned by the
> > dataset inefficient that you've chosen this approach?
> >
> > I don't think so - much the same work is done, just in different places.
> >
> > getUnionModel will work with blank node named graphs.
> >
> > getUnionModel will do describes spanning graphs, iterating over named
> > graphs will not.
> >
> > > Also the code seems to assume that the name of the graph is a URI, does
> > Jena not support Blank Nodes as names for graphs (having an "anonymous
> > node" as name might be surprising but foreseen in RDF datasets)?
> >
> > Again, old code (pre RDF 1.1, which is where bNode graph names came in).
> >
> > Properly, nowadays, it should all work on DatasetGraph whose API does
> work
> > with bNode graphs.  Again, history.
> >
> > If you want to clean up, please do so.
> >
> > > It seems that even when a DescribeHandler is provided, the default
> handler
> > is executed as well. Is there a way to disable this?
> >
> > IIRC all the handers are executed - the idea being to apply all policies
> and
> > handlers may only be able to describe certain classes.  Remove any not
> > required, or set your own registry in the query (a bit tricky in Fuseki).
> >
> > > Another question is about the concept of "BNode closure", what's the
> > rationale for expanding only forward properties? Shouldn't a closure be
> > everything that defines the node?
> >
> > It is a simple, basic policy - the idea being that more appropriate ones
> which
> > are data-sensitive would be used. This basic one can go wrong (FOAF
> graphs
> > when people are bnodes) and does not handle IFP; it does cover blank
> nodes
> > used for values with structure and for RDF lists.
> >
> > The point about DESCRIBE is that the "right" answer is not a fixed data-
> > independent algorithm but is best for the data being published.
>
> I realize that. My question was more about the definition of "closure".
> Following forward properties might be a pragmatic approach, the data can
> often be modelled in such a way that this default implementation of
> DESCRIBE returns very useful results.
>
> But, in some cases even forward properties only, might result in a too
> comprehensive response. So if the current system doesn't allow disabling
> the default handler one cannot make this answer smaller (e.g. return a
> description of instances of ex:Organization without all its ex:hasMember
> properties). I think fuseki should both allow returning results that
> contain more as well as less than the default.
>
> As for the best default I think SCBD is the best because independently of
> the data being published and ontologies being used it returns everything
> the server knows about a particular resource, only stopping the contextual
> description where the client can get more information with another DESCRIBE
> query. With SCBD a connected graph can be fully explored with DESCRIBE
> starting at any resource. Yes the response might be to comprehensive and so
> there needs to be a mechanism for DESCRIBE handlers to allow 

Default DESCRIBE Was: Getting Symmetric Concise Bounded Description with Fuseki

2018-03-12 Thread ajs6f

> On Mar 12, 2018, at 11:40 AM, Reto Gmür  wrote:
> ...
>> -Original Message-
>> From: Andy Seaborne 
>> ...
>> The point about DESCRIBE is that the "right" answer is not a fixed data-
>> independent algorithm but is best for the data being published.
> 
> I realize that. My question was more about the definition of "closure". 
> Following forward properties might be a pragmatic approach, the data can 
> often be modelled in such a way that this default implementation of DESCRIBE 
> returns very useful results.
> 
> But, in some cases even forward properties only, might result in a too 
> comprehensive response. So if the current system doesn't allow disabling the 
> default handler one cannot make this answer smaller (e.g. return a 
> description of instances of ex:Organization without all its ex:hasMember 
> properties). I think fuseki should both allow returning results that contain 
> more as well as less than the default.

I think Andy already offered a solution to this problem:

>> IIRC all the handers are executed - the idea being to apply all policies and 
>> handlers may only be able to describe certain classes.  Remove any not 
>> required, or set your own registry in the query (a bit tricky in Fuseki).

"Remove any not required"

If you build up your own registry, I'm not sure you even need to do that.

> As for the best default I think SCBD is the best because independently of the 
> data being published and ontologies being used it returns everything the 
> server knows about a particular resource, only stopping the contextual 
> description where the client can get more information with another DESCRIBE 
> query. With SCBD a connected graph can be fully explored with DESCRIBE 
> starting at any resource. Yes the response might be to comprehensive and so 
> there needs to be a mechanism for DESCRIBE handlers to allow responses that 
> are smaller than the default. But I argue that wasting a bit of bandwith in 
> some cases is a better default than arbitrarily limiting information and thus 
> providing too little information in many cases.

I have to disagree. The default as-is seems to have served well for most users 
so far. Perhaps this is behavior ("expose all the triples in partitions using 
one request per partition") that is more suited to something like Linked Data 
Platform?

ajs6f




Re: Getting Symmetric Concise Bounded Description with Fuseki

2018-03-12 Thread Andy Seaborne



On 12/03/18 15:40, Reto Gmür wrote:

Hi Andy


-Original Message-
From: Andy Seaborne 
Sent: Saturday, March 10, 2018 3:47 PM
To: users@jena.apache.org
Subject: Re: Getting Symmetric Concise Bounded Description with Fuseki

Hi Reto,

The whole DescribeHandler system is very(, very) old and hasn't changed in
ages, other than maintenance.

On 10/03/18 11:44, Reto Gmür wrote:

Hi Andy,

It first didn't quite work as I wanted it to: the model of the resource passed

to the describe model is the default graph so I got only the triples in that
graph.  Setting "tdb:unionDefaultGraph true" didn't change the graph the
DescribeHandler gets.

tdb:unionDefaultGraph only affects SPARQL execution.


Looking at the default implementation I saw that the Dataset can be

accessed from the context passed to the start method with
cxt.get(ARQConstants.sysCurrentDataset). I am now using the Model returned
by dataset. getUnionModel.

That should work.  Generally available getUnionModel post-dates the describe
handler code.


I'm wondering why the DescribeBNodeClosure doesn't do the same but

instead queries for all graphs that contain the resource and then works on
each of the NamedModel individually. Is the UnionModel returned by the
dataset inefficient that you've chosen this approach?

I don't think so - much the same work is done, just in different places.

getUnionModel will work with blank node named graphs.

getUnionModel will do describes spanning graphs, iterating over named
graphs will not.


Also the code seems to assume that the name of the graph is a URI, does

Jena not support Blank Nodes as names for graphs (having an "anonymous
node" as name might be surprising but foreseen in RDF datasets)?

Again, old code (pre RDF 1.1, which is where bNode graph names came in).

Properly, nowadays, it should all work on DatasetGraph whose API does work
with bNode graphs.  Again, history.

If you want to clean up, please do so.


It seems that even when a DescribeHandler is provided, the default handler

is executed as well. Is there a way to disable this?

IIRC all the handers are executed - the idea being to apply all policies and
handlers may only be able to describe certain classes.  Remove any not
required, or set your own registry in the query (a bit tricky in Fuseki).


Another question is about the concept of "BNode closure", what's the

rationale for expanding only forward properties? Shouldn't a closure be
everything that defines the node?

It is a simple, basic policy - the idea being that more appropriate ones which
are data-sensitive would be used. This basic one can go wrong (FOAF graphs
when people are bnodes) and does not handle IFP; it does cover blank nodes
used for values with structure and for RDF lists.

The point about DESCRIBE is that the "right" answer is not a fixed data-
independent algorithm but is best for the data being published.


I realize that. My question was more about the definition of "closure". 
Following forward properties might be a pragmatic approach, the data can often be 
modelled in such a way that this default implementation of DESCRIBE returns very useful 
results.

But, in some cases even forward properties only, might result in a too 
comprehensive response. So if the current system doesn't allow disabling the 
default handler


DescribeHandlerRegistry.get().clear();


one cannot make this answer smaller (e.g. return a description of instances of 
ex:Organization without all its ex:hasMember properties). I think fuseki should 
both allow returning results that contain more as well as less than the default.

As for the best default I think SCBD is the best because independently of the 
data being published and ontologies being used it returns everything the server 
knows about a particular resource, only stopping the contextual description 
where the client can get more information with another DESCRIBE query. With 
SCBD a connected graph can be fully explored with DESCRIBE starting at any 
resource. Yes the response might be to comprehensive and so there needs to be a 
mechanism for DESCRIBE handlers to allow responses that are smaller than the 
default. But I argue that wasting a bit of bandwith in some cases is a better 
default than arbitrarily limiting information and thus providing too little 
information in many cases.

Reto




  Andy



Cheers,
Reto


-Original Message-
From: Reto Gmür 
Sent: Friday, March 9, 2018 3:42 PM
To: users@jena.apache.org
Subject: RE: Getting Symmetric Concise Bounded Description with
Fuseki

Great, it works!

Here's the code:
https://github.com/linked-solutions/fuseki-scbd-describe

Cheers,
Reto


-Original Message-
From: Andy Seaborne 
Sent: Thursday, March 8, 2018 5:35 PM
To: users@jena.apache.org
Subject: Re: Getting Symmetric Concise Bounded Description with
Fuseki



On 08/03/18 16:12, Reto Gmür wrote:

Thanks for the link ajs6f.

The described method 

RE: Getting Symmetric Concise Bounded Description with Fuseki

2018-03-12 Thread Reto Gmür
Hi Andy

> -Original Message-
> From: Andy Seaborne 
> Sent: Saturday, March 10, 2018 3:47 PM
> To: users@jena.apache.org
> Subject: Re: Getting Symmetric Concise Bounded Description with Fuseki
> 
> Hi Reto,
> 
> The whole DescribeHandler system is very(, very) old and hasn't changed in
> ages, other than maintenance.
> 
> On 10/03/18 11:44, Reto Gmür wrote:
> > Hi Andy,
> >
> > It first didn't quite work as I wanted it to: the model of the resource 
> > passed
> to the describe model is the default graph so I got only the triples in that
> graph.  Setting "tdb:unionDefaultGraph true" didn't change the graph the
> DescribeHandler gets.
> 
> tdb:unionDefaultGraph only affects SPARQL execution.
> 
> > Looking at the default implementation I saw that the Dataset can be
> accessed from the context passed to the start method with
> cxt.get(ARQConstants.sysCurrentDataset). I am now using the Model returned
> by dataset. getUnionModel.
> 
> That should work.  Generally available getUnionModel post-dates the describe
> handler code.
> 
> > I'm wondering why the DescribeBNodeClosure doesn't do the same but
> instead queries for all graphs that contain the resource and then works on
> each of the NamedModel individually. Is the UnionModel returned by the
> dataset inefficient that you've chosen this approach?
> 
> I don't think so - much the same work is done, just in different places.
> 
> getUnionModel will work with blank node named graphs.
> 
> getUnionModel will do describes spanning graphs, iterating over named
> graphs will not.
> 
> > Also the code seems to assume that the name of the graph is a URI, does
> Jena not support Blank Nodes as names for graphs (having an "anonymous
> node" as name might be surprising but foreseen in RDF datasets)?
> 
> Again, old code (pre RDF 1.1, which is where bNode graph names came in).
> 
> Properly, nowadays, it should all work on DatasetGraph whose API does work
> with bNode graphs.  Again, history.
> 
> If you want to clean up, please do so.
> 
> > It seems that even when a DescribeHandler is provided, the default handler
> is executed as well. Is there a way to disable this?
> 
> IIRC all the handers are executed - the idea being to apply all policies and
> handlers may only be able to describe certain classes.  Remove any not
> required, or set your own registry in the query (a bit tricky in Fuseki).
> 
> > Another question is about the concept of "BNode closure", what's the
> rationale for expanding only forward properties? Shouldn't a closure be
> everything that defines the node?
> 
> It is a simple, basic policy - the idea being that more appropriate ones which
> are data-sensitive would be used. This basic one can go wrong (FOAF graphs
> when people are bnodes) and does not handle IFP; it does cover blank nodes
> used for values with structure and for RDF lists.
> 
> The point about DESCRIBE is that the "right" answer is not a fixed data-
> independent algorithm but is best for the data being published.

I realize that. My question was more about the definition of "closure". 
Following forward properties might be a pragmatic approach, the data can often 
be modelled in such a way that this default implementation of DESCRIBE returns 
very useful results.

But, in some cases even forward properties only, might result in a too 
comprehensive response. So if the current system doesn't allow disabling the 
default handler one cannot make this answer smaller (e.g. return a description 
of instances of ex:Organization without all its ex:hasMember properties). I 
think fuseki should both allow returning results that contain more as well as 
less than the default.

As for the best default I think SCBD is the best because independently of the 
data being published and ontologies being used it returns everything the server 
knows about a particular resource, only stopping the contextual description 
where the client can get more information with another DESCRIBE query. With 
SCBD a connected graph can be fully explored with DESCRIBE starting at any 
resource. Yes the response might be to comprehensive and so there needs to be a 
mechanism for DESCRIBE handlers to allow responses that are smaller than the 
default. But I argue that wasting a bit of bandwith in some cases is a better 
default than arbitrarily limiting information and thus providing too little 
information in many cases.

Reto


> 
>  Andy
> 
> >
> > Cheers,
> > Reto
> >
> >> -Original Message-
> >> From: Reto Gmür 
> >> Sent: Friday, March 9, 2018 3:42 PM
> >> To: users@jena.apache.org
> >> Subject: RE: Getting Symmetric Concise Bounded Description with
> >> Fuseki
> >>
> >> Great, it works!
> >>
> >> Here's the code:
> >> https://github.com/linked-solutions/fuseki-scbd-describe
> >>
> >> Cheers,
> >> Reto
> >>
> >>> -Original Message-
> >>> From: Andy Seaborne 
> >>> Sent: Thursday, March 8, 2018 5:35 PM
> >>> To: 

Re: FILTER (CONTAINS on a graph name : should order matter ?

2018-03-12 Thread Martynas Jusevičius
?thing is undefined within GRAPH in your second query.

On Mon, Mar 12, 2018 at 3:59 PM, Jean-Marc Vanel 
wrote:

> Hi !
>
> This works as expected:
>
> SELECT DISTINCT ?thing
>   WHERE {
>graph ?thing {
>  [] ?p ?O .
>}
>FILTER (CONTAINS( str(?thing),"cartopair"))
>  }
>
> but this gives an empty result :
>
> SELECT DISTINCT ?thing
> WHERE {
>  graph ?thing {
>[] ?p ?O .
>FILTER (CONTAINS( str(?thing),"cartopair"))
>  }
> }
>
>
> --
> Jean-Marc Vanel
> http://www.semantic-forms.cc:9111/display?displayuri=http:/
> /jmvanel.free.fr/jmv.rdf%23me#subject
>  /jmvanel.free.fr/jmv.rdf%23me>
> Déductions SARL - Consulting, services, training,
> Rule-based programming, Semantic Web
> +33 (0)6 89 16 29 52
> Twitter: @jmvanel , @jmvanel_fr ; chat: irc://irc.freenode.net#eulergui
>


FILTER (CONTAINS on a graph name : should order matter ?

2018-03-12 Thread Jean-Marc Vanel
Hi !

This works as expected:

SELECT DISTINCT ?thing
  WHERE {
   graph ?thing {
 [] ?p ?O .
   }
   FILTER (CONTAINS( str(?thing),"cartopair"))
 }

but this gives an empty result :

SELECT DISTINCT ?thing
WHERE {
 graph ?thing {
   [] ?p ?O .
   FILTER (CONTAINS( str(?thing),"cartopair"))
 }
}


-- 
Jean-Marc Vanel
http://www.semantic-forms.cc:9111/display?displayuri=http://jmvanel.free.fr/jmv.rdf%23me#subject

Déductions SARL - Consulting, services, training,
Rule-based programming, Semantic Web
+33 (0)6 89 16 29 52
Twitter: @jmvanel , @jmvanel_fr ; chat: irc://irc.freenode.net#eulergui


Re: Streaming CONSTRUCT/INSERTs in TDB

2018-03-12 Thread Andy Seaborne



On 11/03/18 21:27, Adrian Gschwend wrote:

On 10.03.18 00:36, Andy Seaborne wrote:

Hi Andy,


Executes in 2m 20s (java8) for me (and 1m 49s with java9 which is .
Default heap which is IIRC 25% of RAM or 8G. Cold JVM, cold file cache.


wow, did you do that with TDB commandline tools? Default heap in terms
of default settings of Fuseki?


Yes - all this was just using the command line tools.

Default heap but that's different for me and for you.  I'm running with 
8G heap (25% of 32G).


4G didn't work.

On Linux/Mac, set and export JVM_ARGS=-Xmx8G and make sure it gets 
passed to the script.  On Windows, I think you need to edit the scripts.





If you have an 8G machine, an 8G heap may cause problems (swapping).


I have 16 gigs on my local system.


Earlier:
"""
  And I once allocated almost all I had on my system (> 8GB)
"""

I'm afraid it sounds like that didn't get set - if you can see the 
commandline for the running java process via system tools, it will have 
the -Xmx setting.  Or your data/query are different - I see %% stuff in 
the update.


and if Linux, set "swappiness" to zero.

https://askubuntu.com/questions/103915/how-do-i-configure-swappiness

otherwise it kernel keeps some free space.  Ttat does not cause the CPUs 
to go wild.


Don't over allocate - the TDB files are memory mapped files and that dos 
not come out heap.



Does the CPU load go up very high, on all cores? That's a sign of a full
GC trying to reclaim space before a OOME.


yes that's exactly what is happening. What is OOME?


It's about to run out of heap.  java8 has a peculiar feature that when 
heap usage grows, it tries to full GC to create space, then tries again 
and again, ... to the point where the machine is only doing GCs which 
are parallel hence all the CPUs go crazy.


Andy




If you get the same with TDB2, then the space isn't going in
transactions in TDB1.


not sure what that means but ok :)

regards

Adrian



Re: [3.0.1] ResultSetFactory.fromJSON() won't parse ASK JSON result

2018-03-12 Thread Andy Seaborne

JSONInput.make(InputStream) -> SPARQLResult

Andy

On 12/03/18 10:13, Martynas Jusevičius wrote:

Hi Andy,

I'm not using QueryExecution here, I'm trying to parse JSON read from HTTP
InputStream using ResultSetFactory.fromJSON().

Then I want to carry the result set, maybe do some logic based on it, and
possibly serialize it back using ResultSetFormatter.

Is that not possible with ASK result?

On Mon, Mar 12, 2018 at 9:46 AM, Andy Seaborne  wrote:




On 11/03/18 23:03, Martynas Jusevičius wrote:


Hi,

I'm getting the following JSON result from an ASK query:

{ "head": {}, "boolean": true }

However, the method that usually works fine, will not parse it from
InputStream (Jena 3.0.1):

  org.apache.jena.sparql.resultset.ResultSetException: Not a ResultSet
result
org.apache.jena.sparql.resultset.SPARQLResult.getResultSet(
SPARQLResult.java:94)
org.apache.jena.sparql.resultset.JSONInput.fromJSON(JSONInput.java:64)
org.apache.jena.query.ResultSetFactory.fromJSON(ResultSetFac
tory.java:331)

I stepped inside the code and I see that JSONObject is parsed fine, but
afterwards SPARQLResult.resultSet field is not being set for some reason.

Any ideas?



The outcome of an ASK query is a boolean, not a ResultSet.

See execAsk.

SPARQLResult is the class for a holder of any SPARQL result type.

 Andy




Martynas






Re: [3.0.1] ResultSetFactory.fromJSON() won't parse ASK JSON result

2018-03-12 Thread Martynas Jusevičius
Hi Andy,

I'm not using QueryExecution here, I'm trying to parse JSON read from HTTP
InputStream using ResultSetFactory.fromJSON().

Then I want to carry the result set, maybe do some logic based on it, and
possibly serialize it back using ResultSetFormatter.

Is that not possible with ASK result?

On Mon, Mar 12, 2018 at 9:46 AM, Andy Seaborne  wrote:

>
>
> On 11/03/18 23:03, Martynas Jusevičius wrote:
>
>> Hi,
>>
>> I'm getting the following JSON result from an ASK query:
>>
>>{ "head": {}, "boolean": true }
>>
>> However, the method that usually works fine, will not parse it from
>> InputStream (Jena 3.0.1):
>>
>>  org.apache.jena.sparql.resultset.ResultSetException: Not a ResultSet
>> result
>> org.apache.jena.sparql.resultset.SPARQLResult.getResultSet(
>> SPARQLResult.java:94)
>> org.apache.jena.sparql.resultset.JSONInput.fromJSON(JSONInput.java:64)
>> org.apache.jena.query.ResultSetFactory.fromJSON(ResultSetFac
>> tory.java:331)
>>
>> I stepped inside the code and I see that JSONObject is parsed fine, but
>> afterwards SPARQLResult.resultSet field is not being set for some reason.
>>
>> Any ideas?
>>
>
> The outcome of an ASK query is a boolean, not a ResultSet.
>
> See execAsk.
>
> SPARQLResult is the class for a holder of any SPARQL result type.
>
> Andy
>
>
>>
>> Martynas
>>
>>


Re: Best way to save a large amount of triples in TDB

2018-03-12 Thread Dick Murray
On Mon, 12 Mar 2018, 09:27 Davide Curcio,  wrote:

> Hi,
> I want to store a large amount of data inside the TDB server with the
>

Quantity or size on disk?

Jena API. In my code, I retrieve data for each iteration, and so I need
> to store these data in TDB, but if I create all statements with Jena
> API, for each iteration, before load data in the server, obviously I've
> problems with RAM. But if I try to commit data for each iterator in the
> server, and so open and close write transaction each time, obviously
> it's too slow. What's the best way to do this?
>

Standard bulk load as per any storage system...


> Thanks
>
>


Best way to save a large amount of triples in TDB

2018-03-12 Thread Davide Curcio
Hi,
I want to store a large amount of data inside the TDB server with the 
Jena API. In my code, I retrieve data for each iteration, and so I need 
to store these data in TDB, but if I create all statements with Jena 
API, for each iteration, before load data in the server, obviously I've 
problems with RAM. But if I try to commit data for each iterator in the 
server, and so open and close write transaction each time, obviously 
it's too slow. What's the best way to do this?

Thanks



Re: [3.0.1] ResultSetFactory.fromJSON() won't parse ASK JSON result

2018-03-12 Thread Andy Seaborne



On 11/03/18 23:03, Martynas Jusevičius wrote:

Hi,

I'm getting the following JSON result from an ASK query:

   { "head": {}, "boolean": true }

However, the method that usually works fine, will not parse it from
InputStream (Jena 3.0.1):

 org.apache.jena.sparql.resultset.ResultSetException: Not a ResultSet
result
org.apache.jena.sparql.resultset.SPARQLResult.getResultSet(SPARQLResult.java:94)
org.apache.jena.sparql.resultset.JSONInput.fromJSON(JSONInput.java:64)
org.apache.jena.query.ResultSetFactory.fromJSON(ResultSetFactory.java:331)

I stepped inside the code and I see that JSONObject is parsed fine, but
afterwards SPARQLResult.resultSet field is not being set for some reason.

Any ideas?


The outcome of an ASK query is a boolean, not a ResultSet.

See execAsk.

SPARQLResult is the class for a holder of any SPARQL result type.

Andy




Martynas