date:20171013

Re: Nested select doesn't work as expected

2017-10-13 Thread Dimov, Stefan

Thanks, Andy!

S.

On 10/13/17, 3:40 AM, "Andy Seaborne"  wrote:

Thanks.

The inner SELECT isn't really necessary - it is just hiding ?p so rename 
that and don't have it in the outer projection:

SELECT ?s ?p ?o
FROM named_graph:m_p
FROM NAMED named_graph:m_p_s
WHERE
{
   ?s ?p ?o
   GRAPH named_graph:m_p_s { ?o ?px w:frnd }
}

 Andy

On 13/10/17 11:14, Lorenz Buehmann wrote:
> I answered it on StackOverflow. And the formulation of the question was
> confusing. He probably meant that the subjects of the second graph are
> the object of the first graph. Anything else wouldn't make sense...
> 
> 
> Here is the data that I used for testing, thus, Andy could also use it:
> 
> Graph named_graph:m_p
> 
> @prefix :  .
> :p1:pred1:mp1 .
> :p2:pred1:mp1 .
> :p3:pred1:mp2 .
> :p4:pred1:mp2 .
> :p5:pred1:mp3 .
> :p6:pred1:mp3 .
> 
> 
> Graph named_graph:m_p_s
> 
> @prefix :  .
> @prefix w:  .
> 
> :mp1   :pred2w:frnd .
> :mp1   :pred2w:fdlfkdl .
> :mp2   :pred2w:kdsjflk .
> :mp2   :pred2w:jflksdlkj .
> :mp3   :pred2w:frnd .
> :mp3   :pred2w:fjksldjfls .
> 
> 
> Working query:
> 
> PREFIX named_graph: 
> PREFIX w: 
> 
> SELECT *
> FROM named_graph:m_p
> FROM NAMED named_graph:m_p_s
> WHERE
> {
>?s ?p ?o
>{
>   SELECT ?o WHERE {
> GRAPH named_graph:m_p_s { ?o ?p w:frnd }
>}
>}
> }
> 
> 
> Cheers,
> 
> Lorenz
> 
> 
> 
> On 13.10.2017 10:20, Andy Seaborne wrote:
>>
>>
>> On 13/10/17 02:36, Dimov, Stefan wrote:
>>> Hello,
>>>
>>> I have two graphs:
>>
>> Which storage system are they in?
>> Which version of Jena?
>>
>>>
>>> m_p
>>>
>>> p1pred1mp1
>>> p2pred1mp1
>>> p3pred1   mp2
>>> p4pred1   mp2
>>> p5pred1   mp3
>>> p6pred1   mp3
>>>
>>> and m_p_s
>>>
>>> mp1   pred2w:frnd
>>> mp1   pred2w:fdlfkdl
>>> mp2   pred2w:kdsjflk
>>> mp2   pred2   w:jflksdlkj
>>> mp3   pred2   w:frnd
>>> mp3   pred2   w:fjksldjfls
>>
>> Please could you provide complete data such as TriG?
>>
>> It is a barrier to volunteers who answers questions if the first thing
>> you have to do is mangle email, data preparation and disentangle
>> partial queries.
>>
>>> and I want to get all the triples in m_p which objects are predicates
>>> in m_p_s and the object of that predicates in m_p_s is w:frnd
>>>
>>> In other words I want to make query that returns (results with) p1,
>>> p2, p5 and p6 from m_p and doesn’t return p3 and p4.
>>>
>>> I’m trying to do this with nested queries,
>>
>> You don't need a nested SELECT.
>>
>> SELECT * {
>> GRAPH m_p   { ?s ?p ?o }
>> GRAPH m_p_s { ?x ?o w:frnd }
>> }
>>
>> (untested)
>>
>>> but it doesn’t work: E.g.
>>>
>>>   SELECT $subj $pred $pr
>>
>> $subj and $pred are not set in the query.
>>
>> This isn't SQL! In SPARQL, variables get bound in graph patterns.
>>
>>>   FROM NAMED named_graph:m_p
>>
>> The RDF dataset for this query is a single named graph and empty
>> default graph.
>>
>> Did you mean:
>>
>> FROM NAMED named_graph:m_p
>> FROM NAMED named_graph:m_p_s
>>
>> ?
>>
>> or indeed no FROM NAMED and use a dataset directly.
>>
>>
>>>   WHERE
>>>   {
>>>   SELECT $pr
>>>   WHERE
>>>   {
>>>  GRAPH named_graph:m_p_s { $pr $pred0 w:frnd }
>>
>> the m_p_s graph isn't in the dataset hence this pattern is empty.
>>
>> GRAPH is for access; FROM NAMED for setting up.
>>
>>>   }
>>>   }
>>>
>>> returns empty result. I tried different things, but either I get an
>>> error
>>
>> What is the error?
>>
>>> or empty result or everything in m_p.
>>>
>>> I don’t want to use UNION or FILTER for performance reasons.
>>>
>>> Do you have an idea how I can do it?
>>>
>>> Regards,
>>> Stefan
>>>
>

Re: Questions about Jena CLI toold

2017-10-13 Thread Laura Morales

> Have you tried?
> It should do.

Yes I tried and it looks like it does. Just wanted to be sure.

Re: Backup doen't work with Fuseki 3.4.0

2017-10-13 Thread Andy Seaborne


Indeed - it's missing some logging lines:

Standalone:

11:49:56.575 INFO  Backup   :: [10]  Start backup /ds -> 
/home/afs/tmp/run/backups/ds_2017-10-13_11-49-56

11:49:56.576 INFO  Admin:: [10] 200 OK (1 ms)
11:49:56.581 INFO  Backup   :: [10]  Finish backup /ds 
-> /home/afs/tmp/run/backups/ds_2017-10-13_11-49-56


and with a WAR file, I don't get these. The task is "somewhere".

Could you raise a JIRA please?

Andy

On 13/10/17 11:33, DAVID MOLINA ESTRADA wrote:



Hi,

I am running a Fuseki.war 3.4.0 in a Tomcat. With the previous version, 2.6.0, I can do 
backup from Fuseki web console but after upgrading backup doen't work. I don't get any 
error, the application reponses as if everything was fine: "Task backup started at 
2017-10-13T12:15:17.431+02:00, finished at 2017-10-13T12:15:17.433+02:00".


In log file, I can see:


[2017-10-13 12:15:17] Admin  INFO  [3] POST 
http://localhost:8080/fuseki/$/backup/ontofarma
[2017-10-13 12:15:17] Admin  INFO  [3] Backup dataset /ontofarma
[2017-10-13 12:15:17] Server INFO  Task : 1 : backup
[2017-10-13 12:15:17] Server INFO  [Task 1] starts : backup
[2017-10-13 12:15:17] Admin  INFO  [3] 200 OK (4 ms)
[2017-10-13 12:15:17] Server INFO  [Task 1] finishes : backup
[2017-10-13 12:15:17] Server INFO  [4] GET 
http://localhost:8080/fuseki/$/tasks/1
[2017-10-13 12:15:17] Server INFO  [4] Task 1
[2017-10-13 12:15:17] Server INFO  [4] 200 OK (0 ms)


But, the backup folder is empty.

Thank you,
David Molina

Evite imprimir este mensaje si no es estrictamente necesario | Eviti imprimir 
aquest missatge si no és estrictament necessari | Avoid printing this message 
if it is not absolutely necessary
Evite imprimir este mensaje si no es estrictamente necesario | Eviti imprimir 
aquest missatge si no és estrictament necessari | Avoid printing this message 
if it is not absolutely necessary

Re: Questions about Jena CLI toold

2017-10-13 Thread Andy Seaborne




On 12/10/17 22:29, Laura Morales wrote:

- what is the difference between "update" and "tdbupdate"


tdbupdate adds TDB specific argument handling.



- if I have a file "file.nt" containing triples, and I use "tdbloader --graph ex:name file.nt", is this 
the equivalent of adding a "" label to all triples in "file.nt". In other words, all the 
triples from the file will be loaded in the graph name specified by --graph?



Have you tried?

It should do.

Andy

Re: Nested select doesn't work as expected

2017-10-13 Thread Andy Seaborne


Thanks.

The inner SELECT isn't really necessary - it is just hiding ?p so rename 
that and don't have it in the outer projection:


SELECT ?s ?p ?o
FROM named_graph:m_p
FROM NAMED named_graph:m_p_s
WHERE
{
  ?s ?p ?o
  GRAPH named_graph:m_p_s { ?o ?px w:frnd }
}

Andy

On 13/10/17 11:14, Lorenz Buehmann wrote:

I answered it on StackOverflow. And the formulation of the question was
confusing. He probably meant that the subjects of the second graph are
the object of the first graph. Anything else wouldn't make sense...


Here is the data that I used for testing, thus, Andy could also use it:

Graph named_graph:m_p

@prefix :  .
:p1    :pred1    :mp1 .
:p2    :pred1    :mp1 .
:p3    :pred1    :mp2 .
:p4    :pred1    :mp2 .
:p5    :pred1    :mp3 .
:p6    :pred1    :mp3 .


Graph named_graph:m_p_s

@prefix :  .
@prefix w:  .

:mp1   :pred2    w:frnd .
:mp1   :pred2    w:fdlfkdl .
:mp2   :pred2    w:kdsjflk .
:mp2   :pred2    w:jflksdlkj .
:mp3   :pred2    w:frnd .
:mp3   :pred2    w:fjksldjfls .


Working query:

PREFIX named_graph: 
PREFIX w: 

SELECT *
FROM named_graph:m_p
FROM NAMED named_graph:m_p_s
WHERE
{
   ?s ?p ?o
   {
  SELECT ?o WHERE {
    GRAPH named_graph:m_p_s { ?o ?p w:frnd }
   }
   }
}


Cheers,

Lorenz



On 13.10.2017 10:20, Andy Seaborne wrote:



On 13/10/17 02:36, Dimov, Stefan wrote:

Hello,

I have two graphs:


Which storage system are they in?
Which version of Jena?



m_p

p1    pred1    mp1
p2    pred1    mp1
p3    pred1   mp2
p4    pred1   mp2
p5    pred1   mp3
p6    pred1   mp3

and m_p_s

mp1   pred2    w:frnd
mp1   pred2    w:fdlfkdl
mp2   pred2    w:kdsjflk
mp2   pred2   w:jflksdlkj
mp3   pred2   w:frnd
mp3   pred2   w:fjksldjfls


Please could you provide complete data such as TriG?

It is a barrier to volunteers who answers questions if the first thing
you have to do is mangle email, data preparation and disentangle
partial queries.


and I want to get all the triples in m_p which objects are predicates
in m_p_s and the object of that predicates in m_p_s is w:frnd

In other words I want to make query that returns (results with) p1,
p2, p5 and p6 from m_p and doesn’t return p3 and p4.

I’m trying to do this with nested queries,


You don't need a nested SELECT.

SELECT * {
    GRAPH m_p   { ?s ?p ?o }
    GRAPH m_p_s { ?x ?o w:frnd }
}

(untested)


but it doesn’t work: E.g.

  SELECT $subj $pred $pr


$subj and $pred are not set in the query.

This isn't SQL! In SPARQL, variables get bound in graph patterns.


  FROM NAMED named_graph:m_p


The RDF dataset for this query is a single named graph and empty
default graph.

Did you mean:

FROM NAMED named_graph:m_p
FROM NAMED named_graph:m_p_s

?

or indeed no FROM NAMED and use a dataset directly.



  WHERE
  {
  SELECT $pr
  WHERE
  {
     GRAPH named_graph:m_p_s { $pr $pred0 w:frnd }


the m_p_s graph isn't in the dataset hence this pattern is empty.

GRAPH is for access; FROM NAMED for setting up.


  }
  }

returns empty result. I tried different things, but either I get an
error


What is the error?


or empty result or everything in m_p.

I don’t want to use UNION or FILTER for performance reasons.

Do you have an idea how I can do it?

Regards,
Stefan

Backup doen't work with Fuseki 3.4.0

2017-10-13 Thread DAVID MOLINA ESTRADA



Hi,

I am running a Fuseki.war 3.4.0 in a Tomcat. With the previous version, 2.6.0, 
I can do backup from Fuseki web console but after upgrading backup doen't work. 
I don't get any error, the application reponses as if everything was fine: 
"Task backup started at 2017-10-13T12:15:17.431+02:00, finished at 
2017-10-13T12:15:17.433+02:00". 


In log file, I can see:


[2017-10-13 12:15:17] Admin  INFO  [3] POST 
http://localhost:8080/fuseki/$/backup/ontofarma
[2017-10-13 12:15:17] Admin  INFO  [3] Backup dataset /ontofarma
[2017-10-13 12:15:17] Server INFO  Task : 1 : backup
[2017-10-13 12:15:17] Server INFO  [Task 1] starts : backup
[2017-10-13 12:15:17] Admin  INFO  [3] 200 OK (4 ms)
[2017-10-13 12:15:17] Server INFO  [Task 1] finishes : backup
[2017-10-13 12:15:17] Server INFO  [4] GET 
http://localhost:8080/fuseki/$/tasks/1
[2017-10-13 12:15:17] Server INFO  [4] Task 1
[2017-10-13 12:15:17] Server INFO  [4] 200 OK (0 ms)


But, the backup folder is empty.

Thank you,
David Molina

Evite imprimir este mensaje si no es estrictamente necesario | Eviti imprimir 
aquest missatge si no és estrictament necessari | Avoid printing this message 
if it is not absolutely necessary
Evite imprimir este mensaje si no es estrictamente necesario | Eviti imprimir 
aquest missatge si no és estrictament necessari | Avoid printing this message 
if it is not absolutely necessary

Re: How to increase performance

2017-10-13 Thread Lorenz Buehmann



On 13.10.2017 09:48, George News wrote:
> Hi all,
>
> Thanks a lot for your answers... I have "negotiated" with the admins of
> the project and I will be giving you examples of the queries and data ;)
>
> We really need to enhance performance. BTW Virtuoso is good at inference
> or will I have the same issues?
I don't think that people here have that much experience with Virtuoso.
I used Virtuoso 7.x sometimes, but I can't say whether it's "good".
Basically, it also applies rule-based inference, but as far as I know
the focus was never on performance regarding inference. And the
convenience was also not that good from my point of view.

The new Virtuoso 8.x is supposed a more powerful reasoning engine than
before - at least that's what's announced in some blog posts. Indeed,
there are no benchmarks etc.
>
> Thanks again.
> Regards,
> Jorge
>
> On 2017-10-11 15:47, Rob Vesse wrote:
>> Comments inline:
>>
>> On 11/10/2017 11:57, "George News"  wrote:
>>
>> Hi all,
>> 
>> The project I'm working in currently has a TDB with approximately 100M
>> triplets and the size is increasing quite quickly. When I make a typical
>> SPARQL query for getting data from the system, it takes ages, sometimes
>> more than 10-20 minutes. I think performance wise this is not really
>> user friendly. Therefore I need to know how I can increase the speed, 
>> etc.
>> 
>> I'm running the whole system on a machine with Intel Xeon E312xx with
>> 32Gb RAM and many times I'm getting OutofMemory Exceptions and the
>> google.cache that Jena handles is the one that seems to be causing the
>> problem.
>>
>>  Specifics stack traces would be useful to understand where the cache is 
>> being exploded. Certain kinds of query may use the cache more heavily than 
>> others so some elaboration on the general construction of queries would be 
>> interesting.
>> 
>> Are the figures I'm pointing normal (machine specs, response time,
>> etc.)? Is it too big/too small?
>>
>>  The size of the data seems small relative to the size of the machine. You 
>> don’t specify whether you change the JVM heap size, most memory usage in TDB 
>> is off-heap via memory mapped files so setting too large a heap can 
>> negatively impact performance.
>>
>>  The response times seems very poor but that may be the nature of your 
>> queries and data structure, however since you are unable to show those we 
>> can only provide generalisations
>> 
>> For the moment, we have decided to split the graph in pieces, that is,
>> generating a new named graph every now and then so the amount of
>> information stored in a "current" graph is smaller. Then restricting the
>> query to a set of graphs things work better.
>> 
>> Although this solution works, when we merge the graphs for historical
>> queries, we are facing the same problem as before. Then, how can we
>> increased the speed?
>> 
>> I cannot disclosed the dataset or part of it, but I will try to somehow
>> explain it.
>> 
>> - Ids for entities are approximately 255 random ASCII characters. Does
>> the size of the ids affect the speed of the SPARQL queries? If yes, can
>> I apply a Lucene index to the IDs in order to reduce the query time?
>>
>>  It depends on the nature of the query. All terms are mapped into 64-bit 
>> internal identifiers, these are only mapped back to the original terms as 
>> and when that query engine and/or results serialisation requires it.  A 
>> cache is used to speed up the mapping in both directions so depending on the 
>> nature of the queries and your system loads you may be thrashing this cache.
>> 
>> - The depth level of the graph or the information relationship is around
>> 7-8 level at most, but most of the times it is required to link 3-4 
>> levels.
>>
>>   Difficult to say how this impacts performance because it really depends on 
>> how you are querying that structure
>> 
>> - Most of the queries include several:
>> ?x myont:hasattribute ?b.
>> ?a rdf:type ?b.
>> 
>> Therefore checking the class and subclasses of entities. Is there anyway
>> to speed up the inference as if I'm asking for the parent class I will
>> get also the children ones defined in my ontology.
>>
>> So are you actively using inference? If you are then that will significantly 
>> degrade performance because the inference closure is done entirely in memory 
>> i.e. not in TDB if inference is turned on and you will get minimal 
>> performance benefit from using TDB.
>>
>>  If you only need simple inference like class and property hierarchy you may 
>> be better served by asserting those statically using SPARQL updates and not 
>> using dynamic inference
>> 
>> - I know the "." in a query acts as more or less like an AND logical
>> operation. Does the order of sentences have implications in the
>> performance?

Re: Nested select doesn't work as expected

2017-10-13 Thread Lorenz Buehmann

I answered it on StackOverflow. And the formulation of the question was
confusing. He probably meant that the subjects of the second graph are
the object of the first graph. Anything else wouldn't make sense...


Here is the data that I used for testing, thus, Andy could also use it:

Graph named_graph:m_p

@prefix :  .
:p1    :pred1    :mp1 .
:p2    :pred1    :mp1 .
:p3    :pred1    :mp2 .
:p4    :pred1    :mp2 .
:p5    :pred1    :mp3 .
:p6    :pred1    :mp3 .


Graph named_graph:m_p_s

@prefix :  .
@prefix w:  .

:mp1   :pred2    w:frnd .
:mp1   :pred2    w:fdlfkdl .
:mp2   :pred2    w:kdsjflk .
:mp2   :pred2    w:jflksdlkj .
:mp3   :pred2    w:frnd .
:mp3   :pred2    w:fjksldjfls .


Working query:

PREFIX named_graph: 
PREFIX w: 

SELECT *
FROM named_graph:m_p
FROM NAMED named_graph:m_p_s
WHERE
{
  ?s ?p ?o
  {
 SELECT ?o WHERE {
   GRAPH named_graph:m_p_s { ?o ?p w:frnd }
  }
  }
}


Cheers,

Lorenz



On 13.10.2017 10:20, Andy Seaborne wrote:
>
>
> On 13/10/17 02:36, Dimov, Stefan wrote:
>> Hello,
>>
>> I have two graphs:
>
> Which storage system are they in?
> Which version of Jena?
>
>>
>> m_p
>>
>> p1    pred1    mp1
>> p2    pred1    mp1
>> p3    pred1   mp2
>> p4    pred1   mp2
>> p5    pred1   mp3
>> p6    pred1   mp3
>>
>> and m_p_s
>>
>> mp1   pred2    w:frnd
>> mp1   pred2    w:fdlfkdl
>> mp2   pred2    w:kdsjflk
>> mp2   pred2   w:jflksdlkj
>> mp3   pred2   w:frnd
>> mp3   pred2   w:fjksldjfls
>
> Please could you provide complete data such as TriG?
>
> It is a barrier to volunteers who answers questions if the first thing
> you have to do is mangle email, data preparation and disentangle
> partial queries.
>
>> and I want to get all the triples in m_p which objects are predicates
>> in m_p_s and the object of that predicates in m_p_s is w:frnd
>>
>> In other words I want to make query that returns (results with) p1,
>> p2, p5 and p6 from m_p and doesn’t return p3 and p4.
>>
>> I’m trying to do this with nested queries,
>
> You don't need a nested SELECT.
>
> SELECT * {
>    GRAPH m_p   { ?s ?p ?o }
>    GRAPH m_p_s { ?x ?o w:frnd }
> }
>
> (untested)
>
>> but it doesn’t work: E.g.
>>
>>  SELECT $subj $pred $pr
>
> $subj and $pred are not set in the query.
>
> This isn't SQL! In SPARQL, variables get bound in graph patterns.
>
>>  FROM NAMED named_graph:m_p
>
> The RDF dataset for this query is a single named graph and empty
> default graph.
>
> Did you mean:
>
> FROM NAMED named_graph:m_p
> FROM NAMED named_graph:m_p_s
>
> ?
>
> or indeed no FROM NAMED and use a dataset directly.
>
>
>>  WHERE
>>  {
>>  SELECT $pr
>>  WHERE
>>  {
>>     GRAPH named_graph:m_p_s { $pr $pred0 w:frnd }
>
> the m_p_s graph isn't in the dataset hence this pattern is empty.
>
> GRAPH is for access; FROM NAMED for setting up.
>
>>  }
>>  }
>>
>> returns empty result. I tried different things, but either I get an
>> error
>
> What is the error?
>
>> or empty result or everything in m_p.
>>
>> I don’t want to use UNION or FILTER for performance reasons.
>>
>> Do you have an idea how I can do it?
>>
>> Regards,
>> Stefan
>>

Re: Nested select doesn't work as expected

2017-10-13 Thread Andy Seaborne




On 13/10/17 02:36, Dimov, Stefan wrote:

Hello,

I have two graphs:


Which storage system are they in?
Which version of Jena?



m_p

p1pred1mp1
p2pred1mp1
p3pred1   mp2
p4pred1   mp2
p5pred1   mp3
p6pred1   mp3

and m_p_s

mp1   pred2w:frnd
mp1   pred2w:fdlfkdl
mp2   pred2w:kdsjflk
mp2   pred2   w:jflksdlkj
mp3   pred2   w:frnd
mp3   pred2   w:fjksldjfls


Please could you provide complete data such as TriG?

It is a barrier to volunteers who answers questions if the first thing 
you have to do is mangle email, data preparation and disentangle partial 
queries.



and I want to get all the triples in m_p which objects are predicates in m_p_s 
and the object of that predicates in m_p_s is w:frnd

In other words I want to make query that returns (results with) p1, p2, p5 and 
p6 from m_p and doesn’t return p3 and p4.

I’m trying to do this with nested queries,


You don't need a nested SELECT.

SELECT * {
   GRAPH m_p   { ?s ?p ?o }
   GRAPH m_p_s { ?x ?o w:frnd }
}

(untested)


but it doesn’t work: E.g.

 SELECT $subj $pred $pr


$subj and $pred are not set in the query.

This isn't SQL! In SPARQL, variables get bound in graph patterns.


 FROM NAMED named_graph:m_p


The RDF dataset for this query is a single named graph and empty default 
graph.


Did you mean:

FROM NAMED named_graph:m_p
FROM NAMED named_graph:m_p_s

?

or indeed no FROM NAMED and use a dataset directly.



 WHERE
 {
 SELECT $pr
 WHERE
 {
GRAPH named_graph:m_p_s { $pr $pred0 w:frnd }


the m_p_s graph isn't in the dataset hence this pattern is empty.

GRAPH is for access; FROM NAMED for setting up.


 }
 }

returns empty result. I tried different things, but either I get an error


What is the error?


or empty result or everything in m_p.

I don’t want to use UNION or FILTER for performance reasons.

Do you have an idea how I can do it?

Regards,
Stefan

Re: How to increase performance

2017-10-13 Thread George News

Hi all,

Thanks a lot for your answers... I have "negotiated" with the admins of
the project and I will be giving you examples of the queries and data ;)

We really need to enhance performance. BTW Virtuoso is good at inference
or will I have the same issues?

Thanks again.
Regards,
Jorge

On 2017-10-11 15:47, Rob Vesse wrote:
> Comments inline:
> 
> On 11/10/2017 11:57, "George News"  wrote:
> 
> Hi all,
> 
> The project I'm working in currently has a TDB with approximately 100M
> triplets and the size is increasing quite quickly. When I make a typical
> SPARQL query for getting data from the system, it takes ages, sometimes
> more than 10-20 minutes. I think performance wise this is not really
> user friendly. Therefore I need to know how I can increase the speed, etc.
> 
> I'm running the whole system on a machine with Intel Xeon E312xx with
> 32Gb RAM and many times I'm getting OutofMemory Exceptions and the
> google.cache that Jena handles is the one that seems to be causing the
> problem.
> 
>  Specifics stack traces would be useful to understand where the cache is 
> being exploded. Certain kinds of query may use the cache more heavily than 
> others so some elaboration on the general construction of queries would be 
> interesting.
> 
> Are the figures I'm pointing normal (machine specs, response time,
> etc.)? Is it too big/too small?
> 
>  The size of the data seems small relative to the size of the machine. You 
> don’t specify whether you change the JVM heap size, most memory usage in TDB 
> is off-heap via memory mapped files so setting too large a heap can 
> negatively impact performance.
> 
>  The response times seems very poor but that may be the nature of your 
> queries and data structure, however since you are unable to show those we can 
> only provide generalisations
> 
> For the moment, we have decided to split the graph in pieces, that is,
> generating a new named graph every now and then so the amount of
> information stored in a "current" graph is smaller. Then restricting the
> query to a set of graphs things work better.
> 
> Although this solution works, when we merge the graphs for historical
> queries, we are facing the same problem as before. Then, how can we
> increased the speed?
> 
> I cannot disclosed the dataset or part of it, but I will try to somehow
> explain it.
> 
> - Ids for entities are approximately 255 random ASCII characters. Does
> the size of the ids affect the speed of the SPARQL queries? If yes, can
> I apply a Lucene index to the IDs in order to reduce the query time?
> 
>  It depends on the nature of the query. All terms are mapped into 64-bit 
> internal identifiers, these are only mapped back to the original terms as and 
> when that query engine and/or results serialisation requires it.  A cache is 
> used to speed up the mapping in both directions so depending on the nature of 
> the queries and your system loads you may be thrashing this cache.
> 
> - The depth level of the graph or the information relationship is around
> 7-8 level at most, but most of the times it is required to link 3-4 
> levels.
> 
>   Difficult to say how this impacts performance because it really depends on 
> how you are querying that structure
> 
> - Most of the queries include several:
> ?x myont:hasattribute ?b.
> ?a rdf:type ?b.
> 
> Therefore checking the class and subclasses of entities. Is there anyway
> to speed up the inference as if I'm asking for the parent class I will
> get also the children ones defined in my ontology.
> 
> So are you actively using inference? If you are then that will significantly 
> degrade performance because the inference closure is done entirely in memory 
> i.e. not in TDB if inference is turned on and you will get minimal 
> performance benefit from using TDB.
> 
>  If you only need simple inference like class and property hierarchy you may 
> be better served by asserting those statically using SPARQL updates and not 
> using dynamic inference
> 
> - I know the "." in a query acts as more or less like an AND logical
> operation. Does the order of sentences have implications in the
> performance? Should I start with the most restrictive ones? Should I
> start with the simplest ones, i.e. checking number values, etc.?
> 
>  yes and no.  TDB Will attempt to do the necessary scans in an optimal order 
> based on its knowledge of the statistics of the data. However this only 
> applies within a single query pattern i.e. { } so depending on the structure 
> of your query you may need to do some manual reordering. Also if inference is 
> involved then that may interact.
> 
> - Some of the queries uses spatial and time filtering? Is is worth
> implementing the support for spatial searches with SPARQL? Is there any
> kind

Re: Nested select doesn't work as expected

Re: Questions about Jena CLI toold

Re: Backup doen't work with Fuseki 3.4.0

Re: Questions about Jena CLI toold

Re: Nested select doesn't work as expected

Backup doen't work with Fuseki 3.4.0

Re: How to increase performance

Re: Nested select doesn't work as expected

Re: Nested select doesn't work as expected

Re: How to increase performance

10 matches

Site Navigation

Mail list logo

Footer information