Load to memory in Fuseki2 - 72s
Load to TDB in Fuseki2 - 115s

If an upload via the UI goes quiet, one thing to try is increasing the heap size a little.

Fuseki is at the mercy of how file upload works in HTTP forcing an addition copy.

export JVM_ARGS=-Xmx2000M

(direct POST of the data, a non-HTML upload, does not have this inconvenient effect)

    Andy

On 27/04/17 22:12, Andy Seaborne wrote:
8,574,807 triples.

Load time: tdbloader.
97.31 seconds [Rate: 88,114.84 per second]

Database size: 826M

Run: fuseki2 --loc DB /ds

Query 2 execution time: 0.734s cold.
Query 1 execution time: 0.070

Fuseki server footprint RAM:
Virtual: 4.6G
Real: 387M

    Andy

On 27/04/17 11:36, Laura Morales wrote:
I've downloaded wordnet-rdf.princeton.edu RDF dataset, which is a
quite large one (about 1.3GB).
This is an example entity from the dataset

SELECT * WHERE { wn31:100001740-n ?p ?o }

1 rdf:type wno:Synset
2 <http://www.w3.org/2002/07/owl#sameAs>
<http://lemon-model.net/lexica/uby/wn/WN_Synset_0>
3 <http://www.w3.org/2002/07/owl#sameAs>
<http://www.w3.org/2006/03/wn/wn20/instances/synset-entity-noun-1>
4 rdfs:label "entity"@eng
5 wno:translation "kewujudan"@zsm
6 wno:translation "entità"@ita
7 wno:translation "entitas"@ind
8 wno:translation "entitet"@sqi
9 wno:translation "יֵשׁוּת"@heb
10 wno:translation "entidade"@por
11 wno:translation "وُجُود"@ara
12 wno:translation "entiteetti"@fin
13 wno:translation "sorkari"@eus
14 wno:translation "كَيْنُونَة"@ara
15 wno:translation "kewujudan"@ind
16 wno:translation "hakikat"@ind
17 wno:translation "ser"@por
18 wno:translation "entidad"@spa
19 wno:translation "kokonaisuus"@fin
20 wno:translation "izaki"@eus
21 wno:translation "entiti"@zsm
22 wno:translation "実体"@jpn
23 wno:translation "tablet"@ind
24 wno:translation "cosa"@ita
25 wno:translation "entidade"@glg
26 wno:translation "sesuatu"@zsm
27 wno:translation "entité"@fra
28 wno:translation "tablet"@zsm
29 wno:translation "sesuatu"@ind
30 wno:translation "เอกลักษณ์"@tha
31 wno:translation "entiti"@ind
32 wno:translation "ente"@por
33 wno:translation "entitat"@cat
34 wno:translation "hakikat"@zsm
35 wno:translation "entitate"@eus
36 wno:hyponym wn31:100001930-n
37 wno:hyponym wn31:104431553-n
38 wno:hyponym wn31:100002137-n
39 wno:synset_member wn31:entity-n
40 wno:gloss "that which is perceived or known or inferred to have its
own distinct existence (living or nonliving)"@eng
41 wno:part_of_speech wno:noun
42 wno:lexical_domain wno:noun.tops


So I decided to try a different query, to "search entities by label
instead of by subject". This is what I tried

-------------------------------------
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX wn31: <http://wordnet-rdf.princeton.edu/wn31/>
PREFIX wno: <http://wordnet-rdf.princeton.edu/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT *
WHERE
{
  ?synset  a wno:Synset ;
         rdfs:label ?label ;
         wno:gloss ?gloss .

  FILTER regex(?label, "entity", "i")
}
LIMIT 10
-------------------------------------

what happened:

- for a long time after loading the dataset (more than 1h), this last
query timed out all the times I submitted it. So I thought it was a
problem with indexes, and that I should read more about Jena indexes
- but now all of a sudden it seems to work, albeit the query seems to
take a few seconds to complete (which still feels a bit too slow since
the database is local on the same machine, and the dataset is not
*that* huge)


Does anybody know what I've run into? Do indexes have anything to do
with this, or maybe some jena/fuseki cache/bootstrap activity, or it's
just some monkey business going on with my computer? This feels so
strange because I don't think I have done anything relevant with my
computer that could have influenced this query. I just loaded the
dataset into Fuseki, then started querying...

Thank you.

Reply via email to