Hi, Since this group is so responsive I would like to sick advise in another field:
Area: concurrent calls to Fuseki: I am performing concurrent SPARQL queries against freebase data using Fuseki and have noticed that for some queries running them in parallel versus in series results in a big difference in running time, whereas for others the difference in time is minimal or non-existent. For example my first query (notice new FILTER placement that improves performance a lot for me!): prefix fb: <http://rdf.freebase.com/ns/> <http://rdf.freebase.com/ns/> prefix fn: <http://www.w3.org/2005/xpath-functions#> <http://www.w3.org/2005/xpath-functions> prefix xsd: <http://www.w3.org/2001/XMLSchema#> <http://www.w3.org/2001/XMLSchema> select ?entity ?mID ?height ?wikipedia_url where { { ?mID_raw fb:type.object.type fb:people.person . ?mID_raw fb:type.object.name ?entity . ?mID_raw fb:people.person.height_meters ?height_in_meters . ?mID_raw fb:common.topic.topic_equivalent_webpage ?wikipedia_url . FILTER (lang(?entity) = "en" && regex (str(?wikipedia_url), "en.wikipedia", "i") && !regex (str(?wikipedia_url), "curid=", "i")) . } BIND(REPLACE(str(?mID_raw), "http://rdf.freebase.com/ns/" <http://rdf.freebase.com/ns/>, "") as ?mID) BIND(round(xsd:float(?height_in_meters)* xsd:float("100"))/ xsd:float("100") as ?height_rounded) BIND(xsd:float(?height_in_meters)* xsd:float("3.2808") AS ?height_in_feet) BIND(str(?height_in_feet) AS ?feet_str_value) BIND(str(floor(xsd:decimal(?feet_str_value))) AS ?feet_final) BIND(round(xsd:float(?height_in_feet - floor(xsd:decimal(?feet_str_value))) * 12) AS ?inches) BIND(str(floor(xsd:decimal(str(?inches)))) as ?inches_final) BIND(fn:concat(?feet_final, "' ",?inches_final,"\" (",?height_rounded, " m)" ) AS ?height) } Has the following runtime for a single query: 2 mins, 44 seconds and for 5 concurrent queries: 24 mins, 27 seconds Whereas for our second query: prefix fb: <http://rdf.freebase.com/ns/> <http://rdf.freebase.com/ns/> prefix fn: <http://www.w3.org/2005/xpath-functions#> <http://www.w3.org/2005/xpath-functions> select ?entity ?mID ?age_at_death ?wikipedia_url where { { ?mID_raw fb:type.object.type fb:people.person . ?mID_raw fb:type.object.type fb:people.deceased_person . ?mID_raw fb:type.object.name ?entity . ?mID_raw fb:people.deceased_person.date_of_death ?date_of_death . ?mID_raw fb:people.person.date_of_birth ?date_of_birth . ?mID_raw fb:common.topic.topic_equivalent_webpage ?wikipedia_url . FILTER (lang(?entity) = "en" && regex (str(?wikipedia_url), "en.wikipedia", "i") && !regex (str(?wikipedia_url), "curid=", "i")). } BIND(REPLACE(str(?mID_raw), "http://rdf.freebase.com/ns/" <http://rdf.freebase.com/ns/>, "") as ?mID) BIND(fn:year-from-dateTime(?date_of_birth) AS ?year_of_birth) BIND(fn:year-from-dateTime(?date_of_death) AS ?year_of_death) BIND(str(floor(fn:days-from-duration(?date_of_death - ?date_of_birth) / 365)) as ?age) BIND(fn:concat(?age, " (", ?year_of_birth, "-", ?year_of_death, ")" ) AS ?age_at_death) } Has the following runtime for a single query: 5 mins, 35 seconds Average for 5 concurrent queries: 5 mins, 35 seconds Does anybody have any insights why we are seeing such different behavior between the two queries when we run them concurrently? What our expectations should be when we run concurrent queries against Fuseki? I would guess that the time should be more or less the same no matter the load but if this is the expectation in general why we see such a big difference for first query? Also, for the second query above when executing this query we see 100s of lines similar to the following being printed to the log: 05:39:35 WARN NodeValue :: Datatype format exception: "2008-05-16T09"^^xsd:dateTime We know that this problems originates with the import - we got a number of WARNs while importing the data using tdbloader. When we remove the bindings we do not see these Warnings in log and the query runs a lot faster. Any ideas how to overcome this?
