Hi,
we are running a Fuseki server that will hold about 2.2 * 10^9 triples of
meteorological data eventually.
I currently run it with "-Xmx80GB" on a 128GB Server. The database is TDB2 on a
900GB SSD.
Now I face several performance issues:
1. Inserting data:
It takes more than one hour
> Just to be sure, you can try to execute some very generic queries
(e.g. "*a*") and count the results.
Thanks, I'll do that when I have a moment
> The downside of using a high limit (and the reason the default is
"only" 1) is that jena-text/Lucene allocates an array of that size
to hold
Hi Vincent!
Vincent Ventresque kirjoitti 12.09.2018 klo 15:53:
What do you think about this solution :
?uriBnF text:query ( foaf:givenName "*J*" 200 ) . ?uriBnF text:query
( foaf:familyName "roussea*" ) . ?uriBnF foaf:familyName ?nom . ?uriBnF
foaf:givenName ?prenom
It returns all the
Hi Osma,
Thanks again, it's very helpful.
> Either you get less results than expected or the query will take a
long time, or both
What do you think about this solution :
?uriBnF text:query ( foaf:givenName "*J*" 200 ) . ?uriBnF text:query
( foaf:familyName "roussea*" ) . ?uriBnF
Hi Vincent!
Jena-text with the Lucene backend indexes each triple as a separate
Lucene document. This means that you cannot combine givenName and
familyName in the same query - from the Lucene perspective, the
givenName appears in one document where familyName appears in another
document,
Hello Rob
Thank you for all these elements.
> there is a limit on the results returned from each text search so
when these are *separately executed and joined together* you may only
get a subset of the full results
Could you please explain what would be a 'non-separate' query? Do you mean
Well the order of triple patterns shouldn't matter too much when you have a
pure BGP (albeit the optimiser might pick a bad order in some cases)
But we aren't talking about pure BGPs here, having the text:query triples
results in the BGP being broken up into joins of several property functions
Hi Lorenz,
Thanks for your reply.
> for me it sounds more like you've found a bug
I'm not able to tell, just beginning to use Fuseki + Lucene.
> I'm just referring to "Order of triple patterns in a BGP" here
Could you please give a raw text URL for "Order of triple patterns in a
BGP" (seems
Hi "VV",
well, for me it sounds more like you've found a bug and are now doing a
workaround. Or at least something is strange and I'm just referring to
"Order of triple patterns in a BGP" here.
The order of triple patterns in a BGP shouldn't matter - as far as I
know it's always a good old join