Good evening to you as well Mr. Anderson,
We are building an application where we will end up with several hundreds of
millions of triples. So the scope of the application could be considered large.
As for the initial question about model.listStatements joins, here is a code
snippet:
// Query the model for all the children of nodeResource
StmtIterator iterator1 = model.listStatements(nodeResource,
MY_VOCAB.child, (RDFNode)null);
// Iterate through all the statements returned
iterator1.forEachRemaining(statement -> {
// Find all the labels for the objects in the statements
(Assume that there is more than one language label for each)
StmtIterator iterator2 =
model.listStatements(statement.getObject().asResource(), RDFS.label,
(RDFNode)null);
iterator2.forEachRemaining(statement -> {
// Print out the statement to system.out
System.out.println(statement.toString());
});
});
If the query above returned 10,000 children in iterator1, then iterator2 will
be called 10,000 times. This does not seem to be very efficient.
To the best of my knowledge, TDB already has indexed lists of OSP, POS and SPO.
I would have thought that there was a way to run the second query by just
passing an ordered list of the objects returned in the first query. This
provides for far better matching than having to run the same query many times.
The alternative approach that we are looking at is to run a second query where
we return all the labels of all objects, store the results of each query in a
HashSet indexed on ObjectResource and do a RetainAll to join the two sets. The
problem with this is that there are way too many labels in the system to do
this effectively. I can also create a code snippet for this if it is necessary.
So my question is: What is the correct way to join the results from two
model.listStatements?
As for the initial question about model.listStatements filtering, here is a
code snippet:
StmtIterator iterator1 = model.listStatements(
new SimpleSelector(nodeResource, MY_VOCAB.value, (RDFNode)null)
{
public boolean selects(Statement s)
{
// return the object literals > 12345
return (s.getObject().asLiteral().getInt() >
12345);
}
});
In the query above; for every value result, the selector has to do a comparison
with the filter value. I would have thought that it was easier for TDB to do
the filtering, than to include it in a SimpleSelector.
My question is: What is the correct way to implement filtering?
As for the long list in my email that I accidentally sent multiple times; I
hope the concerns and questions are clear enough to be answered. Let me know if
clarification is needed.
Hope that this makes it clearer.
Thanks in advance,
Niels
-----Original Message-----
From: james anderson [mailto:[email protected]]
Sent: Sunday, November 13, 2016 13:58
To: [email protected]
Subject: Re: How do I do a join between multiple model.listStatments calls?
good evening mr andersen,
i am genuinely curious, why you and your group would be experiencing such
difficulties and would like to understand more about what you are doing.
> On 2016-11-13, at 21:32, Niels Andersen <mailto:[email protected]> wrote:
>
> Dear Jena User Group,
>
> […]
> ... To people who choose to reply to this email; keep in mind that I only
> care about scientific and provable facts, I do not care about opinions.
then, let us start there.
your post, as it stood, provides little room for a considered response, as it
includes too little information about what you are doing.
if you are at liberty to provide specifics about your application data models,
your persistent data vocabularies and storage statistics, your attempts to
combine the two through sparql queries, your deployment specifics, and details
about observed performance and reliability, it could aid your cause to do so.
your complaints imply that you have specific experience which could relate such
information, but, in themselves, permit little beyond commiseration.
if you were to follow-up in concrete terms, you might be more likely to benefit
form the collected knowledge in the group.
best regards, from berlin,
---
james anderson | mailto:[email protected] | http://dydra.com