impl.sparql looks very interesting! I can imagine getting that
blanknode logic correct took some work
I see you do internal skolemization within the blanknode in order to
create the context - does this work even when there are multiple
blanknodes in the result in a circular dependency and no URI-bound
subject/object?
Have you had a go at porting this to the incubator-commons-rdf model
to see where the gaps are?
I think it's a good example for exercising the model - as you say
SPARQL is the "only" method to query RDF data. (There's also Linked
Data Platform etc which are more specific - but perhaps also could be
interesting as an exercise).
I had a go:
https://github.com/stain/clerezza-rdf-core/tree/github-sql/impl.sparql/src/main/java/org/apache/commons/rdf/impl/sparql
Changes:
https://github.com/stain/clerezza-rdf-core/compare/github-sql?expand=1
(NOTE: Tests not updated! So it probably doesn't work..)
and it highlighted the problem of incubator-commons-rdf being all
about Streams and no Collections support - so even something as simple
as iterating over the triples has to be done Java 8 style.
Some of the impl.sparql code got cleaner of this (e.g.:
Stream<BlankNodeOrIRI> subjects =
context.getTriples().map(t -> t.getSubject());
Stream<Object> objects =
context.getTriples().map(t -> t.getObject());
Stream<Object> candidates =
Stream.concat(subjects, objects);
Stream<BlankNode> bnodes = candidates.filter(n ->
n instanceof BlankNode)
.map(BlankNode.class::cast);
)
... but other, more traditional iterative code, got trickier. I think
we should support both styles. Lots of this would be solved if Graph
was Iterable<Triple>.
I was forced to use the RDFTermFactory as the Simple implementations
like LiteralImpl are no longer public. Clean enough, but it means
every class needs one of these to do anything useful (e.g. to create a
IRI to supply as an argument to Graph.getTriples()):
private static SimpleRDFTermFactory factory = new SimpleRDFTermFactory();
Incubator RDF don't have any support for cloning and making graphs
immutable. Adding all triples from one graph to another requires
stream-fun, e.g. instead of collection operations like:
expandedContext.addAll(startContext);
I had to do the more elaborate, in a way more iterative things like:
startContext.getTriples().forEach(t -> expandedContext.add(t));
On 16 March 2015 at 12:02, Reto Gmür <[email protected]> wrote:
> Hello,
>
> With the new repository the clerezza rdf commons previously in the commons
> sandbox are now at:
>
> https://git-wip-us.apache.org/repos/asf/clerezza-rdf-core.git
>
> I will compare that code with the current status of the code in the
> incubating rdf-commons project in a later mail.
>
> Now I would like to point to your attention a big step forward towards
> CLEREZZA-856. The impl.sparql modules provide an implementation of the API
> on top of a SPARQL endpoint. Currently it only supports read access. For
> usage example see the tests in
> /src/test/java/org/apache/commons/rdf/impl/sparql (
> https://git-wip-us.apache.org/repos/asf?p=clerezza-rdf-core.git;a=tree;f=impl.sparql/src/test/java/org/apache/commons/rdf/impl/sparql;h=cb9c98bcf427452392e74cd162c08ab308359c13;hb=HEAD
> )
>
> The hard part was supporting BlankNodes. The current implementation handles
> them correctly even in tricky situations, however the current code is not
> optimized for performance yet. As soon as BlankNodes are involved many
> queries have to be sent to the backend. I'm sure some SPARQL wizard could
> help making things more efficient.
>
> Since SPARQL is the only standardized methods to query RDF data, I think
> being able to façade an RDF Graph accessible via SPARQL is an important
> usecase for an RDF API, so it would be good to also have an SPARQL backed
> implementation of the API proposal in the incubating commons-rdf repository.
>
> Cheers,
> Reto
--
Stian Soiland-Reyes
Apache Taverna (incubating), Apache Commons RDF (incubating)
http://orcid.org/0000-0001-9842-9718