All, I've been working on an experimental project for the past month or so: jena-client. Inspired by Java's JDBC [2] and Seasame's Repository interface [3], the idea is to provide a SPARQL-based API for client applications. The main goals of the project are: a) Unified API (same for local in-process RDF stores as remote HTTP or even custom protocols) b) Performance (streaming support for SPARQL Update INSERT DATA / DELETE DATA operations) c) Security (provide tools to protect against injection attacks in client applications) d) Provide default implementations for SPARQL 1.1 Protocol [4] endpoints and in-process Jena DatasetGraphs
At this point, I have implementations of both remote and local repositories. I've created a sample project that shows the basics of how to use the library [5]. I would love to get some feedback on the design (NOT JUST DEVELOPERS, USERS TOO!), in particular the following points: 1) What do you think of the design? 2) Names for the various classes. (I don't particularly like UpdateInserter, UpdateDeleter, and UpdateQuerier for example) 2) Mechanism of creating QueryStatements and UpdateStatements. I was following the JDBC style, but I don't think it works so well for the Update case since you can have a number of update statements that get executed atomically in one request, and there are no return values. I was thinking of just allowing the Query/Update Statement objects to be created directly and then have the execution methods directly on Connection. 3) I would like to use reflection to allow creation of Repository objects, so 3rd parties can create their own drivers and just drop them in the classpath. This would be used for say TDB repositories for example. Does the JDBC-style connection string work here? 4) The Update interfaces all deal with DatasetGraphs / Graphs. Should it also (or instead) deal with Datasets / Models? 5) I'm most likely going remove the "org.apache.jena.client.service" package, it was an early attempt at Service Description / Multiple Endpoing capabilities, which I've decide not to do for now Items left before an initial release: - Finish implementation (TODOs in the codebase) - A way to escape query parameters without using a Query/Update Statement (probably a wrapper around FmtUtils/NodeFmtLib). This is needed to escape non-variable portions of a query (e.g. regex string) - JavaDocs for all public classes / methods - Implement streaming SPARQL Update processing in Fuseki (not strictly necessary for jena-client release, but probably expected!) Some features I'm thinking are out of scope for the initial version: - Remote transactions (a Fuseki-specific extension) - Endpoint discovery (SPARQL 1.1 Service Description [6]) - Graph Store HTTP Protocol [7] (Less capable than SPARQL Update, can't mix update operations) - Multiple endpoints for a single repository / connection (e.g. different entailment regimes in a single transaction) Anyway, give it a spin, let me know what you think! -Stephen P.S. FYI: the original use cases before I started implementation are at [8], however they are not update to date with the current API. [1] https://svn.apache.org/repos/asf/jena/Experimental/jena-client/ [2] http://docs.oracle.com/javase/7/docs/api/java/sql/package-summary.html [3] http://www.openrdf.org/doc/sesame2/api/org/openrdf/repository/package-summary.html [4] http://www.w3.org/TR/sparql11-protocol/ [5] https://svn.apache.org/repos/asf/jena/Scratch/sallen/client-test/ [6] http://www.w3.org/TR/sparql11-service-description/ [7] http://www.w3.org/TR/sparql11-http-rdf-update/ [8] http://people.apache.org/~sallen/jena/ClientUseCases.html
