Hey Andy Sorry for taking forever to get back to you on this but comments inline:
On 8/17/12 5:54 AM, "Andy Seaborne" <[email protected]> wrote: >I'm at the point of being ready to integrate RIOT and anew reader system >into Jena properly. This means we can remove the old parsers in >jena-core (not ARP). > >There is a "but" however. > >RIOT supports both triples and quads readers and model/graphs and >datasets/datasetgraphs ... but classes for all things quad are in ARQ. > >I've created a JIRA but I thought I'd surface it here because it has the >potential to be disruptive. > >https://issues.apache.org/jira/browse/JENA-300 > >== Integration > >Possibilities: > >1/ Put the code in ARQ >1a/ require a cal lto ARQ to initialize >1b/ make jena-core do as reflection call to ARQ initialization. > >2/ Merge jena-arq and jena-core > >The obvious issue for (2) is that the result is a big project to work >with. Whether a larger jena-core really makes a difference in the real >world., I don't know. Long term, some redivision into separate modules >would be good but it's quite hard to find any breakdown of core concepts >if you want testing by module. It's hard to do anything much without a >memory graph implementation! > >If (2), it would be good to time this with making an uber jar >"jena-VERSION.jar" so that people switch to that and don't see any >future reorg of the modules unless they take a detailed look. How about this as a suggestion for the short term: - Move Quad and the riot sub-system into jena-core - Replace the jena-core reader machinery with the riot sub-system This has the advantage of keeping everything query still in it's own module and does not need to break down core. Ideally it would be nice to split off the riot sub-system into it's own module but then you get into problems of there being no reader/writer sub-system in core and requiring users to pull in an extra dependency for one of the most common things they are going to do. I assume you plan to integrate this after 2.7.4 perhaps with a minor version bump I.e. 2.8.0 Longer term I tried to think of some ways to nicely separate things out but was kinda struggling, with the Model interface as it stands (wit it's own read()/write() methods) there is no way to cleanly separate the riot sub-system out from jena-core/jena-arq in the same way that Sesame separate their IO subsystem into their RIO modules. They have a sesame-rio-api module and then specific small modules implementing each reader. If we could remove the read()/write() methods from Model then we can start to get a better separation of concerns: - Interfaces for reading/writing form a jena-riot-api module - Implementations form another module jena-riot-std module In the place of a read()/write() method directly on a Model we can provide a static ModelIO class with read() and write() methods. Wiring up of readers and writers for use by this could perhaps be done automagically through some package scanning and Java attributes combination? Hope these thoughts help Rob > >== Outline of the reader > >There is a single class "WebReader2" that captures the process of >opening a connection to a resource/file/thing, deciding the syntax and >then calling the right parser. This adds full http content negotiation >over what Jena currently does. > >You can add new content-types and connect to the appropriate parser code. > >It includes going through FileManager and if/when that connected to >model.read, all the conneg, redirection and location mapping is made >fundamental. You can even could make all URLs of a pattern > http://myhost/data/turtle/file{n} >be Turtle files despite being served as text/plain. > >== Code > >In an "Experimental" project: > >https://svn.apache.org/repos/asf/jena/Experimental/riot-reader/ > >Code browse; >https://svn.apache.org/viewvc/jena/Experimental/riot-reader/src/main/java/ >riot_reader/ > >The package layout isn't right for integration.
