Hi Patrick Patrick Thompson wrote: > Hello, > I am looking to store the jena model to a cloud based > datastore(accumulo).
Interesting. Storing is the easy part. Querying is more difficult, in particular if you want to implement/support SPARQL. One experiment which 'copies' the design/approach of SDB with HBase is here: - https://github.com/castagna/hbase-rdf Vaibhav Khadilkar, a student from the University of Texas at Dallas, wrote most of that. HBase is very similar to Accumulo, so you might find that useful. Contact Vaibhav who I am sure will be more than happy to exchange ideas and experiences with you (he, together with colleagues, wrote also a technical with a section on the SDB architecture). HBase gives you scans over data stored in your region servers, but no help for joins and SPARQL needs joins for its basic triple patterns. There is no easy way to push those down to the storage layer with HBase, therefore you are limited by the client (and you end-up with not good performances). Are you planning to share the code you write? Why do you want to store 'the jena model' in Accumulo? > I have sifted through the documentation but haven't > yet found information that would help me do that. Yes, you are right. We need to improve the documentation for developers who might want to 'extend' Jena and/or plug-in different storage systems, different indexes, different parsers and/or serialization formats, etc. Or, integrate Jena with existing systems (when valuable). > So specifically I was > looking at this architecture picture: > http://jena.apache.org/about_jena/architecture.html and am looking to > implement the '?' box. Well, this is what you get with high-level pictures: boxes. In this case you do not even have a box? ;-) Often, when you ask: "what's inside that box?" you get as an answer more boxes. Iterate a few times: all boxes. Until you end up with your primary source of information: the source code. Fortunately, for Apache Jena you have the sources available. Jena has two storage systems: SDB (over RDBMS) and TDB (with custom indexes). You can find some documentation of SDB layouts and TDB indexes here: - http://jena.apache.org/documentation/sdb/database_layouts.html - http://jena.apache.org/documentation/tdb/architecture.html Not much, but, once again, you have the sources. :-) I know that this is not the answer you are searching for, but looking at the SDB design is quite useful and it would give you a better idea on what you need to do. Not too long ago, there was a similar thread to what you are trying to do: - How to implement a custom JENA Backend http://markmail.org/thread/g27g73pjj2ozsgbx Search also on Jena JIRA: https://issues.apache.org/jira/browse/JENA There are a couple of open issues relevant where people are trying to add a new storage layer to Jena (exactly what you are tying to do). > At least that is what I am hoping I need to do. Why do you think this is what you need to do? > Instead of going to mysql I want to write to a big table of sorts. Why not TDB? How much data do you need store? > Any pointers will be greatly helpful. Thanks. > > p. > My 2 cents, Paolo
