Hi Patrick

Patrick Thompson wrote:
> Hello,
>   I am looking to store the jena model to a cloud based
> datastore(accumulo). 

Interesting.

Storing is the easy part. Querying is more difficult, in particular if you want
to implement/support SPARQL.

One experiment which 'copies' the design/approach of SDB with HBase is here:

 - https://github.com/castagna/hbase-rdf

Vaibhav Khadilkar, a student from the University of Texas at Dallas, wrote most
of that. HBase is very similar to Accumulo, so you might find that useful.
Contact Vaibhav who I am sure will be more than happy to exchange ideas and
experiences with you (he, together with colleagues, wrote also a technical with
a section on the SDB architecture).

HBase gives you scans over data stored in your region servers, but no help for
joins and SPARQL needs joins for its basic triple patterns. There is no easy way
to push those down to the storage layer with HBase, therefore you are limited by
the client (and you end-up with not good performances).

Are you planning to share the code you write?

Why do you want to store 'the jena model' in Accumulo?

> I have sifted through the documentation but haven't
> yet found information that would help me do that. 

Yes, you are right.

We need to improve the documentation for developers who might want to 'extend'
Jena and/or plug-in different storage systems, different indexes, different
parsers and/or serialization formats, etc. Or, integrate Jena with existing
systems (when valuable).

> So specifically I was
> looking at this architecture picture:
> http://jena.apache.org/about_jena/architecture.html  and am looking to
> implement the '?' box.

Well, this is what you get with high-level pictures: boxes.
In this case you do not even have a box? ;-)
Often, when you ask: "what's inside that box?" you get as an answer more boxes.
Iterate a few times: all boxes. Until you end up with your primary source of
information: the source code. Fortunately, for Apache Jena you have the sources
available.

Jena has two storage systems: SDB (over RDBMS) and TDB (with custom indexes).
You can find some documentation of SDB layouts and TDB indexes here:

 - http://jena.apache.org/documentation/sdb/database_layouts.html
 - http://jena.apache.org/documentation/tdb/architecture.html

Not much, but, once again, you have the sources. :-)

I know that this is not the answer you are searching for, but looking at the SDB
design is quite useful and it would give you a better idea on what you need to 
do.

Not too long ago, there was a similar thread to what you are trying to do:

 - How to implement a custom JENA Backend
   http://markmail.org/thread/g27g73pjj2ozsgbx

Search also on Jena JIRA: https://issues.apache.org/jira/browse/JENA
There are a couple of open issues relevant where people are trying to add a new
storage layer to Jena (exactly what you are tying to do).

>  At least that is what I am hoping I need to do.

Why do you think this is what you need to do?

> Instead of going to mysql I want to write to a big table of sorts. 

Why not TDB?

How much data do you need store?

> Any pointers will be greatly helpful. Thanks.
> 
> p.
> 

My 2 cents,
Paolo

Reply via email to