You both should take a look at the entity centric index -- it uses the column 
family and column qualifier to store triple info outside of the row key.   We 
have looked at alternate storage schemes, they have been largely abandoned due 
a desire to not disrupt existing users (which may have tools that take 
advantage of the current storage scheme and bypass the dao).  

Sent from my iPhone

> On Nov 25, 2016, at 2:37 AM, pranav.puri <[email protected]> wrote:
> 
> hi
> 
> Since triples are stored entirely in the row id in Accumulo.
> 
> Can i use column qualifier or value for storing different parameters for the 
> triples.So that I could implement some iterator to filter the triples based 
> on these parameters and then run sparql queries on these triples.
> 
> 
>> On Tuesday 22 November 2016 07:36 PM, Aaron D. Mihalik wrote:
>> Good questions.  Putting all of the information in the Row ID seems like a
>> common pattern for composite indices in Accumulo, but I went back to the
>> original Rya Paper [1] to pull out the reasoning:
>> 
>> """
>> All the data for the triple resides in the Accumulo Row ID. This offers
>> several benefits: 1) by using a direct string representation, we can do
>> direct range scans on the literals; 2) the format is very easy to serialize
>> and deserialize, which provides for faster query and ingest; 3) since no
>> information needs to be stored in the Column Family, Qualifier, or Value
>> fields of the Accumulo tables, the storage requirements for the triples are
>> significantly reduced.
>> """
>> 
>> As for overriding the storage mechanism/pattern, you'd probably have to
>> write your own TripleRowResolver [2].  There a slide deck here [3] that
>> show's the different layers of Rya, and that might be a good place to start.
>> 
>> What sort of storage scheme are you considering?
>> 
>> --Aaron
>> 
>> [1] https://www.usna.edu/Users/cs/adina/research/Rya_CloudI2012.pdf
>> [2]
>> https://github.com/apache/incubator-rya/blob/master/common/rya.api/src/main/java/org/apache/rya/api/resolver/triple/TripleRowResolver.java
>> [3] https://cwiki.apache.org/confluence/display/RYA/Rya+Office+Hours (see
>> "Running Through Rya Examples")
>> 
>>> On Mon, Nov 21, 2016 at 11:28 PM Greg Clark <[email protected]> wrote:
>>> 
>>> As I understand, Rya stores triples entirely in the row id in Accumulo.
>>> Why?
>>> 
>>> Are entries stored this way to avoid overloading rows?
>>> 
>>> Is the storage configurable; that is, could I set a flag to allow triples
>>> to be stored with one part of the triple in the row id, one part in the
>>> column family, and the remaining one in the column qualifier?  Or perhaps
>>> set a configuration to use a different storage approach?
> 

Reply via email to