You both should take a look at the entity centric index -- it uses the column family and column qualifier to store triple info outside of the row key. We have looked at alternate storage schemes, they have been largely abandoned due a desire to not disrupt existing users (which may have tools that take advantage of the current storage scheme and bypass the dao).
Sent from my iPhone > On Nov 25, 2016, at 2:37 AM, pranav.puri <[email protected]> wrote: > > hi > > Since triples are stored entirely in the row id in Accumulo. > > Can i use column qualifier or value for storing different parameters for the > triples.So that I could implement some iterator to filter the triples based > on these parameters and then run sparql queries on these triples. > > >> On Tuesday 22 November 2016 07:36 PM, Aaron D. Mihalik wrote: >> Good questions. Putting all of the information in the Row ID seems like a >> common pattern for composite indices in Accumulo, but I went back to the >> original Rya Paper [1] to pull out the reasoning: >> >> """ >> All the data for the triple resides in the Accumulo Row ID. This offers >> several benefits: 1) by using a direct string representation, we can do >> direct range scans on the literals; 2) the format is very easy to serialize >> and deserialize, which provides for faster query and ingest; 3) since no >> information needs to be stored in the Column Family, Qualifier, or Value >> fields of the Accumulo tables, the storage requirements for the triples are >> significantly reduced. >> """ >> >> As for overriding the storage mechanism/pattern, you'd probably have to >> write your own TripleRowResolver [2]. There a slide deck here [3] that >> show's the different layers of Rya, and that might be a good place to start. >> >> What sort of storage scheme are you considering? >> >> --Aaron >> >> [1] https://www.usna.edu/Users/cs/adina/research/Rya_CloudI2012.pdf >> [2] >> https://github.com/apache/incubator-rya/blob/master/common/rya.api/src/main/java/org/apache/rya/api/resolver/triple/TripleRowResolver.java >> [3] https://cwiki.apache.org/confluence/display/RYA/Rya+Office+Hours (see >> "Running Through Rya Examples") >> >>> On Mon, Nov 21, 2016 at 11:28 PM Greg Clark <[email protected]> wrote: >>> >>> As I understand, Rya stores triples entirely in the row id in Accumulo. >>> Why? >>> >>> Are entries stored this way to avoid overloading rows? >>> >>> Is the storage configurable; that is, could I set a flag to allow triples >>> to be stored with one part of the triple in the row id, one part in the >>> column family, and the remaining one in the column qualifier? Or perhaps >>> set a configuration to use a different storage approach? >
