Even with something as simple as a pair, things can start getting
difficult. I suppose it really revolves around the level of support you
want to provide at scan time, e.g. "find all pairs where the second is
'x'?".
Spending a few minutes thinking about it, an index could be a separate
table but wouldn't necessarily have to be. It depends on the complexity
of the structure you're trying to index. Using the Pair example again,
you could reserve a column (family) to place index records in which
simply inverts the Pair in the colqual.
On 08/13/2012 11:06 AM, Keith Turner wrote:
On Sun, Aug 12, 2012 at 9:36 PM, Josh Elser<[email protected]> wrote:
Neat idea, Keith.
Have you thought about how to support more complex types? Specifically,
arrays, hashes and the nesting of those? Any thoughts about indexing for
those complex types?
Yeah I was thinking that would be nice. I see a lot of users putting
multiple types into the row and/or columns. Could have something like
TupleEncoder<List<A>>. TupleEncoder would need to encode it elements
such that it sorts correctly. However, this may be cumbersome to use
if you want to use different types. For example I want a row composed
of a Long and String. I was thinking of having the following types to
handle this case.
class Pair<A,B> extends LexEncoder{
Pair(LexEncoder<A> enc1, LexEncoder<B> enc2);
A getFirst(){}
B getSecond(){}
}
class Triple<A,B,C>{//follows same pattern as Pair}
class Quadruple<A,B,C,D>{//follows same pattern as Pair}
This would allow a user to write code like the following that makes it
easy to work with a row composed of a Long and String.
Pair<Long, String> pair;
long l = pair.getFirst();
String s = pair.getSecond();
I am still thinking the tuple concept through.
I was not considering indexing. I assuming you mean creating an index
in another table?
Initial thoughts are that it would make the most sense to place Typo at the
contrib level (or something equivalent). The reason being: Typo doesn't
change the underlying functionality of Accumulo; it only provides a layer on
top of it that makes life easier for developers.
I think putting it in contrib makes sense.
On 08/10/2012 07:07 PM, Keith Turner wrote:
I put together a simple abstraction layer for Accumulo that makes it
easier to read and write Java objects to Accumulo key and value
fields. The data written to Accumulo sort correctly
lexicographically.
I put the code on github and would like some feedback on the design
and whether it should be included with Accumulo.
https://github.com/keith-turner/typo
Its still a little rough and I need to add encoder for all of the
primitive types.
Keith