What makes you say that the OrIterator cannot handle more than one row
per tablet? Can you provide details?
AFAIK, the OrIterator should work correctly in all cases (e.g.
regardless of row distribution in a tablet). Any issues in the code that
prevent it from doing so would be a bug that should be fixed.
Also, the wikisearch example supports indexing over multiple attributes
(and I believe indexes document metadata in addition to the tokenized
document). Is there something unclear that could be better documented?
On 8/22/12 4:41 PM, Cardon, Tejay E wrote:
All,
I'm interested in writing a custom iterator, and I've been looking for
documentation on how to do so. Thus far, I've not been able to find
anything beyond the java docs in SortedKeyValueIterator and a few
other sub-classes. A few of the examples use Iterators, but provide
no real info on how to properly implement one. Is there anywhere to
find general guidance on the iterator stack?
(If you're interested)
Specifically, for those that are curious, I'm trying to implement
something similar to the wikisearch example, but with some key
differences. In my case, I've got a file with various attributes that
being indexed. So for each file there are 5 attributes, and each
attribute has a fixed number of possible values. For example (totally
made up):
personID, gender, hair color, country, race, personRecord
Row:binID; ColFam:Attribute_AttributeValue; ColQ:PersonID; Val:blank
AND
Row:binID; ColFam:"D"; ColQ:personID; value:personRecord
A typical query would be:
Give me the personRecord for all people with:
Gender: male &
Hair color: blond or brown &
Country: USA or England or china or korea &
Race: white or oriental
The existing Iterators used in the wikisearch example are unable to
handle the "or" clauses in each attribute.
The OrIterator doesn't appear to handle the possibility more than one
row per tablet
Thanks,
Tejay Cardon