Re: How to choose BinId for Document partitioned index

William Slacum Sat, 06 Feb 2016 09:26:34 -0800

Often it'll be a hash of the document mod the number of bins you're using.
The hash should be "good" in the sense that it uniquely identifies the
document. It can be as simple as some unique field in the document or just
a hash (like murmur) of the whole document.


On Saturday, February 6, 2016, Jamie Johnson <jej2...@gmail.com> wrote:

> Just found this excellent write up that explains a bit.
>
> https://www.slideshare.net/mobile/acordova00/text-indexing-in-accumulo
> On Feb 6, 2016 8:52 AM, "Jamie Johnson" <jej2...@gmail.com
> <javascript:_e(%7B%7D,'cvml','jej2...@gmail.com');>> wrote:
>
>> Reading the examples for table design I've come across a question
>> associated with the document partitioned index, specifically what is
>> typically chosen as the BinId or maybe more appropriately what factors
>> should influence what is chosen as the BinId and what impact do they have?
>>
>

Re: How to choose BinId for Document partitioned index

Reply via email to