The only thing that is a bit different is we encode (bases62) the numbers of 
xxxx's in the last digit mainly so the terms are smaller in length.


my @foo =  encode_trie(100000);
print Dumper(\@foo);

The output would look like this:
$VAR1 = [
          '1a',                     ## 1xxxxxxxxxx
          '129',                  ##  12xxxxxxxxx
          '1208',         
          '12007',
          '120026',
          '1200205',
          '12002014',
          '120020113',
          '1200201122',
          '12002011201',
          '120020112010'   ## the exact match for  base 3  @ 100000
        ];



So you really only use encode_trie(int) to build the terms to index and  
query_trie( minint, maxint ) for search  terms at query time.


few things i'm pretty sure need some love are:
 1.   encode() and qery_trie() are hard coded for base3.
 2.  If the length if your trie gets longer than 62 chars the cute disk saving 
trick above will surely not work.


enjoy,
-Dan


On Jun 22, 2011, at 7:19 PM, Peter Karman wrote:

> Marvin Humphrey wrote on 6/22/11 8:51 PM:
>>> On Tue, Jun 21, 2011 at 12:42:43AM -0500, Peter Karman wrote:
>>>> I want to override the behavior of the RangeQuery class to support my 
>>>> pseudo
>>>> multi-value fields, which I achieve by concatenating values with the \x03 
>>>> byte.
>> 
>> OK, there's another option which has suddenly become more attractive. :)  My
>> Eventful colleague Dan Markham has submitted a trie implementation that can 
>> be
>> used for generating numeric ranges:
>> 
>>    https://issues.apache.org/jira/browse/LUCY-159
>> 
>> It is to some degree based on the algorithm used by Lucene's 
>> NumericRangeQuery:
>> 
>>    http://s.apache.org/QOx
>> 
> 
> Thanks to both you and Dan for this contribution!
> 
> I'll have a look at the code and the docs and see if it feels workable for my
> particular need. In any case, I think it's great to see contributions like
> these, expanding the Lucy ecosystem.
> 
> 
> -- 
> Peter Karman  .  http://peknet.com/  .  [email protected]

Reply via email to