Re: [FastBit-users] choosing optimal index spec

K. John Wu Tue, 15 Sep 2009 13:51:08 -0700

Hi, Andrew,

Conceptually, you are right, but the specific numbers may not be 
exactly as you indicated.


In a two-level index, the coarse bins are based on a criteria that is 
unlike to have the same number of fine bins in each coarse bin.  You 
can find more information about how the two-level binning works in 
Section 7.1 of this paper <http://crd.lbl.gov/~kewu/ps/LBNL-60891.pdf>.

John



On 9/15/2009 1:33 PM, Andrew Olson wrote:
> On Sep 15, 2009, at 4:11 PM, K. John Wu wrote:
> 
> 
>> BEGIN Column
>> name=start
>> data_type=Unsigned
>> minimum=0
>> maximum=100000000
>> index=<binning begin=0 end=1e8 nbins=10000/><encoding interval- 
>> equality/>
>> END Column
>>
>> (note: index specification must on one line in -part.txt)
>>
>> This index can handle i_begin and i_end that are multiples of 10,000
>> efficiently.  If you have "2000 < start < 78000" in a query, then the
>> index is insufficient to resolve this range condition, FastBit will
>> have to go back to the raw data.
>>
>> In the above example, the values 2000 and 78000 are multiples of 1000,
>> therefore, you can increase nbins to 100000 and keep bin boundaries at
>> the same resolution as the query boundaries.
> 
> One follow up question:
> 
> If nbins is 10000, does FastBit use the low resolution bins for  
> 10000-70000 and the high resolution bins for 2000-10000 and 70000-78000?
> 
> Andrew
> _______________________________________________
> FastBit-users mailing list
> [email protected]
> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Re: [FastBit-users] choosing optimal index spec

Reply via email to