Hi John, Thanks for the advise! Space is not the constraint - I want the queries to execute as fast as they can, so, would go with equality encoding as suggested by you.
cheers, gaurav On Wed, Jul 18, 2012 at 7:53 PM, K. John Wu <[email protected]> wrote: > Hi, Gaurav, > > If you have two million unique identifier, a good option is to sort > your data according to this identifier. The exception to this might > be that you have some other operation that requires the data records > to be in a different order. > > As for choosing between binary encoding and equality encoding, since > you mentioned you will be mostly doing equality queries on this > column, then it would be best to use the equality encoding. There is > also an exception. If you really want to keep the index size small, > then the binary encoding produce the smaller index. However, for 2 > million rows, the index size should not be a serious issue, I presume. > If you are seriously worried about disk space, then sort the rows > according to this ID column. Either use FastBit sorting procedure or > tell FastBit the data is sorted according to this column, so that > FastBit would know that this column is sorted. > > John > > > On 7/17/12 10:32 AM, Gaurav Agarwal wrote: > > Hi John, > > > > I have a column which contains about 2 million unique integers (total > > 2M rows). What would would you recommend as the best option to index > > them for fastest equality query on this column (binary, equality or > > something else?). I need to use this column only in equality > > conditions ( this column is being treated as an identifier of the row > > and therefore I'll not be using this in range operations as well as > > any aggregate operations). > > > > In general, could you pls help us decide between equality and binary > > indexing? If instead of integers, I had 2M unique text values, would > > binary indexing be the best option? > > > > Regards, > > Gaurav > > > > > > _______________________________________________ > > FastBit-users mailing list > > [email protected] > > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users > > >
_______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
