Hi John,

Thanks for the advise!
Space is not the constraint - I want the queries to execute as fast as they
can, so, would go with equality encoding as suggested by you.

cheers,
gaurav

On Wed, Jul 18, 2012 at 7:53 PM, K. John Wu <[email protected]> wrote:

> Hi, Gaurav,
>
> If you have two million unique identifier, a good option is to sort
> your data according to this identifier.  The exception to this might
> be that you have some other operation that requires the data records
> to be in a different order.
>
> As for choosing between binary encoding and equality encoding, since
> you mentioned you will be mostly doing equality queries on this
> column, then it would be best to use the equality encoding.  There is
> also an exception.  If you really want to keep the index size small,
> then the binary encoding produce the smaller index.  However, for 2
> million rows, the index size should not be a serious issue, I presume.
>  If you are seriously worried about disk space, then sort the rows
> according to this ID column.  Either use FastBit sorting procedure or
> tell FastBit the data is sorted according to this column, so that
> FastBit would know that this column is sorted.
>
> John
>
>
> On 7/17/12 10:32 AM, Gaurav Agarwal wrote:
> > Hi John,
> >
> > I have a column which contains about 2 million unique integers (total
> > 2M rows). What would would you recommend as the best option to index
> > them for fastest equality query on this column (binary, equality or
> > something else?). I need to use this column only in equality
> > conditions ( this column is being treated as an identifier of the row
> > and therefore I'll not be using this in range operations as well as
> > any aggregate operations).
> >
> > In general, could you pls help us decide between equality and binary
> > indexing? If instead of integers, I had 2M unique text values, would
> > binary indexing be the best option?
> >
> > Regards,
> > Gaurav
> >
> >
> > _______________________________________________
> > FastBit-users mailing list
> > [email protected]
> > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
> >
>
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to