If there are 9k possible entries in the lookup table, in order to achieve
space savings, the keys will need to be 1 or 2 bytes. For simplicity, let's
say you go with the 2 byte version. For 30 billion cells you will save 2
bytes per cell at best (from 4 bytes to 2) for a total savings of 60Gb and
at worst it will take more size because the lookup keys will be longer than
the actual value being looked up.

The added complexity of a lookup table would not make that savings worth it
to me, but you know your data best.

Just my $0.02

--Tom

On Sunday, September 16, 2012, Rita <[email protected]> wrote:
> Yes, I am trying to save on disk space because of limited resouces and the
> table will be around 30 billion rows.
>
> The lookup table itself will be around 9k rows so its not too bad. A
> character's range will be from 1 to 4.
>
> I suppose I really should worry about it too much.
>
>
>
>
>
> On Sun, Sep 16, 2012 at 6:16 PM, Stack <[email protected]> wrote:
>
>> On Sat, Sep 15, 2012 at 8:09 AM, Rita <[email protected]> wrote:
>> > I am debating if a lookup table would help my situation.
>> >
>> > I have a bunch of codes which map with timestamp (unsigned int). The
>> codes
>> > look like this
>> >
>> > AA4
>> > AAA5
>> > A21
>> > A4
>> > ...
>> > Z435
>> >
>> > The size range from 1 character to 4 characters (1 to 4 bytes,
>> > respectively).
>> >
>> >
>> > Would adding a lookup table for all my codes help in reducing space? If
>> so,
>> > what would be the best way to hash something like this?
>> >
>>
>> You are trying to save on disk space?  You could make your keys binary
>> four bytes max null prefixed if < 4 characters?  Why are you trying to
>> save disk space?  You want a lookup table so you can have a code that
>> is smaller than that of the 1-4 character codes?
>>
>> St.Ack
>> St.Ack
>>
>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>

Reply via email to