So there is one thing to be really carefully about bucketing. Say you bucket a table into 10 buckets, select with where does not actually prune the input buckets so many queries scan all the buckets.
On Sat, Aug 10, 2013 at 12:34 PM, Nitin Pawar <[email protected]>wrote: > will bucketing help? if you know finite # partiotions ? > > > On Sat, Aug 10, 2013 at 9:26 PM, John Omernik <[email protected]> wrote: > >> I have a table that currently uses RC files and has two levels of >> partitions. day and source. The table is first partitioned by day, then >> within each day there are 6-15 source partitions. This makes for a lot of >> crazy partitions and was wondering if there'd be a way to optimize this >> with ORC files and some sorting. >> >> Specifically, would there be a way in a new table to make source a field >> (removing the partition)and somehow, as I am inserting into this new setup >> sort by source in such a way that will help separate the files/indexes in a >> way that gives me almost the same performance as ORC with the two level >> partitions? Just trying to optimize here and curious what people think. >> >> John >> > > > > -- > Nitin Pawar >
