Can you give some examples of how to alter partitions for different input types? I'd appreciate it :)
On Fri, Jul 26, 2013 at 3:29 PM, Alan Gates <ga...@hortonworks.com> wrote: > A table can definitely have partitions with different input > formats/serdes. We test this all the time. > > Assuming your old data doesn't stay for ever and most of your queries are > on more recent data (which is usually the case) I'd advise you to not > reprocess any data, just alter the table to store new partitions in ORC. > Then with time you'll slowly transition the table to ORC. This avoids all > the issues you noted. And since most queries probably only access recent > data you'll see speed ups soon after the switch. > > Alan. > > On Jul 25, 2013, at 4:45 PM, John Omernik wrote: > > > Just finishing up testing with Hive 11 and ORC. Thank you to Owen and > all those who have put hard work into this. Just ORC files, when compared > to RC files in Hive 9, 10, and 11 saw a huge increase in performance, it > was amazing. That said, now we gotta reprocess. > > > > > > We have a large table with lots of partitions. I'd love to be able to > reprocess into a new table, like table_orc, and then at the end of it all, > just drop the original table. That said, I see it being hard to do from a > space perspective. and I will have to do partition at a time. But then > theirs production issues, if I update a partition, insert overwrite int the > ORC table, then I have delete the original and production users will be > missing data.... decisions decisions. > > > > So any ideas? Can a table have some partitions in one file type and > other partitions in another? That sounds scary. Anywho, a good problem to > have... that performance will be worth it. > > > > > >