Sadananda, See if this helps: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions
On Mon, Jan 28, 2013 at 8:05 PM, Sadananda Hegde <saduhe...@gmail.com>wrote: > Hello, > > My hive table is partitioned by year, month and day. I have defined it as > external table. The M/R job correctly loads the files into the daily > subfolders. The hdfs files will be loaded to > <hivetable>/year=yyyy/month=mm/day=dd/ folders by the scheduled M/R jobs. > The M/R job has some business logic in determining the values for year, > month and day; so one run might create / load files into multiple sub > -folders (multiple days). I am able to query the tables after adding > partitions using ALTER TABLE ADD PARTITION statement. But how do I automate > the partition creation step? Basically this script needs to identify the > subfolders created by the M/R job and create corresponding ALTER TABLE ADD > PARTITION statements. > > For example, say the M/R job loads files into the following 3 sub-folders > > /user/hive/warehouse/sales/year=2013/month=1/day=21 > /user/hive/warehouse/sales/year=2013/month=1/day=22 > /user/hive/warehouse/sales/year=2013/month=1/day=23 > > Then it should create 3 alter table statements > > ALTER TABLE sales ADD PARTITION (year=2013, month=1, day=21); > ALTER TABLE sales ADD PARTITION (year=2013, month=1, day=22); > ALTER TABLE sales ADD PARTITION (year=2013, month=1, day=23); > > I thought of changing M/R jobs to load all files into same folder, > then first load the files into non-partitioned table and then to load the > partitioned table from non-partitioned table (using dynamic partition); but > would prefer to avoid that extra step if possible (esp. since data is > already in the correct sub-folders). > > Any help would greately be appreciated. > > Regards, > Sadu > > >