Thanks, Edward. I can probably create all previous days partitions ahead of time and then use Dean's logic to create new partitions on a daily basis. I will probably end up having few empty partitions; need to make sure it does not cause any confusions.
Thanks, Sadu On Tue, Jan 29, 2013 at 7:21 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote: > You can also just create all your partitions ahead of time. They will not > do any harm if empty. (unless you have an older version and hit this... > http://issues.apache.org/jira/browse/HIVE-1007 ) > > > On Tue, Jan 29, 2013 at 8:17 PM, Mark Grover > <grover.markgro...@gmail.com>wrote: > >> Hi Sadananda, >> Sorry to hear that. >> >> It got committed, don't worry about the "ABORTED". Here is the commit on >> the trunk: >> >> https://github.com/apache/hive/commit/523f47c3b6e7cb7b6b7b7801c66406e116af6dbc >> >> However, there is no Apache Hive release with that patch in it. >> >> You have two options: >> 1. Download the patch, rebuild hive and use it >> 2. Find a hacky way to recover your partitions when they are empty and >> populate them later. >> >> Sorry for the inconvenience. >> >> Mark >> >> On Tue, Jan 29, 2013 at 5:09 PM, Sadananda Hegde <saduhe...@gmail.com>wrote: >> >>> Thanks Mark, >>> >>> Recover partition feature will satisfy my needs; but MSCK Repair >>> Partition < tablename> option is not working for me. It does not give any >>> error; but does not add any partitions either. It looks like it adds >>> partitions only when the sub-folder is empty; but not >>> when the sub-folder has the data files. I see a fix to this issue here. >>> >>> https://issues.apache.org/jira/browse/HIVE-3231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >>> >>> But probably it's not commited yet, since the final result says >>> 'ABORTED". >>> >>> Thanks, >>> Sadu >>> >>> On Mon, Jan 28, 2013 at 10:47 PM, Mark Grover < >>> grover.markgro...@gmail.com> wrote: >>> >>>> Sadananda, >>>> See if this helps: >>>> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions >>>> >>>> >>>> On Mon, Jan 28, 2013 at 8:05 PM, Sadananda Hegde >>>> <saduhe...@gmail.com>wrote: >>>> >>>>> Hello, >>>>> >>>>> My hive table is partitioned by year, month and day. I have defined it >>>>> as external table. The M/R job correctly loads the files into the daily >>>>> subfolders. The hdfs files will be loaded to >>>>> <hivetable>/year=yyyy/month=mm/day=dd/ folders by the scheduled M/R jobs. >>>>> The M/R job has some business logic in determining the values for year, >>>>> month and day; so one run might create / load files into multiple sub >>>>> -folders (multiple days). I am able to query the tables after adding >>>>> partitions using ALTER TABLE ADD PARTITION statement. But how do I >>>>> automate >>>>> the partition creation step? Basically this script needs to identify the >>>>> subfolders created by the M/R job and create corresponding ALTER TABLE ADD >>>>> PARTITION statements. >>>>> >>>>> For example, say the M/R job loads files into the following 3 >>>>> sub-folders >>>>> >>>>> /user/hive/warehouse/sales/year=2013/month=1/day=21 >>>>> /user/hive/warehouse/sales/year=2013/month=1/day=22 >>>>> /user/hive/warehouse/sales/year=2013/month=1/day=23 >>>>> >>>>> Then it should create 3 alter table statements >>>>> >>>>> ALTER TABLE sales ADD PARTITION (year=2013, month=1, day=21); >>>>> ALTER TABLE sales ADD PARTITION (year=2013, month=1, day=22); >>>>> ALTER TABLE sales ADD PARTITION (year=2013, month=1, day=23); >>>>> >>>>> I thought of changing M/R jobs to load all files into same folder, >>>>> then first load the files into non-partitioned table and then to load the >>>>> partitioned table from non-partitioned table (using dynamic partition); >>>>> but >>>>> would prefer to avoid that extra step if possible (esp. since data is >>>>> already in the correct sub-folders). >>>>> >>>>> Any help would greately be appreciated. >>>>> >>>>> Regards, >>>>> Sadu >>>>> >>>>> >>>>> >>>> >>>> >>> >> >