> Hive provides the ability to provide custom patterns for partitions. You
> can use this in combination with MSCK REPAIR TABLE to automatically detect
> and load the partitions into the metastore.

I tried this yesterday, and as far as I can tell it doesn’t work with a custom 
partition layout.  At least not with external tables.  MSCK REPAIR TABLE 
reports that there are directories in the table’s location that are not 
partitions of the table, but it wouldn’t actually add the partition unless the 
directory layout matched Hive’s default (key1=value1/key2=value2, etc.)



> On Mar 9, 2015, at 17:16, Pradeep Gollakota <pradeep...@gmail.com> wrote:
> 
> If I understood your question correctly, you want to be able to read the
> output of Camus in Hive and be able to know partition values. If my
> understanding is right, you can do so by using the following.
> 
> Hive provides the ability to provide custom patterns for partitions. You
> can use this in combination with MSCK REPAIR TABLE to automatically detect
> and load the partitions into the metastore.
> 
> Take a look at this SO
> http://stackoverflow.com/questions/24289571/hive-0-13-external-table-dynamic-partitioning-custom-pattern
> 
> Does that help?
> 
> 
> On Mon, Mar 9, 2015 at 1:42 PM, Yang <teddyyyy...@gmail.com> wrote:
> 
>> I believe many users like us would export the output from camus as a hive
>> external table. but the dir structure of camus is like
>> /YYYY/MM/DD/xxxxxx
>> 
>> while hive generally expects /year=YYYY/month=MM/day=DD/xxxxxx if you
>> define that table to be
>> partitioned by (year, month, day). otherwise you'd have to add those
>> partitions created by camus through a separate command. but in the latter
>> case, would a camus job create >1 partitions ? how would we find out the
>> YYYY/MM/DD values from outside ? ---- well you could always do something by
>> hadoop dfs -ls and then grep the output, but it's kind of not clean....
>> 
>> 
>> thanks
>> yang
>> 

Reply via email to