btw, it seems that aws hive has a cool feature to recover all partitions
from subfolders by name of (col=value) under table folder in S3 without
explicitly specifying add partition for each ...

On Wed, Mar 23, 2011 at 3:03 PM, Michael Jiang <[email protected]> wrote:

> solved.
>
> uh, thought that hive will by default look into the table folder in hdfs
> and match sub-folders with partition column names to recognize partitions
> automatically. But realized partition addition has to be done explicitly by
> giving partition name and location. So, by doing "alter table add partition
> (column=value)" solved this (no need to give location since "column=value"
> is a subfolder under table folder in hdfs ;) ...
>
> On Wed, Mar 23, 2011 at 12:41 PM, Michael Jiang <[email protected]>wrote:
>
>> Met a problem that data in an external table didn't get read by hive.
>>
>> Here's how the table was created and data loaded.
>>
>> - Created an external table w/ a partition, pointing to an existing
>> location in hdfs as follows :
>>
>> create external table order_external (item string, quantity int)
>> partitioned by (dt string) row format delimited fields terminated by '\t'
>> stored as textfile location '/user/usera/data/hivetables/order';
>>
>> - Data from a local file system copied to hdfs
>>
>> Have 2 data files in local file system
>>
>> order.2011-03-01.01, which contains 2 entries
>> order.2011-03-01.02, which contains 1 entry
>>
>> cd to data file folder
>> hadoop fs -copyFromLocal order.*
>> /user/usera/data/hivetables/order/dt=2011-03-01
>>
>> verify data is there
>> hadoop fs -cat /user/usera/data/hivetables/order/dt=2011-03-01/*
>> returns 3 entries =>
>> android    2
>> iphone    3
>> ipad    1
>>
>> - Now, query all items in partition dt='2011-03-01'
>>
>> select * from order_external o where o.dt='2011-03-01';
>>
>> this does not show any entry nor did "select * from order_external".
>>
>> I also played with an external table created similar to above, the same
>> location (w/o 'dt=...' folder ofcourse) and data used, the same schema and
>> table name, etc., except that the only difference is this external table is
>> created without a partition. Querying the table shows correct results.
>> Didn't have this problem w/ "internal" table that has partitions.
>>
>> So, what is wrong or missing? Any idea?
>>
>> Thanks!
>> --mj
>>
>
>

Reply via email to