btw, it seems that aws hive has a cool feature to recover all partitions from subfolders by name of (col=value) under table folder in S3 without explicitly specifying add partition for each ...
On Wed, Mar 23, 2011 at 3:03 PM, Michael Jiang <[email protected]> wrote: > solved. > > uh, thought that hive will by default look into the table folder in hdfs > and match sub-folders with partition column names to recognize partitions > automatically. But realized partition addition has to be done explicitly by > giving partition name and location. So, by doing "alter table add partition > (column=value)" solved this (no need to give location since "column=value" > is a subfolder under table folder in hdfs ;) ... > > On Wed, Mar 23, 2011 at 12:41 PM, Michael Jiang <[email protected]>wrote: > >> Met a problem that data in an external table didn't get read by hive. >> >> Here's how the table was created and data loaded. >> >> - Created an external table w/ a partition, pointing to an existing >> location in hdfs as follows : >> >> create external table order_external (item string, quantity int) >> partitioned by (dt string) row format delimited fields terminated by '\t' >> stored as textfile location '/user/usera/data/hivetables/order'; >> >> - Data from a local file system copied to hdfs >> >> Have 2 data files in local file system >> >> order.2011-03-01.01, which contains 2 entries >> order.2011-03-01.02, which contains 1 entry >> >> cd to data file folder >> hadoop fs -copyFromLocal order.* >> /user/usera/data/hivetables/order/dt=2011-03-01 >> >> verify data is there >> hadoop fs -cat /user/usera/data/hivetables/order/dt=2011-03-01/* >> returns 3 entries => >> android 2 >> iphone 3 >> ipad 1 >> >> - Now, query all items in partition dt='2011-03-01' >> >> select * from order_external o where o.dt='2011-03-01'; >> >> this does not show any entry nor did "select * from order_external". >> >> I also played with an external table created similar to above, the same >> location (w/o 'dt=...' folder ofcourse) and data used, the same schema and >> table name, etc., except that the only difference is this external table is >> created without a partition. Querying the table shows correct results. >> Didn't have this problem w/ "internal" table that has partitions. >> >> So, what is wrong or missing? Any idea? >> >> Thanks! >> --mj >> > >
