One thing, timestamp is usually in high cardinality. It is not the right choice because it causes too many partitions.
2016-12-17 23:34 GMT+09:00 Elliot West <tea...@gmail.com>: > It looks as though your table is partitioned yet perhaps you haven't > accounted for this when adding the data? Firstly it is good practice (and > sometimes essential) to put the data into a partition folder of the form > "timestamp='<partition value>'". You may then need to add the partition > depending on how you are creating it. IIRC the Spark DataFrame/DataSet APIs > have good support for adding partitions to existing Hive tables although > there was a bug that prevented the creation of new partitioned tables when > I looked some time ago. If you are manually managing the partitions you may > need to issue an ADD PARTITION command using the Hive CLI: > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL# > LanguageManualDDL-AddPartitions > > On Sat, 17 Dec 2016 at 08:07, 446463...@qq.com <446463...@qq.com> wrote: > >> >> Hi All: >> I create a orc table in hive >> >> create table if not exists user_tag ( >> rowkey STRING , >> cate1 STRING , >> cate2 STRING , >> cate3 STRING , >> cate4 STRING >> ) >> PARTITIONED BY (timestamp STRING) >> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' >> STORED AS orc >> LOCATION '/user/hive/warehouse/kylinlabel.db/user_tag'; >> >> and I generate a orc file in spark and I put this file into path >> /user/hive/warehouse/kylinlabel.db/user_tag >> /user/hive/warehouse/kylinlabel.db/user_tag/part-r-00000- >> 920282f9-4d68-4af8-81c5-69522df3d374.orc >> this is the file path. >> I find there is no data in user_tag table >> Why? >> >> >> ------------------------------ >> >> 446463...@qq.com >> >>