In hive, when you load data to a partitioned table, you need to add that partition info to hive.
so you can just add those partitions and it should work fine. On Wed, Jan 15, 2014 at 1:15 AM, Chen Wang <[email protected]>wrote: > Hey guys, > I am using flume to directly sink data into my hive table. However, there > seems to be some schema inconsistency, and I am not sure how to > troubleshoot it. > > I created a hive table 'targeting' in hive, it use sequence file, snappy > compression, partitioned by 'epoch'. After the table is created, I could > see a folder called 'targeting' under my folder: > /hive/cwang49.db/targeting > > I then using flume to flow my log data into this folder directly, the > flume configuration is: > sinks.HDFS.type = hdfs > sinks.HDFS.hdfs.path = maprfs:///hive/cwang49.db/targeting/epoch=%{epoch} > sinks.HDFS.hdfs.fileType = SequenceFile > sinks.HDFS.hdfs.codeC = snappy > > When I run flume node, I can see folder epoch=123445 created, and there > are files under the folder as well. However, when I run hive query against > the table, it returns empty. > > I think this might be caused by some schema discrepancy? Do I still need > to load partition meta data into hive before i could see the partition?(I > recall doing this for external table). How can I trouble shoot this? > > Thanks a bunch! > Chen > -- Nitin Pawar
