Re: Flume data to hive

Nitin Pawar Tue, 14 Jan 2014 12:10:47 -0800

In hive, when you load data to a partitioned table, you need to add that
partition info to hive.


so you can just add those partitions and it should work fine.


On Wed, Jan 15, 2014 at 1:15 AM, Chen Wang <[email protected]>wrote:

> Hey guys,
> I am using flume to directly sink data into my hive table. However, there
> seems to be some schema inconsistency, and I am not sure how to
> troubleshoot it.
>
> I created a hive table 'targeting' in hive, it use sequence file, snappy
> compression, partitioned by 'epoch'. After the table is created, I could
> see a folder called 'targeting' under my folder:
> /hive/cwang49.db/targeting
>
> I then using flume to flow my log data into this folder directly, the
> flume configuration is:
> sinks.HDFS.type = hdfs
> sinks.HDFS.hdfs.path = maprfs:///hive/cwang49.db/targeting/epoch=%{epoch}
> sinks.HDFS.hdfs.fileType = SequenceFile
> sinks.HDFS.hdfs.codeC = snappy
>
> When I run flume node, I can see folder epoch=123445 created, and there
> are files under the folder as well. However, when I run hive query against
> the table, it returns empty.
>
> I think this might be caused by some schema discrepancy? Do I still need
> to load partition meta data into hive before i could see the partition?(I
> recall doing this for external table). How can I trouble shoot this?
>
> Thanks a bunch!
> Chen
>



-- 
Nitin Pawar

Re: Flume data to hive

Reply via email to