I just posted this query on the pig-user. My query concerns both Pig & HBase, so posting here too.
Thanks! Nikhil Gupta Graduate Student, Stanford University ---------- Forwarded message ---------- From: Nikhil Gupta <[email protected]> Date: Wed, Aug 19, 2009 at 12:49 PM Subject: Storing Pig output into HBase tables To: [email protected] Hi all, I am working no building a analytics kind of engine which takes daily server logs, crunches the data using Pig scripts and (for now) outputs data to HDFS. Later, this data is to be stored on HBase to enable efficient querying from front-end. Currently, I am searching for efficient ways of moving the Pig output on HDFS to the HBase tables. Though this seems to be a very basic task, I could not find any easy way of doing that, except for writing some Java code. The problem is I'll have many different kind of output formats, and writing java code for loading each such file seems wrong. Probably I am missing something. Is there any way of storing Pig output directly in a Hbase table [loading is possible by HBaseStorage, but that doesn't talk of storing]. Or is there any general data load/import tool for Hbase? Thanks! Nikhil Gupta Graduate Student, Stanford University
