Hello, Thank you for the replies.
I have not used pig yet. I am looking into it. I wanted to implement both the approaches. Are pig scripts maintainable? Because the Json structure that I will be receiving will be changing quite often. Almost 3 times a month. I will be processing 24 million Json files per month. I am getting one big file with almost 3 million Json files aggregated. One Json per line. I need to process this file and store all values into HBase. Thanking You, On Thu, Feb 7, 2013 at 12:59 PM, Mohammad Tariq <[email protected]> wrote: > Good point sir. If Pig fits into Panshul's requirements then it's a much > better option. > > Warm Regards, > Tariq > https://mtariq.jux.com/ > cloudfront.blogspot.com > > > On Thu, Feb 7, 2013 at 5:25 PM, Damien Hardy <[email protected]> > wrote: > > > Hello, > > Why not using a PIG script for that ? > > make the json file available on HDFS > > Load with > > > > > http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/builtin/JsonLoader.html > > Store with > > > > > http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/backend/hadoop/hbase/HBaseStorage.html > > > > http://pig.apache.org/docs/r0.10.0/ > > > > Cheers, > > > > -- > > Damien > > > -- Regards, Ouch Whisper 010101010101
