Hi All,
I'm attempting to load sequence files for the first using Elephant Bird's
sequence file loader and having absolutely no luck.
I did a hadoop fs -text one on of the sequence files and noticed all the
keys are (null). Not sure if that is throwing off things here.
Here are various approaches I've tried that all have failed.
REGISTER
'/opt/shared_storage/elephant-bird/build/elephant-bird-2.2.3-SNAPSHOT.jar';
%declare SEQFILE_LOADER
'com.twitter.elephantbird.pig.load.SequenceFileLoader';
%declare TEXT_CONVERTER 'com.twitter.elephantbird.pig.util.TextConverter';
%declare NULL_CONVERTER
'com.twitter.elephantbird.pig.util.NullWritableConverter'
raw_logs = LOAD
'/logs/jive/internal/raw/2012/05/07/2012050795652.0627-720078349.seq' USING
$SEQFILE_LOADER ('-c $NULL_CONVERTER','-c $TEXT_CONVERTER') AS (key:
bytearray, value: chararray);
--raw_logs = LOAD
'/logs/jive/internal/raw/2012/05/07/2012050795652.0627-720078349.seq' USING
$SEQFILE_LOADER ('-c $TEXT_CONVERTER','-c $TEXT_CONVERTER') AS (key:
chararray, value: chararray);
--raw_logs = LOAD
'/logs/jive/internal/raw/2012/05/07/2012050795652.0627-720078349.seq' USING
$SEQFILE_LOADER ();
STORE raw_logs INTO '/data/SearchLogJSON/';
Any thoughts on what might be the problem? Anything else I should try? I'm
totally out of ideas.
Appreciate any pointers!
Chris