Converting types from java HashMap, Long and Array to BytesWritable for RCFileOutputFormat

Viraj Bhat Tue, 08 Jun 2010 10:52:35 -0700

Hi all,
  I am working on an M/R program to convert Zebra data to Hive RC
format.


The TableInputFormat (Zebra) returns keys and values in the form of
BytesWritable and (Pig) Tuple.

In order to convert it to the RCFileOutputFormat whose key is
"BytesWritable and value is "BytesRefArrayWritable" I need to take in a
Pig Tuple iterate over each of its contents and convert it to
"BytesRefWritable".

The easy part is for Strings, which can be converted to BytesRefWritable
as:

myvalue = new BytesRefArrayWritable(10);
//value is a Pig Tuple and get returns a string
String s = (String)value.get(0);
myvalue.set(0, new BytesRefWritable(s.getBytes("UTF-8")));



How do I do it for java "Long", "HashMap" and "Arrays"
//value is a Pig tuple
Long l = new Long((Long)value.get(1));
myvalue.set(iter, new BytesRefWritable(l.toString().getBytes("UTF-8")));
myvalue.set(1, new BytesRefWritable(l.getBytes("UTF-8")));


HashMap<String, Object> hm = new
HashMap<String,Object>((HashMap)value.get(2));

myvalue.set(iter, new
BytesRefWritable(hm.toString().getBytes("UTF-8")));


Would the toString() method work? If I need to re-read RC format back
through the "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe" would
it interpret correctly? 

Is there any documentation for the same?

Any suggestions would be beneficial.

Viraj

Converting types from java HashMap, Long and Array to BytesWritable for RCFileOutputFormat

Reply via email to