Re: Converting types from java HashMap, Long and Array to BytesWritable for RCFileOutputFormat

Yongqiang He Tue, 08 Jun 2010 14:26:12 -0700

Hi Viraj

I recommend you to use Hive's columnserde/lazyserde's code to serialize and
deserialize the data. This can help you avoid write your own way to
serialze/deserialize the data.


Basically, for primitives, it is easy to serialize and de-serialize. But for
complex types, you need to use separators.

Thanks
Yongqiang
On 6/8/10 10:50 AM, "Viraj Bhat" <[email protected]> wrote:

> Hi all,
>   I am working on an M/R program to convert Zebra data to Hive RC
> format. 
> 
> The TableInputFormat (Zebra) returns keys and values in the form of
> BytesWritable and (Pig) Tuple.
> 
> In order to convert it to the RCFileOutputFormat whose key is
> "BytesWritable and value is "BytesRefArrayWritable" I need to take in a
> Pig Tuple iterate over each of its contents and convert it to
> "BytesRefWritable".
> 
> The easy part is for Strings, which can be converted to BytesRefWritable
> as:
> 
> myvalue = new BytesRefArrayWritable(10);
> //value is a Pig Tuple and get returns a string
> String s = (String)value.get(0);
> myvalue.set(0, new BytesRefWritable(s.getBytes("UTF-8")));
> 
> 
> 
> How do I do it for java "Long", "HashMap" and "Arrays"
> //value is a Pig tuple
> Long l = new Long((Long)value.get(1));
> myvalue.set(iter, new BytesRefWritable(l.toString().getBytes("UTF-8")));
> myvalue.set(1, new BytesRefWritable(l.getBytes("UTF-8")));
> 
> 
> HashMap<String, Object> hm = new
> HashMap<String,Object>((HashMap)value.get(2));
> 
> myvalue.set(iter, new
> BytesRefWritable(hm.toString().getBytes("UTF-8")));
> 
> 
> Would the toString() method work? If I need to re-read RC format back
> through the "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe" would
> it interpret correctly?
> 
> Is there any documentation for the same?
> 
> Any suggestions would be beneficial.
> 
> Viraj

Re: Converting types from java HashMap, Long and Array to BytesWritable for RCFileOutputFormat

Reply via email to