How about utf-8 encode your blob and store in Hive as String ? On Tue, Oct 12, 2010 at 4:20 PM, Jinsong Hu <jinsong...@hotmail.com> wrote:
> I thought about that too. but then I need to write an bytes inspector and > stick that into hive inspector factory. we also need to create a new > datatype , such as blob , in hive's supported > data types. Adding a new supported data type to hive is a non-trivial task, > as more code will need to be touched. > > I am just wondering if it is possible to get what I want to do without such > big change. > > > > Jimmy. > > -------------------------------------------------- > From: "Ted Yu" <yuzhih...@gmail.com> > Sent: Tuesday, October 12, 2010 4:12 PM > > To: <dev@hive.apache.org> > Subject: Re: blob handling in hive > > How about creating org.apache.hadoop.hive.serde2.io.BytesWritable which >> wraps byte[] ? >> >> On Tue, Oct 12, 2010 at 3:49 PM, Jinsong Hu <jinsong...@hotmail.com> >> wrote: >> >> storing the blob in hbase is too costly. hbase compaction costs lots of >>> cpu. All I want to do is to be able to read the byte array out of a >>> sequence >>> file, and map that byte array to an hive column. >>> I can write a SerDe for this purpose. >>> >>> I tried to define the data to be array<tinyint>. I then tried to write >>> custom SerDe, after I get the byte array out of the disk, I need to >>> map >>> it, >>> >>> so I wrote the code: >>> columnTypes >>> =TypeInfoUtils.getTypeInfosFromTypeString("int,string,array<tinyint>"); >>> >>> but then how to I convert the data in the row.set() method ? >>> >>> I tried this: >>> >>> byte [] bContent=ev.get_content()==null ? null : >>> (ev.get_content().getData()==null ? null : ev.get_content().getData()); >>> org.apache.hadoop.hive.serde2.io.ByteWritable tContent = >>> bContent==null ? new org.apache.hadoop.hive.serde2.io.ByteWritable() : >>> new >>> org.apache.hadoop.hive.serde2.io.ByteWritable(bContent[0]) ; >>> row.set(2, tContent); >>> >>> this works for a single byte, but doesn't work for byte array. >>> Any way that I can get the byte array returned in sql is appreciated. >>> >>> Jimmy >>> >>> -------------------------------------------------- >>> From: "Ted Yu" <yuzhih...@gmail.com> >>> Sent: Tuesday, October 12, 2010 2:19 PM >>> To: <dev@hive.apache.org> >>> Subject: Re: blob handling in hive >>> >>> >>> One way is to store blob in HBase and use HBaseHandler to access your >>> >>>> blob. >>>> >>>> On Tue, Oct 12, 2010 at 2:14 PM, Jinsong Hu <jinsong...@hotmail.com> >>>> wrote: >>>> >>>> Hi, >>>> >>>>> I am using sqoop to export data from mysql to hive. I noticed that >>>>> hive >>>>> don't have blob data type yet. is there anyway I can do so hive can >>>>> store >>>>> blob ? >>>>> >>>>> Jimmy >>>>> >>>>> >>>>> >>>> >>