Yes. It is the same with normal hive tables.

thanks
yongqiang
On Thu, Mar 17, 2011 at 4:54 PM, Severance, Steve <[email protected]> wrote:
> Thanks Yongqiang.
>
> So for more complex types like map do I just setup a
>
> ROW FORMAT DELIMITED KEYS TERMINATED BY '|' etc...
>
> Thanks.
>
> Steve
>
> -----Original Message-----
> From: yongqiang he [mailto:[email protected]]
> Sent: Thursday, March 17, 2011 4:35 PM
> To: [email protected]
> Subject: Re: Building Custom RCFiles
>
> A side note, in hive, we make all columns saved as Text internally
> (even the column's type is int or double etc). And with some
> experiments, string is more friendly to compression. But it needs CPU
> to decode to its original type.
>
> Thanks
> Yongqiang
> On Thu, Mar 17, 2011 at 4:04 PM, yongqiang he <[email protected]> 
> wrote:
>> You need to customize Hive's ColumnarSerde (maybe functions in
>> LazySerde)'s serde and deserialize function (depends you want to read
>> or write.). And the main thing is that you need to use your own type
>> def (not LazyInt/LazyLong).
>>
>> If your type is int or long (not double/float), casting it to string
>> only wastes some CPU, but can save you more spaces.
>>
>> Thanks
>> Yongqiang
>> On Thu, Mar 17, 2011 at 3:48 PM, Severance, Steve <[email protected]> 
>> wrote:
>>> Hi,
>>>
>>>
>>>
>>> I am working on building a MR job that generates RCFiles that will become
>>> partitions of a hive table. I have most of it working however only strings
>>> (Text) are being deserialized inside of Hive. The hive table is specified to
>>> use a columnarserde which I thought should allow the writable types stored
>>> in the RCFile to be deserialized properly.
>>>
>>>
>>>
>>> Currently all numeric types (IntWritable and LongWritable) come back a null.
>>>
>>>
>>>
>>> Has anyone else seen anything like this or have any ideas? I would rather
>>> not convert all my data to strings to use RCFile.
>>>
>>>
>>>
>>> Thanks.
>>>
>>>
>>>
>>> Steve
>>
>

Reply via email to