[jira] Commented: (HIVE-895) Add SerDe for Avro serialized data

Alex Rovner (JIRA) Sat, 17 Jul 2010 21:14:23 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889558#action_12889558
 ]


Alex Rovner commented on HIVE-895:
----------------------------------

Can some one please explain to me how would this serde work?

Specifically how would it deserialize the data?

>From what I understand Avro file has a header that defines the data that is 
>stored in the file. In order to deserialize the data you need to read the 
>header which is a challenge in Hive's Deserialize interface because the 
>initialize() method does not know anything about the input file. (Note: there 
>is a hack that can get you the file by getting the map.input hadoop 
>property.... this hack however is not good enough in hive because some one 
>might be using the CLI to query which will not trigger a map reduce job.

Does anyone know a good solution to this issue?

I am actually trying to implements a different file format but the idea of our 
format is similar to Avro: Each file has a header in which it contains a 
"schema"

Thanks

> Add SerDe for Avro serialized data
> ----------------------------------
>
>                 Key: HIVE-895
>                 URL: https://issues.apache.org/jira/browse/HIVE-895
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Serializers/Deserializers
>            Reporter: Jeff Hammerbacher
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-895) Add SerDe for Avro serialized data

Reply via email to