[
https://issues.apache.org/jira/browse/HIVE-43?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Thusoo updated HIVE-43:
------------------------------
Component/s: Serializers/Deserializers
> [Hive] Port Hive's serialization/deserialization to the new Serialization
> framework
> -----------------------------------------------------------------------------------
>
> Key: HIVE-43
> URL: https://issues.apache.org/jira/browse/HIVE-43
> Project: Hadoop Hive
> Issue Type: Bug
> Components: Serializers/Deserializers
> Reporter: Pete Wyckoff
>
> Problem 1: legacy data
> This is non-trivial because of legacy Hive data which is written as
> BytesWritable in the SequenceFile value key. The specific RecordIO/Thrift/X
> class name is stored in the metastore.
> If we write our own SequenceFileRecordReader, this is trivial, but the
> standard reader assumes the SequenceFile has the actual class name and thus
> we cannot deserialize at this level as we would just get back bytes
> writable. We need the SequenceFileRecordReader to consult the Deserializer as
> to what the actual class being deserialized is.
> I don't know if this is a common problem of writing data as just
> byteswritable and storing the real class somewhere else, but for us it is an
> issue.
> Otherwise, there's soon to be a ThriftSerialization set of classes and we can
> add ones for our other serdes.
> Problem 2: DynamicSerDe
> This is a serializer/deserializer that takes a thrift DDL at *runtime* and
> can serialize/deserialize thrift/non thrift data. Thus, the class name
> DynamicSerDe doesn't give you what you need, namely the DDL and the protocol
> used for the serialization - Binary or Control Separated. (in theory json,
> xml, ...)
> We can store this DDL in the metastore (and we do), but then DynamicSerDe
> must be used only with Hive. Maybe we should output only to TFiles where we
> could put the DDL in the metadata for the file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.