Add TFileTransport deserializer
-------------------------------
Key: HIVE-333
URL: https://issues.apache.org/jira/browse/HIVE-333
Project: Hadoop Hive
Issue Type: New Feature
Components: Serializers/Deserializers
Environment: Linux
Reporter: Steve Corona
I've been googling around all night and havn't really found what I am looking
for. Basically, I want to transfer some data from my web servers to hive in a
format that's a little more verbose than plain CSV files. It seems like JSON or
thrift would be perfect for this. I am planning on sending this serialized json
or thrift data through scribe and loading it into Hive.. I just can't figure
out how to tell hive that the input data is a bunch of serialized thrift
records (all of the records are the "struct" type) in a TFileTransport.
Hopefully this makes sense...
Reply from Joydeep Sen Sarma ([email protected])
Unfortunately the open source code base does not have the loaders we run to
convert thrift records in a tfiletransport into a sequencefile that hadoop/hive
can work with. One option is that we add this to Hive code base (should be
straightforward).
No process required. Please file a jira - I will try to upload a patch this
weekend (just cut'n'paste for most part). Would appreciate some help in
finessing it out .. (the internal code is hardwired to some assumptions etc. )
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.