[
https://issues.apache.org/jira/browse/HIVE-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Namit Jain updated HIVE-1023:
-----------------------------
Attachment: hive.1023.1.patch
> typedbytes: datatypes should be derived from data
> -------------------------------------------------
>
> Key: HIVE-1023
> URL: https://issues.apache.org/jira/browse/HIVE-1023
> Project: Hadoop Hive
> Issue Type: Improvement
> Components: Query Processor
> Reporter: Namit Jain
> Assignee: Namit Jain
> Fix For: 0.5.0
>
> Attachments: hive.1023.1.patch
>
>
> FROM (
> FROM src
> SELECT TRANSFORM(src.key, src.value) ROW FORMAT SERDE
> 'org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe'
> RECORDWRITER
> 'org.apache.hadoop.hive.contrib.util.typedbytes.TypedBytesRecordWriter'
> USING '/bin/cat'
> AS (tkey, tvalue) ROW FORMAT SERDE
> 'org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe'
> RECORDREADER
> 'org.apache.hadoop.hive.contrib.util.typedbytes.TypedBytesRecordReader'
> ) tmap
> INSERT OVERWRITE TABLE dest1 SELECT tkey, tvalue;
> The output is interpreted as a string - however, it is assumed that the
> script is retuning string data.
> It would be useful if the reader and the deserializer can be decoupled.
> The record reader (TypedBytesRecordReader) will read the typed data
> (independent of the output schema)
> and then convert it according to the output schema.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.