[jira] [Comment Edited] (HIVE-20225) SerDe to support Teradata Binary Format

2018-08-31 Thread Lu Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599174#comment-16599174
 ] 

Lu Li edited comment on HIVE-20225 at 8/31/18 7:14 PM:
---

[~cwsteinbach] helps me commit the patch and for some reason, the necessary 
binary files are not included
 * *data/files/teradata_binary_file/td_data_with_1mb_rowsize.teradata.gz*
 * *data/files/teradata_binary_file/teradata_binary_table.deflate*

 


was (Author: luli):
[~cwsteinbach] helps me commit the patch which doesn't include the necessary 
binary file 
*data/files/teradata_binary_file/td_data_with_1mb_rowsize.teradata.gz* and 
*data/files/teradata_binary_file/teradata_binary_table.deflate*

 

> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch, HIVE-20225.10.patch, 
> HIVE-20225.11.patch, HIVE-20225.12.patch, HIVE-20225.13.patch, 
> HIVE-20225.14-branch-2.patch, HIVE-20225.15.patch, 
> HIVE-20225.16-branch-2.patch, HIVE-20225.2.patch, HIVE-20225.3.patch, 
> HIVE-20225.4.patch, HIVE-20225.5-branch-2.patch, HIVE-20225.6.patch, 
> HIVE-20225.7.patch, HIVE-20225.8.patch, HIVE-20225.9.patch
>
>
> When using TPT/BTEQ to export/import Data from Teradata, Teradata will 
> generate/require binary files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive 
> or write these files in order to load back to TD.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export/import data from Teradata is using TPT. 
> However, the Hive could not directly utilize/generate these binary format 
> because it doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon/generate the exported Teradata 
> Binary Format file transparently



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-20225) SerDe to support Teradata Binary Format

2018-08-06 Thread Lu Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570628#comment-16570628
 ] 

Lu Li edited comment on HIVE-20225 at 8/6/18 6:48 PM:
--

Resubmit the diff of HIVE-20225.4.patch as HIVE-20225.6.patch to trigger ptest


was (Author: luli):
Resubmit the diff as HIVE-20225.4.patch to trigger ptest

> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch, HIVE-20225.2.patch, 
> HIVE-20225.3.patch, HIVE-20225.4.patch, HIVE-20225.5-branch-2.patch, 
> HIVE-20225.6.patch
>
>
> When using TPT/BTEQ to export/import Data from Teradata, Teradata will 
> generate/require binary files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive 
> or write these files in order to load back to TD.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export/import data from Teradata is using TPT. 
> However, the Hive could not directly utilize/generate these binary format 
> because it doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon/generate the exported Teradata 
> Binary Format file transparently



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)