----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68099/ -----------------------------------------------------------
(Updated Aug. 17, 2018, 11:21 p.m.) Review request for hive, Carl Steinbach and Daniel Dai. Changes ------- 1. changes based on the review 2. add the teradata.row.length to support 1MB record file 3. add Query Unit Test Bugs: HIVE-20225 https://issues.apache.org/jira/browse/HIVE-20225 Repository: hive-git Description ------- When using TPT/BTEQ to export Data from Teradata, Teradata will export binary files based on the schema. A Customized SerDe is needed in order to directly read these files from Hive. CREATE EXTERNAL TABLE `TABLE1`( ...) PARTITIONED BY ( ...) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' LOCATION ...; SELECT * FROM `TABLE1`; Problem Statement: Right now the fast way to export data from Teradata is using TPT. However, the Hive could not directly utilize these exported binary format because it doesn't have a SerDe for these files. Result: Provided with the SerDe, Hive can operate upon the exported Teradata Binary Format file transparently. Diffs (updated) ----- data/files/teradata_binary_file/td_data_with_1mb_rowsize.teradata.gz PRE-CREATION data/files/teradata_binary_file/teradata_binary_table.deflate PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/TeradataBinaryFileInputFormat.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/TeradataBinaryFileOutputFormat.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/TeradataBinaryRecordReader.java PRE-CREATION ql/src/test/queries/clientpositive/test_teradatabinaryfile.q PRE-CREATION ql/src/test/results/clientpositive/test_teradatabinaryfile.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/teradata/TeradataBinaryDataInputStream.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/teradata/TeradataBinaryDataOutputStream.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/teradata/TeradataBinarySerde.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/teradata/TestTeradataBinarySerdeForDate.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/teradata/TestTeradataBinarySerdeForDecimal.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/teradata/TestTeradataBinarySerdeForTimeStamp.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/teradata/TestTeradataBinarySerdeGeneral.java PRE-CREATION Diff: https://reviews.apache.org/r/68099/diff/3/ Changes: https://reviews.apache.org/r/68099/diff/2-3/ Testing ------- Junit tests have been added for Serialization and Deserialization functions Thanks, Lu Li