Hello all,

I am currently struggling a lot to ingest data from Teradata into HDFS Hive in 
Parquet format.

1.      I was expecting sqoop to create the tables automatically, but then I 
get an error of Import Hive table's column schema is missing.
2.      Instead of import, I troubleshooted and just created the table using 
sqoop-hive-create-table, that works correctly. Although it does not accept 
--as-parquet file parameter so the hive table is not parquet ready.
3.      So I proceed to alter the table to  change it to allow store parquet.
4.      I try the import again and this time looks like it fails because Hive 
table's InputFormat class is not supported. This looks like the old parquet 
versions where you create the hive table specifying the InputFormat class but 
don't want to go backwards in time to old parquet / haven't tried.
5.      Finally, I have tried just importing --as-parquetfile without any hive 
related to it. And then use a Hive Table Location parameter to load the hdfs 
parquet content, but I get File is not a parquet file expected magic number bla 
bla, having the same error if I try to load the hdfs parquet folder from Spark

I am using all sqoop parameters there to be, from --hive-table to 
--hive-create-table and read through the documentation a couple times. I don't 
think I have any issues in my comand line parameters.

Any assistance welcome
Saif

Reply via email to