In my Sqoop import one of the column in the source table got deleted and
that resulting in data issue. This resulting data are off by 1 column.
The removed column was in the middle of the schema. If it were last column
then wouldn't have any worries.
Data is imported from MySql to Hive using Sqoop. I am using sqoop-1.3.0
Here is the syntax.
sqoop import --hive-import
--options-file 'credential.txt'
--table 'TABLENAME '
--where 'created between 1353960000000 and 1353963600000'
--hive-partition-key part
--hive-partition-value 'PARTITION_VALUE'
--hive-overwrite
--hive-delims-replacement
Now the problem is One of the column in the source DB got removed.
I tried with workaround by including the --columns
1) By hardcoding third column with quotes.
--columns "col1,col2,'col3' as col3,col4"
but this gives error Column name 'col3' not in table
2) Then i tried with (col2 repeated twice)
--columns " col1,col2, col2 , col4"
It threw an error
Imported Failed: Duplicate Column identifier specified:
3) Then i tried with (col2 as col3)
--columns " col1,col2, col2 as col3, col4"
ERROR tool.ImportTool: Imported Failed: Column name 'authid uuid' not in
table
Could anybody suggest workaround for this.
Thanks