----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33104/#review82254 -----------------------------------------------------------
Just few notes: src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java <https://reviews.apache.org/r/33104/#comment132951> Honestly this is the first time I'm seeing the "doFailIfHiveTableExists" method :) It seems unusued in current Sqoop code base, so I'm wondering whether it would be better to not use it here (and perhaps drop it completely in different JIRA). src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java <https://reviews.apache.org/r/33104/#comment132950> Just a note for the log output: I believe that the semantcs of --create-hive-table is - create the table if it doesn't exist and do nothing if it does exists. I'm thinking whether the comment should be just mentioning that this will do "append" to existing hive tables and that user might consider --hive-overwrite if rewrite is needed? E.g. no mention of the --create-hive-table. What do you think? src/java/org/apache/sqoop/tool/BaseSqoopTool.java <https://reviews.apache.org/r/33104/#comment132952> The --append paramemetr doesn't make really sense with Hive because hive import behaves differently then HDFS one, right? It's quite unfortunate, but it seems better to preserve the check to not confuse people even more? Jarcec - Jarek Cecho On April 30, 2015, 5:54 p.m., Qian Xu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/33104/ > ----------------------------------------------------------- > > (Updated April 30, 2015, 5:54 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-2295 > https://issues.apache.org/jira/browse/SQOOP-2295 > > > Repository: sqoop-trunk > > > Description > ------- > > Currently, an existing dataset will throw an exception. This differs from > `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive > are indeed different. For HDFS, unless `--append` is specified, the job will > fail when destination exists already. For Hive, unless `--create-hive-table` > is specified, the job will become append mode. The patch has made the > handling of `--as-textfile` and `--as-parquetfile` consistent. > > > Diffs > ----- > > src/docs/man/hive-args.txt 7d9e427 > src/docs/man/sqoop-create-hive-table.txt 7aebcc1 > src/docs/user/create-hive-table.txt 3aa34fd > src/docs/user/hive-args.txt 53de92d > src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java d5bfae2 > src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc > src/java/org/apache/sqoop/tool/BaseSqoopTool.java c97bb58 > src/test/com/cloudera/sqoop/hive/TestHiveImport.java fa717cb > src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 7934791 > testdata/hive/scripts/normalImportAsParquet.q e434e9b > > Diff: https://reviews.apache.org/r/33104/diff/ > > > Testing > ------- > > Manually tested append, new create and overwrite cases. > > > Thanks, > > Qian Xu > >
