> On June 28, 2018, 8:42 a.m., daniel voros wrote: > > Hi Fero, > > > > If I understand correclty, with this patch we're only displaying a warning > > when using --resilient to let the users know they should add --split-by > > (even if they do so?). > > > > In the documentation you're saying omitting --split-by can lead to > > lost/duplicated records. Shouldn't we stop the importing if there's no > > --split-by then? I understand we can't enforce the uniqeness and ascending > > order though, so keeping some kind of warning could make sense too. > > > > What do you think? > > > > Regards, > > Daniel
Hi Dani, Thank you for the review! Yes, so, the warning message is always the same, though I wanted to put the emphasis on the implicit requirements of import (unique and ascending values in the split-by column). Happy to change the message if you've a better suggestion! I haven't written anything about omitting --split-by, at least it wasn't my intention to. The first sentence (that I added) says: "In case of import however, one has to use both the +--resilient+ option and specify the +--split-by+ column to trigger the retry mechanism." Import doesn't use resilient operations if there is no --split-by option. Though I believe that it falls back to non-resilient (default) behavior. So, what I intended to say was the same as what you're suggesting here, but it might be confusing, then. Do you think I should change anything? - Fero ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67675/#review205494 ----------------------------------------------------------- On June 25, 2018, 3:17 p.m., Fero Szabo wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67675/ > ----------------------------------------------------------- > > (Updated June 25, 2018, 3:17 p.m.) > > > Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas. > > > Bugs: SQOOP-3332 > https://issues.apache.org/jira/browse/SQOOP-3332 > > > Repository: sqoop-trunk > > > Description > ------- > > This is the documentation part of SQOOP-3333. > > > Diffs > ----- > > src/docs/user/connectors.txt f1c7aebe > src/java/org/apache/sqoop/manager/SQLServerManager.java c98ad2db > src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java > cf58f631 > src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java > fc1c4895 > > > Diff: https://reviews.apache.org/r/67675/diff/2/ > > > Testing > ------- > > Unit tests, 3rdparty tests, ant docs. > > I've also investigated how export and import works: > > Import has it's retry mechanism in > org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader#nextKeyValue > In case of error, it re-calculates the db query, thus the implicit > requirements > > Export has it's retry loop in > org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread#write > It doesn't recalculate the query, thus is a lot safer. > > > Thanks, > > Fero Szabo > >