[jira] [Resolved] (SQOOP-3332) Extend Documentation of --resilient flag and add warning message when detected

2018-06-28 Thread Boglarka Egyed (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boglarka Egyed resolved SQOOP-3332.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

Thanks for improving our documentation [~fero]! Please don't forget to close 
the related Review Request.

> Extend Documentation of --resilient flag and add warning message when detected
> --
>
> Key: SQOOP-3332
> URL: https://issues.apache.org/jira/browse/SQOOP-3332
> Project: Sqoop
>  Issue Type: Task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 3.0.0
>
>
> The resilient flag can be used to trigger the retry mechanism in the SQL 
> Server connector. The documentation only tells that it can be used in export, 
> however it can be used in import as well.
> Also, the feature itself relies on the implicit assumption that the split-by 
> column is unique and sorted in ascending order. The users have to be warned 
> about this limitation, at the very least.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 66548: Importing as ORC file to support full ACID Hive tables

2018-06-28 Thread Boglarka Egyed

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66548/#review205513
---



Hi Daniel,

Thanks for updating your patch, I think we are very close to commit it. Could 
you please rebase it to the latest trunk version as some new changes have been 
committed recently? :)

Thanks,
Bogi

- Boglarka Egyed


On May 2, 2018, 12:12 p.m., daniel voros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66548/
> ---
> 
> (Updated May 2, 2018, 12:12 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3311
> https://issues.apache.org/jira/browse/SQOOP-3311
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID 
> by default. This will probably result in increased usage of ACID tables and 
> the need to support importing into ACID tables with Sqoop.
> 
> Currently the only table format supporting full ACID tables is ORC.
> 
> The easiest and most effective way to support importing into these tables 
> would be to write out files as ORC and keep using LOAD DATA as we do for all 
> other Hive tables (supported since HIVE-17361).
> 
> Workaround could be to create table as textfile (as before) and then CTAS 
> from that. This would push the responsibility of creating ORC format to Hive. 
> However it would result in writing every record twice; in text format and in 
> ORC.
> 
> Note that ORC is only necessary for full ACID tables. Insert-only (aka. 
> micromanaged) ACID tables can use arbitrary file format.
> 
> Supporting full ACID tables would also be the first step in making 
> "lastmodified" incremental imports work with Hive.
> 
> 
> Diffs
> -
> 
>   ivy.xml 1f587f3e 
>   ivy/libraries.properties 565a8bf5 
>   src/docs/man/import-common-args.txt 22e3448e 
>   src/docs/man/sqoop-import-all-tables.txt 6db38ad8 
>   src/docs/user/export-purpose.txt def6ead3 
>   src/docs/user/hcatalog.txt 2ae1d54d 
>   src/docs/user/help.txt 8a0d1477 
>   src/docs/user/import-all-tables.txt fbad47b2 
>   src/docs/user/import-purpose.txt c7eca606 
>   src/docs/user/import.txt 2d074f49 
>   src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
>   src/java/org/apache/sqoop/hive/TableDefWriter.java 27d988c5 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 3b542102 
>   src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java c62ee98c 
>   src/java/org/apache/sqoop/tool/ExportTool.java 060f2c07 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2c474b7e 
>   src/java/org/apache/sqoop/util/OrcConversionContext.java PRE-CREATION 
>   src/java/org/apache/sqoop/util/OrcUtil.java PRE-CREATION 
>   src/test/org/apache/sqoop/TestAllTables.java 16933a82 
>   src/test/org/apache/sqoop/TestOrcImport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveServer2TextImport.java f6d591b7 
>   src/test/org/apache/sqoop/hive/TestTableDefWriter.java 3ea61f64 
>   src/test/org/apache/sqoop/util/TestOrcConversionContext.java PRE-CREATION 
>   src/test/org/apache/sqoop/util/TestOrcUtil.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/66548/diff/12/
> 
> 
> Testing
> ---
> 
> - added some unit tests
> - tested basic Hive import scenarios on a cluster
> 
> 
> Thanks,
> 
> daniel voros
> 
>



Re: Review Request 67675: SQOOP-3332 Extend Documentation of --resilient flag and add warning message when detected

2018-06-28 Thread Boglarka Egyed

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67675/#review205509
---


Ship it!




Ship It!

- Boglarka Egyed


On June 28, 2018, 12:29 p.m., Fero Szabo wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67675/
> ---
> 
> (Updated June 28, 2018, 12:29 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3332
> https://issues.apache.org/jira/browse/SQOOP-3332
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> This is the documentation part of SQOOP-.
> 
> 
> Diffs
> -
> 
>   src/docs/user/connectors.txt f1c7aebe 
>   src/java/org/apache/sqoop/manager/SQLServerManager.java c98ad2db 
>   src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java 
> cf58f631 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java 
> fc1c4895 
> 
> 
> Diff: https://reviews.apache.org/r/67675/diff/3/
> 
> 
> Testing
> ---
> 
> Unit tests, 3rdparty tests, ant docs.
> 
> I've also investigated how export and import works: 
> 
> Import has it's retry mechanism in 
> org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader#nextKeyValue
> In case of error, it re-calculates the db query, thus the implicit 
> requirements
> 
> Export has it's retry loop in 
> org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread#write
> It doesn't recalculate the query, thus is a lot safer.
> 
> 
> Thanks,
> 
> Fero Szabo
> 
>



Re: Review Request 67628: Implement an alternative solution for Parquet reading and writing

2018-06-28 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67628/
---

(Updated June 28, 2018, 2:26 p.m.)


Review request for Sqoop.


Bugs: SQOOP-3328
https://issues.apache.org/jira/browse/SQOOP-3328


Repository: sqoop-trunk


Description
---

The new implementation uses classes from parquet.hadoop packages.
TestParquetIncrementalImportMerge has been introduced to cover some gaps we had 
in the Parquet merge support.
The test infrastructure is also modified a bit which was needed because of 
TestParquetIncrementalImportMerge.

Note that this JIRA does not cover the Hive Parquet import support I will 
create another JIRA for that.


Diffs (updated)
-

  src/java/org/apache/sqoop/SqoopOptions.java 
d9984af369f901c782b1a74294291819e7d13cdd 
  src/java/org/apache/sqoop/avro/AvroUtil.java 
57c2062568778c5bb53cd4118ce4f030e4ff33f2 
  src/java/org/apache/sqoop/manager/ConnManager.java 
c80dd5d9cbaa9b114c12b693e9a686d2cbbe51a3 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 
3b5421028d3006e790ed4b711a06dbdb4035b8a0 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 
17c9ed39b1e613a6df36b54cd5395b80e5f8fb0b 
  src/java/org/apache/sqoop/mapreduce/parquet/ParquetConstants.java 
ae53a96bddc523a52384715dd97705dc3d9db607 
  src/java/org/apache/sqoop/mapreduce/parquet/ParquetExportJobConfigurator.java 
8d7b87f6d6832ce8d81d995af4c4bd5eeae38e1b 
  src/java/org/apache/sqoop/mapreduce/parquet/ParquetImportJobConfigurator.java 
fa1bc7d1395fbbbceb3cb72802675aebfdb27898 
  
src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactory.java 
ed5103f1d84540ef2fa5de60599e94aa69156abe 
  
src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactoryProvider.java
 2286a52030778925349ebb32c165ac062679ff71 
  
src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorImplementation.java
 PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/parquet/ParquetMergeJobConfigurator.java 
67fdf6602bcbc6c091e1e9bf4176e56658ce5222 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportJobConfigurator.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportMapper.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportJobConfigurator.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportMapper.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetJobConfiguratorFactory.java
 PRE-CREATION 
  
src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetMergeJobConfigurator.java
 PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteMergeParquetReducer.java 
7f21205e1c4be4200f7248d3f1c8513e0c8e490c 
  
src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportJobConfigurator.java
 ca02c7bdcaf2fa981e15a6a96b111dec38ba2b25 
  src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportMapper.java 
2d88a9c8ea4eb32001e1eb03e636d9386719 
  
src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportJobConfigurator.java
 87828d1413eb71761aed44ad3b138535692f9c97 
  src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportMapper.java 
20adf6e422cc4b661a74c8def114d44a14787fc6 
  
src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetJobConfiguratorFactory.java
 055e1166b07aeef711cd162052791500368c628d 
  
src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetMergeJobConfigurator.java
 9fecf282885f7aeac011a66f7d5d05512624976f 
  src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetUtils.java 
e68bba90d8b08ac3978fcc9ccae612bdf02388e8 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 
c62ee98c2b22d819c9a994884b254f76eb518b6a 
  src/java/org/apache/sqoop/tool/ImportTool.java 
2c474b7eeeff02b59204e4baca8554d668b6c61e 
  src/java/org/apache/sqoop/tool/MergeTool.java 
4c20f7d151514b26a098dafdc1ee265cbde5ad20 
  src/test/org/apache/sqoop/TestBigDecimalExport.java 
ccea17345c0c8a2bdb7c8fd141f37e3c822ee41e 
  src/test/org/apache/sqoop/TestMerge.java 
11806fea6c59ea897bc1aa23f6657ed172d093d5 
  src/test/org/apache/sqoop/TestParquetExport.java 
43dabb57b7862b607490369e09b197b6de65a147 
  src/test/org/apache/sqoop/TestParquetImport.java 
27d407aa3f9f2781f675294fa98431bc46f3dcfa 
  src/test/org/apache/sqoop/TestParquetIncrementalImportMerge.java PRE-CREATION 
  src/test/org/apache/sqoop/TestSqoopOptions.java 
bb7c20ddcb8fb5fc9c3b1edfb73fecb739bba269 
  src/test/org/apache/sqoop/hive/TestHiveServer2TextImport.java 
f6d591b73373fdf33b27202cb8116025fb694ef1 
  src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java 
a5f85a06ba21b01e99c1655450d36016c2901cc0 
  src/test/org/apache/sqoop/testutil/ImportJobTestCase.java 
dbefe209770885063d1b4d0c3940d078b8d91cad 
  src/test/org/apache/sqoop/tool/TestBaseSqoopTool.java 

Re: Review Request 67675: SQOOP-3332 Extend Documentation of --resilient flag and add warning message when detected

2018-06-28 Thread daniel voros

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67675/#review205504
---


Ship it!




Ship It!

- daniel voros


On June 28, 2018, 12:29 p.m., Fero Szabo wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67675/
> ---
> 
> (Updated June 28, 2018, 12:29 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3332
> https://issues.apache.org/jira/browse/SQOOP-3332
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> This is the documentation part of SQOOP-.
> 
> 
> Diffs
> -
> 
>   src/docs/user/connectors.txt f1c7aebe 
>   src/java/org/apache/sqoop/manager/SQLServerManager.java c98ad2db 
>   src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java 
> cf58f631 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java 
> fc1c4895 
> 
> 
> Diff: https://reviews.apache.org/r/67675/diff/3/
> 
> 
> Testing
> ---
> 
> Unit tests, 3rdparty tests, ant docs.
> 
> I've also investigated how export and import works: 
> 
> Import has it's retry mechanism in 
> org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader#nextKeyValue
> In case of error, it re-calculates the db query, thus the implicit 
> requirements
> 
> Export has it's retry loop in 
> org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread#write
> It doesn't recalculate the query, thus is a lot safer.
> 
> 
> Thanks,
> 
> Fero Szabo
> 
>



Re: Review Request 67675: SQOOP-3332 Extend Documentation of --resilient flag and add warning message when detected

2018-06-28 Thread Fero Szabo via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67675/
---

(Updated June 28, 2018, 12:29 p.m.)


Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas.


Changes
---

small correction to make the implicit requirement a little bit clearer.


Bugs: SQOOP-3332
https://issues.apache.org/jira/browse/SQOOP-3332


Repository: sqoop-trunk


Description
---

This is the documentation part of SQOOP-.


Diffs (updated)
-

  src/docs/user/connectors.txt f1c7aebe 
  src/java/org/apache/sqoop/manager/SQLServerManager.java c98ad2db 
  src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java 
cf58f631 
  src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java 
fc1c4895 


Diff: https://reviews.apache.org/r/67675/diff/3/

Changes: https://reviews.apache.org/r/67675/diff/2-3/


Testing
---

Unit tests, 3rdparty tests, ant docs.

I've also investigated how export and import works: 

Import has it's retry mechanism in 
org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader#nextKeyValue
In case of error, it re-calculates the db query, thus the implicit requirements

Export has it's retry loop in 
org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread#write
It doesn't recalculate the query, thus is a lot safer.


Thanks,

Fero Szabo



Re: Review Request 67675: SQOOP-3332 Extend Documentation of --resilient flag and add warning message when detected

2018-06-28 Thread Fero Szabo via Review Board


> On June 28, 2018, 8:42 a.m., daniel voros wrote:
> > Hi Fero,
> > 
> > If I understand correclty, with this patch we're only displaying a warning 
> > when using --resilient to let the users know they should add --split-by 
> > (even if they do so?).
> > 
> > In the documentation you're saying omitting --split-by can lead to 
> > lost/duplicated records. Shouldn't we stop the importing if there's no 
> > --split-by then? I understand we can't enforce the uniqeness and ascending 
> > order though, so keeping some kind of warning could make sense too.
> > 
> > What do you think?
> > 
> > Regards,
> > Daniel

Hi Dani,

Thank you for the review!

Yes, so, the warning message is always the same, though I wanted to put the 
emphasis on the implicit requirements of import (unique and ascending values in 
the split-by column). Happy to change the message if you've a better suggestion!

I haven't written anything about omitting --split-by, at least it wasn't my 
intention to. The first sentence (that I added) says:
"In case of import however, one has to use both the +--resilient+ option and 
specify the +--split-by+ column to trigger the retry mechanism."

Import doesn't use resilient operations if there is no --split-by option. 
Though I believe that it falls back to non-resilient (default) behavior.

So, what I intended to say was the same as what you're suggesting here, but it 
might be confusing, then. Do you think I should change anything?


- Fero


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67675/#review205494
---


On June 25, 2018, 3:17 p.m., Fero Szabo wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67675/
> ---
> 
> (Updated June 25, 2018, 3:17 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3332
> https://issues.apache.org/jira/browse/SQOOP-3332
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> This is the documentation part of SQOOP-.
> 
> 
> Diffs
> -
> 
>   src/docs/user/connectors.txt f1c7aebe 
>   src/java/org/apache/sqoop/manager/SQLServerManager.java c98ad2db 
>   src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java 
> cf58f631 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java 
> fc1c4895 
> 
> 
> Diff: https://reviews.apache.org/r/67675/diff/2/
> 
> 
> Testing
> ---
> 
> Unit tests, 3rdparty tests, ant docs.
> 
> I've also investigated how export and import works: 
> 
> Import has it's retry mechanism in 
> org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader#nextKeyValue
> In case of error, it re-calculates the db query, thus the implicit 
> requirements
> 
> Export has it's retry loop in 
> org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread#write
> It doesn't recalculate the query, thus is a lot safer.
> 
> 
> Thanks,
> 
> Fero Szabo
> 
>



Re: Review Request 67628: Implement an alternative solution for Parquet reading and writing

2018-06-28 Thread daniel voros

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67628/#review205495
---


Ship it!




Thanks for the updates! Ship it!

- daniel voros


On June 26, 2018, 9:15 a.m., Szabolcs Vasas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67628/
> ---
> 
> (Updated June 26, 2018, 9:15 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3328
> https://issues.apache.org/jira/browse/SQOOP-3328
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> The new implementation uses classes from parquet.hadoop packages.
> TestParquetIncrementalImportMerge has been introduced to cover some gaps we 
> had in the Parquet merge support.
> The test infrastructure is also modified a bit which was needed because of 
> TestParquetIncrementalImportMerge.
> 
> Note that this JIRA does not cover the Hive Parquet import support I will 
> create another JIRA for that.
> 
> 
> Diffs
> -
> 
>   src/java/org/apache/sqoop/SqoopOptions.java 
> d9984af369f901c782b1a74294291819e7d13cdd 
>   src/java/org/apache/sqoop/avro/AvroUtil.java 
> 57c2062568778c5bb53cd4118ce4f030e4ff33f2 
>   src/java/org/apache/sqoop/manager/ConnManager.java 
> c80dd5d9cbaa9b114c12b693e9a686d2cbbe51a3 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 
> 3b5421028d3006e790ed4b711a06dbdb4035b8a0 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 
> 17c9ed39b1e613a6df36b54cd5395b80e5f8fb0b 
>   src/java/org/apache/sqoop/mapreduce/parquet/ParquetConstants.java 
> ae53a96bddc523a52384715dd97705dc3d9db607 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetExportJobConfigurator.java 
> 8d7b87f6d6832ce8d81d995af4c4bd5eeae38e1b 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetImportJobConfigurator.java 
> fa1bc7d1395fbbbceb3cb72802675aebfdb27898 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactory.java
>  ed5103f1d84540ef2fa5de60599e94aa69156abe 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactoryProvider.java
>  2286a52030778925349ebb32c165ac062679ff71 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorImplementation.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/ParquetMergeJobConfigurator.java 
> 67fdf6602bcbc6c091e1e9bf4176e56658ce5222 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportJobConfigurator.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportMapper.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportJobConfigurator.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportMapper.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetJobConfiguratorFactory.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetMergeJobConfigurator.java
>  PRE-CREATION 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteMergeParquetReducer.java 
> 7f21205e1c4be4200f7248d3f1c8513e0c8e490c 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportJobConfigurator.java
>  ca02c7bdcaf2fa981e15a6a96b111dec38ba2b25 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportMapper.java 
> 2d88a9c8ea4eb32001e1eb03e636d9386719 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportJobConfigurator.java
>  87828d1413eb71761aed44ad3b138535692f9c97 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportMapper.java 
> 20adf6e422cc4b661a74c8def114d44a14787fc6 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetJobConfiguratorFactory.java
>  055e1166b07aeef711cd162052791500368c628d 
>   
> src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetMergeJobConfigurator.java
>  9fecf282885f7aeac011a66f7d5d05512624976f 
>   src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetUtils.java 
> e68bba90d8b08ac3978fcc9ccae612bdf02388e8 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 
> c62ee98c2b22d819c9a994884b254f76eb518b6a 
>   src/java/org/apache/sqoop/tool/ImportTool.java 
> 2c474b7eeeff02b59204e4baca8554d668b6c61e 
>   src/java/org/apache/sqoop/tool/MergeTool.java 
> 4c20f7d151514b26a098dafdc1ee265cbde5ad20 
>   src/test/org/apache/sqoop/TestBigDecimalExport.java 
> ccea17345c0c8a2bdb7c8fd141f37e3c822ee41e 
>   src/test/org/apache/sqoop/TestMerge.java 
> 11806fea6c59ea897bc1aa23f6657ed172d093d5 
>   src/test/org/apache/sqoop/TestParquetExport.java 
> 43dabb57b7862b607490369e09b197b6de65a147 
>   src/test/org/apache/sqoop/TestParquetImport.java 
> 

Re: Review Request 67675: SQOOP-3332 Extend Documentation of --resilient flag and add warning message when detected

2018-06-28 Thread daniel voros

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67675/#review205494
---



Hi Fero,

If I understand correclty, with this patch we're only displaying a warning when 
using --resilient to let the users know they should add --split-by (even if 
they do so?).

In the documentation you're saying omitting --split-by can lead to 
lost/duplicated records. Shouldn't we stop the importing if there's no 
--split-by then? I understand we can't enforce the uniqeness and ascending 
order though, so keeping some kind of warning could make sense too.

What do you think?

Regards,
Daniel

- daniel voros


On June 25, 2018, 3:17 p.m., Fero Szabo wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67675/
> ---
> 
> (Updated June 25, 2018, 3:17 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3332
> https://issues.apache.org/jira/browse/SQOOP-3332
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> This is the documentation part of SQOOP-.
> 
> 
> Diffs
> -
> 
>   src/docs/user/connectors.txt f1c7aebe 
>   src/java/org/apache/sqoop/manager/SQLServerManager.java c98ad2db 
>   src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java 
> cf58f631 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java 
> fc1c4895 
> 
> 
> Diff: https://reviews.apache.org/r/67675/diff/2/
> 
> 
> Testing
> ---
> 
> Unit tests, 3rdparty tests, ant docs.
> 
> I've also investigated how export and import works: 
> 
> Import has it's retry mechanism in 
> org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader#nextKeyValue
> In case of error, it re-calculates the db query, thus the implicit 
> requirements
> 
> Export has it's retry loop in 
> org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread#write
> It doesn't recalculate the query, thus is a lot safer.
> 
> 
> Thanks,
> 
> Fero Szabo
> 
>