[jira] [Updated] (SQOOP-3223) Sqoop2: Add Elastic Search support

2017-08-21 Thread Hu Liu, (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hu Liu, updated SQOOP-3223:
---
Description: 
ES now is widely used, can we add support for it?
I'm glad to work on it if anyone could assign it to me.

  was:ES now is widely used, can we add support for it?


> Sqoop2: Add Elastic Search support 
> ---
>
> Key: SQOOP-3223
> URL: https://issues.apache.org/jira/browse/SQOOP-3223
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Hu Liu,
>
> ES now is widely used, can we add support for it?
> I'm glad to work on it if anyone could assign it to me.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (SQOOP-3223) Sqoop2: Add Elastic Search support

2017-08-21 Thread Hu Liu, (JIRA)
Hu Liu, created SQOOP-3223:
--

 Summary: Sqoop2: Add Elastic Search support 
 Key: SQOOP-3223
 URL: https://issues.apache.org/jira/browse/SQOOP-3223
 Project: Sqoop
  Issue Type: Improvement
Reporter: Hu Liu,


ES now is widely used, can we add support for it?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 61669: Test HBase kerberized connectivity

2017-08-21 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61669/
---

(Updated Aug. 21, 2017, 3:18 p.m.)


Review request for Sqoop, Boglarka Egyed, Ferenc Szabo, and Zoltán Tóth.


Bugs: SQOOP-3222
https://issues.apache.org/jira/browse/SQOOP-3222


Repository: sqoop-trunk


Description
---

In this patch I have changed the following:
- Added test dependency on hadoop-minikdc.
- Added a JUnit rule which starts/stops kerberos MiniKdc before/after a test 
case/class.
- Added kerberos handling logic to HBaseTestCase and refactored it a bit.
- Removed the kerberos-related properties from the build.xml as they caused 
HBaseKerberizedConnectivityTest to fail.

The changes are inspired by the following HBase test classes:
https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/SecureTestCluster.java
https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/TestGenerateDelegationToken.java


HBase security documentation:
http://hbase.apache.org/1.2/book.html#security


Diffs (updated)
-

  build.xml 5f02dcf7759887d84d8cf0505cc1873c53f70a67 
  ivy.xml e4b45bfd9ff6d984a1d1d1808855a07d8b090921 
  src/test/com/cloudera/sqoop/hbase/HBaseKerberizedConnectivityTest.java 
PRE-CREATION 
  src/test/com/cloudera/sqoop/hbase/HBaseTestCase.java 
d9f74952e5f9dd9497e6e9e99789471bcd8f8930 
  
src/test/org/apache/sqoop/infrastructure/kerberos/KerberosConfigurationProvider.java
 PRE-CREATION 
  src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructure.java 
PRE-CREATION 
  
src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructureRule.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/61669/diff/2/

Changes: https://reviews.apache.org/r/61669/diff/1-2/


Testing
---

Ran unit tests and third party tests.


Thanks,

Szabolcs Vasas



Re: Review Request 61777: sqoop tries to re execute select query during import in case of a connection reset error and this is causing lots of duplicate records from source

2017-08-21 Thread Zoltán Tóth

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61777/
---

(Updated Aug. 21, 2017, 1:15 p.m.)


Review request for Sqoop.


Changes
---

Based on codereview function is separated into different methods with 
meaningful names


Bugs: SQOOP-3139
https://issues.apache.org/jira/browse/SQOOP-3139


Repository: sqoop-trunk


Description
---

In the case if the database table name and split-by parameter was different 
(eg.: Mycol vs. mycol) Sqoop couldn't continue the query from the last value if 
the connection was broken.


Diffs (updated)
-

  src/java/org/apache/sqoop/mapreduce/db/DBRecordReader.java a78eb061 
  src/java/org/apache/sqoop/mapreduce/db/SQLServerDBRecordReader.java 9a3621b0 
  src/test/org/apache/sqoop/mapreduce/db/TestSQLServerDBRecordReader.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/61777/diff/2/

Changes: https://reviews.apache.org/r/61777/diff/1-2/


Testing
---


Thanks,

Zoltán Tóth



Re: Review Request 61777: sqoop tries to re execute select query during import in case of a connection reset error and this is causing lots of duplicate records from source

2017-08-21 Thread Anna Szonyi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61777/#review183322
---



Hi Zoltan,

Thanks for the contribution, and thanks for adding quite a few test cases, a 
few nitpicks above Szabolcs' ones.

Please take a look and let me know your thoughts.

Thanks,
Anna


src/java/org/apache/sqoop/mapreduce/db/SQLServerDBRecordReader.java
Line 76 (original), 89-91 (patched)


This comment is a little hard to understand, could you give a bit more 
specific example?



src/java/org/apache/sqoop/mapreduce/db/SQLServerDBRecordReader.java
Lines 93-99 (patched)


It might make sense to split the code into two separate methods for the 
"case sensitive match" and the "case insensitive match" cases.


- Anna Szonyi


On Aug. 21, 2017, 9:16 a.m., Zoltán Tóth wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61777/
> ---
> 
> (Updated Aug. 21, 2017, 9:16 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3139
> https://issues.apache.org/jira/browse/SQOOP-3139
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> In the case if the database table name and split-by parameter was different 
> (eg.: Mycol vs. mycol) Sqoop couldn't continue the query from the last value 
> if the connection was broken.
> 
> 
> Diffs
> -
> 
>   src/java/org/apache/sqoop/mapreduce/db/DBRecordReader.java a78eb061 
>   src/java/org/apache/sqoop/mapreduce/db/SQLServerDBRecordReader.java 
> 9a3621b0 
>   src/test/org/apache/sqoop/mapreduce/db/TestSQLServerDBRecordReader.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/61777/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zoltán Tóth
> 
>



[jira] [Updated] (SQOOP-3139) sqoop tries to re execute select query during import in case of a connection reset error and this is causing lots of duplicate records from source

2017-08-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SQOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Tóth updated SQOOP-3139:
---
Attachment: (was: SQOOP-3139.patch)

> sqoop tries to re execute select query during import in case of a connection 
> reset error and this is causing lots of duplicate records from source
> --
>
> Key: SQOOP-3139
> URL: https://issues.apache.org/jira/browse/SQOOP-3139
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: IBM Hadoop distribution: 4.2.0
> version: 1.4.6_IBM_27
> Sqoop 1.4.6_IBM_27
>Reporter: hemanth meka
>Assignee: Zoltán Tóth
> Attachments: SQOOP-3139.patch, SQOOP-3139.patch
>
>
> We cannot reproduce this issue as it depends on network. Here is a edited log 
> content for understanding the issue. 
> Log start
> .
> .
> 2017-02-22 07:35:37,638 INFO [main] 
> org.apache.sqoop.mapreduce.sqlserver.SqlServerRecordReader: Using query: 
> select sequence_number, analytical_bundle_masked where ( Sequence_Number >= 
> 8571429 ) AND ( Sequence_Number <= 1000 )
> 2017-02-22 07:35:37,662 INFO [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: select 
> sequence_number, analytical_bundle_masked where ( Sequence_Number >= 8571429 
> ) AND ( Sequence_Number <= 1000 )
> 2017-02-22 07:39:00,533 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.microsoft.sqlserver.jdbc.SQLServerException: Connection reset
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:2399)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:2383)
>   at com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:1884)
>   at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:6685)
>   at com.microsoft.sqlserver.jdbc.TDSReader.nextPacket(IOBuffer.java:6595)
>   at 
> com.microsoft.sqlserver.jdbc.TDSReader.ensurePayload(IOBuffer.java:6571)
>   at com.microsoft.sqlserver.jdbc.TDSReader.readBytes(IOBuffer.java:6864)
>   at 
> com.microsoft.sqlserver.jdbc.TDSReader.readWrappedBytes(IOBuffer.java:6886)
>   at 
> com.microsoft.sqlserver.jdbc.TDSReader.readUnsignedShort(IOBuffer.java:6801)
>   at 
> com.microsoft.sqlserver.jdbc.ServerDTVImpl.getValuePrep(dtv.java:3570)
>   at com.microsoft.sqlserver.jdbc.ServerDTVImpl.getValue(dtv.java:3936)
>   at com.microsoft.sqlserver.jdbc.DTV.getValue(dtv.java:226)
>   at com.microsoft.sqlserver.jdbc.Column.getValue(Column.java:144)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerResultSet.getValue(SQLServerResultSet.java:2099)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerResultSet.getValue(SQLServerResultSet.java:2084)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerResultSet.getString(SQLServerResultSet.java:2427)
>   at 
> org.apache.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:71)
>   at 
> com.cloudera.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:61)
>   at QueryResult.readFields0(QueryResult.java:10706)
>   at QueryResult.readFields(QueryResult.java:10415)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:244)
>   at 
> org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader.nextKeyValue(SQLServerDBRecordReader.java:148)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> 2017-02-22 07:39:00,552 WARN [main] 
> org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader: Trying to recover from 
> DB read failure: 
> java.io.IOException: SQLException in nextKeyValue
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
>   at 
> 

Re: Review Request 61777: sqoop tries to re execute select query during import in case of a connection reset error and this is causing lots of duplicate records from source

2017-08-21 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61777/#review183317
---




src/test/org/apache/sqoop/mapreduce/db/TestSQLServerDBRecordReader.java
Lines 33 (patched)


We try to avoid star imports, can you please organize this import 
accordingly?



src/test/org/apache/sqoop/mapreduce/db/TestSQLServerDBRecordReader.java
Lines 39 (patched)


Nit: can we use SPLIT_BY_COLUMN.toUpperCase() here?


- Szabolcs Vasas


On Aug. 21, 2017, 9:16 a.m., Zoltán Tóth wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61777/
> ---
> 
> (Updated Aug. 21, 2017, 9:16 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3139
> https://issues.apache.org/jira/browse/SQOOP-3139
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> In the case if the database table name and split-by parameter was different 
> (eg.: Mycol vs. mycol) Sqoop couldn't continue the query from the last value 
> if the connection was broken.
> 
> 
> Diffs
> -
> 
>   src/java/org/apache/sqoop/mapreduce/db/DBRecordReader.java a78eb061 
>   src/java/org/apache/sqoop/mapreduce/db/SQLServerDBRecordReader.java 
> 9a3621b0 
>   src/test/org/apache/sqoop/mapreduce/db/TestSQLServerDBRecordReader.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/61777/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zoltán Tóth
> 
>



[jira] [Commented] (SQOOP-3139) sqoop tries to re execute select query during import in case of a connection reset error and this is causing lots of duplicate records from source

2017-08-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SQOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134942#comment-16134942
 ] 

Zoltán Tóth commented on SQOOP-3139:


New Patch was created after code review findings.

> sqoop tries to re execute select query during import in case of a connection 
> reset error and this is causing lots of duplicate records from source
> --
>
> Key: SQOOP-3139
> URL: https://issues.apache.org/jira/browse/SQOOP-3139
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: IBM Hadoop distribution: 4.2.0
> version: 1.4.6_IBM_27
> Sqoop 1.4.6_IBM_27
>Reporter: hemanth meka
>Assignee: Zoltán Tóth
> Attachments: SQOOP-3139.patch, SQOOP-3139.patch, SQOOP-3139.patch
>
>
> We cannot reproduce this issue as it depends on network. Here is a edited log 
> content for understanding the issue. 
> Log start
> .
> .
> 2017-02-22 07:35:37,638 INFO [main] 
> org.apache.sqoop.mapreduce.sqlserver.SqlServerRecordReader: Using query: 
> select sequence_number, analytical_bundle_masked where ( Sequence_Number >= 
> 8571429 ) AND ( Sequence_Number <= 1000 )
> 2017-02-22 07:35:37,662 INFO [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: select 
> sequence_number, analytical_bundle_masked where ( Sequence_Number >= 8571429 
> ) AND ( Sequence_Number <= 1000 )
> 2017-02-22 07:39:00,533 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.microsoft.sqlserver.jdbc.SQLServerException: Connection reset
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:2399)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:2383)
>   at com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:1884)
>   at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:6685)
>   at com.microsoft.sqlserver.jdbc.TDSReader.nextPacket(IOBuffer.java:6595)
>   at 
> com.microsoft.sqlserver.jdbc.TDSReader.ensurePayload(IOBuffer.java:6571)
>   at com.microsoft.sqlserver.jdbc.TDSReader.readBytes(IOBuffer.java:6864)
>   at 
> com.microsoft.sqlserver.jdbc.TDSReader.readWrappedBytes(IOBuffer.java:6886)
>   at 
> com.microsoft.sqlserver.jdbc.TDSReader.readUnsignedShort(IOBuffer.java:6801)
>   at 
> com.microsoft.sqlserver.jdbc.ServerDTVImpl.getValuePrep(dtv.java:3570)
>   at com.microsoft.sqlserver.jdbc.ServerDTVImpl.getValue(dtv.java:3936)
>   at com.microsoft.sqlserver.jdbc.DTV.getValue(dtv.java:226)
>   at com.microsoft.sqlserver.jdbc.Column.getValue(Column.java:144)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerResultSet.getValue(SQLServerResultSet.java:2099)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerResultSet.getValue(SQLServerResultSet.java:2084)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerResultSet.getString(SQLServerResultSet.java:2427)
>   at 
> org.apache.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:71)
>   at 
> com.cloudera.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:61)
>   at QueryResult.readFields0(QueryResult.java:10706)
>   at QueryResult.readFields(QueryResult.java:10415)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:244)
>   at 
> org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader.nextKeyValue(SQLServerDBRecordReader.java:148)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> 2017-02-22 07:39:00,552 WARN [main] 
> org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader: Trying to recover from 
> DB read failure: 
> java.io.IOException: SQLException in nextKeyValue
>   at 
> 

[jira] [Updated] (SQOOP-3139) sqoop tries to re execute select query during import in case of a connection reset error and this is causing lots of duplicate records from source

2017-08-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SQOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Tóth updated SQOOP-3139:
---
Attachment: SQOOP-3139.patch

> sqoop tries to re execute select query during import in case of a connection 
> reset error and this is causing lots of duplicate records from source
> --
>
> Key: SQOOP-3139
> URL: https://issues.apache.org/jira/browse/SQOOP-3139
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: IBM Hadoop distribution: 4.2.0
> version: 1.4.6_IBM_27
> Sqoop 1.4.6_IBM_27
>Reporter: hemanth meka
>Assignee: Zoltán Tóth
> Attachments: SQOOP-3139.patch, SQOOP-3139.patch, SQOOP-3139.patch
>
>
> We cannot reproduce this issue as it depends on network. Here is a edited log 
> content for understanding the issue. 
> Log start
> .
> .
> 2017-02-22 07:35:37,638 INFO [main] 
> org.apache.sqoop.mapreduce.sqlserver.SqlServerRecordReader: Using query: 
> select sequence_number, analytical_bundle_masked where ( Sequence_Number >= 
> 8571429 ) AND ( Sequence_Number <= 1000 )
> 2017-02-22 07:35:37,662 INFO [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: select 
> sequence_number, analytical_bundle_masked where ( Sequence_Number >= 8571429 
> ) AND ( Sequence_Number <= 1000 )
> 2017-02-22 07:39:00,533 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.microsoft.sqlserver.jdbc.SQLServerException: Connection reset
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:2399)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:2383)
>   at com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:1884)
>   at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:6685)
>   at com.microsoft.sqlserver.jdbc.TDSReader.nextPacket(IOBuffer.java:6595)
>   at 
> com.microsoft.sqlserver.jdbc.TDSReader.ensurePayload(IOBuffer.java:6571)
>   at com.microsoft.sqlserver.jdbc.TDSReader.readBytes(IOBuffer.java:6864)
>   at 
> com.microsoft.sqlserver.jdbc.TDSReader.readWrappedBytes(IOBuffer.java:6886)
>   at 
> com.microsoft.sqlserver.jdbc.TDSReader.readUnsignedShort(IOBuffer.java:6801)
>   at 
> com.microsoft.sqlserver.jdbc.ServerDTVImpl.getValuePrep(dtv.java:3570)
>   at com.microsoft.sqlserver.jdbc.ServerDTVImpl.getValue(dtv.java:3936)
>   at com.microsoft.sqlserver.jdbc.DTV.getValue(dtv.java:226)
>   at com.microsoft.sqlserver.jdbc.Column.getValue(Column.java:144)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerResultSet.getValue(SQLServerResultSet.java:2099)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerResultSet.getValue(SQLServerResultSet.java:2084)
>   at 
> com.microsoft.sqlserver.jdbc.SQLServerResultSet.getString(SQLServerResultSet.java:2427)
>   at 
> org.apache.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:71)
>   at 
> com.cloudera.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:61)
>   at QueryResult.readFields0(QueryResult.java:10706)
>   at QueryResult.readFields(QueryResult.java:10415)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:244)
>   at 
> org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader.nextKeyValue(SQLServerDBRecordReader.java:148)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> 2017-02-22 07:39:00,552 WARN [main] 
> org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader: Trying to recover from 
> DB read failure: 
> java.io.IOException: SQLException in nextKeyValue
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
>   at 
> 

Review Request 61777: sqoop tries to re execute select query during import in case of a connection reset error and this is causing lots of duplicate records from source

2017-08-21 Thread Zoltán Tóth

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61777/
---

Review request for Sqoop.


Bugs: SQOOP-3139
https://issues.apache.org/jira/browse/SQOOP-3139


Repository: sqoop-trunk


Description
---

In the case if the database table name and split-by parameter was different 
(eg.: Mycol vs. mycol) Sqoop couldn't continue the query from the last value if 
the connection was broken.


Diffs
-

  src/java/org/apache/sqoop/mapreduce/db/DBRecordReader.java a78eb061 
  src/java/org/apache/sqoop/mapreduce/db/SQLServerDBRecordReader.java 9a3621b0 
  src/test/org/apache/sqoop/mapreduce/db/TestSQLServerDBRecordReader.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/61777/diff/1/


Testing
---


Thanks,

Zoltán Tóth