Re: Review Request 66361: Implement HiveServer2 client

2018-04-06 Thread Fero Szabo via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66361/#review200554
---




src/java/org/apache/sqoop/hive/HiveServer2Client.java
Lines 75 (patched)


Is the package private visibility intentional?

Also applies for the remaining functions.



src/java/org/apache/sqoop/hive/TableDefWriter.java
Lines 300 (patched)


Is the package private visibility intentional?



src/test/org/apache/sqoop/tool/TestHiveServer2OptionValidations.java
Lines 43 (patched)


There is a mismatch here between the parameter name in the constructor and 
the annotation (sqoopTool vs SqoopTool).

Obviously doesn't matter from the perspective of correctness, though, just 
bugs my eye. :)


- Fero Szabo


On March 29, 2018, 3:15 p.m., Szabolcs Vasas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66361/
> ---
> 
> (Updated March 29, 2018, 3:15 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3309
> https://issues.apache.org/jira/browse/SQOOP-3309
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> This JIRA covers the implementation of the client for HiveServer2 and its 
> integration into the classes which use HiveImport.
> 
> - HiveClient interface is introduced with 2 implementation:
>   - HiveImport: this is the original implementation which uses HiveCLI
>   - HiveServer2Client: the new clients which connects to HS2 using JDBC 
> connection
>   - The common code is extracted to HiveCommon class
> - HiveClient should be instantiated using HiveClientFactory which creates and 
> configures the right HiveClient based on the configuration in SqoopOptions
> - HiveMiniCluster is introduced with a couple of helper classes to enable 
> end-to-end HS2 tests
> - A couple of new options are added to SqoopOptions to be able to configure 
> the connection to HS2
> - Validation is implemented for these new options
> - I will upload the documentation soon
> 
> 
> Diffs
> -
> 
>   build.xml d85cf7198c5d7ca382407f927e73a85a17d12a62 
>   ivy.xml 6be4fa20fbbf1f303c69d86942b1874e18a14afc 
>   src/java/org/apache/sqoop/SqoopOptions.java 
> 651cebd69ee7e75d06c75945e3607c4fab7eb11c 
>   src/java/org/apache/sqoop/hive/HiveClient.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveClientCommon.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveClientFactory.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveImport.java 
> c2729119d31f7e585f204f2d31b2051eea71b72b 
>   src/java/org/apache/sqoop/hive/HiveServer2Client.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveServer2ConnectionFactory.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/hive/TableDefWriter.java 
> b7a25b7809e0d50166966a77161dc8ff603fb2d2 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 
> b02e4fe7fda25c7f8171c7db17d15a7987459687 
>   src/java/org/apache/sqoop/tool/CreateHiveTableTool.java 
> d259566180369a55d490144e6f865e728f4f2e61 
>   src/java/org/apache/sqoop/tool/ImportAllTablesTool.java 
> 18f7a0af48d972d5186e9414475e080f1eb765f3 
>   src/java/org/apache/sqoop/tool/ImportTool.java 
> e9920058858653bec7407bf7992eb6445401e813 
>   src/test/org/apache/sqoop/hive/TestHS2TextImport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveClientFactory.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveMiniCluster.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveServer2Client.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestTableDefWriter.java 
> 8bdc3beb3677312ec0ee2e612616358bca4ca838 
>   src/test/org/apache/sqoop/hive/minicluster/AuthenticationConfiguration.java 
> PRE-CREATION 
>   src/test/org/apache/sqoop/hive/minicluster/HiveMiniCluster.java 
> PRE-CREATION 
>   
> src/test/org/apache/sqoop/hive/minicluster/KerberosAuthenticationConfiguration.java
>  PRE-CREATION 
>   
> src/test/org/apache/sqoop/hive/minicluster/NoAuthenticationConfiguration.java 
> PRE-CREATION 
>   
> src/test/org/apache/sqoop/hive/minicluster/PasswordAuthenticationConfiguration.java
>  PRE-CREATION 
>   src/test/org/apache/sqoop/testutil/HS2TestUtil.java PRE-CREATION 
>   src/test/org/apache/sqoop/tool/TestHiveServer2OptionValidations.java 
> PRE-CREATION 
>   src/test/org/apache/sqoop/tool/TestImportTool.java 
> 1c0cf4d863692f75bb8831e834fae47fc18b5df5 
> 
> 
> Diff: https://reviews.apache.org/r/66361/diff/1/
> 
> 
> Testing
> ---
> 
> Ran unit and third party tests suite.
> 
> 
> Thanks,
> 
> Szabolcs Vasas
> 
>



Re: Review Request 66475: SQOOP-3310 Add import to Kafka feature

2018-04-06 Thread Boglarka Egyed

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66475/#review200652
---



Hi Andras,

Thanks for taking care of this new addition!

As this is a very huge change please note that this is just an initial review 
from me, I will continue next week.

Could you please regenerate your patch? I got the following warning during 
applying it:

/Users/boglarka.egyed/Downloads/kafka.diff:868: new blank line at EOF.
+
/Users/boglarka.egyed/Downloads/kafka.diff:914: new blank line at EOF.
+

Thanks,
Bogi


ivy.xml
Lines 141 (patched)


Unnecessary blank line.



ivy.xml
Lines 150 (patched)


Unnecessary blank line.



src/java/org/apache/sqoop/SqoopOptions.java
Lines 769 (patched)


typo: Netsted



src/java/org/apache/sqoop/kafka/GenericRecordKafkaTransformer.java
Lines 38 (patched)


Unnecessary blank line.



src/java/org/apache/sqoop/kafka/GenericRecordKafkaTransformer.java
Lines 52-53 (patched)


Unnecessary blank lines.



src/java/org/apache/sqoop/kafka/GenericRecordKafkaTransformer.java
Lines 72-73 (patched)


Unnecessary blank lines.



src/java/org/apache/sqoop/kafka/KafkaProduceProcessor.java
Lines 60 (patched)


Unnecessary blank line.



src/java/org/apache/sqoop/kafka/KafkaProduceProcessor.java
Lines 72 (patched)


Unnecessary blank line.



src/java/org/apache/sqoop/kafka/KafkaProduceProcessor.java
Lines 120-122 (patched)


Unnecessary blank lines.



src/java/org/apache/sqoop/kafka/KafkaProduceProcessor.java
Lines 145 (patched)


Unnecessary blank line.



src/java/org/apache/sqoop/kafka/KafkaUtil.java
Lines 37 (patched)


typo: testing and simulation the condition



src/java/org/apache/sqoop/kafka/KafkaUtil.java
Lines 38 (patched)


typo: cliet



src/java/org/apache/sqoop/kafka/KafkaUtil.java
Lines 47-52 (patched)


I know that Sqoop implementation already uses this logic at some places but 
I think it would be much more clean to not have test related logic in the 
production code. Would you mind to modify your change to eliminate this part 
and use mocking in tests where needed?



src/java/org/apache/sqoop/kafka/KafkaUtil.java
Lines 94-95 (patched)


Unnecessary blank lines.



src/java/org/apache/sqoop/kafka/ProducerRecordTransformer.java
Lines 50 (patched)


I see two input parameters not just one, could you please include that one 
here too?



src/java/org/apache/sqoop/kafka/ProducerWrapper.java
Lines 78 (patched)


typo: send (sent?)



src/java/org/apache/sqoop/tool/BaseSqoopTool.java
Lines 937 (patched)


typo: two consecutive whitespace


- Boglarka Egyed


On April 5, 2018, 12:35 p.m., Andras Beni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66475/
> ---
> 
> (Updated April 5, 2018, 12:35 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, Fero Szabo, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3310
> https://issues.apache.org/jira/browse/SQOOP-3310
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Adds the ability to import data from RDBMS to Apache Kafka
> 
> 
> Diffs
> -
> 
>   ivy.xml 6be4fa20fbbf1f303c69d86942b1874e18a14afc 
>   ivy/libraries.properties c44b50bc1361bb3a32f47ed2c1b9f291e8c18d05 
>   src/java/org/apache/sqoop/SqoopOptions.java 
> 651cebd69ee7e75d06c75945e3607c4fab7eb11c 
>   src/java/org/apache/sqoop/kafka/GenericRecordKafkaTransformer.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/kafka/JsonKafkaTransformer.java PRE-CREATION 
>   src/java/org/apache/sqoop/kafka/KafkaProduceProcessor.java PRE-CREATION 
>   src/java/org/apache/sqoop/kafka/KafkaUtil.java PRE-CREATION 
>   src/java/org/apache/sqoop/kafka/ProducerRecordTransformer.java PRE-CREATION 
>   src/java/org/apache/sqoop/kafka/ProducerWrapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/manager/SqlManager.java 
> 

Re: Review Request 66361: Implement HiveServer2 client

2018-04-06 Thread Boglarka Egyed

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66361/#review200552
---




src/java/org/apache/sqoop/hive/HiveServer2ConnectionFactory.java
Lines 55 (patched)


Why did you include only the message here not the exception itself?



src/test/org/apache/sqoop/hive/TestHS2TextImport.java
Lines 1 (patched)


Missing Apache license header.



src/test/org/apache/sqoop/hive/TestHiveClientFactory.java
Lines 1 (patched)


Missing Apache license header.



src/test/org/apache/sqoop/hive/TestHiveServer2Client.java
Lines 1 (patched)


Missing Apache license header.



src/test/org/apache/sqoop/tool/TestHiveServer2OptionValidations.java
Lines 41 (patched)


Inconsistent naming convention: other HiveServer2 related tests contain HS2 
in their names except this. Also the business logic classes cointain 
HiveServer2 in their names so maybe not using HS2 would be more straightforward 
- what do you think?


- Boglarka Egyed


On March 29, 2018, 3:15 p.m., Szabolcs Vasas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66361/
> ---
> 
> (Updated March 29, 2018, 3:15 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3309
> https://issues.apache.org/jira/browse/SQOOP-3309
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> This JIRA covers the implementation of the client for HiveServer2 and its 
> integration into the classes which use HiveImport.
> 
> - HiveClient interface is introduced with 2 implementation:
>   - HiveImport: this is the original implementation which uses HiveCLI
>   - HiveServer2Client: the new clients which connects to HS2 using JDBC 
> connection
>   - The common code is extracted to HiveCommon class
> - HiveClient should be instantiated using HiveClientFactory which creates and 
> configures the right HiveClient based on the configuration in SqoopOptions
> - HiveMiniCluster is introduced with a couple of helper classes to enable 
> end-to-end HS2 tests
> - A couple of new options are added to SqoopOptions to be able to configure 
> the connection to HS2
> - Validation is implemented for these new options
> - I will upload the documentation soon
> 
> 
> Diffs
> -
> 
>   build.xml d85cf7198c5d7ca382407f927e73a85a17d12a62 
>   ivy.xml 6be4fa20fbbf1f303c69d86942b1874e18a14afc 
>   src/java/org/apache/sqoop/SqoopOptions.java 
> 651cebd69ee7e75d06c75945e3607c4fab7eb11c 
>   src/java/org/apache/sqoop/hive/HiveClient.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveClientCommon.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveClientFactory.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveImport.java 
> c2729119d31f7e585f204f2d31b2051eea71b72b 
>   src/java/org/apache/sqoop/hive/HiveServer2Client.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveServer2ConnectionFactory.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/hive/TableDefWriter.java 
> b7a25b7809e0d50166966a77161dc8ff603fb2d2 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 
> b02e4fe7fda25c7f8171c7db17d15a7987459687 
>   src/java/org/apache/sqoop/tool/CreateHiveTableTool.java 
> d259566180369a55d490144e6f865e728f4f2e61 
>   src/java/org/apache/sqoop/tool/ImportAllTablesTool.java 
> 18f7a0af48d972d5186e9414475e080f1eb765f3 
>   src/java/org/apache/sqoop/tool/ImportTool.java 
> e9920058858653bec7407bf7992eb6445401e813 
>   src/test/org/apache/sqoop/hive/TestHS2TextImport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveClientFactory.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveMiniCluster.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveServer2Client.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestTableDefWriter.java 
> 8bdc3beb3677312ec0ee2e612616358bca4ca838 
>   src/test/org/apache/sqoop/hive/minicluster/AuthenticationConfiguration.java 
> PRE-CREATION 
>   src/test/org/apache/sqoop/hive/minicluster/HiveMiniCluster.java 
> PRE-CREATION 
>   
> src/test/org/apache/sqoop/hive/minicluster/KerberosAuthenticationConfiguration.java
>  PRE-CREATION 
>   
> src/test/org/apache/sqoop/hive/minicluster/NoAuthenticationConfiguration.java 
> PRE-CREATION 
>   
> src/test/org/apache/sqoop/hive/minicluster/PasswordAuthenticationConfiguration.java
>  PRE-CREATION 
>   src/test/org/apache/sqoop/testutil/HS2TestUtil.java PRE-CREATION 
>   src/test/org/apache/sqoop/tool/TestHiveServer2OptionValidations.java 
> PRE-CREATION 
>   

Re: Review Request 66353: Connection resource related issues in DBOutputFormat and OracleManager

2018-04-06 Thread Laszlo Bodor

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66353/
---

(Updated April 6, 2018, 1:26 p.m.)


Review request for Sqoop.


Changes
---

test case for oracle manager
formatting


Bugs: SQOOP-3306
https://issues.apache.org/jira/browse/SQOOP-3306


Repository: sqoop-trunk


Description
---

A fortify scan showed 2 possible resource leaks.

1: The function getRecordWriter() in DBOutputFormat.java sometimes fails to 
release a database resource allocated by getConnection() on line 117.
In the file DBOutputFormat.java similar issues were on line numbers 117

2: The function makeConnection() in OracleManager.java sometimes fails to 
release a database resource allocated by getConnection() on line 321.
In the file OracleManager.java similar issues were on line numbers 321


Diffs (updated)
-

  src/java/org/apache/sqoop/manager/OracleManager.java 929b5061 
  src/java/org/apache/sqoop/mapreduce/db/DBOutputFormat.java 730ff286 
  src/test/org/apache/sqoop/manager/TestOracleManager.java PRE-CREATION 


Diff: https://reviews.apache.org/r/66353/diff/2/

Changes: https://reviews.apache.org/r/66353/diff/1-2/


Testing
---

ant test


Thanks,

Laszlo Bodor



[jira] [Updated] (SQOOP-3305) Upgrade to Hadoop 3, Hive 3, and HBase 2

2018-04-06 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated SQOOP-3305:

Summary: Upgrade to Hadoop 3, Hive 3, and HBase 2  (was: Upgrade to Hadoop 
3.0.0)

I'm adding Hive and HBase to the summary, since they need to be handled 
together. See review request for details.

> Upgrade to Hadoop 3, Hive 3, and HBase 2
> 
>
> Key: SQOOP-3305
> URL: https://issues.apache.org/jira/browse/SQOOP-3305
> Project: Sqoop
>  Issue Type: Task
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>
> To be able to eventually support the latest versions of Hive, HBase and 
> Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See 
> https://hadoop.apache.org/docs/r3.0.0/index.html
> In this ticket I'll collect the necessary changes to do the upgrade. I'm not 
> setting a fix version yet, since this might mean a major release and to be 
> done together with the upgrade of related components.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (SQOOP-3311) Importing as ORC file to support full ACID Hive tables

2018-04-06 Thread Daniel Voros (JIRA)
Daniel Voros created SQOOP-3311:
---

 Summary: Importing as ORC file to support full ACID Hive tables
 Key: SQOOP-3311
 URL: https://issues.apache.org/jira/browse/SQOOP-3311
 Project: Sqoop
  Issue Type: New Feature
  Components: hive-integration
Reporter: Daniel Voros
Assignee: Daniel Voros


Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID 
by default. This will probably result in increased usage of ACID tables and the 
need to support importing into ACID tables with Sqoop.

Currently the only table format supporting full ACID tables is ORC.

The easiest and most effective way to support importing into these tables would 
be to write out files as ORC and keep using LOAD DATA as we do for all other 
Hive tables (supported since HIVE-17361).

Workaround could be to create table as textfile (as before) and then CTAS from 
that. This would push the responsibility of creating ORC format to Hive. 
However it would result in writing every record twice; in text format and in 
ORC.

Note that ORC is only necessary for full ACID tables. Insert-only (aka. 
micromanaged) ACID tables can use arbitrary file format.

Supporting full ACID tables would also be the first step in making 
"lastmodified" incremental imports work with Hive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 66221: SQOOP-3301 Document SQOOP-3216 - metastore related change

2018-04-06 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66221/#review200629
---




src/docs/user/metastore-purpose.txt
Lines 20 (patched)


do we need this extra new line here?



src/docs/user/metastore-purpose.txt
Line 26 (original), 30 (patched)


I think we should explicitly state that MySQL, MSSQL,  Hsqldb, PostgreSQL, 
Oracle and DB2 are supported since we only test these. Other RDBMSs might work 
but we should not confuse the users.



src/docs/user/saved-jobs.txt
Lines 160 (patched)


typo: paramter



src/docs/user/saved-jobs.txt
Lines 161 (patched)


typo: configuartion



src/docs/user/saved-jobs.txt
Lines 296 (patched)


I think we should explicitly state that MySQL, MSSQL,  Hsqldb, PostgreSQL, 
Oracle and DB2 are supported since we only test these. Other RDBMSs might work 
but we should not confuse the users.


- Szabolcs Vasas


On March 27, 2018, 11:48 a.m., Fero Szabo wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66221/
> ---
> 
> (Updated March 27, 2018, 11:48 a.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3301
> https://issues.apache.org/jira/browse/SQOOP-3301
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> This is the documentation for the metastore related patch implemented by Zach 
> Berkowitz.
> 
> 
> Diffs
> -
> 
>   src/docs/man/sqoop-job.txt 8be57402 
>   src/docs/user/metastore-purpose.txt 95c2d774 
>   src/docs/user/saved-jobs.txt 6885079f 
> 
> 
> Diff: https://reviews.apache.org/r/66221/diff/2/
> 
> 
> Testing
> ---
> 
> ant docs ran successfully
> 
> 
> Thanks,
> 
> Fero Szabo
> 
>