Re: Review Request 61427: Make sure the original ClassLoader is restored when running HCatalog tests
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61427/#review182692 --- Ship it! Applied and tested - Zoltán Tóth On Aug. 4, 2017, 3:49 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61427/ > --- > > (Updated Aug. 4, 2017, 3:49 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3218 > https://issues.apache.org/jira/browse/SQOOP-3218 > > > Repository: sqoop-trunk > > > Description > --- > > Make sure the original ClassLoader is restored when running HCatalog tests > > > Diffs > - > > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java > 2101b0687bb2307edfa5dc6e6eb1a20eb462f981 > src/test/org/apache/sqoop/mapreduce/hcat/TestSqoopHCatUtilities.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/61427/diff/1/ > > > Testing > --- > > Unit tests and third party tests were ran. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 61669: Test HBase kerberized connectivity
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61669/#review183203 --- src/test/com/cloudera/sqoop/hbase/HBaseTestCase.java Line 151 (original), 153 (patched) <https://reviews.apache.org/r/61669/#comment259217> I think if you would move the class variable at the beginning of the class then it would improve the readability. - Zoltán Tóth On Aug. 15, 2017, 6:56 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61669/ > --- > > (Updated Aug. 15, 2017, 6:56 p.m.) > > > Review request for Sqoop, Boglarka Egyed, Ferenc Szabo, and Zoltán Tóth. > > > Bugs: SQOOP-3222 > https://issues.apache.org/jira/browse/SQOOP-3222 > > > Repository: sqoop-trunk > > > Description > --- > > In this patch I have changed the following: > - Added test dependency on hadoop-minikdc. > - Added a JUnit rule which starts/stops kerberos MiniKdc before/after a test > case/class. > - Added kerberos handling logic to HBaseTestCase and refactored it a bit. > - Removed the kerberos-related properties from the build.xml as they caused > HBaseKerberizedConnectivityTest to fail. > > The changes are inspired by the following HBase test classes: > https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/SecureTestCluster.java > https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/TestGenerateDelegationToken.java > > > HBase security documentation: > http://hbase.apache.org/1.2/book.html#security > > > Diffs > - > > build.xml 5f02dcf7759887d84d8cf0505cc1873c53f70a67 > ivy.xml e4b45bfd9ff6d984a1d1d1808855a07d8b090921 > src/test/com/cloudera/sqoop/hbase/HBaseKerberizedConnectivityTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/hbase/HBaseTestCase.java > d9f74952e5f9dd9497e6e9e99789471bcd8f8930 > > src/test/org/apache/sqoop/infrastructure/kerberos/KerberosConfigurationProvider.java > PRE-CREATION > > src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructure.java > PRE-CREATION > > src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructureRule.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/61669/diff/1/ > > > Testing > --- > > Ran unit tests and third party tests. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 61669: Test HBase kerberized connectivity
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61669/#review183204 --- src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructureRule.java Lines 46 (patched) <https://reviews.apache.org/r/61669/#comment259220> I think you can use Files.createFile(Path, ...) so you will use only NIO package. It also depends on your taste, so if you leave it like this it is also okay. src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructureRule.java Lines 61 (patched) <https://reviews.apache.org/r/61669/#comment259218> You used plural in the name of the method but the miniKdc.createPrincipal is singular. I think it would better if you would use singular instead. src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructureRule.java Lines 71 (patched) <https://reviews.apache.org/r/61669/#comment259219> Maybe you can change it into one line or you can just put that into the createPrincipals method because you use the same try - catch block for any error. It depends on your taste so feel free to choose your action. - Zoltán Tóth On Aug. 15, 2017, 6:56 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61669/ > --- > > (Updated Aug. 15, 2017, 6:56 p.m.) > > > Review request for Sqoop, Boglarka Egyed, Ferenc Szabo, and Zoltán Tóth. > > > Bugs: SQOOP-3222 > https://issues.apache.org/jira/browse/SQOOP-3222 > > > Repository: sqoop-trunk > > > Description > --- > > In this patch I have changed the following: > - Added test dependency on hadoop-minikdc. > - Added a JUnit rule which starts/stops kerberos MiniKdc before/after a test > case/class. > - Added kerberos handling logic to HBaseTestCase and refactored it a bit. > - Removed the kerberos-related properties from the build.xml as they caused > HBaseKerberizedConnectivityTest to fail. > > The changes are inspired by the following HBase test classes: > https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/SecureTestCluster.java > https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/TestGenerateDelegationToken.java > > > HBase security documentation: > http://hbase.apache.org/1.2/book.html#security > > > Diffs > - > > build.xml 5f02dcf7759887d84d8cf0505cc1873c53f70a67 > ivy.xml e4b45bfd9ff6d984a1d1d1808855a07d8b090921 > src/test/com/cloudera/sqoop/hbase/HBaseKerberizedConnectivityTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/hbase/HBaseTestCase.java > d9f74952e5f9dd9497e6e9e99789471bcd8f8930 > > src/test/org/apache/sqoop/infrastructure/kerberos/KerberosConfigurationProvider.java > PRE-CREATION > > src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructure.java > PRE-CREATION > > src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructureRule.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/61669/diff/1/ > > > Testing > --- > > Ran unit tests and third party tests. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 62574: Document how to run third party tests manually with databases running in docker
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62574/#review187089 --- Ship it! Hey Szabi, That is a greate improvement. I really appreciate your contribution. Now it is really easy to run tests. Cheers, Zoli - Zoltán Tóth On Oct. 4, 2017, 7:49 a.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62574/ > --- > > (Updated Oct. 4, 2017, 7:49 a.m.) > > > Review request for Sqoop and Zach Berkowitz. > > > Bugs: SQOOP-3229 > https://issues.apache.org/jira/browse/SQOOP-3229 > > > Repository: sqoop-trunk > > > Description > --- > > Documentation is added to COMPILING.txt, docker-compose file and helper > scripts are added. > > > Diffs > - > > COMPILING.txt eb69b220ecdaedd4d279259aba92df143482145a > scripts/thirdpartytest/docker-compose/db2scripts/db2entrypoint.sh > PRE-CREATION > scripts/thirdpartytest/docker-compose/oraclescripts/healthcheck.sh > PRE-CREATION > > scripts/thirdpartytest/docker-compose/oraclescripts/startup/oracleusersetup.sql > PRE-CREATION > scripts/thirdpartytest/docker-compose/sqoop-thirdpartytest-db-services.yml > PRE-CREATION > scripts/thirdpartytest/start-thirdpartytest-db-containers.sh PRE-CREATION > scripts/thirdpartytest/stop-thirdpartytest-db-containers.sh PRE-CREATION > > > Diff: https://reviews.apache.org/r/62574/diff/3/ > > > Testing > --- > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 62028: Remove Sqoop dependency on deprecated HBase APIs
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62028/#review184973 --- Hey Szabi, Thanks for the contribution especially for good test coverage. I added some comments, please check them and decide about them. Cheers, Zoli src/java/org/apache/sqoop/hbase/HBasePutProcessor.java Lines 180 (patched) <https://reviews.apache.org/r/62028/#comment261198> You can make it a one liner if you want to but it is a question of taste mutation != null && (mutation instanceof Put || mutation instanceof Delete) src/java/org/apache/sqoop/hbase/HBasePutProcessor.java Line 163 (original), 190 (patched) <https://reviews.apache.org/r/62028/#comment261200> Method throws IOException but we catch it inside the method. Either remove the throws from method head or throw the exception to the caller. src/java/org/apache/sqoop/mapreduce/HBaseBulkImportJob.java Lines 97 (patched) <https://reviews.apache.org/r/62028/#comment261201> Exception is too general. Please use specific exception here. src/java/org/apache/sqoop/mapreduce/HBaseBulkImportJob.java Lines 98 (patched) <https://reviews.apache.org/r/62028/#comment261202> Don't you want to log if it cannot get HBaseTable? Maybe it will be logged on higher level src/java/org/apache/sqoop/mapreduce/HBaseBulkImportJob.java Lines 130 (patched) <https://reviews.apache.org/r/62028/#comment261203> Can you change general exception into specified ones? src/java/org/apache/sqoop/mapreduce/HBaseBulkImportJob.java Lines 149 (patched) <https://reviews.apache.org/r/62028/#comment261204> If the error message is not neccessary then you can leave the exception to be thrown to to the caller method. - Zoltán Tóth On Sept. 1, 2017, 9:30 a.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62028/ > --- > > (Updated Sept. 1, 2017, 9:30 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3232 > https://issues.apache.org/jira/browse/SQOOP-3232 > > > Repository: sqoop-trunk > > > Description > --- > > Sqoop currently depends on pre HBase 1.0 APIs that have been deprecated and > will be removed in the HBase 2.0 release. > The task is to remove the dependency on these old APIs to make sure that the > upgrade to a newer HBase version will be easier in the future. > > > Diffs > - > > src/java/org/apache/sqoop/hbase/HBasePutProcessor.java > 032fd38ad0ff13372ae70be47e38db8c4ba8ef8f > src/java/org/apache/sqoop/mapreduce/HBaseBulkImportJob.java > 2bbfffe03844517da9d0d7c94380a8fb57c5eb29 > src/java/org/apache/sqoop/mapreduce/HBaseImportJob.java > 523d0a7ede70e16b4e80f8349f08c67eba2e4d01 > src/test/com/cloudera/sqoop/hbase/HBaseTestCase.java > 8b29b5fdb00e223a4f2af14b8a9cbfd9ba9d7d83 > src/test/org/apache/sqoop/hbase/TestHBasePutProcessor.java PRE-CREATION > > > Diff: https://reviews.apache.org/r/62028/diff/1/ > > > Testing > --- > > Added a new test case for the small refactoring I did. > Ran all unit and third party tests successfully. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 61669: Test HBase kerberized connectivity
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61669/#review184139 --- Ship it! Hey Szabolcs, Thanks for your contribution. I ran unit and integration tests to make your changes run smoothly and they did. Cheers, Zoli - Zoltán Tóth On Aug. 30, 2017, 8:17 a.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61669/ > --- > > (Updated Aug. 30, 2017, 8:17 a.m.) > > > Review request for Sqoop, Boglarka Egyed, Ferenc Szabo, and Zoltán Tóth. > > > Bugs: SQOOP-3222 > https://issues.apache.org/jira/browse/SQOOP-3222 > > > Repository: sqoop-trunk > > > Description > --- > > In this patch I have changed the following: > - Added test dependency on hadoop-minikdc. > - Added a JUnit rule which starts/stops kerberos MiniKdc before/after a test > case/class. > - Added kerberos handling logic to HBaseTestCase and refactored it a bit. > - Removed the kerberos-related properties from the build.xml as they caused > HBaseKerberizedConnectivityTest to fail. > > The changes are inspired by the following HBase test classes: > https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/SecureTestCluster.java > https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/TestGenerateDelegationToken.java > > > HBase security documentation: > http://hbase.apache.org/1.2/book.html#security > > > Diffs > - > > build.xml 5f02dcf7759887d84d8cf0505cc1873c53f70a67 > ivy.xml e4b45bfd9ff6d984a1d1d1808855a07d8b090921 > src/test/com/cloudera/sqoop/hbase/HBaseKerberizedConnectivityTest.java > PRE-CREATION > src/test/com/cloudera/sqoop/hbase/HBaseTestCase.java > d9f74952e5f9dd9497e6e9e99789471bcd8f8930 > > src/test/org/apache/sqoop/infrastructure/kerberos/KerberosConfigurationProvider.java > PRE-CREATION > > src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructure.java > PRE-CREATION > > src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructureRule.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/61669/diff/5/ > > > Testing > --- > > Ran unit tests and third party tests. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 62028: Remove Sqoop dependency on deprecated HBase APIs
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62028/#review185068 --- Ship it! Ship It! - Zoltán Tóth On Sept. 11, 2017, 9:58 a.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62028/ > --- > > (Updated Sept. 11, 2017, 9:58 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3232 > https://issues.apache.org/jira/browse/SQOOP-3232 > > > Repository: sqoop-trunk > > > Description > --- > > Sqoop currently depends on pre HBase 1.0 APIs that have been deprecated and > will be removed in the HBase 2.0 release. > The task is to remove the dependency on these old APIs to make sure that the > upgrade to a newer HBase version will be easier in the future. > > > Diffs > - > > src/java/org/apache/sqoop/hbase/HBasePutProcessor.java > 032fd38ad0ff13372ae70be47e38db8c4ba8ef8f > src/java/org/apache/sqoop/mapreduce/HBaseBulkImportJob.java > 2bbfffe03844517da9d0d7c94380a8fb57c5eb29 > src/java/org/apache/sqoop/mapreduce/HBaseImportJob.java > 523d0a7ede70e16b4e80f8349f08c67eba2e4d01 > src/test/com/cloudera/sqoop/hbase/HBaseTestCase.java > 8b29b5fdb00e223a4f2af14b8a9cbfd9ba9d7d83 > src/test/org/apache/sqoop/hbase/TestHBasePutProcessor.java PRE-CREATION > > > Diff: https://reviews.apache.org/r/62028/diff/2/ > > > Testing > --- > > Added a new test case for the small refactoring I did. > Ran all unit and third party tests successfully. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 62057: SQOOP-3014 Sqoop with HCatalog import loose precision for large numbers that does not fit into double
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62057/ --- (Updated Sept. 12, 2017, 12:19 p.m.) Review request for Sqoop, Boglarka Egyed and Anna Szonyi. Changes --- Change updated based on review Bugs: SQOOP-3014 https://issues.apache.org/jira/browse/SQOOP-3014 Repository: sqoop-trunk Description --- HCatalog rounded BigDecimals but that should not happen. Now Sqoop HCatalog doesn't change BigDecimals Diffs (updated) - src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java aba2458e src/test/org/apache/sqoop/hcat/HCatalogImportTest.java d784a205 src/test/org/apache/sqoop/mapreduce/hcat/TestSqoopHCatImportHelper.java PRE-CREATION Diff: https://reviews.apache.org/r/62057/diff/3/ Changes: https://reviews.apache.org/r/62057/diff/2-3/ Testing --- I ran unit tests and integration tests as well. New test cases were added to test the change Thanks, Zoltán Tóth
Re: Review Request 61522: SQOOP-2907 : Export parquet files to RDBMS: don't require .metadata for parquet files
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61522/#review186844 --- Hey Sandish, Thanks for your contribution. I am pretty sure your work will solve a lot of headache for a lot of people. I'd some findings so I added my comments to the code. Please read them and put your comments/ideas to it. Cheers, Zoli src/java/org/apache/sqoop/avro/AvroUtil.java Lines 194 (patched) <https://reviews.apache.org/r/61522/#comment263683> If it is only Parquet file related then what is this change in AvroUtils. I haven't checked why it is necessary so please explain why does this change needed. src/java/org/apache/sqoop/avro/AvroUtil.java Lines 195 (patched) <https://reviews.apache.org/r/61522/#comment263684> If this part is needed then you can use Java Charset to load UTF-8 so it won't throw IOException. src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java Lines 100 (patched) <https://reviews.apache.org/r/61522/#comment263688> It is called here double time. Is it okay? src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java Lines 107 (patched) <https://reviews.apache.org/r/61522/#comment263690> Why do you use double log.warn here? You can merge it to one. src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java Lines 111 (patched) <https://reviews.apache.org/r/61522/#comment263689> Why don't use the catch block for this code snippet? src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java Lines 120 (patched) <https://reviews.apache.org/r/61522/#comment263687> I think this shouldn't be public. If you use it only inside of this class then make it private. If you would like to use it outside of this class then a getter would make more sense. It is also should be somewhere begining of the class because it is a static final variable. You can intialize it in a static block. src/test/com/cloudera/sqoop/TestParquetExport.java Lines 165 (patched) <https://reviews.apache.org/r/61522/#comment263678> This method is almost copy of the previous one. Please avoid duplicates and put the common parts into one method. And please also change the comment of the method becuase that is the same but the two method doing different thing. fileNum parameter is not used in the method, please remove it. src/test/com/cloudera/sqoop/TestParquetExport.java Lines 226 (patched) <https://reviews.apache.org/r/61522/#comment263680> Please do not copy and paste code parts which is already in the class. It increases duplication which makes harder to read and maintain later. testSupportedParquetTypes() method also contains this part of the code. Put them into a different method like createParquetTestFileContent() src/test/com/cloudera/sqoop/TestParquetExport.java Lines 232 (patched) <https://reviews.apache.org/r/61522/#comment263679> We don't have an official formatter yet but I think a 120 columns length row is better than 80. src/test/com/cloudera/sqoop/TestParquetExport.java Lines 475 (patched) <https://reviews.apache.org/r/61522/#comment263681> Another duplication src/test/com/cloudera/sqoop/TestParquetExport.java Lines 527 (patched) <https://reviews.apache.org/r/61522/#comment263685> Please specify the expected exception here. Your test should have one good result only. Pleas avoid unnecessary usage of general exceptions. Use expected exception instead of try - catch. It makes the code more readable. src/test/com/cloudera/sqoop/TestParquetExport.java Lines 614 (patched) <https://reviews.apache.org/r/61522/#comment263686> Same here. Please specify the exception and use expected exception instead of try - catch blocks. - Zoltán Tóth On Aug. 9, 2017, 10:53 a.m., Sandish Kumar HN wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61522/ > --- > > (Updated Aug. 9, 2017, 10:53 a.m.) > > > Review request for Sqoop and Anna Szonyi. > > > Bugs: SQOOP-2907 > https://issues.apache.org/jira/browse/SQOOP-2907 > > > Repository: sqoop-trunk > > > Description > --- > > Kite currently requires .metadata. > Parquet files have their own metadata stored along data files. > It would be great for Export operation on parquet files to RDBMS not to > require .metadata. > We have most of the files created by Spark and Hive, and they don't create > .metadata, it only Kite that does. > It makes sqoop export of parquet files usability very limited. > > > Diffs > - > > src/java/org/apache/sqoo
Review Request 61777: sqoop tries to re execute select query during import in case of a connection reset error and this is causing lots of duplicate records from source
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61777/ --- Review request for Sqoop. Bugs: SQOOP-3139 https://issues.apache.org/jira/browse/SQOOP-3139 Repository: sqoop-trunk Description --- In the case if the database table name and split-by parameter was different (eg.: Mycol vs. mycol) Sqoop couldn't continue the query from the last value if the connection was broken. Diffs - src/java/org/apache/sqoop/mapreduce/db/DBRecordReader.java a78eb061 src/java/org/apache/sqoop/mapreduce/db/SQLServerDBRecordReader.java 9a3621b0 src/test/org/apache/sqoop/mapreduce/db/TestSQLServerDBRecordReader.java PRE-CREATION Diff: https://reviews.apache.org/r/61777/diff/1/ Testing --- Thanks, Zoltán Tóth
Re: Review Request 61777: sqoop tries to re execute select query during import in case of a connection reset error and this is causing lots of duplicate records from source
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61777/ --- (Updated Aug. 21, 2017, 1:15 p.m.) Review request for Sqoop. Changes --- Based on codereview function is separated into different methods with meaningful names Bugs: SQOOP-3139 https://issues.apache.org/jira/browse/SQOOP-3139 Repository: sqoop-trunk Description --- In the case if the database table name and split-by parameter was different (eg.: Mycol vs. mycol) Sqoop couldn't continue the query from the last value if the connection was broken. Diffs (updated) - src/java/org/apache/sqoop/mapreduce/db/DBRecordReader.java a78eb061 src/java/org/apache/sqoop/mapreduce/db/SQLServerDBRecordReader.java 9a3621b0 src/test/org/apache/sqoop/mapreduce/db/TestSQLServerDBRecordReader.java PRE-CREATION Diff: https://reviews.apache.org/r/61777/diff/2/ Changes: https://reviews.apache.org/r/61777/diff/1-2/ Testing --- Thanks, Zoltán Tóth
Re: Review Request 61933: ImportTest, ExportTest and TimestampDataTest fail because of column escaping problems
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61933/#review184035 --- Ship it! Hey Szabolcs, thanks for your contribution. - Zoltán Tóth On Aug. 28, 2017, 1:18 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61933/ > --- > > (Updated Aug. 28, 2017, 1:18 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3226 > https://issues.apache.org/jira/browse/SQOOP-3226 > > > Repository: sqoop-trunk > > > Description > --- > > It seems there were some changes in the escaping logic in the Oracle escaping > logic which broke ImportTest, ExportTest and TimestampDataTest. Since these > are third party tests ant clean test did not spot the problem earlier. > > > Diffs > - > > src/test/org/apache/sqoop/manager/oracle/ExportTest.java > 23b4c73ddeb8ba72477be2cdcebbdbc3373665f8 > src/test/org/apache/sqoop/manager/oracle/ImportTest.java > 0002128ff70b8159bbb560f3484a9cfdb0576a0e > src/test/org/apache/sqoop/manager/oracle/OraOopTestCase.java > 631e4f96fc7edc501faedde014d829d6190e58e5 > src/test/org/apache/sqoop/manager/oracle/TimestampDataTest.java > 1babf6cc7ff3e9a2bb616de9926e7c502b27b3a3 > > > Diff: https://reviews.apache.org/r/61933/diff/2/ > > > Testing > --- > > Executed unit and third party tests. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 62057: SQOOP-3014 Sqoop with HCatalog import loose precision for large numbers that does not fit into double
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62057/ --- (Updated Sept. 4, 2017, 3:26 p.m.) Review request for Sqoop and Boglarka Egyed. Bugs: SQOOP-3014 https://issues.apache.org/jira/browse/SQOOP-3014 Repository: sqoop-trunk Description --- HCatalog rounded BigDecimals but that should not happen. Now Sqoop HCatalog doesn't change BigDecimals Diffs - src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java aba2458e src/test/org/apache/sqoop/mapreduce/hcat/TestSqoopHCatImportHelper.java PRE-CREATION Diff: https://reviews.apache.org/r/62057/diff/1/ Testing --- I ran unit tests and integration tests as well. New test cases were added to test the change Thanks, Zoltán Tóth
Re: Review Request 59833: SQLServerDatatypeImportDelimitedFileTest can fail in some environments
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59833/#review184036 --- Ship it! Hey Szabolcs, I ran the unit and the third party tests successfully with your patch. Thanks for your contribution Cheers, Zoli - Zoltán Tóth On June 6, 2017, 10:29 a.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59833/ > --- > > (Updated June 6, 2017, 10:29 a.m.) > > > Review request for Sqoop. > > > Bugs: sqoop-3195 > https://issues.apache.org/jira/browse/sqoop-3195 > > > Repository: sqoop-trunk > > > Description > --- > > I tried to execute the SQLServer third party tests with two different SQL > Server versions. With SQL Server 2014 Express edition all the tests were > successful however with SQL Server 2017 Developer edition I got the following > error: > [junit] Test > org.apache.sqoop.manager.sqlserver.SQLServerDatatypeImportDelimitedFileTest > FAILED > Failure for following Test Data : > FLOAT > SCALE : null > PREC : null > TO_INSERT : 1.7976931348623157 > DB_READBACK : 1.7976931348623155 > HDFS_READBACK : 1.7976931348623155 > NEG_POS_FLAG : POS > OFFSET : 8 > --- > Exception details : > expected a different string expected:<1.797693134862315[5]> but > was:<1.797693134862315[7]> > By looking at the test case I have found that it inserts 1.7976931348623157 > into the database but it expects 1.7976931348623155 (the last digit is 5 > instead of 7) probably because float is an approximate numeric data types on > MSSQL and on earlier versions this is how it worked but. > I suggest using a less precise float number in this test case to avoid > flakyness. > > > Diffs > - > > testdata/DatatypeTestData-import-lite.txt a4b5c75 > > > Diff: https://reviews.apache.org/r/59833/diff/1/ > > > Testing > --- > > I ran SQL server third party tests with MSSQL 2014 and MSSQL 2017 too. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request: SQOOP-604 Easy throttling feature for MySQL exports
On Sept. 28, 2012, 9:59 a.m., Abhijeet Gaikwad wrote: src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java, line 329 https://reviews.apache.org/r/7135/diff/1/?file=155911#file155911line329 What happens when MYSQL_CHECKPOINT_SLEEP_KEY is greater than mapred.task.timeout? If the job is killed, we need to handle the scenario. That's a good point! Given that the default value of mapred.task.timeout is 60 (10m) I consider this very unlikely, the ideal value of the new config key has order of magniture of a few hundred ms. However, in some extreme cases (or when clearly misusing this feature) it is possible that this case needs to be handled. Do you have any suggestion? For example, limiting sqoop.mysql.export.sleep.ms to a maximum of the value in mapred.task.timeout? - Zoltán --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7135/#review12019 --- On Sept. 27, 2012, 3:47 p.m., Zoltán Tóth-Czifra wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7135/ --- (Updated Sept. 27, 2012, 3:47 p.m.) Review request for Sqoop. Description --- Code review for SQOOP-604, see https://issues.apache.org/jira/browse/SQOOP-604 The solution in short: Using the already existing checkpoint feature of the direct (--direct) MySQL exports (the export process is restarted every X bytes written), extending it with a new config value that would simply make the thread sleep for X milliseconds at the checkbpoints. With low enough byte count limit this can be a simple yet powerful throttling mechanism. Diffs - src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java a4e8b88 Diff: https://reviews.apache.org/r/7135/diff/ Testing --- Executing with different settings of sqoop.mysql.export.checkpoint.bytes and sqoop.mysql.export.sleep.ms: 33554432B / 0ms: Transferred 4.7579 MB in 8.7175 seconds (558.8826 KB/sec) 102400B / 500ms: Transferred 4.7579 MB in 35.7794 seconds (136.1698 KB/sec) 51200B / 500ms: Transferred 4.758 MB in 57.8675 seconds (84.1959 KB/sec) 51200B / 250ms: Transferred 4.7579 MB in 35.0293 seconds (139.0854 KB/sec) I did not add unit tests yet and as it involves calling to Thread.sleep, I find testing this difficult. Unfortunately there is no machine or environment object that could be injected to these classes as mocks that could take care of time-related fixtures. Thanks, Zoltán Tóth-Czifra
Re: Review Request: SQOOP-604 Easy throttling feature for MySQL exports
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7135/ --- (Updated Oct. 2, 2012, 10:08 a.m.) Review request for Sqoop. Changes --- You are right, I was in a hurry and here is the result. Anyways, I attach the fixed patch. Compiled with no checkstyle warnings. Output of test: 2012-10-02 12:03:17,575 WARN com.cloudera.sqoop.mapreduce.MySQLExportMapper: Value for sqoop.mysql.export.sleep.ms has to be smaller than mapred.task.timeout Description --- Code review for SQOOP-604, see https://issues.apache.org/jira/browse/SQOOP-604 The solution in short: Using the already existing checkpoint feature of the direct (--direct) MySQL exports (the export process is restarted every X bytes written), extending it with a new config value that would simply make the thread sleep for X milliseconds at the checkbpoints. With low enough byte count limit this can be a simple yet powerful throttling mechanism. Diffs (updated) - src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java a4e8b88 Diff: https://reviews.apache.org/r/7135/diff/ Testing --- Executing with different settings of sqoop.mysql.export.checkpoint.bytes and sqoop.mysql.export.sleep.ms: 33554432B / 0ms: Transferred 4.7579 MB in 8.7175 seconds (558.8826 KB/sec) 102400B / 500ms: Transferred 4.7579 MB in 35.7794 seconds (136.1698 KB/sec) 51200B / 500ms: Transferred 4.758 MB in 57.8675 seconds (84.1959 KB/sec) 51200B / 250ms: Transferred 4.7579 MB in 35.0293 seconds (139.0854 KB/sec) I did not add unit tests yet and as it involves calling to Thread.sleep, I find testing this difficult. Unfortunately there is no machine or environment object that could be injected to these classes as mocks that could take care of time-related fixtures. Thanks, Zoltán Tóth-Czifra
Re: Review Request: SQOOP-604 Easy throttling feature for MySQL exports
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7135/ --- (Updated Oct. 2, 2012, 4:08 p.m.) Review request for Sqoop. Changes --- Sorry! Description --- Code review for SQOOP-604, see https://issues.apache.org/jira/browse/SQOOP-604 The solution in short: Using the already existing checkpoint feature of the direct (--direct) MySQL exports (the export process is restarted every X bytes written), extending it with a new config value that would simply make the thread sleep for X milliseconds at the checkbpoints. With low enough byte count limit this can be a simple yet powerful throttling mechanism. Diffs (updated) - src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java a4e8b88 Diff: https://reviews.apache.org/r/7135/diff/ Testing --- Executing with different settings of sqoop.mysql.export.checkpoint.bytes and sqoop.mysql.export.sleep.ms: 33554432B / 0ms: Transferred 4.7579 MB in 8.7175 seconds (558.8826 KB/sec) 102400B / 500ms: Transferred 4.7579 MB in 35.7794 seconds (136.1698 KB/sec) 51200B / 500ms: Transferred 4.758 MB in 57.8675 seconds (84.1959 KB/sec) 51200B / 250ms: Transferred 4.7579 MB in 35.0293 seconds (139.0854 KB/sec) I did not add unit tests yet and as it involves calling to Thread.sleep, I find testing this difficult. Unfortunately there is no machine or environment object that could be injected to these classes as mocks that could take care of time-related fixtures. Thanks, Zoltán Tóth-Czifra
Re: Review Request: SQOOP-604 Easy throttling feature for MySQL exports
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7135/ --- (Updated Oct. 4, 2012, 12:25 p.m.) Review request for Sqoop. Changes --- Sorry, I'm retarded. Description --- Code review for SQOOP-604, see https://issues.apache.org/jira/browse/SQOOP-604 The solution in short: Using the already existing checkpoint feature of the direct (--direct) MySQL exports (the export process is restarted every X bytes written), extending it with a new config value that would simply make the thread sleep for X milliseconds at the checkbpoints. With low enough byte count limit this can be a simple yet powerful throttling mechanism. Diffs (updated) - src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java a4e8b88 Diff: https://reviews.apache.org/r/7135/diff/ Testing --- Executing with different settings of sqoop.mysql.export.checkpoint.bytes and sqoop.mysql.export.sleep.ms: 33554432B / 0ms: Transferred 4.7579 MB in 8.7175 seconds (558.8826 KB/sec) 102400B / 500ms: Transferred 4.7579 MB in 35.7794 seconds (136.1698 KB/sec) 51200B / 500ms: Transferred 4.758 MB in 57.8675 seconds (84.1959 KB/sec) 51200B / 250ms: Transferred 4.7579 MB in 35.0293 seconds (139.0854 KB/sec) I did not add unit tests yet and as it involves calling to Thread.sleep, I find testing this difficult. Unfortunately there is no machine or environment object that could be injected to these classes as mocks that could take care of time-related fixtures. Thanks, Zoltán Tóth-Czifra
Re: Review Request: SQOOP-604 Easy throttling feature for MySQL exports
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7135/ --- (Updated Nov. 2, 2012, 12:32 p.m.) Review request for Sqoop. Changes --- Sure thing, I'm sorry. Checkstyle passes now with my changes. Description --- Code review for SQOOP-604, see https://issues.apache.org/jira/browse/SQOOP-604 The solution in short: Using the already existing checkpoint feature of the direct (--direct) MySQL exports (the export process is restarted every X bytes written), extending it with a new config value that would simply make the thread sleep for X milliseconds at the checkbpoints. With low enough byte count limit this can be a simple yet powerful throttling mechanism. Diffs (updated) - src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java a4e8b88 Diff: https://reviews.apache.org/r/7135/diff/ Testing --- Executing with different settings of sqoop.mysql.export.checkpoint.bytes and sqoop.mysql.export.sleep.ms: 33554432B / 0ms: Transferred 4.7579 MB in 8.7175 seconds (558.8826 KB/sec) 102400B / 500ms: Transferred 4.7579 MB in 35.7794 seconds (136.1698 KB/sec) 51200B / 500ms: Transferred 4.758 MB in 57.8675 seconds (84.1959 KB/sec) 51200B / 250ms: Transferred 4.7579 MB in 35.0293 seconds (139.0854 KB/sec) I did not add unit tests yet and as it involves calling to Thread.sleep, I find testing this difficult. Unfortunately there is no machine or environment object that could be injected to these classes as mocks that could take care of time-related fixtures. Thanks, Zoltán Tóth-Czifra
Re: Review Request: SQOOP-604 Easy throttling feature for MySQL exports
On Nov. 3, 2012, 5:18 a.m., Abhijeet Gaikwad wrote: Looks good :) ant checkstyle - no errors ant test - success Thank you for your help Abhijeet! - Zoltán --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7135/#review13075 --- On Nov. 2, 2012, 12:32 p.m., Zoltán Tóth-Czifra wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7135/ --- (Updated Nov. 2, 2012, 12:32 p.m.) Review request for Sqoop. Description --- Code review for SQOOP-604, see https://issues.apache.org/jira/browse/SQOOP-604 The solution in short: Using the already existing checkpoint feature of the direct (--direct) MySQL exports (the export process is restarted every X bytes written), extending it with a new config value that would simply make the thread sleep for X milliseconds at the checkbpoints. With low enough byte count limit this can be a simple yet powerful throttling mechanism. Diffs - src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java a4e8b88 Diff: https://reviews.apache.org/r/7135/diff/ Testing --- Executing with different settings of sqoop.mysql.export.checkpoint.bytes and sqoop.mysql.export.sleep.ms: 33554432B / 0ms: Transferred 4.7579 MB in 8.7175 seconds (558.8826 KB/sec) 102400B / 500ms: Transferred 4.7579 MB in 35.7794 seconds (136.1698 KB/sec) 51200B / 500ms: Transferred 4.758 MB in 57.8675 seconds (84.1959 KB/sec) 51200B / 250ms: Transferred 4.7579 MB in 35.0293 seconds (139.0854 KB/sec) I did not add unit tests yet and as it involves calling to Thread.sleep, I find testing this difficult. Unfortunately there is no machine or environment object that could be injected to these classes as mocks that could take care of time-related fixtures. Thanks, Zoltán Tóth-Czifra
Review Request: SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7880/ --- Review request for Sqoop. Description --- Code review for SQOOP-683, see https://issues.apache.org/jira/browse/SQOOP-683. Diffs - src/docs/user/compatibility.txt 3576fd7 Diff: https://reviews.apache.org/r/7880/diff/ Testing --- Converted to XML with asciidoc, the affected part: simparaSometimes you need to export large data with Sqoop to a live MySQL cluster that is under a high load serving random queries from the users of our product. While data consistency issues during the export can be easily solved with a staging table, there is still a problem: the performance impact caused by the heavy export./simpara simparaFirst off, the resources of MySQL dedicated to the import process can affect the performance of the live product, both on the master and on the slaves. Second, even if the servers can handle the import with no significant performance impact (mysqlimport should be relatively cheap), importing big tables can cause serious replication lag in the cluster risking data inconsistency./simpara simparaWith literal-D sqoop.mysql.export.sleep.ms=time/literal, where emphasistime/emphasis is a value in milliseconds, you can let the server relax between checkpoints and the replicas catch up by pausing the export process after transferring the number of bytes specified in literalsqoop.mysql.export.checkpoint.bytes/literal. Experiment with different settings of these two parameters to archieve an export pace that doesn#8217;t endanger the stability of your MySQL cluster./simpara importantsimparaNote that any arguments to Sqoop that are of the form literal-D parameter=value/literal are Hadoop emphasisgeneric arguments/emphasis and must appear before any tool-specific arguments (for example, literal--connect/literal, literal--table/literal, etc). Don#8217;t forget that these parameters only work with the literal--direct/literal flag set./simpara/important Thanks, Zoltán Tóth-Czifra
Re: Review Request: SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7880/ --- (Updated Nov. 8, 2012, 6:33 p.m.) Review request for Sqoop. Changes --- Thank you for the suggestions! All of them make sense to me, see new patch :) Description --- Code review for SQOOP-683, see https://issues.apache.org/jira/browse/SQOOP-683. Diffs (updated) - src/docs/user/compatibility.txt 3576fd7 Diff: https://reviews.apache.org/r/7880/diff/ Testing --- Converted to XML with asciidoc, the affected part: simparaSometimes you need to export large data with Sqoop to a live MySQL cluster that is under a high load serving random queries from the users of our product. While data consistency issues during the export can be easily solved with a staging table, there is still a problem: the performance impact caused by the heavy export./simpara simparaFirst off, the resources of MySQL dedicated to the import process can affect the performance of the live product, both on the master and on the slaves. Second, even if the servers can handle the import with no significant performance impact (mysqlimport should be relatively cheap), importing big tables can cause serious replication lag in the cluster risking data inconsistency./simpara simparaWith literal-D sqoop.mysql.export.sleep.ms=time/literal, where emphasistime/emphasis is a value in milliseconds, you can let the server relax between checkpoints and the replicas catch up by pausing the export process after transferring the number of bytes specified in literalsqoop.mysql.export.checkpoint.bytes/literal. Experiment with different settings of these two parameters to archieve an export pace that doesn#8217;t endanger the stability of your MySQL cluster./simpara importantsimparaNote that any arguments to Sqoop that are of the form literal-D parameter=value/literal are Hadoop emphasisgeneric arguments/emphasis and must appear before any tool-specific arguments (for example, literal--connect/literal, literal--table/literal, etc). Don#8217;t forget that these parameters only work with the literal--direct/literal flag set./simpara/important Thanks, Zoltán Tóth-Czifra