[jira] [Commented] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
[ https://issues.apache.org/jira/browse/SQOOP-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553713#comment-15553713 ] Hudson commented on SQOOP-3021: --- FAILURE: Integrated in Jenkins build Sqoop-hadoop23 #1263 (See [https://builds.apache.org/job/Sqoop-hadoop23/1263/]) SQOOP-3021: ClassWriter fails if a column name contains a backslash (maugli: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git=commit=0f13c474bf91f6609b9bdd7a1aa25b250f07e398]) * (add) src/test/org/apache/sqoop/manager/mysql/MySqlColumnEscapeImportTest.java * (edit) src/test/com/cloudera/sqoop/ThirdPartyTests.java * (add) src/test/org/apache/sqoop/manager/oracle/OracleColumnEscapeImportTest.java * (edit) src/java/org/apache/sqoop/orm/ClassWriter.java * (edit) src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java * (edit) src/test/com/cloudera/sqoop/manager/MySQLTestUtils.java > ClassWriter fails if a column name contains a backslash character > - > > Key: SQOOP-3021 > URL: https://issues.apache.org/jira/browse/SQOOP-3021 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Szabolcs Vasas >Assignee: Szabolcs Vasas > Fix For: 1.4.7 > > Attachments: SQOOP-3021.patch > > > The following Sqoop command fails with a javac error: > {code} > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select C1_INT,C4_VARCHAR20, REGEXP_REPLACE(TRIM(C4_VARCHAR20),'\:','!') from > T1_IMPORT WHERE \$CONDITIONS" --target-dir regex_imp --delete-target-dir -m 1 > {code} > The reason is that the REGEXP_REPLACE expression contains a backslash > character which does not get escaped in ClassWriter and an invalid string > gets generated into the Java code. > SQOOP-2864 solved this problem for the double quote character we need to > generalize that solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
[ https://issues.apache.org/jira/browse/SQOOP-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553689#comment-15553689 ] Hudson commented on SQOOP-3021: --- FAILURE: Integrated in Jenkins build Sqoop-hadoop200 #1065 (See [https://builds.apache.org/job/Sqoop-hadoop200/1065/]) SQOOP-3021: ClassWriter fails if a column name contains a backslash (maugli: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git=commit=0f13c474bf91f6609b9bdd7a1aa25b250f07e398]) * (add) src/test/org/apache/sqoop/manager/oracle/OracleColumnEscapeImportTest.java * (edit) src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java * (add) src/test/org/apache/sqoop/manager/mysql/MySqlColumnEscapeImportTest.java * (edit) src/test/com/cloudera/sqoop/ThirdPartyTests.java * (edit) src/test/com/cloudera/sqoop/manager/MySQLTestUtils.java * (edit) src/java/org/apache/sqoop/orm/ClassWriter.java > ClassWriter fails if a column name contains a backslash character > - > > Key: SQOOP-3021 > URL: https://issues.apache.org/jira/browse/SQOOP-3021 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Szabolcs Vasas >Assignee: Szabolcs Vasas > Fix For: 1.4.7 > > Attachments: SQOOP-3021.patch > > > The following Sqoop command fails with a javac error: > {code} > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select C1_INT,C4_VARCHAR20, REGEXP_REPLACE(TRIM(C4_VARCHAR20),'\:','!') from > T1_IMPORT WHERE \$CONDITIONS" --target-dir regex_imp --delete-target-dir -m 1 > {code} > The reason is that the REGEXP_REPLACE expression contains a backslash > character which does not get escaped in ClassWriter and an invalid string > gets generated into the Java code. > SQOOP-2864 solved this problem for the double quote character we need to > generalize that solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
[ https://issues.apache.org/jira/browse/SQOOP-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553683#comment-15553683 ] Hudson commented on SQOOP-3021: --- FAILURE: Integrated in Jenkins build Sqoop-hadoop20 #1061 (See [https://builds.apache.org/job/Sqoop-hadoop20/1061/]) SQOOP-3021: ClassWriter fails if a column name contains a backslash (maugli: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git=commit=0f13c474bf91f6609b9bdd7a1aa25b250f07e398]) * (edit) src/test/com/cloudera/sqoop/manager/MySQLTestUtils.java * (add) src/test/org/apache/sqoop/manager/oracle/OracleColumnEscapeImportTest.java * (edit) src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java * (edit) src/test/com/cloudera/sqoop/ThirdPartyTests.java * (add) src/test/org/apache/sqoop/manager/mysql/MySqlColumnEscapeImportTest.java * (edit) src/java/org/apache/sqoop/orm/ClassWriter.java > ClassWriter fails if a column name contains a backslash character > - > > Key: SQOOP-3021 > URL: https://issues.apache.org/jira/browse/SQOOP-3021 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Szabolcs Vasas >Assignee: Szabolcs Vasas > Fix For: 1.4.7 > > Attachments: SQOOP-3021.patch > > > The following Sqoop command fails with a javac error: > {code} > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select C1_INT,C4_VARCHAR20, REGEXP_REPLACE(TRIM(C4_VARCHAR20),'\:','!') from > T1_IMPORT WHERE \$CONDITIONS" --target-dir regex_imp --delete-target-dir -m 1 > {code} > The reason is that the REGEXP_REPLACE expression contains a backslash > character which does not get escaped in ClassWriter and an invalid string > gets generated into the Java code. > SQOOP-2864 solved this problem for the double quote character we need to > generalize that solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
[ https://issues.apache.org/jira/browse/SQOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553643#comment-15553643 ] Attila Szabo commented on SQOOP-3022: - Hi [~Tagar], I've checked the INSERT ALL statement, but for me it just looks like a syntax sugar for writing multiple INSERTs, thus I don't think it would make sense to use it for the current scenario (e.g. introducing new query generation is always a risk, and we don't expect any performance related improvement by this change). Do you have any docs what would say/state the opposite of my conclusion? Thanks! [~maugli] > sqoop export for Oracle generates tremendous amounts of redo logs > - > > Key: SQOOP-3022 > URL: https://issues.apache.org/jira/browse/SQOOP-3022 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors, connectors/oracle >Affects Versions: 1.4.3, 1.4.4, 1.4.5, 1.4.6 >Reporter: Ruslan Dautkhanov > Labels: export, oracle > > Sqoop export for Oracle generates tremendous amounts of redo logs (comparable > to export size or more). > We have put target tables in nologgin mode, but Oracle will still generate > redo logs unless +APPEND Oracle insert hint is used. > See https://oracle-base.com/articles/misc/append-hint for examples. > Please add an option for sqoop to generate insert statements in Oracle with > APPEND statement. Our databases are swamped with redo/archived logs whenever > we sqoop data to them. This is easily avoidable. And from business > prospective sqooping to staging tables in nologgin mode is totally fine. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
[ https://issues.apache.org/jira/browse/SQOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553631#comment-15553631 ] Attila Szabo commented on SQOOP-3022: - Hi [~Tagar], In connection with "ORA-12838: cannot read/modify an object after modifying it in parallel" you're absolutely right, and if you check the OraOop related sources you could see we're doing exactly the same what was your guess: 1 transaction, many records. It is scalable, but has to be aware about that when parallel writing path is open no one else should try to read/modify. To make it even more scalable OraOop has a solution to create a partitioned table, insert into those partition tables, and at the end only move the partition under the table with easy to execute alter statements. It's working quite good. On the front of INSERT ALL: Never tried it, have to check it tomorrow. > sqoop export for Oracle generates tremendous amounts of redo logs > - > > Key: SQOOP-3022 > URL: https://issues.apache.org/jira/browse/SQOOP-3022 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors, connectors/oracle >Affects Versions: 1.4.3, 1.4.4, 1.4.5, 1.4.6 >Reporter: Ruslan Dautkhanov > Labels: export, oracle > > Sqoop export for Oracle generates tremendous amounts of redo logs (comparable > to export size or more). > We have put target tables in nologgin mode, but Oracle will still generate > redo logs unless +APPEND Oracle insert hint is used. > See https://oracle-base.com/articles/misc/append-hint for examples. > Please add an option for sqoop to generate insert statements in Oracle with > APPEND statement. Our databases are swamped with redo/archived logs whenever > we sqoop data to them. This is easily avoidable. And from business > prospective sqooping to staging tables in nologgin mode is totally fine. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
[ https://issues.apache.org/jira/browse/SQOOP-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553622#comment-15553622 ] Hudson commented on SQOOP-3021: --- FAILURE: Integrated in Jenkins build Sqoop-hadoop100 #1025 (See [https://builds.apache.org/job/Sqoop-hadoop100/1025/]) SQOOP-3021: ClassWriter fails if a column name contains a backslash (maugli: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git=commit=0f13c474bf91f6609b9bdd7a1aa25b250f07e398]) * (edit) src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java * (edit) src/test/com/cloudera/sqoop/ThirdPartyTests.java * (add) src/test/org/apache/sqoop/manager/mysql/MySqlColumnEscapeImportTest.java * (add) src/test/org/apache/sqoop/manager/oracle/OracleColumnEscapeImportTest.java * (edit) src/test/com/cloudera/sqoop/manager/MySQLTestUtils.java * (edit) src/java/org/apache/sqoop/orm/ClassWriter.java > ClassWriter fails if a column name contains a backslash character > - > > Key: SQOOP-3021 > URL: https://issues.apache.org/jira/browse/SQOOP-3021 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Szabolcs Vasas >Assignee: Szabolcs Vasas > Fix For: 1.4.7 > > Attachments: SQOOP-3021.patch > > > The following Sqoop command fails with a javac error: > {code} > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select C1_INT,C4_VARCHAR20, REGEXP_REPLACE(TRIM(C4_VARCHAR20),'\:','!') from > T1_IMPORT WHERE \$CONDITIONS" --target-dir regex_imp --delete-target-dir -m 1 > {code} > The reason is that the REGEXP_REPLACE expression contains a backslash > character which does not get escaped in ClassWriter and an invalid string > gets generated into the Java code. > SQOOP-2864 solved this problem for the double quote character we need to > generalize that solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
[ https://issues.apache.org/jira/browse/SQOOP-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Szabo closed SQOOP-3021. --- > ClassWriter fails if a column name contains a backslash character > - > > Key: SQOOP-3021 > URL: https://issues.apache.org/jira/browse/SQOOP-3021 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Szabolcs Vasas >Assignee: Szabolcs Vasas > Fix For: 1.4.7 > > Attachments: SQOOP-3021.patch > > > The following Sqoop command fails with a javac error: > {code} > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select C1_INT,C4_VARCHAR20, REGEXP_REPLACE(TRIM(C4_VARCHAR20),'\:','!') from > T1_IMPORT WHERE \$CONDITIONS" --target-dir regex_imp --delete-target-dir -m 1 > {code} > The reason is that the REGEXP_REPLACE expression contains a backslash > character which does not get escaped in ClassWriter and an invalid string > gets generated into the Java code. > SQOOP-2864 solved this problem for the double quote character we need to > generalize that solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
[ https://issues.apache.org/jira/browse/SQOOP-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Szabo resolved SQOOP-3021. - Resolution: Fixed > ClassWriter fails if a column name contains a backslash character > - > > Key: SQOOP-3021 > URL: https://issues.apache.org/jira/browse/SQOOP-3021 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Szabolcs Vasas >Assignee: Szabolcs Vasas > Fix For: 1.4.7 > > Attachments: SQOOP-3021.patch > > > The following Sqoop command fails with a javac error: > {code} > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select C1_INT,C4_VARCHAR20, REGEXP_REPLACE(TRIM(C4_VARCHAR20),'\:','!') from > T1_IMPORT WHERE \$CONDITIONS" --target-dir regex_imp --delete-target-dir -m 1 > {code} > The reason is that the REGEXP_REPLACE expression contains a backslash > character which does not get escaped in ClassWriter and an invalid string > gets generated into the Java code. > SQOOP-2864 solved this problem for the double quote character we need to > generalize that solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
[ https://issues.apache.org/jira/browse/SQOOP-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553618#comment-15553618 ] Attila Szabo commented on SQOOP-3021: - Hey [~vasas], A very clean solution for a quite difficult to run into issue! Many thanks for your contribution. [~maugli] > ClassWriter fails if a column name contains a backslash character > - > > Key: SQOOP-3021 > URL: https://issues.apache.org/jira/browse/SQOOP-3021 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Szabolcs Vasas >Assignee: Szabolcs Vasas > Fix For: 1.4.7 > > Attachments: SQOOP-3021.patch > > > The following Sqoop command fails with a javac error: > {code} > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select C1_INT,C4_VARCHAR20, REGEXP_REPLACE(TRIM(C4_VARCHAR20),'\:','!') from > T1_IMPORT WHERE \$CONDITIONS" --target-dir regex_imp --delete-target-dir -m 1 > {code} > The reason is that the REGEXP_REPLACE expression contains a backslash > character which does not get escaped in ClassWriter and an invalid string > gets generated into the Java code. > SQOOP-2864 solved this problem for the double quote character we need to > generalize that solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
[ https://issues.apache.org/jira/browse/SQOOP-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553612#comment-15553612 ] ASF subversion and git services commented on SQOOP-3021: Commit 0f13c474bf91f6609b9bdd7a1aa25b250f07e398 in sqoop's branch refs/heads/trunk from [~maugli] [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=0f13c47 ] SQOOP-3021: ClassWriter fails if a column name contains a backslash character (Szabolcs Vasas via Attila Szabo) > ClassWriter fails if a column name contains a backslash character > - > > Key: SQOOP-3021 > URL: https://issues.apache.org/jira/browse/SQOOP-3021 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Szabolcs Vasas >Assignee: Szabolcs Vasas > Fix For: 1.4.7 > > Attachments: SQOOP-3021.patch > > > The following Sqoop command fails with a javac error: > {code} > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select C1_INT,C4_VARCHAR20, REGEXP_REPLACE(TRIM(C4_VARCHAR20),'\:','!') from > T1_IMPORT WHERE \$CONDITIONS" --target-dir regex_imp --delete-target-dir -m 1 > {code} > The reason is that the REGEXP_REPLACE expression contains a backslash > character which does not get escaped in ClassWriter and an invalid string > gets generated into the Java code. > SQOOP-2864 solved this problem for the double quote character we need to > generalize that solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 52605: ClassWriter fails if a column name contains a backslash character
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/52605/#review151739 --- Very nice work Szabi! Good idea to use an existing escaping solution from a truested open source library, and also you'd provided nice improvements around testing too. - Attila Szabo On Oct. 6, 2016, 3:01 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/52605/ > --- > > (Updated Oct. 6, 2016, 3:01 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3021 > https://issues.apache.org/jira/browse/SQOOP-3021 > > > Repository: sqoop-trunk > > > Description > --- > > SQOOP-2864 solved this problem for the double quote character I generalized > that solution. > > > Diffs > - > > src/java/org/apache/sqoop/orm/ClassWriter.java 9d91887 > src/test/com/cloudera/sqoop/ThirdPartyTests.java 3103bd4 > src/test/com/cloudera/sqoop/manager/MySQLTestUtils.java b5b9b6e > src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 802 > src/test/org/apache/sqoop/manager/mysql/MySqlColumnEscapeImportTest.java > PRE-CREATION > src/test/org/apache/sqoop/manager/oracle/OracleColumnEscapeImportTest.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/52605/diff/ > > > Testing > --- > > I have added a new MySQL third party test to test the escaping of the double > quote character in the column name and a new Oracle third party test to test > the escaping of the backslash character. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 52605: ClassWriter fails if a column name contains a backslash character
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/52605/#review151738 --- Ship it! Ship It! - Attila Szabo On Oct. 6, 2016, 3:01 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/52605/ > --- > > (Updated Oct. 6, 2016, 3:01 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3021 > https://issues.apache.org/jira/browse/SQOOP-3021 > > > Repository: sqoop-trunk > > > Description > --- > > SQOOP-2864 solved this problem for the double quote character I generalized > that solution. > > > Diffs > - > > src/java/org/apache/sqoop/orm/ClassWriter.java 9d91887 > src/test/com/cloudera/sqoop/ThirdPartyTests.java 3103bd4 > src/test/com/cloudera/sqoop/manager/MySQLTestUtils.java b5b9b6e > src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 802 > src/test/org/apache/sqoop/manager/mysql/MySqlColumnEscapeImportTest.java > PRE-CREATION > src/test/org/apache/sqoop/manager/oracle/OracleColumnEscapeImportTest.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/52605/diff/ > > > Testing > --- > > I have added a new MySQL third party test to test the escaping of the double > quote character in the column name and a new Oracle third party test to test > the escaping of the backslash character. > > > Thanks, > > Szabolcs Vasas > >
[jira] [Commented] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
[ https://issues.apache.org/jira/browse/SQOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553578#comment-15553578 ] Ruslan Dautkhanov commented on SQOOP-3022: -- You're right on APPEND_VALUES hint. Thank you for pointint to that [~maugli] I just ran a few small tests with this hint on row by row inserts and managed to get "ORA-12838: cannot read/modify an object after modifying it in parallel" on one of inserts. So there is some scalability problems with this hint. It may still work if you don't have long-living transactions (i.e. set -Dsqoop.export.statements.per.transaction=1 but tune up well -Dsqoop.export.records.per.statement ?). On a side note, have you looked at Oracle's INSERT ALL command when you can specify multiple rows in one INSERT statement? See for example https://www.techonthenet.com/oracle/questions/insert_rows.php "INSERT ALL" is supported since at least Oracle 10g - https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_9014.htm so it should not new feature, and shouldn't be a compatibility concern. INSERT ALL with APPEND_VALUES hint may play better in terms of scalability. > sqoop export for Oracle generates tremendous amounts of redo logs > - > > Key: SQOOP-3022 > URL: https://issues.apache.org/jira/browse/SQOOP-3022 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors, connectors/oracle >Affects Versions: 1.4.3, 1.4.4, 1.4.5, 1.4.6 >Reporter: Ruslan Dautkhanov > Labels: export, oracle > > Sqoop export for Oracle generates tremendous amounts of redo logs (comparable > to export size or more). > We have put target tables in nologgin mode, but Oracle will still generate > redo logs unless +APPEND Oracle insert hint is used. > See https://oracle-base.com/articles/misc/append-hint for examples. > Please add an option for sqoop to generate insert statements in Oracle with > APPEND statement. Our databases are swamped with redo/archived logs whenever > we sqoop data to them. This is easily avoidable. And from business > prospective sqooping to staging tables in nologgin mode is totally fine. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
[ https://issues.apache.org/jira/browse/SQOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553276#comment-15553276 ] Attila Szabo commented on SQOOP-3022: - Hi [~Tagar], Yes I do also think the documentation there is not appropriate! I would see no reason either for that "lock contention" effect. And also my measurements around SQOOP-2983 shows that with the direct path insert we can be lightning fast, and in a very scalable way. > sqoop export for Oracle generates tremendous amounts of redo logs > - > > Key: SQOOP-3022 > URL: https://issues.apache.org/jira/browse/SQOOP-3022 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors, connectors/oracle >Affects Versions: 1.4.3, 1.4.4, 1.4.5, 1.4.6 >Reporter: Ruslan Dautkhanov > Labels: export, oracle > > Sqoop export for Oracle generates tremendous amounts of redo logs (comparable > to export size or more). > We have put target tables in nologgin mode, but Oracle will still generate > redo logs unless +APPEND Oracle insert hint is used. > See https://oracle-base.com/articles/misc/append-hint for examples. > Please add an option for sqoop to generate insert statements in Oracle with > APPEND statement. Our databases are swamped with redo/archived logs whenever > we sqoop data to them. This is easily avoidable. And from business > prospective sqooping to staging tables in nologgin mode is totally fine. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
[ https://issues.apache.org/jira/browse/SQOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553268#comment-15553268 ] Attila Szabo commented on SQOOP-3022: - Hi [~Tagar], According to this ( http://docs.oracle.com/cd/E11882_01/server.112/e25494.pdf page 604 ) both of them exits, and if I do understand correctly +APPEND is only usable when you use it with "INSERT into ... SELECT ..." statement, and +APPEND_VALUES when we have an INSERT statement with VALUES close (this is the case what makes sense in case of Sqoop IMHO). According to Tom White +APPEND never worked with values (except one short amount of time/version), that was the reason why they added (https://asktom.oracle.com/pls/apex/f?p=100:11:0P11_QUESTION_ID:6087912900346548365). APPEND_VALUES in Sqoop: org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java org/apache/sqoop/manager/oracle/OraOopOutputFormatUpdate.java docs/user/connectors.txt conf/oraoop-site-template.xml So if I'm not mistaken: the current usage of hints is appropriate, and with SQOOP-2983 we would also have it with a better performance than the current trunk version. Could you please confirm my findings? + [~david.robson] Dave, as the author of OraOop capabilities of Sqoop, could you please also confirm if my findings are valid? Many thanks, [~maugli] > sqoop export for Oracle generates tremendous amounts of redo logs > - > > Key: SQOOP-3022 > URL: https://issues.apache.org/jira/browse/SQOOP-3022 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors, connectors/oracle >Affects Versions: 1.4.3, 1.4.4, 1.4.5, 1.4.6 >Reporter: Ruslan Dautkhanov > Labels: export, oracle > > Sqoop export for Oracle generates tremendous amounts of redo logs (comparable > to export size or more). > We have put target tables in nologgin mode, but Oracle will still generate > redo logs unless +APPEND Oracle insert hint is used. > See https://oracle-base.com/articles/misc/append-hint for examples. > Please add an option for sqoop to generate insert statements in Oracle with > APPEND statement. Our databases are swamped with redo/archived logs whenever > we sqoop data to them. This is easily avoidable. And from business > prospective sqooping to staging tables in nologgin mode is totally fine. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
[ https://issues.apache.org/jira/browse/SQOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Szabo updated SQOOP-3022: Comment: was deleted (was: Hi [~Tagar], According to this ( http://docs.oracle.com/cd/E11882_01/server.112/e25494.pdf page 604 ) both of them exits, and if I do understand correctly +APPEND is only usable when you use it with "INSERT into ... SELECT ..." statement, and +APPEND_VALUES when we have an INSERT statement with VALUES close (this is the case what makes sense in case of Sqoop IMHO). According to Tom White +APPEND never worked with values (except one short amount of time/version), that was the reason why they added (https://asktom.oracle.com/pls/apex/f?p=100:11:0P11_QUESTION_ID:6087912900346548365). APPEND_VALUES in Sqoop: org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java org/apache/sqoop/manager/oracle/OraOopOutputFormatUpdate.java docs/user/connectors.txt conf/oraoop-site-template.xml So if I'm not mistaken: the current usage of hints is appropriate, and with SQOOP-2983 we would also have it with a better performance than the current trunk version. Could you please confirm my findings? + [~david.robson] Dave, as the author of OraOop capabilities of Sqoop, could you please also confirm if my findings are valid? Many thanks, [~maugli]) > sqoop export for Oracle generates tremendous amounts of redo logs > - > > Key: SQOOP-3022 > URL: https://issues.apache.org/jira/browse/SQOOP-3022 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors, connectors/oracle >Affects Versions: 1.4.3, 1.4.4, 1.4.5, 1.4.6 >Reporter: Ruslan Dautkhanov > Labels: export, oracle > > Sqoop export for Oracle generates tremendous amounts of redo logs (comparable > to export size or more). > We have put target tables in nologgin mode, but Oracle will still generate > redo logs unless +APPEND Oracle insert hint is used. > See https://oracle-base.com/articles/misc/append-hint for examples. > Please add an option for sqoop to generate insert statements in Oracle with > APPEND statement. Our databases are swamped with redo/archived logs whenever > we sqoop data to them. This is easily avoidable. And from business > prospective sqooping to staging tables in nologgin mode is totally fine. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
[ https://issues.apache.org/jira/browse/SQOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553267#comment-15553267 ] Attila Szabo commented on SQOOP-3022: - Hi [~Tagar], According to this ( http://docs.oracle.com/cd/E11882_01/server.112/e25494.pdf page 604 ) both of them exits, and if I do understand correctly +APPEND is only usable when you use it with "INSERT into ... SELECT ..." statement, and +APPEND_VALUES when we have an INSERT statement with VALUES close (this is the case what makes sense in case of Sqoop IMHO). According to Tom White +APPEND never worked with values (except one short amount of time/version), that was the reason why they added (https://asktom.oracle.com/pls/apex/f?p=100:11:0P11_QUESTION_ID:6087912900346548365). APPEND_VALUES in Sqoop: org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java org/apache/sqoop/manager/oracle/OraOopOutputFormatUpdate.java docs/user/connectors.txt conf/oraoop-site-template.xml So if I'm not mistaken: the current usage of hints is appropriate, and with SQOOP-2983 we would also have it with a better performance than the current trunk version. Could you please confirm my findings? + [~david.robson] Dave, as the author of OraOop capabilities of Sqoop, could you please also confirm if my findings are valid? Many thanks, [~maugli] > sqoop export for Oracle generates tremendous amounts of redo logs > - > > Key: SQOOP-3022 > URL: https://issues.apache.org/jira/browse/SQOOP-3022 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors, connectors/oracle >Affects Versions: 1.4.3, 1.4.4, 1.4.5, 1.4.6 >Reporter: Ruslan Dautkhanov > Labels: export, oracle > > Sqoop export for Oracle generates tremendous amounts of redo logs (comparable > to export size or more). > We have put target tables in nologgin mode, but Oracle will still generate > redo logs unless +APPEND Oracle insert hint is used. > See https://oracle-base.com/articles/misc/append-hint for examples. > Please add an option for sqoop to generate insert statements in Oracle with > APPEND statement. Our databases are swamped with redo/archived logs whenever > we sqoop data to them. This is easily avoidable. And from business > prospective sqooping to staging tables in nologgin mode is totally fine. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
[ https://issues.apache.org/jira/browse/SQOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553213#comment-15553213 ] Ruslan Dautkhanov commented on SQOOP-3022: -- I found this comment in source code https://github.com/apache/sqoop/blob/7c1754270ff21f533088b946c873321f890da791/src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java#L64 {quote} // NB: "Direct inserts" cannot utilize APPEND_VALUES, otherwise Oracle // will serialize // the N mappers, causing a lot of lock contention. {quote} I don't think it's true if you don't have neither indexes nor FK constraints on target table in Oracle. See https://docs.oracle.com/cd/B19306_01/server.102/b14215/ldr_modes.htm : {quote} - Neither local or global indexes can be maintained by the load. - Referential integrity and CHECK constraints must be disabled. .. {quote} If both of the above are satisfied, Oracle can do direct path insert from multiple sessions (multiple sqoop export mappers' connections). > sqoop export for Oracle generates tremendous amounts of redo logs > - > > Key: SQOOP-3022 > URL: https://issues.apache.org/jira/browse/SQOOP-3022 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors, connectors/oracle >Affects Versions: 1.4.3, 1.4.4, 1.4.5, 1.4.6 >Reporter: Ruslan Dautkhanov > Labels: export, oracle > > Sqoop export for Oracle generates tremendous amounts of redo logs (comparable > to export size or more). > We have put target tables in nologgin mode, but Oracle will still generate > redo logs unless +APPEND Oracle insert hint is used. > See https://oracle-base.com/articles/misc/append-hint for examples. > Please add an option for sqoop to generate insert statements in Oracle with > APPEND statement. Our databases are swamped with redo/archived logs whenever > we sqoop data to them. This is easily avoidable. And from business > prospective sqooping to staging tables in nologgin mode is totally fine. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
[ https://issues.apache.org/jira/browse/SQOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553153#comment-15553153 ] Ruslan Dautkhanov commented on SQOOP-3022: -- Thank you Attila for your prompt response. APPEND_VALUES is *not* a valid Oracle hint. Never saw it is used. Just looked at non-public Oracle articles in support.oracle.com and don't see it is there too. 'APPEND' is the only correct hint that I am aware of that controls logging. I searched sqoop source for 'APPEND_VALUES' at https://github.com/apache/sqoop/search?utf8=%E2%9C%93=APPEND_VALUES+ and see many times it is mentioned in comments, but never actually injected into an INSERT statement? I don't see it is used. Actually, I have sqoop export to Oracle running right now and there are no hints used in INSERT statements. Is it enabled through some sort of optional sqoop export argument? Since Oracle hints are coming in a comment-like syntax, any invalid hints are not only ignored, but any other hints after first invalid Oracle hint are ignored (not parsed) as well. So if you see APPEND_VALUES is actually used in code, it has to be changed to APPEND. Thank you for reference and work on SQOOP-2983 - that would be great to see it commited. > sqoop export for Oracle generates tremendous amounts of redo logs > - > > Key: SQOOP-3022 > URL: https://issues.apache.org/jira/browse/SQOOP-3022 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors, connectors/oracle >Affects Versions: 1.4.3, 1.4.4, 1.4.5, 1.4.6 >Reporter: Ruslan Dautkhanov > Labels: export, oracle > > Sqoop export for Oracle generates tremendous amounts of redo logs (comparable > to export size or more). > We have put target tables in nologgin mode, but Oracle will still generate > redo logs unless +APPEND Oracle insert hint is used. > See https://oracle-base.com/articles/misc/append-hint for examples. > Please add an option for sqoop to generate insert statements in Oracle with > APPEND statement. Our databases are swamped with redo/archived logs whenever > we sqoop data to them. This is easily avoidable. And from business > prospective sqooping to staging tables in nologgin mode is totally fine. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
[ https://issues.apache.org/jira/browse/SQOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553116#comment-15553116 ] Attila Szabo edited comment on SQOOP-3022 at 10/6/16 8:40 PM: -- Hi [~Tagar], I'd like to kindly ask a bit more input from you here: Are +APPEND and +APPEND_VALUES totally unrelated, or do they somehow connected together? Is +APPEND is only good for reducing the redo logs? The reason why I'm asking is that OraOop feature of Sqoop right now has an option which uses the +APPEND_VALUES hint for inserting data. So if it is not unrelated maybe that code path would be also good for you (sadly I was not able to find out from the official Oracle docs, if +APPEND_VALUES also reduce the redo log size (I do understand it's working on the direct path insert too). One plus information here: On SQOOP-2983 right now we're working on fixing the insert performance of OraOop so if that solution would be good for you it would make sense first waiting for that JIRA ticket to be closed. Thanks for your answer in advance! [~maugli] was (Author: maugli): Hi [~Tagar], I'd like to kindly ask a bit more input from you here: Is +APPEND and +APPEND_VALUES are totally unrelated? Is +APPEND is only good for reducing the redo logs? The reason why I'm asking is that OraOop feature of Sqoop right now has an option which uses the +APPEND_VALUES hint for inserting data. So if it is not unrelated maybe that code path would be also good for you (sadly I was not able to find out from the official Oracle docs, if +APPEND_VALUES also reduce the redo log size (I do understand it's working on the direct path insert too). One plus information here: On SQOOP-2983 right now we're working on fixing the insert performance of OraOop so if that solution would be good for you it would make sense first waiting for that JIRA ticket to be closed. Thanks for your answer in advance! [~maugli] > sqoop export for Oracle generates tremendous amounts of redo logs > - > > Key: SQOOP-3022 > URL: https://issues.apache.org/jira/browse/SQOOP-3022 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors, connectors/oracle >Affects Versions: 1.4.3, 1.4.4, 1.4.5, 1.4.6 >Reporter: Ruslan Dautkhanov > Labels: export, oracle > > Sqoop export for Oracle generates tremendous amounts of redo logs (comparable > to export size or more). > We have put target tables in nologgin mode, but Oracle will still generate > redo logs unless +APPEND Oracle insert hint is used. > See https://oracle-base.com/articles/misc/append-hint for examples. > Please add an option for sqoop to generate insert statements in Oracle with > APPEND statement. Our databases are swamped with redo/archived logs whenever > we sqoop data to them. This is easily avoidable. And from business > prospective sqooping to staging tables in nologgin mode is totally fine. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
[ https://issues.apache.org/jira/browse/SQOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553116#comment-15553116 ] Attila Szabo commented on SQOOP-3022: - Hi [~Tagar], I'd like to kindly ask a bit more input from you here: Is +APPEND and +APPEND_VALUES are totally unrelated? Is +APPEND is only good for reducing the redo logs? The reason why I'm asking is that OraOop feature of Sqoop right now has an option which uses the +APPEND_VALUES hint for inserting data. So if it is not unrelated maybe that code path would be also good for you (sadly I was not able to find out from the official Oracle docs, if +APPEND_VALUES also reduce the redo log size (I do understand it's working on the direct path insert too). One plus information here: On SQOOP-2983 right now we're working on fixing the insert performance of OraOop so if that solution would be good for you it would make sense first waiting for that JIRA ticket to be closed. Thanks for your answer in advance! [~maugli] > sqoop export for Oracle generates tremendous amounts of redo logs > - > > Key: SQOOP-3022 > URL: https://issues.apache.org/jira/browse/SQOOP-3022 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors, connectors/oracle >Affects Versions: 1.4.3, 1.4.4, 1.4.5, 1.4.6 >Reporter: Ruslan Dautkhanov > Labels: export, oracle > > Sqoop export for Oracle generates tremendous amounts of redo logs (comparable > to export size or more). > We have put target tables in nologgin mode, but Oracle will still generate > redo logs unless +APPEND Oracle insert hint is used. > See https://oracle-base.com/articles/misc/append-hint for examples. > Please add an option for sqoop to generate insert statements in Oracle with > APPEND statement. Our databases are swamped with redo/archived logs whenever > we sqoop data to them. This is easily avoidable. And from business > prospective sqooping to staging tables in nologgin mode is totally fine. > Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (SQOOP-3022) sqoop export for Oracle generates tremendous amounts of redo logs
Ruslan Dautkhanov created SQOOP-3022: Summary: sqoop export for Oracle generates tremendous amounts of redo logs Key: SQOOP-3022 URL: https://issues.apache.org/jira/browse/SQOOP-3022 Project: Sqoop Issue Type: Bug Components: codegen, connectors, connectors/oracle Affects Versions: 1.4.6, 1.4.5, 1.4.4, 1.4.3 Reporter: Ruslan Dautkhanov Sqoop export for Oracle generates tremendous amounts of redo logs (comparable to export size or more). We have put target tables in nologgin mode, but Oracle will still generate redo logs unless +APPEND Oracle insert hint is used. See https://oracle-base.com/articles/misc/append-hint for examples. Please add an option for sqoop to generate insert statements in Oracle with APPEND statement. Our databases are swamped with redo/archived logs whenever we sqoop data to them. This is easily avoidable. And from business prospective sqooping to staging tables in nologgin mode is totally fine. Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3003) Sqoop import fails to query with split-by/boundary-query using Oracle Date/Timestamp
[ https://issues.apache.org/jira/browse/SQOOP-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553025#comment-15553025 ] Sowmya Ramesh commented on SQOOP-3003: -- [~nateclevenger]: I don't see any tests in the patch. Can you please add few tests to validate the fix? Also, please create review request in https://reviews.apache.org/. Thanks! > Sqoop import fails to query with split-by/boundary-query using Oracle > Date/Timestamp > > > Key: SQOOP-3003 > URL: https://issues.apache.org/jira/browse/SQOOP-3003 > Project: Sqoop > Issue Type: Bug > Components: connectors/oracle >Reporter: Nate Clevenger > Attachments: SQOOP-3003.patch > > > Given the following example sqoop import command intended to import data from > an Oracle test_table, split-by a timestamp_column using a boundary query > (e.g. one-day range) with sqoop parallelism of eight: > {code} > sqoop import --connect jdbc:oracle:... --username --password > --target-dir /tmp/sqoop/test -m 8 --null-string '' --append --query "SELECT > primary_key, TO_CHAR(timestamp_column) FROM test_table WHERE primary_key != > 0 AND \$CONDITIONS" --split-by "timestamp_column" --boundary-query "SELECT > TO_TIMESTAMP('1970-01-01', > '-mm-dd')+numtodsinterval(1472083200,'second'), > TO_TIMESTAMP('1970-01-01', '-mm-dd')+numtodsinterval(1472169600,'second') > FROM DUAL" > {code} > The following exception is thrown by each map tasks: > {code} > Caused by: java.sql.SQLDataException: ORA-01843: not a valid month > at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:447) > at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396) > at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:951) > at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:513) > at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227) > at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531) > at > oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:208) > at > oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:886) > at > oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1175) > at > oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1296) > at > oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3613) > at > oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3657) > at > oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1495) > at > org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111) > at > org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235) > ... 12 more > {code} > Inspecting the source code, the issue appears to be attributed to > OracleManager failing to set the correct input format (should be > [OracleDataDrivenDBInputFormat|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/mapreduce/db/OracleDataDrivenDBInputFormat.java#L47], > but appears to be getting set to > [DataDrivenDBInputFormat|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/mapreduce/db/DataDrivenDBInputFormat.java#L79], > resulting in > [DateSplitter|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/mapreduce/db/DateSplitter.java#L180] > being applied instead of > [OracleDateSplitter|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/mapreduce/db/OracleDateSplitter.java#L30]). > OracleManager appears to apply the correct input format when using the > [\--table > option|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/manager/OracleManager.java#L437] > in {{sqoop import}}, but doesn't apply a similar override when using the > [\--query > option|https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/manager/SqlManager.java#L676], > resulting in the input format being [defaulted to > DataDriveDBInputFormat|https://github.com/apache/sqoop/blob/a0b730c77e297a62909063289ef37a2b993ff5e1/src/java/org/apache/sqoop/manager/ImportJobContext.java#L42]. > This defect was tested using 1.4.6-cdh5.5.2-release and > 1.4.6-cdh5.6.0-release, and the affected code issue appears to still be > applicable as of the latest in trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
[ https://issues.apache.org/jira/browse/SQOOP-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szabolcs Vasas updated SQOOP-3021: -- Attachment: SQOOP-3021.patch > ClassWriter fails if a column name contains a backslash character > - > > Key: SQOOP-3021 > URL: https://issues.apache.org/jira/browse/SQOOP-3021 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Szabolcs Vasas >Assignee: Szabolcs Vasas > Fix For: 1.4.7 > > Attachments: SQOOP-3021.patch > > > The following Sqoop command fails with a javac error: > {code} > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select C1_INT,C4_VARCHAR20, REGEXP_REPLACE(TRIM(C4_VARCHAR20),'\:','!') from > T1_IMPORT WHERE \$CONDITIONS" --target-dir regex_imp --delete-target-dir -m 1 > {code} > The reason is that the REGEXP_REPLACE expression contains a backslash > character which does not get escaped in ClassWriter and an invalid string > gets generated into the Java code. > SQOOP-2864 solved this problem for the double quote character we need to > generalize that solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 52605: ClassWriter fails if a column name contains a backslash character
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/52605/ --- Review request for Sqoop. Bugs: SQOOP-3021 https://issues.apache.org/jira/browse/SQOOP-3021 Repository: sqoop-trunk Description --- SQOOP-2864 solved this problem for the double quote character I generalized that solution. Diffs - src/java/org/apache/sqoop/orm/ClassWriter.java 9d91887 src/test/com/cloudera/sqoop/ThirdPartyTests.java 3103bd4 src/test/com/cloudera/sqoop/manager/MySQLTestUtils.java b5b9b6e src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 802 src/test/org/apache/sqoop/manager/mysql/MySqlColumnEscapeImportTest.java PRE-CREATION src/test/org/apache/sqoop/manager/oracle/OracleColumnEscapeImportTest.java PRE-CREATION Diff: https://reviews.apache.org/r/52605/diff/ Testing --- I have added a new MySQL third party test to test the escaping of the double quote character in the column name and a new Oracle third party test to test the escaping of the backslash character. Thanks, Szabolcs Vasas
[jira] [Updated] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
[ https://issues.apache.org/jira/browse/SQOOP-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szabolcs Vasas updated SQOOP-3021: -- Description: The following Sqoop command fails with a javac error: {code} sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select C1_INT,C4_VARCHAR20, REGEXP_REPLACE(TRIM(C4_VARCHAR20),'\:','!') from T1_IMPORT WHERE \$CONDITIONS" --target-dir regex_imp --delete-target-dir -m 1 {code} The reason is that the REGEXP_REPLACE expression contains a backslash character which does not get escaped in ClassWriter and an invalid string gets generated into the Java code. SQOOP-2864 solved this problem for the double quote character we need to generalize that solution. was: I've seen a user who created table with column names containing double quotes and while code generation, we quite spectacularly failed: {code} 16/03/02 12:14:13 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: error: ')' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: error: not a statement __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: error: ';' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: error: not a statement __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: error: ';' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: error: ')' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: error: not a statement __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: error: ';' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: error: not a statement __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: error: ';' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:340: error: ')' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:340: error: ';' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:340: error: ';' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:343: error: 'else' without 'if' elseif ("'single'".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:355: error: ')' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:355: error: ';' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:355: error: ';' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:359: error: 'else' without 'if' elseif ("'single'".equals(__fieldName)) { ^ 18 errors 16/03/02 12:14:14 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Error returned by javac at org.apache.sqoop.orm.CompilationManager.compile(CompilationManager.java:217) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:108) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:488) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at
[jira] [Assigned] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
[ https://issues.apache.org/jira/browse/SQOOP-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szabolcs Vasas reassigned SQOOP-3021: - Assignee: Szabolcs Vasas (was: Jarek Jarcec Cecho) > ClassWriter fails if a column name contains a backslash character > - > > Key: SQOOP-3021 > URL: https://issues.apache.org/jira/browse/SQOOP-3021 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Szabolcs Vasas >Assignee: Szabolcs Vasas > Fix For: 1.4.7 > > > I've seen a user who created table with column names containing double quotes > and while code generation, we quite spectacularly failed: > {code} > 16/03/02 12:14:13 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is > /usr/lib/hadoop-mapreduce > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: > error: ')' expected > __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: > error: not a statement > __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: > error: ';' expected > __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: > error: not a statement > __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); >^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: > error: ';' expected > __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); >^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: > error: ')' expected > __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: > error: not a statement > __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: > error: ';' expected > __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: > error: not a statement > __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); >^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: > error: ';' expected > __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); >^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:340: > error: ')' expected > elseif (""tellmewhy"".equals(__fieldName)) { > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:340: > error: ';' expected > elseif (""tellmewhy"".equals(__fieldName)) { > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:340: > error: ';' expected > elseif (""tellmewhy"".equals(__fieldName)) { > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:343: > error: 'else' without 'if' > elseif ("'single'".equals(__fieldName)) { > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:355: > error: ')' expected > elseif (""tellmewhy"".equals(__fieldName)) { > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:355: > error: ';' expected > elseif (""tellmewhy"".equals(__fieldName)) { > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:355: > error: ';' expected > elseif (""tellmewhy"".equals(__fieldName)) { > ^ > /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:359: > error: 'else' without 'if' > elseif ("'single'".equals(__fieldName)) { > ^ > 18 errors > 16/03/02 12:14:14 ERROR tool.ImportTool: Encountered IOException running > import job: java.io.IOException: Error returned by javac > at > org.apache.sqoop.orm.CompilationManager.compile(CompilationManager.java:217) > at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:108) > at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:488) > at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) > at org.apache.sqoop.Sqoop.run(Sqoop.java:143) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at
[jira] [Created] (SQOOP-3021) ClassWriter fails if a column name contains a backslash character
Szabolcs Vasas created SQOOP-3021: - Summary: ClassWriter fails if a column name contains a backslash character Key: SQOOP-3021 URL: https://issues.apache.org/jira/browse/SQOOP-3021 Project: Sqoop Issue Type: Bug Affects Versions: 1.4.6 Reporter: Szabolcs Vasas Assignee: Jarek Jarcec Cecho Fix For: 1.4.7 I've seen a user who created table with column names containing double quotes and while code generation, we quite spectacularly failed: {code} 16/03/02 12:14:13 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: error: ')' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: error: not a statement __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: error: ';' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: error: not a statement __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:325: error: ';' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: error: ')' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: error: not a statement __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: error: ';' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: error: not a statement __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:332: error: ';' expected __sqoop$field_map.put(""tellmewhy"", this._tellmewhy_); ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:340: error: ')' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:340: error: ';' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:340: error: ';' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:343: error: 'else' without 'if' elseif ("'single'".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:355: error: ')' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:355: error: ';' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:355: error: ';' expected elseif (""tellmewhy"".equals(__fieldName)) { ^ /tmp/sqoop-root/compile/60f084d5441147b848b007d2a18b504d/bofa.java:359: error: 'else' without 'if' elseif ("'single'".equals(__fieldName)) { ^ 18 errors 16/03/02 12:14:14 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Error returned by javac at org.apache.sqoop.orm.CompilationManager.compile(CompilationManager.java:217) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:108) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:488) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {code} I've looked into it and the problem is that we've started preserving raw column names inside the generated class, but we did not escape properly the case when the column name contain double quotes. -- This message was
[jira] [Created] (SQOOP-3020) Error with Kite Connector in Sqoop2
atul dwivedi created SQOOP-3020: --- Summary: Error with Kite Connector in Sqoop2 Key: SQOOP-3020 URL: https://issues.apache.org/jira/browse/SQOOP-3020 Project: Sqoop Issue Type: Bug Reporter: atul dwivedi java.lang.NoClassDefFoundError: org/kitesdk/compat/DynMethods$Builder at org.kitesdk.data.spi.filesystem.FileSystemUtil.(FileSystemUtil.java:298) at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:139) at org.kitesdk.data.Datasets.create(Datasets.java:239) at org.kitesdk.data.Datasets.create(Datasets.java:307) at org.kitesdk.data.Datasets.create(Datasets.java:335) at org.apache.sqoop.connector.kite.KiteDatasetExecutor.createDataset(KiteDatasetExecutor.java:67) at org.apache.sqoop.connector.kite.KiteLoader.getExecutor(KiteLoader.java:52) at org.apache.sqoop.connector.kite.KiteLoader.load(KiteLoader.java:62) at org.apache.sqoop.connector.kite.KiteLoader.load(KiteLoader.java:36) at org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor$ConsumerThread.run(SqoopOutputFormatLoadExecutor.java:250) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.lang.ClassNotFoundException: org.kitesdk.compat.DynMethods$Builder at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) ... 16 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 50655: SQOOP-2991 : endless failed Netezza import
> On sep. 27, 2016, 8:55 matin, Erzsebet Szilagyi wrote: > > Hi Benjamin, > > I downloaded this patch and tried an 'ant compile-all' but it failed with > > the following: > > > > [javac] > > /.../src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java:199: > > error: constructor NetezzaJDBCStatementRunner in class > > NetezzaJDBCStatementRunner cannot be applied to given types; > > [javac] extTableThread = new > > NetezzaJDBCStatementRunner(Thread.currentThread(), > > [javac]^ > > [javac] required: Thread,Connection,String,NamedFifo > > [javac] found: Thread,Connection,String > > [javac] reason: actual and formal argument lists differ in length > > > > Do you think the error is at my side or there's a problem with the code? > > Thanks, > > Liz Hi, the error is on my side. Thanks a lot for your feedback. I posted a new patch. Regards. - Benjamin --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50655/#review150534 --- On oct. 6, 2016, 9:26 matin, Benjamin BONNET wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50655/ > --- > > (Updated oct. 6, 2016, 9:26 matin) > > > Review request for Sqoop, David Robson, Jarek Cecho, and Kathleen Ting. > > > Repository: sqoop-trunk > > > Description > --- > > See on JIRA https://issues.apache.org/jira/browse/SQOOP-2991 > > > Diffs > - > > > src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java > aa058d1 > > src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableImportMapper.java > 2efea53 > > src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaJDBCStatementRunner.java > cedfd23 > > Diff: https://reviews.apache.org/r/50655/diff/ > > > Testing > --- > > Tested with requests that throw errors on Netezza side : importing with a > user that has not enough rights to create an external table. Without patch, > import fails but map reduce job never ends, with patch, import fails, map > reduce job ends with an IOException. > > > Thanks, > > Benjamin BONNET > >
Re: Review Request 50655: SQOOP-2991 : endless failed Netezza import
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50655/ --- (Updated oct. 6, 2016, 9:26 matin) Review request for Sqoop, David Robson, Jarek Cecho, and Kathleen Ting. Changes --- fix compilation issue, thanks Liz Repository: sqoop-trunk Description --- See on JIRA https://issues.apache.org/jira/browse/SQOOP-2991 Diffs (updated) - src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java aa058d1 src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableImportMapper.java 2efea53 src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaJDBCStatementRunner.java cedfd23 Diff: https://reviews.apache.org/r/50655/diff/ Testing --- Tested with requests that throw errors on Netezza side : importing with a user that has not enough rights to create an external table. Without patch, import fails but map reduce job never ends, with patch, import fails, map reduce job ends with an IOException. Thanks, Benjamin BONNET