[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754122#comment-15754122 ] ASF subversion and git services commented on SQOOP-2983: Commit 7783f85f805d54f8377a438aebffb06593aec858 in sqoop's branch refs/heads/trunk from [~maugli] [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=7783f85 ] SQOOP-3083: Fixing fault injection targets to work together with try with resources statements (introduced in SQOOP-2983) (Anna Szonyi via Attila Szabo) > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Fix For: 1.4.7 > > Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, > SQOOP-2983-7.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609602#comment-15609602 ] Hudson commented on SQOOP-2983: --- SUCCESS: Integrated in Jenkins build Sqoop-hadoop200 #1068 (See [https://builds.apache.org/job/Sqoop-hadoop200/1068/]) SQOOP-2983: OraOop export has degraded performance with wide tables (jarcec: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=3fc4ff714427df4cc0da7cd9fdb451703f8686c1]) * (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java * (edit) src/test/org/apache/sqoop/manager/oracle/OraOopTestCase.java * (edit) src/test/org/apache/sqoop/manager/oracle/ExportTest.java * (add) src/test/org/apache/sqoop/manager/oracle/OraOopTypesTest.java * (edit) src/test/org/apache/sqoop/manager/oracle/util/OracleData.java * (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java * (edit) src/java/org/apache/sqoop/orm/ClassWriter.java > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Fix For: 1.4.7 > > Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, > SQOOP-2983-7.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609563#comment-15609563 ] Hudson commented on SQOOP-2983: --- FAILURE: Integrated in Jenkins build Sqoop-hadoop100 #1028 (See [https://builds.apache.org/job/Sqoop-hadoop100/1028/]) SQOOP-2983: OraOop export has degraded performance with wide tables (jarcec: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=3fc4ff714427df4cc0da7cd9fdb451703f8686c1]) * (edit) src/test/org/apache/sqoop/manager/oracle/ExportTest.java * (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java * (add) src/test/org/apache/sqoop/manager/oracle/OraOopTypesTest.java * (edit) src/test/org/apache/sqoop/manager/oracle/OraOopTestCase.java * (edit) src/test/org/apache/sqoop/manager/oracle/util/OracleData.java * (edit) src/java/org/apache/sqoop/orm/ClassWriter.java * (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Fix For: 1.4.7 > > Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, > SQOOP-2983-7.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609537#comment-15609537 ] Hudson commented on SQOOP-2983: --- FAILURE: Integrated in Jenkins build Sqoop-hadoop23 #1265 (See [https://builds.apache.org/job/Sqoop-hadoop23/1265/]) SQOOP-2983: OraOop export has degraded performance with wide tables (jarcec: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=3fc4ff714427df4cc0da7cd9fdb451703f8686c1]) * (edit) src/test/org/apache/sqoop/manager/oracle/util/OracleData.java * (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java * (edit) src/java/org/apache/sqoop/orm/ClassWriter.java * (edit) src/test/org/apache/sqoop/manager/oracle/OraOopTestCase.java * (edit) src/test/org/apache/sqoop/manager/oracle/ExportTest.java * (add) src/test/org/apache/sqoop/manager/oracle/OraOopTypesTest.java * (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Fix For: 1.4.7 > > Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, > SQOOP-2983-7.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609522#comment-15609522 ] Hudson commented on SQOOP-2983: --- FAILURE: Integrated in Jenkins build Sqoop-hadoop20 #1063 (See [https://builds.apache.org/job/Sqoop-hadoop20/1063/]) SQOOP-2983: OraOop export has degraded performance with wide tables (jarcec: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=3fc4ff714427df4cc0da7cd9fdb451703f8686c1]) * (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java * (edit) src/test/org/apache/sqoop/manager/oracle/util/OracleData.java * (edit) src/test/org/apache/sqoop/manager/oracle/OraOopTestCase.java * (add) src/test/org/apache/sqoop/manager/oracle/OraOopTypesTest.java * (edit) src/test/org/apache/sqoop/manager/oracle/ExportTest.java * (edit) src/java/org/apache/sqoop/orm/ClassWriter.java * (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Fix For: 1.4.7 > > Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, > SQOOP-2983-7.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609399#comment-15609399 ] ASF subversion and git services commented on SQOOP-2983: Commit 3fc4ff714427df4cc0da7cd9fdb451703f8686c1 in sqoop's branch refs/heads/trunk from [~jarcec] [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=3fc4ff7 ] SQOOP-2983: OraOop export has degraded performance with wide tables (Attila Szabo via Jarek Jarcec Cecho) > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, > SQOOP-2983-7.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609395#comment-15609395 ] Jarek Jarcec Cecho commented on SQOOP-2983: --- Let's create a follow up JIRA for [~david.robson]'s feedback to merge the code paths and let's get this in to resolve the actual perf issue as that is negatively affecting our users. > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, > SQOOP-2983-7.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15530362#comment-15530362 ] Attila Szabo commented on SQOOP-2983: - Dear [~jarcec], Until [~david.robson] will have the chance to look at my latest change on the review board, would you please also do a review and give a -1/+1 according to your evaluation? Thanks, [~maugli] > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, > SQOOP-2983-7.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504009#comment-15504009 ] Kathleen Ting commented on SQOOP-2983: -- Piling on. [~david.robson], per your advice, [~maugli]'s reverted every change made to the update code path, so it's only fixing the insert part. Thanks for all your past reviews of SQOOP-2983 and when you get a chance, would you please review the latest revisions to SQOOP-2983 and then commit it once you find it satisfactory? > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, > SQOOP-2983-7.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414882#comment-15414882 ] Attila Szabo commented on SQOOP-2983: - Hi [~kathleen], Thanks for the feedback. Previously I've done the same process you'd suggested, but thought this is just a noise for the committer (to see there the previous patch files). I will keep them in the future too! Thanks, [~maugli] > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Attachments: SQOOP-2983-5.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414503#comment-15414503 ] Kathleen Ting commented on SQOOP-2983: -- Thanks Attila for the revised patch. As a meta point, please don't delete patches from JIRA. Instead please add new revisions (naming each new iteration with an increasing numerical value, as you've done). No need to re-add the older iterations (e.g. SQOOP-2983-1.patch) to this JIRA but just something to keep in mind going forward. Thanks again for your contribution. > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Attachments: SQOOP-2983-5.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414395#comment-15414395 ] Attila Szabo commented on SQOOP-2983: - With the help of [~david.robson] I was able to identify one issue around "update-key" option, and also was able to spot another issue (left behind after the changes around Oracle escaped column name support). Both of them are fixed. New test case attached as well. New diff reflects all of the changes. Please do another round of review! > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Attachments: SQOOP-2983-1.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398403#comment-15398403 ] Attila Szabo commented on SQOOP-2983: - Updated patchfile (according to the changes requested by [~david.robson] over review board). > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > Attachments: SQOOP-2983-1.patch > > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables
[ https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382877#comment-15382877 ] Attila Szabo commented on SQOOP-2983: - Hi [~jarcec], [~kathleen], [~david.robson], Let me share my investigations and results with you in a "long story short" mode. If you find my findings and my fix appropriate please help me to get this patch committed as soon as possible. Thanks in advance! So the story: After a quite long testing and experimenting phase the following conclusions had been found: - Using the direct path write (/*+APPEND_VALUES*/) seems to be a good idea, as when I've applied it on the top of the standard ExportBatchOutputFormat and used the same "-Dsqoop.export.records.per.statement=5000 -Dsqoop.export.statements.per.transaction=1" session constraints the performance went above 5mb/sec, so the original idea is valid. - According Oracle's documentation NOLOGGING feature works only properly when the session is writing on the direct path, so it's been clear OraOOP should be fixed, and we should not introduce a HINT parameter for the standard Oracle driver (although it could make sense to introduce that in a different FR JIRA) - Thus I've started to dig around what could be that different in the OraOOP query handling and the standard Oracle driver. I was able to measure out that creating the prepared statements are much slower in case of OraOOP. Executing further experiments I've found that something should be wrong around configuringPreparedStatement. Here some problems was found (e.g. the lookup of the column names are linear so could perform badly as wider the table gets), but the problem still felt more fundamental. So finally I was able to figure out the problem is with how we set/bind the values through JDBC with the help of the SqoopRecord. When I've applied the same way how we did it in the ExportBatchOutputFormat the performance get instnatly better (got up to 8-10 mb/sec). - However there was still not too relevant difference between the partitioned version and the non partitioned one (although it seemed to be trivial there should be, as in case of non partitioned because of the direct write after a while the synchronous writes should concurrent/lock out each one in a way the wait times should undermine the further parallelisation), and in some cases (as I've raised the level of parallelisation) it become even much slower (got down to 5mb/sec only in case of 3M lines/4.5gb/data with 10 mappers). and it was still wired for me. So in the log files finally I've found the current way how we moved the tables->subpartitions was very expensive, and sometimes took nearly more time than copying the data to the temp table itself. Thus I've made some investigations and according to the Oracle documentation, as soon as I've applied the "WITHOUT VALIDATION" clause on the ALTER statement it's just started to work as it is intended. Now in the current state it works like that I can even kill (==20+ load avarage) my local DB with a 10 node cluster 20mappers, so finally the RDBMS become the bottleneck as it should be. I kindly ask you to review my proposed changes and share your thoughts with me! > OraOop export has degraded performance with wide tables > --- > > Key: SQOOP-2983 > URL: https://issues.apache.org/jira/browse/SQOOP-2983 > Project: Sqoop > Issue Type: Bug >Reporter: Attila Szabo >Assignee: Attila Szabo >Priority: Critical > > The current version of OraOOP seems to perform very low from performance POV > when --direct mode turned on (regardless if the partitioned feature is turned > of). > Just as a baseline from the current trunk version: > Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec > with direct mode on my cluster, while the standard oracle driver can produce > up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size). > Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with > OraOOP, however with the standard Oracle connector it's around 3.5mb/sec. > It seems OraOOP export needs a thorough review and some fixing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)