[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-12-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754122#comment-15754122
 ] 

ASF subversion and git services commented on SQOOP-2983:


Commit 7783f85f805d54f8377a438aebffb06593aec858 in sqoop's branch 
refs/heads/trunk from [~maugli]
[ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=7783f85 ]

SQOOP-3083: Fixing fault injection targets to work
together with try with resources statements
(introduced in SQOOP-2983)

(Anna Szonyi via Attila Szabo)


> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Fix For: 1.4.7
>
> Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, 
> SQOOP-2983-7.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-10-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609602#comment-15609602
 ] 

Hudson commented on SQOOP-2983:
---

SUCCESS: Integrated in Jenkins build Sqoop-hadoop200 #1068 (See 
[https://builds.apache.org/job/Sqoop-hadoop200/1068/])
SQOOP-2983: OraOop export has degraded performance with wide tables (jarcec: 
[https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=3fc4ff714427df4cc0da7cd9fdb451703f8686c1])
* (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java
* (edit) src/test/org/apache/sqoop/manager/oracle/OraOopTestCase.java
* (edit) src/test/org/apache/sqoop/manager/oracle/ExportTest.java
* (add) src/test/org/apache/sqoop/manager/oracle/OraOopTypesTest.java
* (edit) src/test/org/apache/sqoop/manager/oracle/util/OracleData.java
* (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java
* (edit) src/java/org/apache/sqoop/orm/ClassWriter.java


> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Fix For: 1.4.7
>
> Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, 
> SQOOP-2983-7.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-10-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609563#comment-15609563
 ] 

Hudson commented on SQOOP-2983:
---

FAILURE: Integrated in Jenkins build Sqoop-hadoop100 #1028 (See 
[https://builds.apache.org/job/Sqoop-hadoop100/1028/])
SQOOP-2983: OraOop export has degraded performance with wide tables (jarcec: 
[https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=3fc4ff714427df4cc0da7cd9fdb451703f8686c1])
* (edit) src/test/org/apache/sqoop/manager/oracle/ExportTest.java
* (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java
* (add) src/test/org/apache/sqoop/manager/oracle/OraOopTypesTest.java
* (edit) src/test/org/apache/sqoop/manager/oracle/OraOopTestCase.java
* (edit) src/test/org/apache/sqoop/manager/oracle/util/OracleData.java
* (edit) src/java/org/apache/sqoop/orm/ClassWriter.java
* (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java


> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Fix For: 1.4.7
>
> Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, 
> SQOOP-2983-7.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-10-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609537#comment-15609537
 ] 

Hudson commented on SQOOP-2983:
---

FAILURE: Integrated in Jenkins build Sqoop-hadoop23 #1265 (See 
[https://builds.apache.org/job/Sqoop-hadoop23/1265/])
SQOOP-2983: OraOop export has degraded performance with wide tables (jarcec: 
[https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=3fc4ff714427df4cc0da7cd9fdb451703f8686c1])
* (edit) src/test/org/apache/sqoop/manager/oracle/util/OracleData.java
* (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java
* (edit) src/java/org/apache/sqoop/orm/ClassWriter.java
* (edit) src/test/org/apache/sqoop/manager/oracle/OraOopTestCase.java
* (edit) src/test/org/apache/sqoop/manager/oracle/ExportTest.java
* (add) src/test/org/apache/sqoop/manager/oracle/OraOopTypesTest.java
* (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java


> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Fix For: 1.4.7
>
> Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, 
> SQOOP-2983-7.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-10-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609522#comment-15609522
 ] 

Hudson commented on SQOOP-2983:
---

FAILURE: Integrated in Jenkins build Sqoop-hadoop20 #1063 (See 
[https://builds.apache.org/job/Sqoop-hadoop20/1063/])
SQOOP-2983: OraOop export has degraded performance with wide tables (jarcec: 
[https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=3fc4ff714427df4cc0da7cd9fdb451703f8686c1])
* (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java
* (edit) src/test/org/apache/sqoop/manager/oracle/util/OracleData.java
* (edit) src/test/org/apache/sqoop/manager/oracle/OraOopTestCase.java
* (add) src/test/org/apache/sqoop/manager/oracle/OraOopTypesTest.java
* (edit) src/test/org/apache/sqoop/manager/oracle/ExportTest.java
* (edit) src/java/org/apache/sqoop/orm/ClassWriter.java
* (edit) src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java


> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Fix For: 1.4.7
>
> Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, 
> SQOOP-2983-7.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-10-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609399#comment-15609399
 ] 

ASF subversion and git services commented on SQOOP-2983:


Commit 3fc4ff714427df4cc0da7cd9fdb451703f8686c1 in sqoop's branch 
refs/heads/trunk from [~jarcec]
[ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=3fc4ff7 ]

SQOOP-2983: OraOop export has degraded performance with wide tables

(Attila Szabo via Jarek Jarcec Cecho)


> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, 
> SQOOP-2983-7.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-10-26 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609395#comment-15609395
 ] 

Jarek Jarcec Cecho commented on SQOOP-2983:
---

Let's create a follow up JIRA for [~david.robson]'s feedback to merge the code 
paths and let's get this in to resolve the actual perf issue as that is 
negatively affecting our users.

> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, 
> SQOOP-2983-7.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-09-28 Thread Attila Szabo (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15530362#comment-15530362
 ] 

Attila Szabo commented on SQOOP-2983:
-

Dear [~jarcec], 

Until [~david.robson] will have the chance to look at my latest change on the 
review board, would you please also do a review and give a -1/+1 according to 
your evaluation?

Thanks,
[~maugli]

> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, 
> SQOOP-2983-7.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-09-19 Thread Kathleen Ting (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504009#comment-15504009
 ] 

Kathleen Ting commented on SQOOP-2983:
--

Piling on. [~david.robson], per your advice, [~maugli]'s reverted every change 
made to the update code path, so it's only fixing the insert part. Thanks for 
all your past reviews of SQOOP-2983 and when you get a chance, would you please 
review the latest revisions to SQOOP-2983 and then commit it once you find it 
satisfactory?

> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Attachments: SQOOP-2983-5.patch, SQOOP-2983-6.patch, 
> SQOOP-2983-7.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-08-10 Thread Attila Szabo (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414882#comment-15414882
 ] 

Attila Szabo commented on SQOOP-2983:
-

Hi [~kathleen],

Thanks for the feedback. Previously I've done the same process you'd suggested, 
but thought this is just a noise for the committer (to see there the previous 
patch files). I will keep them in the future too!

Thanks,
[~maugli]

> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Attachments: SQOOP-2983-5.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-08-09 Thread Kathleen Ting (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414503#comment-15414503
 ] 

Kathleen Ting commented on SQOOP-2983:
--

Thanks Attila for the revised patch. As a meta point, please don't delete 
patches from JIRA. Instead please add new revisions (naming each new iteration 
with an increasing numerical value, as you've done). No need to re-add the 
older iterations (e.g. SQOOP-2983-1.patch) to this JIRA but just something to 
keep in mind going forward. Thanks again for your contribution. 

> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Attachments: SQOOP-2983-5.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-08-09 Thread Attila Szabo (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414395#comment-15414395
 ] 

Attila Szabo commented on SQOOP-2983:
-

With the help of [~david.robson] I was able to identify one issue around 
"update-key" option, and also was able to spot another issue (left behind after 
the changes around Oracle escaped column name support). Both of them are fixed. 
New test case attached as well. New diff reflects all of the changes. Please do 
another round of review!

> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Attachments: SQOOP-2983-1.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-07-28 Thread Attila Szabo (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398403#comment-15398403
 ] 

Attila Szabo commented on SQOOP-2983:
-

Updated patchfile (according to the changes requested by [~david.robson] over 
review board).

> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
> Attachments: SQOOP-2983-1.patch
>
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2983) OraOop export has degraded performance with wide tables

2016-07-18 Thread Attila Szabo (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382877#comment-15382877
 ] 

Attila Szabo commented on SQOOP-2983:
-

Hi [~jarcec], [~kathleen], [~david.robson],

Let me share my investigations and results with you in a "long story short" 
mode. If you find my findings and my fix appropriate please help me to get this 
patch committed as soon as possible. Thanks in advance!

So the story:

After a quite long testing and experimenting phase the following conclusions 
had been found:
- Using the direct path write (/*+APPEND_VALUES*/) seems to be a good idea, as 
when I've applied it on the top of the standard ExportBatchOutputFormat and 
used the same "-Dsqoop.export.records.per.statement=5000 
-Dsqoop.export.statements.per.transaction=1" session constraints the 
performance went above 5mb/sec, so the original idea is valid.
- According Oracle's documentation NOLOGGING feature works only properly when 
the session is writing on the direct path, so it's been clear OraOOP should be 
fixed, and we should not introduce a HINT parameter for the standard Oracle 
driver (although it could make sense to introduce that in a different FR JIRA)
- Thus I've started to dig around what could be that different in the OraOOP 
query handling and the standard Oracle driver. I was able to measure out that 
creating the prepared statements are much slower in case of OraOOP. Executing 
further experiments I've found that something should be wrong around 
configuringPreparedStatement. Here some problems was found (e.g. the lookup of 
the column names are linear so could perform badly as wider the table gets), 
but the problem still felt more fundamental. So finally I was able to figure 
out the problem is with how we set/bind the values through JDBC with the help 
of the SqoopRecord. When I've applied the same way how we did it in the 
ExportBatchOutputFormat the performance get instnatly better (got up to 8-10 
mb/sec).
- However there was still not too relevant difference between the partitioned 
version and the non partitioned one (although it seemed to be trivial there 
should be, as in case of non partitioned because of the direct write after a 
while the synchronous writes should concurrent/lock out each one in a way the 
wait times should undermine the further parallelisation), and in some cases (as 
I've raised the level of parallelisation) it become even much slower (got down 
to 5mb/sec only in case of 3M lines/4.5gb/data with 10 mappers). and it was 
still wired for me. So in the log files finally I've found the current way how 
we moved the tables->subpartitions was very expensive, and sometimes took 
nearly more time than copying the data to the temp table itself. Thus I've made 
some investigations and according to the Oracle documentation, as soon as I've 
applied the "WITHOUT VALIDATION" clause on the ALTER statement it's just 
started to work as it is intended.

Now in the current state it works like that I can even kill (==20+ load 
avarage) my local DB with a 10 node cluster 20mappers, so finally the RDBMS 
become the bottleneck as it should be.

I kindly ask you to review my proposed changes and share your thoughts with me!

> OraOop export has degraded performance with wide tables
> ---
>
> Key: SQOOP-2983
> URL: https://issues.apache.org/jira/browse/SQOOP-2983
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Attila Szabo
>Assignee: Attila Szabo
>Priority: Critical
>
> The current version of OraOOP seems to perform very low from performance POV 
> when --direct mode turned on (regardless if the partitioned feature is turned 
> of).
> Just as a baseline from the current trunk version:
> Inserting 100.000 rows into a 800 column wide Oracle table has 400-600 kb/sec 
> with direct mode on my cluster, while the standard oracle driver can produce 
> up to 1.2-1.8 mb/sec. (depending on the number of mappers, batch size).
> Inserting 1.000.000 rows into the same table goes up to 800k-1mb/sec with 
> OraOOP, however with the standard Oracle connector it's around 3.5mb/sec.
> It seems OraOOP export needs a thorough review and some fixing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)