[jira] [Commented] (MAPREDUCE-6012) DBInputSplit creates invalid ranges on Oracle

2014-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102403#comment-14102403
 ] 

Hudson commented on MAPREDUCE-6012:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #651 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/651/])
MAPREDUCE-6012. DBInputSplit creates invalid ranges on Oracle. (Wei Yan via 
kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1618694)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/OracleDBRecordReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/db/TestDbClasses.java


 DBInputSplit creates invalid ranges on Oracle
 -

 Key: MAPREDUCE-6012
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6012
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1, 2.4.1
Reporter: Julien Serdaru
Assignee: Wei Yan
 Fix For: 1.3.0, 2.6.0

 Attachments: HADOOP-9530.patch, MAPREDUCE-6012-2-branch2.patch, 
 MAPREDUCE-6012-branch-1.patch


 The DBInputFormat on Oracle does not create valid ranges.
 The method getSplit line 263 is as follows:
   split = new DBInputSplit(i * chunkSize, (i * chunkSize) + 
 chunkSize);
 So the first split will have a start value of 0 (0*chunkSize).
 However, the OracleDBRecordReader, line 84 is as follows:
   if (split.getLength()  0  split.getStart()  0){
 Since the start value of the first range is equal to 0, we will skip the 
 block that partitions the input set. As a result, one of the map task will 
 process the entire data set, rather than the partition.
 I'm assuming the fix is trivial and would involve removing the second check 
 in the if block.
 Also, I believe the OracleDBRecordReader paging query is incorrect.
 Line 92 should read:
   query.append( ) WHERE dbif_rno  ).append(split.getStart());
 instead of (note  instead of =)
   query.append( ) WHERE dbif_rno = ).append(split.getStart());
 Otherwise some rows will be ignored and some counted more than once.
 A map/reduce job that counts the number of rows based on a predicate will 
 highlight the incorrect behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6012) DBInputSplit creates invalid ranges on Oracle

2014-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102652#comment-14102652
 ] 

Hudson commented on MAPREDUCE-6012:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #1842 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1842/])
MAPREDUCE-6012. DBInputSplit creates invalid ranges on Oracle. (Wei Yan via 
kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1618694)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/OracleDBRecordReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/db/TestDbClasses.java


 DBInputSplit creates invalid ranges on Oracle
 -

 Key: MAPREDUCE-6012
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6012
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1, 2.4.1
Reporter: Julien Serdaru
Assignee: Wei Yan
 Fix For: 1.3.0, 2.6.0

 Attachments: HADOOP-9530.patch, MAPREDUCE-6012-2-branch2.patch, 
 MAPREDUCE-6012-branch-1.patch


 The DBInputFormat on Oracle does not create valid ranges.
 The method getSplit line 263 is as follows:
   split = new DBInputSplit(i * chunkSize, (i * chunkSize) + 
 chunkSize);
 So the first split will have a start value of 0 (0*chunkSize).
 However, the OracleDBRecordReader, line 84 is as follows:
   if (split.getLength()  0  split.getStart()  0){
 Since the start value of the first range is equal to 0, we will skip the 
 block that partitions the input set. As a result, one of the map task will 
 process the entire data set, rather than the partition.
 I'm assuming the fix is trivial and would involve removing the second check 
 in the if block.
 Also, I believe the OracleDBRecordReader paging query is incorrect.
 Line 92 should read:
   query.append( ) WHERE dbif_rno  ).append(split.getStart());
 instead of (note  instead of =)
   query.append( ) WHERE dbif_rno = ).append(split.getStart());
 Otherwise some rows will be ignored and some counted more than once.
 A map/reduce job that counts the number of rows based on a predicate will 
 highlight the incorrect behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6012) DBInputSplit creates invalid ranges on Oracle

2014-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103096#comment-14103096
 ] 

Hudson commented on MAPREDUCE-6012:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1868 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1868/])
MAPREDUCE-6012. DBInputSplit creates invalid ranges on Oracle. (Wei Yan via 
kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1618694)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/OracleDBRecordReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/db/TestDbClasses.java


 DBInputSplit creates invalid ranges on Oracle
 -

 Key: MAPREDUCE-6012
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6012
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1, 2.4.1
Reporter: Julien Serdaru
Assignee: Wei Yan
 Fix For: 1.3.0, 2.6.0

 Attachments: HADOOP-9530.patch, MAPREDUCE-6012-2-branch2.patch, 
 MAPREDUCE-6012-branch-1.patch


 The DBInputFormat on Oracle does not create valid ranges.
 The method getSplit line 263 is as follows:
   split = new DBInputSplit(i * chunkSize, (i * chunkSize) + 
 chunkSize);
 So the first split will have a start value of 0 (0*chunkSize).
 However, the OracleDBRecordReader, line 84 is as follows:
   if (split.getLength()  0  split.getStart()  0){
 Since the start value of the first range is equal to 0, we will skip the 
 block that partitions the input set. As a result, one of the map task will 
 process the entire data set, rather than the partition.
 I'm assuming the fix is trivial and would involve removing the second check 
 in the if block.
 Also, I believe the OracleDBRecordReader paging query is incorrect.
 Line 92 should read:
   query.append( ) WHERE dbif_rno  ).append(split.getStart());
 instead of (note  instead of =)
   query.append( ) WHERE dbif_rno = ).append(split.getStart());
 Otherwise some rows will be ignored and some counted more than once.
 A map/reduce job that counts the number of rows based on a predicate will 
 highlight the incorrect behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6012) DBInputSplit creates invalid ranges on Oracle

2014-08-18 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101167#comment-14101167
 ] 

Ray Chiang commented on MAPREDUCE-6012:
---

Thanks Wei.  Glad to see this fixed.

 DBInputSplit creates invalid ranges on Oracle
 -

 Key: MAPREDUCE-6012
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6012
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1, 2.4.1
Reporter: Julien Serdaru
Assignee: Wei Yan
 Fix For: 1.3.0, 2.6.0

 Attachments: HADOOP-9530.patch, MAPREDUCE-6012-2-branch2.patch, 
 MAPREDUCE-6012-branch-1.patch


 The DBInputFormat on Oracle does not create valid ranges.
 The method getSplit line 263 is as follows:
   split = new DBInputSplit(i * chunkSize, (i * chunkSize) + 
 chunkSize);
 So the first split will have a start value of 0 (0*chunkSize).
 However, the OracleDBRecordReader, line 84 is as follows:
   if (split.getLength()  0  split.getStart()  0){
 Since the start value of the first range is equal to 0, we will skip the 
 block that partitions the input set. As a result, one of the map task will 
 process the entire data set, rather than the partition.
 I'm assuming the fix is trivial and would involve removing the second check 
 in the if block.
 Also, I believe the OracleDBRecordReader paging query is incorrect.
 Line 92 should read:
   query.append( ) WHERE dbif_rno  ).append(split.getStart());
 instead of (note  instead of =)
   query.append( ) WHERE dbif_rno = ).append(split.getStart());
 Otherwise some rows will be ignored and some counted more than once.
 A map/reduce job that counts the number of rows based on a predicate will 
 highlight the incorrect behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6012) DBInputSplit creates invalid ranges on Oracle

2014-08-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101588#comment-14101588
 ] 

Hudson commented on MAPREDUCE-6012:
---

FAILURE: Integrated in Hadoop-trunk-Commit #6086 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6086/])
MAPREDUCE-6012. DBInputSplit creates invalid ranges on Oracle. (Wei Yan via 
kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1618694)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/OracleDBRecordReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/db/TestDbClasses.java


 DBInputSplit creates invalid ranges on Oracle
 -

 Key: MAPREDUCE-6012
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6012
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1, 2.4.1
Reporter: Julien Serdaru
Assignee: Wei Yan
 Fix For: 1.3.0, 2.6.0

 Attachments: HADOOP-9530.patch, MAPREDUCE-6012-2-branch2.patch, 
 MAPREDUCE-6012-branch-1.patch


 The DBInputFormat on Oracle does not create valid ranges.
 The method getSplit line 263 is as follows:
   split = new DBInputSplit(i * chunkSize, (i * chunkSize) + 
 chunkSize);
 So the first split will have a start value of 0 (0*chunkSize).
 However, the OracleDBRecordReader, line 84 is as follows:
   if (split.getLength()  0  split.getStart()  0){
 Since the start value of the first range is equal to 0, we will skip the 
 block that partitions the input set. As a result, one of the map task will 
 process the entire data set, rather than the partition.
 I'm assuming the fix is trivial and would involve removing the second check 
 in the if block.
 Also, I believe the OracleDBRecordReader paging query is incorrect.
 Line 92 should read:
   query.append( ) WHERE dbif_rno  ).append(split.getStart());
 instead of (note  instead of =)
   query.append( ) WHERE dbif_rno = ).append(split.getStart());
 Otherwise some rows will be ignored and some counted more than once.
 A map/reduce job that counts the number of rows based on a predicate will 
 highlight the incorrect behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6012) DBInputSplit creates invalid ranges on Oracle

2014-08-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092202#comment-14092202
 ] 

Karthik Kambatla commented on MAPREDUCE-6012:
-

+1

Spoke to Wei offline to understand the issue better, and his fix makes sense to 
me. 

 DBInputSplit creates invalid ranges on Oracle
 -

 Key: MAPREDUCE-6012
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6012
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1, 2.4.1
Reporter: Julien Serdaru
Assignee: Wei Yan
 Attachments: HADOOP-9530.patch, MAPREDUCE-6012-2-branch2.patch, 
 MAPREDUCE-6012-branch-1.patch


 The DBInputFormat on Oracle does not create valid ranges.
 The method getSplit line 263 is as follows:
   split = new DBInputSplit(i * chunkSize, (i * chunkSize) + 
 chunkSize);
 So the first split will have a start value of 0 (0*chunkSize).
 However, the OracleDBRecordReader, line 84 is as follows:
   if (split.getLength()  0  split.getStart()  0){
 Since the start value of the first range is equal to 0, we will skip the 
 block that partitions the input set. As a result, one of the map task will 
 process the entire data set, rather than the partition.
 I'm assuming the fix is trivial and would involve removing the second check 
 in the if block.
 Also, I believe the OracleDBRecordReader paging query is incorrect.
 Line 92 should read:
   query.append( ) WHERE dbif_rno  ).append(split.getStart());
 instead of (note  instead of =)
   query.append( ) WHERE dbif_rno = ).append(split.getStart());
 Otherwise some rows will be ignored and some counted more than once.
 A map/reduce job that counts the number of rows based on a predicate will 
 highlight the incorrect behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6012) DBInputSplit creates invalid ranges on Oracle

2014-07-29 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078702#comment-14078702
 ] 

zhihai xu commented on MAPREDUCE-6012:
--

[~ywskycn] 's patch looks good to me. His patch used getEnd() instead of 
getStart() + getLength(); in the SQL Query, which simplified the old code.

 DBInputSplit creates invalid ranges on Oracle
 -

 Key: MAPREDUCE-6012
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6012
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1, 2.4.1
Reporter: Julien Serdaru
Assignee: Wei Yan
 Attachments: HADOOP-9530.patch, MAPREDUCE-6012-branch-1.patch


 The DBInputFormat on Oracle does not create valid ranges.
 The method getSplit line 263 is as follows:
   split = new DBInputSplit(i * chunkSize, (i * chunkSize) + 
 chunkSize);
 So the first split will have a start value of 0 (0*chunkSize).
 However, the OracleDBRecordReader, line 84 is as follows:
   if (split.getLength()  0  split.getStart()  0){
 Since the start value of the first range is equal to 0, we will skip the 
 block that partitions the input set. As a result, one of the map task will 
 process the entire data set, rather than the partition.
 I'm assuming the fix is trivial and would involve removing the second check 
 in the if block.
 Also, I believe the OracleDBRecordReader paging query is incorrect.
 Line 92 should read:
   query.append( ) WHERE dbif_rno  ).append(split.getStart());
 instead of (note  instead of =)
   query.append( ) WHERE dbif_rno = ).append(split.getStart());
 Otherwise some rows will be ignored and some counted more than once.
 A map/reduce job that counts the number of rows based on a predicate will 
 highlight the incorrect behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6012) DBInputSplit creates invalid ranges on Oracle

2014-07-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078711#comment-14078711
 ] 

Hadoop QA commented on MAPREDUCE-6012:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12658549/MAPREDUCE-6012-branch-1.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4776//console

This message is automatically generated.

 DBInputSplit creates invalid ranges on Oracle
 -

 Key: MAPREDUCE-6012
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6012
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1, 2.4.1
Reporter: Julien Serdaru
Assignee: Wei Yan
 Attachments: HADOOP-9530.patch, MAPREDUCE-6012-branch-1.patch


 The DBInputFormat on Oracle does not create valid ranges.
 The method getSplit line 263 is as follows:
   split = new DBInputSplit(i * chunkSize, (i * chunkSize) + 
 chunkSize);
 So the first split will have a start value of 0 (0*chunkSize).
 However, the OracleDBRecordReader, line 84 is as follows:
   if (split.getLength()  0  split.getStart()  0){
 Since the start value of the first range is equal to 0, we will skip the 
 block that partitions the input set. As a result, one of the map task will 
 process the entire data set, rather than the partition.
 I'm assuming the fix is trivial and would involve removing the second check 
 in the if block.
 Also, I believe the OracleDBRecordReader paging query is incorrect.
 Line 92 should read:
   query.append( ) WHERE dbif_rno  ).append(split.getStart());
 instead of (note  instead of =)
   query.append( ) WHERE dbif_rno = ).append(split.getStart());
 Otherwise some rows will be ignored and some counted more than once.
 A map/reduce job that counts the number of rows based on a predicate will 
 highlight the incorrect behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6012) DBInputSplit creates invalid ranges on Oracle

2014-07-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078756#comment-14078756
 ] 

Hadoop QA commented on MAPREDUCE-6012:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12658550/MAPREDUCE-6012-2-branch2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4777//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4777//console

This message is automatically generated.

 DBInputSplit creates invalid ranges on Oracle
 -

 Key: MAPREDUCE-6012
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6012
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1, 2.4.1
Reporter: Julien Serdaru
Assignee: Wei Yan
 Attachments: HADOOP-9530.patch, MAPREDUCE-6012-2-branch2.patch, 
 MAPREDUCE-6012-branch-1.patch


 The DBInputFormat on Oracle does not create valid ranges.
 The method getSplit line 263 is as follows:
   split = new DBInputSplit(i * chunkSize, (i * chunkSize) + 
 chunkSize);
 So the first split will have a start value of 0 (0*chunkSize).
 However, the OracleDBRecordReader, line 84 is as follows:
   if (split.getLength()  0  split.getStart()  0){
 Since the start value of the first range is equal to 0, we will skip the 
 block that partitions the input set. As a result, one of the map task will 
 process the entire data set, rather than the partition.
 I'm assuming the fix is trivial and would involve removing the second check 
 in the if block.
 Also, I believe the OracleDBRecordReader paging query is incorrect.
 Line 92 should read:
   query.append( ) WHERE dbif_rno  ).append(split.getStart());
 instead of (note  instead of =)
   query.append( ) WHERE dbif_rno = ).append(split.getStart());
 Otherwise some rows will be ignored and some counted more than once.
 A map/reduce job that counts the number of rows based on a predicate will 
 highlight the incorrect behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)