date:20150209


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramtin updated MAPREDUCE-6246:
--
Description: 
In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.

I changed the current DBOutputFormat class by checking the product name from 
connection object to see if it is DB2 then generates INSERT INTO command 
without semicolon(;). 

This technique is already used in DBInputFormat class for generating different 
SELECT statements for Oracle and MySQL databases.

  was:
In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;;)) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.

I changed the current DBOutputFormat class by checking the product name from 
connection object to see if it is DB2 then generates INSERT INTO command 
without semicolon(;). 

This technique is already used in DBInputFormat class for generating different 
SELECT statements for Oracle and MySQL databases.


 DBOutputFormat.java appending extra semicolon to query which is incompatible 
 with DB2
 -

 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
 Platform: xSeries, pSeries
 Browser: Firefox, IE
 Security Settings: No Security, Flat file, LDAP, PAM
 File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin
   Original Estimate: 24h
  Remaining Estimate: 24h

 In DBOutputFormat class there is constructQuery method that generates INSERT 
 INTO statement with semicolon(;) at the end.
 Semicolon is ANSI SQL-92 standard character for a statement terminator but 
 this feature is disabled(OFF) as a default settings in IBM DB2.
 Although by using -t we can turn it ON for db2. 
 (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
  But there are some products that already built on top of this default 
 setting (OFF) so by turning ON this feature make them error prone.
 I changed the current DBOutputFormat class by checking the product name from 
 connection object to see if it is DB2 then generates INSERT INTO command 
 without semicolon(;). 
 This technique is already used in DBInputFormat class for generating 
 different SELECT statements for Oracle and MySQL databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramtin updated MAPREDUCE-6246:
--
Description: 
In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.

I changed the current DBOutputFormat class by checking the product name from 
connection object to see if it is DB2 then generates INSERT INTO command 
without semicolon(;). 

This technique is already used in DBInputFormat class for generating different 
SELECT statements for Oracle and MySQL databases.

  was:
In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.

I changed the current DBOutputFormat class by checking the product name from 
connection object to see if it is DB2 then generates INSERT INTO command 
without semicolon(;). 

This technique is already used in DBInputFormat class for generating different 
SELECT statements for Oracle and MySQL databases.


 DBOutputFormat.java appending extra semicolon to query which is incompatible 
 with DB2
 -

 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
 Platform: xSeries, pSeries
 Browser: Firefox, IE
 Security Settings: No Security, Flat file, LDAP, PAM
 File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin
   Original Estimate: 24h
  Remaining Estimate: 24h

 In DBOutputFormat class there is constructQuery method that generates INSERT 
 INTO statement with semicolon(;) at the end.
 Semicolon is ANSI SQL-92 standard character for a statement terminator but 
 this feature is disabled(OFF) as a default settings in IBM DB2.
 Although by using -t we can turn it ON for db2. 
 (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
  But there are some products that already built on top of this default 
 setting (OFF) so by turning ON this feature make them error prone.
 I changed the current DBOutputFormat class by checking the product name from 
 connection object to see if it is DB2 then generates INSERT INTO command 
 without semicolon(;). 
 This technique is already used in DBInputFormat class for generating 
 different SELECT statements for Oracle and MySQL databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6244) Hadoop examples when run without an argument, gives ERROR instead of just usage info


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312642#comment-14312642
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-6244:
---

Cancelling for the previous comment.

 Hadoop examples when run without an argument, gives ERROR instead of just 
 usage info
 

 Key: MAPREDUCE-6244
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6244
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.23.0, trunk-win, 2.6.0
Reporter: Robert Justice
Assignee: Abhishek Kapoor
Priority: Minor
 Attachments: HADOOP-8834.patch, HADOOP-8834.patch


 Hadoop sort example should not give an ERROR and only should display usage 
 when run with no parameters. 
 {code}
 $ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar sort
 ERROR: Wrong number of parameters: 0 instead of 2.
 sort [-m maps] [-r reduces] [-inFormat input format class] [-outFormat 
 output format class] [-outKey output key class] [-outValue output value 
 class] [-totalOrder pcnt num samples max splits] input output
 Generic options supported are
 -conf configuration file specify an application configuration file
 -D property=valueuse value for given property
 -fs local|namenode:port  specify a namenode
 -jt local|jobtracker:portspecify a job tracker
 -files comma separated list of filesspecify comma separated files to be 
 copied to the map reduce cluster
 -libjars comma separated list of jarsspecify comma separated jar files 
 to include in the classpath.
 -archives comma separated list of archivesspecify comma separated 
 archives to be unarchived on the compute machines.
 The general command line syntax is
 bin/hadoop command [genericOptions] [commandOptions]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramtin updated MAPREDUCE-6246:
--
Attachment: MAPREDUCE-6246.patch

 DBOutputFormat.java appending extra semicolon to query which is incompatible 
 with DB2
 -

 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
 Platform: xSeries, pSeries
 Browser: Firefox, IE
 Security Settings: No Security, Flat file, LDAP, PAM
 File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin
  Labels: DB2, mapreduce
 Fix For: 2.4.1

 Attachments: MAPREDUCE-6246.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 In DBOutputFormat class there is constructQuery method that generates INSERT 
 INTO statement with semicolon(;) at the end.
 Semicolon is ANSI SQL-92 standard character for a statement terminator but 
 this feature is disabled(OFF) as a default settings in IBM DB2.
 Although by using -t we can turn it ON for db2. 
 (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
  But there are some products that already built on top of this default 
 setting (OFF) so by turning ON this feature make them error prone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramtin updated MAPREDUCE-6246:
--
Description: 
DBoutputformat is used for writing output of mapreduce jobs to the database and 
when used with db2 jdbc drivers it fails with following error

com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, 
SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, 
DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at 
com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127)

  was:
DBoutputformat is used for writing output of mapreduce jobs to the database and 
when used with db2 jdbc drivers it fails with following error

com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, 
SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, 
DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at 
com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127)

In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.


 DBOutputFormat.java appending extra semicolon to query which is incompatible 
 with DB2
 -

 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
 Platform: xSeries, pSeries
 Browser: Firefox, IE
 Security Settings: No Security, Flat file, LDAP, PAM
 File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin
  Labels: DB2, mapreduce
 Fix For: 2.4.1

 Attachments: MAPREDUCE-6246.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 DBoutputformat is used for writing output of mapreduce jobs to the database 
 and when used with db2 jdbc drivers it fails with following error
 com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, 
 SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, 
 DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at 
 com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6223) TestJobConf#testNegativeValueForTaskVmem failures

2015-02-09 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312791#comment-14312791
 ] 

Akira AJISAKA commented on MAPREDUCE-6223:
--

Thanks [~varun_saxena] for updating the patch. +1 pending [~kasha]'s review. 
The findbugs warnings look unrelated to the patch.

 TestJobConf#testNegativeValueForTaskVmem failures
 -

 Key: MAPREDUCE-6223
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6223
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Gera Shegalov
Assignee: Varun Saxena
 Attachments: MAPREDUCE-6223.001.patch, MAPREDUCE-6223.002.patch, 
 MAPREDUCE-6223.003.patch


 {code}
 Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.328 sec  
 FAILURE! - in org.apache.hadoop.conf.TestJobConf
 testNegativeValueForTaskVmem(org.apache.hadoop.conf.TestJobConf)  Time 
 elapsed: 0.089 sec   FAILURE!
 java.lang.AssertionError: expected:1024 but was:-1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.conf.TestJobConf.testNegativeValueForTaskVmem(TestJobConf.java:111)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2

ramtin created MAPREDUCE-6246:
-

 Summary: DBOutputFormat.java appending extra semicolon to query 
which is incompatible with DB2
 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
Platform: xSeries, pSeries
Browser: Firefox, IE
Security Settings: No Security, Flat file, LDAP, PAM
File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin


In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.

I changed the current DBOutputFormat class by checking the product name from 
connection object to see if it is DB2 then generates INSERT INTO command 
without semicolon(;). 

This technique is already used in DBInputFormat class for generating different 
SELECT statements for Oracle and MySQL databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6237) DBRecordReader is not thread safe

2015-02-09 Thread Kannan Rajah (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312583#comment-14312583
]

Kannan Rajah commented on MAPREDUCE-6237:
-

[~ozawa] Is the patch alright? Anything else I need to do to get this committed?

DBRecordReader is not thread safe
-

Key: MAPREDUCE-6237
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6237
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: mrv2
Affects Versions: 2.5.0
Reporter: Kannan Rajah
Assignee: Kannan Rajah
Attachments: mapreduce-6237.patch, mapreduce-6237.patch,
mapreduce-6237.patch

DBInputFormat.createDBRecorder is reusing JDBC connections across instances
of DBRecordReader. This is not a good idea. We should be creating separate
connection. If performance is a concern, then we should be using connection
pooling instead.
I looked at DBOutputFormat.getRecordReader. It actually creates a new
Connection object for each DBRecordReader. So can we just change
DBInputFormat to create new Connection every time? The connection reuse code
was added as part of connection leak bug in MAPREDUCE-1443. Any reason for
caching the connection?
We observed this issue in a customer setup where they were reading data from
MySQL using Pig. As per customer, the query is returning two records which
causes Pig to create two instances of DBRecordReader. These two instances are
sharing the database connection instance. The first DBRecordReader runs to
extract the first record from MySQL just fine, but then closes the shared
connection instance. When the second DBRecordReader runs, it tries to execute
a query to retrieve the second record on the closed shared connection
instance, which fail. If we set
mapred.map.tasks to 1, the query will be successful.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (MAPREDUCE-5381) Support graceful decommission of tasktracker

2015-02-09 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved MAPREDUCE-5381.

Resolution: Won't Fix

Hardly any development is happening in 1.x now. I am closing this in favor of 
YARN's YARN-914. Please reopen if need be.

 Support graceful decommission of tasktracker
 

 Key: MAPREDUCE-5381
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5381
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.2.0
Reporter: Luke Lu
Assignee: Binglin Chang
 Attachments: MAPREDUCE-5381-graceful-decomm.v1.patch


 When TTs are decommissioned for non-fault reasons (capacity change etc.), 
 it's desirable to minimize the impact to running jobs.
 Currently if a TT is decommissioned, all running tasks on the TT need to be 
 rescheduled on other TTs. Further more, for finished map tasks, if their map 
 output are not fetched by the reducers of the job, these map tasks will need 
 to be rerun as well.
 We propose to introduce a mechanism to optionally gracefully decommission a 
 tasktracker.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramtin updated MAPREDUCE-6246:
--
Description: 
In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;;)) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.

I changed the current DBOutputFormat class by checking the product name from 
connection object to see if it is DB2 then generates INSERT INTO command 
without semicolon(;). 

This technique is already used in DBInputFormat class for generating different 
SELECT statements for Oracle and MySQL databases.

  was:
In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.

I changed the current DBOutputFormat class by checking the product name from 
connection object to see if it is DB2 then generates INSERT INTO command 
without semicolon(;). 

This technique is already used in DBInputFormat class for generating different 
SELECT statements for Oracle and MySQL databases.


 DBOutputFormat.java appending extra semicolon to query which is incompatible 
 with DB2
 -

 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
 Platform: xSeries, pSeries
 Browser: Firefox, IE
 Security Settings: No Security, Flat file, LDAP, PAM
 File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin
   Original Estimate: 24h
  Remaining Estimate: 24h

 In DBOutputFormat class there is constructQuery method that generates INSERT 
 INTO statement with semicolon(;;)) at the end.
 Semicolon is ANSI SQL-92 standard character for a statement terminator but 
 this feature is disabled(OFF) as a default settings in IBM DB2.
 Although by using -t we can turn it ON for db2. 
 (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
  But there are some products that already built on top of this default 
 setting (OFF) so by turning ON this feature make them error prone.
 I changed the current DBOutputFormat class by checking the product name from 
 connection object to see if it is DB2 then generates INSERT INTO command 
 without semicolon(;). 
 This technique is already used in DBInputFormat class for generating 
 different SELECT statements for Oracle and MySQL databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6237) Multiple mappers with DBInputFormat don't work because of reusing conections

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tsuyoshi OZAWA updated MAPREDUCE-6237:
--
Fix Version/s: 2.6.1

Multiple mappers with DBInputFormat don't work because of reusing conections

Key: MAPREDUCE-6237
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6237
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: mrv2
Affects Versions: 2.5.0, 2.6.0
Reporter: Kannan Rajah
Assignee: Kannan Rajah
Fix For: 2.6.1

Attachments: mapreduce-6237.patch, mapreduce-6237.patch,
mapreduce-6237.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6237) Multiple mappers with DBInputFormat don't work because of reusing conections

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tsuyoshi OZAWA updated MAPREDUCE-6237:
--
Resolution: Fixed
Status: Resolved (was: Patch Available)

Multiple mappers with DBInputFormat don't work because of reusing conections

Attachments: mapreduce-6237.patch, mapreduce-6237.patch,
mapreduce-6237.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramtin updated MAPREDUCE-6246:
--
Description: 
In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.

  was:
In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.

I changed the current DBOutputFormat class by checking the product name from 
connection object to see if it is DB2 then generates INSERT INTO command 
without semicolon(;). 

This technique is already used in DBInputFormat class for generating different 
SELECT statements for Oracle and MySQL databases.


 DBOutputFormat.java appending extra semicolon to query which is incompatible 
 with DB2
 -

 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
 Platform: xSeries, pSeries
 Browser: Firefox, IE
 Security Settings: No Security, Flat file, LDAP, PAM
 File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin
   Original Estimate: 24h
  Remaining Estimate: 24h

 In DBOutputFormat class there is constructQuery method that generates INSERT 
 INTO statement with semicolon(;) at the end.
 Semicolon is ANSI SQL-92 standard character for a statement terminator but 
 this feature is disabled(OFF) as a default settings in IBM DB2.
 Although by using -t we can turn it ON for db2. 
 (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
  But there are some products that already built on top of this default 
 setting (OFF) so by turning ON this feature make them error prone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6237) Multiple mappers with DBInputFormat don't work because of reusing conections

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312636#comment-14312636
]

Tsuyoshi OZAWA commented on MAPREDUCE-6237:
---

Committed this to trunk, branch-2, and branch-2.6. Thanks [~rkannan82] for your
contribution and thanks [~jira.shegalov] for your review.

[~rkannan82], BTW, do you mind creating following JIRA to use thread pool based
on Gera's suggestion?

Multiple mappers with DBInputFormat don't work because of reusing conections

Attachments: mapreduce-6237.patch, mapreduce-6237.patch,
mapreduce-6237.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6244) Hadoop examples when run without an argument, gives ERROR instead of just usage info


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-6244:
--
Status: Open  (was: Patch Available)

 Hadoop examples when run without an argument, gives ERROR instead of just 
 usage info
 

 Key: MAPREDUCE-6244
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6244
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.6.0, 0.23.0, trunk-win
Reporter: Robert Justice
Assignee: Abhishek Kapoor
Priority: Minor
 Attachments: HADOOP-8834.patch, HADOOP-8834.patch


 Hadoop sort example should not give an ERROR and only should display usage 
 when run with no parameters. 
 {code}
 $ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar sort
 ERROR: Wrong number of parameters: 0 instead of 2.
 sort [-m maps] [-r reduces] [-inFormat input format class] [-outFormat 
 output format class] [-outKey output key class] [-outValue output value 
 class] [-totalOrder pcnt num samples max splits] input output
 Generic options supported are
 -conf configuration file specify an application configuration file
 -D property=valueuse value for given property
 -fs local|namenode:port  specify a namenode
 -jt local|jobtracker:portspecify a job tracker
 -files comma separated list of filesspecify comma separated files to be 
 copied to the map reduce cluster
 -libjars comma separated list of jarsspecify comma separated jar files 
 to include in the classpath.
 -archives comma separated list of archivesspecify comma separated 
 archives to be unarchived on the compute machines.
 The general command line syntax is
 bin/hadoop command [genericOptions] [commandOptions]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6237) DBRecordReader is not thread safe

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312590#comment-14312590
]

Tsuyoshi OZAWA commented on MAPREDUCE-6237:
---

+1, findbugs look not related to your patch. I'll commit this to branch-2 and
trunk shortly.

DBRecordReader is not thread safe
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6237) DBRecordReader is not thread safe

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tsuyoshi OZAWA updated MAPREDUCE-6237:
--
Affects Version/s: 2.6.0
Hadoop Flags: Reviewed

DBRecordReader is not thread safe
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6237) Multiple mappers with DBInputFormat don't work because of reusing conections

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tsuyoshi OZAWA updated MAPREDUCE-6237:
--
Summary: Multiple mappers with DBInputFormat don't work because of reusing
conections (was: DBRecordReader is not thread safe)

Multiple mappers with DBInputFormat don't work because of reusing conections

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2

2015-02-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313278#comment-14313278
 ] 

Hadoop QA commented on MAPREDUCE-6246:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697540/MAPREDUCE-6246.patch
  against trunk revision af08425.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5177//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5177//console

This message is automatically generated.

 DBOutputFormat.java appending extra semicolon to query which is incompatible 
 with DB2
 -

 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
 Platform: xSeries, pSeries
 Browser: Firefox, IE
 Security Settings: No Security, Flat file, LDAP, PAM
 File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin
  Labels: DB2, mapreduce
 Fix For: 2.4.1

 Attachments: MAPREDUCE-6246.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 DBoutputformat is used for writing output of mapreduce jobs to the database 
 and when used with db2 jdbc drivers it fails with following error
 com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, 
 SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, 
 DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at 
 com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127)
 In DBOutputFormat class there is constructQuery method that generates INSERT 
 INTO statement with semicolon(;) at the end.
 Semicolon is ANSI SQL-92 standard character for a statement terminator but 
 this feature is disabled(OFF) as a default settings in IBM DB2.
 Although by using -t we can turn it ON for db2. 
 (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
  But there are some products that already built on top of this default 
 setting (OFF) so by turning ON this feature make them error prone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6237) Multiple mappers with DBInputFormat don't work because of reusing conections

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312817#comment-14312817
]

Hudson commented on MAPREDUCE-6237:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #833 (See
[https://builds.apache.org/job/Hadoop-Yarn-trunk/833/])
MAPREDUCE-6237. Multiple mappers with DBInputFormat don't work because of
reusing conections. Contributed by Kannan Rajah. (ozawa: rev
241336ca2b7cf97d7e0bd84dbe0542b72f304dc9)
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DBInputFormat.java
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBInputFormat.java
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBInputFormat.java
* hadoop-mapreduce-project/CHANGES.txt

Multiple mappers with DBInputFormat don't work because of reusing conections

Attachments: mapreduce-6237.patch, mapreduce-6237.patch,
mapreduce-6237.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6174) Combine common stream code into parent class for InMemoryMapOutput and OnDiskMapOutput.

2015-02-09 Thread Eric Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated MAPREDUCE-6174:
--
Status: Patch Available  (was: Open)

 Combine common stream code into parent class for InMemoryMapOutput and 
 OnDiskMapOutput.
 ---

 Key: MAPREDUCE-6174
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6174
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 2.6.0, 3.0.0
Reporter: Eric Payne
Assignee: Eric Payne
 Attachments: MAPREDUCE-6174.v1.txt


 Per MAPREDUCE-6166, both InMemoryMapOutput and OnDiskMapOutput will be doing 
 similar things with regards to IFile streams.
 In order to make it explicit that InMemoryMapOutput and OnDiskMapOutput are 
 different from 3rd-party implementations, this JIRA will make them subclass a 
 common class (see 
 https://issues.apache.org/jira/browse/MAPREDUCE-6166?focusedCommentId=14223368page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14223368)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6174) Combine common stream code into parent class for InMemoryMapOutput and OnDiskMapOutput.

2015-02-09 Thread Eric Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated MAPREDUCE-6174:
--
Attachment: MAPREDUCE-6174.v1.txt

[~jira.shegalov], I have uploaded a patch for this issue. Would you please have 
a look?

 Combine common stream code into parent class for InMemoryMapOutput and 
 OnDiskMapOutput.
 ---

 Key: MAPREDUCE-6174
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6174
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0, 2.6.0
Reporter: Eric Payne
Assignee: Eric Payne
 Attachments: MAPREDUCE-6174.v1.txt


 Per MAPREDUCE-6166, both InMemoryMapOutput and OnDiskMapOutput will be doing 
 similar things with regards to IFile streams.
 In order to make it explicit that InMemoryMapOutput and OnDiskMapOutput are 
 different from 3rd-party implementations, this JIRA will make them subclass a 
 common class (see 
 https://issues.apache.org/jira/browse/MAPREDUCE-6166?focusedCommentId=14223368page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14223368)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-5903) If Kerberos Authentication is enabled, MapReduce job is failing on reducer phase

2015-02-09 Thread kumar ranganathan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312122#comment-14312122
 ] 

kumar ranganathan commented on MAPREDUCE-5903:
--

I am also facing the same exception when enabling LDAP for windows active 
directory in hadoop-2.6.0. 

 If Kerberos Authentication is enabled, MapReduce job is failing on reducer 
 phase
 

 Key: MAPREDUCE-5903
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5903
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.4.0
 Environment: hadoop: 2.4.0.2.1.2.0
Reporter: Victor Kim
Priority: Critical
  Labels: shuffle

 I have 3-node cluster configuration: 1 ResourceManager and 3 NodeManagers, 
 Kerberos is enabled, have hdfs, yarn, mapred principals\keytabs. 
 ResourceManager and NodeManager are ran under yarn user, using yarn Kerberos 
 principal. 
 Use case 1: WordCount, submit job using yarn UGI (i.e. superuser, the one 
 having Kerberos principal on all boxes). Result: job successfully completed.
 Use case 2: WordCount, submit job using LDAP user impersonation via yarn UGI. 
 Result: Map tasks are completed SUCCESSfully, Reduce task fails with 
 ShuffleError Caused by: java.io.IOException: Exceeded 
 MAX_FAILED_UNIQUE_FETCHES (see the stack trace below).
 The use case with user impersonation used to work on earlier versions, 
 without YARN (with JTTT).
 I found similar issue with Kerberos AUTH involved here: 
 https://groups.google.com/forum/#!topic/nosql-databases/tGDqs75ACqQ
 And here https://issues.apache.org/jira/browse/MAPREDUCE-4030 it's marked as 
 resolved, which is not the case when Kerberos Authentication is enabled.
 The exception trace from YarnChild JVM:
 2014-05-21 12:49:35,687 FATAL [fetcher#3] 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Shuffle failed 
 with too many fetch failures and insufficient progress!
 2014-05-21 12:49:35,688 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : 
 org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
 shuffle in fetcher#3
 at 
 org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:416)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
 Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; 
 bailing-out.
 at 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
 at 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-207) Computing Input Splits on the MR Cluster

2015-02-09 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-207:
---
Status: Open  (was: Patch Available)

Cancelling patch, as it no longer applies.

 Computing Input Splits on the MR Cluster
 

 Key: MAPREDUCE-207
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-207
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: applicationmaster, mrv2
Reporter: Philip Zeyliger
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-207.patch, MAPREDUCE-207.v02.patch, 
 MAPREDUCE-207.v03.patch, MAPREDUCE-207.v05.patch, MAPREDUCE-207.v06.patch, 
 MAPREDUCE-207.v07.patch


 Instead of computing the input splits as part of job submission, Hadoop could 
 have a separate job task type that computes the input splits, therefore 
 allowing that computation to happen on the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6237) Multiple mappers with DBInputFormat don't work because of reusing conections

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312951#comment-14312951
]

Hudson commented on MAPREDUCE-6237:
---

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #99 (See
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/99/])
MAPREDUCE-6237. Multiple mappers with DBInputFormat don't work because of
reusing conections. Contributed by Kannan Rajah. (ozawa: rev
241336ca2b7cf97d7e0bd84dbe0542b72f304dc9)
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBInputFormat.java
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBInputFormat.java
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DBInputFormat.java
* hadoop-mapreduce-project/CHANGES.txt

Multiple mappers with DBInputFormat don't work because of reusing conections

Attachments: mapreduce-6237.patch, mapreduce-6237.patch,
mapreduce-6237.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4413) MR lib dir contains jdiff (which is gpl)

2015-02-09 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4413:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

+1 committed to trunk.

Thanks!

 MR lib dir contains jdiff (which is gpl)
 

 Key: MAPREDUCE-4413
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4413
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Nemon Lou
Priority: Critical
 Fix For: 3.0.0

 Attachments: MAPREDUCE-4413.patch, MAPREDUCE-4413.patch


 A tarball built from trunk contains the following:
 ./share/hadoop/mapreduce/lib/jdiff-1.0.9.jar
 jdiff is gplv2, we need to exclude it from the build artifact.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6237) Multiple mappers with DBInputFormat don't work because of reusing conections

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313047#comment-14313047
]

Hudson commented on MAPREDUCE-6237:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #7053 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/7053/])
MAPREDUCE-6237. Multiple mappers with DBInputFormat don't work because of
reusing conections. Contributed by Kannan Rajah. (ozawa: rev
241336ca2b7cf97d7e0bd84dbe0542b72f304dc9)
* hadoop-mapreduce-project/CHANGES.txt
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBInputFormat.java
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DBInputFormat.java
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBInputFormat.java

Multiple mappers with DBInputFormat don't work because of reusing conections

Attachments: mapreduce-6237.patch, mapreduce-6237.patch,
mapreduce-6237.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-4413) MR lib dir contains jdiff (which is gpl)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313044#comment-14313044
 ] 

Hudson commented on MAPREDUCE-4413:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #7053 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7053/])
MAPREDUCE-4413. MR lib dir contains jdiff (which is gpl) (Nemon Lou via aw) 
(aw: rev aab459c904bf2007c5b230af8c058793935faf89)
* hadoop-mapreduce-project/CHANGES.txt
* hadoop-assemblies/src/main/resources/assemblies/hadoop-mapreduce-dist.xml


 MR lib dir contains jdiff (which is gpl)
 

 Key: MAPREDUCE-4413
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4413
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Nemon Lou
Priority: Critical
 Fix For: 3.0.0

 Attachments: MAPREDUCE-4413.patch, MAPREDUCE-4413.patch


 A tarball built from trunk contains the following:
 ./share/hadoop/mapreduce/lib/jdiff-1.0.9.jar
 jdiff is gplv2, we need to exclude it from the build artifact.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramtin updated MAPREDUCE-6246:
--
Fix Version/s: 2.4.1
   Labels: DB2 mapreduce  (was: )
   Status: Patch Available  (was: Open)

I changed the current DBOutputFormat class by checking the product name from 
connection object to see if it is DB2 then generates INSERT INTO  query 
without semicolon(;). 

This technique is already used in DBInputFormat class for generating different 
SELECT statements for Oracle and MySQL databases.

 DBOutputFormat.java appending extra semicolon to query which is incompatible 
 with DB2
 -

 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
 Platform: xSeries, pSeries
 Browser: Firefox, IE
 Security Settings: No Security, Flat file, LDAP, PAM
 File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin
  Labels: mapreduce, DB2
 Fix For: 2.4.1

   Original Estimate: 24h
  Remaining Estimate: 24h

 In DBOutputFormat class there is constructQuery method that generates INSERT 
 INTO statement with semicolon(;) at the end.
 Semicolon is ANSI SQL-92 standard character for a statement terminator but 
 this feature is disabled(OFF) as a default settings in IBM DB2.
 Although by using -t we can turn it ON for db2. 
 (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
  But there are some products that already built on top of this default 
 setting (OFF) so by turning ON this feature make them error prone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramtin updated MAPREDUCE-6246:
--
Description: 
DBoutputformat is used for writing output of mapreduce jobs to the database and 
when used with db2 jdbc drivers it fails with following error

com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, 
SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, 
DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at 
com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127)

In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.

  was:
In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.


 DBOutputFormat.java appending extra semicolon to query which is incompatible 
 with DB2
 -

 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
 Platform: xSeries, pSeries
 Browser: Firefox, IE
 Security Settings: No Security, Flat file, LDAP, PAM
 File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin
  Labels: DB2, mapreduce
 Fix For: 2.4.1

 Attachments: MAPREDUCE-6246.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 DBoutputformat is used for writing output of mapreduce jobs to the database 
 and when used with db2 jdbc drivers it fails with following error
 com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, 
 SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, 
 DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at 
 com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127)
 In DBOutputFormat class there is constructQuery method that generates INSERT 
 INTO statement with semicolon(;) at the end.
 Semicolon is ANSI SQL-92 standard character for a statement terminator but 
 this feature is disabled(OFF) as a default settings in IBM DB2.
 Although by using -t we can turn it ON for db2. 
 (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
  But there are some products that already built on top of this default 
 setting (OFF) so by turning ON this feature make them error prone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramtin updated MAPREDUCE-6246:
--
Description: 
DBoutputformat is used for writing output of mapreduce jobs to the database and 
when used with db2 jdbc drivers it fails with following error

com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, 
SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, 
DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at 
com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127)


In DBOutputFormat class there is constructQuery method that generates INSERT 
INTO statement with semicolon(;) at the end.

Semicolon is ANSI SQL-92 standard character for a statement terminator but this 
feature is disabled(OFF) as a default settings in IBM DB2.

Although by using -t we can turn it ON for db2. 
(http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
 But there are some products that already built on top of this default setting 
(OFF) so by turning ON this feature make them error prone.

  was:
DBoutputformat is used for writing output of mapreduce jobs to the database and 
when used with db2 jdbc drivers it fails with following error

com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, 
SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, 
DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at 
com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127)


 DBOutputFormat.java appending extra semicolon to query which is incompatible 
 with DB2
 -

 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
 Platform: xSeries, pSeries
 Browser: Firefox, IE
 Security Settings: No Security, Flat file, LDAP, PAM
 File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin
  Labels: DB2, mapreduce
 Fix For: 2.4.1

 Attachments: MAPREDUCE-6246.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 DBoutputformat is used for writing output of mapreduce jobs to the database 
 and when used with db2 jdbc drivers it fails with following error
 com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, 
 SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, 
 DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at 
 com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127)
 In DBOutputFormat class there is constructQuery method that generates INSERT 
 INTO statement with semicolon(;) at the end.
 Semicolon is ANSI SQL-92 standard character for a statement terminator but 
 this feature is disabled(OFF) as a default settings in IBM DB2.
 Although by using -t we can turn it ON for db2. 
 (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
  But there are some products that already built on top of this default 
 setting (OFF) so by turning ON this feature make them error prone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6247) Use DBCP connection pooling in DBInputFormat

2015-02-09 Thread Kannan Rajah (JIRA)

Kannan Rajah created MAPREDUCE-6247:
---

 Summary: Use DBCP connection pooling in DBInputFormat
 Key: MAPREDUCE-6247
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6247
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 2.6.0, 2.5.0
Reporter: Kannan Rajah
Assignee: Kannan Rajah
Priority: Minor


As part of MAPREDUCE-6237, we removed caching of DB connection. 
[~jira.shegalov] and [~ozawa] suggested that we use DBCP connection pooling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6242) Progress report log is incredibly excessive in application master

2015-02-09 Thread Varun Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated MAPREDUCE-6242:

Status: Patch Available  (was: Open)

 Progress report log is incredibly excessive in application master
 -

 Key: MAPREDUCE-6242
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6242
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.4.0
Reporter: Jian Fang
Assignee: Varun Saxena
 Attachments: MAPREDUCE-6242.001.patch


 We saw incredibly excessive logs in application master for a long running one 
 with many task attempts. The log write rate is around 1MB/sec in some cases. 
 Most of the log entries were from the progress report such as the following 
 ones.
 2015-02-03 17:46:14,321 INFO [IPC Server handler 56 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.15605757
 2015-02-03 17:46:17,581 INFO [IPC Server handler 2 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.4108217
 2015-02-03 17:46:20,426 INFO [IPC Server handler 0 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_02_0 is : 0.06634143
 2015-02-03 17:46:20,807 INFO [IPC Server handler 4 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.6506
 2015-02-03 17:46:21,013 INFO [IPC Server handler 6 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_01_0 is : 0.21723115
 Looks like the report interval is controlled by a hard-coded variable 
 PROGRESS_INTERVAL as 3 seconds in class org.apache.hadoop.mapred.Task. We 
 should allow users to set the appropriate progress interval for their 
 applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6223) TestJobConf#testNegativeValueForTaskVmem failures

2015-02-09 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313167#comment-14313167
 ] 

Karthik Kambatla commented on MAPREDUCE-6223:
-

Patch looks mostly good to me. Nit: I would leave the test for negative values, 
but update the asserts to reflect the expected behavior. 

 TestJobConf#testNegativeValueForTaskVmem failures
 -

 Key: MAPREDUCE-6223
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6223
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Gera Shegalov
Assignee: Varun Saxena
 Attachments: MAPREDUCE-6223.001.patch, MAPREDUCE-6223.002.patch, 
 MAPREDUCE-6223.003.patch


 {code}
 Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.328 sec  
 FAILURE! - in org.apache.hadoop.conf.TestJobConf
 testNegativeValueForTaskVmem(org.apache.hadoop.conf.TestJobConf)  Time 
 elapsed: 0.089 sec   FAILURE!
 java.lang.AssertionError: expected:1024 but was:-1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.conf.TestJobConf.testNegativeValueForTaskVmem(TestJobConf.java:111)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6237) Multiple mappers with DBInputFormat don't work because of reusing conections

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313096#comment-14313096
]

Hudson commented on MAPREDUCE-6237:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #100 (See
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/100/])
MAPREDUCE-6237. Multiple mappers with DBInputFormat don't work because of
reusing conections. Contributed by Kannan Rajah. (ozawa: rev
241336ca2b7cf97d7e0bd84dbe0542b72f304dc9)
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBInputFormat.java
* hadoop-mapreduce-project/CHANGES.txt
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBInputFormat.java
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DBInputFormat.java

Multiple mappers with DBInputFormat don't work because of reusing conections

Attachments: mapreduce-6237.patch, mapreduce-6237.patch,
mapreduce-6237.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6242) Progress report log is incredibly excessive in application master

2015-02-09 Thread Varun Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated MAPREDUCE-6242:

Status: Open  (was: Patch Available)

 Progress report log is incredibly excessive in application master
 -

 Key: MAPREDUCE-6242
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6242
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.4.0
Reporter: Jian Fang
Assignee: Varun Saxena
 Attachments: MAPREDUCE-6242.001.patch


 We saw incredibly excessive logs in application master for a long running one 
 with many task attempts. The log write rate is around 1MB/sec in some cases. 
 Most of the log entries were from the progress report such as the following 
 ones.
 2015-02-03 17:46:14,321 INFO [IPC Server handler 56 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.15605757
 2015-02-03 17:46:17,581 INFO [IPC Server handler 2 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.4108217
 2015-02-03 17:46:20,426 INFO [IPC Server handler 0 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_02_0 is : 0.06634143
 2015-02-03 17:46:20,807 INFO [IPC Server handler 4 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_00_0 is : 0.6506
 2015-02-03 17:46:21,013 INFO [IPC Server handler 6 on 37661] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
 attempt_1422985365246_0001_m_01_0 is : 0.21723115
 Looks like the report interval is controlled by a hard-coded variable 
 PROGRESS_INTERVAL as 3 seconds in class org.apache.hadoop.mapred.Task. We 
 should allow users to set the appropriate progress interval for their 
 applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6237) Multiple mappers with DBInputFormat don't work because of reusing conections

[
https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313139#comment-14313139
]

Hudson commented on MAPREDUCE-6237:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #2031 (See
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2031/])
MAPREDUCE-6237. Multiple mappers with DBInputFormat don't work because of
reusing conections. Contributed by Kannan Rajah. (ozawa: rev
241336ca2b7cf97d7e0bd84dbe0542b72f304dc9)
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DataDrivenDBInputFormat.java
* hadoop-mapreduce-project/CHANGES.txt
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DBInputFormat.java
*
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/OracleDataDrivenDBInputFormat.java

Multiple mappers with DBInputFormat don't work because of reusing conections

Attachments: mapreduce-6237.patch, mapreduce-6237.patch,
mapreduce-6237.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6237) Multiple mappers with DBInputFormat don't work because of reusing conections