from:"Allen Wittenauer \(JIRA\)"

[jira] [Updated] (MAPREDUCE-5403) MR changes to accommodate yarn.application.classpath being moved to the server-side

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5403:

Labels: BB2015-05-TBR  (was: )

 MR changes to accommodate yarn.application.classpath being moved to the 
 server-side
 ---

 Key: MAPREDUCE-5403
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5403
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5403-1.patch, MAPREDUCE-5403-2.patch, 
 MAPREDUCE-5403.patch


 yarn.application.classpath is a confusing property because it is used by 
 MapReduce and not YARN, and MapReduce already has 
 mapreduce.application.classpath, which provides the same functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-3807) JobTracker needs fix similar to HDFS-94

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-3807:

Labels: BB2015-05-TBR newbie  (was: newbie)

 JobTracker needs fix similar to HDFS-94
 ---

 Key: MAPREDUCE-3807
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3807
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Harsh J
  Labels: BB2015-05-TBR, newbie
 Attachments: MAPREDUCE-3807.patch


 1.0 JobTracker's jobtracker.jsp page currently shows:
 {code}
 h2Cluster Summary (Heap Size is %= 
 StringUtils.byteDesc(Runtime.getRuntime().totalMemory()) %/%= 
 StringUtils.byteDesc(Runtime.getRuntime().maxMemory()) %)/h2
 {code}
 It could use an improvement same as HDFS-94 to reflect live heap usage more 
 accurately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5188) error when verify FileType of RS_SOURCE in getCompanionBlocks in BlockPlacementPolicyRaid.java

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5188:

Labels: BB2015-05-TBR contrib/raid  (was: contrib/raid)

 error when verify FileType of RS_SOURCE in getCompanionBlocks  in 
 BlockPlacementPolicyRaid.java
 ---

 Key: MAPREDUCE-5188
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5188
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Affects Versions: 2.0.2-alpha
Reporter: junjin
Assignee: junjin
Priority: Critical
  Labels: BB2015-05-TBR, contrib/raid
 Fix For: 2.0.2-alpha

 Attachments: MAPREDUCE-5188.patch


 error when verify FileType of RS_SOURCE in getCompanionBlocks  in 
 BlockPlacementPolicyRaid.java
 need change xorParityLength in line #379 to rsParityLength since it's for 
 verifying RS_SOURCE  type



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5365) Set mapreduce.job.classloader to true by default

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5365:

Labels: BB2015-05-TBR  (was: )

 Set mapreduce.job.classloader to true by default
 

 Key: MAPREDUCE-5365
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5365
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5365.patch


 MAPREDUCE-1700 introduced the mapreduce.job.classpath option, which uses a 
 custom classloader to separate system classes from user classes.  It seems 
 like there are only rare cases when a user would not want this on, and that 
 it should enabled by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4346:

Labels: BB2015-05-TBR  (was: )

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4330) TaskAttemptCompletedEventTransition invalidates previously successful attempt without checking if the newly completed attempt is successful

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4330:

Labels: BB2015-05-TBR  (was: )

 TaskAttemptCompletedEventTransition invalidates previously successful attempt 
 without checking if the newly completed attempt is successful
 ---

 Key: MAPREDUCE-4330
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4330
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.1
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-4330-20130415.1.patch, 
 MAPREDUCE-4330-20130415.patch, MAPREDUCE-4330-21032013.1.patch, 
 MAPREDUCE-4330-21032013.patch


 The previously completed attempt is removed from 
 successAttemptCompletionEventNoMap and marked OBSOLETE.
 After that, if the newly completed attempt is successful then it is added to 
 the successAttemptCompletionEventNoMap. 
 This seems wrong because the newly completed attempt could be failed and thus 
 there is no need to invalidate the successful attempt.
 One error case would be when a speculative attempt completes with 
 killed/failed after the successful version has completed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4273) Make CombineFileInputFormat split result JDK independent

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4273:

Labels: BB2015-05-TBR  (was: )

 Make CombineFileInputFormat split result JDK independent
 

 Key: MAPREDUCE-4273
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4273
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-4273-branch1-v2.patch, 
 mapreduce-4273-branch-1.patch, mapreduce-4273-branch-2.patch, 
 mapreduce-4273.patch


 The split result of CombineFileInputFormat depends on the iteration order of  
 nodeToBlocks and rackToBlocks hash maps, which makes the result HashMap 
 implementation hence JDK dependent.
 This is manifested as TestCombineFileInputFormat failures on alternative JDKs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5377) JobID is not displayed truly by hadoop job -history command

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5377:

Labels: BB2015-05-TBR newbie  (was: newbie)

 JobID is not displayed truly by hadoop job -history command
 -

 Key: MAPREDUCE-5377
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5377
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.2.0
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
Priority: Minor
  Labels: BB2015-05-TBR, newbie
 Attachments: MAPREDUCE-5377.patch


 JobID output by hadoop job -history command is wrong string.
 {quote}
 [hadoop@hadoop hadoop]$ hadoop job -history terasort
 Hadoop job: 0001_1374260789919_hadoop
 =
 Job tracker host name: job
 job tracker start time: Tue May 18 15:39:51 PDT 1976
 User: hadoop
 JobName: TeraSort
 JobConf: 
 hdfs://hadoop:8020/hadoop/mapred/staging/hadoop/.staging/job_201307191206_0001/job.xml
 Submitted At: 19-7-2013 12:06:29
 Launched At: 19-7-2013 12:06:30 (0sec)
 Finished At: 19-7-2013 12:06:44 (14sec)
 Status: SUCCESS
 {quote}
 In this example, it should show job_201307191206_0001 at Hadoop job:, but 
 shows 0001_1374260789919_hadoop. In addition, Job tracker host name and 
 job tracker start time is invalid.
 This problem can solve by fixing setting of jobId in HistoryViewer(). In 
 addition, it should fix the information of JobTracker at HistoryViewr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5150) Backport 2009 terasort (MAPREDUCE-639) to branch-1

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5150:

Labels: BB2015-05-TBR  (was: )

 Backport 2009 terasort (MAPREDUCE-639) to branch-1
 --

 Key: MAPREDUCE-5150
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5150
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Affects Versions: 1.2.0
Reporter: Gera Shegalov
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5150-branch-1.patch


 Users evaluate performance of Hadoop clusters using different benchmarks such 
 as TeraSort. However, terasort version in branch-1 is outdated. It works on 
 teragen dataset that cannot exceed 4 billion unique keys and it does not have 
 the fast non-sampling partitioner SimplePartitioner either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-3936) Clients should not enforce counter limits

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-3936:

Labels: BB2015-05-TBR  (was: )

 Clients should not enforce counter limits 
 --

 Key: MAPREDUCE-3936
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3936
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Tom White
Assignee: Tom White
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-3936.patch, MAPREDUCE-3936.patch


 The code for enforcing counter limits (from MAPREDUCE-1943) creates a static 
 JobConf instance to load the limits, which may throw an exception if the 
 client limit is set to be lower than the limit on the cluster (perhaps 
 because the cluster limit was raised from the default).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6251) JobClient needs additional retries at a higher level to address not-immediately-consistent dfs corner cases

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6251:

Labels: BB2015-05-TBR  (was: )

 JobClient needs additional retries at a higher level to address 
 not-immediately-consistent dfs corner cases
 ---

 Key: MAPREDUCE-6251
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6251
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.6.0
Reporter: Craig Welch
Assignee: Craig Welch
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6251.0.patch, MAPREDUCE-6251.1.patch, 
 MAPREDUCE-6251.2.patch, MAPREDUCE-6251.3.patch, MAPREDUCE-6251.4.patch


 The JobClient is used to get job status information for running and completed 
 jobs.  Final state and history for a job is communicated from the application 
 master to the job history server via a distributed file system - where the 
 history is uploaded by the application master to the dfs and then 
 scanned/loaded by the jobhistory server.  While HDFS has strong consistency 
 guarantees not all Hadoop DFS's do.  When used in conjunction with a 
 distributed file system which does not have this guarantee there will be 
 cases where the history server may not see an uploaded file, resulting in the 
 dreaded no such job and a null value for the RunningJob in the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5819) Binary token merge should be done once in TokenCache#obtainTokensForNamenodesInternal()

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5819:

Labels: BB2015-05-TBR  (was: )

 Binary token merge should be done once in 
 TokenCache#obtainTokensForNamenodesInternal()
 ---

 Key: MAPREDUCE-5819
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5819
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: security
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: mapreduce-5819-v1.txt


 Currently mergeBinaryTokens() is called by every invocation of 
 obtainTokensForNamenodesInternal(FileSystem, Credentials, Configuration) in 
 the loop of obtainTokensForNamenodesInternal(Credentials, Path[], 
 Configuration).
 This can be simplified so that mergeBinaryTokens() is called only once in 
 obtainTokensForNamenodesInternal(Credentials, Path[], Configuration).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-2340) optimize JobInProgress.initTasks()

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-2340:

Labels: BB2015-05-TBR critical-0.22.0  (was: critical-0.22.0)

 optimize JobInProgress.initTasks()
 --

 Key: MAPREDUCE-2340
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2340
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.20.1, 0.21.0
Reporter: Kang Xiao
  Labels: BB2015-05-TBR, critical-0.22.0
 Attachments: MAPREDUCE-2340.patch, MAPREDUCE-2340.patch, 
 MAPREDUCE-2340.r1.diff


 JobTracker's hostnameToNodeMap cache can speed up JobInProgress.initTasks() 
 and JobInProgress.createCache() significantly. A test for 1 job with 10 
 maps on a 2400 cluster shows nearly 10 and 50 times speed up for initTasks() 
 and createCache(). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5258) Memory Leak while using LocalJobRunner

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5258:

Labels: BB2015-05-TBR patch  (was: patch)

 Memory Leak while using LocalJobRunner
 --

 Key: MAPREDUCE-5258
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5258
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
Reporter: Subroto Sanyal
Assignee: skrho
  Labels: BB2015-05-TBR, patch
 Fix For: 1.1.3

 Attachments: mapreduce-5258 _001.txt, mapreduce-5258.txt


 Every-time a LocalJobRunner is launched it creates JobTrackerInstrumentation 
 and QueueMetrics.
 While creating this MetricsSystem ; it registers and adds a Callback to 
 ArrayList which keeps on growing as the DefaultMetricsSystem is Singleton. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6350) JobHistory doesn't support fully-functional search

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6350:

Labels: BB2015-05-TBR  (was: )

 JobHistory doesn't support fully-functional search
 --

 Key: MAPREDUCE-6350
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6350
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Reporter: Siqi Li
Assignee: Siqi Li
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: YARN-1614.v1.patch, YARN-1614.v2.patch


 job history server will only output the first 50 characters of the job names 
 in webUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6284) Add a 'task attempt state' to MapReduce Application Master REST API

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6284:

Labels: BB2015-05-TBR  (was: )

 Add a 'task attempt state' to MapReduce Application Master REST API
 ---

 Key: MAPREDUCE-6284
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6284
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Ryu Kobayashi
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6284.1.patch, MAPREDUCE-6284.1.patch, 
 MAPREDUCE-6284.2.patch, MAPREDUCE-6284.3.patch, MAPREDUCE-6284.3.patch


 It want to 'task attempt state' on the 'App state' similarly REST API.
 GET http://proxy http address:port/proxy/application 
 _id/ws/v1/mapreduce/jobs/job_id/tasks/task_id/attempts/attempt_id/state
 PUT http://proxy http address:port/proxy/application 
 _id/ws/v1/mapreduce/jobs/job_id/tasks/task_id/attempts/attempt_id/state
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6338) MR AppMaster does not honor ephemeral port range

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6338:

Labels: BB2015-05-TBR  (was: )

 MR AppMaster does not honor ephemeral port range
 

 Key: MAPREDUCE-6338
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6338
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 2.6.0
Reporter: Frank Nguyen
Assignee: Frank Nguyen
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6338.002.patch


 The MR AppMaster should only use port ranges defined in the 
 yarn.app.mapreduce.am.job.client.port-range property.  On initial startup of 
 the MRAppMaster, it does use the port range defined in the property.  
 However, it also opens up a listener on a random ephemeral port.  This is not 
 the Jetty listener.  It is another listener opened by the MRAppMaster via 
 another thread and is recognized by the RM.  Other nodes will try to 
 communicate to it via that random port.  With firewall settings on, the MR 
 job will fail because the random port is not opened.  This problem has caused 
 others to have all OS ephemeral ports opened to have MR jobs run.
 This is related to MAPREDUCE-4079



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6332) Add more required API's to MergeManager interface

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6332:

Labels: BB2015-05-TBR  (was: )

 Add more required API's to MergeManager interface 
 --

 Key: MAPREDUCE-6332
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.5.0, 2.6.0, 2.7.0
Reporter: Rohith
Assignee: Rohith
  Labels: BB2015-05-TBR
 Attachments: 0001-MAPREDUCE-6332.patch, 0002-MAPREDUCE-6332.patch


 MR provides ability to the user for plugin custom ShuffleConsumerPlugin using 
 *mapreduce.job.reduce.shuffle.consumer.plugin.class*.  When the user is 
 allowed to use this configuration as plugin, user also interest in 
 implementing his own MergeManagerImpl. 
 But now , user is forced to use MR provided MergeManagerImpl instead of 
 custom MergeManagerImpl when user is using shuffle.consumer.plugin class. 
 There should be well defined API's in MergeManager that can be used for any 
 implementation without much effort to user for custom implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5733) Define and use a constant for property textinputformat.record.delimiter

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5733:

Labels: BB2015-05-TBR  (was: )

 Define and use a constant for property textinputformat.record.delimiter
 -

 Key: MAPREDUCE-5733
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5733
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Gelesh
Assignee: Gelesh
Priority: Trivial
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5733.patch, MAPREDUCE-5733_2.patch

   Original Estimate: 10m
  Remaining Estimate: 10m

 (Configugration) conf.set(textinputformat.record.delimiter,myDelimiter) , 
 is bound to typo error. Lets have it as a Static String in some class, to 
 minimise such error. This would also help in IDE like eclipse suggesting the 
 String.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5203) Make AM of M/R Use NMClient

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5203:

Labels: BB2015-05-TBR  (was: )

 Make AM of M/R Use NMClient
 ---

 Key: MAPREDUCE-5203
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5203
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5203.1.patch, MAPREDUCE-5203.2.patch, 
 MAPREDUCE-5203.3.patch, MAPREDUCE-5203.4.patch, MAPREDUCE-5203.5.patch


 YARN-422 adds NMClient. AM of mapreduce should use it instead of using the 
 raw ContainerManager proxy directly. ContainerLauncherImpl needs to be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-2632:

Labels: BB2015-05-TBR  (was: )

 Avoid calling the partitioner when the numReduceTasks is 1.
 ---

 Key: MAPREDUCE-2632
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: tasktracker
Affects Versions: 0.23.0
Reporter: Ravi Teja Ch N V
Assignee: Ravi Teja Ch N V
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-2632-1.patch, MAPREDUCE-2632.patch


 We can avoid the call to the partitioner when the number of reducers is 
 1.This will avoid the unnecessary computations by the partitioner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5374) CombineFileRecordReader does not set map.input.* configuration parameters for first file read

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5374:

Labels: BB2015-05-TBR  (was: )

 CombineFileRecordReader does not set map.input.* configuration parameters 
 for first file read
 ---

 Key: MAPREDUCE-5374
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5374
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Dave Beech
Assignee: Dave Beech
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5374.patch, MAPREDUCE-5374.patch


 The CombineFileRecordReader operates on splits consisting of multiple files. 
 Each time a new record reader is initialised for a chunk, certain 
 parameters are supposed to be set on the configuration object 
 (map.input.file, map.input.start and map.input.length)
 However, the first reader is initialised in a different way to subsequent 
 ones (i.e. initialize is called by the MapTask directly rather than from 
 inside the record reader class). Because of this, these config parameters are 
 not set properly and are returned as null when you access them from inside a 
 mapper. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5981) Log levels of certain MR logs can be changed to DEBUG

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5981:

Labels: BB2015-05-TBR  (was: )

 Log levels of certain MR logs can be changed to DEBUG
 -

 Key: MAPREDUCE-5981
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5981
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Varun Saxena
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5981.patch


 Following map reduce logs can be changed to DEBUG log level.
 1. In 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher#copyFromHost(Fetcher.java : 
 313), the second log is not required to be at info level. This can be moved 
 to debug as a warn log is anyways printed if verifyReply fails.
   SecureShuffleUtils.verifyReply(replyHash, encHash, shuffleSecretKey);
   LOG.info(for url=+msgToEncode+ sent hash and received reply);
 2. Thread related info need not be printed in logs at INFO level. Below 2 
 logs can be moved to DEBUG
 a) In 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl#getHost(ShuffleSchedulerImpl.java
  : 381), below log can be changed to DEBUG
LOG.info(Assigning  + host +  with  + host.getNumKnownMapOutputs() +
 to  + Thread.currentThread().getName());
 b) In 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.getMapsForHost(ShuffleSchedulerImpl.java
  : 411), below log can be changed to DEBUG
  LOG.info(assigned  + includedMaps +  of  + totalSize +  to  +
  host +  to  + Thread.currentThread().getName());
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5362) clean up POM dependencies

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5362:

Labels: BB2015-05-TBR  (was: )

 clean up POM dependencies
 -

 Key: MAPREDUCE-5362
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5362
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5362.patch, mr-5362-0.patch


 Intermediate 'pom' modules define dependencies inherited by leaf modules.
 This is causing issues in intellij IDE.
 We should normalize the leaf modules like in common, hdfs and tools where all 
 dependencies are defined in each leaf module and the intermediate 'pom' 
 module do not define any dependency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6020) Too many threads blocking on the global JobTracker lock from getJobCounters, optimize getJobCounters to release global JobTracker lock before access the per job count

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6020:

Labels: BB2015-05-TBR  (was: )

 Too many threads blocking on the global JobTracker lock from getJobCounters, 
 optimize getJobCounters to release global JobTracker lock before access the 
 per job counter in JobInProgress
 -

 Key: MAPREDUCE-6020
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6020
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.23.10
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6020.branch1.patch


 Too many threads blocking on the global JobTracker lock from getJobCounters, 
 optimize getJobCounters to release global JobTracker lock before access the 
 per job counter in JobInProgress. It may be a lot of JobClients to call 
 getJobCounters in JobTracker at the same time, Current code will lock the 
 JobTracker to block all the threads to get counter from JobInProgress. It is 
 better to unlock the JobTracker when get counter from 
 JobInProgress(job.getCounters(counters)). So all the theads can run parallel 
 when access its own job counter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5889:

Labels: BB2015-05-TBR newbie  (was: newbie)

 Deprecate FileInputFormat.setInputPaths(Job, String) and 
 FileInputFormat.addInputPaths(Job, String)
 ---

 Key: MAPREDUCE-5889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: BB2015-05-TBR, newbie
 Attachments: MAPREDUCE-5889.3.patch, MAPREDUCE-5889.patch, 
 MAPREDUCE-5889.patch


 {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and 
 {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail 
 to parse commaSeparatedPaths if a comma is included in the file path. (e.g. 
 Path: {{/path/file,with,comma}})
 We should deprecate these methods and document to use {{setInputPaths(Job 
 job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} 
 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5929) YARNRunner.java, path for jobJarPath not set correctly

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5929:

Labels: BB2015-05-TBR newbie patch  (was: newbie patch)

 YARNRunner.java, path for jobJarPath not set correctly
 --

 Key: MAPREDUCE-5929
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5929
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Chao Tian
Assignee: Rahul Palamuttam
  Labels: BB2015-05-TBR, newbie, patch
 Attachments: MAPREDUCE-5929.patch


 In YARNRunner.java, line 357,
 Path jobJarPath = new Path(jobConf.get(MRJobConfig.JAR));
 This causes the job.jar file to miss scheme, host and port number on 
 distributed file systems other than hdfs. 
 If we compare line 357 with line 344, there job.xml is actually set as
  
 Path jobConfPath = new Path(jobSubmitDir,MRJobConfig.JOB_CONF_FILE);
 It appears jobSubmitDir is missing on line 357, which causes this problem. 
 In hdfs, the additional qualify process will correct this problem, but not 
 other generic distributed file systems.
 The proposed change is to replace 35 7 with
 Path jobJarPath = new Path(jobConf.get(jobSubmitDir,MRJobConfig.JAR));
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6038) A boolean may be set error in the Word Count v2.0 in MapReduce Tutorial

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6038:

Labels: BB2015-05-TBR  (was: )

 A boolean may be set error in the Word Count v2.0 in MapReduce Tutorial
 ---

 Key: MAPREDUCE-6038
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6038
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: java version 1.8.0_11 hostspot 64-bit
Reporter: Pei Ma
Assignee: Tsuyoshi Ozawa
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6038.1.patch


 As a beginner, when I learned about the basic of the mr, I found that I 
 cound't run the WordCount2 using the command bin/hadoop jar wc.jar 
 WordCount2 /user/joe/wordcount/input /user/joe/wordcount/output in the 
 Tutorial. The VM throwed the NullPoniterException at the line 47. In the line 
 45, the returned default value of conf.getBoolean is true. That is to say  
 when wordcount.skip.patterns is not set ,the WordCount2 will continue to 
 execute getCacheFiles.. Then patternsURIs gets the null value. When the 
 -skip option dosen't exist,  wordcount.skip.patterns will not be set. 
 Then a NullPointerException come out.
 At all, the block after the if-statement in line no. 45 shoudn't be executed 
 when the -skip option dosen't exist in command. Maybe the line 45 should 
 like that  if (conf.getBoolean(wordcount.skip.patterns, false)) { 
 .Just change the boolean.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5817) mappers get rescheduled on node transition even after all reducers are completed

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5817:

Labels: BB2015-05-TBR  (was: )

 mappers get rescheduled on node transition even after all reducers are 
 completed
 

 Key: MAPREDUCE-5817
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5817
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.3.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
  Labels: BB2015-05-TBR
 Attachments: mapreduce-5817.patch


 We're seeing a behavior where a job runs long after all reducers were already 
 finished. We found that the job was rescheduling and running a number of 
 mappers beyond the point of reducer completion. In one situation, the job ran 
 for some 9 more hours after all reducers completed!
 This happens because whenever a node transition (to an unusable state) comes 
 into the app master, it just reschedules all mappers that already ran on the 
 node in all cases.
 Therefore, if any node transition has a potential to extend the job period. 
 Once this window opens, another node transition can prolong it, and this can 
 happen indefinitely in theory.
 If there is some instability in the pool (unhealthy, etc.) for a duration, 
 then any big job is severely vulnerable to this problem.
 If all reducers have been completed, JobImpl.actOnUnusableNode() should not 
 reschedule mapper tasks. If all reducers are completed, the mapper outputs 
 are no longer needed, and there is no need to reschedule mapper tasks as they 
 would not be consumed anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5490) MapReduce doesn't set the environment variable for children processes

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5490:

Labels: BB2015-05-TBR  (was: )

 MapReduce doesn't set the environment variable for children processes
 -

 Key: MAPREDUCE-5490
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5490
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Owen O'Malley
Assignee: Owen O'Malley
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5490.patch, mr-5490.patch, mr-5490.patch


 Currently, MapReduce uses the command line argument to pass the classpath to 
 the child. This breaks if the process forks a child that needs the same 
 classpath. Such a case happens in Hive when it uses map-side joins. I propose 
 that we make MapReduce in branch-1 use the CLASSPATH environment variable 
 like YARN does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5499) Fix synchronization issues of the setters/getters of *PBImpl which take in/return lists

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5499:

Labels: BB2015-05-TBR  (was: )

 Fix synchronization issues of the setters/getters of *PBImpl which take 
 in/return lists
 ---

 Key: MAPREDUCE-5499
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5499
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Xuan Gong
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5499.1.patch, MAPREDUCE-5499.2.patch


 Similar to YARN-609. There're the following *PBImpls which need to be fixed:
 1. GetDiagnosticsResponsePBImpl
 2. GetTaskAttemptCompletionEventsResponsePBImpl
 3. GetTaskReportsResposnePBImpl
 4. CounterGroupPBImpl
 5. JobReportPBImpl
 6. TaskReportPBImpl



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5392) mapred job -history all command throws IndexOutOfBoundsException

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5392:

Labels: BB2015-05-TBR  (was: )

 mapred job -history all command throws IndexOutOfBoundsException
 --

 Key: MAPREDUCE-5392
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5392
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0, 2.0.5-alpha, 2.2.0
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5392.2.patch, MAPREDUCE-5392.3.patch, 
 MAPREDUCE-5392.4.patch, MAPREDUCE-5392.5.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch


 When I use an all option by mapred job -history comamnd, the following 
 exceptions are displayed and do not work.
 {code}
 Exception in thread main java.lang.StringIndexOutOfBoundsException: String 
 index out of range: -3
 at java.lang.String.substring(String.java:1875)
 at 
 org.apache.hadoop.mapreduce.util.HostUtil.convertTrackerNameToHostName(HostUtil.java:49)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.getTaskLogsUrl(HistoryViewer.java:459)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.printAllTaskAttempts(HistoryViewer.java:235)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.print(HistoryViewer.java:117)
 at org.apache.hadoop.mapreduce.tools.CLI.viewHistory(CLI.java:472)
 at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:313)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1233)
 {code}
 This is because a node name recorded in History file is not given tracker_. 
 Therefore it makes modifications to be able to read History file even if a 
 node name is not given by tracker_.
 In addition, it fixes the URL of displayed task log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4065) Add .proto files to built tarball

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4065:

Labels: BB2015-05-TBR  (was: )

 Add .proto files to built tarball
 -

 Key: MAPREDUCE-4065
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4065
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.2, 2.4.0
Reporter: Ralph H Castain
Assignee: Tsuyoshi Ozawa
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-4065.1.patch


 Please add the .proto files to the built tarball so that users can build 3rd 
 party tools that use protocol buffers without having to do an svn checkout of 
 the source code.
 Sorry I don't know more about Maven, or I would provide a patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6030) In mr-jobhistory-daemon.sh, some env variables are not affected by mapred-env.sh

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6030:

Labels: BB2015-05-TBR  (was: )

 In mr-jobhistory-daemon.sh, some env variables are not affected by 
 mapred-env.sh
 

 Key: MAPREDUCE-6030
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6030
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.4.1
Reporter: Youngjoon Kim
Assignee: Youngjoon Kim
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6030.patch


 In mr-jobhistory-daemon.sh, some env variables are exported before sourcing 
 mapred-env.sh, so these variables don't use values defined in mapred-env.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6040) distcp should automatically use /.reserved/raw when run by the superuser

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6040:

Labels: BB2015-05-TBR  (was: )

 distcp should automatically use /.reserved/raw when run by the superuser
 

 Key: MAPREDUCE-6040
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6040
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distcp
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Charles Lamb
  Labels: BB2015-05-TBR
 Attachments: HDFS-6134-Distcp-cp-UseCasesTable2.pdf, 
 MAPREDUCE-6040.001.patch, MAPREDUCE-6040.002.patch


 On HDFS-6134, [~sanjay.radia] asked for distcp to automatically prepend 
 /.reserved/raw if the distcp is being performed by the superuser and 
 /.reserved/raw is supported by both the source and destination filesystems. 
 This behavior only occurs if none of the src and target pathnames are 
 /.reserved/raw.
 The -disablereservedraw flag can be used to disable this option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6068) Illegal progress value warnings in map tasks

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6068:

Labels: BB2015-05-TBR  (was: )

 Illegal progress value warnings in map tasks
 

 Key: MAPREDUCE-6068
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6068
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, task
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Binglin Chang
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6068.002.patch, MAPREDUCE-6068.v1.patch


 When running a terasort on latest trunk, I see the following in my task logs:
 {code}
 2014-09-02 17:42:28,437 INFO [main] org.apache.hadoop.mapred.MapTask: Map 
 output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
 2014-09-02 17:42:42,238 WARN [main] org.apache.hadoop.util.Progress: Illegal 
 progress value found, progress is larger than 1. Progress will be changed to 1
 2014-09-02 17:42:42,238 WARN [main] org.apache.hadoop.util.Progress: Illegal 
 progress value found, progress is larger than 1. Progress will be changed to 1
 2014-09-02 17:42:42,241 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Starting flush of map output
 {code}
 We should eliminate these warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6315) Implement retrieval of logs for crashed MR-AM via jhist in the staging directory

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6315:

Labels: BB2015-05-TBR  (was: )

 Implement retrieval of logs for crashed MR-AM via jhist in the staging 
 directory
 

 Key: MAPREDUCE-6315
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6315
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client, mr-am
Affects Versions: 2.7.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6315.001.patch


 When all AM attempts crash, there is no record of them in JHS. Thus no easy 
 way to get the logs. This JIRA automates the procedure by utilizing the jhist 
 file in the staging directory. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6246:

Labels: BB2015-05-TBR DB2 mapreduce  (was: DB2 mapreduce)

 DBOutputFormat.java appending extra semicolon to query which is incompatible 
 with DB2
 -

 Key: MAPREDUCE-6246
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.4.1
 Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x
 Platform: xSeries, pSeries
 Browser: Firefox, IE
 Security Settings: No Security, Flat file, LDAP, PAM
 File System: HDFS, GPFS FPO
Reporter: ramtin
Assignee: ramtin
  Labels: BB2015-05-TBR, DB2, mapreduce
 Attachments: MAPREDUCE-6246.002.patch, MAPREDUCE-6246.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 DBoutputformat is used for writing output of mapreduce jobs to the database 
 and when used with db2 jdbc drivers it fails with following error
 com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, 
 SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, 
 DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at 
 com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127)
 In DBOutputFormat class there is constructQuery method that generates INSERT 
 INTO statement with semicolon(;) at the end.
 Semicolon is ANSI SQL-92 standard character for a statement terminator but 
 this feature is disabled(OFF) as a default settings in IBM DB2.
 Although by using -t we can turn it ON for db2. 
 (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2).
  But there are some products that already built on top of this default 
 setting (OFF) so by turning ON this feature make them error prone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6316) Task Attempt List entries should link to the task overview

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6316:

Labels: BB2015-05-TBR  (was: )

 Task Attempt List entries should link to the task overview
 --

 Key: MAPREDUCE-6316
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6316
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Siqi Li
Assignee: Siqi Li
  Labels: BB2015-05-TBR
 Attachments: AM attempt page.png, AM task page.png, All Attempts 
 page.png, MAPREDUCE-6316.v1.patch, MAPREDUCE-6316.v2.patch, 
 MAPREDUCE-6316.v3.patch, Task Overview page.png


 Typical workflow is to click on the list of failed attempts. Then you want to 
 look at the counters, or the list of attempts of just one task in general. If 
 each entry task attempt id linked the task id portion of it back to the task, 
 we would not have to go through the list of tasks to search for the task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5465) Container killed before hprof dumps profile.out

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5465:

Labels: BB2015-05-TBR  (was: )

 Container killed before hprof dumps profile.out
 ---

 Key: MAPREDUCE-5465
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5465
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am, mrv2
Reporter: Radim Kolar
Assignee: Ming Ma
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5465-2.patch, MAPREDUCE-5465-3.patch, 
 MAPREDUCE-5465-4.patch, MAPREDUCE-5465-5.patch, MAPREDUCE-5465-6.patch, 
 MAPREDUCE-5465-7.patch, MAPREDUCE-5465-8.patch, MAPREDUCE-5465-9.patch, 
 MAPREDUCE-5465.patch


 If there is profiling enabled for mapper or reducer then hprof dumps 
 profile.out at process exit. It is dumped after task signaled to AM that work 
 is finished.
 AM kills container with finished work without waiting for hprof to finish 
 dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 
 works) , it could not finish dump in time before being killed making entire 
 dump unusable because cpu and heap stats are missing.
 There needs to be better delay before container is killed if profiling is 
 enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6305) AM/Task log page should be able to link back to the job

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6305:

Labels: BB2015-05-TBR  (was: )

 AM/Task log page should be able to link back to the job
 ---

 Key: MAPREDUCE-6305
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6305
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Siqi Li
Assignee: Siqi Li
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6305.v1.patch, MAPREDUCE-6305.v2.patch, 
 MAPREDUCE-6305.v3.patch, MAPREDUCE-6305.v4.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6241) Native compilation fails for Checksum.cc due to an incompatibility of assembler register constraint for PowerPC

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6241:

Labels: BB2015-05-TBR features  (was: features)

 Native compilation fails for Checksum.cc due to an  incompatibility of 
 assembler register constraint for PowerPC
 

 Key: MAPREDUCE-6241
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6241
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0, 2.6.0
 Environment: Debian/Jessie, kernel 3.18.5,  ppc64 GNU/Linux
 gcc (Debian 4.9.1-19)
 protobuf 2.6.1
 OpenJDK Runtime Environment (IcedTea 2.5.3) (7u71-2.5.3-2)
 OpenJDK Zero VM (build 24.65-b04, interpreted mode)
 source was cloned (and updated) from Apache-Hadoop's git repository 
Reporter: Stephan Drescher
Assignee: Binglin Chang
Priority: Minor
  Labels: BB2015-05-TBR, features
 Attachments: MAPREDUCE-6241.001.patch, MAPREDUCE-6241.002.patch


 Issue when using assembler code for performance optimization on the powerpc 
 platform (compiled for 32bit)
 mvn compile -Pnative -DskipTests
 [exec] /usr/bin/c++   -Dnativetask_EXPORTS -m32  -DSIMPLE_MEMCPY 
 -fno-strict-aliasing -Wall -Wno-sign-compare -g -O2 -DNDEBUG -fPIC 
 -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native/javah
  
 -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src
  
 -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util
  
 -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib
  
 -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/test
  
 -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src
  
 -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native
  -I/home/hadoop/Java/java7/include -I/home/hadoop/Java/java7/include/linux 
 -isystem 
 /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/gtest/include
 -o CMakeFiles/nativetask.dir/main/native/src/util/Checksum.cc.o -c 
 /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util/Checksum.cc
  [exec] CMakeFiles/nativetask.dir/build.make:744: recipe for target 
 'CMakeFiles/nativetask.dir/main/native/src/util/Checksum.cc.o' failed
  [exec] make[2]: Leaving directory 
 '/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native'
  [exec] CMakeFiles/Makefile2:95: recipe for target 
 'CMakeFiles/nativetask.dir/all' failed
  [exec] make[1]: Leaving directory 
 '/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native'
  [exec] Makefile:76: recipe for target 'all' failed
  [exec] 
 /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util/Checksum.cc:
  In function ‘void NativeTask::init_cpu_support_flag()’:
 /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util/Checksum.cc:611:14:
  error: impossible register constraint in ‘asm’
 --
 popl %%ebx : =a (eax), [ebx] =r(ebx), =c(ecx), =d(edx) : a 
 (eax_in) : cc);
 --



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6336) Enable v2 FileOutputCommitter by default

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6336:

Labels: BB2015-05-TBR  (was: )

 Enable v2 FileOutputCommitter by default
 

 Key: MAPREDUCE-6336
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6336
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 2.7.0
Reporter: Gera Shegalov
Assignee: Siqi Li
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6336.v1.patch


 This JIRA is to propose making new FileOutputCommitter behavior from 
 MAPREDUCE-4815 enabled by default in trunk, and potentially in branch-2. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6269) improve JobConf to add option to not share Credentials between jobs.

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6269:

Labels: BB2015-05-TBR  (was: )

 improve JobConf to add option to not share Credentials between jobs.
 

 Key: MAPREDUCE-6269
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6269
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6269.000.patch


 Improve JobConf to add constructor to avoid sharing Credentials between jobs.
 By default the Credentials will be shared to keep the backward compatibility.
 We can add a new constructor with a new parameter to decide whether to share 
 Credentials. Some issues reported in cascading is due to corrupted credentials
 at
 https://github.com/Cascading/cascading/commit/45b33bb864172486ac43782a4d13329312d01c0e
 If we add this support in JobConf, it will benefit all job clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6298) Job#toString throws an exception when not in state RUNNING

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6298:

Labels: BB2015-05-TBR  (was: )

 Job#toString throws an exception when not in state RUNNING
 --

 Key: MAPREDUCE-6298
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6298
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6298.1.patch


 Job#toString calls {{ensureState(JobState.RUNNING);}} as the very first 
 thing. That method causes an Exception to be thrown which is not nice.
 One thing this breaks is usage of Job on the Scala (e.g. Spark) REPL as that 
 calls toString after every invocation and that fails every time.
 I'll attach a patch that checks state and if it's RUNNING prints the original 
 message and if not prints something else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6356) Misspelling of threshold in log4j.properties for tests

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6356:

Labels: BB2015-05-TBR  (was: )

 Misspelling of threshold in log4j.properties for tests
 --

 Key: MAPREDUCE-6356
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6356
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6356.patch


 log4j.properties file for test contains misspelling log4j.threshhold.
 We should use log4j.threshold correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-2094) org.apache.hadoop.mapreduce.lib.input.FileInputFormat: isSplitable implements unsafe default behaviour that is different from the documented behaviour.

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-2094:

Labels: BB2015-05-TBR  (was: )

 org.apache.hadoop.mapreduce.lib.input.FileInputFormat: isSplitable implements 
 unsafe default behaviour that is different from the documented behaviour.
 ---

 Key: MAPREDUCE-2094
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2094
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Reporter: Niels Basjes
Assignee: Niels Basjes
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-2094-2011-05-19.patch, 
 MAPREDUCE-2094-20140727-svn-fixed-spaces.patch, 
 MAPREDUCE-2094-20140727-svn.patch, MAPREDUCE-2094-20140727.patch, 
 MAPREDUCE-2094-2015-05-05-2328.patch, 
 MAPREDUCE-2094-FileInputFormat-docs-v2.patch


 When implementing a custom derivative of FileInputFormat we ran into the 
 effect that a large Gzipped input file would be processed several times. 
 A near 1GiB file would be processed around 36 times in its entirety. Thus 
 producing garbage results and taking up a lot more CPU time than needed.
 It took a while to figure out and what we found is that the default 
 implementation of the isSplittable method in 
 [org.apache.hadoop.mapreduce.lib.input.FileInputFormat | 
 http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java?view=markup
  ] is simply return true;. 
 This is a very unsafe default and is in contradiction with the JavaDoc of the 
 method which states: Is the given filename splitable? Usually, true, but if 
 the file is stream compressed, it will not be.  . The actual implementation 
 effectively does Is the given filename splitable? Always true, even if the 
 file is stream compressed using an unsplittable compression codec. 
 For our situation (where we always have Gzipped input) we took the easy way 
 out and simply implemented an isSplittable in our class that does return 
 false; 
 Now there are essentially 3 ways I can think of for fixing this (in order of 
 what I would find preferable):
 # Implement something that looks at the used compression of the file (i.e. do 
 migrate the implementation from TextInputFormat to FileInputFormat). This 
 would make the method do what the JavaDoc describes.
 # Force developers to think about it and make this method abstract.
 # Use a safe default (i.e. return false)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6279) AM should explicity exit JVM after all services have stopped

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6279:

Labels: BB2015-05-TBR  (was: )

 AM should explicity exit JVM after all services have stopped
 

 Key: MAPREDUCE-6279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Eric Payne
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6279.v1.txt, MAPREDUCE-6279.v2.txt, 
 MAPREDUCE-6279.v3.patch, MAPREDUCE-6279.v4.patch


 Occasionally the MapReduce AM can get stuck trying to shut down.  
 MAPREDUCE-6049 and MAPREDUCE-5888 were specific instances that have been 
 fixed, but this can also occur with uber jobs if the task code inadvertently 
 leaves non-daemon threads lingering.
 We should explicitly shutdown the JVM after the MapReduce AM has unregistered 
 and all services have been stopped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6174) Combine common stream code into parent class for InMemoryMapOutput and OnDiskMapOutput.

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6174:

Labels: BB2015-05-TBR  (was: )

 Combine common stream code into parent class for InMemoryMapOutput and 
 OnDiskMapOutput.
 ---

 Key: MAPREDUCE-6174
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6174
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 3.0.0, 2.6.0
Reporter: Eric Payne
Assignee: Eric Payne
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6174.002.patch, MAPREDUCE-6174.003.patch, 
 MAPREDUCE-6174.v1.txt


 Per MAPREDUCE-6166, both InMemoryMapOutput and OnDiskMapOutput will be doing 
 similar things with regards to IFile streams.
 In order to make it explicit that InMemoryMapOutput and OnDiskMapOutput are 
 different from 3rd-party implementations, this JIRA will make them subclass a 
 common class (see 
 https://issues.apache.org/jira/browse/MAPREDUCE-6166?focusedCommentId=14223368page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14223368)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5747) Potential null pointer deference in HsTasksBlock#render()

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5747:

Labels: BB2015-05-TBR newbie patch  (was: newbie patch)

 Potential null pointer deference in HsTasksBlock#render()
 -

 Key: MAPREDUCE-5747
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5747
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor
  Labels: BB2015-05-TBR, newbie, patch
 Attachments: MAPREDUCE-5747-1.patch


 At line 140:
 {code}
 } else {
   ta = new TaskAttemptInfo(successful, type, false);
 {code}
 There is no check for type against null.
 TaskAttemptInfo ctor deferences type:
 {code}
   public TaskAttemptInfo(TaskAttempt ta, TaskType type, Boolean isRunning) {
 final TaskAttemptReport report = ta.getReport();
 this.type = type.toString();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6337) add a mode to replay MR job history files to the timeline service

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6337:

Labels: BB2015-05-TBR  (was: )

 add a mode to replay MR job history files to the timeline service
 -

 Key: MAPREDUCE-6337
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6337
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Sangjin Lee
Assignee: Sangjin Lee
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6337-YARN-2928.001.patch


 The subtask covers the work on top of YARN-3437 to add a mode to replay MR 
 job history files to the timeline service storage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6079) Renaming JobImpl#username to reporterUserName

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6079:

Labels: BB2015-05-TBR  (was: )

 Renaming JobImpl#username to reporterUserName
 -

 Key: MAPREDUCE-6079
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6079
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Tsuyoshi Ozawa
Assignee: Tsuyoshi Ozawa
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6079.1.patch


 On MAPREDUCE-6033, we found the bug because of confusing field names 
 {{userName}} and {{username}}. We should change the names to distinguish them 
 easily. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6320) Configuration of retrieved Job via Cluster is not properly set-up

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6320:

Labels: BB2015-05-TBR  (was: )

 Configuration of retrieved Job via Cluster is not properly set-up
 -

 Key: MAPREDUCE-6320
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6320
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jens Rabe
Assignee: Jens Rabe
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6320.001.patch, MAPREDUCE-6320.002.patch, 
 MAPREDUCE-6320.003.patch


 When getting a Job via the Cluster API, it is not correctly configured.
 To reproduce this:
 # Submit a MR job, and set some arbitrary parameter to its configuration
 {code:java}
 job.getConfiguration().set(foo, bar);
 job.setJobName(foo-bug-demo);
 {code}
 # Get the job in a client:
 {code:java}
 final Cluster c = new Cluster(conf);
 final JobStatus[] statuses = c.getAllJobStatuses();
 final JobStatus s = ... // get the status for the job named foo-bug-demo
 final Job j = c.getJob(s.getJobId());
 final Configuration conf = job.getConfiguration();
 {code}
 # Get its foo entry
 {code:java}
 final String s = conf.get(foo);
 {code}
 # Expected: s is bar; But: s is null.
 The reason is that the job's configuration is stored on HDFS (the 
 Configuration has a resource with a *hdfs://* URL) and in the *loadResource* 
 it is changed to a path on the local file system 
 (hdfs://host.domain:port/tmp/hadoop-yarn/... is changed to 
 /tmp/hadoop-yarn/...), which does not exist, and thus the configuration is 
 not populated.
 The bug happens in the *Cluster* class, where *JobConfs* are created from 
 *status.getJobFile()*. A quick fix would be to copy this job file to a 
 temporary file in the local file system and populate the JobConf from this 
 file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6128:

Labels: BB2015-05-TBR  (was: )

 Automatic addition of bundled jars to distributed cache 
 

 Key: MAPREDUCE-6128
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 2.5.1
Reporter: Gera Shegalov
Assignee: Gera Shegalov
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch, 
 MAPREDUCE-6128.v03.patch, MAPREDUCE-6128.v04.patch, MAPREDUCE-6128.v05.patch, 
 MAPREDUCE-6128.v06.patch, MAPREDUCE-6128.v07.patch, MAPREDUCE-6128.v08.patch


 On the client side, JDK adds Class-Path elements from the job jar manifest
 on the classpath. In theory there could be many bundled jars in many 
 directories such that adding them manually via libjars or similar means to 
 task classpaths is cumbersome. If this property is enabled, the same jars are 
 added
 to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4683) We need to fix our build to create/distribute hadoop-mapreduce-client-core-tests.jar

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4683:

Labels: BB2015-05-TBR  (was: )

 We need to fix our build to create/distribute 
 hadoop-mapreduce-client-core-tests.jar
 

 Key: MAPREDUCE-4683
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4683
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Reporter: Arun C Murthy
Assignee: Akira AJISAKA
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-4683.patch


 We need to fix our build to create/distribute 
 hadoop-mapreduce-client-core-tests.jar, need this before MAPREDUCE-4253



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6310) Add jdiff support to MapReduce

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6310:

Labels: BB2015-05-TBR  (was: )

 Add jdiff support to MapReduce
 --

 Key: MAPREDUCE-6310
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6310
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Li Lu
Assignee: Li Lu
Priority: Blocker
  Labels: BB2015-05-TBR
 Attachments: MAPRED-6310-040615.patch


 Previously we used jdiff for Hadoop common and HDFS. Now we're extending the 
 support of jdiff to YARN. Probably we'd like to do similar things with 
 MapReduce? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6271) org.apache.hadoop.mapreduce.Cluster GetJob() display warn log

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6271:

Labels: BB2015-05-TBR  (was: )

 org.apache.hadoop.mapreduce.Cluster GetJob() display warn log
 -

 Key: MAPREDUCE-6271
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6271
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.7.0
Reporter: Peng Zhang
Assignee: Peng Zhang
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6271.v2.patch, MR-6271.patch


 When using getJob() with MapReduce 2.7, warn log caused by configuration 
 loaded twice is displayed every time. And when job completed, this function 
 will display warn log of java.io.FileNotFoundException
 And I think this is related with MAPREDUCE-5875, the change in GetJob() seems 
 to be not needed, cause it's only for test.
 {noformat}
 15/03/04 13:41:23 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 15/03/04 13:41:23 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.attempts;  Ignoring.
 15/03/04 13:41:24 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 15/03/04 13:41:24 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.attempts;  Ignoring.
 15/03/04 13:41:25 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 15/03/04 13:41:25 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.attempts;  Ignoring.
 15/03/04 13:41:26 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 15/03/04 13:41:26 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.attempts;  Ignoring.
 15/03/04 13:41:27 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 15/03/04 13:41:27 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.attempts;  Ignoring.
 15/03/04 13:41:28 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 15/03/04 13:41:28 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.attempts;  Ignoring.
 15/03/04 13:41:29 WARN conf.Configuration: 
 hdfsG://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 15/03/04 13:41:29 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.attempts;  Ignoring.
 15/03/04 13:41:29 INFO exec.Task: 2015-03-04 13:41:29,853 Stage-1 map = 100%, 
  reduce = 0%, Cumulative CPU 2.37 sec
 15/03/04 13:41:30 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 15/03/04 13:41:30 WARN conf.Configuration: 
 hdfs://example/yarn/example2/staging/test_user/.staging/job_1425388652704_0116/job.xml:an
  attempt to override final parameter:

[jira] [Updated] (MAPREDUCE-6296) A better way to deal with InterruptedException on waitForCompletion

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6296:

Labels: BB2015-05-TBR  (was: )

 A better way to deal with InterruptedException on waitForCompletion
 ---

 Key: MAPREDUCE-6296
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6296
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Yang Hao
Assignee: Yang Hao
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6296.patch


 Some code in method waitForCompletion of Job class is 
 {code:title=Job.java|borderStyle=solid}
   public boolean waitForCompletion(boolean verbose
) throws IOException, InterruptedException,
 ClassNotFoundException {
 if (state == JobState.DEFINE) {
   submit();
 }
 if (verbose) {
   monitorAndPrintJob();
 } else {
   // get the completion poll interval from the client.
   int completionPollIntervalMillis = 
 Job.getCompletionPollInterval(cluster.getConf());
   while (!isComplete()) {
 try {
   Thread.sleep(completionPollIntervalMillis);
 } catch (InterruptedException ie) {
 }
   }
 }
 return isSuccessful();
   }
 {code}
 but a better way to deal with InterruptException is
 {code:title=Job.java|borderStyle=solid}
   public boolean waitForCompletion(boolean verbose
) throws IOException, InterruptedException,
 ClassNotFoundException {
 if (state == JobState.DEFINE) {
   submit();
 }
 if (verbose) {
   monitorAndPrintJob();
 } else {
   // get the completion poll interval from the client.
   int completionPollIntervalMillis = 
 Job.getCompletionPollInterval(cluster.getConf());
   while (!isComplete()) {
 try {
   Thread.sleep(completionPollIntervalMillis);
 } catch (InterruptedException ie) {
   Thread.currentThread().interrupt();
 }
   }
 }
 return isSuccessful();
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-3517) map.input.path is null at the first split when use CombieFileInputFormat

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-3517:

Labels: BB2015-05-TBR  (was: )

  map.input.path is null at the first split when use CombieFileInputFormat
 ---

 Key: MAPREDUCE-3517
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3517
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 0.20.203.0
Reporter: wanbin
  Labels: BB2015-05-TBR
 Attachments: CombineFileRecordReader.diff, MAPREDUCE-3517.02.patch


  map.input.path is null at the first split when use CombieFileInputFormat. 
 because in runNewMapper function, mapContext instead of taskContext which is 
 set map.input.path.  so we need set map.input.path again to mapContext



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5883) Total megabyte-seconds in job counters is slightly misleading

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5883:

Labels: BB2015-05-TBR  (was: )

 Total megabyte-seconds in job counters is slightly misleading
 ---

 Key: MAPREDUCE-5883
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5883
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.4.0
Reporter: Nathan Roberts
Assignee: Nathan Roberts
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5883.patch


 The following counters are in milliseconds so megabyte-seconds might be 
 better stated as megabyte-milliseconds
 MB_MILLIS_MAPS.name=   Total megabyte-seconds taken by all map 
 tasks
 MB_MILLIS_REDUCES.name=Total megabyte-seconds taken by all reduce 
 tasks
 VCORES_MILLIS_MAPS.name=   Total vcore-seconds taken by all map tasks
 VCORES_MILLIS_REDUCES.name=Total vcore-seconds taken by all reduce 
 tasks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6027) mr jobs with relative paths can fail

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6027:

Labels: BB2015-05-TBR  (was: )

 mr jobs with relative paths can fail
 

 Key: MAPREDUCE-6027
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6027
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Reporter: Wing Yew Poon
Assignee: Wing Yew Poon
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6027.patch


 I built hadoop from branch-2 and tried to run terasort as follows:
 {noformat}
 wypoon$ bin/hadoop jar 
 share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0-SNAPSHOT.jar terasort 
 sort-input sort-output
 14/08/07 08:57:55 INFO terasort.TeraSort: starting
 2014-08-07 08:57:56.229 java[36572:1903] Unable to load realm info from 
 SCDynamicStore
 14/08/07 08:57:56 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 14/08/07 08:57:57 INFO input.FileInputFormat: Total input paths to process : 2
 Spent 156ms computing base-splits.
 Spent 2ms computing TeraScheduler splits.
 Computing input splits took 159ms
 Sampling 2 splits of 2
 Making 1 from 10 sampled records
 Computing parititions took 626ms
 Spent 789ms computing partitions.
 14/08/07 08:57:57 INFO client.RMProxy: Connecting to ResourceManager at 
 localhost/127.0.0.1:8032
 14/08/07 08:57:58 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
 /tmp/hadoop-yarn/staging/wypoon/.staging/job_1407426900134_0001
 java.lang.IllegalArgumentException: Can not create a Path from an empty URI
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:140)
   at org.apache.hadoop.fs.Path.init(Path.java:192)
   at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
   at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.checkPermissionOfOther(ClientDistributedCacheManager.java:275)
   at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.ancestorsHaveExecutePermissions(ClientDistributedCacheManager.java:256)
   at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.isPublic(ClientDistributedCacheManager.java:243)
   at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineCacheVisibilities(ClientDistributedCacheManager.java:162)
   at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:58)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
   at org.apache.hadoop.examples.terasort.TeraSort.run(TeraSort.java:316)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at org.apache.hadoop.examples.terasort.TeraSort.main(TeraSort.java:325)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}
 If I used absolute paths for the input and out directories, the job runs fine.
 This breakage is due to HADOOP-10876.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5876) SequenceFileRecordReader NPE if close() is called before initialize()

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5876:

Labels: BB2015-05-TBR  (was: )

 SequenceFileRecordReader NPE if close() is called before initialize()
 -

 Key: MAPREDUCE-5876
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5876
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.3.0, 2.4.0
Reporter: Reinis Vicups
Assignee: Tsuyoshi Ozawa
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5876.1.patch


 org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader extends 
 org.apache.hadoop.mapreduce.RecordReader which in turn implements 
 java.io.Closeable.
 According to java spec the java.io.Closeable#close() has to be idempotent 
 (http://docs.oracle.com/javase/7/docs/api/java/io/Closeable.html) which is 
 not.
 An NPE is being thrown if close() method is invoked without previously 
 calling initialize() method. This happens because SequenceFile.Reader in is 
 null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6003) Resource Estimator suggests huge map output in some cases

2015-05-05 Thread Allen Wittenauer (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-6003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Allen Wittenauer updated MAPREDUCE-6003:

Labels: BB2015-05-TBR (was: )

Resource Estimator suggests huge map output in some cases
-

Key: MAPREDUCE-6003
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6003
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: jobtracker
Affects Versions: 1.2.1
Reporter: Chengbing Liu
Assignee: Chengbing Liu
Labels: BB2015-05-TBR
Attachments: MAPREDUCE-6003-branch-1.2.patch

In some cases, ResourceEstimator can return way too large map output
estimation. This happens when input size is not correctly calculated.
A typical case is when joining two Hive tables (one in HDFS and the other in
HBase). The maps that process the HBase table finish first, which has a 0
length of inputs due to its TableInputFormat. Then for a map that processes
HDFS table, the estimated output size is very large because of the wrong
input size, causing the map task not possible to be assigned.
There are two possible solutions to this problem:
(1) Make input size correct for each case, e.g. HBase, etc.
(2) Use another algorithm to estimate the map output, or at least make it
closer to reality.
I prefer the second way, since the first would require all possibilities to
be taken care of. It is not easy for some inputs such as URIs.
In my opinion, we could make a second estimation which is independent of the
input size:
estimationB = (completedMapOutputSize / completedMaps) * totalMaps * 10
Here, multiplying by 10 makes the estimation more conservative, so that it
will be less likely to assign it to some where not big enough.
The former estimation goes like this:
estimationA = (inputSize * completedMapOutputSize * 2.0) /
completedMapInputSize
My suggestion is to take minimum of the two estimations:
estimation = min(estimationA, estimationB)

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-3182:

Labels: BB2015-05-TBR  (was: )

 loadgen ignores -m command line when writing random data
 

 Key: MAPREDUCE-3182
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0, 2.3.0
Reporter: Jonathan Eagles
Assignee: Chen He
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-3182.patch


 If no input directories are specified, loadgen goes into a special mode where 
 random data is generated and written. In that mode, setting the number of 
 mappers (-m command line option) is overridden by a calculation. Instead, it 
 should take into consideration the user specified number of mappers and fall 
 back to the calculation. In addition, update the documentation as well to 
 match the new behavior in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-1380) Adaptive Scheduler

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-1380:

Labels: BB2015-05-TBR  (was: )

 Adaptive Scheduler
 --

 Key: MAPREDUCE-1380
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1380
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.4.1
Reporter: Jordà Polo
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-1380-branch-1.2.patch, 
 MAPREDUCE-1380_0.1.patch, MAPREDUCE-1380_1.1.patch, MAPREDUCE-1380_1.1.pdf


 The Adaptive Scheduler is a pluggable Hadoop scheduler that automatically 
 adjusts the amount of used resources depending on the performance of jobs and 
 on user-defined high-level business goals.
 Existing Hadoop schedulers are focused on managing large, static clusters in 
 which nodes are added or removed manually. On the other hand, the goal of 
 this scheduler is to improve the integration of Hadoop and the applications 
 that run on top of it with environments that allow a more dynamic 
 provisioning of resources.
 The current implementation is quite straightforward. Users specify a deadline 
 at job submission time, and the scheduler adjusts the resources to meet that 
 deadline (at the moment, the scheduler can be configured to either minimize 
 or maximize the amount of resources). If multiple jobs are run 
 simultaneously, the scheduler prioritizes them by deadline. Note that the 
 current approach to estimate the completion time of jobs is quite simplistic: 
 it is based on the time it takes to finish each task, so it works well with 
 regular jobs, but there is still room for improvement for unpredictable jobs.
 The idea is to further integrate it with cloud-like and virtual environments 
 (such as Amazon EC2, Emotive, etc.) so that if, for instance, a job isn't 
 able to meet its deadline, the scheduler automatically requests more 
 resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5845) TestShuffleHandler failing intermittently on windows

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5845:

Labels: BB2015-05-TBR  (was: )

 TestShuffleHandler failing intermittently on windows
 

 Key: MAPREDUCE-5845
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5845
 Project: Hadoop Map/Reduce
  Issue Type: Test
Reporter: Varun Vasudev
Assignee: Varun Vasudev
  Labels: BB2015-05-TBR
 Attachments: apache-mapreduce-5845.0.patch


 TestShuffleHandler fails intermittently on Windows - specifically, 
 testClientClosesConnection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5225) SplitSampler in mapreduce.lib should use a SPLIT_STEP to jump around splits

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5225:

Labels: BB2015-05-TBR  (was: )

 SplitSampler in mapreduce.lib should use a SPLIT_STEP to jump around splits
 ---

 Key: MAPREDUCE-5225
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5225
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5225.1.patch


 Now, SplitSampler only samples the first maxSplitsSampled splits, caused by 
 MAPREDUCE-1820. However, jumping around all splits is in general preferable 
 than the first N splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4216) Make MultipleOutputs generic to support non-file output formats

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4216:

Labels: BB2015-05-TBR Output  (was: Output)

 Make MultipleOutputs generic to support non-file output formats
 ---

 Key: MAPREDUCE-4216
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4216
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 1.0.2
Reporter: Robbie Strickland
  Labels: BB2015-05-TBR, Output
 Attachments: MAPREDUCE-4216.patch


 The current MultipleOutputs implementation is tied to FileOutputFormat in 
 such a way that it is not extensible to other types of output. It should be 
 made more generic, such as with an interface that can be implemented for 
 different outputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4840) Delete dead code and deprecate public API related to skipping bad records

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4840:

Labels: BB2015-05-TBR  (was: )

 Delete dead code and deprecate public API related to skipping bad records
 -

 Key: MAPREDUCE-4840
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4840
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Mostafa Elhemali
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-4840.patch


 It looks like the decision was made in MAPREDUCE-1932 to remove support for 
 skipping bad records rather than fix it (it doesn't work right now in trunk). 
 If that's the case then we should probably delete all the dead code related 
 to it and deprecate the public API's for it right?
 Dead code I'm talking about:
 1. Task class: skipping, skipRanges, writeSkipRecs
 2. MapTask class:  SkippingRecordReader inner class
 3. ReduceTask class: SkippingReduceValuesIterator inner class
 4. Tests: TestBadRecords
 Public API:
 1. SkipBadRecords class



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-3115) OOM When the value for the property mapred.map.multithreadedrunner.class is set to MultithreadedMapper instance.

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-3115:

Labels: BB2015-05-TBR  (was: )

 OOM When the value for the property mapred.map.multithreadedrunner.class is 
 set to MultithreadedMapper instance.
 --

 Key: MAPREDUCE-3115
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3115
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 0.23.0, 1.0.0
 Environment: NA
Reporter: Bhallamudi Venkata Siva Kamesh
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-3115.2.patch, MAPREDUCE-3115.patch


 When we set the value for the property *mapred.map.multithreadedrunner.class* 
 as instance of MultithreadedMapper, using 
 MultithreadedMapper.setMapperClass(), it simply throws 
 IllegalArgumentException.
 But when we set the same property, using job's conf object using 
 job.getConfiguration().setClass(*mapred.map.multithreadedrunner.class*, 
 MultithreadedMapper.class, Mapper.class), throws OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5969) Private non-Archive Files' size add twice in Distributed Cache directory size calculation.

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5969:

Labels: BB2015-05-TBR  (was: )

 Private non-Archive Files' size add twice in Distributed Cache directory size 
 calculation.
 --

 Key: MAPREDUCE-5969
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5969
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5969.branch1.1.patch, 
 MAPREDUCE-5969.branch1.patch


 Private non-Archive Files' size add twice in Distributed Cache directory size 
 calculation. Private non-Archive Files list is passed in by -files command 
 line option. The Distributed Cache directory size is used to check whether 
 the total cache files size exceed the cache size limitation,  the default 
 cache size limitation is 10G.
 I add log in addCacheInfoUpdate and setSize in 
 TrackerDistributedCacheManager.java.
 I use the following command to test:
 hadoop jar ./wordcount.jar org.apache.hadoop.examples.WordCount -files 
 hdfs://host:8022/tmp/zxu/WordCount.java,hdfs://host:8022/tmp/zxu/wordcount.jar
  /tmp/zxu/test_in/ /tmp/zxu/test_out
 to add two files into distributed cache:WordCount.java and wordcount.jar.
 WordCount.java file size is 2395 byes and wordcount.jar file size is 3865 
 bytes. The total should be 6260.
 The log show these files size added twice:
 add one time before download to local node and add second time after download 
 to local node, so total file number becomes 4 instead of 2:
 addCacheInfoUpdate size: 6260 num: 2 baseDir: /mapred/local
 addCacheInfoUpdate size: 8683 num: 3 baseDir: /mapred/local
 addCacheInfoUpdate size: 12588 num: 4 baseDir: /mapred/local
 In the code, for Private non-Archive File, the first time we add file size is 
 at 
 getLocalCache:
 {code}
 if (!isArchive) {
   //for private archives, the lengths come over RPC from the 
   //JobLocalizer since the JobLocalizer is the one who expands
   //archives and gets the total length
   lcacheStatus.size = fileStatus.getLen();
   LOG.info(getLocalCache: + localizedPath +  size = 
   + lcacheStatus.size);
   // Increase the size and sub directory count of the cache
   // from baseDirSize and baseDirNumberSubDir.
   baseDirManager.addCacheInfoUpdate(lcacheStatus);
 }
 {code}
 The second time we add file size is at 
 setSize:
 {code}
   synchronized (status) {
 status.size = size;
 baseDirManager.addCacheInfoUpdate(status);
   }
 {code}
 The fix is not to add the file size for for Private non-Archive File after 
 download(downloadCacheObject).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5227) JobTrackerMetricsSource and QueueMetrics should standardize naming rules

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5227:

Labels: BB2015-05-TBR  (was: )

 JobTrackerMetricsSource and QueueMetrics should standardize naming rules
 

 Key: MAPREDUCE-5227
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5227
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.1.3, 1.2.1
Reporter: Tsuyoshi Ozawa
Assignee: Tsuyoshi Ozawa
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5227-1.1-branch.1.patch, 
 MAPREDUCE-5227-branch-1.1.patch, MAPREDUCE-5227.1.patch


 JobTrackerMetricsSource and QueueMetrics provides users with some metrics, 
 but its naming rules( jobs_running, running_maps, running_reduces) 
 sometimes confuses users. It should be standardized.
 One concern is backward compatibility, so one idea is to share 
 MetricMutableGaugeInt object from old and new property name.
 e.g. to share runningMaps from running_maps and maps_running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5700) historyServer can't show container's log when aggregation is not enabled

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5700:

Labels: BB2015-05-TBR  (was: )

 historyServer can't show container's log when aggregation is not enabled
 

 Key: MAPREDUCE-5700
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5700
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.23.7, 2.0.4-alpha, 2.2.0
 Environment:  yarn.log-aggregation-enable=false , HistoryServer will 
 show like this:
 Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
Reporter: Hong Shen
Assignee: Hong Shen
  Labels: BB2015-05-TBR
 Attachments: yarn-647-2.patch, yarn-647.patch


 When yarn.log-aggregation-enable is seted to false, after a MR_App complete, 
 we can't view the container's log from the HistoryServer, it shows message 
 like:
 Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
 Since we don't want to aggregate the container's log, because it will be a 
 pressure to namenode. but sometimes we also want to take a look at 
 container's log.
 Should we show the container's log across HistoryServer even if 
 yarn.log-aggregation-enable is seted to false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5248) Let NNBenchWithoutMR specify the replication factor for its test

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5248:

Labels: BB2015-05-TBR  (was: )

 Let NNBenchWithoutMR specify the replication factor for its test
 

 Key: MAPREDUCE-5248
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5248
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client, test
Affects Versions: 3.0.0
Reporter: Erik Paulson
Assignee: Erik Paulson
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5248.patch, MAPREDUCE-5248.txt

   Original Estimate: 1h
  Remaining Estimate: 1h

 The NNBenchWithoutMR test creates files with a replicationFactorPerFile 
 hard-coded to 1. It'd be nice to be able to specify that on the commandline.
 Also, it'd be great if MAPREDUCE-4750 was merged along with this fix. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5264) FileAlreadyExistsException is assumed to be thrown by FileSystem#mkdirs or FileContext#mkdir in the codebase

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5264:

Labels: BB2015-05-TBR  (was: )

 FileAlreadyExistsException is assumed to be thrown by FileSystem#mkdirs or 
 FileContext#mkdir in the codebase
 

 Key: MAPREDUCE-5264
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5264
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Rémy SAISSY
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5264.20130607.1.patch


 According to https://issues.apache.org/jira/browse/HADOOP-9438,
 FileSystem#mkdirs and FileContext#mkdir do not throw 
 FileAlreadyExistsException if the directory already exist.
 Some places in the mapreduce codebase assumes FileSystem#mkdirs or 
 FileContext#mkdir throw FileAlreadyExistsException.
 At least the following files are concerned:
  - YarnChild.java
  - JobHistoryEverntHandler.java
  - HistoryFileManager.java
 It would be good to re-review and patch this if needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5708) Duplicate String.format in getSpillFileForWrite

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5708:

Labels: BB2015-05-TBR  (was: )

 Duplicate String.format in getSpillFileForWrite
 ---

 Key: MAPREDUCE-5708
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5708
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Konstantin Weitz
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: 0001-Removed-duplicate-String.format.patch

   Original Estimate: 10m
  Remaining Estimate: 10m

 The code responsible for formatting the spill file name (namely 
 _getSpillFileForWrite_) unnecessarily calls _String.format_ twice. This does 
 not only affect performance, but leads to the weird requirement that task 
 attempt ids cannot contain _%_ characters (because these would be interpreted 
 as format specifiers in the outside _String.format_ call).
 I assume this was done by mistake, as it could only be useful if task attempt 
 ids contained _%n_.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5216) While using TextSplitter in DataDrivenDBInputformat, the lower limit (split start) always remains the same, for all splits.

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5216:

Labels: BB2015-05-TBR  (was: )

 While using TextSplitter in DataDrivenDBInputformat, the lower limit (split 
 start) always remains the same, for all splits.
 ---

 Key: MAPREDUCE-5216
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5216
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Gelesh
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5216.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 While using TextSplitter in DataDrivenDBInputformat, the lower limit (split 
 start) always remains the same, for all splits.
 ie, 
 Split 1 Start =A, End = M, Split 2 Start =A, End = P, Split 3 Start =A, End = 
 S,
 instead of
 Split 1 Start =A, End = M, Split 2 Start =M, End = P, Split 3 Start =P, End = 
 S,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5871) Estimate Job Endtime

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5871:

Labels: BB2015-05-TBR  (was: )

 Estimate Job Endtime
 

 Key: MAPREDUCE-5871
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5871
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5871.patch


 YARN-1969 adds a new earliest-endtime-first policy to the fair scheduler. As 
 a prerequisite step, the AppMaster should estimate its end time and send it 
 to the RM via the heartbeat. This jira focuses on how the AppMaster performs 
 this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6023) Fix SuppressWarnings from unchecked to rawtypes in O.A.H.mapreduce.lib.input.TaggedInputSplit

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6023:

Labels: BB2015-05-TBR newbie  (was: newbie)

 Fix SuppressWarnings from unchecked to rawtypes in 
 O.A.H.mapreduce.lib.input.TaggedInputSplit
 -

 Key: MAPREDUCE-6023
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6023
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Abhilash Srimat Tirumala Pallerlamudi
Priority: Minor
  Labels: BB2015-05-TBR, newbie
 Attachments: MAPREDUCE-6023.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4818) Easier identification of tasks that timeout during localization

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4818:

Labels: BB2015-05-TBR usability  (was: usability)

 Easier identification of tasks that timeout during localization
 ---

 Key: MAPREDUCE-4818
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4818
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 0.23.3, 2.0.3-alpha
Reporter: Jason Lowe
Assignee: Siqi Li
  Labels: BB2015-05-TBR, usability
 Attachments: MAPREDUCE-4818.v1.patch, MAPREDUCE-4818.v2.patch, 
 MAPREDUCE-4818.v3.patch, MAPREDUCE-4818.v4.patch, MAPREDUCE-4818.v5.patch


 When a task is taking too long to localize and is killed by the AM due to 
 task timeout, the job UI/history is not very helpful.  The attempt simply 
 lists a diagnostic stating it was killed due to timeout, but there are no 
 logs for the attempt since it never actually got started.  There are log 
 messages on the NM that show the container never made it past localization by 
 the time it was killed, but users often do not have access to those logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5577) Allow querying the JobHistoryServer by job arrival time

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5577:

Labels: BB2015-05-TBR  (was: )

 Allow querying the JobHistoryServer by job arrival time
 ---

 Key: MAPREDUCE-5577
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5577
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Reporter: Sandy Ryza
Assignee: Sandy Ryza
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5577.patch


   The JobHistoryServer REST APIs currently allow querying by job submit time 
 and finish time.  However, jobs don't necessarily arrive in order of their 
 finish time, meaning that a client who wants to stay on top of all completed 
 jobs needs to query large time intervals to make sure they're not missing 
 anything.  Exposing functionality to allow querying by the time a job lands 
 at the JobHistoryServer would allow clients to set the start of their query 
 interval to the time of their last query. 
 The arrival time of a job would be defined as the time that it lands in the 
 done directory and can be picked up using the last modified date on history 
 files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4487) Reduce job latency by removing hardcoded sleep statements

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4487:

Labels: BB2015-05-TBR  (was: )

 Reduce job latency by removing hardcoded sleep statements
 -

 Key: MAPREDUCE-4487
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4487
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Tom White
Assignee: Tom White
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-4487-mr2.patch, MAPREDUCE-4487.patch


 There are a few places in MapReduce where there are hardcoded sleep 
 statements. By replacing them with wait/notify or similar it's possible to 
 reduce latency for short running jobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6273) HistoryFileManager should check whether summaryFile exists to avoid FileNotFoundException causing HistoryFileInfo into MOVE_FAILED state

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6273:

Labels: BB2015-05-TBR  (was: )

 HistoryFileManager should check whether summaryFile exists to avoid 
 FileNotFoundException causing HistoryFileInfo into MOVE_FAILED state
 

 Key: MAPREDUCE-6273
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6273
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6273.000.patch


 HistoryFileManager should check whether summaryFile exists to avoid 
 FileNotFoundException causing HistoryFileInfo into MOVE_FAILED state,
 I saw the following error message:
 {code}
 2015-02-17 19:13:45,198 ERROR 
 org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Error while trying to 
 move a job to done
 java.io.FileNotFoundException: File does not exist: 
 /user/history/done_intermediate/agd_laci-sluice/job_1423740288390_1884.summary
   at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
   at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1878)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1819)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1771)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:527)
   at 
 org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:85)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:356)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
   at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown 
 Source)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
   at 
 org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1181)
   at 
 org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1169)
   at 
 org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1159)
   at 
 org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:270)
   at 
 org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:237)
   at org.apache.hadoop.hdfs.DFSInputStream.init(DFSInputStream.java:230)
   at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1457)
   at org.apache.hadoop.fs.Hdfs.open(Hdfs.java:318)
   at org.apache.hadoop.fs.Hdfs.open(Hdfs.java:59)
   at 
 org.apache.hadoop.fs.AbstractFileSystem.open(AbstractFileSystem.java:621)
   at org.apache.hadoop.fs.FileContext$6.next(FileContext.java:789)
   at org.apache.hadoop.fs.FileContext$6.next(FileContext.java:785)
   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
   at org.apache.hadoop.fs.FileContext.open(FileContext.java:785)
   at 
 org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getJobSummary(HistoryFileManager.java:953)
   at 
 org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.access$400(HistoryFileManager.java:82)
   at

[jira] [Updated] (MAPREDUCE-4919) All maps hangs when set mapreduce.task.io.sort.factor to 1

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4919:

Labels: BB2015-05-TBR  (was: )

 All maps hangs when set mapreduce.task.io.sort.factor to 1
 --

 Key: MAPREDUCE-4919
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4919
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Reporter: Jerry Chen
Assignee: Jerry Chen
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-4919.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 In one of my testing that when I set mapreduce.task.io.sort.factor to 1, all 
 the maps hang and will never end. But the CPU usage for each node are very 
 high and until killed by the app master when time out comes, and the job 
 failed. 
 I traced the problem and found out that all the maps hangs on the final merge 
 phase.
 The while loop in computeBytesInMerges will never end with a factor of 1:
 int f = 1; //in my case
 int n = 16; //in my case
 while (n  f || considerFinalMerge) {
   ...
   n -= (f-1);
   f = factor;
 }
 As the f-1 will equals 0 and n will always be 16 and the while runs for ever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5951:

Labels: BB2015-05-TBR  (was: )

 Add support for the YARN Shared Cache
 -

 Key: MAPREDUCE-5951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Chris Trezzo
Assignee: Chris Trezzo
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5951-trunk-v1.patch, 
 MAPREDUCE-5951-trunk-v2.patch, MAPREDUCE-5951-trunk-v3.patch, 
 MAPREDUCE-5951-trunk-v4.patch, MAPREDUCE-5951-trunk-v5.patch, 
 MAPREDUCE-5951-trunk-v6.patch, MAPREDUCE-5951-trunk-v7.patch, 
 MAPREDUCE-5951-trunk-v8.patch


 Implement the necessary changes so that the MapReduce application can 
 leverage the new YARN shared cache (i.e. YARN-1492).
 Specifically, allow per-job configuration so that MapReduce jobs can specify 
 which set of resources they would like to cache (i.e. jobjar, libjars, 
 archives, files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-1362) Pipes should be ported to the new mapreduce API

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-1362:

Labels: BB2015-05-TBR  (was: )

 Pipes should be ported to the new mapreduce API
 ---

 Key: MAPREDUCE-1362
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1362
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: pipes
Reporter: Bassam Tabbara
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-1362-trunk.patch, MAPREDUCE-1362.patch, 
 MAPREDUCE-1362.patch


 Pipes is still currently using the old mapred API. This prevents us from 
 using pipes with HBase's TableInputFormat, HRegionPartitioner, etc. 
 Here is a rough proposal for how to accomplish this:
 * Add a new package org.apache.hadoop.mapreduce.pipes that uses the new 
 mapred API.
 * the new pipes package will run side by side with the old one. old one 
 should get deprecated at some point.
 * the wire protocol used between PipesMapper and PipesReducer and C++ 
 programs must not change.
 * bin/hadoop should support both pipes (old api) and pipes2 (new api)
 Does this sound reasonable?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-3914) Mismatched free() / delete / delete [] in HadoopPipes

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-3914:

Labels: BB2015-05-TBR  (was: )

 Mismatched free() / delete / delete [] in HadoopPipes
 -

 Key: MAPREDUCE-3914
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3914
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 0.20.205.0, 0.23.0, 1.0.0
 Environment: Based upon map reduce pipes task executed on Ubuntu 11.10
Reporter: Charles Earl
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-3914-branch-0.23.patch, 
 MAPREDUCE-3914-branch-1.0.patch, MAPREDUCE-3914.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 When running valgrind on a simple MapReduce pipes job, valgrind identifies a 
 mismatched new / delete:
 ==20394== Mismatched free() / delete / delete []
 ==20394==at 0x4C27FF2: operator delete(void*) (vg_replace_malloc.c:387)
 ==20394==by 0x4328A5: HadoopPipes::runTask(HadoopPipes::Factory const) 
 (HadoopPipes.cc:1171)
 ==20394==by 0x424C33: main (ProcessRow.cpp:118)
 ==20394==  Address 0x9c5b540 is 0 bytes inside a block of size 131,072 alloc'd
 ==20394==at 0x4C2864B: operator new[](unsigned long) 
 (vg_replace_malloc.c:305)
 ==20394==by 0x431E5D: HadoopPipes::runTask(HadoopPipes::Factory const) 
 (HadoopPipes.cc:1121)
 ==20394==by 0x424C33: main (ProcessRow.cpp:118)
 ==20394== 
 ==20394== Mismatched free() / delete / delete []
 ==20394==at 0x4C27FF2: operator delete(void*) (vg_replace_malloc.c:387)
 ==20394==by 0x4328AF: HadoopPipes::runTask(HadoopPipes::Factory const) 
 (HadoopPipes.cc:1172)
 ==20394==by 0x424C33: main (ProcessRow.cpp:118)
 ==20394==  Address 0x9c7b580 is 0 bytes inside a block of size 131,072 alloc'd
 ==20394==at 0x4C2864B: operator new[](unsigned long) 
 (vg_replace_malloc.c:305)
 ==20394==by 0x431E6A: HadoopPipes::runTask(HadoopPipes::Factory const) 
 (HadoopPipes.cc:1122)
 ==20394==by 0x424C33: main (ProcessRow.cpp:118)
 The new [] calls in Lines 1121 and 1122 of HadoopPipes.cc:
 bufin = new char[bufsize];
 bufout = new char[bufsize];
 should have matching delete [] calls but are instead bracketed my delete on 
 lines 1171 and 1172:
   delete bufin;
   delete bufout;
 So these should be replaced by delete[]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5491) DFSIO do not initialize write buffer correctly

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5491:

Labels: BB2015-05-TBR  (was: )

 DFSIO do not initialize write buffer correctly
 --

 Key: MAPREDUCE-5491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5491
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: benchmarks, test
Reporter: Raymond Liu
Assignee: Raymond Liu
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5491-v2.patch, MAPREDUCE-5491.patch


 In DFSIO test, the IOMapperBase will set bufferSize in configure method, 
 while writeMapper, appendMapper etc use bufferSize to initialize buffer in 
 the constructor. This will lead to buffer not initialized at all. It is ok 
 for non compression route, while compression is used, the output data size 
 will be very small due to all 0 in buffer.
 Thus, the overrided configure method should be be the correct place for 
 initial buffer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5549) distcp app should fail if m/r job fails

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5549:

Labels: BB2015-05-TBR  (was: )

 distcp app should fail if m/r job fails
 ---

 Key: MAPREDUCE-5549
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5549
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp, mrv2
Affects Versions: 3.0.0
Reporter: David Rosenstrauch
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5549-001.patch, MAPREDUCE-5549-002.patch


 I run distcpv2 in a scripted manner.  The script checks if the distcp step 
 fails and, if so, aborts the rest of the script.  However, I ran into an 
 issue today where the distcp job failed, but my calling script went on its 
 merry way.
 Digging into the code a bit more (at 
 https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java),
  I think I see the issue:  the distcp app is not returning an error exit code 
 to the shell when the distcp job fails.  This is a big problem, IMO, as it 
 prevents distcp from being successfully used in a scripted environment.  IMO, 
 the code should change like so:
 Before:
 {code:title=org.apache.hadoop.tools.DistCp.java}
 //...
   public int run(String[] argv) {
 //...
 try {
   execute();
 } catch (InvalidInputException e) {
   LOG.error(Invalid input: , e);
   return DistCpConstants.INVALID_ARGUMENT;
 } catch (DuplicateFileException e) {
   LOG.error(Duplicate files in input path: , e);
   return DistCpConstants.DUPLICATE_INPUT;
 } catch (Exception e) {
   LOG.error(Exception encountered , e);
   return DistCpConstants.UNKNOWN_ERROR;
 }
 return DistCpConstants.SUCCESS;
   }
 //...
 {code}
 After:
 {code:title=org.apache.hadoop.tools.DistCp.java}
 //...
   public int run(String[] argv) {
 //...
 Job job = null;
 try {
   job = execute();
 } catch (InvalidInputException e) {
   LOG.error(Invalid input: , e);
   return DistCpConstants.INVALID_ARGUMENT;
 } catch (DuplicateFileException e) {
   LOG.error(Duplicate files in input path: , e);
   return DistCpConstants.DUPLICATE_INPUT;
 } catch (Exception e) {
   LOG.error(Exception encountered , e);
   return DistCpConstants.UNKNOWN_ERROR;
 }
 if (job.isSuccessful()) {
   return DistCpConstants.SUCCESS;
 }
 else {
   return DistCpConstants.UNKNOWN_ERROR;
 }
   }
 //...
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4443) MR AM and job history server should be resilient to jobs that exceed counter limits

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4443:

Labels: BB2015-05-TBR usability  (was: usability)

 MR AM and job history server should be resilient to jobs that exceed counter 
 limits 
 

 Key: MAPREDUCE-4443
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4443
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Rahul Jain
Assignee: Mayank Bansal
  Labels: BB2015-05-TBR, usability
 Attachments: MAPREDUCE-4443-trunk-1.patch, 
 MAPREDUCE-4443-trunk-2.patch, MAPREDUCE-4443-trunk-3.patch, 
 MAPREDUCE-4443-trunk-draft.patch, am_failed_counter_limits.txt


 We saw this problem migrating applications to MapReduceV2:
 Our applications use hadoop counters extensively (1000+ counters for certain 
 jobs). While this may not be one of recommended best practices in hadoop, the 
 real issue here is reliability of the framework when applications exceed 
 counter limits.
 The hadoop servers (yarn, history server) were originally brought up with 
 mapreduce.job.counters.max=1000 under core-site.xml
 We then ran map-reduce job under an application using its own job specific 
 overrides, with  mapreduce.job.counters.max=1
 All the tasks for the job finished successfully; however the overall job 
 still failed due to AM encountering exceptions as:
 {code}
 2012-07-12 17:31:43,485 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks
 : 712012-07-12 17:31:43,502 FATAL [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher threa
 dorg.apache.hadoop.mapreduce.counters.LimitExceededException: Too many 
 counters: 1001 max=1000
 at 
 org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:58) 
at org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:65)
 at 
 org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:77)
 at 
 org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:94)
 at 
 org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:105)
 at 
 org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.incrAllCounters(AbstractCounterGroup.java:202)
 at 
 org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:337)
 at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.constructFinalFullcounters(JobImpl.java:1212)
 at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.mayBeConstructFinalFullCounters(JobImpl.java:1198)
 at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.createJobFinishedEvent(JobImpl.java:1179)
 at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.logJobHistoryFinishedEvent(JobImpl.java:711)
 at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.checkJobCompleteSuccess(JobImpl.java:737)
 at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.checkJobForCompletion(JobImpl.java:1360)
 at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1340)
 at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1323)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:380)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
 at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:666)
 at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:113)
 at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:890)
 at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:886)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:125)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:74)   
  at java.lang.Thread.run(Thread.java:662)
 2012-07-12 17:31:43,502 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..2012-07-12 
 17:31:43,503 INFO [Thread-1] org.apache.had
 {code}
 The

[jira] [Updated] (MAPREDUCE-6096) SummarizedJob class NPEs with some jhist files

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6096:

Labels: BB2015-05-TBR easyfix patch  (was: easyfix patch)

 SummarizedJob class NPEs with some jhist files
 --

 Key: MAPREDUCE-6096
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6096
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Reporter: zhangyubiao
  Labels: BB2015-05-TBR, easyfix, patch
 Attachments: MAPREDUCE-6096-v2.patch, MAPREDUCE-6096-v3.patch, 
 MAPREDUCE-6096-v4.patch, MAPREDUCE-6096-v5.patch, MAPREDUCE-6096-v6.patch, 
 MAPREDUCE-6096-v7.patch, MAPREDUCE-6096-v8.patch, MAPREDUCE-6096.patch, 
 job_1410427642147_0124-1411726671220-hadp-word+count-1411726696863-1-1-SUCCEEDED-default.jhist


 When I Parse  the JobHistory in the HistoryFile,I use the Hadoop System's  
 map-reduce-client-core project 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser  class and 
 HistoryViewer$SummarizedJob to Parse the JobHistoryFile(Just Like 
 job_1408862281971_489761-1410883171851_XXX.jhist)  
 and it throw an Exception Just Like 
 Exception in thread pool-1-thread-1 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer$SummarizedJob.init(HistoryViewer.java:626)
   at 
 com.jd.hadoop.log.parse.ParseLogService.getJobDetail(ParseLogService.java:70)
 After I'm see the SummarizedJob class I  find that attempt.getTaskStatus() is 
 NULL ， So I change the order of 
 attempt.getTaskStatus().equals (TaskStatus.State.FAILED.toString())  to 
 TaskStatus.State.FAILED.toString().equals(attempt.getTaskStatus()) 
 and it works well .
 So I wonder If we can change all  attempt.getTaskStatus()  after 
 TaskStatus.State.XXX.toString() ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5917) Be able to retrieve configuration keys by index

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5917:

Labels: BB2015-05-TBR  (was: )

 Be able to retrieve configuration keys by index
 ---

 Key: MAPREDUCE-5917
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5917
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: pipes
Reporter: Joe Mudd
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5917.patch


 The pipes C++ side does not have a configuration key/value pair iterator.  It 
 is useful to be able to iterate through all of the configuration keys without 
 having to expose a C++ map iterator since that is specific to the JobConf 
 internals.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-3097) archive does not archive if the content specified is a file

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-3097:

Labels: BB2015-05-TBR  (was: )

 archive does not archive if the content specified is a file
 ---

 Key: MAPREDUCE-3097
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3097
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.203.0, 0.20.205.0
Reporter: Arpit Gupta
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-3097.patch


 archive command only archives directories. when the content specified is a 
 file it proceeds with the archive job but does not archive the content this 
 can be misleading as the user might think that archive was successful. We 
 should change it to either throw an error or make it archive files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5608) Replace and deprecate mapred.tasktracker.indexcache.mb

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5608:

Labels: BB2015-05-TBR configuration newbie  (was: configuration newbie)

 Replace and deprecate mapred.tasktracker.indexcache.mb
 --

 Key: MAPREDUCE-5608
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5608
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: BB2015-05-TBR, configuration, newbie
 Attachments: MAPREDUCE-5608-002.patch, MAPREDUCE-5608.patch


 In MR2 mapred.tasktracker.indexcache.mb still works for configuring the size 
 of the shuffle service index cache.  As the tasktracker no longer exists, we 
 should replace this with something like mapreduce.shuffle.indexcache.mb. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5854) Move the search box in UI from the right side to the left side

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5854:

Labels: BB2015-05-TBR  (was: )

 Move the search box in UI from the right side to the left side
 --

 Key: MAPREDUCE-5854
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5854
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.23.9
Reporter: Jinhui Liu
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5854.patch, MAPREDUCE-5854.patch


 In the UI for resoure manager, job history, and job configuration (this might 
 not be a complete list), there is a search box at the top-right corner of the 
 listed content. This search box is frequently used but it is usually not 
 visible due to right-alignment. Extra scroll is needed to make it visable and 
 it is not convenient. It would be good to move it to the left-side, next to 
 the Show ... Entries drop-down box.
 In the same spirit, the First|Preious|...|Next|Last at the bottom-right 
 corner of the listed content can also be moved to the left side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-3202) Integrating Hadoop Vaidya with Job History UI in Hadoop 2.0

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-3202:

Labels: BB2015-05-TBR  (was: )

 Integrating Hadoop Vaidya with Job History UI in Hadoop 2.0 
 

 Key: MAPREDUCE-3202
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3202
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: jobhistoryserver
Affects Versions: 2.0.0-alpha
Reporter: vitthal (Suhas) Gogate
Assignee: vitthal (Suhas) Gogate
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-3202.patch, MAPREDUCE-3202.patch


 Hadoop Vaidya provides a detailed analysis of the M/R job in terms of various 
 execution inefficiencies and the associated remedies that user can easily 
 understand and fix. This Jira patch integrates it with Job History UI under 
 Hadoop 2.0 branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-4594) Add init/shutdown methods to mapreduce Partitioner

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-4594:

Labels: BB2015-05-TBR  (was: )

 Add init/shutdown methods to mapreduce Partitioner
 --

 Key: MAPREDUCE-4594
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4594
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Reporter: Radim Kolar
Assignee: Radim Kolar
  Labels: BB2015-05-TBR
 Attachments: partitioner1.txt, partitioner2.txt, partitioner2.txt, 
 partitioner3.txt, partitioner4.txt, partitioner5.txt, partitioner6.txt, 
 partitioner6.txt, partitioner7.txt, partitioner8.txt, partitioner9.txt


 The Partitioner supports only the Configurable API, which can be used for 
 basic init in setConf(). Problem is that there is no shutdown function.
 I propose to use standard setup() cleanup() functions like in mapper / 
 reducer.
 Use case is that I need to start and stop spring context and datagrid client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-5860) Hadoop pipes Combiner is closed before all of its reduce calls

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-5860:

Labels: BB2015-05-TBR  (was: )

 Hadoop pipes Combiner is closed before all of its reduce calls
 --

 Key: MAPREDUCE-5860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 0.23.0
 Environment: 0.23.0 on 64 bit linux
Reporter: Joe Mudd
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5860.patch


 When a Combiner is specified to runTask() its reduce() method may be called 
 after its close() method has been called due to how the Combiner's containing 
 object, CombineRunner, is closed after the TaskContextImpl's reducer member 
 is closed (see TaskContextImpl::closeAll()).
 I believe the fix is to delegate the Combiner's ownership to CombineRunner, 
 making it responsible for calling the Combiner's close() method and deleting 
 the Combiner instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6117) Hadoop ignores yarn.nodemanager.hostname for RPC listeners

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6117:

Labels: BB2015-05-TBR  (was: )

 Hadoop ignores yarn.nodemanager.hostname for RPC listeners
 --

 Key: MAPREDUCE-6117
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6117
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, task
Affects Versions: 2.2.1, 2.4.1, 2.5.1
 Environment: Any mapreduce example with standard cluster.  In our 
 case each node has four networks.  It is important that all internode 
 communication be done on a specific network.
Reporter: Waldyn Benbenek
Assignee: Waldyn Benbenek
  Labels: BB2015-05-TBR
 Fix For: 2.5.1

 Attachments: MapReduce-534.patch

   Original Estimate: 48h
  Time Spent: 384h
  Remaining Estimate: 0h

 The RPC listeners for an application are using the hostname of the node as 
 the binding address of the listener,  They ignore yarn.nodemanager.hostname 
 for this.  In our setup we want all communication between nodes to be done 
 via the network addresses we specify in yarn.nodemanager.hostname on each 
 node.  
 TaskAttemptListenerImpl.java and MRClientService.java are two places I have 
 found where the default address is used rather that NM_host.   The node 
 Manager hostname should be used for all communication between nodes including 
 the RPC listeners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6204) TestJobCounters should use new properties instead JobConf.MAPRED_TASK_JAVA_OPTS

2015-05-05 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6204:

Labels: BB2015-05-TBR  (was: )

 TestJobCounters should use new properties instead 
 JobConf.MAPRED_TASK_JAVA_OPTS
 ---

 Key: MAPREDUCE-6204
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6204
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 2.6.0
Reporter: sam liu
Assignee: sam liu
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-6204-1.patch, MAPREDUCE-6204.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

< 1 2 3 4 5 6 7 8 9 10 >

301 - 400 of 1589 matches

Mail list logo