[jira] [Updated] (MAPREDUCE-6283) MRHistoryServer log files management optimization

2015-07-02 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated MAPREDUCE-6283:

Priority: Major  (was: Minor)

 MRHistoryServer log files management optimization
 -

 Key: MAPREDUCE-6283
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6283
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Reporter: Zhang Wei
Assignee: Varun Saxena
   Original Estimate: 2,016h
  Remaining Estimate: 2,016h

 In some heavy computation clusters, user may continually submit lots of jobs, 
 in our scenario, there are 240k jobs per day. On average, 5 nodes will 
 participate in running a job. All these job's log file will be aggregated on 
 the hdfs. That is a big load for namenode. The total number of generated log 
 files in the default cleaning period (1 week) can be calculated as follows:
 AM logs per week: 7 days * 240,000 jobs/day * 2 files/job = 3360,000 files
 App logs per week: 7 days * 240,000 jobs/day * 5 nodes/job * 1 file/node = 
 8400,000 files
 There will be more than 10 million log files generated in one week. Even 
 worse, some environments have to keep the logs for potential issues tracking 
 for longer time. In general, these small log files will occupy about 12G heap 
 size of Namenode, and impact the response speed of Namenode.
 For optimizing the log management of history server, the main goals are:
 1)Reduce the total count of files in HDFS.
 2)Compatible with the former history server operation.
 As per the goals above, we can mine the detail demands as follows: 
 1)Merge log files into bigger ones in HDFS periodically.
 2)Optimized design should inherits from the original architecture to make 
 the merged logs transparent to be browsed.
 3)Merged logs should be aged periodically just like the common logs.
 The whole  life cycle of the AM logs:
 1.Created by Application Master in intermediate-done-dir.
 2.Moved to done-dir after the job is done.
 3.Archived to archived-dir  periodically.
 4.Cleaned when all the logs in harball are expired.
 The whole  life cycle of the App logs:
 1.Created by Applications in local-dirs.
 2.Aggregated to remote-app-log-dir after the job is done.
 3.Archived to archived-dir  periodically.
 4.Cleaned when all the logs in harball are expired. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6240) Hadoop client displays confusing error message

2015-07-02 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612794#comment-14612794
 ] 

Gera Shegalov commented on MAPREDUCE-6240:
--

Thanks [~chris.douglas], ran out of time this week. Will take care of it next 
week.

 Hadoop client displays confusing error message
 --

 Key: MAPREDUCE-6240
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6240
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.7.0
Reporter: Mohammad Kamrul Islam
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-6240-gera.001.patch, 
 MAPREDUCE-6240-gera.001.patch, MAPREDUCE-6240-gera.002.patch, 
 MAPREDUCE-6240.003.patch, MAPREDUCE-6240.1.patch


 Hadoop client often throws exception  with java.io.IOException: Cannot 
 initialize Cluster. Please check your configuration for 
 mapreduce.framework.name and the correspond server addresses.
 This is a misleading and generic message for any cluster initialization 
 problem. It takes a lot of debugging hours to identify the root cause. The 
 correct error message could resolve this problem quickly.
 In one such instance, Oozie log showed the following exception  while the 
 root cause was CNF  that Hadoop client didn't return in the exception.
 {noformat}
  JA009: Cannot initialize Cluster. Please check your configuration for 
 mapreduce.framework.name and the correspond server addresses.
 at 
 org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:412)
 at 
 org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:392)
 at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979)
 at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134)
 at 
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
 at 
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
 at org.apache.oozie.command.XCommand.call(XCommand.java:281)
 at 
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323)
 at 
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252)
 at 
 org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: java.io.IOException: Cannot initialize Cluster. Please check your 
 configuration for mapreduce.framework.name and the correspond server 
 addresses.
 at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
 at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:82)
 at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:75)
 at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470)
 at org.apache.hadoop.mapred.JobClient.init(JobClient.java:449)
 at 
 org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:372)
 at 
 org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:370)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at 
 org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:379)
 at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1185)
 at 
 org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:927)
  ... 10 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (MAPREDUCE-6415) Create a tool to combine aggregated logs into HAR files

2015-07-02 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-6415 started by Robert Kanter.

 Create a tool to combine aggregated logs into HAR files
 ---

 Key: MAPREDUCE-6415
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6415
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.8.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: HAR-ableAggregatedLogs_v1.pdf


 While we wait for YARN-2942 to become viable, it would still be great to 
 improve the aggregated logs problem.  We can write a tool that combines 
 aggregated log files into a single HAR file per application, which should 
 solve the too many files and too many blocks problems.  See the design 
 document for details.
 See YARN-2942 for more context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6425) ShuffleHandler passes wrong base parameter to getMapOutputInfo if mapId is not in the cache.

2015-07-02 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated MAPREDUCE-6425:
-
Summary: ShuffleHandler passes wrong base parameter to getMapOutputInfo 
if mapId is not in the cache.  (was: ShuffleHandler passes wrong {{base}} 
parameter to getMapOutputInfo if mapId is not in the cache.)

 ShuffleHandler passes wrong base parameter to getMapOutputInfo if mapId is 
 not in the cache.
 --

 Key: MAPREDUCE-6425
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6425
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: MAPREDUCE-6425.000.patch


 ShuffleHandler passes wrong {{base}} parameter to {{getMapOutputInfo}} if 
 mapId is not in the cache.
 {{getMapOutputInfo}} expected the {{base}} parameter is 
 {{getBaseLocation(jobId, user) + mapId}}
 When it is called inside populateHeaders, the {{base}} parameter is set 
 correctly
 {code}
 String base = outputBaseStr + mapId;
 MapOutputInfo outputInfo = getMapOutputInfo(base, mapId, reduce, 
 user);
 {code}
 When  it is called outside populateHeaders, the {{base}} parameter is set 
 wrongly to outputBasePathStr after number of mapId cached exceeds 
 {{mapOutputMetaInfoCacheSize}}.
 {code}
  String outputBasePathStr = getBaseLocation(jobId, user);
   MapOutputInfo info = mapOutputInfoMap.get(mapId);
   if (info == null) {
 info = getMapOutputInfo(outputBasePathStr, mapId, reduceId, user);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6425) ShuffleHandler passes wrong base parameter to getMapOutputInfo if mapId is not in the cache.

2015-07-02 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated MAPREDUCE-6425:
-
Attachment: MAPREDUCE-6425.000.patch

 ShuffleHandler passes wrong base parameter to getMapOutputInfo if mapId is 
 not in the cache.
 

 Key: MAPREDUCE-6425
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6425
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: MAPREDUCE-6425.000.patch


 ShuffleHandler passes wrong {{base}} parameter to {{getMapOutputInfo}} if 
 mapId is not in the cache.
 getMapOutputInfo expected the {{base}} parameter is {{getBaseLocation(jobId, 
 user) + mapId}}
 When it is called inside populateHeaders, the {{base}} parameter is set 
 correctly
 {code}
 String base = outputBaseStr + mapId;
 MapOutputInfo outputInfo = getMapOutputInfo(base, mapId, reduce, 
 user);
 {code}
 When  it is called outside populateHeaders, the {{base}} parameter is set 
 wrongly to outputBasePathStr after number of mapId cached exceeds 
 {{mapOutputMetaInfoCacheSize}}.
 {code}
  String outputBasePathStr = getBaseLocation(jobId, user);
   MapOutputInfo info = mapOutputInfoMap.get(mapId);
   if (info == null) {
 info = getMapOutputInfo(outputBasePathStr, mapId, reduceId, user);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6425) ShuffleHandler passes wrong {{base}} parameter to getMapOutputInfo if mapId is not in the cache.

2015-07-02 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated MAPREDUCE-6425:
-
Description: 
ShuffleHandler passes wrong {{base}} parameter to {{getMapOutputInfo}} if mapId 
is not in the cache.
{{getMapOutputInfo}} expected the {{base}} parameter is 
{{getBaseLocation(jobId, user) + mapId}}
When it is called inside populateHeaders, the {{base}} parameter is set 
correctly
{code}
String base = outputBaseStr + mapId;
MapOutputInfo outputInfo = getMapOutputInfo(base, mapId, reduce, user);
{code}
When  it is called outside populateHeaders, the {{base}} parameter is set 
wrongly to outputBasePathStr after number of mapId cached exceeds 
{{mapOutputMetaInfoCacheSize}}.
{code}
 String outputBasePathStr = getBaseLocation(jobId, user);
  MapOutputInfo info = mapOutputInfoMap.get(mapId);
  if (info == null) {
info = getMapOutputInfo(outputBasePathStr, mapId, reduceId, user);
  }
{code}

  was:
ShuffleHandler passes wrong {{base}} parameter to {{getMapOutputInfo}} if mapId 
is not in the cache.
getMapOutputInfo expected the {{base}} parameter is {{getBaseLocation(jobId, 
user) + mapId}}
When it is called inside populateHeaders, the {{base}} parameter is set 
correctly
{code}
String base = outputBaseStr + mapId;
MapOutputInfo outputInfo = getMapOutputInfo(base, mapId, reduce, user);
{code}
When  it is called outside populateHeaders, the {{base}} parameter is set 
wrongly to outputBasePathStr after number of mapId cached exceeds 
{{mapOutputMetaInfoCacheSize}}.
{code}
 String outputBasePathStr = getBaseLocation(jobId, user);
  MapOutputInfo info = mapOutputInfoMap.get(mapId);
  if (info == null) {
info = getMapOutputInfo(outputBasePathStr, mapId, reduceId, user);
  }
{code}

Summary: ShuffleHandler passes wrong {{base}} parameter to 
getMapOutputInfo if mapId is not in the cache.  (was: ShuffleHandler passes 
wrong base parameter to getMapOutputInfo if mapId is not in the cache.)

 ShuffleHandler passes wrong {{base}} parameter to getMapOutputInfo if mapId 
 is not in the cache.
 

 Key: MAPREDUCE-6425
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6425
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: MAPREDUCE-6425.000.patch


 ShuffleHandler passes wrong {{base}} parameter to {{getMapOutputInfo}} if 
 mapId is not in the cache.
 {{getMapOutputInfo}} expected the {{base}} parameter is 
 {{getBaseLocation(jobId, user) + mapId}}
 When it is called inside populateHeaders, the {{base}} parameter is set 
 correctly
 {code}
 String base = outputBaseStr + mapId;
 MapOutputInfo outputInfo = getMapOutputInfo(base, mapId, reduce, 
 user);
 {code}
 When  it is called outside populateHeaders, the {{base}} parameter is set 
 wrongly to outputBasePathStr after number of mapId cached exceeds 
 {{mapOutputMetaInfoCacheSize}}.
 {code}
  String outputBasePathStr = getBaseLocation(jobId, user);
   MapOutputInfo info = mapOutputInfoMap.get(mapId);
   if (info == null) {
 info = getMapOutputInfo(outputBasePathStr, mapId, reduceId, user);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6425) ShuffleHandler passes wrong base parameter to getMapOutputInfo if mapId is not in the cache.

2015-07-02 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated MAPREDUCE-6425:
-
Attachment: (was: MAPREDUCE-6425.000.patch)

 ShuffleHandler passes wrong base parameter to getMapOutputInfo if mapId is 
 not in the cache.
 --

 Key: MAPREDUCE-6425
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6425
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Reporter: zhihai xu
Assignee: zhihai xu

 ShuffleHandler passes wrong {{base}} parameter to {{getMapOutputInfo}} if 
 mapId is not in the cache.
 {{getMapOutputInfo}} expected the {{base}} parameter is 
 {{getBaseLocation(jobId, user) + mapId}}
 When it is called inside populateHeaders, the {{base}} parameter is set 
 correctly
 {code}
 String base = outputBaseStr + mapId;
 MapOutputInfo outputInfo = getMapOutputInfo(base, mapId, reduce, 
 user);
 {code}
 When  it is called outside populateHeaders, the {{base}} parameter is set 
 wrongly to outputBasePathStr after number of mapId cached exceeds 
 {{mapOutputMetaInfoCacheSize}}.
 {code}
  String outputBasePathStr = getBaseLocation(jobId, user);
   MapOutputInfo info = mapOutputInfoMap.get(mapId);
   if (info == null) {
 info = getMapOutputInfo(outputBasePathStr, mapId, reduceId, user);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6425) ShuffleHandler passes wrong base parameter to getMapOutputInfo if mapId is not in the cache.

2015-07-02 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated MAPREDUCE-6425:
-
Status: Patch Available  (was: Open)

 ShuffleHandler passes wrong base parameter to getMapOutputInfo if mapId is 
 not in the cache.
 

 Key: MAPREDUCE-6425
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6425
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: MAPREDUCE-6425.000.patch


 ShuffleHandler passes wrong {{base}} parameter to {{getMapOutputInfo}} if 
 mapId is not in the cache.
 getMapOutputInfo expected the {{base}} parameter is {{getBaseLocation(jobId, 
 user) + mapId}}
 When it is called inside populateHeaders, the {{base}} parameter is set 
 correctly
 {code}
 String base = outputBaseStr + mapId;
 MapOutputInfo outputInfo = getMapOutputInfo(base, mapId, reduce, 
 user);
 {code}
 When  it is called outside populateHeaders, the {{base}} parameter is set 
 wrongly to outputBasePathStr after number of mapId cached exceeds 
 {{mapOutputMetaInfoCacheSize}}.
 {code}
  String outputBasePathStr = getBaseLocation(jobId, user);
   MapOutputInfo info = mapOutputInfoMap.get(mapId);
   if (info == null) {
 info = getMapOutputInfo(outputBasePathStr, mapId, reduceId, user);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6425) ShuffleHandler passes wrong base parameter to getMapOutputInfo if mapId is not in the cache.

2015-07-02 Thread zhihai xu (JIRA)
zhihai xu created MAPREDUCE-6425:


 Summary: ShuffleHandler passes wrong base parameter to 
getMapOutputInfo if mapId is not in the cache.
 Key: MAPREDUCE-6425
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6425
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Reporter: zhihai xu
Assignee: zhihai xu


ShuffleHandler passes wrong {{base}} parameter to {{getMapOutputInfo}} if mapId 
is not in the cache.
getMapOutputInfo expected the {{base}} parameter is {{getBaseLocation(jobId, 
user) + mapId}}
When it is called inside populateHeaders, the {{base}} parameter is set 
correctly
{code}
String base = outputBaseStr + mapId;
MapOutputInfo outputInfo = getMapOutputInfo(base, mapId, reduce, user);
{code}
When  it is called outside populateHeaders, the {{base}} parameter is set 
wrongly to outputBasePathStr after number of mapId cached exceeds 
{{mapOutputMetaInfoCacheSize}}.
{code}
 String outputBasePathStr = getBaseLocation(jobId, user);
  MapOutputInfo info = mapOutputInfoMap.get(mapId);
  if (info == null) {
info = getMapOutputInfo(outputBasePathStr, mapId, reduceId, user);
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6425) ShuffleHandler passes wrong base parameter to getMapOutputInfo if mapId is not in the cache.

2015-07-02 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated MAPREDUCE-6425:
-
Attachment: MAPREDUCE-6425.000.patch

 ShuffleHandler passes wrong base parameter to getMapOutputInfo if mapId is 
 not in the cache.
 --

 Key: MAPREDUCE-6425
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6425
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: MAPREDUCE-6425.000.patch


 ShuffleHandler passes wrong {{base}} parameter to {{getMapOutputInfo}} if 
 mapId is not in the cache.
 {{getMapOutputInfo}} expected the {{base}} parameter is 
 {{getBaseLocation(jobId, user) + mapId}}
 When it is called inside populateHeaders, the {{base}} parameter is set 
 correctly
 {code}
 String base = outputBaseStr + mapId;
 MapOutputInfo outputInfo = getMapOutputInfo(base, mapId, reduce, 
 user);
 {code}
 When  it is called outside populateHeaders, the {{base}} parameter is set 
 wrongly to outputBasePathStr after number of mapId cached exceeds 
 {{mapOutputMetaInfoCacheSize}}.
 {code}
  String outputBasePathStr = getBaseLocation(jobId, user);
   MapOutputInfo info = mapOutputInfoMap.get(mapId);
   if (info == null) {
 info = getMapOutputInfo(outputBasePathStr, mapId, reduceId, user);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5221) Reduce side Combiner is not used when using the new API

2015-07-02 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612862#comment-14612862
 ] 

Tsuyoshi Ozawa commented on MAPREDUCE-5221:
---

[~davelatham] thank you for pinging. I'm refreshing a patch soon.

 Reduce side Combiner is not used when using the new API
 ---

 Key: MAPREDUCE-5221
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5221
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Tsuyoshi Ozawa
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5221.1.patch, MAPREDUCE-5221.10.patch, 
 MAPREDUCE-5221.2.patch, MAPREDUCE-5221.3.patch, MAPREDUCE-5221.4.patch, 
 MAPREDUCE-5221.5.patch, MAPREDUCE-5221.6.patch, MAPREDUCE-5221.7-2.patch, 
 MAPREDUCE-5221.7.patch, MAPREDUCE-5221.8.patch, MAPREDUCE-5221.9.patch


 If a combiner is specified using o.a.h.mapreduce.Job.setCombinerClass - this 
 will silently ignored on the reduce side since the reduce side usage is only 
 aware of the old api combiner.
 This doesn't fail the job - since the new combiner key does not deprecate the 
 old key.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5221) Reduce side Combiner is not used when using the new API

2015-07-02 Thread Dave Latham (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612605#comment-14612605
 ] 

Dave Latham commented on MAPREDUCE-5221:


Looks like this was never committed and left to rot.  Any reason it didn't make 
it?  If we refresh it can we get it committed?

 Reduce side Combiner is not used when using the new API
 ---

 Key: MAPREDUCE-5221
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5221
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Tsuyoshi Ozawa
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-5221.1.patch, MAPREDUCE-5221.10.patch, 
 MAPREDUCE-5221.2.patch, MAPREDUCE-5221.3.patch, MAPREDUCE-5221.4.patch, 
 MAPREDUCE-5221.5.patch, MAPREDUCE-5221.6.patch, MAPREDUCE-5221.7-2.patch, 
 MAPREDUCE-5221.7.patch, MAPREDUCE-5221.8.patch, MAPREDUCE-5221.9.patch


 If a combiner is specified using o.a.h.mapreduce.Job.setCombinerClass - this 
 will silently ignored on the reduce side since the reduce side usage is only 
 aware of the old api combiner.
 This doesn't fail the job - since the new combiner key does not deprecate the 
 old key.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6424) Store MR counters as timeline metrics instead of event

2015-07-02 Thread Junping Du (JIRA)
Junping Du created MAPREDUCE-6424:
-

 Summary: Store MR counters as timeline metrics instead of event
 Key: MAPREDUCE-6424
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6424
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Junping Du
Assignee: Junping Du


In MAPREDUCE-6327, we make map/reduce counters get encoded from 
JobFinishedEvent as timeline events with counters details in JSON format. 
We need to store framework specific counters as metrics in timeline service to 
support query, aggregation, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)