date:20121209

[jira] [Updated] (MAPREDUCE-4660) Update task placement policy for NetworkTopology with 'NodeGroup' layer

2012-12-09 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-4660:
--

Attachment: MAPREDUCE-4660-v2.patch

Update patch to address recent changes on branch-1 in v2 patch.

 Update task placement policy for NetworkTopology with 'NodeGroup' layer
 ---

 Key: MAPREDUCE-4660
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4660
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: jobtracker, mrv1, scheduler
Reporter: Junping Du
Assignee: Junping Du
 Attachments: MAPREDUCE-4660.patch, MAPREDUCE-4660-v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2012-12-09 Thread Avner BenHanoch (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13527482#comment-13527482
]

Avner BenHanoch commented on MAPREDUCE-4049:

If you need more:
* My issue is a PART OF whole new topic of Shuffle Consumer – Shuffle Provider
plugins. Currently, we just submitted the consumer part. We still need to
complete *the provider part* in MRv2 and in MRv1, plus few related topics.
Then we need to back port all to hadoop-2 hadoop-1.

* Hence, my issue is part of other big context and not part of your issue
(Still, be my guest, and feel free to subordinate your issue to my issue)

* Besides, it was already clearly said that at any case, MAPREDUCE-2454 can’t
be accepted to hadoop-1, since it is too massive change for a branch that is
going to its end of life. On the other hand, my patch already passed code
review for hadoop-1 and was only delayed because of a justified request to go
in the regular path and first submit to trunk. Hence, there is no reason to
block my trivial patch for all branches just because the complex issues in
MAPREDUCE-2454.

plugin for generic shuffle service
--

Key: MAPREDUCE-4049
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
Project: Hadoop Map/Reduce
Issue Type: Sub-task
Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
Labels: merge, plugin, rdma, shuffle
Fix For: 3.0.0

Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf,
mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch,
mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch

Support generic shuffle service as set of two plugins: ShuffleProvider
ShuffleConsumer.
This will satisfy the following needs:
# Better shuffle and merge performance. For example: we are working on
shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE,
or Infiniband) instead of using the current HTTP shuffle. Based on the fast
RDMA shuffle, the plugin can also utilize a suitable merge approach during
the intermediate merges. Hence, getting much better performance.
# Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden
dependency of NodeManager with a specific version of mapreduce shuffle
(currently targeted to 0.24.0).
References:
# Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu
from Auburn University with others,
[http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
# I am attaching 2 documents with suggested Top Level Design for both plugins
(currently, based on 1.0 branch)
# I am providing link for downloading UDA - Mellanox's open source plugin
that implements generic shuffle service using RDMA and levitated merge.
Note: At this phase, the code is in C++ through JNI and you should consider
it as beta only. Still, it can serve anyone that wants to implement or
contribute to levitated merge. (Please be advised that levitated merge is
mostly suit in very fast networks) -
[http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2012-12-09 Thread Avner BenHanoch (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13527480#comment-13527480
]

Avner BenHanoch commented on MAPREDUCE-4049:

Alejandro,

I am repeating my [previous
comment|https://issues.apache.org/jira/browse/MAPREDUCE-4049?focusedCommentId=13504502page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13504502]
that your behavior is inappropriate and unfriendly.
* Your interest in this JIRA issue was only started to be shown 2 weeks ago,
exactly 2 hours after Arun's last comment in MAPREDUCE-2454, in which Arun kept
persisting his concern for the core of MapReduce with the massive changes that
the huge patch there introduces, and requested breaking it into subtasks.
* At that phase my trivial patch already passed a long way and was ready for
the trunk (beside minor last request just to rename functions in one class).
But then you woke up and started to ask many new things.

I was always concerned that linking our issues will delay one for the other,
instead of letting each progress at its own pace (and this is just 3 months
after Asokan already tried to delay my issue and wanted to first submit his
patch).

*You promised many times that all this will delay me just in few days.* See
the course of your promises to me *(all quotes are taken from you in this
JIRA)*:
* Your original justification for the linking was: _As all this JIRAs are
small, I think we'll be able *to move fast with all of them*_.
* You responded: _If that requires *a couple of extra days*, is is a small
price to pay_.
* You clarified: _And don't worry about begin a subtask delaying it, I'll
review it as soon as you post a patch and committed it when ready. The same is
happening with the other subtasks, *so things should be in quite quickly*. Thx_

*Now, what happened:*

* *I fulfilled ALL your requests* including those that are {color:red}_to
have a consistent set of names and APIs (ie inner Context) for a set of related
plugins (all the ones affected by MAPREDUCE-2454)_{color}, then, *you
personally reviewed my patch and you personally +1 it*.
* This Friday you promised again to *merge to trunk -_fast, by the end of
next week if no surprises arise_*.
* Then *you personally merged it to your branch*. After that *Arun merged it
to trunk too*.
* Then you broke all your past commitments and responded with: _*-1 this
patch to go in trunk until the work in the branch is completed.*_

*Sorry. I don’t get it!*
* My patch contains all your requests, including those that are
{color:red}_to have a consistent set of names and APIs (ie inner Context) for
a set of related plugins (all the ones affected by MAPREDUCE-2454)_{color},
and you personally +1 it and merged it to your branch. *How can be that it
suits your branch, but not the trunk because of your branch needs???*

* Personally, watching the design (and performance) questions on
MAPREDUCE-2454 I have no idea when this patch will ever be accepted to trunk.
*I don’t think it is appropriate to block my trivial patch to go to the trunk.
{color:red}My patch stands for itself.{color}* There are many people that are
waiting for it and wanting it.

* Per your request, my patch got the tags @Unstable and @limittedPrivate. You
always have the option to iron its code, in the same way that you can do with
any code in SVN.

* Your _-1 this patch to go in trunk until the work in the branch is
completed_ *literally says that you took my issue as {color:red}hostage{color}
for MAPREDUCE-2454* despite your promises that these steps will not delay me.

plugin for generic shuffle service
--

Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf,
mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch,
mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch

[jira] [Commented] (MAPREDUCE-4808) Allow reduce-side merge to be pluggable

2012-12-09 Thread Mariappan Asokan (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13527520#comment-13527520
]

Mariappan Asokan commented on MAPREDUCE-4808:
-

Hi Aun,
Thanks for your feedback. Perhaps I should mention some use cases of a
MergeManager plugin in addition to the technical details of the design
mentioned here as well as in MAPREDUCE-4812.

MergeManager plugin would allow us and any implementer of the plugin to do
variety of additional transformations like copy, limit-N query(MAPREDUCE-1928),
full join, and hashed aggregation more efficiently. Since shuffle code is
available in the framework, we want to make use of it. In my opinion, the
framework shuffle code seems to be stable in MRv2.

Making Merger to be pluggable will not add much value. If I understand
correctly, it allows plugin implementers to implement only a single pass of the
merge. The overall merge is still driven by MergeManager. Also, there is only
merge operation possible. Any additional transformation has to be done in the
Reducer only. A lot of times this is not very efficient.

Hope I clarified the usefulness of allowing MergeManager to be pluggable.
Please feel free if you any questions.

Thanks.

-- Asokan

Allow reduce-side merge to be pluggable
---

Key: MAPREDUCE-4808
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
Project: Hadoop Map/Reduce
Issue Type: Sub-task
Affects Versions: 2.0.2-alpha
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
Fix For: 2.0.3-alpha

Attachments: COMBO-mapreduce-4809-4812-4808.patch,
mapreduce-4808.patch

Allow reduce-side merge to be pluggable for MAPREDUCE-2454

[jira] [Commented] (MAPREDUCE-4396) Make LocalJobRunner work with private distributed cache

2012-12-09 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13527640#comment-13527640
 ] 

Eric Yang commented on MAPREDUCE-4396:
--

Does trunk need a patch?

 Make LocalJobRunner work with private distributed cache
 ---

 Key: MAPREDUCE-4396
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4396
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
Priority: Minor
 Attachments: mapreduce-4396-branch-1.patch, test-afterpatch.result, 
 test-beforepatch.result, test-patch.result


 Some LocalJobRunner related unit tests fails if user directory permission 
 and/or umask is too restrictive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4863) Adding aggregationWaitMap for node-level combiner.

2012-12-09 Thread Tsuyoshi OZAWA (JIRA)

Tsuyoshi OZAWA created MAPREDUCE-4863:
-

 Summary: Adding aggregationWaitMap for node-level combiner.
 Key: MAPREDUCE-4863
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4863
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA


To manage node/rack-level combining, MRAppMaster needs to have a management 
information about outputs of completed MapTasks to be aggregated.  
AggregationWaitMap is used so that MRAppMaster decides whether or not MapTasks 
start to combine local MapOutputFiles.

AggregationWaitMap is a abstraction class of ConcurrentHashMapString, 
ArrayListTaskAttemptCompletionEvent. These Events are candidate files to be 
aggregated.

When MapTasks are completed, MRAppMaster buffer TaskAttemptCompletionEvent into 
AggregationWaitMap to delay reducers' fethcing outputs from mappers until 
node-level aggregation are finished.  After node-level aggreagtion, MRAppMaster 
write back mapAttemptCompletionEvents, to restart reducers' feching outputs 
from mappers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4864) Extending RPC over umblical protocol getAggregationTargets() for node-level combiner.

2012-12-09 Thread Tsuyoshi OZAWA (JIRA)

Tsuyoshi OZAWA created MAPREDUCE-4864:
-

 Summary: Extending RPC over umblical protocol 
getAggregationTargets() for node-level combiner.
 Key: MAPREDUCE-4864
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4864
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA


MapTasks need to know whether or not they should start node-level combiner 
agaist outputs of mapper on their node. 

The new umbilical RPC, getAggregationTargets(), is used to get outputs to be 
aggregated on the node. The definition as follows:

AggregationTarget getAggregationTargets(TaskAttemptID aggregator) throws 
IOException;

AggregationTarget is a abstraction class of array of TaskAttemptID to be 
aggregated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4864) Extending RPC over umblical protocol, getAggregationTargets(), for node-level combiner.

2012-12-09 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4864:
--

Summary: Extending RPC over umblical protocol, getAggregationTargets(), 
for node-level combiner.  (was: Extending RPC over umblical protocol 
getAggregationTargets() for node-level combiner.)

 Extending RPC over umblical protocol, getAggregationTargets(), for 
 node-level combiner.
 -

 Key: MAPREDUCE-4864
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4864
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA

 MapTasks need to know whether or not they should start node-level combiner 
 agaist outputs of mapper on their node. 
 The new umbilical RPC, getAggregationTargets(), is used to get outputs to be 
 aggregated on the node. The definition as follows:
 AggregationTarget getAggregationTargets(TaskAttemptID aggregator) throws 
 IOException;
 AggregationTarget is a abstraction class of array of TaskAttemptID to be 
 aggregated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4864) Adding new umbilical protocol RPC, getAggregationTargets(), for node-level combiner.

2012-12-09 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4864:
--

Summary: Adding new umbilical protocol RPC, getAggregationTargets(), for 
node-level combiner.  (was: Extending RPC over umblical protocol, 
getAggregationTargets(), for node-level combiner.)

 Adding new umbilical protocol RPC, getAggregationTargets(), for node-level 
 combiner.
 --

 Key: MAPREDUCE-4864
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4864
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA

 MapTasks need to know whether or not they should start node-level combiner 
 agaist outputs of mapper on their node. 
 The new umbilical RPC, getAggregationTargets(), is used to get outputs to be 
 aggregated on the node. The definition as follows:
 AggregationTarget getAggregationTargets(TaskAttemptID aggregator) throws 
 IOException;
 AggregationTarget is a abstraction class of array of TaskAttemptID to be 
 aggregated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4865) Launching node-level combiner the end stage of MapTask

2012-12-09 Thread Tsuyoshi OZAWA (JIRA)

Tsuyoshi OZAWA created MAPREDUCE-4865:
-

 Summary: Launching node-level combiner the end stage of MapTask
 Key: MAPREDUCE-4865
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4865
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA


MapTask needs to start node-level aggregation against local outputs at the end 
stage of MapTask after calling getAggregationTargets().

This feature is implemented with Merger and CombinerRunner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4864) Adding new umbilical protocol RPC, getAggregationTargets(), for node-level combiner.

2012-12-09 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4864:
--

Component/s: tasktracker
 mrv2

 Adding new umbilical protocol RPC, getAggregationTargets(), for node-level 
 combiner.
 --

 Key: MAPREDUCE-4864
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4864
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster, mrv2, tasktracker
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA

 MapTasks need to know whether or not they should start node-level combiner 
 agaist outputs of mapper on their node. 
 The new umbilical RPC, getAggregationTargets(), is used to get outputs to be 
 aggregated on the node. The definition as follows:
 AggregationTarget getAggregationTargets(TaskAttemptID aggregator) throws 
 IOException;
 AggregationTarget is a abstraction class of array of TaskAttemptID to be 
 aggregated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4660) Update task placement policy for NetworkTopology with 'NodeGroup' layer

[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

[jira] [Commented] (MAPREDUCE-4808) Allow reduce-side merge to be pluggable

[jira] [Commented] (MAPREDUCE-4396) Make LocalJobRunner work with private distributed cache

[jira] [Created] (MAPREDUCE-4863) Adding aggregationWaitMap for node-level combiner.

[jira] [Created] (MAPREDUCE-4864) Extending RPC over umblical protocol getAggregationTargets() for node-level combiner.

[jira] [Updated] (MAPREDUCE-4864) Extending RPC over umblical protocol, getAggregationTargets(), for node-level combiner.

[jira] [Updated] (MAPREDUCE-4864) Adding new umbilical protocol RPC, getAggregationTargets(), for node-level combiner.

[jira] [Created] (MAPREDUCE-4865) Launching node-level combiner the end stage of MapTask

[jira] [Updated] (MAPREDUCE-4864) Adding new umbilical protocol RPC, getAggregationTargets(), for node-level combiner.

11 matches

Site Navigation

Mail list logo

Footer information