[jira] [Created] (MAPREDUCE-4502) Multi-level aggregation with combining the result of maps per node/rack

2012-08-01 Thread Tsuyoshi OZAWA (JIRA)
Tsuyoshi OZAWA created MAPREDUCE-4502:
-

 Summary: Multi-level aggregation with combining the result of maps 
per node/rack
 Key: MAPREDUCE-4502
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Tsuyoshi OZAWA


The shuffle costs is expensive in Hadoop in spite of the
existence of combiner, because the scope of combining is limited
within only one MapTask. To solve this problem, it's a good way to aggregate 
the result of maps per node/rack by launch combiner.

This JIRA is to implement the multi-level aggregation infrastructure, including 
combining per container(MAPREDUCE-3902 is related), coordinating containers by 
application master without breaking fault tolerance of jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Multi-level aggregation with combining the result of maps per node/rack

2012-08-01 Thread Tsuyoshi OZAWA
Robert,

Thank you for your precious opinion and sharing the related JIRA tickets.

The combination consisting of reusing container (MAPREDUCE-3902) and the
coordination system in the AM is good idea to minimize implementation
cost and ensure fault tolerance. The design can also solve scheduling problem
of when to run combiner.And the current design doesn't matter security against
the intermediate data at all, so I'll consider it.

I'll create a new design note with your opinion in mind, and attach it
on a new JIRA
(MAPREDUCE-4502).

Thanks,
Tsuyoshi OZAWA

On Tue, Jul 31, 2012 at 10:46 PM, Robert Evans ev...@yahoo-inc.com wrote:
 Tsuyoshi,


 There has been a lot of work happening in the shuffle phase.  It is being
 made pluggable in both 1.0 and 2.0/trunk (MAPREDUCE-4049).  There is also
 some work being done to reuse containers in trunk/2.0 (MAPREDUCE-3902).
 This should have a similar, although perhaps more limited result, because
 when different map tasks run in the same container their outputs also go
 through the same combiner.  I have heard that it is showing some good
 results for both small and large jobs.  There was also some work to try
 and pull in Sailfish (No JIRA just ramblings on the mailing list), which
 moves the shuffle phase to a separate process.  I have not seen much
 happen on that front recently, but it saw some large gains on big jobs,
 but is worse on small jobs.  I think that this is something very
 interesting and I would encourage you to file a JIRA and pursue it.

 I don't know anything about your design, so please feel free to disregard
 my comments if they do not apply.  I would encourage you to think about
 security on this.  When you run the combiner you need to be sure that it
 runs as the user that owns the data.  This should probably not be too
 difficult if you hijack a mapper tasks that has just finished to try and
 combine the data from others on the same node.  To do this you will
 probably need some sort of a coordination system in the AM to tell that
 mapper what other mappers to try and combine data from.  It would be nice
 to coordinate this with the container reuse work, which currently just
 tells the container to run another split through.  It could be another
 option to tell it to combine with the map output from container X.

 Another thing to be aware of is small jobs.  It would be great to see how
 this impacts small jobs, and if it has a negative impact we should look
 for an automated way to turn this off or on.

 Thanks for your work,

 Bobby Evans

 On 7/30/12 8:11 PM, Tsuyoshi OZAWA ozawa.tsuyo...@gmail.com wrote:

Hi,

We consider the shuffle cost is a main concern in MapReduce,
in particular, aggregation processing.
The shuffle costs is also expensive in Hadoop in spite of the
existence of combiner, because the scope of combining is limited
within only one MapTask.

To solve this problem, I've implemented the prototype that
combines the result of multiple maps per node[1].
This is the first step to make hadoop faster with multi-level
aggregation technique like Google Dremel[2].

I took a benchmark with the prototype.
We used WordCount program with in-mapper combining optimization
as the benchmark. The benchmark is taken under 40 nodes [3].
The input data set is 300GB, 500GB, 1TB, and 2TB texts which is generated
by default RandomTextWriter. Reducer is configured
as 1 on the assumption that some workload forces 1 reducer
like Google Dremel. The result is as follows:

 | 300GB | 500GB |   1TB |   2TB |
Normal (sec) |  4004 |  5551 | 12177 | 27608 |
Combining per node (sec) |  3678 |  3844 |  7440 | 15591 |

Note that a MapTask runs combiner per node every 3 minutes in
the current prototype, so the aggregation rate is very limited.

Normal is the result of current hadoop, and Combining per node
is the result with my optimization.  Regardless of the 3-minutes
restriction, the prototype is 1.7 times faster than normal hadoop
in 2TB case.  Another benchmark also shows that the shuffle costs
is cut down by 50%.

I want to know from you guys, do you think is it a useful feature?
If yes, I will work for contributing it.
It is also welcome to tell me the benchmark that you want me to do
with my prototype.

Regards,
Tsuyoshi


[1] The idea is also described in Hadoop wiki:
http://wiki.apache.org/hadoop/HadoopResearchProjects
[2] Dremel paper is available at:
http://research.google.com/pubs/pub36632.html
[3] The specification of each nodes is as follows:
CPU Core(TM)2 Duo CPU E7400 2.80GHz x 2
Memory 8 GB
Network 1 GbE




-- 
OZAWA Tsuyoshi


Hadoop-Mapreduce-trunk - Build # 1154 - Still Failing

2012-08-01 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1154/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 31911 lines...]
[INFO] hadoop-yarn-api ... SUCCESS [10.174s]
[INFO] hadoop-yarn-common  SUCCESS [29.636s]
[INFO] hadoop-yarn-server  SUCCESS [0.108s]
[INFO] hadoop-yarn-server-common . SUCCESS [2.583s]
[INFO] hadoop-yarn-server-nodemanager  SUCCESS [2:23.469s]
[INFO] hadoop-yarn-server-web-proxy .. SUCCESS [1.234s]
[INFO] hadoop-yarn-server-resourcemanager  SUCCESS [3:33.635s]
[INFO] hadoop-yarn-server-tests .. SUCCESS [40.765s]
[INFO] hadoop-mapreduce-client ... SUCCESS [0.059s]
[INFO] hadoop-mapreduce-client-core .. SUCCESS [1:22.593s]
[INFO] hadoop-yarn-applications .. SUCCESS [0.092s]
[INFO] hadoop-yarn-applications-distributedshell . SUCCESS [12.265s]
[INFO] hadoop-yarn-applications-unmanaged-am-launcher  SUCCESS [9.788s]
[INFO] hadoop-yarn-site .. SUCCESS [0.175s]
[INFO] hadoop-mapreduce-client-common  SUCCESS [18.091s]
[INFO] hadoop-mapreduce-client-shuffle ... SUCCESS [0.983s]
[INFO] hadoop-mapreduce-client-app ... SUCCESS [4:24.527s]
[INFO] hadoop-mapreduce-client-hs  SUCCESS [1:06.751s]
[INFO] hadoop-mapreduce-client-jobclient . FAILURE [34:51.450s]
[INFO] Apache Hadoop MapReduce Examples .. SKIPPED
[INFO] hadoop-mapreduce .. SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 49:51.447s
[INFO] Finished at: Wed Aug 01 14:09:03 UTC 2012
[INFO] Final Memory: 27M/226M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on 
project hadoop-mapreduce-client-jobclient: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire-reports
 for the individual test results.
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn goals -rf :hadoop-mapreduce-client-jobclient
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Updating MAPREDUCE-
Updating MAPREDUCE-4375
Updating MAPREDUCE-4457
Updating MAPREDUCE-4483
Updating MAPREDUCE-4492
Updating MAPREDUCE-4493
Updating MAPREDUCE-4456
Updating HDFS-3738
Updating MAPREDUCE-4496
Updating HADOOP-8370
Updating HADOOP-8637
Updating HDFS-3667
Updating MAPREDUCE-4234
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Created] (MAPREDUCE-4504) SortValidator writes to wrong directory

2012-08-01 Thread Robert Joseph Evans (JIRA)
Robert Joseph Evans created MAPREDUCE-4504:
--

 Summary: SortValidator writes to wrong directory
 Key: MAPREDUCE-4504
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4504
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans


SortValidator tries to write to jobConf.get(hadoop.tmp.dir, /tmp), but it 
is not intended to be an HDFS directory. it should just be /tmp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Can someone review MAPREDUCE-4309 and MAPREDUCE-4310?

2012-08-01 Thread Jun Ping Du
These two patches are for Hadoop Network Topology extension (YARN part) for 
virtualization environment.

Thanks,

Junping

- Original Message -
From: Jun Ping Du j...@vmware.com
To: common-...@hadoop.apache.org, hdfs-...@hadoop.apache.org, 
mapreduce-dev@hadoop.apache.org
Cc: Mark Pollack mpoll...@vmware.com, Jurgen Leschner 
jlesch...@vmware.com, Richard McDougall r...@vmware.com
Sent: Monday, June 4, 2012 11:48:35 PM
Subject: Make Hadoop NetworkTopology and data locality more pluggable for other 
deploying topology like: virtualization.

Hello Folks,
  I just filed a Umbrella jira today to address current NetworkTopology 
issue that binding strictly to three tier network. The motivation here is to 
make hadoop more flexible for deploying topology (especially for 
cloud/virtualization case) and more configurable in data locality related 
policies like: replica placement, task scheduling, choosing block for DFSClient 
reading, balancing. 
  We submit a draft proposal in this Umbrella as well as the implementation 
code. As code base is large (~260K), the code is separated into 7 sub JIRA 
issues which seems to be more convenient for reviewing. However, we split the 
code based on functionality which cause some dependencies between patches which 
way we are not sure the best. Welcome to provide comments and suggestions on 
doc and code, and look forward to work with all of you to enhance hadoop in 
some new situations towards perfect.
  Hope this is a good start.

Cheers,

Junping

- Original Message -
From: Junping Du (JIRA) j...@apache.org
To: common-iss...@hadoop.apache.org
Sent: Monday, June 4, 2012 12:09:22 PM
Subject: [jira] [Created] (HADOOP-8468) Umbrella of enhancements to support 
different failure and locality topologies

Junping Du created HADOOP-8468:
--

 Summary: Umbrella of enhancements to support different failure and 
locality topologies
 Key: HADOOP-8468
 URL: https://issues.apache.org/jira/browse/HADOOP-8468
 Project: Hadoop Common
  Issue Type: Bug
  Components: ha, io
Affects Versions: 2.0.0-alpha, 1.0.0
Reporter: Junping Du
Assignee: Junping Du
Priority: Critical


The current hadoop network topology (described in some previous issues like: 
Hadoop-692) works well in classic three-tiers network when it comes out. 
However, it does not take into account other failure models or changes in the 
infrastructure that can affect network bandwidth efficiency like: 
virtualization. 
Virtualized platform has following genes that shouldn't been ignored by hadoop 
topology in scheduling tasks, placing replica, do balancing or fetching block 
for reading: 
1. VMs on the same physical host are affected by the same hardware failure. In 
order to match the reliability of a physical deployment, replication of data 
across two virtual machines on the same host should be avoided.
2. The network between VMs on the same physical host has higher throughput and 
lower latency and does not consume any physical switch bandwidth.
Thus, we propose to make hadoop network topology extend-able and introduce a 
new level in the hierarchical topology, a node group level, which maps well 
onto an infrastructure that is based on a virtualized environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4505) Create a combiner bypass path for keys with a single value

2012-08-01 Thread Owen O'Malley (JIRA)
Owen O'Malley created MAPREDUCE-4505:


 Summary: Create a combiner bypass path for keys with a single value
 Key: MAPREDUCE-4505
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4505
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Reporter: Owen O'Malley


It would help optimize a lot of cases where there aren't a lot of replicated 
keys if the framework would bypass the deserialize/combiner/serialize step for 
keys that only have a single value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira