Content request for 0.20.205 Sustaining Release

Nathan Roberts Wed, 31 Aug 2011 08:26:07 -0700

Hi Matt,

My colleagues with experience running 0.20.203 (as well as many previous 
releases of Hadoop) in Yahoo!'s production environment are requesting that the 
following items be included as high priority sustaining improvements in 
0.20.205.  Rather than contributors sending in several separate requests to 
this mailing list, this email aggregates contributions from the following 
individuals: Daryn Sharp, Jeffrey Naisbitt, Kihwal Lee, Sherry Chen, Thomas 
Graves, Bharath Mundlapudi, Robert Joseph Evans, Anupam Seth, Eric Payne, John 
George.


Recommendations and suggestions for this list of jiras came from folks with 
significant experience working with large scale Hadoop clusters within Yahoo! 
production environments, including Service Engineering teams, Quality 
Engineering teams, Solutions Engineering teams, and Development teams.

Notes on the items listed below:

 *   All the Jiras listed with the exception of HADOOP-7510, MAPREDUCE-2764, 
MAPREDUCE-2915, and HDFS-2257 have been committed to 0.20-security. These 
remaining four jiras are in-progress and should wrap up over the next few days.
 *   All of the jiras listed have been fixed in trunk with the following 
exceptions:
    *   The four jiras listed above which are still being worked
    *   MAPREDUCE-2780 - Similar to previous bullet. In progress now.
    *   MAPREDUCE-2324 - Has a strong interaction with MR279 so filed 
MAPREDUCE-2723 to make sure this is handled correctly in yarn
    *   MAPREDUCE-2729, MAPREDUCE-2621 - Don't make sense after integration of 
MR279

Thank you for considering this list of Jiras for inclusion in 0.20.205.
Nathan Roberts

====

MAPREDUCE-2489 - Jobsplits with random hostnames can make the queue unusable
Justification: A broken job that is issuing random hostnames to the job tracker 
can hang up a queue and severely impact the performance of the job tracker.
Risk: Low. Change involves a simple check for obviously malformed hostnames.

MAPREDUCE-2852 - Remove YDH Bug 2854624 from  code comments
Justification: Comment change only
Risk: Low

HADOOP-7472 - RPC client should deal with the IP address changes
Justification: If the IP address of a namenode is changed, all clients must be 
restarted. This can be very expensive and difficult to execute when many of the 
clients are not within the cluster-proper. e.g. distcp
Risk: Low. If an address change is suspected, the code now performs an 
additional lookup and updates the address. Does not affect normal path.

MAPREDUCE-2729 - Reducers are always counted having pending tasks even if they 
can't be scheduled yet because not enough of their mappers have completed
Justification: reducer slots are not being properly allocated when reducers are 
waiting on map tasks to finish, causing situations where a queue can be 
significantly under utilized. In grids where queues are configured with 
relatively tight constraints, this can result in substantial throughput 
degradation when this condition arises.
Risk: Medium/Low. No change to the scheduler can be taken lightly so in those 
terms it's medium. However, the change itself is straighforward and the experts 
agree it was a bug.

MAPREDUCE-2705 - tasks localized and launched serially by TaskLauncher - 
causing other tasks to be delayed
Justification: Large localization processes lock up task launcher for 
potentially very long periods of time. This can result in significant delays 
for other tasks assigned to the same compute node.
Risk: Low. Localization is performed in a separate thread but overall flow for 
a particular task remains unchanged.

MAPREDUCE-2651 - Race condition in Linux Task Controller for job log directory 
creation
Justification: Tasks can fail because of a race to create the job log directory.
Risk: Low. Deals with EEXIST more consistently.

MAPREDUCE-2650 - back-port MAPREDUCE-2238 to 0.20-security
Justification: Permission handling within localization causes races and can 
leave directories with broken permissions. Adversely affects test 
reproducibility.
Risk: Low. Fix has been in trunk and 22 for several months.

MAPREDUCE-2621 - TestCapacityScheduler fails with Queue q1 does not exist
Justification: Hudson unit test failures
Risk: Low. Changes just create an explicit association between the QueueManager 
and JT.

MAPREDUCE-2494 - Make the distributed cache delete entires using LRU priority
Justification: Some regularly recurring jobs require large distributed cache 
contents. The current scheme deletes these contents when the distributed cache 
fills up. The penalty for localizing this type of job is a recurring penalty.
Risk: Low/Medium - Currently eviction is all or nothing. This change just 
orders the eviction and doesn't do it all at once. All of the races dealing 
with eviction were already dealt with in the code so no additional risk from 
that standpoint.

MAPREDUCE-2324 - Job should fail if a reduce task can't be scheduled anywhere
Justification: Jobs can get stuck in limbo emitting tons of messages to the 
logs about not being able to schedule the reduce. It's best to either just 
attempt the reduce and let it fail, or  put in more sophisticated logic to 
attempt to fail these jobs before attempting the reduce at all.
Risk: Low. Change now just removes the check which tried to prevent this. So, 
the job will be attempted and will just fail through the normal course.

MAPREDUCE-2187 - map tasks timeout during sorting
Justification: No progress is reported during merge sort so if this phase is 
takes too long, the tasks can timeout and fail.
Risk: Low. Adds new progress report point during merge sort.

HDFS-2202 - Changes to balancer bandwidth should not require datanode restart.
Justification: There are times when operations needs to either speed up or slow 
down the balancer bandwidth. The system should support doing so without 
restarting the datanodes.
Risk: Low/Medium - If feature is not used, code paths are the same.

HDFS-1836 - Thousand of CLOSE_WAIT socket
Justification: Clients can chew up socket connections by not closing down 
correctly.
Risk: Low.

HADOOP-7432 - Back-port HADOOP-7110 to 0.20-security
Justification: Fixes build/UT failures due to racey chmod and improve 
performance by using JNI chmod rather than forking.
Risk: Low. Backport of fix for 22 from Todd Lipconn.

HADOOP-7314 - Add support for throwing UnknownHostException when a host doesn't 
resolve
Justification: Tied to MAPREDUCE-2489. Same justification.
Risk: Same risk as MAPREDUCE-2489

MAPREDUCE-2764 - Fix renewal of dfs delegation tokens
Justification: Long running jobs like distcp may repeatedly fail to renew 
delegation tokens even after an intermittent error has been corrected. The 
repeated failures can overwhelm the job tracker causing the entire grid to have 
difficulty.
Risk: Medium risk. Requires a low-level change to the tokens to include enough 
information so that the token can be renewed later.

HDFS-2257 - HftpFilesysystem should implement GetDelegationTokens
Justification: Required for MAPREDUCE-2764
Risk: Medium - See MAPREDUCE-2764

MAPREDUCE-2780 - MAPREDUCE-2764 Standardize the value of token service
Justification: Required for MAPREDUCE-2764
Risk: Low. Creates a setService method rather than having all token producers 
do this themselves. No change to actual tokens.

HADOOP-7510 - Tokens should use original hostname provided instead of ip
Justification: Need this in order to support namenode changing IP address. 
Otherwise as soon as next task tries to look something up in token cache using 
ip, it will fail to find the proper token and then fail to execute.
Risk: Medium risk. Requires a change to the information maintained in the token.

HADOOP-7539 - merge hadoop archive goodness from trunk to 0.20
Justification: HAR support regressed somewhat when merging to 0.20.203. This 
jira brings HAR support back to what's in trunk.
Risk: Low risk. Doesn't affect mainline HDFS/MAPREDUCE,

HADOOP-6889 - Make RPC to have an option to timeout
Justification: Clients can hang when issuing RPCs to troubled datanodes because 
there is no RPC timeout. Has been pulled into 0.22, 0.20.append.
Risk: Low. Running in 0.20.appemd, trunk and 22. Fixed 12 months ago.

MAPREDUCE-2915 LinuxTaskController does not work when 
JniBasedUnixGroupsNetgroupMapping or JniBasedUnixGroupsMapping is enabled
Justification: If one does not use the JNI versions of these methods, the 
namenode and jobtracker frequently have to fork. Especially in the case of the 
namenode, this can cause many seconds of unavailability due to the time it 
takes the linux kernel to copy hundreds of MB of page tables and exec a new 
process.
Risk: Low. Fix adds a missing argument when launching the linuxtaskcontroller 
(path to native libraries)

Content request for 0.20.205 Sustaining Release

Reply via email to