Hadoop-Mapreduce-trunk - Build # 1574 - Still Failing

2013-10-10 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1574/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 33695 lines...]
  TestReduceFetchFromPartialMem.testReduceFromPartialMem:93-runJob:300 » 
OutOfMemory
  TestJobSysDirWithDFS.testWithDFS:130 » YarnRuntime 
java.lang.OutOfMemoryError:...
  TestLazyOutput.testLazyOutput:146 » YarnRuntime java.lang.OutOfMemoryError: 
un...
  TestSpecialCharactersInOutputPath.testJobWithDFS:112 » YarnRuntime 
java.lang.O...
  TestMapReduceLazyOutput.testLazyOutput:136 » YarnRuntime 
java.lang.OutOfMemory...
  TestSpeculativeExecution.setup:122 » IO Cannot run program stat: 
java.io.IOE...
  TestMRJobs.setup:130 » YarnRuntime java.lang.OutOfMemoryError: unable to 
creat...
  TestRMNMInfo.setup:84 » IO Cannot run program stat: java.io.IOException: 
err...
  TestUberAM.setup:45-TestMRJobs.setup:130 » YarnRuntime 
java.lang.OutOfMemoryE...

Tests run: 455, Failures: 8, Errors: 11, Skipped: 11

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] hadoop-mapreduce-client ... SUCCESS [2.543s]
[INFO] hadoop-mapreduce-client-core .. SUCCESS [43.385s]
[INFO] hadoop-mapreduce-client-common  SUCCESS [24.790s]
[INFO] hadoop-mapreduce-client-shuffle ... SUCCESS [2.472s]
[INFO] hadoop-mapreduce-client-app ... SUCCESS [6:47.311s]
[INFO] hadoop-mapreduce-client-hs  SUCCESS [2:02.866s]
[INFO] hadoop-mapreduce-client-jobclient . FAILURE [49:40.789s]
[INFO] hadoop-mapreduce-client-hs-plugins  SKIPPED
[INFO] Apache Hadoop MapReduce Examples .. SKIPPED
[INFO] hadoop-mapreduce .. SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 59:44.837s
[INFO] Finished at: Thu Oct 10 14:19:09 UTC 2013
[INFO] Final Memory: 22M/93M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on 
project hadoop-mapreduce-client-jobclient: ExecutionException; nested exception 
is java.util.concurrent.ExecutionException: java.lang.RuntimeException: The 
forked VM terminated without saying properly goodbye. VM crash or System.exit 
called ?
[ERROR] Command was/bin/sh -c cd 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
  /home/jenkins/tools/java/jdk1.6.0_26/jre/bin/java -Xmx1024m 
-XX:+HeapDumpOnOutOfMemoryError -jar 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire/surefirebooter5605906304175332674.jar
 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire/surefire9153747844174506124tmp
 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire/surefire_1115428300805884185348tmp
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn goals -rf :hadoop-mapreduce-client-jobclient
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Updating MAPREDUCE-5569
Updating HDFS-5323
Updating HADOOP-9470
Updating HDFS-4510
Updating YARN-1284
Updating YARN-1283
Updating YARN-879
Updating HADOOP-10031
Updating MAPREDUCE-5102
Updating HDFS-5337
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

Re: [VOTE] Release Apache Hadoop 2.2.0

2013-10-10 Thread Arpit Gupta
+1 (non binding)

Ran secure and non secure multi node clusters and tested HA and RM recovery 
tests.

--
Arpit Gupta
Hortonworks Inc.
http://hortonworks.com/

On Oct 7, 2013, at 12:00 AM, Arun C Murthy a...@hortonworks.com wrote:

 Folks,
 
 I've created a release candidate (rc0) for hadoop-2.2.0 that I would like to 
 get released - this release fixes a small number of bugs and some 
 protocol/api issues which should ensure they are now stable and will not 
 change in hadoop-2.x.
 
 The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0
 The RC tag in svn is here: 
 http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0
 
 The maven artifacts are available via repository.apache.org.
 
 Please try the release and vote; the vote will run for the usual 7 days.
 
 thanks,
 Arun
 
 P.S.: Thanks to Colin, Andrew, Daryn, Chris and others for helping nail down 
 the symlinks-related issues. I'll release note the fact that we have disabled 
 it in 2.2. Also, thanks to Vinod for some heavy-lifting on the YARN side in 
 the last couple of weeks.
 
 
 
 
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 
 
 -- 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law. If the reader 
 of this message is not the intended recipient, you are hereby notified that 
 any printing, copying, dissemination, distribution, disclosure or 
 forwarding of this communication is strictly prohibited. If you have 
 received this communication in error, please contact the sender immediately 
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


RE: [VOTE] Release Apache Hadoop 2.2.0

2013-10-10 Thread Bikas Saha
+1 (non binding)

-Original Message-
From: Arpit Gupta [mailto:ar...@hortonworks.com]
Sent: Thursday, October 10, 2013 10:06 AM
To: common-...@hadoop.apache.org
Cc: hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org;
mapreduce-dev@hadoop.apache.org
Subject: Re: [VOTE] Release Apache Hadoop 2.2.0

+1 (non binding)

Ran secure and non secure multi node clusters and tested HA and RM
recovery tests.

--
Arpit Gupta
Hortonworks Inc.
http://hortonworks.com/

On Oct 7, 2013, at 12:00 AM, Arun C Murthy a...@hortonworks.com wrote:

 Folks,

 I've created a release candidate (rc0) for hadoop-2.2.0 that I would
like to get released - this release fixes a small number of bugs and some
protocol/api issues which should ensure they are now stable and will not
change in hadoop-2.x.

 The RC is available at:
 http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0
 The RC tag in svn is here:
 http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0

 The maven artifacts are available via repository.apache.org.

 Please try the release and vote; the vote will run for the usual 7 days.

 thanks,
 Arun

 P.S.: Thanks to Colin, Andrew, Daryn, Chris and others for helping nail
down the symlinks-related issues. I'll release note the fact that we have
disabled it in 2.2. Also, thanks to Vinod for some heavy-lifting on the
YARN side in the last couple of weeks.





 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/



 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity to which it is addressed and may contain information that is
 confidential, privileged and exempt from disclosure under applicable
 law. If the reader of this message is not the intended recipient, you
 are hereby notified that any printing, copying, dissemination,
 distribution, disclosure or forwarding of this communication is
 strictly prohibited. If you have received this communication in error,
 please contact the sender immediately and delete it from your system.
Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity
to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified
that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender
immediately
and delete it from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Created] (MAPREDUCE-5575) History files deleted from the intermediate directory never get removed from the JobListCache

2013-10-10 Thread Sandy Ryza (JIRA)
Sandy Ryza created MAPREDUCE-5575:
-

 Summary: History files deleted from the intermediate directory 
never get removed from the JobListCache
 Key: MAPREDUCE-5575
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5575
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.2.0
Reporter: Sandy Ryza


The JobHistoryServer periodically scans through the intermediate directory. It 
adds all files to the JobListCache. It deletes job files that are older than 
the max age and moves all other files to the done directory.  Later, when files 
in the done directory become too old, they're deleted from the JobListCache.  
Jobs that were deleted in the intermediate directory (and thus never moved to 
the done directory) end up in the JobListCache but can never be deleted from it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: [VOTE] Release Apache Hadoop 2.2.0

2013-10-10 Thread Chris Nauroth
+1 non-binding

I verified the checksum and signature.  I deployed the tarball to a small
cluster of Ubuntu VMs: 1 * NameNode, 1 * ResourceManager, 2 * DataNode, 2 *
NodeManager, 1 * SecondaryNameNode.  I ran a few HDFS commands and sample
MapReduce jobs.  I verified that the 2NN can take a checkpoint
successfully.  Everything worked as expected.

The outcome of the recent discussions on HDFS symlinks was that we need to
disable the feature in this release.  Just to be certain that this patch
took, I wrote a small client to call FileSystem.createSymlink and tried to
run it in my 2.2.0 cluster.  It threw UnsupportedOperationException, which
is the expected behavior.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Thu, Oct 10, 2013 at 10:18 AM, Bikas Saha bi...@hortonworks.com wrote:

 +1 (non binding)

 -Original Message-
 From: Arpit Gupta [mailto:ar...@hortonworks.com]
 Sent: Thursday, October 10, 2013 10:06 AM
 To: common-...@hadoop.apache.org
 Cc: hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org;
 mapreduce-dev@hadoop.apache.org
 Subject: Re: [VOTE] Release Apache Hadoop 2.2.0

 +1 (non binding)

 Ran secure and non secure multi node clusters and tested HA and RM
 recovery tests.

 --
 Arpit Gupta
 Hortonworks Inc.
 http://hortonworks.com/

 On Oct 7, 2013, at 12:00 AM, Arun C Murthy a...@hortonworks.com wrote:

  Folks,
 
  I've created a release candidate (rc0) for hadoop-2.2.0 that I would
 like to get released - this release fixes a small number of bugs and some
 protocol/api issues which should ensure they are now stable and will not
 change in hadoop-2.x.
 
  The RC is available at:
  http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0
  The RC tag in svn is here:
  http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0
 
  The maven artifacts are available via repository.apache.org.
 
  Please try the release and vote; the vote will run for the usual 7 days.
 
  thanks,
  Arun
 
  P.S.: Thanks to Colin, Andrew, Daryn, Chris and others for helping nail
 down the symlinks-related issues. I'll release note the fact that we have
 disabled it in 2.2. Also, thanks to Vinod for some heavy-lifting on the
 YARN side in the last couple of weeks.
 
 
 
 
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
  entity to which it is addressed and may contain information that is
  confidential, privileged and exempt from disclosure under applicable
  law. If the reader of this message is not the intended recipient, you
  are hereby notified that any printing, copying, dissemination,
  distribution, disclosure or forwarding of this communication is
  strictly prohibited. If you have received this communication in error,
  please contact the sender immediately and delete it from your system.
 Thank You.


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: [VOTE] Release Apache Hadoop 2.2.0

2013-10-10 Thread Jian He
+1 non-binding

Built from source code, and ran a few sample jobs on single node cluster,
tested RM and AM recovery.

Thanks,
Jian


On Thu, Oct 10, 2013 at 10:59 AM, Chris Nauroth cnaur...@hortonworks.comwrote:

 +1 non-binding

 I verified the checksum and signature.  I deployed the tarball to a small
 cluster of Ubuntu VMs: 1 * NameNode, 1 * ResourceManager, 2 * DataNode, 2 *
 NodeManager, 1 * SecondaryNameNode.  I ran a few HDFS commands and sample
 MapReduce jobs.  I verified that the 2NN can take a checkpoint
 successfully.  Everything worked as expected.

 The outcome of the recent discussions on HDFS symlinks was that we need to
 disable the feature in this release.  Just to be certain that this patch
 took, I wrote a small client to call FileSystem.createSymlink and tried to
 run it in my 2.2.0 cluster.  It threw UnsupportedOperationException, which
 is the expected behavior.

 Chris Nauroth
 Hortonworks
 http://hortonworks.com/



 On Thu, Oct 10, 2013 at 10:18 AM, Bikas Saha bi...@hortonworks.com
 wrote:

  +1 (non binding)
 
  -Original Message-
  From: Arpit Gupta [mailto:ar...@hortonworks.com]
  Sent: Thursday, October 10, 2013 10:06 AM
  To: common-...@hadoop.apache.org
  Cc: hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org;
  mapreduce-dev@hadoop.apache.org
  Subject: Re: [VOTE] Release Apache Hadoop 2.2.0
 
  +1 (non binding)
 
  Ran secure and non secure multi node clusters and tested HA and RM
  recovery tests.
 
  --
  Arpit Gupta
  Hortonworks Inc.
  http://hortonworks.com/
 
  On Oct 7, 2013, at 12:00 AM, Arun C Murthy a...@hortonworks.com wrote:
 
   Folks,
  
   I've created a release candidate (rc0) for hadoop-2.2.0 that I would
  like to get released - this release fixes a small number of bugs and some
  protocol/api issues which should ensure they are now stable and will not
  change in hadoop-2.x.
  
   The RC is available at:
   http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0
   The RC tag in svn is here:
   http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0
  
   The maven artifacts are available via repository.apache.org.
  
   Please try the release and vote; the vote will run for the usual 7
 days.
  
   thanks,
   Arun
  
   P.S.: Thanks to Colin, Andrew, Daryn, Chris and others for helping nail
  down the symlinks-related issues. I'll release note the fact that we have
  disabled it in 2.2. Also, thanks to Vinod for some heavy-lifting on the
  YARN side in the last couple of weeks.
  
  
  
  
  
   --
   Arun C. Murthy
   Hortonworks Inc.
   http://hortonworks.com/
  
  
  
   --
   CONFIDENTIALITY NOTICE
   NOTICE: This message is intended for the use of the individual or
   entity to which it is addressed and may contain information that is
   confidential, privileged and exempt from disclosure under applicable
   law. If the reader of this message is not the intended recipient, you
   are hereby notified that any printing, copying, dissemination,
   distribution, disclosure or forwarding of this communication is
   strictly prohibited. If you have received this communication in error,
   please contact the sender immediately and delete it from your system.
  Thank You.
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
  to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
  that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
  immediately
  and delete it from your system. Thank You.
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank 

[jira] [Created] (MAPREDUCE-5576) MR AM unregistration should be failed due to UnknownHostException on getting history url

2013-10-10 Thread Zhijie Shen (JIRA)
Zhijie Shen created MAPREDUCE-5576:
--

 Summary: MR AM unregistration should be failed due to 
UnknownHostException on getting history url
 Key: MAPREDUCE-5576
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5576
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen


Before RMCommunicator sends the request to RM to finish the application, it 
will try to get the JHS url, which may throw UnknownHostException. The current 
code path will skip sending the request to RM when the exception is raised, 
which sounds not a reasonable behavior, because RM's unregistering an AM will 
not affected by the tracking URL. The URL can be empty or null. 

AFAIK, the impact of null URL will be that the URL to redirect users from RM 
web page to JHS will be unavailable, and the job report will not show the URL 
as well. However, is it much much better than failing an application because of 
UnknownHostException here? Anyway, users can go to JHS directly to find the 
application history info.

Therefore, the reasonable code path here should be catching 
UnknownHostException and set historyUrl = null





--
This message was sent by Atlassian JIRA
(v6.1#6144)


pipes not working in MR2?

2013-10-10 Thread Sandy Ryza
I'm unable to get a simple hadoop pipes job working in MR2, and got the
sense it hasn't been working for a while.  Does anybody have any insight
into what's going on?  Has anybody used them successfully recently?

thanks for any help,
Sandy


[jira] [Created] (MAPREDUCE-5577) Allow querying the JobHistoryServer by job arrival time

2013-10-10 Thread Sandy Ryza (JIRA)
Sandy Ryza created MAPREDUCE-5577:
-

 Summary: Allow querying the JobHistoryServer by job arrival time
 Key: MAPREDUCE-5577
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5577
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Reporter: Sandy Ryza
Assignee: Sandy Ryza


The JobHistoryServer REST APIs currently allow querying by job submit time and 
finish time.  However, jobs don't necessarily arrive in order of their finish 
time, meaning that a client who wants to stay on top of all completed jobs 
needs to query large time intervals to make sure they're not missing anything.  
Exposing functionality to allow querying by the time a job lands at the 
JobHistoryServer would allow clients to set the start of their query interval 
to the time of their last query. 

The arrival time of a job would be defined as the time that it lands in the 
done directory. 




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (MAPREDUCE-5512) TaskTracker hung after failed reconnect to the JobTracker

2013-10-10 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic resolved MAPREDUCE-5512.
---

   Resolution: Fixed
Fix Version/s: 1.3.0
   1-win

Fix committed to branch-1 and branch-1-win. 

 TaskTracker hung after failed reconnect to the JobTracker
 -

 Key: MAPREDUCE-5512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 1-win, 1.3.0

 Attachments: hadoop-tasktracker-RD00155DD09100.log, 
 MAPREDUCE-5512.branch-1.patch, tt_Hung.txt


 TaskTracker hung after failed reconnect to the JobTracker. 
 This is the problematic piece of code:
 {code}
 this.distributedCacheManager = new TrackerDistributedCacheManager(
 this.fConf, taskController);
 this.distributedCacheManager.startCleanupThread();
 
 this.jobClient = (InterTrackerProtocol) 
 UserGroupInformation.getLoginUser().doAs(
 new PrivilegedExceptionActionObject() {
   public Object run() throws IOException {
 return RPC.waitForProxy(InterTrackerProtocol.class,
 InterTrackerProtocol.versionID,
 jobTrackAddr, fConf);
   }
 });
 {code}
 In case RPC.waitForProxy() throws, TrackerDistributedCacheManager cleanup 
 thread will never be stopped, and given that it is a non daemon thread it 
 will keep TT up forever.



--
This message was sent by Atlassian JIRA
(v6.1#6144)