Thanks for the investigation Kihwal. I will keep an eye on future test failure in TestRowCounter.
On Tue, Oct 30, 2012 at 9:29 AM, Kihwal Lee <[email protected]> wrote: > Ted, > > I couldn't reproduce it by just running the test case. When you reproduce > it, look at the stderr/stdout file somewhere under > target/org.apache.hadoop.mapred.MiniMRCluster. Look for the one under the > directory whose name containing the app id. > > I did run into a similar problem and the stderr said: > /bin/bash: /bin/java: No such file or directory > > It was because JAVA_HOME was not set. But in this case the exit code was > 127 (shell not being able to locate the command to exec). In the hudson > job, the exit code was 1, so I think it's something else. > > Kihwal > > On 10/29/12 11:56 PM, "Ted Yu" <[email protected]> wrote: > > >TestRowCounter still fails: > > > https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/244/testReport/j > >unit/org.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterNoColu > >mn/ > > > >but there was no 'divide by zero' exception. > > > >Cheers > > > >On Thu, Oct 25, 2012 at 8:04 AM, Ted Yu <[email protected]> wrote: > > > >> I will try 2.0.2-alpha release. > >> > >> Cheers > >> > >> > >> On Thu, Oct 25, 2012 at 7:54 AM, Ted Yu <[email protected]> wrote: > >> > >>> Thanks for the quick response, Robert. > >>> Here is the hadoop version being used: > >>> <hadoop-two.version>2.0.1-alpha</hadoop-two.version> > >>> > >>> If there is newer release, I am willing to try that before filing JIRA. > >>> > >>> > >>> On Thu, Oct 25, 2012 at 7:07 AM, Robert Evans > >>><[email protected]>wrote: > >>> > >>>> It looks like you are running with an older version of 2.0, even > >>>>though > >>>> it > >>>> does not really make much of a difference in this case, The issue > >>>>shows > >>>> up when getLocalPathForWrite thinks there is no space on to write to > >>>>on > >>>> any of the disks it has configured. This could be because you do not > >>>> have > >>>> any directories configured. I really don't know for sure exactly > >>>>what is > >>>> happening. It might be disk fail in place removing disks for you > >>>>because > >>>> of other issues. Either way we should file a JIRA against Hadoop to > >>>>make > >>>> it so we never get the / by zero error and provide a better way to > >>>>handle > >>>> the possible causes. > >>>> > >>>> --Bobby Evans > >>>> > >>>> On 10/24/12 11:54 PM, "Ted Yu" <[email protected]> wrote: > >>>> > >>>> >Hi, > >>>> >HBase has Jenkins build against hadoop 2.0 > >>>> >I was checking why TestRowCounter sometimes failed: > >>>> > > >>>> > >>>> > https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/231/testRepor > >>>>t/o > >>>> > >>>> > >>>>>rg.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterExclusiv > >>>>>eCol > >>>> >umn/ > >>>> > > >>>> >I think the following could be the cause: > >>>> > > >>>> >2012-10-22 23:46:32,571 WARN [AsyncDispatcher event handler] > >>>> >resourcemanager.RMAuditLogger(255): USER=jenkins > >>>> OPERATION=Application > >>>> >Finished - Failed TARGET=RMAppManager RESULT=FAILURE > >>>> DESCRIPTION=App > >>>> >failed with state: FAILED PERMISSIONS=Application > >>>> >application_1350949562159_0002 failed 1 times due to AM Container for > >>>> >appattempt_1350949562159_0002_000001 exited with exitCode: -1000 due > >>>> >to: java.lang.ArithmeticException: / by zero > >>>> > at > >>>> > >>>> > >>>>>org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPat > >>>>>hFor > >>>> >Write(LocalDirAllocator.java:355) > >>>> > at > >>>> > >>>> > >>>>>org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAl > >>>>>loca > >>>> >tor.java:150) > >>>> > at > >>>> > >>>> > >>>>>org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAl > >>>>>loca > >>>> >tor.java:131) > >>>> > at > >>>> > >>>> > >>>>>org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAl > >>>>>loca > >>>> >tor.java:115) > >>>> > at > >>>> > >>>> > >>>>>org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getL > >>>>>ocal > >>>> >PathForWrite(LocalDirsHandlerService.java:257) > >>>> > at > >>>> > >>>> > >>>>>org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.R > >>>>>esou > >>>> > >>>> > >>>>>rceLocalizationService$LocalizerRunner.run(ResourceLocalizationService > >>>>>.jav > >>>> >a:849) > >>>> > > >>>> >However, I don't seem to find where in getLocalPathForWrite() > >>>>division > >>>> by > >>>> >zero could have arisen. > >>>> > > >>>> >Comment / hint is welcome. > >>>> > > >>>> >Thanks > >>>> > >>>> > >>> > >> > >
