[
https://issues.apache.org/jira/browse/HADOOP-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596743#action_12596743
]
Vinod Kumar Vavilapalli commented on HADOOP-3389:
-------------------------------------------------
Debugged this on hudson.zones solaris box and found out the actual reason for
failure.
When i run /bin/sh -c "sleep 10", the processes that are spawned are (two?!)
bq. 28998 /bin/sh /bin/sh /bin/sh (??!)
and
bq. 28999 sleep 10
While on RHEL, in spawns only one process which is the process we wish to run.
bq. 29015 sleep 10
And, we use /bin/sh -c "command" to run any external command from inside hod -
src.contrib.hod.Common.threads.simpleCommand class. The pid returned by this
class is that of the direct child(which would be the one with pid 28998 and
which doesn't return till 300 secs in this test-case) and this explains the
issue. The issue has gone away when I made simpleCommand to use /bin/bash
instead of /bin/sh to run the given command.
So, a possible fix for this could be changing the implementation of
simpleCommand to use /bin/bash instead of /bin/sh. Hadoop is using /bin/bash
all over the places, so this shouldn't be a problem.
Side Notes:
1) The first process(pid 28998) is odd in any case. Is this how solaris's
/bin/sh behaves and why?
2) Nigel observed that /bin/sh is a link to /sbin/sh on this machine (ok/weird?)
> [HOD] HOD unit test RunHodCleanupTests fails on solaris
> -------------------------------------------------------
>
> Key: HADOOP-3389
> URL: https://issues.apache.org/jira/browse/HADOOP-3389
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/hod
> Reporter: Vinod Kumar Vavilapalli
> Assignee: Hemanth Yamijala
>
> HOD unit test RunHodCleanupTests fails on solaris and this was first observed
> while submitting HADOOP-3023. Hudson failed to run this test altogether, the
> first time. The Second time, it took 300 secs to finish, instead of returning
> immediately as observed on RHEL boxes. Because of this, HADOOP-3023 is
> blocked, thereby no hod unit test can be run at all by Hudson.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.