[
https://issues.apache.org/jira/browse/CHUKWA-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14396011#comment-14396011
]
Eric Yang edited comment on CHUKWA-743 at 4/5/15 12:22 AM:
-----------------------------------------------------------
PidFile class should be removed. Posix file lock interface only work inside
the same process not across multiple instance of the programs. A old trick was
to bind the locking to a port number as indicator if there is more than one
instance of the program has been running. However, this approach may not be
safe because third party could connect to the binding port to cause race
condition as well. Hence, hadoop shell script is still the best solution:
{code}
if pid file exists, kill -0 to test program running.
if program is running
warn the user, it's already running
exit 0
else
warn the user, it's not running
exit 1
else
start the program
record pid
sleep 1
{code}
was (Author: eyang):
PidFile class should be removed. Posix file lock interface only work inside
the same process not across multiple instance of the programs. A old trick was
to bind the locking to a port number as indicator if there is more than one
instance of the program has been running. However, this approach may not be
safe because third party could connect to the binding port to cause race
condition as well. Hence, hadoop shell script is still the best solution:
{code}
if pid file exists, kill -0 to test program running.
if program is running
exit 1
else
start the program
record pid
sleep 1
{code}
> race condition in PidFile
> -------------------------
>
> Key: CHUKWA-743
> URL: https://issues.apache.org/jira/browse/CHUKWA-743
> Project: Chukwa
> Issue Type: Bug
> Reporter: Alan Snyder
>
> I believe there is a race condition in org.apache.hadoop.chukwa.util.PidFile.
> The problem is that the creation and deletion of the file is not protected by
> any lock. Client A can delete the file just before Client B tries to acquire
> a lock. If at that moment Client C tries to create the file, it will succeed.
> Client B and Client C will both succeed in acquiring a lock because there are
> two different files (one is hidden because it was deleted after being
> opened). I have tested similar code on OS X and this is what happened.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)