sorry, for sending the wrong email. 2014/01/21 15:17 "You Hoken" <[email protected]>:
> Hi, > > I am using ExecSource to execute resident shell program via "rsh" command. > The resident shell program is simple program which doing "tail" log file > put in server (AIX) being "rsh". > > Flume: 1.3.1 > JDK: 1.6.0 > Linux executing Flume (ExecSource): SUSE Linux Enterprise Server 11 SP2 > AIX: V5.2 > > In this case, when I stop flume, took very long time (about 6 hours) to > stop ExecSource. > > The details are as follows. > It took about 6 hours between (1) and (2). > (1) INFO [node-shutdownHook] (org.apache.flume.source.ExecSource.stop:178) > - Stopping exec source with command:rsh serverXXX sh YYY.sh > (2) INFO [pool-4-thread-1] > (org.apache.flume.source.ExecSource$ExecRunnable > .run:307) - Command rsh serverXXX sh YYY.sh] exited with 0 > > This happened always.... > I guess TCP keepalive setting under OS (SUSE linux) affect this situation. > But still I don't know why takes 6 hours to stop ExecSource. > > So, to find the cause, I debuged these process and result is the > followings. > 1. ExecSource#stop:Process#destroy > 2. ExecSource#stop:Process#waitFor (start waiting for response No.1) > 3. ExecSource#run :Process#getErrorStream > 4. ExecSource#run :Process#destroy > 5. ExecSource#run :Process#waitFor (start waiting for response No.4) > 6. ExecSource#run :Process#waitFor (end waiting for response No.4) > 7. ExecSource#stop:Process#waitFor (end waiting for response No.1) > > You can see that No.5 terminates before No.2. > It seems thread safety (synchronized (process)) is invalid, I think. > Is this execution order correct ? > Do you think this execution order caused my problem ? > > by debugging, now I am sure the followings. > 1.two threads (ExecSource#stop and ExecSource#run) are executed at the > same time > 2.ExecSource#stop seems to wait for response at Process#waitFor after > java.lang.Process#destroy > 3.after Process#getErrorStream, ExecSource#run seems to wait for response > at > Process#waitFor after java.lang.Process#destroy > > In the above, I am worried if standard error from external process were > outputted after destroying, buffer overflow in client side might be caused > for > deadlock at Process#waitFor. > > So, I think that reading standard error had better be done in other thead > before executing waitFor (after executing destroy at ExecSource#stop). > > How do you think ? > > regards, > > YOU >
