I've encountered a problem in using Runtime.exec() and the outputStream and inputStreams with Java on Linux. Apologies in advance for the very long message, but I've done a lot of analysis on this to narrow the problem, make it reproducible, try it under varying environments, etc. This message is probably best viewed using a monospaced font. Any help or clues that can be offered will be greatly appreciated.
The summary of the problem is that I am trying to start an external process using Runtime.exec() and then attach to this process's stdin, stdout and stderr using Process.getInputStream, Process.getOutputStream and Process.getErrorStream. I wish to keep the process alive, and to continually be able to shovel input into the stdin stream and to capture the output on stdout and stderr streams. The complication here is that one or more of the pipes seem to be closing prematurely causing the child process to go away and the master to begin throwing IOexceptions. I've included test code which will demonstrate the problem I'm describing. The test code consists of a single java source file (ProcessMaster.java) and a short Perl script (ProcessSlave.pl). This test code is at the very bottom of this message. The slave process (Perl script) simply does blocking line reads on stdin. It then checks for some special-case tokens (write an end marker, flush a stream, or exit), and acts on them if it sees them, otherwise it writes the line it saw back out on stdout and stderr with a leading tag to identify the stream. The master starts the slave process and gets the process's stdin, stdout and stderr streams. Then, it pumps a multi-line command into the slave's stdin. Using a pair of threads, we wait for output on both the stdout and stderr of the slave, and we continue reading output until either we see an end marker or hit the end of the stream, thereby exiting the thread. I will describe the configurations I've used and provide some detail into what I've discovered on the varying configurations in a bit. First, let me describe the error I'm seeing. The noticeable symptom of the problem is seeing an IOException thrown by the master (beware - line numbers may be skewed a couple from the test code): java.io.IOException: Bad file descriptor at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:212) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:72) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:130) at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:245) at java.io.BufferedWriter.flush(BufferedWriter.java:233) at ProcessMaster.runCommand(ProcessMaster.java:115) at ProcessMaster.main(ProcessMaster.java:217) I believe this to be an aftereffect - there's nothing to debug here: the pipe we had to the slave's stdin has gone away, so trying to flush a command we just wrote to the outputStream's BufferedWriter throws this exception. No surprises here. (Note - using IBM's java 1.3.0 implementation, the exception thrown is an Illegal seek rather than Bad File Descriptor. Still, it's thrown in the same place in the code). The reason the pipe's broken and I get the exception above is that the slave process has gone away! But why? To help with that, I've attached an strace to the slave process while the master is blocking for input asking for the delay in milliseconds that it should insert in some places. The slave strace shows me a couple scenarios, both of which would result in the slave exiting. The mystery that has me stumped is why either of these eventualities are happening in the first place. First case is the slave getting a null on stdin. Presumably this happens when the pipe between the slave's stdin and the master has gone away. Here's an strace that shows this: 16:04:24.858985 read(0, "Four score...\nwrite [ STDERR, \'{"..., 4096) = 104 16:04:26.272951 fstat64(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 16:04:26.273061 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x401e2000 16:04:26.273142 write(1, "SLAVE STDOUT> Four score...\n", 28) = 28 16:04:26.273396 write(2, "SLAVE STDERR> Four score...\n", 28) = 28 16:04:26.273592 write(2, "{END#1}\n", 8) = 8 16:04:26.273687 write(1, "{END#1}\n", 8) = 8 16:04:26.273774 read(0, "It was a dark and stormy night.\n"..., 4096) = 122 16:04:27.364158 write(1, "SLAVE STDOUT> It was a dark and "..., 46) = 46 16:04:27.364232 write(2, "SLAVE STDERR> It was a dark and "..., 46) = 46 16:04:27.364338 write(2, "{END#2}\n", 8) = 8 16:04:27.364396 write(1, "{END#2}\n", 8) = 8 16:04:27.364454 read(0, "", 4096) = 0 16:04:27.365131 rt_sigprocmask(SIG_SETMASK, [QUIT RT_0], NULL, 8) = 0 16:04:27.365397 munmap(0x401e2000, 4096) = 0 16:04:27.365448 _exit(0) Second case is the slave getting broken pipe(s) on stderr or stdout. The strace below shows only one attempt to write to a broken pipe. There may be several before it finally gives up: 16:19:46.750919 read(0, "Four score...\nwrite [ STDERR, \'{"..., 4096) = 104 16:19:48.337176 fstat64(0x1, 0xbffff710) = 0 16:19:48.337279 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40194000 16:19:48.337420 write(1, "SLAVE STDOUT> Four score...\n", 28) = 28 16:19:48.337782 write(2, "SLAVE STDERR> Four score...\n", 28) = 28 16:19:48.338022 write(2, "{END#1}\n", 8) = 8 16:19:48.338160 write(1, "{END#1}\n", 8) = 8 16:19:48.338298 read(0, "It was a dark and stormy night.\n"..., 4096) = 122 16:19:48.369457 write(1, "SLAVE STDOUT> It was a dark and "..., 46) = 46 16:19:48.369642 write(2, "SLAVE STDERR> It was a dark and "..., 46) = 46 16:19:48.370105 write(2, "{END#2}\n", 8) = -1 EPIPE (Broken pipe) 16:19:48.370196 --- SIGPIPE (Broken pipe) --- Here are the specific hardware/RedHat/kernel/Java vm/JIT combinations I've used: # Proc. RedHat Kernel Java JIT 1 uni RH6.2 2.2.14-5 Sun 1.2.2 nojit 2 Sun 1.3.1_01 hotspot 3 dual RH6.2 2.2.19-3hpc Sun 1.2.2 nojit 4 Sun 1.3.1_01 hotspot 5 Sun 1.3.1_01 classic(nojit) 6 Sun 1.4.0 B2 hotspot 7 uni RH7.1 2.4.3-12 Sun 1.2.2 nojit 8 Sun 1.3.1_01 hotspot 9 Sun 1.3.1_01 classic(nojit) 10 IBM 1.3.0 jitc 11 Blackdown 1.3.1 hotspot 12 Blackdown 1.3.1 classic(nojit) I will refer to the configurations in the table above by the number in the first column to (hopefully) be less confusing. The target configuration is 4, a dual Pentium processor box running an SMP kernel, ideally using Sun java 1.3.1 with the hotspot vm. The most successful configurations were 1, 7 and 9. I have yet to see these fail with the test case, however, they are not the target platform (at least hardware-wise). Configurations 3 and 5 were moderately successful, though they would either hang outright or throw the IOException detailed above, usually before 1000 iterations. Seems to hint that using a JIT is problematic, but it's not clear why. A problem with the JIT, or a race condition caused by the increased speed? Configurations 2, 4, 6, 8, 10, 11 and 12 simply did not work. They'd usually fail by the third or fourth iteration. Interesting to note here is that configuration 12 used the classic vm (nojit) of Blackdown's java 1.3.1, yet it would still fail early. Using the classic vm of Sun's 1.3.1 would get it to act at least as well as Sun java 1.2.2. I haven't tried the IBM java 1.3.0 without jitc - haven't yet determined how to turn off the JIT. Sun's java 1.4.0 beta 2 doesn't seem to offer a way to run without the hotspot JIT. An additional data point is that I ran the test code under Windows 2000 under Sun java 1.3.1_01 + hotspot with no errors. This would seem to suggest that the problem is some native code or at the OS level (or my code's interaction with either). The questions I have out of all this are: * Why do the pipes seem to be closing? * Is the additional speed of the JIT compilers what is exacerbating the problem when they're used? Remember, I've seen this problem under Sun java 1.2.2, and Sun java 1.3.1 with nojit, only it will be less frequent. * Is there a better way I should be doing this type of process control given the tools that java provides me? * Is there a problem in how I'm dealing with the process and streams, or is it a bug in Java or in Linux? * Sun java 1.3.0 had a problem noted where it failed on SMP kernels (The message I saw was posted to the Blackdown java/Linux board around August 2000. Do current versions of java still have a problem with SMP kernels? Below are two source files: They're set up to live in /tmp/JavaProcessTest, although this can be changed by editing the proc.startProcess line for non-Windows platforms near the top of main() in ProcessMaster.pl and recompiling. Run the class with a simple "java ProcessMaster", adding a classpath argument if necessary. You'll be prompted for a delay time in milliseconds. The delay will sleep between select steps. Also, while the ProcessMaster is blocking for input of the delay time is a great time to turn an strace onto the pid of the Perl process of the child. BEGIN ProcessSlave.pl --------------------------------------------------- #!/usr/bin/perl # Ensure that both STDOUT and STDERR autoflush my $oldfh = select STDOUT; $| = 1; select $oldfh; my $oldfh = select STDERR; $| = 1; select $oldfh; while ($line = <STDIN>) { if ($line =~ /^exit/) { last; } elsif ($line =~ /^write/) { $line =~ /\#(\d+)\}/; $cmdnum = $1; if ($line =~ /STDERR/) { print STDERR "{END#" . $cmdnum . "}\n"; } elsif ($line =~ /STDOUT/) { print STDOUT "{END#" . $cmdnum . "}\n"; } } elsif ($line =~ /^flush/) { if ($line =~ /STDERR/) { # STDERR already autoflushes } elsif ($line =~ /STDOUT/) { # STDOUT already autoflushes } } else { print STDOUT "SLAVE STDOUT> " . $line; print STDERR "SLAVE STDERR> " . $line; } } --------------------------------------------------- END ProcessSlave.pl BEGIN ProcessMaster.java --------------------------------------------------- import java.io.*; import java.util.*; /** * Wrapper class for running specified command and environment in * a subprocess. * */ public class ProcessMaster { // ----- package constuctors --------------------------------------------- /** * Constructs ProcessMaster. * */ public ProcessMaster() { } /** * Starts the Process * */ public synchronized void startProcess(String processCmdLine, String [] env) throws Exception { if( env != null) m_Process = Runtime.getRuntime().exec(processCmdLine,env); else m_Process = Runtime.getRuntime().exec(processCmdLine); // Writes to slave process's STDIN m_OutWriter = new BufferedWriter( new OutputStreamWriter( m_Process.getOutputStream())); // Reads from slave process's STDOUT and STDERR m_InReader = new BufferedReader( new InputStreamReader(m_Process.getInputStream()), 4096); m_ErrReader = new BufferedReader( new InputStreamReader(m_Process.getErrorStream()), 4096); } /** * Stops the Process */ public synchronized void stopProcess(String exitCmdLine) throws Exception { if(m_Process == null) throw new IllegalStateException("The Process is not running. Try starting the process before running the command."); try{ if(!exitCmdLine.endsWith("\n")) exitCmdLine += "\n"; m_OutWriter.write(exitCmdLine,0,exitCmdLine.length()); System.err.println("DEBUG: Exiting with command "+ exitCmdLine); m_OutWriter.flush(); m_InReader.close(); m_OutWriter.close(); m_ErrReader.close(); System.err.println("DEBUG: Waiting for process to terminate..."); m_Process.waitFor(); System.err.println("DEBUG: Process is terminated."); } finally{ m_Process.destroy(); System.gc(); } } /** * Runs the command as given by the input parameter string */ public synchronized void runCommand(String runCmdLine) throws Exception{ if(m_Process == null) throw new IllegalStateException("The Process is not running. Try starting the process before running the command."); // Make sure the command line terminates with a \n if(!runCmdLine.endsWith("\n")) runCmdLine += "\n"; // Initialize all Buffers m_OutputBuffer.setLength(0); m_ErrBuffer.setLength(0); String inputCommand = ""; m_Counter++; // Increment the counter m_commandInProgress = true; // set command in progress to false // Set Timed out Variable to false; m_isTimedOut = false; // Write the command inputCommand += runCmdLine; // Write the command to generate END Markers and flush inputCommand += "write [ STDERR, '{{" + cEndMarker + m_Counter +"}\\n'];\n"; inputCommand += "write [ STDOUT, '{{" + cEndMarker + m_Counter +"}\\n'];\n"; inputCommand += "flush STDERR;\nflush STDOUT;\n"; System.err.println("DEBUG: ProcessMaster.runCommand executing command: \n" + inputCommand + "\n"); try { if (sleepTime > 0) Thread.sleep(sleepTime); m_OutWriter.write(inputCommand,0,inputCommand.length()); System.err.println("DEBUG: Wrote command to Slave stdin, about to flush"); if (sleepTime > 0) Thread.sleep(sleepTime); m_OutWriter.flush(); System.err.println("DEBUG: flushed writer to Slave stdin"); if (sleepTime > 0) Thread.sleep(sleepTime); // Spawn new thread for reading from the Error Stream ProcessReader errReader = new ProcessReader(m_Counter,m_ErrReader,m_ErrBuffer); System.err.println("DEBUG: created errReader"); Thread errThread = new Thread(errReader); System.err.println("DEBUG: Created errThread"); errThread.start(); System.err.println("DEBUG: started errThread"); if (sleepTime > 0) Thread.sleep(sleepTime); // Spawn a new thread for reading from the Input Stream ProcessReader outReader = new ProcessReader(m_Counter,m_InReader, m_OutputBuffer); System.err.println("DEBUG: created outReader"); Thread outThread = new Thread(outReader); System.err.println("DEBUG: Created outThread"); outThread.start(); System.err.println("DEBUG: started outThread"); // Join the threads threadJoin(errThread,"errorReaderThread"); System.err.println("DEBUG: joined errThread"); threadJoin(outThread,"outReaderThread"); System.err.println("DEBUG: joined outThread"); // This will kill all other lagging threads m_commandInProgress = false; } catch(Exception exc) { System.err.println("Exception:" + exc); exc.printStackTrace(); System.exit(1); } } /** * Joins threads. */ private void threadJoin(Thread thread, String tag) { System.err.println("DEBUG: ProcessWrapper.threadJoin waiting for "+tag+" to terminate..."); long start = System.currentTimeMillis(); try { thread.join(cTimeoutInterval); } catch (InterruptedException exc) { System.err.println("Warning: ProcessWrapper.threadJoin "+tag+" got interrupted, how odd..."); } finally { long elapsed = System.currentTimeMillis() - start; if (elapsed > cTimeoutInterval / 2) { int elapsedSec = (int) (elapsed / 1000); System.err.println("Warning: ProcessWrapper.threadJoin timeout after "+elapsedSec+ " seconds for "+tag+", continuing..."); m_isTimedOut=true; } } } /** * Starts testing the Program */ public static void main(String [] args) throws Exception { try{ ProcessMaster proc = new ProcessMaster(); if (File.separatorChar == '\\') { // assume we're on Windows proc.startProcess("perl c:/ProcessSlave.pl",null); } else { // assume we're on not-Windows proc.startProcess("/tmp/JavaProcessTest/ProcessSlave.pl",null); } BufferedReader reader = new BufferedReader( new InputStreamReader(System.in)); System.out.print("Thread.sleep() delay in ms (0 for none)>"); String wait = reader.readLine().trim(); try { sleepTime = Integer.parseInt(wait); } catch(NumberFormatException nfe) { System.err.println("\nInvalid sleep delay. Defaulting to no delay."); sleepTime = 0; } while (true) { proc.runCommand("Four score and seven years ago..."); System.err.println("OUT:"+proc.getOutput()); System.err.println("ERR:"+proc.getError()); System.err.println("-----------------------------------"); proc.runCommand("It was a dark and stormy night."); System.err.println("OUT:"+proc.getOutput()); System.err.println("ERR:"+proc.getError()); System.err.println("-----------------------------------"); proc.runCommand("Far out in the uncharted backwaters..."); System.err.println("OUT:"+proc.getOutput()); System.err.println("ERR:"+proc.getError()); System.err.println("-----------------------------------"); proc.runCommand("D'oh!"); System.err.println("OUT:"+proc.getOutput()); System.err.println("ERR:"+proc.getError()); System.err.println("-----------------------------------"); proc.runCommand("Call me Ishmael."); System.err.println("OUT:"+proc.getOutput()); System.err.println("ERR:"+proc.getError()); System.err.println("-----------------------------------"); proc.runCommand("In the beginning there was the void."); System.err.println("OUT:"+proc.getOutput()); System.err.println("ERR:"+proc.getError()); System.err.println("-----------------------------------"); proc.runCommand("I've had quite enough of this."); System.err.println("OUT:"+proc.getOutput()); System.err.println("ERR:"+proc.getError()); System.err.println("-----------------------------------"); } proc.stopProcess("exit"); System.err.println("OUT:"+proc.getOutput()); System.err.println("ERR:"+proc.getError()); } catch(Exception exc) { exc.printStackTrace(); } } /** * Returns stdout of process, so far. */ public String getOutput() { synchronized (this) { return m_OutputBuffer.toString(); } } /** * Returns error. */ public String getError() { synchronized (this) { return m_ErrBuffer.toString(); } } /** * Returns the timed out status. */ public final boolean isTimedOut(){ return m_isTimedOut; } //------------------- Inner Class ------------------- /** * A class that reads the Stderr of the process. */ private class ProcessReader implements Runnable { /** * Holds the Counter's value. */ private int m_CounterCopy=0; // Holds the Counter's value /** * Holds the reader's reference. */ private BufferedReader m_Reader; // Holds the reader's reference /** * Holds the buffer reference. */ private StringBuffer m_Buffer; // Holds the buffer reference /** * Constructs the ProcessReader. */ public ProcessReader(int counter, BufferedReader reader, StringBuffer buffer) { this.m_CounterCopy = counter; this.m_Reader = reader; this.m_Buffer = buffer; } /** * The Run method of the thread */ public synchronized void run(){ String line=""; // Temporary var int occurenceOfEndMarker=0; while(m_CounterCopy==m_Counter && m_commandInProgress) { // Loop 'til command is in progress try{ line = m_Reader.readLine(); if(line == null) { // If null terminate the thread System.err.println("WHOA!: Read null, terminating thread"); break; } } catch( IOException exc ) { System.err.println("Error: IOException at ProcessReader.run()" + exc.getMessage()); break; } System.err.println("PIPE LINE:" + line); occurenceOfEndMarker = line.indexOf("{"+cEndMarker+m_CounterCopy+"}"); if(occurenceOfEndMarker != -1){ // Check for End Marker /* output lines may not be coming in new line, * so check if the line contains the end marker * and remove the end marker from being written to * the buffer */ line = line.substring(0,occurenceOfEndMarker); System.err.println("Saw EndMarker: {" + cEndMarker + m_CounterCopy + "}"); m_Buffer.append(line); break; } occurenceOfEndMarker = line.indexOf(cCompleteMarker); if(occurenceOfEndMarker != -1){ // Check for End Marker System.err.println("Saw cCompleteMarker: " + cCompleteMarker); m_Buffer.append(line); break; } // And the ordinary line gets written on the buffer. m_Buffer.append(line); } } } // ----- package member variables ---------------------------------------- /** * Process handle for executing process. */ private Process m_Process; /** * Buffered contents of stdout. */ private StringBuffer m_OutputBuffer = new StringBuffer(); /** * Buffered contents of stderr. */ private StringBuffer m_ErrBuffer = new StringBuffer(); /** * A final variable that stores the seconds for timeout for now 2 hours */ private final int cTimeoutInterval=2* 60 * 60*1000; /** * A variable that stores the counter value of the currently * executing command. */ private int m_Counter = 0; /** * A boolean variable that specifies if a command is currently in progress. */ private boolean m_commandInProgress; /** * A final variable that stores the BEGIN marker. */ private final String cBeginMarker="BEGIN#"; /** * A final variable that stores the BEGIN marker. */ final static String cCompleteMarker="COMPLETE"; /** * A final variable that stores the END marker. */ private final String cEndMarker ="END#"; /** * A final variable that stores the IO Reader of the Process. */ private BufferedReader m_InReader; /** * A final variable that stores the IO Writer of the Process. */ private BufferedWriter m_OutWriter; /** * A final variable that stores IO error reader of the Process. */ private BufferedReader m_ErrReader; /** * The boolean variable indiacates if a process command had timed-out. */ private boolean m_isTimedOut=false; private static int sleepTime = 0; } // end class --------------------------------------------------- END ProcessMaster.java ---------------------------------------------------------------------- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]