[Sorry if this is a duplicate message; I sent this message yesterday but wasn't subscribed to the list then, so I don't know if you got that one.]


Hello,

I'm a PhD student in Computer Science at Stanford University, evaluating a static race detection tool I've developed for Java, on open source programs. I ran the tool on Apache FTP Server's source code and found a bunch of races, see:

http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/

Since this is a static tool, the race reports aren't concrete test cases. Moreover, some will be false positives. I've inspected most of them myself and I'll summarize possible bugs below:

Bug #1:

There are race reports on field m_maxIdleTimeSec of class BaseUser and on field m_request of class RequestHandler, see for instance:

http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/2_trace.html http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/8_trace.html

These races indicate a possible bug. In particular, the server thread calls method newConnection(IConnection) of class ConnectionManagerImpl, which executes the code:

cons.add(connection);
...
BaseUser user = (BaseUser)connection.getRequest().getUser();
user.setMaxIdleTime(m_defaultIdleSec);

Notice that it sets m_maxIdleTimeSec *after* adding the connection to m_conList, which is incorrect because the timer thread may fire while the server thread is executing the "..." code above. The timer thread will sweep over m_conList, find this connection which was just added, and execute the following code:

FtpRequestImpl request = (FtpRequestImpl)con.getRequest();
if (request.isTimeout(currTime)) { *** }

where the isTimeout method calls getMaxIdleTime() (hence the race on field m_maxIdleTimeSec), which may return some value of m_maxIdleTimeSec other than m_defaultIdleSec which the server thread wanted to set after executing the "..." code above.

Likewise, the race on field m_request suggests that the server thread may get a null pointer exception because it executes getRequest() above and does not check if the value returned is non-null. In particular, the timer thread may close the connection that was just added to m_conList by the main server thread (note that the close() method of class RequestHandler sets field m_request of the connection to null, hence the race on field m_request).

Of course, the above scenarios may be *highly* improbable in practice, but since my tool is static, it just reports possible races -- not whether or how probable they are in actual runs. It's easy to fix the races merely by moving the code that adds the connection to m_consList in method newConnection(IConnection) to after the code that sets its max idle time.

Bug #2:

You might want to declare field m_isConnectionClosed of class RequestHandler 'volatile'. See for instance this race report:

http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/15_trace.html

In the absence of the volatile keyword, under the latest Java Memory Model, it is legal for the unsynchronized accesses to this field in the run() method of class RequestHandler to return a stale value (ex. from the local cache of a processor on a multi-processor machine). In particular, even if the timer thread sets this field to true for a connection it has closed, the connection thread executing the run() method may still see it's value as false. The way to fix this problem is to use synchronization in the run() method (see related Bug #3 below) or to declare the field m_isConnectionClosed as volatile. Declaring it volatile ensures that the latest value of the field will be read.

Bug #3:

There are several races on fields like m_request, m_reader, and m_writer, and m_controlSocket, see for instance:

http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/27_trace.html
http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/24_trace.html
http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/16_trace.html
http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/19_trace.html

They occur because the run() method of class RequestHandler may be executed by a connection thread concurrently with the close() method of class RequestHandler by a timer thread. The run() method reads these fields without synchronization and dereferences them without checking for non-nullness, while the close() method sets them to null. Have you considered the possibility of a null pointer exception happening because of the above scenario? I ask because the run() method has a loop that repeatedly dereferences these fields. Perhaps you don't expect this scenario to ever occur, but if you do expect it to occur, then you need to put the body of the entire run() method or at least the body of the loop in it inside a block synchronized on 'this'.

Bug #4:

Finally, there are races on several fields of class FtpStatisticsImpl (ex. m_currConnections, m_totalConnections, m_currLogins, m_totalLogins, etc., see for instance:

http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/38_trace.html

There is no synchronization in this class although its fields may be incremented and decremented by different threads concurrently. Is this intentional, for performance reasons?

==

I would be happy if you can confirm/refute the above bugs. Any feedback you provide is valuable to me since I'm evaluating this tool on real-world programs.

Thanks!
-- Mayur

Reply via email to