Hi,
I took a thorough look at the race reports that my tool produced last
week, and found a new bug that I missed during my first inspection. It is
revealed by a race on field m_lastAccessTime of class FtpRequestImpl:
http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/47_trace.html
Note that the timer thread reads this field while the connection thread
writes to it, and hence there is a race. The field should be declared
volatile so that, for instance, it's stale value is not read by the timer
thread (the timer thread calls timerTask() which calls isTimeOut(...) on
objects of FtpRequestImpl which read this field, while the connection
thread updates the last access time by calling updateLastAccessTime() in
the notifyObserver() method which is called in each iteration of the run()
method of class RequestHandler).
Also, thanks for your feedback on my previous bug reports!
-- Mayur
On Tue, 8 Nov 2005, Mayur Naik wrote:
[Sorry if this is a duplicate message; I sent this message yesterday but
wasn't subscribed to the list then, so I don't know if you got that one.]
Hello,
I'm a PhD student in Computer Science at Stanford University, evaluating a
static race detection tool I've developed for Java, on open source programs.
I ran the tool on Apache FTP Server's source code and found a bunch of races,
see:
http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/
Since this is a static tool, the race reports aren't concrete test cases.
Moreover, some will be false positives. I've inspected most of them myself
and I'll summarize possible bugs below:
Bug #1:
There are race reports on field m_maxIdleTimeSec of class BaseUser and on
field m_request of class RequestHandler, see for instance:
http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/2_trace.html
http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/8_trace.html
These races indicate a possible bug. In particular, the server thread calls
method newConnection(IConnection) of class ConnectionManagerImpl, which
executes the code:
cons.add(connection);
...
BaseUser user = (BaseUser)connection.getRequest().getUser();
user.setMaxIdleTime(m_defaultIdleSec);
Notice that it sets m_maxIdleTimeSec *after* adding the connection to
m_conList, which is incorrect because the timer thread may fire while the
server thread is executing the "..." code above. The timer thread will sweep
over m_conList, find this connection which was just added, and execute the
following code:
FtpRequestImpl request = (FtpRequestImpl)con.getRequest();
if (request.isTimeout(currTime)) { *** }
where the isTimeout method calls getMaxIdleTime() (hence the race on field
m_maxIdleTimeSec), which may return some value of m_maxIdleTimeSec other than
m_defaultIdleSec which the server thread wanted to set after executing the
"..." code above.
Likewise, the race on field m_request suggests that the server thread may get
a null pointer exception because it executes getRequest() above and does not
check if the value returned is non-null. In particular, the timer thread may
close the connection that was just added to m_conList by the main server
thread (note that the close() method of class RequestHandler sets field
m_request of the connection to null, hence the race on field m_request).
Of course, the above scenarios may be *highly* improbable in practice, but
since my tool is static, it just reports possible races -- not whether or how
probable they are in actual runs. It's easy to fix the races merely by
moving the code that adds the connection to m_consList in method
newConnection(IConnection) to after the code that sets its max idle time.
Bug #2:
You might want to declare field m_isConnectionClosed of class RequestHandler
'volatile'. See for instance this race report:
http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/15_trace.html
In the absence of the volatile keyword, under the latest Java Memory Model,
it is legal for the unsynchronized accesses to this field in the run() method
of class RequestHandler to return a stale value (ex. from the local cache of
a processor on a multi-processor machine). In particular, even if the timer
thread sets this field to true for a connection it has closed, the connection
thread executing the run() method may still see it's value as false. The way
to fix this problem is to use synchronization in the run() method (see
related Bug #3 below) or to declare the field m_isConnectionClosed as
volatile. Declaring it volatile ensures that the latest value of the field
will be read.
Bug #3:
There are several races on fields like m_request, m_reader, and m_writer, and
m_controlSocket, see for instance:
http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/27_trace.html
http://suif.stanford.edu/~mhn/ftpserver-dev/target/classes/results/24_trace.html
http://suif.stanford.edu/~mhn/ftpserver-dev/target/c