[
https://issues.apache.org/jira/browse/DERBY-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721062#action_12721062
]
Kathey Marsden commented on DERBY-4053:
---------------------------------------
I noticed this happened on the IBM 10.4 run last night and had the
DRDA_InvalidReplyTooShort.S:Invalid reply from network server: Insufficient
data trace on a ping shortly before the hang. I looked briefly at how this
might occur, that a ping would return insufficient data.
A ping would go through this code in DRDAConnThread:
private void sessionInitialState()
throws Exception
{
// process NetworkServerControl commands - if it is not either
valid protocol let the
// DRDA error handling handle it
if (reader.isCmd())
{
try {
server.processCommands(reader, writer, session);
// reset reader and writer
reader.initialize(this, null);
writer.reset(null);
closeSession();
} catch (Throwable t) {
if (t instanceof InterruptedException)
throw (InterruptedException)t;
else
{
server.consoleExceptionPrintTrace(t);
}
}
In NetworkServerControlImpl during shutdown we do the following which might
interrupt the thread:
synchronized (threadList)
{
//interupt any connection threads still active
for (int i = 0; i < threadList.size(); i++)
{
final DRDAConnThread threadi =
(DRDAConnThread)threadList.get(i);
threadi.close();
AccessController.doPrivileged(
new PrivilegedAction() {
public Object
run() {
threadi.interrupt();
return
null;
}
});
}
threadList.clear();
}
It seems a kind of abrupt way to shutdown the threads. Would calling threadi()
close() be more appropriate so it could finish what it was doing? I am also
unclear about the state of the cleanup when such an interrupt occurs during the
middle of writing a response.
Tomorrow I will try putting a sleep between the reply header and the OK byte on
the ping response and try to shutdown at the same time to see if I can get
the InvalidReplyTooShort and then see if subsequent connection attempts hang.
> suites.All hang with message java.net.BindException: Address already in use:
> NET_Bind in derby.log
> ---------------------------------------------------------------------------------------------------
>
> Key: DERBY-4053
> URL: https://issues.apache.org/jira/browse/DERBY-4053
> Project: Derby
> Issue Type: Bug
> Components: Network Server
> Affects Versions: 10.5.1.1
> Reporter: Kathey Marsden
> Attachments: derby-4053_repro_dont_commit_diff.txt, derby.log,
> javacore-20090420-1735.txt, javacore.20090211.123031.4000.0001.txt,
> suites.All.out
>
>
> Running suites.All with IBM 1.5 on 10.5.0.0 alpha - (743198) I got a hang
> in the test run. The last test to run successfully was
> xtestNestedSavepoints, but I am not sure exactly what test caused the hang.
> I took a thread dump which I will attach, which showed network server up and
> running but no ClientThread and a ping attempt blocked.
> This hang is very similar to the hang that was seen after the fix attempts
> for DERBY-1465 but that change was backed out so it is not related to that
> change. It could be that the change for DERBY-1465 just made this highly
> intermittent problem more likely.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.