[ 
https://issues.apache.org/jira/browse/DERBY-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721062#action_12721062
 ] 

Kathey Marsden commented on DERBY-4053:
---------------------------------------

I noticed this happened on the IBM 10.4 run last night and  had the 
DRDA_InvalidReplyTooShort.S:Invalid reply from network server: Insufficient 
data trace on a ping shortly before the hang.  I looked briefly at how this 
might occur, that a ping would return insufficient data.

A ping would go through this code in DRDAConnThread:
private void sessionInitialState()
                throws Exception
        {
                // process NetworkServerControl commands - if it is not either 
valid protocol  let the 
                // DRDA error handling handle it
                if (reader.isCmd())
                {
                        try {
                                server.processCommands(reader, writer, session);
                                // reset reader and writer
                                reader.initialize(this, null);
                                writer.reset(null);
                                closeSession();
                        } catch (Throwable t) {
                                if (t instanceof InterruptedException)
                                        throw (InterruptedException)t;
                                else
                                {
                                        server.consoleExceptionPrintTrace(t);
                                }
                        }

In NetworkServerControlImpl during shutdown we do the following which might 
interrupt  the thread:
        synchronized (threadList)
                {
                        //interupt any connection threads still active
                        for (int i = 0; i < threadList.size(); i++)
                        {
                                final DRDAConnThread threadi = 
(DRDAConnThread)threadList.get(i);
                
                                threadi.close();
                                AccessController.doPrivileged(
                                                        new PrivilegedAction() {
                                                                public Object 
run() {
                                                                        
threadi.interrupt();
                                                                        return 
null;
                                                                }
                                                        });
                        }
                        threadList.clear();
                }

It seems a kind of abrupt way to shutdown the threads. Would calling threadi() 
close()  be more appropriate so it could finish what it was doing?  I am also 
unclear about the state of the cleanup when such an interrupt occurs during the 
middle of writing a response.

Tomorrow I will try putting a sleep between the reply header and the OK byte on 
the ping response and try to shutdown at the same time  to  see if I can get 
the InvalidReplyTooShort and then see if subsequent connection attempts hang.




> suites.All hang with message java.net.BindException: Address already in use: 
> NET_Bind in derby.log 
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4053
>                 URL: https://issues.apache.org/jira/browse/DERBY-4053
>             Project: Derby
>          Issue Type: Bug
>          Components: Network Server
>    Affects Versions: 10.5.1.1
>            Reporter: Kathey Marsden
>         Attachments: derby-4053_repro_dont_commit_diff.txt, derby.log, 
> javacore-20090420-1735.txt, javacore.20090211.123031.4000.0001.txt, 
> suites.All.out
>
>
> Running suites.All with IBM 1.5  on 10.5.0.0 alpha - (743198)  I got a hang 
> in the test run.  The last test to run successfully was 
> xtestNestedSavepoints, but I am not sure exactly what test caused  the hang.  
> I took a thread dump which I will attach, which showed network server up and 
> running but no ClientThread and a ping attempt blocked.
> This hang is very similar to the hang that was seen after the fix attempts 
> for DERBY-1465 but that change was backed out so it is not related to that 
> change.   It could be that the change for DERBY-1465 just made this highly 
> intermittent problem more likely.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to