[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643588#comment-13643588
 ] 

Sergey Maslyakov commented on ZOOKEEPER-1043:
---------------------------------------------

I observe very similar symptoms on a *3.4.5* Zookeeper that runs on 
Solaris10/x86 and JRE 1.6.0_27-b07. Please see the log below. The explanation 
that I found is same as what "J Ch" is talking about. First, 
Socket.setTcpNodelay() throws an exception that is not properly handled, which 
results in not attaching a socket to a selector, and then a flood of log 
messages caused by an NPE.

I would say it is a critical problem as I have not seen Zookeeper to leave this 
state on its own.

{code:none|title=Sample log}
2013-04-24 03:07:04,480 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12181:NIOServerCnxnFactory@197] - 
Accepted socket connection from /10.64.133.196:54055
2013-04-24 03:07:04,481 [myid:] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12181:NIOServerCnxnFactory@220] - 
Ignoring exception
java.net.SocketException: Invalid argument
        at sun.nio.ch.Net.setIntOption0(Native Method)
        at sun.nio.ch.Net.setIntOption(Net.java:157)
        at sun.nio.ch.SocketChannelImpl$1.setInt(SocketChannelImpl.java:406)
        at sun.nio.ch.SocketOptsImpl.setBoolean(SocketOptsImpl.java:38)
        at sun.nio.ch.SocketOptsImpl$IP$TCP.noDelay(SocketOptsImpl.java:284)
        at sun.nio.ch.OptionAdaptor.setTcpNoDelay(OptionAdaptor.java:48)
        at sun.nio.ch.SocketAdaptor.setTcpNoDelay(SocketAdaptor.java:268)
        at 
org.apache.zookeeper.server.NIOServerCnxn.<init>(NIOServerCnxn.java:107)
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.createConnection(NIOServerCnxnFactory.java:161)
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:202)
        at java.lang.Thread.run(Thread.java:619)
2013-04-24 03:07:04,482 [myid:] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12181:NIOServerCnxnFactory@218] - 
Ignoring unexpected runtime exception
java.lang.NullPointerException
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:190)
        at java.lang.Thread.run(Thread.java:619) 
{code}

The fix seems to be pretty simple, however, it is very difficult to reproduce 
the problem for testing purposes.
                
> Looped NPE at 
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:244)
> -------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1043
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1043
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.3.3
>         Environment: Sparc Solaris 10
> Java 6u17 64 bits
> 5 nodes ensemble
>            Reporter: César Álvarez Núñez
>         Attachments: ZOOKEEPER-1043.patch
>
>
> I'm sorry but I only have this log (which belongs to a "follower" node) and a 
> previous message [Unexpected NodeCreated event after a 
> reconnection.|http://mail-archives.apache.org/mod_mbox/zookeeper-user/201103.mbox/%[email protected]%3E]
>  where I describe a potential side-effect at client side.
> {noformat}
> 2011-04-04 09:31:09,608 - INFO  [Snapshot Thread:FileTxnSnapLog@208][] - 
> Snapshotting: 1700527e36
> 2011-04-04 09:31:09,653 - INFO  [SyncThread:1:FileTxnLog@197][] - Creating 
> new log file: log.1700527e38
> 2011-04-04 10:13:39,287 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@251][] - 
> Accepted socket connection from /XXX.XXX.XXX.69:1093
> 2011-04-04 10:13:39,371 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn@777][] - Client 
> attempting to establish new session at /XXX.XXX.XXX.69:1093
> 2011-04-04 10:13:39,376 - INFO  [CommitProcessor:1:NIOServerCnxn@1580][] - 
> Established session 0x12ee79c4a720022 with negotiated timeout 20000 for 
> client /XXX.XXX.XXX.69:1093
> 2011-04-04 12:04:11,131 - INFO  [SyncThread:1:FileTxnLog@197][] - Creating 
> new log file: log.170053bf15
> 2011-04-04 12:04:11,131 - INFO  [Snapshot Thread:FileTxnSnapLog@208][] - 
> Snapshotting: 170053bf17
> 2011-04-04 12:13:10,779 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@251][] - 
> Accepted socket connection from /XXX.XXX.XXX.63:1817
> 2011-04-04 12:13:10,790 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn@777][] - Client 
> attempting to establish new session at /XXX.XXX.XXX.63:1817
> 2011-04-04 12:13:10,794 - INFO  [CommitProcessor:1:NIOServerCnxn@1580][] - 
> Established session 0x12ee79c4a720023 with negotiated timeout 20000 for 
> client /XXX.XXX.XXX.63:1817
> 2011-04-04 12:13:10,814 - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn@634][] - 
> EndOfStreamException: Unable to read additional data from client sessionid 
> 0x12ee79c4a720023, likely client has closed socket
> 2011-04-04 12:13:10,816 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn@1435][] - Closed 
> socket connection for client /XXX.XXX.XXX.63:1817 which had sessionid 
> 0x12ee79c4a720023
> 2011-04-04 12:13:10,839 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@251][] - 
> Accepted socket connection from /XXX.XXX.XXX.63:1814
> 2011-04-04 12:13:10,840 - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@274][] - 
> Ignoring exception
> java.net.SocketException: Invalid argument
>         at sun.nio.ch.Net.setIntOption0(Native Method)
>         at sun.nio.ch.Net.setIntOption(Unknown Source)
>         at sun.nio.ch.SocketChannelImpl$1.setInt(Unknown Source)
>         at sun.nio.ch.SocketOptsImpl.setBoolean(Unknown Source)
>         at sun.nio.ch.SocketOptsImpl$IP$TCP.noDelay(Unknown Source)
>         at sun.nio.ch.OptionAdaptor.setTcpNoDelay(Unknown Source)
>         at sun.nio.ch.SocketAdaptor.setTcpNoDelay(Unknown Source)
>         at 
> org.apache.zookeeper.server.NIOServerCnxn.<init>(NIOServerCnxn.java:1367)
>         at 
> org.apache.zookeeper.server.NIOServerCnxn$Factory.createConnection(NIOServerCnxn.java:215)
>         at 
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:256)
> 2011-04-04 12:13:10,841 - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@272][] - 
> Ignoring unexpected runtime exception
> java.lang.NullPointerException
>         at 
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:244)
> 2011-04-04 12:13:10,841 - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@272][] - 
> Ignoring unexpected runtime exception
> java.lang.NullPointerException
>         at 
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:244)
> 2011-04-04 12:13:10,842 - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@272][] - 
> Ignoring unexpected runtime exception
> java.lang.NullPointerException
>         at 
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:244)
> ...
> ...
> ...
> 2011-04-04 16:49:23,101 - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@272][] - 
> Ignoring unexpected runtime exception
> java.lang.NullPointerException
>         at 
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:244)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to