[
https://issues.apache.org/jira/browse/ZOOKEEPER-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643588#comment-13643588
]
Sergey Maslyakov commented on ZOOKEEPER-1043:
---------------------------------------------
I observe very similar symptoms on a *3.4.5* Zookeeper that runs on
Solaris10/x86 and JRE 1.6.0_27-b07. Please see the log below. The explanation
that I found is same as what "J Ch" is talking about. First,
Socket.setTcpNodelay() throws an exception that is not properly handled, which
results in not attaching a socket to a selector, and then a flood of log
messages caused by an NPE.
I would say it is a critical problem as I have not seen Zookeeper to leave this
state on its own.
{code:none|title=Sample log}
2013-04-24 03:07:04,480 [myid:] - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12181:NIOServerCnxnFactory@197] -
Accepted socket connection from /10.64.133.196:54055
2013-04-24 03:07:04,481 [myid:] - WARN
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12181:NIOServerCnxnFactory@220] -
Ignoring exception
java.net.SocketException: Invalid argument
at sun.nio.ch.Net.setIntOption0(Native Method)
at sun.nio.ch.Net.setIntOption(Net.java:157)
at sun.nio.ch.SocketChannelImpl$1.setInt(SocketChannelImpl.java:406)
at sun.nio.ch.SocketOptsImpl.setBoolean(SocketOptsImpl.java:38)
at sun.nio.ch.SocketOptsImpl$IP$TCP.noDelay(SocketOptsImpl.java:284)
at sun.nio.ch.OptionAdaptor.setTcpNoDelay(OptionAdaptor.java:48)
at sun.nio.ch.SocketAdaptor.setTcpNoDelay(SocketAdaptor.java:268)
at
org.apache.zookeeper.server.NIOServerCnxn.<init>(NIOServerCnxn.java:107)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.createConnection(NIOServerCnxnFactory.java:161)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:202)
at java.lang.Thread.run(Thread.java:619)
2013-04-24 03:07:04,482 [myid:] - WARN
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12181:NIOServerCnxnFactory@218] -
Ignoring unexpected runtime exception
java.lang.NullPointerException
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:190)
at java.lang.Thread.run(Thread.java:619)
{code}
The fix seems to be pretty simple, however, it is very difficult to reproduce
the problem for testing purposes.
> Looped NPE at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:244)
> -------------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-1043
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1043
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.3.3
> Environment: Sparc Solaris 10
> Java 6u17 64 bits
> 5 nodes ensemble
> Reporter: César Álvarez Núñez
> Attachments: ZOOKEEPER-1043.patch
>
>
> I'm sorry but I only have this log (which belongs to a "follower" node) and a
> previous message [Unexpected NodeCreated event after a
> reconnection.|http://mail-archives.apache.org/mod_mbox/zookeeper-user/201103.mbox/%[email protected]%3E]
> where I describe a potential side-effect at client side.
> {noformat}
> 2011-04-04 09:31:09,608 - INFO [Snapshot Thread:FileTxnSnapLog@208][] -
> Snapshotting: 1700527e36
> 2011-04-04 09:31:09,653 - INFO [SyncThread:1:FileTxnLog@197][] - Creating
> new log file: log.1700527e38
> 2011-04-04 10:13:39,287 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@251][] -
> Accepted socket connection from /XXX.XXX.XXX.69:1093
> 2011-04-04 10:13:39,371 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn@777][] - Client
> attempting to establish new session at /XXX.XXX.XXX.69:1093
> 2011-04-04 10:13:39,376 - INFO [CommitProcessor:1:NIOServerCnxn@1580][] -
> Established session 0x12ee79c4a720022 with negotiated timeout 20000 for
> client /XXX.XXX.XXX.69:1093
> 2011-04-04 12:04:11,131 - INFO [SyncThread:1:FileTxnLog@197][] - Creating
> new log file: log.170053bf15
> 2011-04-04 12:04:11,131 - INFO [Snapshot Thread:FileTxnSnapLog@208][] -
> Snapshotting: 170053bf17
> 2011-04-04 12:13:10,779 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@251][] -
> Accepted socket connection from /XXX.XXX.XXX.63:1817
> 2011-04-04 12:13:10,790 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn@777][] - Client
> attempting to establish new session at /XXX.XXX.XXX.63:1817
> 2011-04-04 12:13:10,794 - INFO [CommitProcessor:1:NIOServerCnxn@1580][] -
> Established session 0x12ee79c4a720023 with negotiated timeout 20000 for
> client /XXX.XXX.XXX.63:1817
> 2011-04-04 12:13:10,814 - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn@634][] -
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x12ee79c4a720023, likely client has closed socket
> 2011-04-04 12:13:10,816 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn@1435][] - Closed
> socket connection for client /XXX.XXX.XXX.63:1817 which had sessionid
> 0x12ee79c4a720023
> 2011-04-04 12:13:10,839 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@251][] -
> Accepted socket connection from /XXX.XXX.XXX.63:1814
> 2011-04-04 12:13:10,840 - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@274][] -
> Ignoring exception
> java.net.SocketException: Invalid argument
> at sun.nio.ch.Net.setIntOption0(Native Method)
> at sun.nio.ch.Net.setIntOption(Unknown Source)
> at sun.nio.ch.SocketChannelImpl$1.setInt(Unknown Source)
> at sun.nio.ch.SocketOptsImpl.setBoolean(Unknown Source)
> at sun.nio.ch.SocketOptsImpl$IP$TCP.noDelay(Unknown Source)
> at sun.nio.ch.OptionAdaptor.setTcpNoDelay(Unknown Source)
> at sun.nio.ch.SocketAdaptor.setTcpNoDelay(Unknown Source)
> at
> org.apache.zookeeper.server.NIOServerCnxn.<init>(NIOServerCnxn.java:1367)
> at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.createConnection(NIOServerCnxn.java:215)
> at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:256)
> 2011-04-04 12:13:10,841 - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@272][] -
> Ignoring unexpected runtime exception
> java.lang.NullPointerException
> at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:244)
> 2011-04-04 12:13:10,841 - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@272][] -
> Ignoring unexpected runtime exception
> java.lang.NullPointerException
> at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:244)
> 2011-04-04 12:13:10,842 - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@272][] -
> Ignoring unexpected runtime exception
> java.lang.NullPointerException
> at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:244)
> ...
> ...
> ...
> 2011-04-04 16:49:23,101 - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2301:NIOServerCnxn$Factory@272][] -
> Ignoring unexpected runtime exception
> java.lang.NullPointerException
> at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:244)
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira