On 4/5/2018 3:44 AM, Andor Molnar wrote:
You can get the current jute.maxbuffer setting from a running
ZooKeeper instance by querying ZooKeeperServerBean via JMX.
I'm not sure how I would do that in a client program. It might be
trivial, but it's not something I've ever done.
Currently there're 2 usage of the setting in ZK: 1) server-client
communication which is by default 4MB, 2) server-server communication
which is by default 1MB. They can't be set individually, but can be
overriden with the jute.maxbuffer system property.
I'm looking for a way to ask the ZK client to give me the value it is
currently using as its max packet length. I'm only going to be logging
a warning to inform the user about which file may have caused a problem
due to size, not preventing the attempt at uploading the file, so I'm
not opposed to falling back to a hard-coded value if I can't figure it
out. I can look for the jute.maxbuffer sysprop, but if ZK will tell me
what it's actually using, I'd prefer that.
Does the max packet length cover ONLY the size of the znode data, or
does the znode name get included in that? Asked another way: Should I
subtract a little bit from the max packet length (maybe 128 or 256)
before I compare the file size, or just compare the unchanged value?
I did discover that the ZkClientConfig.CLIENT_MAX_PACKET_LENGTH_DEFAULT
field I mentioned before is not available in 3.4.x, it seems to have
been added to a 3.5 version. Since Solr uses 3.4.x and won't upgrade
until there is a stable 3.5 release, I can't use that.
I do think that the ZK client should log something useful when the max
packet length is exceeded -- if that's even possible. The user in this
scenario is running the latest version of Solr that was available at the
time, which includes ZK 3.4.10 for its client. The error message
indicated socket problems, but didn't have any information about the cause.
When running under java 9, they got this as the error:
WARN - 2018-04-04 09:05:28.194;
org.apache.zookeeper.ClientCnxn$SendThread; Session 0x100244e8ffb0004
for server localhost/127.0.0.1:2181, unexpected error, closing socket
connection and attempting reconnect java.io.IOException: Connection
reset by peer
With Java 8, they got this:
WARN - 2018-04-04 09:10:11.879;
org.apache.zookeeper.ClientCnxn$SendThread; Session 0x10024db7e280002
for server localhost/0:0:0:0:0:0:0:1:2181, unexpected error, closing
socket connection and attempting reconnect java.io.IOException: Protocol
wrong type for socket
In both cases, the stacktrace listed a bunch of sun classes and then a
couple of methods in zookeeper's ClientCnxnSocketNIO class.
When I asked them what their ZK server log said, that's when I figured
out the problem:
2018-04-04 14:06:01,361 [myid:] - WARN [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn@383] - Exception causing close of
session 0x10024db7e280006: Len error 5327937
Do I understand correctly that Solr uploads file to ZooKeeper?
Solr *itself* won't typically be uploading data to ZK that can exceed
the max packet size. It is typically done either with a separate
commandline program (the ZkCLI class the commandline program uses is
included in Solr), or by a client program using the SolrJ library (which
is part of Solr like ZkCLI, but usable by itself). The action being
performed is an upload of a configuration for a Solr index.
Solr does sometimes run into the problem described in ZOOKEEPER-1162,
but this is due to the number of children in a znode, where each one has
minimal data.
Thanks,
Shawn