Let's be clear from the start, storing large data objects in Zookeeper is strongly discouraged. If you want to store large objects with good consistency models, store the data in something else (like a distributed file system or key value store), commit the data and then use ZK to provide a reference to that data. Zookeeper is intended for coordination, not data storage. It is not a reasonable alternative to a noSQL database.
That said, good and informative error messages are always useful and are better than anonymous errors. Even if the connection is closed related to the error, it would be nice to give some decent feedback. On the other hand, the error that you are seeing sounds like your client and your server have inconsistent settings for the maximum jute buffer size. If the client has a larger setting than the server, then the server will run out of buffer space before reading the entire request. To the server, this will look like a network error and there is little that the server can do to recover the connection safely because some bytes may already have been lost due to the short read. As such closing the connection is pretty much all that can be done. If buffer lengths on client and server match and the client tries a long write, I believe the write will fail on the client side with a much more descriptive message. One thing that could plausibly be done would be to enhance the initial handshake between client and server so mismatch in buffer sizes are detected more aggressively. Since a length could be exchanged in a fixed size, this could be done while keeping the connection healthy which would, in turn, allow a useful error message that drives to the true source of the error (i.e. the configuration). This would, however, require a protocol change which is always a sensitive change. Can you determine if the root cause of your problem is inconsistent settings between client and server? On Thu, Jan 7, 2021 at 10:27 PM Huizhi Lu <h...@apache.org> wrote: > Hi ZK Experts, > > I would like to ask a quick question. As we know, assume we are using > the default 1 MB jute.maxbuffer, if a zk client tries to write a large > znode > 1MB, the server will fail it. Server will log "Len error" and > close the connection. The client will receive a connection loss. In a > third party ZkClient lib (eg. I0IZkClient), it'll keep retrying the > operation upon connection loss. And this forever retrying might have a > chance to take down the zk server. > > I believe the zk community must have considered such a situation. I > wonder why zk server does not handle the error a bit better and send a > clearer response to the client, eg. KeeperException.PacketLenError > (and zk server does not really have to close the connection), so the > client knows the error is non retryable. I think there must be some > reasons I am not aware of that zk does not offer it, so I'd like to > ask here. Or is there any ticket/email thread that has discussed this? > > Maybe zk would expect the app client to handle connection loss > appropriately, eg. by having a retry strategy(backoff retry, limiting > the retry count, etc.). Is this what zk would expect, instead of > returning a PacketLenError exception? > > Really appreciate any input. > > > Best, > -Huizhi >