Let's be clear from the start, storing large data objects in Zookeeper is
strongly discouraged. If you want to store large objects with good
consistency models, store the data in something else (like a distributed
file system or key value store), commit the data and then use ZK to provide
a reference to that data. Zookeeper is intended for coordination, not data
storage. It is not a reasonable alternative to a noSQL database.

That said, good and informative error messages are always useful and are
better than anonymous errors. Even if the connection is closed related to
the error, it would be nice to give some decent feedback.

On the other hand, the error that you are seeing sounds like your client
and your server have inconsistent settings for the maximum jute buffer
size. If the client has a larger setting than the server, then the server
will run out of buffer space before reading the entire request. To the
server, this will look like a network error and there is little that the
server can do to recover the connection safely because some bytes may
already have been lost due to the short read. As such closing the
connection is pretty much all that can be done.

If buffer lengths on client and server match and the client tries a long
write, I believe the write will fail on the client side with a much more
descriptive message.

One thing that could plausibly be done would be to enhance the initial
handshake between client and server so mismatch in buffer sizes are
detected more aggressively. Since a length could be exchanged in a fixed
size, this could be done while keeping the connection healthy which would,
in turn, allow a useful error message that drives to the true source of the
error (i.e. the configuration). This would, however, require a protocol
change which is always a sensitive change.

Can you determine if the root cause of your problem is inconsistent
settings between client and server?



On Thu, Jan 7, 2021 at 10:27 PM Huizhi Lu <h...@apache.org> wrote:

> Hi ZK Experts,
>
> I would like to ask a quick question. As we know, assume we are using
> the default 1 MB jute.maxbuffer, if a zk client tries to write a large
> znode > 1MB, the server will fail it. Server will log "Len error" and
> close the connection. The client will receive a connection loss. In a
> third party ZkClient lib (eg. I0IZkClient), it'll keep retrying the
> operation upon connection loss. And this forever retrying might have a
> chance to take down the zk server.
>
> I believe the zk community must have considered such a situation. I
> wonder why zk server does not handle the error a bit better and send a
> clearer response to the client, eg. KeeperException.PacketLenError
> (and zk server does not really have to close the connection), so the
> client knows the error is non retryable. I think there must be some
> reasons I am not aware of that zk does not offer it, so I'd like to
> ask here. Or is there any ticket/email thread that has discussed this?
>
> Maybe zk would expect the app client to handle connection loss
> appropriately, eg. by having a retry strategy(backoff retry, limiting
> the retry count, etc.). Is this what zk would expect, instead of
> returning a PacketLenError exception?
>
> Really appreciate any input.
>
>
> Best,
> -Huizhi
>

Reply via email to