Jean-Daniel Cryans created KUDU-1868:
----------------------------------------
Summary: Java client mishandles socket read timeouts for scans
Key: KUDU-1868
URL: https://issues.apache.org/jira/browse/KUDU-1868
Project: Kudu
Issue Type: Bug
Components: client
Affects Versions: 1.2.0
Reporter: Jean-Daniel Cryans
Scan calls from the Java client that take more than the socket read timeout get
retried (unless the operation timeout has expired) instead of being killed.
Users will see this:
{code}
org.apache.kudu.client.NonRecoverableException: Invalid call sequence ID in
scan request
{code}
Note that the right behavior here would still end up killing the scanner, so
this is really a problem the user has to deal with! It's usually caused by slow
IO, combined with very selection scans.
Workaround: set defaultSocketReadTimeoutMs higher, ideally equal to
defaultOperationTimeoutMs (the defaults are 10 and 30 seconds respectively).
But really the user should investigate why single the scans are so slow.
One potentially easy fix to this is to handle retries differently for scanners
so that the user gets nicer exception. A harder fix is to handle socket read
timeouts completely differently, basically it should be per-RPC and not per
TabletClient like it is right now.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)