[
https://issues.apache.org/jira/browse/CASSANDRA-11320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205957#comment-15205957
]
Stefania commented on CASSANDRA-11320:
--------------------------------------
The patch is ready; I am just waiting for the results of some more tests.
There are 3 new options:
{code}
MAXINFLIGHTMESSAGES=512 - the maximum number of messages not yet
acknowledged by a replica, before the
back-off policy in worker processes kicks in
MAXBACKOFFATTEMPTS=32 - the maximum number of back-off attempts in
worker processes. During each attempt,
if no replica with less than
MAXINFLIGHTMESSAGES pending is found, there is a pause
in the worker process for an amount of time
that is drawn at random between
1 and 2^num-attempts seconds
MAXPENDINGCHUNKS=24 - the maximum number of chunks not yet read
by a working process, once this number
is reached, no new chunks are sent from the
feeding process to the worker process
{code}
The default values should be reasonable and users should not need to change
them.
If all replicas have more than MAXINFLIGHTMESSAGES in progress, then a back-off
policy is applied in the worker process main thread and it will not send any
more messages until at least one replica has fewer in progress messages. The
pause becomes exponentially larger. If there are still no replicas after
MAXBACKOFFATTEMPTS, a {{NoHostAvailable}} exception is raised. The old back-off
policy is removed and on timeouts, we retry as for any other server errors,
since the back-off is now performed for all messages.
The feeding process now has a thread for each worker process that sends chunks
asynchronously. If there are more than MAXPENDINGCHUNKS, then no chunks are
sent. If all worker processes have more than MAXPENDINGCHUNKS in progress, the
feeding process sleeps for an amount of time that gets exponentially larger.
The new thread is introduced in {{OneWayChannel}} and will replace the thread
introduced by the second patch of CASSANDRA-11053. Generally speaking, it is
safer to write into a pipe in a separate thread because if the pipe is full,
then the send blocks; there doesn't seem to be an API to determine if the send
will block on Windows - other than using inter-process synchronization and I've
verified that this is much slower than introducing threads. The performance
impact of these threads is of the order of 3-4k rows per second: from 47k rows
per second to 44k rows per second on my laptop, when importing 2M entries
generated with a standard stress write.
> Improve backoff policy for cqlsh COPY FROM
> ------------------------------------------
>
> Key: CASSANDRA-11320
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11320
> Project: Cassandra
> Issue Type: Improvement
> Components: Tools
> Reporter: Stefania
> Assignee: Stefania
> Labels: doc-impacting
> Fix For: 3.x
>
>
> Currently we have an exponential back-off policy in COPY FROM that kicks in
> when timeouts are received. However there are two limitations:
> * it does not cover new requests and therefore we may not back-off
> sufficiently to give time to an overloaded server to recover
> * the pause is performed in the receiving thread and therefore we may not
> process server messages quickly enough
> There is a static throttling mechanism in rows per second from feeder to
> worker processes (the INGESTRATE) but the feeder has no idea of the load of
> each worker process. However it's easy to keep track of how many chunks a
> worker process has yet to read by introducing a bounded semaphore.
> The idea is to move the back-off pauses to the worker processes main thread
> so as to include all messages, new and retries, not just the retries that
> timed out. The worker process will not read new chunks during the back-off
> pauses, and the feeder process can then look at the number of pending chunks
> before sending new chunks to a worker process.
> [~aholmber], [~aweisberg] what do you think?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)