[ 
https://issues.apache.org/jira/browse/CASSANDRA-11320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205957#comment-15205957
 ] 

Stefania commented on CASSANDRA-11320:
--------------------------------------

The patch is ready; I am just waiting for the results of some more tests. 

There are 3 new options:

{code}
         MAXINFLIGHTMESSAGES=512  - the maximum number of messages not yet 
acknowledged by a replica, before the
                                    back-off policy in worker processes kicks in
         MAXBACKOFFATTEMPTS=32    - the maximum number of back-off attempts in 
worker processes. During each attempt,
                                    if no replica with less than 
MAXINFLIGHTMESSAGES pending is found, there is a pause 
                                    in the worker process for an amount of time 
that is drawn at random between 
                                    1 and 2^num-attempts seconds
         MAXPENDINGCHUNKS=24      - the maximum number of chunks not yet read 
by a working process, once this number
                                    is reached, no new chunks are sent from the 
feeding process to the worker process
{code}

The default values should be reasonable and users should not need to change 
them. 

If all replicas have more than MAXINFLIGHTMESSAGES in progress, then a back-off 
policy is applied in the worker process main thread and it will not send any 
more messages until at least one replica has fewer in progress messages. The 
pause becomes exponentially larger. If there are still no replicas after 
MAXBACKOFFATTEMPTS, a {{NoHostAvailable}} exception is raised. The old back-off 
policy is removed and on timeouts, we retry as for any other server errors, 
since the back-off is now performed for all messages.

The feeding process now has a thread for each worker process that sends chunks 
asynchronously. If there are more than MAXPENDINGCHUNKS, then no chunks are 
sent. If all worker processes have more than MAXPENDINGCHUNKS in progress, the 
feeding process sleeps for an amount of time that gets exponentially larger.

The new thread is introduced in {{OneWayChannel}} and will replace the thread 
introduced by the second patch of CASSANDRA-11053. Generally speaking, it is 
safer to write into a pipe in a separate thread because if the pipe is full, 
then the send blocks; there doesn't seem to be an API to determine if the send 
will block on Windows - other than using inter-process synchronization and I've 
verified that this is much slower than introducing threads. The performance 
impact of these threads is of the order of 3-4k rows per second: from 47k rows 
per second to 44k rows per second on my laptop, when importing 2M entries 
generated with a standard stress write.


> Improve backoff policy for cqlsh COPY FROM
> ------------------------------------------
>
>                 Key: CASSANDRA-11320
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11320
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Stefania
>            Assignee: Stefania
>              Labels: doc-impacting
>             Fix For: 3.x
>
>
> Currently we have an exponential back-off policy in COPY FROM that kicks in 
> when timeouts are received. However there are two limitations:
> * it does not cover new requests and therefore we may not back-off 
> sufficiently to give time to an overloaded server to recover
> * the pause is performed in the receiving thread and therefore we may not 
> process server messages quickly enough
> There is a static throttling mechanism in rows per second from feeder to 
> worker processes (the INGESTRATE) but the feeder has no idea of the load of 
> each worker process. However it's easy to keep track of how many chunks a 
> worker process has yet to read by introducing a bounded semaphore.
> The idea is to move the back-off pauses to the worker processes main thread 
> so as to include all messages, new and retries, not just the retries that 
> timed out. The worker process will not read new chunks during the back-off 
> pauses, and the feeder process can then look at the number of pending chunks 
> before sending new chunks to a worker process.
> [~aholmber], [~aweisberg] what do you think?  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to