[cassandra-website] branch master updated: Blog Post 2020-09-03 Improving Resiliency

mck Thu, 03 Sep 2020 00:47:58 -0700

This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git



The following commit(s) were added to refs/heads/master by this push:
     new c99bd7e  Blog Post 2020-09-03 Improving Resiliency
c99bd7e is described below

commit c99bd7eed1a33b8ceff6e045475114b5c004b807
Author: Melissa Logan <loganloganlogan@Logan-2018.local>
AuthorDate: Wed Sep 2 15:35:24 2020 -0700

    Blog Post 2020-09-03 Improving Resiliency
---
 .../2020-09-03-improving-resiliency.markdown       | 105 +++++++++++++++++++++
 src/img/blog-post-improving-resiliency/image1.png  | Bin 0 -> 256522 bytes
 src/img/blog-post-improving-resiliency/image10.png | Bin 0 -> 278163 bytes
 src/img/blog-post-improving-resiliency/image11.png | Bin 0 -> 509083 bytes
 src/img/blog-post-improving-resiliency/image12.png | Bin 0 -> 234728 bytes
 src/img/blog-post-improving-resiliency/image13.png | Bin 0 -> 199026 bytes
 src/img/blog-post-improving-resiliency/image14.png | Bin 0 -> 252461 bytes
 src/img/blog-post-improving-resiliency/image15.png | Bin 0 -> 260371 bytes
 src/img/blog-post-improving-resiliency/image16.png | Bin 0 -> 466079 bytes
 src/img/blog-post-improving-resiliency/image2.png  | Bin 0 -> 355440 bytes
 src/img/blog-post-improving-resiliency/image3.png  | Bin 0 -> 354831 bytes
 src/img/blog-post-improving-resiliency/image4.png  | Bin 0 -> 392171 bytes
 src/img/blog-post-improving-resiliency/image5.png  | Bin 0 -> 274880 bytes
 src/img/blog-post-improving-resiliency/image6.png  | Bin 0 -> 274174 bytes
 src/img/blog-post-improving-resiliency/image7.png  | Bin 0 -> 147739 bytes
 src/img/blog-post-improving-resiliency/image8.png  | Bin 0 -> 606925 bytes
 src/img/blog-post-improving-resiliency/image9.png  | Bin 0 -> 461089 bytes
 17 files changed, 105 insertions(+)

diff --git a/src/_posts/2020-09-03-improving-resiliency.markdown 
b/src/_posts/2020-09-03-improving-resiliency.markdown
new file mode 100644
index 0000000..1acea24
--- /dev/null
+++ b/src/_posts/2020-09-03-improving-resiliency.markdown
@@ -0,0 +1,105 @@
+---
+layout: post
+title: "Improving Apache Cassandra’s Front Door and Backpressure"
+date:   2020-09-03 09:00:00 -0700
+author: the Apache Cassandra Community
+categories: blog
+---
+
+As part of 
[CASSANDRA-15013](https://issues.apache.org/jira/browse/CASSANDRA-15013), we 
have improved Cassandra’s ability to handle high throughput workloads, while 
having enough safeguards in place to protect itself from potentially going out 
of memory. In order to better explain the change we have made, let us 
understand at a high level, on how an incoming request is processed by 
Cassandra before the fix, followed by what we changed, and the new relevant 
configuration knobs available.
+
+### How inbound requests were handled before 
+
+Let us take the scenario of a client application sending requests to C* 
cluster. For the purpose of this blog, let us focus on one of the C* 
coordinator nodes.
+
+![alt_text](img/blog-post-improving-resiliency/image1.png "image_tooltip")
+
+Below is the microscopic view of client-server interaction at the C* 
coordinator node. Each client connection to Cassandra node happens over a netty 
channel, and for efficiency purposes, each Netty eventloop thread is 
responsible for more than one netty channel.
+
+![alt_text](img/blog-post-improving-resiliency/image2.png "image_tooltip")
+
+The eventloop threads read requests coming off of netty channels and enqueue 
them into a bounded inbound queue in the Cassandra node.
+
+![alt_text](img/blog-post-improving-resiliency/image3.png "image_tooltip")
+
+A thread pool dequeues requests from the inbound queue, processes them 
asynchronously and enqueues the response into an outbound queue. There exist 
multiple outbound queues, one for each eventloop thread to avoid races.
+
+![alt_text](img/blog-post-improving-resiliency/image4.png "image_tooltip")
+
+![alt_text](img/blog-post-improving-resiliency/image5.png "image_tooltip")
+
+![alt_text](img/blog-post-improving-resiliency/image6.png "image_tooltip")
+
+The same eventloop threads that are responsible for enqueuing incoming 
requests into the inbound queue, are also responsible for dequeuing responses 
off from the outbound queue and shipping responses back to the client.
+
+![alt_text](img/blog-post-improving-resiliency/image7.png "image_tooltip")
+
+![alt_text](img/blog-post-improving-resiliency/image8.png "image_tooltip")
+
+#### Issue with this workflow
+
+Let us take a scenario where there is a spike in operations from the client. 
The eventloop threads are now enqueuing requests at a much higher rate than the 
rate at which the requests are being processed by the native transport thread 
pool. Eventually, the inbound queue reaches its limit and says it cannot store 
any more requests in the queue.
+
+![alt_text](img/blog-post-improving-resiliency/image9.png "image_tooltip")
+
+Consequently, the eventloop threads get into a blocked state as they try to 
enqueue more requests into an already full inbound queue. They wait until they 
can successfully enqueue the request in hand, into the queue.
+
+![alt_text](img/blog-post-improving-resiliency/image10.png "image_tooltip")
+
+As noted earlier, these blocked eventloop threads are also supposed to dequeue 
responses from the outbound queue. Given they are in blocked state, the 
outbound queue (which is unbounded) grows endlessly, with all the responses, 
eventually resulting in C*  going out of memory. This is a vicious cycle 
because, since the eventloop threads are blocked, there is no one to ship 
responses back to the client; eventually client side timeout triggers, and 
clients may send more requests due to retr [...]
+
+![alt_text](img/blog-post-improving-resiliency/image11.png "image_tooltip")
+
+So far, we have built a fair understanding of how the front door of C* works 
with regard to handling client requests, and how blocked eventloop threads can 
affect Cassandra.
+
+### What we changed
+
+#### Backpressure
+
+The essential root cause of the issue is that eventloop threads are getting 
blocked. Let us not block them by making the bounded inbound queue unbounded. 
If we are not careful here though, we could have an out of memory situation, 
this time because of the unbounded inbound queue. So we defined an overloaded 
state for the node based on the memory usage of the inbound queue.
+
+We introduced two levels of thresholds, one at the node level, and the other 
more granular, at client IP. The one at client IP helps to isolate rogue client 
IPs, while not affecting other good clients, if there is such a situation. 
+
+These thresholds can be set using cassandra yaml file.
+
+```
+native_transport_max_concurrent_requests_in_bytes_per_ip
+native_transport_max_concurrent_requests_in_bytes
+```
+
+These thresholds can be further changed at runtime 
([CASSANDRA-15519](https://issues.apache.org/jira/browse/CASSANDRA-15519)).
+
+#### Configurable server response to the client as part of backpressure
+
+If C* happens to be in overloaded state (as defined by the thresholds 
mentioned above), C* can react in one of the following ways:
+
+*   Apply backpressure by setting “Autoread” to false on the netty channel in 
question (default behavior).
+*   Respond back to the client with Overloaded Exception (if client sets 
“THROW_ON_OVERLOAD” connection startup option to “true.”
+
+Let us look at the client request-response workflow again, in both these cases.
+
+#### **THROW_ON_OVERLOAD = false (default)**
+
+If the inbound queue is full (i.e. the thresholds are met).
+
+![alt_text](img/blog-post-improving-resiliency/image12.png "image_tooltip")
+
+C* sets autoread to false on the netty channel, which means it will stop 
reading bytes off of the netty channel.
+
+![alt_text](img/blog-post-improving-resiliency/image13.png "image_tooltip")
+
+Consequently, the kernel socket inbound buffer becomes full since no bytes are 
being read off of it by netty eventloop.
+
+![alt_text](img/blog-post-improving-resiliency/image14.png "image_tooltip")
+
+Once the Kernel Socket Inbound Buffer is full on the server side, things start 
getting piled up in the Kernel Socket Outbound Buffer on the client side, and 
once this buffer gets full, client will start experiencing backpressure.
+
+![alt_text](img/blog-post-improving-resiliency/image15.png "image_tooltip")
+
+#### **THROW_ON_OVERLOAD = true**
+
+If the inbound queue is full (i.e. the thresholds are met), eventloop threads 
do not enqueue the request into the Inbound Queue. Instead, the eventloop 
thread creates an OverloadedException response message and enqueues it into the 
flusher queue, which will then be shipped back to the client.
+
+![alt_text](img/blog-post-improving-resiliency/image16.png "image_tooltip")
+
+This way, Cassandra is able to serve very large throughput, while protecting 
itself from getting into memory starvation issues. This patch has been vetted 
through thorough performance benchmarking. Detailed performance analysis can be 
found 
[here](https://issues.apache.org/jira/browse/CASSANDRA-15013?focusedCommentId=16881762&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16881762).
 
diff --git a/src/img/blog-post-improving-resiliency/image1.png 
b/src/img/blog-post-improving-resiliency/image1.png
new file mode 100644
index 0000000..8edf5f8
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image1.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image10.png 
b/src/img/blog-post-improving-resiliency/image10.png
new file mode 100644
index 0000000..ffed820
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image10.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image11.png 
b/src/img/blog-post-improving-resiliency/image11.png
new file mode 100644
index 0000000..c2b4a69
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image11.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image12.png 
b/src/img/blog-post-improving-resiliency/image12.png
new file mode 100644
index 0000000..675e84f
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image12.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image13.png 
b/src/img/blog-post-improving-resiliency/image13.png
new file mode 100644
index 0000000..70f0887
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image13.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image14.png 
b/src/img/blog-post-improving-resiliency/image14.png
new file mode 100644
index 0000000..fd53d62
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image14.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image15.png 
b/src/img/blog-post-improving-resiliency/image15.png
new file mode 100644
index 0000000..df90bd0
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image15.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image16.png 
b/src/img/blog-post-improving-resiliency/image16.png
new file mode 100644
index 0000000..64dcde5
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image16.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image2.png 
b/src/img/blog-post-improving-resiliency/image2.png
new file mode 100644
index 0000000..edea7fe
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image2.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image3.png 
b/src/img/blog-post-improving-resiliency/image3.png
new file mode 100644
index 0000000..7e1f291
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image3.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image4.png 
b/src/img/blog-post-improving-resiliency/image4.png
new file mode 100644
index 0000000..367f7c4
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image4.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image5.png 
b/src/img/blog-post-improving-resiliency/image5.png
new file mode 100644
index 0000000..2c2e65d
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image5.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image6.png 
b/src/img/blog-post-improving-resiliency/image6.png
new file mode 100644
index 0000000..67b9bd2
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image6.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image7.png 
b/src/img/blog-post-improving-resiliency/image7.png
new file mode 100644
index 0000000..ce49186
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image7.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image8.png 
b/src/img/blog-post-improving-resiliency/image8.png
new file mode 100644
index 0000000..e9f0f6f
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image8.png 
differ
diff --git a/src/img/blog-post-improving-resiliency/image9.png 
b/src/img/blog-post-improving-resiliency/image9.png
new file mode 100644
index 0000000..c54ec71
Binary files /dev/null and b/src/img/blog-post-improving-resiliency/image9.png 
differ


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra-website] branch master updated: Blog Post 2020-09-03 Improving Resiliency

Reply via email to