Re: Cassandra p95 latencies

Elliott Sims via user Mon, 14 Aug 2023 11:56:04 -0700

1.  Check for Nagle/delayed-ack, but probably nodelay is getting set by the
driver so it shouldn't be a problem.
2.  Check for network latency (just regular old ping among hosts, during
traffic)
3.  Check your GC metrics and see if garbage collections line up with
outliers.  Some tuning can help there, depending on the pattern, but 40ms
p99 at least would be fairly normal for G1GC.
4.  Check actual local write times, and I/O times with iostat.  If you have
spinning drives 40ms is fairly expected.  It's high but not totally
unexpected for consumer-grade SSDs.  For enterprise-grade SSDs commit times
that long would be very unusual.  What are your commitlog_sync settings?


On Mon, Aug 14, 2023 at 8:43 AM Josh McKenzie <jmcken...@apache.org> wrote:

> The queries are rightly designed
>
> Data modeling in Cassandra is 100% gray space; there unfortunately is no
> right or wrong design. You'll need to share basic shapes / contours of your
> data model for other folks to help you; seemingly innocuous things in a
> data model can cause unexpected issues w/C*'s storage engine paradigm
> thanks to the partitioning and data storage happening under the hood.
>
> If you were seeing single digit ms on 3.0.X or 3.11.X and 40ms p95 on 4.0
> I'd immediately look to the DB as being the culprit. For all other cases,
> you should be seeing single digit ms as queries in C* generally boil down
> to key/value lookups (partition key) to a list of rows you either point
> query (key/value #2) or range scan via clustering keys and pull back out.
>
> There's also paging to take into consideration (whether you're using it or
> not, what your page size is) and the data itself (do you have thousands of
> columns? Multi-MB blobs you're pulling back out? etc). All can play into
> this.
>
> On Fri, Aug 11, 2023, at 3:40 PM, Jeff Jirsa wrote:
>
> You’re going to have to help us help you
>
> 4.0 is pretty widely deployed. I’m not aware of a perf regression
>
> Can you give us a schema (anonymized) and queries and show us a trace ?
>
>
> On Aug 10, 2023, at 10:18 PM, Shaurya Gupta <shaurya.n...@gmail.com>
> wrote:
>
> 
> The queries are rightly designed as I already explained. 40 ms is way too
> high as compared to what I seen with other DBs and many a times with
> Cassandra 3.x versions.
> CPU consumed as I mentioned is not high, it is around 20%.
>
> On Thu, Aug 10, 2023 at 5:14 PM MyWorld <timeplus.1...@gmail.com> wrote:
>
> Hi,
> P95 should not be a problem if rightly designed. Levelled compaction
> strategy further reduces this, however it consume some resources. For read,
> caching is also helpful.
> Can you check your cpu iowait as it could be the reason for delay
>
> Regards,
> Ashish
>
> On Fri, 11 Aug, 2023, 04:58 Shaurya Gupta, <shaurya.n...@gmail.com> wrote:
>
> Hi community
>
> What is the expected P95 latency for Cassandra Read and Write queries
> executed with Local_Quorum over a table with 3 replicas ? The queries are
> done using the partition + clustering key and row size in bytes is not too
> much, maybe 1-2 KB maximum.
> Assuming CPU is not a crunch ?
>
> We observe those to be 40 ms P95 Reads and same for Writes. This looks
> very high as compared to what we expected. We are using Cassandra 4.0.
>
> Any documentation / numbers will be helpful.
>
> Thanks
> --
> Shaurya Gupta
>
>
>
> --
> Shaurya Gupta
>
>
>

-- 
This email, including its contents and any attachment(s), may contain 
confidential and/or proprietary information and is solely for the review 
and use of the intended recipient(s). If you have received this email in 
error, please notify the sender and permanently delete this email, its 
content, and any attachment(s).  Any disclosure, copying, or taking of any 
action in reliance on an email received in error is strictly prohibited.

Re: Cassandra p95 latencies

Reply via email to