Re: Predicting Read/Write Latency as a Function of Total Requests & Cluster Size

2019-12-10 Thread Peter Corless
The theoretical answer involves Little's Law (*L=λW*). But the practical experience is, as you say, dependent on a fair number of factors. We wrote a recent blog

Re: Connection Pooling in v4.x Java Driver

2019-12-10 Thread Jon Haddad
I'm not sure how closely the driver maintainers are following this list. You might want to ask on the Java Driver mailing list: https://groups.google.com/a/lists.datastax.com/forum/#!forum/java-driver-user On Tue, Dec 10, 2019 at 5:10 PM Caravaggio, Kevin < kevin.caravag...@lowes.com> wrote:

Connection Pooling in v4.x Java Driver

2019-12-10 Thread Caravaggio, Kevin
Hello, When integrating with DataStax OSS Cassandra Java driver v4.x, I noticed “Unlike previous versions of the driver, pools do not resize dynamically” in reference to the connection pool

Re: Dynamo autoscaling: does it beat cassandra?

2019-12-10 Thread Dor Laor
Compression of 3.x is much better than 2.y (see the attached graph of our of our customers (scylla). However it's not related to Dynamo's hot partition and caching. In Dynamo, every tablet has its own limits and caching isn't taken into account. Once the throughput goes beyond the tablet

Re: Dynamo autoscaling: does it beat cassandra?

2019-12-10 Thread Reid Pinchback
Hi Carl, I can’t speak to all of the internal mechanics and what the committers factored in. I have no doubt that intelligent decisions were the goal given the context of the time. More where I come from is that at least in our case, we see nodes with a fair hunk of file data sitting in

Re: Dynamo autoscaling: does it beat cassandra?

2019-12-10 Thread Carl Mueller
Dor and Reid: thanks, that was very helpful. Is the large amount of compression an artifact of pre-cass3.11 where the column names were per-cell (combined with the cluster key for extreme verbosity, I think), so compression would at least be effective against those portions of the sstable data?

Re: Seeing tons of DigestMismatchException exceptions after upgrading from 2.2.13 to 3.11.4

2019-12-10 Thread Reid Pinchback
Colleen, to your question, yes there is a difference between 2.x and 3.x that would impact repairs. The merkel tree computations changed, to having a default tree depth that is greater. That can cause significant memory drag, to the point that nodes sometimes even OOM. This has been fixed in

Re: Seeing tons of DigestMismatchException exceptions after upgrading from 2.2.13 to 3.11.4

2019-12-10 Thread Reid Pinchback
Carl, your speculation matches our observations, and we have a use case with that unfortunate usage pattern. Write-then-immediately-read is not friendly to eventually-consistent data stores. It makes the reading pay a tax that really is associated with writing activity. From: Carl Mueller

Re: Dynamo autoscaling: does it beat cassandra?

2019-12-10 Thread Reid Pinchback
Note that DynamoDB I/O throughput scaling doesn’t work well with brief spikes. Unless you write your own machinery to manage the provisioning, by the time AWS scales the I/O bandwidth your incident has long since passed. It’s not a thing to rely on if you have a latency SLA. It really only

Re: Predicting Read/Write Latency as a Function of Total Requests & Cluster Size

2019-12-10 Thread Reid Pinchback
Latency SLAs are very much *not* Cassandra’s sweet spot, scaling throughput and storage is more where C*’s strengths shine. If you want just median latency you’ll find things a bit more amenable to modeling, but not if you have 2 nines and particularly not 3 nines SLA expectations. Basically,

Predicting Read/Write Latency as a Function of Total Requests & Cluster Size

2019-12-10 Thread Fred Habash
I'm looking for an empirical way to answer these two question: 1. If I increase application work load (read/write requests) by some percentage, how is it going to affect read/write latency. Of course, all other factors remaining constant e.g. ec2 instance class, ssd specs, number of nodes, etc.