Re: How to monitor datastax driver compression performance?
tlp-stress has support for customizing payloads, but it's not documented very well. For a given data model (say the KeyValue one), you can override what tlp-stress will send over. By default it's pretty small, a handful of bytes. If you pass --field.keyvalue.value (the table name + the field name) then the custom field generator you'd like to use. For example, --field.keyvalue.value='random(1,11000)` will generate 10K random characters. You can also generate text from real words by using the book(100,200) function (100-200 random works out of books) if you want something that will compress better. You can see a (poorly formatted) list of all the customizations you can do by running `tlp-stress fields` This is one the areas I haven't spent enough time on to share with the world in a carefree manner, but it works. If you're willing to overlook the poor docs in the area I think it might meet your needs. Regarding compression at the query level vs not, I think you should look at the overhead first. I'm betting you'll find it's insignificant. That said, you can always create two cluster objects with two radically different settings if you find you need it. On Tue, Apr 9, 2019 at 6:32 AM Gabriel Giussi wrote: > > tlp-stress allow us to define size of rows? Because I will see the benefit of > compression in terms of request rates only if the compression ratio is > significant, i.e. requires less network round trips. > This could be done generating bigger partitions with parameters -n and -p, > i.e. decreasing the -p? > > Also, don't you think that driver should allow configuring compression per > query? Because one table with wide rows could benefit from compression while > another one with less payload could not. > > Thanks for your help Jon. > > > El lun., 8 abr. 2019 a las 19:13, Jon Haddad () escribió: >> >> If it were me, I'd look at raw request rates (in terms of requests / >> second as well as request latency), network throughput and then some >> flame graphs of both the server and your application: >> https://github.com/jvm-profiling-tools/async-profiler. >> >> I've created an issue in tlp-stress to add compression options for the >> driver: https://github.com/thelastpickle/tlp-stress/issues/67. If >> you're interested in contributing the feature I think tlp-stress will >> more or less solve the remainder of the problem for you (the load >> part, not the os numbers). >> >> Jon >> >> >> >> >> On Mon, Apr 8, 2019 at 7:26 AM Gabriel Giussi >> wrote: >> > >> > Hi, I'm trying to test if adding driver compression will bring me any >> > benefit. >> > I understand that the trade-off is less bandwidth but increased CPU usage >> > in both cassandra nodes (compression) and client nodes (decompression) but >> > I want to know what are the key metrics and how to monitor them to probe >> > compression is giving good results? >> > I guess I should look at latency percentiles reported by >> > com.datastax.driver.core.Metrics and CPU usage, but what about bandwith >> > usage and compression ratio? >> > Should I use tcpdump to capture packets length coming from cassandra >> > nodes? Something like tcpdump -n "src port 9042 and tcp[13] & 8 != 0" | >> > sed -n "s/^.*length \(.*\).*$/\1/p" would be enough? >> > >> > Thanks >> >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >> - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: How to monitor datastax driver compression performance?
tlp-stress allow us to define size of rows? Because I will see the benefit of compression in terms of request rates only if the compression ratio is significant, i.e. requires less network round trips. This could be done generating bigger partitions with parameters -n and -p, i.e. decreasing the -p? Also, don't you think that driver should allow configuring compression per query? Because one table with wide rows could benefit from compression while another one with less payload could not. Thanks for your help Jon. El lun., 8 abr. 2019 a las 19:13, Jon Haddad () escribió: > If it were me, I'd look at raw request rates (in terms of requests / > second as well as request latency), network throughput and then some > flame graphs of both the server and your application: > https://github.com/jvm-profiling-tools/async-profiler. > > I've created an issue in tlp-stress to add compression options for the > driver: https://github.com/thelastpickle/tlp-stress/issues/67. If > you're interested in contributing the feature I think tlp-stress will > more or less solve the remainder of the problem for you (the load > part, not the os numbers). > > Jon > > > > > On Mon, Apr 8, 2019 at 7:26 AM Gabriel Giussi > wrote: > > > > Hi, I'm trying to test if adding driver compression will bring me any > benefit. > > I understand that the trade-off is less bandwidth but increased CPU > usage in both cassandra nodes (compression) and client nodes > (decompression) but I want to know what are the key metrics and how to > monitor them to probe compression is giving good results? > > I guess I should look at latency percentiles reported by > com.datastax.driver.core.Metrics and CPU usage, but what about bandwith > usage and compression ratio? > > Should I use tcpdump to capture packets length coming from cassandra > nodes? Something like tcpdump -n "src port 9042 and tcp[13] & 8 != 0" | sed > -n "s/^.*length \(.*\).*$/\1/p" would be enough? > > > > Thanks > > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > >
Re: How to monitor datastax driver compression performance?
If it were me, I'd look at raw request rates (in terms of requests / second as well as request latency), network throughput and then some flame graphs of both the server and your application: https://github.com/jvm-profiling-tools/async-profiler. I've created an issue in tlp-stress to add compression options for the driver: https://github.com/thelastpickle/tlp-stress/issues/67. If you're interested in contributing the feature I think tlp-stress will more or less solve the remainder of the problem for you (the load part, not the os numbers). Jon On Mon, Apr 8, 2019 at 7:26 AM Gabriel Giussi wrote: > > Hi, I'm trying to test if adding driver compression will bring me any benefit. > I understand that the trade-off is less bandwidth but increased CPU usage in > both cassandra nodes (compression) and client nodes (decompression) but I > want to know what are the key metrics and how to monitor them to probe > compression is giving good results? > I guess I should look at latency percentiles reported by > com.datastax.driver.core.Metrics and CPU usage, but what about bandwith usage > and compression ratio? > Should I use tcpdump to capture packets length coming from cassandra nodes? > Something like tcpdump -n "src port 9042 and tcp[13] & 8 != 0" | sed -n > "s/^.*length \(.*\).*$/\1/p" would be enough? > > Thanks - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
How to monitor datastax driver compression performance?
Hi, I'm trying to test if adding driver compression will bring me any benefit. I understand that the trade-off is less bandwidth but increased CPU usage in both cassandra nodes (compression) and client nodes (decompression) but I want to know what are the key metrics and how to monitor them to probe compression is giving good results? I guess I should look at latency percentiles reported by com.datastax.driver.core.Metrics and CPU usage, but what about bandwith usage and compression ratio? Should I use tcpdump to capture packets length coming from cassandra nodes? Something like* tcpdump -n "src port 9042 and tcp[13] & 8 != 0" | sed -n "s/^.*length \(.*\).*$/\1/p"* would be enough? Thanks