@Rahul, I am using cassandra-stress tool. On Tue, Feb 6, 2018 at 7:37 PM, Rahul Singh <[email protected]> wrote:
> Could be the cause. I would run 2 and then 4 concurrent clients to see how > they behave. What’s your client written in? How are you managing your > connection? > > -- > Rahul Singh > [email protected] > > Anant Corporation > > On Feb 6, 2018, 8:50 AM -0500, mohsin k <[email protected]>, > wrote: > > Thanks, Jeff, will definitely check the trace. Also, one strange thing I > noticed, with number of threads till '64', the latency is around 3ms but as > the number of threads increases latency also increases. Eventually, at > thread count, 609 latency is around 30ms. I am using a single client to > loadtest 4 node cluster. Is this the issue because of client being > bottleneck? > > On Mon, Feb 5, 2018 at 8:05 PM, Jeff Jirsa <[email protected]> wrote: > >> Also: coordinator handles tracing and read repair. Make sure tracing is >> off for production. Have your data repaired if possible to eliminate that. >> >> Use tracing to see what’s taking the time. >> >> -- >> Jeff Jirsa >> >> >> On Feb 5, 2018, at 6:32 AM, Jeff Jirsa <[email protected]> wrote: >> >> There’s two parts to latency on the Cassandra side: >> >> Local and coordinator >> >> When you read, the node to which you connect coordinates the request to >> the node which has the data (potentially itself). Long tail in coordinator >> latencies tend to be the coordinator itself gc’ing, which will happen from >> time to time. If it’s more consistently high, it may be natural latencies >> in your cluster (ie: your requests are going cross wan and the other dc is >> 10-20ms away). >> >> If the latency is seen in p99 but not p50, you can almost always >> speculatively read from another coordinator (driver level speculation) >> after a millisecond or so. >> >> -- >> Jeff Jirsa >> >> >> On Feb 5, 2018, at 5:41 AM, mohsin k <[email protected]> wrote: >> >> Thanks for response @Nicolas. I was considering the total read latency >> from the client to server (as shown in the image above) which is around >> 30ms. Which I want to get around 3ms (client and server are both on same >> network). I did not consider read latency provided by the server (which I >> should have). I monitored CPU , memory and JVM lifecycle, which is at a >> safe level. *I think the difference(0.03 to 30) might be because of low >> network bandwidth, correct me if I am wrong.* >> >> I did reduce chunk_length_in_kb to 4kb, but I couldn't get a considerable >> amount of difference, might be because there is less room for improvement >> on the server side. >> >> Thanks again. >> >> On Mon, Feb 5, 2018 at 6:52 PM, Nicolas Guyomar < >> [email protected]> wrote: >> >>> Your row hit rate is 0.971 which is already very high, IMHO there is >>> "nothing" left to do here if you can afford to store your entire dataset in >>> memory >>> >>> Local read latency: 0.030 ms already seems good to me, what makes you >>> think that you can achieve more with the relative "small" box you are using >>> ? >>> >>> You have to keep an eye on other metrics which might be a limiting >>> factor, like cpu usage, JVM heap lifecycle and so on >>> >>> For read heavy workflow it is sometimes advised to reduce chunk_length_in_kb >>> from the default 64kb to 4kb, see if it helps ! >>> >>> On 5 February 2018 at 13:09, mohsin k <[email protected]> wrote: >>> >>>> Hey Rahul, >>>> >>>> Each partition has around 10 cluster keys. Based on nodetool, I can >>>> roughly estimate partition size to be less than 1KB. >>>> >>>> On Mon, Feb 5, 2018 at 5:37 PM, mohsin k <[email protected]> >>>> wrote: >>>> >>>>> Hey Nicolas, >>>>> >>>>> My goal is to reduce latency as much as possible. I did wait for >>>>> warmup. The test ran for more than 15mins, I am not sure why it shows >>>>> 2mins >>>>> though. >>>>> >>>>> >>>>> >>>>> On Mon, Feb 5, 2018 at 5:25 PM, Rahul Singh < >>>>> [email protected]> wrote: >>>>> >>>>>> What is the average size of your partitions / rows. 1GB may not be >>>>>> enough. >>>>>> >>>>>> Rahul >>>>>> >>>>>> On Feb 5, 2018, 6:52 AM -0500, mohsin k <[email protected]>, >>>>>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I have been looking into different configurations for tuning my >>>>>> cassandra servers. So, initially I loadtested server using >>>>>> cassandra-stress >>>>>> tool, with default configs and then tuning one by one config to measure >>>>>> impact of change. First config, I tried was setting " >>>>>> *row_cache_size_in_mb*" to 1000 (MB) in yaml, adding caching {'keys': >>>>>> 'ALL', *'rows_per_partition': 'ALL'*}. After changing these configs, >>>>>> I observed that latency has increased rather than decreasing. It would be >>>>>> really helpful if I get to understand why is this the case and what steps >>>>>> must be taken to decrease the latency. >>>>>> >>>>>> I am running a cluster with 4 nodes. >>>>>> >>>>>> Following is my schema: >>>>>> >>>>>> CREATE TABLE stresstest.user_to_segment ( >>>>>> userid text, >>>>>> segmentid text, >>>>>> PRIMARY KEY (userid, segmentid) >>>>>> ) WITH CLUSTERING ORDER BY (segmentid DESC) >>>>>> AND bloom_filter_fp_chance = 0.1 >>>>>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} >>>>>> AND comment = 'A table to hold blog segment user relation' >>>>>> AND compaction = {'class': 'org.apache.cassandra.db.compa >>>>>> ction.LeveledCompactionStrategy'} >>>>>> AND compression = {'chunk_length_in_kb': '64', 'class': ' >>>>>> org.apache.cassandra.io.compress.LZ4Compressor'} >>>>>> AND crc_check_chance = 1.0 >>>>>> AND dclocal_read_repair_chance = 0.1 >>>>>> AND default_time_to_live = 0 >>>>>> AND gc_grace_seconds = 864000 >>>>>> AND max_index_interval = 2048 >>>>>> AND memtable_flush_period_in_ms = 0 >>>>>> AND min_index_interval = 128 >>>>>> AND read_repair_chance = 0.0 >>>>>> AND speculative_retry = '99PERCENTILE'; >>>>>> >>>>>> Following are node specs: >>>>>> RAM: 4GB >>>>>> CPU: 4 Core >>>>>> HDD: 250BG >>>>>> >>>>>> >>>>>> Following is the output of 'nodetool info' after setting >>>>>> row_cache_size_in_mb: >>>>>> >>>>>> ID : d97dfbbf-1dc3-4d95-a1d9-c9a8d22a3d32 >>>>>> Gossip active : true >>>>>> Thrift active : false >>>>>> Native Transport active: true >>>>>> Load : 10.94 MiB >>>>>> Generation No : 1517571163 >>>>>> Uptime (seconds) : 9169 >>>>>> Heap Memory (MB) : 136.01 / 3932.00 >>>>>> Off Heap Memory (MB) : 0.10 >>>>>> Data Center : dc1 >>>>>> Rack : rack1 >>>>>> Exceptions : 0 >>>>>> Key Cache : entries 125881, size 9.6 MiB, capacity 100 >>>>>> MiB, 107 hits, 126004 requests, 0.001 recent hit rate, 14400 save period >>>>>> in >>>>>> seconds >>>>>> Row Cache : entries 125861, size 31.54 MiB, capacity >>>>>> 1000 MiB, 4262684 hits, 4388545 requests, 0.971 recent hit rate, 0 >>>>>> save period in seconds >>>>>> Counter Cache : entries 0, size 0 bytes, capacity 50 MiB, 0 >>>>>> hits, 0 requests, NaN recent hit rate, 7200 save period in seconds >>>>>> Chunk Cache : entries 273, size 17.06 MiB, capacity 480 >>>>>> MiB, 325 misses, 126623 requests, 0.997 recent hit rate, NaN microseconds >>>>>> miss latency >>>>>> Percent Repaired : 100.0% >>>>>> Token : (invoke with -T/--tokens to see all 256 >>>>>> tokens) >>>>>> >>>>>> >>>>>> Following is output of nodetool cfstats: >>>>>> >>>>>> Total number of tables: 37 >>>>>> ---------------- >>>>>> Keyspace : stresstest >>>>>> Read Count: 4398162 >>>>>> Read Latency: 0.02184742626579012 ms. >>>>>> Write Count: 0 >>>>>> Write Latency: NaN ms. >>>>>> Pending Flushes: 0 >>>>>> Table: user_to_segment >>>>>> SSTable count: 1 >>>>>> SSTables in each level: [1, 0, 0, 0, 0, 0, 0, 0, 0] >>>>>> Space used (live): 11076103 >>>>>> Space used (total): 11076103 >>>>>> Space used by snapshots (total): 0 >>>>>> Off heap memory used (total): 107981 >>>>>> SSTable Compression Ratio: 0.5123353861375962 >>>>>> Number of partitions (estimate): 125782 >>>>>> Memtable cell count: 0 >>>>>> Memtable data size: 0 >>>>>> Memtable off heap memory used: 0 >>>>>> Memtable switch count: 2 >>>>>> Local read count: 4398162 >>>>>> Local read latency: 0.030 ms >>>>>> Local write count: 0 >>>>>> Local write latency: NaN ms >>>>>> Pending flushes: 0 >>>>>> Percent repaired: 0.0 >>>>>> Bloom filter false positives: 0 >>>>>> Bloom filter false ratio: 0.00000 >>>>>> Bloom filter space used: 79280 >>>>>> Bloom filter off heap memory used: 79272 >>>>>> Index summary off heap memory used: 26757 >>>>>> Compression metadata off heap memory used: 1952 >>>>>> Compacted partition minimum bytes: 43 >>>>>> Compacted partition maximum bytes: 215 >>>>>> Compacted partition mean bytes: 136 >>>>>> Average live cells per slice (last five minutes): 5.719932432432432 >>>>>> Maximum live cells per slice (last five minutes): 10 >>>>>> Average tombstones per slice (last five minutes): 1.0 >>>>>> Maximum tombstones per slice (last five minutes): 1 >>>>>> Dropped Mutations: 0 >>>>>> >>>>>> Following are my results: >>>>>> The blue graph is before setting >>>>>> row_cache_size_in_mb, orange is after >>>>>> >>>>>> Thanks, >>>>>> Mohsin >>>>>> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: [email protected] >>>>>> For additional commands, e-mail: [email protected] >>>>>> >>>>>> >>>>> >>>> >>> >> >
