I can see from cfhistograms that I do have some wide rows (see below). I set
trace probability as you suggested but the output doesn’t appear to tell me
what row was actually read unless I missed something. I just see executing
prepared statement. Any ideas how I can find the row in question?
I am considering reducing read_request_timeout_in_ms: 5000 in cassandra.yaml so
that it reduces the impact when this occurs.
Any help in identifying my issue would be GREATLY appreciated
Cell Count per Partition
1 cells: 50449950
2 cells: 14281828
3 cells: 8093366
4 cells: 5029200
5 cells: 3103023
6 cells: 3059903
7 cells: 1903018
8 cells: 1509297
10 cells: 2420359
12 cells: 1624895
14 cells: 1171678
17 cells: 1289391
20 cells: 909777
24 cells: 852081
29 cells: 722925
35 cells: 587067
42 cells: 459473
50 cells: 358744
60 cells: 304146
72 cells: 244682
86 cells: 191045
103 cells: 155337
124 cells: 127061
149 cells: 98913
179 cells: 77454
215 cells: 59849
258 cells: 46117
310 cells: 35321
372 cells: 26319
446 cells: 19379
535 cells: 13783
642 cells: 9993
770 cells: 6973
924 cells: 4713
1109 cells: 3229
1331 cells: 2062
1597 cells: 1338
1916 cells: 773
2299 cells: 495
2759 cells: 268
3311 cells: 150
3973 cells: 100
4768 cells: 42
5722 cells: 24
6866 cells: 12
8239 cells: 9
9887 cells: 3
11864 cells: 0
14237 cells: 5
17084 cells: 1
20501 cells: 0
24601 cells: 2
29521 cells: 0
35425 cells: 0
42510 cells: 0
51012 cells: 0
61214 cells: 2
From: DuyHai Doan <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Date: Thursday, July 24, 2014 at 3:01 PM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: Hot, large row
"How can I detect wide rows?" -->
nodetool cfhistograms <keyspace> <suspected column family>
Look at column "Column count" (last column) and identify a line in this column
with very high value of "Offset". In a well designed application you should
have a gaussian distribution where 80% of your row have a similar number of
columns.
"Anyone know what debug level I can set so that I can see what reads the hot
node is handling? " -->
"nodetool settraceprobability <value>", where value is a small number (0.001)
on the node where you encounter the issue. Activate the tracing for a while (5
mins) and deactivate it (value = 0). Then look into system_traces tables
"events" & "sessions". It may help or not since the tracing is done once every
1000.
"Any way to get the server to blacklist these wide rows automatically?" --> No
On Thu, Jul 24, 2014 at 8:48 PM, Keith Wright
<[email protected]<mailto:[email protected]>> wrote:
Hi all,
We are seeing an issue where basically daily one of our nodes spikes in load
and is churning in CMS heap pressure. It appears that reads are backing up and
my guess is that our application is reading a large row repeatedly. Our write
structure can lead itself to wide rows very infrequently (<0.001%) and we do
our best to detect and delete them but obviously we’re missing a case. Hoping
for assistance on the following questions:
* How can I detect wide rows?
* Anyone know what debug level I can set so that I can see what reads the
hot node is handling? I’m hoping to see the “bad” row
* Any way to get the server to blacklist these wide rows automatically?
We’re using C* 2.0.6 with Vnodes.
Thanks