Hi Jon, We are on multi datacenter(On Prim) setup. We also noticed too many messages like below:
DEBUG [GossipStage:1] 2020-02-10 09:38:52,953 FailureDetector.java:457 - Ignoring interval time of 3258125997 for /10.x.x.x DEBUG [GossipStage:1] 2020-02-10 09:38:52,954 FailureDetector.java:457 - Ignoring interval time of 2045630029 for /10.y.y.y DEBUG [GossipStage:1] 2020-02-10 09:38:52,954 FailureDetector.java:457 - Ignoring interval time of 2045416737 for /10.z.z.z Currently the value of phi_convict_threshold is not set which makes it to 8 (default) . Can this also cause hints buildup even when we can see that all nodes are UP ? Recommended value of phi_convict_threshold is 12 in AWS multi datacenter environment. Thanks Surbhi On Sun, 9 Feb 2020 at 21:42, Surbhi Gupta <surbhi.gupt...@gmail.com> wrote: > Thanks a lot Jon.. > Will try the recommendations and let you know the results.... > > On Fri, Feb 7, 2020 at 10:52 AM Jon Haddad <j...@jonhaddad.com> wrote: > >> There's a few things you can do here that might help. >> >> First off, if you're using the default heap settings, that's a serious >> problem. If you've got the head room, my recommendation is to use 16GB >> heap with 12 GB new gen and pin your memtable heap space to 2GB. Set your >> max tenuring threshold to 6 and your survivor ratio to 6. You don't need a >> lot of old gen space with cassandra, almost everything that will show up >> there is memtable related, and we allocate a *lot* whenever we read data >> off disk. >> >> Most folks use the default disk read ahead setting of 128KB. You can >> check this setting using blockdev --report, under the RA column. You'll >> see 256 there, that's in 512 byte sectors. MVs rely on a read before a >> write, so for every read off disk you do, you'll pull additional 128KB into >> your page cache. This is usually a waste and puts WAY too much pressure on >> your disk. On SSD, I always change this to 4KB. >> >> Next, be sure you're setting your compression rate accordingly. I wrote >> a long post on the topic here: >> https://thelastpickle.com/blog/2018/08/08/compression_performance.html. >> Our default compression is very unfriendly for read heavy workloads if >> you're reading small rows. If your records are small, 4KB compression >> chunk length is your friend. >> >> I have some slides showing pretty good performance improvements from the >> above 2 changes. Specifically, I went from 16K reads a second at 180ms p99 >> latency up to 63K reads / second at 21ms p99. Disk usage dropped by a >> factor of 10. Throw in those JVM changes I recommended and things should >> improve even further. >> >> Generally speaking, I recommend avoiding MVs, as they can be a giant mine >> if you aren't careful. They're not doing any magic behind the scenes that >> makes scaling easier, and in a lot of cases they're a hinderance. You >> still need to understand the underlying data and how it's laid out to use >> them properly, which is 99% of the work. >> >> Jon >> >> On Fri, Feb 7, 2020 at 10:32 AM Michael Shuler <mich...@pbandjelly.org> >> wrote: >> >>> That JIRA still says Open, so no, it has not been fixed (unless there's >>> a fixed duplicate in JIRA somewhere). >>> >>> For clarification, you could update that ticket with a comment including >>> your environmental details, usage of MV, etc. I'll bump the priority up >>> and include some possible branchX fixvers. >>> >>> Michael >>> >>> On 2/7/20 10:53 AM, Surbhi Gupta wrote: >>> > Hi, >>> > >>> > We are getting hit by the below bug. >>> > Other than lowering hinted_handoff_throttle_in_kb to 100 any other >>> work >>> > around ? >>> > >>> > https://issues.apache.org/jira/browse/CASSANDRA-13810 >>> > >>> > Any idea if it got fixed in later version. >>> > We are on Open source Cassandra 3.11.1 . >>> > >>> > Thanks >>> > Surbhi >>> > >>> > >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >>> For additional commands, e-mail: user-h...@cassandra.apache.org >>> >>>