Re: Nodetool command to pre-load the chunk cache

2023-03-24 Thread Chris Lohfink
Something additional to consider (outside C* fix) is using a tool like
happycache  to have
consistent pagecache between them. Might be sufficient if the data is in
memory already.

Chris

On Tue, Mar 21, 2023 at 2:48 PM Jeff Jirsa  wrote:

> We serialize the other caches to disk to avoid cold-start problems, I
> don't see why we couldn't also serialize the chunk cache? Seems worth a
> JIRA to me.
>
> Until then, you can probably use the dynamic snitch (badness + severity)
> to route around newly started hosts.
>
> I'm actually pretty surprised the chunk cache is that effective, sort of
> nice to know.
>
>
>
> On Tue, Mar 21, 2023 at 10:17 AM Carlos Diaz  wrote:
>
>> Hi Team,
>>
>> We are heavy users of Cassandra at a pretty big bank.  Security measures
>> require us to constantly refresh our C* nodes every x number of days.  We
>> normally do this in a rolling fashion, taking one node down at a time and
>> then refreshing it with a new instance.  This process has been working for
>> us great for the past few years.
>>
>> However, we recently started having issues when a newly refreshed
>> instance comes back online, our automation waits a few minutes for the node
>> to become "ready (UN)" and then moves on to the next node.  The problem
>> that we are facing is that when the node is ready, the chunk cache is still
>> empty so when the node starts accepting new connections, queries that go to
>> take much longer to respond and this causes errors for our apps.
>>
>> I was thinking that it would be great if we had a nodetool command that
>> would allow us to prefetch a certain table or a set of tables to preload
>> the chunk cache.  Then we could simply add another check (nodetool info?),
>> to ensure that the chunk cache has been preloaded enough to handle queries
>> to this particular node.
>>
>> Would love to hear others' feedback on the feasibility of this idea.
>>
>> Thanks!
>>
>>
>>
>>


Re: Nodetool command to pre-load the chunk cache

2023-03-21 Thread Jeff Jirsa
We serialize the other caches to disk to avoid cold-start problems, I don't
see why we couldn't also serialize the chunk cache? Seems worth a JIRA to
me.

Until then, you can probably use the dynamic snitch (badness + severity) to
route around newly started hosts.

I'm actually pretty surprised the chunk cache is that effective, sort of
nice to know.



On Tue, Mar 21, 2023 at 10:17 AM Carlos Diaz  wrote:

> Hi Team,
>
> We are heavy users of Cassandra at a pretty big bank.  Security measures
> require us to constantly refresh our C* nodes every x number of days.  We
> normally do this in a rolling fashion, taking one node down at a time and
> then refreshing it with a new instance.  This process has been working for
> us great for the past few years.
>
> However, we recently started having issues when a newly refreshed instance
> comes back online, our automation waits a few minutes for the node to
> become "ready (UN)" and then moves on to the next node.  The problem that
> we are facing is that when the node is ready, the chunk cache is still
> empty so when the node starts accepting new connections, queries that go to
> take much longer to respond and this causes errors for our apps.
>
> I was thinking that it would be great if we had a nodetool command that
> would allow us to prefetch a certain table or a set of tables to preload
> the chunk cache.  Then we could simply add another check (nodetool info?),
> to ensure that the chunk cache has been preloaded enough to handle queries
> to this particular node.
>
> Would love to hear others' feedback on the feasibility of this idea.
>
> Thanks!
>
>
>
>


Re: Nodetool command to pre-load the chunk cache

2023-03-21 Thread Bowen Song via user
It sounds like a bad policy, and you should push for that to be changed. 
Failing that, you have some options:


1. Use faster disks. This improves cold start performance, without 
relying on the caches.


2. Rely on row cache instead. It can be saved periodically and loaded at 
startup time.


3. Ensure read CL < RF, and rely on speculative retries. Note: you will 
need to avoid restarting two severs owning the same token range 
consecutively for this to work.


These are on top of my head, but I'm sure there's more ways to do it. 
You should decide based on your situation.


BTW, manually load the chunk cache is never going to work unless you 
know what the hot data is. Load a whole table into chunk cache makes no 
sense unless the table on each server can fit in 512 MB of memory, but 
then why do you even need Cassandra?



On 21/03/2023 17:15, Carlos Diaz wrote:

Hi Team,

We are heavy users of Cassandra at a pretty big bank. Security 
measures require us to constantly refresh our C* nodes every x number 
of days.  We normally do this in a rolling fashion, taking one node 
down at a time and then refreshing it with a new instance.  This 
process has been working for us great for the past few years.


However, we recently started having issues when a newly refreshed 
instance comes back online, our automation waits a few minutes for the 
node to become "ready (UN)" and then moves on to the next node.  The 
problem that we are facing is that when the node is ready, the chunk 
cache is still empty so when the node starts accepting new 
connections, queries that go to take much longer to respond and this 
causes errors for our apps.


I was thinking that it would be great if we had a nodetool command 
that would allow us to prefetch a certain table or a set of tables to 
preload the chunk cache.  Then we could simply add another check 
(nodetool info?), to ensure that the chunk cache has been preloaded 
enough to handle queries to this particular node.


Would love to hear others' feedback on the feasibility of this idea.

Thanks!