[
https://issues.apache.org/jira/browse/CASSANDRA-18619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stefan Miklosovic updated CASSANDRA-18619:
------------------------------------------
Description:
Lets have a node with 8 cores and lets do "nodetool setconcurrentcompactors 4"
When I am doing "nodetool garbagecollect", there is a possibility to specify
number of "jobs" via -j flag. If I set it to "2", max two threads will be
compacting, if I set it to 6, it will be in practice capped to 4 as that is my
"concurrentcompactors" setting.
So far good.
However, when I set jobs to 4 and I execute garbagecollecting on two tables,
tb1 and tb2 like this:
{code:java}
nodetool garbagecollect -j 4 -- keyspace1 tb1 tb2
{code}
What it does is that it will start to gc first table, 4 tables at max AND THEN
it will start to gc the second table.
In other words, if tb1 has 10 tables to gc and I have 4 jobs at max, it will gc
them, but if one looks into compactionstats, she sees that as gc-ing
progresses, there might be e.g. just 2 tables left to gc so in theory there is
a slot for two additional sstables to gc as well but this will not happen. It
will wait until the first table is gc-ed and then it will start to gc the
second one with 4 threads.
This might be improved so as soon as there is a free job thread to gc, next
sstable would be scheduled to be gc-ed even it is from a different cql table.
was:
Lets have a node with 8 cores and lets do "nodetool setconcurrentcompactors 4"
When I am doing "nodetool garbagecollect", there is a possibility to specify
number of "jobs" via -j flag. If I set it to "2", max two threads will be
compacting, if I set it to 6, it will be in practice capped to 4 as that is my
"concurrentcompactors" setting.
So far good.
However, when I set jobs to 4 and I execute garbagecollecting on two tables,
tb1 and tb2 like this:
{code}
nodetool garbagecollect -j 4 -- keyspace1 tb1 tb2
{code}
What it does is that it will start to gc first table, 4 tables at max AND THEN
it will start to gc the second table.
In other words, if tb1 has 10 tables to gc and I have 4 jobs at max, it will gc
them, but if one looks into compactionstats, she sees that as gc-ing
progresses, there might be e.g. just 2 tables left to gc so in theory there is
a slot for two additional sstables to gc-ed as well but this will not happen.
It will wait until the first table is gc-ed and then it will start to gc the
second one with 4 threds.
This might be improved so as soon as there is a free job thread to gc, next
sstable would be scheduled to be gc-ed even it is from a different cql table.
> nodetool garbagecollect does not use all available compaction executors
> -----------------------------------------------------------------------
>
> Key: CASSANDRA-18619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18619
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Stefan Miklosovic
> Priority: Normal
>
> Lets have a node with 8 cores and lets do "nodetool setconcurrentcompactors 4"
> When I am doing "nodetool garbagecollect", there is a possibility to specify
> number of "jobs" via -j flag. If I set it to "2", max two threads will be
> compacting, if I set it to 6, it will be in practice capped to 4 as that is
> my "concurrentcompactors" setting.
> So far good.
> However, when I set jobs to 4 and I execute garbagecollecting on two tables,
> tb1 and tb2 like this:
> {code:java}
> nodetool garbagecollect -j 4 -- keyspace1 tb1 tb2
> {code}
> What it does is that it will start to gc first table, 4 tables at max AND
> THEN it will start to gc the second table.
> In other words, if tb1 has 10 tables to gc and I have 4 jobs at max, it will
> gc them, but if one looks into compactionstats, she sees that as gc-ing
> progresses, there might be e.g. just 2 tables left to gc so in theory there
> is a slot for two additional sstables to gc as well but this will not happen.
> It will wait until the first table is gc-ed and then it will start to gc the
> second one with 4 threads.
> This might be improved so as soon as there is a free job thread to gc, next
> sstable would be scheduled to be gc-ed even it is from a different cql table.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]