Re: Full repair caused disk space increase issue

2018-01-09 Thread Jon Haddad
The old files will not be split.  TWCS doesn’t ever do that.  

> On Jan 9, 2018, at 12:26 AM, wxn...@zjqunshuo.com wrote:
> 
> Hi Alex,
> After I changed one node to TWCS using JMX command, it started to compact. I 
> expect the old large sstable files will be split into smaller ones according 
> to the time bucket. But I got still large sstable file.
> 
> JMX command used:
> set CompactionParametersJson 
> {"class":"com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy","compaction_window_unit":"DAYS","compaction_window_size":"8"}
> 
> Logs:
> INFO  [CompactionExecutor:4] 2018-01-09 15:55:04,525 
> CompactionManager.java:654 - Will not compact 
> /mnt/hadoop/cassandra/data/system/batchlog-0290003c977e397cac3efdfdc01d626b/lb-37-big:
>  it is not an active sstable
> INFO  [CompactionExecutor:4] 2018-01-09 15:55:04,525 
> CompactionManager.java:664 - No files to compact for user defined compaction
> 
> The last log means something?
> 
> Cheers,
> -Simon
>  
> From: wxn...@zjqunshuo.com <mailto:wxn...@zjqunshuo.com>
> Date: 2018-01-05 15:54
> To: user <mailto:user@cassandra.apache.org>
> Subject: Re: Full repair caused disk space increase issue
> Thanks Alex. Some nodes have finished anticompaction and disk space got 
> reclaimed as you mentioned. 
> BTW, after reading your 
> post(http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html 
> <http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html>) on TWCS, I 
> decided to use TWCS, and doing the full repair is one of the preparation of 
> changing to TWCS.
>  
> From: Alexander Dejanovski <mailto:a...@thelastpickle.com>
> Date: 2018-01-05 15:17
> To: user <mailto:user@cassandra.apache.org>
> Subject: Re: Full repair caused disk space increase issue
> Hi Simon,
> 
> since Cassandra 2.2, anticompaction is performed in all types of repairs, 
> except subrange repair.
> Given that you have some very big SSTables, the temporary space used by 
> anticompaction (which does the opposite of compaction : read one sstable, 
> output two sstables) will impact your disk usage while it's running. It will 
> reach a peak when they are close to completion.
> The anticompaction that is reported by compactionstats is currently using an 
> extra 147GB*[compression ratio]. So with a compression ratio of 0.3 for 
> example, that would be 44GB that will get reclaimed shortly after the 
> anticompaction is over.
> 
> You can check the current overhead of compaction by listing temporary 
> sstables : *tmp*Data.db
> 
> It's also possible that you have some overstreaming that occurred during your 
> repair, which will increase the size on disk until it gets compacted away 
> (over time).
> You should also check if you don't have snapshots sticking around by running 
> "nodetool listsnapshots".
> 
> Now, you're mentioning that you ran repair to evict tombstones. This is not 
> what repair does, and tombstones are evicted through compaction when they 
> meet the requirements (gc_grace_seconds and all the cells of the partition 
> involved in the same compaction).
> If you want to optimize your tombstone eviction, especially with STCS, I 
> advise to turn on unchecked_tombstone_compaction, which will allow single 
> sstables compactions to be triggered by Cassandra when there is more than 20% 
> of estimated droppable tombstones in an SSTable.
> You can check your current droppable tombstone ratio by running 
> sstablemetadata on all your sstables.
> A command like the following should do the trick (it will print out min/max 
> timestamps too) : 
> 
> for f in *Data.db; do meta=$(sudo sstablemetadata $f); echo -e "Max:" $(date 
> --date=@$(echo "$meta" | grep Maximum\ time | cut -d" "  -f3| cut -c 1-10) 
> '+%m/%d/%Y') "Min:" $(date --date=@$(echo "$meta" | grep Minimum\ time | cut 
> -d" "  -f3| cut -c 1-10) '+%m/%d/%Y') $(echo "$meta" | grep droppable) ' \t ' 
> $(ls -lh $f | awk '{print $5" "$6" "$7" "$8" "$9}'); done | sort
> 
> Check if the 20% threshold is high enough by verifying that newly created 
> SSTables don't already reach that level, and adjust accordingly if it's the 
> case (for example raise the threshold to 50%).
> 
> To activate the tombstone compactions, with a 50% droppable tombstone 
> threshold, perform the following statement on your table : 
> 
> ALTER TABLE cargts.eventdata WITH compaction = 
> {'class':'SizeTieredCompactionStrategy', 
> 'unchecked_tombstone_compaction':'true', 'tombstone_threshold':'0.5'}
> 
> Picking the right threshold is up to yo

Re: Full repair caused disk space increase issue

2018-01-09 Thread wxn...@zjqunshuo.com
Hi Alex,
After I changed one node to TWCS using JMX command, it started to compact. I 
expect the old large sstable files will be split into smaller ones according to 
the time bucket. But I got still large sstable file.

JMX command used:
set CompactionParametersJson 
{"class":"com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy","compaction_window_unit":"DAYS","compaction_window_size":"8"}

Logs:
INFO  [CompactionExecutor:4] 2018-01-09 15:55:04,525 CompactionManager.java:654 
- Will not compact 
/mnt/hadoop/cassandra/data/system/batchlog-0290003c977e397cac3efdfdc01d626b/lb-37-big:
 it is not an active sstable
INFO  [CompactionExecutor:4] 2018-01-09 15:55:04,525 CompactionManager.java:664 
- No files to compact for user defined compaction

The last log means something?

Cheers,
-Simon
 
From: wxn...@zjqunshuo.com
Date: 2018-01-05 15:54
To: user
Subject: Re: Full repair caused disk space increase issue
Thanks Alex. Some nodes have finished anticompaction and disk space got 
reclaimed as you mentioned. 
BTW, after reading your 
post(http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html) on TWCS, I 
decided to use TWCS, and doing the full repair is one of the preparation of 
changing to TWCS.
 
From: Alexander Dejanovski
Date: 2018-01-05 15:17
To: user
Subject: Re: Full repair caused disk space increase issue
Hi Simon,

since Cassandra 2.2, anticompaction is performed in all types of repairs, 
except subrange repair.
Given that you have some very big SSTables, the temporary space used by 
anticompaction (which does the opposite of compaction : read one sstable, 
output two sstables) will impact your disk usage while it's running. It will 
reach a peak when they are close to completion.
The anticompaction that is reported by compactionstats is currently using an 
extra 147GB*[compression ratio]. So with a compression ratio of 0.3 for 
example, that would be 44GB that will get reclaimed shortly after the 
anticompaction is over.

You can check the current overhead of compaction by listing temporary sstables 
: *tmp*Data.db

It's also possible that you have some overstreaming that occurred during your 
repair, which will increase the size on disk until it gets compacted away (over 
time).
You should also check if you don't have snapshots sticking around by running 
"nodetool listsnapshots".

Now, you're mentioning that you ran repair to evict tombstones. This is not 
what repair does, and tombstones are evicted through compaction when they meet 
the requirements (gc_grace_seconds and all the cells of the partition involved 
in the same compaction).
If you want to optimize your tombstone eviction, especially with STCS, I advise 
to turn on unchecked_tombstone_compaction, which will allow single sstables 
compactions to be triggered by Cassandra when there is more than 20% of 
estimated droppable tombstones in an SSTable.
You can check your current droppable tombstone ratio by running sstablemetadata 
on all your sstables.
A command like the following should do the trick (it will print out min/max 
timestamps too) : 

for f in *Data.db; do meta=$(sudo sstablemetadata $f); echo -e "Max:" $(date 
--date=@$(echo "$meta" | grep Maximum\ time | cut -d" "  -f3| cut -c 1-10) 
'+%m/%d/%Y') "Min:" $(date --date=@$(echo "$meta" | grep Minimum\ time | cut 
-d" "  -f3| cut -c 1-10) '+%m/%d/%Y') $(echo "$meta" | grep droppable) ' \t ' 
$(ls -lh $f | awk '{print $5" "$6" "$7" "$8" "$9}'); done | sort

Check if the 20% threshold is high enough by verifying that newly created 
SSTables don't already reach that level, and adjust accordingly if it's the 
case (for example raise the threshold to 50%).

To activate the tombstone compactions, with a 50% droppable tombstone 
threshold, perform the following statement on your table : 

ALTER TABLE cargts.eventdata WITH compaction = 
{'class':'SizeTieredCompactionStrategy', 
'unchecked_tombstone_compaction':'true', 'tombstone_threshold':'0.5'}

Picking the right threshold is up to you.
Note that tombstone compactions running more often will use temporary space as 
well, but they should help evicting tombstones faster if the partitions are 
contained within a single SSTable.

If you are dealing with TTLed data and your partitions spread over time, I'd 
strongly suggest considering TWCS instead of STCS which can remove fully 
expired SSTables much more efficiently.

Cheers,


On Fri, Jan 5, 2018 at 7:43 AM wxn...@zjqunshuo.com <wxn...@zjqunshuo.com> 
wrote:
Hi All,
In order to evict tombstones, I issued full repair with the command "nodetool 
-pr -full". Then the data load size was indeed decreased by 100G for each node 
by using "nodetool status" to check. But the actual disk usage increased by 
500G for each node. The repair is still ongoing and leaving less and 

Re: Full repair caused disk space increase issue

2018-01-04 Thread wxn...@zjqunshuo.com
Thanks Alex. Some nodes have finished anticompaction and disk space got 
reclaimed as you mentioned. 
BTW, after reading your 
post(http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html) on TWCS, I 
decided to use TWCS, and doing the full repair is one of the preparation of 
changing to TWCS.
 
From: Alexander Dejanovski
Date: 2018-01-05 15:17
To: user
Subject: Re: Full repair caused disk space increase issue
Hi Simon,

since Cassandra 2.2, anticompaction is performed in all types of repairs, 
except subrange repair.
Given that you have some very big SSTables, the temporary space used by 
anticompaction (which does the opposite of compaction : read one sstable, 
output two sstables) will impact your disk usage while it's running. It will 
reach a peak when they are close to completion.
The anticompaction that is reported by compactionstats is currently using an 
extra 147GB*[compression ratio]. So with a compression ratio of 0.3 for 
example, that would be 44GB that will get reclaimed shortly after the 
anticompaction is over.

You can check the current overhead of compaction by listing temporary sstables 
: *tmp*Data.db

It's also possible that you have some overstreaming that occurred during your 
repair, which will increase the size on disk until it gets compacted away (over 
time).
You should also check if you don't have snapshots sticking around by running 
"nodetool listsnapshots".

Now, you're mentioning that you ran repair to evict tombstones. This is not 
what repair does, and tombstones are evicted through compaction when they meet 
the requirements (gc_grace_seconds and all the cells of the partition involved 
in the same compaction).
If you want to optimize your tombstone eviction, especially with STCS, I advise 
to turn on unchecked_tombstone_compaction, which will allow single sstables 
compactions to be triggered by Cassandra when there is more than 20% of 
estimated droppable tombstones in an SSTable.
You can check your current droppable tombstone ratio by running sstablemetadata 
on all your sstables.
A command like the following should do the trick (it will print out min/max 
timestamps too) : 

for f in *Data.db; do meta=$(sudo sstablemetadata $f); echo -e "Max:" $(date 
--date=@$(echo "$meta" | grep Maximum\ time | cut -d" "  -f3| cut -c 1-10) 
'+%m/%d/%Y') "Min:" $(date --date=@$(echo "$meta" | grep Minimum\ time | cut 
-d" "  -f3| cut -c 1-10) '+%m/%d/%Y') $(echo "$meta" | grep droppable) ' \t ' 
$(ls -lh $f | awk '{print $5" "$6" "$7" "$8" "$9}'); done | sort

Check if the 20% threshold is high enough by verifying that newly created 
SSTables don't already reach that level, and adjust accordingly if it's the 
case (for example raise the threshold to 50%).

To activate the tombstone compactions, with a 50% droppable tombstone 
threshold, perform the following statement on your table : 

ALTER TABLE cargts.eventdata WITH compaction = 
{'class':'SizeTieredCompactionStrategy', 
'unchecked_tombstone_compaction':'true', 'tombstone_threshold':'0.5'}

Picking the right threshold is up to you.
Note that tombstone compactions running more often will use temporary space as 
well, but they should help evicting tombstones faster if the partitions are 
contained within a single SSTable.

If you are dealing with TTLed data and your partitions spread over time, I'd 
strongly suggest considering TWCS instead of STCS which can remove fully 
expired SSTables much more efficiently.

Cheers,


On Fri, Jan 5, 2018 at 7:43 AM wxn...@zjqunshuo.com <wxn...@zjqunshuo.com> 
wrote:
Hi All,
In order to evict tombstones, I issued full repair with the command "nodetool 
-pr -full". Then the data load size was indeed decreased by 100G for each node 
by using "nodetool status" to check. But the actual disk usage increased by 
500G for each node. The repair is still ongoing and leaving less and less disk 
space for me.

From compactionstats, I see "Anticompaction after repair". Based on my 
understanding, it is for incremental repair by changing sstable metadata to 
indicate which file is repaired, so in next repair it is not going to be 
repaired. But I'm doing full repair, Why Anticompaction?
9e09c490-f1be-11e7-b2ea-b3085f85ccae   Anticompaction after repair cargts   
eventdata147.3 GB   158.54 GB   bytes 92.91%

There are pare sstable files. I mean they have the same timestamp as below. I 
guess one of them or both of them should be deleted after during repair, but 
for some unknown reason, the repair process failed to delete them. 
-rw-r--r-- 1 root root 237G Dec 31 12:48 lb-123800-big-Data.db
-rw-r--r-- 1 root root 243G Dec 31 12:48 lb-123801-big-Data.db

C* version is 2.2.8 with STCS. Any ideas?

Cheers,
-Simon


-- 
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Full repair caused disk space increase issue

2018-01-04 Thread Alexander Dejanovski
Hi Simon,

since Cassandra 2.2, anticompaction is performed in all types of repairs,
except subrange repair.
Given that you have some very big SSTables, the temporary space used by
anticompaction (which does the opposite of compaction : read one sstable,
output two sstables) will impact your disk usage while it's running. It
will reach a peak when they are close to completion.
The anticompaction that is reported by compactionstats is currently using
an extra 147GB*[compression ratio]. So with a compression ratio of 0.3 for
example, that would be 44GB that will get reclaimed shortly after the
anticompaction is over.

You can check the current overhead of compaction by listing temporary
sstables : *tmp*Data.db

It's also possible that you have some overstreaming that occurred during
your repair, which will increase the size on disk until it gets compacted
away (over time).
You should also check if you don't have snapshots sticking around by
running "nodetool listsnapshots".

Now, you're mentioning that you ran repair to evict tombstones. This is not
what repair does, and tombstones are evicted through compaction when they
meet the requirements (gc_grace_seconds and all the cells of the partition
involved in the same compaction).
If you want to optimize your tombstone eviction, especially with STCS, I
advise to turn on unchecked_tombstone_compaction, which will allow single
sstables compactions to be triggered by Cassandra when there is more than
20% of estimated droppable tombstones in an SSTable.
You can check your current droppable tombstone ratio by running
sstablemetadata on all your sstables.
A command like the following should do the trick (it will print out min/max
timestamps too) :

for f in *Data.db; do meta=$(sudo sstablemetadata $f); echo -e "Max:"
$(date --date=@$(echo "$meta" | grep Maximum\ time | cut -d" "  -f3| cut -c
1-10) '+%m/%d/%Y') "Min:" $(date --date=@$(echo "$meta" | grep Minimum\
time | cut -d" "  -f3| cut -c 1-10) '+%m/%d/%Y') $(echo "$meta" | grep
droppable) ' \t ' $(ls -lh $f | awk '{print $5" "$6" "$7" "$8" "$9}'); done
| sort

Check if the 20% threshold is high enough by verifying that newly created
SSTables don't already reach that level, and adjust accordingly if it's the
case (for example raise the threshold to 50%).

To activate the tombstone compactions, with a 50% droppable tombstone
threshold, perform the following statement on your table :

ALTER TABLE cargts.eventdata WITH compaction =
{'class':'SizeTieredCompactionStrategy',
'unchecked_tombstone_compaction':'true', 'tombstone_threshold':'0.5'}

Picking the right threshold is up to you.
Note that tombstone compactions running more often will use temporary space
as well, but they should help evicting tombstones faster if the partitions
are contained within a single SSTable.

If you are dealing with TTLed data and your partitions spread over time,
I'd strongly suggest considering TWCS instead of STCS which can remove
fully expired SSTables much more efficiently.

Cheers,


On Fri, Jan 5, 2018 at 7:43 AM wxn...@zjqunshuo.com 
wrote:

> Hi All,
> In order to evict tombstones, I issued full repair with the command
> "nodetool -pr -full". Then the data load size was indeed decreased by 100G
> for each node by using "nodetool status" to check. But the actual disk
> usage increased by 500G for each node. The repair is still ongoing and
> leaving less and less disk space for me.
>
> From compactionstats, I see "Anticompaction after repair". Based on my
> understanding, it is for incremental repair by changing sstable metadata to
> indicate which file is repaired, so in next repair it is not going to be
> repaired. But I'm doing full repair, Why Anticompaction?
>
> 9e09c490-f1be-11e7-b2ea-b3085f85ccae   Anticompaction after repair cargts 
>   eventdata147.3 GB   158.54 GB   bytes 92.91%
>
> There are pare sstable files. I mean they have the same timestamp as
> below. I guess one of them or both of them should be deleted after during
> repair, but for some unknown reason, the repair process failed to delete
> them.
> -rw-r--r-- 1 root root 237G Dec 31 12:48 lb-123800-big-Data.db
> -rw-r--r-- 1 root root 243G Dec 31 12:48 lb-123801-big-Data.db
>
> C* version is 2.2.8 with STCS. Any ideas?
>
> Cheers,
> -Simon
>


-- 
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Full repair caused disk space increase issue

2018-01-04 Thread wxn...@zjqunshuo.com
Hi All,
In order to evict tombstones, I issued full repair with the command "nodetool 
-pr -full". Then the data load size was indeed decreased by 100G for each node 
by using "nodetool status" to check. But the actual disk usage increased by 
500G for each node. The repair is still ongoing and leaving less and less disk 
space for me.

From compactionstats, I see "Anticompaction after repair". Based on my 
understanding, it is for incremental repair by changing sstable metadata to 
indicate which file is repaired, so in next repair it is not going to be 
repaired. But I'm doing full repair, Why Anticompaction?
9e09c490-f1be-11e7-b2ea-b3085f85ccae   Anticompaction after repair cargts   
eventdata147.3 GB   158.54 GB   bytes 92.91%

There are pare sstable files. I mean they have the same timestamp as below. I 
guess one of them or both of them should be deleted after during repair, but 
for some unknown reason, the repair process failed to delete them. 
-rw-r--r-- 1 root root 237G Dec 31 12:48 lb-123800-big-Data.db
-rw-r--r-- 1 root root 243G Dec 31 12:48 lb-123801-big-Data.db

C* version is 2.2.8 with STCS. Any ideas?

Cheers,
-Simon