date:20230418



[ 
https://issues.apache.org/jira/browse/CASSANDRA-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713920#comment-17713920
 ] 

Claude Warren commented on CASSANDRA-8928:
--

[~nasnousssi] , Yes.  it is one of the 3 tickets I am working on.  There is 
currently an issue with downgrading from 4.x to 3.x  You said you were able to 
get the downgrade to work on a single node.  Can you tell me what versions you 
downgraded from and to?

> Add downgradesstables
> -
>
> Key: CASSANDRA-8928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8928
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Tools
>Reporter: Jeremy Hanna
>Assignee: Claude Warren
>Priority: Low
>  Labels: remove-reopen
> Fix For: 5.x
>
>
> As mentioned in other places such as CASSANDRA-8047 and in the wild, 
> sometimes you need to go back.  A downgrade sstables utility would be nice 
> for a lot of reasons and I don't know that supporting going back to the 
> previous major version format would be too much code since we already support 
> reading the previous version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18336) Sstables were cleared when OOM and best_effort is used



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713913#comment-17713913
 ] 

Stefan Miklosovic commented on CASSANDRA-18336:
---

Thanks guys, I am building it.

> Sstables were cleared when OOM and best_effort is used
> --
>
> Key: CASSANDRA-18336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: NAIZHEN QUE
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
>
> Attachments: 4031679897782_.pic.jpg, 4241679905694_.pic.jpg, 
> system.log.2023-02-21.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> 1.When this exception occurs in the system
> {code:java}
> // 
> ERROR [CompactionExecutor:351627] 2023-02-21 17:59:20,721 
> CassandraDaemon.java:581 - Exception in thread 
> Thread[CompactionExecutor:351627,1,main]
> org.apache.cassandra.io.FSReadError: java.io.IOException: Map failed
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:167)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:310)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:246)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:170)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:73)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:61)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104)
>     at 
> org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:365)
>     at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:337)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:172)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:124)
>     at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:64)
>     at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:137)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:193)
>     at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:77)
>     at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:100)
>     at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:298)
>     at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>     at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.io.IOException: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1016)
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:163)
>     ... 23 common frames omitted
> Caused by: java.lang.OutOfMemoryError: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map0(Native Method)
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1013)
> {code}
> 2.Restart the node, Verifying logfile transaction ,All sstables are deleted
> {code:java}
> // code placeholder
> INFO  [main] 2023-02-21 18:00:23,350 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb-8819408-big-Index.db
>  
> INFO  [main] 2023-02-21 18:00:23,615 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb-8819408-big-Data.db
>  
> INFO  [main] 2023-02-21 18:00:46,504 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb_txn_compaction_c923b230-b077-11ed-a081-5d5a5c990823.log
>  
> INFO  [main] 2023-02-21 18:00:46,510 LogTransaction.java:536 - Verifying 
> logfile transaction 
> [nb_txn_compaction_461935b0-b1ce-11ed-a081-5d5a5c990823.log in 
> /historyData/cassandra/data/kairosdb/data_points-870f

[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files

2023-04-18 Thread Caleb Rackliffe (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713893#comment-17713893
 ] 

Caleb Rackliffe commented on CASSANDRA-18464:
-

What are the chances that we just drop Java 8 support for 6.0?

> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Fix For: 6.x
>
> Attachments: UseDirectIOFeatureForCommitLogFiles.patch
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18336) Sstables were cleared when OOM and best_effort is used

2023-04-18 Thread maxwellguo (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713839#comment-17713839
 ] 

maxwellguo commented on CASSANDRA-18336:


+1 on this, [~brandon.williams] What's your opinion?

> Sstables were cleared when OOM and best_effort is used
> --
>
> Key: CASSANDRA-18336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: NAIZHEN QUE
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
>
> Attachments: 4031679897782_.pic.jpg, 4241679905694_.pic.jpg, 
> system.log.2023-02-21.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> 1.When this exception occurs in the system
> {code:java}
> // 
> ERROR [CompactionExecutor:351627] 2023-02-21 17:59:20,721 
> CassandraDaemon.java:581 - Exception in thread 
> Thread[CompactionExecutor:351627,1,main]
> org.apache.cassandra.io.FSReadError: java.io.IOException: Map failed
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:167)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:310)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:246)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:170)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:73)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:61)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104)
>     at 
> org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:365)
>     at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:337)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:172)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:124)
>     at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:64)
>     at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:137)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:193)
>     at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:77)
>     at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:100)
>     at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:298)
>     at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>     at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.io.IOException: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1016)
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:163)
>     ... 23 common frames omitted
> Caused by: java.lang.OutOfMemoryError: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map0(Native Method)
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1013)
> {code}
> 2.Restart the node, Verifying logfile transaction ,All sstables are deleted
> {code:java}
> // code placeholder
> INFO  [main] 2023-02-21 18:00:23,350 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb-8819408-big-Index.db
>  
> INFO  [main] 2023-02-21 18:00:23,615 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb-8819408-big-Data.db
>  
> INFO  [main] 2023-02-21 18:00:46,504 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb_txn_compaction_c923b230-b077-11ed-a081-5d5a5c990823.log
>  
> INFO  [main] 2023-02-21 18:00:46,510 LogTransaction.java:536 - Verifying 
> logfile transaction 
> [nb_txn_compaction_461935b0-b1ce-11ed-a081-5d5a5c990823.log in 
> /historyData/cassandra/data/kairosdb/data_poi

[jira] [Commented] (CASSANDRA-18329) Upgrade jamm

2023-04-18 Thread Jonathan Ellis (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713812#comment-17713812
 ] 

Jonathan Ellis commented on CASSANDRA-18329:


As a more-actively-maintained alternative to jamm, Lucene has this: 
https://lucene.apache.org/core/9_5_0/core/org/apache/lucene/util/RamUsageEstimator.html

> Upgrade jamm
> 
>
> Key: CASSANDRA-18329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18329
> Project: Cassandra
>  Issue Type: Task
>  Components: Jamm
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.x
>
>
> Jamm is currently under maintenance that will solve JDK11 issues and enable 
> it to work with post JDK11+ versions up to JDK17.
> This ticket will serve as a placeholder for upgrading Jamm in Cassandra when 
> the new Jamm release is out. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17869) Add JDK17 option to cassandra-builds (build-scripts and jenkins dsl) and on jenkins agents

2023-04-18 Thread Ekaterina Dimitrova (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713780#comment-17713780
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17869:
-

Maybe in case anyone use it in their environment and we break people if we just 
remove it?

> Add JDK17 option to cassandra-builds (build-scripts and jenkins dsl) and on 
> jenkins agents
> --
>
> Key: CASSANDRA-17869
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17869
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Add JDK17 option to cassandra-builds build-scripts, they only currently 
> support options {{8}} and {{11}}.
> Add JDK17 to the matrix axes in the jenkins dsl.
> Ensure JDK17 is installed on all the jenkins agents.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713775#comment-17713775
 ] 

Brandon Williams commented on CASSANDRA-18464:
--

I applied this to a branch and had circle take a first pass.

||Branch||CI||
|[trunk|https://github.com/driftx/cassandra/tree/CASSANDRA-18464-trunk]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/977/workflows/677ded2a-f1bf-416c-ae8b-bdf32f18a2eb],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/977/workflows/9554a2d4-0322-492d-9f6b-b7e70eb8dab0]|


> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Fix For: 6.x
>
> Attachments: UseDirectIOFeatureForCommitLogFiles.patch
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-18464:
-
Change Category: Performance
 Complexity: Normal
  Fix Version/s: 6.x
 Status: Open  (was: Triage Needed)

> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Fix For: 6.x
>
> Attachments: UseDirectIOFeatureForCommitLogFiles.patch
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-8928) Add downgradesstables

2023-04-18 Thread anis ben brahim (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713762#comment-17713762
 ] 

anis ben brahim commented on CASSANDRA-8928:


Hi [~claude]

do you have any updates on this feature ? After doing some test on a single 
node, it seems that recovering only system keyspaces from backups and 
downgrading other keyspaces with this tool works. Is there any plan to continue 
working on this subject ? 

Best regards,

Anis

> Add downgradesstables
> -
>
> Key: CASSANDRA-8928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8928
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Tools
>Reporter: Jeremy Hanna
>Assignee: Claude Warren
>Priority: Low
>  Labels: remove-reopen
> Fix For: 5.x
>
>
> As mentioned in other places such as CASSANDRA-8047 and in the wild, 
> sometimes you need to go back.  A downgrade sstables utility would be nice 
> for a lot of reasons and I don't know that supporting going back to the 
> previous major version format would be too much code since we already support 
> reading the previous version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files

2023-04-18 Thread Josh McKenzie (Jira)

Josh McKenzie created CASSANDRA-18464:
-

 Summary: Enable Direct I/O For CommitLog Files
 Key: CASSANDRA-18464
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
 Project: Cassandra
  Issue Type: New Feature
  Components: Local/Commit Log
Reporter: Josh McKenzie
Assignee: Amit Pawar
 Attachments: UseDirectIOFeatureForCommitLogFiles.patch

Relocating from [dev@ email 
thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]

 

I shared my investigation about Commitlog I/O issue on large core count system 
in my previous email dated July-22 and link to the thread is given below.
[https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]

Basically, two solutions looked possible to improve the CommitLog I/O.
 # Multi-threaded syncing
 # Using Direct-IO through JNA

I worked on 2nd option considering the following benefit compared to the first 
one
 # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
Learnt through FIO benchmarking.
 # Reduces kernel file cache uses which in-turn reduces kernel I/O activity for 
Commitlog files only.
 # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
30% for Commitlog syncer thread with Direct I/O feature
 # Direct I/O implementation is easier compared to multi-threaded

As per the community suggestion, less in code complex is good to have. Direct 
I/O enablement looked promising but there was one issue. 
Java version 8 does not have native support to enable Direct I/O. So, JNA 
library usage is must. The same implementation should also work across other 
versions of Java (like 11 and beyond).

I have completed Direct I/O implementation and summary of the attached patch 
changes are given below.
 # This implementation is not using Java file channels and file is opened 
through JNA to use Direct I/O feature.
 # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
“NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
only).
 # JNA write call is used to flush the changes.
 # New helper functions are defined in NativeLibrary.java and platform specific 
file. Currently tested on Linux only.
 # Patch allows user to configure optimum block size  and alignment if default 
values are not OK for CommitLog disk.
 # Following configuration options are provided in Cassandra.yaml file
a. use_jna_for_commitlog_io : to use jna feature
b. use_direct_io_for_commitlog : to use Direct I/O feature.
c. direct_io_minimum_block_alignment: 512 (default)
d. nvme_disk_block_size: 32MiB (default and can be changed as per the required 
size)

 Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
was tested. It works with both Java 8 and 11 versions. Compressed and Encrypted 
based segments are not supported yet and it can be enabled later based on the 
Community feedback.

 Following improvement are seen with Direct I/O enablement.
 # 32 cores >= ~15%
 # 64 cores >= ~80%

 Also, another observation would like to share here. Reading Commitlog files 
with Direct I/O might help in reducing node bring-up time after the node crash.

 Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07

 The attached patch enables Direct I/O feature for Commitlog files. Please 
check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files

2023-04-18 Thread Josh McKenzie (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-18464:
--
Attachment: UseDirectIOFeatureForCommitLogFiles.patch

> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Attachments: UseDirectIOFeatureForCommitLogFiles.patch
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17869) Add JDK17 option to cassandra-builds (build-scripts and jenkins dsl) and on jenkins agents



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713650#comment-17713650
 ] 

Brandon Williams commented on CASSANDRA-17869:
--

Do we need to keep proliferating RUN_STATIC_MATRIX?  As I recall it only makes 
things break by running incompatible upgrades, so even though we aren't setting 
the env var it seems like keeping a sharp edge around.

> Add JDK17 option to cassandra-builds (build-scripts and jenkins dsl) and on 
> jenkins agents
> --
>
> Key: CASSANDRA-17869
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17869
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Add JDK17 option to cassandra-builds build-scripts, they only currently 
> support options {{8}} and {{11}}.
> Add JDK17 to the matrix axes in the jenkins dsl.
> Ensure JDK17 is installed on all the jenkins agents.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18396) Dtests marked with @ported_to_in_jvm can be skipped since 4.1



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713640#comment-17713640
 ] 

Brandon Williams commented on CASSANDRA-18396:
--

bq. Note that I have also fixed a mistake that we accidentally introduced on 
CASSANDRA-18391

Whoops, glad there was no real impact.  Everything looks good here, +1.

> Dtests marked with @ported_to_in_jvm can be skipped since 4.1
> -
>
> Key: CASSANDRA-18396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18396
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/python
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
>
>
> During the CASSANDRA-15536 epic we ported multiple Python dtests to in-JVM 
> dtests.
> The ported Python dtests are still present but marked with a new 
> {{@ported_to_in_jvm}} annotation. JVM dtests didn't support vnodes at that 
> time, so when a Python dtest is marked with that annotation it's only run for 
> vnodes config, whereas it's skipped if vnodes are off.
> However, we have had support for vnodes on JVM dtests since 4.1. Thus, I 
> think we should modify the {{@ported_to_in_jvm}} annotation to also skip 
> configs with vnodes if all the nodes are in 4.1 or later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18396) Dtests marked with @ported_to_in_jvm can be skipped since 4.1



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-18396:
-
Status: Ready to Commit  (was: Review In Progress)

> Dtests marked with @ported_to_in_jvm can be skipped since 4.1
> -
>
> Key: CASSANDRA-18396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18396
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/python
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
>
>
> During the CASSANDRA-15536 epic we ported multiple Python dtests to in-JVM 
> dtests.
> The ported Python dtests are still present but marked with a new 
> {{@ported_to_in_jvm}} annotation. JVM dtests didn't support vnodes at that 
> time, so when a Python dtest is marked with that annotation it's only run for 
> vnodes config, whereas it's skipped if vnodes are off.
> However, we have had support for vnodes on JVM dtests since 4.1. Thus, I 
> think we should modify the {{@ported_to_in_jvm}} annotation to also skip 
> configs with vnodes if all the nodes are in 4.1 or later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18396) Dtests marked with @ported_to_in_jvm can be skipped since 4.1



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-18396:
--
Fix Version/s: 3.0.x
   3.11.x
   4.0.x
   4.1.x
   5.x

> Dtests marked with @ported_to_in_jvm can be skipped since 4.1
> -
>
> Key: CASSANDRA-18396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18396
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/python
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
>
>
> During the CASSANDRA-15536 epic we ported multiple Python dtests to in-JVM 
> dtests.
> The ported Python dtests are still present but marked with a new 
> {{@ported_to_in_jvm}} annotation. JVM dtests didn't support vnodes at that 
> time, so when a Python dtest is marked with that annotation it's only run for 
> vnodes config, whereas it's skipped if vnodes are off.
> However, we have had support for vnodes on JVM dtests since 4.1. Thus, I 
> think we should modify the {{@ported_to_in_jvm}} annotation to also skip 
> configs with vnodes if all the nodes are in 4.1 or later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18396) Dtests marked with @ported_to_in_jvm can be skipped since 4.1



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-18396:
-
Reviewers: Brandon Williams
   Status: Review In Progress  (was: Patch Available)

> Dtests marked with @ported_to_in_jvm can be skipped since 4.1
> -
>
> Key: CASSANDRA-18396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18396
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/python
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
>
> During the CASSANDRA-15536 epic we ported multiple Python dtests to in-JVM 
> dtests.
> The ported Python dtests are still present but marked with a new 
> {{@ported_to_in_jvm}} annotation. JVM dtests didn't support vnodes at that 
> time, so when a Python dtest is marked with that annotation it's only run for 
> vnodes config, whereas it's skipped if vnodes are off.
> However, we have had support for vnodes on JVM dtests since 4.1. Thus, I 
> think we should modify the {{@ported_to_in_jvm}} annotation to also skip 
> configs with vnodes if all the nodes are in 4.1 or later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18396) Dtests marked with @ported_to_in_jvm can be skipped since 4.1



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713612#comment-17713612
 ] 

Andres de la Peña commented on CASSANDRA-18396:
---

Note that I have also fixed [a 
mistake|https://github.com/apache/cassandra-dtest/blob/c49bcf307686886fb34eea646eb3e7ff5855eb03/conftest.py#L494]
 that we accidentally introduced on CASSANDRA-18391, where 
{{fixture_ported_to_in_jvm}} was calling to {{_skip_msg}} instead of 
{{_skip_ported_msg}}. That doesn't have any consequences in practice because we 
don't have any upgrade dtests annotated with {{ported_to_in_jvm}}.

> Dtests marked with @ported_to_in_jvm can be skipped since 4.1
> -
>
> Key: CASSANDRA-18396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18396
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/python
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
>
> During the CASSANDRA-15536 epic we ported multiple Python dtests to in-JVM 
> dtests.
> The ported Python dtests are still present but marked with a new 
> {{@ported_to_in_jvm}} annotation. JVM dtests didn't support vnodes at that 
> time, so when a Python dtest is marked with that annotation it's only run for 
> vnodes config, whereas it's skipped if vnodes are off.
> However, we have had support for vnodes on JVM dtests since 4.1. Thus, I 
> think we should modify the {{@ported_to_in_jvm}} annotation to also skip 
> configs with vnodes if all the nodes are in 4.1 or later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18396) Dtests marked with @ported_to_in_jvm can be skipped since 4.1



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-18396:
--
Test and Documentation Plan: 
||Branch||Not patched||Patched||
|3.0|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2853/workflows/0ca41e5b-3e85-4ad4-8547-5a9ea53b9cd4]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2860/workflows/e16506bc-a3e7-4c0a-8286-c06cde073d7b]|
|3.11|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2854/workflows/89215583-16fb-416b-85bd-2b4933cf0fb8]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2862/workflows/04437fef-aa6a-49b3-8aa0-9b494aa9c298]|
|4.0|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2855/workflows/46c3897a-c4a2-42a7-96a2-8a32017e4803]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2855/workflows/d40c4e30-649b-4190-bd43-0d0780bb17bb]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2859/workflows/8d71cd82-c29b-40f9-b6ec-c9c00b1c7d2e]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2859/workflows/e452139d-9f9e-4f8f-8db6-1b49c62a75e2]|
|4.1|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2856/workflows/6ad2b6e9-0764-4500-b5a7-55925edbe048]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2856/workflows/ec949f73-cd1b-450c-a910-17d90188c75f]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2861/workflows/2c071cf4-b1ee-4f19-9aa6-2887f7caac29]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2861/workflows/74e02978-d8c9-4bc7-8fc2-0d2c74726428]|
|trunk|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2857/workflows/9317d1bf-e533-42fe-b456-0fcf17efa0a8]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2857/workflows/cb55fee4-8987-4279-9af3-9cee0f5c5da8]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2858/workflows/aeb788d7-8669-40e2-882f-a67642f04647]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2858/workflows/03c1a210-66b9-47fd-a9d3-55abac4269a6]|
 Status: Patch Available  (was: In Progress)

> Dtests marked with @ported_to_in_jvm can be skipped since 4.1
> -
>
> Key: CASSANDRA-18396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18396
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/python
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
>
> During the CASSANDRA-15536 epic we ported multiple Python dtests to in-JVM 
> dtests.
> The ported Python dtests are still present but marked with a new 
> {{@ported_to_in_jvm}} annotation. JVM dtests didn't support vnodes at that 
> time, so when a Python dtest is marked with that annotation it's only run for 
> vnodes config, whereas it's skipped if vnodes are off.
> However, we have had support for vnodes on JVM dtests since 4.1. Thus, I 
> think we should modify the {{@ported_to_in_jvm}} annotation to also skip 
> configs with vnodes if all the nodes are in 4.1 or later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713611#comment-17713611
 ] 

Stefan Miklosovic commented on CASSANDRA-12937:
---

[~claude]  I ll go through that table soon. Give me some time here please. But 
if we somehow put together this format with overhauled config param names I 
think we should be good. We should preserve backward compatibility and we 
should not fail anything which does not fail currently.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713597#comment-17713597
 ] 

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 2:41 PM:


https://github.com/apache/cassandra/pull/2282

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.
5. aliases supported

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length_in_kb: "32"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "lz4"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length_in_kb: "64KiB"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length: "1MiB"
  min_compress_ratio: "0"
{code}

All this works. I am not sure I covered all parameters but I expect that this 
might be done in a similar fashion.


was (Author: smiklosovic):
https://github.com/apache/cassandra/pull/2282

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.
5. aliases supported

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length_in_kb: "32"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "lz4"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}
{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length_in_kb: "64KiB"
  min_compress_ratio: "0"
{code}

All this works. I am not sure I covered all parameters but I expect that this 
might be done in a similar fashion.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713597#comment-17713597
 ] 

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 2:39 PM:


https://github.com/apache/cassandra/pull/2282

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.
5. aliases supported

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length_in_kb: "32"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "lz4"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}
{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length_in_kb: "64KiB"
  min_compress_ratio: "0"
{code}

All this works. I am not sure I covered all parameters but I expect that this 
might be done in a similar fashion.


was (Author: smiklosovic):
https://github.com/apache/cassandra/pull/2282

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.
5. aliases supported

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length_in_kb: "32"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "lz4"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

All this works. I am not sure I covered all parameters but I expect that this 
might be done in a similar fashion.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18396) Dtests marked with @ported_to_in_jvm can be skipped since 4.1



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713607#comment-17713607
 ] 

Andres de la Peña commented on CASSANDRA-18396:
---

Here is the PR updating {{@ported_to_in_jvm}} to also skip tests with vnodes if 
they have been ported and the tested branch is greater than or equals to 4.1: 
[https://github.com/apache/cassandra-dtest/pull/218]

Here are the CI results for the patch, compared to the unpatched branches:
||Branch||Not patched||Patched||
|3.0|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2853/workflows/0ca41e5b-3e85-4ad4-8547-5a9ea53b9cd4]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2860/workflows/e16506bc-a3e7-4c0a-8286-c06cde073d7b]|
|3.11|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2854/workflows/89215583-16fb-416b-85bd-2b4933cf0fb8]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2862/workflows/04437fef-aa6a-49b3-8aa0-9b494aa9c298]|
|4.0|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2855/workflows/46c3897a-c4a2-42a7-96a2-8a32017e4803]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2855/workflows/d40c4e30-649b-4190-bd43-0d0780bb17bb]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2859/workflows/8d71cd82-c29b-40f9-b6ec-c9c00b1c7d2e]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2859/workflows/e452139d-9f9e-4f8f-8db6-1b49c62a75e2]|
|4.1|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2856/workflows/6ad2b6e9-0764-4500-b5a7-55925edbe048]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2856/workflows/ec949f73-cd1b-450c-a910-17d90188c75f]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2861/workflows/2c071cf4-b1ee-4f19-9aa6-2887f7caac29]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2861/workflows/74e02978-d8c9-4bc7-8fc2-0d2c74726428]|
|trunk|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2857/workflows/9317d1bf-e533-42fe-b456-0fcf17efa0a8]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2857/workflows/cb55fee4-8987-4279-9af3-9cee0f5c5da8]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2858/workflows/aeb788d7-8669-40e2-882f-a67642f04647]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2858/workflows/03c1a210-66b9-47fd-a9d3-55abac4269a6]|

CircleCI UI make it difficult to read how many tests have been run, but the 
artifacts tab allows to see if the proper tests have been skipped.

The skipped tests on trunk are:
 * 
{{bootstrap_test.py::BootstrapTester::test_node_cannot_join_as_hibernating_node_without_replace_address}}
 * {{consistency_test.py::TestConsistency::test_13911}}
 * {{consistency_test.py::TestConsistency::test_13911_rows_srp}}
 * {{consistency_test.py::TestConsistency::test_13911_partitions_srp}}
 * {{consistency_test.py::TestConsistency::test_13880}}
 * {{consistency_test.py::TestConsistency::test_13747}}
 * {{consistency_test.py::TestConsistency::test_13595}}
 * {{consistency_test.py::TestConsistency::test_12872}}
 * {{consistency_test.py::TestConsistency::test_short_read}}
 * {{consistency_test.py::TestConsistency::test_short_read_delete}}
 * {{consistency_test.py::TestConsistency::test_short_read_quorum_delete}}
 * {{consistency_test.py::TestConsistency::test_readrepair}}
 * {{hintedhandoff_test.py::TestHintedHandoffConfig::test_nodetool}}
 * {{putget_test.py::TestPutGet::test_putget}}
 * {{putget_test.py::TestPutGet::test_putget_snappy}}
 * {{putget_test.py::TestPutGet::test_putget_deflate}}
 * {{putget_test.py::TestPutGet::test_non_local_read}}
 * {{putget_test.py::TestPutGet::test_rangeputget}}
 * {{putget_test.py::TestPutGet::test_wide_row}}
 * {{putget_test.py::TestPutGet::test_wide_slice}}
 * {{read_repair_test.py::TestReadRepair::test_alter_rf_and_run_read_repair}}
 * 
{{read_repair_test.py::TestReadRepair::test_range_slice_query_with_tombstones}}
 * 
{{read_repair_test.py::TestReadRepair::test_gcable_tombstone_resurrection_on_range_slice_query}}
 * {{read_repair_test.py::TestReadRepairGuarantees::test_monotonic_reads}}
 * {{read_repair_test.py::TestReadRepairGuarantees::test_atomic_writes}}

Those were being skipped only when vnodes weren't involved. Now they are always 
skipped independently of the usage of vnodes.

> Dtests marked with @ported_to_in_jvm can be skipped since 4.1
> -
>
> Key: CASSANDRA-18396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18396
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/python
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
>
> During the CASSANDRA-15536 epic

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713597#comment-17713597
 ] 

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 2:36 PM:


https://github.com/apache/cassandra/pull/2282

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.
5. aliases supported

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length_in_kb: "32"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "lz4"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

All this works. I am not sure I covered all parameters but I expect that this 
might be done in a similar fashion.


was (Author: smiklosovic):
https://github.com/apache/cassandra/pull/2281

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.
5. aliases supported

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length_in_kb: "32"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "lz4"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

All this works. I am not sure I covered all parameters but I expect that this 
might be done in a similar fashion.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713597#comment-17713597
 ] 

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 2:22 PM:


https://github.com/apache/cassandra/pull/2281

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.
5. aliases supported

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
parameters:
- chunk_length_in_kb: "32"
  min_compress_ratio: "0"
{code}

{code}
sstable_compression:
  - class_name: "lz4"
parameters:
- chunk_length: "32KiB"
  min_compress_ratio: "0"
{code}

All this works. I am not sure I covered all parameters but I expect that this 
might be done in a similar fashion.


was (Author: smiklosovic):
https://github.com/apache/cassandra/pull/2281

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.
5. aliases supported

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length: "32KiB"
#  min_compress_ratio: "0"
{code}

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length_in_kb: "32"
#  min_compress_ratio: "0"
{code}

All this works. I am not sure I covered all parameters but I expect that this 
might be done in a similar fashion.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713597#comment-17713597
 ] 

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 2:20 PM:


https://github.com/apache/cassandra/pull/2281

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.
5. aliases supported

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length: "32KiB"
#  min_compress_ratio: "0"
{code}

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length_in_kb: "32"
#  min_compress_ratio: "0"
{code}

All this works. I am not sure I covered all parameters but I expect that this 
might be done in a similar fashion.


was (Author: smiklosovic):
https://github.com/apache/cassandra/pull/2281

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length: "32KiB"
#  min_compress_ratio: "0"
{code}

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length_in_kb: "32"
#  min_compress_ratio: "0"
{code}

All this works. I am not sure I covered all parameters but I expect that this 
might be done in a similar fashion.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713597#comment-17713597
 ] 

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 2:19 PM:


https://github.com/apache/cassandra/pull/2281

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length: "32KiB"
#  min_compress_ratio: "0"
{code}

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length_in_kb: "32"
#  min_compress_ratio: "0"
{code}

All this works. I am not sure I covered all parameters but I expect that this 
might be done in a similar fashion.


was (Author: smiklosovic):
https://github.com/apache/cassandra/pull/2281

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length: "32KiB"
#  min_compress_ratio: "0"
{code}

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length_in_kb: "32"
#  min_compress_ratio: "0"
{code}

All this works. 

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713597#comment-17713597
 ] 

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 2:17 PM:


https://github.com/apache/cassandra/pull/2281

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported (as well as old)
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length: "32KiB"
#  min_compress_ratio: "0"
{code}

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length_in_kb: "32"
#  min_compress_ratio: "0"
{code}

All this works. 


was (Author: smiklosovic):
https://github.com/apache/cassandra/pull/2281

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length: "32KiB"
#  min_compress_ratio: "0"
{code}

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length_in_kb: "32"
#  min_compress_ratio: "0"
{code}

All this works. 

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713597#comment-17713597
 ] 

Stefan Miklosovic commented on CASSANDRA-12937:
---

https://github.com/apache/cassandra/pull/2281

1. flat map
2. ParameterizedClass - same stuff as everywhere
3. new format of values supported
4. some parameters / their names were deprecated in 3.0 so they can be removed 
in 5.0.

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length: "32KiB"
#  min_compress_ratio: "0"
{code}

{code}
#sstable_compression:
#  - class_name: "org.apache.cassandra.io.compress.LZ4Compressor"
#parameters:
#- chunk_length_in_kb: "32"
#  min_compress_ratio: "0"
{code}

All this works. 

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713592#comment-17713592
 ] 

Claude Warren commented on CASSANDRA-12937:
---

OK. Here is my proposal

 
||Parameter||CompressionParam 
Map (input)||CompressionParam
Map (output)||CompressionParam as Serializer||CQL||Notes||
|chunk_length_in_kb |X|X|X|X| |
|chunk_length_kb|X| | | | |
|chunk_length|X| | | |chunk length with DataStorageSpec suffix|
|crc_check_chance|X| | |X|Accepted but ignored and removed from map.  
This is the current operation|
|min_compress_ratio|X|X| | | |
|max_compressed_length|X| |X| |DataStorageSpec suffix|
|lz4_compressor_type|X|X|X|X|left in options for ICompressor construction|
|{{lz4_high_compressor_level}}|X|X|X|X|left in options for ICompressor 
construction|
|{{compression_level}}|X|X|X|X|left in options for ICompressor construction|
|(arbitrary parameter)|X|X|X| |left in options for ICompressor construction|

 

All of the  parameters will be accepted.

Conflicts of input will be handled as follows:
 * multiple chunk_length type parameters (chunk_length, chunk_length_kb, 
chunk_length_in_kb) will result in a ConfigurationException
 * Specifying both min_compress_ratio and max_compressed_length will result in 
a ConfigurationException.
 * crc_check_chance processing will not change.

I think this meets the requirements for CQL configuration copy paste as well as 
YAML properties expressed with DataStoreageSpec suffix.


 
 

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra-website] branch asf-staging updated (5cb00556 -> f614cbbf)

2023-04-18 Thread git-site-role

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 5cb00556 generate docs for c84757a0
 new f614cbbf generate docs for c84757a0

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (5cb00556)
\
 N -- N -- N   refs/heads/asf-staging (f614cbbf)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 site-ui/build/ui-bundle.zip | Bin 4796900 -> 4796900 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713571#comment-17713571
 ] 

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 12:25 PM:
-

I again do not have any clear answer why we are "extracting" something and then 
we have extra parameterized class with further options. 

_Currently CompressionParams takes the class name and the parameters.  It 
extracts chunk_length_kb (or chunk_length_in_kb) and min_compress_ratio from 
the parameters and uses them to build the CompressionParams instance._

Do not you just try to copy the same approach as in "CompressionParams extracts 
these parameters so we need to apply same extraction in cassandra.yaml"? Why is 
the extraction like it is suggested important? Why we can not flatten the 
configuration? It is quite questionable why we are having nested sections like 
that when, from ux perspective, it is truly just a map. The fact that we are 
doing something internally in some fashion does not mean that it has to 
manifest into the configuration in cassandra.yaml. 

If we want to support same values as in other places in cassandra.yaml but have 
nice flat map (preferably), would not it be possible to translate these values 
internally into the old values? The most idealistic option would be to start to 
support the same format of values for parameters in cassandra.yaml in CQL as 
well.

I do not think that without further discussion how this should be modeled and 
reaching broader consensus this ticket is close to the actual merging.


was (Author: smiklosovic):
I again do not have any clear answer why we are "extracting" something and then 
we have extra parameterized class with further options. 

_Currently CompressionParams takes the class name and the parameters.  It 
extracts chunk_length_kb (or chunk_length_in_kb) and min_compress_ratio from 
the parameters and uses them to build the CompressionParams instance._

Do not you just try to copy the same approach as in "CompressionParams extracts 
these parameters so we need to apply same extraction in cassandra.yaml"? Why is 
the extraction like it is suggested important? Why we can not flatten the 
configuration? It is quite questionable why we are having nested sections like 
that when, from ux perspective, it is truly just a map. The fact that we are 
doing something internally in some fashion does not mean that it has to 
manifest into the configuration in cassandra.yaml. 

If we want to support same values as in other places in cassandra.yaml but have 
nice flat map (preferably), would not it be possible to translate these values 
internally into the old values? Or even better, would it be possible to have 
new value parameters in cassandra.yaml but they would be transformed internally 
into the old values? The most idealistic option would be to start to support 
the same format of values for parameters in cassandra.yaml in CQL as well.

I do not think that without further discussion how this should be modeled and 
reaching broader consensus this ticket is close to the actual merging.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the ne

[jira] [Updated] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-12937:
--
Reviewers:   (was: Stefan Miklosovic)

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713571#comment-17713571
 ] 

Stefan Miklosovic commented on CASSANDRA-12937:
---

I again do not have any clear answer why we are "extracting" something and then 
we have extra parameterized class with further options. 

_Currently CompressionParams takes the class name and the parameters.  It 
extracts chunk_length_kb (or chunk_length_in_kb) and min_compress_ratio from 
the parameters and uses them to build the CompressionParams instance._

Do not you just try to copy the same approach as in "CompressionParams extracts 
these parameters so we need to apply same extraction in cassandra.yaml"? Why is 
the extraction like it is suggested important? Why we can not flatten the 
configuration? It is quite questionable why we are having nested sections like 
that when, from ux perspective, it is truly just a map. The fact that we are 
doing something internally in some fashion does not mean that it has to 
manifest into the configuration in cassandra.yaml. 

If we want to support same values as in other places in cassandra.yaml but have 
nice flat map (preferably), would not it be possible to translate these values 
internally into the old values? Or even better, would it be possible to have 
new value parameters in cassandra.yaml but they would be transformed internally 
into the old values? The most idealistic option would be to start to support 
the same format of values for parameters in cassandra.yaml in CQL as well.

I do not think that without further discussion how this should be modeled and 
reaching broader consensus this ticket is close to the actual merging.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression

2023-04-18 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713551#comment-17713551
 ] 

Michael Semb Wever edited comment on CASSANDRA-12937 at 4/18/23 11:23 AM:
--

I would rather see consistency in the cassandra.yaml
Please use parameter names and unit as is the new style in the yaml. It does 
not need to match the cql style.

If it is needed to have an extra class to support this, it is not a headache or 
an obsession. I don't have any opinion about whether the extra top-level 
options (chunk_length, maxCompressedLength, minCompressRatio) should be above 
the parameter map or in it.  I do see the issue with having to transform names 
and values inside the map as being clumsy and potentially error prone.

Neither hints_compression or commitlog_compression supports customising the 
chunk_length AFAIK.


was (Author: michaelsembwever):
I would rather see consistency in the cassandra.yaml
Please use parameter names and unit as is the new style in the yaml. It does 
not need to match the cql style.

If it is needed to have an extra class to support this, it is not a headache or 
an obsession. I also don't have any opinion about whether the extra top-level 
options (chunk_length, maxCompressedLength, minCompressRatio) should be above 
the parameter map or in it.  I do see the issue with having to transform names 
and values inside the map as being clumsy and potentially error prone.

Neither hints_compression or commitlog_compression supports customising the 
chunk_length AFAIK.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12937) Default setting (yaml) for SSTable compression

2023-04-18 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713551#comment-17713551
 ] 

Michael Semb Wever commented on CASSANDRA-12937:


I would rather see consistency in the cassandra.yaml
Please use parameter names and unit as is the new style in the yaml. It does 
not need to match the cql style.

If it is needed to have an extra class to support this, it is not a headache or 
an obsession. I also don't have any opinion about whether the extra top-level 
options (chunk_length, maxCompressedLength, minCompressRatio) should be above 
the parameter map or in it.  I do see the issue with having to transform names 
and values inside the map as being clumsy and potentially error prone.

Neither hints_compression or commitlog_compression supports customising the 
chunk_length AFAIK.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713549#comment-17713549
 ] 

Claude Warren commented on CASSANDRA-12937:
---

I am trying to follow the guidance I was given.  At this point I think we need 
to have a discussion on the dev mailing list to arrive at a consensus of how 
this should be done.

Currently CompressionParams takes the class name and the parameters.  It 
extracts chunk_length_kb (or chunk_length_in_kb) and min_compress_ratio from 
the parameters and uses them to build the CompressionParams instance.

 
 
The table below outlines the parameters and where they are used.  Blue cells 
indicate proposed changes.
 
||Parameter||CompressionParam as map||CompressionParam as 
Serializer||CQL||Notes||
|chunk_length_in_kb |X|X|X| |
|chunk_length_kb|(deprecated - read not written)| | | |
|chunk_length|X( proposed)| | |chunk length with DataStorageSpec suffix|
|crc_check_chance|(deprecated - read not written )| |X|crc_check_chance is not 
used in CompressionParam|
|min_compress_ratio|X| | | |
|max_compressed_length| |X| |Proposed to add to map input as a string with 
DataStorageSpec suffix|
|lz4_compressor_type|X|X|X| |
|{{lz4_high_compressor_level}}|X|X|X| |
|{{compression_level}}|X|X|X|Zstd compressor param|

 
 
 

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18336) Sstables were cleared when OOM and best_effort is used



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713548#comment-17713548
 ] 

Brandon Williams commented on CASSANDRA-18336:
--

Looks ready to be built to me.

> Sstables were cleared when OOM and best_effort is used
> --
>
> Key: CASSANDRA-18336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: NAIZHEN QUE
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
>
> Attachments: 4031679897782_.pic.jpg, 4241679905694_.pic.jpg, 
> system.log.2023-02-21.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> 1.When this exception occurs in the system
> {code:java}
> // 
> ERROR [CompactionExecutor:351627] 2023-02-21 17:59:20,721 
> CassandraDaemon.java:581 - Exception in thread 
> Thread[CompactionExecutor:351627,1,main]
> org.apache.cassandra.io.FSReadError: java.io.IOException: Map failed
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:167)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:310)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:246)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:170)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:73)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:61)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104)
>     at 
> org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:365)
>     at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:337)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:172)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:124)
>     at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:64)
>     at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:137)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:193)
>     at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:77)
>     at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:100)
>     at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:298)
>     at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>     at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.io.IOException: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1016)
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:163)
>     ... 23 common frames omitted
> Caused by: java.lang.OutOfMemoryError: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map0(Native Method)
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1013)
> {code}
> 2.Restart the node, Verifying logfile transaction ,All sstables are deleted
> {code:java}
> // code placeholder
> INFO  [main] 2023-02-21 18:00:23,350 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb-8819408-big-Index.db
>  
> INFO  [main] 2023-02-21 18:00:23,615 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb-8819408-big-Data.db
>  
> INFO  [main] 2023-02-21 18:00:46,504 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb_txn_compaction_c923b230-b077-11ed-a081-5d5a5c990823.log
>  
> INFO  [main] 2023-02-21 18:00:46,510 LogTransaction.java:536 - Verifying 
> logfile transaction 
> [nb_txn_compaction_461935b0-b1ce-11ed-a081-5d5a5c990823.log in 
> /historyData/cassandra/data/kairosdb/data_points-870f

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression

[
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713529#comment-17713529
]

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 10:42 AM:
-

All I prefer to see is to have a simple map of parameters into
ParametrizedClass which would have exactly same names as for their CQL
counterparts. They would be literally just used there. There does not seem to
be any collisions with that. I do not get the "obsession" with having
parameters for these compressors to follow the same names of CompressionParams.
(or following same units).

_The CompressionParams has 3 parameters that it extracts or creates from the
parameters in the ParameterizedClass._

why do they have to be extracted in the first place?

for hints_compression in yaml we have:

{code}
# Compression to apply to the hint files. If omitted, hints files
# will be written uncompressed. LZ4, Snappy, and Deflate compressors
# are supported.
#hints_compression:
# - class_name: LZ4Compressor
# parameters:
# -
{code}

For commitlog_compression we have:

{code}
# Compression to apply to the commit log. If omitted, the commit log
# will be written uncompressed. LZ4, Snappy, and Deflate compressors
# are supported.
# commitlog_compression:
# - class_name: LZ4Compressor
# parameters:
# -
{code}

for sstable_compression, I would prefer to see the exact same way of the
configuration. Why are we trying to introduce completely custom way of the
configuration which exists nowhere else with extracting some parameters
outside? Why we can not use same stuff?

I do not think that we should blindly follow "the parameters names and their
units". I think we already discussed this. I already explained all advantages
of following what we have there already. If we make it explicitly clear that
these parameters are exactly same as if they would be put into compression
params upon table creation, they would save us a lot of headache to have
something completely custom and people would need to put there parameters and
their names as they are used to. Why do we want to change all of this to
further confuse the user?

EDIT: to further support my case with having same parameters and their units in
cassandra.yaml as they are specified in CQL upon table creation, what happens
in practice is that people who want to take advantage of this configuration
would just copy-paste CQL snippet for compression params and they would make it
like entries in the map by hitting "enter" on the keyboard and they are done. I
highly doubt that they would like to specify "other units" just for the sake of
consistency with the rest of cassandra.yaml. I do not think they care at all.
They just want to copy it over from CQL and call it the day.

was (Author: smiklosovic):
All I prefer to see is to have a simple map of parameters into
ParametrizedClass which would have exactly same names as for their CQL
counterparts. They would be literally just used there. There does not seem to
be any collisions with that. I do not get the "obsession" with having
parameters for these compressors to follow the same names of CompressionParams.
(or following same units).

_The CompressionParams has 3 parameters that it extracts or creates from the
parameters in the ParameterizedClass._

why do they have to be extracted in the first place?

for hints_compression in yaml we have:

For commitlog_compression we have:

> Default setting (yaml) for SSTable compression
>

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713529#comment-17713529
 ] 

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 10:22 AM:
-

All I prefer to see is to have a simple map of parameters into 
ParametrizedClass which would have exactly same names as for their CQL 
counterparts. They would be literally just used there. There does not seem to 
be any collisions with that. I do not get the "obsession" with having 
parameters for these compressors to follow the same names of CompressionParams. 
(or following same units).

_The CompressionParams has 3 parameters that it extracts or creates from the 
parameters in the ParameterizedClass._ 

why do they have to be extracted in the first place? 

for hints_compression in yaml we have:

{code}
# Compression to apply to the hint files. If omitted, hints files
# will be written uncompressed. LZ4, Snappy, and Deflate compressors
# are supported.
#hints_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

For commitlog_compression we have:

{code}
# Compression to apply to the commit log. If omitted, the commit log
# will be written uncompressed.  LZ4, Snappy, and Deflate compressors
# are supported.
# commitlog_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

for sstable_compression, I would prefer to see the exact same way of the 
configuration. Why are we trying to introduce completely custom way of the 
configuration which exists nowhere else with extracting some parameters 
outside? Why we can not use same stuff?

I do not think that we should blindly follow "the parameters names and their 
units". I think we already discussed this. I already explained all advantages 
of following what we have there already. If we make it explicitly clear that 
these parameters are exactly same as if they would be put into compression 
params upon table creation, they would save us a lot of headache to have 
something completely custom and people would need to put there parameters and 
their names as they are used to. Why do we want to change all of this to 
further confuse the user?


was (Author: smiklosovic):
All I prefer to see is to have a simple map of parameters into 
ParametrizedClass which would have exactly same names as for their CQL 
counterparts. They would be literally just used there. There does not seem to 
be any collisions with that. I do not get the "obsession" with having 
parameters for these compressors to follow the same names of CompressionParams. 
(or following same units).

_The CompressionParams has 3 parameters that it extracts or creates from the 
parameters in the ParameterizedClass._ 

why do they have to be extracted in the first place? 

for hints_compression in yaml we have:

{code}
# Compression to apply to the hint files. If omitted, hints files
# will be written uncompressed. LZ4, Snappy, and Deflate compressors
# are supported.
#hints_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

For commitlog_compression we have:

{code}
# Compression to apply to the commit log. If omitted, the commit log
# will be written uncompressed.  LZ4, Snappy, and Deflate compressors
# are supported.
# commitlog_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

for sstable_compression, I would prefer to see the exact same way of the 
configuration. Why are we trying to introduce completely custom way of the 
configuration which exists nowhere else with extracting some parameters 
outside? Why we can not use same stuff?

I do not think that we should blindly follow "the parameters names and their 
units". I think we already discussed this. I already explained all advantages 
of following what we have there already. If we make it explicitly clear that 
these parameters are exactly same as if they would be put into compression 
params upon table creation, they would save us a lot of headache to have 
something completely custom and people would need to put there parameters and 
their names as they are used to. We do we want to change all of this to further 
confuse the user?

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713529#comment-17713529
 ] 

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 10:21 AM:
-

All I prefer to see is to have a simple map of parameters into 
ParametrizedClass which would have exactly same names as for their CQL 
counterparts. They would be literally just used there. There does not seem to 
be any collisions with that. I do not get the "obsession" with having 
parameters for these compressors to follow the same names of CompressionParams. 
(or following same units).

_The CompressionParams has 3 parameters that it extracts or creates from the 
parameters in the ParameterizedClass._ 

why do they have to be extracted in the first place? 

for hints_compression in yaml we have:

{code}
# Compression to apply to the hint files. If omitted, hints files
# will be written uncompressed. LZ4, Snappy, and Deflate compressors
# are supported.
#hints_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

For commitlog_compression we have:

{code}
# Compression to apply to the commit log. If omitted, the commit log
# will be written uncompressed.  LZ4, Snappy, and Deflate compressors
# are supported.
# commitlog_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

for sstable_compression, I would prefer to see the exact same way of the 
configuration. Why are we trying to introduce completely custom way of the 
configuration which exists nowhere else with extracting some parameters 
outside? Why we can not use same stuff?

I do not think that we should blindly follow "the parameters names and their 
units". I think we already discussed this. I already explained all advantages 
of following what we have there already. If we make it explicitly clear that 
these parameters are exactly same as if they would be put into compression 
params upon table creation, they would save us a lot of headache to have 
something completely custom and people would need to put there parameters and 
their names as they are used to. We do we want to change all of this to further 
confuse the user?


was (Author: smiklosovic):
All I prefer to see is to have a simple map of parameters into 
ParametrizedClass which would have exactly same names as for their CQL 
counterparts. They would be literally just used there. There does not seem to 
be any collisions with that. I do not get the "obsession" with having 
parameters for these compressors to follow the same names of CompressionParams. 
(or following same units).

_The CompressionParams has 3 parameters that it extracts or creates from the 
parameters in the ParameterizedClass._ 

why do they have to be extracted in the first place? 

for hints_compression in yaml we have:

{code}
# Compression to apply to the hint files. If omitted, hints files
# will be written uncompressed. LZ4, Snappy, and Deflate compressors
# are supported.
#hints_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

For commitlog_compression we have:

{code}
# Compression to apply to the commit log. If omitted, the commit log
# will be written uncompressed.  LZ4, Snappy, and Deflate compressors
# are supported.
# commitlog_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

for sstable_compression, I would prefer to see the exact same way of the 
configuration. Why are we trying to introduce completely custom way of the 
configuration which nowhere else with exacting some parameters outside? Why we 
can not use same stuff?

I do not think that we should blindly follow "the parameters names and their 
units". I think we already discussed this. I already explained all advantages 
of following what we have there already. If we make it explicitly clear that 
these parameters are exactly same as if they would be put into compression 
params upon table creation, they would save us a lot of headache to have 
something completely custom and people would need to put there parameters and 
their names as they are used to. We do we want to change all of this to further 
confuse the user?

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> Th

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713529#comment-17713529
 ] 

Stefan Miklosovic edited comment on CASSANDRA-12937 at 4/18/23 10:20 AM:
-

All I prefer to see is to have a simple map of parameters into 
ParametrizedClass which would have exactly same names as for their CQL 
counterparts. They would be literally just used there. There does not seem to 
be any collisions with that. I do not get the "obsession" with having 
parameters for these compressors to follow the same names of CompressionParams. 
(or following same units).

_The CompressionParams has 3 parameters that it extracts or creates from the 
parameters in the ParameterizedClass._ 

why do they have to be extracted in the first place? 

for hints_compression in yaml we have:

{code}
# Compression to apply to the hint files. If omitted, hints files
# will be written uncompressed. LZ4, Snappy, and Deflate compressors
# are supported.
#hints_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

For commitlog_compression we have:

{code}
# Compression to apply to the commit log. If omitted, the commit log
# will be written uncompressed.  LZ4, Snappy, and Deflate compressors
# are supported.
# commitlog_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

for sstable_compression, I would prefer to see the exact same way of the 
configuration. Why are we trying to introduce completely custom way of the 
configuration which nowhere else with exacting some parameters outside? Why we 
can not use same stuff?

I do not think that we should blindly follow "the parameters names and their 
units". I think we already discussed this. I already explained all advantages 
of following what we have there already. If we make it explicitly clear that 
these parameters are exactly same as if they would be put into compression 
params upon table creation, they would save us a lot of headache to have 
something completely custom and people would need to put there parameters and 
their names as they are used to. We do we want to change all of this to further 
confuse the user?


was (Author: smiklosovic):
All I prefer to see is to have a simple map of parameters into 
ParametrizedClass which would have exactly same names as for their CQL 
counterparts. They would be literally just used there. There does not seem to 
be any collisions with that. I do not get the "obsession" with having 
parameters for these compressors to follow the same names of CompressionParams.

_The CompressionParams has 3 parameters that it extracts or creates from the 
parameters in the ParameterizedClass._ 

why do they have to be extracted in the first place? 

for hints_compression in yaml we have:

{code}
# Compression to apply to the hint files. If omitted, hints files
# will be written uncompressed. LZ4, Snappy, and Deflate compressors
# are supported.
#hints_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

For commitlog_compression we have:

{code}
# Compression to apply to the commit log. If omitted, the commit log
# will be written uncompressed.  LZ4, Snappy, and Deflate compressors
# are supported.
# commitlog_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

for sstable_compression, I would prefer to see the exact same way of the 
configuration. Why are we trying to introduce completely custom way of the 
configuration which nowhere else with exacting some parameters outside? Why we 
can not use same stuff?

I do not think that we should blindly follow "the parameters names and their 
units". I think we already discussed this. I already explained all advantages 
of following what we have there already. If we make it explicitly clear that 
these parameters are exactly same as if they would be put into compression 
params upon table creation, they would save us a lot of headache to have 
something completely custom and people would need to put there parameters and 
their names as they are used to. We do we want to change all of this to further 
confuse the user?

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml

[jira] [Commented] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713529#comment-17713529
 ] 

Stefan Miklosovic commented on CASSANDRA-12937:
---

All I prefer to see is to have a simple map of parameters into 
ParametrizedClass which would have exactly same names as for their CQL 
counterparts. They would be literally just used there. There does not seem to 
be any collisions with that. I do not get the "obsession" with having 
parameters for these compressors to follow the same names of CompressionParams.

_The CompressionParams has 3 parameters that it extracts or creates from the 
parameters in the ParameterizedClass._ 

why do they have to be extracted in the first place? 

for hints_compression in yaml we have:

{code}
# Compression to apply to the hint files. If omitted, hints files
# will be written uncompressed. LZ4, Snappy, and Deflate compressors
# are supported.
#hints_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

For commitlog_compression we have:

{code}
# Compression to apply to the commit log. If omitted, the commit log
# will be written uncompressed.  LZ4, Snappy, and Deflate compressors
# are supported.
# commitlog_compression:
#   - class_name: LZ4Compressor
# parameters:
# -
{code}

for sstable_compression, I would prefer to see the exact same way of the 
configuration. Why are we trying to introduce completely custom way of the 
configuration which nowhere else with exacting some parameters outside? Why we 
can not use same stuff?

I do not think that we should blindly follow "the parameters names and their 
units". I think we already discussed this. I already explained all advantages 
of following what we have there already. If we make it explicitly clear that 
these parameters are exactly same as if they would be put into compression 
params upon table creation, they would save us a lot of headache to have 
something completely custom and people would need to put there parameters and 
their names as they are used to. We do we want to change all of this to further 
confuse the user?

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713524#comment-17713524
 ] 

Claude Warren edited comment on CASSANDRA-12937 at 4/18/23 10:09 AM:
-

hints_compression and commitlog_compression use the standard ParameterizedClass.

The CompressionParams has 3 parameters that it extracts or creates from the 
parameters in the ParameterizedClass.  The parameters in CompressionParams are 
{code:java}
private final int chunkLength;
private final int maxCompressedLength;  // In content we store max length to 
avoid rounding errors causing compress/decompress mismatch.
private final double minCompressRatio;  // In configuration we store min ratio, 
the input parameter.
{code}
The ParameterizedClass constructor that accepts the Map of 
options expects a key of "chunk_length_in_kb" or "chunk_length_kb"  as well as 
a "min_compress_ratio".

This change I made does not change the hints_compression or 
commitlog_compression options.

The yaml file has an additional set of requirements:
 * The chunkLength (yaml: chunk_length) should be specified with the 
DataStorageSpec suffix (e.g. KiB).
 * The maxCompressedLength should be accepted as a parameter.
 * The maxCompressedLength  (yaml: max_compressed_length)  should be specified 
with the DataStorageSpec suffix (e.g. KiB).
 * maxCompressedLength and minCompressRatio are related to each other via 
chunk_length; so only one can be specified.

I could work chunkLength and maxCompressedLength  into the class_name 
parameters, however, I believe this will result in adding 2 more reserved words 
 both of which will need to be removed from the parameter list.  This change 
will affect all CompressionParams  constructions that use the 
Map format.  

I will make the change with the following processes for determining collision 
values:
 * If both max_compressed_length and min_compress_ratio are specified an 
ConfigurationException will be thrown.
 * if both chunk_length and either chunk_length_in_kb or chunk_length_kb  are 
specified and they are not equal  ConfiguraitonException will be thrown.
 * if chunk_length or max_compressed_length are specified and do not use the 
DataStorageSpec suffix a ConfigurationException will be thrown

I will also ensure that the short names: lz4, none, noop, snappy, deflate, and 
zstd  will work as class names and use the defaults specified by the 
CompressionParams methods of the same names.


was (Author: claudenw):
hints_compression and commitlog_compression use the standard ParameterizedClass.

The CompressionParams has 3 parameters that it extracts or creates from the 
parameters in the ParameterizedClass.  The parameters in CompressionParams are 
{code:java}
private final int chunkLength;
private final int maxCompressedLength;  // In content we store max length to 
avoid rounding errors causing compress/decompress mismatch.
private final double minCompressRatio;  // In configuration we store min ratio, 
the input parameter.
{code}
The ParameterizedClass constructor that accepts the Map of 
options expects a key of "chunk_length_in_kb" or "chunk_length_kb"  as well as 
a "min_compress_ratio".

This change I made does not change the hints_compression or 
commitlog_compression options.

The yaml file has an additional set of requirements:
 * The chunkLength (yaml: chunk_length) should be specified with the 
DataStorageSpec suffix (e.g. KiB).
 * The maxCompressedLength should be accepted as a parameter.
 * The maxCompressedLength  (yaml: max_compressed_length)  should be specified 
with the DataStorageSpec extensions (e.g. KiB).
 * maxCompressedLength and minCompressRatio are related to each other via 
chunk_length; so only one can be specified.

I could work chunkLength and maxCompressedLength  into the class_name 
parameters, however, I believe this will result in adding 2 more reserved words 
 both of which will need to be removed from the parameter list.  This change 
will affect all CompressionParams  constructions that use the 
Map format.  

I will make the change with the following processes for determining collision 
values:


 * If both max_compressed_length and min_compress_ratio are specified an 
ConfigurationException will be thrown.
 * if both chunk_length and either chunk_length_in_kb or chunk_length_kb  are 
specified and they are not equal  ConfiguraitonException will be thrown.
 * if chunk_length or max_compressed_length are specified and do not use the 
DataStorageSpec suffix a ConfigurationException will be thrown

I will also ensure that the short names: lz4, none, noop, snappy, deflate, and 
zstd  will work as class names and use the defaults specified by the 
CompressionParams methods of the same names.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
>

[jira] [Commented] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713524#comment-17713524
 ] 

Claude Warren commented on CASSANDRA-12937:
---

hints_compression and commitlog_compression use the standard ParameterizedClass.

The CompressionParams has 3 parameters that it extracts or creates from the 
parameters in the ParameterizedClass.  The parameters in CompressionParams are 
{code:java}
private final int chunkLength;
private final int maxCompressedLength;  // In content we store max length to 
avoid rounding errors causing compress/decompress mismatch.
private final double minCompressRatio;  // In configuration we store min ratio, 
the input parameter.
{code}
The ParameterizedClass constructor that accepts the Map of 
options expects a key of "chunk_length_in_kb" or "chunk_length_kb"  as well as 
a "min_compress_ratio".

This change I made does not change the hints_compression or 
commitlog_compression options.

The yaml file has an additional set of requirements:
 * The chunkLength (yaml: chunk_length) should be specified with the 
DataStorageSpec suffix (e.g. KiB).
 * The maxCompressedLength should be accepted as a parameter.
 * The maxCompressedLength  (yaml: max_compressed_length)  should be specified 
with the DataStorageSpec extensions (e.g. KiB).
 * maxCompressedLength and minCompressRatio are related to each other via 
chunk_length; so only one can be specified.

I could work chunkLength and maxCompressedLength  into the class_name 
parameters, however, I believe this will result in adding 2 more reserved words 
 both of which will need to be removed from the parameter list.  This change 
will affect all CompressionParams  constructions that use the 
Map format.  

I will make the change with the following processes for determining collision 
values:


 * If both max_compressed_length and min_compress_ratio are specified an 
ConfigurationException will be thrown.
 * if both chunk_length and either chunk_length_in_kb or chunk_length_kb  are 
specified and they are not equal  ConfiguraitonException will be thrown.
 * if chunk_length or max_compressed_length are specified and do not use the 
DataStorageSpec suffix a ConfigurationException will be thrown

I will also ensure that the short names: lz4, none, noop, snappy, deflate, and 
zstd  will work as class names and use the defaults specified by the 
CompressionParams methods of the same names.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18461) CEP-21 Avoid NPE when getting dc/rack for not yet registered endpoints



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18461:

Test and Documentation Plan: cci
 Status: Patch Available  (was: Open)

> CEP-21 Avoid NPE when getting dc/rack for not yet registered endpoints
> --
>
> Key: CASSANDRA-18461
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18461
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Marcus Eriksson
>Priority: Normal
>
> If a snitch is asked for location info for a node not yet added to the 
> cluster, it should not NPE. In future, it may be desirable to fine tune the 
> actual behaviour, but for now returning a default would be an improvement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18463) CEP-21 Reinstate client notifications for joining/leaving/moving nodes



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18463:

Test and Documentation Plan: cci
 Status: Patch Available  (was: Open)

> CEP-21 Reinstate client notifications for joining/leaving/moving nodes
> --
>
> Key: CASSANDRA-18463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18463
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Marcus Eriksson
>Priority: Normal
>
> This functionality was disabled by some of the recent changes to 
> {{StorageService}}, causing a number of test failures. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18462) CEP-21 Fix tools tests



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18462:

Test and Documentation Plan: cci
 Status: Patch Available  (was: Open)

> CEP-21 Fix tools tests
> --
>
> Key: CASSANDRA-18462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18462
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Marcus Eriksson
>Priority: Normal
>
> The {{LocalLog}} instance created for use with offline tools should not be 
> initialised with the standard set of listeners.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18460) CEP-21 Ensure that ClusterMetadata::forceEpoch keeps component epochs consistent



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18460:

Test and Documentation Plan: cci
 Status: Patch Available  (was: Open)

> CEP-21 Ensure that ClusterMetadata::forceEpoch keeps component epochs 
> consistent
> 
>
> Key: CASSANDRA-18460
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18460
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
>
> {{forceEpoch}} is used to make the application of multiple transformations in 
> series appear as a single atomic update. It's primary use is in 
> {{UnsafeJoin}} (i.e. join without bootstrap) to apply the usual sequence of 
> start/mid/finish join in a single transformation. In such cases, we must 
> ensure that no component of {{ClusterMetadata}} which maintains its own 
> last-modified epoch, ends up with an epoch greater than the one of the 
> enclosing {{ClusterMetadata}}. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18456) CEP-21 During startup request replay from CMS asynchronously



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18456:

Test and Documentation Plan: cci
 Status: Patch Available  (was: Open)

> CEP-21 During startup request replay from CMS asynchronously
> 
>
> Key: CASSANDRA-18456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18456
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Marcus Eriksson
>Priority: Normal
>
> During the startup sequence, nodes first replay any locally persisted 
> metadata changes then request any newer, unseen updates from the CMS. This 
> second part should be both async and optional, meaning a CMS failure or 
> partition shouldn't prevent nodes from starting up. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18457) CEP-21 Ensure that SchemaTransformation impls correctly set TableMetadata epoch



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18457:

Test and Documentation Plan: cci
 Status: Patch Available  (was: Open)

> CEP-21 Ensure that SchemaTransformation impls correctly set TableMetadata 
> epoch
> ---
>
> Key: CASSANDRA-18457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18457
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
>
> Many existing {{SchemaTransformation}} implementations (not to mention as yet 
> unimplemented ones) have the potential to modify {{TableMetadata}} and it is 
> brittle to require each of them to take care of updating the metadata epoch. 
> We should have {{ClusterMetadata.Transformer}} or the {{AlterSchema}} 
> transformation itself handle this automatically.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18459) CEP-21 Rewrite o.a.c.distributed.test.SchemaTest



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18459:

Test and Documentation Plan: cci
 Status: Patch Available  (was: Open)

> CEP-21 Rewrite o.a.c.distributed.test.SchemaTest
> 
>
> Key: CASSANDRA-18459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18459
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
>
> Several of the premises that this test is based on are no longer valid. 
> Rewrite it so that instead of executing schema changes node-locally to 
> prevent them propagating, we can have the non-cms node pause before enacting 
> a schema change to enable us to verify that the expected exceptions are 
> thrown. Also, local schema reset and pulling during startup are not relevant 
> in TCM, so we can simplify the tests to ensure that a down node learns of any 
> missed updates when it restarts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18458) CEP-21 During startup, don't open SSTables until local metadata log replay is complete



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18458:

Test and Documentation Plan: cci
 Status: Patch Available  (was: Open)

> CEP-21 During startup, don't open SSTables until local metadata log replay is 
> complete
> --
>
> Key: CASSANDRA-18458
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18458
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
>
> Eagerly opening SSTables when their {{ColumnFamilyStore}} is first 
> instantiated presents problems when replaying the local metadata log during 
> startup. Schema modifications which _had_ been applied by the time an SSTable 
> was written may not have been replayed yet, causing serialization errors. To 
> address this, we can defer opening SSTables until after the local log replay 
> is complete.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18462) CEP-21 Fix tools tests



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18462:

Change Category: Quality Assurance
 Complexity: Normal
  Reviewers: Alex Petrov, Sam Tunnicliffe
 Status: Open  (was: Triage Needed)

https://github.com/beobal/cassandra/commit/5022c68fe756751d843c2fe01a042e7e720f13d8

> CEP-21 Fix tools tests
> --
>
> Key: CASSANDRA-18462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18462
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Marcus Eriksson
>Priority: Normal
>
> The {{LocalLog}} instance created for use with offline tools should not be 
> initialised with the standard set of listeners.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18456) CEP-21 During startup request replay from CMS asynchronously



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713523#comment-17713523
 ] 

Sam Tunnicliffe commented on CASSANDRA-18456:
-

https://github.com/beobal/cassandra/commit/b2b08d8e21f20576148d8c9602401d6d9f7193b4

> CEP-21 During startup request replay from CMS asynchronously
> 
>
> Key: CASSANDRA-18456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18456
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Marcus Eriksson
>Priority: Normal
>
> During the startup sequence, nodes first replay any locally persisted 
> metadata changes then request any newer, unseen updates from the CMS. This 
> second part should be both async and optional, meaning a CMS failure or 
> partition shouldn't prevent nodes from starting up. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18463) CEP-21 Reinstate client notifications for joining/leaving/moving nodes



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18463:

Change Category: Semantic
 Complexity: Normal
  Reviewers: Alex Petrov, Sam Tunnicliffe
 Status: Open  (was: Triage Needed)

https://github.com/beobal/cassandra/commit/2f622560d8bb40d176bd14495a7bb68fcb0011d6

> CEP-21 Reinstate client notifications for joining/leaving/moving nodes
> --
>
> Key: CASSANDRA-18463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18463
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Marcus Eriksson
>Priority: Normal
>
> This functionality was disabled by some of the recent changes to 
> {{StorageService}}, causing a number of test failures. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18460) CEP-21 Ensure that ClusterMetadata::forceEpoch keeps component epochs consistent



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18460:

Change Category: Semantic
 Complexity: Normal
  Reviewers: Alex Petrov, Marcus Eriksson
 Status: Open  (was: Triage Needed)

https://github.com/beobal/cassandra/commit/4be5fb03122ed629f74bb121c7be48c6f0dabd27

> CEP-21 Ensure that ClusterMetadata::forceEpoch keeps component epochs 
> consistent
> 
>
> Key: CASSANDRA-18460
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18460
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
>
> {{forceEpoch}} is used to make the application of multiple transformations in 
> series appear as a single atomic update. It's primary use is in 
> {{UnsafeJoin}} (i.e. join without bootstrap) to apply the usual sequence of 
> start/mid/finish join in a single transformation. In such cases, we must 
> ensure that no component of {{ClusterMetadata}} which maintains its own 
> last-modified epoch, ends up with an epoch greater than the one of the 
> enclosing {{ClusterMetadata}}. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18461) CEP-21 Avoid NPE when getting dc/rack for not yet registered endpoints



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18461:

Change Category: Quality Assurance
 Complexity: Normal
  Reviewers: Alex Petrov, Sam Tunnicliffe
 Status: Open  (was: Triage Needed)

https://github.com/beobal/cassandra/commit/2342ee36f52a0a9738feed4b7e56fc630710bd50

> CEP-21 Avoid NPE when getting dc/rack for not yet registered endpoints
> --
>
> Key: CASSANDRA-18461
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18461
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Marcus Eriksson
>Priority: Normal
>
> If a snitch is asked for location info for a node not yet added to the 
> cluster, it should not NPE. In future, it may be desirable to fine tune the 
> actual behaviour, but for now returning a default would be an improvement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18458) CEP-21 During startup, don't open SSTables until local metadata log replay is complete



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18458:

Change Category: Semantic
 Complexity: Normal
  Reviewers: Alex Petrov, Marcus Eriksson
 Status: Open  (was: Triage Needed)

https://github.com/beobal/cassandra/commit/99efe13a1b72871ef045c0c9c9a07830e5d7d595

> CEP-21 During startup, don't open SSTables until local metadata log replay is 
> complete
> --
>
> Key: CASSANDRA-18458
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18458
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
>
> Eagerly opening SSTables when their {{ColumnFamilyStore}} is first 
> instantiated presents problems when replaying the local metadata log during 
> startup. Schema modifications which _had_ been applied by the time an SSTable 
> was written may not have been replayed yet, causing serialization errors. To 
> address this, we can defer opening SSTables until after the local log replay 
> is complete.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18459) CEP-21 Rewrite o.a.c.distributed.test.SchemaTest



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18459:

Change Category: Quality Assurance
 Complexity: Normal
  Reviewers: Alex Petrov, Marcus Eriksson
 Status: Open  (was: Triage Needed)

https://github.com/beobal/cassandra/commit/4ef4c1fb5a6249075f2bc57d24870573dca4e858

> CEP-21 Rewrite o.a.c.distributed.test.SchemaTest
> 
>
> Key: CASSANDRA-18459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18459
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
>
> Several of the premises that this test is based on are no longer valid. 
> Rewrite it so that instead of executing schema changes node-locally to 
> prevent them propagating, we can have the non-cms node pause before enacting 
> a schema change to enable us to verify that the expected exceptions are 
> thrown. Also, local schema reset and pulling during startup are not relevant 
> in TCM, so we can simplify the tests to ensure that a down node learns of any 
> missed updates when it restarts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18457) CEP-21 Ensure that SchemaTransformation impls correctly set TableMetadata epoch



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18457:

Change Category: Semantic
 Complexity: Normal
  Reviewers: Alex Petrov, Marcus Eriksson
 Status: Open  (was: Triage Needed)

https://github.com/beobal/cassandra/commit/89e9776351f7616df4d0ab12eeb4e1095b68a476

> CEP-21 Ensure that SchemaTransformation impls correctly set TableMetadata 
> epoch
> ---
>
> Key: CASSANDRA-18457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18457
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
>
> Many existing {{SchemaTransformation}} implementations (not to mention as yet 
> unimplemented ones) have the potential to modify {{TableMetadata}} and it is 
> brittle to require each of them to take care of updating the metadata epoch. 
> We should have {{ClusterMetadata.Transformer}} or the {{AlterSchema}} 
> transformation itself handle this automatically.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18456) CEP-21 During startup request replay from CMS asynchronously



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-18456:

Change Category: Semantic
 Complexity: Normal
  Reviewers: Alex Petrov, Sam Tunnicliffe
 Status: Open  (was: Triage Needed)

> CEP-21 During startup request replay from CMS asynchronously
> 
>
> Key: CASSANDRA-18456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18456
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Sam Tunnicliffe
>Assignee: Marcus Eriksson
>Priority: Normal
>
> During the startup sequence, nodes first replay any locally persisted 
> metadata changes then request any newer, unseen updates from the CMS. This 
> second part should be both async and optional, meaning a CMS failure or 
> partition shouldn't prevent nodes from starting up. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18396) Dtests marked with @ported_to_in_jvm can be skipped since 4.1



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-18396:
--
Change Category: Quality Assurance
 Complexity: Low Hanging Fruit
 Status: Open  (was: Triage Needed)

> Dtests marked with @ported_to_in_jvm can be skipped since 4.1
> -
>
> Key: CASSANDRA-18396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18396
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/python
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
>
> During the CASSANDRA-15536 epic we ported multiple Python dtests to in-JVM 
> dtests.
> The ported Python dtests are still present but marked with a new 
> {{@ported_to_in_jvm}} annotation. JVM dtests didn't support vnodes at that 
> time, so when a Python dtest is marked with that annotation it's only run for 
> vnodes config, whereas it's skipped if vnodes are off.
> However, we have had support for vnodes on JVM dtests since 4.1. Thus, I 
> think we should modify the {{@ported_to_in_jvm}} annotation to also skip 
> configs with vnodes if all the nodes are in 4.1 or later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-18463) CEP-21 Reinstate client notifications for joining/leaving/moving nodes

Sam Tunnicliffe created CASSANDRA-18463:
---

 Summary: CEP-21 Reinstate client notifications for 
joining/leaving/moving nodes
 Key: CASSANDRA-18463
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18463
 Project: Cassandra
  Issue Type: Improvement
  Components: Transactional Cluster Metadata
Reporter: Sam Tunnicliffe
Assignee: Marcus Eriksson


This functionality was disabled by some of the recent changes to 
{{StorageService}}, causing a number of test failures. 




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-18462) CEP-21 Fix tools tests

Sam Tunnicliffe created CASSANDRA-18462:
---

 Summary: CEP-21 Fix tools tests
 Key: CASSANDRA-18462
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18462
 Project: Cassandra
  Issue Type: Improvement
  Components: Transactional Cluster Metadata
Reporter: Sam Tunnicliffe
Assignee: Marcus Eriksson


The {{LocalLog}} instance created for use with offline tools should not be 
initialised with the standard set of listeners.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-18460) CEP-21 Ensure that ClusterMetadata::forceEpoch keeps component epochs consistent

Sam Tunnicliffe created CASSANDRA-18460:
---

 Summary: CEP-21 Ensure that ClusterMetadata::forceEpoch keeps 
component epochs consistent
 Key: CASSANDRA-18460
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18460
 Project: Cassandra
  Issue Type: Improvement
  Components: Transactional Cluster Metadata
Reporter: Sam Tunnicliffe
Assignee: Sam Tunnicliffe


{{forceEpoch}} is used to make the application of multiple transformations in 
series appear as a single atomic update. It's primary use is in {{UnsafeJoin}} 
(i.e. join without bootstrap) to apply the usual sequence of start/mid/finish 
join in a single transformation. In such cases, we must ensure that no 
component of {{ClusterMetadata}} which maintains its own last-modified epoch, 
ends up with an epoch greater than the one of the enclosing 
{{ClusterMetadata}}. 




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-18461) CEP-21 Avoid NPE when getting dc/rack for not yet registered endpoints

Sam Tunnicliffe created CASSANDRA-18461:
---

 Summary: CEP-21 Avoid NPE when getting dc/rack for not yet 
registered endpoints
 Key: CASSANDRA-18461
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18461
 Project: Cassandra
  Issue Type: Improvement
  Components: Transactional Cluster Metadata
Reporter: Sam Tunnicliffe
Assignee: Marcus Eriksson


If a snitch is asked for location info for a node not yet added to the cluster, 
it should not NPE. In future, it may be desirable to fine tune the actual 
behaviour, but for now returning a default would be an improvement.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-18458) CEP-21 During startup, don't open SSTables until local metadata log replay is complete

Sam Tunnicliffe created CASSANDRA-18458:
---

 Summary: CEP-21 During startup, don't open SSTables until local 
metadata log replay is complete
 Key: CASSANDRA-18458
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18458
 Project: Cassandra
  Issue Type: Improvement
  Components: Transactional Cluster Metadata
Reporter: Sam Tunnicliffe
Assignee: Sam Tunnicliffe


Eagerly opening SSTables when their {{ColumnFamilyStore}} is first instantiated 
presents problems when replaying the local metadata log during startup. Schema 
modifications which _had_ been applied by the time an SSTable was written may 
not have been replayed yet, causing serialization errors. To address this, we 
can defer opening SSTables until after the local log replay is complete.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-18459) CEP-21 Rewrite o.a.c.distributed.test.SchemaTest

Sam Tunnicliffe created CASSANDRA-18459:
---

 Summary: CEP-21 Rewrite o.a.c.distributed.test.SchemaTest
 Key: CASSANDRA-18459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18459
 Project: Cassandra
  Issue Type: Improvement
  Components: Transactional Cluster Metadata
Reporter: Sam Tunnicliffe
Assignee: Sam Tunnicliffe


Several of the premises that this test is based on are no longer valid. Rewrite 
it so that instead of executing schema changes node-locally to prevent them 
propagating, we can have the non-cms node pause before enacting a schema change 
to enable us to verify that the expected exceptions are thrown. Also, local 
schema reset and pulling during startup are not relevant in TCM, so we can 
simplify the tests to ensure that a down node learns of any missed updates when 
it restarts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-18456) CEP-21 During startup request replay from CMS asynchronously

Sam Tunnicliffe created CASSANDRA-18456:
---

 Summary: CEP-21 During startup request replay from CMS 
asynchronously
 Key: CASSANDRA-18456
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18456
 Project: Cassandra
  Issue Type: Improvement
  Components: Transactional Cluster Metadata
Reporter: Sam Tunnicliffe
Assignee: Marcus Eriksson


During the startup sequence, nodes first replay any locally persisted metadata 
changes then request any newer, unseen updates from the CMS. This second part 
should be both async and optional, meaning a CMS failure or partition shouldn't 
prevent nodes from starting up. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-18457) CEP-21 Ensure that SchemaTransformation impls correctly set TableMetadata epoch

Sam Tunnicliffe created CASSANDRA-18457:
---

 Summary: CEP-21 Ensure that SchemaTransformation impls correctly 
set TableMetadata epoch
 Key: CASSANDRA-18457
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18457
 Project: Cassandra
  Issue Type: Improvement
  Components: Transactional Cluster Metadata
Reporter: Sam Tunnicliffe
Assignee: Sam Tunnicliffe


Many existing {{SchemaTransformation}} implementations (not to mention as yet 
unimplemented ones) have the potential to modify {{TableMetadata}} and it is 
brittle to require each of them to take care of updating the metadata epoch. We 
should have {{ClusterMetadata.Transformer}} or the {{AlterSchema}} 
transformation itself handle this automatically.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18336) Sstables were cleared when OOM and best_effort is used



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-18336:
--
Test and Documentation Plan: CI
 Status: Patch Available  (was: In Progress)

> Sstables were cleared when OOM and best_effort is used
> --
>
> Key: CASSANDRA-18336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: NAIZHEN QUE
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
>
> Attachments: 4031679897782_.pic.jpg, 4241679905694_.pic.jpg, 
> system.log.2023-02-21.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 1.When this exception occurs in the system
> {code:java}
> // 
> ERROR [CompactionExecutor:351627] 2023-02-21 17:59:20,721 
> CassandraDaemon.java:581 - Exception in thread 
> Thread[CompactionExecutor:351627,1,main]
> org.apache.cassandra.io.FSReadError: java.io.IOException: Map failed
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:167)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:310)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:246)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:170)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:73)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:61)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104)
>     at 
> org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:365)
>     at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:337)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:172)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:124)
>     at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:64)
>     at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:137)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:193)
>     at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:77)
>     at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:100)
>     at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:298)
>     at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>     at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.io.IOException: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1016)
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:163)
>     ... 23 common frames omitted
> Caused by: java.lang.OutOfMemoryError: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map0(Native Method)
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1013)
> {code}
> 2.Restart the node, Verifying logfile transaction ,All sstables are deleted
> {code:java}
> // code placeholder
> INFO  [main] 2023-02-21 18:00:23,350 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb-8819408-big-Index.db
>  
> INFO  [main] 2023-02-21 18:00:23,615 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb-8819408-big-Data.db
>  
> INFO  [main] 2023-02-21 18:00:46,504 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb_txn_compaction_c923b230-b077-11ed-a081-5d5a5c990823.log
>  
> INFO  [main] 2023-02-21 18:00:46,510 LogTransaction.java:536 - Verifying 
> logfile transaction 
> [nb_txn_compaction_461935b0-b1ce-11ed-a081-5d5a5c990823.log in 
> /historyData/cassandra/data/kairosd

[jira] [Updated] (CASSANDRA-18336) Sstables were cleared when OOM and best_effort is used

2023-04-18 Thread maxwellguo (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maxwellguo updated CASSANDRA-18336:
---
Reviewers: maxwellguo

> Sstables were cleared when OOM and best_effort is used
> --
>
> Key: CASSANDRA-18336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: NAIZHEN QUE
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
>
> Attachments: 4031679897782_.pic.jpg, 4241679905694_.pic.jpg, 
> system.log.2023-02-21.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 1.When this exception occurs in the system
> {code:java}
> // 
> ERROR [CompactionExecutor:351627] 2023-02-21 17:59:20,721 
> CassandraDaemon.java:581 - Exception in thread 
> Thread[CompactionExecutor:351627,1,main]
> org.apache.cassandra.io.FSReadError: java.io.IOException: Map failed
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:167)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:310)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:246)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:170)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:73)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:61)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104)
>     at 
> org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:365)
>     at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:337)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:172)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:124)
>     at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:64)
>     at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:137)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:193)
>     at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:77)
>     at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:100)
>     at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:298)
>     at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>     at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.io.IOException: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1016)
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:163)
>     ... 23 common frames omitted
> Caused by: java.lang.OutOfMemoryError: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map0(Native Method)
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1013)
> {code}
> 2.Restart the node, Verifying logfile transaction ,All sstables are deleted
> {code:java}
> // code placeholder
> INFO  [main] 2023-02-21 18:00:23,350 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb-8819408-big-Index.db
>  
> INFO  [main] 2023-02-21 18:00:23,615 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb-8819408-big-Data.db
>  
> INFO  [main] 2023-02-21 18:00:46,504 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb_txn_compaction_c923b230-b077-11ed-a081-5d5a5c990823.log
>  
> INFO  [main] 2023-02-21 18:00:46,510 LogTransaction.java:536 - Verifying 
> logfile transaction 
> [nb_txn_compaction_461935b0-b1ce-11ed-a081-5d5a5c990823.log in 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b]
> INFO  [main] 2023-02-21 18:00:46,517 LogTra

[jira] [Commented] (CASSANDRA-12937) Default setting (yaml) for SSTable compression



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713478#comment-17713478
 ] 

Stefan Miklosovic commented on CASSANDRA-12937:
---

I disagree. I will try to finish my view on this ticket soon to compare the 
ideas. I have a feeling we are moving in circles here.

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18336) Sstables were cleared when OOM and best_effort is used



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713459#comment-17713459
 ] 

Stefan Miklosovic commented on CASSANDRA-18336:
---

Thanks, I rewrote the logic little bit to be more generic, we might include 
more exceptions which are not meant to trigger the data removal in the future 
so I prepared the code for it.

[~brandon.williams] do you want to take the last look before I start to build 
it? 

https://github.com/instaclustr/cassandra/tree/CASSANDRA-18336-3.0
https://github.com/instaclustr/cassandra/tree/CASSANDRA-18336-3.11
https://github.com/instaclustr/cassandra/tree/CASSANDRA-18336-4.0
https://github.com/instaclustr/cassandra/tree/CASSANDRA-18336-4.1
https://github.com/instaclustr/cassandra/tree/CASSANDRA-18336

> Sstables were cleared when OOM and best_effort is used
> --
>
> Key: CASSANDRA-18336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: NAIZHEN QUE
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
>
> Attachments: 4031679897782_.pic.jpg, 4241679905694_.pic.jpg, 
> system.log.2023-02-21.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 1.When this exception occurs in the system
> {code:java}
> // 
> ERROR [CompactionExecutor:351627] 2023-02-21 17:59:20,721 
> CassandraDaemon.java:581 - Exception in thread 
> Thread[CompactionExecutor:351627,1,main]
> org.apache.cassandra.io.FSReadError: java.io.IOException: Map failed
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:167)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:310)
>     at 
> org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:246)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:170)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:73)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:61)
>     at 
> org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104)
>     at 
> org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:365)
>     at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:337)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:172)
>     at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:124)
>     at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:64)
>     at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:137)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:193)
>     at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>     at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:77)
>     at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:100)
>     at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:298)
>     at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>     at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.io.IOException: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1016)
>     at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:163)
>     ... 23 common frames omitted
> Caused by: java.lang.OutOfMemoryError: Map failed
>     at java.base/sun.nio.ch.FileChannelImpl.map0(Native Method)
>     at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1013)
> {code}
> 2.Restart the node, Verifying logfile transaction ,All sstables are deleted
> {code:java}
> // code placeholder
> INFO  [main] 2023-02-21 18:00:23,350 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyData/cassandra/data/kairosdb/data_points-870fab7087ba11eb8b50d3c6960df21b/nb-8819408-big-Index.db
>  
> INFO  [main] 2023-02-21 18:00:23,615 LogTransaction.java:240 - Unfinished 
> transaction log, deleting 
> /historyDat

[jira] [Assigned] (CASSANDRA-18336) Sstables were cleared when OOM and best_effort is used