[cassandra] branch cassandra-5.0 updated: ninja-fix remove all DSE references

mck Thu, 12 Oct 2023 00:24:59 -0700

This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch cassandra-5.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git



The following commit(s) were added to refs/heads/cassandra-5.0 by this push:
     new 45cf5edb37 ninja-fix remove all DSE references
45cf5edb37 is described below

commit 45cf5edb376d736139c206777b4ec723a71966c8
Author: Lorina Poland <[email protected]>
AuthorDate: Wed Oct 11 13:18:23 2023 -0700

    ninja-fix remove all DSE references
    
     patch by Lorina Poland; reviewed by Mick Semb Wever for CASSANDRA-18231
---
 .../pages/developing/cql/create-custom-index.adoc  |  11 +-
 .../developing/cql/indexing/sai/_collections.adoc  |   2 +-
 .../pages/developing/cql/indexing/sai/sai-faq.adoc |   5 +-
 .../cql-commands/compact-subproperties.adoc        | 278 +++++++++++++++++++++
 4 files changed, 285 insertions(+), 11 deletions(-)

diff --git 
a/doc/modules/cassandra/pages/developing/cql/create-custom-index.adoc 
b/doc/modules/cassandra/pages/developing/cql/create-custom-index.adoc
index d1b5fda6b6..329e373966 100644
--- a/doc/modules/cassandra/pages/developing/cql/create-custom-index.adoc
+++ b/doc/modules/cassandra/pages/developing/cql/create-custom-index.adoc
@@ -149,13 +149,10 @@ CREATE TABLE audit ( id int PRIMARY KEY , text_map 
map<text, text>);
 
 Create multiple SAI indexes on the same `map` column, each using `KEYS`, 
`VALUES`, and `ENTRIES`.
 
-// ifeval::["{evalproduct}" == "dse"]
-// [NOTE]
-// ==== 
-// Creating multiple SAI indexes with different map types *on the same column* 
requires Cassandra 5.0 or later.
-// If you're using DSE 6.8.3, submit a `DROP INDEX index-name;` command before 
adding the next map type on the same column.
-// ====
-// endif::[]
+[NOTE]
+==== 
+Creating multiple SAI indexes with different map types *on the same column* 
requires Cassandra 5.0 or later.
+====
 
 [source,language-cql]
 ----
diff --git 
a/doc/modules/cassandra/pages/developing/cql/indexing/sai/_collections.adoc 
b/doc/modules/cassandra/pages/developing/cql/indexing/sai/_collections.adoc
index 08493df2e2..a6d9e7b857 100644
--- a/doc/modules/cassandra/pages/developing/cql/indexing/sai/_collections.adoc
+++ b/doc/modules/cassandra/pages/developing/cql/indexing/sai/_collections.adoc
@@ -7,7 +7,7 @@ SAI supports collections of type `map`, `list`, and `set`.
 Collections allow you to group and store data together in a column.
 
 In a relational database, a grouping such as a user's multiple email addresses 
is achieved via many-to-one joined relationship between (for example) a `user` 
table and an `email` table.
-DSE avoids joins between two tables by storing the user's email addresses in a 
collection column in the `user` table.
+{product} avoids joins between two tables by storing the user's email 
addresses in a collection column in the `user` table.
 Each collection specifies the data type of the data held.
 
 A collection is appropriate if the data for collection storage is limited.
diff --git 
a/doc/modules/cassandra/pages/developing/cql/indexing/sai/sai-faq.adoc 
b/doc/modules/cassandra/pages/developing/cql/indexing/sai/sai-faq.adoc
index 45cec54441..e815caac81 100644
--- a/doc/modules/cassandra/pages/developing/cql/indexing/sai/sai-faq.adoc
+++ b/doc/modules/cassandra/pages/developing/cql/indexing/sai/sai-faq.adoc
@@ -218,7 +218,7 @@ CREATE CUSTOM INDEX lastname_sai_idx ON 
cycling.cyclist_semi_pro (lastname)
 
 [TIP]
 ====
-SAI added the `ascii` option with the DSE 6.8.7 release.
+SAI has an `ascii` option.
 The default is `false`.
 When set to `true`, SAI converts alphabetic, numeric, and symbolic characters 
that are not in the Basic Latin Unicode block (the first 127 ASCII characters) 
to the ASCII equivalent, if one exists.
 For example, this option changes à to a.
@@ -235,7 +235,6 @@ Also, SAI can use multiple defined indexes within a single 
read query.
 
 == How can I view SAI memory usage metrics?
 
-SAI follows the Threads Per Core (TPC) memory model for DSE.
 The SAI memory footprint is divided between the JVM heap and the Chunk Cache.
 The heap stores memtable indexes, and the chunk cache stores recently accessed 
on-disk index components as well as other SSTable components.
 SAI provides metrics for both the heap and the chunk cache.
@@ -243,7 +242,7 @@ For each index, SAI also provides metrics for determining 
the size in bytes of m
 Refer to 
xref:cassandra:developing/cql/indexing/sai/monitoring.adoc#saiIndexGroupMetrics[Index
 group metrics].
 SAI also provides Table state metrics that give you visibility into the disk 
usage, the percentage of disk usage of the base table, the index builds in 
progress, and related metrics.
 See 
xref:cassandra:developing/cql/indexing/sai/monitoring.adoc#saiTableStateMetrics[Table
 state metrics].
-These metrics and many others are accessible via DSE OpsCenter.
+
 
 [[saiAndQueriesFaq]]
 == What is the performance impact of adding SAI columns to a read query? How 
many `AND` clauses can I add?
diff --git 
a/doc/modules/cassandra/pages/reference/cql-commands/compact-subproperties.adoc 
b/doc/modules/cassandra/pages/reference/cql-commands/compact-subproperties.adoc
new file mode 100644
index 0000000000..500da166f0
--- /dev/null
+++ 
b/doc/modules/cassandra/pages/reference/cql-commands/compact-subproperties.adoc
@@ -0,0 +1,278 @@
+=== compaction = \{compaction_map}
+:description: Construct a map of the compaction option and its subproperties.
+
+Defines the strategy for cleaning up data after writes.
+
+Syntax uses a simple JSON format:
+
+[source,language-cql]
+----
+compaction = {
+     'class' : '<compaction_strategy_name>',
+     '<property_name>' : <value> [, ...] }
+----
+
+where the <compaction_strategy_name> is 
xref:STCS[SizeTieredCompactionStrategy], 
xref:TWCS[TimeWindowCompactionStrategy], or xref:LCS[LeveledCompactionStrategy].
+
+[IMPORTANT]
+====
+Use only compaction implementations bundled with {product}.
+See xref:cassandra:managing/operating/compaction/index.adoc[Compaction 
Strategies] for more details.
+====
+
+==== Common properties
+
+The following properties apply to all compaction strategies.
+
+[source,language-cql]
+----
+compaction = {
+     'class' : 'compaction_strategy_name',
+     'enabled' : (true | false),
+     'log_all' : (true | false),
+     'only_purge_repaired_tombstone' : (true | false),
+     'tombstone_threshold' : <ratio>,
+     'tombstone_compaction_interval' : <sec>,
+     'unchecked_tombstone_compaction' : (true | false),
+     'min_threshold' : <num_sstables>,
+     'max_threshold' : <num_sstables> }
+----
+
+*enabled* ::
+Enable background compaction.
+
+* `true` runs minor compactions.
+* `false` disables minor compactions.
+
++
+[TIP]
+==== 
+Use `nodetool enableautocompaction` to start running compactions.
+====
+
+{empty}::
+Default: `true`
+
+*log_all* ::
+Activates advanced logging for the entire cluster.
++
+Default: `false`
+
+*only_purge_repaired_tombstone* ::
+Enabling this property prevents data from resurrecting when repair is not run 
within the `gc_grace_seconds`.
+When its been a long time between repairs, the database keeps all tombstones.
++
+
+* `true` - Only allow tombstone purges on repaired SSTables.
+* `false` - Purge tombstones on SSTables during compaction even if the table 
has not been repaired.
+
++
+Default: `false`
+
+*tombstone_threshold* ::
+The ratio of garbage-collectable tombstones to all contained columns.
+If the ratio exceeds this limit, compactions starts only on that table to 
purge the tombstones.
++
+Default: `0.2`
+
+*tombstone_compaction_interval* ::
+Number of seconds before compaction can run on an SSTable after it is created.
+An SSTable is eligible for compaction when it exceeds the 
`tombstone_threshold`.
+Because it might not be possible to drop tombstones when doing a single 
SSTable compaction, and since the compaction is triggered base on an estimated 
tombstone ratio, this setting makes the minimum interval between two single 
SSTable compactions tunable to prevent an SSTable from being constantly 
re-compacted.
++
+Default: `86400` (1 day)
+
+*unchecked_tombstone_compaction* ::
+Setting to `true` allows tombstone compaction to run without pre-checking 
which tables are eligible for the operation.
+Even without this pre-check, {product} checks an SSTable to make sure it is 
safe to drop tombstones.
++
+Default: `false`
+
+*min_threshold* :: 
+The minimum number of SSTables to trigger a minor compaction.
++
+*Restriction:* Not used in `LeveledCompactionStrategy`.
++
+Default: `4`
+
+*max_threshold* ::
+The maximum number of SSTables before a minor compaction is triggered.
++
+*Restriction:* Not used in `LeveledCompactionStrategy`.
++
+Default: `32`
+
+[[STCS]]
+==== SizeTieredCompactionStrategy
+
+The compaction class `SizeTieredCompactionStrategy` (STCS) triggers a minor 
compaction when table meets the `min_threshold`.
+Minor compactions do not involve all the tables in a keyspace.
+See 
xref:cassandra:managing/operating/compaction/stcs.adoc[SizeTieredCompactionStrategy
 (STCS)].
+
+[NOTE]
+====
+Default compaction strategy.
+====
+
+The following properties only apply to SizeTieredCompactionStrategy:
+
+[source,language-cql]
+----
+compaction = {
+     'class' : 'SizeTieredCompactionStrategy',
+     'bucket_high' : <factor>,
+     'bucket_low' : <factor>,
+     'min_sstable_size' : <int> }
+----
+
+*bucket_high* ::
+Size-tiered compaction merges sets of SSTables that are approximately the same 
size.
+The database compares each SSTable size to the average of all SSTable sizes 
for this table on the node.
+It merges SSTables whose size in KB are within [average-size * bucket_low] and 
[average-size * bucket_high].
++
+Default: `1.5`
+
+*bucket_low* ::
+Size-tiered compaction merges sets of SSTables that are approximately the same 
size.
+The database compares each SSTable size to the average of all SSTable sizes 
for this table on the node.
+It merges SSTables whose size in KB are within [average-size * bucket_low] and 
[average-size * bucket_high].
++
+Default: `0.5`
+
+*min_sstable_size* ::
+STCS groups SSTables into buckets.
+The bucketing process groups SSTables that differ in size by less than 50%.
+This bucketing process is too fine-grained for small SSTables.
+If your SSTables are small, use this option to define a size threshold in MB 
below which all SSTables belong to one unique bucket.
++
+Default: `50` (MB)
+
+[NOTE]
+==== 
+The `cold_reads_to_omit` property for 
xref:cassandra:managing/operating/compaction/stcs.adoc[SizeTieredCompactionStrategy
 (STCS)] is no longer supported.
+====
+
+[[TWCS]]
+==== TimeWindowCompactionStrategy
+
+The compaction class `TimeWindowCompactionStrategy` (TWCS) compacts SSTables 
using a series of _time windows_ or _buckets_.
+TWCS creates a new time window within each successive time period.
+During the active time window, TWCS compacts all SSTables flushed from memory 
into larger SSTables using STCS.
+At the end of the time period, all of these SSTables are compacted into a 
single SSTable.
+Then the next time window starts and the process repeats.
+See 
xref:cassandra:managing/operating/compaction/twcs.adoc[TimeWindowCompactionStrategy
 (TWCS)].
+
+[NOTE]
+==== 
+All of the properties for STCS are also valid for TWCS.
+====
+
+The following properties apply only to TimeWindowCompactionStrategy:
+
+[source,language-cql]
+----
+compaction = {
+     'class' : 'TimeWindowCompactionStrategy,
+     'compaction_window_unit' : <days>,
+     'compaction_window_size' : <int>,
+     'split_during_flush' : (true | false) }
+----
+
+*compaction_window_unit* ::
+Time unit used to define the bucket size.
+The value is based on the Java `TimeUnit`.
+For the list of valid values, see the Java API `TimeUnit` page located at 
https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/TimeUnit.html.
++
+Default: `days`
+
+*compaction_window_size* ::
+Units per bucket.
++
+Default: `1`
+
+*split_during_flush* ::
+Prevents mixing older data from repairs and hints with newer data from the 
current time window.
+During a flush operation, determines whether data partitions are split based 
on the configured time window.
+
+* `false` - the data partitions are not split based on the configured time 
window.
+* `true` - ensure that data repaired by NodeSync is placed in the correct TWCS 
window.
+Enable `split_during_flush` when using NodeSync with TWCS or when running node 
repairs.
+Default: `false`
+
++
+During the flush operation, the data is split into a maximum of 12 windows.
+Each window holds the data in a separate SSTable.
+If the current time is <t0> and each window has a time duration of <w>, the 
data is split in the SSTables as follows:
+
+* SSTable 0 contains data for the time period < <t0> - 10 * <w>
+* SSTables 1 to 10 contain data for the 10 equal time periods from (<t0> - 10 
* <w>) through to (<t0> - 1 * <w>)
+* SSTable 11, the 12th table, contains data for the time period > <t0>
+
+[[LCS]]
+==== LeveledCompactionStrategy
+
+The compaction class `LeveledCompactionStrategy` (LCS) creates SSTables of a 
fixed, relatively small size (160 MB by default) that are grouped into levels.
+Within each level, SSTables are guaranteed to be non-overlapping.
+Each level (L0, L1, L2 and so on) is 10 times as large as the previous.
+Disk I/O is more uniform and predictable on higher than on lower levels as 
SSTables are continuously being compacted into progressively larger levels.
+At each level, row keys are merged into non-overlapping SSTables in the next 
level.
+See 
xref:cassandra:managing/operating/compaction/lcs.adoc[LeveledCompactionStrategy 
(LCS)].
+
+[NOTE]
+==== 
+For more guidance, see 
https://www.datastax.com/dev/blog/when-to-use-leveled-compaction[When to Use 
Leveled Compaction] and 
https://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra[Leveled
 Compaction] blog.
+====
+
+The following properties only apply to LeveledCompactionStrategy:
+
+[source,language-cql]
+----
+compaction = {
+     'class' : 'LeveledCompactionStrategy,
+     'sstable_size_in_mb' : <int> }
+----
+
+*sstable_size_in_mb* ::
+The target size for SSTables that use the LeveledCompactionStrategy.
+Although SSTable sizes should be less or equal to sstable_size_in_mb, it is 
possible that compaction could produce a larger SSTable during compaction.
+This occurs when data for a given partition key is exceptionally large.
+The {product} database does not split the data into two SSTables.
++
+Default: `160`
++
+[CAUTION]
+==== 
+The default value, 160 MB, may be inefficient and negatively impact database 
indexing and the queries that rely on indexes.
+For example, consider the benefit of using higher values for 
sstable_size_in_mb in tables that use (SAI) indexes.
+For related information, see 
xref:developing:indexing/sai/configuring.adoc#saiConfigure__saiCompactionStrategies[Compaction
 strategies].
+====
+
+==== DateTieredCompactionStrategy (deprecated)
+
+[IMPORTANT]
+====
+Use xref:TWCS[TimeWindowCompactionStrategy] instead.
+====
+
+Stores data written within a certain period of time in the same SSTable.
+
+*base_time_seconds* ::
+The size of the first time window.
++
+Default: `3600`
+
+*max_sstable_age_days (deprecated)* ::
+{product} does not compact SSTables if its most recent data is older than this 
property.
+Fractional days can be set.
++
+Default: `1000`
+
+*max_window_size_seconds* ::
+The maximum window size in seconds.
++
+Default: `86400`
+
+*timestamp_resolution* ::
+Units, <MICROSECONDS> or <MILLISECONDS>, to match the timestamp of inserted 
data.
++
+Default: `MICROSECONDS`


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[cassandra] branch cassandra-5.0 updated: ninja-fix remove all DSE references

Reply via email to