This is an automated email from the ASF dual-hosted git repository.
maedhroz pushed a commit to branch cassandra-5.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git
The following commit(s) were added to refs/heads/cassandra-5.0 by this push:
new e9a82df1f3 Correct out-of-date metrics and configuration documentation
for SAI
e9a82df1f3 is described below
commit e9a82df1f370c451db7310441246980545256cf5
Author: Caleb Rackliffe <[email protected]>
AuthorDate: Wed Sep 18 14:41:50 2024 -0500
Correct out-of-date metrics and configuration documentation for SAI
patch by Caleb Rackliffe; reviewed by Jon Haddad for CASSANDRA-19898
---
CHANGES.txt | 1 +
.../cql/indexing/sai/operations/configuring.adoc | 25 ++------
.../cql/indexing/sai/operations/monitoring.adoc | 67 ----------------------
3 files changed, 6 insertions(+), 87 deletions(-)
diff --git a/CHANGES.txt b/CHANGES.txt
index 4015caa7da..da7c3e5c05 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
5.0.1
+ * Correct out-of-date metrics and configuration documentation for SAI
(CASSANDRA-19898)
* Make configuration entries in memtable section order-independent
(CASSANDRA-19906)
* Add guardrail for enabling usage of VectorType (CASSANDRA-19903)
* Set executable flag for shell scripts in .build directory for source
artifact (CASSANDRA-19896)
diff --git
a/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/configuring.adoc
b/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/configuring.adoc
index a618143e1d..0acd114746 100644
---
a/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/configuring.adoc
+++
b/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/configuring.adoc
@@ -3,25 +3,7 @@
// LLP: *NOT DONE*
-Configuring your {product} environment for Storage-Attached Indexing (SAI)
requires some important customization of the `cassandra.yaml` file.
-
-== Increase file cache above the default value
-
-By default, the file cache's
xref:cassandra:managing/configuration/cass_yaml_file.adoc#file_cache_size[file_cache_size]
value is calculated as 50% of the `MaxDirectMemorySize` setting.
-This default for `file_cache_size` may result in suboptimal performance
because Cassandra is not able to take full advantage of available memory.
-
-[TIP]
-====
-File cache is also known as chunk cache.
-====
-
-The `file_cache_size` value can be defined explicitly in `cassandra.yaml`.
-The recommendation is to:
-
-. Increase `--XX:MaxDirectMemorySize`, leaving approximately 15-20% of memory
for the OS and other in-memory structures.
-. In `cassandra.yaml`, explicitly set `file_cache_size` to 75% of that value.
-
-In testing, this configuration improves indexing performance across read,
write, and mixed read/write scenarios.
+Configuring your {product} environment for Storage-Attached Indexing (SAI) may
require some customization of the `cassandra.yaml` file.
== Compaction strategies
@@ -45,7 +27,7 @@ In general, do not use `LeveledCompactionStrategy` (LCS)
unless your index queri
However, if you decide to use LCS, use the following guidelines:
* The `160` MB default for the `CREATE TABLE` command's `sstable_size_in_mb`
option, described in this
xref:reference:cql-commands/create-table.adoc#compactSubprop__LCS[topic], may
result in suboptimal performance for index queries that do not restrict on
token range or partition key.
-* While even higher values may be appropriate, depending on your hardware,
DataStax recommends at least doubling the default value of `sstable_size_in_mb`.
+* While even higher values may be appropriate, depending on your hardware, we
recommend at least doubling the default value of `sstable_size_in_mb`.
Example:
@@ -63,6 +45,9 @@ Each SAI index should ultimately consume less space on disk
because of better lo
If query performance degrades on large (`sstable_max_size` ~2GB) SAI indexed
SSTables when the workload is not dominated by reads but is experiencing
increased write amplification, consider using Unified Compaction Strategy (UCS).
+The `cassandra.yaml` options `sai_sstable_indexes_per_query_warn_threshold`
(default: 32) and `sai_sstable_indexes_per_query_fail_threshold` (default:
disabled) determine the number of SSTable indexes a SAI query touches before
warning clients and failing queries respectively.
+When enabled, they can provide feedback for clients and protection for the
database in the face of sub-optimal read queries.
+
== About SAI encryption
With SAI indexes, its on-disk components are simply additional SSTable data.
diff --git
a/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/monitoring.adoc
b/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/monitoring.adoc
index 87780f652b..4cc645f18b 100644
---
a/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/monitoring.adoc
+++
b/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/monitoring.adoc
@@ -47,11 +47,7 @@ The categorized data:
* Global indexing metrics
* Table query metrics
* Per query metrics
-* Key fetch metrics
-* Offset fetch metrics
-* Token fetch metrics
* Column query metrics per index
-* Terms metrics per index
* Range slice metrics
For example, you can use metrics to get the current count of total partition
reads since the node started for `cycling.cyclist_semi_pro`.
@@ -101,32 +97,6 @@ The index group metrics for the given keyspace and table:
* `IndexFileCacheBytes` -- Size in bytes of memory used by the on-disk data
structure of the per-column indices.
* `OpenIndexFiles` -- Number of open index files for the given table's SAI
indices.
-== Key fetch metrics
-
-----
-ObjectName:
org.apache.cassandra.metrics:type=StorageAttachedIndex,keyspace=<keyspace>,table=<table>,scope=KeyFetch,name=<metric>
-----
-
-The key fetch metrics for the given keyspace and table:
-
-* `ChunkCacheHitRate` -- All-time chunk cache hit rate for keys during queries
against the given table.
-* `TotalChunkCacheLookups` -- All-time chunk cache lookups for keys during
queries against the given table.
-* `TotalChunkCacheMisses` -- All-time chunk cache misses for keys during
queries against the given table.
-* `ChunkCache(One|Five|Fifteen)HitRate` -- <N>-minute chunk cache hit rate for
keys during queries against the given table.
-
-== Offset fetch metrics
-
-----
-ObjectName:
org.apache.cassandra.metrics:type=StorageAttachedIndex,keyspace=<keyspace>,table=<table>,scope=OffsetFetch,name=<metric>
-----
-
-The offset fetch metrics for the given keyspace and table:
-
-* `ChunkCacheHitRate` -- All-time chunk cache hit rate for partition key
SSTable offset fetches during queries against the given table.
-* `TotalChunkCacheLookups` -- All-time chunk cache lookups for partition key
SSTable offset fetches during queries against the given table.
-* `TotalChunkCacheMisses` -- All-time chunk cache misses for partition key
SSTable offset fetches during queries against the given table.
-* `ChunkCache(One|Five|Fifteen)HitRate` -- <N>-minute chunk cache hit rate for
partition key SSTable offset fetches during queries against the given table.
-
== Per query metrics
----
@@ -170,30 +140,6 @@ The table state metrics for the given keyspace and table:
* `TotalIndexCount` -- Total number of SAI indices per table.
* `TotalQueryableIndexCount` -- Status of SAI indices per table currently in
the `is_querable` state.
-== Token fetch metrics
-
-----
-ObjectName:
org.apache.cassandra.metrics:type=StorageAttachedIndex,keyspace=<keyspace>,table=<table>,scope=TokenFetch,name=<metric>
-----
-
-The token fetch metrics for the given keyspace and table:
-
-* `ChunkCacheHitRate` -- All-time chunk cache hit rate for partition key token
fetches during queries against the given table.
-* `TotalChunkCacheLookups` -- All-time chunk cache lookups for partition key
token fetches during queries against the given table.
-* `TotalChunkCacheMisses` -- All-time chunk cache misses for partition key
token fetches during queries against the given table.
-* `ChunkCache(One|Five|Fifteen)HitRate` -- <N>-minute chunk cache hit rate for
partition key token fetches during queries against the given table.
-
-== Token skipping metrics
-
-----
-ObjectName:
org.apache.cassandra.metrics:type=StorageAttachedIndex,keyspace=<keyspace>,table=<table>,scope=TokenSkipping,name=<metric>
-----
-
-The token skippping metrics for the given keyspace and table:
-
-* `CacheHits` -- Number of cache hits from token skipping in a multi-index
`AND` query.
-* `Lookups` -- Number of lookups from token skipping a multi-index `AND` query.
-
== Column query metrics for each numeric index
----
@@ -219,19 +165,6 @@ The column query metrics for the given keyspace, table,
and index include:
* `TermsLookupLatency` -- For string indexes, such as `country_sai_idx` in the
xref:cassandra:getting-started/sai-quickstart.adoc[quickstart] examples, this
metric shows terms lookup latency percentiles (in microseconds) per
one/five/fifteen minute query throughput.
-== Terms metrics for each string index
-
-----
-ObjectName:
org.apache.cassandra.metrics:type=StorageAttachedIndex,keyspace=<keyspace>,table=<table>,index=<index>,scope=Terms,name=<metric>
-----
-
-For string indexes, the terms metrics for the given keyspace, table, and index:
-
-* `ChunkCacheHitRate` -- All-time chunk cache hit rate for terms during string
index queries that used the given index.
-* `TotalChunkCacheLookups` -- All-time chunk cache lookups for terms during
string index queries that used the given index.
-* `TotalChunkCacheMisses` -- All-time chunk cache misses for terms during
string index queries that used the given index.
-* `ChunkCache(One|Five|Fifteen)HitRate` -- <N>-minute chunk cache hit rate for
terms during string index queries that used the given index.
-
== Range slice metrics
----
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]