(cassandra) branch cassandra-5.0 updated: Correct out-of-date metrics and configuration documentation for SAI

maedhroz Thu, 19 Sep 2024 12:56:41 -0700

This is an automated email from the ASF dual-hosted git repository.

maedhroz pushed a commit to branch cassandra-5.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git



The following commit(s) were added to refs/heads/cassandra-5.0 by this push:
     new e9a82df1f3 Correct out-of-date metrics and configuration documentation 
for SAI
e9a82df1f3 is described below

commit e9a82df1f370c451db7310441246980545256cf5
Author: Caleb Rackliffe <[email protected]>
AuthorDate: Wed Sep 18 14:41:50 2024 -0500

    Correct out-of-date metrics and configuration documentation for SAI
    
    patch by Caleb Rackliffe; reviewed by Jon Haddad for CASSANDRA-19898
---
 CHANGES.txt                                        |  1 +
 .../cql/indexing/sai/operations/configuring.adoc   | 25 ++------
 .../cql/indexing/sai/operations/monitoring.adoc    | 67 ----------------------
 3 files changed, 6 insertions(+), 87 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 4015caa7da..da7c3e5c05 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 5.0.1
+ * Correct out-of-date metrics and configuration documentation for SAI 
(CASSANDRA-19898)
  * Make configuration entries in memtable section order-independent 
(CASSANDRA-19906)
  * Add guardrail for enabling usage of VectorType (CASSANDRA-19903)
  * Set executable flag for shell scripts in .build directory for source 
artifact (CASSANDRA-19896)
diff --git 
a/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/configuring.adoc
 
b/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/configuring.adoc
index a618143e1d..0acd114746 100644
--- 
a/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/configuring.adoc
+++ 
b/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/configuring.adoc
@@ -3,25 +3,7 @@
 
 // LLP: *NOT DONE*
 
-Configuring your {product} environment for Storage-Attached Indexing (SAI) 
requires some important customization of the `cassandra.yaml` file. 
-
-== Increase file cache above the default value
-
-By default, the file cache's 
xref:cassandra:managing/configuration/cass_yaml_file.adoc#file_cache_size[file_cache_size]
 value is calculated as 50% of the `MaxDirectMemorySize` setting.
-This default for `file_cache_size` may result in suboptimal performance 
because Cassandra is not able to take full advantage of available memory.
-
-[TIP]
-====
-File cache is also known as chunk cache.
-====
-
-The `file_cache_size` value can be defined explicitly in `cassandra.yaml`.
-The recommendation is to:
-
-. Increase `--XX:MaxDirectMemorySize`, leaving approximately 15-20% of memory 
for the OS and other in-memory structures.
-. In `cassandra.yaml`, explicitly set `file_cache_size` to 75% of that value.
-
-In testing, this configuration improves indexing performance across read, 
write, and mixed read/write scenarios.
+Configuring your {product} environment for Storage-Attached Indexing (SAI) may 
require some customization of the `cassandra.yaml` file. 
 
 == Compaction strategies
 
@@ -45,7 +27,7 @@ In general, do not use `LeveledCompactionStrategy` (LCS) 
unless your index queri
 However, if you decide to use LCS, use the following guidelines:
 
 * The `160` MB default for the `CREATE TABLE` command's `sstable_size_in_mb` 
option, described in this 
xref:reference:cql-commands/create-table.adoc#compactSubprop__LCS[topic], may 
result in suboptimal performance for index queries that do not restrict on 
token range or partition key.
-* While even higher values may be appropriate, depending on your hardware, 
DataStax recommends at least doubling the default value of `sstable_size_in_mb`.
+* While even higher values may be appropriate, depending on your hardware, we 
recommend at least doubling the default value of `sstable_size_in_mb`. 
 
 Example:
 
@@ -63,6 +45,9 @@ Each SAI index should ultimately consume less space on disk 
because of better lo
 
 If query performance degrades on large (`sstable_max_size` ~2GB) SAI indexed 
SSTables when the workload is not dominated by reads but is experiencing 
increased write amplification, consider using Unified Compaction Strategy (UCS).
 
+The `cassandra.yaml` options `sai_sstable_indexes_per_query_warn_threshold` 
(default: 32) and `sai_sstable_indexes_per_query_fail_threshold` (default: 
disabled) determine the number of SSTable indexes a SAI query touches before 
warning clients and failing queries respectively.
+When enabled, they can provide feedback for clients and protection for the 
database in the face of sub-optimal read queries.
+
 == About SAI encryption
 
 With SAI indexes, its on-disk components are simply additional SSTable data.
diff --git 
a/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/monitoring.adoc
 
b/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/monitoring.adoc
index 87780f652b..4cc645f18b 100644
--- 
a/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/monitoring.adoc
+++ 
b/doc/modules/cassandra/pages/developing/cql/indexing/sai/operations/monitoring.adoc
@@ -47,11 +47,7 @@ The categorized data:
 * Global indexing metrics
 * Table query metrics
 * Per query metrics
-* Key fetch metrics
-* Offset fetch metrics
-* Token fetch metrics
 * Column query metrics per index
-* Terms metrics per index
 * Range slice metrics
 
 For example, you can use metrics to get the current count of total partition 
reads since the node started for `cycling.cyclist_semi_pro`.
@@ -101,32 +97,6 @@ The index group metrics for the given keyspace and table:
 * `IndexFileCacheBytes` -- Size in bytes of memory used by the on-disk data 
structure of the per-column indices.
 * `OpenIndexFiles` -- Number of open index files for the given table's SAI 
indices.
 
-== Key fetch metrics
-
-----
-ObjectName: 
org.apache.cassandra.metrics:type=StorageAttachedIndex,keyspace=<keyspace>,table=<table>,scope=KeyFetch,name=<metric>
-----
-
-The key fetch metrics for the given keyspace and table:
-
-* `ChunkCacheHitRate` -- All-time chunk cache hit rate for keys during queries 
against the given table.
-* `TotalChunkCacheLookups` -- All-time chunk cache lookups for keys during 
queries against the given table.
-* `TotalChunkCacheMisses` -- All-time chunk cache misses for keys during 
queries against the given table.
-* `ChunkCache(One|Five|Fifteen)HitRate` -- <N>-minute chunk cache hit rate for 
keys during queries against the given table.
-
-== Offset fetch metrics
-
-----
-ObjectName: 
org.apache.cassandra.metrics:type=StorageAttachedIndex,keyspace=<keyspace>,table=<table>,scope=OffsetFetch,name=<metric>
-----
-
-The offset fetch metrics for the given keyspace and table:
-
-* `ChunkCacheHitRate` -- All-time chunk cache hit rate for partition key 
SSTable offset fetches during queries against the given table.
-* `TotalChunkCacheLookups` -- All-time chunk cache lookups for partition key 
SSTable offset fetches during queries against the given table.
-* `TotalChunkCacheMisses` -- All-time chunk cache misses for partition key 
SSTable offset fetches during queries against the given table.
-* `ChunkCache(One|Five|Fifteen)HitRate` -- <N>-minute chunk cache hit rate for 
partition key SSTable offset fetches during queries against the given table.
-
 == Per query metrics
 
 ----
@@ -170,30 +140,6 @@ The table state metrics for the given keyspace and table:
 * `TotalIndexCount` -- Total number of SAI indices per table.
 * `TotalQueryableIndexCount` -- Status of SAI indices per table currently in 
the `is_querable` state.
 
-== Token fetch metrics
-
-----
-ObjectName: 
org.apache.cassandra.metrics:type=StorageAttachedIndex,keyspace=<keyspace>,table=<table>,scope=TokenFetch,name=<metric>
-----
-
-The token fetch metrics for the given keyspace and table:
-
-* `ChunkCacheHitRate` -- All-time chunk cache hit rate for partition key token 
fetches during queries against the given table.
-* `TotalChunkCacheLookups` -- All-time chunk cache lookups for partition key 
token fetches during queries against the given table.
-* `TotalChunkCacheMisses` -- All-time chunk cache misses for partition key 
token fetches during queries against the given table.
-* `ChunkCache(One|Five|Fifteen)HitRate` -- <N>-minute chunk cache hit rate for 
partition key token fetches during queries against the given table.
-
-== Token skipping metrics
-
-----
-ObjectName: 
org.apache.cassandra.metrics:type=StorageAttachedIndex,keyspace=<keyspace>,table=<table>,scope=TokenSkipping,name=<metric>
-----
-
-The token skippping metrics for the given keyspace and table:
-
-* `CacheHits` -- Number of cache hits from token skipping in a multi-index 
`AND` query.
-* `Lookups` -- Number of lookups from token skipping a multi-index `AND` query.
-
 == Column query metrics for each numeric index
 
 ----
@@ -219,19 +165,6 @@ The column query metrics for the given keyspace, table, 
and index include:
 
 * `TermsLookupLatency` -- For string indexes, such as `country_sai_idx` in the 
xref:cassandra:getting-started/sai-quickstart.adoc[quickstart] examples, this 
metric shows terms lookup latency percentiles (in microseconds) per 
one/five/fifteen minute query throughput.
 
-== Terms metrics for each string index
-
-----
-ObjectName: 
org.apache.cassandra.metrics:type=StorageAttachedIndex,keyspace=<keyspace>,table=<table>,index=<index>,scope=Terms,name=<metric>
-----
-
-For string indexes, the terms metrics for the given keyspace, table, and index:
-
-* `ChunkCacheHitRate` -- All-time chunk cache hit rate for terms during string 
index queries that used the given index.
-* `TotalChunkCacheLookups` -- All-time chunk cache lookups for terms during 
string index queries that used the given index.
-* `TotalChunkCacheMisses` -- All-time chunk cache misses for terms during 
string index queries that used the given index.
-* `ChunkCache(One|Five|Fifteen)HitRate` -- <N>-minute chunk cache hit rate for 
terms during string index queries that used the given index.
-
 == Range slice metrics
 
 ----


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(cassandra) branch cassandra-5.0 updated: Correct out-of-date metrics and configuration documentation for SAI

Reply via email to