blambov commented on code in PR #2896: URL: https://github.com/apache/cassandra/pull/2896#discussion_r1393935210
########## conf/cassandra_latest.yaml: ########## @@ -0,0 +1,2139 @@ + +# Cassandra storage config YAML + +# NOTE: +# See https://cassandra.apache.org/doc/latest/configuration/ for +# full explanations of configuration directives +# /NOTE + +# NOTE: +# This file is provided in two versions: +# - cassandra.yaml: Contains configuration defaults for a "compatible" +# configuration that operates using settings that are backwards-compatible +# and interoperable with machines running older versions of Cassandra. +# This version is provided to facilitate pain-free upgrades for existing +# users of Cassandra running in production who want to gradually and +# carefully introduce new features. +# - cassandra_latest.yaml: Contains configuration defaults that enable +# the latest features of Cassandra, including improved functionality as +# well as higher performance. This version is provided for new users of +# Cassandra who want to get the most out of their cluster, and for users +# evaluating the technology. +# To use this version, simply copy this file over cassandra.yaml, or specify +# it using the -Dcassandra.config system property, e.g. by running +# cassandra -Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml +# /NOTE + +# The name of the cluster. This is mainly used to prevent machines in +# one logical cluster from joining another. +cluster_name: 'Test Cluster' + +# This defines the number of tokens randomly assigned to this node on the ring +# The more tokens, relative to other nodes, the larger the proportion of data +# that this node will store. You probably want all nodes to have the same number +# of tokens assuming they have equal hardware capability. +# +# If you leave this unspecified, Cassandra will use the default of 1 token for legacy compatibility, +# and will use the initial_token as described below. +# +# Specifying initial_token will override this setting on the node's initial start, +# on subsequent starts, this setting will apply even if initial token is set. +# +# See https://cassandra.apache.org/doc/latest/getting-started/production.html#tokens for +# best practice information about num_tokens. +# +num_tokens: 16 + +# Triggers automatic allocation of num_tokens tokens for this node. The allocation +# algorithm attempts to choose tokens in a way that optimizes replicated load over +# the nodes in the datacenter for the replica factor. +# +# The load assigned to each node will be close to proportional to its number of +# vnodes. +# +# Only supported with the Murmur3Partitioner. + +# Replica factor is determined via the replication strategy used by the specified +# keyspace. +# allocate_tokens_for_keyspace: KEYSPACE + +# Replica factor is explicitly set, regardless of keyspace or datacenter. +# This is the replica factor within the datacenter, like NTS. +allocate_tokens_for_local_replication_factor: 3 + +# initial_token allows you to specify tokens manually. While you can use it with +# vnodes (num_tokens > 1, above) -- in which case you should provide a +# comma-separated list -- it's primarily used when adding nodes to legacy clusters +# that do not have vnodes enabled. +# initial_token: + +# May either be "true" or "false" to enable globally +hinted_handoff_enabled: true + +# When hinted_handoff_enabled is true, a black list of data centers that will not +# perform hinted handoff +# hinted_handoff_disabled_datacenters: +# - DC1 +# - DC2 + +# this defines the maximum amount of time a dead host will have hints +# generated. After it has been dead this long, new hints for it will not be +# created until it has been seen alive and gone down again. +# Min unit: ms +max_hint_window: 3h + +# Maximum throttle in KiBs per second, per delivery thread. This will be +# reduced proportionally to the number of nodes in the cluster. (If there +# are two nodes in the cluster, each delivery thread will use the maximum +# rate; if there are three, each will throttle to half of the maximum, +# since we expect two nodes to be delivering hints simultaneously.) +# Min unit: KiB +hinted_handoff_throttle: 1024KiB + +# Number of threads with which to deliver hints; +# Consider increasing this number when you have multi-dc deployments, since +# cross-dc handoff tends to be slower +max_hints_delivery_threads: 2 + +# Directory where Cassandra should store hints. +# If not set, the default directory is $CASSANDRA_HOME/data/hints. +# hints_directory: /var/lib/cassandra/hints + +# How often hints should be flushed from the internal buffers to disk. +# Will *not* trigger fsync. +# Min unit: ms +hints_flush_period: 10000ms + +# Maximum size for a single hints file, in mebibytes. +# Min unit: MiB +max_hints_file_size: 128MiB + +# The file size limit to store hints for an unreachable host, in mebibytes. +# Once the local hints files have reached the limit, no more new hints will be created. +# Set a non-positive value will disable the size limit. +# max_hints_size_per_host: 0MiB + +# Enable / disable automatic cleanup for the expired and orphaned hints file. +# Disable the option in order to preserve those hints on the disk. +auto_hints_cleanup_enabled: false + +# Enable/disable transfering hints to a peer during decommission. Even when enabled, this does not guarantee +# consistency for logged batches, and it may delay decommission when coupled with a strict hinted_handoff_throttle. +# Default: true +# transfer_hints_on_decommission: true + +# Compression to apply to the hint files. If omitted, hints files +# will be written uncompressed. LZ4, Snappy, and Deflate compressors +# are supported. +#hints_compression: +# - class_name: LZ4Compressor +# parameters: +# - + +# Directory where Cassandra should store results of a One-Shot troubleshooting heapdump for uncaught exceptions. +# Note: this value can be overridden by the -XX:HeapDumpPath JVM env param with a relative local path for testing if +# so desired. +# If not set, the default directory is $CASSANDRA_HOME/heapdump +# heap_dump_path: /var/lib/cassandra/heapdump + +# Enable / disable automatic dump of heap on first uncaught exception +# If not set, the default value is false +# dump_heap_on_uncaught_exception: true + +# Enable / disable persistent hint windows. +# +# If set to false, a hint will be stored only in case a respective node +# that hint is for is down less than or equal to max_hint_window. +# +# If set to true, a hint will be stored in case there is not any +# hint which was stored earlier than max_hint_window. This is for cases +# when a node keeps to restart and hints are not delivered yet, we would be saving +# hints for that node indefinitely. +# +# Defaults to true. +# +# hint_window_persistent_enabled: true + +# Maximum throttle in KiBs per second, total. This will be +# reduced proportionally to the number of nodes in the cluster. +# Min unit: KiB +batchlog_replay_throttle: 1024KiB + +# Authentication backend, implementing IAuthenticator; used to identify users +# Out of the box, Cassandra provides org.apache.cassandra.auth.{AllowAllAuthenticator, +# PasswordAuthenticator}. +# +# - AllowAllAuthenticator performs no checks - set it to disable authentication. +# - PasswordAuthenticator relies on username/password pairs to authenticate +# users. It keeps usernames and hashed passwords in system_auth.roles table. +# Please increase system_auth keyspace replication factor if you use this authenticator. +# If using PasswordAuthenticator, CassandraRoleManager must also be used (see below) +authenticator: + class_name : org.apache.cassandra.auth.AllowAllAuthenticator +# MutualTlsAuthenticator can be configured using the following configuration. One can add their own validator +# which implements MutualTlsCertificateValidator class and provide logic for extracting identity out of certificates +# and validating certificates. +# class_name : org.apache.cassandra.auth.MutualTlsAuthenticator +# parameters : +# validator_class_name: org.apache.cassandra.auth.SpiffeCertificateValidator + +# Authorization backend, implementing IAuthorizer; used to limit access/provide permissions +# Out of the box, Cassandra provides org.apache.cassandra.auth.{AllowAllAuthorizer, +# CassandraAuthorizer}. +# +# - AllowAllAuthorizer allows any action to any user - set it to disable authorization. +# - CassandraAuthorizer stores permissions in system_auth.role_permissions table. Please +# increase system_auth keyspace replication factor if you use this authorizer. +authorizer: AllowAllAuthorizer + +# Part of the Authentication & Authorization backend, implementing IRoleManager; used +# to maintain grants and memberships between roles. +# Out of the box, Cassandra provides org.apache.cassandra.auth.CassandraRoleManager, +# which stores role information in the system_auth keyspace. Most functions of the +# IRoleManager require an authenticated login, so unless the configured IAuthenticator +# actually implements authentication, most of this functionality will be unavailable. +# +# - CassandraRoleManager stores role data in the system_auth keyspace. Please +# increase system_auth keyspace replication factor if you use this role manager. +role_manager: CassandraRoleManager + +# Network authorization backend, implementing INetworkAuthorizer; used to restrict user +# access to certain DCs +# Out of the box, Cassandra provides org.apache.cassandra.auth.{AllowAllNetworkAuthorizer, +# CassandraNetworkAuthorizer}. +# +# - AllowAllNetworkAuthorizer allows access to any DC to any user - set it to disable authorization. +# - CassandraNetworkAuthorizer stores permissions in system_auth.network_permissions table. Please +# increase system_auth keyspace replication factor if you use this authorizer. +network_authorizer: AllowAllNetworkAuthorizer + +# CIDR authorization backend, implementing ICIDRAuthorizer; used to restrict user +# access from certain CIDRs +# Out of the box, Cassandra provides org.apache.cassandra.auth.{AllowAllCIDRAuthorizer, +# CassandraCIDRAuthorizer}. +# - AllowAllCIDRAuthorizer allows access from any CIDR to any user - set it to disable CIDR authorization. +# - CassandraCIDRAuthorizer stores user's CIDR permissions in system_auth.cidr_permissions table. Please +# increase system_auth keyspace replication factor if you use this authorizer, otherwise any changes to +# system_auth tables being used by this feature may be lost when a host goes down. +cidr_authorizer: + class_name: AllowAllCIDRAuthorizer + # Below parameters are used only when CIDR authorizer is enabled + # parameters: + # CIDR authorizer when enabled, i.e, CassandraCIDRAuthorizer, is applicable for non-superusers only by default. + # Set this setting to true, to enable CIDR authorization for superusers as well. + # Note: CIDR checks cannot be performed for JMX calls + # cidr_checks_for_superusers: true + + # CIDR authorizer when enabled, supports MONITOR and ENFORCE modes. Default mode is MONITOR + # In MONITOR mode, CIDR checks are NOT enforced. Instead, CIDR groups of users accesses are logged using + # nospamlogger. A warning message would be logged if a user accesses from unauthorized CIDR group (but access won't + # be rejected). An info message would be logged otherwise. + # In ENFORCE mode, CIDR checks are enforced, i.e, users accesses would be rejected if attempted from unauthorized + # CIDR groups. + # cidr_authorizer_mode: MONITOR + + # Refresh interval for CIDR groups cache, this value is considered in minutes + # cidr_groups_cache_refresh_interval: 5 + + # Maximum number of entries an IP to CIDR groups cache can accommodate + # ip_cache_max_size: 100 + +# Depending on the auth strategy of the cluster, it can be beneficial to iterate +# from root to table (root -> ks -> table) instead of table to root (table -> ks -> root). +# As the auth entries are whitelisting, once a permission is found you know it to be +# valid. We default to false as the legacy behavior is to query at the table level then +# move back up to the root. See CASSANDRA-17016 for details. +# traverse_auth_from_root: false + +# Validity period for roles cache (fetching granted roles can be an expensive +# operation depending on the role manager, CassandraRoleManager is one example) +# Granted roles are cached for authenticated sessions in AuthenticatedUser and +# after the period specified here, become eligible for (async) reload. +# Defaults to 2000, set to 0 to disable caching entirely. +# Will be disabled automatically for AllowAllAuthenticator. +# For a long-running cache using roles_cache_active_update, consider +# setting to something longer such as a daily validation: 86400000 +# Min unit: ms +roles_validity: 2000ms + +# Refresh interval for roles cache (if enabled). +# After this interval, cache entries become eligible for refresh. Upon next +# access, an async reload is scheduled and the old value returned until it +# completes. If roles_validity is non-zero, then this must be +# also. +# This setting is also used to inform the interval of auto-updating if +# using roles_cache_active_update. +# Defaults to the same value as roles_validity. +# For a long-running cache, consider setting this to 60000 (1 hour) etc. +# Min unit: ms +# roles_update_interval: 2000ms + +# If true, cache contents are actively updated by a background task at the +# interval set by roles_update_interval. If false, cache entries +# become eligible for refresh after their update interval. Upon next access, +# an async reload is scheduled and the old value returned until it completes. +# roles_cache_active_update: false + +# Validity period for permissions cache (fetching permissions can be an +# expensive operation depending on the authorizer, CassandraAuthorizer is +# one example). Defaults to 2000, set to 0 to disable. +# Will be disabled automatically for AllowAllAuthorizer. +# For a long-running cache using permissions_cache_active_update, consider +# setting to something longer such as a daily validation: 86400000ms +# Min unit: ms +permissions_validity: 2000ms + +# Refresh interval for permissions cache (if enabled). +# After this interval, cache entries become eligible for refresh. Upon next +# access, an async reload is scheduled and the old value returned until it +# completes. If permissions_validity is non-zero, then this must be +# also. +# This setting is also used to inform the interval of auto-updating if +# using permissions_cache_active_update. +# Defaults to the same value as permissions_validity. +# For a longer-running permissions cache, consider setting to update hourly (60000) +# Min unit: ms +# permissions_update_interval: 2000ms + +# If true, cache contents are actively updated by a background task at the +# interval set by permissions_update_interval. If false, cache entries +# become eligible for refresh after their update interval. Upon next access, +# an async reload is scheduled and the old value returned until it completes. +# permissions_cache_active_update: false + +# Validity period for credentials cache. This cache is tightly coupled to +# the provided PasswordAuthenticator implementation of IAuthenticator. If +# another IAuthenticator implementation is configured, this cache will not +# be automatically used and so the following settings will have no effect. +# Please note, credentials are cached in their encrypted form, so while +# activating this cache may reduce the number of queries made to the +# underlying table, it may not bring a significant reduction in the +# latency of individual authentication attempts. +# Defaults to 2000, set to 0 to disable credentials caching. +# For a long-running cache using credentials_cache_active_update, consider +# setting to something longer such as a daily validation: 86400000 +# Min unit: ms +credentials_validity: 2000ms + +# Refresh interval for credentials cache (if enabled). +# After this interval, cache entries become eligible for refresh. Upon next +# access, an async reload is scheduled and the old value returned until it +# completes. If credentials_validity is non-zero, then this must be +# also. +# This setting is also used to inform the interval of auto-updating if +# using credentials_cache_active_update. +# Defaults to the same value as credentials_validity. +# For a longer-running permissions cache, consider setting to update hourly (60000) +# Min unit: ms +# credentials_update_interval: 2000ms + +# If true, cache contents are actively updated by a background task at the +# interval set by credentials_update_interval. If false (default), cache entries +# become eligible for refresh after their update interval. Upon next access, +# an async reload is scheduled and the old value returned until it completes. +# credentials_cache_active_update: false + +# The partitioner is responsible for distributing groups of rows (by +# partition key) across nodes in the cluster. The partitioner can NOT be +# changed without reloading all data. If you are adding nodes or upgrading, +# you should set this to the same partitioner that you are currently using. +# +# The default partitioner is the Murmur3Partitioner. Older partitioners +# such as the RandomPartitioner, ByteOrderedPartitioner, and +# OrderPreservingPartitioner have been included for backward compatibility only. +# For new clusters, you should NOT change this value. +# +partitioner: org.apache.cassandra.dht.Murmur3Partitioner + +# Directories where Cassandra should store data on disk. If multiple +# directories are specified, Cassandra will spread data evenly across +# them by partitioning the token ranges. +# If not set, the default directory is $CASSANDRA_HOME/data/data. +# data_file_directories: +# - /var/lib/cassandra/data + +# Directory were Cassandra should store the data of the local system keyspaces. +# By default Cassandra will store the data of the local system keyspaces in the first of the data directories specified +# by data_file_directories. +# This approach ensures that if one of the other disks is lost Cassandra can continue to operate. For extra security +# this setting allows to store those data on a different directory that provides redundancy. +# local_system_data_file_directory: + +# commit log. when running on magnetic HDD, this should be a +# separate spindle than the data directories. +# If not set, the default directory is $CASSANDRA_HOME/data/commitlog. +# commitlog_directory: /var/lib/cassandra/commitlog + +# Enable / disable CDC functionality on a per-node basis. This modifies the logic used +# for write path allocation rejection (standard: never reject. cdc: reject Mutation +# containing a CDC-enabled table if at space limit in cdc_raw_directory). +cdc_enabled: false + +# Specify whether writes to the CDC-enabled tables should be blocked when CDC data on disk has reached to the limit. +# When setting to false, the writes will not be blocked and the oldest CDC data on disk will be deleted to +# ensure the size constraint. The default is true. +# cdc_block_writes: true + +# Specify whether CDC mutations are replayed through the write path on streaming, e.g. repair. +# When enabled, CDC data streamed to the destination node will be written into commit log first. When setting to false, +# the streamed CDC data is written into SSTables just the same as normal streaming. The default is true. +# If this is set to false, streaming will be considerably faster however it's possible that, in extreme situations +# (losing > quorum # nodes in a replica set), you may have data in your SSTables that never makes it to the CDC log. +# cdc_on_repair_enabled: true + +# CommitLogSegments are moved to this directory on flush if cdc_enabled: true and the +# segment contains mutations for a CDC-enabled table. This should be placed on a +# separate spindle than the data directories. If not set, the default directory is +# $CASSANDRA_HOME/data/cdc_raw. +# cdc_raw_directory: /var/lib/cassandra/cdc_raw + +# Policy for data disk failures: +# +# die +# shut down gossip and client transports and kill the JVM for any fs errors or +# single-sstable errors, so the node can be replaced. +# +# stop_paranoid +# shut down gossip and client transports even for single-sstable errors, +# kill the JVM for errors during startup. +# +# stop +# shut down gossip and client transports, leaving the node effectively dead, but +# can still be inspected via JMX, kill the JVM for errors during startup. +# +# best_effort +# stop using the failed disk and respond to requests based on +# remaining available sstables. This means you WILL see obsolete +# data at CL.ONE! +# +# ignore +# ignore fatal errors and let requests fail, as in pre-1.2 Cassandra +disk_failure_policy: stop + +# Policy for commit disk failures: +# +# die +# shut down the node and kill the JVM, so the node can be replaced. +# +# stop +# shut down the node, leaving the node effectively dead, but +# can still be inspected via JMX. +# +# stop_commit +# shutdown the commit log, letting writes collect but +# continuing to service reads, as in pre-2.0.5 Cassandra +# +# ignore +# ignore fatal errors and let the batches fail +commit_failure_policy: stop + +# Maximum size of the native protocol prepared statement cache +# +# Valid values are either "auto" (omitting the value) or a value greater 0. +# +# Note that specifying a too large value will result in long running GCs and possbily +# out-of-memory errors. Keep the value at a small fraction of the heap. +# +# If you constantly see "prepared statements discarded in the last minute because +# cache limit reached" messages, the first step is to investigate the root cause +# of these messages and check whether prepared statements are used correctly - +# i.e. use bind markers for variable parts. +# +# Do only change the default value, if you really have more prepared statements than +# fit in the cache. In most cases it is not neccessary to change this value. +# Constantly re-preparing statements is a performance penalty. +# +# Default value ("auto") is 1/256th of the heap or 10MiB, whichever is greater +# Min unit: MiB +prepared_statements_cache_size: + +# Maximum size of the key cache in memory. +# +# Each key cache hit saves 1 seek and each row cache hit saves 2 seeks at the +# minimum, sometimes more. The key cache is fairly tiny for the amount of +# time it saves, so it's worthwhile to use it at large numbers. +# The row cache saves even more time, but must contain the entire row, +# so it is extremely space-intensive. It's best to only use the +# row cache if you have hot rows or static rows. +# +# NOTE: if you reduce the size, you may not get you hottest keys loaded on startup. +# +# Default value is empty to make it "auto" (min(5% of Heap (in MiB), 100MiB)). Set to 0 to disable key cache. +# +# This is only relevant to SSTable formats that use key cache, e.g. BIG. +# Min unit: MiB +key_cache_size: 0MiB + +# Duration in seconds after which Cassandra should +# save the key cache. Caches are saved to saved_caches_directory as +# specified in this configuration file. +# +# Saved caches greatly improve cold-start speeds, and is relatively cheap in +# terms of I/O for the key cache. Row cache saving is much more expensive and +# has limited use. +# +# This is only relevant to SSTable formats that use key cache, e.g. BIG. +# Default is 14400 or 4 hours. +# Min unit: s +# key_cache_save_period: 4h + +# Number of keys from the key cache to save +# Disabled by default, meaning all keys are going to be saved +# This is only relevant to SSTable formats that use key cache, e.g. BIG. +# key_cache_keys_to_save: 100 + +# Row cache implementation class name. Available implementations: +# +# org.apache.cassandra.cache.OHCProvider +# Fully off-heap row cache implementation (default). +# +# org.apache.cassandra.cache.SerializingCacheProvider +# This is the row cache implementation available +# in previous releases of Cassandra. +# row_cache_class_name: org.apache.cassandra.cache.OHCProvider + +# Maximum size of the row cache in memory. +# Please note that OHC cache implementation requires some additional off-heap memory to manage +# the map structures and some in-flight memory during operations before/after cache entries can be +# accounted against the cache capacity. This overhead is usually small compared to the whole capacity. +# Do not specify more memory that the system can afford in the worst usual situation and leave some +# headroom for OS block level cache. Do never allow your system to swap. +# +# Default value is 0, to disable row caching. +# Min unit: MiB +row_cache_size: 0MiB + +# Duration in seconds after which Cassandra should save the row cache. +# Caches are saved to saved_caches_directory as specified in this configuration file. +# +# Saved caches greatly improve cold-start speeds, and is relatively cheap in +# terms of I/O for the key cache. Row cache saving is much more expensive and +# has limited use. +# +# Default is 0 to disable saving the row cache. +# Min unit: s +row_cache_save_period: 0s + +# Number of keys from the row cache to save. +# Specify 0 (which is the default), meaning all keys are going to be saved +# row_cache_keys_to_save: 100 + +# Maximum size of the counter cache in memory. +# +# Counter cache helps to reduce counter locks' contention for hot counter cells. +# In case of RF = 1 a counter cache hit will cause Cassandra to skip the read before +# write entirely. With RF > 1 a counter cache hit will still help to reduce the duration +# of the lock hold, helping with hot counter cell updates, but will not allow skipping +# the read entirely. Only the local (clock, count) tuple of a counter cell is kept +# in memory, not the whole counter, so it's relatively cheap. +# +# NOTE: if you reduce the size, you may not get you hottest keys loaded on startup. +# +# Default value is empty to make it "auto" (min(2.5% of Heap (in MiB), 50MiB)). Set to 0 to disable counter cache. +# NOTE: if you perform counter deletes and rely on low gcgs, you should disable the counter cache. +# Min unit: MiB +counter_cache_size: + +# Duration in seconds after which Cassandra should +# save the counter cache (keys only). Caches are saved to saved_caches_directory as +# specified in this configuration file. +# +# Default is 7200 or 2 hours. +# Min unit: s +counter_cache_save_period: 7200s + +# Number of keys from the counter cache to save +# Disabled by default, meaning all keys are going to be saved +# counter_cache_keys_to_save: 100 + +# saved caches +# If not set, the default directory is $CASSANDRA_HOME/data/saved_caches. +# saved_caches_directory: /var/lib/cassandra/saved_caches + +# Number of seconds the server will wait for each cache (row, key, etc ...) to load while starting +# the Cassandra process. Setting this to zero is equivalent to disabling all cache loading on startup +# while still having the cache during runtime. +# Min unit: s +# cache_load_timeout: 30s + +# commitlog_sync may be either "periodic", "group", or "batch." +# +# When in batch mode, Cassandra won't ack writes until the commit log +# has been flushed to disk. Each incoming write will trigger the flush task. +# +# group mode is similar to batch mode, where Cassandra will not ack writes +# until the commit log has been flushed to disk. The difference is group +# mode will wait up to commitlog_sync_group_window between flushes. +# +# Min unit: ms +# commitlog_sync_group_window: 1000ms +# +# the default option is "periodic" where writes may be acked immediately +# and the CommitLog is simply synced every commitlog_sync_period +# milliseconds. +commitlog_sync: periodic +# Min unit: ms +commitlog_sync_period: 10000ms + +# When in periodic commitlog mode, the number of milliseconds to block writes +# while waiting for a slow disk flush to complete. +# Min unit: ms +# periodic_commitlog_sync_lag_block: + +# The size of the individual commitlog file segments. A commitlog +# segment may be archived, deleted, or recycled once all the data +# in it (potentially from each columnfamily in the system) has been +# flushed to sstables. +# +# The default size is 32, which is almost always fine, but if you are +# archiving commitlog segments (see commitlog_archiving.properties), +# then you probably want a finer granularity of archiving; 8 or 16 MB +# is reasonable. +# Max mutation size is also configurable via max_mutation_size setting in +# cassandra.yaml. The default is half the size commitlog_segment_size in bytes. +# This should be positive and less than 2048. +# +# NOTE: If max_mutation_size is set explicitly then commitlog_segment_size must +# be set to at least twice the size of max_mutation_size +# +# Min unit: MiB +commitlog_segment_size: 32MiB + +# Compression to apply to the commit log. If omitted, the commit log +# will be written uncompressed. LZ4, Snappy, and Deflate compressors +# are supported. +# commitlog_compression: +# - class_name: LZ4Compressor +# parameters: +# - +# TODO: Add commitlog_access_mode: direct + + +# Compression to apply to SSTables as they flush for compressed tables. +# Note that tables without compression enabled do not respect this flag. +# +# As high ratio compressors like LZ4HC, Zstd, and Deflate can potentially +# block flushes for too long, the default is to flush with a known fast +# compressor in those cases. Options are: +# +# none : Flush without compressing blocks but while still doing checksums. +# fast : Flush with a fast compressor. If the table is already using a +# fast compressor that compressor is used. +# table: Always flush with the same compressor that the table uses. This +# was the pre 4.0 behavior. +# +# flush_compression: fast + +# any class that implements the SeedProvider interface and has a +# constructor that takes a Map<String, String> of parameters will do. +seed_provider: + # Addresses of hosts that are deemed contact points. + # Cassandra nodes use this list of hosts to find each other and learn + # the topology of the ring. You must change this if you are running + # multiple nodes! + - class_name: org.apache.cassandra.locator.SimpleSeedProvider + parameters: + # seeds is actually a comma-delimited list of addresses. + # Ex: "<ip1>,<ip2>,<ip3>" + - seeds: "127.0.0.1:7000" + # If set to "true", SimpleSeedProvider will return all IP addresses for a DNS name, + # based on the configured name service on the system. Defaults to "false". + # resolve_multiple_ip_addresses_per_dns_record: "false" + +# For workloads with more data than can fit in memory, Cassandra's +# bottleneck will be reads that need to fetch data from +# disk. "concurrent_reads" should be set to (16 * number_of_drives) in +# order to allow the operations to enqueue low enough in the stack +# that the OS and drives can reorder them. Same applies to +# "concurrent_counter_writes", since counter writes read the current +# values before incrementing and writing them back. +# +# On the other hand, since writes are almost never IO bound, the ideal +# number of "concurrent_writes" is dependent on the number of cores in +# your system; (8 * number_of_cores) is a good rule of thumb. +concurrent_reads: 32 +concurrent_writes: 32 +concurrent_counter_writes: 32 + +# For materialized view writes, as there is a read involved, so this should +# be limited by the less of concurrent reads or concurrent writes. +concurrent_materialized_view_writes: 32 + +# Maximum memory to use for inter-node and client-server networking buffers. +# +# Defaults to the smaller of 1/16 of heap or 128MB. This pool is allocated off-heap, +# so is in addition to the memory allocated for heap. The cache also has on-heap +# overhead which is roughly 128 bytes per chunk (i.e. 0.2% of the reserved size +# if the default 64k chunk size is used). +# Memory is only allocated when needed. +# Min unit: MiB +# networking_cache_size: 128MiB + +# Enable the sstable chunk cache. The chunk cache will store recently accessed +# sections of the sstable in-memory as uncompressed buffers. +# file_cache_enabled: false + +# Maximum memory to use for sstable chunk cache and buffer pooling. +# 32MB of this are reserved for pooling buffers, the rest is used for chunk cache +# that holds uncompressed sstable chunks. +# Defaults to the smaller of 1/4 of heap or 512MB. This pool is allocated off-heap, +# so is in addition to the memory allocated for heap. The cache also has on-heap +# overhead which is roughly 128 bytes per chunk (i.e. 0.2% of the reserved size +# if the default 64k chunk size is used). +# Memory is only allocated when needed. +# Min unit: MiB +# file_cache_size: 512MiB + +# Flag indicating whether to allocate on or off heap when the sstable buffer +# pool is exhausted, that is when it has exceeded the maximum memory +# file_cache_size, beyond which it will not cache buffers but allocate on request. + +# buffer_pool_use_heap_if_exhausted: true + +# The strategy for optimizing disk read +# Possible values are: +# ssd (for solid state disks, the default) +# spinning (for spinning disks) +# disk_optimization_strategy: ssd + +# Supported memtable implementations and selected default. +# Currently Cassandra offers two memtable implementations: +# - SkipListMemtable is the legacy memtable implementation provided by earlier +# versions of Cassandra. +# - TrieMemtable is a new memtable that utilizes a trie data structure. This +# implementation significantly reduces garbage collection load by moving +# more of the sstable metadata off-heap, fits more data in the same allocation +# and can reliably handle higher write throughput. +# Because the trie memtable is a sharded single-writer solution, it can perform +# worse when the load is very unevenly distributed, e.g. when most of the writes +# access a very small number of partitions or with legacy secondary indexes. +# The memtable implementation can be selected per table by setting memtable +# property in the table definition to one of the configurations specified below. +# If the memtable property is not set, the "default" configuration will be used. +# See src/java/org/apache/cassandra/db/memtable/Memtable_API.md for further +# information. +memtable: + configurations: + skiplist: + class_name: SkipListMemtable + trie: + class_name: TrieMemtable + default: + inherits: trie + +# Total permitted memory to use for memtables. Cassandra will stop +# accepting writes when the limit is exceeded until a flush completes, +# and will trigger a flush based on memtable_cleanup_threshold +# If omitted, Cassandra will set both to 1/4 the size of the heap. +# Min unit: MiB +# memtable_heap_space: 2048MiB +# Min unit: MiB +# memtable_offheap_space: 2048MiB + +# memtable_cleanup_threshold is deprecated. The default calculation +# is the only reasonable choice. See the comments on memtable_flush_writers +# for more information. +# +# Ratio of occupied non-flushing memtable size to total permitted size +# that will trigger a flush of the largest memtable. Larger mct will +# mean larger flushes and hence less compaction, but also less concurrent +# flush activity which can make it difficult to keep your disks fed +# under heavy write load. +# +# memtable_cleanup_threshold defaults to 1 / (memtable_flush_writers + 1) +# memtable_cleanup_threshold: 0.11 + +# Specify the way Cassandra allocates and manages memtable memory. +# Options are: +# +# heap_buffers +# on heap nio buffers +# +# offheap_buffers +# off heap (direct) nio buffers +# +# offheap_objects +# off heap objects +memtable_allocation_type: offheap_objects + +# Limit memory usage for Merkle tree calculations during repairs. The default +# is 1/16th of the available heap. The main tradeoff is that smaller trees +# have less resolution, which can lead to over-streaming data. If you see heap +# pressure during repairs, consider lowering this, but you cannot go below +# one mebibyte. If you see lots of over-streaming, consider raising +# this or using subrange repair. +# +# For more details see https://issues.apache.org/jira/browse/CASSANDRA-14096. +# +# Min unit: MiB +# repair_session_space: + +# repair: +# # Configure the retries for each of the repair messages that support it. As of this moment retries use an exponential algorithm where each attempt sleeps longer based off the base_sleep_time and attempt. +# retries: +# max_attempts: 10 +# base_sleep_time: 200ms +# max_sleep_time: 1s +# # Increase the timeout of validation responses due to them containing the merkle tree +# merkle_tree_response: +# base_sleep_time: 30s +# max_sleep_time: 1m + +# Total space to use for commit logs on disk. +# +# If space gets above this value, Cassandra will flush every dirty CF +# in the oldest segment and remove it. So a small total commitlog space +# will tend to cause more flush activity on less-active columnfamilies. +# +# The default value is the smaller of 8192, and 1/4 of the total space +# of the commitlog volume. +# +# commitlog_total_space: 8192MiB + +# This sets the number of memtable flush writer threads per disk +# as well as the total number of memtables that can be flushed concurrently. +# These are generally a combination of compute and IO bound. +# +# Memtable flushing is more CPU efficient than memtable ingest and a single thread +# can keep up with the ingest rate of a whole server on a single fast disk +# until it temporarily becomes IO bound under contention typically with compaction. +# At that point you need multiple flush threads. At some point in the future +# it may become CPU bound all the time. +# +# You can tell if flushing is falling behind using the MemtablePool.BlockedOnAllocation +# metric which should be 0, but will be non-zero if threads are blocked waiting on flushing +# to free memory. +# +# memtable_flush_writers defaults to two for a single data directory. +# This means that two memtables can be flushed concurrently to the single data directory. +# If you have multiple data directories the default is one memtable flushing at a time +# but the flush will use a thread per data directory so you will get two or more writers. +# +# Two is generally enough to flush on a fast disk [array] mounted as a single data directory. +# Adding more flush writers will result in smaller more frequent flushes that introduce more +# compaction overhead. +# +# There is a direct tradeoff between number of memtables that can be flushed concurrently +# and flush size and frequency. More is not better you just need enough flush writers +# to never stall waiting for flushing to free memory. +# +# memtable_flush_writers: 2 + +# Total space to use for change-data-capture logs on disk. +# +# If space gets above this value, Cassandra will throw WriteTimeoutException +# on Mutations including tables with CDC enabled. A CDCCompactor is responsible +# for parsing the raw CDC logs and deleting them when parsing is completed. +# +# The default value is the min of 4096 MiB and 1/8th of the total space +# of the drive where cdc_raw_directory resides. +# Min unit: MiB +# cdc_total_space: 4096MiB + +# When we hit our cdc_raw limit and the CDCCompactor is either running behind +# or experiencing backpressure, we check at the following interval to see if any +# new space for cdc-tracked tables has been made available. Default to 250ms +# Min unit: ms +# cdc_free_space_check_interval: 250ms + +# Whether to, when doing sequential writing, fsync() at intervals in +# order to force the operating system to flush the dirty +# buffers. Enable this to avoid sudden dirty buffer flushing from +# impacting read latencies. Almost always a good idea on SSDs; not +# necessarily on platters. +trickle_fsync: false Review Comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

