I have compiled this table (1). It is editable by everybody.

The idea behind that is that if people recognize they would like to have
some property in cassandra.yaml exposed, just mark it as green, add a
comment and some notes. It would be cool if we did this together as people
who have introduced these properties will most likely know if it is indeed
not going to be in cassandra.yml or they just forgot.

If you think that some property should not be in cassandra.yaml, just mark
that as red.

I went through it very briefly and marked it as green / red, the very
obvious ones, I am not sure about the rest. Yellow is somehow in the middle.

(1)
https://docs.google.com/spreadsheets/d/11MOxhNqwE1tWP4ex2gzKG2pmeAWFaHDKo-CRp25h9BU/edit?usp=sharing

On Sun, Jan 26, 2025 at 2:09 PM Dmitry Konstantinov <netud...@gmail.com>
wrote:

> >  I think we need to integrate this to some ant target. If you expanded
> on this, that would be great.
>
> A draft version of the ant target (as of now it is configured as
> non-failed and not attached to the usual build process):
> https://github.com/apache/cassandra/pull/3830/files
>
> On Fri, 24 Jan 2025 at 20:24, Štefan Miklošovič <smikloso...@apache.org>
> wrote:
>
>> How are we going to document what each of 112 missing properties is doing
>> and / or exclude them from cassandra.yaml? There are a lot of properties
>> which just don't ring a bell exactly what they are for. I think we should
>> create a basic table and document what each is for and what is the decision
>> about adding that property to yaml or not.
>>
>> On Fri, Jan 24, 2025 at 4:31 PM Dmitry Konstantinov <netud...@gmail.com>
>> wrote:
>>
>>> A very primitive implementation of the 1st idea below:
>>>
>>> String configUrl = 
>>> "file:///Users/dmitry/IdeaProjects/cassandra-trunk/conf/cassandra.yaml";
>>> Field[] allFields = Config.class.getFields();
>>> List<String> topLevelPropertyNames = new ArrayList<>();
>>> for(Field field : allFields)
>>> {
>>>     if (!Modifier.isStatic(field.getModifiers()))
>>>     {
>>>         topLevelPropertyNames.add(field.getName());
>>>     }
>>> }
>>>
>>> URL url = new URL(configUrl);
>>> List<String> lines = Files.readAllLines(Paths.get(url.toURI()));
>>>
>>> int missedCount = 0;
>>> for (String propertyName : topLevelPropertyNames)
>>> {
>>>     boolean found = false;
>>>     for (String line : lines)
>>>     {
>>>         if (line.startsWith(propertyName + ":")
>>>             || line.startsWith("#" + propertyName + ":")
>>>             || line.startsWith("# " + propertyName + ":")) {
>>>             found = true;
>>>             break;
>>>         }
>>>     }
>>>     if (!found)
>>>     {
>>>         missedCount++;
>>>         System.out.println(propertyName);
>>>     }
>>> }
>>> System.out.println("Total missed:" + missedCount);
>>>
>>>
>>> It prints the following config property names which are defined in 
>>> Config.java but not present as "property" or "# property " in a file:
>>>
>>> permissions_cache_max_entries
>>> roles_cache_max_entries
>>> credentials_cache_max_entries
>>> auto_bootstrap
>>> force_new_prepared_statement_behaviour
>>> use_deterministic_table_id
>>> repair_request_timeout
>>> stream_transfer_task_timeout
>>> cms_await_timeout
>>> cms_default_max_retries
>>> cms_default_retry_backoff
>>> epoch_aware_debounce_inflight_tracker_max_size
>>> metadata_snapshot_frequency
>>> available_processors
>>> repair_session_max_tree_depth
>>> use_offheap_merkle_trees
>>> internode_max_message_size
>>> native_transport_max_message_size
>>> native_transport_max_request_data_in_flight_per_ip
>>> native_transport_max_request_data_in_flight
>>> native_transport_receive_queue_capacity
>>> min_free_space_per_drive
>>> max_space_usable_for_compactions_in_percentage
>>> reject_repair_compaction_threshold
>>> concurrent_index_builders
>>> max_streaming_retries
>>> commitlog_max_compression_buffers_in_pool
>>> max_mutation_size
>>> dynamic_snitch
>>> failure_detector
>>> use_creation_time_for_hint_ttl
>>> key_cache_migrate_during_compaction
>>> key_cache_invalidate_after_sstable_deletion
>>> paxos_cache_size
>>> file_cache_round_up
>>> disk_optimization_estimate_percentile
>>> disk_optimization_page_cross_chance
>>> purgeable_tobmstones_metric_granularity
>>> windows_timer_interval
>>> otc_coalescing_strategy
>>> otc_coalescing_window_us
>>> otc_coalescing_enough_coalesced_messages
>>> otc_backlog_expiration_interval_ms
>>> scripted_user_defined_functions_enabled
>>> user_defined_functions_threads_enabled
>>> allow_insecure_udfs
>>> allow_extra_insecure_udfs
>>> user_defined_functions_warn_timeout
>>> user_defined_functions_fail_timeout
>>> user_function_timeout_policy
>>> back_pressure_enabled
>>> back_pressure_strategy
>>> repair_command_pool_full_strategy
>>> repair_command_pool_size
>>> block_for_peers_timeout_in_secs
>>> block_for_peers_in_remote_dcs
>>> skip_stream_disk_space_check
>>> snapshot_on_repaired_data_mismatch
>>> validation_preview_purge_head_start
>>> initial_range_tombstone_list_allocation_size
>>> range_tombstone_list_growth_factor
>>> snapshot_on_duplicate_row_detection
>>> check_for_duplicate_rows_during_reads
>>> check_for_duplicate_rows_during_compaction
>>> autocompaction_on_startup_enabled
>>> auto_optimise_inc_repair_streams
>>> auto_optimise_full_repair_streams
>>> auto_optimise_preview_repair_streams
>>> consecutive_message_errors_threshold
>>> internode_error_reporting_exclusions
>>> compact_tables_enabled
>>> vector_type_enabled
>>> intersect_filtering_query_warned
>>> intersect_filtering_query_enabled
>>> streaming_slow_events_log_timeout
>>> repair_state_expires
>>> repair_state_size
>>> paxos_variant
>>> skip_paxos_repair_on_topology_change
>>> paxos_purge_grace_period
>>> paxos_on_linearizability_violations
>>> paxos_state_purging
>>> paxos_repair_enabled
>>> paxos_topology_repair_no_dc_checks
>>> paxos_topology_repair_strict_each_quorum
>>> skip_paxos_repair_on_topology_change_keyspaces
>>> paxos_contention_wait_randomizer
>>> paxos_contention_min_wait
>>> paxos_contention_max_wait
>>> paxos_contention_min_delta
>>> paxos_repair_parallelism
>>> sstable_read_rate_persistence_enabled
>>> client_request_size_metrics_enabled
>>> max_top_size_partition_count
>>> max_top_tombstone_partition_count
>>> min_tracked_partition_size
>>> min_tracked_partition_tombstone_count
>>> top_partitions_enabled
>>> severity_during_decommission
>>> progress_barrier_min_consistency_level
>>> progress_barrier_default_consistency_level
>>> progress_barrier_timeout
>>> progress_barrier_backoff
>>> discovery_timeout
>>> unsafe_tcm_mode
>>> cql_start_time
>>> native_transport_throw_on_overload
>>> native_transport_queue_max_item_age_threshold
>>> native_transport_min_backoff_on_queue_overload
>>> native_transport_max_backoff_on_queue_overload
>>> native_transport_timeout
>>> enforce_native_deadline_for_hints
>>> Total missed:112
>>>
>>>
>>>
>>> On Fri, 24 Jan 2025 at 15:10, Štefan Miklošovič <smikloso...@apache.org>
>>> wrote:
>>>
>>>> It should also work the other way around. If there is a property which
>>>> is commented out in yaml and it is not in Config.java, that should fail as
>>>> well. If it is not commented out and it is not in Config.java, that will
>>>> fail in runtime as it fails on unrecognized property.
>>>>
>>>> This will be used in practice very rarely as we seldom remove the
>>>> properties in Config but if we do and a property is commented out, we
>>>> should not ship a dead property name, even commented out.
>>>>
>>>> On Fri, Jan 24, 2025 at 3:51 PM Paulo Motta <pa...@apache.org> wrote:
>>>>
>>>>> >  >  If "# my_cool_property: true" is NOT in cassandra.yaml, we might
>>>>> indeed add it, also commented out. I think it would be quite easy to check
>>>>> against yaml if there is a line starting on "# my_cool_property" or just 
>>>>> on
>>>>> "my_cool_property". Both cases would satisfy the check.
>>>>>
>>>>> Makes sense, I think this would be good to have as a lint or test to
>>>>> easily catch overlooks during review.
>>>>>
>>>>>
>>>>> On Fri, Jan 24, 2025 at 9:44 AM Štefan Miklošovič <
>>>>> smikloso...@apache.org> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jan 24, 2025 at 3:27 PM Paulo Motta <pa...@apache.org> wrote:
>>>>>>
>>>>>>> > from time to time I see configuration properties in Config.java
>>>>>>> and they are clearly not in cassandra.yaml. Not every property in 
>>>>>>> Config is
>>>>>>> in cassandra.yaml. I would like to know if there is some specific reason
>>>>>>> behind that.
>>>>>>>
>>>>>>> I think one of the original reasons was to "hide" advanced configs
>>>>>>> that are not meant to be updated, unless in very niche circumstances.
>>>>>>> However I think this has been extrapolated to non-advanced settings.
>>>>>>>
>>>>>>> > Question related to that is if we could not have a build-time
>>>>>>> check that all properties in Config have to be in cassandra.yaml and 
>>>>>>> fail
>>>>>>> the build if a property in Config does not have its counterpart in yaml.
>>>>>>>
>>>>>>> Are you saying every configuration property should be commented-out,
>>>>>>> or do you think that every Config property should be specified in
>>>>>>> cassandra.yaml with their default uncomented ? One issue with that is 
>>>>>>> that
>>>>>>> you could cause user confusion if you "reveal" a niche/advanced config 
>>>>>>> that
>>>>>>> is not meant to be updated. I think this would be addressed by
>>>>>>> the @HiddenInYaml flag you are proposing in a later post.
>>>>>>>
>>>>>>
>>>>>> Yes, then can stay hidden, but we should annotate it with @Hidden or
>>>>>> similar. As of now, if that property is not in yaml, we just don't know 
>>>>>> if
>>>>>> it was forgotten to be added or if we have not added it on purpose.
>>>>>>
>>>>>> They can keep being commented out if they currently are. Imagine a
>>>>>> property in Config.java
>>>>>>
>>>>>> public boolean my_cool_property = true;
>>>>>>
>>>>>> and then this in cassandra.yaml
>>>>>>
>>>>>> # my_cool_property: true
>>>>>>
>>>>>> It is completely ok.
>>>>>>
>>>>>> If "# my_cool_property: true" is NOT in cassandra.yaml, we might
>>>>>> indeed add it, also commented out. I think it would be quite easy to 
>>>>>> check
>>>>>> against yaml if there is a line starting on "# my_cool_property" or just 
>>>>>> on
>>>>>> "my_cool_property". Both cases would satisfy the check.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> > There are dozens of properties in Config and I have a strong
>>>>>>> suspicion that we missed to publish some to yaml so users do not even 
>>>>>>> know
>>>>>>> such a property exists and as of now we do not even know which they are.
>>>>>>>
>>>>>>> I believe this is a problem. I think most properties should be in
>>>>>>> cassandra.yaml, unless they are very advanced or not meant to be 
>>>>>>> updated.
>>>>>>>
>>>>>>> Another tangential issue is that there are features/settings that
>>>>>>> don't even have a Config entry, but are just controlled by JVM 
>>>>>>> properties.
>>>>>>>
>>>>>>> I think that we should attempt to unify Config and jvm properties
>>>>>>> under a predictable structure. For example, if there is a YAML config
>>>>>>> enable_user_defined_functions, then there should be a respective JVM 
>>>>>>> flag
>>>>>>> -Dcassandra.enable_user_defined_functions, and vice versa.
>>>>>>>
>>>>>>
>>>>>> Yeah, good idea.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Fri, Jan 24, 2025 at 9:16 AM Štefan Miklošovič <
>>>>>>> smikloso...@apache.org> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> from time to time I see configuration properties in Config.java and
>>>>>>>> they are clearly not in cassandra.yaml. Not every property in Config 
>>>>>>>> is in
>>>>>>>> cassandra.yaml. I would like to know if there is some specific reason
>>>>>>>> behind that.
>>>>>>>>
>>>>>>>> Question related to that is if we could not have a build-time check
>>>>>>>> that all properties in Config have to be in cassandra.yaml and fail the
>>>>>>>> build if a property in Config does not have its counterpart in yaml.
>>>>>>>>
>>>>>>>> There are dozens of properties in Config and I have a strong
>>>>>>>> suspicion that we missed to publish some to yaml so users do not even 
>>>>>>>> know
>>>>>>>> such a property exists and as of now we do not even know which they 
>>>>>>>> are.
>>>>>>>>
>>>>>>>
>>>
>>> --
>>> Dmitry Konstantinov
>>>
>>
>
> --
> Dmitry Konstantinov
>

Reply via email to