>  I think we need to integrate this to some ant target. If you expanded on
this, that would be great.

A draft version of the ant target (as of now it is configured as non-failed
and not attached to the usual build process):
https://github.com/apache/cassandra/pull/3830/files

On Fri, 24 Jan 2025 at 20:24, Štefan Miklošovič <smikloso...@apache.org>
wrote:

> How are we going to document what each of 112 missing properties is doing
> and / or exclude them from cassandra.yaml? There are a lot of properties
> which just don't ring a bell exactly what they are for. I think we should
> create a basic table and document what each is for and what is the decision
> about adding that property to yaml or not.
>
> On Fri, Jan 24, 2025 at 4:31 PM Dmitry Konstantinov <netud...@gmail.com>
> wrote:
>
>> A very primitive implementation of the 1st idea below:
>>
>> String configUrl = 
>> "file:///Users/dmitry/IdeaProjects/cassandra-trunk/conf/cassandra.yaml";
>> Field[] allFields = Config.class.getFields();
>> List<String> topLevelPropertyNames = new ArrayList<>();
>> for(Field field : allFields)
>> {
>>     if (!Modifier.isStatic(field.getModifiers()))
>>     {
>>         topLevelPropertyNames.add(field.getName());
>>     }
>> }
>>
>> URL url = new URL(configUrl);
>> List<String> lines = Files.readAllLines(Paths.get(url.toURI()));
>>
>> int missedCount = 0;
>> for (String propertyName : topLevelPropertyNames)
>> {
>>     boolean found = false;
>>     for (String line : lines)
>>     {
>>         if (line.startsWith(propertyName + ":")
>>             || line.startsWith("#" + propertyName + ":")
>>             || line.startsWith("# " + propertyName + ":")) {
>>             found = true;
>>             break;
>>         }
>>     }
>>     if (!found)
>>     {
>>         missedCount++;
>>         System.out.println(propertyName);
>>     }
>> }
>> System.out.println("Total missed:" + missedCount);
>>
>>
>> It prints the following config property names which are defined in 
>> Config.java but not present as "property" or "# property " in a file:
>>
>> permissions_cache_max_entries
>> roles_cache_max_entries
>> credentials_cache_max_entries
>> auto_bootstrap
>> force_new_prepared_statement_behaviour
>> use_deterministic_table_id
>> repair_request_timeout
>> stream_transfer_task_timeout
>> cms_await_timeout
>> cms_default_max_retries
>> cms_default_retry_backoff
>> epoch_aware_debounce_inflight_tracker_max_size
>> metadata_snapshot_frequency
>> available_processors
>> repair_session_max_tree_depth
>> use_offheap_merkle_trees
>> internode_max_message_size
>> native_transport_max_message_size
>> native_transport_max_request_data_in_flight_per_ip
>> native_transport_max_request_data_in_flight
>> native_transport_receive_queue_capacity
>> min_free_space_per_drive
>> max_space_usable_for_compactions_in_percentage
>> reject_repair_compaction_threshold
>> concurrent_index_builders
>> max_streaming_retries
>> commitlog_max_compression_buffers_in_pool
>> max_mutation_size
>> dynamic_snitch
>> failure_detector
>> use_creation_time_for_hint_ttl
>> key_cache_migrate_during_compaction
>> key_cache_invalidate_after_sstable_deletion
>> paxos_cache_size
>> file_cache_round_up
>> disk_optimization_estimate_percentile
>> disk_optimization_page_cross_chance
>> purgeable_tobmstones_metric_granularity
>> windows_timer_interval
>> otc_coalescing_strategy
>> otc_coalescing_window_us
>> otc_coalescing_enough_coalesced_messages
>> otc_backlog_expiration_interval_ms
>> scripted_user_defined_functions_enabled
>> user_defined_functions_threads_enabled
>> allow_insecure_udfs
>> allow_extra_insecure_udfs
>> user_defined_functions_warn_timeout
>> user_defined_functions_fail_timeout
>> user_function_timeout_policy
>> back_pressure_enabled
>> back_pressure_strategy
>> repair_command_pool_full_strategy
>> repair_command_pool_size
>> block_for_peers_timeout_in_secs
>> block_for_peers_in_remote_dcs
>> skip_stream_disk_space_check
>> snapshot_on_repaired_data_mismatch
>> validation_preview_purge_head_start
>> initial_range_tombstone_list_allocation_size
>> range_tombstone_list_growth_factor
>> snapshot_on_duplicate_row_detection
>> check_for_duplicate_rows_during_reads
>> check_for_duplicate_rows_during_compaction
>> autocompaction_on_startup_enabled
>> auto_optimise_inc_repair_streams
>> auto_optimise_full_repair_streams
>> auto_optimise_preview_repair_streams
>> consecutive_message_errors_threshold
>> internode_error_reporting_exclusions
>> compact_tables_enabled
>> vector_type_enabled
>> intersect_filtering_query_warned
>> intersect_filtering_query_enabled
>> streaming_slow_events_log_timeout
>> repair_state_expires
>> repair_state_size
>> paxos_variant
>> skip_paxos_repair_on_topology_change
>> paxos_purge_grace_period
>> paxos_on_linearizability_violations
>> paxos_state_purging
>> paxos_repair_enabled
>> paxos_topology_repair_no_dc_checks
>> paxos_topology_repair_strict_each_quorum
>> skip_paxos_repair_on_topology_change_keyspaces
>> paxos_contention_wait_randomizer
>> paxos_contention_min_wait
>> paxos_contention_max_wait
>> paxos_contention_min_delta
>> paxos_repair_parallelism
>> sstable_read_rate_persistence_enabled
>> client_request_size_metrics_enabled
>> max_top_size_partition_count
>> max_top_tombstone_partition_count
>> min_tracked_partition_size
>> min_tracked_partition_tombstone_count
>> top_partitions_enabled
>> severity_during_decommission
>> progress_barrier_min_consistency_level
>> progress_barrier_default_consistency_level
>> progress_barrier_timeout
>> progress_barrier_backoff
>> discovery_timeout
>> unsafe_tcm_mode
>> cql_start_time
>> native_transport_throw_on_overload
>> native_transport_queue_max_item_age_threshold
>> native_transport_min_backoff_on_queue_overload
>> native_transport_max_backoff_on_queue_overload
>> native_transport_timeout
>> enforce_native_deadline_for_hints
>> Total missed:112
>>
>>
>>
>> On Fri, 24 Jan 2025 at 15:10, Štefan Miklošovič <smikloso...@apache.org>
>> wrote:
>>
>>> It should also work the other way around. If there is a property which
>>> is commented out in yaml and it is not in Config.java, that should fail as
>>> well. If it is not commented out and it is not in Config.java, that will
>>> fail in runtime as it fails on unrecognized property.
>>>
>>> This will be used in practice very rarely as we seldom remove the
>>> properties in Config but if we do and a property is commented out, we
>>> should not ship a dead property name, even commented out.
>>>
>>> On Fri, Jan 24, 2025 at 3:51 PM Paulo Motta <pa...@apache.org> wrote:
>>>
>>>> >  >  If "# my_cool_property: true" is NOT in cassandra.yaml, we might
>>>> indeed add it, also commented out. I think it would be quite easy to check
>>>> against yaml if there is a line starting on "# my_cool_property" or just on
>>>> "my_cool_property". Both cases would satisfy the check.
>>>>
>>>> Makes sense, I think this would be good to have as a lint or test to
>>>> easily catch overlooks during review.
>>>>
>>>>
>>>> On Fri, Jan 24, 2025 at 9:44 AM Štefan Miklošovič <
>>>> smikloso...@apache.org> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Jan 24, 2025 at 3:27 PM Paulo Motta <pa...@apache.org> wrote:
>>>>>
>>>>>> > from time to time I see configuration properties in Config.java and
>>>>>> they are clearly not in cassandra.yaml. Not every property in Config is 
>>>>>> in
>>>>>> cassandra.yaml. I would like to know if there is some specific reason
>>>>>> behind that.
>>>>>>
>>>>>> I think one of the original reasons was to "hide" advanced configs
>>>>>> that are not meant to be updated, unless in very niche circumstances.
>>>>>> However I think this has been extrapolated to non-advanced settings.
>>>>>>
>>>>>> > Question related to that is if we could not have a build-time check
>>>>>> that all properties in Config have to be in cassandra.yaml and fail the
>>>>>> build if a property in Config does not have its counterpart in yaml.
>>>>>>
>>>>>> Are you saying every configuration property should be commented-out,
>>>>>> or do you think that every Config property should be specified in
>>>>>> cassandra.yaml with their default uncomented ? One issue with that is 
>>>>>> that
>>>>>> you could cause user confusion if you "reveal" a niche/advanced config 
>>>>>> that
>>>>>> is not meant to be updated. I think this would be addressed by
>>>>>> the @HiddenInYaml flag you are proposing in a later post.
>>>>>>
>>>>>
>>>>> Yes, then can stay hidden, but we should annotate it with @Hidden or
>>>>> similar. As of now, if that property is not in yaml, we just don't know if
>>>>> it was forgotten to be added or if we have not added it on purpose.
>>>>>
>>>>> They can keep being commented out if they currently are. Imagine a
>>>>> property in Config.java
>>>>>
>>>>> public boolean my_cool_property = true;
>>>>>
>>>>> and then this in cassandra.yaml
>>>>>
>>>>> # my_cool_property: true
>>>>>
>>>>> It is completely ok.
>>>>>
>>>>> If "# my_cool_property: true" is NOT in cassandra.yaml, we might
>>>>> indeed add it, also commented out. I think it would be quite easy to check
>>>>> against yaml if there is a line starting on "# my_cool_property" or just 
>>>>> on
>>>>> "my_cool_property". Both cases would satisfy the check.
>>>>>
>>>>>
>>>>>
>>>>>> > There are dozens of properties in Config and I have a strong
>>>>>> suspicion that we missed to publish some to yaml so users do not even 
>>>>>> know
>>>>>> such a property exists and as of now we do not even know which they are.
>>>>>>
>>>>>> I believe this is a problem. I think most properties should be in
>>>>>> cassandra.yaml, unless they are very advanced or not meant to be updated.
>>>>>>
>>>>>> Another tangential issue is that there are features/settings that
>>>>>> don't even have a Config entry, but are just controlled by JVM 
>>>>>> properties.
>>>>>>
>>>>>> I think that we should attempt to unify Config and jvm properties
>>>>>> under a predictable structure. For example, if there is a YAML config
>>>>>> enable_user_defined_functions, then there should be a respective JVM flag
>>>>>> -Dcassandra.enable_user_defined_functions, and vice versa.
>>>>>>
>>>>>
>>>>> Yeah, good idea.
>>>>>
>>>>>
>>>>>>
>>>>>> On Fri, Jan 24, 2025 at 9:16 AM Štefan Miklošovič <
>>>>>> smikloso...@apache.org> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> from time to time I see configuration properties in Config.java and
>>>>>>> they are clearly not in cassandra.yaml. Not every property in Config is 
>>>>>>> in
>>>>>>> cassandra.yaml. I would like to know if there is some specific reason
>>>>>>> behind that.
>>>>>>>
>>>>>>> Question related to that is if we could not have a build-time check
>>>>>>> that all properties in Config have to be in cassandra.yaml and fail the
>>>>>>> build if a property in Config does not have its counterpart in yaml.
>>>>>>>
>>>>>>> There are dozens of properties in Config and I have a strong
>>>>>>> suspicion that we missed to publish some to yaml so users do not even 
>>>>>>> know
>>>>>>> such a property exists and as of now we do not even know which they are.
>>>>>>>
>>>>>>
>>
>> --
>> Dmitry Konstantinov
>>
>

-- 
Dmitry Konstantinov

Reply via email to