> I think we need to integrate this to some ant target. If you expanded on this, that would be great.
A draft version of the ant target (as of now it is configured as non-failed and not attached to the usual build process): https://github.com/apache/cassandra/pull/3830/files On Fri, 24 Jan 2025 at 20:24, Štefan Miklošovič <smikloso...@apache.org> wrote: > How are we going to document what each of 112 missing properties is doing > and / or exclude them from cassandra.yaml? There are a lot of properties > which just don't ring a bell exactly what they are for. I think we should > create a basic table and document what each is for and what is the decision > about adding that property to yaml or not. > > On Fri, Jan 24, 2025 at 4:31 PM Dmitry Konstantinov <netud...@gmail.com> > wrote: > >> A very primitive implementation of the 1st idea below: >> >> String configUrl = >> "file:///Users/dmitry/IdeaProjects/cassandra-trunk/conf/cassandra.yaml"; >> Field[] allFields = Config.class.getFields(); >> List<String> topLevelPropertyNames = new ArrayList<>(); >> for(Field field : allFields) >> { >> if (!Modifier.isStatic(field.getModifiers())) >> { >> topLevelPropertyNames.add(field.getName()); >> } >> } >> >> URL url = new URL(configUrl); >> List<String> lines = Files.readAllLines(Paths.get(url.toURI())); >> >> int missedCount = 0; >> for (String propertyName : topLevelPropertyNames) >> { >> boolean found = false; >> for (String line : lines) >> { >> if (line.startsWith(propertyName + ":") >> || line.startsWith("#" + propertyName + ":") >> || line.startsWith("# " + propertyName + ":")) { >> found = true; >> break; >> } >> } >> if (!found) >> { >> missedCount++; >> System.out.println(propertyName); >> } >> } >> System.out.println("Total missed:" + missedCount); >> >> >> It prints the following config property names which are defined in >> Config.java but not present as "property" or "# property " in a file: >> >> permissions_cache_max_entries >> roles_cache_max_entries >> credentials_cache_max_entries >> auto_bootstrap >> force_new_prepared_statement_behaviour >> use_deterministic_table_id >> repair_request_timeout >> stream_transfer_task_timeout >> cms_await_timeout >> cms_default_max_retries >> cms_default_retry_backoff >> epoch_aware_debounce_inflight_tracker_max_size >> metadata_snapshot_frequency >> available_processors >> repair_session_max_tree_depth >> use_offheap_merkle_trees >> internode_max_message_size >> native_transport_max_message_size >> native_transport_max_request_data_in_flight_per_ip >> native_transport_max_request_data_in_flight >> native_transport_receive_queue_capacity >> min_free_space_per_drive >> max_space_usable_for_compactions_in_percentage >> reject_repair_compaction_threshold >> concurrent_index_builders >> max_streaming_retries >> commitlog_max_compression_buffers_in_pool >> max_mutation_size >> dynamic_snitch >> failure_detector >> use_creation_time_for_hint_ttl >> key_cache_migrate_during_compaction >> key_cache_invalidate_after_sstable_deletion >> paxos_cache_size >> file_cache_round_up >> disk_optimization_estimate_percentile >> disk_optimization_page_cross_chance >> purgeable_tobmstones_metric_granularity >> windows_timer_interval >> otc_coalescing_strategy >> otc_coalescing_window_us >> otc_coalescing_enough_coalesced_messages >> otc_backlog_expiration_interval_ms >> scripted_user_defined_functions_enabled >> user_defined_functions_threads_enabled >> allow_insecure_udfs >> allow_extra_insecure_udfs >> user_defined_functions_warn_timeout >> user_defined_functions_fail_timeout >> user_function_timeout_policy >> back_pressure_enabled >> back_pressure_strategy >> repair_command_pool_full_strategy >> repair_command_pool_size >> block_for_peers_timeout_in_secs >> block_for_peers_in_remote_dcs >> skip_stream_disk_space_check >> snapshot_on_repaired_data_mismatch >> validation_preview_purge_head_start >> initial_range_tombstone_list_allocation_size >> range_tombstone_list_growth_factor >> snapshot_on_duplicate_row_detection >> check_for_duplicate_rows_during_reads >> check_for_duplicate_rows_during_compaction >> autocompaction_on_startup_enabled >> auto_optimise_inc_repair_streams >> auto_optimise_full_repair_streams >> auto_optimise_preview_repair_streams >> consecutive_message_errors_threshold >> internode_error_reporting_exclusions >> compact_tables_enabled >> vector_type_enabled >> intersect_filtering_query_warned >> intersect_filtering_query_enabled >> streaming_slow_events_log_timeout >> repair_state_expires >> repair_state_size >> paxos_variant >> skip_paxos_repair_on_topology_change >> paxos_purge_grace_period >> paxos_on_linearizability_violations >> paxos_state_purging >> paxos_repair_enabled >> paxos_topology_repair_no_dc_checks >> paxos_topology_repair_strict_each_quorum >> skip_paxos_repair_on_topology_change_keyspaces >> paxos_contention_wait_randomizer >> paxos_contention_min_wait >> paxos_contention_max_wait >> paxos_contention_min_delta >> paxos_repair_parallelism >> sstable_read_rate_persistence_enabled >> client_request_size_metrics_enabled >> max_top_size_partition_count >> max_top_tombstone_partition_count >> min_tracked_partition_size >> min_tracked_partition_tombstone_count >> top_partitions_enabled >> severity_during_decommission >> progress_barrier_min_consistency_level >> progress_barrier_default_consistency_level >> progress_barrier_timeout >> progress_barrier_backoff >> discovery_timeout >> unsafe_tcm_mode >> cql_start_time >> native_transport_throw_on_overload >> native_transport_queue_max_item_age_threshold >> native_transport_min_backoff_on_queue_overload >> native_transport_max_backoff_on_queue_overload >> native_transport_timeout >> enforce_native_deadline_for_hints >> Total missed:112 >> >> >> >> On Fri, 24 Jan 2025 at 15:10, Štefan Miklošovič <smikloso...@apache.org> >> wrote: >> >>> It should also work the other way around. If there is a property which >>> is commented out in yaml and it is not in Config.java, that should fail as >>> well. If it is not commented out and it is not in Config.java, that will >>> fail in runtime as it fails on unrecognized property. >>> >>> This will be used in practice very rarely as we seldom remove the >>> properties in Config but if we do and a property is commented out, we >>> should not ship a dead property name, even commented out. >>> >>> On Fri, Jan 24, 2025 at 3:51 PM Paulo Motta <pa...@apache.org> wrote: >>> >>>> > > If "# my_cool_property: true" is NOT in cassandra.yaml, we might >>>> indeed add it, also commented out. I think it would be quite easy to check >>>> against yaml if there is a line starting on "# my_cool_property" or just on >>>> "my_cool_property". Both cases would satisfy the check. >>>> >>>> Makes sense, I think this would be good to have as a lint or test to >>>> easily catch overlooks during review. >>>> >>>> >>>> On Fri, Jan 24, 2025 at 9:44 AM Štefan Miklošovič < >>>> smikloso...@apache.org> wrote: >>>> >>>>> >>>>> >>>>> On Fri, Jan 24, 2025 at 3:27 PM Paulo Motta <pa...@apache.org> wrote: >>>>> >>>>>> > from time to time I see configuration properties in Config.java and >>>>>> they are clearly not in cassandra.yaml. Not every property in Config is >>>>>> in >>>>>> cassandra.yaml. I would like to know if there is some specific reason >>>>>> behind that. >>>>>> >>>>>> I think one of the original reasons was to "hide" advanced configs >>>>>> that are not meant to be updated, unless in very niche circumstances. >>>>>> However I think this has been extrapolated to non-advanced settings. >>>>>> >>>>>> > Question related to that is if we could not have a build-time check >>>>>> that all properties in Config have to be in cassandra.yaml and fail the >>>>>> build if a property in Config does not have its counterpart in yaml. >>>>>> >>>>>> Are you saying every configuration property should be commented-out, >>>>>> or do you think that every Config property should be specified in >>>>>> cassandra.yaml with their default uncomented ? One issue with that is >>>>>> that >>>>>> you could cause user confusion if you "reveal" a niche/advanced config >>>>>> that >>>>>> is not meant to be updated. I think this would be addressed by >>>>>> the @HiddenInYaml flag you are proposing in a later post. >>>>>> >>>>> >>>>> Yes, then can stay hidden, but we should annotate it with @Hidden or >>>>> similar. As of now, if that property is not in yaml, we just don't know if >>>>> it was forgotten to be added or if we have not added it on purpose. >>>>> >>>>> They can keep being commented out if they currently are. Imagine a >>>>> property in Config.java >>>>> >>>>> public boolean my_cool_property = true; >>>>> >>>>> and then this in cassandra.yaml >>>>> >>>>> # my_cool_property: true >>>>> >>>>> It is completely ok. >>>>> >>>>> If "# my_cool_property: true" is NOT in cassandra.yaml, we might >>>>> indeed add it, also commented out. I think it would be quite easy to check >>>>> against yaml if there is a line starting on "# my_cool_property" or just >>>>> on >>>>> "my_cool_property". Both cases would satisfy the check. >>>>> >>>>> >>>>> >>>>>> > There are dozens of properties in Config and I have a strong >>>>>> suspicion that we missed to publish some to yaml so users do not even >>>>>> know >>>>>> such a property exists and as of now we do not even know which they are. >>>>>> >>>>>> I believe this is a problem. I think most properties should be in >>>>>> cassandra.yaml, unless they are very advanced or not meant to be updated. >>>>>> >>>>>> Another tangential issue is that there are features/settings that >>>>>> don't even have a Config entry, but are just controlled by JVM >>>>>> properties. >>>>>> >>>>>> I think that we should attempt to unify Config and jvm properties >>>>>> under a predictable structure. For example, if there is a YAML config >>>>>> enable_user_defined_functions, then there should be a respective JVM flag >>>>>> -Dcassandra.enable_user_defined_functions, and vice versa. >>>>>> >>>>> >>>>> Yeah, good idea. >>>>> >>>>> >>>>>> >>>>>> On Fri, Jan 24, 2025 at 9:16 AM Štefan Miklošovič < >>>>>> smikloso...@apache.org> wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> from time to time I see configuration properties in Config.java and >>>>>>> they are clearly not in cassandra.yaml. Not every property in Config is >>>>>>> in >>>>>>> cassandra.yaml. I would like to know if there is some specific reason >>>>>>> behind that. >>>>>>> >>>>>>> Question related to that is if we could not have a build-time check >>>>>>> that all properties in Config have to be in cassandra.yaml and fail the >>>>>>> build if a property in Config does not have its counterpart in yaml. >>>>>>> >>>>>>> There are dozens of properties in Config and I have a strong >>>>>>> suspicion that we missed to publish some to yaml so users do not even >>>>>>> know >>>>>>> such a property exists and as of now we do not even know which they are. >>>>>>> >>>>>> >> >> -- >> Dmitry Konstantinov >> > -- Dmitry Konstantinov