Good point re: the implications of parsing and durability in the face of seeing unknown or missing parameters. I don't think widening the scope on that would be ideal, especially considering the entire impetus for this conversation is "we've misbehaved with our config and have a bunch of undocumented stuff we're not sure is still useful, or what it's for". =/
On Mon, Jan 27, 2025, at 3:41 PM, Štefan Miklošovič wrote: > "we take "unclaimed" items and move them to their own InternalConfig.java or > something" > > This is interesting. If we are meant to be still able to put these properties > into cassandra.yaml (even they are "internal ones") and they would be just in > InternalConfig.java for some basic separation of internal / user-facing > configuration, then we would need to have two yaml loaders: > > 1) the one as we have now which loads cassandra.yaml it into Config.java > 2) the second one which would load cassandra.yaml into InternalConfig.java > > For both cases, we could not fail when there are unrecognized properties in > cassandra.yaml while parsing it (1), because every loader, for Config.java as > well as InternalConfig.java, is parsing just some "subset" of yaml. > > (1) > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/config/YamlConfigurationLoader.java#L443-L444 > > If we just had "public InternalConfig internal = new InternalConfig" as a > field in Config.java, then this would lead to properties being effectively > renamed in cassandra.yaml like > > internal: > some_currently_internal_property: false > > instead of just > > some_currently_internal_property: false > > I do not think we want to have them renamed / under different configuration > sections in yaml. I get that they are "internal" etc but we just don't know > how / where it is used and deployed and just blindly renaming them is not a > good idea imho. > > On Mon, Jan 27, 2025 at 8:46 PM Josh McKenzie <jmcken...@apache.org> wrote: >> __ >> This may be an off-base comparison, but this reminds me of struggles we've >> had getting to 0 failing unit tests before and the debates on fencing off a >> snapshot of the current "failure set" so you can have a set point where no >> further degradation is allowed in a primary data set. >> >> All of which is to say - maybe at the end of the spreadsheet, we take >> "unclaimed" items and move them to their own InternalConfig.java or >> something and add an ant target that a) disallows further addition to >> InternalConfig.java w/out throwing an error / needing whitelist update, and >> b) disallows further regression in the Config.java <-> cassandra.yaml >> relationship for non-annotated fields. >> >> That way we can at least halt the progression of the disease even if we're >> stymied on cleaning up some of the existing symptoms. >> >> On Mon, Jan 27, 2025, at 1:38 PM, Štefan Miklošovič wrote: >>> Indeed, we need to balance that and thoughtfully choose what is going to be >>> added and what not. However, we should not hide something which is meant to >>> be tweaked by a user. The config is intimidating mostly because everything >>> is just in one file. I merely remember discussions a few years ago which >>> were about splitting cassandra.yaml into multiple files which would be >>> focused just on one subsystem / would cover some logically isolated domain. >>> >>> Anyway, I think the main goal of this effort for now would be to at least >>> map where we are at. Some of them are genuinely missing. E.g. guardrails, >>> how is a user meant to know about that if it is not even documented ... >>> >>> On Mon, Jan 27, 2025 at 6:16 PM Chris lohfink <cnl...@gmail.com> wrote: >>>> Might be a bit of a balance between exposing what people actually are >>>> likely to need to modify vs having a super intimidating config file. It's >>>> already nearly 2000 lines. Personally I'd rather see some >>>> auto-documentation or something that's in the docs >>>> <https://cassandra.apache.org/doc/latest/cassandra/managing/configuration/cass_yaml_file.html> >>>> than an effort to manually add another 1000 lines. >>>> >>>> Chris >>>> >>>> On Fri, Jan 24, 2025 at 9:41 AM Dmitry Konstantinov <netud...@gmail.com> >>>> wrote: >>>>> Maybe I missed some patterns but it looks like a pretty good estimation, >>>>> I did like 10 random checks manually to verify :-) >>>>> I will try to make an ant target with a similar logic (hopefully, during >>>>> the weekend) >>>>> I will create a ticket to track this activity (to share attachments there >>>>> to not overload the thread with such outputs in future). >>>>> >>>>> On Fri, 24 Jan 2025 at 15:37, Štefan Miklošovič <smikloso...@apache.org> >>>>> wrote: >>>>>> Oh my god, 112? :DD I was thinking it would be less than 10. >>>>>> >>>>>> Anyway, I think we need to integrate this to some ant target. If you >>>>>> expanded on this, that would be great. >>>>>> >>>>>> On Fri, Jan 24, 2025 at 4:31 PM Dmitry Konstantinov <netud...@gmail.com> >>>>>> wrote: >>>>>>> A very primitive implementation of the 1st idea below: >>>>>>> >>>>>>> String configUrl = >>>>>>> "file:///Users/dmitry/IdeaProjects/cassandra-trunk/conf/cassandra.yaml"; >>>>>>> Field[] allFields = Config.class.getFields(); >>>>>>> List<String> topLevelPropertyNames = new ArrayList<>(); >>>>>>> for(Field field : allFields) >>>>>>> { >>>>>>> if (!Modifier.*isStatic*(field.getModifiers())) >>>>>>> { >>>>>>> topLevelPropertyNames.add(field.getName()); >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> URL url = new URL(configUrl); >>>>>>> List<String> lines = Files.*readAllLines*(Paths.*get*(url.toURI())); >>>>>>> >>>>>>> int missedCount = 0; >>>>>>> for (String propertyName : topLevelPropertyNames) >>>>>>> { >>>>>>> boolean found = false; >>>>>>> for (String line : lines) >>>>>>> { >>>>>>> if (line.startsWith(propertyName + ":") >>>>>>> || line.startsWith("#" + propertyName + ":") >>>>>>> || line.startsWith("# " + propertyName + ":")) { >>>>>>> found = true; >>>>>>> break; >>>>>>> } >>>>>>> } >>>>>>> if (!found) >>>>>>> { >>>>>>> missedCount++; >>>>>>> System.*out*.println(propertyName); >>>>>>> } >>>>>>> } >>>>>>> System.*out*.println("Total missed:" + missedCount); >>>>>>> >>>>>>> It prints the following config property names which are defined in >>>>>>> Config.java but not present as "property" or "# property " in a file: >>>>>>> permissions_cache_max_entries >>>>>>> roles_cache_max_entries >>>>>>> credentials_cache_max_entries >>>>>>> auto_bootstrap >>>>>>> force_new_prepared_statement_behaviour >>>>>>> use_deterministic_table_id >>>>>>> repair_request_timeout >>>>>>> stream_transfer_task_timeout >>>>>>> cms_await_timeout >>>>>>> cms_default_max_retries >>>>>>> cms_default_retry_backoff >>>>>>> epoch_aware_debounce_inflight_tracker_max_size >>>>>>> metadata_snapshot_frequency >>>>>>> available_processors >>>>>>> repair_session_max_tree_depth >>>>>>> use_offheap_merkle_trees >>>>>>> internode_max_message_size >>>>>>> native_transport_max_message_size >>>>>>> native_transport_max_request_data_in_flight_per_ip >>>>>>> native_transport_max_request_data_in_flight >>>>>>> native_transport_receive_queue_capacity >>>>>>> min_free_space_per_drive >>>>>>> max_space_usable_for_compactions_in_percentage >>>>>>> reject_repair_compaction_threshold >>>>>>> concurrent_index_builders >>>>>>> max_streaming_retries >>>>>>> commitlog_max_compression_buffers_in_pool >>>>>>> max_mutation_size >>>>>>> dynamic_snitch >>>>>>> failure_detector >>>>>>> use_creation_time_for_hint_ttl >>>>>>> key_cache_migrate_during_compaction >>>>>>> key_cache_invalidate_after_sstable_deletion >>>>>>> paxos_cache_size >>>>>>> file_cache_round_up >>>>>>> disk_optimization_estimate_percentile >>>>>>> disk_optimization_page_cross_chance >>>>>>> purgeable_tobmstones_metric_granularity >>>>>>> windows_timer_interval >>>>>>> otc_coalescing_strategy >>>>>>> otc_coalescing_window_us >>>>>>> otc_coalescing_enough_coalesced_messages >>>>>>> otc_backlog_expiration_interval_ms >>>>>>> scripted_user_defined_functions_enabled >>>>>>> user_defined_functions_threads_enabled >>>>>>> allow_insecure_udfs >>>>>>> allow_extra_insecure_udfs >>>>>>> user_defined_functions_warn_timeout >>>>>>> user_defined_functions_fail_timeout >>>>>>> user_function_timeout_policy >>>>>>> back_pressure_enabled >>>>>>> back_pressure_strategy >>>>>>> repair_command_pool_full_strategy >>>>>>> repair_command_pool_size >>>>>>> block_for_peers_timeout_in_secs >>>>>>> block_for_peers_in_remote_dcs >>>>>>> skip_stream_disk_space_check >>>>>>> snapshot_on_repaired_data_mismatch >>>>>>> validation_preview_purge_head_start >>>>>>> initial_range_tombstone_list_allocation_size >>>>>>> range_tombstone_list_growth_factor >>>>>>> snapshot_on_duplicate_row_detection >>>>>>> check_for_duplicate_rows_during_reads >>>>>>> check_for_duplicate_rows_during_compaction >>>>>>> autocompaction_on_startup_enabled >>>>>>> auto_optimise_inc_repair_streams >>>>>>> auto_optimise_full_repair_streams >>>>>>> auto_optimise_preview_repair_streams >>>>>>> consecutive_message_errors_threshold >>>>>>> internode_error_reporting_exclusions >>>>>>> compact_tables_enabled >>>>>>> vector_type_enabled >>>>>>> intersect_filtering_query_warned >>>>>>> intersect_filtering_query_enabled >>>>>>> streaming_slow_events_log_timeout >>>>>>> repair_state_expires >>>>>>> repair_state_size >>>>>>> paxos_variant >>>>>>> skip_paxos_repair_on_topology_change >>>>>>> paxos_purge_grace_period >>>>>>> paxos_on_linearizability_violations >>>>>>> paxos_state_purging >>>>>>> paxos_repair_enabled >>>>>>> paxos_topology_repair_no_dc_checks >>>>>>> paxos_topology_repair_strict_each_quorum >>>>>>> skip_paxos_repair_on_topology_change_keyspaces >>>>>>> paxos_contention_wait_randomizer >>>>>>> paxos_contention_min_wait >>>>>>> paxos_contention_max_wait >>>>>>> paxos_contention_min_delta >>>>>>> paxos_repair_parallelism >>>>>>> sstable_read_rate_persistence_enabled >>>>>>> client_request_size_metrics_enabled >>>>>>> max_top_size_partition_count >>>>>>> max_top_tombstone_partition_count >>>>>>> min_tracked_partition_size >>>>>>> min_tracked_partition_tombstone_count >>>>>>> top_partitions_enabled >>>>>>> severity_during_decommission >>>>>>> progress_barrier_min_consistency_level >>>>>>> progress_barrier_default_consistency_level >>>>>>> progress_barrier_timeout >>>>>>> progress_barrier_backoff >>>>>>> discovery_timeout >>>>>>> unsafe_tcm_mode >>>>>>> cql_start_time >>>>>>> native_transport_throw_on_overload >>>>>>> native_transport_queue_max_item_age_threshold >>>>>>> native_transport_min_backoff_on_queue_overload >>>>>>> native_transport_max_backoff_on_queue_overload >>>>>>> native_transport_timeout >>>>>>> enforce_native_deadline_for_hints >>>>>>> Total missed:112 >>>>>>> >>>>>>> >>>>>>> On Fri, 24 Jan 2025 at 15:10, Štefan Miklošovič >>>>>>> <smikloso...@apache.org> wrote: >>>>>>>> It should also work the other way around. If there is a property which >>>>>>>> is commented out in yaml and it is not in Config.java, that should >>>>>>>> fail as well. If it is not commented out and it is not in Config.java, >>>>>>>> that will fail in runtime as it fails on unrecognized property. >>>>>>>> >>>>>>>> This will be used in practice very rarely as we seldom remove the >>>>>>>> properties in Config but if we do and a property is commented out, we >>>>>>>> should not ship a dead property name, even commented out. >>>>>>>> >>>>>>>> On Fri, Jan 24, 2025 at 3:51 PM Paulo Motta <pa...@apache.org> wrote: >>>>>>>>> > > If "# my_cool_property: true" is NOT in cassandra.yaml, we >>>>>>>>> > might indeed add it, also commented out. I think it would be quite >>>>>>>>> > easy to check against yaml if there is a line starting on "# >>>>>>>>> > my_cool_property" or just on "my_cool_property". Both cases would >>>>>>>>> > satisfy the check. >>>>>>>>> >>>>>>>>> Makes sense, I think this would be good to have as a lint or test to >>>>>>>>> easily catch overlooks during review. >>>>>>>>> >>>>>>>>> On Fri, Jan 24, 2025 at 9:44 AM Štefan Miklošovič >>>>>>>>> <smikloso...@apache.org> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Jan 24, 2025 at 3:27 PM Paulo Motta <pa...@apache.org> wrote: >>>>>>>>>>> > from time to time I see configuration properties in Config.java >>>>>>>>>>> > and they are clearly not in cassandra.yaml. Not every property in >>>>>>>>>>> > Config is in cassandra.yaml. I would like to know if there is >>>>>>>>>>> > some specific reason behind that. >>>>>>>>>>> >>>>>>>>>>> I think one of the original reasons was to "hide" advanced configs >>>>>>>>>>> that are not meant to be updated, unless in very niche >>>>>>>>>>> circumstances. However I think this has been extrapolated to >>>>>>>>>>> non-advanced settings. >>>>>>>>>>> >>>>>>>>>>> > Question related to that is if we could not have a build-time >>>>>>>>>>> > check that all properties in Config have to be in cassandra.yaml >>>>>>>>>>> > and fail the build if a property in Config does not have its >>>>>>>>>>> > counterpart in yaml. >>>>>>>>>>> >>>>>>>>>>> Are you saying every configuration property should be >>>>>>>>>>> commented-out, or do you think that every Config property should be >>>>>>>>>>> specified in cassandra.yaml with their default uncomented ? One >>>>>>>>>>> issue with that is that you could cause user confusion if you >>>>>>>>>>> "reveal" a niche/advanced config that is not meant to be updated. I >>>>>>>>>>> think this would be addressed by the @HiddenInYaml flag you are >>>>>>>>>>> proposing in a later post. >>>>>>>>>> >>>>>>>>>> Yes, then can stay hidden, but we should annotate it with @Hidden or >>>>>>>>>> similar. As of now, if that property is not in yaml, we just don't >>>>>>>>>> know if it was forgotten to be added or if we have not added it on >>>>>>>>>> purpose. >>>>>>>>>> >>>>>>>>>> They can keep being commented out if they currently are. Imagine a >>>>>>>>>> property in Config.java >>>>>>>>>> >>>>>>>>>> public boolean my_cool_property = true; >>>>>>>>>> >>>>>>>>>> and then this in cassandra.yaml >>>>>>>>>> >>>>>>>>>> # my_cool_property: true >>>>>>>>>> >>>>>>>>>> It is completely ok. >>>>>>>>>> >>>>>>>>>> If "# my_cool_property: true" is NOT in cassandra.yaml, we might >>>>>>>>>> indeed add it, also commented out. I think it would be quite easy to >>>>>>>>>> check against yaml if there is a line starting on "# >>>>>>>>>> my_cool_property" or just on "my_cool_property". Both cases would >>>>>>>>>> satisfy the check. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> > There are dozens of properties in Config and I have a strong >>>>>>>>>>> > suspicion that we missed to publish some to yaml so users do not >>>>>>>>>>> > even know such a property exists and as of now we do not even >>>>>>>>>>> > know which they are. >>>>>>>>>>> >>>>>>>>>>> I believe this is a problem. I think most properties should be in >>>>>>>>>>> cassandra.yaml, unless they are very advanced or not meant to be >>>>>>>>>>> updated. >>>>>>>>>>> >>>>>>>>>>> Another tangential issue is that there are features/settings that >>>>>>>>>>> don't even have a Config entry, but are just controlled by JVM >>>>>>>>>>> properties. >>>>>>>>>>> >>>>>>>>>>> I think that we should attempt to unify Config and jvm properties >>>>>>>>>>> under a predictable structure. For example, if there is a YAML >>>>>>>>>>> config enable_user_defined_functions, then there should be a >>>>>>>>>>> respective JVM flag -Dcassandra.enable_user_defined_functions, and >>>>>>>>>>> vice versa. >>>>>>>>>> >>>>>>>>>> Yeah, good idea. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 24, 2025 at 9:16 AM Štefan Miklošovič >>>>>>>>>>> <smikloso...@apache.org> wrote: >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> from time to time I see configuration properties in Config.java >>>>>>>>>>>> and they are clearly not in cassandra.yaml. Not every property in >>>>>>>>>>>> Config is in cassandra.yaml. I would like to know if there is some >>>>>>>>>>>> specific reason behind that. >>>>>>>>>>>> >>>>>>>>>>>> Question related to that is if we could not have a build-time >>>>>>>>>>>> check that all properties in Config have to be in cassandra.yaml >>>>>>>>>>>> and fail the build if a property in Config does not have its >>>>>>>>>>>> counterpart in yaml. >>>>>>>>>>>> >>>>>>>>>>>> There are dozens of properties in Config and I have a strong >>>>>>>>>>>> suspicion that we missed to publish some to yaml so users do not >>>>>>>>>>>> even know such a property exists and as of now we do not even know >>>>>>>>>>>> which they are. >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Dmitry Konstantinov >>>>> >>>>> >>>>> -- >>>>> Dmitry Konstantinov >>