[
https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496302#comment-17496302
]
Paulo Motta edited comment on CASSANDRA-17292 at 2/22/22, 8:23 PM:
-------------------------------------------------------------------
Thanks for the additional context [~maedhroz], that is very helpful to
understand the reasoning behind the proposed nesting.
{quote}For a moment, let's ignore the fact that there's any kind of textual
configuration file at all for the project, but we still have all the
knobs/systems/etc. The very first thing I would do is create a "domain model"
for C* configuration on the Java side, a hierarchy rooted in a Configuration
container class, which would contain members w/ types like
ClusterConfiguration, NetworkConfiguration, StorageConfiguration, etc. These
would be easy to navigate, would provide reasonable points for inline
documentation, could encapsulate validation logic for relationships between
parameters within subsystems and features, and could be passed as little
"kernels" of configuration around the codebase, allowing for better mocking,
etc.
{quote}
I think we're not very far from what we want the end result to look like from
the developer's perspective, my proposal is just a simplification of yours
where instead of a multi-level hierarchy rooted on physical resources
(cluster/network/storage), I'm proposing a feature-centric domain model
hierachy with a single level - each feature define its own configuration
subtree.
The basic construct to create new feature configurations is the following class:
{code:java}
public abstract class FeatureConfiguration
{
// is the feature enabled by default?
boolean enabled = false;
// the feature name to be used in the YAML/JSON
public abstract String getFeatureName();
// whether this feature can be disabled
public boolean isOptional()
{
return true;
}
}
{code}
This would allow to easily create typed configuration for each feature:
* CommitlogConfiguration
* HintsConfiguration
* MaterializedViewsConfiguration
For example this is how "HintsConfiguration" would look like:
{code:java}
public class HintsConfiguration extends FeatureConfiguration
{
public HintsConfiguration()
{
this.enabled = true;
}
public String getFeatureName()
{
return "hinted_handoff";
}
boolean auto_hints_cleanup = false
Duration max_hint_window = "3h"
Throttle hinted_handoff_throttle = "1024KiB"
int max_hints_delivery_threads = 2
Duration hints_flush_period = "10000ms"
Size max_hints_file_size = "128MiB"
}
{code}
And would be represented as following on {{{}cassandra.yaml{}}}:
{code:yaml}
# Commit log (cannot be disabled because isOptional()=false)
commit_log:
commitlog_sync: periodic
commitlog_sync_period: 10000ms
commitlog_segment_size: 32MiB
# Hinted Handoff
hinted_handoff:
enabled: true
auto_hints_cleanup: false
max_hint_window: 3h
hinted_handoff_throttle: 1024KiB
max_hints_delivery_threads: 2
hints_flush_period: 10000ms
max_hints_file_size: 128MiB
# MVs are experimental and not recommended for production-use
materialized_views: enabled: false
{code}
The approach above provides a very simple user experience while allowing typed
configuration in the developer's side.
I think that we can easily fit most database configurations in this
feature-centric view, but if there are some that we cannot fit into an existing
feature we could create a new type {{ResourceConfiguration}} which would allow
to configure a resource not tied to a particular feature.
{quote}I'm still pretty strongly in support of a versioned but intact single
configuration file.
{quote}
Perhaps I should've made it clear but the split of configuration in multiple
files is a mere optional convenience of my proposal, which also support
configurations in a single file for backward-compatibility.
For instance, moving the configuration from the {{features.yaml}} to
{{core.yaml}} would still render the same global configuration.
I think that the optional splitting of configuration in different files provide
an organizational benefit of grouping together properties belonging to a
similar category (ie. core-features which cannot be disabled, optional features
and guardrails).
My original proposal of starting with 3 initial categories
(core.yaml/features.yaml/guardrails.yaml) is mostly to facilitate the
transition to the new configuration model:
- cassandra.yaml (previously core.yaml): all legacy configurations would
initially go here separated by section headers
- features.yaml: all configurations compatible with the new
{{{}FeatureConfiguration{}}}} model would go here (including new features and
"migrated" legacy features)
- guardrails.yaml: all guardrails are collocated in the same file for
operational simplicity
For instance, the hints configuration is currently flat so it would initially
go in {{cassandra.yaml}} in the old format:
{code:yaml}
hinted_handoff_enabled: true
max_hint_window: 3h
hinted_handoff_throttle: 1024KiB
max_hints_delivery_threads: 2
hints_flush_period: 10000ms
max_hints_file_size: 128MiB
auto_hints_cleanup_enabled: false
{code:yaml}
After someone incrementally decide's to "migrate" the hints configuration from
the "legacy" format to the new format, it would remove the above entry from
{{cassandra.yaml}} and add a new entry to {{{}features.yaml{}}}:
{code:yaml}
# even though this configuration is in features.yaml by default
# it will still be valid if we add it to core.yaml, cassandra.yaml
# or even foobar.yaml ;-)
hinted_handoff:
enabled: true
auto_hints_cleanup: false
max_hint_window: 3h
hinted_handoff_throttle: 1024KiB
max_hints_delivery_threads: 2
hints_flush_period: 10000ms
max_hints_file_size: 128MiB
{code}
This model allow us to remain backward compatible by supporting legacy
configurations on {{{}core.yaml{}}}//{{{}cassandra.yaml{}}} and migrating
configurations incrementally to the new {{FeatureConfiguration}} format on the
{{features.yaml}} file which would contain all the "modern" configuration.
was (Author: paulo):
Thanks for the additional context [~maedhroz], that is very helpful to
understand the reasoning behind the proposed nesting.
{quote}For a moment, let's ignore the fact that there's any kind of textual
configuration file at all for the project, but we still have all the
knobs/systems/etc. The very first thing I would do is create a "domain model"
for C* configuration on the Java side, a hierarchy rooted in a Configuration
container class, which would contain members w/ types like
ClusterConfiguration, NetworkConfiguration, StorageConfiguration, etc. These
would be easy to navigate, would provide reasonable points for inline
documentation, could encapsulate validation logic for relationships between
parameters within subsystems and features, and could be passed as little
"kernels" of configuration around the codebase, allowing for better mocking,
etc.
{quote}
I think we're not very far from what we want the end result to look like from
the developer's perspective, my proposal is just a simplification of yours
where instead of a multi-level hierarchy rooted on physical resources
(cluster/network/storage), I'm proposing a feature-centric domain model
hierachy with a single level - each feature define its own configuration
subtree.
The basic construct to create new feature configurations is the following class:
{code:java}
public abstract class FeatureConfiguration
{
// is the feature enabled by default?
boolean enabled = false;
// the feature name to be used in the YAML/JSON
public abstract String getFeatureName();
// whether this feature can be disabled
public boolean isOptional()
{
return true;
}
}
{code}
This would allow to easily create typed configuration for each feature:
* CommitlogConfiguration
* HintsConfiguration
* MaterializedViewsConfiguration
For example this is how "HintsConfiguration" would look like:
{code:java}
public class HintsConfiguration extends FeatureConfiguration
{
public HintsConfiguration()
{
this.enabled = true;
}
public String getFeatureName()
{
return "hinted_handoff";
}
boolean auto_hints_cleanup = false
Duration max_hint_window = "3h"
Throttle hinted_handoff_throttle = "1024KiB"
int max_hints_delivery_threads = 2
Duration hints_flush_period = "10000ms"
Size max_hints_file_size = "128MiB"
}
{code}
And would be represented as following on {{{}cassandra.yaml{}}}:
{code:yaml}
# Commit log (cannot be disabled because isOptional()=false)
commit_log: commitlog_sync: periodic
commitlog_sync_period: 10000ms
commitlog_segment_size: 32MiB
# Hinted Handoff
hinted_handoff: enabled: true
auto_hints_cleanup: false
max_hint_window: 3h
hinted_handoff_throttle: 1024KiB
max_hints_delivery_threads: 2
hints_flush_period: 10000ms
max_hints_file_size: 128MiB
# MVs are experimental and not recommended for production-use
materialized_views: enabled: false
{code}
The approach above provides a very simple user experience while allowing typed
configuration in the developer's side.
I think that we can easily fit most database configurations in this
feature-centric view, but if there are some that we cannot fit into an existing
feature we could create a new type {{ResourceConfiguration}} which would allow
to configure a resource not tied to a particular feature.
{quote}I'm still pretty strongly in support of a versioned but intact single
configuration file.
{quote}
Perhaps I should've made it clear but the split of configuration in multiple
files is a mere optional convenience of my proposal, which also support
configurations in a single file for backward-compatibility.
For instance, moving the configuration from the {{features.yaml}} to
{{core.yaml}} would still render the same global configuration.
I think that the optional splitting of configuration in different files provide
an organizational benefit of grouping together properties belonging to a
similar category (ie. core-features which cannot be disabled, optional features
and guardrails).
My original proposal of starting with 3 initial categories
(core.yaml/features.yaml/guardrails.yaml) is mostly to facilitate the
transition to the new configuration model:
- cassandra.yaml (previously core.yaml): all legacy configurations would
initially go here separated by section headers
- features.yaml: all configurations compatible with the new
{{{}FeatureConfiguration{}}}} model would go here (including new features and
"migrated" legacy features)
- guardrails.yaml: all guardrails are collocated in the same file for
operational simplicity
For instance, the hints configuration is currently flat so it would initially
go in {{cassandra.yaml}} in the old format:
{code:yaml}
hinted_handoff_enabled: true
max_hint_window: 3h
hinted_handoff_throttle: 1024KiB
max_hints_delivery_threads: 2
hints_flush_period: 10000ms
max_hints_file_size: 128MiB
auto_hints_cleanup_enabled: false
{code}
After someone incrementally decide's to "migrate" the hints configuration from
the "legacy" format to the new format, it would remove the above entry from
{{cassandra.yaml}} and add a new entry to {{{}features.yaml{}}}:
{code:yaml}
# even though this configuration is in features.yaml by default
# it will still be valid if we add it to core.yaml, cassandra.yaml
# or even foobar.yaml ;-)
hinted_handoff: enabled: true
auto_hints_cleanup: false
max_hint_window: 3h
hinted_handoff_throttle: 1024KiB
max_hints_delivery_threads: 2
hints_flush_period: 10000ms
max_hints_file_size: 128MiB
{code}
This model allow us to remain backward compatible by supporting legacy
configurations on {{{}core.yaml{}}}//{{{}cassandra.yaml{}}} and migrating
configurations incrementally to the new {{FeatureConfiguration}} format on the
{{features.yaml}} file which would contain all the "modern" configuration.
> Move cassandra.yaml toward a nested structure around major database concepts
> ----------------------------------------------------------------------------
>
> Key: CASSANDRA-17292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17292
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Config
> Reporter: Caleb Rackliffe
> Assignee: Caleb Rackliffe
> Priority: Normal
> Fix For: 5.x
>
>
> Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new
> features") has made it clear we will gravitate toward appropriately nested
> structures for new parameters in {{cassandra.yaml}}, but from the scattered
> conversation across a few Guardrails tickets (see CASSANDRA-17212 and
> CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to
> eventually extend this to the rest of {{cassandra.yaml}}. The benefits of
> this change include those we gain by doing it for new features (single point
> of interest for feature documentation, typed configuration objects, logical
> grouping for additional parameters added over time, discoverability, etc.),
> but one a larger scale.
> This may overlap with ongoing work, including the Guardrails epic. Ideally,
> even a rough cut of a design here would allow that to move forward in a
> timely and coherent manner (with less long-term refactoring pain).
> Current proposals:
> From [~benedict] -
> https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas
> From [~maedhroz] -
> https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a
> From [~paulo] -
> https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]