As discussed on the thread, the idea was we would also tweak the definition of 
LP(CONFIG). I am not opposed to making them Public. 


> On Mar 24, 2022, at 7:19 PM, 张铎 <[email protected]> wrote:
> 
> IMO if we want to have a constant class for this, we should mark it as
> IA.Public. IA.LP(Config) just means the class itself can be referenced in
> config files.
> 
> Bryan Beaudreault <[email protected]> 于2022年3月25日周五 04:25写道:
> 
>> Yes but I don't see a major issue with promoting to LP(Config)? I thought
>> the agreement here was that LP(Config) was basically "exposed for config
>> constants", but maybe I misunderstood.
>> 
>> Also ConnectionConfiguration is a relatively basic and lightweight wrapper
>> around configs. I think it's a perfect example of what this LP(Config) API
>> could look like.
>> 
>>> On Thu, Mar 24, 2022 at 3:47 PM Nick Dimiduk <[email protected]> wrote:
>>> 
>>> One sticking point: ConnectionConfiguration and
>>> AsyncConnectionConfiguration are both IA.Private.
>>> 
>>> On Thu, Mar 24, 2022 at 17:21 Bryan Beaudreault
>>> <[email protected]> wrote:
>>> 
>>>> I realized there already exists a good candidate for this in
>> hbase-client
>>>> -- ConnectionConfiguration. My latest commit in
>>>> https://github.com/apache/hbase/pull/4180
>>> <https://github.com/apache/hbase/pull/4180>
>>> adds a new config constant there
>>>> and marks it as LP(Config), but I'd also be happy to revert that part
>> of
>>>> the PR and instead handle that in a dedicated jira for this topic if
>>>> desired.
>>>> 
>>>> On Thu, Mar 24, 2022 at 11:08 AM Bryan Beaudreault <
>>>> [email protected]>
>>>> wrote:
>>>> 
>>>>> Thank you both for the input here!
>>>>> 
>>>>> It seems like we've come to the conclusion that it would be ok to:
>>>>> 
>>>>> - Update docs around LP(CONFIG) to say that it also encompasses some
>>>>> module-aggregated classes which hold config constants
>>>>> - Maybe document this convention elsewhere for contributors as well,
>>>>> though need to look into where (guidance welcome, but will look
>> around)
>>>>> - I can create the first constant class in hbase-client for the
>>> use-cases
>>>>> in my jira. I also have another config constant I've added in
>>>>> https://github.com/apache/hbase/pull/4180
>>> <https://github.com/apache/hbase/pull/4180>
>>> prior to this discussion, so
>>>>> may want to use that class there.
>>>>> - Look into updates to tooling to audit the LP(CONFIG) classes
>>>>> 
>>>>> If this sounds good, I'll create a JIRA to track the work. The first
>> 3
>>>>> todos will be relatively easy, and I'll look into options for the
>> audit
>>>> tool
>>>>> 
>>>>> 
>>>>> On Mon, Mar 21, 2022 at 12:14 PM Andrew Purtell <
>>>> [email protected]>
>>>>> wrote:
>>>>> 
>>>>>> Agreed the definition of LP(CONFIG) would need to be tweaked.
>>>>>> 
>>>>>> We do not quite have enough support for analyzing configuration key
>>> set
>>>>>> changes in the current API audit tool. Removal of a public constant
>>>> field
>>>>>> from an LP annotated class would be flagged but a modification of
>> the
>>>>>> constant would not. However it is a OSS project consisting of Perl
>>>> scripts
>>>>>> that might accept a contribution or at least could be patched or
>>>> extended.
>>>>>> Or we can build a new audit tool for the purpose, which is what I
>>> would
>>>>>> recommend. The Perl based tool shells out to javah and is quite
>>>> expensive,
>>>>>> and on some platforms, depending on comparison, can fail due to
>>> command
>>>>>> line length limits. A Java tool would likely be more efficient at
>>>>>> processing Java class annotations and introspecting string
>> constants.
>>>>>> 
>>>>>>> 
>>>>>>> On Mar 21, 2022, at 8:52 AM, Nick Dimiduk <[email protected]>
>>>> wrote:
>>>>>>> 
>>>>>>> On Mon, Mar 21, 2022 at 4:32 PM Andrew Purtell <
>>>>>> [email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Although collecting all configuration keys into a single file is
>>>>>>>> definitely an anti-pattern I’m not sure the same is true of
>> package
>>>> or
>>>>>>>> Maven module level aggregation classes marked LP(CONFIG).Somewhat
>>>> like
>>>>>>>> DFSConfigKeys but geared toward our API/release auditing.
>>>>>>>> 
>>>>>>>> This would seem virtuous for a couple of reasons. Relevant
>>>>>> configuration
>>>>>>>> key constants for the package or module would be grouped in a
>> well
>>>>>> known
>>>>>>>> place for users and developers alike. The LP(CONFIG) designation
>>>> would
>>>>>>>> require developers to think about deprecation cycle if
>>> contemplating
>>>> a
>>>>>>>> change, thus providing some back pressure against snap decisions.
>>> Or,
>>>>>> if
>>>>>>>> not then, then at release candidate evaluation time, user
>>>> configuration
>>>>>>>> breaking changes could be caught be a release automation tool
>> that
>>>>>> diffs
>>>>>>>> LP(CONFIG) annotated classes. Something like this would improve
>> the
>>>>>> state
>>>>>>>> of configuration key management quite dramatically, because
>>> currently
>>>>>> it’s
>>>>>>>> ad hoc.
>>>>>>>> 
>>>>>>> 
>>>>>>> I am supportive of trying such an effort. We'll need to tweak the
>>>>>> meaning
>>>>>>> of LP(CONFIG) as we define it currently, but that can be done. I
>>> don't
>>>>>> know
>>>>>>> what our current tools do or assume regarding this annotation. I
>>> think
>>>>>>> there is something custom happening in the compatibility reports
>>> that
>>>> we
>>>>>>> generate as part of each RC.
>>>>>>> 
>>>>>>>>> On Mar 16, 2022, at 10:46 AM, Bryan Beaudreault
>>>>>> <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Thanks for your detailed response, Nick!
>>>>>>>>> 
>>>>>>>>>> I think that none of my comments address your intended topic:
>> how
>>>> do
>>>>>> we
>>>>>>>>> publish our configuration points as an API that can be consumed
>> by
>>>>>> user
>>>>>>>>> applications? (Do I have that correct?)
>>>>>>>>> 
>>>>>>>>> This is a good summary, and I appreciate the other
>>>>>>>> thoughts/clarifications
>>>>>>>>> as well. I also realize this is probably hard to get perfect and
>>> any
>>>>>>>> choice
>>>>>>>>> must be weighed against the effort necessary to change/maintain.
>>>>>>>>> 
>>>>>>>>> One example I know is Hadoop/HDFS, and I bet some on this list
>>> have
>>>>>> much
>>>>>>>>> more knowledge of that project's history than I do. For HDFS
>> they
>>>> have
>>>>>>>>> DFSConfigKeys which in my experience does seem to include most
>>>>>> configs. I
>>>>>>>>> believe they even have unit tests which verify that all configs
>> in
>>>> the
>>>>>>>>> various site.xml files are represented in code. In more recent
>>>>>> versions
>>>>>>>>> they have split that class up into smaller groupings, for
>> example
>>>>>>>>> DfsClientConf and the various inner classes there.
>>>>>>>>> 
>>>>>>>>> In a vacuum, from a code design perspective, I'm not commenting
>> on
>>>>>>>> whether
>>>>>>>>> that's a good or bad pattern. I also don't know of the politics
>> of
>>>> the
>>>>>>>>> project or what sorts of pain points they've discovered in that
>>>>>> pattern
>>>>>>>>> over the years. But from *user's perspective*, this is a handy
>> way
>>>> to
>>>>>>>>> handle things in my opinion.
>>>>>>>>> 
>>>>>>>>> At my company, in general we try to avoid "magic strings" [1]
>> and
>>>>>> instead
>>>>>>>>> always try to use constants. We can and do define our own
>>> constants
>>>> to
>>>>>>>> try
>>>>>>>>> to mirror some of the "private" magic strings in the hbase
>> client.
>>>>>> This
>>>>>>>> is
>>>>>>>>> better than nothing but even better would be to use
>> hbase-provided
>>>>>>>>> constants so that we can build more defensive applications,
>> using
>>>> the
>>>>>>>>> compiler to verify that the configs we reference still do
>>> anything.
>>>>>>>>> 
>>>>>>>>> I unfortunately can't speak to the original issues with
>> HConstants
>>>>>> that
>>>>>>>>> turned it into an anti-pattern. What I do notice is there are
>>>>>> definitely
>>>>>>>>> examples in the hbase codebase of duplicated config strings, one
>>> of
>>>>>> which
>>>>>>>>> is called out in one of the jiras I linked in my original email.
>>>> These
>>>>>>>> are
>>>>>>>>> just bugs waiting to happen in my opinion, either for hbase
>> itself
>>>> or
>>>>>> for
>>>>>>>>> users which may reference them.
>>>>>>>>> 
>>>>>>>>> [1] https://deviq.com/antipatterns/magic-strings
>>> <https://deviq.com/antipatterns/magic-strings>
>>>>>> <https://deviq.com/antipatterns/magic-strings
>>> <https://deviq.com/antipatterns/magic-strings>
>>>> 
>>>>>>>>>> On Wed, Mar 16, 2022 at 10:52 AM Nick Dimiduk <
>>> [email protected]
>>>>> 
>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Bryan,
>>>>>>>>>> 
>>>>>>>>>> Thanks for bringing this up.
>>>>>>>>>> 
>>>>>>>>>> I agree with Duo (and I think we have this settled as
>>> project-wide
>>>>>>>>>> consensus) that HConstants is/was an anti-pattern, that we are
>>>>>> actively
>>>>>>>>>> against adding new fields there, and opportunistically removing
>>>>>> fields
>>>>>>>> when
>>>>>>>>>> we can. Further, the documented meaning of the
>>>>>>>>>> HBaseInterfaceAudience.CONFIG field is "Denotes class names
>> that
>>>>>> appear
>>>>>>>> in
>>>>>>>>>> user facing configuration files", so this isn't really
>>> appropriate
>>>>>> for
>>>>>>>>>> marking a field that exposes a configuration key to user
>>>>>> applications. I
>>>>>>>>>> will also note that there appears to be two categories of
>> tunable
>>>>>>>>>> parameters -- configuration points that we expect users to
>> tweak
>>>> are
>>>>>>>>>> catalogued and documented in the book [0] and everything else
>> is
>>>>>> left to
>>>>>>>>>> the obscurity of code-grep.
>>>>>>>>>> 
>>>>>>>>>> While we are actively squashing use of fields in HConstants, I
>>>> don't
>>>>>>>> know
>>>>>>>>>> that we have proposed some alternative to the user community.
>> For
>>>> my
>>>>>>>> part,
>>>>>>>>>> when I write and review code that involves configuration keys,
>> I
>>>>>>>> generally
>>>>>>>>>> implement the key constant string as a private field in an
>>>>>> appropriate
>>>>>>>>>> class, and the unit test coverage for that configuration key
>>>>>> replicates
>>>>>>>> the
>>>>>>>>>> string in the test. My reasoning being that the string is a
>> part
>>> of
>>>>>> our
>>>>>>>>>> public API and making a change to the public API should be
>>> detected
>>>>>> from
>>>>>>>>>> the unit test. I also have (on occasion) gone out of my way to
>>>> write
>>>>>>>> about
>>>>>>>>>> the configuration keys in the package or class-level javadoc.
>>>>>>>>>> 
>>>>>>>>>> I think that none of my comments address your intended topic:
>> how
>>>> do
>>>>>> we
>>>>>>>>>> publish our configuration points as an API that can be consumed
>>> by
>>>>>> user
>>>>>>>>>> applications? (Do I have that correct?)
>>>>>>>>>> 
>>>>>>>>>> I am of the mind that we don't need/want an API of
>>> configurations ;
>>>>>> we
>>>>>>>> want
>>>>>>>>>> a catalogue, i.e., what has been started in our book. Perhaps
>>>>>>>> accompanied
>>>>>>>>>> by/generated from an authoritative hbase-defaults.xml file. In
>>>> fact,
>>>>>> we
>>>>>>>>>> already do generate from hbase-default.xml, the result is [1]
>> ; I
>>>>>> don't
>>>>>>>>>> believe it is authoritative.
>>>>>>>>>> 
>>>>>>>>>> If we did have an AP thoughI, what would be better than the
>>>>>> HConstants
>>>>>>>>>> approach of key-strings as public fields ? What if we had a
>>>>>>>>>> ConfigurationBuilder type of class, which had methods tied to
>>>>>>>> configuration
>>>>>>>>>> keys? I would think that such a globally applicable class would
>>>> have
>>>>>> the
>>>>>>>>>> same maintenance issues as HConstants. But what if we had some
>>> kind
>>>>>> of
>>>>>>>>>> ConfigurationSetter class, perhaps per package, that performed
>>> this
>>>>>>>>>> function? That might be maintainable for us and useful for
>> users.
>>>>>>>>>> 
>>>>>>>>>> I'm keen to hear what other ideas are out there, or better,
>>>> examples
>>>>>> and
>>>>>>>>>> counter-examples from other projects.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Nick
>>>>>>>>>> 
>>>>>>>>>> [0]:
>> https://hbase.apache.org/book.html#important_configurations
>>> <https://hbase.apache.org/book.html#important_configurations>
>>>>>> <https://hbase.apache.org/book.html#important_configurations
>>> <https://hbase.apache.org/book.html#important_configurations>
>>>> 
>>>>>>>>>> <https://hbase.apache.org/book.html#important_configurations
>>> <https://hbase.apache.org/book.html#important_configurations>
>>>>>> <https://hbase.apache.org/book.html#important_configurations
>>> <https://hbase.apache.org/book.html#important_configurations>
>>>> 
>>>>>>> 
>>>>>>>>>> [1]:
>>>> https://hbase.apache.org/book.html#hbase_default_configurations
>>> <https://hbase.apache.org/book.html#hbase_default_configurations>
>>>>>> <https://hbase.apache.org/book.html#hbase_default_configurations
>>> <https://hbase.apache.org/book.html#hbase_default_configurations>
>>>> 
>>>>>>>>>> <
>> https://hbase.apache.org/book.html#hbase_default_configurations
>>> <https://hbase.apache.org/book.html#hbase_default_configurations>
>>>>>> <https://hbase.apache.org/book.html#hbase_default_configurations
>>> <https://hbase.apache.org/book.html#hbase_default_configurations>
>>>> 
>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Tue, Mar 15, 2022 at 4:28 PM Bryan Beaudreault
>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi devs,
>>>>>>>>>>> 
>>>>>>>>>>> As a major user of hbase, my company has thousands of clients
>>>>>> deployed
>>>>>>>>>>> which use the hbase client to connect to a variety of hbase
>>>>>> clusters.
>>>>>>>> We
>>>>>>>>>>> have a common library which handles configuring all clients by
>>>>>> setting
>>>>>>>> up
>>>>>>>>>>> the Configuration object prior to creating a Connection. Our
>>>> library
>>>>>>>> sets
>>>>>>>>>>> configurations using the various configs in HConstants, but
>>> there
>>>>>> are
>>>>>>>>>> also
>>>>>>>>>>> a bunch of configs which don't exist in HConstants. For these
>> we
>>>>>> have
>>>>>>>>>>> hardcoded config strings in our client.
>>>>>>>>>>> 
>>>>>>>>>>> We're now working on an hbase upgrade and need to go through
>> our
>>>>>> client
>>>>>>>>>>> library and check how the configs may have changed in the new
>>>>>> version.
>>>>>>>>>> This
>>>>>>>>>>> is relatively easy to do for those HConstants cases -- configs
>>> may
>>>>>> be
>>>>>>>>>>> marked @Deprecated which will show up in one's editor, they
>> may
>>> be
>>>>>>>>>> removed
>>>>>>>>>>> entirely which would show up is a compile error, and otherwise
>>> one
>>>>>> can
>>>>>>>>>>> easily click through or bring up the javadoc. For the others
>>> that
>>>>>> don't
>>>>>>>>>>> exist in HConstants, we need to go manually search the hbase
>>>>>> codebase
>>>>>>>> for
>>>>>>>>>>> those strings.
>>>>>>>>>>> 
>>>>>>>>>>> Without doing this painstaking manual process, we would
>>>> potentially
>>>>>>>>>> deploy
>>>>>>>>>>> the upgraded client with configs which are no longer used or
>>>>>> deprecated
>>>>>>>>>> by
>>>>>>>>>>> the hbase client. For those using HConstants, this is
>>> immediately
>>>>>>>> obvious
>>>>>>>>>>> because the HConstant field may have been removed. This is a
>>> clear
>>>>>>>>>>> indication of needing to investigate the config. In this case
>>> it's
>>>>>>>>>>> preferred to face the compile failure because it's clearer
>> than
>>>>>> having
>>>>>>>>>>> something silently disappear or change.
>>>>>>>>>>> 
>>>>>>>>>>> I opened 3 jiras to move some fields to HConstants, but got
>> some
>>>>>>>>>> reasonable
>>>>>>>>>>> pushback from Duo:
>>>>>>>>>>> 
>>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-26845
>>> <https://issues.apache.org/jira/browse/HBASE-26845>
>>>>>> <https://issues.apache.org/jira/browse/HBASE-26845
>>> <https://issues.apache.org/jira/browse/HBASE-26845>
>>>> 
>>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-26845
>>> <https://issues.apache.org/jira/browse/HBASE-26845>
>>>>>> <https://issues.apache.org/jira/browse/HBASE-26845
>>> <https://issues.apache.org/jira/browse/HBASE-26845>
>>>> 
>>>>>>> 
>>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-26846
>>> <https://issues.apache.org/jira/browse/HBASE-26846>
>>>>>> <https://issues.apache.org/jira/browse/HBASE-26846
>>> <https://issues.apache.org/jira/browse/HBASE-26846>
>>>> 
>>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-26846
>>> <https://issues.apache.org/jira/browse/HBASE-26846>
>>>>>> <https://issues.apache.org/jira/browse/HBASE-26846
>>> <https://issues.apache.org/jira/browse/HBASE-26846>
>>>> 
>>>>>>> 
>>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-26847
>>> <https://issues.apache.org/jira/browse/HBASE-26847>
>>>>>> <https://issues.apache.org/jira/browse/HBASE-26847
>>> <https://issues.apache.org/jira/browse/HBASE-26847>
>>>> 
>>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-26847
>>> <https://issues.apache.org/jira/browse/HBASE-26847>
>>>>>> <https://issues.apache.org/jira/browse/HBASE-26847
>>> <https://issues.apache.org/jira/browse/HBASE-26847>
>>>> 
>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Duo's pushback is that HConstants is an anti-pattern and these
>>>>>> configs
>>>>>>>>>> are
>>>>>>>>>>> not part of our public API. I can agree that a catch-all
>>> constants
>>>>>>>> class
>>>>>>>>>>> might be an anti-pattern, but would argue that consolidating
>>>> configs
>>>>>>>>>> there
>>>>>>>>>>> is very useful for end-users. I can also potentially agree
>> that
>>>>>>>> exposing
>>>>>>>>>>> these as part of our public API might limit the flexibility of
>>>>>>>>>> development
>>>>>>>>>>> due to compatibility constraints about IA.Public.
>>>>>>>>>>> 
>>>>>>>>>>> To me it seems odd to add a configuration, whose whole point
>> is
>>> to
>>>>>> make
>>>>>>>>>>> something tuneable, but then bury it in a private class
>> despite
>>>>>> having
>>>>>>>>>> real
>>>>>>>>>>> implications for how the application runs. If a configuration
>> is
>>>> not
>>>>>>>>>> meant
>>>>>>>>>>> to be tuned, it shouldn't be a configuration at all. Otherwise
>>> it
>>>>>>>> should
>>>>>>>>>> be
>>>>>>>>>>> exposed for reference.
>>>>>>>>>>> 
>>>>>>>>>>> I'm wondering if there is some compromise we can achieve which
>>>>>> makes it
>>>>>>>>>>> easier for end-users to integrate with tunable configs.
>>>>>>>>>>> 
>>>>>>>>>>> One can imagine a large project to clean up all of our configs
>>>> under
>>>>>>>> some
>>>>>>>>>>> new class with maybe IA.LimitedPrivate(CONFIG), but I fear
>>> making
>>>>>>>> perfect
>>>>>>>>>>> (needing to migrate all configs) the enemy of good.
>>>>>>>>>>> 
>>>>>>>>>>> A better option might be to make those classes which expose
>>>> configs
>>>>>>>>>>> LimitedPrivate(CONFIG) -- for example AsyncProcess and
>>>>>>>>>>> ConnectionImplementation. That might be the most incremental
>>>> change
>>>>>> we
>>>>>>>>>>> could make. We could handle this on a case-by-case basis.
>>>>>>>>>>> 
>>>>>>>>>>> Does anyone have any thoughts?
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 

Reply via email to