As discussed on the thread, the idea was we would also tweak the definition of LP(CONFIG). I am not opposed to making them Public.
> On Mar 24, 2022, at 7:19 PM, 张铎 <[email protected]> wrote: > > IMO if we want to have a constant class for this, we should mark it as > IA.Public. IA.LP(Config) just means the class itself can be referenced in > config files. > > Bryan Beaudreault <[email protected]> 于2022年3月25日周五 04:25写道: > >> Yes but I don't see a major issue with promoting to LP(Config)? I thought >> the agreement here was that LP(Config) was basically "exposed for config >> constants", but maybe I misunderstood. >> >> Also ConnectionConfiguration is a relatively basic and lightweight wrapper >> around configs. I think it's a perfect example of what this LP(Config) API >> could look like. >> >>> On Thu, Mar 24, 2022 at 3:47 PM Nick Dimiduk <[email protected]> wrote: >>> >>> One sticking point: ConnectionConfiguration and >>> AsyncConnectionConfiguration are both IA.Private. >>> >>> On Thu, Mar 24, 2022 at 17:21 Bryan Beaudreault >>> <[email protected]> wrote: >>> >>>> I realized there already exists a good candidate for this in >> hbase-client >>>> -- ConnectionConfiguration. My latest commit in >>>> https://github.com/apache/hbase/pull/4180 >>> <https://github.com/apache/hbase/pull/4180> >>> adds a new config constant there >>>> and marks it as LP(Config), but I'd also be happy to revert that part >> of >>>> the PR and instead handle that in a dedicated jira for this topic if >>>> desired. >>>> >>>> On Thu, Mar 24, 2022 at 11:08 AM Bryan Beaudreault < >>>> [email protected]> >>>> wrote: >>>> >>>>> Thank you both for the input here! >>>>> >>>>> It seems like we've come to the conclusion that it would be ok to: >>>>> >>>>> - Update docs around LP(CONFIG) to say that it also encompasses some >>>>> module-aggregated classes which hold config constants >>>>> - Maybe document this convention elsewhere for contributors as well, >>>>> though need to look into where (guidance welcome, but will look >> around) >>>>> - I can create the first constant class in hbase-client for the >>> use-cases >>>>> in my jira. I also have another config constant I've added in >>>>> https://github.com/apache/hbase/pull/4180 >>> <https://github.com/apache/hbase/pull/4180> >>> prior to this discussion, so >>>>> may want to use that class there. >>>>> - Look into updates to tooling to audit the LP(CONFIG) classes >>>>> >>>>> If this sounds good, I'll create a JIRA to track the work. The first >> 3 >>>>> todos will be relatively easy, and I'll look into options for the >> audit >>>> tool >>>>> >>>>> >>>>> On Mon, Mar 21, 2022 at 12:14 PM Andrew Purtell < >>>> [email protected]> >>>>> wrote: >>>>> >>>>>> Agreed the definition of LP(CONFIG) would need to be tweaked. >>>>>> >>>>>> We do not quite have enough support for analyzing configuration key >>> set >>>>>> changes in the current API audit tool. Removal of a public constant >>>> field >>>>>> from an LP annotated class would be flagged but a modification of >> the >>>>>> constant would not. However it is a OSS project consisting of Perl >>>> scripts >>>>>> that might accept a contribution or at least could be patched or >>>> extended. >>>>>> Or we can build a new audit tool for the purpose, which is what I >>> would >>>>>> recommend. The Perl based tool shells out to javah and is quite >>>> expensive, >>>>>> and on some platforms, depending on comparison, can fail due to >>> command >>>>>> line length limits. A Java tool would likely be more efficient at >>>>>> processing Java class annotations and introspecting string >> constants. >>>>>> >>>>>>> >>>>>>> On Mar 21, 2022, at 8:52 AM, Nick Dimiduk <[email protected]> >>>> wrote: >>>>>>> >>>>>>> On Mon, Mar 21, 2022 at 4:32 PM Andrew Purtell < >>>>>> [email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Although collecting all configuration keys into a single file is >>>>>>>> definitely an anti-pattern I’m not sure the same is true of >> package >>>> or >>>>>>>> Maven module level aggregation classes marked LP(CONFIG).Somewhat >>>> like >>>>>>>> DFSConfigKeys but geared toward our API/release auditing. >>>>>>>> >>>>>>>> This would seem virtuous for a couple of reasons. Relevant >>>>>> configuration >>>>>>>> key constants for the package or module would be grouped in a >> well >>>>>> known >>>>>>>> place for users and developers alike. The LP(CONFIG) designation >>>> would >>>>>>>> require developers to think about deprecation cycle if >>> contemplating >>>> a >>>>>>>> change, thus providing some back pressure against snap decisions. >>> Or, >>>>>> if >>>>>>>> not then, then at release candidate evaluation time, user >>>> configuration >>>>>>>> breaking changes could be caught be a release automation tool >> that >>>>>> diffs >>>>>>>> LP(CONFIG) annotated classes. Something like this would improve >> the >>>>>> state >>>>>>>> of configuration key management quite dramatically, because >>> currently >>>>>> it’s >>>>>>>> ad hoc. >>>>>>>> >>>>>>> >>>>>>> I am supportive of trying such an effort. We'll need to tweak the >>>>>> meaning >>>>>>> of LP(CONFIG) as we define it currently, but that can be done. I >>> don't >>>>>> know >>>>>>> what our current tools do or assume regarding this annotation. I >>> think >>>>>>> there is something custom happening in the compatibility reports >>> that >>>> we >>>>>>> generate as part of each RC. >>>>>>> >>>>>>>>> On Mar 16, 2022, at 10:46 AM, Bryan Beaudreault >>>>>> <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Thanks for your detailed response, Nick! >>>>>>>>> >>>>>>>>>> I think that none of my comments address your intended topic: >> how >>>> do >>>>>> we >>>>>>>>> publish our configuration points as an API that can be consumed >> by >>>>>> user >>>>>>>>> applications? (Do I have that correct?) >>>>>>>>> >>>>>>>>> This is a good summary, and I appreciate the other >>>>>>>> thoughts/clarifications >>>>>>>>> as well. I also realize this is probably hard to get perfect and >>> any >>>>>>>> choice >>>>>>>>> must be weighed against the effort necessary to change/maintain. >>>>>>>>> >>>>>>>>> One example I know is Hadoop/HDFS, and I bet some on this list >>> have >>>>>> much >>>>>>>>> more knowledge of that project's history than I do. For HDFS >> they >>>> have >>>>>>>>> DFSConfigKeys which in my experience does seem to include most >>>>>> configs. I >>>>>>>>> believe they even have unit tests which verify that all configs >> in >>>> the >>>>>>>>> various site.xml files are represented in code. In more recent >>>>>> versions >>>>>>>>> they have split that class up into smaller groupings, for >> example >>>>>>>>> DfsClientConf and the various inner classes there. >>>>>>>>> >>>>>>>>> In a vacuum, from a code design perspective, I'm not commenting >> on >>>>>>>> whether >>>>>>>>> that's a good or bad pattern. I also don't know of the politics >> of >>>> the >>>>>>>>> project or what sorts of pain points they've discovered in that >>>>>> pattern >>>>>>>>> over the years. But from *user's perspective*, this is a handy >> way >>>> to >>>>>>>>> handle things in my opinion. >>>>>>>>> >>>>>>>>> At my company, in general we try to avoid "magic strings" [1] >> and >>>>>> instead >>>>>>>>> always try to use constants. We can and do define our own >>> constants >>>> to >>>>>>>> try >>>>>>>>> to mirror some of the "private" magic strings in the hbase >> client. >>>>>> This >>>>>>>> is >>>>>>>>> better than nothing but even better would be to use >> hbase-provided >>>>>>>>> constants so that we can build more defensive applications, >> using >>>> the >>>>>>>>> compiler to verify that the configs we reference still do >>> anything. >>>>>>>>> >>>>>>>>> I unfortunately can't speak to the original issues with >> HConstants >>>>>> that >>>>>>>>> turned it into an anti-pattern. What I do notice is there are >>>>>> definitely >>>>>>>>> examples in the hbase codebase of duplicated config strings, one >>> of >>>>>> which >>>>>>>>> is called out in one of the jiras I linked in my original email. >>>> These >>>>>>>> are >>>>>>>>> just bugs waiting to happen in my opinion, either for hbase >> itself >>>> or >>>>>> for >>>>>>>>> users which may reference them. >>>>>>>>> >>>>>>>>> [1] https://deviq.com/antipatterns/magic-strings >>> <https://deviq.com/antipatterns/magic-strings> >>>>>> <https://deviq.com/antipatterns/magic-strings >>> <https://deviq.com/antipatterns/magic-strings> >>>> >>>>>>>>>> On Wed, Mar 16, 2022 at 10:52 AM Nick Dimiduk < >>> [email protected] >>>>> >>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi Bryan, >>>>>>>>>> >>>>>>>>>> Thanks for bringing this up. >>>>>>>>>> >>>>>>>>>> I agree with Duo (and I think we have this settled as >>> project-wide >>>>>>>>>> consensus) that HConstants is/was an anti-pattern, that we are >>>>>> actively >>>>>>>>>> against adding new fields there, and opportunistically removing >>>>>> fields >>>>>>>> when >>>>>>>>>> we can. Further, the documented meaning of the >>>>>>>>>> HBaseInterfaceAudience.CONFIG field is "Denotes class names >> that >>>>>> appear >>>>>>>> in >>>>>>>>>> user facing configuration files", so this isn't really >>> appropriate >>>>>> for >>>>>>>>>> marking a field that exposes a configuration key to user >>>>>> applications. I >>>>>>>>>> will also note that there appears to be two categories of >> tunable >>>>>>>>>> parameters -- configuration points that we expect users to >> tweak >>>> are >>>>>>>>>> catalogued and documented in the book [0] and everything else >> is >>>>>> left to >>>>>>>>>> the obscurity of code-grep. >>>>>>>>>> >>>>>>>>>> While we are actively squashing use of fields in HConstants, I >>>> don't >>>>>>>> know >>>>>>>>>> that we have proposed some alternative to the user community. >> For >>>> my >>>>>>>> part, >>>>>>>>>> when I write and review code that involves configuration keys, >> I >>>>>>>> generally >>>>>>>>>> implement the key constant string as a private field in an >>>>>> appropriate >>>>>>>>>> class, and the unit test coverage for that configuration key >>>>>> replicates >>>>>>>> the >>>>>>>>>> string in the test. My reasoning being that the string is a >> part >>> of >>>>>> our >>>>>>>>>> public API and making a change to the public API should be >>> detected >>>>>> from >>>>>>>>>> the unit test. I also have (on occasion) gone out of my way to >>>> write >>>>>>>> about >>>>>>>>>> the configuration keys in the package or class-level javadoc. >>>>>>>>>> >>>>>>>>>> I think that none of my comments address your intended topic: >> how >>>> do >>>>>> we >>>>>>>>>> publish our configuration points as an API that can be consumed >>> by >>>>>> user >>>>>>>>>> applications? (Do I have that correct?) >>>>>>>>>> >>>>>>>>>> I am of the mind that we don't need/want an API of >>> configurations ; >>>>>> we >>>>>>>> want >>>>>>>>>> a catalogue, i.e., what has been started in our book. Perhaps >>>>>>>> accompanied >>>>>>>>>> by/generated from an authoritative hbase-defaults.xml file. In >>>> fact, >>>>>> we >>>>>>>>>> already do generate from hbase-default.xml, the result is [1] >> ; I >>>>>> don't >>>>>>>>>> believe it is authoritative. >>>>>>>>>> >>>>>>>>>> If we did have an AP thoughI, what would be better than the >>>>>> HConstants >>>>>>>>>> approach of key-strings as public fields ? What if we had a >>>>>>>>>> ConfigurationBuilder type of class, which had methods tied to >>>>>>>> configuration >>>>>>>>>> keys? I would think that such a globally applicable class would >>>> have >>>>>> the >>>>>>>>>> same maintenance issues as HConstants. But what if we had some >>> kind >>>>>> of >>>>>>>>>> ConfigurationSetter class, perhaps per package, that performed >>> this >>>>>>>>>> function? That might be maintainable for us and useful for >> users. >>>>>>>>>> >>>>>>>>>> I'm keen to hear what other ideas are out there, or better, >>>> examples >>>>>> and >>>>>>>>>> counter-examples from other projects. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Nick >>>>>>>>>> >>>>>>>>>> [0]: >> https://hbase.apache.org/book.html#important_configurations >>> <https://hbase.apache.org/book.html#important_configurations> >>>>>> <https://hbase.apache.org/book.html#important_configurations >>> <https://hbase.apache.org/book.html#important_configurations> >>>> >>>>>>>>>> <https://hbase.apache.org/book.html#important_configurations >>> <https://hbase.apache.org/book.html#important_configurations> >>>>>> <https://hbase.apache.org/book.html#important_configurations >>> <https://hbase.apache.org/book.html#important_configurations> >>>> >>>>>>> >>>>>>>>>> [1]: >>>> https://hbase.apache.org/book.html#hbase_default_configurations >>> <https://hbase.apache.org/book.html#hbase_default_configurations> >>>>>> <https://hbase.apache.org/book.html#hbase_default_configurations >>> <https://hbase.apache.org/book.html#hbase_default_configurations> >>>> >>>>>>>>>> < >> https://hbase.apache.org/book.html#hbase_default_configurations >>> <https://hbase.apache.org/book.html#hbase_default_configurations> >>>>>> <https://hbase.apache.org/book.html#hbase_default_configurations >>> <https://hbase.apache.org/book.html#hbase_default_configurations> >>>> >>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Mar 15, 2022 at 4:28 PM Bryan Beaudreault >>>>>>>>>> <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi devs, >>>>>>>>>>> >>>>>>>>>>> As a major user of hbase, my company has thousands of clients >>>>>> deployed >>>>>>>>>>> which use the hbase client to connect to a variety of hbase >>>>>> clusters. >>>>>>>> We >>>>>>>>>>> have a common library which handles configuring all clients by >>>>>> setting >>>>>>>> up >>>>>>>>>>> the Configuration object prior to creating a Connection. Our >>>> library >>>>>>>> sets >>>>>>>>>>> configurations using the various configs in HConstants, but >>> there >>>>>> are >>>>>>>>>> also >>>>>>>>>>> a bunch of configs which don't exist in HConstants. For these >> we >>>>>> have >>>>>>>>>>> hardcoded config strings in our client. >>>>>>>>>>> >>>>>>>>>>> We're now working on an hbase upgrade and need to go through >> our >>>>>> client >>>>>>>>>>> library and check how the configs may have changed in the new >>>>>> version. >>>>>>>>>> This >>>>>>>>>>> is relatively easy to do for those HConstants cases -- configs >>> may >>>>>> be >>>>>>>>>>> marked @Deprecated which will show up in one's editor, they >> may >>> be >>>>>>>>>> removed >>>>>>>>>>> entirely which would show up is a compile error, and otherwise >>> one >>>>>> can >>>>>>>>>>> easily click through or bring up the javadoc. For the others >>> that >>>>>> don't >>>>>>>>>>> exist in HConstants, we need to go manually search the hbase >>>>>> codebase >>>>>>>> for >>>>>>>>>>> those strings. >>>>>>>>>>> >>>>>>>>>>> Without doing this painstaking manual process, we would >>>> potentially >>>>>>>>>> deploy >>>>>>>>>>> the upgraded client with configs which are no longer used or >>>>>> deprecated >>>>>>>>>> by >>>>>>>>>>> the hbase client. For those using HConstants, this is >>> immediately >>>>>>>> obvious >>>>>>>>>>> because the HConstant field may have been removed. This is a >>> clear >>>>>>>>>>> indication of needing to investigate the config. In this case >>> it's >>>>>>>>>>> preferred to face the compile failure because it's clearer >> than >>>>>> having >>>>>>>>>>> something silently disappear or change. >>>>>>>>>>> >>>>>>>>>>> I opened 3 jiras to move some fields to HConstants, but got >> some >>>>>>>>>> reasonable >>>>>>>>>>> pushback from Duo: >>>>>>>>>>> >>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-26845 >>> <https://issues.apache.org/jira/browse/HBASE-26845> >>>>>> <https://issues.apache.org/jira/browse/HBASE-26845 >>> <https://issues.apache.org/jira/browse/HBASE-26845> >>>> >>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-26845 >>> <https://issues.apache.org/jira/browse/HBASE-26845> >>>>>> <https://issues.apache.org/jira/browse/HBASE-26845 >>> <https://issues.apache.org/jira/browse/HBASE-26845> >>>> >>>>>>> >>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-26846 >>> <https://issues.apache.org/jira/browse/HBASE-26846> >>>>>> <https://issues.apache.org/jira/browse/HBASE-26846 >>> <https://issues.apache.org/jira/browse/HBASE-26846> >>>> >>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-26846 >>> <https://issues.apache.org/jira/browse/HBASE-26846> >>>>>> <https://issues.apache.org/jira/browse/HBASE-26846 >>> <https://issues.apache.org/jira/browse/HBASE-26846> >>>> >>>>>>> >>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-26847 >>> <https://issues.apache.org/jira/browse/HBASE-26847> >>>>>> <https://issues.apache.org/jira/browse/HBASE-26847 >>> <https://issues.apache.org/jira/browse/HBASE-26847> >>>> >>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-26847 >>> <https://issues.apache.org/jira/browse/HBASE-26847> >>>>>> <https://issues.apache.org/jira/browse/HBASE-26847 >>> <https://issues.apache.org/jira/browse/HBASE-26847> >>>> >>>>>>> >>>>>>>>>>> >>>>>>>>>>> Duo's pushback is that HConstants is an anti-pattern and these >>>>>> configs >>>>>>>>>> are >>>>>>>>>>> not part of our public API. I can agree that a catch-all >>> constants >>>>>>>> class >>>>>>>>>>> might be an anti-pattern, but would argue that consolidating >>>> configs >>>>>>>>>> there >>>>>>>>>>> is very useful for end-users. I can also potentially agree >> that >>>>>>>> exposing >>>>>>>>>>> these as part of our public API might limit the flexibility of >>>>>>>>>> development >>>>>>>>>>> due to compatibility constraints about IA.Public. >>>>>>>>>>> >>>>>>>>>>> To me it seems odd to add a configuration, whose whole point >> is >>> to >>>>>> make >>>>>>>>>>> something tuneable, but then bury it in a private class >> despite >>>>>> having >>>>>>>>>> real >>>>>>>>>>> implications for how the application runs. If a configuration >> is >>>> not >>>>>>>>>> meant >>>>>>>>>>> to be tuned, it shouldn't be a configuration at all. Otherwise >>> it >>>>>>>> should >>>>>>>>>> be >>>>>>>>>>> exposed for reference. >>>>>>>>>>> >>>>>>>>>>> I'm wondering if there is some compromise we can achieve which >>>>>> makes it >>>>>>>>>>> easier for end-users to integrate with tunable configs. >>>>>>>>>>> >>>>>>>>>>> One can imagine a large project to clean up all of our configs >>>> under >>>>>>>> some >>>>>>>>>>> new class with maybe IA.LimitedPrivate(CONFIG), but I fear >>> making >>>>>>>> perfect >>>>>>>>>>> (needing to migrate all configs) the enemy of good. >>>>>>>>>>> >>>>>>>>>>> A better option might be to make those classes which expose >>>> configs >>>>>>>>>>> LimitedPrivate(CONFIG) -- for example AsyncProcess and >>>>>>>>>>> ConnectionImplementation. That might be the most incremental >>>> change >>>>>> we >>>>>>>>>>> could make. We could handle this on a case-by-case basis. >>>>>>>>>>> >>>>>>>>>>> Does anyone have any thoughts? >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >>> >>
