+1 to vinoyang's suggestions. @Vino Yang <vinoy...@apache.org> , do you want to formalize this and update our contributing page?
On Thu, Jan 30, 2020 at 3:09 AM hmatu <hma...@foxmail.com> wrote: > Hi, > > > > I think these "MINOR" issues are important, a good project requires > not only functions, but also good coding style and habits. > > > Best > Hmatu > > > > > ------------------ Original ------------------ > From: "Vinoth Chandar"<vin...@apache.org>; > Date: Mon, Jan 27, 2020 03:25 AM > To: "dev"<dev@hudi.apache.org>; > > Subject: Re: [DISCUSS] Unify Hudi code cleanup and improvement > > > > Hi Vino, > > You raise a valid point on what "MINOR" PR should be. All JIRAs start out > in "NEW" state and committers have to "Accept" the issue already (to force > early conversations like this). > > May be we should draw some bounds on it like, "cannot be more than 50 > lines", "No functionality changes" .. etc? WDYT? This seems to be > the core > of the issue. > > On Thu, Jan 23, 2020 at 4:17 PM vino yang <yanghua1...@gmail.com> > wrote: > > > Hi Vinoth, > > > > Thank you for your thoughts, I agree that focusing on some higher > priority > > work is more valuable. > > > > This discussion is to sort out and manage the work that the community > is > > already doing. There are currently some PRs working on this type of > work, > > such as PR[1][2][3][4]. The community has not given guidance on these > > tasks. I think it's not very appropriate to open a "MINOR" PR > directly. So, > > I want to hear from the community and how to manage them more > effectively. > > The discussion does not encourage to give a higher priority to such > work. > > > > We haven't stopped this kind of work, so we should provide effective > > guidance and organization so that it doesn't look disorganized. WYDT? > > > > Best, > > Vino > > > > [1]: https://github.com/apache/incubator-hudi/pull/1237 > > [2]: https://github.com/apache/incubator-hudi/pull/1139 > > [3]: https://github.com/apache/incubator-hudi/pull/1137 > > [4]: https://github.com/apache/incubator-hudi/pull/1136 > > > > Vinoth Chandar <vin...@apache.org> 于2020年1月23日周四 下午1:20写道: > > > > > Hi, > > > > > > Thanks everyone for sharing your views! > > > > > > Some of this conversation is starting to feel like boiling the > ocean. I > > > believe in refactoring with purpose and discussing > class-by-class or > > > module-by-module does not make sense to me. Can we first list > down what > > we > > > want to achieve? So far, I have only heard fixing IDE/IntelliJ > warnings. > > > Also instead of focussing on new work, how about looking at the > pending > > > JIRAs under "Testing" "Code Cleanup" components first and see if > those > > are > > > worth tackling. > > > > > > We went down this path for code formatting and today we still > have > > > inconsistencies. Looking back, I feel we should have clearly > defined end > > > goals for the cleanups and we can then rank them based on ROI. > > > > > > Thanks > > > Vinoth > > > > > > On Wed, Jan 22, 2020 at 7:05 PM vino yang <yanghua1...@gmail.com> > wrote: > > > > > > > Hi Shiyan and Bhavani: > > > > > > > > Thanks for sharing your thoughts. > > > > > > > > As I originally stated. The advantage of using modules as a > unit to > > split > > > > work is that the decomposition is clear, but the > disadvantage is that > > the > > > > volume of changes may be huge, which brings huge risks > (considering > > that > > > > Hudi's test coverage is still not very high) and the > workload of > > review. > > > > The advantage of splitting by class is that the volume of > changes is > > > small > > > > and the review is more convenient, but the disadvantages > are too many > > > tasks > > > > and high maintenance costs. > > > > > > > > > > > > *In addition, we need to define the boundaries of the "code > cleanup" I > > > > expressed in this topic: it is limited to the smart tips > shown by > > > Intellij > > > > IDEA. If the boundaries are too wide, then this discussion > will lose > > > > control.* > > > > I agree with Bhavani that we don't take it as the actual > goal. But we > > are > > > > not opposed to the community to help improve the quality of > the code > > > > (basically, these tips given by the IDE are more > reasonable). > > > > > > > > > > > > So, I still give my thoughts: We manage this work with > Jira. Before we > > > > start working, we need to find a committer as a mentor. The > mentor must > > > > decide whether the scale of the subtasks is reasonable and > whether > > > > additional unit tests need to be added to verify the > changes. And the > > > > mentor should be responsible for merged changes. > > > > > > > > What do you think? > > > > > > > > Best, > > > > Vino > > > > > > > > Bhavani Sudha <bhavanisud...@gmail.com> 于2020年1月22日周三 > 下午2:22写道: > > > > > > > > > Hi @vinoyang thanks for bringing this to discussion. I > feel it would > > be > > > > > less disruptive to clean up code as part of individual > classes being > > > > > touched for a specific goal rather than code cleanup > being the actual > > > > goal. > > > > > This would narrow the touch point and ensure test > coverage (both unit > > > and > > > > > integration tests) catches any > accidental/unintentional changes. > > Also > > > it > > > > > would give chance to change any documentation > quoting/referencing > > that > > > > > code. Wanted to share my personal opinion. > > > > > > > > > > Thanks, > > > > > Sudha > > > > > > > > > > > > > > > > > > > > On Tue, Jan 21, 2020 at 11:36 AM Shiyan Xu < > > > xu.shiyan.raym...@gmail.com> > > > > > wrote: > > > > > > > > > > > The clean-up work can actually be split by > modules. > > > > > > > > > > > > Though it is generally a good practice to follow, > my concern is the > > > > > > clean-up is likely to cause conflicts with some > on-going changes. > > If > > > I > > > > > may > > > > > > suggest, the dedicated clean-up tasks should avoid > > > > > > - modules that are undergoing multiple feature > changes/PRs > > > > > > - modules that are planned to have major > refactoring due to design > > > > > changes > > > > > > (since clean-up can be done altogether during > refactoring) > > > > > > > > > > > > On Tue, Jan 21, 2020 at 4:17 AM Vinoth Chandar < > vin...@apache.org> > > > > > wrote: > > > > > > > > > > > > > Not sure if I fully agree with sweeping > statements being made. > > But, > > > > +1 > > > > > > for > > > > > > > structuring this work via Jiras and having > some committer > > “accept” > > > > the > > > > > > > issue first. Some of these tend to be > subjective and we do need > > to > > > > > make > > > > > > > different tradeoffs. > > > > > > > > > > > > > > On Tue, Jan 21, 2020 at 1:28 AM vino yang < > yanghua1...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Pratyaksh, > > > > > > > > > > > > > > > > Thanks for your thought. > > > > > > > > > > > > > > > > Let's listen to others' comments. If > there is no objection, we > > > will > > > > > > > follow > > > > > > > > this way. > > > > > > > > > > > > > > > > Best, > > > > > > > > Vino > > > > > > > > > > > > > > > > > > > > > > > > Pratyaksh Sharma <pratyaks...@gmail.com> > 于2020年1月21日周二 > > 下午4:56写道: > > > > > > > > > > > > > > > > > Hi Vino, > > > > > > > > > > > > > > > > > > Big +1 for this initiative. I have > done this code cleanup for > > > > test > > > > > > > > classes > > > > > > > > > in the past and strongly feel > there is a need to do the same > > at > > > > > other > > > > > > > > > places as well. I would definitely > like to volunteer for > > this. > > > > > > > > > > > > > > > > > > On Tue, Jan 21, 2020 at 1:52 PM > vino yang < > > > yanghua1...@gmail.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi folks, > > > > > > > > > > > > > > > > > > > > Currently, the code quality > of some Hudi module is not very > > > > well. > > > > > > As > > > > > > > > many > > > > > > > > > > developers have seen, the > Intellij IDEA has shown many > > > > > intellisense > > > > > > > > about > > > > > > > > > > cleanup and improvement. The > community does not object to > > > doing > > > > > the > > > > > > > > > cleanup > > > > > > > > > > and improvement work and the > work has been started via some > > > > > direct > > > > > > > > > "minor" > > > > > > > > > > PRs by some volunteers. The > current way is unorganized and > > > hard > > > > > to > > > > > > > > > manage. > > > > > > > > > > For tracking this work, I > prefer to manage this work with > > the > > > > > Jira > > > > > > > > issue. > > > > > > > > > > We can create an umbrella > issue. Then, split the work into > > > > > several > > > > > > > > > > subtasks. > > > > > > > > > > > > > > > > > > > > Since those "bad smell" lays > anywhere in the whole project. > > > > It's > > > > > > > > > difficult > > > > > > > > > > to give a standard to split > the subtasks. For example, some > > > > files > > > > > > > have > > > > > > > > a > > > > > > > > > > lot while some modules have > few. So I suggest the standard > > > > would > > > > > > > depend > > > > > > > > > on > > > > > > > > > > the volume of the changes. > Before working, any subtask > > should > > > > > find > > > > > > a > > > > > > > > > > committer as a mentor who > would judge and approve the scope > > > is > > > > > > > > suitable. > > > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > Any comments and suggestions > would be appreciated. > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Vino > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >