Hi Vinoth,

Yes, I'd like to give some suggestions about the "MINOR" PR.
Will file a Jira issue to track this work.

Best,
Vino

Vinoth Chandar <vin...@apache.org> 于2020年2月4日周二 上午8:16写道:

> +1 to vinoyang's suggestions.
>
> @Vino Yang <vinoy...@apache.org>  , do you want to formalize this and
> update our contributing page?
>
> On Thu, Jan 30, 2020 at 3:09 AM hmatu <hma...@foxmail.com> wrote:
>
> > Hi,
> >
> >
> >
> > I think these "MINOR" issues are important, a good&nbsp;project requires
> > not only functions, but also good coding style and  habits.
> >
> >
> > Best
> > Hmatu
> >
> >
> >
> >
> > ------------------&nbsp;Original&nbsp;------------------
> > From:&nbsp;"Vinoth Chandar"<vin...@apache.org&gt;;
> > Date:&nbsp;Mon, Jan 27, 2020 03:25 AM
> > To:&nbsp;"dev"<dev@hudi.apache.org&gt;;
> >
> > Subject:&nbsp;Re: [DISCUSS] Unify Hudi code cleanup and improvement
> >
> >
> >
> > Hi Vino,
> >
> > You raise a valid point on what "MINOR" PR should be. All JIRAs start out
> > in "NEW" state and committers have to "Accept" the issue already (to
> force
> > early conversations like this).
> >
> > May be we should draw some bounds on it like, "cannot be more than 50
> > lines", "No functionality changes" .. etc? WDYT?&nbsp; This seems to be
> > the core
> > of the issue.
> >
> > On Thu, Jan 23, 2020 at 4:17 PM vino yang <yanghua1...@gmail.com&gt;
> > wrote:
> >
> > &gt; Hi Vinoth,
> > &gt;
> > &gt; Thank you for your thoughts, I agree that focusing on some higher
> > priority
> > &gt; work is more valuable.
> > &gt;
> > &gt; This discussion is to sort out and manage the work that the
> community
> > is
> > &gt; already doing. There are currently some PRs working on this type of
> > work,
> > &gt; such as PR[1][2][3][4]. The community has not given guidance on
> these
> > &gt; tasks. I think it's not very appropriate to open a "MINOR" PR
> > directly. So,
> > &gt; I want to hear from the community and how to manage them more
> > effectively.
> > &gt; The discussion does not encourage to give a higher priority to such
> > work.
> > &gt;
> > &gt; We haven't stopped this kind of work, so we should provide effective
> > &gt; guidance and organization so that it doesn't look disorganized.
> WYDT?
> > &gt;
> > &gt; Best,
> > &gt; Vino
> > &gt;
> > &gt; [1]: https://github.com/apache/incubator-hudi/pull/1237
> > &gt; [2]: https://github.com/apache/incubator-hudi/pull/1139
> > &gt; [3]: https://github.com/apache/incubator-hudi/pull/1137
> > &gt; [4]: https://github.com/apache/incubator-hudi/pull/1136
> > &gt;
> > &gt; Vinoth Chandar <vin...@apache.org&gt; 于2020年1月23日周四 下午1:20写道:
> > &gt;
> > &gt; &gt; Hi,
> > &gt; &gt;
> > &gt; &gt; Thanks everyone for sharing your views!
> > &gt; &gt;
> > &gt; &gt; Some of this conversation is starting to feel like boiling the
> > ocean. I
> > &gt; &gt; believe in refactoring with purpose and discussing
> > class-by-class or
> > &gt; &gt; module-by-module does not make sense to me. Can we first list
> > down what
> > &gt; we
> > &gt; &gt; want to achieve? So far, I have only heard fixing IDE/IntelliJ
> > warnings.
> > &gt; &gt; Also instead of focussing on new work, how about looking at the
> > pending
> > &gt; &gt; JIRAs under "Testing" "Code Cleanup" components first and see
> if
> > those
> > &gt; are
> > &gt; &gt; worth tackling.
> > &gt; &gt;
> > &gt; &gt; We went down this path for code formatting and today we still
> > have
> > &gt; &gt; inconsistencies. Looking back, I feel we should have clearly
> > defined end
> > &gt; &gt; goals for the cleanups and we can then rank them based on ROI.
> > &gt; &gt;
> > &gt; &gt; Thanks
> > &gt; &gt; Vinoth
> > &gt; &gt;
> > &gt; &gt; On Wed, Jan 22, 2020 at 7:05 PM vino yang <
> yanghua1...@gmail.com&gt;
> > wrote:
> > &gt; &gt;
> > &gt; &gt; &gt; Hi Shiyan and Bhavani:
> > &gt; &gt; &gt;
> > &gt; &gt; &gt; Thanks for sharing your thoughts.
> > &gt; &gt; &gt;
> > &gt; &gt; &gt; As I originally stated. The advantage of using modules as
> a
> > unit to
> > &gt; split
> > &gt; &gt; &gt; work is that the decomposition is clear, but the
> > disadvantage is that
> > &gt; the
> > &gt; &gt; &gt; volume of changes may be huge, which brings huge risks
> > (considering
> > &gt; that
> > &gt; &gt; &gt; Hudi's test coverage is still not very high) and the
> > workload of
> > &gt; review.
> > &gt; &gt; &gt; The advantage of splitting by class is that the volume of
> > changes is
> > &gt; &gt; small
> > &gt; &gt; &gt; and the review is more convenient, but the disadvantages
> > are too many
> > &gt; &gt; tasks
> > &gt; &gt; &gt; and high maintenance costs.
> > &gt; &gt; &gt;
> > &gt; &gt; &gt;
> > &gt; &gt; &gt; *In addition, we need to define the boundaries of the
> "code
> > cleanup" I
> > &gt; &gt; &gt; expressed in this topic: it is limited to the smart tips
> > shown by
> > &gt; &gt; Intellij
> > &gt; &gt; &gt; IDEA. If the boundaries are too wide, then this discussion
> > will lose
> > &gt; &gt; &gt; control.*
> > &gt; &gt; &gt; I agree with Bhavani that we don't take it as the actual
> > goal. But we
> > &gt; are
> > &gt; &gt; &gt; not opposed to the community to help improve the quality
> of
> > the code
> > &gt; &gt; &gt; (basically, these tips given by the IDE are more
> > reasonable).
> > &gt; &gt; &gt;
> > &gt; &gt; &gt;
> > &gt; &gt; &gt; So, I still give my thoughts: We manage this work with
> > Jira. Before we
> > &gt; &gt; &gt; start working, we need to find a committer as a mentor.
> The
> > mentor must
> > &gt; &gt; &gt; decide whether the scale of the subtasks is reasonable and
> > whether
> > &gt; &gt; &gt; additional unit tests need to be added to verify the
> > changes. And the
> > &gt; &gt; &gt; mentor should be responsible for merged changes.
> > &gt; &gt; &gt;
> > &gt; &gt; &gt; What do you think?
> > &gt; &gt; &gt;
> > &gt; &gt; &gt; Best,
> > &gt; &gt; &gt; Vino
> > &gt; &gt; &gt;
> > &gt; &gt; &gt; Bhavani Sudha <bhavanisud...@gmail.com&gt; 于2020年1月22日周三
> > 下午2:22写道:
> > &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; Hi @vinoyang thanks for bringing this to discussion.
> I
> > feel it would
> > &gt; be
> > &gt; &gt; &gt; &gt; less disruptive to clean up code as part of
> individual
> > classes being
> > &gt; &gt; &gt; &gt; touched for a specific goal rather than code cleanup
> > being the actual
> > &gt; &gt; &gt; goal.
> > &gt; &gt; &gt; &gt; This would narrow the touch point and ensure test
> > coverage (both unit
> > &gt; &gt; and
> > &gt; &gt; &gt; &gt; integration tests)&nbsp; catches any
> > accidental/unintentional changes.
> > &gt; Also
> > &gt; &gt; it
> > &gt; &gt; &gt; &gt; would give chance to change any documentation
> > quoting/referencing
> > &gt; that
> > &gt; &gt; &gt; &gt; code. Wanted to share my personal opinion.
> > &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; Thanks,
> > &gt; &gt; &gt; &gt; Sudha
> > &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 11:36 AM Shiyan Xu <
> > &gt; &gt; xu.shiyan.raym...@gmail.com&gt;
> > &gt; &gt; &gt; &gt; wrote:
> > &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; The clean-up work can actually be split by
> > modules.
> > &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; Though it is generally a good practice to
> follow,
> > my concern is the
> > &gt; &gt; &gt; &gt; &gt; clean-up is likely to cause conflicts with some
> > on-going changes.
> > &gt; If
> > &gt; &gt; I
> > &gt; &gt; &gt; &gt; may
> > &gt; &gt; &gt; &gt; &gt; suggest, the dedicated clean-up tasks should
> avoid
> > &gt; &gt; &gt; &gt; &gt; - modules that are undergoing multiple feature
> > changes/PRs
> > &gt; &gt; &gt; &gt; &gt; - modules that are planned to have major
> > refactoring due to design
> > &gt; &gt; &gt; &gt; changes
> > &gt; &gt; &gt; &gt; &gt; (since clean-up can be done altogether during
> > refactoring)
> > &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 4:17 AM Vinoth Chandar <
> > vin...@apache.org&gt;
> > &gt; &gt; &gt; &gt; wrote:
> > &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; Not sure if I fully agree with sweeping
> > statements being made.
> > &gt; But,
> > &gt; &gt; &gt; +1
> > &gt; &gt; &gt; &gt; &gt; for
> > &gt; &gt; &gt; &gt; &gt; &gt; structuring this work via Jiras and having
> > some committer
> > &gt; “accept”
> > &gt; &gt; &gt; the
> > &gt; &gt; &gt; &gt; &gt; &gt; issue first.&nbsp; Some of these tend to be
> > subjective and we do need
> > &gt; to
> > &gt; &gt; &gt; &gt; make
> > &gt; &gt; &gt; &gt; &gt; &gt; different tradeoffs.
> > &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 1:28 AM vino yang <
> > yanghua1...@gmail.com
> > &gt; &gt;
> > &gt; &gt; &gt; &gt; wrote:
> > &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Hi Pratyaksh,
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Thanks for your thought.
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Let's listen to others' comments. If
> > there is no objection, we
> > &gt; &gt; will
> > &gt; &gt; &gt; &gt; &gt; &gt; follow
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; this way.
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Best,
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Vino
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Pratyaksh Sharma <
> pratyaks...@gmail.com&gt;
> > 于2020年1月21日周二
> > &gt; 下午4:56写道:
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Hi Vino,
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Big +1 for this initiative. I
> have
> > done this code cleanup for
> > &gt; &gt; &gt; test
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; classes
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; in the past and strongly feel
> > there is a need to do the same
> > &gt; at
> > &gt; &gt; &gt; &gt; other
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; places as well. I would
> definitely
> > like to volunteer for
> > &gt; this.
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 1:52 PM
> > vino yang <
> > &gt; &gt; yanghua1...@gmail.com
> > &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; wrote:
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Hi folks,
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Currently, the code quality
> > of some Hudi module is not very
> > &gt; &gt; &gt; well.
> > &gt; &gt; &gt; &gt; &gt; As
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; many
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; developers have seen, the
> > Intellij IDEA has shown many
> > &gt; &gt; &gt; &gt; intellisense
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; about
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; cleanup and improvement. The
> > community does not object to
> > &gt; &gt; doing
> > &gt; &gt; &gt; &gt; the
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; cleanup
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; and improvement work and the
> > work has been started via some
> > &gt; &gt; &gt; &gt; direct
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; "minor"
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; PRs by some volunteers. The
> > current way is unorganized and
> > &gt; &gt; hard
> > &gt; &gt; &gt; &gt; to
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; manage.
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; For tracking this work, I
> > prefer to manage this work with
> > &gt; the
> > &gt; &gt; &gt; &gt; Jira
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; issue.
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; We can create an umbrella
> > issue. Then, split the work into
> > &gt; &gt; &gt; &gt; several
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; subtasks.
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Since those "bad smell" lays
> > anywhere in the whole project.
> > &gt; &gt; &gt; It's
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; difficult
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; to give a standard to split
> > the subtasks. For example, some
> > &gt; &gt; &gt; files
> > &gt; &gt; &gt; &gt; &gt; &gt; have
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; a
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; lot while some modules have
> > few. So I suggest the standard
> > &gt; &gt; &gt; would
> > &gt; &gt; &gt; &gt; &gt; &gt; depend
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; on
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; the volume of the changes.
> > Before working, any subtask
> > &gt; should
> > &gt; &gt; &gt; &gt; find
> > &gt; &gt; &gt; &gt; &gt; a
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; committer as a mentor who
> > would judge and approve the scope
> > &gt; &gt; is
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; suitable.
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; What do you think?
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Any comments and suggestions
> > would be appreciated.
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Best,
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Vino
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt; &gt;
> > &gt; &gt; &gt;
> > &gt; &gt;
> > &gt;
>

Reply via email to