+1 to vinoyang's suggestions.

@Vino Yang <vinoy...@apache.org>  , do you want to formalize this and
update our contributing page?

On Thu, Jan 30, 2020 at 3:09 AM hmatu <hma...@foxmail.com> wrote:

> Hi,
>
>
>
> I think these "MINOR" issues are important, a good&nbsp;project requires
> not only functions, but also good coding style and  habits.
>
>
> Best
> Hmatu
>
>
>
>
> ------------------&nbsp;Original&nbsp;------------------
> From:&nbsp;"Vinoth Chandar"<vin...@apache.org&gt;;
> Date:&nbsp;Mon, Jan 27, 2020 03:25 AM
> To:&nbsp;"dev"<dev@hudi.apache.org&gt;;
>
> Subject:&nbsp;Re: [DISCUSS] Unify Hudi code cleanup and improvement
>
>
>
> Hi Vino,
>
> You raise a valid point on what "MINOR" PR should be. All JIRAs start out
> in "NEW" state and committers have to "Accept" the issue already (to force
> early conversations like this).
>
> May be we should draw some bounds on it like, "cannot be more than 50
> lines", "No functionality changes" .. etc? WDYT?&nbsp; This seems to be
> the core
> of the issue.
>
> On Thu, Jan 23, 2020 at 4:17 PM vino yang <yanghua1...@gmail.com&gt;
> wrote:
>
> &gt; Hi Vinoth,
> &gt;
> &gt; Thank you for your thoughts, I agree that focusing on some higher
> priority
> &gt; work is more valuable.
> &gt;
> &gt; This discussion is to sort out and manage the work that the community
> is
> &gt; already doing. There are currently some PRs working on this type of
> work,
> &gt; such as PR[1][2][3][4]. The community has not given guidance on these
> &gt; tasks. I think it's not very appropriate to open a "MINOR" PR
> directly. So,
> &gt; I want to hear from the community and how to manage them more
> effectively.
> &gt; The discussion does not encourage to give a higher priority to such
> work.
> &gt;
> &gt; We haven't stopped this kind of work, so we should provide effective
> &gt; guidance and organization so that it doesn't look disorganized. WYDT?
> &gt;
> &gt; Best,
> &gt; Vino
> &gt;
> &gt; [1]: https://github.com/apache/incubator-hudi/pull/1237
> &gt; [2]: https://github.com/apache/incubator-hudi/pull/1139
> &gt; [3]: https://github.com/apache/incubator-hudi/pull/1137
> &gt; [4]: https://github.com/apache/incubator-hudi/pull/1136
> &gt;
> &gt; Vinoth Chandar <vin...@apache.org&gt; 于2020年1月23日周四 下午1:20写道:
> &gt;
> &gt; &gt; Hi,
> &gt; &gt;
> &gt; &gt; Thanks everyone for sharing your views!
> &gt; &gt;
> &gt; &gt; Some of this conversation is starting to feel like boiling the
> ocean. I
> &gt; &gt; believe in refactoring with purpose and discussing
> class-by-class or
> &gt; &gt; module-by-module does not make sense to me. Can we first list
> down what
> &gt; we
> &gt; &gt; want to achieve? So far, I have only heard fixing IDE/IntelliJ
> warnings.
> &gt; &gt; Also instead of focussing on new work, how about looking at the
> pending
> &gt; &gt; JIRAs under "Testing" "Code Cleanup" components first and see if
> those
> &gt; are
> &gt; &gt; worth tackling.
> &gt; &gt;
> &gt; &gt; We went down this path for code formatting and today we still
> have
> &gt; &gt; inconsistencies. Looking back, I feel we should have clearly
> defined end
> &gt; &gt; goals for the cleanups and we can then rank them based on ROI.
> &gt; &gt;
> &gt; &gt; Thanks
> &gt; &gt; Vinoth
> &gt; &gt;
> &gt; &gt; On Wed, Jan 22, 2020 at 7:05 PM vino yang <yanghua1...@gmail.com&gt;
> wrote:
> &gt; &gt;
> &gt; &gt; &gt; Hi Shiyan and Bhavani:
> &gt; &gt; &gt;
> &gt; &gt; &gt; Thanks for sharing your thoughts.
> &gt; &gt; &gt;
> &gt; &gt; &gt; As I originally stated. The advantage of using modules as a
> unit to
> &gt; split
> &gt; &gt; &gt; work is that the decomposition is clear, but the
> disadvantage is that
> &gt; the
> &gt; &gt; &gt; volume of changes may be huge, which brings huge risks
> (considering
> &gt; that
> &gt; &gt; &gt; Hudi's test coverage is still not very high) and the
> workload of
> &gt; review.
> &gt; &gt; &gt; The advantage of splitting by class is that the volume of
> changes is
> &gt; &gt; small
> &gt; &gt; &gt; and the review is more convenient, but the disadvantages
> are too many
> &gt; &gt; tasks
> &gt; &gt; &gt; and high maintenance costs.
> &gt; &gt; &gt;
> &gt; &gt; &gt;
> &gt; &gt; &gt; *In addition, we need to define the boundaries of the "code
> cleanup" I
> &gt; &gt; &gt; expressed in this topic: it is limited to the smart tips
> shown by
> &gt; &gt; Intellij
> &gt; &gt; &gt; IDEA. If the boundaries are too wide, then this discussion
> will lose
> &gt; &gt; &gt; control.*
> &gt; &gt; &gt; I agree with Bhavani that we don't take it as the actual
> goal. But we
> &gt; are
> &gt; &gt; &gt; not opposed to the community to help improve the quality of
> the code
> &gt; &gt; &gt; (basically, these tips given by the IDE are more
> reasonable).
> &gt; &gt; &gt;
> &gt; &gt; &gt;
> &gt; &gt; &gt; So, I still give my thoughts: We manage this work with
> Jira. Before we
> &gt; &gt; &gt; start working, we need to find a committer as a mentor. The
> mentor must
> &gt; &gt; &gt; decide whether the scale of the subtasks is reasonable and
> whether
> &gt; &gt; &gt; additional unit tests need to be added to verify the
> changes. And the
> &gt; &gt; &gt; mentor should be responsible for merged changes.
> &gt; &gt; &gt;
> &gt; &gt; &gt; What do you think?
> &gt; &gt; &gt;
> &gt; &gt; &gt; Best,
> &gt; &gt; &gt; Vino
> &gt; &gt; &gt;
> &gt; &gt; &gt; Bhavani Sudha <bhavanisud...@gmail.com&gt; 于2020年1月22日周三
> 下午2:22写道:
> &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; Hi @vinoyang thanks for bringing this to discussion. I
> feel it would
> &gt; be
> &gt; &gt; &gt; &gt; less disruptive to clean up code as part of individual
> classes being
> &gt; &gt; &gt; &gt; touched for a specific goal rather than code cleanup
> being the actual
> &gt; &gt; &gt; goal.
> &gt; &gt; &gt; &gt; This would narrow the touch point and ensure test
> coverage (both unit
> &gt; &gt; and
> &gt; &gt; &gt; &gt; integration tests)&nbsp; catches any
> accidental/unintentional changes.
> &gt; Also
> &gt; &gt; it
> &gt; &gt; &gt; &gt; would give chance to change any documentation
> quoting/referencing
> &gt; that
> &gt; &gt; &gt; &gt; code. Wanted to share my personal opinion.
> &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; Thanks,
> &gt; &gt; &gt; &gt; Sudha
> &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 11:36 AM Shiyan Xu <
> &gt; &gt; xu.shiyan.raym...@gmail.com&gt;
> &gt; &gt; &gt; &gt; wrote:
> &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; The clean-up work can actually be split by
> modules.
> &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; Though it is generally a good practice to follow,
> my concern is the
> &gt; &gt; &gt; &gt; &gt; clean-up is likely to cause conflicts with some
> on-going changes.
> &gt; If
> &gt; &gt; I
> &gt; &gt; &gt; &gt; may
> &gt; &gt; &gt; &gt; &gt; suggest, the dedicated clean-up tasks should avoid
> &gt; &gt; &gt; &gt; &gt; - modules that are undergoing multiple feature
> changes/PRs
> &gt; &gt; &gt; &gt; &gt; - modules that are planned to have major
> refactoring due to design
> &gt; &gt; &gt; &gt; changes
> &gt; &gt; &gt; &gt; &gt; (since clean-up can be done altogether during
> refactoring)
> &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 4:17 AM Vinoth Chandar <
> vin...@apache.org&gt;
> &gt; &gt; &gt; &gt; wrote:
> &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; Not sure if I fully agree with sweeping
> statements being made.
> &gt; But,
> &gt; &gt; &gt; +1
> &gt; &gt; &gt; &gt; &gt; for
> &gt; &gt; &gt; &gt; &gt; &gt; structuring this work via Jiras and having
> some committer
> &gt; “accept”
> &gt; &gt; &gt; the
> &gt; &gt; &gt; &gt; &gt; &gt; issue first.&nbsp; Some of these tend to be
> subjective and we do need
> &gt; to
> &gt; &gt; &gt; &gt; make
> &gt; &gt; &gt; &gt; &gt; &gt; different tradeoffs.
> &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 1:28 AM vino yang <
> yanghua1...@gmail.com
> &gt; &gt;
> &gt; &gt; &gt; &gt; wrote:
> &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; Hi Pratyaksh,
> &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; Thanks for your thought.
> &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; Let's listen to others' comments. If
> there is no objection, we
> &gt; &gt; will
> &gt; &gt; &gt; &gt; &gt; &gt; follow
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; this way.
> &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; Best,
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; Vino
> &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; Pratyaksh Sharma <pratyaks...@gmail.com&gt;
> 于2020年1月21日周二
> &gt; 下午4:56写道:
> &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Hi Vino,
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Big +1 for this initiative. I have
> done this code cleanup for
> &gt; &gt; &gt; test
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; classes
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; in the past and strongly feel
> there is a need to do the same
> &gt; at
> &gt; &gt; &gt; &gt; other
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; places as well. I would definitely
> like to volunteer for
> &gt; this.
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 1:52 PM
> vino yang <
> &gt; &gt; yanghua1...@gmail.com
> &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; wrote:
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Hi folks,
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Currently, the code quality
> of some Hudi module is not very
> &gt; &gt; &gt; well.
> &gt; &gt; &gt; &gt; &gt; As
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; many
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; developers have seen, the
> Intellij IDEA has shown many
> &gt; &gt; &gt; &gt; intellisense
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; about
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; cleanup and improvement. The
> community does not object to
> &gt; &gt; doing
> &gt; &gt; &gt; &gt; the
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; cleanup
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; and improvement work and the
> work has been started via some
> &gt; &gt; &gt; &gt; direct
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; "minor"
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; PRs by some volunteers. The
> current way is unorganized and
> &gt; &gt; hard
> &gt; &gt; &gt; &gt; to
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; manage.
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; For tracking this work, I
> prefer to manage this work with
> &gt; the
> &gt; &gt; &gt; &gt; Jira
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; issue.
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; We can create an umbrella
> issue. Then, split the work into
> &gt; &gt; &gt; &gt; several
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; subtasks.
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Since those "bad smell" lays
> anywhere in the whole project.
> &gt; &gt; &gt; It's
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; difficult
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; to give a standard to split
> the subtasks. For example, some
> &gt; &gt; &gt; files
> &gt; &gt; &gt; &gt; &gt; &gt; have
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; a
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; lot while some modules have
> few. So I suggest the standard
> &gt; &gt; &gt; would
> &gt; &gt; &gt; &gt; &gt; &gt; depend
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; on
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; the volume of the changes.
> Before working, any subtask
> &gt; should
> &gt; &gt; &gt; &gt; find
> &gt; &gt; &gt; &gt; &gt; a
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; committer as a mentor who
> would judge and approve the scope
> &gt; &gt; is
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; suitable.
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; What do you think?
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Any comments and suggestions
> would be appreciated.
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Best,
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Vino
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt; &gt;
> &gt; &gt; &gt; &gt;
> &gt; &gt; &gt;
> &gt; &gt;
> &gt;

Reply via email to