Hi all,

Have filed a Jira issue: https://issues.apache.org/jira/browse/HUDI-602

Best,
Vino

vino yang <yanghua1...@gmail.com> 于2020年2月4日周二 下午9:39写道:

> Hi Vinoth,
>
> Yes, I'd like to give some suggestions about the "MINOR" PR.
> Will file a Jira issue to track this work.
>
> Best,
> Vino
>
> Vinoth Chandar <vin...@apache.org> 于2020年2月4日周二 上午8:16写道:
>
>> +1 to vinoyang's suggestions.
>>
>> @Vino Yang <vinoy...@apache.org>  , do you want to formalize this and
>> update our contributing page?
>>
>> On Thu, Jan 30, 2020 at 3:09 AM hmatu <hma...@foxmail.com> wrote:
>>
>> > Hi,
>> >
>> >
>> >
>> > I think these "MINOR" issues are important, a good&nbsp;project requires
>> > not only functions, but also good coding style and  habits.
>> >
>> >
>> > Best
>> > Hmatu
>> >
>> >
>> >
>> >
>> > ------------------&nbsp;Original&nbsp;------------------
>> > From:&nbsp;"Vinoth Chandar"<vin...@apache.org&gt;;
>> > Date:&nbsp;Mon, Jan 27, 2020 03:25 AM
>> > To:&nbsp;"dev"<dev@hudi.apache.org&gt;;
>> >
>> > Subject:&nbsp;Re: [DISCUSS] Unify Hudi code cleanup and improvement
>> >
>> >
>> >
>> > Hi Vino,
>> >
>> > You raise a valid point on what "MINOR" PR should be. All JIRAs start
>> out
>> > in "NEW" state and committers have to "Accept" the issue already (to
>> force
>> > early conversations like this).
>> >
>> > May be we should draw some bounds on it like, "cannot be more than 50
>> > lines", "No functionality changes" .. etc? WDYT?&nbsp; This seems to be
>> > the core
>> > of the issue.
>> >
>> > On Thu, Jan 23, 2020 at 4:17 PM vino yang <yanghua1...@gmail.com&gt;
>> > wrote:
>> >
>> > &gt; Hi Vinoth,
>> > &gt;
>> > &gt; Thank you for your thoughts, I agree that focusing on some higher
>> > priority
>> > &gt; work is more valuable.
>> > &gt;
>> > &gt; This discussion is to sort out and manage the work that the
>> community
>> > is
>> > &gt; already doing. There are currently some PRs working on this type of
>> > work,
>> > &gt; such as PR[1][2][3][4]. The community has not given guidance on
>> these
>> > &gt; tasks. I think it's not very appropriate to open a "MINOR" PR
>> > directly. So,
>> > &gt; I want to hear from the community and how to manage them more
>> > effectively.
>> > &gt; The discussion does not encourage to give a higher priority to such
>> > work.
>> > &gt;
>> > &gt; We haven't stopped this kind of work, so we should provide
>> effective
>> > &gt; guidance and organization so that it doesn't look disorganized.
>> WYDT?
>> > &gt;
>> > &gt; Best,
>> > &gt; Vino
>> > &gt;
>> > &gt; [1]: https://github.com/apache/incubator-hudi/pull/1237
>> > &gt; [2]: https://github.com/apache/incubator-hudi/pull/1139
>> > &gt; [3]: https://github.com/apache/incubator-hudi/pull/1137
>> > &gt; [4]: https://github.com/apache/incubator-hudi/pull/1136
>> > &gt;
>> > &gt; Vinoth Chandar <vin...@apache.org&gt; 于2020年1月23日周四 下午1:20写道:
>> > &gt;
>> > &gt; &gt; Hi,
>> > &gt; &gt;
>> > &gt; &gt; Thanks everyone for sharing your views!
>> > &gt; &gt;
>> > &gt; &gt; Some of this conversation is starting to feel like boiling the
>> > ocean. I
>> > &gt; &gt; believe in refactoring with purpose and discussing
>> > class-by-class or
>> > &gt; &gt; module-by-module does not make sense to me. Can we first list
>> > down what
>> > &gt; we
>> > &gt; &gt; want to achieve? So far, I have only heard fixing IDE/IntelliJ
>> > warnings.
>> > &gt; &gt; Also instead of focussing on new work, how about looking at
>> the
>> > pending
>> > &gt; &gt; JIRAs under "Testing" "Code Cleanup" components first and see
>> if
>> > those
>> > &gt; are
>> > &gt; &gt; worth tackling.
>> > &gt; &gt;
>> > &gt; &gt; We went down this path for code formatting and today we still
>> > have
>> > &gt; &gt; inconsistencies. Looking back, I feel we should have clearly
>> > defined end
>> > &gt; &gt; goals for the cleanups and we can then rank them based on ROI.
>> > &gt; &gt;
>> > &gt; &gt; Thanks
>> > &gt; &gt; Vinoth
>> > &gt; &gt;
>> > &gt; &gt; On Wed, Jan 22, 2020 at 7:05 PM vino yang <
>> yanghua1...@gmail.com&gt;
>> > wrote:
>> > &gt; &gt;
>> > &gt; &gt; &gt; Hi Shiyan and Bhavani:
>> > &gt; &gt; &gt;
>> > &gt; &gt; &gt; Thanks for sharing your thoughts.
>> > &gt; &gt; &gt;
>> > &gt; &gt; &gt; As I originally stated. The advantage of using modules
>> as a
>> > unit to
>> > &gt; split
>> > &gt; &gt; &gt; work is that the decomposition is clear, but the
>> > disadvantage is that
>> > &gt; the
>> > &gt; &gt; &gt; volume of changes may be huge, which brings huge risks
>> > (considering
>> > &gt; that
>> > &gt; &gt; &gt; Hudi's test coverage is still not very high) and the
>> > workload of
>> > &gt; review.
>> > &gt; &gt; &gt; The advantage of splitting by class is that the volume of
>> > changes is
>> > &gt; &gt; small
>> > &gt; &gt; &gt; and the review is more convenient, but the disadvantages
>> > are too many
>> > &gt; &gt; tasks
>> > &gt; &gt; &gt; and high maintenance costs.
>> > &gt; &gt; &gt;
>> > &gt; &gt; &gt;
>> > &gt; &gt; &gt; *In addition, we need to define the boundaries of the
>> "code
>> > cleanup" I
>> > &gt; &gt; &gt; expressed in this topic: it is limited to the smart tips
>> > shown by
>> > &gt; &gt; Intellij
>> > &gt; &gt; &gt; IDEA. If the boundaries are too wide, then this
>> discussion
>> > will lose
>> > &gt; &gt; &gt; control.*
>> > &gt; &gt; &gt; I agree with Bhavani that we don't take it as the actual
>> > goal. But we
>> > &gt; are
>> > &gt; &gt; &gt; not opposed to the community to help improve the quality
>> of
>> > the code
>> > &gt; &gt; &gt; (basically, these tips given by the IDE are more
>> > reasonable).
>> > &gt; &gt; &gt;
>> > &gt; &gt; &gt;
>> > &gt; &gt; &gt; So, I still give my thoughts: We manage this work with
>> > Jira. Before we
>> > &gt; &gt; &gt; start working, we need to find a committer as a mentor.
>> The
>> > mentor must
>> > &gt; &gt; &gt; decide whether the scale of the subtasks is reasonable
>> and
>> > whether
>> > &gt; &gt; &gt; additional unit tests need to be added to verify the
>> > changes. And the
>> > &gt; &gt; &gt; mentor should be responsible for merged changes.
>> > &gt; &gt; &gt;
>> > &gt; &gt; &gt; What do you think?
>> > &gt; &gt; &gt;
>> > &gt; &gt; &gt; Best,
>> > &gt; &gt; &gt; Vino
>> > &gt; &gt; &gt;
>> > &gt; &gt; &gt; Bhavani Sudha <bhavanisud...@gmail.com&gt; 于2020年1月22日周三
>> > 下午2:22写道:
>> > &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; Hi @vinoyang thanks for bringing this to
>> discussion. I
>> > feel it would
>> > &gt; be
>> > &gt; &gt; &gt; &gt; less disruptive to clean up code as part of
>> individual
>> > classes being
>> > &gt; &gt; &gt; &gt; touched for a specific goal rather than code cleanup
>> > being the actual
>> > &gt; &gt; &gt; goal.
>> > &gt; &gt; &gt; &gt; This would narrow the touch point and ensure test
>> > coverage (both unit
>> > &gt; &gt; and
>> > &gt; &gt; &gt; &gt; integration tests)&nbsp; catches any
>> > accidental/unintentional changes.
>> > &gt; Also
>> > &gt; &gt; it
>> > &gt; &gt; &gt; &gt; would give chance to change any documentation
>> > quoting/referencing
>> > &gt; that
>> > &gt; &gt; &gt; &gt; code. Wanted to share my personal opinion.
>> > &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; Thanks,
>> > &gt; &gt; &gt; &gt; Sudha
>> > &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 11:36 AM Shiyan Xu <
>> > &gt; &gt; xu.shiyan.raym...@gmail.com&gt;
>> > &gt; &gt; &gt; &gt; wrote:
>> > &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; The clean-up work can actually be split by
>> > modules.
>> > &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; Though it is generally a good practice to
>> follow,
>> > my concern is the
>> > &gt; &gt; &gt; &gt; &gt; clean-up is likely to cause conflicts with some
>> > on-going changes.
>> > &gt; If
>> > &gt; &gt; I
>> > &gt; &gt; &gt; &gt; may
>> > &gt; &gt; &gt; &gt; &gt; suggest, the dedicated clean-up tasks should
>> avoid
>> > &gt; &gt; &gt; &gt; &gt; - modules that are undergoing multiple feature
>> > changes/PRs
>> > &gt; &gt; &gt; &gt; &gt; - modules that are planned to have major
>> > refactoring due to design
>> > &gt; &gt; &gt; &gt; changes
>> > &gt; &gt; &gt; &gt; &gt; (since clean-up can be done altogether during
>> > refactoring)
>> > &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 4:17 AM Vinoth Chandar
>> <
>> > vin...@apache.org&gt;
>> > &gt; &gt; &gt; &gt; wrote:
>> > &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; Not sure if I fully agree with sweeping
>> > statements being made.
>> > &gt; But,
>> > &gt; &gt; &gt; +1
>> > &gt; &gt; &gt; &gt; &gt; for
>> > &gt; &gt; &gt; &gt; &gt; &gt; structuring this work via Jiras and having
>> > some committer
>> > &gt; “accept”
>> > &gt; &gt; &gt; the
>> > &gt; &gt; &gt; &gt; &gt; &gt; issue first.&nbsp; Some of these tend to
>> be
>> > subjective and we do need
>> > &gt; to
>> > &gt; &gt; &gt; &gt; make
>> > &gt; &gt; &gt; &gt; &gt; &gt; different tradeoffs.
>> > &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 1:28 AM vino yang
>> <
>> > yanghua1...@gmail.com
>> > &gt; &gt;
>> > &gt; &gt; &gt; &gt; wrote:
>> > &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Hi Pratyaksh,
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Thanks for your thought.
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Let's listen to others' comments. If
>> > there is no objection, we
>> > &gt; &gt; will
>> > &gt; &gt; &gt; &gt; &gt; &gt; follow
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; this way.
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Best,
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Vino
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; Pratyaksh Sharma <
>> pratyaks...@gmail.com&gt;
>> > 于2020年1月21日周二
>> > &gt; 下午4:56写道:
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Hi Vino,
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Big +1 for this initiative. I
>> have
>> > done this code cleanup for
>> > &gt; &gt; &gt; test
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; classes
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; in the past and strongly feel
>> > there is a need to do the same
>> > &gt; at
>> > &gt; &gt; &gt; &gt; other
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; places as well. I would
>> definitely
>> > like to volunteer for
>> > &gt; this.
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; On Tue, Jan 21, 2020 at 1:52 PM
>> > vino yang <
>> > &gt; &gt; yanghua1...@gmail.com
>> > &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; wrote:
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Hi folks,
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Currently, the code quality
>> > of some Hudi module is not very
>> > &gt; &gt; &gt; well.
>> > &gt; &gt; &gt; &gt; &gt; As
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; many
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; developers have seen, the
>> > Intellij IDEA has shown many
>> > &gt; &gt; &gt; &gt; intellisense
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; about
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; cleanup and improvement.
>> The
>> > community does not object to
>> > &gt; &gt; doing
>> > &gt; &gt; &gt; &gt; the
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; cleanup
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; and improvement work and
>> the
>> > work has been started via some
>> > &gt; &gt; &gt; &gt; direct
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; "minor"
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; PRs by some volunteers. The
>> > current way is unorganized and
>> > &gt; &gt; hard
>> > &gt; &gt; &gt; &gt; to
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; manage.
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; For tracking this work, I
>> > prefer to manage this work with
>> > &gt; the
>> > &gt; &gt; &gt; &gt; Jira
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; issue.
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; We can create an umbrella
>> > issue. Then, split the work into
>> > &gt; &gt; &gt; &gt; several
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; subtasks.
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Since those "bad smell"
>> lays
>> > anywhere in the whole project.
>> > &gt; &gt; &gt; It's
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; difficult
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; to give a standard to split
>> > the subtasks. For example, some
>> > &gt; &gt; &gt; files
>> > &gt; &gt; &gt; &gt; &gt; &gt; have
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; a
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; lot while some modules have
>> > few. So I suggest the standard
>> > &gt; &gt; &gt; would
>> > &gt; &gt; &gt; &gt; &gt; &gt; depend
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; on
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; the volume of the changes.
>> > Before working, any subtask
>> > &gt; should
>> > &gt; &gt; &gt; &gt; find
>> > &gt; &gt; &gt; &gt; &gt; a
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; committer as a mentor who
>> > would judge and approve the scope
>> > &gt; &gt; is
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; suitable.
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; What do you think?
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Any comments and
>> suggestions
>> > would be appreciated.
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Best,
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Vino
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt; &gt;
>> > &gt; &gt; &gt;
>> > &gt; &gt;
>> > &gt;
>>
>

Reply via email to