+1 for failing fast starting with findbugs and eventually covering the
important points from checkstyle.

Bes,
Stamatis

On Fri, Jun 5, 2020 at 9:35 AM Zoltan Haindrich <k...@rxd.hu> wrote:

>
> Hey Mustafa!
>
> Those checks are not executed anymore in the new system. I always feeled
> it a bit confusing to have a comment which reports about
> checkstyle/finbugs/etc issues; while
> getting a green test run was almost impossible due to the high number of
> randomly falling tests.
> So I don't think it's viable that someone will re-submit the patch with
> style changes.
>
> I think the old approach is a "soft" way of enforcing code quality - in
> which I personally don't believe: code quality should be enforced by
> rules/quality gates/etc.
>
> So I would like to take a different approach...I think we should definetly
> re-introduce these checks - however without "tolerance" being built-in.
> This will most likely
> mean that we will have to soften the ruleset first; but then we may
> gradually increase the bar to a higher level.
>
> The "without tolerance" will mean that this will be checked during (or
> right after) the build phase - so if you make quality mistakes you will
> just not get a test results
> (it will also save resources).
>
> Yesterday Laszlo have opened a ticket about fixing findbugs issues - if we
> do fix those issues; but we never enforce to fail the build - someone might
> just add a few more.
>
> To increase code quality through out the project I think we could take a
> bottom-up approach:
> * first patch:
>    * fix things in a low level module(like common or storage-api)
>    * it should also add the neccessary maven&job changes to enforce things
> during precommit (up-to that module)
> * followups:
>    * raise the bar to higher level modules
>
> Obviously we can't do this for something like checkstyle which detects a
> myriad of small issues:
> * the ruleset should be shrinked to something which needs reasonable
> amount of work to start enforcing
> * later we can enable further rules/fix all of them in the project
>
> What do you think about this?
>
> cheers,
> Zoltan
>
>
>
> On 6/5/20 2:47 AM, Mustafa IMAN wrote:
> > Thank you Zoltan for all this work.
> > I see many PRs are merged based on the new workflow already. The old
> > workflow generates many reports like ASF license/findbugs/checkstyle
> etc. I
> > don't see these in the new Github PR workflow. I am concerned the
> codebase
> > is going to suffer from lack of these reports very quickly. Do these
> checks
> > still happen but are not visible?
> >
> > On Tue, Jun 2, 2020 at 4:41 AM Zoltan Haindrich <k...@rxd.hu> wrote:
> >
> >> Hello,
> >>
> >> I would like to note that you may login to the jenkins instance - to
> >> start/kill builds (or create new jobs).
> >> I've configured github oauth - but since team membership can't be
> queried
> >> from the "apache organization" - it's harder to configure all "hive
> >> committers".
> >> However...I think I've made it available for most of us - if you can't
> >> start builds/etc just let me know your github user and I'll add it.
> >>
> >> cheers,
> >> Zoltan
> >>
> >>
> >>
> >> On 5/29/20 2:32 PM, Zoltan Haindrich wrote:
> >>> Hey all!
> >>>
> >>> The patch is now in master - so every new PR or a push on it will
> >> trigger a new run.
> >>>
> >>> Please decide which one would you like to use - open a PR to see the
> new
> >> one work...or upload a patch file to the jira - but please don't do
> both;
> >> because in that case 2
> >>> execution will happen.
> >>>
> >>> The job execution time(2-4 hours) of a single run is a bit higher than
> >> the usual on the ptest server - this is mostly to increase throughput.
> >>>
> >>> The patch also disabled a set of tests; I will send the full list of
> >> skipped tests shortly.
> >>>
> >>> cheers,
> >>> Zoltan
> >>>
> >>>
> >>> On 5/27/20 1:50 PM, Zoltan Haindrich wrote:
> >>>> Hello all!
> >>>>
> >>>> The new stuff is ready to be switched on-to. It needs to be merged
> into
> >> master - and after that anyone who opens a PR will get a run by the new
> >> HiveQA infra.
> >>>> I propose to run the 2 systems side-by-side for some time - the
> regular
> >> master builds will start; and we will see how frequently that is
> polluted
> >> by flaky tests.
> >>>>
> >>>> Note that the current patch also disables around ~25 more tests to
> >> increase stability - to get a better overview about the disabled tests I
> >> think the "direction of the
> >>>> information flow" should be altered; what I mean by that is: instead
> of
> >> just throwing in a jira for "disable test x" and opening a new one like
> >> "fix test x"; only open
> >>>> the latter and place the jira reference into the ignore message;
> >> meanwhile also add a regular report about the actually disabled tests -
> so
> >> people who do know about the
> >>>> importance of a particular test can get involved.
> >>>>
> >>>> Note: the builds.apache.org instance will be shutdown somewhere in
> the
> >> future as well...but I think the new one is a good-enough alternative to
> >> not have to migrate the
> >>>> Hive-precommit job over to https://ci-hadoop.apache.org/.
> >>>>
> >>>> http://34.66.156.144:8080/job/hive-precommit/job/PR-948/5/
> >>>> https://issues.apache.org/jira/browse/HIVE-22942
> >>>> https://github.com/apache/hive/pull/948/files
> >>>>
> >>>> cheers,
> >>>> Zoltan
> >>>>
> >>>> On 5/18/20 1:42 PM, Zoltan Haindrich wrote:
> >>>>> Hey!
> >>>>>
> >>>>> On 5/18/20 11:51 AM, Zoltan Chovan wrote:
> >>>>>> Thank you for all of your efforts, this looks really promising. With
> >> moving
> >>>>>> to github PRs, would that also mean that we move away from the
> >> reviewboard
> >>>>>> for code review?
> >>>>> I didn't thinked about that. I think using github's review interface
> >> will remain optional, because both review systems has there own strong
> >> points - I wouldn't force
> >>>>> anyone to use one over the other. (For some patches reviewboard is
> >> much better; because it's able to track content moves a bit better than
> >> github. - meanwhile github has
> >>>>> a small feature that enables to mark files as reviewed)
> >>>>> As a matter of fact we had sometimes patches on the jira's which
> never
> >> had neither an RB or a PR to review them - having a PR there at least
> will
> >> make it easier for
> >>>>> reviewers to comment.
> >>>>>
> >>>>>> Also, what happens if a PR is updated? Will the tests run for both
> or
> >> just
> >>>>>> for the latest version?
> >>>>> It will trigger a new build - if there is already a build in progress
> >> that will prevent a new build from starting until it finishes...and
> there
> >> is also a 5 builds/day
> >>>>> limit; which might induce some wait.
> >>>>>
> >>>>> cheers,
> >>>>> Zoltan
> >>>>>
> >>>>>>
> >>>>>> Regards,
> >>>>>> Zoltan
> >>>>>>
> >>>>>> On Sun, May 17, 2020 at 10:51 PM Zoltan Haindrich <k...@rxd.hu>
> >> wrote:
> >>>>>>
> >>>>>>> Hello all!
> >>>>>>>
> >>>>>>> The proposed system have become more stable lately - and I think
> I've
> >>>>>>> solved a few sources of flakiness.
> >>>>>>> To be really usable I also wanted to add a way to dynamically
> >>>>>>> enable/disable a set of tests (for example the replication tests
> >> take ~7
> >>>>>>> hours to execute from the total of 24
> >>>>>>> hours - and they are also a bit unstable, so not running them when
> >> not
> >>>>>>> neccesary would be beneficial in multiple ways) - but to do this
> the
> >> best
> >>>>>>> would be to throw in
> >>>>>>> junit5; unfortunately the current ptest installation uses maven
> 3.0.5
> >>>>>>> which doesn't like these kind of things - so instead of hacking a
> >> fix for
> >>>>>>> that ....I've removed it
> >>>>>>> from the dev branch for now.
> >>>>>>>
> >>>>>>> I would like to propose to start an evaluation phase of the new
> test
> >>>>>>> procedures(INFRA-20269)
> >>>>>>> The process would look something like this:
> >>>>>>> * someone opens a PR - the tests will be run on the changes
> >>>>>>> * on every active branches the tests will run from time to time
> >>>>>>>      * this will produce a bunch of test runs on the master branch
> as
> >> well ;
> >>>>>>> which will show how well the tests behave on the master branch
> >> without any
> >>>>>>> patches
> >>>>>>> * runs on branches (PRs or active development branches(eg:master))
> >> will be
> >>>>>>> rate limited to 5 builds/day
> >>>>>>> * at most ~4 builds at a time - to maximize resource usage
> >>>>>>> * turnaround time for a build is right now 2 hours - which I feel
> >> like a
> >>>>>>> balanced choice between speed/response time
> >>>>>>>
> >>>>>>> Possible future benefits:
> >>>>>>> * toggle features using github tags
> >>>>>>> * optional testgroups (metastore/replication) tests
> >>>>>>> * ability to run the metastore verification tests
> >>>>>>> * possibility to add smoke tests
> >>>>>>>
> >>>>>>> To enable this I will have to finish the HIVE-22942 ticket - beyond
> >> the
> >>>>>>> new Jenkinsfile which defines the full logic;
> >>>>>>> although I've sinked a lot of time into fixing all kind of flaky
> >> tests I
> >>>>>>> would would like to disable around ~25 tests.
> >>>>>>>
> >>>>>>> I also would like to propose a method to verify the stability of a
> >> single
> >>>>>>> test: run it a 100 times in series at the same place where the
> >> precommit
> >>>>>>> tests are running.
> >>>>>>> This will put the bar high enough that only totally stable tests
> >> could
> >>>>>>> satisfy it (a 99% stable test has 36% chance to pass this without
> >> being
> >>>>>>> caught :D)
> >>>>>>> After this will be in service it could be used to: validate that an
> >>>>>>> existing test is unstable (before disabling it) - and then used
> >> again to
> >>>>>>> prove that it got fixed during
> >>>>>>> re-enabling it.
> >>>>>>>
> >>>>>>> Please let me know what you think!
> >>>>>>>
> >>>>>>> cheers,
> >>>>>>> Zoltan
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 4/29/20 4:28 PM, Zoltan Haindrich wrote:
> >>>>>>>> Hey All!
> >>>>>>>>
> >>>>>>>> I was planning to replace the ptest stuff with something less
> >> complex
> >>>>>>> for a while now - I see that we struggle a lot because of ptest is
> >> more
> >>>>>>> complicated than it should be...
> >>>>>>>> It would be much better if it would be constructed from well made
> >>>>>>> existing CI piece. - because of that I've started working on [1] a
> >> few
> >>>>>>> months ago.
> >>>>>>>>
> >>>>>>>> It has it's pros and cons...but it's not the same as the existing
> >> ptest
> >>>>>>> stuff.
> >>>>>>>> I've collected some infos about how it compares against the
> >> existing one
> >>>>>>> - but it became too long so I've moved it into a google docs
> >> document at
> >>>>>>> [3].
> >>>>>>>>
> >>>>>>>> It's not yet ready... I still have some remaining
> >> problems/concerns/etc
> >>>>>>>> * what do you think about changing to a github PR based workflow?
> >>>>>>>> * it will not support at all things like "isolation" - so we will
> >> have
> >>>>>>> to make our tests work with eachother without bending the rules...
> >>>>>>>> * I've tried to overcommit the cpu resources which creates a more
> >> noisy
> >>>>>>> environment for the actual tests - this squeezes out some new
> >> problems
> >>>>>>> which should be fixed before
> >>>>>>>> this could be enabled.
> >>>>>>>> * for every PR the first run is somewhat sub-optimal...there are
> >> some
> >>>>>>> reasons for this - the actually used resources are the same; but
> the
> >>>>>>> overall execution time is not
> >>>>>>>> optimal; I could accept this as a compromise because right now I
> >> wait
> >>>>>>>> 24 hours for a precommit run.
> >>>>>>>>
> >>>>>>>> It's deployed at [2] and anyone can start a testrun on it:
> >>>>>>>> * merge my HIVE-22942-ptest-alt branch from [4] into your branch
> >>>>>>>> * open a PR against my hive repo on github [5]
> >>>>>>>>
> >>>>>>>> cheers,
> >>>>>>>> Zoltan
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> [1] https://issues.apache.org/jira/browse/HIVE-22942
> >>>>>>>> [2] http://34.66.156.144:8080/job/hive-precommit
> >>>>>>>> [3]
> >>>>>>>
> >>
> https://docs.google.com/document/d/1dhL5B-eBvYNKEsNV3kE6RrkV5w-LtDgw5CtHV5pdoX4/edit?usp=sharing
> >>>>>>>> [4] https://github.com/kgyrtkirk/hive/tree/HIVE-22942-ptest-alt
> >>>>>>>> [5] https://github.com/kgyrtkirk/hive/
> >>>>>>>
> >>>>>>
> >>
> >
>

Reply via email to