@Jason I believe that we want to get some coverage report as a PR pre-commit check, so that we can utilize it as part of code review. Removing parallelization increases pre-commit duration greatly (see the big drop on this graph <http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-6M&to=now>) and I believe that's what Michael worries about.
If we'd want to run coverage jobs in background on master, then it is possible to do a separate job. --Mikhail Have feedback <http://go/migryz-feedback>? On Mon, Jan 7, 2019 at 9:33 AM Jason Kuster <jasonkus...@google.com> wrote: > Michael, could we solve that problem by adding an additional build with > those features disabled specifically for generating coverage data over > time? Any tradeoffs to doing something like that? > > On Mon, Jan 7, 2019 at 8:23 AM Michael Luckey <adude3...@gmail.com> wrote: > >> Unfortunately I am a bit worried, that enabling code coverage will not >> come for free at the moment. As far as I understand the current state of >> the build system, we would either have to disable parallel builds or the >> build cache, which I assume both to be enabled currently on precommit runs. >> I am trying to get some numbers here, but it might well be, that someone >> did already that evaluation. So I d be happy to get further insights if >> someone with deeper knowledge to the setup could jump in here. >> >> This situation will probably be resolved after an upgrade to gradle5, but >> unfortunately this seems to be not easily doable right now. >> >> michel >> >> >> >> On Mon, Jan 7, 2019 at 4:21 PM Maximilian Michels <m...@apache.org> wrote: >> >>> Code coverage would be very useful, regardless of what tool we use to >>> measure >>> it. Especially when reviewing PRs, because it provides a quantitative >>> measure >>> which can help reviewer and contributor to decide if testing is >>> sufficient. >>> >>> I'd prefer to use specific tools instead of one-fits-it-all tool but I'm >>> open to >>> whatever works best. The process of setting the tooling up is likely >>> going to be >>> iterative. >>> >>> @Mikhail If you want to setup anything and present it here, that would >>> be >>> awesome. (Of course discussion can continue here.) >>> >>> On 03.01.19 19:34, Heejong Lee wrote: >>> > I think we should also consider false positive ratio of the tool. >>> Oftentimes >>> > deeper analysis easily produces tons of false positives which make >>> people less >>> > interested in static analysis results because of triaging overheads. >>> > >>> > On Thu, Jan 3, 2019 at 4:18 PM Scott Wegner <sc...@apache.org >>> > <mailto:sc...@apache.org>> wrote: >>> > >>> > Discussion on software engineering practices and tools tends to >>> gather many >>> > opinions :) I suggest breaking this out into a doc to keep the >>> discussion >>> > organized. >>> > >>> > I appreciate that you've started with a list of requirements. I >>> would add a few: >>> > >>> > 6. Analysis results should be integrated into the code review >>> workflow. >>> > 7. It should also be possible to run analysis and evaluate results >>> locally. >>> > 8. Analysis rules and thresholds should be easily configurable. >>> > >>> > And some thoughts on the previous requirements: >>> > >>> > > 2. Tool should keep history of reports. >>> > >>> > Seems nice-to-have but not required. I believe the most value is >>> viewing the >>> > delta during code review, and also maybe a snapshot of the overall >>> state of >>> > master. If we want trends we could also import data into >>> > s.apache.org/beam-community-metrics < >>> http://s.apache.org/beam-community-metrics> >>> > >>> > > 4. Tool should encorporate code coverage and static analysis >>> reports. (Or >>> > more if applicable) >>> > >>> > Is the idea to have a single tool responsible for all code >>> analysis? We >>> > currently have a variety of tools running in our build. It would be >>> > challenging to find a single tool that aggregates all current (and >>> future) >>> > analysis, especially considering the different language >>> ecosystems. Having >>> > targeted tools responsible for different pieces allows us to >>> pick-and-choose >>> > what works best for Beam. >>> > >>> > >>> > On Thu, Jan 3, 2019 at 3:43 PM Mikhail Gryzykhin < >>> mig...@google.com >>> > <mailto:mig...@google.com>> wrote: >>> > >>> > Let me summarize and answer main question that I see: >>> > 1. Seems that we do want to have some statistics on coverage >>> and >>> > integrate automatic requirements into our build system. >>> > 2. Implementation is still to be discussed. >>> > >>> > Lets talk about implementation further. >>> > >>> > My requirements for choice are: >>> > 1. Tool should give us an option for deep-dive into findings. >>> > 2. Tool should keep history of reports. >>> > 3. Tool should give an option to break build (allow for >>> hardcoded >>> > requirements) >>> > 4. Tool should encorporate code coverage and static analysis >>> reports. >>> > (Or more if applicable) >>> > 5. Tool should support most or all languages we utilize in >>> beam. >>> > >>> > Let me dive into SonarQube a bit first. (All up to my >>> understanding of >>> > how it works.) >>> > Hits most of the points, potentially with some tweaks. >>> > This tool relies on reports generated by common tools. It also >>> tracks >>> > history of builds and allows to navigate it. Multi language. >>> I'm still >>> > working on figuring out how to configure it though. >>> > >>> > Common thresholds/checks that are suggested by SonarQube: >>> > Many checks are possible to apply to new code only. This >>> allows not to >>> > fix legacy code, but keep all new additions clean and neat >>> (ish). >>> > Test coverage by line/branch: Relies on cubertura report. >>> Usually >>> > coverage by branch is suggested. (all "if" case lines should >>> be tested >>> > with positive and negative condition result) >>> > Method complexity: Amount of different paths/conditions that >>> method can >>> > be invoked with. Suggested max number is 15. Generally >>> describes how >>> > easy it is to test/understand method. >>> > Bugs/vulnerabilities: Generally, output of Findbug. Reflects >>> commonly >>> > vulnerable/dangerous code that might cause errors. Or just >>> errors in >>> > code. I believe that sonar allows for custom code analysis as >>> well, but >>> > that is not required. >>> > Technical debt: estimations on how much time will it take to >>> cleanup >>> > code to make it shiny. Includes code duplications, commented >>> code, not >>> > following naming conventions, long methods, ifs that can be >>> inverted, >>> > public methods that can be private, etc. I'm not familiar with >>> explicit >>> > list, but on my experience suggestions are usually relevant. >>> > More on metrics can be found here: >>> > >>> https://docs.sonarqube.org/latest/user-guide/metric-definitions/ >>> > >>> > Suggested alternatives: >>> > https://scan.coverity.com/ >>> > This tool looks great and I'll check more on it. But it has a >>> > restriction to 14 or 7 builds per week (not sure how will they >>> estimate >>> > our project). Also, I'm not sure if we can break pre-commit >>> based on >>> > report from coverity. Looks good for generating historical >>> data. >>> > >>> > https://docs.codecov.io/docs/browser-extension >>> > I'll check more on this one. Looks great to have it integrated >>> in PRs. >>> > Although it requires plugin installation by each developer. I >>> don't >>> > think it allows to break builds and only does coverage. Am I >>> correct? >>> > >>> > Regards, >>> > --Mikhail >>> > >>> > Have feedback <http://go/migryz-feedback>? >>> > >>> > On Thu, Jan 3, 2019 at 2:18 PM Kenneth Knowles < >>> k...@apache.org >>> > <mailto:k...@apache.org>> wrote: >>> > >>> > It would be very useful to have line and/or branch >>> coverage visible. >>> > These are both very weak proxies for quality or >>> reliability, so IMO >>> > strict thresholds are not helpful. One thing that is super >>> useful is >>> > to integrate line coverage into code review, like this: >>> > https://docs.codecov.io/docs/browser-extension. It is >>> very easy to >>> > notice major missing tests. >>> > >>> > We have never really used Sonarqube. It was turned on as a >>> > possibility in the early days but never worked on past >>> that point. >>> > Could be nice. I suspect there's a lot to be gained by >>> just finding >>> > very low numbers and improving them. So just running >>> Jacoco's >>> > offline HTML generation would do it (also this integrates >>> with >>> > Jenkins). I tried this the other day and discovered that >>> our gradle >>> > config is broken and does not wire tests and coverage >>> reporting >>> > together properly. Last thing: How is "technical debt" >>> measured? I'm >>> > skeptical of quantitative measures for qualitative notions. >>> > >>> > Kenn >>> > >>> > On Thu, Jan 3, 2019 at 1:58 PM Heejong Lee < >>> heej...@google.com >>> > <mailto:heej...@google.com>> wrote: >>> > >>> > I don't have any experience of using SonarQube but >>> Coverity >>> > worked well for me. Looks like it already has beam >>> repo: >>> > https://scan.coverity.com/projects/11881 >>> > >>> > On Thu, Jan 3, 2019 at 1:27 PM Reuven Lax < >>> re...@google.com >>> > <mailto:re...@google.com>> wrote: >>> > >>> > checkstyle and findbugs are already run as >>> precommit checks, >>> > are they not? >>> > >>> > On Thu, Jan 3, 2019 at 7:19 PM Mikhail Gryzykhin >>> > <mig...@google.com <mailto:mig...@google.com>> >>> wrote: >>> > >>> > Hi everyone, >>> > >>> > In our current builds we (can) run multiple >>> code quality >>> > checks tools like checkstyle, findbugs, code >>> test >>> > coverage via cubertura. However we do not >>> utilize many >>> > of those signals. >>> > >>> > I suggest to add requirements to code based on >>> those >>> > tools. Specifically, I suggest to add >>> pre-commit checks >>> > that will require PRs to conform to some >>> quality checks. >>> > >>> > We can see good example of thresholds to add >>> at Apache >>> > SonarQube provided default quality gate config >>> > < >>> https://builds.apache.org/analysis/quality_gates/show/1>: >>> > 80% tests coverage on new code, >>> > 5% technical technical debt on new code, >>> > No bugs/Vulnerabilities added. >>> > >>> > As another part of this proposal, I want to >>> suggest the >>> > use of SonarQube for tracking code statistics >>> and as >>> > agent for enforcing code quality thresholds. >>> It is >>> > Apache provided tool that has integration with >>> Jenkins >>> > or Gradle via plugins. >>> > >>> > I believe some reporting to SonarQube was >>> configured for >>> > mvn builds of some of Beam sub-projects, but >>> was lost >>> > during migration to gradle. >>> > >>> > I was looking for other options, but so far >>> found only >>> > general configs to gradle builds that will >>> fail build if >>> > code coverage for project is too low. Such >>> approach will >>> > force us to backfill tests for all existing >>> code that >>> > can be tedious and demand learning of all >>> legacy code >>> > that might not be part of current work. >>> > >>> > I suggest to discuss and come to conclusion on >>> two >>> > points in this tread: >>> > 1. Do we want to add code quality checks to our >>> > pre-commit jobs and require them to pass >>> before PR is >>> > merged? >>> > >>> > Suggested: Add code quality checks listed >>> above at >>> > first, adjust them as we see fit in the >>> future. >>> > >>> > 2. What tools do we want to utilize for >>> analyzing code >>> > quality? >>> > >>> > Under discussion. Suggested: SonarQube, >>> but will >>> > depend on functionality level we want to >>> achieve. >>> > >>> > >>> > Regards, >>> > --Mikhail >>> > >>> > >>> > >>> > -- >>> > >>> > >>> > >>> > >>> > Got feedback? tinyurl.com/swegner-feedback >>> > <https://tinyurl.com/swegner-feedback> >>> > >>> >> > > -- > ------- > Jason Kuster > Apache Beam / Google Cloud Dataflow > > See something? Say something. go/jasonkuster-feedback > <https://goto.google.com/jasonkuster-feedback> >