Re: Nutch codebase formatting
Thanks Seb. I'll go ahead and try to build in the google Java format via super-linter and see where we get...! lewismc On 2023/10/29 17:04:47 Sebastian Nagel wrote: > Hi Lewis, > > >> whether we need a Nutch custom code style at all… why don’t we just use > >> some other existing style and then enforce it? > > Enforcing: yes! > > However, I would try hard to keep the changes on a reasonable minimum. For > example, if we change the indentation, almost every code line is affected > which > makes > - "git annotate" mostly useless (or more difficult to use because you need > look >back) > - merges of open PRs, custom patches or modifications in custom repositories >might get quite painful, until the formatting is synchronized. > > > >> * google Java format [1] which offers a GitHub action for easy integration > >> into our CI process, or > > +1 > > + available also for Intellij, Eclipse > + indentation stays the same > +/- about 25% of the code lines are changed (might be acceptable) > > > >> * superlinter [3] basically emerging as the industry OSS default, offers a > >> GitHub action and could also be configured to lint dockerfile, and other > >> artifacts. It can also be configured to use the google Java style as well… > > +1 (with Google Java style) > > > > I’ll submit a PR for superlinter so everyone can see what it would look > like. > > Great! Thanks! > > > Best, > Sebastian > > On 10/29/23 00:38, Lewis John McGibbney wrote: > > Any thoughts on this folks. > > I’ll submit a PR for superlinter so everyone can see what it would look > > like. > > lewismc > > > > On 2023/10/23 19:28:45 lewis john mcgibbney wrote: > >> Hi dev@, > >> > >> For the longest time the Nutch codebase has shipped with a > >> eclipse-codeformat.xml [0] file. > >> Whilst this has been largely successful in keeping the codebase uniform, it > >> cannot/has not been integrated into continuous integration (CI) and > >> subsequently not really enforced! > >> > >> Whilst I’m a big fan of “if it ain’t broken don’t fix it”, I think we > >> should have some CI code formatting checks. Additionally I really question > >> whether we need a Nutch custom code style at all… why don’t we just use > >> some other existing style and then enforce it? > >> > >> I therefore propose that we replace the legacy code formatter with a > >> convention such as > >> > >> * google Java format [1] which offers a GitHub action for easy integration > >> into our CI process, or > >> * check style [2] which offers an Ant task which we could use, this is of > >> less utility as we think about the move to grade > >> * superlinter [3] basically emerging as the industry OSS default, offers a > >> GitHub action and could also be configured to lint dockerfile, and other > >> artifacts. It can also be configured to use the google Java style as well… > >> > >> My preference would be [3] because it offers a more comprehensive linting > >> package for the entire codebase not just the Java code. > >> > >> Thanks for your consideration. > >> lewismc > >> > >> [0] > >> https://github.com/apache/nutch/blob/master/eclipse-codeformat.xml > >> [1] > >> https://github.com/google/google-java-format > >> [2] > >> https://checkstyle.sourceforge.io/ > >> [3] > >> https://github.com/marketplace/actions/super-linter > >> >
Re: Nutch codebase formatting
Hi Lewis, >> whether we need a Nutch custom code style at all… why don’t we just use >> some other existing style and then enforce it? Enforcing: yes! However, I would try hard to keep the changes on a reasonable minimum. For example, if we change the indentation, almost every code line is affected which makes - "git annotate" mostly useless (or more difficult to use because you need look back) - merges of open PRs, custom patches or modifications in custom repositories might get quite painful, until the formatting is synchronized. >> * google Java format [1] which offers a GitHub action for easy integration >> into our CI process, or +1 + available also for Intellij, Eclipse + indentation stays the same +/- about 25% of the code lines are changed (might be acceptable) >> * superlinter [3] basically emerging as the industry OSS default, offers a >> GitHub action and could also be configured to lint dockerfile, and other >> artifacts. It can also be configured to use the google Java style as well… +1 (with Google Java style) > I’ll submit a PR for superlinter so everyone can see what it would look like. Great! Thanks! Best, Sebastian On 10/29/23 00:38, Lewis John McGibbney wrote: Any thoughts on this folks. I’ll submit a PR for superlinter so everyone can see what it would look like. lewismc On 2023/10/23 19:28:45 lewis john mcgibbney wrote: Hi dev@, For the longest time the Nutch codebase has shipped with a eclipse-codeformat.xml [0] file. Whilst this has been largely successful in keeping the codebase uniform, it cannot/has not been integrated into continuous integration (CI) and subsequently not really enforced! Whilst I’m a big fan of “if it ain’t broken don’t fix it”, I think we should have some CI code formatting checks. Additionally I really question whether we need a Nutch custom code style at all… why don’t we just use some other existing style and then enforce it? I therefore propose that we replace the legacy code formatter with a convention such as * google Java format [1] which offers a GitHub action for easy integration into our CI process, or * check style [2] which offers an Ant task which we could use, this is of less utility as we think about the move to grade * superlinter [3] basically emerging as the industry OSS default, offers a GitHub action and could also be configured to lint dockerfile, and other artifacts. It can also be configured to use the google Java style as well… My preference would be [3] because it offers a more comprehensive linting package for the entire codebase not just the Java code. Thanks for your consideration. lewismc [0] https://github.com/apache/nutch/blob/master/eclipse-codeformat.xml [1] https://github.com/google/google-java-format [2] https://checkstyle.sourceforge.io/ [3] https://github.com/marketplace/actions/super-linter
Re: Nutch codebase formatting
Any thoughts on this folks. I’ll submit a PR for superlinter so everyone can see what it would look like. lewismc On 2023/10/23 19:28:45 lewis john mcgibbney wrote: > Hi dev@, > > For the longest time the Nutch codebase has shipped with a > eclipse-codeformat.xml [0] file. > Whilst this has been largely successful in keeping the codebase uniform, it > cannot/has not been integrated into continuous integration (CI) and > subsequently not really enforced! > > Whilst I’m a big fan of “if it ain’t broken don’t fix it”, I think we > should have some CI code formatting checks. Additionally I really question > whether we need a Nutch custom code style at all… why don’t we just use > some other existing style and then enforce it? > > I therefore propose that we replace the legacy code formatter with a > convention such as > > * google Java format [1] which offers a GitHub action for easy integration > into our CI process, or > * check style [2] which offers an Ant task which we could use, this is of > less utility as we think about the move to grade > * superlinter [3] basically emerging as the industry OSS default, offers a > GitHub action and could also be configured to lint dockerfile, and other > artifacts. It can also be configured to use the google Java style as well… > > My preference would be [3] because it offers a more comprehensive linting > package for the entire codebase not just the Java code. > > Thanks for your consideration. > lewismc > > [0] > https://github.com/apache/nutch/blob/master/eclipse-codeformat.xml > [1] > https://github.com/google/google-java-format > [2] > https://checkstyle.sourceforge.io/ > [3] > https://github.com/marketplace/actions/super-linter >
Nutch codebase formatting
Hi dev@, For the longest time the Nutch codebase has shipped with a eclipse-codeformat.xml [0] file. Whilst this has been largely successful in keeping the codebase uniform, it cannot/has not been integrated into continuous integration (CI) and subsequently not really enforced! Whilst I’m a big fan of “if it ain’t broken don’t fix it”, I think we should have some CI code formatting checks. Additionally I really question whether we need a Nutch custom code style at all… why don’t we just use some other existing style and then enforce it? I therefore propose that we replace the legacy code formatter with a convention such as * google Java format [1] which offers a GitHub action for easy integration into our CI process, or * check style [2] which offers an Ant task which we could use, this is of less utility as we think about the move to grade * superlinter [3] basically emerging as the industry OSS default, offers a GitHub action and could also be configured to lint dockerfile, and other artifacts. It can also be configured to use the google Java style as well… My preference would be [3] because it offers a more comprehensive linting package for the entire codebase not just the Java code. Thanks for your consideration. lewismc [0] https://github.com/apache/nutch/blob/master/eclipse-codeformat.xml [1] https://github.com/google/google-java-format [2] https://checkstyle.sourceforge.io/ [3] https://github.com/marketplace/actions/super-linter