Hi Lewis,

>> whether we need a Nutch custom code style at all… why don’t we just use
>> some other existing style and then enforce it?

Enforcing: yes!

However, I would try hard to keep the changes on a reasonable minimum. For example, if we change the indentation, almost every code line is affected which makes
- "git annotate" mostly useless (or more difficult to use because you need look
  back)
- merges of open PRs, custom patches or modifications in custom repositories
  might get quite painful, until the formatting is synchronized.


>> * google Java format [1] which offers a GitHub action for easy integration
>> into our CI process, or

+1

+ available also for Intellij, Eclipse
+ indentation stays the same
+/- about 25% of the code lines are changed (might be acceptable)


>> * superlinter [3] basically emerging as the industry OSS default, offers a
>> GitHub action and could also be configured to lint dockerfile, and other
>> artifacts. It can also be configured to use the google Java style as well…

+1 (with Google Java style)


> I’ll submit a PR for superlinter so everyone can see what it would look like.

Great! Thanks!


Best,
Sebastian

On 10/29/23 00:38, Lewis John McGibbney wrote:
Any thoughts on this folks.
I’ll submit a PR for superlinter so everyone can see what it would look like.
lewismc

On 2023/10/23 19:28:45 lewis john mcgibbney wrote:
Hi dev@,

For the longest time the Nutch codebase has shipped with a
eclipse-codeformat.xml [0] file.
Whilst this has been largely successful in keeping the codebase uniform, it
cannot/has not been integrated into continuous integration (CI)  and
subsequently not really enforced!

Whilst I’m a big fan of “if it ain’t broken don’t fix it”, I think we
should have some CI code formatting checks. Additionally I really question
whether we need a Nutch custom code style at all… why don’t we just use
some other existing style and then enforce it?

I therefore propose that we replace the legacy code formatter with a
convention such as

* google Java format [1] which offers a GitHub action for easy integration
into our CI process, or
* check style [2] which offers an Ant task which we could use, this is of
less utility as we think about the move to grade
* superlinter [3] basically emerging as the industry OSS default, offers a
GitHub action and could also be configured to lint dockerfile, and other
artifacts. It can also be configured to use the google Java style as well…

My preference would be [3] because it offers a more comprehensive linting
package for the entire codebase not just the Java code.

Thanks for your consideration.
lewismc

[0]
https://github.com/apache/nutch/blob/master/eclipse-codeformat.xml
[1]
https://github.com/google/google-java-format
[2]
https://checkstyle.sourceforge.io/
[3]
https://github.com/marketplace/actions/super-linter

Reply via email to