Hi dev@, For the longest time the Nutch codebase has shipped with a eclipse-codeformat.xml [0] file. Whilst this has been largely successful in keeping the codebase uniform, it cannot/has not been integrated into continuous integration (CI) and subsequently not really enforced!
Whilst I’m a big fan of “if it ain’t broken don’t fix it”, I think we should have some CI code formatting checks. Additionally I really question whether we need a Nutch custom code style at all… why don’t we just use some other existing style and then enforce it? I therefore propose that we replace the legacy code formatter with a convention such as * google Java format [1] which offers a GitHub action for easy integration into our CI process, or * check style [2] which offers an Ant task which we could use, this is of less utility as we think about the move to grade * superlinter [3] basically emerging as the industry OSS default, offers a GitHub action and could also be configured to lint dockerfile, and other artifacts. It can also be configured to use the google Java style as well… My preference would be [3] because it offers a more comprehensive linting package for the entire codebase not just the Java code. Thanks for your consideration. lewismc [0] https://github.com/apache/nutch/blob/master/eclipse-codeformat.xml [1] https://github.com/google/google-java-format [2] https://checkstyle.sourceforge.io/ [3] https://github.com/marketplace/actions/super-linter