Hi All,
I know I know, but please keep reading because I recently learned about
some new developments in the area of coding-style automation.
The tool I would propose we use is Spotless
(https://github.com/diffplug/spotless). This doesn't come with a
formatter but allows using other popular formatters such as
google-java-format. The nice thing about Spotless is that it serves as a
verifier for CI but can also apply the configured style automatically.
That is, for the programmer all she has to do is `mvn spotless:apply` to
fix any style violations.
An interesting feature, which was (somewhat) recently added is "ratchet"
(https://github.com/diffplug/spotless/blob/main/plugin-maven/README.md#ratchet).
With this, you can set up Spotless to only apply it's rules to files
that were changed after a configured commit. This would allow a gradual
application of the new coding style instead of one big change.
If we decide to use Spotless, we would of course also have to decide on
a coding style. For this I would propose google-java-format, which the
flink-statefun project uses. The main difference from our current
"style" is that this uses spaces instead of tabs for indentation. By
default it would be 2 spaces but it can be configured to use 4 spaces
which would make code look more or less like our current style. There
are no more configuration knobs, so using tabs is not an option.
Finally, why should we do this? I think most engineers agree that having
a common enforced style is good to have so I only want to highlight a
few thoughts here about things we could improve:
- No more nits about coding style in reviews, this makes it easier for
both the reviewer and developer
- No manual fixing of Checkstyle errors because Spotless can do that
automatically
- Because Flink is such a big project little islands of coding style
have formed between people that commonly work on components. It can be a
nuisance when you work on a different component and then reviewers don't
like your typical coding style. And you first have to get used to the
slight differences in style when reading code.
There are also downsides I see in this:
- We break the history, but both "git blame" and modern IntelliJ can
ignore whitespace when attributing changes. So for files that are
already "well" formatted not much would change.
- In the short-term it will be harder to apply changes both to master
and one of the release-x branches because formatting will be different.
I think this is not too hard though because Spotless can automatically
apply the style.
In summary, we would have some short-term pain with this but I think it
would be good in the long run. What are your thoughts?
Best,
Aljoscha