I had a chance to rebase one of my PRs. It is really hard to justify the new max line length of 100 chars. I think the most unreadable code is usually around places when single statements are split into multiple lines. Since Java has explicit types and we prefer full and elaborate class/var names, the new line limit alone triggered so many changes and requires splitting a lot of statements that would previously fit on a single line.
The max line length is one of the reasons why Palantir folks forked the Google format. https://github.com/palantir/palantir-java-format <https://github.com/palantir/palantir-java-format> - Anton > On Jul 28, 2022, at 7:42 AM, Eduard Tudenhoefner <edu...@tabular.io> wrote: > > Thanks for your honest feedback Anton, which I really appreciate. I agree > with the points you brought up. > > I don't prefer the Google Java Format either, but it is at least something > that produces consistent results across tools. > If you ask me, I would love to customize the code format and have it as close > as possible to what the Iceberg project had. But one of the reasons why the > format isn't customizable is because it is difficult to support such > customizations across tools. > > The problem with code format is that there is no single format that can > satisfy the preferences of everybody. However, from my experience, once > people start to use any code format that produces consistent results > across Eclipse/IntelliJ/cmd line, people stop worrying about code format > details. > > This is also one of the reasons why the creators of Go decided to have a code > formatter built-in (https://go.dev/doc/effective_go#formatting > <https://go.dev/doc/effective_go#formatting>). > > In the long run I think the decision to use Spotless + code formatting helps, > especially for new people that want to contribute and don't want to feel > frustrated due to review feedback around code style. > > Eduard > > On Thu, Jul 28, 2022 at 6:15 AM Anton Okolnychyi <aokolnyc...@apple.com > <mailto:aokolnyc...@apple.com>> wrote: > I am late to the discussion, but I would like to express my opinion. > > As someone who deeply cares about the code consistency and leaves a lot of > style-related nits, I am definitely happy to see efforts to simplify the life > of the reviewers and automate the formatting process. Explaining the code > style to new contributors consumes a lot of time and results in more > iterations on PRs. Automating this would be great! > > However, I am concerned regarding the changes merged in PR #5312. > > - Many changes that went in that PR were against the original code style of > the project that we, committers and contributors, tried to maintain for all > these years. I think we had a lot of well-structured and consistent places, > which attracted devs to Iceberg. > - I feel the code quality dropped as some places were written in the old > style and now look really weird. If we followed the new style originally, > maybe, it would look much better as we would structure the code differently. > - As a someone whose team maintains an internal fork that is very close to > master (I know we are not the only ones), I feel such a massive change will > be a challenge to cherry-pick. I am not convinced introducing automatic > formatting had to be so radical. > > The first two points concern me the most. Since I am late to the discussion, > it would not be fair for me to request changes. However, I strongly suggest > the community to reconsider the approach we took while we haven’t merged many > changes on top of PR #5312. > > To sum up, I definitely support automating the formatting to simplify reviews > and help new contributors but I would try to come with a way to make it less > radical and follow whatever we had all these years. We did discuss some of > the formatting guidelines a few years ago and it is all different now. > > - Anton > >> On Jul 27, 2022, at 11:40 AM, Steven Wu <stevenz...@gmail.com >> <mailto:stevenz...@gmail.com>> wrote: >> >> Eduard, thank you for driving this effort! Great work! >> >> On Wed, Jul 27, 2022 at 11:11 AM Eduard Tudenhoefner <edu...@tabular.io >> <mailto:edu...@tabular.io>> wrote: >> Hey everyone, >> >> Google Java Format + Spotless integration have been merged as part of >> https://github.com/apache/iceberg/pull/5266 >> <https://github.com/apache/iceberg/pull/5266>. >> >> The big bang reformatting (aka spotlessApply) will happen as part of >> https://github.com/apache/iceberg/pull/5312 >> <https://github.com/apache/iceberg/pull/5312> and once this PR is merged, >> please perform the following steps if you have open PRs: >> >> 1. Rebase your PR branch against the commit before the „Big Bang“ >> 2. Run `./gradlew spotlessApply` >> 3. Squash all commits >> 4. Rebase against the latest HEAD of the master branch >> 5. Force-push your branch >> >> This should hopefully produce a minimum of conflicts on PRs. >> >> Additionally, make sure to install the google-java-format plugin for your >> IDE as outlined in https://github.com/apache/iceberg-docs/pull/125 >> <https://github.com/apache/iceberg-docs/pull/125> so that you can format >> code inside the IDE as well. >> >> Eduard >> >> On Fri, Jul 15, 2022 at 12:45 PM Eduard Tudenhoefner <edu...@tabular.io >> <mailto:edu...@tabular.io>> wrote: >> That's correct Ryan. The Google style is what can be applied across tools >> and Spotless is the tool that we use to check and apply that style (and also >> auto-apply the license header). The apply part is the important piece that >> Checkstyle for example is missing (as it's tedious to manually apply what >> checkstyle:check complains about). >> >> Additionally, Spotless >> <https://github.com/diffplug/spotless/tree/main/plugin-gradle> supports a >> bunch of other languages, so it would even be possible to use it for >> auto-formatting Gradle files (or Scala files using scalafmt) if we wanted >> to, but for now I would probably focus on the Java codebase. >> >> Eduard >> >> On Thu, Jul 14, 2022 at 6:58 PM Ryan Blue <b...@tabular.io >> <mailto:b...@tabular.io>> wrote: >> Okay, I understand. So it isn't that spotless is supported across tools, it >> is that google style is supported and we can use Spotless to both check and >> apply that style. Is that correct? >> >> On Thu, Jul 14, 2022 at 9:56 AM Eduard Tudenhoefner <edu...@tabular.io >> <mailto:edu...@tabular.io>> wrote: >> Yeah it's how Dmitri said, the Google style is most likely on purpose not >> configurable, because otherwise it's getting more difficult to achieve the >> same formatting results across IDEs & CMD line. >> >> >> On Wed, Jul 13, 2022 at 7:39 PM Dmitri Bourlatchkov >> <dmitri.bourlatch...@dremio.com <mailto:dmitri.bourlatch...@dremio.com>> >> wrote: >> Yeah, the Google style has very specific wrapping limits and javadoc format. >> It is not configurable, if we go with it. >> >> From my POV, using another style would be fine. However, with a custom style >> it can be hard and tricky to ensure that all IDEs enforce it the same way as >> spotless. >> >> The Google style has a plugin at least for IntelliJ (and likely for Eclipse >> too) so it will behave the same way in IDE and cmd. line build. >> >> On Wed, Jul 13, 2022 at 1:28 PM Ryan Blue <b...@tabular.io >> <mailto:b...@tabular.io>> wrote: >> Is there any value in trying to get Spotless to match the current format as >> closely as possible? It would be great to have fewer changes. >> >> For example, when I tested the apply, I see this: >> >> - /** >> - * Returns an initialized {@link AliyunProperties} >> - */ >> + /** Returns an initialized {@link AliyunProperties} */ >> Changes like that don’t seem very valuable to me. Similarly, it looks like a >> lot of text is wrapped at a different line length. If we can alter that, we >> could easily avoid a lot of changes. >> >> Ryan >> >> >> On Wed, Jul 13, 2022 at 7:41 AM Eduard Tudenhoefner <edu...@tabular.io >> <mailto:edu...@tabular.io>> wrote: >> As a first step, I created https://github.com/apache/iceberg/pull/5266 >> <https://github.com/apache/iceberg/pull/5266> which configures Spotless to >> use the Google Java Format and also apply the correct copyright header for >> Java files. >> >> Once this PR is merged, the next steps would be: >> removing conflicting Checkstyle rules that are not in line with the Google >> format >> formatting the entire code base via `./gradlew spotlessApply` >> setting `enforceCheck` to `true` in >> https://github.com/apache/iceberg/blob/80318d8cfbeb0d96d0afc27c84bc3dbddde35344/baseline.gradle#L48 >> >> <https://github.com/apache/iceberg/blob/80318d8cfbeb0d96d0afc27c84bc3dbddde35344/baseline.gradle#L48> >> so that validation fails if code isn't properly formatted >> updating docs around the current Formatter usage and how to configure >> Eclipse/IntelliJ >> The first 3 steps should be done together as part of the big bang. >> >> Eduard >> >> On Mon, Jul 11, 2022 at 10:54 PM Ryan Blue <b...@apache.org >> <mailto:b...@apache.org>> wrote: >> Okay, it sounds like there's mostly agreement for going with spotless. Let's >> try that out. We'll work on some changes to add spotless so that >> `spotlessApply` works. Then we can do the big bang migration (which I also >> agree is the best option) just before the 1.0. >> >> Thanks, everyone! >> >> On Mon, Jul 11, 2022 at 11:50 AM Dmitri Bourlatchkov >> <dmitri.bourlatch...@dremio.com <mailto:dmitri.bourlatch...@dremio.com>> >> wrote: >> My experience with the Google Code Style + Spotless was positive too. >> >> I'd be fine with another code style as long as it is "deterministic" (e.g. >> does not make changes on repeated execution) and works in IntelliJ IDEA / >> Eclipse / etc. >> >> Regarding cherry-picking into older branches, I think Robert's suggestion >> can be tweaked slightly to be helpful there too: >> >> 1. Checkout old branch >> 2. Apply the new style (run gradle ...) >> 3. Cherry-pick without committing >> 4. Manually revert to old style >> 5. Commit >> 6. Reset to original branch HEAD >> 7. Cherry pick commit 5 again >> >> It's a bit lengthy and may be a tedious process, but it should allow >> applying the git-level changes mostly automatically. >> >> Cheers, >> Dmitri. >> >> >> On Fri, Jul 8, 2022 at 2:53 AM Robert Stupp <sn...@snazy.de >> <mailto:sn...@snazy.de>> wrote: >> From my experience, it’s a big win to have automatic code formatting. >> >> In projectnessie we use automatic code formatting for all languages and >> haven’t serious issues with Spotless. It is just nice to not have to bike >> shed about whitespaces, line breaks, brackets, etc. It was a bit of >> discussion, because people had bad memories from past experiences with >> automatic code formatting breaking code and introducing subtle bugs. >> >> I think that using code styles that „do not allow bike shedding“ (Google >> Code Style) are a very good option. >> >> So far none of us has seen issues with any of the Spotless code formatters >> that we use: XML, Kotlin/Gradle, Kotlin, Antlr4, Java, Scala - relying on >> the „standard“ settings w/o any customizations. We use this piece of code, >> externalized into an internal Gradle plugin: >> https://github.com/projectnessie/gradle-build-plugins/blob/main/spotless/src/main/kotlin/org/projectnessie/buildtools/spotless/SpotlessHelperPlugin.kt >> >> <https://github.com/projectnessie/gradle-build-plugins/blob/main/spotless/src/main/kotlin/org/projectnessie/buildtools/spotless/SpotlessHelperPlugin.kt> >> For Iceberg, it would probably be nice to have some Groovy code formatting >> for the build scripts as well. >> >> Sure, the migration will add some pain. IMHO the best option is a „big bang“ >> across the whole code base, because it happens only once. Migrating one >> module after another is a „repeated series of pains“. >> >> Since the result of a `./gradlew spotlessApply` is deterministic, people >> that have open PRs could: >> 1. Rebase their PR branch against the commit before the „Big Bang“ >> 2. Include a commit with the necessary Gradle build change (one the only >> contains the changes to add Spotless) >> 3. Do the `./gradlew spotlessApply` >> 4. Squash all commits in the PR-branch >> 5. Rebase again - against the HEAD of the master branch >> 6. Force-push PR-branch >> Because git is „clever enough“ to eliminate the „duplicated/unrelated >> changes“, the final result of the above steps is just the diff with the >> changes for the open PR. >> >> >>> Am 08.07.2022 um 00:59 schrieb Ryan Blue <b...@tabular.io >>> <mailto:b...@tabular.io>>: >>> >>> We were just talking about this proposal internally. I think it would be >>> great to have automatic code formatting, especially since we have to point >>> out a lot of changes manually. The main question is how to get there >>> without too much disruption. This came up in our discussions around the >>> upcoming 1.0 release, since that may be a good opportunity to make all of >>> the code changes. >>> >>> For background, the main concern about adding something like this is >>> applying all of the changes needed to get the existing code to conform to >>> the new style. That is really disruptive because it will cause all of the >>> PRs to need to be rebased and makes it really difficult to cherry-pick >>> changes from after the code formatting happens to branches that were >>> created before code formatting. The 1.0 release makes a good opportunity >>> because we are making other changes (removing deprecations) and will >>> hopefully have people upgrading their branches to the new major version, >>> rather than cherry picking. >>> >>> This is as good a time as any to add automatic code formatting, but it's up >>> to the community: so should we refromat the project and apply spotless code >>> formatting everywhere? I'm interested to hear opinions! >>> >>> Ryan >>> >>> On Thu, Mar 10, 2022 at 3:00 AM Eduard Tudenhoefner <edu...@dremio.com >>> <mailto:edu...@dremio.com>> wrote: >>> Hello everyone, >>> >>> I would like to get the discussion started around automatic code formatting >>> + enforcing and how we get there. >>> >>> Currently we use Checkstyle check to enforce formatting. However, the >>> problem with that is that you still have to manually do the actual >>> formatting. >>> >>> What I would like to propose is the usage of Spotless >>> (https://github.com/diffplug/spotless >>> <https://github.com/diffplug/spotless>) for checking and enforcing Java >>> code style (it can also enforce code style for Scala, Markdown, ... btw). >>> Spotless is being used by many projects >>> (https://github.com/search?l=gradle&q=spotless&type=Code >>> <https://github.com/search?l=gradle&q=spotless&type=Code>) and comes >>> essentially with two tasks: >>> * spotlessCheck: Checks that sourcecode satisfies formatting steps >>> * spotlessApply: Applies code formatting steps to sourcecode in-place >>> >>> >>> Code format >>> >>> The problem with code format is that there is no single format that can >>> satisfy the preferences of everybody. However, from my experience, once >>> people start to use any code format that produces consistent results across >>> Eclipse/IntelliJ/cmd line, people stop worrying about code format details. >>> This is also one of the reasons why the creators of Go decided to have a >>> code formatter built-in (https://go.dev/doc/effective_go#formatting >>> <https://go.dev/doc/effective_go#formatting>): >>> >>> Formatting issues are the most contentious but the least consequential. >>> People can adapt to different formatting styles but it's better if they >>> don't have to, and less time is devoted to the topic if everyone adheres to >>> the same style. The problem is how to approach this Utopia without a long >>> prescriptive style guide. >>> With Go we take an unusual approach and let the machine take care of most >>> formatting issues. The gofmt program (also available as go fmt, which >>> operates at the package level rather than source file level) reads a Go >>> program and emits the source in a standard style of indentation and >>> vertical alignment, retaining and if necessary reformatting comments. >>> >>> >>> I would like to propose using the Google Java Format with Spotless. The >>> reason for this format is essentially that this is a widely-adopted code >>> format that is designed specifically for code reviews (since we're spending >>> more time reviewing code than writing it). >>> Additionally, it produces consistent formatting results across >>> Eclipse/IntelliJ/cmd line, which I think is another very important factor. >>> >>> Thus, our initial Gradle spotless configuration could look similar to the >>> above below: >>> >>> >>> pluginManager.withPlugin('com.diffplug.spotless') { >>> spotless { >>> // don't run spotlessCheck during gradle check task during the >>> transition phase >>> enforceCheck = false >>> java { >>> target 'src/main/java/**/*.java', 'src/test/java/**/*.java', >>> 'src/jmh/java/**/*.java' >>> googleJavaFormat() >>> } >>> } >>> } >>> >>> We don't have to use Google Java Format. Spotless also supports formatting >>> the code with other formats, but from previous experience the Google Java >>> Format seemed to be really the only one to produce consistent results >>> across Eclipse/IntelliJ/cmd line. >>> >>> >>> How do we get to a point where the entire codebase is properly formatted >>> (and enforceCheck = false can be removed)? >>> >>> Now this is a difficult question. Obviously we don't want to have a single >>> format-everything commit, as that would affect lots of in-flight PRs. >>> >>> There would have to be some form of gradual formatting, for example module >>> by module. Spotless offers something called Ratched >>> (https://github.com/diffplug/spotless/tree/main/plugin-gradle#ratchet >>> <https://github.com/diffplug/spotless/tree/main/plugin-gradle#ratchet>) >>> that allows to enforce code format gradually (but I'm not sure this would >>> be a good thing either). >>> >>> How exactly we'd like to approach this transitioning phase this is a >>> completely separate discussion, but I feel like at least we could get the >>> ball rolling so that we make it also easier for newcomers to contribute to >>> the project, since it would be straightforward for them to make their PRs >>> adhere to the code format and also save time during PR reviews. >>> >>> >>> Eduard >>> >>> >>> -- >>> Ryan Blue >>> Tabular >> >> >> >> -- >> Ryan Blue >> >> >> -- >> Ryan Blue >> Tabular >> >> >> -- >> Ryan Blue >> Tabular >