[ 
https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18040705#comment-18040705
 ] 

Tim Allison commented on TIKA-4251:
-----------------------------------

[~tilman] , I tried to modify the eclipse format so that it made no changes, 
with the hopes that we could modify a line at a time to get the incremental, 
format-specific changes.

Claude thought that'd work... but it was a disaster, which shocked absolutely 
no one involved. :D 

It is just not possible (at least with this tooling, which I'm not held to).

It finally came up with:
{noformat}
Option 1: Accept One Big Format Commit (Controversial but clean)  Just pick 
your target style (Google) and do it all at once:
  - One commit: "chore: apply Google Java Style formatting"
  - Everyone rebases/merges after
  - Git blame issue: Use git blame --ignore-rev <commit-sha> or 
.git-blame-ignore-revs  Pro: Clean break, done once
  Con: Big merge conflicts, disrupts in-flight work  

Option 2: Format Only New/Modified Code (Gradual, least disruptive)  <plugin>
    <groupId>com.diffplug.spotless</groupId>
    <artifactId>spotless-maven-plugin</artifactId>
    <configuration>
      <ratchetFrom>origin/main</ratchetFrom>  <!-- Only format changed lines -->
      <java>
        <googleJavaFormat/>
      </java>
    </configuration>
  </plugin>  Pro: Zero disruption, gradual migration
  Con: Codebase stays mixed for a long time  

Option 3: Minimal Spotless - Just License + Imports (Pragmatic)  Don't format 
at all, just enforce:
  <java>
    <licenseHeader>...</licenseHeader>
    <removeUnusedImports/>
    <!-- NO FORMATTING -->
  </java>  Keep checkstyle for everything else. Add formatting later when 
you're ready for the big commit.  Pro: Immediate value, no disruption
  Con: Doesn't solve the formatting consistency problem
 {noformat}
Fourth option is module by module.

I wasn't aware of  {{{}ratchetFrom{}}}, but that sounds like a pretty good 
option because it would solve my personal frustrations of dealing with 
checkstyle toe-stubbing on every PR, and, theoretically, it would eventually 
cover the codebase, or at least the parts we care about and modify often? 

What makes sense?

> [DISCUSS] move to cosium's git-code-format-maven-plugin with 
> google-java-format
> -------------------------------------------------------------------------------
>
>                 Key: TIKA-4251
>                 URL: https://issues.apache.org/jira/browse/TIKA-4251
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> I was recently working a bit on incubator-stormcrawler, and I noticed that 
> they are using cosium's git-code-format-maven-plugin: 
> https://github.com/Cosium/git-code-format-maven-plugin
> I was initially annoyed that I couldn't quickly figure out what I had to fix 
> to make the linter happyl, but then I realized there was a magic command: 
> {{mvn git-code-format:format-code}} which just fixed the code so that the 
> linter passed. 
> The one drawback I found is that it does not fix nor does it alert on 
> wildcard imports.  We could still use checkstyle for that but only have one 
> rule for checkstyle.
> The other drawback is that there is not a lot of room for variation from 
> google's style. This may actually be a benefit, too, of course.
> I just ran this on {{tika-core}} here: 
> https://github.com/apache/tika/tree/google-java-format
> What would you think about making this change for 3.x?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to