uranusjr commented on PR #34621:
URL: https://github.com/apache/airflow/pull/34621#issuecomment-1736797913

   There are two directions to the problem:
   
   a. whether we want a Windows user’s _checkout_ to use CRLF, or just LF.
   b. whether we want a file created by a Windows user with CRLF to be _checked 
in_ to the repository as-is, or automatically converted to use LF.
   
   It used to be quite important to consider a. because many (older) Windows 
tools do not support the LF line ending well. This is however not a practical 
issue anymore; nowadays b. is the main consideration; since we (should) only 
want LF in the canonical repository on GitHub, a new file created on Windows 
must only be checked in with LF, not CRLF.
   
   Now consider the possible configurations:
   
   1. `core.autocrlf = false` would allow CRLF to be committed. There are other 
ways to prevent CRLF from being present though, so this is not entirely out of 
consideration. I think there’s a pre-commit hook for this, for example.
   2. `core.autocrlf = true` would always convert line endings to CRLF on 
checkout, and convert line endings (back) to LF on checkin.
   3. `core.autocrlf = input` is a combination of the two; it does not convert 
on checkout (so the local copy in the filesystem would use LF), but would 
convert to LF on checkin.
   
   From a quick glance it’s easy to see option 2 and 3 seem more reasonable. 
The problem is however which files the conversions apply to. By default, Git 
would use a heuristic to figure out whether a file is text (and should be 
converted) or binary (and should not), but any heuristic is not perfect. Git 
therefore allows the repository to explicitly specify whether a file is text or 
not in `.gitattributes`, but still maintainers must be careful to review 
commits so no accidental checkins are made with wrongly detected file types. 
This is fortunately easy for Airflow specifically since we have a pretty solid 
review procedure, and GitHub’s interface makes the detection pretty easy.
   
   So what is the best solution anyway? There isn’t one. In an ideal world I’d 
say it’s best to use `core.autocrlf = false`, use pre-commit to ensure no CRLF 
is committed in, and tell all contributors using Windows to be careful with 
their editor settings. But that’s not very practical for a public project like 
Airflow (more so for private projects, where you can just tell everyone on the 
team to set things up correctly), so it’s fine to use either `true` or `input` 
in practice and rely on maintainers to review and detect any errors Git’s 
algorithm may make. A pre-commit hook can still work in that case, although 
probably not be worthwhile since explaining what that lint error means to a 
novice contributor is probably harder than just telling the contributor to add 
and entry in `.gitattributes` anyway.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to