Re: Git on Windows maps creation time onto changed time
Hi Keith, On Fri, 2 Feb 2018, Keith Goldfarb wrote: > > On Feb 2, 2018, at 1:18 PM, Johannes Schindelin > >wrote: > > > > You cannot assume that the inode numbers are identical between file > > systems/operating systems. That's simply not going to work. > > Yes, I agree with you, I cannot assume this, so I checked. In my case, the > inode numbers are indeed identical. > > > I know you really want *so hard* for the same working directory to be > > accessible from both Windows and Linux. I have a lot of sympathy for that > > sentiment. Though I do not see much chance for success on that front. > > I’m certainly willing to accept that there are going to be limitations by > using a filesystem from two different operating systems. But regardless of > the problems caused by that pattern, would you agree that the Windows code > should be using the actual inode number (well, 32-bits of it) instead of zero? In all likelihood, you have enabled the FS Cache feature. And for a good reason: it makes many Git operations much, much faster. Still not as fast as on Linux because Git is quite tuned to assume all the Linux semantics. So instead of using tell-tales for file modifications that are portable, Git uses tell-tales that are readily available on Linux. And instead of using access patterns that are portable, Git assumes POSIX semantics where you simply use opendir()/readdir()/closedir() and in that inner loop, you call lstat() to get all the stat data. This is obviously not portable beyond POSIX. On Windows, for example, there are really fast FindFirstFile()/FindNextFile() APIs that even allow you to specify patterns to match (which Linux has to do manually, by inspecting the file name obtained via readdir()). You even get some stat data back at the same time, really fast, without additional API calls, unlike on Linux. But Git's access patterns really are tuned to the *dir() way, and we have to emulate it in Git for Windows. This is very, very slow. The FS Cache feature tries to gain back some speed by using FindFirstFile()/FindNextFile() pre-emptively and caching the data, looking it up upon the next time stat data is queried. The stat data obtained that way does not, however, include Change Time nor inode number. And that, dear children, was the story for the day: this is why Git for Windows uses the CreationTime instead of the Change Time and why it marks inode as 0; Because to do it accurately would be a lot slower, and we cannot afford even slower because we are already super slow because we have to work around Git's assumptions as to what APIs are available. Ciao, Johannes
Re: Git on Windows maps creation time onto changed time
> On Feb 2, 2018, at 1:18 PM, Johannes Schindelin> wrote: > > You cannot assume that the inode numbers are identical between file > systems/operating systems. That's simply not going to work. Yes, I agree with you, I cannot assume this, so I checked. In my case, the inode numbers are indeed identical. > I know you really want *so hard* for the same working directory to be > accessible from both Windows and Linux. I have a lot of sympathy for that > sentiment. Though I do not see much chance for success on that front. I’m certainly willing to accept that there are going to be limitations by using a filesystem from two different operating systems. But regardless of the problems caused by that pattern, would you agree that the Windows code should be using the actual inode number (well, 32-bits of it) instead of zero? K.
Re: Git on Windows maps creation time onto changed time
Hi Keith, On Fri, 2 Feb 2018, Keith Goldfarb wrote: > > The purpose of these values is to allow to notice a change on the file > > system without going through the actual file data. Duplicating > > st_mtime would be pointless. > > Well, I agree with the first statement. I disagree with the 2nd. It’s > not pointless to fix a problem, and in theory the creation date of a > file can never change. > > > Don't do that then. Use core.trustctime. > > I am. Unfortunately, my problem isn’t solved by that alone. Perhaps this > deserves its own thread, but on Windows the st_ino field is set to zero. > This can also trigger a false positive, causing the whole cache to be > rebuilt unnecessarily. You cannot assume that the inode numbers are identical between file systems/operating systems. That's simply not going to work. I know you really want *so hard* for the same working directory to be accessible from both Windows and Linux. I have a lot of sympathy for that sentiment. Though I do not see much chance for success on that front. Ciao, Johannes
Re: Git on Windows maps creation time onto changed time
> The purpose of these values is to allow to notice a change on the file system > without going through the actual file data. Duplicating st_mtime would be > pointless. Well, I agree with the first statement. I disagree with the 2nd. It’s not pointless to fix a problem, and in theory the creation date of a file can never change. > Don't do that then. Use core.trustctime. I am. Unfortunately, my problem isn’t solved by that alone. Perhaps this deserves its own thread, but on Windows the st_ino field is set to zero. This can also trigger a false positive, causing the whole cache to be rebuilt unnecessarily. K.
Re: Git on Windows maps creation time onto changed time
Am 02.02.2018 um 00:10 schrieb Keith Goldfarb: Dear git, While tracking down a problem with a filesystem shared by Windows and Ubuntu, I came across the following code in compat/mingw.c (ming_fstat(), also in do_lstat()): if (GetFileInformationByHandle(fh, )) { buf->st_ino = 0; buf->st_gid = 0; buf->st_uid = 0; buf->st_nlink = 1; buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes); buf->st_size = fdata.nFileSizeLow | (((off_t)fdata.nFileSizeHigh)<<32); buf->st_dev = buf->st_rdev = 0; /* not used by Git */ buf->st_atime = filetime_to_time_t(&(fdata.ftLastAccessTime)); buf->st_mtime = filetime_to_time_t(&(fdata.ftLastWriteTime)); buf->st_ctime = filetime_to_time_t(&(fdata.ftCreationTime)); return 0; } The assignment of buf->st_ctime doesn’t seem right to me. I understand there’s no good choice here, but I think a better choice would be to duplicate the definition used for st_mtime. The purpose of these values is to allow to notice a change on the file system without going through the actual file data. Duplicating st_mtime would be pointless. Background: When I do a git status on Windows and then later on Ubuntu (or the other order), it is extremely slow, as the entire tree is being traversed. I tracked it down to this difference in definition of c_time. Yes, I know about the core.trustctime variable, but my problem aside this seems like an unwise choice. Don't do that then. Use core.trustctime. -- Hannes
Re: Git on Windows maps creation time onto changed time
On Fri, Feb 2, 2018 at 12:10 AM, Keith Goldfarbwrote: > Dear git, > > While tracking down a problem with a filesystem shared by Windows and Ubuntu, > I came across the following code in compat/mingw.c (ming_fstat(), also in > do_lstat()): > > if (GetFileInformationByHandle(fh, )) { > buf->st_ino = 0; > buf->st_gid = 0; > buf->st_uid = 0; > buf->st_nlink = 1; > buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes); > buf->st_size = fdata.nFileSizeLow | > (((off_t)fdata.nFileSizeHigh)<<32); > buf->st_dev = buf->st_rdev = 0; /* not used by Git */ > buf->st_atime = filetime_to_time_t(&(fdata.ftLastAccessTime)); > buf->st_mtime = filetime_to_time_t(&(fdata.ftLastWriteTime)); > buf->st_ctime = filetime_to_time_t(&(fdata.ftCreationTime)); > return 0; > } > > The assignment of buf->st_ctime doesn’t seem right to me. I understand > there’s no good choice here, but I think a better choice would be to > duplicate the definition used for st_mtime. > > Background: When I do a git status on Windows and then later on Ubuntu (or > the other order), it is extremely slow, as the entire tree is being > traversed. I tracked it down to this difference in definition of c_time. Yes, > I know about the core.trustctime variable, but my problem aside this seems > like an unwise choice. > > Thanks for listening, Let's CC Marius Storm-Olsen who added this behavior (CC'd).