Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes
On Fri, Dec 12, 2014 at 3:41 AM, Torsten Bögershausen tbo...@web.de wrote: Even if I share the the concerns that the cache may work on one system, but not on the other, there should be better ways to protect from that. Using the uname does not really help, if you move one repo from NTFS to VFAT, we will not detect it (assuming we use Windows). (And how much do we need to support the move of a repo ?) There is a concern that this may not work, when different clients are accessing the repo, using the UNTR extension. Some kind of sanity check would be good to have, what can be done ? The most important things are the timestamps. I can think of 2 sanity checks: - If the modified time stamp of a directory is older then the create time of any file, the UNTR cache can not be used. - If the timestamp of a file changes, but the sha1 sum is the same, what does this mean? The file (or the whole repo) has been copied, or the time stamping does not work. A simple verification of the FS could be to stat() .git/, create a temp file, delete it and stat() again. If mtime does not change, the FS is unusable for UNTR. This is a slow test. Some filesytem only supports second resolution timestamps. If you create and delete the file so fast, mtime may remain in the same second even if the fs is supported. We need to wait a second between those operations (this is why update-index --untracked-cache takes several seconds). So it cannot be done often (i.e. at startup of every command) Then we could extend the uname idea: Create a string in UNTR which is a collection of lines like this: Working-For: Linux;/mnt/nfs/projects/project1 Not-OK-For: WIndows:/a:/project1 (Of course the strings can be made nicer, and '\n' is URL-encoded.) Each system that is not listed needs to probe the repo, add another line and re-write the index. We can even add a best-for line, and invalidate the UNTR every 12 hours or so. It starts to look complicated. How about letting the user deal with it? UNTR will support storing multiple location lines. Whenever UNTR is needed at a new location, it's disabled. The user has to use 'update-index' to test the new location and add it to UNTR. The user can choose to keep current locations (network access), or replace them all with the new one (repo move), or just mark the new location unusable so the extension is kept for use in other places, but warning at this place is suppressed. THe localtion consists of worktree path, system name and host name. Should we think about having an ASCII area for additional information, which is part of the stone, but the content is flexible ? These lines are already in free form. If the running system generates the same string as stored in UNTR, it's allowed to use the extension. We need code to understand this content, so flexibility must be within limit.. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes
On 10.12.14 13:22, Duy Nguyen wrote: On Wed, Dec 10, 2014 at 12:08 PM, Torsten Bögershausen tbo...@web.de wrote: That opens another question: How flexible/extensible/self-describing is the format of the UNTR extension ? If we drop the OS name root dir check because it disallows network use, but later add a better method to verify that the underlying chain local OS - network - remote OS-remote FS is OK, do we need to change the file format of UNTR ? If yes, can old clients read the new format and vice versa? Do we need a version information of some kind, or does the old client skip unknown entries like we do with extensions in the index ? The way index extensions are done so far, there's no actual versioning inside an extension.Once an extension is out, its format is set in stone. If you change your mind, you make a new extension (with a different signature), so signatures are sort of version. Code is shared mostly so it should not be a problem. Old clients don't recognize new extensions, so they drop them. New clients either stick to old extensions or convert them to new ones. This is all local matters, so I don't think we need to worry too much. Thanks for the info. Even if I share the the concerns that the cache may work on one system, but not on the other, there should be better ways to protect from that. Using the uname does not really help, if you move one repo from NTFS to VFAT, we will not detect it (assuming we use Windows). (And how much do we need to support the move of a repo ?) There is a concern that this may not work, when different clients are accessing the repo, using the UNTR extension. Some kind of sanity check would be good to have, what can be done ? The most important things are the timestamps. I can think of 2 sanity checks: - If the modified time stamp of a directory is older then the create time of any file, the UNTR cache can not be used. - If the timestamp of a file changes, but the sha1 sum is the same, what does this mean? The file (or the whole repo) has been copied, or the time stamping does not work. A simple verification of the FS could be to stat() .git/, create a temp file, delete it and stat() again. If mtime does not change, the FS is unusable for UNTR. Then we could extend the uname idea: Create a string in UNTR which is a collection of lines like this: Working-For: Linux;/mnt/nfs/projects/project1 Not-OK-For: WIndows:/a:/project1 (Of course the strings can be made nicer, and '\n' is URL-encoded.) Each system that is not listed needs to probe the repo, add another line and re-write the index. We can even add a best-for line, and invalidate the UNTR every 12 hours or so. Should we think about having an ASCII area for additional information, which is part of the stone, but the content is flexible ? The patch-series really speeds up git status on a network, thanks for working on it. The next days^H^H^H^H weeks I will do some more tests, using different combinations of OS and network protocols. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes
On Wed, Dec 10, 2014 at 12:08 PM, Torsten Bögershausen tbo...@web.de wrote: That opens another question: How flexible/extensible/self-describing is the format of the UNTR extension ? If we drop the OS name root dir check because it disallows network use, but later add a better method to verify that the underlying chain local OS - network - remote OS-remote FS is OK, do we need to change the file format of UNTR ? If yes, can old clients read the new format and vice versa? Do we need a version information of some kind, or does the old client skip unknown entries like we do with extensions in the index ? The way index extensions are done so far, there's no actual versioning inside an extension.Once an extension is out, its format is set in stone. If you change your mind, you make a new extension (with a different signature), so signatures are sort of version. Code is shared mostly so it should not be a problem. Old clients don't recognize new extensions, so they drop them. New clients either stick to old extensions or convert them to new ones. This is all local matters, so I don't think we need to worry too much. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes
On Mon, Dec 08, 2014 at 09:05:07PM +0700, Nguyễn Thái Ngọc Duy wrote: If the user enables untracked cache, then - move worktree to an unsupported filesystem - or simply upgrade OS - or move the whole (portable) disk from one machine to another - or access a shared fs from another machine there's no guarantee that untracked cache can still function properly. Record the worktree location and OS footprint in the cache. If it changes, err on the safe side and disable the cache. The user can 'update-index --untracked-cache' again to make sure all conditions are met. My use case for untracked cache is where I have one machine with a repository and which is also mounted via sshfs on another machine. It looks like this will disable the cache every time I change the machine I access it on. Would you be willing to accept a patch for a configuration option to disable this? -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes
On Tue, Dec 9, 2014 at 5:04 PM, brian m. carlson sand...@crustytoothpaste.net wrote: On Mon, Dec 08, 2014 at 09:05:07PM +0700, Nguyễn Thái Ngọc Duy wrote: If the user enables untracked cache, then - move worktree to an unsupported filesystem - or simply upgrade OS - or move the whole (portable) disk from one machine to another - or access a shared fs from another machine there's no guarantee that untracked cache can still function properly. Record the worktree location and OS footprint in the cache. If it changes, err on the safe side and disable the cache. The user can 'update-index --untracked-cache' again to make sure all conditions are met. My use case for untracked cache is where I have one machine with a repository and which is also mounted via sshfs on another machine. It looks like this will disable the cache every time I change the machine I access it on. Would you be willing to accept a patch for a configuration option to disable this? Torsten also does not like this patch. Maybe I'm just too paranoid. Maybe we can drop this patch. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes
On 12/09/2014 11:53 PM, Duy Nguyen wrote: On Tue, Dec 9, 2014 at 5:04 PM, brian m. carlson sand...@crustytoothpaste.net wrote: On Mon, Dec 08, 2014 at 09:05:07PM +0700, Nguyễn Thái Ngọc Duy wrote: If the user enables untracked cache, then - move worktree to an unsupported filesystem - or simply upgrade OS - or move the whole (portable) disk from one machine to another - or access a shared fs from another machine there's no guarantee that untracked cache can still function properly. Record the worktree location and OS footprint in the cache. If it changes, err on the safe side and disable the cache. The user can 'update-index --untracked-cache' again to make sure all conditions are met. My use case for untracked cache is where I have one machine with a repository and which is also mounted via sshfs on another machine. It looks like this will disable the cache every time I change the machine I access it on. Would you be willing to accept a patch for a configuration option to disable this? Torsten also does not like this patch. Maybe I'm just too paranoid. Maybe we can drop this patch. That opens another question: How flexible/extensible/self-describing is the format of the UNTR extension ? If we drop the OS name root dir check because it disallows network use, but later add a better method to verify that the underlying chain local OS - network - remote OS-remote FS is OK, do we need to change the file format of UNTR ? If yes, can old clients read the new format and vice versa? Do we need a version information of some kind, or does the old client skip unknown entries like we do with extensions in the index ? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 23/23] untracked cache: guard and disable on system changes
If the user enables untracked cache, then - move worktree to an unsupported filesystem - or simply upgrade OS - or move the whole (portable) disk from one machine to another - or access a shared fs from another machine there's no guarantee that untracked cache can still function properly. Record the worktree location and OS footprint in the cache. If it changes, err on the safe side and disable the cache. The user can 'update-index --untracked-cache' again to make sure all conditions are met. This adds a new requirement that setup_git_directory* must be called before read_cache() because we need worktree location by then, or the cache is dropped. This change does not cover all bases, you can fool it if you try hard. The point is to stop accidents. Helped-by: Eric Sunshine sunsh...@sunshineco.com Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com --- Documentation/technical/index-format.txt | 3 +++ dir.c| 28 git-compat-util.h| 1 + test-dump-untracked-cache.c | 1 + 4 files changed, 33 insertions(+) diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index b97ac8d..5dc2bee 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -242,6 +242,9 @@ Git index format The extension starts with + - A NUL-terminated string describing the environment when the cache +is created. + - Stat data of $GIT_DIR/info/exclude. See Index entry section from ctime field until file size. diff --git a/dir.c b/dir.c index 916f1ed..ef58547 100644 --- a/dir.c +++ b/dir.c @@ -2246,10 +2246,21 @@ static void write_one_dir(struct untracked_cache_dir *untracked, write_one_dir(untracked-dirs[i], wd); } +static void get_ident_string(struct strbuf *sb) +{ + struct utsname uts; + + if (uname(uts)) + die_errno(_(failed to get kernel name and information)); + strbuf_addf(sb, Location %s, system %s %s %s, get_git_work_tree(), + uts.sysname, uts.release, uts.version); +} + void write_untracked_extension(struct strbuf *out, struct untracked_cache *untracked) { struct ondisk_untracked_cache *ouc; struct write_data wd; + struct strbuf sb = STRBUF_INIT; unsigned char varbuf[16]; int len = 0, varint_len; if (untracked-exclude_per_dir) @@ -2261,6 +2272,11 @@ void write_untracked_extension(struct strbuf *out, struct untracked_cache *untra hashcpy(ouc-excludes_file_sha1, untracked-ss_excludes_file.sha1); ouc-dir_flags = htonl(untracked-dir_flags); memcpy(ouc-exclude_per_dir, untracked-exclude_per_dir, len + 1); + + get_ident_string(sb); + strbuf_add(out, sb.buf, sb.len + 1); + strbuf_release(sb); + strbuf_add(out, ouc, sizeof(*ouc) + len); if (!untracked-root) { varint_len = encode_varint(0, varbuf); @@ -2444,12 +2460,24 @@ struct untracked_cache *read_untracked_extension(const void *data, unsigned long struct untracked_cache *uc; struct read_data rd; const unsigned char *next = data, *end = data + sz; + struct strbuf sb = STRBUF_INIT; int len; if (sz = 1 || end[-1] != '\0') return NULL; end--; + get_ident_string(sb); + if (strcmp(sb.buf, (const char *)next)) { + warning(_(system identification does not match, untracked cache disabled.\n + Stored: %s\nCurrent: %s\n), + next, sb.buf); + strbuf_release(sb); + return NULL; + } + next += sb.len + 1; + strbuf_release(sb); + ouc = (const struct ondisk_untracked_cache *)next; if (next + sizeof(*ouc) end) return NULL; diff --git a/git-compat-util.h b/git-compat-util.h index f587749..e9502a1 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -132,6 +132,7 @@ #elif defined(_MSC_VER) #include compat/msvc.h #else +#include sys/utsname.h #include sys/wait.h #include sys/resource.h #include sys/socket.h diff --git a/test-dump-untracked-cache.c b/test-dump-untracked-cache.c index 710441e..25d855d 100644 --- a/test-dump-untracked-cache.c +++ b/test-dump-untracked-cache.c @@ -44,6 +44,7 @@ int main(int ac, char **av) { struct untracked_cache *uc; struct strbuf base = STRBUF_INIT; + setup_git_directory(); if (read_cache() 0) die(unable to read index file); uc = the_index.untracked; -- 2.2.0.60.gb7b3c64 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html