Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes

2014-12-14 Thread Duy Nguyen
On Fri, Dec 12, 2014 at 3:41 AM, Torsten Bögershausen tbo...@web.de wrote:
 Even if I share the the concerns that the cache may work on one system,
 but not on the other, there should be better ways to protect from that.

 Using the uname does not really help, if you move one repo from NTFS to VFAT,
 we will not detect it (assuming we use Windows).
 (And how much do we need to support the move of a repo ?)

 There is a concern that this may not work, when different clients are 
 accessing
 the repo, using the UNTR extension.

 Some kind of sanity check would be good to have, what can be done ?
 The most important things are the timestamps.
 I can think of 2 sanity checks:
 - If the modified time stamp of a directory is older then the create time of 
 any file,
   the UNTR cache can not be used.
 - If the timestamp of a file changes, but the sha1 sum is the same, what does 
 this mean?
   The file (or the whole repo) has been copied, or the time stamping does not 
 work.

 A simple verification of the FS could be to stat() .git/, create a temp file, 
 delete it and
 stat() again. If mtime does not change, the FS is unusable for UNTR.

This is a slow test. Some filesytem only supports second resolution
timestamps. If you create and delete the file so fast, mtime may
remain in the same second even if the fs is supported. We need to wait
a second between those operations (this is why update-index
--untracked-cache takes several seconds). So it cannot be done often
(i.e. at startup of every command)

 Then we could extend the uname idea:
 Create a string in UNTR which is a collection of lines like this:

 Working-For: Linux;/mnt/nfs/projects/project1
 Not-OK-For: WIndows:/a:/project1
 (Of course the strings can be made nicer, and '\n' is URL-encoded.)

 Each system that is not listed needs to probe the repo, add another line
 and re-write the index.

 We can even add a best-for line, and invalidate the UNTR every 12 hours or 
 so.

It starts to look complicated. How about letting the user deal with
it? UNTR will support storing multiple location lines. Whenever UNTR
is needed at a new location, it's disabled. The user has to use
'update-index' to test the new location and add it to UNTR. The user
can choose to keep current locations (network access), or replace them
all with the new one (repo move), or just mark the new location
unusable so the extension is kept for use in other places, but warning
at this place is suppressed. THe localtion consists of worktree
path, system name and host name.

 Should we think about having an ASCII area for additional information, which 
 is part
 of the stone, but the content is flexible ?

These lines are already in free form. If the running system generates
the same string as stored in UNTR, it's allowed to use the extension.
We need code to understand this content, so flexibility must be within
limit..
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes

2014-12-11 Thread Torsten Bögershausen
On 10.12.14 13:22, Duy Nguyen wrote:
 On Wed, Dec 10, 2014 at 12:08 PM, Torsten Bögershausen tbo...@web.de wrote:
 That opens another question:
 How flexible/extensible/self-describing is the format of the UNTR extension
 ?
 If we drop the OS name  root dir check because it disallows network use,
 but later add a better method to verify that the underlying chain
 local OS - network - remote OS-remote FS is OK,
 do we need to change the file format of UNTR ?
 If yes, can old clients read the new format and vice versa?
 Do we need a version information of some kind, or does the
 old client skip unknown entries like we do with extensions in the index ?
 The way index extensions are done so far, there's no actual versioning
 inside an extension.Once an extension is out, its format is set in
 stone. If you change your mind, you make a new extension (with a
 different signature), so signatures are sort of version. Code is
 shared mostly so it should not be a problem. Old clients don't
 recognize new extensions, so they drop them. New clients either stick
 to old extensions or convert them to new ones. This is all local
 matters, so I don't think we need to worry too much.
Thanks for the info.
Even if I share the the concerns that the cache may work on one system,
but not on the other, there should be better ways to protect from that.

Using the uname does not really help, if you move one repo from NTFS to VFAT,
we will not detect it (assuming we use Windows).
(And how much do we need to support the move of a repo ?)

There is a concern that this may not work, when different clients are accessing
the repo, using the UNTR extension.

Some kind of sanity check would be good to have, what can be done ?
The most important things are the timestamps.
I can think of 2 sanity checks:
- If the modified time stamp of a directory is older then the create time of 
any file,
  the UNTR cache can not be used.
- If the timestamp of a file changes, but the sha1 sum is the same, what does 
this mean?
  The file (or the whole repo) has been copied, or the time stamping does not 
work.

A simple verification of the FS could be to stat() .git/, create a temp file, 
delete it and
stat() again. If mtime does not change, the FS is unusable for UNTR.

Then we could extend the uname idea:
Create a string in UNTR which is a collection of lines like this:

Working-For: Linux;/mnt/nfs/projects/project1
Not-OK-For: WIndows:/a:/project1
(Of course the strings can be made nicer, and '\n' is URL-encoded.)

Each system that is not listed needs to probe the repo, add another line
and re-write the index.

We can even add a best-for line, and invalidate the UNTR every 12 hours or so.

Should we think about having an ASCII area for additional information, which is 
part
of the stone, but the content is flexible ?

The patch-series really speeds up git status on a network, thanks for working 
on it.
The next days^H^H^H^H weeks I will do some more tests, using different 
combinations
of OS and network protocols.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes

2014-12-10 Thread Duy Nguyen
On Wed, Dec 10, 2014 at 12:08 PM, Torsten Bögershausen tbo...@web.de wrote:
 That opens another question:
 How flexible/extensible/self-describing is the format of the UNTR extension
 ?
 If we drop the OS name  root dir check because it disallows network use,
 but later add a better method to verify that the underlying chain
 local OS - network - remote OS-remote FS is OK,
 do we need to change the file format of UNTR ?
 If yes, can old clients read the new format and vice versa?
 Do we need a version information of some kind, or does the
 old client skip unknown entries like we do with extensions in the index ?

The way index extensions are done so far, there's no actual versioning
inside an extension.Once an extension is out, its format is set in
stone. If you change your mind, you make a new extension (with a
different signature), so signatures are sort of version. Code is
shared mostly so it should not be a problem. Old clients don't
recognize new extensions, so they drop them. New clients either stick
to old extensions or convert them to new ones. This is all local
matters, so I don't think we need to worry too much.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes

2014-12-09 Thread brian m. carlson
On Mon, Dec 08, 2014 at 09:05:07PM +0700, Nguyễn Thái Ngọc Duy wrote:
 If the user enables untracked cache, then
 
  - move worktree to an unsupported filesystem
  - or simply upgrade OS
  - or move the whole (portable) disk from one machine to another
  - or access a shared fs from another machine
 
 there's no guarantee that untracked cache can still function properly.
 Record the worktree location and OS footprint in the cache. If it
 changes, err on the safe side and disable the cache. The user can
 'update-index --untracked-cache' again to make sure all conditions are
 met.

My use case for untracked cache is where I have one machine with a
repository and which is also mounted via sshfs on another machine.  It
looks like this will disable the cache every time I change the machine I
access it on.  Would you be willing to accept a patch for a configuration
option to disable this?
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187


signature.asc
Description: Digital signature


Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes

2014-12-09 Thread Duy Nguyen
On Tue, Dec 9, 2014 at 5:04 PM, brian m. carlson
sand...@crustytoothpaste.net wrote:
 On Mon, Dec 08, 2014 at 09:05:07PM +0700, Nguyễn Thái Ngọc Duy wrote:
 If the user enables untracked cache, then

  - move worktree to an unsupported filesystem
  - or simply upgrade OS
  - or move the whole (portable) disk from one machine to another
  - or access a shared fs from another machine

 there's no guarantee that untracked cache can still function properly.
 Record the worktree location and OS footprint in the cache. If it
 changes, err on the safe side and disable the cache. The user can
 'update-index --untracked-cache' again to make sure all conditions are
 met.

 My use case for untracked cache is where I have one machine with a
 repository and which is also mounted via sshfs on another machine.  It
 looks like this will disable the cache every time I change the machine I
 access it on.  Would you be willing to accept a patch for a configuration
 option to disable this?

Torsten also does not like this patch. Maybe I'm just too paranoid.
Maybe we can drop this patch.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes

2014-12-09 Thread Torsten Bögershausen

On 12/09/2014 11:53 PM, Duy Nguyen wrote:

On Tue, Dec 9, 2014 at 5:04 PM, brian m. carlson
sand...@crustytoothpaste.net wrote:

On Mon, Dec 08, 2014 at 09:05:07PM +0700, Nguyễn Thái Ngọc Duy wrote:

If the user enables untracked cache, then

  - move worktree to an unsupported filesystem
  - or simply upgrade OS
  - or move the whole (portable) disk from one machine to another
  - or access a shared fs from another machine

there's no guarantee that untracked cache can still function properly.
Record the worktree location and OS footprint in the cache. If it
changes, err on the safe side and disable the cache. The user can
'update-index --untracked-cache' again to make sure all conditions are
met.

My use case for untracked cache is where I have one machine with a
repository and which is also mounted via sshfs on another machine.  It
looks like this will disable the cache every time I change the machine I
access it on.  Would you be willing to accept a patch for a configuration
option to disable this?

Torsten also does not like this patch. Maybe I'm just too paranoid.
Maybe we can drop this patch.

That opens another question:
How flexible/extensible/self-describing is the format of the UNTR 
extension ?

If we drop the OS name  root dir check because it disallows network use,
but later add a better method to verify that the underlying chain
local OS - network - remote OS-remote FS is OK,
do we need to change the file format of UNTR ?
If yes, can old clients read the new format and vice versa?
Do we need a version information of some kind, or does the
old client skip unknown entries like we do with extensions in the index ?



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 23/23] untracked cache: guard and disable on system changes

2014-12-08 Thread Nguyễn Thái Ngọc Duy
If the user enables untracked cache, then

 - move worktree to an unsupported filesystem
 - or simply upgrade OS
 - or move the whole (portable) disk from one machine to another
 - or access a shared fs from another machine

there's no guarantee that untracked cache can still function properly.
Record the worktree location and OS footprint in the cache. If it
changes, err on the safe side and disable the cache. The user can
'update-index --untracked-cache' again to make sure all conditions are
met.

This adds a new requirement that setup_git_directory* must be called
before read_cache() because we need worktree location by then, or the
cache is dropped.

This change does not cover all bases, you can fool it if you try
hard. The point is to stop accidents.

Helped-by: Eric Sunshine sunsh...@sunshineco.com
Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
 Documentation/technical/index-format.txt |  3 +++
 dir.c| 28 
 git-compat-util.h|  1 +
 test-dump-untracked-cache.c  |  1 +
 4 files changed, 33 insertions(+)

diff --git a/Documentation/technical/index-format.txt 
b/Documentation/technical/index-format.txt
index b97ac8d..5dc2bee 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -242,6 +242,9 @@ Git index format
 
   The extension starts with
 
+  - A NUL-terminated string describing the environment when the cache
+is created.
+
   - Stat data of $GIT_DIR/info/exclude. See Index entry section from
 ctime field until file size.
 
diff --git a/dir.c b/dir.c
index 916f1ed..ef58547 100644
--- a/dir.c
+++ b/dir.c
@@ -2246,10 +2246,21 @@ static void write_one_dir(struct untracked_cache_dir 
*untracked,
write_one_dir(untracked-dirs[i], wd);
 }
 
+static void get_ident_string(struct strbuf *sb)
+{
+   struct utsname uts;
+
+   if (uname(uts))
+   die_errno(_(failed to get kernel name and information));
+   strbuf_addf(sb, Location %s, system %s %s %s, get_git_work_tree(),
+   uts.sysname, uts.release, uts.version);
+}
+
 void write_untracked_extension(struct strbuf *out, struct untracked_cache 
*untracked)
 {
struct ondisk_untracked_cache *ouc;
struct write_data wd;
+   struct strbuf sb = STRBUF_INIT;
unsigned char varbuf[16];
int len = 0, varint_len;
if (untracked-exclude_per_dir)
@@ -2261,6 +2272,11 @@ void write_untracked_extension(struct strbuf *out, 
struct untracked_cache *untra
hashcpy(ouc-excludes_file_sha1, untracked-ss_excludes_file.sha1);
ouc-dir_flags = htonl(untracked-dir_flags);
memcpy(ouc-exclude_per_dir, untracked-exclude_per_dir, len + 1);
+
+   get_ident_string(sb);
+   strbuf_add(out, sb.buf, sb.len + 1);
+   strbuf_release(sb);
+
strbuf_add(out, ouc, sizeof(*ouc) + len);
if (!untracked-root) {
varint_len = encode_varint(0, varbuf);
@@ -2444,12 +2460,24 @@ struct untracked_cache *read_untracked_extension(const 
void *data, unsigned long
struct untracked_cache *uc;
struct read_data rd;
const unsigned char *next = data, *end = data + sz;
+   struct strbuf sb = STRBUF_INIT;
int len;
 
if (sz = 1 || end[-1] != '\0')
return NULL;
end--;
 
+   get_ident_string(sb);
+   if (strcmp(sb.buf, (const char *)next)) {
+   warning(_(system identification does not match, untracked 
cache disabled.\n
+ Stored: %s\nCurrent: %s\n),
+   next, sb.buf);
+   strbuf_release(sb);
+   return NULL;
+   }
+   next += sb.len + 1;
+   strbuf_release(sb);
+
ouc = (const struct ondisk_untracked_cache *)next;
if (next + sizeof(*ouc)  end)
return NULL;
diff --git a/git-compat-util.h b/git-compat-util.h
index f587749..e9502a1 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -132,6 +132,7 @@
 #elif defined(_MSC_VER)
 #include compat/msvc.h
 #else
+#include sys/utsname.h
 #include sys/wait.h
 #include sys/resource.h
 #include sys/socket.h
diff --git a/test-dump-untracked-cache.c b/test-dump-untracked-cache.c
index 710441e..25d855d 100644
--- a/test-dump-untracked-cache.c
+++ b/test-dump-untracked-cache.c
@@ -44,6 +44,7 @@ int main(int ac, char **av)
 {
struct untracked_cache *uc;
struct strbuf base = STRBUF_INIT;
+   setup_git_directory();
if (read_cache()  0)
die(unable to read index file);
uc = the_index.untracked;
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html