Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-16 Thread Duy Nguyen
On Tue, Jul 15, 2014 at 11:45 PM, Junio C Hamano gits...@pobox.com wrote:
 What is the real point of writing into *.lock and renaming?  It
 serves two purposes: (1) everybody adheres to that convention---if
 we managed to take the lock index.lock, nobody else will compete
 and interfere with us until we rename it to index. (2) in the
 meantime, others can still read the old contents in the original
 index.

 With take lock, write to a temporary, commit by rename or
 rollback by delete, we can have the usual transactional semantics.
 While we are working on it, nobody is allowed to muck with the same
 file, because everybody pays attention to *.lock.  People will not
 see what we are doing until we release the lock because we are not
 writing into the primary file.  And people can see what we did in
 its entirety once we are done because we close and rename to commit
 our changes atomically.

True.

 Think what CLOSE_LOCK adds to that and you would appreciate its
 usefulness and at the same time realize its limitation.  By allowing
 us to flush what we wrote to the disk without releasing the lock, we
 can give others (e.g. subprograms we spawn) controlled access to the
 new contents we prepared before we commit the result to the outside
 world.  The access is controlled because we are in control when we
 tell these others to peek into or update the temporary file *.lock.

 The original implementaiton of CLOSE_LOCK is limited in that you can
 do so only once; you take a lock, you do your thing, you close, you
 let (one or more) others see it, and you commit (or rollback).  You
 cannot do another of your thing once you close with the original
 implementation because there was no way to reopen.

This is probably where our opinions differ. Yes, if you are sure
nobody else is looking at the lock file any more, then you can do
whatever you want. And because this is a .lock file, nobody is
supposed to look at it unless you tell them too (in contrast
$GIT_DIR/index can be read at any time). The format of the index makes
it impossible to just edit one byte and be done with it. You always
write a full new file. By sticking to transaction-style update, you
need no extra code, and you have a back up file as a side effect.

 What do you gain by your proposal to lock index.lock file?  We
 know we already have index.lock, so nobody should be competing on
 mucking with its contents with us and we gain nothing by writing
 into index.lock.lock and renaming it to index.lock.  We are in total
 control of the lifetime of index.lock, when we spawn subprograms on
 it to let them write to it, when we ourselves write to it, when we
 spawn subprograms on it to let them read from it, all under the lock
 we have on the index, i.e. index.lock.

 The only thing use of another temporary file (that does not have to
 be a lock on index.lock, by the way, because we have total control
 over when and who updates the file while we hold the index.lock)
 would give you is that it allows you to make the success of the do
 another of your thing step optional.  While holding the lock, you
 close and let add -i work on it, and after it returns, instead of
 reopening, you write into yet another index.lock.lock, expecting
 that it might fail and when it fails you can roll it back, leaving
 the contents add -i left in index.lock intact.  If you do not
 use the extra temporary file, you start from index.lock left by
 add -i, write the updated index into index.lock and if you fail
 to write, you have to roll back the entire index---you lose the
 option to use the index left by add -i without repopulated
 cache-tree.  But in the index update context, I do not think such a
 complexity is not necessary.  If something fails, we should fail and
 roll back the entire index.

I probably look at the problem from a wrong angle. To me the result of
commit -p is precious. I'm not a big user of commit -p myself as I
prefer add -p but it's the same: it'd be frustrating if after you
have carefully added your chunks, the program aborts and you have to
start over. And not just a few chunks. Think of reviewing .po files
and approve strings by the means of adding them to the index. Perhaps
because _I_ as a developer see this cache-tree update step optional
and react to it unnecessarily. Ordinary users won't see any
difference. And perhaps a better way to save the result of commit/add
-p is some sort of index log, not be over-protective at this
interactive commit code block.

I don't feel strongly either way. So your call.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-16 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

   If you do not
 use the extra temporary file, you start from index.lock left by
 add -i, write the updated index into index.lock and if you fail
 to write, you have to roll back the entire index---you lose the
 option to use the index left by add -i without repopulated
 cache-tree.  But in the index update context, I do not think such a
 complexity is not necessary.  If something fails, we should fail and
 roll back the entire index.

 I probably look at the problem from a wrong angle. To me the result of
 commit -p is precious. I'm not a big user of commit -p myself as I
 prefer add -p but it's the same...

Oh, we agree that the result of commit -p is precious.

But the point of David's series is to change the definition of the
precious result to not just commit is made as asked, but now
also to include that the index the user will use for continued work
will have populated cache-tree.  The series thinks both are precious.

As you can probably read from my review responses, I am not sold to
the idea that spending extra cycles to pre-populate cache-tree is
100% good idea, but if we _were_ to accept that it is a good idea,
it logically follows that failing to populate cache-tree is just as
a failure as failing to commit.

In any case, it is unlikely for writing out the updated index with
refreshed cache-tree to fail and blow away the partially built index
(we do not even attempt to reopen/update if we cannot prepare
in-core cache-tree), so I do not think it is such a big deal either
way.



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-15 Thread Junio C Hamano
On Mon, Jul 14, 2014 at 7:15 PM, Duy Nguyen pclo...@gmail.com wrote:

 It makes me wonder if a cleaner way of rebuilding cache-treei in this
 case is from git-add--interactive.perl, or by simply spawn git
 update-index --rebuild-cache-tree after running
 git-add--interactive.perl.

We could check if the cache-tree has fully been populated by
add -i and limit the rebuilding by git commit -p main process,
but if add -i did not do so, there is no reason why git commit -p
should not do so, without relying on the implementation of add -i
to do so.

At least to me, what you suggested sounds nothing more than
a cop-out; instead of lifting the limitation of the API by enhancing
it with reopen-lock-file, punting to shift the burden elsewhere. I
am not quite sure why that is cleaner, but perhaps I am missing
something.

In the longer run, it would be plausible that somebody would want
to rewrite git-add -i and will make it just a C API call away without
having to spawn a separate process. You cannot punt by saying
make 'add -i' responsible for it at that point; you will be doing
what 'add -i' would be doing yourself, no?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-15 Thread Duy Nguyen
On Mon, Jul 14, 2014 at 11:38:06PM -0700, Junio C Hamano wrote:
 On Mon, Jul 14, 2014 at 7:15 PM, Duy Nguyen pclo...@gmail.com wrote:
 
  It makes me wonder if a cleaner way of rebuilding cache-treei in this
  case is from git-add--interactive.perl, or by simply spawn git
  update-index --rebuild-cache-tree after running
  git-add--interactive.perl.
 
 We could check if the cache-tree has fully been populated by
 add -i and limit the rebuilding by git commit -p main process,
 but if add -i did not do so, there is no reason why git commit -p
 should not do so, without relying on the implementation of add -i
 to do so.
 
 At least to me, what you suggested sounds nothing more than
 a cop-out; instead of lifting the limitation of the API by enhancing
 it with reopen-lock-file, punting to shift the burden elsewhere. I
 am not quite sure why that is cleaner, but perhaps I am missing
 something.

Reopen-lock-file to me does not sound like an enhancement, at least in
the index update context. We have always updated the index by writing
to a new file, then rename. Now if the new write_locked_index() blows
up after reopen_lock_file(), we have no way but die() because
index_lock.filename can no longer be trusted.

Doing it in a separate process keeps the tradition. And if the second
write fails, we still have good index_lock.filename. We could also do
the same here with something like this without new process (lightly
tested):

-- 8 --
diff --git a/builtin/commit.c b/builtin/commit.c
index 606c890..c2b4875 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -340,6 +340,18 @@ static char *prepare_index(int argc, const char **argv, 
const char *prefix,
 
discard_cache();
read_cache_from(index_lock.filename);
+   if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) {
+   static struct lock_file second_index_lock;
+   hold_lock_file_for_update(second_index_lock,
+ index_lock.filename,
+ LOCK_DIE_ON_ERROR);
+   if (write_locked_index(the_index, second_index_lock,
+  COMMIT_LOCK)) {
+   warning(_(Failed to update main cache tree));
+   rollback_lock_file(second_index_lock);
+   }
+   } else
+   warning(_(Failed to update main cache tree));
 
commit_style = COMMIT_NORMAL;
return index_lock.filename;
-- 8 --

I was worried about the dependency between second_index_lock and
index_lock, but at cleanup, all we do is rollback, which is safe
regardless of dependencies. Just need to make sure we commit
second_index_lock before index_lock.

 In the longer run, it would be plausible that somebody would want
 to rewrite git-add -i and will make it just a C API call away without
 having to spawn a separate process. You cannot punt by saying
 make 'add -i' responsible for it at that point; you will be doing
 what 'add -i' would be doing yourself, no?

Yes, but at current rewrite rate I'd bet it won't happen in the next
two years at least.
--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-15 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

 On Mon, Jul 14, 2014 at 11:38:06PM -0700, Junio C Hamano wrote:
 On Mon, Jul 14, 2014 at 7:15 PM, Duy Nguyen pclo...@gmail.com wrote:
 
  It makes me wonder if a cleaner way of rebuilding cache-treei in this
  case is from git-add--interactive.perl, or by simply spawn git
  update-index --rebuild-cache-tree after running
  git-add--interactive.perl.
 
 We could check if the cache-tree has fully been populated by
 add -i and limit the rebuilding by git commit -p main process,
 but if add -i did not do so, there is no reason why git commit -p
 should not do so, without relying on the implementation of add -i
 to do so.
 
 At least to me, what you suggested sounds nothing more than
 a cop-out; instead of lifting the limitation of the API by enhancing
 it with reopen-lock-file, punting to shift the burden elsewhere. I
 am not quite sure why that is cleaner, but perhaps I am missing
 something.

 Reopen-lock-file to me does not sound like an enhancement, at least in
 the index update context. We have always updated the index by writing
 to a new file, then rename.

Time to step back and think, I think.

What is the real point of writing into *.lock and renaming?  It
serves two purposes: (1) everybody adheres to that convention---if
we managed to take the lock index.lock, nobody else will compete
and interfere with us until we rename it to index. (2) in the
meantime, others can still read the old contents in the original
index.

With take lock, write to a temporary, commit by rename or
rollback by delete, we can have the usual transactional semantics.
While we are working on it, nobody is allowed to muck with the same
file, because everybody pays attention to *.lock.  People will not
see what we are doing until we release the lock because we are not
writing into the primary file.  And people can see what we did in
its entirety once we are done because we close and rename to commit
our changes atomically.

Think what CLOSE_LOCK adds to that and you would appreciate its
usefulness and at the same time realize its limitation.  By allowing
us to flush what we wrote to the disk without releasing the lock, we
can give others (e.g. subprograms we spawn) controlled access to the
new contents we prepared before we commit the result to the outside
world.  The access is controlled because we are in control when we
tell these others to peek into or update the temporary file *.lock.

The original implementaiton of CLOSE_LOCK is limited in that you can
do so only once; you take a lock, you do your thing, you close, you
let (one or more) others see it, and you commit (or rollback).  You
cannot do another of your thing once you close with the original
implementation because there was no way to reopen.

What do you gain by your proposal to lock index.lock file?  We
know we already have index.lock, so nobody should be competing on
mucking with its contents with us and we gain nothing by writing
into index.lock.lock and renaming it to index.lock.  We are in total
control of the lifetime of index.lock, when we spawn subprograms on
it to let them write to it, when we ourselves write to it, when we
spawn subprograms on it to let them read from it, all under the lock
we have on the index, i.e. index.lock.

The only thing use of another temporary file (that does not have to
be a lock on index.lock, by the way, because we have total control
over when and who updates the file while we hold the index.lock)
would give you is that it allows you to make the success of the do
another of your thing step optional.  While holding the lock, you
close and let add -i work on it, and after it returns, instead of
reopening, you write into yet another index.lock.lock, expecting
that it might fail and when it fails you can roll it back, leaving
the contents add -i left in index.lock intact.  If you do not
use the extra temporary file, you start from index.lock left by
add -i, write the updated index into index.lock and if you fail
to write, you have to roll back the entire index---you lose the
option to use the index left by add -i without repopulated
cache-tree.  But in the index update context, I do not think such a
complexity is not necessary.  If something fails, we should fail and
roll back the entire index.

 Now if the new write_locked_index() blows
 up after reopen_lock_file(), we have no way but die() because
 index_lock.filename can no longer be trusted.

If that is the case, lock.filename[] is getting clobbered by
somebody too early, isn't it?  I think Michael's mh/lockfile topic
(dropped from my tree due to getting stale without rerolling) used a
separate flag to indicate the validity without mucking with the
filename member, and we may want to do something like that as the
right solution to that problem.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-14 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

 On Sat, Jul 12, 2014 at 11:44 AM, David Turner dtur...@twopensource.com 
 wrote:
 @@ -342,6 +342,15 @@ static char *prepare_index(int argc, const char **argv, 
 const char *prefix,

 discard_cache();
 read_cache_from(index_lock.filename);
 +   if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) {
 +   fd = open(index_lock.filename, O_WRONLY);
 +   if (fd = 0)
 +   if (write_cache(fd, active_cache, active_nr) 
 == 0) {
 +   close_lock_file(index_lock);

 If write_cache() returns a negative value, index.lock is probably
 corrupted. Should we die() instead of moving on and returning
 index_lock.filename to the caller? The caller may move index.lock to
 index later on and officially ruin index.

Perhaps true, but worse yet, this will not play nicely together with
your split index series, no?  After taking the lock and writing and
closing, we spawn the interactive while still holding the lock, and
the open we see here is because we want to further update the same
under the same lock.  Perhaps write_locked_index() API in the split
index series can notice that the underlying fd in index_lock has
been closed earlier, realize that the call is to re-update the
index under the same lock and open the file again for writing?

Then we can update the above open() then write_cache() sequence
with just a call to write_locked_index().



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-14 Thread Ramsay Jones
On 14/07/14 16:54, Junio C Hamano wrote:
 Duy Nguyen pclo...@gmail.com writes:
 
 On Sat, Jul 12, 2014 at 11:44 AM, David Turner dtur...@twopensource.com 
 wrote:
 @@ -342,6 +342,15 @@ static char *prepare_index(int argc, const char 
 **argv, const char *prefix,

 discard_cache();
 read_cache_from(index_lock.filename);
 +   if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) {
 +   fd = open(index_lock.filename, O_WRONLY);
 +   if (fd = 0)
 +   if (write_cache(fd, active_cache, 
 active_nr) == 0) {
 +   close_lock_file(index_lock);

 If write_cache() returns a negative value, index.lock is probably
 corrupted. Should we die() instead of moving on and returning
 index_lock.filename to the caller? The caller may move index.lock to
 index later on and officially ruin index.
 
 Perhaps true, but worse yet, this will not play nicely together with
 your split index series, no?  After taking the lock and writing and
 closing, we spawn the interactive while still holding the lock, and
 the open we see here is because we want to further update the same
 under the same lock.  Perhaps write_locked_index() API in the split
 index series can notice that the underlying fd in index_lock has
 been closed earlier, realize that the call is to re-update the
 index under the same lock and open the file again for writing?

Hmm, I was just about to suggest that there was some negative interplay
between the 'dt/cache-tree-repair' and 'nd/split-index' branches as well.

The pu branch fails the testsuite for me. In particular, t0090-cache-tree.sh
fails like so:

$ ./t0090-cache-tree.sh -i -v
...
ok 9 - second commit has cache-tree

expecting success: 
cat -\EOT foo.c 
int foo()
{
return 42;
}
int bar()
{
return 42;
}
EOT
git add foo.c 
test_invalid_cache_tree 
git commit -m add a file 
test_cache_tree 
cat -\EOT foo.c 
int foo()
{
return 43;
}
int bar()
{
return 44;
}
EOT
(echo p; echo 1; echo; echo s; echo n; echo y; echo q) |
git commit --interactive -m foo 
test_cache_tree

[master d1075a6] add a file
 Author: A U Thor aut...@example.com
 1 file changed, 8 insertions(+)
 create mode 100644 foo.c
   staged unstaged path
  1:unchanged+2/-2 foo.c

*** Commands ***
  1: [s]tatus 2: [u]pdate 3: [r]evert 4: [a]dd untracked
  5: [p]atch  6: [d]iff   7: [q]uit   8: [h]elp
What nowstaged unstaged path
  1:unchanged+2/-2 [f]oo.c
Patch updatestaged unstaged path
* 1:unchanged+2/-2 [f]oo.c
Patch update diff --git a/foo.c b/foo.c
index 75522e2..3f7f049 100644
--- a/foo.c
+++ b/foo.c
@@ -1,8 +1,8 @@
 int foo()
 {
-return 42;
+return 43;
 }
 int bar()
 {
-return 42;
+return 44;
 }
Stage this hunk [y,n,q,a,d,/,s,e,?]? Split into 2 hunks.
@@ -1,6 +1,6 @@
 int foo()
 {
-return 42;
+return 43;
 }
 int bar()
 {
Stage this hunk [y,n,q,a,d,/,j,J,g,e,?]? @@ -4,5 +4,5 @@
 }
 int bar()
 {
-return 42;
+return 44;
 }
Stage this hunk [y,n,q,a,d,/,K,g,e,?]? 
*** Commands ***
  1: [s]tatus 2: [u]pdate 3: [r]evert 4: [a]dd untracked
  5: [p]atch  6: [d]iff   7: [q]uit   8: [h]elp
What now Bye.
[master 65d7dde] foo
 Author: A U Thor aut...@example.com
 1 file changed, 1 insertion(+), 1 deletion(-)
--- expect  2014-07-14 17:10:13.755209229 +
+++ filtered2014-07-14 17:10:13.763209258 +
@@ -1 +1 @@
-SHA  (3 entries, 0 subtrees)
+invalid   (0 subtrees)
not ok 10 - commit --interactive gives cache-tree on partial commit
#   
#   cat -\EOT foo.c 
#   int foo()
#   {
#   return 42;
#   }
#   int bar()
#   {
#   return 42;
#   }
#   EOT
#   git add foo.c 
#   test_invalid_cache_tree 
#   git commit -m add a file 
#   test_cache_tree 
#   cat -\EOT foo.c 
#   int foo()
#   {
#   return 43;
#   }
#   int bar()
#   {
#   return 44;
#   }
#   EOT
#   (echo p; echo 1; echo; echo s; echo n; echo y; echo q) |
#   git commit --interactive -m foo 
#   test_cache_tree
#   
$ 

Note that 

Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-14 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Duy Nguyen pclo...@gmail.com writes:

 On Sat, Jul 12, 2014 at 11:44 AM, David Turner dtur...@twopensource.com 
 wrote:
 @@ -342,6 +342,15 @@ static char *prepare_index(int argc, const char 
 **argv, const char *prefix,

 discard_cache();
 read_cache_from(index_lock.filename);
 +   if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) {
 +   fd = open(index_lock.filename, O_WRONLY);
 +   if (fd = 0)
 +   if (write_cache(fd, active_cache, 
 active_nr) == 0) {
 +   close_lock_file(index_lock);

 If write_cache() returns a negative value, index.lock is probably
 corrupted. Should we die() instead of moving on and returning
 index_lock.filename to the caller? The caller may move index.lock to
 index later on and officially ruin index.

 Perhaps true, but worse yet, this will not play nicely together with
 your split index series, no?  After taking the lock and writing and
 closing, we spawn the interactive while still holding the lock, and
 the open we see here is because we want to further update the same
 under the same lock.  Perhaps write_locked_index() API in the split
 index series can notice that the underlying fd in index_lock has
 been closed earlier, realize that the call is to re-update the
 index under the same lock and open the file again for writing?

 Then we can update the above open() then write_cache() sequence
 with just a call to write_locked_index().

Or perhaps introduce reopen_locked_file() and have these rare
callers that want to re-update after they closed the file call it,
in order to avoid mistake?

Perhaps something like this on top of your series?

-- 8 --
Subject: [PATCH] lockfile: allow reopening a closed but still locked file

In some code paths (e.g. giving add -i to prepare the contents to
be committed interactively inside commit -p) where a caller takes
a lock, writes the new content, give chance for others to use it
while still holding the lock, and then releases the lock when all is
done.  As an extension, allow the caller to re-update an already
closed file while still holding the lock (i.e. not yet committed) by
re-opening the file, to be followed by updating the contents and
then by the usual close_lock_file() or commit_lock_file().

This is necessary if we want to add code to rebuild the cache-tree
and write the resulting index out after add -i returns the control
to commit -p, for example.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 cache.h|  7 ---
 lockfile.c | 51 +--
 2 files changed, 41 insertions(+), 17 deletions(-)

diff --git a/cache.h b/cache.h
index c6b7770..a9344cf 100644
--- a/cache.h
+++ b/cache.h
@@ -567,12 +567,13 @@ extern NORETURN void unable_to_lock_index_die(const char 
*path, int err);
 extern int hold_lock_file_for_update(struct lock_file *, const char *path, 
int);
 extern int hold_lock_file_for_append(struct lock_file *, const char *path, 
int);
 extern int commit_lock_file(struct lock_file *);
-extern void update_index_if_able(struct index_state *, struct lock_file *);
+extern int reopen_lock_file(struct lock_file *);
+extern int close_lock_file(struct lock_file *);
+extern void rollback_lock_file(struct lock_file *);
 
+extern void update_index_if_able(struct index_state *, struct lock_file *);
 extern int hold_locked_index(struct lock_file *, int);
 extern void set_alternate_index_output(const char *);
-extern int close_lock_file(struct lock_file *);
-extern void rollback_lock_file(struct lock_file *);
 extern int delete_ref(const char *, const unsigned char *sha1, int delopt);
 
 /* Environment bits from configuration mechanism */
diff --git a/lockfile.c b/lockfile.c
index b706614..618a4b2 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -120,21 +120,8 @@ static char *resolve_symlink(char *p, size_t s)
return p;
 }
 
-
-static int lock_file(struct lock_file *lk, const char *path, int flags)
+static int lock_file_open(struct lock_file *lk)
 {
-   /*
-* subtract 5 from size to make sure there's room for adding
-* .lock for the lock file name
-*/
-   static const size_t max_path_len = sizeof(lk-filename) - 5;
-
-   if (strlen(path) = max_path_len)
-   return -1;
-   strcpy(lk-filename, path);
-   if (!(flags  LOCK_NODEREF))
-   resolve_symlink(lk-filename, max_path_len);
-   strcat(lk-filename, .lock);
lk-fd = open(lk-filename, O_RDWR | O_CREAT | O_EXCL, 0666);
if (0 = lk-fd) {
if (!lock_file_list) {
@@ -156,6 +143,23 @@ static int lock_file(struct lock_file *lk, const char 
*path, int flags)
return lk-fd;
 }
 
+static int lock_file(struct lock_file *lk, const char *path, int flags)
+{
+   /*
+* subtract 5 from size to make sure there's room for adding
+* .lock 

Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-14 Thread Junio C Hamano
Ramsay Jones ram...@ramsay1.demon.co.uk writes:

 that the merge commit 7608c87e fails. Looking at the details of the
 merge resolution, made me think of Duy's split index work.

Yes, there is a deliberately dropped hunk from dt/cache-tree-repair
in that merge, because the topic relied on being able to say here
is the file descriptor, write the index to it, which no longer is
available with the split-index topic.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-14 Thread Ramsay Jones
On 14/07/14 18:51, Junio C Hamano wrote:
 Ramsay Jones ram...@ramsay1.demon.co.uk writes:
 
 that the merge commit 7608c87e fails. Looking at the details of the
 merge resolution, made me think of Duy's split index work.
 
 Yes, there is a deliberately dropped hunk from dt/cache-tree-repair
 in that merge, because the topic relied on being able to say here
 is the file descriptor, write the index to it, which no longer is
 available with the split-index topic.

Ah, OK. Sounds like everything is under control then.

Thanks.

ATB,
Ramsay Jones



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-14 Thread Junio C Hamano
Ramsay Jones ram...@ramsay1.demon.co.uk writes:

 On 14/07/14 18:51, Junio C Hamano wrote:
 Ramsay Jones ram...@ramsay1.demon.co.uk writes:
 
 that the merge commit 7608c87e fails. Looking at the details of the
 merge resolution, made me think of Duy's split index work.
 
 Yes, there is a deliberately dropped hunk from dt/cache-tree-repair
 in that merge, because the topic relied on being able to say here
 is the file descriptor, write the index to it, which no longer is
 available with the split-index topic.

 Ah, OK. Sounds like everything is under control then.

Wasn't, but now I think it is ;-)

David, could you please double check the conflict resolution at
882426ea (Merge branch 'dt/cache-tree-repair' into jch, 2014-07-14),
at about the middle between master..pu?  By eyeballing

git diff 882426ea^ 882426ea

we should see what your series would have done if it were based on
top of the nd/split-index topic.  The most iffy is the first hunk of
change to builtin/commit.c, which is more or less my rewrite of what
you did on top of 'master'.

The change to builtin/checkout.c also seems somewhat iffy in that we
treat the_index.cache_tree (aka active_cache_tree) as if cache
trees are something we can manipulate independent of a particular
index_state (which has been the rule for a long time), even though
in the world order after nd/split-index topic, cache_tree_update()
can no longer be used on a cache-tree that is not associated to a
particular index_state.  It is not a problem with your series, but
comes from nd/split-index topic, and it might indicate a slight
unevenness of the API (i.e. we may want to either insist that the
public API to muck with a cache-tree outside cache-tree.c must be
accessed via an index-state and never via a bare cache-tree
structure, by insisting that cache_tree_fully_valid() to take a
pointer to an index-state as well; or we may want to go the other
way and allow API users to pass a bare cache-tree without the
index-state when the latter is not absolutely necessary, by changing
cache_tree_update() to take a cache-tree, not an index-state).

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-14 Thread David Turner
On Mon, 2014-07-14 at 15:16 -0700, Junio C Hamano wrote:
 Ramsay Jones ram...@ramsay1.demon.co.uk writes:
 
  On 14/07/14 18:51, Junio C Hamano wrote:
  Ramsay Jones ram...@ramsay1.demon.co.uk writes:
  
  that the merge commit 7608c87e fails. Looking at the details of the
  merge resolution, made me think of Duy's split index work.
  
  Yes, there is a deliberately dropped hunk from dt/cache-tree-repair
  in that merge, because the topic relied on being able to say here
  is the file descriptor, write the index to it, which no longer is
  available with the split-index topic.
 
  Ah, OK. Sounds like everything is under control then.
 
 Wasn't, but now I think it is ;-)
 
 David, could you please double check the conflict resolution at
 882426ea (Merge branch 'dt/cache-tree-repair' into jch, 2014-07-14),
 at about the middle between master..pu?  By eyeballing
 
 git diff 882426ea^ 882426ea
 
 we should see what your series would have done if it were based on
 top of the nd/split-index topic.  The most iffy is the first hunk of
 change to builtin/commit.c, which is more or less my rewrite of what
 you did on top of 'master'.

LGTM.  And the tests all pass.

 The change to builtin/checkout.c also seems somewhat iffy in that we
 treat the_index.cache_tree (aka active_cache_tree) as if cache
 trees are something we can manipulate independent of a particular
 index_state (which has been the rule for a long time), even though
 in the world order after nd/split-index topic, cache_tree_update()
 can no longer be used on a cache-tree that is not associated to a
 particular index_state.  It is not a problem with your series, but
 comes from nd/split-index topic, and it might indicate a slight
 unevenness of the API (i.e. we may want to either insist that the
 public API to muck with a cache-tree outside cache-tree.c must be
 accessed via an index-state and never via a bare cache-tree
 structure, by insisting that cache_tree_fully_valid() to take a
 pointer to an index-state as well; or we may want to go the other
 way and allow API users to pass a bare cache-tree without the
 index-state when the latter is not absolutely necessary, by changing
 cache_tree_update() to take a cache-tree, not an index-state).

I agree that some sort of API updates like would be an improvement. But
this seems to work for now.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-14 Thread Duy Nguyen
On Tue, Jul 15, 2014 at 5:16 AM, Junio C Hamano gits...@pobox.com wrote:
 Ramsay Jones ram...@ramsay1.demon.co.uk writes:

 On 14/07/14 18:51, Junio C Hamano wrote:
 Ramsay Jones ram...@ramsay1.demon.co.uk writes:

 that the merge commit 7608c87e fails. Looking at the details of the
 merge resolution, made me think of Duy's split index work.

 Yes, there is a deliberately dropped hunk from dt/cache-tree-repair
 in that merge, because the topic relied on being able to say here
 is the file descriptor, write the index to it, which no longer is
 available with the split-index topic.

 Ah, OK. Sounds like everything is under control then.

 Wasn't, but now I think it is ;-)

 David, could you please double check the conflict resolution at
 882426ea (Merge branch 'dt/cache-tree-repair' into jch, 2014-07-14),
 at about the middle between master..pu?  By eyeballing

 git diff 882426ea^ 882426ea

 we should see what your series would have done if it were based on
 top of the nd/split-index topic.  The most iffy is the first hunk of
 change to builtin/commit.c, which is more or less my rewrite of what
 you did on top of 'master'.

It makes me wonder if a cleaner way of rebuilding cache-treei in this
case is from git-add--interactive.perl, or by simply spawn git
update-index --rebuild-cache-tree after running
git-add--interactive.perl.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-12 Thread Duy Nguyen
On Sat, Jul 12, 2014 at 11:44 AM, David Turner dtur...@twopensource.com wrote:
 @@ -342,6 +342,15 @@ static char *prepare_index(int argc, const char **argv, 
 const char *prefix,

 discard_cache();
 read_cache_from(index_lock.filename);
 +   if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) {
 +   fd = open(index_lock.filename, O_WRONLY);
 +   if (fd = 0)
 +   if (write_cache(fd, active_cache, active_nr) 
 == 0) {
 +   close_lock_file(index_lock);

If write_cache() returns a negative value, index.lock is probably
corrupted. Should we die() instead of moving on and returning
index_lock.filename to the caller? The caller may move index.lock to
index later on and officially ruin index.

 +   }
 +   }
 +   else
 +   fprintf(stderr, FAiled to update main cache tree\n);

make the above line something like this for i18n support:

fprintf_ln(stderr, _(Failed to update main cache tree));


 commit_style = COMMIT_NORMAL;
 return index_lock.filename;
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

2014-07-11 Thread David Turner
During the commit process, update the cache-tree. Write this updated
cache-tree so that it's ready for subsequent commands.

Add test code which demonstrates that git commit now writes the cache
tree.  Make all tests test the entire cache-tree, not just the root
level.

Signed-off-by: David Turner dtur...@twitter.com
---
 builtin/commit.c  |  16 ++-
 t/t0090-cache-tree.sh | 117 +++---
 2 files changed, 117 insertions(+), 16 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 9cfef6c..d6681c4 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -342,6 +342,15 @@ static char *prepare_index(int argc, const char **argv, 
const char *prefix,
 
discard_cache();
read_cache_from(index_lock.filename);
+   if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) {
+   fd = open(index_lock.filename, O_WRONLY);
+   if (fd = 0)
+   if (write_cache(fd, active_cache, active_nr) == 
0) {
+   close_lock_file(index_lock);
+   }
+   }
+   else
+   fprintf(stderr, FAiled to update main cache tree\n);
 
commit_style = COMMIT_NORMAL;
return index_lock.filename;
@@ -383,8 +392,12 @@ static char *prepare_index(int argc, const char **argv, 
const char *prefix,
if (!only  !pathspec.nr) {
fd = hold_locked_index(index_lock, 1);
refresh_cache_or_die(refresh_flags);
-   if (active_cache_changed) {
+   if (active_cache_changed
+   || !cache_tree_fully_valid(active_cache_tree)) {
update_main_cache_tree(WRITE_TREE_SILENT);
+   active_cache_changed = 1;
+   }
+   if (active_cache_changed) {
if (write_cache(fd, active_cache, active_nr) ||
commit_locked_index(index_lock))
die(_(unable to write new_index file));
@@ -435,6 +448,7 @@ static char *prepare_index(int argc, const char **argv, 
const char *prefix,
fd = hold_locked_index(index_lock, 1);
add_remove_files(partial);
refresh_cache(REFRESH_QUIET);
+   update_main_cache_tree(WRITE_TREE_SILENT);
if (write_cache(fd, active_cache, active_nr) ||
close_lock_file(index_lock))
die(_(unable to write new_index file));
diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh
index 3a3342e..48c4240 100755
--- a/t/t0090-cache-tree.sh
+++ b/t/t0090-cache-tree.sh
@@ -8,7 +8,7 @@ cache-tree extension.
  . ./test-lib.sh
 
 cmp_cache_tree () {
-   test-dump-cache-tree actual 
+   test-dump-cache-tree | sed -e '/#(ref)/d' actual 
sed s/$_x40/SHA/ actual filtered 
test_cmp $1 filtered
 }
@@ -16,14 +16,38 @@ cmp_cache_tree () {
 # We don't bother with actually checking the SHA1:
 # test-dump-cache-tree already verifies that all existing data is
 # correct.
-test_shallow_cache_tree () {
-   printf SHA  (%d entries, 0 subtrees)\n $(git ls-files|wc -l) expect 

+generate_expected_cache_tree_rec () {
+   dir=$1${1:+/} 
+   parent=$2 
+   # ls-files might have foo/bar, foo/bar/baz, and foo/bar/quux
+   # We want to count only foo because it's the only direct child
+   subtrees=$(git ls-files|grep /|cut -d / -f 1|uniq) 
+   subtree_count=$(echo $subtrees|awk '$1 {++c} END {print c}') 
+   entries=$(git ls-files|wc -l) 
+   printf SHA $dir (%d entries, %d subtrees)\n $entries 
$subtree_count 
+   for subtree in $subtrees
+   do
+   cd $subtree
+   generate_expected_cache_tree_rec $dir$subtree $dir || 
return 1
+   cd ..
+   done 
+   dir=$parent
+}
+
+generate_expected_cache_tree () {
+   (
+   generate_expected_cache_tree_rec
+   )
+}
+
+test_cache_tree () {
+   generate_expected_cache_tree expect 
cmp_cache_tree expect
 }
 
 test_invalid_cache_tree () {
printf invalid  %s ()\n  $@ 
expect 
-   test-dump-cache-tree | \
+   test-dump-cache-tree |
sed -n -e s/[0-9]* subtrees// -e '/#(ref)/d' -e '/^invalid /p' 
actual 
test_cmp expect actual
 }
@@ -33,14 +57,14 @@ test_no_cache_tree () {
cmp_cache_tree expect
 }
 
-test_expect_failure 'initial commit has cache-tree' '
+test_expect_success 'initial commit has cache-tree' '
test_commit foo 
-   test_shallow_cache_tree
+   test_cache_tree
 '
 
 test_expect_success 'read-tree HEAD establishes cache-tree' '
git read-tree HEAD 
-   test_shallow_cache_tree
+   test_cache_tree
 '
 
 test_expect_success 'git-add invalidates cache-tree' '
@@ -58,6 +82,18 @@ test_expect_success 'git-add in subdir invalidates 
cache-tree' '