Re: [PATCH 05/15] refs.c: update rename_ref to use a transaction

2014-10-30 Thread Ronnie Sahlberg
On Wed, Oct 29, 2014 at 11:43 AM, Junio C Hamano  wrote:
> Ronnie Sahlberg  writes:
>
>> On Tue, Oct 28, 2014 at 2:12 PM, Junio C Hamano  wrote:
>>
>>> More importantly, when you know that the end result you want to see
>>> is that the old and new log files are bit-for-bit identical, and if
>>> not there is some bug in either parsing or formatting, why parse the
>>> old and reformat into the new?  What would happen when there were
>>> malformed entries in the old that makes your parsing fail?
>>>
>>
>> Fair enough. I will change it to ONLY use a transaction for the actual
>> ref update and keep using rename() for the reflog handling.
>> Only real change I will do for the reflog handling is to change the
>> temporary file name used to be less collission prone if there are two
>> renames happening at the same time
>> so that they don't destroy each others reflogs.
>
> I think it is a good idea to make renaming the entire reflog a
> logical element of transaction (as you mentioned in our private
> discussion) to allow different backends implement in their best
> efficient & robust way.

Right. I have changed it to use an optimized function to read the
whole existing reflog as a blob into a strbuf
and then a new transaction function   transaction_replace_reflog(...
the-blob ...) to write the whole blob back to the new location.


>
> And for filesystem-backed backends, I actually think "keep the
> original until we know we do not have to roll back", that follows
> the same pattern for the other transactional updates, is a good
> implementation of that "best efficient & robust way", compared to
> the original "just rename it".  It frees us from having to be
> worried about what happens if we cannot rename it back.
>
> Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/15] refs.c: update rename_ref to use a transaction

2014-10-29 Thread Junio C Hamano
Ronnie Sahlberg  writes:

> On Tue, Oct 28, 2014 at 2:12 PM, Junio C Hamano  wrote:
>
>> More importantly, when you know that the end result you want to see
>> is that the old and new log files are bit-for-bit identical, and if
>> not there is some bug in either parsing or formatting, why parse the
>> old and reformat into the new?  What would happen when there were
>> malformed entries in the old that makes your parsing fail?
>>
>
> Fair enough. I will change it to ONLY use a transaction for the actual
> ref update and keep using rename() for the reflog handling.
> Only real change I will do for the reflog handling is to change the
> temporary file name used to be less collission prone if there are two
> renames happening at the same time
> so that they don't destroy each others reflogs.

I think it is a good idea to make renaming the entire reflog a
logical element of transaction (as you mentioned in our private
discussion) to allow different backends implement in their best
efficient & robust way.

And for filesystem-backed backends, I actually think "keep the
original until we know we do not have to roll back", that follows
the same pattern for the other transactional updates, is a good
implementation of that "best efficient & robust way", compared to
the original "just rename it".  It frees us from having to be
worried about what happens if we cannot rename it back.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/15] refs.c: update rename_ref to use a transaction

2014-10-29 Thread Ronnie Sahlberg
On Tue, Oct 28, 2014 at 2:12 PM, Junio C Hamano  wrote:
> Ronnie Sahlberg  writes:
>
>> I timed a git branch -m for a branch with ~2400 log entries and it
>> takes neglible time :
>>   real 0m0.008s
>>   user 0m0.000s
>>   sys 0m0.007s
>
> I really hate this line of reasoning.  Small things tend to add up.
>
> More importantly, when you know that the end result you want to see
> is that the old and new log files are bit-for-bit identical, and if
> not there is some bug in either parsing or formatting, why parse the
> old and reformat into the new?  What would happen when there were
> malformed entries in the old that makes your parsing fail?
>

Fair enough. I will change it to ONLY use a transaction for the actual
ref update and keep using rename() for the reflog handling.
Only real change I will do for the reflog handling is to change the
temporary file name used to be less collission prone if there are two
renames happening at the same time
so that they don't destroy each others reflogs.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/15] refs.c: update rename_ref to use a transaction

2014-10-28 Thread Junio C Hamano
Ronnie Sahlberg  writes:

> I timed a git branch -m for a branch with ~2400 log entries and it
> takes neglible time :
>   real 0m0.008s
>   user 0m0.000s
>   sys 0m0.007s

I really hate this line of reasoning.  Small things tend to add up.

More importantly, when you know that the end result you want to see
is that the old and new log files are bit-for-bit identical, and if
not there is some bug in either parsing or formatting, why parse the
old and reformat into the new?  What would happen when there were
malformed entries in the old that makes your parsing fail?


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/15] refs.c: update rename_ref to use a transaction

2014-10-28 Thread Ronnie Sahlberg
On Tue, Oct 28, 2014 at 12:56 PM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> Ronnie Sahlberg  writes:
>>
>>> commit 0295e9cebc41020ee84da275549b164a8770ffba upstream.
>>>
>>> Change refs.c to use a single transaction to copy/rename both the refs and
>>> its reflog. Since we are no longer using rename() to move the reflog file
>>> we no longer need to disallow rename_ref for refs with a symlink for its
>>> reflog so we can remove that test from the testsuite.
>>
>> Do you mean that we used to do a single rename(2) to move the entire
>> logfile, but now you copy potentially thousands of reflog entries
>> one by one?
>>
>> H,... is that an improvement?

I think so. It makes to code a lot simpler and more atomic. As a side
effect it removes restrictions for symlink handling and eliminates the
two renames colliding race.
Though, a read and then rewrite thousands of reflog entries will be
slower than a single rename() syscall.

>
> I see some value in "keep the original while creating a new one,
> just in case we fail to fully recreate the new one so that we can
> roll back with less programming effort".  But still, we should be
> able to copy the original to new without parsing and reformatting
> each and every entry, no?

Is renaming a branch with a long history is such a frequent or time
critical event?
I timed a git branch -m for a branch with ~2400 log entries and it
takes neglible time :
  real 0m0.008s
  user 0m0.000s
  sys 0m0.007s


During the special rename case, we are deleting one ref and creating
another. For cases such as m->m/m or the reverse we must delete the
old file/directory before we can create the new one.

The old rename code did this by renaming the file out to a common
directory and then back to the new location. Which is fast (but a bit
...)
The alternative is to read the old file into memory, delete it and
then write the content back to the new location, which is kind of what
the new code does.

If this turns out to be a bottleneck we can change the io when writing
the reflog entries to use fwrite(). Lets see if there is a problem
first.

regards
ronnie sahlberg
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/15] refs.c: update rename_ref to use a transaction

2014-10-28 Thread Junio C Hamano
Junio C Hamano  writes:

> Ronnie Sahlberg  writes:
>
>> commit 0295e9cebc41020ee84da275549b164a8770ffba upstream.
>>
>> Change refs.c to use a single transaction to copy/rename both the refs and
>> its reflog. Since we are no longer using rename() to move the reflog file
>> we no longer need to disallow rename_ref for refs with a symlink for its
>> reflog so we can remove that test from the testsuite.
>
> Do you mean that we used to do a single rename(2) to move the entire
> logfile, but now you copy potentially thousands of reflog entries
> one by one?
>
> H,... is that an improvement?

I see some value in "keep the original while creating a new one,
just in case we fail to fully recreate the new one so that we can
roll back with less programming effort".  But still, we should be
able to copy the original to new without parsing and reformatting
each and every entry, no?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/15] refs.c: update rename_ref to use a transaction

2014-10-28 Thread Junio C Hamano
Ronnie Sahlberg  writes:

> commit 0295e9cebc41020ee84da275549b164a8770ffba upstream.
>
> Change refs.c to use a single transaction to copy/rename both the refs and
> its reflog. Since we are no longer using rename() to move the reflog file
> we no longer need to disallow rename_ref for refs with a symlink for its
> reflog so we can remove that test from the testsuite.

Do you mean that we used to do a single rename(2) to move the entire
logfile, but now you copy potentially thousands of reflog entries
one by one?

H,... is that an improvement?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/15] refs.c: update rename_ref to use a transaction

2014-10-21 Thread Ronnie Sahlberg
commit 0295e9cebc41020ee84da275549b164a8770ffba upstream.

Change refs.c to use a single transaction to copy/rename both the refs and
its reflog. Since we are no longer using rename() to move the reflog file
we no longer need to disallow rename_ref for refs with a symlink for its
reflog so we can remove that test from the testsuite.

Change the function to return 1 on failure instead of either -1 or 1.

These changes make the rename_ref operation atomic. This also eliminates the
need to use rename() to shift the reflog around via a temporary filename.
As an extension to this, since we no longer use rename() on the reflog file,
we can now safely perform renames even if the reflog is a symbolic link
and thus can remove the check and fail for that condition.

Change-Id: I59477d410a34298a29cf0cbf32328b9053b158fe
Signed-off-by: Ronnie Sahlberg 
Signed-off-by: Jonathan Nieder 
---
 refs.c| 192 --
 t/t3200-branch.sh |   7 --
 2 files changed, 70 insertions(+), 129 deletions(-)

diff --git a/refs.c b/refs.c
index b9c8f91..f43fef4 100644
--- a/refs.c
+++ b/refs.c
@@ -2752,58 +2752,26 @@ int delete_ref(const char *refname, const unsigned char 
*sha1, int delopt)
return 0;
 }
 
-/*
- * People using contrib's git-new-workdir have .git/logs/refs ->
- * /some/other/path/.git/logs/refs, and that may live on another device.
- *
- * IOW, to avoid cross device rename errors, the temporary renamed log must
- * live into logs/refs.
- */
-#define TMP_RENAMED_LOG  "logs/refs/.tmp-renamed-log"
+struct rename_reflog_cb {
+   struct transaction *transaction;
+   const char *refname;
+   struct strbuf *err;
+};
 
-static int rename_tmp_log(const char *newrefname)
+static int rename_reflog_ent(unsigned char *osha1, unsigned char *nsha1,
+const char *id, unsigned long timestamp, int tz,
+const char *message, void *cb_data)
 {
-   int attempts_remaining = 4;
+   struct rename_reflog_cb *cb = cb_data;
+   struct reflog_committer_info ci;
 
- retry:
-   switch (safe_create_leading_directories(git_path("logs/%s", 
newrefname))) {
-   case SCLD_OK:
-   break; /* success */
-   case SCLD_VANISHED:
-   if (--attempts_remaining > 0)
-   goto retry;
-   /* fall through */
-   default:
-   error("unable to create directory for %s", newrefname);
-   return -1;
-   }
-
-   if (rename(git_path(TMP_RENAMED_LOG), git_path("logs/%s", newrefname))) 
{
-   if ((errno==EISDIR || errno==ENOTDIR) && --attempts_remaining > 
0) {
-   /*
-* rename(a, b) when b is an existing
-* directory ought to result in ISDIR, but
-* Solaris 5.8 gives ENOTDIR.  Sheesh.
-*/
-   if (remove_empty_directories(git_path("logs/%s", 
newrefname))) {
-   error("Directory not empty: logs/%s", 
newrefname);
-   return -1;
-   }
-   goto retry;
-   } else if (errno == ENOENT && --attempts_remaining > 0) {
-   /*
-* Maybe another process just deleted one of
-* the directories in the path to newrefname.
-* Try again from the beginning.
-*/
-   goto retry;
-   } else {
-   error("unable to move logfile "TMP_RENAMED_LOG" to 
logs/%s: %s",
-   newrefname, strerror(errno));
-   return -1;
-   }
-   }
-   return 0;
+   memset(&ci, 0, sizeof(ci));
+   ci.id = id;
+   ci.timestamp = timestamp;
+   ci.tz = tz;
+   return transaction_update_reflog(cb->transaction, cb->refname,
+nsha1, osha1, &ci, message, 0,
+cb->err);
 }
 
 static int rename_ref_available(const char *oldname, const char *newname)
@@ -2823,91 +2791,71 @@ static int write_ref_sha1(struct ref_lock *lock, const 
unsigned char *sha1,
 
 int rename_ref(const char *oldrefname, const char *newrefname, const char 
*logmsg)
 {
-   unsigned char sha1[20], orig_sha1[20];
-   int flag = 0, logmoved = 0;
-   struct ref_lock *lock;
-   struct stat loginfo;
-   int log = !lstat(git_path("logs/%s", oldrefname), &loginfo);
+   unsigned char sha1[20];
+   int flag = 0, log;
+   struct transaction *transaction = NULL;
+   struct strbuf err = STRBUF_INIT;
const char *symref = NULL;
+   struct rename_reflog_cb cb;
+   struct reflog_committer_info ci;
 
-   if (log && S_ISLNK(loginfo.st_mode))
-   return error("reflog for %s is a symlink", oldrefname