Re: [PATCH 2/4] pack v4: add v4_size to struct delta_base_cache_entry

2013-09-15 Thread Duy Nguyen
On Sat, Sep 14, 2013 at 11:22 AM, Nicolas Pitre n...@fluxnic.net wrote:
 The cache is currently updated by the caller.  The caller may ask for a
 copy of 2 entries from a base object, but that base object may itself
 copy those objects from somewhere else in a larger chunk.

 Let's consider this example:

 tree A
 --
 0 (0) copy 2 entries from tree B starting at entry 0
 1 (2) copy 1 entry from tree B starting at entry 3

 tree B
 --
 0 (0) copy 6 entries from tree C starting at entry 0
 1 (6) entry foo.txt
 2 (7) entry bar.txt

 Right now, the code calls decode_entries() to decode 2 entries from tree
 B but those entries are part of a copy from tree C.  When that call
 returns, the cache is updated as if tree B entry #2 would start at
 offset 1 but this is wrong because offset 0 in tree B covers 6 entries
 and therefore offset 1 is for entry #6.

 So this needs a rethink.

I've given it some thought and see no simple/efficient way do it when
2+ depth is involved. Ideally tree A should refer to tree C directly
for the first two entries, but in general we can't enforce that a copy
sequence must refer to non-copy sequences only. Caching flattened tree
B up until the 6th entry may help, but then there's no need to cache
offsets anymore because we could just cache tree A..
--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: add a config option to control orderfile

2013-09-15 Thread Michael S. Tsirkin
On Wed, Sep 04, 2013 at 12:08:15AM +0300, Michael S. Tsirkin wrote:
 On Tue, Sep 03, 2013 at 10:12:18AM -0700, Junio C Hamano wrote:
  Michael S. Tsirkin m...@redhat.com writes:
  
   I always want my diffs to show header files first,
   then .c files, then the rest. Make it possible to
   set orderfile though a config option to achieve this.
  
   Signed-off-by: Michael S. Tsirkin m...@redhat.com
   ---
  
  I admit that I was the guilty one who introduced the orderfile, but
  I think the feature was misguided (for one thing, it breaks the
  patch-id, I think).
 
 This should be easy to fix - just sort before applying the ID, no?

Ping. Any comment on this idea?

  So I am moderately hesitant to promote its use
  with an addition like this.
 
 The status quo just makes people insist on using a script to generate patches 
 for
 mail.  Some projects put interface documentation in the header files, in
 that case it makes sense to put .h before .c in the patch as it makes
 review easier.
 
 In fact I sometimes reorder chunks in the patch manually for readability -
 it's probably worth finding a way to support this.
 
 
diff.c | 3 +++
1 file changed, 3 insertions(+)
  
   diff --git a/diff.c b/diff.c
   index 208094f..cca0767 100644
   --- a/diff.c
   +++ b/diff.c
   @@ -264,6 +264,9 @@ int git_diff_basic_config(const char *var, const char 
   *value, void *cb)
 return 0;
 }

   +if (!strcmp(var, diff.orderfile))
   +return 
   git_config_string(default_diff_options.orderfile, var, value);
   +
 if (!prefixcmp(var, submodule.))
 return parse_submodule_config_option(var, value);
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: add a config option to control orderfile

2013-09-15 Thread Michael S. Tsirkin
On Sun, Sep 15, 2013 at 10:49:00AM +0300, Michael S. Tsirkin wrote:
 On Wed, Sep 04, 2013 at 12:08:15AM +0300, Michael S. Tsirkin wrote:
  On Tue, Sep 03, 2013 at 10:12:18AM -0700, Junio C Hamano wrote:
   Michael S. Tsirkin m...@redhat.com writes:
   
I always want my diffs to show header files first,
then .c files, then the rest. Make it possible to
set orderfile though a config option to achieve this.
   
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
   
   I admit that I was the guilty one who introduced the orderfile, but
   I think the feature was misguided (for one thing, it breaks the
   patch-id, I think).
  
  This should be easy to fix - just sort before applying the ID, no?
 
 Ping. Any comment on this idea?

Actually, after I've looked at the code,
diffcore_order is already already called for patch-id.
So patch-id shouldn't be broken by specifying the orderfile.
Why then is the feature misguided?

   So I am moderately hesitant to promote its use
   with an addition like this.
  
  The status quo just makes people insist on using a script to generate 
  patches for
  mail.  Some projects put interface documentation in the header files, in
  that case it makes sense to put .h before .c in the patch as it makes
  review easier.
  
  In fact I sometimes reorder chunks in the patch manually for readability -
  it's probably worth finding a way to support this.
  
  
 diff.c | 3 +++
 1 file changed, 3 insertions(+)
   
diff --git a/diff.c b/diff.c
index 208094f..cca0767 100644
--- a/diff.c
+++ b/diff.c
@@ -264,6 +264,9 @@ int git_diff_basic_config(const char *var, const 
char *value, void *cb)
return 0;
}
 
+if (!strcmp(var, diff.orderfile))
+return 
git_config_string(default_diff_options.orderfile, var, value);
+
if (!prefixcmp(var, submodule.))
return parse_submodule_config_option(var, value);
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] branch: use $curr_branch_short more

2013-09-15 Thread René Scharfe

Am 09.09.2013 01:13, schrieb Felipe Contreras:

On Sun, Sep 8, 2013 at 10:21 AM, René Scharfe l@web.de wrote:

Am 31.08.2013 19:20, schrieb Felipe Contreras:


A summary should contain as much information that would allow me to
skip the commit message as possible.

If I can't tell from the summary if I can safely skip the commit
message, the summary is not doing a good job.

trivial simplification explains the what, and the why at the
same time, and allows most people to skip the commit message, thus is
a good summary.



No patch should be skipped on the mailing list.  As you wrote, trivial
patches can still be wrong.


What patches each persons skips is each person's own decision, don't
be patronizing, if I want to skip a trivial patch, I will, I can't
read each and every patch from the dozens of mailing lists I'm
subscribed to, and there's no way each and every reader is going to
read each and every patch. They should be prioritized, and trivial
ones can be safely skipped by most people.


Yes, of course; someone needs to review every patch in the end, but each 
reader decides for themselves which ones to skip.  I can't keep up with 
the traffic either.


By the way, the bikeshedding phenomenon probably causes trivial patches 
to receive the most attention. :)



When going through the history I can see that quickly recognizing
insubstantial changes is useful, but if I see a summary twice then in my
mind forms a big question mark -- why did the same thing had to be done yet
again?

As an example, both 0d12e59f (pull: replace unnecessary sed invocation) and
bc2bbc45 (pull, rebase: simplify to use die()) could arguably have had the
summary trivial simplification, but I'm glad the author went with
something a bit more specific.


Well I wont. Because it takes long to read, and after reading I still
don't don't if they are trivial or not, I might actually have to read
the commit message, but to be honest, I would probably go right into
the diff itself, because judging from Git's history, chances are that
somebody wrote a novel there with the important bit I'm looking for
just at the end, to don't ruin the suspense.


Ha!  It's better to write it down at all than to miss it years later, 
when even the original author has forgotten all about it.



In the first commit, it's saying it's a single invocation, so I take
it it's trivial, but what is it replaced with? Is the code simpler, is
it more complex? I don't know, I'm still not being told *why* that
patch is made. It says 'unnecessary' but why is it unnecessary?


The sed call is unnecessary because of the fact that it can be replaced. 
:)  I'm sure I would have understood it to mean a conversion to Shell 
builtin functionality in order to avoid forking and executing an 
external command, even if I hadn't read the patch.



In the second commit, it's saying it's a simplification, but I don't
know if it's just one instance, or a thousand, so I don't know what
would be the impact of the patch.

Either way I'm forced to read more just to know if it was safe for me
to skip them, at which point the whole purpose of a summary is
defeated.


I don't see how using trivial simplification as the summary for both 
could have improved matters.



Again, triviality and correctness are two separate different things.
The patch is trivial even if you can't judge it's correctness.


Well, in terms of impact I agree.


No, in all terms. A patch can be:

Trivial and correct
Trivial and incorrect
Non-trivial and correct
Non-trivial and incorrect


Well, yes, but I thought your statement that The patch is trivial was 
referring to my actual patch which started this sub-thread.  And I meant 
that the benefit of that patch to users and programmers was small.



To me, what you are describing is an obvious patch, not a trivial one.
An obvious patch is so obvious that you can judge it's correctness
easily by looking at the diff, a trivial one is of little importance.


That's one definition; I think I had the mathematical notion in mind that
calls proofs trivial which are immediately evident.


We are not talking about mathematics, we are talking about the English
human language.


Here we were talking about source code patches.  As formal descriptions 
of changes to (mostly) programming language text they are closer to 
mathematics than English.  Using math terms when talking about them is 
not too far of a stretch.


René

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] repack: rewrite the shell script in C

2013-09-15 Thread René Scharfe

Am 29.08.2013 22:39, schrieb Stefan Beller:

The motivation of this patch is to get closer to a goal of being
able to have a core subset of git functionality built in to git.
That would mean

 * people on Windows could get a copy of at least the core parts
   of Git without having to install a Unix-style shell

 * people using git in on servers with chrooted environments
   do not need to worry about standard tools lacking for shell
   scripts.

This patch is meant to be mostly a literal translation of the
git-repack script; the intent is that later patches would start using
more library facilities, but this patch is meant to be as close to a
no-op as possible so it doesn't do that kind of thing.

Signed-off-by: Stefan Beller stefanbel...@googlemail.com
---
 Makefile|   2 +-
 builtin.h   |   1 +
 builtin/repack.c| 379 
 git-repack.sh = contrib/examples/git-repack.sh |   0
 git.c   |   1 +
 5 files changed, 382 insertions(+), 1 deletion(-)
 create mode 100644 builtin/repack.c
 rename git-repack.sh = contrib/examples/git-repack.sh (100%)

diff --git a/Makefile b/Makefile
index 3588ca1..69e5267 100644
--- a/Makefile
+++ b/Makefile
@@ -464,7 +464,6 @@ SCRIPT_SH += git-pull.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-rebase.sh
 SCRIPT_SH += git-remote-testgit.sh
-SCRIPT_SH += git-repack.sh
 SCRIPT_SH += git-request-pull.sh
 SCRIPT_SH += git-stash.sh
 SCRIPT_SH += git-submodule.sh
@@ -971,6 +970,7 @@ BUILTIN_OBJS += builtin/reflog.o
 BUILTIN_OBJS += builtin/remote.o
 BUILTIN_OBJS += builtin/remote-ext.o
 BUILTIN_OBJS += builtin/remote-fd.o
+BUILTIN_OBJS += builtin/repack.o
 BUILTIN_OBJS += builtin/replace.o
 BUILTIN_OBJS += builtin/rerere.o
 BUILTIN_OBJS += builtin/reset.o
diff --git a/builtin.h b/builtin.h
index 8afa2de..b56cf07 100644
--- a/builtin.h
+++ b/builtin.h
@@ -102,6 +102,7 @@ extern int cmd_reflog(int argc, const char **argv, const 
char *prefix);
 extern int cmd_remote(int argc, const char **argv, const char *prefix);
 extern int cmd_remote_ext(int argc, const char **argv, const char *prefix);
 extern int cmd_remote_fd(int argc, const char **argv, const char *prefix);
+extern int cmd_repack(int argc, const char **argv, const char *prefix);
 extern int cmd_repo_config(int argc, const char **argv, const char *prefix);
 extern int cmd_rerere(int argc, const char **argv, const char *prefix);
 extern int cmd_reset(int argc, const char **argv, const char *prefix);
diff --git a/builtin/repack.c b/builtin/repack.c
new file mode 100644
index 000..0cc823d
--- /dev/null
+++ b/builtin/repack.c
@@ -0,0 +1,379 @@
+#include builtin.h
+#include cache.h
+#include dir.h
+#include parse-options.h
+#include run-command.h
+#include sigchain.h
+#include strbuf.h
+#include string-list.h
+#include argv-array.h
+
+static int delta_base_offset = 1;
+static char *packdir, *packtmp;
+
+static const char *const git_repack_usage[] = {
+   N_(git repack [options]),
+   NULL
+};
+
+static int repack_config(const char *var, const char *value, void *cb)
+{
+   if (!strcmp(var, repack.usedeltabaseoffset)) {
+   delta_base_offset = git_config_bool(var, value);
+   return 0;
+   }
+   return git_default_config(var, value, cb);
+}
+
+/*
+ * Remove temporary $GIT_OBJECT_DIRECTORY/pack/.tmp-$$-pack-* files.
+ */
+static void remove_temporary_files(void)
+{
+   struct strbuf buf = STRBUF_INIT;
+   size_t dirlen, prefixlen;
+   DIR *dir;
+   struct dirent *e;
+
+   dir = opendir(packdir);
+   if (!dir)
+   return;
+
+   strbuf_addstr(buf, packdir);


Let's say packdir is .git/objects/pack (no trailing slash).  Then buf 
is .git/objects/pack now as well.



+
+   /* Point at the slash at the end of .../objects/pack/ */
+   dirlen = buf.len + 1;
+   strbuf_addf(buf, %s, packtmp);


packtmp starts with packdir, so by adding it here we have it twice.  buf 
is now .git/objects/pack.git/objects/pack/.tmp-123-pack (if our pid is 
123), no?


Perhaps don't add the packdir to the strbuf in the first place and 
calculate dirlen as strlen(packdir) + 1?


Besided, strbuf_addstr(x, y) is better for adding a string than 
strbuf_addf(x, %s, y).



+   /* Point at the dash at the end of .../.tmp-%d-pack- */
+   prefixlen = buf.len - dirlen;


This is the length of git/objects/pack/.tmp-123-pack now.


+
+   while ((e = readdir(dir))) {
+   if (strncmp(e-d_name, buf.buf + dirlen, prefixlen))
+   continue;
+   strbuf_setlen(buf, dirlen);
+   strbuf_addstr(buf, e-d_name);
+   unlink(buf.buf);
+   }
+   closedir(dir);
+   strbuf_release(buf);
+}
+
+static void remove_pack_on_signal(int signo)
+{
+   remove_temporary_files();
+   sigchain_pop(signo);
+   raise(signo);
+}
+
+/*
+ * 

Re: Pls Reply Soon!

2013-09-15 Thread mslizawong
i have a business proposal for you, write me back for more info.
-Sent from my ipad.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-15 Thread Sebastian Schuberth
On Sat, Sep 14, 2013 at 12:06 AM, Junio C Hamano gits...@pobox.com wrote:

 You can explicitly include the system header from your compatibility
 layer, i.e.

 === compat/mingw/string.h ===

 #define __NO_INLINE__

 #ifdef SYSTEM_STRING_H_HEADER
 #include SYSTEM_STRING_H_HEADER
 #else
 #include_next string.h
 #endif

 and then in config.mak.uname, do something like this:

 ifneq (,$(findstring MINGW,$(uname_S)))
 ifndef SYSTEM_STRING_H_HEADER
 SYSTEM_STRING_H_HEADER = C:\\llvm\include\string.h
 endif

 COMPAT_CFLAGS += -DSYSTEM_STRING_H_HEADER=$(SYSTEM_STRING_H_HEADER)
 endif

 People who have the system header file at different paths can
 further override SYSTEM_STRING_H_HEADER in their config.mak.

 That would help compilers targetting mingw that do not support
 #include_next without spreading the damage to other people's
 systems, I think.

I think this is less favorable compared to my last proposed solution.
While my work-around in git-compat-util.h from [1] already is quite
ugly, it's at least in a single place. You solution spreads the code
it multiple place, making it even more ugly and less comprehensible,
IMHO.

[1] http://www.spinics.net/lists/git/msg217546.html

-- 
Sebastian Schuberth
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] branch: use $curr_branch_short more

2013-09-15 Thread Felipe Contreras
On Sun, Sep 15, 2013 at 6:42 AM, René Scharfe l@web.de wrote:
 Am 09.09.2013 01:13, schrieb Felipe Contreras:

 On Sun, Sep 8, 2013 at 10:21 AM, René Scharfe l@web.de wrote:

 Am 31.08.2013 19:20, schrieb Felipe Contreras:

 A summary should contain as much information that would allow me to
 skip the commit message as possible.

 If I can't tell from the summary if I can safely skip the commit
 message, the summary is not doing a good job.

 trivial simplification explains the what, and the why at the
 same time, and allows most people to skip the commit message, thus is
 a good summary.



 No patch should be skipped on the mailing list.  As you wrote, trivial
 patches can still be wrong.


 What patches each persons skips is each person's own decision, don't
 be patronizing, if I want to skip a trivial patch, I will, I can't
 read each and every patch from the dozens of mailing lists I'm
 subscribed to, and there's no way each and every reader is going to
 read each and every patch. They should be prioritized, and trivial
 ones can be safely skipped by most people.


 Yes, of course; someone needs to review every patch in the end, but each
 reader decides for themselves which ones to skip.  I can't keep up with the
 traffic either.

 By the way, the bikeshedding phenomenon probably causes trivial patches to
 receive the most attention. :)

Exactly, so if the summary of the commit message allows people to skip
a patch, that is fine.

 When going through the history I can see that quickly recognizing
 insubstantial changes is useful, but if I see a summary twice then in my
 mind forms a big question mark -- why did the same thing had to be done
 yet
 again?

 As an example, both 0d12e59f (pull: replace unnecessary sed invocation)
 and
 bc2bbc45 (pull, rebase: simplify to use die()) could arguably have had
 the
 summary trivial simplification, but I'm glad the author went with
 something a bit more specific.


 Well I wont. Because it takes long to read, and after reading I still
 don't don't if they are trivial or not, I might actually have to read
 the commit message, but to be honest, I would probably go right into
 the diff itself, because judging from Git's history, chances are that
 somebody wrote a novel there with the important bit I'm looking for
 just at the end, to don't ruin the suspense.


 Ha!  It's better to write it down at all than to miss it years later, when
 even the original author has forgotten all about it.

Yes, of course, but that still means the commit message summary failed
it's purpose.

 In the first commit, it's saying it's a single invocation, so I take
 it it's trivial, but what is it replaced with? Is the code simpler, is
 it more complex? I don't know, I'm still not being told *why* that
 patch is made. It says 'unnecessary' but why is it unnecessary?


 The sed call is unnecessary because of the fact that it can be replaced. :)
 I'm sure I would have understood it to mean a conversion to Shell builtin
 functionality in order to avoid forking and executing an external command,
 even if I hadn't read the patch.

The problem is that the commit message is not for you, it's for every
reader, so the fact that you would have understood it that way is
irrelevant.

Maybe this is an exercise in the lack of empathy, and an example of
mono-culture.

 In the second commit, it's saying it's a simplification, but I don't
 know if it's just one instance, or a thousand, so I don't know what
 would be the impact of the patch.

 Either way I'm forced to read more just to know if it was safe for me
 to skip them, at which point the whole purpose of a summary is
 defeated.


 I don't see how using trivial simplification as the summary for both could
 have improved matters.

It would say trivial, which allows me and a lot of other people to
safely skip them, it's as simple as that.

 Again, triviality and correctness are two separate different things.
 The patch is trivial even if you can't judge it's correctness.


 Well, in terms of impact I agree.


 No, in all terms. A patch can be:

 Trivial and correct
 Trivial and incorrect
 Non-trivial and correct
 Non-trivial and incorrect


 Well, yes, but I thought your statement that The patch is trivial was
 referring to my actual patch which started this sub-thread.  And I meant
 that the benefit of that patch to users and programmers was small.

I don't understand what you are trying to say, the point remains; a
patch being trivial says nothing about its correctness.

 To me, what you are describing is an obvious patch, not a trivial one.
 An obvious patch is so obvious that you can judge it's correctness
 easily by looking at the diff, a trivial one is of little importance.


 That's one definition; I think I had the mathematical notion in mind that
 calls proofs trivial which are immediately evident.


 We are not talking about mathematics, we are talking about the English
 human language.


 Here we were talking about source code 

Re: [PATCH 1/2] repack: rewrite the shell script in C

2013-09-15 Thread Stefan Beller
Rene, thank you very much for the review!

On 09/15/2013 01:42 PM, René Scharfe wrote:

 +static void remove_temporary_files(void)
 +{
 +struct strbuf buf = STRBUF_INIT;
 +size_t dirlen, prefixlen;
 +DIR *dir;
 +struct dirent *e;
 +
 +dir = opendir(packdir);
 +if (!dir)
 +return;
 +
 +strbuf_addstr(buf, packdir);
 
 Let's say packdir is .git/objects/pack (no trailing slash).  Then buf
 is .git/objects/pack now as well.
 
 +
 +/* Point at the slash at the end of .../objects/pack/ */
 +dirlen = buf.len + 1;
 +strbuf_addf(buf, %s, packtmp);
 
 packtmp starts with packdir, so by adding it here we have it twice.  buf
 is now .git/objects/pack.git/objects/pack/.tmp-123-pack (if our pid is
 123), no?
 
 Perhaps don't add the packdir to the strbuf in the first place and
 calculate dirlen as strlen(packdir) + 1?
 
 Besided, strbuf_addstr(x, y) is better for adding a string than
 strbuf_addf(x, %s, y).

Oops, thanks for catching that. I'll be sending a fixup commit.

 
 +/* Point at the dash at the end of .../.tmp-%d-pack- */
 +prefixlen = buf.len - dirlen;
 
 This is the length of git/objects/pack/.tmp-123-pack now.

Also fixed.


 +static void get_non_kept_pack_filenames(struct string_list *fname_list)
 +{
 +DIR *dir;
 +struct dirent *e;
 +char *fname;
 +size_t len;
 +
 +if (!(dir = opendir(packdir)))
 +return;
 +
 +while ((e = readdir(dir)) != NULL) {
 +if (suffixcmp(e-d_name, .pack))
 +continue;
 +
 +len = strlen(e-d_name) - strlen(.pack);
 +fname = xmemdupz(e-d_name, len);
 +
 +if (!file_exists(mkpath(%s/%s.keep, packdir, fname)))
 +string_list_append_nodup(fname_list, fname);
 
 This leaks fname for packs with .keep files.
 

fixed.

 +static void remove_redundant_pack(const char *path_prefix, const char
 *hex)
 +{
 +const char *exts[] = {.pack, .idx, .keep};
 +int i;
 +struct strbuf buf = STRBUF_INIT;
 +size_t plen;
 +
 +strbuf_addf(buf, %s/%s, path_prefix, hex);
 +plen = buf.len;
 +
 +for (i = 0; i  ARRAY_SIZE(exts); i++) {
 +strbuf_setlen(buf, plen);
 +strbuf_addstr(buf, exts[i]);
 +unlink(buf.buf);
 +}
 +}
 
 hex must also include the pack- prefix and path_prefix must be a
 directory name.  Perhaps call them base_name and dir_name or similar to
 make that clearer?
 
 buf is leaked.


buf will be freed.

the parameter hex contains the pack- already.
The remove_redundant_pack function is called in a loop iterating over
elements of existing_packs, which is filled in get_non_kept_pack_filenames,
which doesn't fill in the hex, but filenames without extension.

I'll rename the variables to base_name and dir_name as you suggested.


 +failed = 0;
 +for_each_string_list_item(item, names) {
 +for (ext = 0; ext  2; ext++) {
 +char *fname, *fname_old;
 +fname = mkpathdup(%s/%s%s, packdir,
 +item-string, exts[ext]);
 +if (!file_exists(fname)) {
 +free(fname);
 +continue;
 +}
 +
 +fname_old = mkpath(%s/old-%s%s, packdir,
 +item-string, exts[ext]);
 +if (file_exists(fname_old))
 +if (unlink(fname_old))
 +failed = 1;
 +
 +if (!failed  rename(fname, fname_old)) {
 +failed = 1;
 +break;
 
 This leaks fname.

will fix.

 +if (rename(fname_old, fname))
 +die_errno(_(renaming '%s' failed), fname_old);
 
 The Shell script exits silently in that case.  How about  improving error
 reporting in a separate patch?

Moved out of the main patch.


 +if (remove_path(fname))
 +warning(_(removing '%s' failed), fname);
 
 Similar case here: The Shell script continues silently.
 

here as well.

 
 If you take care to clear the argv_arrays then perhaps do the same for
 the string_lists?  The OS cleans up after us anyway, but calming
 Valgrind or similar leak checkers is a good practice nevertheless.
 

I'll do it.


So here are the changes, I'll resend the series as well.

--8--
From 63c94df8c74c6643d6291c324661a939b9945619 Mon Sep 17 00:00:00 2001
From: Stefan Beller stefanbel...@googlemail.com
Date: Sun, 15 Sep 2013 17:05:49 +0200
Subject: [PATCH 1/2] Suggestions by Rene

---
 builtin/repack.c | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index a0314be..d345d16 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -39,12 +39,10 @@ static void remove_temporary_files(void)
if (!dir)
return;
 
-   strbuf_addstr(buf, packdir);
-
/* Point at the slash at the end of .../objects/pack/ */
-   dirlen = buf.len + 1;
-   strbuf_addf(buf, %s, packtmp);
-   /* Point at the dash at the end of .../.tmp-%d-pack- */
+ 

[PATCH 2/3] repack: retain the return value of pack-objects

2013-09-15 Thread Stefan Beller
During the review process of the previous commit (repack: rewrite the
shell script in C), Johannes Sixt proposed to retain any exit codes from
the sub-process, which makes it probably more obvious in case of failure.

As the commit before should behave as close to the original shell
script, the proposed change is put in this extra commit.
The infrastructure however was already setup in the previous commit.
(Having a local 'ret' variable)

Signed-off-by: Stefan Beller stefanbel...@googlemail.com
Signed-off-by: Junio C Hamano gits...@pobox.com
---
 builtin/repack.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index a15bd9b..d345d16 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -231,7 +231,7 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
 
ret = start_command(cmd);
if (ret)
-   return 1;
+   return ret;
 
nr_packs = 0;
out = xfdopen(cmd.out, r);
@@ -244,7 +244,7 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
fclose(out);
ret = finish_command(cmd);
if (ret)
-   return 1;
+   return ret;
argv_array_clear(cmd_args);
 
if (!nr_packs  !quiet)
-- 
1.8.4.273.ga194ead

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] repack: rewrite the shell script in C

2013-09-15 Thread Stefan Beller
The motivation of this patch is to get closer to a goal of being
able to have a core subset of git functionality built in to git.
That would mean

 * people on Windows could get a copy of at least the core parts
   of Git without having to install a Unix-style shell

 * people using git in on servers with chrooted environments
   do not need to worry about standard tools lacking for shell
   scripts.

This patch is meant to be mostly a literal translation of the
git-repack script; the intent is that later patches would start using
more library facilities, but this patch is meant to be as close to a
no-op as possible so it doesn't do that kind of thing.

Signed-off-by: Stefan Beller stefanbel...@googlemail.com
Signed-off-by: Junio C Hamano gits...@pobox.com
---
 Makefile   |   2 +-
 builtin.h  |   1 +
 builtin/repack.c   | 387 +
 contrib/examples/git-repack.sh | 194 +
 git-repack.sh  | 194 -
 git.c  |   1 +
 6 files changed, 584 insertions(+), 195 deletions(-)
 create mode 100644 builtin/repack.c
 create mode 100755 contrib/examples/git-repack.sh
 delete mode 100755 git-repack.sh

diff --git a/Makefile b/Makefile
index 3588ca1..69e5267 100644
--- a/Makefile
+++ b/Makefile
@@ -464,7 +464,6 @@ SCRIPT_SH += git-pull.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-rebase.sh
 SCRIPT_SH += git-remote-testgit.sh
-SCRIPT_SH += git-repack.sh
 SCRIPT_SH += git-request-pull.sh
 SCRIPT_SH += git-stash.sh
 SCRIPT_SH += git-submodule.sh
@@ -971,6 +970,7 @@ BUILTIN_OBJS += builtin/reflog.o
 BUILTIN_OBJS += builtin/remote.o
 BUILTIN_OBJS += builtin/remote-ext.o
 BUILTIN_OBJS += builtin/remote-fd.o
+BUILTIN_OBJS += builtin/repack.o
 BUILTIN_OBJS += builtin/replace.o
 BUILTIN_OBJS += builtin/rerere.o
 BUILTIN_OBJS += builtin/reset.o
diff --git a/builtin.h b/builtin.h
index 8afa2de..b56cf07 100644
--- a/builtin.h
+++ b/builtin.h
@@ -102,6 +102,7 @@ extern int cmd_reflog(int argc, const char **argv, const 
char *prefix);
 extern int cmd_remote(int argc, const char **argv, const char *prefix);
 extern int cmd_remote_ext(int argc, const char **argv, const char *prefix);
 extern int cmd_remote_fd(int argc, const char **argv, const char *prefix);
+extern int cmd_repack(int argc, const char **argv, const char *prefix);
 extern int cmd_repo_config(int argc, const char **argv, const char *prefix);
 extern int cmd_rerere(int argc, const char **argv, const char *prefix);
 extern int cmd_reset(int argc, const char **argv, const char *prefix);
diff --git a/builtin/repack.c b/builtin/repack.c
new file mode 100644
index 000..a15bd9b
--- /dev/null
+++ b/builtin/repack.c
@@ -0,0 +1,387 @@
+#include builtin.h
+#include cache.h
+#include dir.h
+#include parse-options.h
+#include run-command.h
+#include sigchain.h
+#include strbuf.h
+#include string-list.h
+#include argv-array.h
+
+static int delta_base_offset = 1;
+static char *packdir, *packtmp;
+
+static const char *const git_repack_usage[] = {
+   N_(git repack [options]),
+   NULL
+};
+
+static int repack_config(const char *var, const char *value, void *cb)
+{
+   if (!strcmp(var, repack.usedeltabaseoffset)) {
+   delta_base_offset = git_config_bool(var, value);
+   return 0;
+   }
+   return git_default_config(var, value, cb);
+}
+
+/*
+ * Remove temporary $GIT_OBJECT_DIRECTORY/pack/.tmp-$$-pack-* files.
+ */
+static void remove_temporary_files(void)
+{
+   struct strbuf buf = STRBUF_INIT;
+   size_t dirlen, prefixlen;
+   DIR *dir;
+   struct dirent *e;
+
+   dir = opendir(packdir);
+   if (!dir)
+   return;
+
+   /* Point at the slash at the end of .../objects/pack/ */
+   dirlen = strlen(packdir) + 1;
+   strbuf_addstr(buf, packtmp);
+   /* Hold the length of  .tmp-%d-pack- */
+   prefixlen = buf.len - dirlen;
+
+   while ((e = readdir(dir))) {
+   if (strncmp(e-d_name, buf.buf + dirlen, prefixlen))
+   continue;
+   strbuf_setlen(buf, dirlen);
+   strbuf_addstr(buf, e-d_name);
+   unlink(buf.buf);
+   }
+   closedir(dir);
+   strbuf_release(buf);
+}
+
+static void remove_pack_on_signal(int signo)
+{
+   remove_temporary_files();
+   sigchain_pop(signo);
+   raise(signo);
+}
+
+/*
+ * Adds all packs hex strings to the fname list, which do not
+ * have a corresponding .keep file.
+ */
+static void get_non_kept_pack_filenames(struct string_list *fname_list)
+{
+   DIR *dir;
+   struct dirent *e;
+   char *fname;
+   size_t len;
+
+   if (!(dir = opendir(packdir)))
+   return;
+
+   while ((e = readdir(dir)) != NULL) {
+   if (suffixcmp(e-d_name, .pack))
+   continue;
+
+   len = strlen(e-d_name) - strlen(.pack);
+   fname = 

[PATCH 3/3] repack: improve warnings about failure of renaming and removing files

2013-09-15 Thread Stefan Beller
Signed-off-by: Stefan Beller stefanbel...@googlemail.com
---
 builtin/repack.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index d345d16..f7e91bf 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -327,7 +327,7 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
chmod(fname_old, statbuffer.st_mode);
}
if (rename(fname_old, fname))
-   exit(errno);
+   die_errno(_(renaming '%s' failed), fname_old);
free(fname);
free(fname_old);
}
@@ -341,7 +341,8 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
packdir,
item-string,
exts[ext]);
-   remove_path(fname);
+   if (remove_path(fname))
+   warning(_(removing '%s' failed), fname);
}
}
 
-- 
1.8.4.273.ga194ead

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] t7406-submodule-update: demonstrate behaviour when run without init beforehand

2013-09-15 Thread Tay Ray Chuan
Signed-off-by: Tay Ray Chuan rcta...@gmail.com
---
 t/t7406-submodule-update.sh | 20 
 1 file changed, 20 insertions(+)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index f0b3305..00475eb 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -66,6 +66,26 @@ test_expect_success 'setup a submodule tree' '
)
 '
 
+test_expect_success 'setup a repo with uninitialized submodules' '
+   git clone super super2
+'
+
+test_expect_success 'submodule update path warns without init beforehand' '
+   (cd super2 
+test -n $(git submodule update submodule)
+   )
+'
+
+test_expect_success 'submodule update is silent without init beforehand' '
+   (cd super2 
+test -z $(git submodule update)
+   )
+'
+
+test_expect_success 'cleanup repo with uninitialized submodules' '
+   rm -rf super2
+'
+
 test_expect_success 'submodule update detaching the HEAD ' '
(cd super/submodule 
 git reset --hard HEAD~1
-- 
1.8.4.rc4.527.g303b16c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] git submodule update should give notice when run without init beforehand

2013-09-15 Thread Tay Ray Chuan
When 'update' is run with no path in a repository with uninitialized
submodules, the program terminates with no output, and zero status code.
Be more helpful to users by mentioning this.

This may be controlled by an advice.* option.

Signed-off-by: Tay Ray Chuan rcta...@gmail.com
---
 Documentation/config.txt|  5 +
 git-submodule.sh| 16 ++--
 t/t7406-submodule-update.sh |  5 -
 3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ec57a15..79313f9 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -202,6 +202,11 @@ advice.*::
rmHints::
In case of failure in the output of linkgit:git-rm[1],
show directions on how to proceed from the current state.
+   submoduleUpdateUninit::
+   When linkgit:git-submodule[1] `update` is run with no `path`
+   arguments in a repository with uninitialized submodules,
+   mention that uninitalized submodules are indeed present, and
+   that they may be initialized with the `--init` option.
 --
 
 core.fileMode::
diff --git a/git-submodule.sh b/git-submodule.sh
index 2979197..56f3dc2 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -777,6 +777,7 @@ cmd_update()
cloned_modules=
module_list $@ | {
err=
+   has_uninit=
while read mode sha1 stage sm_path
do
die_if_unmatched $mode
@@ -807,9 +808,13 @@ cmd_update()
then
# Only mention uninitialized submodules when its
# path have been specified
-   test $# != 0 
-   say $(eval_gettext Submodule path '\$displaypath' not 
initialized
+   if test $# != 0
+   then
+   say $(eval_gettext Submodule path 
'\$displaypath' not initialized
 Maybe you want to use 'update --init'?)
+   else
+   has_uninit=1
+   fi
continue
fi
 
@@ -940,6 +945,13 @@ Maybe you want to use 'update --init'?)
IFS=$OIFS
exit 1
fi
+
+   if test -n $has_uninit \
+   -a $(git config --bool --get advice.submoduleUpdateUninit) != 
false
+   then
+   say $(eval_gettext Uninitialized submodules were detected;
+Maybe you want to use 'update --init'?)
+   fi
}
 }
 
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 00475eb..8dbe410 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -76,8 +76,11 @@ test_expect_success 'submodule update path warns without 
init beforehand' '
)
 '
 
-test_expect_success 'submodule update is silent without init beforehand' '
+test_expect_success 'submodule update warns without init beforehand' '
(cd super2 
+test_must_fail git config --get advice.submoduleUpdateUninit 
+test -n $(git submodule update) 
+git config advice.submoduleUpdateUninit false 
 test -z $(git submodule update)
)
 '
-- 
1.8.4.rc4.527.g303b16c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] t7406-submodule-update: add missing

2013-09-15 Thread Tay Ray Chuan
322bb6e (2011 Aug 11) introduced a new subshell at the end of a test
case but omitted a '' to join the two; fix this.

Signed-off-by: Tay Ray Chuan rcta...@gmail.com
---
 t/t7406-submodule-update.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index b192f93..f0b3305 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -58,7 +58,7 @@ test_expect_success 'setup a submodule tree' '
 git submodule add ../merging merging 
 test_tick 
 git commit -m rebasing
-   )
+   ) 
(cd super 
 git submodule add ../none none 
 test_tick 
-- 
1.8.4.rc4.527.g303b16c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] repack: rewrite the shell script in C

2013-09-15 Thread Ramkumar Ramachandra
Stefan Beller wrote:
  Makefile   |   2 +-
  builtin.h  |   1 +
  builtin/repack.c   | 387 
 +
  contrib/examples/git-repack.sh | 194 +
  git-repack.sh  | 194 -
  git.c  |   1 +
  6 files changed, 584 insertions(+), 195 deletions(-)
  create mode 100644 builtin/repack.c
  create mode 100755 contrib/examples/git-repack.sh
  delete mode 100755 git-repack.sh

Looks like repack.c is significantly larger than git-repack.sh. I look
forward to reading the code.

 diff --git a/builtin/repack.c b/builtin/repack.c
 new file mode 100644
 index 000..a15bd9b
 --- /dev/null
 +++ b/builtin/repack.c
 @@ -0,0 +1,387 @@
 +#include builtin.h
 +#include cache.h
 +#include dir.h
 +#include parse-options.h
 +#include run-command.h
 +#include sigchain.h
 +#include strbuf.h
 +#include string-list.h
 +#include argv-array.h
 +
 +static int delta_base_offset = 1;
 +static char *packdir, *packtmp;
 +
 +static const char *const git_repack_usage[] = {
 +   N_(git repack [options]),
 +   NULL
 +};
 +
 +static int repack_config(const char *var, const char *value, void *cb)
 +{
 +   if (!strcmp(var, repack.usedeltabaseoffset)) {
 +   delta_base_offset = git_config_bool(var, value);
 +   return 0;
 +   }
 +   return git_default_config(var, value, cb);
 +}

Configuration option: one bool. I wonder what other configuration
options the future will bring in.

 +/*
 + * Remove temporary $GIT_OBJECT_DIRECTORY/pack/.tmp-$$-pack-* files.
 + */

Is $GIT_OBJECT_DIRECTORY a standard variable, or should it be
$GIT_DIR/objects? A quick grep tells me that there are a few
references to it, but I'm yet to be convinced that it can be something
other than $GIT_DIR/objects.

 +static void remove_temporary_files(void)
 +{
 +   struct strbuf buf = STRBUF_INIT;
 +   size_t dirlen, prefixlen;
 +   DIR *dir;
 +   struct dirent *e;
 +
 +   dir = opendir(packdir);
 +   if (!dir)
 +   return;

Wait a minute: where did we initalize packdir? Ah, it's static, so it
must have been initalized before the function was called (in some sort
of setup function), right? Why don't you return an int from the
function so it's possible to differentiate between success and
failure?

 +   /* Point at the slash at the end of .../objects/pack/ */

Is packdir a relative or absolute path? The ... isn't helping.

 +   dirlen = strlen(packdir) + 1;
 +   strbuf_addstr(buf, packtmp);

Mysterious initalization of packtmp.

 +   /* Hold the length of  .tmp-%d-pack- */
 +   prefixlen = buf.len - dirlen;

Okay, so that's what packtmp contains: a reading of repack.sh told me
that you didn't even change the variable names.

 +   while ((e = readdir(dir))) {
 +   if (strncmp(e-d_name, buf.buf + dirlen, prefixlen))
 +   continue;

Skip the dentry that points to the .tmp-* thing.

 +   strbuf_setlen(buf, dirlen);
 +   strbuf_addstr(buf, e-d_name);
 +   unlink(buf.buf);
 +   }
 +   closedir(dir);
 +   strbuf_release(buf);
 +}

Okay.

 +static void remove_pack_on_signal(int signo)
 +{
 +   remove_temporary_files();
 +   sigchain_pop(signo);
 +   raise(signo);
 +}

Okay, although I'd have named the variable signal, not signo.

 +/*
 + * Adds all packs hex strings to the fname list, which do not
 + * have a corresponding .keep file.
 + */

The packs which don't have a corresponding .keep file must be removed/
repacked. Okay.

 +static void get_non_kept_pack_filenames(struct string_list *fname_list)
 +{
 +   DIR *dir;
 +   struct dirent *e;
 +   char *fname;

Filename, I assume.

 +   size_t len;
 +
 +   if (!(dir = opendir(packdir)))
 +   return;

An int return, please.

 +   while ((e = readdir(dir)) != NULL) {
 +   if (suffixcmp(e-d_name, .pack))
 +   continue;

If we didn't find a .pack file, skip over?

 +   len = strlen(e-d_name) - strlen(.pack);
 +   fname = xmemdupz(e-d_name, len);

You can probably use the pathbufs to save some memory here, but I
wouldn't worry about it at this stage.

 +   if (!file_exists(mkpath(%s/%s.keep, packdir, fname)))
 +   string_list_append_nodup(fname_list, fname);
 +   else
 +   free(fname);
 +   }
 +   closedir(dir);
 +}

Okay.

 +static void remove_redundant_pack(const char *path_prefix, const char *hex)
 +{
 +   const char *exts[] = {.pack, .idx, .keep};
 +   int i;
 +   struct strbuf buf = STRBUF_INIT;
 +   size_t plen;
 +
 +   strbuf_addf(buf, %s/%s, path_prefix, hex);

hex is the sha1-hex written out to the full 40 characters?

 +   plen = buf.len;
 +
 +   for (i = 0; i  ARRAY_SIZE(exts); i++) {
 +   

1.8.4 rebase regression?

2013-09-15 Thread Patrick Welche
I just upgraded (via pkgsrc) from git 1.8.3.4 to 1.8.4. With 1.8.4, I had
local changes in glib, did a git pull --rebase. Some of my changes
conflicted, but

$ git rebase --abort
No rebase in progress?

so somehow the usual process of amending the edit, or skipping the patch
no longer works.

I found a similar report at:
http://mail-index.netbsd.org/pkgsrc-users/2013/09/14/msg018646.html


Another less important regression is that in an xterm, with 1.8.3.4 I see

$ /tmp/bin/git diff 
diff --cc glib/gmain.c
index 738e69c,5aaebd0..000
--- a/glib/gmain.c
+++ b/glib/gmain.c
@@@ -4953,32 -4921,32 +4953,48 @@@ g_unix_signal_watch_dispatch (GSourc
  }
...

but with 1.8.4, I see

$ git diff
ESC[1mdiff --cc glib/gmain.cESC[m
ESC[1mindex 738e69c,5aaebd0..000ESC[m
ESC[1m--- a/glib/gmain.cESC[m
ESC[1m+++ b/glib/gmain.cESC[m
ESC[36m@@@ -4953,32 -4921,32 +4953,48 @@@ESC[m 
ESC[mg_unix_signal_watch_dispatch (GSourcESC[m


(same xterm, no change of TERM in both invocations above)
git status in 1.8.4 does show red, so colour does work...

Thoughts on how to help debug?

Cheers,

Patrick
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] pack v4: add v4_size to struct delta_base_cache_entry

2013-09-15 Thread Nicolas Pitre
On Sun, 15 Sep 2013, Duy Nguyen wrote:

 On Sat, Sep 14, 2013 at 11:22 AM, Nicolas Pitre n...@fluxnic.net wrote:
  The cache is currently updated by the caller.  The caller may ask for a
  copy of 2 entries from a base object, but that base object may itself
  copy those objects from somewhere else in a larger chunk.
 
  Let's consider this example:
 
  tree A
  --
  0 (0) copy 2 entries from tree B starting at entry 0
  1 (2) copy 1 entry from tree B starting at entry 3
 
  tree B
  --
  0 (0) copy 6 entries from tree C starting at entry 0
  1 (6) entry foo.txt
  2 (7) entry bar.txt
 
  Right now, the code calls decode_entries() to decode 2 entries from tree
  B but those entries are part of a copy from tree C.  When that call
  returns, the cache is updated as if tree B entry #2 would start at
  offset 1 but this is wrong because offset 0 in tree B covers 6 entries
  and therefore offset 1 is for entry #6.
 
  So this needs a rethink.
 
 I've given it some thought and see no simple/efficient way do it when
 2+ depth is involved. Ideally tree A should refer to tree C directly
 for the first two entries, but in general we can't enforce that a copy
 sequence must refer to non-copy sequences only. Caching flattened tree
 B up until the 6th entry may help, but then there's no need to cache
 offsets anymore because we could just cache tree A..

Well... for the case where tree A should refer to tree C directly, that 
should be optimized by packv4-create/pack-objects.

But as far as my offset cache is concerned, I came around to rethink it.
I managed to write decent and logical code this time.
The speed-up is significant even without any tuning.  Here it is:

commit aa43ec18956a2c835112f086a0a59d7fbc35a021
Author: Nicolas Pitre n...@fluxnic.net
Date:   Fri Sep 13 20:56:31 2013 -0400

packv4-parse.c: add tree offset caching

It is a common pattern to see a tree object copy a few entries from another
tree object, possibly from a non zero offset, then provide a few entries
of its own just to go back to the previous tree object to copy some more
entries.  Each time this happens, the tree object being copied has to be
parsed from the beginning over again up to the desired offset which is
rather wasteful.

Let's introduce a tree offset cache to record the entry position and offset
when tree parsing stops so a subsequent copy from the same tree object
can be resumed without having to start from the beginning all the time.

Without doing further tuning on this cache, performing git rev-list --all
--objects | wc -l on my copy of git.git went from 14.5s down to 6.6s.

Signed-off-by: Nicolas Pitre n...@fluxnic.net

diff --git a/packv4-parse.c b/packv4-parse.c
index 38c22b0..b8af702 100644
--- a/packv4-parse.c
+++ b/packv4-parse.c
@@ -378,6 +378,40 @@ static int copy_canonical_tree_entries(struct packed_git 
*p, off_t offset,
return 0;
 }
 
+/* ordering is so that member alignment takes the least amount of space */
+struct pv4_tree_cache {
+   off_t base_offset;
+   off_t offset;
+   off_t last_copy_base;
+   struct packed_git *p;
+   unsigned int pos;
+   unsigned int nb_entries;
+};
+
+#define CACHE_SIZE 256
+static struct pv4_tree_cache pv4_tree_cache[CACHE_SIZE];
+
+static struct pv4_tree_cache *get_tree_offset_cache(struct packed_git *p, 
off_t base_offset)
+{
+   struct pv4_tree_cache *c;
+   unsigned long hash;
+
+   hash = (unsigned long)p + (unsigned long)base_offset;
+   hash += (hash  8) + (hash  16);
+   hash %= CACHE_SIZE;
+
+   c = pv4_tree_cache[hash];
+   if (c-p != p || c-base_offset != base_offset) {
+   c-p = p;
+   c-base_offset = base_offset;
+   c-offset = 0;
+   c-last_copy_base = 0;
+   c-pos = 0;
+   c-nb_entries = 0;
+   }
+   return c;
+}
+
 static int tree_entry_prefix(unsigned char *buf, unsigned long size,
 const unsigned char *path, int path_len,
 unsigned mode)
@@ -405,39 +439,72 @@ static int tree_entry_prefix(unsigned char *buf, unsigned 
long size,
 }
 
 static int decode_entries(struct packed_git *p, struct pack_window **w_curs,
- off_t offset, unsigned int start, unsigned int count,
- unsigned char **dstp, unsigned long *sizep,
- int parse_hdr)
+ off_t obj_offset, unsigned int start, unsigned int 
count,
+ unsigned char **dstp, unsigned long *sizep)
 {
unsigned long avail;
-   unsigned int nb_entries;
const unsigned char *src, *scp;
-   off_t copy_objoffset = 0;
+   unsigned int curpos;
+   off_t offset, copy_objoffset;
+   struct pv4_tree_cache *c;
+
+   c = get_tree_offset_cache(p, obj_offset);
+   if (count  start  c-nb_entries  start = c-pos 
+   

Re: [PATCH 2/4] pack v4: add v4_size to struct delta_base_cache_entry

2013-09-15 Thread Duy Nguyen
On Mon, Sep 16, 2013 at 11:42 AM, Nicolas Pitre n...@fluxnic.net wrote:
 On Sun, 15 Sep 2013, Duy Nguyen wrote:

 On Sat, Sep 14, 2013 at 11:22 AM, Nicolas Pitre n...@fluxnic.net wrote:
  The cache is currently updated by the caller.  The caller may ask for a
  copy of 2 entries from a base object, but that base object may itself
  copy those objects from somewhere else in a larger chunk.
 
  Let's consider this example:
 
  tree A
  --
  0 (0) copy 2 entries from tree B starting at entry 0
  1 (2) copy 1 entry from tree B starting at entry 3
 
  tree B
  --
  0 (0) copy 6 entries from tree C starting at entry 0
  1 (6) entry foo.txt
  2 (7) entry bar.txt
 
  Right now, the code calls decode_entries() to decode 2 entries from tree
  B but those entries are part of a copy from tree C.  When that call
  returns, the cache is updated as if tree B entry #2 would start at
  offset 1 but this is wrong because offset 0 in tree B covers 6 entries
  and therefore offset 1 is for entry #6.
 
  So this needs a rethink.

 I've given it some thought and see no simple/efficient way do it when
 2+ depth is involved. Ideally tree A should refer to tree C directly
 for the first two entries, but in general we can't enforce that a copy
 sequence must refer to non-copy sequences only. Caching flattened tree
 B up until the 6th entry may help, but then there's no need to cache
 offsets anymore because we could just cache tree A..

 Well... for the case where tree A should refer to tree C directly, that
 should be optimized by packv4-create/pack-objects.

 But as far as my offset cache is concerned, I came around to rethink it.
 I managed to write decent and logical code this time.

OK if I read your patch correctly, in the example above, after
processing tree A entry #0, the position of tree B entry #0 is cached
(instead of #6). When tree A #2 is processed, we check out the cache
and parse tree B's copy sequence again, which leads to a cached
position of tree C, correct? We still have to jump through tree B
unnnecessarily, but that can be dealt with later. Good thing that it
works, and the perf improvement is impressive too. I think we now have
the core of struct pv4_tree_desc :)
-- 
Duy

 The speed-up is significant even without any tuning.  Here it is:

 commit aa43ec18956a2c835112f086a0a59d7fbc35a021
 Author: Nicolas Pitre n...@fluxnic.net
 Date:   Fri Sep 13 20:56:31 2013 -0400

 packv4-parse.c: add tree offset caching

 It is a common pattern to see a tree object copy a few entries from 
 another
 tree object, possibly from a non zero offset, then provide a few entries
 of its own just to go back to the previous tree object to copy some more
 entries.  Each time this happens, the tree object being copied has to be
 parsed from the beginning over again up to the desired offset which is
 rather wasteful.

 Let's introduce a tree offset cache to record the entry position and 
 offset
 when tree parsing stops so a subsequent copy from the same tree object
 can be resumed without having to start from the beginning all the time.

 Without doing further tuning on this cache, performing git rev-list --all
 --objects | wc -l on my copy of git.git went from 14.5s down to 6.6s.

 Signed-off-by: Nicolas Pitre n...@fluxnic.net

 diff --git a/packv4-parse.c b/packv4-parse.c
 index 38c22b0..b8af702 100644
 --- a/packv4-parse.c
 +++ b/packv4-parse.c
 @@ -378,6 +378,40 @@ static int copy_canonical_tree_entries(struct packed_git 
 *p, off_t offset,
 return 0;
  }

 +/* ordering is so that member alignment takes the least amount of space */
 +struct pv4_tree_cache {
 +   off_t base_offset;
 +   off_t offset;
 +   off_t last_copy_base;
 +   struct packed_git *p;
 +   unsigned int pos;
 +   unsigned int nb_entries;
 +};
 +
 +#define CACHE_SIZE 256
 +static struct pv4_tree_cache pv4_tree_cache[CACHE_SIZE];
 +
 +static struct pv4_tree_cache *get_tree_offset_cache(struct packed_git *p, 
 off_t base_offset)
 +{
 +   struct pv4_tree_cache *c;
 +   unsigned long hash;
 +
 +   hash = (unsigned long)p + (unsigned long)base_offset;
 +   hash += (hash  8) + (hash  16);
 +   hash %= CACHE_SIZE;
 +
 +   c = pv4_tree_cache[hash];
 +   if (c-p != p || c-base_offset != base_offset) {
 +   c-p = p;
 +   c-base_offset = base_offset;
 +   c-offset = 0;
 +   c-last_copy_base = 0;
 +   c-pos = 0;
 +   c-nb_entries = 0;
 +   }
 +   return c;
 +}
 +
  static int tree_entry_prefix(unsigned char *buf, unsigned long size,
  const unsigned char *path, int path_len,
  unsigned mode)
 @@ -405,39 +439,72 @@ static int tree_entry_prefix(unsigned char *buf, 
 unsigned long size,
  }

  static int decode_entries(struct packed_git *p, struct pack_window **w_curs,
 - off_t offset, unsigned int 

Gerade liebe Schuhe

2013-09-15 Thread Jomenl
Dies kann vor allem durch extreme tatsächlichen Heels dann eingeschränkt
Damenschuhen , ansonsten beschränkte Strategie Herrenschuhe, während
irritierend und auch verletzen alle nehmen insgesamt gelten . Dennoch gibt
es zahlreiche Designs, die einfach hergestellt werden, aber stilvoll, und
sie werden nie verletzen Menschen die Füße jedoch inklusive ein paar
normales Gefühl , viel auf sich selbst zeigen immer der Nachfrage der
Unternehmen weltweit .  Nike Air Max reduziert
http://www.airmaxschuhenonlinekaufen.eu/   So, dass die Damen sind
voraussichtlich verzichten aus teuflisch slim toe Turnschuhe bei niedrigeren
Fersen zeigen . Ich nehme an, sie Arten von Schuhen oder Stiefeln brauchen
im Recht fühlen , erhöht angenehm und kann Kompliment besser in der Branche
Abend.
All Look der Stil Nike Sneaker erscheinen gestochen scharf auf ihre
bedeutenden bunt färben. Es gibt viele Arten von Farbton für verschiedene
Benutzer- Praktiken , sowie die traditionellen ist mehr in Mode für sie. Sie
werden sehen, der wesentliche Unterschied in diesem Sneaker mit anderen
präsentiert es Schlag Assimilation Produkt durch die Art Premium- Abdeckung
in den höheren Lagen unterstützt.
Gelblich sowie lila sind ihr Chef Farbtöne für die Staples Mitte Lakers
Häuser Gymnasium, deshalb der Hersteller erfordert Silber dann Purpur Liebe
Kobe -Turnschuhen vor allem Farben. Aber es sind nicht ein Fall, wenn Sie
sehr besitzen können eine Reihe von , dass die Nike-Turnschuhe , weil einige
der Vorteile bezüglich Nike Schuhen ist die Informationstechnologie hat die
Beine dann schützen keep walking für ein eigentlich sehr lange Zeit. Einkauf
bis zu den Füßen ist einfach Schaden Nike Verwendung einer neueren
Erfahrungen , um sie zu schützen werden . Sie müssen nicht sich Sorgen über
das Gewicht , um Schuhe oder Stiefel , falls Sie sich mit ihren Spielen .
Verschiedene Menschen Beinen passiert immer weh nur als ihre besondere
Schuhe nicht geeignet war für ihren Sport , zweitklassigen kann nicht für
den besonderen Charakter der Erhaltung zahlen.
Wahrlich, es ist sportlich Arbeitsbasis trägt erhältlich als Teil von einer
Vielzahl von Arten in Bezug auf Aussehen , Formen zusätzlich Modelle .
Wirklich oft erheblich Verständnis sind Premium Styling auf diese Elemente
ohne Investitionen weit Aufmerksamkeit auf Formularen. Jetzt Hersteller
unter diesen Waren zu schaffen unterschiedlichen maximalen - Abschluss
sportlich Fuß trägt in unzähligen Formen , so dass sie in verschiedenen
formen Fußtypen .
Es sind wahrscheinlich Ihre aktuelle erste off Stil der Betrieb Turnschuhe
zu bekommen entwickelt Benutzung eines identischen Farbpalette mischen diese
Art von Innovation produziert gegabelt off in Einführung , da eine sehr gute
Pfund bevorzugten Artikel eingebaut , während aus den Bereichen . Jeder der
Finalisierung ist bemerkenswert , während Sie würde auf jeden Fall wünschen,
dass hat eine schillernde Wasser Laufsohle.
Schuhe oder Stiefel Fans ab Ländern auf der ganzen Welt als AIR Jordan
Schuhe bekannt werden die  Enthusiasten Turnschuhe  .  Die besten
Basketball- Schuhe  wurde gegenüber erklären Breeze Jordan XX3 , das wäre
dieses Zentrum hat , dass Menschen würden ihre Vision zu lösen angewendet .
Informationstechnologie bekannt war nur , dass viele Menschen weg durch die
besondere Geschichte über 23 , aus diesem Grund müssen , damit glorreiche
bekommen wieder zurück, Nike 706 Space kann  Casting -Legende , und viele
Menschen aufrichtig zu begehen.
Besorgen Sie sich die Innenseite dünn auf gute Gründe, warum nike völlig
kostenlos Schuhe oder Stiefel zu neigen , dass die Zahl eine Lösung für die
Läufer und auch Kleidung heutzutage in unserer Anleitung in voller Nike Free
five.0 Turnschuhe. Higher -zertifiziert Nike Schuhe können in günstigen
Preisen in einer Dienstleistung, die Arten von Nike Turnschuhe Nike
besonders Umwelt Maximum, Nike Shox , zzgl. Nike Air Jordan und so weiter
bietet  Nike Air Max kaufen http://www.airmaxschuhenonlinekaufen.eu/  
gekauft werden.



-
Thomas Sabo charms armband 
Thomas Sabo charms anhänger 
--
View this message in context: 
http://git.661346.n2.nabble.com/Gerade-liebe-Schuhe-tp7596354.html
Sent from the git mailing list archive at Nabble.com.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html