Re: What's cooking in git.git (Sep 2013, #03; Wed, 11)

2013-09-12 Thread Johannes Sixt
Am 9/12/2013 1:32, schrieb Junio C Hamano:
 * jc/ref-excludes (2013-09-03) 2 commits
  - document --exclude option
  - revision: introduce --exclude=glob to tame wildcards
 
  People often wished a way to tell git log --branches (and git
  log --remotes --not --branches) to exclude some local branches
  from the expansion of --branches (similarly for --tags, --all
  and --glob=pattern).  Now they have one.
 
  Will merge to 'next'.

Please don't. This is by far not ready. It needs a different approach to
support --exclude= in rev-parse.

-- Hannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: This is sequel to your non-response of my earlier letter to you

2013-09-12 Thread Barr Brenda Hoffman


Dear Friend

After five years of waiting the crown (British government) has given me
the power to contact you so that you become the beneficiary to total
amount £15,000,000.00 GBP in the intent of the deceased who died without a
will.
I am contacting you because I believe you are related. Please if you are
interested respond immediately with your full names, physical address and
cell number.

Yours Truly,

Brenda Hoffman





-- 
DISCLAIMER:
---
 This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the system manager. This 
message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. Before opening any mail and attachments please 
check them for viruses and defect If you are not the intended recipient you are 
notified that disclosing, copying, distributing or taking any action in 
reliance on the contents of this information is strictly prohibited.
-
This message has been scanned for viruses and
dangerous content, and is believed to be clean.

Regional Cancer Centre, Thiruvananthapuram
www.rcctvm.org

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re-Transmission of blobs?

2013-09-12 Thread Josef Wolf
On Mi, Sep 11, 2013 at 10:14:54 -0700, Junio C Hamano wrote:
 Josef Wolf j...@raven.inka.de writes:
  On Di, Sep 10, 2013 at 10:51:02 -0700, Junio C Hamano wrote:
  Consider this simple history with only a handful of commits (as
  usual, time flows from left to right):
  
E
   /   
  A---B---C---D
  
  where D is at the tip of the sending side, E is at the tip of the
  receiving side.  The exchange goes roughly like this:
  
  (receiving side): what do you have?
  
  (sending side): my tip is at D.
  
  (receiving side): D?  I've never heard of it --- please give it
to me.  I have E.
 
  At this point, why would the receiving side not tell all the heads it knows
  about?
 
 It did.  The receiving end had only one branch whose tip is E.  It
 may have a tracking branch that knows where the tip of the sending
 end used to be when it forked (which is C), so the above may say I
 have E and C.  It actually would say I have B and A and ... for a
 bounded number of commits, but that does not fundamentally change
 the picture---the important point is it is bounded and there is a
 horizon.

Therefore, the sending sinde has all information it needs to do any
optimizations you can think of...

  There are some work being done to optimize this further using
  various techniques, but they are not ready yet.
 
 And this still stands.

Do you have a pointer or something? I'd like to check out whether I can
contribute to this work.

-- 
Josef Wolf
j...@raven.inka.de
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Windows: Do not redefine _WIN32_WINNT

2013-09-12 Thread Sebastian Schuberth
On Wed, Sep 11, 2013 at 11:51 PM, Junio C Hamano gits...@pobox.com wrote:

 It seems that compat/poll/poll.c also defines _WIN32_WINNT (but only
 with _MSC_VER defined).  The change to git-compat-util.h in this
 patch avoids redefinition for both MinGW and MSVC case.  Do you also
 need to have this, too?

In my patch I did not change poll.c because I did only check this
issue with MinGW, not MSVC, so I never ran into the _MSC_VER code
path. Back in 1.8.3 git-compat-util.h did define _WIN32_WINNT for both
MinGW and MSVC, which is why in my patch I had to add the #ifndef /
#endif. But I believe it's good to have these guards for both MinGW
and MSVC, actually.

 Here is what I tentatively queued on top of the three from Karsten,
 and your Fix stat definitions.

Looks good to me, thanks!

-- 
Sebastian Schuberth
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] relative_path should honor dos_drive_prefix

2013-09-12 Thread Jiang Xin
Tvangeste found that the relative_path function could not work
properly on Windows if in and prefix have dos driver prefix.
($gmane/234434)

e.g., When execute: test-path-utils relative_path C:/a/b D:/x/y,
should return C:/a/b, but returns ../../C:/a/b, which is wrong.

So make relative_path honor dos_drive_prefix, and add test cases
for it in t0060.

Reported-by: Tvangeste i.4m.l...@yandex.ru
Helped-by: Johannes Sixt j...@kdbg.org
Signed-off-by: Jiang Xin worldhello@gmail.com
---
 path.c| 20 
 t/t0060-path-utils.sh |  4 
 2 files changed, 24 insertions(+)

diff --git a/path.c b/path.c
index 7f3324a..ffcdea1 100644
--- a/path.c
+++ b/path.c
@@ -441,6 +441,16 @@ int adjust_shared_perm(const char *path)
return 0;
 }
 
+static int have_same_root(const char *path1, const char *path2)
+{
+   int is_abs1, is_abs2;
+
+   is_abs1 = is_absolute_path(path1);
+   is_abs2 = is_absolute_path(path2);
+   return (is_abs1  is_abs2  !strncasecmp(path1, path2, 1)) ||
+  (!is_abs1  !is_abs2);
+}
+
 /*
  * Give path as relative to prefix.
  *
@@ -461,6 +471,16 @@ const char *relative_path(const char *in, const char 
*prefix,
else if (!prefix_len)
return in;
 
+   if (have_same_root(in, prefix)) {
+   /* bypass dos_drive, for c: is identical to C: */
+   if (has_dos_drive_prefix(in)) {
+   i = 2;
+   j = 2;
+   }
+   } else {
+   return in;
+   }
+
while (i  prefix_len  j  in_len  prefix[i] == in[j]) {
if (is_dir_sep(prefix[i])) {
while (is_dir_sep(prefix[i]))
diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 76c7792..c3c3b33 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -208,6 +208,10 @@ relative_path a/b/ a/b ./
 relative_path aa/b ../
 relative_path x/y  a/b ../../x/y
 relative_path a/c  a/b ../c
+relative_path a/b  /x/ya/b
+relative_path /a/b x/y /a/bPOSIX
+relative_path d:/a/b   D:/a/c  ../bMINGW
+relative_path C:/a/b   D:/a/c  C:/a/b  MINGW
 relative_path a/b  empty   a/b
 relative_path a/b  nulla/b
 relative_path empty/a/b./
-- 
1.8.3.rc2.14.g5ac1b82

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re-Transmission of blobs?

2013-09-12 Thread Jeff King
On Thu, Sep 12, 2013 at 09:42:41AM +0200, Josef Wolf wrote:

   There are some work being done to optimize this further using
   various techniques, but they are not ready yet.
  
  And this still stands.
 
 Do you have a pointer or something? I'd like to check out whether I can
 contribute to this work.

I think Junio is referring to the reachability bitmap work. We may know
that the other side has commit E (and therefore every object reachable
from it), but we do not walk the graph to find the complete set of
reachable objects. Doing so requires a lot of CPU and I/O, and in most
cases does not help much.

However, if we had an index of reachable objects (e.g., a bitmap) for
each commit, then we could very cheaply compute the set difference
between what the other side wants and what they have.

JGit has support for pack bitmaps already. There was a patch series a
few months ago to implement a similar functionality for C git, but the
on-disk format was not compatible with JGit's. That series has been
reworked off-list to be compatible with the JGit implementation.

Those patches need a little cleanup before they are ready for the list,
but hopefully that should happen soon-ish.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git-gui: Modify push dialog to support Gerrit review

2013-09-12 Thread Joergen Edelbo
Problem: It is not possible to push for Gerrit review as you will
always try to push to refs/heads/... on the remote.

Changes done:

Add an option in the Push dialog to select Gerrit review and a
corresponding entry for a branch name. If this option is selected,
push the changes to refs/for/gerrit-branch/local branch. In
this way the local branch names will be used as topic branches on
Gerrit.

If you are on a detached HEAD, then add a HEAD entry in the branch
selection list. If this is selected, push HEAD:HEAD in the normal
case and HEAD:refs/for/gerrit_branch in the Gerrit case.

The Gerrit branch to push to is controlled by gerrit.reviewbranch
configuration option.
---
Hi again,

Seems like this discussion has died out. Is there no perspective in
changing git-gui to support Gerrit better?

Anyway here is what I consider my final shot at a solution. Compared
to the last one, this commit can handle the case when you work on a 
detached HEAD, and the Gerrit branch to push to is handled by a
configuration option.

BR Jørgen Edelbo

 git-gui.sh|1 +
 lib/option.tcl|1 +
 lib/transport.tcl |   48 +---
 3 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/git-gui.sh b/git-gui.sh
index b62ae4a..3228654 100755
--- a/git-gui.sh
+++ b/git-gui.sh
@@ -901,6 +901,7 @@ set default_config(gui.diffcontext) 5
 set default_config(gui.diffopts) {}
 set default_config(gui.commitmsgwidth) 75
 set default_config(gui.newbranchtemplate) {}
+set default_config(gerrit.reviewbranch) {}
 set default_config(gui.spellingdictionary) {}
 set default_config(gui.fontui) [font configure font_ui]
 set default_config(gui.fontdiff) [font configure font_diff]
diff --git a/lib/option.tcl b/lib/option.tcl
index 23c9ae7..ffa4226 100644
--- a/lib/option.tcl
+++ b/lib/option.tcl
@@ -157,6 +157,7 @@ proc do_options {} {
{t gui.diffopts {mc Additional Diff Parameters}}
{i-0..99 gui.commitmsgwidth {mc Commit Message Text Width}}
{t gui.newbranchtemplate {mc New Branch Name Template}}
+   {t gerrit.reviewbranch {mc Gerrit Review Branch}}
{c gui.encoding {mc Default File Contents Encoding}}
{b gui.warndetachedcommit {mc Warn before committing to a 
detached head}}
{s gui.stageuntracked {mc Staging of untracked files} {list 
yes no ask}}
diff --git a/lib/transport.tcl b/lib/transport.tcl
index e5d211e..9b1cfc5 100644
--- a/lib/transport.tcl
+++ b/lib/transport.tcl
@@ -61,6 +61,7 @@ proc push_to {remote} {
 
 proc start_push_anywhere_action {w} {
global push_urltype push_remote push_url push_thin push_tags
+   global gerrit_review gerrit_branch
global push_force
global repo_config
 
@@ -95,7 +96,19 @@ proc start_push_anywhere_action {w} {
set cnt 0
foreach i [$w.source.l curselection] {
set b [$w.source.l get $i]
-   lappend cmd refs/heads/$b:refs/heads/$b
+   if {$gerrit_review  $gerrit_branch ne {}} {
+   switch $b {
+   $gerrit_branch  { lappend cmd 
refs/heads/$b:refs/for/$gerrit_branch }
+   HEAD{ lappend cmd 
HEAD:refs/for/$gerrit_branch }
+   default { lappend cmd 
refs/heads/$b:refs/for/$gerrit_branch/$b }
+   }
+   } else {
+   if {$b eq HEAD} {
+   lappend cmd HEAD:HEAD
+   } else {
+   lappend cmd 
refs/heads/$b:refs/heads/$b
+   }
+   }
incr cnt
}
if {$cnt == 0} {
@@ -118,9 +131,10 @@ trace add variable push_remote write \
[list radio_selector push_urltype remote]
 
 proc do_push_anywhere {} {
-   global all_remotes current_branch
+   global all_remotes current_branch is_detached
global push_urltype push_remote push_url push_thin push_tags
-   global push_force use_ttk NS
+   global gerrit_review gerrit_branch
+   global push_force use_ttk NS repo_config
 
set w .push_setup
toplevel $w
@@ -149,6 +163,11 @@ proc do_push_anywhere {} {
-height 10 \
-width 70 \
-selectmode extended
+   if {$is_detached} {
+   $w.source.l insert end HEAD
+   $w.source.l select set end
+   $w.source.l yview end
+   }
foreach h [load_all_heads] {
$w.source.l insert end $h
if {$h eq $current_branch} {
@@ -215,13 +234,36 @@ proc do_push_anywhere {} {
-text [mc Include tags] \
-variable push_tags
grid $w.options.tags -columnspan 2 -sticky w
+  

Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Sebastian Schuberth
On Wed, Sep 11, 2013 at 11:41 PM, Jeff King p...@peff.net wrote:

 I'm on Windows using MSYS / MinGW. Since MinGW runtime version 4.0,
 string.h contains the following code (see [1]):

 #ifndef __NO_INLINE__
 __CRT_INLINE int __cdecl __MINGW_NOTHROW
 strncasecmp (const char * __sz1, const char * __sz2, size_t __sizeMaxCompare)
 {return _strnicmp (__sz1, __sz2, __sizeMaxCompare);}
 #else
 #define strncasecmp _strnicmp
 #endif

 What is the error the compiler reports? Can it take the address of other

The error message of GCC 4.8.1 is:

LINK git-credential-store.exe
libgit.a(mailmap.o): In function `read_mailmap':
C:\mingwGitDevEnv\git/mailmap.c:238: undefined reference to `strcasecmp'
collect2.exe: error: ld returned 1 exit status
make: *** [git-credential-store.exe] Error 1

So it's a linker error, not a compiler error.

 inline functions? For example, can it compile:

 inline int foo(void) { return 5; }
 extern int bar(int (*cb)(void));
 int call(void) { return bar(foo); }

I had to modify the example slightly to:

inline int foo(void) { return 5; }
extern int bar(int (*cb)(void)) { return cb(); }
int main(void) { return bar(foo); }

And this compiles.

 Just wondering if that is the root of the problem, or if maybe there is
 something else subtle going on. Also, does __CRT_INLINE just turn into
 inline, or is there perhaps some other pre-processor magic going on?

This is the function definition from string.h after preprocessing:

extern __inline__ int __attribute__((__cdecl__)) __attribute__ ((__nothrow__))
strncasecmp (const char * __sz1, const char * __sz2, size_t __sizeMaxCompare)
  {return _strnicmp (__sz1, __sz2, __sizeMaxCompare);}

-- 
Sebastian Schuberth
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] relative_path should honor dos_drive_prefix

2013-09-12 Thread Tvangeste
Thank you, this fixes the problem with `git svn rebase` on Windows for me.

--Tvangeste

On Thu, Sep 12, 2013 at 11:12 AM, Jiang Xin worldhello@gmail.com wrote:
 Tvangeste found that the relative_path function could not work
 properly on Windows if in and prefix have dos driver prefix.
 ($gmane/234434)

 e.g., When execute: test-path-utils relative_path C:/a/b D:/x/y,
 should return C:/a/b, but returns ../../C:/a/b, which is wrong.

 So make relative_path honor dos_drive_prefix, and add test cases
 for it in t0060.

 Reported-by: Tvangeste i.4m.l...@yandex.ru
 Helped-by: Johannes Sixt j...@kdbg.org
 Signed-off-by: Jiang Xin worldhello@gmail.com
 ---
  path.c| 20 
  t/t0060-path-utils.sh |  4 
  2 files changed, 24 insertions(+)

 diff --git a/path.c b/path.c
 index 7f3324a..ffcdea1 100644
 --- a/path.c
 +++ b/path.c
 @@ -441,6 +441,16 @@ int adjust_shared_perm(const char *path)
 return 0;
  }

 +static int have_same_root(const char *path1, const char *path2)
 +{
 +   int is_abs1, is_abs2;
 +
 +   is_abs1 = is_absolute_path(path1);
 +   is_abs2 = is_absolute_path(path2);
 +   return (is_abs1  is_abs2  !strncasecmp(path1, path2, 1)) ||
 +  (!is_abs1  !is_abs2);
 +}
 +
  /*
   * Give path as relative to prefix.
   *
 @@ -461,6 +471,16 @@ const char *relative_path(const char *in, const char 
 *prefix,
 else if (!prefix_len)
 return in;

 +   if (have_same_root(in, prefix)) {
 +   /* bypass dos_drive, for c: is identical to C: */
 +   if (has_dos_drive_prefix(in)) {
 +   i = 2;
 +   j = 2;
 +   }
 +   } else {
 +   return in;
 +   }
 +
 while (i  prefix_len  j  in_len  prefix[i] == in[j]) {
 if (is_dir_sep(prefix[i])) {
 while (is_dir_sep(prefix[i]))
 diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
 index 76c7792..c3c3b33 100755
 --- a/t/t0060-path-utils.sh
 +++ b/t/t0060-path-utils.sh
 @@ -208,6 +208,10 @@ relative_path a/b/ a/b ./
  relative_path aa/b ../
  relative_path x/y  a/b ../../x/y
  relative_path a/c  a/b ../c
 +relative_path a/b  /x/ya/b
 +relative_path /a/b x/y /a/bPOSIX
 +relative_path d:/a/b   D:/a/c  ../bMINGW
 +relative_path C:/a/b   D:/a/c  C:/a/b  MINGW
  relative_path a/b  empty   a/b
  relative_path a/b  nulla/b
  relative_path empty/a/b./
 --
 1.8.3.rc2.14.g5ac1b82

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] wt-status: turn advice_status_hints into a field of wt_status

2013-09-12 Thread Matthieu Moy
Jeff King p...@peff.net writes:

 On Wed, Sep 11, 2013 at 11:08:58AM +0200, Matthieu Moy wrote:

 No behavior change in this patch, but this makes the display of status
 hints more flexible as they can be enabled or disabled for individual
 calls to commit.c:run_status().
 [...]
 +static void status_finalize(struct wt_status *s)
 +{
 +determine_whence(s);
 +s-hints = advice_status_hints;
 +}

 Is a finalize the right place to put the status hint tweak? I would
 expect the order for callers to be:

   wt_status_prepare(s);
   /* manual tweaks of fields in s */
   wt_status_finalize(s);

 but the finalize would overwrite any tweak of the hints field. So it
 would make more sense to me for it to go in prepare().

finalize is indeed not the right name. I made a separate function to
avoid too much code duplication between status and commit, and looked
for a name comlementary to prepare for a function that is ran later to
fill-in some fields.

 The problem is that we are doing two things in that git_config call:

   1. Loading basic git-wide core config.

(Yes. This includes the advice section, so I need it for
advice_status_hints)

 So the cleanest thing would be something like:

   git_config(git_diff_ui_config, NULL);
   wt_status_prepare(s);

[...]

That is clean, but a bit long and it is essentially duplicated between
status and commit. I went another way: put all the similar code in a
common function status_init_config:

static void status_init_config(struct wt_status *s, config_fn_t fn)
{
wt_status_prepare(s);
gitmodules_config();
git_config(git_status_config, s);
determine_whence(s);
s-hints = advice_status_hints; /* must come after git_config() */
}

We could split the git_config call, but that would not bring much
benefit IMHO. In any case, it can be done very simply on top of my patch
if needed later, as there is now only one call site for git_config.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] wt-status: turn advice_status_hints into a field of wt_status

2013-09-12 Thread Jeff King
On Thu, Sep 12, 2013 at 11:44:30AM +0200, Matthieu Moy wrote:

 That is clean, but a bit long and it is essentially duplicated between
 status and commit. I went another way: put all the similar code in a
 common function status_init_config:
 
 static void status_init_config(struct wt_status *s, config_fn_t fn)
 {
   wt_status_prepare(s);
   gitmodules_config();
   git_config(git_status_config, s);
   determine_whence(s);
   s-hints = advice_status_hints; /* must come after git_config() */
 }

s/git_status_config/fn/, I assume.

 We could split the git_config call, but that would not bring much
 benefit IMHO. In any case, it can be done very simply on top of my patch
 if needed later, as there is now only one call site for git_config.

Yeah, I think that is fine. The other cleanup may or may not be worth
it, but should not be a blocker to your patch. With what you suggest
above, you are certainly not making anything worse with respect to the
code organization.

Thanks.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] urlmatch: append_normalized_escapes can reallocate norm.buf

2013-09-12 Thread Thomas Rast
The calls to strbuf_add* within append_normalized_escapes() can
reallocate the buffer passed to it.  Therefore, the seg_start pointer
into the string cannot be kept across such calls.

The actual bug is from 3402a8d (config: add helper to normalize and
match URLs, 2013-07-31).  It can first be detected by valgrind after
6a56993 (config: parse http.url.variable using urlmatch,
2013-08-05) introduced tests covering url_normalize().

Signed-off-by: Thomas Rast tr...@inf.ethz.ch
---

My apologies if this is redundant; I didn't have time to watch the
list over the last two weeks.  However it seems today's pu is still
broken.

The valgrind error looks like this:

  ==4607== Invalid read of size 1
  ==4607==at 0x4C2D3A1: __GI_strcmp (mc_replace_strmem.c:731)
  ==4607==by 0x404C68: url_normalize (urlmatch.c:300)
  ==4607==by 0x403F33: main (test-urlmatch-normalization.c:34)
  ==4607==  Address 0x5be9046 is 6 bytes inside a block of size 24 free'd
  ==4607==at 0x4C2BFC6: realloc (vg_replace_malloc.c:687)
  ==4607==by 0x405F6B: xrealloc (wrapper.c:100)
  ==4607==by 0x40794E: strbuf_grow (strbuf.c:74)
  ==4607==by 0x40854D: strbuf_vaddf (strbuf.c:268)
  ==4607==by 0x40817E: strbuf_addf (strbuf.c:203)
  ==4607==by 0x404300: append_normalized_escapes (urlmatch.c:58)
  ==4607==by 0x404C0A: url_normalize (urlmatch.c:291)
  ==4607==by 0x403F33: main (test-urlmatch-normalization.c:34)

It went undetected for a while because it does not fail the test: the
calls to test-urlmatch-normalization happen inside a $() substitution.

I checked the other call sites to append_normalized_escapes() for the
same type of problem, and they seem to be okay.

 urlmatch.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/urlmatch.c b/urlmatch.c
index 1db76c8..59abc80 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -281,7 +281,8 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
url_len--;
}
for (;;) {
-   const char *seg_start = norm.buf + norm.len;
+   const char *seg_start;
+   size_t prev_len = norm.len;
const char *next_slash = url + strcspn(url, /?#);
int skip_add_slash = 0;
/*
@@ -297,6 +298,7 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
strbuf_release(norm);
return NULL;
}
+   seg_start = norm.buf + prev_len;
if (!strcmp(seg_start, .)) {
/* ignore a . segment; be careful not to remove initial 
'/' */
if (seg_start == path_start + 1) {
-- 
1.8.4.609.g4395a4f

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread John Keeping
On Thu, Sep 12, 2013 at 11:36:56AM +0200, Sebastian Schuberth wrote:
  Just wondering if that is the root of the problem, or if maybe there is
  something else subtle going on. Also, does __CRT_INLINE just turn into
  inline, or is there perhaps some other pre-processor magic going on?
 
 This is the function definition from string.h after preprocessing:
 
 extern __inline__ int __attribute__((__cdecl__)) __attribute__ ((__nothrow__))
 strncasecmp (const char * __sz1, const char * __sz2, size_t __sizeMaxCompare)
   {return _strnicmp (__sz1, __sz2, __sizeMaxCompare);}

I wonder if GCC has changed it's behaviour to more closely match C99.
Clang as a compatibility article about this sort of issue:

http://clang.llvm.org/compatibility.html#inline
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] pack v4: avoid strlen() in tree_entry_prefix

2013-09-12 Thread Nguyễn Thái Ngọc Duy
We do know the length of the path name of an tree entry from the tree
dictionary. On an unoptimized build, this cuts down git rev-list
--objects v1.8.4's time from 6.2s to 5.8s. This difference is less on
optimized builds.

Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
 packv4-parse.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/packv4-parse.c b/packv4-parse.c
index c00a014..ae3e6a5 100644
--- a/packv4-parse.c
+++ b/packv4-parse.c
@@ -50,15 +50,17 @@ struct packv4_dict *pv4_create_dict(const unsigned char 
*data, int dict_size)
return NULL;
}
 
-   dict = xmalloc(sizeof(*dict) + nb_entries * sizeof(dict-offsets[0]));
+   dict = xmalloc(sizeof(*dict) +
+  (nb_entries + 1) * sizeof(dict-offsets[0]));
dict-data = data;
dict-nb_entries = nb_entries;
 
+   dict-offsets[0] = 0;
cp = data;
for (i = 0; i  nb_entries; i++) {
-   dict-offsets[i] = cp - data;
cp += 2;
cp += strlen((const char *)cp) + 1;
+   dict-offsets[i + 1] = cp - data;
}
 
return dict;
@@ -163,7 +165,8 @@ static void load_path_dict(struct packed_git *p)
p-path_dict = paths;
 }
 
-const unsigned char *get_pathref(struct packed_git *p, unsigned int index)
+const unsigned char *get_pathref(struct packed_git *p, unsigned int index,
+int *len)
 {
if (!p-path_dict)
load_path_dict(p);
@@ -172,6 +175,9 @@ const unsigned char *get_pathref(struct packed_git *p, 
unsigned int index)
error(%s: index overflow, __func__);
return NULL;
}
+   if (len)
+   *len = p-path_dict-offsets[index + 1] -
+   p-path_dict-offsets[index];
return p-path_dict-data + p-path_dict-offsets[index];
 }
 
@@ -373,9 +379,9 @@ static int copy_canonical_tree_entries(struct packed_git 
*p, off_t offset,
 }
 
 static int tree_entry_prefix(unsigned char *buf, unsigned long size,
-const unsigned char *path, unsigned mode)
+const unsigned char *path, int path_len,
+unsigned mode)
 {
-   int path_len = strlen((const char *)path) + 1;
int mode_len = 0;
int len;
unsigned char mode_buf[8];
@@ -463,14 +469,15 @@ static int decode_entries(struct packed_git *p, struct 
pack_window **w_curs,
 */
const unsigned char *path, *sha1;
unsigned mode;
-   int len;
+   int len, pathlen;
 
-   path = get_pathref(p, what  1);
+   path = get_pathref(p, what  1, pathlen);
sha1 = get_sha1ref(p, scp);
if (!path || !sha1)
return -1;
mode = (path[0]  8) | path[1];
-   len = tree_entry_prefix(*dstp, *sizep, path + 2, mode);
+   len = tree_entry_prefix(*dstp, *sizep,
+   path + 2, pathlen - 2, mode);
if (!len || len + 20  *sizep)
return -1;
hashcpy(*dstp + len, sha1);
-- 
1.8.2.83.gc99314b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] pack v4: make use of cached v4 trees when unpacking

2013-09-12 Thread Nguyễn Thái Ngọc Duy
git rev-list --objects v1.8.4 time is reduced from 29s to 10s with
this patch. But it is still a long way to catch up with v2: 4s.

Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
 The problem I see with decode_entries() is that given n copy
 sequences, it re-reads the same base n times. 30+ copy sequences are
 not unusual at all with git.git.

 I'm thinking of adding a cache to deal with one-base trees, which is
 all we have now. If we know in advance what base a tree needs without
 parsing the tree, we could unpack from base up like we do with
 ref-deltas. Because in this case we know the base is always flat, we
 could have a more efficient decode_entries that only goes through the
 base once. I want to get the timing down to as close as possible to
 v2 before adding v4-aware interface.

 Pack cache is an idea being cooked for a while by Jeff. Maybe we
 could merge his work to pack v4 or require it when pack v4 is finally
 merged to 'next'.

 packv4-parse.c | 17 +++--
 packv4-parse.h |  2 ++
 sha1_file.c| 14 ++
 3 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/packv4-parse.c b/packv4-parse.c
index 5002f42..b8855b0 100644
--- a/packv4-parse.c
+++ b/packv4-parse.c
@@ -415,8 +415,20 @@ static int decode_entries(struct packed_git *p, struct 
pack_window **w_curs,
unsigned int nb_entries;
const unsigned char *src, *scp;
off_t copy_objoffset = 0;
+   const void *cached = NULL;
+   unsigned long cached_size, cached_v4_size;
+
+   if (hdr)  /* we need offset point at obj header */
+   cached = get_cached_v4_tree(p, offset,
+   cached_size, cached_v4_size);
+
+   if (cached) {
+   src = cached;
+   avail = cached_v4_size;
+   hdr = 0;
+   } else
+   src = use_pack(p, w_curs, offset, avail);
 
-   src = use_pack(p, w_curs, offset, avail);
scp = src;
 
if (hdr) {
@@ -452,7 +464,8 @@ static int decode_entries(struct packed_git *p, struct 
pack_window **w_curs,
while (count) {
unsigned int what;
 
-   if (avail  20) {
+   /* fixme: need to put bach the out-of-bound check when cached 
== 1 */
+   if (!cached  avail  20) {
src = use_pack(p, w_curs, offset, avail);
if (avail  20)
return -1;
diff --git a/packv4-parse.h b/packv4-parse.h
index 647b73c..f584c31 100644
--- a/packv4-parse.h
+++ b/packv4-parse.h
@@ -16,6 +16,8 @@ unsigned long pv4_unpack_object_header_buffer(const unsigned 
char *base,
  unsigned long *sizep);
 const unsigned char *get_sha1ref(struct packed_git *p,
 const unsigned char **bufp);
+const void *get_cached_v4_tree(struct packed_git *p, off_t base_offset,
+unsigned long *size, unsigned long *v4_size);
 
 void *pv4_get_commit(struct packed_git *p, struct pack_window **w_curs,
 off_t offset, unsigned long size);
diff --git a/sha1_file.c b/sha1_file.c
index b176316..82570be 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1967,6 +1967,20 @@ static int in_delta_base_cache(struct packed_git *p, 
off_t base_offset)
return eq_delta_base_cache_entry(ent, p, base_offset);
 }
 
+const void *get_cached_v4_tree(struct packed_git *p, off_t base_offset,
+unsigned long *size, unsigned long *v4_size)
+{
+   struct delta_base_cache_entry *ent;
+   ent = get_delta_base_cache_entry(p, base_offset);
+
+   if (!eq_delta_base_cache_entry(ent, p, base_offset) ||
+   ent-type != OBJ_PV4_TREE)
+   return NULL;
+   *size = ent-size;
+   *v4_size = ent-v4_size;
+   return ent-data;
+}
+
 static void clear_delta_base_cache_entry(struct delta_base_cache_entry *ent)
 {
ent-data = NULL;
-- 
1.8.2.83.gc99314b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] pack v4: cache flattened v4 trees in delta base cache

2013-09-12 Thread Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
 The memmove in pv4_get_tree() may look inefficient. I added a
 heuristics to avoid moving if nb_entries takes 2 bytes (most common,
 I think), but it does not improve much. So memmove() is probably ok.

 packv4-parse.c | 60 +++---
 packv4-parse.h |  3 ++-
 sha1_file.c|  8 +++-
 3 files changed, 62 insertions(+), 9 deletions(-)

diff --git a/packv4-parse.c b/packv4-parse.c
index ae3e6a5..5002f42 100644
--- a/packv4-parse.c
+++ b/packv4-parse.c
@@ -406,7 +406,10 @@ static int tree_entry_prefix(unsigned char *buf, unsigned 
long size,
 
 static int decode_entries(struct packed_git *p, struct pack_window **w_curs,
  off_t offset, unsigned int start, unsigned int count,
- unsigned char **dstp, unsigned long *sizep, int hdr)
+ unsigned char **dstp, unsigned long *sizep,
+ unsigned char **v4_dstp, unsigned long *v4_sizep,
+ unsigned int *v4_entries,
+ int hdr)
 {
unsigned long avail;
unsigned int nb_entries;
@@ -422,10 +425,18 @@ static int decode_entries(struct packed_git *p, struct 
pack_window **w_curs,
if (++scp - src = avail - 20)
return -1;
/* is this a canonical tree object? */
-   if ((*scp  0xf) == OBJ_TREE)
+   if ((*scp  0xf) == OBJ_TREE) {
+   /*
+* we could try to convert to v4 tree before
+* giving up, provided that the number of
+* inconvertible trees is small. But that's
+* for later.
+*/
+   *v4_dstp = NULL;
return copy_canonical_tree_entries(p, offset,
   start, count,
   dstp, sizep);
+   }
/* let's still make sure this is actually a pv4 tree */
if ((*scp++  0xf) != OBJ_PV4_TREE)
return -1;
@@ -484,6 +495,16 @@ static int decode_entries(struct packed_git *p, struct 
pack_window **w_curs,
*dstp += len + 20;
*sizep -= len + 20;
count--;
+   if (*v4_dstp) {
+   if (scp - src  *v4_sizep)
+   *v4_dstp = NULL;
+   else {
+   memcpy(*v4_dstp, src, scp - src);
+   *v4_dstp += scp - src;
+   *v4_sizep -= scp - src;
+   (*v4_entries)++;
+   }
+   }
} else if (what  1) {
/*
 * Copy from another tree object.
@@ -537,7 +558,8 @@ static int decode_entries(struct packed_git *p, struct 
pack_window **w_curs,
count -= copy_count;
ret = decode_entries(p, w_curs,
copy_objoffset, copy_start, copy_count,
-   dstp, sizep, 1);
+   dstp, sizep, v4_dstp, v4_sizep,
+   v4_entries, 1);
if (ret)
return ret;
/* force pack window readjustment */
@@ -554,11 +576,13 @@ static int decode_entries(struct packed_git *p, struct 
pack_window **w_curs,
 }
 
 void *pv4_get_tree(struct packed_git *p, struct pack_window **w_curs,
-  off_t offset, unsigned long size)
+  off_t offset, unsigned long size,
+  void **v4_data, unsigned long *v4_size)
 {
-   unsigned long avail;
-   unsigned int nb_entries;
+   unsigned long avail, v4_max_size;
+   unsigned int nb_entries, v4_entries;
unsigned char *dst, *dcp;
+   unsigned char *v4_dst, *v4_dcp;
const unsigned char *src, *scp;
int ret;
 
@@ -570,11 +594,33 @@ void *pv4_get_tree(struct packed_git *p, struct 
pack_window **w_curs,
 
dst = xmallocz(size);
dcp = dst;
-   ret = decode_entries(p, w_curs, offset, 0, nb_entries, dcp, size, 0);
+   if (v4_data) {
+   /*
+* v4 can't be larger than canonical, so size should
+* be enough
+*/
+   v4_max_size = size;
+   v4_dst = v4_dcp = xmallocz(v4_max_size);
+   v4_entries = 0;
+   }
+   ret = decode_entries(p, w_curs, offset, 0, nb_entries,
+

[PATCH 2/4] pack v4: add v4_size to struct delta_base_cache_entry

2013-09-12 Thread Nguyễn Thái Ngọc Duy
The intention is to store flat v4 trees in delta base cache to avoid
repeatedly expanding copy sequences in v4 trees. When the user needs
to unpack a v4 tree and the tree is found in the cache, the tree will
be converted back to canonical format. Future tree_desc interface may
skip canonical format and read v4 trees directly.

For that to work we need to keep track of v4 tree size after all copy
sequences are expanded, which is the purpose of this new field.

Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
 sha1_file.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 038e22e..03c66bb 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1934,6 +1934,7 @@ static struct delta_base_cache_entry {
struct packed_git *p;
off_t base_offset;
unsigned long size;
+   unsigned long v4_size;
enum object_type type;
 } delta_base_cache[MAX_DELTA_CACHE];
 
@@ -2015,7 +2016,8 @@ void clear_delta_base_cache(void)
 }
 
 static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
-   void *base, unsigned long base_size, enum object_type type)
+   void *base, unsigned long base_size, unsigned long v4_size,
+   enum object_type type)
 {
unsigned long hash = pack_entry_hash(p, base_offset);
struct delta_base_cache_entry *ent = delta_base_cache + hash;
@@ -2045,6 +2047,7 @@ static void add_delta_base_cache(struct packed_git *p, 
off_t base_offset,
ent-type = type;
ent-data = base;
ent-size = base_size;
+   ent-v4_size = v4_size;
ent-lru.next = delta_base_cache_lru;
ent-lru.prev = delta_base_cache_lru.prev;
delta_base_cache_lru.prev-next = ent-lru;
@@ -2208,7 +2211,7 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset,
data = NULL;
 
if (base)
-   add_delta_base_cache(p, obj_offset, base, base_size, 
type);
+   add_delta_base_cache(p, obj_offset, base, base_size, 0, 
type);
 
if (!base) {
/*
-- 
1.8.2.83.gc99314b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re-Transmission of blobs?

2013-09-12 Thread Josef Wolf
On Do, Sep 12, 2013 at 05:23:40 -0400, Jeff King wrote:
 On Thu, Sep 12, 2013 at 09:42:41AM +0200, Josef Wolf wrote:

 I think Junio is referring to the reachability bitmap work. We may know
 that the other side has commit E (and therefore every object reachable
 from it), but we do not walk the graph to find the complete set of
 reachable objects. Doing so requires a lot of CPU and I/O, and in most
 cases does not help much.

I'm not sure I understand correctly. I see that bitmaps can be used to
implement set operations. But how comes that walking the graph requires a lot
of CPU? Isn't it O(n)?

 However, if we had an index of reachable objects (e.g., a bitmap) for
 each commit, then we could very cheaply compute the set difference
 between what the other side wants and what they have.

Those bitmaps would be stored in the git metadata? Is it worth it? Storing a
bitmap for every commit just to be used once-in-a-while seems to be a pretty
big overhead to me. Not to mention the interoperability problems you mentioned
below.

 JGit has support for pack bitmaps already. There was a patch series a
 few months ago to implement a similar functionality for C git, but the
 on-disk format was not compatible with JGit's. That series has been
 reworked off-list to be compatible with the JGit implementation.
 
 Those patches need a little cleanup before they are ready for the list,
 but hopefully that should happen soon-ish.

Sounds like you're already almost done and don't really need help
anymore. Just out of curiosity, I'd be interested in a pointer anyway ;-)

-- 
Josef Wolf
j...@raven.inka.de
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/3] wt-status: turn advice_status_hints into a field of wt_status

2013-09-12 Thread Matthieu Moy
No behavior change in this patch, but this makes the display of status
hints more flexible as they can be enabled or disabled for individual
calls to commit.c:run_status().

Signed-off-by: Matthieu Moy matthieu@imag.fr
---
No real change since v1, just a slight adaptation after the PATCH 1.

 builtin/commit.c |  1 +
 wt-status.c  | 38 +++---
 wt-status.h  |  1 +
 3 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index dc957a9..7bfce94 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -169,6 +169,7 @@ static void status_init_config(struct wt_status *s, 
config_fn_t fn)
gitmodules_config();
git_config(fn, s);
determine_whence(s);
+   s-hints = advice_status_hints; /* must come after git_config() */
 }
 
 static void rollback_index_files(void)
diff --git a/wt-status.c b/wt-status.c
index cb24f1f..885ee66 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -160,7 +160,7 @@ static void wt_status_print_unmerged_header(struct 
wt_status *s)
}
}
 
-   if (!advice_status_hints)
+   if (!s-hints)
return;
if (s-whence != FROM_COMMIT)
;
@@ -187,7 +187,7 @@ static void wt_status_print_cached_header(struct wt_status 
*s)
const char *c = color(WT_STATUS_HEADER, s);
 
status_printf_ln(s, c, _(Changes to be committed:));
-   if (!advice_status_hints)
+   if (!s-hints)
return;
if (s-whence != FROM_COMMIT)
; /* NEEDSWORK: use git reset --unresolve??? */
@@ -205,7 +205,7 @@ static void wt_status_print_dirty_header(struct wt_status 
*s,
const char *c = color(WT_STATUS_HEADER, s);
 
status_printf_ln(s, c, _(Changes not staged for commit:));
-   if (!advice_status_hints)
+   if (!s-hints)
return;
if (!has_deleted)
status_printf_ln(s, c, _(  (use \git add file...\ to 
update what will be committed)));
@@ -223,7 +223,7 @@ static void wt_status_print_other_header(struct wt_status 
*s,
 {
const char *c = color(WT_STATUS_HEADER, s);
status_printf_ln(s, c, %s:, what);
-   if (!advice_status_hints)
+   if (!s-hints)
return;
status_printf_ln(s, c, _(  (use \git %s file...\ to include in 
what will be committed)), how);
status_printf_ln(s, c, );
@@ -801,13 +801,13 @@ static void show_merge_in_progress(struct wt_status *s,
 {
if (has_unmerged(s)) {
status_printf_ln(s, color, _(You have unmerged paths.));
-   if (advice_status_hints)
+   if (s-hints)
status_printf_ln(s, color,
_(  (fix conflicts and run \git commit\)));
} else {
status_printf_ln(s, color,
_(All conflicts fixed but you are still merging.));
-   if (advice_status_hints)
+   if (s-hints)
status_printf_ln(s, color,
_(  (use \git commit\ to conclude merge)));
}
@@ -823,7 +823,7 @@ static void show_am_in_progress(struct wt_status *s,
if (state-am_empty_patch)
status_printf_ln(s, color,
_(The current patch is empty.));
-   if (advice_status_hints) {
+   if (s-hints) {
if (!state-am_empty_patch)
status_printf_ln(s, color,
_(  (fix conflicts and then run \git am 
--continue\)));
@@ -896,7 +896,7 @@ static void show_rebase_in_progress(struct wt_status *s,
else
status_printf_ln(s, color,
 _(You are currently rebasing.));
-   if (advice_status_hints) {
+   if (s-hints) {
status_printf_ln(s, color,
_(  (fix conflicts and then run \git rebase 
--continue\)));
status_printf_ln(s, color,
@@ -913,7 +913,7 @@ static void show_rebase_in_progress(struct wt_status *s,
else
status_printf_ln(s, color,
 _(You are currently rebasing.));
-   if (advice_status_hints)
+   if (s-hints)
status_printf_ln(s, color,
_(  (all conflicts fixed: run \git rebase 
--continue\)));
} else if (split_commit_in_progress(s)) {
@@ -925,7 +925,7 @@ static void show_rebase_in_progress(struct wt_status *s,
else
status_printf_ln(s, color,
 _(You are currently splitting a 
commit during a rebase.));
-   if (advice_status_hints)
+   if (s-hints)
status_printf_ln(s, color,
_(  (Once your 

[PATCH v2 3/3] commit: disable status hints when writing to COMMIT_EDITMSG

2013-09-12 Thread Matthieu Moy
This turns the template COMMIT_EDITMSG from e.g

  # [...]
  # Changes to be committed:
  #   (use git reset HEAD file... to unstage)
  #
  # modified:   builtin/commit.c
  #
  # Untracked files:
  #   (use git add file... to include in what will be committed)
  #
  # t/foo
  #

to

  # [...]
  # Changes to be committed:
  # modified:   builtin/commit.c
  #
  # Untracked files:
  # t/foo
  #

Most status hints were written to be accurate when running git status
before running a commit. Many of them are not applicable when the commit
has already been started, and should not be shown in COMMIT_EDITMSG. The
most obvious are hints advising to run git commit,
git rebase/am/cherry-pick --continue, which do not make sense when the
command has already been run.

Other messages become slightly inaccurate (e.g. hint to use git add to
add untracked files), as the suggested commands are not immediately
applicable during the editing of COMMIT_EDITMSG, but would be applicable
if the commit is aborted. These messages are both potentially helpful and
slightly misleading. This patch chose to remove them too, to avoid
introducing too much complexity in the status code.

Signed-off-by: Matthieu Moy matthieu@imag.fr
---
Unchanged since v1.

 builtin/commit.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/builtin/commit.c b/builtin/commit.c
index 7bfce94..e32246a 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -705,6 +705,12 @@ static int prepare_to_commit(const char *index_file, const 
char *prefix,
if (s-fp == NULL)
die_errno(_(could not open '%s'), git_path(commit_editmsg));
 
+   /*
+* Most hints are counter-productive when the commit has
+* already started.
+*/
+   s-hints = 0;
+
if (clean_message_contents)
stripspace(sb, 0);
 
-- 
1.8.4.8.g834017f

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] commit: factor status configuration is a helper function

2013-09-12 Thread Matthieu Moy
cmd_commit and cmd_status use very similar code to initialize the
wt_status structure. Factor this code into a function to ensure future
changes will keep both versions consistent.

Signed-off-by: Matthieu Moy matthieu@imag.fr
---
New patch, as discussed with Peff.

 builtin/commit.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 60812b5..dc957a9 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -163,6 +163,14 @@ static void determine_whence(struct wt_status *s)
s-whence = whence;
 }
 
+static void status_init_config(struct wt_status *s, config_fn_t fn)
+{
+   wt_status_prepare(s);
+   gitmodules_config();
+   git_config(fn, s);
+   determine_whence(s);
+}
+
 static void rollback_index_files(void)
 {
switch (commit_style) {
@@ -1246,10 +1254,7 @@ int cmd_status(int argc, const char **argv, const char 
*prefix)
if (argc == 2  !strcmp(argv[1], -h))
usage_with_options(builtin_status_usage, 
builtin_status_options);
 
-   wt_status_prepare(s);
-   gitmodules_config();
-   git_config(git_status_config, s);
-   determine_whence(s);
+   status_init_config(s, git_status_config);
argc = parse_options(argc, argv, prefix,
 builtin_status_options,
 builtin_status_usage, 0);
@@ -1490,11 +1495,8 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
if (argc == 2  !strcmp(argv[1], -h))
usage_with_options(builtin_commit_usage, 
builtin_commit_options);
 
-   wt_status_prepare(s);
-   gitmodules_config();
-   git_config(git_commit_config, s);
+   status_init_config(s, git_commit_config);
status_format = STATUS_FORMAT_NONE; /* Ignore status.short */
-   determine_whence(s);
s.colopts = 0;
 
if (get_sha1(HEAD, sha1))
-- 
1.8.4.8.g834017f

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Re-Transmission of blobs?

2013-09-12 Thread Pyeron, Jason J CTR (US)
 -Original Message-
 From: Jeff King
 Sent: Thursday, September 12, 2013 5:24 AM
 
 On Thu, Sep 12, 2013 at 09:42:41AM +0200, Josef Wolf wrote:
 
There are some work being done to optimize this further using
various techniques, but they are not ready yet.
  
   And this still stands.
 
  Do you have a pointer or something? I'd like to check out whether I
 can
  contribute to this work.
 
 I think Junio is referring to the reachability bitmap work. We may know
 that the other side has commit E (and therefore every object
 reachable
 from it), but we do not walk the graph to find the complete set of
 reachable objects. Doing so requires a lot of CPU and I/O, and in most
 cases does not help much.

If the rules of engagement are change a bit, the server side can be release 
from most of its work (CPU/IO).

Client does the following, looping as needed:

Heads=server-heads();
KnownCommits=Local-AllCommits();
Missingblobs=[];
Foreach(commit:heads) if (!knownCommits-contains(commit)) 
MissingBlobs[]=commit;
Foreach(commit:knownCommit) if (!commit-isValid()) 
MissingBlobs[]=commit-blobs();
If (missingBlobs-size()0) server-FetchBlobs(missingBlobs);


This should work efficiently for the server if
a) the client is empty
b) the client is corrupt
c) the client is up to date

Extending the server-fetchBlobs() to be more fancy, like taking patterns, such 
as between aa and dd exclusive is an exercise for someone else.




smime.p7s
Description: S/MIME cryptographic signature


[PATCH 5/4] pack v4: convert v4 tree to canonical format if found in base cache

2013-09-12 Thread Nguyễn Thái Ngọc Duy
git log --stat -1 v1.4.8 /dev/null takes 13s with v4 (8s with
v2). Of course we could do better when v4-aware tree-diff interface is
in place..

Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
 Oops.. forgot this and broke git.

 Another option is change cache_or_unpack_entry() to force
 OBJ_PV4_TREE to go through unpack_entry(), then update pv4_get_tree() to
 lookup the base cache at the first decode_entries() call. Right now
 it does not (hdr == 0) so we need more processing.

 packv4-parse.c | 22 ++
 packv4-parse.h |  2 ++
 sha1_file.c| 11 +++
 3 files changed, 35 insertions(+)

diff --git a/packv4-parse.c b/packv4-parse.c
index b8855b0..448c91e 100644
--- a/packv4-parse.c
+++ b/packv4-parse.c
@@ -461,6 +461,10 @@ static int decode_entries(struct packed_git *p, struct 
pack_window **w_curs,
avail -= scp - src;
src = scp;
 
+   /* special case for pv4_cached_tree_to_canonical() */
+   if (!count  cached)
+   count = nb_entries;
+
while (count) {
unsigned int what;
 
@@ -648,3 +652,21 @@ unsigned long pv4_unpack_object_header_buffer(const 
unsigned char *base,
*sizep = val  4;
return cp - base;
 }
+
+/* offset must already be cached! */
+void *pv4_cached_tree_to_canonical(struct packed_git *p, off_t offset,
+  unsigned long size)
+{
+   int ret;
+   unsigned char *dst, *dcp;
+   unsigned char *v4_dstp = NULL;
+   dst = xmallocz(size);
+   dcp = dst;
+   ret = decode_entries(p, NULL, offset, 0, 0,
+dcp, size, v4_dstp, NULL, NULL, 1);
+   if (ret  0 || size != 0) {
+   free(dst);
+   return NULL;
+   }
+   return dst;
+}
diff --git a/packv4-parse.h b/packv4-parse.h
index f584c31..ad21e19 100644
--- a/packv4-parse.h
+++ b/packv4-parse.h
@@ -24,5 +24,7 @@ void *pv4_get_commit(struct packed_git *p, struct pack_window 
**w_curs,
 void *pv4_get_tree(struct packed_git *p, struct pack_window **w_curs,
   off_t offset, unsigned long size,
   void **v4_data, unsigned long *v4_size);
+void *pv4_cached_tree_to_canonical(struct packed_git *p, off_t offset,
+  unsigned long size);
 
 #endif
diff --git a/sha1_file.c b/sha1_file.c
index 82570be..0944ef6 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2000,6 +2000,17 @@ static void *cache_or_unpack_entry(struct packed_git *p, 
off_t base_offset,
if (!eq_delta_base_cache_entry(ent, p, base_offset))
return unpack_entry(p, base_offset, type, base_size);
 
+   if (ent-type == OBJ_PV4_TREE) {
+   ret = pv4_cached_tree_to_canonical(p, base_offset, ent-size);
+   if (!ret)
+   return NULL;
+   if (!keep_cache)
+   clear_delta_base_cache_entry(ent);
+   *type = OBJ_TREE;
+   *base_size = ent-size;
+   return ret;
+   }
+
ret = ent-data;
 
if (!keep_cache)
-- 
1.8.2.83.gc99314b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] relative_path should honor dos_drive_prefix

2013-09-12 Thread Torsten Bögershausen
On 2013-09-12 11.12, Jiang Xin wrote:
 Tvangeste found that the relative_path function could not work
 properly on Windows if in and prefix have dos driver prefix.
 ($gmane/234434)
 
 e.g., When execute: test-path-utils relative_path C:/a/b D:/x/y,
 should return C:/a/b, but returns ../../C:/a/b, which is wrong.
 
 So make relative_path honor dos_drive_prefix, and add test cases
 for it in t0060.
 
 Reported-by: Tvangeste i.4m.l...@yandex.ru
 Helped-by: Johannes Sixt j...@kdbg.org
 Signed-off-by: Jiang Xin worldhello@gmail.com
 ---
  path.c| 20 
  t/t0060-path-utils.sh |  4 
  2 files changed, 24 insertions(+)
 
 diff --git a/path.c b/path.c
 index 7f3324a..ffcdea1 100644
 --- a/path.c
 +++ b/path.c
 @@ -441,6 +441,16 @@ int adjust_shared_perm(const char *path)
   return 0;
  }
  
 +static int have_same_root(const char *path1, const char *path2)
 +{
 + int is_abs1, is_abs2;
 +
 + is_abs1 = is_absolute_path(path1);
 + is_abs2 = is_absolute_path(path2);
 + return (is_abs1  is_abs2  !strncasecmp(path1, path2, 1)) ||
   ^^^
I wonder: should strncasecmp() be replaced with strncmp_icase() ?

See dir.c: 
int strncmp_icase(const char *a, const char *b, size_t count)
{
return ignore_case ? strncasecmp(a, b, count) : strncmp(a, b, count);
}
/Torsten











 +(!is_abs1  !is_abs2);
 +}
 +
  /*
   * Give path as relative to prefix.
   *
 @@ -461,6 +471,16 @@ const char *relative_path(const char *in, const char 
 *prefix,
   else if (!prefix_len)
   return in;
  
 + if (have_same_root(in, prefix)) {
 + /* bypass dos_drive, for c: is identical to C: */
 + if (has_dos_drive_prefix(in)) {
 + i = 2;
 + j = 2;
 + }
 + } else {
 + return in;
 + }
 +
   while (i  prefix_len  j  in_len  prefix[i] == in[j]) {
   if (is_dir_sep(prefix[i])) {
   while (is_dir_sep(prefix[i]))
 diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
 index 76c7792..c3c3b33 100755
 --- a/t/t0060-path-utils.sh
 +++ b/t/t0060-path-utils.sh
 @@ -208,6 +208,10 @@ relative_path a/b/   a/b ./
  relative_path a  a/b ../
  relative_path x/ya/b ../../x/y
  relative_path a/ca/b ../c
 +relative_path a/b/x/ya/b
 +relative_path /a/b   x/y /a/bPOSIX
 +relative_path d:/a/b D:/a/c  ../bMINGW
 +relative_path C:/a/b D:/a/c  C:/a/b  MINGW
  relative_path a/bempty   a/b
  relative_path a/bnulla/b
  relative_path empty  /a/b./
 

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Kyle J. McKay
On Sep 12, 2013, at 02:57, Thomas Rast wrote:

 The calls to strbuf_add* within append_normalized_escapes() can
 reallocate the buffer passed to it.  Therefore, the seg_start pointer
 into the string cannot be kept across such calls.

Thanks for finding this.

 It went undetected for a while because it does not fail the test: the
 calls to test-urlmatch-normalization happen inside a $() substitution.
 
 I checked the other call sites to append_normalized_escapes() for the
 same type of problem, and they seem to be okay.

 diff --git a/urlmatch.c b/urlmatch.c
 index 1db76c8..59abc80 100644
 --- a/urlmatch.c
 +++ b/urlmatch.c
 @@ -281,7 +281,8 @@ char *url_normalize(const char *url, struct url_info 
 *out_info)
   url_len--;
   }
   for (;;) {
 - const char *seg_start = norm.buf + norm.len;
 + const char *seg_start;
 + size_t prev_len = norm.len;

How about a more descriptive name for what prev_len is?  It's actually the
segment start offset.

   const char *next_slash = url + strcspn(url, /?#);
   int skip_add_slash = 0;
   /*
 @@ -297,6 +298,7 @@ char *url_normalize(const char *url, struct url_info 
 *out_info)
   strbuf_release(norm);
   return NULL;
   }
 + seg_start = norm.buf + prev_len;

A comment would be nice here to remind folks who might be tempted to
revert this to the previous version why it's being done this way.

I'm sure at some point someone will propose a simplification patch
otherwise.

Also some nits.  The patch description should be imperative mood
(cf. Documentation/SubmittingPatches).  And instead of mentioning the seg_start
pointer in the description (which will be meaningless to just about everyone and
it's clear from the diff), mention the bad thing the code was doing in more
general terms that will be clear to anyone familiar with a strbuf.

So how about this patch instead...

-- 8 --
From: Thomas Rast tr...@inf.ethz.ch
Subject: urlmatch.c: recompute pointer after append_normalized_escapes

When append_normalized_escapes is called, its internal strbuf_add* calls can
cause the strbuf's buf to be reallocated changing the value of the buf pointer.

Do not use the strbuf buf pointer from before any append_normalized_escapes
calls afterwards.  Instead recompute the needed pointer.

Signed-off-by: Thomas Rast tr...@inf.ethz.ch
Signed-off-by: Kyle J. McKay mack...@gmail.com
---
 urlmatch.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/urlmatch.c b/urlmatch.c
index 1db76c89..01c67467 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -281,8 +281,9 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
url_len--;
}
for (;;) {
-   const char *seg_start = norm.buf + norm.len;
+   const char *seg_start;
+   size_t seg_start_off = norm.len;
const char *next_slash = url + strcspn(url, /?#);
int skip_add_slash = 0;
/*
 * RFC 3689 indicates that any . or .. segments should be
@@ -297,6 +298,8 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
strbuf_release(norm);
return NULL;
}
+   /* append_normalized_escapes can cause norm.buf to change */
+   seg_start = norm.buf + seg_start_off;
if (!strcmp(seg_start, .)) {
/* ignore a . segment; be careful not to remove initial 
'/' */
if (seg_start == path_start + 1) {
-- 
1.8.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Sep 2013, #03; Wed, 11)

2013-09-12 Thread Junio C Hamano
Johannes Sixt j.s...@viscovery.net writes:

 Am 9/12/2013 1:32, schrieb Junio C Hamano:
 * jc/ref-excludes (2013-09-03) 2 commits
  - document --exclude option
  - revision: introduce --exclude=glob to tame wildcards
 
  People often wished a way to tell git log --branches (and git
  log --remotes --not --branches) to exclude some local branches
  from the expansion of --branches (similarly for --tags, --all
  and --glob=pattern).  Now they have one.
 
  Will merge to 'next'.

 Please don't. This is by far not ready. It needs a different approach to
 support --exclude= in rev-parse.

Thanks for a dose of sanity. I didn't look at rev-parse. I vaguely
recall somebody offered follow-ups (was it you?) and at that point
I placed this on the back-burner.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Thomas Rast
Kyle J. McKay mack...@gmail.com writes:

 Also some nits.  The patch description should be imperative mood
 (cf. Documentation/SubmittingPatches).

Heh.  Serves me right to go away for a while and get SubmittingPatches
cited at me on return ;-)

Thanks for the updated patch.  I agree with the changes.  I particularly
like the better variable name.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Junio C Hamano
John Keeping j...@keeping.me.uk writes:

 On Thu, Sep 12, 2013 at 11:36:56AM +0200, Sebastian Schuberth wrote:
  Just wondering if that is the root of the problem, or if maybe there is
  something else subtle going on. Also, does __CRT_INLINE just turn into
  inline, or is there perhaps some other pre-processor magic going on?
 
 This is the function definition from string.h after preprocessing:
 
 extern __inline__ int __attribute__((__cdecl__)) __attribute__ 
 ((__nothrow__))
 strncasecmp (const char * __sz1, const char * __sz2, size_t __sizeMaxCompare)
   {return _strnicmp (__sz1, __sz2, __sizeMaxCompare);}

 I wonder if GCC has changed it's behaviour to more closely match C99.
 Clang as a compatibility article about this sort of issue:

 http://clang.llvm.org/compatibility.html#inline

Interesting.  The ways the page suggests as fixes are

 - change it to a statis inline;
 - remove inline from the definition;
 - provide an external (non-inline) def somewhere else;
 - compile with gnu899 dialect.

But the first two are non-starter, and the third one to force
everybody to define an equivalent implementation is nonsense, for a
definition in the standard header file.

I agree with an earlier conclusion that defining our own wrapper
(with an explanation why such a redundant wrapper exists) is the
best course of action at this point, until the system header is
fixed.

 mailmap.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/mailmap.c b/mailmap.c
index a7969c4..d36d424 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -52,6 +52,19 @@ static void free_mailmap_entry(void *p, const char *s)
string_list_clear_func(me-namemap, free_mailmap_info);
 }
 
+/*
+ * On some systems, string.h has _only_ inline definition of strcasecmp
+ * without supplying a non-inline implementation anywhere, which is, eh,
+ * unusual; we cannot take an address of such a function to store it in
+ * namemap.cmp.  This is here as a workaround---do not assign strcasecmp
+ * directly to namemap.cmp until we know no systems that matter have such
+ * an unusual string.h.
+ */
+static int namemap_cmp(const char *a, const char *b)
+{
+   return strcasecmp(a, b);
+}
+
 static void add_mapping(struct string_list *map,
char *new_name, char *new_email,
char *old_name, char *old_email)
@@ -75,7 +88,7 @@ static void add_mapping(struct string_list *map,
item = string_list_insert_at_index(map, index, old_email);
me = xcalloc(1, sizeof(struct mailmap_entry));
me-namemap.strdup_strings = 1;
-   me-namemap.cmp = strcasecmp;
+   me-namemap.cmp = namemap_cmp;
item-util = me;
}
 
@@ -237,7 +250,7 @@ int read_mailmap(struct string_list *map, char 
**repo_abbrev)
int err = 0;
 
map-strdup_strings = 1;
-   map-cmp = strcasecmp;
+   map-cmp = namemap_cmp;
 
if (!git_mailmap_blob  is_bare_repository())
git_mailmap_blob = HEAD:.mailmap;
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size

2013-09-12 Thread Nicolas Pitre
On Tue, 10 Sep 2013, Duy Nguyen wrote:

 On Mon, Sep 9, 2013 at 10:01 PM, Nicolas Pitre n...@fluxnic.net wrote:
  However this means that the progress meter will now be wrong and that's
  terrible !  Users *will* complain that the meter doesn't reach 100% and
  they'll protest for being denied the remaining objects during the
  transfer !
 
  Joking aside, we should think about doing something about it.  I was
  wondering if some kind of prefix to the pack stream could be inserted
  onto the wire when sending a pack v4.  Something like:
 
  'T', 'H', 'I', 'N', actual_number_of_sent_objects_in_network_order
 
  This 8-byte prefix would simply be discarded by index-pack after being
  parsed.
 
  What do you think?
 
 I have no problem with this. Although I rather we generalize the case
 to support multiple packs in the same stream (in some case the server
 can just stream away one big existing pack, followed by a smaller pack
 of recent updates), where thin is just a special pack that is not
 saved on disk. So except for the signature difference, it should at
 least follow the pack header (sig, version, nr_objects)

Except in this case this is not a separate pack.  This prefix is there 
to provide information that is valid only for the pack to follow and 
therefore cannot be considered as some independent data.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] relative_path should honor dos_drive_prefix

2013-09-12 Thread Karsten Blees
Am 12.09.2013 11:12, schrieb Jiang Xin:
 Tvangeste found that the relative_path function could not work
 properly on Windows if in and prefix have dos driver prefix.
 ($gmane/234434)
 
 e.g., When execute: test-path-utils relative_path C:/a/b D:/x/y,
 should return C:/a/b, but returns ../../C:/a/b, which is wrong.
 
 So make relative_path honor dos_drive_prefix, and add test cases
 for it in t0060.
 

I still don't think that cd'ing through the root is a Good Thing for absolute 
paths, as it is not possible to do so in general.

POSIX says the meaning of '//' is implementation-defined [1].

Cygwin supports //hostname/share/directory/file. You cannot cd to the hostname 
(i.e. root would be //hostname/share).

On Windows, we have drive letters, UNC paths and namespaces:
C:\directory\file
\\hostname\share\directory\file (same as cygwin)
\\?\C:\directory\file
\\?\UNC\hostname\share\directory\file

I'm not sure about '//' support on other git platforms (the most likely 
candidate would probably be HP-UX, as HP bought Apollo/DomainOS, which 
supported '//hostname/directory/file back in 1981).


You could of course handle all these special cases. However, with the POSIX 
definition above, the only reliable and future-proof way to check if we can cd 
out of a path component of an _absolute_ path is to actually try it until we 
get an error. And we probably don't want to do actual IO in relative_path 
(which is pure string manipulation so far).

[1] 
http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap04.html#tag_21_04_12


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Specifying a private key when connecting to a remote SSH repo

2013-09-12 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 We already have GIT_SSH, so I would expect:

   GIT_SSH='ssh -i $HOME/.ssh/id_for_example_com' git push

 to work. But sadly, GIT_SSH does not use the shell, unlike most other
 configure git commands. :(

You read me correctly ;-)

 We could consider it a consistency bug and fix it, though I suspect we
 may be annoying people on Windows who have spaces in their paths.

Again, you read me correctly ;-)

 You could write a credential helper shell script that knows about
 classes of remotes (e.g., selecting an identity file based on the
 hostname), and write only a few lines to cover a large number of hosts.

Yes, but the same trick can be used in $HOME/.ssh/config to let one
entry cover the same large number of hosts, so...

 For example:

   #!/bin/sh
   test $1 = get || exit 0
   while IFS== read key val; do
 test $key = host || continue
 case $val in
   *.example.com) echo sshident=com_key ;;
   *.example.net) echo sshident=net_key ;;
 esac
   done

 But it feels a bit hacky to be using the credential helpers at all for
 ssh connections.

Yeah, perhaps.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] relative_path should honor dos_drive_prefix

2013-09-12 Thread Junio C Hamano
Torsten Bögershausen tbo...@web.de writes:

 +static int have_same_root(const char *path1, const char *path2)
 +{
 +int is_abs1, is_abs2;
 +
 +is_abs1 = is_absolute_path(path1);
 +is_abs2 = is_absolute_path(path2);
 +return (is_abs1  is_abs2  !strncasecmp(path1, path2, 1)) ||
^^^
 I wonder: should strncasecmp() be replaced with strncmp_icase() ?

True.

 See dir.c: 
 int strncmp_icase(const char *a, const char *b, size_t count)
 {
   return ignore_case ? strncasecmp(a, b, count) : strncmp(a, b, count);
 }
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/21] np/pack-v4 updates

2013-09-12 Thread Nicolas Pitre
On Thu, 12 Sep 2013, Duy Nguyen wrote:

 On Wed, Sep 11, 2013 at 11:25 PM, Nicolas Pitre n...@fluxnic.net wrote:
  On Wed, 11 Sep 2013, Duy Nguyen wrote:
 
  Nico, if you have time you may want to look into this. The result v4
  pack from pack-objects on git.git for me is 35MB (one branch) while
  packv4-create produces 30MB (v2 is 40MB). I don't know why there is
  such a big difference in size. I compared. Ident dict is identical.
  Tree dict is a bit different (some that have same hits are ordered
  differently). Delta chains do not differ much. Many groups of entries
  in the pack are displaced though. I guess I turned a wrong knob or
  something in pack-objects in v4 code..
 
  Will try to have a closer look.
 
 Problem found. I encoded some trees as ref-delta instead of pv4-tree
 :( Something like this brings the size back to packv4-create output
 
 diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
 index f604fa5..3d9ab0e 100644
 --- a/builtin/pack-objects.c
 +++ b/builtin/pack-objects.c
 @@ -1490,7 +1490,8 @@ static void check_object(struct object_entry *entry)
   * deltify other objects against, in order to avoid
   * circular deltas.
   */
 - entry-type = entry-in_pack_type;
 + if (pack_version  4)
 + entry-type = entry-in_pack_type;
   entry-delta = base_entry;
   entry-delta_size = entry-size;
   entry-delta_sibling = base_entry-delta_child;

Hmmm... I've folded this fix into your patch touching this area.

This code is becoming rather subtle and messy though.  We'll have to 
find a way to better abstract things.  Especially since object data 
reuse will work only for blobs and tags with packv4.  Commits and trees 
will need adjustments to their indices.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] relative_path should honor dos_drive_prefix

2013-09-12 Thread Johannes Sixt
Am 12.09.2013 16:13, schrieb Torsten Bögershausen:
 On 2013-09-12 11.12, Jiang Xin wrote:
 +static int have_same_root(const char *path1, const char *path2)
 +{
 +int is_abs1, is_abs2;
 +
 +is_abs1 = is_absolute_path(path1);
 +is_abs2 = is_absolute_path(path2);
 +return (is_abs1  is_abs2  !strncasecmp(path1, path2, 1)) ||
^^^
 I wonder: should strncasecmp() be replaced with strncmp_icase() ?

I don't think so: On POSIX, it is irrelevant, because the call will only
compare a slash to a slash. On Windows, it compares the drive letters
(or a slash); it is *always* case-insensitive, even if the volume
mounted is NTFS with case-sensitivity enabled and core.ignorecase is false.

 See dir.c: 
 int strncmp_icase(const char *a, const char *b, size_t count)
 {
   return ignore_case ? strncasecmp(a, b, count) : strncmp(a, b, count);
 }

-- Hannes

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Junio C Hamano
Thanks, both.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Jeff King
On Thu, Sep 12, 2013 at 08:37:08AM -0700, Junio C Hamano wrote:

  I wonder if GCC has changed it's behaviour to more closely match C99.
  Clang as a compatibility article about this sort of issue:
 
  http://clang.llvm.org/compatibility.html#inline
 
 Interesting.  The ways the page suggests as fixes are
 
  - change it to a statis inline;
  - remove inline from the definition;
  - provide an external (non-inline) def somewhere else;
  - compile with gnu899 dialect.

Right, option 3 seems perfectly reasonable to me, as we must be prepared
to cope with a decision not to inline the function, and there has to be
_some_ linked implementation. But shouldn't libc be providing an
external, linkable strcasecmp in this case?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH-v4] Allow git-filter-branch to process large repositories with lots of branches.

2013-09-12 Thread Lee Carver
As noted in several forums, a recommended way to move trees between
repositories
is to use git-filter-branch to revise the history for a single tree:

http://gbayer.com/development/moving-files-from-one-git-repository-to-anoth
er-preserving-history/
http://stackoverflow.com/questions/1365541/how-to-move-files-from-one-git-r
epo-to-another-not-a-clone-preserving-history

However, this can lead to argument list too long errors when the original
repository has many retained branches (6k)

/usr/local/git/libexec/git-core/git-filter-branch: line 270:
/usr/local/git/libexec/git-core/git: Argument list too long
Could not get the commits

Piping the saved output from git rev-parse into git rev-list avoids this
problem, since the rev-parse output is not processed as a command line
argument.

Signed-off-by: Lee Carver lee.car...@servicenow.com
---
 git-filter-branch.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index ac2a005..2091885 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -255,7 +255,7 @@ else
remap_to_ancestor=t
 fi
 
-rev_args=$(git rev-parse --revs-only $@)
+git rev-parse --revs-only $@  ../parse
 
 case $filter_subdir in
 )
@@ -268,7 +268,7 @@ case $filter_subdir in
 esac
 
 git rev-list --reverse --topo-order --default HEAD \
-   --parents --simplify-merges $rev_args $@  ../revs ||
+   --parents --simplify-merges --stdin $@  ../parse  ../revs ||
die Could not get the commits
 commits=$(wc -l ../revs | tr -d  )
 
-- 
1.8.3.2


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Specifying a private key when connecting to a remote SSH repo

2013-09-12 Thread Breck Yunits
Thanks very much for the feedback and implementation suggestions.

 If the only thing you are interested in supporting is a one-shot
 invocation, i.e. giving which identity file to use from the command
 line when you run either git push or git fetch,

Yes, this is the new option that could benefit the most people.

I think this workflow would be very fast and make it very easy to have
1 key per project right where you need it:

```
mkdir project
cd project
ssh-keygen -t rsa -N  -f deploy.key
git init
echo deploy.key*  .gitignore
echo Hello world  readme.md
git add .
git commit -m Initial commit
git remote add origin g...@github.com:breck7/project.git
git push -u origin master -ssh -i deploy.key
```

This probably wouldn't be the option used most frequently, but could
be a neat option to have for both power users and new users.

For power users, I could see this being useful if you have many
projects that all have different keys.

For new users, I could see this is as a quick way to get out of
trouble if you are running into ssh problems.

-Breck


On Thu, Sep 12, 2013 at 8:43 AM, Junio C Hamano gits...@pobox.com wrote:
 Jeff King p...@peff.net writes:

 We already have GIT_SSH, so I would expect:

   GIT_SSH='ssh -i $HOME/.ssh/id_for_example_com' git push

 to work. But sadly, GIT_SSH does not use the shell, unlike most other
 configure git commands. :(

 You read me correctly ;-)

 We could consider it a consistency bug and fix it, though I suspect we
 may be annoying people on Windows who have spaces in their paths.

 Again, you read me correctly ;-)

 You could write a credential helper shell script that knows about
 classes of remotes (e.g., selecting an identity file based on the
 hostname), and write only a few lines to cover a large number of hosts.

 Yes, but the same trick can be used in $HOME/.ssh/config to let one
 entry cover the same large number of hosts, so...

 For example:

   #!/bin/sh
   test $1 = get || exit 0
   while IFS== read key val; do
 test $key = host || continue
 case $val in
   *.example.com) echo sshident=com_key ;;
   *.example.net) echo sshident=net_key ;;
 esac
   done

 But it feels a bit hacky to be using the credential helpers at all for
 ssh connections.

 Yeah, perhaps.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Sep 2013, #03; Wed, 11)

2013-09-12 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 This description is slightly inaccurate since the re-roll. I think it is
 now:

   git config did not provide a way to set or access numbers larger
   than a native int on the platform; it now provides 64-bit signed
   integers on all platforms.

 Not a big deal, but it may make your life easier if this description
 ends up in the merge commit, and then you later try to write
 ReleaseNotes off of it.

Thanks, this helps very much.

If you are not interested in how a Git maintainer works, you can
stop reading.

Otherwise understanding of Documentation/howto/maintain-git.txt may
be necessary to grok the below.

The way I work these days is:

 - Add a ### cut mark to Meta/redo-jch.sh so that topics I want to
   have in 'master' comes before it;

 - Proofread the description on these topics in Meta/whats-cooking.txt;

 - Make a copy of RelNotes to a temporary file and have it in my
   Emacs;

 - On 'master', run Meta/redo-jch.sh -c1 -e, with emacsclient set
   to my EDITOR;

 - Edit the merge log message if necessary (and if I do so,
   whats-cooking needs to be also updated), but do not say C-x #
   yet;

 - Copy that final merge log message to that temporary file (it is
   all in the same Emacs, so this is very simple and easy) to add a
   new entry;

 - Go back to the merge log message and say C-x #, which will go
   back two step in this list, repeatedly processing all the topics.

And then copy the temporary file over to RelNotes.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Jonathan Nieder
Junio C Hamano wrote:

 I think we would want something like below.

Looks good to me, but

 -- 8 --
 Subject: [PATCH] mailmap: work around implementations with pure inline 
 strcasecmp

 On some systems, string.h has _only_ inline definition of strcasecmp

Please specify which system we are talking about: s/some systems/MinGW 4.0/

[...]
 --- a/mailmap.c
 +++ b/mailmap.c
 @@ -52,6 +52,19 @@ static void free_mailmap_entry(void *p, const char *s)
   string_list_clear_func(me-namemap, free_mailmap_info);
  }
  
 +/*
 + * On some systems, string.h has _only_ inline definition of strcasecmp

Likewise.

With or without that change,
Reviewed-by: Jonathan Nieder jrnie...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Thu, Sep 12, 2013 at 08:37:08AM -0700, Junio C Hamano wrote:

  I wonder if GCC has changed it's behaviour to more closely match C99.
  Clang as a compatibility article about this sort of issue:
 
  http://clang.llvm.org/compatibility.html#inline
 
 Interesting.  The ways the page suggests as fixes are
 
  - change it to a statis inline;
  - remove inline from the definition;
  - provide an external (non-inline) def somewhere else;
  - compile with gnu899 dialect.

 Right, option 3 seems perfectly reasonable to me, as we must be prepared
 to cope with a decision not to inline the function, and there has to be
 _some_ linked implementation. But shouldn't libc be providing an
 external, linkable strcasecmp in this case?

That is exactly my point when I said that the third one is nonsense
for a definition in the standard header file.

I think we would want something like below.

-- 8 --
Subject: [PATCH] mailmap: work around implementations with pure inline 
strcasecmp

On some systems, string.h has _only_ inline definition of strcasecmp
without supplying a non-inline implementation anywhere, which is,
eh, unusual; we cannot take an address of such a function to store
it in namemap.cmp.  Work it around by introducing our own level of
indirection.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 mailmap.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/mailmap.c b/mailmap.c
index 44614fc..8863e23 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -52,6 +52,19 @@ static void free_mailmap_entry(void *p, const char *s)
string_list_clear_func(me-namemap, free_mailmap_info);
 }
 
+/*
+ * On some systems, string.h has _only_ inline definition of strcasecmp
+ * without supplying a non-inline implementation anywhere, which is, eh,
+ * unusual; we cannot take an address of such a function to store it in
+ * namemap.cmp.  This is here as a workaround---do not assign strcasecmp
+ * directly to namemap.cmp until we know no systems that matter have such
+ * an unusual string.h.
+ */
+static int namemap_cmp(const char *a, const char *b)
+{
+   return strcasecmp(a, b);
+}
+
 static void add_mapping(struct string_list *map,
char *new_name, char *new_email,
char *old_name, char *old_email)
@@ -75,7 +88,7 @@ static void add_mapping(struct string_list *map,
item = string_list_insert_at_index(map, index, old_email);
me = xcalloc(1, sizeof(struct mailmap_entry));
me-namemap.strdup_strings = 1;
-   me-namemap.cmp = strcasecmp;
+   me-namemap.cmp = namemap_cmp;
item-util = me;
}
 
@@ -241,7 +254,7 @@ int read_mailmap(struct string_list *map, char 
**repo_abbrev)
int err = 0;
 
map-strdup_strings = 1;
-   map-cmp = strcasecmp;
+   map-cmp = namemap_cmp;
 
if (!git_mailmap_blob  is_bare_repository())
git_mailmap_blob = HEAD:.mailmap;
-- 
1.8.4-485-gec42fe2


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git-svn fails to initialize with submodule setup, when '.git' is a file and not a directory

2013-09-12 Thread Marc-Antoine Ruel
Repro:
1. git clone --recursive a repository with submodules.
2. cd checkout/submoduleA
3. git svn init svn://host/repo

Expected:
Works as usual

Actual:
/path/to/checkout/submoduleA/.git/refs: Not a directory
init: command returned error: 1

Why:
submoduleA/.git is a file, not a directory.
$ cat /path/to/checkout/submoduleA/.git
gitdir: ../.git/modules/submoduleA

$ git --version
git version 1.8.4

Thanks,

M-A
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Use simpler relative_path when set_git_dir

2013-09-12 Thread Johannes Sixt
Am 12.09.2013 11:12, schrieb Jiang Xin:
 Using a relative_path as git_dir first appears in v1.5.6-1-g044bbbc.
 It will make git_dir shorter only if git_dir is inside work_tree,
 and this will increase performance. But my last refactor effort on
 relative_path function (commit v1.8.3-rc2-12-ge02ca72) changed that.
 Always use relative_path as git_dir may bring troubles like
 $gmane/234434.
 
 Because new relative_path is a combination of original relative_path
 from path.c and original path_relative from quote.c, so in order to
 restore the origin implementation, save the original relative_path
 to simple_relative_path, and call it in setup.c.
 
 Suggested-by: Karsten Blees karsten.bl...@gmail.com
 Signed-off-by: Jiang Xin worldhello@gmail.com
 ---
  cache.h |  1 +
  path.c  | 45 +
  setup.c |  5 +
  3 files changed, 47 insertions(+), 4 deletions(-)
 
 diff --git a/cache.h b/cache.h
 index 8e42256..ab17f1d 100644
 --- a/cache.h
 +++ b/cache.h
 @@ -738,6 +738,7 @@ const char *real_path(const char *path);
  const char *real_path_if_valid(const char *path);
  const char *absolute_path(const char *path);
  const char *relative_path(const char *in, const char *prefix, struct strbuf 
 *sb);
 +const char *simple_relative_path(const char *in, const char *prefix);
  int normalize_path_copy(char *dst, const char *src);
  int longest_ancestor_length(const char *path, struct string_list *prefixes);
  char *strip_path_suffix(const char *path, const char *suffix);
 diff --git a/path.c b/path.c
 index ffcdea1..0f0c92f 100644
 --- a/path.c
 +++ b/path.c
 @@ -558,6 +558,51 @@ const char *relative_path(const char *in, const char 
 *prefix,
  }
  
  /*
 + * A simpler implementation of relative_path

So we have a heavy-duty function relative_path(), but it is not capable
of doing the simple operations that this function does?

There must be something wrong.

This function were easier to sell if it were named
remove_optional_prefix() or something.

 + *
 + * Get relative path by removing prefix from in. This function
 + * first appears in v1.5.6-1-g044bbbc, and makes git_dir shorter
 + * to increase performance when traversing the path to work_tree.
 + */
 +const char *simple_relative_path(const char *in, const char *prefix)
 +{
 + static char buf[PATH_MAX + 1];
 + int i = 0, j = 0;
 +
 + if (!prefix || !prefix[0])
 + return in;
 + while (prefix[i]) {
 + if (is_dir_sep(prefix[i])) {
 + if (!is_dir_sep(in[j]))
 + return in;
 + while (is_dir_sep(prefix[i]))
 + i++;
 + while (is_dir_sep(in[j]))
 + j++;
 + continue;
 + } else if (in[j] != prefix[i]) {
 + return in;
 + }
 + i++;
 + j++;
 + }
 + if (
 + /* /foo is a prefix of /foo */
 + in[j] 
 + /* /foo is not a prefix of /foobar */
 + !is_dir_sep(prefix[i-1])  !is_dir_sep(in[j])
 +)
 + return in;
 + while (is_dir_sep(in[j]))
 + j++;
 + if (!in[j])
 + strcpy(buf, .);
 + else
 + strcpy(buf, in + j);
 + return buf;
 +}
 ...

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Jeff King
On Thu, Sep 12, 2013 at 11:35:21AM -0700, Junio C Hamano wrote:

   - change it to a statis inline;
   - remove inline from the definition;
   - provide an external (non-inline) def somewhere else;
   - compile with gnu899 dialect.
 
  Right, option 3 seems perfectly reasonable to me, as we must be prepared
  to cope with a decision not to inline the function, and there has to be
  _some_ linked implementation. But shouldn't libc be providing an
  external, linkable strcasecmp in this case?
 
 That is exactly my point when I said that the third one is nonsense
 for a definition in the standard header file.

Yes, but I am saying it is the responsibility of libc. IOW, I am
wondering if this particular mingw environment is simply broken, and if
so, what is the status on the fix?  Could another option be to declare
the environment unworkable and tell people to upgrade?

I am not even sure if we are right to call it broken, but talking to the
mingw people might be a good next step, as they will surely have an
opinion. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: git-daemon access-hook race condition

2013-09-12 Thread Eugene Sajine
Hi,


We are serving repos in closed netwrok via git protocol. We are using
git-daemon access hook (thank you very much for such a great feature)
in order to create push notifications for Jenkins.
I.e. upon the push the access-hook is called and then the curl command
is created and executed. As we have several instances of Jenkins, that
we need to notify (three), the execution of the access-hook can take
some time.

Sometimes we have a situation when the whole chain works fine but
Jenkins git plugin doesn't recognize the changes. I think it happens
because we hit a kind of race condition:

1. Incoming push triggers access-hook
2. notify jenkins 1
3. notify jenkins 2
4. jenkins 1 polls repo but sees no changes
5. notify Jenkins 3
6. the push data transfer finishes - consequent pushes will find
changes w/o any problem

The question is:

Is there a way to avoid that?
Is it possible to have access-hook to be executed after receive?
Is it possible to introduce a parameter that would specify if it needs
to be executed before receive or after?

Thanks,
Eugene
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] relative_path should honor dos_drive_prefix

2013-09-12 Thread Junio C Hamano
Johannes Sixt j...@kdbg.org writes:

 Am 12.09.2013 16:13, schrieb Torsten Bögershausen:
 On 2013-09-12 11.12, Jiang Xin wrote:
 +static int have_same_root(const char *path1, const char *path2)
 +{
 +   int is_abs1, is_abs2;
 +
 +   is_abs1 = is_absolute_path(path1);
 +   is_abs2 = is_absolute_path(path2);
 +   return (is_abs1  is_abs2  !strncasecmp(path1, path2, 1)) ||
^^^
 I wonder: should strncasecmp() be replaced with strncmp_icase() ?

 I don't think so: On POSIX, it is irrelevant, because the call will only
 compare a slash to a slash. On Windows, it compares the drive letters
 (or a slash); it is *always* case-insensitive, even if the volume
 mounted is NTFS with case-sensitivity enabled and core.ignorecase is false.

Ah, you are right, of course.  We could even do

tolower(path1[0]) == tolower(path2[0])

which might be more explicit.

For systems that need POSIX escape hatch for Apollo Domain ;-), we
would need a bit more work.  When both path1 and path2 begin with a
double-dash, we would need to check if they match up to the next
slash, so that

 - //host1/usr/src and //host1/usr/lib share the same root and the
   former can be made to ../src relative to the latter;

 - //host1/usr/src and //host2/usr/lib are of separate roots.

or something.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: git-daemon access-hook race condition

2013-09-12 Thread Junio C Hamano
Eugene Sajine eugu...@gmail.com writes:

 Is it possible to have access-hook to be executed after receive?

The whole point of access-hook is to allow it to decide whether the
access is allowed or not, so that is a non-starter.

A notification _after_ successful push update is usually done via
the post-receive hook in the receiving repository, I think.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/4] submodule trailing slash improvements

2013-09-12 Thread John Keeping
Changes since v1:

* Improvements to existing pathspec code to use is_dir_sep instead of
  comparing against '/' and handle multiple trailing slashes
* Remove calls to read_cache() made redundant by a new call in
  builtin/reset.c::parse_args()

John Keeping (4):
  pathspec: use is_dir_sep() to check for trailing slashes
  pathspec: strip multiple trailing slashes from submodules
  rm: re-use parse_pathspec's trailing-slash removal
  reset: handle submodule with trailing slash

 builtin/reset.c|  8 ++--
 builtin/rm.c   | 20 
 pathspec.c | 30 +++---
 t/t7400-submodule-basic.sh |  6 --
 4 files changed, 33 insertions(+), 31 deletions(-)

-- 
1.8.4.277.gfbd6843.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/4] reset: handle submodule with trailing slash

2013-09-12 Thread John Keeping
When using tab-completion, a directory path will often end with a
trailing slash which currently confuses git reset when dealing with
submodules.  Now that we have parse_pathspec we can easily handle this
by simply adding the PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP flag.

To do this, we need to move the read_cache() call before the
parse_pathspec() call.  All of the existing paths through cmd_reset()
that do not die early already call read_cache() at some point, so there
is no performance impact to doing this in the common case.

Signed-off-by: John Keeping j...@keeping.me.uk
---
 builtin/reset.c| 8 ++--
 t/t7400-submodule-basic.sh | 6 --
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/builtin/reset.c b/builtin/reset.c
index 5e4c551..800117f 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -143,7 +143,6 @@ static int read_from_tree(const struct pathspec *pathspec,
opt.output_format = DIFF_FORMAT_CALLBACK;
opt.format_callback = update_index_from_diff;
 
-   read_cache();
if (do_diff_cache(tree_sha1, opt))
return 1;
diffcore_std(opt);
@@ -169,7 +168,7 @@ static void set_reflog_message(struct strbuf *sb, const 
char *action,
 
 static void die_if_unmerged_cache(int reset_type)
 {
-   if (is_merge() || read_cache()  0 || unmerged_cache())
+   if (is_merge() || unmerged_cache())
die(_(Cannot do a %s reset in the middle of a merge.),
_(reset_type_names[reset_type]));
 
@@ -220,8 +219,13 @@ static void parse_args(struct pathspec *pathspec,
}
}
*rev_ret = rev;
+
+   if (read_cache()  0)
+   die(_(index file corrupt));
+
parse_pathspec(pathspec, 0,
   PATHSPEC_PREFER_FULL |
+  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP |
   (patch_mode ? PATHSPEC_PREFIX_ORIGIN : 0),
   prefix, argv);
 }
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 4192fe0..c268d3c 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -481,7 +481,7 @@ test_expect_success 'do not add files from a submodule' '
 
 '
 
-test_expect_success 'gracefully add submodule with a trailing slash' '
+test_expect_success 'gracefully add/reset submodule with a trailing slash' '
 
git reset --hard 
git commit -m commit subproject init 
@@ -495,7 +495,9 @@ test_expect_success 'gracefully add submodule with a 
trailing slash' '
git add init/ 
test_must_fail git diff --exit-code --cached init 
test $commit = $(git ls-files --stage |
-   sed -n s/^16 \([^ ]*\).*/\1/p)
+   sed -n s/^16 \([^ ]*\).*/\1/p) 
+   git reset init/ 
+   git diff --exit-code --cached init
 
 '
 
-- 
1.8.4.277.gfbd6843.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/4] pathspec: strip multiple trailing slashes from submodules

2013-09-12 Thread John Keeping
This allows us to replace the submodule path trailing slash removal in
builtin/rm.c with the PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP flag to
parse_pathspec() without changing the behaviour with respect to multiple
trailing slashes.

Signed-off-by: John Keeping j...@keeping.me.uk
---
 pathspec.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 7c6963b..11b031a 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -251,12 +251,16 @@ static unsigned prefix_pathspec(struct pathspec_item 
*item,
item-len = strlen(item-match);
item-prefix = prefixlen;
 
-   if ((flags  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP) 
-   (item-len = 1  is_dir_sep(item-match[item-len - 1])) 
-   (i = cache_name_pos(item-match, item-len - 1)) = 0 
-   S_ISGITLINK(active_cache[i]-ce_mode)) {
-   item-len--;
-   match[item-len] = '\0';
+   if (flags  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP) {
+   size_t pathlen = item-len;
+   while (pathlen  0  is_dir_sep(item-match[pathlen - 1]))
+   pathlen--;
+
+   if ((i = cache_name_pos(item-match, pathlen)) = 0 
+   S_ISGITLINK(active_cache[i]-ce_mode)) {
+   item-len = pathlen;
+   match[item-len] = '\0';
+   }
}
 
if (flags  PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE)
@@ -271,11 +275,14 @@ static unsigned prefix_pathspec(struct pathspec_item 
*item,
!is_dir_sep(match[ce_len]) ||
memcmp(ce-name, match, ce_len))
continue;
-   if (item-len == ce_len + 1) {
-   /* strip trailing slash */
+
+   while (item-len  0  is_dir_sep(match[item-len - 
1]))
item-len--;
-   match[item-len] = '\0';
-   } else
+
+   /* strip trailing slash */
+   match[item-len] = '\0';
+
+   if (item-len != ce_len)
die (_(Pathspec '%s' is in submodule '%.*s'),
 elt, ce_len, ce-name);
}
-- 
1.8.4.277.gfbd6843.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/4] pathspec: use is_dir_sep() to check for trailing slashes

2013-09-12 Thread John Keeping
This allows us to correctly removing trailing backslashes on Windows
when checking for submodules.

Signed-off-by: John Keeping j...@keeping.me.uk
---
 pathspec.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index ad1a9f5..7c6963b 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -252,7 +252,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
item-prefix = prefixlen;
 
if ((flags  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP) 
-   (item-len = 1  item-match[item-len - 1] == '/') 
+   (item-len = 1  is_dir_sep(item-match[item-len - 1])) 
(i = cache_name_pos(item-match, item-len - 1)) = 0 
S_ISGITLINK(active_cache[i]-ce_mode)) {
item-len--;
@@ -267,7 +267,8 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
if (!S_ISGITLINK(ce-ce_mode))
continue;
 
-   if (item-len = ce_len || match[ce_len] != '/' ||
+   if (item-len = ce_len ||
+   !is_dir_sep(match[ce_len]) ||
memcmp(ce-name, match, ce_len))
continue;
if (item-len == ce_len + 1) {
-- 
1.8.4.277.gfbd6843.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Junio C Hamano
Kyle J. McKay mack...@gmail.com writes:

 So how about this patch instead...

 -- 8 --
 From: Thomas Rast tr...@inf.ethz.ch
 Subject: urlmatch.c: recompute pointer after append_normalized_escapes

 When append_normalized_escapes is called, its internal strbuf_add* calls can
 cause the strbuf's buf to be reallocated changing the value of the buf 
 pointer.

 Do not use the strbuf buf pointer from before any append_normalized_escapes
 calls afterwards.  Instead recompute the needed pointer.

 Signed-off-by: Thomas Rast tr...@inf.ethz.ch
 Signed-off-by: Kyle J. McKay mack...@gmail.com
 ---
  urlmatch.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

 diff --git a/urlmatch.c b/urlmatch.c
 index 1db76c89..01c67467 100644
 --- a/urlmatch.c
 +++ b/urlmatch.c
 @@ -281,8 +281,9 @@ char *url_normalize(const char *url, struct url_info 
 *out_info)
   url_len--;
   }
   for (;;) {
 - const char *seg_start = norm.buf + norm.len;
 + const char *seg_start;
 + size_t seg_start_off = norm.len;
   const char *next_slash = url + strcspn(url, /?#);
   int skip_add_slash = 0;
   /*
* RFC 3689 indicates that any . or .. segments should be
 @@ -297,6 +298,8 @@ char *url_normalize(const char *url, struct url_info 
 *out_info)
   strbuf_release(norm);
   return NULL;
   }
 + /* append_normalized_escapes can cause norm.buf to change */
 + seg_start = norm.buf + seg_start_off;

The change looks good, but I find that this comment is not placed in
the right place.  It is good if the reader knows about an old bug to
put it here, but if the first thing a reader reads is this updated
version, the comment is better placed close to the place where the
start_ofs variable captures the original value (i.e. because the
next call may relocate the buffer, we cannot grab seg_start upfront;
instead we need to record the start_ofs here, and that is what this
variable is about).

It is too minor a point for a reroll, so I'll try to tweak it
locally.  Something like this (but now I think about it, the comment
may not even be necessary).

 urlmatch.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/urlmatch.c b/urlmatch.c
index 01c6746..d1600e2 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -282,9 +282,17 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
}
for (;;) {
const char *seg_start;
-   size_t seg_start_off = norm.len;
+   size_t seg_start_off;
const char *next_slash = url + strcspn(url, /?#);
int skip_add_slash = 0;
+
+   /*
+* record the starting offset; appending escapes may
+* relocate the buffer, so we cannot capture seg_start
+* upfront and use it later.
+*/
+   seg_start_off = norm.len;
+
/*
 * RFC 3689 indicates that any . or .. segments should be
 * unescaped before being checked for.
@@ -298,7 +306,7 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
strbuf_release(norm);
return NULL;
}
-   /* append_normalized_escapes can cause norm.buf to change */
+
seg_start = norm.buf + seg_start_off;
if (!strcmp(seg_start, .)) {
/* ignore a . segment; be careful not to remove initial 
'/' */
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re-Transmission of blobs?

2013-09-12 Thread Jeff King
On Thu, Sep 12, 2013 at 12:35:32PM +0200, Josef Wolf wrote:

 I'm not sure I understand correctly. I see that bitmaps can be used to
 implement set operations. But how comes that walking the graph requires a lot
 of CPU? Isn't it O(n)?

Yes and no. Your n there is the entirety of history. Whereas a simple
git push generally only has to look at the recent history. So even
though you are looking at each commit and tree only once, it's still a
large number of them (and each one needs to be pulled off of the disk,
decompressed, and reconstructed from deltas).

Secondly, the graph traversal ends up seeing the same sha1s over and
over again in tree entries (because most entries in the tree don't
change from commit to commit). We spend a non-trivial amount of time
looking those up in a hash table.

Just try git rev-list --objects --all in your favorite repository to
get a sense. It takes something like 30 seconds in the kernel repo. You
would probably not want to add 30 seconds of CPU time to a trivial push.

 Those bitmaps would be stored in the git metadata? Is it worth it? Storing a
 bitmap for every commit just to be used once-in-a-while seems to be a pretty
 big overhead to me. Not to mention the interoperability problems you mentioned
 below.

There are tricks to make them smaller (run-length compression,
bitmapping a subset of commits and traversing to the nearest one,
storing bitmaps as deltas against nearby bitmaps). And how often it is
used depends on your git workload. For a repository serving git clones
and fetches, it speeds up every operation.

Try starting a clone of:

  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

versus

  git://github.com/torvalds/linux.git

and see which one starts sending you data more quickly.

 Sounds like you're already almost done and don't really need help
 anymore. Just out of curiosity, I'd be interested in a pointer anyway
 ;-)

Shawn gave a talk on JGit here:

  
http://www.eclipsecon.org/2013/sites/eclipsecon.org.2013/files/Scaling%20Up%20JGit%20-%20EclipseCon%202013.pdf

and the scrapped patches for git are here:

  http://article.gmane.org/gmane.comp.version-control.git/228918

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Sebastian Schuberth
On Thu, Sep 12, 2013 at 8:20 PM, Jeff King p...@peff.net wrote:

  I wonder if GCC has changed it's behaviour to more closely match C99.
  Clang as a compatibility article about this sort of issue:
 
  http://clang.llvm.org/compatibility.html#inline

 Interesting.  The ways the page suggests as fixes are

  - change it to a statis inline;
  - remove inline from the definition;
  - provide an external (non-inline) def somewhere else;
  - compile with gnu899 dialect.

 Right, option 3 seems perfectly reasonable to me, as we must be prepared
 to cope with a decision not to inline the function, and there has to be
 _some_ linked implementation. But shouldn't libc be providing an
 external, linkable strcasecmp in this case?

MinGW / GCC is not linking against libc, but against MSVCRT, Visual
Studio's C runtime. And in fact MSVCRT has a non-inline implementation
of a case-insensitive string comparison for up to the first n
characters; it just happens to be called _strnicmp, not
strncasecmp. Which is why I still think just having a #define
strncasecmp _strnicmp is the most elegant solution to the problem.
And that's exactly what defining __NO_INLINE__ does. Granted, defining
__NO_INLINE__ in the scope of string.h will also add a #define
strcasecmp _stricmp; but despite it's name, defining __NO_INLINE__
does not imply a performance hit due to functions not being inlined
because it's just the strncasecmp wrapper around _strnicmp that's
being inlined, not _strnicmp itself.

-- 
Sebastian Schuberth
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/4] rm: re-use parse_pathspec's trailing-slash removal

2013-09-12 Thread John Keeping
Instead of re-implementing the remove trailing slashes loop in
builtin/rm.c just pass PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP to
parse_pathspec.

Signed-off-by: John Keeping j...@keeping.me.uk
---
 builtin/rm.c | 20 
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/builtin/rm.c b/builtin/rm.c
index 9b59ab3..3a0e0ea 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -298,22 +298,10 @@ int cmd_rm(int argc, const char **argv, const char 
*prefix)
if (read_cache()  0)
die(_(index file corrupt));
 
-   /*
-* Drop trailing directory separators from directories so we'll find
-* submodules in the index.
-*/
-   for (i = 0; i  argc; i++) {
-   size_t pathlen = strlen(argv[i]);
-   if (pathlen  is_dir_sep(argv[i][pathlen - 1]) 
-   is_directory(argv[i])) {
-   do {
-   pathlen--;
-   } while (pathlen  is_dir_sep(argv[i][pathlen - 1]));
-   argv[i] = xmemdupz(argv[i], pathlen);
-   }
-   }
-
-   parse_pathspec(pathspec, 0, PATHSPEC_PREFER_CWD, prefix, argv);
+   parse_pathspec(pathspec, 0,
+  PATHSPEC_PREFER_CWD |
+  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP,
+  prefix, argv);
refresh_index(the_index, REFRESH_QUIET, pathspec, NULL, NULL);
 
seen = xcalloc(pathspec.nr, 1);
-- 
1.8.4.277.gfbd6843.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/4] pathspec: strip multiple trailing slashes from submodules

2013-09-12 Thread John Keeping
On Thu, Sep 12, 2013 at 12:48:10PM -0700, Junio C Hamano wrote:
 John Keeping j...@keeping.me.uk writes:
 
  This allows us to replace the submodule path trailing slash removal in
  builtin/rm.c with the PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP flag to
  parse_pathspec() without changing the behaviour with respect to multiple
  trailing slashes.
 
 Where does prefix_pathspec()'s input, which could have an unwanted
 trailing slash, come from?
 
 If it is read from some of our internal data structure and known to
 have at most one, then this change makes me feel very uneasy to cope
 with potentially sloppy end-user input and data generated by ourselves
 with the same logic.  It will allow our internal to be sloppy without
 forcing us notice and fix that sloppiness.
 
 If it is coming from an end-user input, then I would not object to
 the change, though.

I added this in response to Duy's comment on v1 [1].

[1] http://article.gmane.org/gmane.comp.version-control.git/234548

Looking more closely, this does come from user input (via the argv
passed into parse_pathspec) but does (some of the time) go through
prefix_path_gently which calls normalize_path_copy_len.

It's not immediately clear to me when prefix_pathspec goes through this
particular code path, but I think we may be able to drop this (and the
previous patch) without affecting the user.

  Signed-off-by: John Keeping j...@keeping.me.uk
  ---
   pathspec.c | 27 +--
   1 file changed, 17 insertions(+), 10 deletions(-)
 
  diff --git a/pathspec.c b/pathspec.c
  index 7c6963b..11b031a 100644
  --- a/pathspec.c
  +++ b/pathspec.c
  @@ -251,12 +251,16 @@ static unsigned prefix_pathspec(struct pathspec_item 
  *item,
  item-len = strlen(item-match);
  item-prefix = prefixlen;
   
  -   if ((flags  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP) 
  -   (item-len = 1  is_dir_sep(item-match[item-len - 1])) 
  -   (i = cache_name_pos(item-match, item-len - 1)) = 0 
  -   S_ISGITLINK(active_cache[i]-ce_mode)) {
  -   item-len--;
  -   match[item-len] = '\0';
  +   if (flags  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP) {
  +   size_t pathlen = item-len;
  +   while (pathlen  0  is_dir_sep(item-match[pathlen - 1]))
  +   pathlen--;
  +
  +   if ((i = cache_name_pos(item-match, pathlen)) = 0 
  +   S_ISGITLINK(active_cache[i]-ce_mode)) {
  +   item-len = pathlen;
  +   match[item-len] = '\0';
  +   }
  }
   
  if (flags  PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE)
  @@ -271,11 +275,14 @@ static unsigned prefix_pathspec(struct pathspec_item 
  *item,
  !is_dir_sep(match[ce_len]) ||
  memcmp(ce-name, match, ce_len))
  continue;
  -   if (item-len == ce_len + 1) {
  -   /* strip trailing slash */
  +
  +   while (item-len  0  is_dir_sep(match[item-len - 
  1]))
  item-len--;
  -   match[item-len] = '\0';
  -   } else
  +
  +   /* strip trailing slash */
  +   match[item-len] = '\0';
  +
  +   if (item-len != ce_len)
  die (_(Pathspec '%s' is in submodule '%.*s'),
   elt, ce_len, ce-name);
  }
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Jeff King
On Thu, Sep 12, 2013 at 09:46:51PM +0200, Sebastian Schuberth wrote:

  Right, option 3 seems perfectly reasonable to me, as we must be prepared
  to cope with a decision not to inline the function, and there has to be
  _some_ linked implementation. But shouldn't libc be providing an
  external, linkable strcasecmp in this case?
 
 MinGW / GCC is not linking against libc, but against MSVCRT, Visual
 Studio's C runtime. And in fact MSVCRT has a non-inline implementation
 of a case-insensitive string comparison for up to the first n
 characters; it just happens to be called _strnicmp, not
 strncasecmp. Which is why I still think just having a #define
 strncasecmp _strnicmp is the most elegant solution to the problem.
 And that's exactly what defining __NO_INLINE__ does. Granted, defining
 __NO_INLINE__ in the scope of string.h will also add a #define
 strcasecmp _stricmp; but despite it's name, defining __NO_INLINE__
 does not imply a performance hit due to functions not being inlined
 because it's just the strncasecmp wrapper around _strnicmp that's
 being inlined, not _strnicmp itself.

Ah, thanks, that explains what is going on. I do think the environment
is probably in violation of C99, but I dug in the mingw history, and it
looks like it has been this way for over 10 years.

So it is probably worth working around, but it would be nice if the
damage could be contained to just the affected platform.

I think there are basically three classes of solution:

  1. Declare __NO_INLINE__ everywhere. I'd worry this might affect other
 environments, who would then not inline and lose performance (but
 since it's a non-standard macro, we don't really know what it will
 do in other places; possibly nothing).

  2. Declare __NO_INLINE__ on mingw. Similar to above, but we know it
 only affects mingw, and we know the meaning of NO_INLINE there.

  3. Try to impact only the uses as a function pointer (e.g., by using
 a wrapper function as suggested in the thread).

Your patch does (1), I believe. Junio's patch does (3), but is a
maintenance burden in that any new callsites will need to remember to do
the same trick.

But your argument (and reading the mingw header, I agree) is that there
is no performance difference at all between (2) and (3). And (2) does
not have the maintenance burden. So it does seem like the right path to
me.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re-Transmission of blobs?

2013-09-12 Thread Jeff King
On Thu, Sep 12, 2013 at 12:45:44PM +, Pyeron, Jason J CTR (US) wrote:

 If the rules of engagement are change a bit, the server side can be release 
 from most of its work (CPU/IO).
 
 Client does the following, looping as needed:
 
 Heads=server-heads();
 KnownCommits=Local-AllCommits();
 Missingblobs=[];
 Foreach(commit:heads) if (!knownCommits-contains(commit)) 
 MissingBlobs[]=commit;
 Foreach(commit:knownCommit) if (!commit-isValid()) 
 MissingBlobs[]=commit-blobs();
 If (missingBlobs-size()0) server-FetchBlobs(missingBlobs);

That doesn't quite work. The client does not know the set of missing
objects just from the commits. It knows the sha1 of the root trees it is
missing. And then if it fetches those, it knows the sha1 of any
top-level entries it is missing. And when it gets those, it knows the
sha1 of any 2nd-level entries it is missing, and so forth.

You can progressively ask for each level, but:

  1. You are spending a round-trip for each request. Doing it per-object
 is awful (the dumb http walker will do this if the repo is not
 packed, and it's S-L-O-W). Doing it per-level would be better, but
 not great.

  2. You are losing opportunities for deltas (or you are making the
 state the server needs to maintain very complicated, as it must
 remember from request to request which objects you have gotten that
 can be used as delta bases).

  3. There is a lot of overhead in this protocol. The client has to
 mention each object individually by sha1. It may not seem like a
 lot, but it can easily add 10% to a clone (just look at the size of
 the pack .idx files versus the packfiles themselves).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Junio C Hamano
Sebastian Schuberth sschube...@gmail.com writes:

 I'm not too happy with the wording either. As I see it, even on MinGW
 runtime version 4.0 it's not true that string.h has _only_ inline
 definition of strcasecmp; there's also #define strncasecmp
 _strnicmp which effectively provides a non-inline definition of
 strncasecmp aka _strnicmp.

I do not get this part.  Sure, string.h would have definitions of
things other than strcasecmp, such as strncasecmp.  So what?

Does it effectively provide a non-inline definition of strcasecmp?

Perhaps the real issue is that the header file does not give an
equivalent those who want to take the address of strcasecmp will
get the address of _stricmp instead macro, e.g.

#define strcasecmp _stricmp

or something?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] pathspec: use is_dir_sep() to check for trailing slashes

2013-09-12 Thread Johannes Sixt
Am 12.09.2013 21:24, schrieb John Keeping:
 This allows us to correctly removing trailing backslashes on Windows
 when checking for submodules.
 
 Signed-off-by: John Keeping j...@keeping.me.uk
 ---
  pathspec.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)
 
 diff --git a/pathspec.c b/pathspec.c
 index ad1a9f5..7c6963b 100644
 --- a/pathspec.c
 +++ b/pathspec.c
 @@ -252,7 +252,7 @@ static unsigned prefix_pathspec(struct pathspec_item 
 *item,
   item-prefix = prefixlen;
  
   if ((flags  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP) 
 - (item-len = 1  item-match[item-len - 1] == '/') 
 + (item-len = 1  is_dir_sep(item-match[item-len - 1])) 
   (i = cache_name_pos(item-match, item-len - 1)) = 0 
   S_ISGITLINK(active_cache[i]-ce_mode)) {
   item-len--;
 @@ -267,7 +267,8 @@ static unsigned prefix_pathspec(struct pathspec_item 
 *item,
   if (!S_ISGITLINK(ce-ce_mode))
   continue;
  
 - if (item-len = ce_len || match[ce_len] != '/' ||
 + if (item-len = ce_len ||
 + !is_dir_sep(match[ce_len]) ||
   memcmp(ce-name, match, ce_len))
   continue;
   if (item-len == ce_len + 1) {

A design decisions to keep in mind:

Paths in the index *ALWAYS* use the slash, even on Windows. On Windows,
pathspec that are user input must undergo backslash-to-slash
transformation at a very early stage so that later processing that
compares the user input to index contents need not do it on the fly. The
backslash-to-slash transformation used to happen in get_pathspec() via
prefix_path() and normalize_path_copy().

If, at this point, the contents of 'match' is still being parsed for
pathspec magic, then it is likely correct to use is_dir_sep().

On the other hand, if at this point the contents of 'match' are used to
execute pathspec magic, then it is not correct to use is_dir_sep(); the
conversion of backslash to slash should have happened earlier, and no
backslashes should be present anymore.

(Yes, this means that on Windows we cannot escape glob characters
because, e.g., 'a\*.c' was turned into 'a/*.c'.)

-- Hannes

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Re-Transmission of blobs?

2013-09-12 Thread Pyeron, Jason J CTR (US)
 -Original Message-
 From: Jeff King
 Sent: Thursday, September 12, 2013 3:57 PM
 
 On Thu, Sep 12, 2013 at 12:45:44PM +, Pyeron, Jason J CTR (US)
 wrote:
 
  If the rules of engagement are change a bit, the server side can be
 release from most of its work (CPU/IO).
 
  Client does the following, looping as needed:
 
  Heads=server-heads();
  KnownCommits=Local-AllCommits();
  Missingblobs=[];
  Foreach(commit:heads) if (!knownCommits-contains(commit))
 MissingBlobs[]=commit;
  Foreach(commit:knownCommit) if (!commit-isValid())
 MissingBlobs[]=commit-blobs();
  If (missingBlobs-size()0) server-FetchBlobs(missingBlobs);
 
 That doesn't quite work. The client does not know the set of missing

looping as needed

 objects just from the commits. It knows the sha1 of the root trees it
 is
 missing. And then if it fetches those, it knows the sha1 of any
 top-level entries it is missing. And when it gets those, it knows the
 sha1 of any 2nd-level entries it is missing, and so forth.
 
 You can progressively ask for each level, but:
 
   1. You are spending a round-trip for each request. Doing it per-
 object
  is awful (the dumb http walker will do this if the repo is not
  packed, and it's S-L-O-W). Doing it per-level would be better, but
  not great.

Yes, but it is those awfully slow connections (slower that the looping issue) 
which happen to always drop while cloning from our office. And the round trip 
should be mitigated by http-keep-alives.

 
   2. You are losing opportunities for deltas (or you are making the
  state the server needs to maintain very complicated, as it must
  remember from request to request which objects you have gotten
 that
  can be used as delta bases).

But, again if the connection drops, we have already lost the delta advantage. I 
would think the scenario would go like this:

git clone url://blah/blah
[fail]
cd blah
git clone --resume #uses normal methods
[fail]
while ! git clone --resume --HitItWithAStick

replace clone with fetch for that use case too

 
   3. There is a lot of overhead in this protocol. The client has to
  mention each object individually by sha1. It may not seem like a
  lot, but it can easily add 10% to a clone (just look at the size
 of
  the pack .idx files versus the packfiles themselves).

But if it finishes in a week, it is a lot better than it never, ever finishes.

I draw attention to the time I had to download DB2 UDB in Thailand via 28k 
modem. It was resumable with wget, if it were not, it would have required the 
use of a plane to sneaker net it back.




smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH v2 2/4] pathspec: strip multiple trailing slashes from submodules

2013-09-12 Thread Junio C Hamano
John Keeping j...@keeping.me.uk writes:

 This allows us to replace the submodule path trailing slash removal in
 builtin/rm.c with the PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP flag to
 parse_pathspec() without changing the behaviour with respect to multiple
 trailing slashes.

Where does prefix_pathspec()'s input, which could have an unwanted
trailing slash, come from?

If it is read from some of our internal data structure and known to
have at most one, then this change makes me feel very uneasy to cope
with potentially sloppy end-user input and data generated by ourselves
with the same logic.  It will allow our internal to be sloppy without
forcing us notice and fix that sloppiness.

If it is coming from an end-user input, then I would not object to
the change, though.

Thanks.

 Signed-off-by: John Keeping j...@keeping.me.uk
 ---
  pathspec.c | 27 +--
  1 file changed, 17 insertions(+), 10 deletions(-)

 diff --git a/pathspec.c b/pathspec.c
 index 7c6963b..11b031a 100644
 --- a/pathspec.c
 +++ b/pathspec.c
 @@ -251,12 +251,16 @@ static unsigned prefix_pathspec(struct pathspec_item 
 *item,
   item-len = strlen(item-match);
   item-prefix = prefixlen;
  
 - if ((flags  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP) 
 - (item-len = 1  is_dir_sep(item-match[item-len - 1])) 
 - (i = cache_name_pos(item-match, item-len - 1)) = 0 
 - S_ISGITLINK(active_cache[i]-ce_mode)) {
 - item-len--;
 - match[item-len] = '\0';
 + if (flags  PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP) {
 + size_t pathlen = item-len;
 + while (pathlen  0  is_dir_sep(item-match[pathlen - 1]))
 + pathlen--;
 +
 + if ((i = cache_name_pos(item-match, pathlen)) = 0 
 + S_ISGITLINK(active_cache[i]-ce_mode)) {
 + item-len = pathlen;
 + match[item-len] = '\0';
 + }
   }
  
   if (flags  PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE)
 @@ -271,11 +275,14 @@ static unsigned prefix_pathspec(struct pathspec_item 
 *item,
   !is_dir_sep(match[ce_len]) ||
   memcmp(ce-name, match, ce_len))
   continue;
 - if (item-len == ce_len + 1) {
 - /* strip trailing slash */
 +
 + while (item-len  0  is_dir_sep(match[item-len - 
 1]))
   item-len--;
 - match[item-len] = '\0';
 - } else
 +
 + /* strip trailing slash */
 + match[item-len] = '\0';
 +
 + if (item-len != ce_len)
   die (_(Pathspec '%s' is in submodule '%.*s'),
elt, ce_len, ce-name);
   }
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] lookup_object: remove hashtable_index() and optimize hash_obj()

2013-09-12 Thread Junio C Hamano
Nicolas Pitre n...@fluxnic.net writes:

 Maybe it's worth squashing in one or both of the comments below as a
 warning to anybody who tries to tweak it.

 Agreed.

 @Junio: are you willing to squash those in, or do you prefer a resent?

I think I've queued it ready to be squashed.  No need for resend.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Kyle J. McKay

On Sep 12, 2013, at 11:30, Junio C Hamano wrote:


+   /* append_normalized_escapes can cause norm.buf to change */
+   seg_start = norm.buf + seg_start_off;


The change looks good, but I find that this comment is not placed in
the right place.  It is good if the reader knows about an old bug to
put it here, but if the first thing a reader reads is this updated
version, the comment is better placed close to the place where the
start_ofs variable captures the original value (i.e. because the
next call may relocate the buffer, we cannot grab seg_start upfront;
instead we need to record the start_ofs here, and that is what this
variable is about).

It is too minor a point for a reroll, so I'll try to tweak it
locally.  Something like this (but now I think about it, the comment
may not even be necessary).


The longer comment looks good to me.  If you think the code will be  
safe from
simplification patches without a comment, that works for me too.  I've  
just seen
so many simplification patches go by on the list I'm concerned it  
will be a

target otherwise leading to re-introduction of the problem.


diff --git a/urlmatch.c b/urlmatch.c
index 01c6746..d1600e2 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -282,9 +282,17 @@ char *url_normalize(const char *url, struct  
url_info *out_info)

}
for (;;) {
const char *seg_start;
-   size_t seg_start_off = norm.len;
+   size_t seg_start_off;
const char *next_slash = url + strcspn(url, /?#);
int skip_add_slash = 0;
+
+   /*
+* record the starting offset; appending escapes may
+* relocate the buffer, so we cannot capture seg_start
+* upfront and use it later.
+*/
+   seg_start_off = norm.len;
+
/*
 * RFC 3689 indicates that any . or .. segments should be
 * unescaped before being checked for.
@@ -298,7 +306,7 @@ char *url_normalize(const char *url, struct  
url_info *out_info)

strbuf_release(norm);
return NULL;
}
-   /* append_normalized_escapes can cause norm.buf to change */
+
seg_start = norm.buf + seg_start_off;
if (!strcmp(seg_start, .)) {
/* ignore a . segment; be careful not to remove initial 
'/' */


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 I think there are basically three classes of solution:

   1. Declare __NO_INLINE__ everywhere. I'd worry this might affect other
  environments, who would then not inline and lose performance (but
  since it's a non-standard macro, we don't really know what it will
  do in other places; possibly nothing).

   2. Declare __NO_INLINE__ on mingw. Similar to above, but we know it
  only affects mingw, and we know the meaning of NO_INLINE there.

   3. Try to impact only the uses as a function pointer (e.g., by using
  a wrapper function as suggested in the thread).

 Your patch does (1), I believe. Junio's patch does (3), but is a
 maintenance burden in that any new callsites will need to remember to do
 the same trick.

 But your argument (and reading the mingw header, I agree) is that there
 is no performance difference at all between (2) and (3). And (2) does
 not have the maintenance burden. So it does seem like the right path to
 me.

Agreed.  If that #define __NO_INLINE__ does not appear in the common
part of our header files like git-compat-util.h but is limited to
somewhere in compat/, that would be the perfect outcome.

Thanks, both.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: Fwd: git-daemon access-hook race condition

2013-09-12 Thread Eugene Sajine
On Thu, Sep 12, 2013 at 3:15 PM, Junio C Hamano gits...@pobox.com wrote:
 Eugene Sajine eugu...@gmail.com writes:

 Is it possible to have access-hook to be executed after receive?

 The whole point of access-hook is to allow it to decide whether the
 access is allowed or not, so that is a non-starter.

 A notification _after_ successful push update is usually done via
 the post-receive hook in the receiving repository, I think.


Junio,

Thanks for the reply!

This is interesting: i always thought about the access-hook as
something to be executed when the repo is accessed, not just
verification if access is allowed - your definition is much more
limiting.

we have about 1400 bare repos - so i would like to avoid the
configuration of each one of them. I could probably find a way to
automate it, but already having access-hook in current implementation
makes me reluctant to go this way, because it is so much easier to use
centralized manager.

So are you really sure that it is a non-starter to have
--before-service/--after-service options for access-hook?

Thanks,
Eugene
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] lookup_object: remove hashtable_index() and optimize hash_obj()

2013-09-12 Thread Nicolas Pitre
On Wed, 11 Sep 2013, Jeff King wrote:

 On Tue, Sep 10, 2013 at 06:17:12PM -0400, Nicolas Pitre wrote:
 
  Also remove the modulus as this is an expansive operation.
  The size argument is always a power of 2 anyway, so a simple
  mask operation provides the same result.
  
  On a 'git rev-list --all --objects' run this decreased the time spent
  in lookup_object from 27.5% to 24.1%.
 
 Nice. This is a tiny bit subtle, though, as the power-of-2 growth
 happens elsewhere, and we may want to tweak it later (the decorate.c
 hash, for example, grows by 3/2).
 
 Maybe it's worth squashing in one or both of the comments below as a
 warning to anybody who tries to tweak it.

Agreed.

@Junio: are you willing to squash those in, or do you prefer a resent?

 ---
 diff --git a/object.c b/object.c
 index e2dae22..5f792cb 100644
 --- a/object.c
 +++ b/object.c
 @@ -47,6 +47,7 @@ static unsigned int hash_obj(const unsigned char *sha1, 
 unsigned int n)
  {
   unsigned int hash;
   memcpy(hash, sha1, sizeof(unsigned int));
 + /* Assumes power-of-2 hash sizes in grow_object_hash */
   return hash  (n - 1);
  }
  
 @@ -94,6 +95,10 @@ static void grow_object_hash(void)
  static void grow_object_hash(void)
  {
   int i;
 + /*
 +  * Note that this size must always be power-of-2 to match hash_obj
 +  * above.
 +  */
   int new_hash_size = obj_hash_size  32 ? 32 : 2 * obj_hash_size;
   struct object **new_hash;
  
 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] remote-bzr: reuse bzrlib transports when possible

2013-09-12 Thread Junio C Hamano
Richard Hansen rhan...@bbn.com writes:

 Ping?  I'd like to merge fc/contrib-bzr.hg-fixes topic to 'next'
 (and fast track it to 'master' after that), and it would be helpful
 to get an Ack on the conflict resolution I have.

 Sorry for the delay.

 Looks good to me, and the tests still pass.

Thanks.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: Fwd: git-daemon access-hook race condition

2013-09-12 Thread Junio C Hamano
Eugene Sajine eugu...@gmail.com writes:

 So are you really sure that it is a non-starter to have
 --before-service/--after-service options for access-hook?

Given the definition of --access-hook in git help daemon:

--access-hook=path::
Every time a client connects, first run an external command
specified by the path ... The external command can decide
to decline the service by exiting with a non-zero status (or
to allow it by exiting with a zero status)

There is *NO* way in anywhere --after-service makes any sense (and
by definition --before-service is redundant).

What you _could_ propose is to define a *new* hook that is run when
the spawned service has returned, with the same information that is
fed to the access hook (possibly with its exit status).

I do not offhand know if we retain the original service information
that long after the main daemon process has spawned the service
process, though.  With the current system, the only thing it needs
to know is the PID of the service processes that are to be culled by
calls to waitpid().  So you may have to extend existing bookkeeping
data structures a bit to keep those pieces of information around if
you wanted to add such a new hook.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: Fwd: git-daemon access-hook race condition

2013-09-12 Thread Eugene Sajine
Junio,

Thanks for the clarification! Your solution does look better.

For now though i think i will have to delay the notification somehow
and let the service finish first then notify the server.

Thanks again!

Eugene


On Thu, Sep 12, 2013 at 5:08 PM, Junio C Hamano gits...@pobox.com wrote:
 Eugene Sajine eugu...@gmail.com writes:

 So are you really sure that it is a non-starter to have
 --before-service/--after-service options for access-hook?

 Given the definition of --access-hook in git help daemon:

 --access-hook=path::
 Every time a client connects, first run an external command
 specified by the path ... The external command can decide
 to decline the service by exiting with a non-zero status (or
 to allow it by exiting with a zero status)

 There is *NO* way in anywhere --after-service makes any sense (and
 by definition --before-service is redundant).

 What you _could_ propose is to define a *new* hook that is run when
 the spawned service has returned, with the same information that is
 fed to the access hook (possibly with its exit status).

 I do not offhand know if we retain the original service information
 that long after the main daemon process has spawned the service
 process, though.  With the current system, the only thing it needs
 to know is the PID of the service processes that are to be culled by
 calls to waitpid().  So you may have to extend existing bookkeeping
 data structures a bit to keep those pieces of information around if
 you wanted to add such a new hook.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] remote-bzr: reuse bzrlib transports when possible

2013-09-12 Thread Richard Hansen
On 2013-09-10 18:01, Junio C Hamano wrote:
 Junio C Hamano gits...@pobox.com writes:
 
 Richard Hansen rhan...@bbn.com writes:

  def do_export(parser):
 -global parsed_refs, dirname
 +global parsed_refs, dirname, transports

 As this has been acked by Felipe who knows the script the best, I'll
 apply this directly to 'master'.

 These additions of global transports however have trivial
 interactions with fc/contrib-bzr-hg-fixes topic Felipe posted
 earlier, which I was planning to start merging down to 'next' and
 then to 'master'.  Most funcions merely use the variable without
 assigning, so global transports can be removed, in line with the
 spirit of 641a2b5b (remote-helpers: cleanup more global variables,
 2013-08-28), except for the obvious initialisation in main(), I
 think.  Please double check the conflict resolution result in a
 commit on 'pu' with

 git show 'origin/pu^{/Merge fc/contrib-bzr}'

 when I push the result out.

 Thanks.
 
 Ping?  I'd like to merge fc/contrib-bzr.hg-fixes topic to 'next'
 (and fast track it to 'master' after that), and it would be helpful
 to get an Ack on the conflict resolution I have.

Sorry for the delay.

Looks good to me, and the tests still pass.

-Richard

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Jonathan Nieder
Sebastian Schuberth wrote:

 And that's exactly what defining __NO_INLINE__ does. Granted, defining
 __NO_INLINE__ in the scope of string.h will also add a #define
 strcasecmp _stricmp; but despite it's name, defining __NO_INLINE__
 does not imply a performance hit due to functions not being inlined
 because it's just the strncasecmp wrapper around _strnicmp that's
 being inlined, not _strnicmp itself.

What I don't understand is why the header doesn't use static inline
instead of extern inline.  The former would seem to be better in
every way for this particular use case.

See also http://www.greenend.org.uk/rjk/tech/inline.html, section
GNU C inline rules.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-compat-util: Avoid strcasecmp() being inlined

2013-09-12 Thread Jonathan Nieder
Sebastian Schuberth wrote:

 I'm not too happy with the wording either. As I see it, even on MinGW
 runtime version 4.0 it's not true that string.h has _only_ inline
 definition of strcasecmp; there's also #define strncasecmp
 _strnicmp

I assume you mean #define strcasecmp _stricmp, which is guarded by
defined(__NO_INLINE__).  I think what Junio meant is that by default
(i.e., in the !defined(__NO_INLINE__) case) string.h uses
__CRT_INLINE, defined as

extern inline __attribute__((__gnu_inline__))

to suppress the non-inline function definition.

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Converting repo from HG, `git filter-branch --prune-empty -- --all` is extremely slow and errors out.

2013-09-12 Thread John Gietzen
Background:
Windows, git version 1.8.3.msysgit.0
bare repo, 54k commits after migration from HG
git filter-branch --prune-empty -- --all

I'm trying to clean up our repository after migrating it from HG.  I'm running 
the filter-branch command listed above in an effort to clean up all of garbage 
commits that HG required (closing branch commits and their ilk).

From my past experience, git filter-branch is extremely quick when using 
simple filters, like env-filter, since it doesn't have to touch the working 
dir.  However, in our case each revision is taking 1-3 seconds; our entire 
repo will take 30 hours to clean up at this rate.  Normally, this wouldn't be 
a problem, except that we are getting sh.exe couldn't start errors after 
anywhere between the 5000th and 6000th rewritten commit.  Filter-branch 
doesn't have support for picking up where it left off, so we are entirely 
unable to clean up our repo. 

All that being said, I have 3 questions:
  1.  Is there anything I can do to speed up the filter-branch command? 
(Alternatively, is there a way I can profile git-filter-branch.sh on msysgit?)
  2.  Any idea why sh.exe would fail?
  3.  Is there a way I can resume the filter-branch command when/if it fails?  
(Alternatively, is there a way I can do the filter-branch in pieces and 
efficiently rebase... or something?)

I have already had to modify git-filter-branch.sh myself (to support the 
immense number of refs we are rewriting), so I'm comfortable with that.

Any help you can offer would be appreciated.
 
Thanks in advance,
John Gietzen

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Jonathan Nieder
Kyle J. McKay wrote:

 The longer comment looks good to me.  If you think the code will be safe from
 simplification patches without a comment, that works for me too.

I think if we can't trust reviewers to catch this kind of thing, we're
in trouble (i.e., moving too fast). :)

So FWIW my instinct is to leave the comment out, since I actually find
it more readable that way (otherwise I would wonder, Why am I being
told that a strbuf's buffer has a nonconstant address?  Do some other
strbufs have a constant address or something?)

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: Fwd: git-daemon access-hook race condition

2013-09-12 Thread Eugene Sajine
On Thu, Sep 12, 2013 at 5:16 PM, Eugene Sajine eugu...@gmail.com wrote:
 Junio,

 Thanks for the clarification! Your solution does look better.

 For now though i think i will have to delay the notification somehow
 and let the service finish first then notify the server.

 Thanks again!

 Eugene


 On Thu, Sep 12, 2013 at 5:08 PM, Junio C Hamano gits...@pobox.com wrote:
 Eugene Sajine eugu...@gmail.com writes:

 So are you really sure that it is a non-starter to have
 --before-service/--after-service options for access-hook?

 Given the definition of --access-hook in git help daemon:

 --access-hook=path::
 Every time a client connects, first run an external command
 specified by the path ... The external command can decide
 to decline the service by exiting with a non-zero status (or
 to allow it by exiting with a zero status)

 There is *NO* way in anywhere --after-service makes any sense (and
 by definition --before-service is redundant).

 What you _could_ propose is to define a *new* hook that is run when
 the spawned service has returned, with the same information that is
 fed to the access hook (possibly with its exit status).

 I do not offhand know if we retain the original service information
 that long after the main daemon process has spawned the service
 process, though.  With the current system, the only thing it needs
 to know is the PID of the service processes that are to be culled by
 calls to waitpid().  So you may have to extend existing bookkeeping
 data structures a bit to keep those pieces of information around if
 you wanted to add such a new hook.



For now I'm trying to do the following:

access-hook.bash has:

delayed-notify.bash $@ 

delayed-notify.bash has:

sleep 10
...
curl ...

I'm expecting access-hook to spawn new process and return without
waiting for it to finish to let the service to do its job. But when i
do push - it sleeps for 10 seconds anyway. Am i missing something
obvious here?

Any help is much appreciated!

Thanks,
Eugene
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Junio C Hamano
Jonathan Nieder jrnie...@gmail.com writes:

 Kyle J. McKay wrote:

 The longer comment looks good to me.  If you think the code will be safe from
 simplification patches without a comment, that works for me too.

 I think if we can't trust reviewers to catch this kind of thing, we're
 in trouble (i.e., moving too fast). :)

 So FWIW my instinct is to leave the comment out, since I actually find
 it more readable that way (otherwise I would wonder, Why am I being
 told that a strbuf's buffer has a nonconstant address?  Do some other
 strbufs have a constant address or something?)

Yeah, I was staring at that message and coming to the same
conclusion.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC] Developer's Certificate of Origin: default to COPYING

2013-09-12 Thread Richard Hansen
The Developer's Certificate of Origin refers to the open source
license indicated in the file, but there is no such indication in
most files in the Git repository.

Update the text to indicate that the license in COPYING should be
assumed if a file doesn't excplicitly indicate which license applies
to the file.

The phrase accompanies the file was chosen to support different
default licenses in different subdirectories (e.g., 2-clause BSD for
vcs-svn/*, LGPL2.1+ for xdiff/*).

Signed-off-by: Richard Hansen rhan...@bbn.com
---
I'm bringing this up because, to this layman's eyes, it seems like a
potentially troublesome oversight.  IIUC, one of the purposes of the
Developer's Certificate of Origin is to make it easy for developers to
declare which license covers a contribution.  Requiring a license
declaration protects the project and its users from copyright
litigation.

What happens if the file(s) being modified do not indicate which
license applies to the file?  Is there no license?  Does it default to
the main project license in COPYING?  This lack of clarity makes me a
bit nervous (law is already too nondeterministic for my liking), so
I'd like to see a change that makes it explicit.

Notes:
  * I am not a lawyer.  (Maybe a lawyer should be consulted?)
  * This change might not be necessary.
  * This change might be wrong.
  * I hope I'm not just wasting everyone's time by bringing this up.

 Documentation/SubmittingPatches | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 7055576..c5ff744 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -227,13 +227,15 @@ the patch, which certifies that you wrote it or otherwise 
have
 the right to pass it on as a open-source patch.  The rules are
 pretty simple: if you can certify the below:
 
-Developer's Certificate of Origin 1.1
+Developer's Certificate of Origin 1.2
 
 By making a contribution to this project, I certify that:
 
 (a) The contribution was created in whole or in part by me and I
 have the right to submit it under the open source license
-indicated in the file; or
+indicated in the file (or, if no license is indicated in
+the file, the license in COPYING that accompanies the
+file); or
 
 (b) The contribution is based upon previous work that, to the best
 of my knowledge, is covered under an appropriate open source
@@ -241,7 +243,8 @@ pretty simple: if you can certify the below:
 work with modifications, whether created in whole or in part
 by me, under the same open source license (unless I am
 permitted to submit under a different license), as indicated
-in the file; or
+in the file (or, if no license is indicated in the file,
+the license in COPYING that accompanies the file); or
 
 (c) The contribution was provided directly to me by some other
 person who certified (a), (b) or (c) and I have not modified
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: Fwd: git-daemon access-hook race condition

2013-09-12 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Eugene Sajine eugu...@gmail.com writes:

 So are you really sure that it is a non-starter to have
 --before-service/--after-service options for access-hook?

 Given the definition of --access-hook in git help daemon:

 --access-hook=path::
 Every time a client connects, first run an external command
 specified by the path ... The external command can decide
 to decline the service by exiting with a non-zero status (or
 to allow it by exiting with a zero status)

 There is *NO* way in anywhere --after-service makes any sense (and
 by definition --before-service is redundant).

 What you _could_ propose is to define a *new* hook that is run when
 the spawned service has returned, with the same information that is
 fed to the access hook (possibly with its exit status).

Scratch that exit status part, as I do not think it is useful.

To a receive-pack and a send-pack that is talking to it, if a push
results in a failure, it is a failure.  Likewise for upload-pack and
fetch-pack for a transfer in the reverse direction.

And the way that failure is communicated from the receive-pack to
the end-user via the send-pack is for the receive-pack to send a
protocol message that tells the send-pack about the failure, and the
send-pack showing the error message and signalling the failure with
its exit status.  Likewise for upload-pack and fetch-pack (hence
fetch, which is conceptually a thin wrapper around it).

Between the deamon and the receive-pack (or the fetch-pack) process,
however, such a failed push (or fetch) is still a success.  I
correctly diagnosed and successfully sent a rejection notice to the
other end is signalled by receive-pack to the daemon by exiting
with success (i.e. 0) exit status.

So even if we feed the exit status of the service process to the
hook script specified by the --post-service-hook, it does not tell
the script if the service succeeded in that sense.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Developer's Certificate of Origin: default to COPYING

2013-09-12 Thread Junio C Hamano
Linus, this is not limited to us, so I am bothering you; sorry about
that.

My instinct tells me that some competent lawyers at linux-foundation
helped you with the wording of DCO, and we amateurs shouldn't be
mucking with the text like this patch does at all, but just in case
you might find it interesting...


Richard Hansen rhan...@bbn.com writes:

 The Developer's Certificate of Origin refers to the open source
 license indicated in the file, but there is no such indication in
 most files in the Git repository.

 Update the text to indicate that the license in COPYING should be
 assumed if a file doesn't excplicitly indicate which license applies
 to the file.

 The phrase accompanies the file was chosen to support different
 default licenses in different subdirectories (e.g., 2-clause BSD for
 vcs-svn/*, LGPL2.1+ for xdiff/*).

 Signed-off-by: Richard Hansen rhan...@bbn.com
 ---
 I'm bringing this up because, to this layman's eyes, it seems like a
 potentially troublesome oversight.  IIUC, one of the purposes of the
 Developer's Certificate of Origin is to make it easy for developers to
 declare which license covers a contribution.  Requiring a license
 declaration protects the project and its users from copyright
 litigation.

 What happens if the file(s) being modified do not indicate which
 license applies to the file?  Is there no license?  Does it default to
 the main project license in COPYING?  This lack of clarity makes me a
 bit nervous (law is already too nondeterministic for my liking), so
 I'd like to see a change that makes it explicit.

 Notes:
   * I am not a lawyer.  (Maybe a lawyer should be consulted?)
   * This change might not be necessary.
   * This change might be wrong.
   * I hope I'm not just wasting everyone's time by bringing this up.

  Documentation/SubmittingPatches | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

 diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
 index 7055576..c5ff744 100644
 --- a/Documentation/SubmittingPatches
 +++ b/Documentation/SubmittingPatches
 @@ -227,13 +227,15 @@ the patch, which certifies that you wrote it or 
 otherwise have
  the right to pass it on as a open-source patch.  The rules are
  pretty simple: if you can certify the below:
  
 -Developer's Certificate of Origin 1.1
 +Developer's Certificate of Origin 1.2
  
  By making a contribution to this project, I certify that:
  
  (a) The contribution was created in whole or in part by me and I
  have the right to submit it under the open source license
 -indicated in the file; or
 +indicated in the file (or, if no license is indicated in
 +the file, the license in COPYING that accompanies the
 +file); or
  
  (b) The contribution is based upon previous work that, to the best
  of my knowledge, is covered under an appropriate open source
 @@ -241,7 +243,8 @@ pretty simple: if you can certify the below:
  work with modifications, whether created in whole or in part
  by me, under the same open source license (unless I am
  permitted to submit under a different license), as indicated
 -in the file; or
 +in the file (or, if no license is indicated in the file,
 +the license in COPYING that accompanies the file); or
  
  (c) The contribution was provided directly to me by some other
  person who certified (a), (b) or (c) and I have not modified
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Developer's Certificate of Origin: default to COPYING

2013-09-12 Thread Linus Torvalds
On Thu, Sep 12, 2013 at 3:30 PM, Junio C Hamano gits...@pobox.com wrote:
 Linus, this is not limited to us, so I am bothering you; sorry about
 that.

 My instinct tells me that some competent lawyers at linux-foundation
 helped you with the wording of DCO, and we amateurs shouldn't be
 mucking with the text like this patch does at all, but just in case
 you might find it interesting...

There were lawyers involved, yes.

I'm not sure there is any actual confusion, because the fact is,
lawyers aren't robots or programmers, and they have the human
qualities of understanding implications. So I'm actually inclined to
not change legal text unless a lawyer actually tells me that it's
needed.

Plus even if this change was needed, why would anybody point to
COPYING. It's much better to just say the copyright license of the
file, knowing that different projects have different rules about this
all, and some projects mix files from different sources, where parts
of the tree may be under different licenses that may be explained
elsewhere..

Linus
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Developer's Certificate of Origin: default to COPYING

2013-09-12 Thread Theodore Ts'o
I certainly wouldn't recommend messing with the text of the DCO
without first consulting some lawyers.  There should also be some
centralized coordination about any changes in the text and the version
number.

- Ted
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Developer's Certificate of Origin: default to COPYING

2013-09-12 Thread Richard Hansen
On 2013-09-12 18:44, Linus Torvalds wrote:
 On Thu, Sep 12, 2013 at 3:30 PM, Junio C Hamano gits...@pobox.com wrote:
 Linus, this is not limited to us, so I am bothering you; sorry about
 that.

 My instinct tells me that some competent lawyers at linux-foundation
 helped you with the wording of DCO, and we amateurs shouldn't be
 mucking with the text like this patch does at all, but just in case
 you might find it interesting...
 
 There were lawyers involved, yes.
 
 I'm not sure there is any actual confusion, because the fact is,
 lawyers aren't robots or programmers, and they have the human
 qualities of understanding implications.

Well stated.  :)

 So I'm actually inclined to
 not change legal text unless a lawyer actually tells me that it's
 needed.

Is it worthwhile to poke a lawyer about this as a precaution?  (If so,
who?)  Or do we wait for a motivating event?

 
 Plus even if this change was needed, why would anybody point to
 COPYING. It's much better to just say the copyright license of the
 file, knowing that different projects have different rules about this
 all, and some projects mix files from different sources, where parts
 of the tree may be under different licenses that may be explained
 elsewhere..

I agree that your phrasing is better.

-Richard
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Developer's Certificate of Origin: default to COPYING

2013-09-12 Thread Linus Torvalds
On Thu, Sep 12, 2013 at 4:15 PM, Richard Hansen rhan...@bbn.com wrote:

 Is it worthwhile to poke a lawyer about this as a precaution?  (If so,
 who?)  Or do we wait for a motivating event?

I can poke the lawyer that was originally involved. If people know
other lawyers, feel free to poke them too. Just ask them to be
realistic, not go into some kind of super-anal lawyer mode where they
go off on some what if thing.

Note that one issue is that this is kind of like a license change,
even if it's arguably just a clarification. I'd expect that a lawyer
who is so anal that they think this wording needs change would also
think that the DCO version number needs change and then spend half an
hour (and $500) talking about how this only affects new sign-offs and
how you'd want to make it very obvious how things have changed, Yadda
yadda.

IOW, my personal opinion is that if you get a lawyer that is _that_
interested in irrelevant details, you have much bigger problems than
this particular wording. Lawyers do tend to be particular about
wording, but in the end, they tend to also agree that intent matters.
At least the good ones who have a case. Once they start talking about
the meaning of the word 'is', you know they are just weaselwording
and don't actually have any real argument.

  Linus
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: Fwd: git-daemon access-hook race condition

2013-09-12 Thread Eugene Sajine
On Thu, Sep 12, 2013 at 6:20 PM, Junio C Hamano gits...@pobox.com wrote:
 Junio C Hamano gits...@pobox.com writes:

 Eugene Sajine eugu...@gmail.com writes:

 So are you really sure that it is a non-starter to have
 --before-service/--after-service options for access-hook?

 Given the definition of --access-hook in git help daemon:

 --access-hook=path::
 Every time a client connects, first run an external command
 specified by the path ... The external command can decide
 to decline the service by exiting with a non-zero status (or
 to allow it by exiting with a zero status)

 There is *NO* way in anywhere --after-service makes any sense (and
 by definition --before-service is redundant).

 What you _could_ propose is to define a *new* hook that is run when
 the spawned service has returned, with the same information that is
 fed to the access hook (possibly with its exit status).

 Scratch that exit status part, as I do not think it is useful.

 To a receive-pack and a send-pack that is talking to it, if a push
 results in a failure, it is a failure.  Likewise for upload-pack and
 fetch-pack for a transfer in the reverse direction.

 And the way that failure is communicated from the receive-pack to
 the end-user via the send-pack is for the receive-pack to send a
 protocol message that tells the send-pack about the failure, and the
 send-pack showing the error message and signalling the failure with
 its exit status.  Likewise for upload-pack and fetch-pack (hence
 fetch, which is conceptually a thin wrapper around it).

 Between the deamon and the receive-pack (or the fetch-pack) process,
 however, such a failed push (or fetch) is still a success.  I
 correctly diagnosed and successfully sent a rejection notice to the
 other end is signalled by receive-pack to the daemon by exiting
 with success (i.e. 0) exit status.

 So even if we feed the exit status of the service process to the
 hook script specified by the --post-service-hook, it does not tell
 the script if the service succeeded in that sense.


I see what you're saying.
In my particular use case I can work around that service status
because even if it failed it will just trigger Jenkins to poll and in
case of failure to transfer data there will be no new changes for
Jenkins to work with. If we would want the --post-service-hook to know
that data transfer succeeded or failed, then may be there should be
some difference between service status and service process status?
In this case the existing logic works with service process status
while the --post-service-hook is fed with the service status (or
name it data transfer status)

Do i make any sense?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: Fwd: git-daemon access-hook race condition

2013-09-12 Thread Junio C Hamano
Eugene Sajine eugu...@gmail.com writes:

 So even if we feed the exit status of the service process to the
 hook script specified by the --post-service-hook, it does not tell
 the script if the service succeeded in that sense.

 I see what you're saying.
 In my particular use case I can work around that service status
 because even if it failed it will just trigger Jenkins to poll and in
 case of failure to transfer data there will be no new changes for
 Jenkins to work with. If we would want the --post-service-hook to know
 that data transfer succeeded or failed, then may be there should be
 some difference between service status and service process status?
 In this case the existing logic works with service process status
 while the --post-service-hook is fed with the service status (or
 name it data transfer status)

 Do i make any sense?

Almost; you missed that there is no channel to pass data transfer
status from the service back to the daemon.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Converting repo from HG, `git filter-branch --prune-empty -- --all` is extremely slow and errors out.

2013-09-12 Thread Felipe Contreras
On Thu, Sep 12, 2013 at 5:01 PM, John Gietzen jgiet...@woot.com wrote:
 Background:
 Windows, git version 1.8.3.msysgit.0
 bare repo, 54k commits after migration from HG
 git filter-branch --prune-empty -- --all

 I'm trying to clean up our repository after migrating it from HG.  I'm 
 running the filter-branch command listed above in an effort to clean up all 
 of garbage commits that HG required (closing branch commits and their ilk).

 From my past experience, git filter-branch is extremely quick when using 
 simple filters, like env-filter, since it doesn't have to touch the working 
 dir.  However, in our case each revision is taking 1-3 seconds; our entire 
 repo will take 30 hours to clean up at this rate.  Normally, this wouldn't be 
 a problem, except that we are getting sh.exe couldn't start errors after 
 anywhere between the 5000th and 6000th rewritten commit.  Filter-branch 
 doesn't have support for picking up where it left off, so we are entirely 
 unable to clean up our repo.

Indeed, I remember writing my own simplified version of 'git
filter-branch' that was much faster. If I recall correctly, the trick
was avoiding 'git write-tree' which can be done if you are not using
any tree filter, but 'git filter-branch' is not that smart.

If all you want to do is prune empty commits, it should be easy to
write a script that simply does 'git commit-tree'. I might decide to
do that based on my script if I have time today.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Reject non-ff pulls by default

2013-09-12 Thread Felipe Contreras
On Wed, Sep 11, 2013 at 6:38 AM, Matthieu Moy
matthieu@grenoble-inp.fr wrote:
 Felipe Contreras felipe.contre...@gmail.com writes:

 On Tue, Sep 10, 2013 at 3:26 AM, Matthieu Moy
 matthieu@grenoble-inp.fr wrote:

 So, you insist in asking the user to chose between rebase and merge, but
 you also insist that they will not chose rebase? So, why ask?

 Because as you said, they don't know what that is.

 That does not answer my question: why ask?

If you have to ask, then you haven't read the commit messages, the
cover letter, or the relevant discussion. Even Linus Torvalds agreed
this was a good change.

 Look around you what people say about Git. See how many complain about
 Git not exposing enough complexity to the user. See how many would
 complain about Git not advertising rebase enough. Then, look how many
 complain about Git exposing too much complexity and making it too easy
 to use features like rebase.

And see how many are confused by Git doing something they never told
it to do, and then being totally lost because they are in the middle
of a state they don't understand, and how many do merges by mistake.

There's a reason why Git user-base considers 'git pull' dangerous.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/2] version-gen: fixes and cleanups

2013-09-12 Thread Felipe Contreras
Felipe Contreras (2):
  version-gen: cleanup
  version-gen: avoid messing the version

 GIT-VERSION-GEN | 36 +++-
 1 file changed, 19 insertions(+), 17 deletions(-)

-- 
1.8.4-338-gefd7fa6

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] version-gen: avoid messing the version

2013-09-12 Thread Felipe Contreras
If the version is 'v1.8.4-rc1' that is the version, and there's no need
to change it to anything else, like 'v1.8.4.rc1'.

If RedHat, or somebody else, needs a specific version, they can use the
'version' file, like everybody else.

Signed-off-by: Felipe Contreras felipe.contre...@gmail.com
---
 GIT-VERSION-GEN | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN
index e96538d..b3de2db 100755
--- a/GIT-VERSION-GEN
+++ b/GIT-VERSION-GEN
@@ -26,10 +26,8 @@ describe () {
 if test -f version
 then
VN=$(cat version) || VN=$DEF_VER
-elif test -d ${GIT_DIR:-.git} -o -f .git  describe
+elif test ! -d ${GIT_DIR:-.git} -o ! -f .git || ! describe
 then
-   VN=$(echo $VN | sed -e 's/-/./g')
-else
VN=$DEF_VER
 fi
 
-- 
1.8.4-338-gefd7fa6

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/4] pathspec: strip multiple trailing slashes from submodules

2013-09-12 Thread Duy Nguyen
On Fri, Sep 13, 2013 at 3:21 AM, John Keeping j...@keeping.me.uk wrote:
 On Thu, Sep 12, 2013 at 12:48:10PM -0700, Junio C Hamano wrote:
 John Keeping j...@keeping.me.uk writes:

  This allows us to replace the submodule path trailing slash removal in
  builtin/rm.c with the PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP flag to
  parse_pathspec() without changing the behaviour with respect to multiple
  trailing slashes.

 Where does prefix_pathspec()'s input, which could have an unwanted
 trailing slash, come from?

 If it is read from some of our internal data structure and known to
 have at most one, then this change makes me feel very uneasy to cope
 with potentially sloppy end-user input and data generated by ourselves
 with the same logic.  It will allow our internal to be sloppy without
 forcing us notice and fix that sloppiness.

 If it is coming from an end-user input, then I would not object to
 the change, though.

 I added this in response to Duy's comment on v1 [1].

 [1] http://article.gmane.org/gmane.comp.version-control.git/234548

Looks like I add more noise to this thread than anything valuable.
Yes, once argv goes through parse_pathspec it's normalized so we do
not need to strip consecutive slashes any more. I'm not entirely sure
if it also converts Windows '\\' to '/'..

 Looking more closely, this does come from user input (via the argv
 passed into parse_pathspec) but does (some of the time) go through
 prefix_path_gently which calls normalize_path_copy_len.

 It's not immediately clear to me when prefix_pathspec goes through this
 particular code path, but I think we may be able to drop this (and the
 previous patch) without affecting the user.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/2] version-gen: cleanup

2013-09-12 Thread Felipe Contreras
No functional changes.

Signed-off-by: Felipe Contreras felipe.contre...@gmail.com
---
 GIT-VERSION-GEN | 36 
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN
index 06026ea..e96538d 100755
--- a/GIT-VERSION-GEN
+++ b/GIT-VERSION-GEN
@@ -6,22 +6,29 @@ DEF_VER=v1.8.4
 LF='
 '
 
-# First see if there is a version file (included in release tarballs),
-# then try git-describe, then default.
-if test -f version
-then
-   VN=$(cat version) || VN=$DEF_VER
-elif test -d ${GIT_DIR:-.git} -o -f .git 
-   VN=$(git describe --match v[0-9]* --abbrev=7 HEAD 2/dev/null) 
+describe () {
+   VN=$(git describe --match v[0-9]* --abbrev=7 HEAD 2/dev/null) || 
return 1
case $VN in
-   *$LF*) (exit 1) ;;
+   *$LF*)
+   return 1
+   ;;
v[0-9]*)
git update-index -q --refresh
test -z $(git diff-index --name-only HEAD --) ||
-   VN=$VN-dirty ;;
+   VN=$VN-dirty
+   return 0
+   ;;
esac
+}
+
+# First see if there is a version file (included in release tarballs),
+# then try 'git describe', then default.
+if test -f version
+then
+   VN=$(cat version) || VN=$DEF_VER
+elif test -d ${GIT_DIR:-.git} -o -f .git  describe
 then
-   VN=$(echo $VN | sed -e 's/-/./g');
+   VN=$(echo $VN | sed -e 's/-/./g')
 else
VN=$DEF_VER
 fi
@@ -34,9 +41,6 @@ then
 else
VC=unset
 fi
-test $VN = $VC || {
-   echo 2 GIT_VERSION = $VN
-   echo GIT_VERSION = $VN $GVF
-}
-
-
+test $VN = $VC  exit
+echo 2 GIT_VERSION = $VN
+echo GIT_VERSION = $VN $GVF
-- 
1.8.4-338-gefd7fa6

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Developer's Certificate of Origin: default to COPYING

2013-09-12 Thread W. Trevor King
On Thu, Sep 12, 2013 at 04:25:03PM -0700, Linus Torvalds wrote:
 On Thu, Sep 12, 2013 at 4:15 PM, Richard Hansen rhan...@bbn.com wrote:
 
  Is it worthwhile to poke a lawyer about this as a precaution?  (If so,
  who?)  Or do we wait for a motivating event?
 
 I can poke the lawyer that was originally involved.

For what it's worth, there is an existing push to clarify the
licensing terms for the DCO [1].  Involved parties include Luis
Rodriguez, Richard Fontana, Bradley Kuhn, Mike Dolan, and Karen
Copenhaver.  Hopefully they'll have something to say after the New
Orleans LinuxCon.  The DCO licensing is not quite the same as changing
the DCO text, but they're probably closely related ;).

Cheers,
Trevor

[1]: http://thread.gmane.org/gmane.linux.kernel/1397613/focus=1400065

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [PATCH 00/21] np/pack-v4 updates

2013-09-12 Thread Duy Nguyen
On Thu, Sep 12, 2013 at 11:20 PM, Nicolas Pitre n...@fluxnic.net wrote:
 On Thu, 12 Sep 2013, Duy Nguyen wrote:

 On Wed, Sep 11, 2013 at 11:25 PM, Nicolas Pitre n...@fluxnic.net wrote:
  On Wed, 11 Sep 2013, Duy Nguyen wrote:
 
  Nico, if you have time you may want to look into this. The result v4
  pack from pack-objects on git.git for me is 35MB (one branch) while
  packv4-create produces 30MB (v2 is 40MB). I don't know why there is
  such a big difference in size. I compared. Ident dict is identical.
  Tree dict is a bit different (some that have same hits are ordered
  differently). Delta chains do not differ much. Many groups of entries
  in the pack are displaced though. I guess I turned a wrong knob or
  something in pack-objects in v4 code..
 
  Will try to have a closer look.

 Problem found. I encoded some trees as ref-delta instead of pv4-tree
 :( Something like this brings the size back to packv4-create output

 diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
 index f604fa5..3d9ab0e 100644
 --- a/builtin/pack-objects.c
 +++ b/builtin/pack-objects.c
 @@ -1490,7 +1490,8 @@ static void check_object(struct object_entry *entry)
   * deltify other objects against, in order to avoid
   * circular deltas.
   */
 - entry-type = entry-in_pack_type;
 + if (pack_version  4)
 + entry-type = entry-in_pack_type;
   entry-delta = base_entry;
   entry-delta_size = entry-size;
   entry-delta_sibling = base_entry-delta_child;

 Hmmm... I've folded this fix into your patch touching this area.

You may want to stricten the condition a bit, to pack_version  4 ||
entry-type != OBJ_TREE. I think always not doing it in v4 turns off
the reuse code path for blobs and tags.

 This code is becoming rather subtle and messy though.  We'll have to
 find a way to better abstract things.

Yep. Not sure how that should be done though. Maybe we can revisit it
when pack-objects learns to skip compatibility layer when reading v4
commits and trees..

 Especially since object data
 reuse will work only for blobs and tags with packv4.  Commits and trees
 will need adjustments to their indices.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Local tag killer

2013-09-12 Thread Michael Haggerty
A colleague of mine discovered, the hard way, that

git fetch --tags --prune $REMOTE

deletes all local tags that are not present on that particular remote.
To me this seems a dangerous and poorly-documented interaction of
features and arguably a bug.

Granted, it might not be such a good idea to use local tags, as it is
all to easy to push them inadvertently and then it is difficult to
remove them permanently from a shared upstream repository because other
people might have fetched them and in turn inadvertently re-push them.

But the fact that combining two options, each of which seems safe and
reasonable for daily use, results in the death of local tags unrelated
to the remote is unexpected [1].  Also remember that the --prune
feature can be turned on permanently via git config using
fetch.prune or remote.$REMOTE.prune.

Moreover, the documentation is misleading on this point:

 -p::
 --prune::
   After fetching, remove any remote-tracking branches which
   no longer exist on the remote.

It is a stretch for references under refs/tags/ to be called
remote-tracking branches, even if they exist as the target of the
refspec refs/tags/*:refs/tags/* that is implicitly added by the --tags
option.

I suggest that --prune should not touch references under refs/tags/
regardless of whether they appear on the right side of explicit or
implicit refspecs.  If pruning tags is deemed to be essential, then
there should be a specific option (--prune-tags?) to request it.


When looking into this, I found a test in t5510 that appears to want to
verify this very behavior:

 test_expect_success 'fetch --prune --tags does not delete the remote-tracking 
 branches' '
   cd $D 
   git clone . prune-tags 
   cd prune-tags 
   git fetch origin refs/heads/master:refs/tags/sometag 
 
   git fetch --prune --tags origin 
   git rev-parse origin/master 
   test_must_fail git rev-parse somebranch
 '

However, the last line seems to contain a copy-paste error and should
presumably have s/somebranch/sometag/.

Michael

[1] It would be as if git clean had two options --ammonia and
--bleach :-)

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Local tag killer

2013-09-12 Thread Junio C Hamano
 When looking into this, I found a test in t5510 that appears to want to
 verify this very behavior:

 test_expect_success 'fetch --prune --tags does not delete the 
 remote-tracking branches' '

The title tells me that it wants to make sure when pruning tags it does
not touch remote-tracking branches under refs/remotes/origin/, I think.

   cd $D 
   git clone . prune-tags 
   cd prune-tags 
   git fetch origin refs/heads/master:refs/tags/sometag 

   git fetch --prune --tags origin 
   git rev-parse origin/master 
   test_must_fail git rev-parse somebranch
 '

 However, the last line seems to contain a copy-paste error and should
 presumably have s/somebranch/sometag/.

I agree that somebranch must be sometag, as it wants to make sure
that --prune --tags removes the tag the other side does not have.

I also agree that the documentation is misstated; remote-tracking branch
may have been a convenient and well understood phrase for whoever wrote
that part, but the --prune is designed to cull extra refs in the
hierarchies into
which refs would be fetched if counterparts existed on the other side, so
culling tags that do not exist on the remote side should also be described.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] relative_path should honor dos_drive_prefix

2013-09-12 Thread Jiang Xin
2013/9/13 Junio C Hamano gits...@pobox.com:

 For systems that need POSIX escape hatch for Apollo Domain ;-), we
 would need a bit more work.  When both path1 and path2 begin with a
 double-dash, we would need to check if they match up to the next
 slash, so that

  - //host1/usr/src and //host1/usr/lib share the same root and the
former can be made to ../src relative to the latter;

  - //host1/usr/src and //host2/usr/lib are of separate roots.

 or something.



But how could we know which platform supports network pathnames and
needs such implementation.

-- 
Jiang Xin
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/3] fixes for relative_path

2013-09-12 Thread Jiang Xin
Updates since v1:

* New patch 1/3 on t0060, which use umambigous leading path (/foo).
* Call tolower instead of strncasecmp in patch 2/3.
* Rename simple_relative_path to remove_leading_path in patch 3/3.

Jiang Xin (3):
  test: use unambigous leading path (/foo) for mingw
  relative_path should honor dos_drive_prefix
  Use simpler relative_path when set_git_dir

 cache.h   |  1 +
 path.c| 65 +++
 setup.c   |  5 +---
 t/t0060-path-utils.sh | 60 +--
 4 files changed, 99 insertions(+), 32 deletions(-)

-- 
1.8.4.459.gd80d422

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] test: use unambigous leading path (/foo) for mingw

2013-09-12 Thread Jiang Xin
In test cases for relative_path, path with one leading character
(such as /a, /x) may be recogonized as a:/ or x:/ if there is
such doc drive on MINGW platform. Use an umambigous leading path
/foo instead.

Signed-off-by: Jiang Xin worldhello@gmail.com
---
 t/t0060-path-utils.sh | 56 +--
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 3a48de2..82a6f21 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -190,33 +190,33 @@ test_expect_success SYMLINKS 'real path works on 
symlinks' '
test $sym = $(test-path-utils real_path $dir2/syml)
 '
 
-relative_path /a/b/c/  /a/b/   c/
-relative_path /a/b/c/  /a/bc/
-relative_path /a//b//c///a/b// c/  POSIX
-relative_path /a/b /a/b./
-relative_path /a/b//a/b./
-relative_path /a   /a/b../
-relative_path //a/b/   ../../
-relative_path /a/c /a/b/   ../c
-relative_path /a/c /a/b../c
-relative_path /x/y /a/b/   ../../x/y
-relative_path /a/b empty   /a/b
-relative_path /a/b null/a/b
-relative_path a/b/c/   a/b/c/
-relative_path a/b/c/   a/b c/
-relative_path a/b//c   a//bc
-relative_path a/b/ a/b/./
-relative_path a/b/ a/b ./
-relative_path aa/b ../
-relative_path x/y  a/b ../../x/y
-relative_path a/c  a/b ../c
-relative_path a/b  empty   a/b
-relative_path a/b  nulla/b
-relative_path empty/a/b./
-relative_path emptyempty   ./
-relative_path emptynull./
-relative_path null empty   ./
-relative_path null null./
-relative_path null /a/b./
+relative_path /foo/a/b/c/  /foo/a/b/   c/
+relative_path /foo/a/b/c/  /foo/a/bc/
+relative_path /foo/a//b//c///foo/a/b// c/  POSIX
+relative_path /foo/a/b /foo/a/b./
+relative_path /foo/a/b//foo/a/b./
+relative_path /foo/a   /foo/a/b../
+relative_path //foo/a/b/   ../../../
+relative_path /foo/a/c /foo/a/b/   ../c
+relative_path /foo/a/c /foo/a/b../c
+relative_path /foo/x/y /foo/a/b/   ../../x/y
+relative_path /foo/a/b empty   /foo/a/b
+relative_path /foo/a/b null/foo/a/b
+relative_path foo/a/b/c/   foo/a/b/c/
+relative_path foo/a/b/c/   foo/a/b c/
+relative_path foo/a/b//c   foo/a//bc
+relative_path foo/a/b/ foo/a/b/./
+relative_path foo/a/b/ foo/a/b ./
+relative_path foo/afoo/a/b ../
+relative_path foo/x/y  foo/a/b ../../x/y
+relative_path foo/a/c  foo/a/b ../c
+relative_path foo/a/b  empty   foo/a/b
+relative_path foo/a/b  nullfoo/a/b
+relative_path empty/foo/a/b./
+relative_path emptyempty   ./
+relative_path emptynull./
+relative_path null empty   ./
+relative_path null null./
+relative_path null /foo/a/b./
 
 test_done
-- 
1.8.4.459.gd80d422

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/3] Use simpler relative_path when set_git_dir

2013-09-12 Thread Jiang Xin
Using a relative_path as git_dir first appears in v1.5.6-1-g044bbbc.
It will make git_dir shorter only if git_dir is inside work_tree,
and this will increase performance. But my last refactor effort on
relative_path function (commit v1.8.3-rc2-12-ge02ca72) changed that.
Always use relative_path as git_dir may bring troubles like
$gmane/234434.

Because new relative_path is a combination of original relative_path
from path.c and original path_relative from quote.c, so in order to
restore the origin implementation, save the original relative_path
as remove_leading_path, and call it in setup.c.

Suggested-by: Karsten Blees karsten.bl...@gmail.com
Signed-off-by: Jiang Xin worldhello@gmail.com
---
 cache.h |  1 +
 path.c  | 45 +
 setup.c |  5 +
 3 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index a47b9c0..73fa334 100644
--- a/cache.h
+++ b/cache.h
@@ -747,6 +747,7 @@ int is_directory(const char *);
 const char *real_path(const char *path);
 const char *real_path_if_valid(const char *path);
 const char *absolute_path(const char *path);
+const char *remove_leading_path(const char *in, const char *prefix);
 const char *relative_path(const char *in, const char *prefix, struct strbuf 
*sb);
 int normalize_path_copy_len(char *dst, const char *src, int *prefix_len);
 int normalize_path_copy(char *dst, const char *src);
diff --git a/path.c b/path.c
index 65d376d..24594c4 100644
--- a/path.c
+++ b/path.c
@@ -551,6 +551,51 @@ const char *relative_path(const char *in, const char 
*prefix,
 }
 
 /*
+ * A simpler implementation of relative_path
+ *
+ * Get relative path by removing prefix from in. This function
+ * first appears in v1.5.6-1-g044bbbc, and makes git_dir shorter
+ * to increase performance when traversing the path to work_tree.
+ */
+const char *remove_leading_path(const char *in, const char *prefix)
+{
+   static char buf[PATH_MAX + 1];
+   int i = 0, j = 0;
+
+   if (!prefix || !prefix[0])
+   return in;
+   while (prefix[i]) {
+   if (is_dir_sep(prefix[i])) {
+   if (!is_dir_sep(in[j]))
+   return in;
+   while (is_dir_sep(prefix[i]))
+   i++;
+   while (is_dir_sep(in[j]))
+   j++;
+   continue;
+   } else if (in[j] != prefix[i]) {
+   return in;
+   }
+   i++;
+   j++;
+   }
+   if (
+   /* /foo is a prefix of /foo */
+   in[j] 
+   /* /foo is not a prefix of /foobar */
+   !is_dir_sep(prefix[i-1])  !is_dir_sep(in[j])
+  )
+   return in;
+   while (is_dir_sep(in[j]))
+   j++;
+   if (!in[j])
+   strcpy(buf, .);
+   else
+   strcpy(buf, in + j);
+   return buf;
+}
+
+/*
  * It is okay if dst == src, but they should not overlap otherwise.
  *
  * Performs the following normalizations on src, storing the result in dst:
diff --git a/setup.c b/setup.c
index f08dd64..dbf4138 100644
--- a/setup.c
+++ b/setup.c
@@ -227,7 +227,6 @@ int is_inside_work_tree(void)
 
 void setup_work_tree(void)
 {
-   struct strbuf sb = STRBUF_INIT;
const char *work_tree, *git_dir;
static int initialized = 0;
 
@@ -247,10 +246,8 @@ void setup_work_tree(void)
if (getenv(GIT_WORK_TREE_ENVIRONMENT))
setenv(GIT_WORK_TREE_ENVIRONMENT, ., 1);
 
-   set_git_dir(relative_path(git_dir, work_tree, sb));
+   set_git_dir(remove_leading_path(git_dir, work_tree));
initialized = 1;
-
-   strbuf_release(sb);
 }
 
 static int check_repository_format_gently(const char *gitdir, int *nongit_ok)
-- 
1.8.4.459.gd80d422

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >