Re: [PATCH] Undefine strlcpy if needed.

2014-08-25 Thread Ramsay Jones
On 25/08/14 02:54, tsuna wrote:
 On Sun, Aug 24, 2014 at 5:32 PM, Ramsay Jones
 ram...@ramsay1.demon.co.uk wrote:
 Again, I don't have access to an OS X system, so I don't know
 which package provides libintl/gettext, but it seems to be missing
 on your system.
 
 Probably yeah, those libraries don’t seem to be provided in standard
 with OS X or OS X’s development tools, so maybe the Makefile should
 also default to having NO_GETTEXT=YesPlease when on OS X.
 
 You can avoid the build failure, without running configure, by
 setting NO_GETTEXT=YesPlease in your config.mak file.



 I need to run configure first:

 $ make configure
 GEN configure
 $ ./configure
 configure: Setting lib to 'lib' (the default)
 […]

 So, presumably, configure has set NO_GETEXT=YesPlease in your
 config.mak.autogen file.
 
 Yes it did.
 
 I don’t mind running configure, but so far Git has compiled fine
 without doing it.  Should we fix the default values of NO_STRLCPY /
 NO_GETEXT on OS X?
 

Is NO_STRLCPY still a problem with a fresh clone (and putting
NO_GETEXT=YesPlease in your config.mak)? I still do not understand
why you were getting those warnings; AFAICT it should not be happening!
Also, Torsten could not reproduce.

As far as NO_GETTEXT is concerned, I have to defer to someone who has
experience on that platform (I have _zero_ experience on OS X).

ATB,
Ramsay Jones


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/4] Change GIT_ALLOC_LIMIT check to use git_parse_ulong()

2014-08-25 Thread Jeff King
On Sun, Aug 24, 2014 at 06:07:44PM +0200, Steffen Prohaska wrote:

 diff --git a/wrapper.c b/wrapper.c
 index bc1bfb8..69d1c9b 100644
 --- a/wrapper.c
 +++ b/wrapper.c
 @@ -11,14 +11,18 @@ static void (*try_to_free_routine)(size_t size) = 
 do_nothing;
  
  static void memory_limit_check(size_t size)
  {
 - static int limit = -1;
 - if (limit == -1) {
 - const char *env = getenv(GIT_ALLOC_LIMIT);
 - limit = env ? atoi(env) * 1024 : 0;
 + static size_t limit = SIZE_MAX;
 + if (limit == SIZE_MAX) {

You use SIZE_MAX as the sentinel for not set, and 0 as the sentinel
for no limit. That seems kind of backwards.

I guess you are inheriting this from the existing code, which lets
GIT_ALLOC_LIMIT=0 mean no limit. I'm not sure if we want to keep that
or not (it would be backwards incompatible to change it, but we are
already breaking compatibility here by assuming bytes rather than
kilobytes; I think that's OK because this is not a documented feature,
or one intended to be used externally).

 + const char *var = GIT_ALLOC_LIMIT;
 + unsigned long val = 0;
 + const char *env = getenv(var);
 + if (env  !git_parse_ulong(env, val))
 + die(Failed to parse %s, var);
 + limit = val;
   }

This and the next patch both look OK to me, but I notice this part is
largely duplicated between the two. We already have git_env_bool to do a
similar thing for boolean environment variables. Should we do something
similar like:

diff --git a/config.c b/config.c
index 058505c..11919eb 100644
--- a/config.c
+++ b/config.c
@@ -1122,6 +1122,14 @@ int git_env_bool(const char *k, int def)
return v ? git_config_bool(k, v) : def;
 }
 
+unsigned long git_env_ulong(const char *k, unsigned long val)
+{
+   const char *v = getenv(k);
+   if (v  !git_parse_ulong(k, val))
+   die(failed to parse %s, k);
+   return val;
+}
+
 int git_config_system(void)
 {
return !git_env_bool(GIT_CONFIG_NOSYSTEM, 0);

It's not a lot of code, but I think the callers end up being much easier
to read:

  if (limit == SIZE_MAX)
limit = git_env_ulong(GIT_ALLOC_LIMIT, 0);

   if (limit  size  limit)
 - die(attempting to allocate %PRIuMAX over limit %d,
 - (intmax_t)size, limit);
 + die(attempting to allocate %PRIuMAX over limit %PRIuMAX,
 + (uintmax_t)size, (uintmax_t)limit);

This part is duplicated, too, though I do not know if the infrastructure
to avoid that is worth the trouble. Unless you wanted to do a whole:

  check_limit(limit, GIT_ALLOC_LIMIT, size);

or something, but I am also not convinced that is not just obfuscating
things.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 4/4] convert: Stream from fd to required clean filter instead of mmap

2014-08-25 Thread Jeff King
On Sun, Aug 24, 2014 at 06:07:46PM +0200, Steffen Prohaska wrote:

 The data is streamed to the filter process anyway.  Better avoid mapping
 the file if possible.  This is especially useful if a clean filter
 reduces the size, for example if it computes a sha1 for binary data,
 like git media.  The file size that the previous implementation could
 handle was limited by the available address space; large files for
 example could not be handled with (32-bit) msysgit.  The new
 implementation can filter files of any size as long as the filter output
 is small enough.
 
 The new code path is only taken if the filter is required.  The filter
 consumes data directly from the fd.  The original data is not available
 to git, so it must fail if the filter fails.

Can you clarify this second paragraph a bit more? If I understand
correctly, we handle a non-required filter failing by just reading the
data again (which we can do because we either read it into memory
ourselves, or mmap it). With the streaming approach, we will read the
whole file through our stream; if that fails we would then want to read
the stream from the start.

Couldn't we do that with an lseek (or even an mmap with offset 0)? That
obviously would not work for non-file inputs, but I think we address
that already in index_fd: we push non-seekable things off to index_pipe,
where we spool them to memory.

So it seems like the ideal strategy would be:

  1. If it's seekable, try streaming. If not, fall back to lseek/mmap.

  2. If it's not seekable and the filter is required, try streaming. We
 die anyway if we fail.

  3. If it's not seekable and the filter is not required, decide based
 on file size:

   a. If it's small, spool to memory and proceed as we do now.

   b. If it's big, spool to a seekable tempfile.

Your patch implements part 2. But I would think part 1 is the most common
case. And while part 3b seems unpleasant, it is better than the current
code (with or without your patch), which will do 3a on a large file.

Hmm. Though I guess in (3) we do not have the size up front, so it's
complicated (we could spool N bytes to memory, then start dumping to a
file after that). I do not think we necessarily need to implement that
part, though. It seems like (1) is the thing I would expect to hit the
most (i.e., people do not always mark their filters are required).

 - write_err = (write_in_full(child_process.in, params-src, params-size) 
  0);
 + if (params-src) {
 + write_err = (write_in_full(child_process.in, params-src, 
 params-size)  0);

Style: 4-space indentation (rather than a tab). There's more of it in
this function (and in would_convert...) that I didn't mark.

 + } else {
 + /* dup(), because copy_fd() closes the input fd. */
 + fd = dup(params-fd);

Not a problem you are introducing, but this seem kind of like a
misfeature in copy_fd. Is it worth fixing? The function only has two
existing callers.

 + /* Apply a filter to an fd only if the filter is required to succeed.
 +  * We must die if the filter fails, because the original data before
 +  * filtering is not available.
 +  */

Style nit:

  /*
   * We have a blank line at the top of our
   * multi-line comments.
   */

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] git-prompt: do not look for refs/stash in $GIT_DIR

2014-08-25 Thread Jeff King
On Sun, Aug 24, 2014 at 08:22:41PM +0700, Gábor Szeder wrote:

 On Aug 23, 2014 12:26 PM, Jeff King p...@peff.net wrote:
  Since dd0b72c (bash prompt: use bash builtins to check stash 
  state, 2011-04-01), git-prompt checks whether we have a 
  stash by looking for $GIT_DIR/refs/stash. Generally external 
  programs should never do this, because they would miss 
  packed-refs.
 
 Not sure whether the prompt script is external program or not, but
 doesn't matter, this is the right thing to do.

Yeah, by external I just meant nothing outside of refs.c should make
this assumption.

  That commit claims that packed-refs does not pack 
  refs/stash, but that is not quite true. It does pack the 
  ref, but due to a bug, fails to prune the ref. When we fix 
  that bug, we would want to be doing the right thing here. 
 
  Signed-off-by: Jeff King p...@peff.net 
  --- 
  I know we are pretty sensitive to forks in the prompt code (after all, 
  that was the point of dd0b72c). This patch is essentially a reversion of 
  this hunk of dd0b72c, and is definitely safe.
 
 I'm not sure, but if I remember correctly (don't have the means to
 check it at the moment, sorry) in that commit I also added a 'git
 pack-ref' invocation to the relevant test(s?) to guard us against
 breakages due to changes in 'git pack-refs'.  If that is so, then I
 think those invocations should be removed as well, as they'll become
 useless.

It did add that change (that's actually how I noticed the problem!
Thank you for being thorough in dd0b72c). My inclination is to leave the
pack-refs invocations, as they protect against a certain class of errors
(we are not doing the risky behavior now, but the purpose of the test
suite is to detect regressions; the next person to touch that code may
not be so careful as you were).

I don't feel too strongly, though, so if we want them gone, I'm OK with
that.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bisect: save heap memory. allocate only the required amount

2014-08-25 Thread Jeff King
On Sun, Aug 24, 2014 at 04:39:37PM -0700, Junio C Hamano wrote:

 On Sun, Aug 24, 2014 at 8:10 AM, Stefan Beller stefanbel...@gmail.com wrote:
for (p = list, i = 0; i  cnt; i++) {
  - struct name_decoration *r = xmalloc(sizeof(*r) + 100);
  + char name[100];
 
  Would it make sense to convert the 'name' into a git strbuf?
  Please have a look at Documentation/technical/api-strbuf.txt
 
 Why not go one step further and format it twice, once only
 to measure the necessary size to allocate, allocate and
 then format into it for real? Then you do not need to print
 into a temporary strbuf, copy the result and free the strbuf,
 no?
 
 BUT.
 
 The string will always be dist= followed by decimal representation of
 a count that fits in int anyway, so I actually think use of strbuf is way
 overkill (and formatting it twice also is); the patch as posted should be
 just fine.

I think you are right, and the patch is the right direction (assuming we
want to do this; I question whether there are enough elements in the
list for us to care about the size, and if there are, we are probably
better off storing the int and formatting the strings on the fly).

I wonder if there is a way we could get rid of the magic 100 here,
though. Its meaning is enough to hold 'dist=' and any integer. But you
have to read carefully to see that this call to sprintf is not a buffer
overflow. A strbuf is one way to get rid of it, though it is awkward
because we then have to copy the result into a flex-array structure.

It would be nice if there was some way to abstract the idea of
formatting a buffer directly into a flex-array. That would involve the
double-format you mention, but we could use it in lots of places to make
the code nicer. Maybe like:

  void *fmt_flex_array(size_t base, const char *fmt, ...)
  {
  va_list ap;
  size_t flex;
  unsigned char *ret;

  va_start(ap, fmt);
  flex = vsnprintf(NULL, 0, fmt, ap);
  va_end(ap);

  ret = xmalloc(base + flex + 1);
  va_start(ap, fmt);
  /* Eek, see below */
  vsnprintf(ret + flex, flex + 1, fmt, ap);

  return ret;
  }

and you'd call it like:

  struct name_decoration *r = fmt_flex_array(sizeof(*r), dist=%d, x);

Except that I don't think we are guaranteed that offsetof(mystruct,
flex_member) is equal to sizeof(mystruct). If FLEX_ARRAY is 0, it should
be, but some platforms use FLEX_ARRAY=1. So you'd have to pass in the
offset like:

  struct name_decoration *r = fmt_flex_array(sizeof(*r),
 offsetof(*r, name),
 dist=%d, x);

which is a little less nice. You could make it nicer with a macro, but
we don't assume variadic macros. sigh

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bisect: save heap memory. allocate only the required amount

2014-08-25 Thread Jeff King
On Sun, Aug 24, 2014 at 07:47:24PM +0530, Arjun Sreedharan wrote:

 diff --git a/bisect.c b/bisect.c
 index d6e851d..c96aab0 100644
 --- a/bisect.c
 +++ b/bisect.c
 @@ -215,10 +215,13 @@ static struct commit_list *best_bisection_sorted(struct 
 commit_list *list, int n
   }
   qsort(array, cnt, sizeof(*array), compare_commit_dist);
   for (p = list, i = 0; i  cnt; i++) {
 - struct name_decoration *r = xmalloc(sizeof(*r) + 100);
 + char name[100];
 + sprintf(name, dist=%d, array[i].distance);
 + int name_len = strlen(name);
 + struct name_decoration *r = xmalloc(sizeof(*r) + name_len);

This allocation should be name_len + 1 for the NUL-terminator, no?

It looks like add_name_decoration in log-tree already handles half of
what you are adding here. Can we just make that available globally (it
is manipulating the already-global struct decoration name_decoration)?

I also notice that we do not set r-type at all, meaning the decoration
lookup code in log-tree will access uninitialized memory (worse, it will
use it as a pointer offset into the color list; I got a segfault when I
tried to run git rev-list --bisect-all v1.8.0..v1.9.0).

I think we need this:

diff --git a/bisect.c b/bisect.c
index d6e851d..e2a7682 100644
--- a/bisect.c
+++ b/bisect.c
@@ -219,6 +219,7 @@ static struct commit_list *best_bisection_sorted(struct 
commit_list *list, int n
struct object *obj = (array[i].commit-object);
 
sprintf(r-name, dist=%d, array[i].distance);
+   r-type = 0;
r-next = add_decoration(name_decoration, obj, r);
p-item = array[i].commit;
p = p-next;

at a minimum.

It looks like this was a regression caused by eb3005e (commit.h: add
'type' to struct name_decoration, 2010-06-19). Which makes me wonder if
anybody actually _uses_ --bisect-all (which AFAICT is the only way to
trigger the problem), but since it's public, I guess we should keep it.

I think the sane thing here is to stop advertising name_decoration as a
global, and make all callers use add_name_decoration. That makes it
easier for callers like this one, and would have caught the regression
caused be eb3005e (the compiler would have noticed that we were not
passing a type parameter to the function).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Relative submodule URLs

2014-08-25 Thread Robert Dailey
On Fri, Aug 22, 2014 at 11:00 AM, Marc Branchaud marcn...@xiplink.com wrote:
 A couple of years ago I started to work on such a thing ([1] [2] [3]), mainly
 because when we tried to change to relative submodules we got bitten when
 someone used clone's -o option so that his super-repo had no origin remote
 *and* his was checked out on a detached HEAD.  So get_default_remote() failed
 for him.

 I didn't have time to complete the work -- it ended up being quite involved.
  But Junio did come up with an excellent transition plan [4] for adopting a
 default remote setting.

 [1] (v0) http://thread.gmane.org/gmane.comp.version-control.git/200145
 [2] (v1) http://thread.gmane.org/gmane.comp.version-control.git/201065
 [3] (v2) http://thread.gmane.org/gmane.comp.version-control.git/201306
 [4] http://article.gmane.org/gmane.comp.version-control.git/201332

 I think you're on the right path. However I'd suggest something like
 the following:

 [submodule]
 remote = remote_for_relative_submodules (e.g. `upstream`)

 I think remote.default would be more generally useful, especially when
 working with detached checkouts.

Honestly speaking I don't use default.remote, even now that I know
about it thanks to the discussion ongoing here. The reason is that
sometimes I push my branches to origin, sometimes I push them to my
fork. I like explicit control as to which one I push to. I also sync
my git config file to dropbox and I use it on multiple projects and
platforms. I don't use the same push destination workflow on all
projects. It seems to get in the way of my workflow more than it
helps. I really only ever have two needs:

1. Push explicitly to my remote (e.g. `git push fork` or `git push origin`)
2. Push to the tracked branch (e.g. `git push`)

I'm also not sure how `push.default = simple` conflicts with the usage
of `remote.default`, since in the tracked-repo case, you must
explicitly specify the source ref to push. Is this behavior documented
somewhere?

 (For the record, I would also be happy if clone got rid of its -o option and
 origin became the sacred, reserved remote name (perhaps translated into
 other languages as needed) that clone always uses no matter what.)

 [branch.name]
 submoduleRemote = remote_for_relative_submodule

 If I understand correctly, you want this so that your branch can be a fork of
 only the super-repo while the submodules are not forked and so they should
 keep tracking their original repo.

That's correct. But this is case-by-case. Sometimes I make a change
where I want the submodule forked (rare), most times I don't.
Sometimes I can get away with pushing changes to the submodule and
worrying about it later since I know the submodule ref won't move
forward unless someone does update --remote (which isn't often or only
done as needed).

 To me this seems to be going in the opposite direction of having branches
 recursively apply to submodules, which I think most of us want.

 A branch should fork the entire repo, including its submodules.  The
 implication is that if you want to push that branch somewhere, that somewhere
 needs to be able to accept the forks of the submodules *even if those
 submodules aren't changed in your branch* because at the very least the
 branch ref has to exist in the submodules' repositories.

There are many levels on which this can apply. When it comes to
checkouts and such, I agree. However, how will this impact *creating*
branches? What about forking? Do you expect submodule forking 
branching to be automatic as well? Based on your description, it seems
so (although a new branch doesn't necessarily have to correspond to a
new fork, unless I'm misunderstanding you). This seems difficult to
do, especially the forking part since you would need an API for this
(Github, Atlassian Stash, etc), unless you are thinking of something
clever like local/relative forks.

However the inconvenience of forking manually isn't the main reason
why I avoid forking submodules. It's the complication of pull
requests. There is no uniformity there, which is unfortunate.
Recursive pull requests are something outside the scope of git, I
realize that, but it would still be nice. However the suggestion you
make here lays the foundation for that I think.

 With absolute-path submodules, the push is a simple as creating the branch
 ref in the submodules' home repositories -- even if the main somewhere
 you're pushing to isn't one of those repositories.

 With relative-path submodules, the push's target repo *must* also have the
 submodules in their proper places, so that they can get updated.
 Furthermore, if you clone a repo that has relative-path submodules you *must*
 also clone the submodules.

 Robert, I think what you'll say to this is that you still want your branch to
 track the latest submodules updates from their home repository. (BTW, I'm
 confused with how you're using the terms upstream and origin.  I'll use
 home to refer to the repository where everything 

Re: [PATCH] bisect: save heap memory. allocate only the required amount

2014-08-25 Thread Christian Couder
On Mon, Aug 25, 2014 at 3:35 PM, Jeff King p...@peff.net wrote:
 On Sun, Aug 24, 2014 at 07:47:24PM +0530, Arjun Sreedharan wrote:

 diff --git a/bisect.c b/bisect.c
 index d6e851d..c96aab0 100644
 --- a/bisect.c
 +++ b/bisect.c
 @@ -215,10 +215,13 @@ static struct commit_list 
 *best_bisection_sorted(struct commit_list *list, int n
   }
   qsort(array, cnt, sizeof(*array), compare_commit_dist);
   for (p = list, i = 0; i  cnt; i++) {
 - struct name_decoration *r = xmalloc(sizeof(*r) + 100);
 + char name[100];
 + sprintf(name, dist=%d, array[i].distance);
 + int name_len = strlen(name);
 + struct name_decoration *r = xmalloc(sizeof(*r) + name_len);

 This allocation should be name_len + 1 for the NUL-terminator, no?

I wondered about that too, but as struct name_decoration is defined like this:

struct name_decoration {
struct name_decoration *next;
int type;
char name[1];
};

the .name field of this struct already has one char, so the allocation
above should be ok.

 It looks like add_name_decoration in log-tree already handles half of
 what you are adding here. Can we just make that available globally (it
 is manipulating the already-global struct decoration name_decoration)?

Yeah, it looks like it should be better.

Note that add_name_decoration() does:

int nlen = strlen(name);
struct name_decoration *res = xmalloc(sizeof(struct name_decoration) + nlen);

so it also relies on the fact that .name contains one char.

 I also notice that we do not set r-type at all, meaning the decoration
 lookup code in log-tree will access uninitialized memory (worse, it will
 use it as a pointer offset into the color list; I got a segfault when I
 tried to run git rev-list --bisect-all v1.8.0..v1.9.0).

 I think we need this:

 diff --git a/bisect.c b/bisect.c
 index d6e851d..e2a7682 100644
 --- a/bisect.c
 +++ b/bisect.c
 @@ -219,6 +219,7 @@ static struct commit_list *best_bisection_sorted(struct 
 commit_list *list, int n
 struct object *obj = (array[i].commit-object);

 sprintf(r-name, dist=%d, array[i].distance);
 +   r-type = 0;
 r-next = add_decoration(name_decoration, obj, r);
 p-item = array[i].commit;
 p = p-next;

 at a minimum.

Yeah if we don't use add_name_decoration() we would need that.
Thanks for noticing.

 It looks like this was a regression caused by eb3005e (commit.h: add
 'type' to struct name_decoration, 2010-06-19). Which makes me wonder if
 anybody actually _uses_ --bisect-all (which AFAICT is the only way to
 trigger the problem), but since it's public, I guess we should keep it.

Yeah, we should probably keep it.

 I think the sane thing here is to stop advertising name_decoration as a
 global, and make all callers use add_name_decoration. That makes it
 easier for callers like this one, and would have caught the regression
 caused be eb3005e (the compiler would have noticed that we were not
 passing a type parameter to the function).

I agree.

Thanks,
Christian.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Relative submodule URLs

2014-08-25 Thread Robert Dailey
On Sun, Aug 24, 2014 at 8:34 AM, Heiko Voigt hvo...@hvoigt.net wrote:
 New --with--remote parameter for 'git submodule'
 

 While having said all that about submodule settings I think a much
 much simpler start is to go ahead with a commandline setting, like
 Robert proposed here[2].

 For that we do not have to worry about how it can be stored,
 transported, defined per submodule or on a branch, since answers to this
 are given at the commandline (and current repository state).

 There are still open questions about this though:

   * Should the name in the submodule be 'origin' even though you
 specified --with-remote=somewhere? For me its always confusing to
 have the same/similar remotes named differently in different
 repositories. That why I try to keep the names the same in all my
 clones of repositories (i.e. for my private, github, upstream
 remotes).

   * When you do a 'git submodule sync --with-remote=somewhere' should
 the remote be added or replaced.

 My opinion on these are:

 The remote should be named as in the superproject so
 --with-remote=somewhere adds/replaces the remote 'somewhere' in the
 submodules named on the commandline (or all in case no submodule is
 specified). In case of a fresh clone of the submodule, there would be no
 origin but only a remote under the new name.

 Would the --with-remote feature I describe be a feasible start for you
 Robert? What do others think? Is the naming of the parameter
 '--with-remote' alright?

 Cheers Heiko

 [1] http://article.gmane.org/gmane.comp.version-control.git/255512
 [2] http://article.gmane.org/gmane.comp.version-control.git/255512
 [3] 
 https://github.com/jlehmann/git-submod-enhancements/wiki#special-ref-overriding-gitmodules-values

Hi Heiko,

My last email response was in violation of your request to keep the
two topics separate, sorry about that. I started typing it this
weekend and completed the draft this morning, without having read this
response from you first. At this point my only intention was to start
discussion on a possible short-term solution. I realize the Git
developers are working hard on improving submodule workflow for the
long term. In addition I do not have the domain expertise to properly
make suggestions in regards to longer-term solutions, so I leave that
to you :-)

The --with-remote feature would allow me to begin using relative
submodules because:

On a per-submodule basis, I can specify the remote it will use. When I
fork a submodule and need to start tracking it, I can run `git
submodule sync --with-remote fork`, which will take my super repo's
'fork' remote, REPLACE 'origin' in the submodule with that URL, and
also redo the relative URL calculation. This is ideal since I use HTTP
at home (so I can use my proxy server to access git behind firewall at
work) and at work physically I use SSH for performance (to avoid HTTP
protocol). I also like the idea of never having to update my
submodule URLs again if the git server moves, domain name changes, or
whatever else.

Here is what I think would make the feature most usable. I think you
went over some of these ideas but I just want to clarify, to make sure
we're on the same page. Please correct me as needed.

1. Running `git submodule update --with-remote name` shall fail the
command unconditionally.
2. Using the `--with-remote` option on submodule `update` or `sync`
will fail if it detects absolute submodule URLs in .gitmodule
3. Running `git submodule update --init --with-remote name` shall
fail the command ONLY if a submodule is being processed that is NOT
also being initialized.
4. The behavior of git submodule's `update` or `sync` commands
combined with `--with-remote` will REPLACE or CREATE the 'origin'
remote in each submodule it is run in. We will not allow the user to
configure what the submodule remote name will end up being (I think
this is current behavior and forces good practice; I consider `origin`
an adopted standard for git, and actually wish it was more enforced
for super projects as well!)

Let me know if I've missed anything. Once we clarify requirements I'll
attempt to start work on this during my free time. I'll start by
testing this through msysgit, since I do not have linux installed, but
I have Linux Mint running in a Virtual Machine so I can test on both
platforms as needed (I don't have a lot of experience on Linux
though).

I hope you won't mind me reaching out for questions as needed, however
I will attempt to be as resourceful as possible since I know you're
all busy. Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Relative submodule URLs

2014-08-25 Thread Robert Dailey
On Mon, Aug 25, 2014 at 9:29 AM, Robert Dailey rcdailey.li...@gmail.com wrote:
 On Sun, Aug 24, 2014 at 8:34 AM, Heiko Voigt hvo...@hvoigt.net wrote:
 New --with--remote parameter for 'git submodule'
 

 While having said all that about submodule settings I think a much
 much simpler start is to go ahead with a commandline setting, like
 Robert proposed here[2].

 For that we do not have to worry about how it can be stored,
 transported, defined per submodule or on a branch, since answers to this
 are given at the commandline (and current repository state).

 There are still open questions about this though:

   * Should the name in the submodule be 'origin' even though you
 specified --with-remote=somewhere? For me its always confusing to
 have the same/similar remotes named differently in different
 repositories. That why I try to keep the names the same in all my
 clones of repositories (i.e. for my private, github, upstream
 remotes).

   * When you do a 'git submodule sync --with-remote=somewhere' should
 the remote be added or replaced.

 My opinion on these are:

 The remote should be named as in the superproject so
 --with-remote=somewhere adds/replaces the remote 'somewhere' in the
 submodules named on the commandline (or all in case no submodule is
 specified). In case of a fresh clone of the submodule, there would be no
 origin but only a remote under the new name.

 Would the --with-remote feature I describe be a feasible start for you
 Robert? What do others think? Is the naming of the parameter
 '--with-remote' alright?

 Cheers Heiko

 [1] http://article.gmane.org/gmane.comp.version-control.git/255512
 [2] http://article.gmane.org/gmane.comp.version-control.git/255512
 [3] 
 https://github.com/jlehmann/git-submod-enhancements/wiki#special-ref-overriding-gitmodules-values

 Hi Heiko,

 My last email response was in violation of your request to keep the
 two topics separate, sorry about that. I started typing it this
 weekend and completed the draft this morning, without having read this
 response from you first. At this point my only intention was to start
 discussion on a possible short-term solution. I realize the Git
 developers are working hard on improving submodule workflow for the
 long term. In addition I do not have the domain expertise to properly
 make suggestions in regards to longer-term solutions, so I leave that
 to you :-)

 The --with-remote feature would allow me to begin using relative
 submodules because:

 On a per-submodule basis, I can specify the remote it will use. When I
 fork a submodule and need to start tracking it, I can run `git
 submodule sync --with-remote fork`, which will take my super repo's
 'fork' remote, REPLACE 'origin' in the submodule with that URL, and
 also redo the relative URL calculation. This is ideal since I use HTTP
 at home (so I can use my proxy server to access git behind firewall at
 work) and at work physically I use SSH for performance (to avoid HTTP
 protocol). I also like the idea of never having to update my
 submodule URLs again if the git server moves, domain name changes, or
 whatever else.

 Here is what I think would make the feature most usable. I think you
 went over some of these ideas but I just want to clarify, to make sure
 we're on the same page. Please correct me as needed.

 1. Running `git submodule update --with-remote name` shall fail the
 command unconditionally.
 2. Using the `--with-remote` option on submodule `update` or `sync`
 will fail if it detects absolute submodule URLs in .gitmodule
 3. Running `git submodule update --init --with-remote name` shall
 fail the command ONLY if a submodule is being processed that is NOT
 also being initialized.
 4. The behavior of git submodule's `update` or `sync` commands
 combined with `--with-remote` will REPLACE or CREATE the 'origin'
 remote in each submodule it is run in. We will not allow the user to
 configure what the submodule remote name will end up being (I think
 this is current behavior and forces good practice; I consider `origin`
 an adopted standard for git, and actually wish it was more enforced
 for super projects as well!)

 Let me know if I've missed anything. Once we clarify requirements I'll
 attempt to start work on this during my free time. I'll start by
 testing this through msysgit, since I do not have linux installed, but
 I have Linux Mint running in a Virtual Machine so I can test on both
 platforms as needed (I don't have a lot of experience on Linux
 though).

 I hope you won't mind me reaching out for questions as needed, however
 I will attempt to be as resourceful as possible since I know you're
 all busy. Thanks.

Thought of a few more:

5. If `--with-remote` is unspecified, behavior will continue as it
currently does (I'm not clear on the precedence here of various
options, but I assume: `remote.default` first, then
`branch.name.remote`)
6. `--with-remote` will take 

Re: [PATCH] bisect: save heap memory. allocate only the required amount

2014-08-25 Thread Jeff King
On Mon, Aug 25, 2014 at 04:06:52PM +0200, Christian Couder wrote:

  This allocation should be name_len + 1 for the NUL-terminator, no?
 
 I wondered about that too, but as struct name_decoration is defined like this:
 
 struct name_decoration {
 struct name_decoration *next;
 int type;
 char name[1];
 };
 
 the .name field of this struct already has one char, so the allocation
 above should be ok.

Yeah, you're right. I would argue it should just be FLEX_ARRAY for
consistency with other spots, though (in which case add_name_decoration
needs to be updated with a +1).

Running git grep '^char [^ ]*\[[01]]' -- '*.[ch]' shows that this
is one of only two spots that don't use FLEX_ARRAY (and the other has a
comment explaining why not).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/4] Change GIT_ALLOC_LIMIT check to use git_parse_ulong()

2014-08-25 Thread Steffen Prohaska

On Aug 25, 2014, at 1:38 PM, Jeff King p...@peff.net wrote:

 On Sun, Aug 24, 2014 at 06:07:44PM +0200, Steffen Prohaska wrote:
 
 diff --git a/wrapper.c b/wrapper.c
 index bc1bfb8..69d1c9b 100644
 --- a/wrapper.c
 +++ b/wrapper.c
 @@ -11,14 +11,18 @@ static void (*try_to_free_routine)(size_t size) = 
 do_nothing;
 
 static void memory_limit_check(size_t size)
 {
 -static int limit = -1;
 -if (limit == -1) {
 -const char *env = getenv(GIT_ALLOC_LIMIT);
 -limit = env ? atoi(env) * 1024 : 0;
 +static size_t limit = SIZE_MAX;
 +if (limit == SIZE_MAX) {
 
 You use SIZE_MAX as the sentinel for not set, and 0 as the sentinel
 for no limit. That seems kind of backwards.
 
 I guess you are inheriting this from the existing code, which lets
 GIT_ALLOC_LIMIT=0 mean no limit. I'm not sure if we want to keep that
 or not (it would be backwards incompatible to change it, but we are
 already breaking compatibility here by assuming bytes rather than
 kilobytes; I think that's OK because this is not a documented feature,
 or one intended to be used externally).

I think it's reasonable that GIT_ALLOC_LIMIT=0 means no limit, so that
the limit can easily be disabled temporarily.

But I could change the sentinel and handle 0 like:

if (git_parse_ulong(env, val)) {
if (!val) {
val = SIZE_MAX;
}
}

Maybe we should do this.



 +const char *var = GIT_ALLOC_LIMIT;
 +unsigned long val = 0;
 +const char *env = getenv(var);
 +if (env  !git_parse_ulong(env, val))
 +die(Failed to parse %s, var);
 +limit = val;
  }
 
 This and the next patch both look OK to me, but I notice this part is
 largely duplicated between the two. We already have git_env_bool to do a
 similar thing for boolean environment variables. Should we do something
 similar like:
 
 diff --git a/config.c b/config.c
 index 058505c..11919eb 100644
 --- a/config.c
 +++ b/config.c
 @@ -1122,6 +1122,14 @@ int git_env_bool(const char *k, int def)
   return v ? git_config_bool(k, v) : def;
 }
 
 +unsigned long git_env_ulong(const char *k, unsigned long val)
 +{
 + const char *v = getenv(k);
 + if (v  !git_parse_ulong(k, val))
 + die(failed to parse %s, k);
 + return val;
 +}
 +
 int git_config_system(void)
 {
   return !git_env_bool(GIT_CONFIG_NOSYSTEM, 0);
 
 It's not a lot of code, but I think the callers end up being much easier
 to read:
 
  if (limit == SIZE_MAX)
   limit = git_env_ulong(GIT_ALLOC_LIMIT, 0);

I think you're right.  I'll change it.


Steffen--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/4] Change GIT_ALLOC_LIMIT check to use git_parse_ulong()

2014-08-25 Thread Jeff King
On Mon, Aug 25, 2014 at 05:06:22PM +0200, Steffen Prohaska wrote:

 I think it's reasonable that GIT_ALLOC_LIMIT=0 means no limit, so that
 the limit can easily be disabled temporarily.

IMHO, GIT_ALLOC_LIMIT= (i.e., the empty string) would be a good way to
say that (and I guess that even works currently, due to the way atoi
works, but I suspect git_parse_ulong might complain). It is probably not
worth worrying about too much. This is not even a user-facing interface,
and the test scripts just set it to 0.

So I'd be OK going that direction, or just leaving it as-is.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Help improving the future of debugging

2014-08-25 Thread Benjamin Siegmund
Dear Git Developers,

since 1974 researchers and software developers try to ease software
debugging. Over the last years, they created many new tools and
formalized methods. We are interested if these advancements have
reached professional software developers and how they influenced their
approach. To find this out, we are conducting an Online Survey for
Software Developers. From the results we expect new insights into
debugging practice that help us to suggest new directions for future
research. So if you are a software developer or know any software
developers, you can really help us.  The survey is, of course, fully
anonymous and will take about 15 minutes to fill out. Feel free to
redistribute this message to anyone who you think might be interested.
The survey can be reached at:


http://www.uni-potsdam.de/skopie-up/index.php/689349


Thank you for your interest,
Benjamin Siegmund
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] resolved deltas

2014-08-25 Thread René Scharfe
Am 23.08.2014 um 13:18 schrieb Jeff King:
 On Sat, Aug 23, 2014 at 07:04:59AM -0400, Jeff King wrote:
 
 On Sat, Aug 23, 2014 at 06:56:40AM -0400, Jeff King wrote:

 So I think your patch is doing the right thing.

 By the way, if you want to add a test to your patch, there is
 infrastructure in t5308 to create packs with duplicate objects. If I
 understand the problem correctly, you could trigger this by having a
 delta object whose base is duplicated, even without the missing object.
 
 This actually turned out to be really easy. The test below fails without
 your patch and passes with it. Please feel free to squash it in.
 
 diff --git a/t/t5308-pack-detect-duplicates.sh 
 b/t/t5308-pack-detect-duplicates.sh
 index 9c5a876..50f7a69 100755
 --- a/t/t5308-pack-detect-duplicates.sh
 +++ b/t/t5308-pack-detect-duplicates.sh
 @@ -77,4 +77,19 @@ test_expect_success 'index-pack can reject packs with 
 duplicates' '
   test_expect_code 1 git cat-file -e $LO_SHA1
   '
   
 +test_expect_success 'duplicated delta base does not trigger assert' '
 + clear_packs 
 + {
 + A=01d7713666f4de822776c7622c10f1b07de280dc 
 + B=e68fe8129b546b101aee9510c5328e7f21ca1d18 
 + pack_header 3 
 + pack_obj $A $B 
 + pack_obj $B 
 + pack_obj $B
 + } dups.pack 
 + pack_trailer dups.pack 
 + git index-pack --stdin dups.pack 
 + test_must_fail git index-pack --stdin --strict dups.pack
 +'
 +
   test_done

Thanks, that looks good.  But while preparing the patch I noticed that
the added test sometimes fails.  Helgrind pointed outet a race
condition.  It is not caused by the patch to turn the asserts into
regular ifs, however -- here's a Helgrind report for the original code
with the new test:

==34949== Helgrind, a thread error detector
==34949== Copyright (C) 2007-2013, and GNU GPL'd, by OpenWorks LLP et al.
==34949== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==34949== Command: /home/lsr/src/git/t/../bin-wrappers/git index-pack --stdin
==34949==
==34949== Helgrind, a thread error detector
==34949== Copyright (C) 2007-2013, and GNU GPL'd, by OpenWorks LLP et al.
==34949== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==34949== Command: /home/lsr/src/git/git index-pack --stdin
==34949==
==34949== ---Thread-Announcement--
==34949==
==34949== Thread #3 was created
==34949==at 0x594DF7E: clone (clone.S:74)
==34949==by 0x544A2B9: do_clone.constprop.3 (createthread.c:75)
==34949==by 0x544B762: pthread_create@@GLIBC_2.2.5 (createthread.c:245)
==34949==by 0x4C2D55D: pthread_create_WRK (hg_intercepts.c:269)
==34949==by 0x43ABB8: cmd_index_pack (index-pack.c:1097)
==34949==by 0x405B6A: handle_builtin (git.c:351)
==34949==by 0x404CE8: main (git.c:575)
==34949==
==34949== ---Thread-Announcement--
==34949==
==34949== Thread #2 was created
==34949==at 0x594DF7E: clone (clone.S:74)
==34949==by 0x544A2B9: do_clone.constprop.3 (createthread.c:75)
==34949==by 0x544B762: pthread_create@@GLIBC_2.2.5 (createthread.c:245)
==34949==by 0x4C2D55D: pthread_create_WRK (hg_intercepts.c:269)
==34949==by 0x43ABB8: cmd_index_pack (index-pack.c:1097)
==34949==by 0x405B6A: handle_builtin (git.c:351)
==34949==by 0x404CE8: main (git.c:575)
==34949==
==34949== 
==34949==
==34949== Possible data race during read of size 4 at 0x5E15910 by thread #3
==34949== Locks held: none
==34949==at 0x439327: find_unresolved_deltas (index-pack.c:918)
==34949==by 0x439666: threaded_second_pass (index-pack.c:1002)
==34949==by 0x4C2D6F6: mythread_wrapper (hg_intercepts.c:233)
==34949==by 0x544B0A3: start_thread (pthread_create.c:309)
==34949==
==34949== This conflicts with a previous write of size 4 by thread #2
==34949== Locks held: none
==34949==at 0x4390E2: resolve_delta (index-pack.c:865)
==34949==by 0x439340: find_unresolved_deltas (index-pack.c:919)
==34949==by 0x439666: threaded_second_pass (index-pack.c:1002)
==34949==by 0x4C2D6F6: mythread_wrapper (hg_intercepts.c:233)
==34949==by 0x544B0A3: start_thread (pthread_create.c:309)
==34949==
==34949== Address 0x5E15910 is 48 bytes inside a block of size 256 alloc'd
==34949==at 0x4C2A7D0: calloc (vg_replace_malloc.c:618)
==34949==by 0x50CA83: xcalloc (wrapper.c:119)
==34949==by 0x439AF6: cmd_index_pack (index-pack.c:1643)
==34949==by 0x405B6A: handle_builtin (git.c:351)
==34949==by 0x404CE8: main (git.c:575)
==34949==
git: builtin/index-pack.c:918: find_unresolved_deltas_1: Assertion 
`child-real_type == OBJ_REF_DELTA' failed.
==34949==
==34949== For counts of detected and suppressed errors, rerun with: -v
==34949== Use --history-level=approx or =none to gain increased speed, at
==34949== the cost of reduced accuracy of conflicting-access 

Re: [PATCH v5 4/4] convert: Stream from fd to required clean filter instead of mmap

2014-08-25 Thread Steffen Prohaska

On Aug 25, 2014, at 2:43 PM, Jeff King p...@peff.net wrote:

 On Sun, Aug 24, 2014 at 06:07:46PM +0200, Steffen Prohaska wrote:
 
 The data is streamed to the filter process anyway.  Better avoid mapping
 the file if possible.  This is especially useful if a clean filter
 reduces the size, for example if it computes a sha1 for binary data,
 like git media.  The file size that the previous implementation could
 handle was limited by the available address space; large files for
 example could not be handled with (32-bit) msysgit.  The new
 implementation can filter files of any size as long as the filter output
 is small enough.
 
 The new code path is only taken if the filter is required.  The filter
 consumes data directly from the fd.  The original data is not available
 to git, so it must fail if the filter fails.
 
 Can you clarify this second paragraph a bit more? If I understand
 correctly, we handle a non-required filter failing by just reading the
 data again (which we can do because we either read it into memory
 ourselves, or mmap it).

We don't read the data again.  convert_to_git() assumes that it is already
in memory and simply keeps the original buffer if the filter fails.


 With the streaming approach, we will read the
 whole file through our stream; if that fails we would then want to read
 the stream from the start.
 
 Couldn't we do that with an lseek (or even an mmap with offset 0)? That
 obviously would not work for non-file inputs, but I think we address
 that already in index_fd: we push non-seekable things off to index_pipe,
 where we spool them to memory.

It could be handled that way, but we would be back to the original problem
that 32-bit git fails for large files.  The convert code path currently
assumes that all data is available in a single buffer at some point to apply
crlf and ident filters.

If the initial filter, which is assumed to reduce the file size, fails, we
could seek to 0 and read the entire file.  But git would then fail for large
files with out-of-memory.  We would not gain anything for the use case that
I describe in the commit message's first paragraph.

To implement something like the ideal strategy below, the entire convert 
machinery for crlf and ident would have to be converted to a streaming
approach.  Another option would be to detect that only the clean filter
would be applied and not crlf and ident.  Maybe we could get away with
something simpler then.

But I think that if the clean filter's purpose is to reduce file size, it
does not make sense to try to handle the case of a failing filter with a 
fallback plan.  The filter should simply be marked required, because
any sane operation requires it.


 So it seems like the ideal strategy would be:
 
  1. If it's seekable, try streaming. If not, fall back to lseek/mmap.
 
  2. If it's not seekable and the filter is required, try streaming. We
 die anyway if we fail.
 
  3. If it's not seekable and the filter is not required, decide based
 on file size:
 
   a. If it's small, spool to memory and proceed as we do now.
 
   b. If it's big, spool to a seekable tempfile.
 
 Your patch implements part 2. But I would think part 1 is the most common
 case. And while part 3b seems unpleasant, it is better than the current
 code (with or without your patch), which will do 3a on a large file.
 
 Hmm. Though I guess in (3) we do not have the size up front, so it's
 complicated (we could spool N bytes to memory, then start dumping to a
 file after that). I do not think we necessarily need to implement that
 part, though. It seems like (1) is the thing I would expect to hit the
 most (i.e., people do not always mark their filters are required).

Well, I think they have to mark it if the filter's purpose is to reduce size.

I'll add a bit of the discussion to the commit message.  I'm not convinced
that we should do more at this point.


 +} else {
 +/* dup(), because copy_fd() closes the input fd. */
 +fd = dup(params-fd);
 
 Not a problem you are introducing, but this seem kind of like a
 misfeature in copy_fd. Is it worth fixing? The function only has two
 existing callers.

I found it confusing.  I think it's worth fixing.

Steffen
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/5] fast-import: fix buffer overflow in dump_tags

2014-08-25 Thread Ronnie Sahlberg
Jeff,
We have a fix like this in the next set of transaction updates.
https://code-review.googlesource.com/#/c/1012/13/fast-import.c

However, if your concerns are the integrity of the servers and not
taking any chances
you might not want to wait for my patches to graduate.


ronnie sahlberg

On Fri, Aug 22, 2014 at 10:32 PM, Jeff King p...@peff.net wrote:
 When creating a new annotated tag, we sprintf the refname
 into a static-sized buffer. If we have an absurdly long
 tagname, like:

   git init repo 
   cd repo 
   git commit --allow-empty -m foo 
   git tag -m message mytag 
   git fast-export mytag |
   perl -lpe '/^tag/ and s/mytag/a x 8192/e' |
   git fast-import input

 we'll overflow the buffer. We can fix it by using a strbuf.

 Signed-off-by: Jeff King p...@peff.net
 ---
 I'm not sure how easily exploitable this is. The buffer is on the stack,
 and we definitely demolish the return address. But we never actually
 return from the function, since lock_ref_sha1 will fail in such a case
 and die (it cannot succeed because the name is longer than PATH_MAX,
 which we check when concatenating it with $GIT_DIR).

 Still, there is no limit to the size of buffer you can feed it, so it's
 entirely possible you can overwrite something else and cause some
 mischief. So I wouldn't call it trivially exploitable, but I would not
 bet my servers that it is not (and of course it is easy to trigger if
 you can convince somebody to run fast-import a stream, so any remote
 helpers produce a potentially vulnerable situation).

  fast-import.c | 10 ++
  1 file changed, 6 insertions(+), 4 deletions(-)

 diff --git a/fast-import.c b/fast-import.c
 index f25a4ae..a1479e9 100644
 --- a/fast-import.c
 +++ b/fast-import.c
 @@ -1734,14 +1734,16 @@ static void dump_tags(void)
 static const char *msg = fast-import;
 struct tag *t;
 struct ref_lock *lock;
 -   char ref_name[PATH_MAX];
 +   struct strbuf ref_name = STRBUF_INIT;

 for (t = first_tag; t; t = t-next_tag) {
 -   sprintf(ref_name, tags/%s, t-name);
 -   lock = lock_ref_sha1(ref_name, NULL);
 +   strbuf_reset(ref_name);
 +   strbuf_addf(ref_name, tags/%s, t-name);
 +   lock = lock_ref_sha1(ref_name.buf, NULL);
 if (!lock || write_ref_sha1(lock, t-sha1, msg)  0)
 -   failure |= error(Unable to update %s, ref_name);
 +   failure |= error(Unable to update %s, ref_name.buf);
 }
 +   strbuf_release(ref_name);
  }

  static void dump_marks_helper(FILE *f,
 --
 2.1.0.346.ga0367b9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Loop index increases monotonically when reading unmerged entries

2014-08-25 Thread Junio C Hamano
Jaime Soriano Pastor jsorianopas...@gmail.com writes:

 I think this line is dangerous, if add_cache_entry is not able to
 remove higher-stages it will be looping forever, as happens in the
 case of this thread.
 I cannot see why it's even needed, and removing it doesn't break any test.

 On Sun, Aug 24, 2014 at 7:57 PM, Jaime Soriano Pastor
 jsorianopas...@gmail.com wrote:
 Signed-off-by: Jaime Soriano Pastor jsorianopas...@gmail.com
 ---
  read-cache.c | 1 -
  1 file changed, 1 deletion(-)

 diff --git a/read-cache.c b/read-cache.c
 index c1a9619..3d70386 100644
 --- a/read-cache.c
 +++ b/read-cache.c
 @@ -1971,7 +1971,6 @@ int read_index_unmerged(struct index_state *istate)
 if (add_index_entry(istate, new_ce, 0))
 return error(%s: cannot drop to stage #0,
  new_ce-name);
 -   i = index_name_pos(istate, new_ce-name, len);

I think the original idea was that regardless of how many entries
with the same name were removed because of the replacement (or
addition) of new_ce, by making i point at the newly added
new_ce, we would make sure that the loop will continue from the
next entry.  The if/return expected that add_index_entry() will get
rid of all the other entries with the same name as new_ce has or it
will return an error.

Without the bug in add_index_entry(), because new_ce always has
the same name as ce, the entry we found at i by the loop, we know
that index_name_pos() will give the same i we already have, so
removing this line should be a no-op.

Now, add_index_entry() in your case did not notice that it failed to
remove all other entries with the same name as new_ce, resulting
in your looping forever.  Among the merged and unmerged entries
with the same name exists in the index file class of index file
corruption, we could treat the merged and unmerged entries with the
same name not just exists but next to each other, unmerged ones
coming immediately after merged one case specially (i.e. declaring
that it is more likely for a broken software to leave both next to
each other than otherwise) and try to accomodate it as your original
patch did.  I am not absolutely sure if such a special case is worth
it, and with your updated [1/2] read_index_from(): check order of
entries when reading index we will not be doing so, which is good.

With that safety in place, the bug in add_index_entry() will
disappear; it is safe not to adjust i by calling index_name_pos()
and this patch, [2/2] read_index_unmerged(): remove unnecessary
loop index adjustment, will be a good thing to do.

Thanks.

 }
 return unmerged;
  }
 --
 2.0.4.1.g0b8a4f9.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] fast-import: clean up pack_data pointer in end_packfile

2014-08-25 Thread Ronnie Sahlberg
Print an error before returning when pack_data is NULL ?

Otherwise, LGTM

On Fri, Aug 22, 2014 at 10:27 PM, Jeff King p...@peff.net wrote:
 We have a global pointer pack_data pointing to the current
 pack we have open. Inside end_packfile we have two new
 pointers, old_p and new_p. The latter points to pack_data,
 and the former points to the new installed version of the
 packfile we get when we hand the file off to the regular
 sha1_file machinery. When then free old_p.

 Presumably the extra old_p pointer was there so that we
 could overwrite pack_data with new_p and still free old_p,
 but we don't do that. We just leave pack_data pointing to
 bogus memory, and don't overwrite it until we call
 start_packfile again (if ever).

 This can cause problems for our die routine, which calls
 end_packfile to clean things up. If we die at the wrong
 moment, we can end up looking at invalid memory in
 pack_data left after the last end_packfile().

 Instead, let's make sure we set pack_data to NULL after we
 free it, and make calling endfile() again with a NULL
 pack_data a noop (there is nothing to end).

 We can further make things less confusing by dropping old_p
 entirely, and moving new_p closer to its point of use.

 Signed-off-by: Jeff King p...@peff.net
 ---
 Noticed while running fast-import under valgrind to debug the next
 commit. :)

  fast-import.c | 11 +++
  1 file changed, 7 insertions(+), 4 deletions(-)

 diff --git a/fast-import.c b/fast-import.c
 index d73f58c..f25a4ae 100644
 --- a/fast-import.c
 +++ b/fast-import.c
 @@ -946,10 +946,12 @@ static void unkeep_all_packs(void)

  static void end_packfile(void)
  {
 -   struct packed_git *old_p = pack_data, *new_p;
 +   if (!pack_data)
 +   return;

 clear_delta_base_cache();
 if (object_count) {
 +   struct packed_git *new_p;
 unsigned char cur_pack_sha1[20];
 char *idx_name;
 int i;
 @@ -991,10 +993,11 @@ static void end_packfile(void)
 pack_id++;
 }
 else {
 -   close(old_p-pack_fd);
 -   unlink_or_warn(old_p-pack_name);
 +   close(pack_data-pack_fd);
 +   unlink_or_warn(pack_data-pack_name);
 }
 -   free(old_p);
 +   free(pack_data);
 +   pack_data = NULL;

 /* We can't carry a delta across packfiles. */
 strbuf_release(last_blob.data);
 --
 2.1.0.346.ga0367b9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] resolved deltas

2014-08-25 Thread Shawn Pearce
On Sat, Aug 23, 2014 at 3:56 AM, Jeff King p...@peff.net wrote:
 [+cc spearce, as I think this is a bug in code.google.com's sending
  side, and he can probably get the attention of the right folks]
...
 My guess is a bug on the sending side. We have seen duplicate pack
 objects before, but never (AFAIK) combined with a missing object. This
 really seems like the sender just sent the wrong data for one object.

 IIRC, code.google.com is backed by their custom Dulwich implementation
 which runs on BigTable. We'll see if Shawn can get this to the right
 people as a bug report. :)

Thanks. This is a bug in the code.google.com implementation that is
running on Bigtable. I forwarded the report to the team that manages
that service so they can investigate further.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-25 Thread Junio C Hamano
Jaime Soriano Pastor jsorianopas...@gmail.com writes:

 Subject: Re: [PATCH 1/2] Check order when reading index

Please be careful when crafting the commit title.  This single line
will be the only one that readers will have to identify the change
among hundreds of entries in git shortlog output when trying to
see what kind of change went into the project during the given
period.  Something like:

read_index_from(): catch out of order entries while reading an index file

perhaps?

 Signed-off-by: Jaime Soriano Pastor jsorianopas...@gmail.com
 ---
  read-cache.c | 18 ++
  1 file changed, 18 insertions(+)

 diff --git a/read-cache.c b/read-cache.c
 index 7f5645e..c1a9619 100644
 --- a/read-cache.c
 +++ b/read-cache.c
 @@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct 
 ondisk_cache_entry *ondisk,
   return ce;
  }
  
 +void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)

Does this have to be global, i.e. not static void ...?

 +{
 + int name_compare = strcmp(ce-name, next_ce-name);
 + if (0  name_compare)
 + die(Unordered stage entries in index);
 + if (!name_compare) {
 + if (!ce_stage(ce))
 + die(Multiple stage entries for merged file '%s',
 + ce-name);

OK.  If ce is at stage #0, no other entry can have the same name
regardless of the stage, and next_ce having the same name violates
that rule.

 + if (ce_stage(ce) = ce_stage(next_ce))
 + die(Unordered stage entries for '%s',
 + ce-name);

Not quite.  We do allow multiple higher stage entries; having two or
more stage #1 entries is perfectly fine during a merge resolution,
and both ce and next_ce may be pointing at the stage #1 entries of
the same path.  Replacing the comparison with  is sufficient, I
think.

Thanks.

 + }
 +}
 +
  /* remember to discard_cache() before reading a different cache! */
  int read_index_from(struct index_state *istate, const char *path)
  {
 @@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const 
 char *path)
   ce = create_from_disk(disk_ce, consumed, previous_name);
   set_index_entry(istate, i, ce);
  
 + if (i  0)
 + check_ce_order(istate-cache[i - 1], ce);
 +
   src_offset += consumed;
   }
   strbuf_release(previous_name_buf);
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] fast-import: stop using lock_ref_sha1

2014-08-25 Thread Ronnie Sahlberg
The next ref transaction series does a similar change and ends up
removing the function lock_ref_sha1() completely.
https://code-review.googlesource.com/#/c/1017/19/refs.c

So I think we can drop this patch.


ronnie sahlberg


On Fri, Aug 22, 2014 at 10:33 PM, Jeff King p...@peff.net wrote:
 We can use lock_any_ref_for_update instead. Besides being
 more flexible, the only difference between the two is that
 lock_ref_sha1 does not allow top-level refs like
 refs/foo to be updated. However, we know that we do not
 have such a ref, because we explicitly add the tags/
 prefix ourselves.

 Note that we now must feed the whole name refs/tags/X
 instead of just tags/X to the function. As a result, our
 failure error message is uses the longer name. This is
 probably a good thing, though.

 As an interesting side note, if we forgot to switch this
 input to the function, the tests do not currently catch it.
 We import a tag X and then check that we can access it at
 tags/X. If we accidentally created tags/X at the
 top-level $GIT_DIR instead of under refs/, we would still
 find it due to our ref lookup procedure!

 We do not make such a mistake in this patch, of course, but
 while we're thinking about it, let's make the fast-import
 tests more robust by checking for fully qualified
 refnames.

 Signed-off-by: Jeff King p...@peff.net
 ---
 As I mentioned, I'd be OK with dropping this one in favor of just
 waiting for Ronnie's transaction patches to graduate.

  fast-import.c  | 4 ++--
  t/t9300-fast-import.sh | 8 
  2 files changed, 6 insertions(+), 6 deletions(-)

 diff --git a/fast-import.c b/fast-import.c
 index a1479e9..04a85a4 100644
 --- a/fast-import.c
 +++ b/fast-import.c
 @@ -1738,8 +1738,8 @@ static void dump_tags(void)

 for (t = first_tag; t; t = t-next_tag) {
 strbuf_reset(ref_name);
 -   strbuf_addf(ref_name, tags/%s, t-name);
 -   lock = lock_ref_sha1(ref_name.buf, NULL);
 +   strbuf_addf(ref_name, refs/tags/%s, t-name);
 +   lock = lock_any_ref_for_update(ref_name.buf, NULL, 0, NULL);
 if (!lock || write_ref_sha1(lock, t-sha1, msg)  0)
 failure |= error(Unable to update %s, ref_name.buf);
 }
 diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
 index 5fc9ef2..f4c6673 100755
 --- a/t/t9300-fast-import.sh
 +++ b/t/t9300-fast-import.sh
 @@ -153,7 +153,7 @@ tag series-A
  An annotated tag without a tagger
  EOF
  test_expect_success 'A: verify tag/series-A' '
 -   git cat-file tag tags/series-A actual 
 +   git cat-file tag refs/tags/series-A actual 
 test_cmp expect actual
  '

 @@ -165,7 +165,7 @@ tag series-A-blob
  An annotated tag that annotates a blob.
  EOF
  test_expect_success 'A: verify tag/series-A-blob' '
 -   git cat-file tag tags/series-A-blob actual 
 +   git cat-file tag refs/tags/series-A-blob actual 
 test_cmp expect actual
  '

 @@ -232,8 +232,8 @@ EOF
  test_expect_success \
 'A: tag blob by sha1' \
 'git fast-import input 
 -   git cat-file tag tags/series-A-blob-2 actual 
 -   git cat-file tag tags/series-A-blob-3 actual 
 +   git cat-file tag refs/tags/series-A-blob-2 actual 
 +   git cat-file tag refs/tags/series-A-blob-3 actual 
 test_cmp expect actual'

  test_tick
 --
 2.1.0.346.ga0367b9
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] fix pack-refs pruning of refs/foo

2014-08-25 Thread Ronnie Sahlberg
On Fri, Aug 22, 2014 at 10:23 PM, Jeff King p...@peff.net wrote:
 I noticed that git pack-refs --all will pack a top-level ref like
 refs/foo, but will not actually prune $GIT_DIR/refs/foo. I do not
 see the point in having a policy not to pack refs/foo if --all is
 given. But even if we did have such a policy, this seems broken; we
 should either pack and prune, or leave them untouched. I don't see any
 indication that the existing behavior was intentional.

 The problem is that pack-refs's prune_ref calls lock_ref_sha1, which
 enforces this no toplevel behavior. I am not sure there is any real
 point to this, given that most callers use lock_any_ref_for_update,
 which is exactly equivalent but without the toplevel check.

 The first two patches deal with this by switching pack-refs to use
 lock_any_ref_for_update. This will conflict slightly with Ronnie's
 ref-transaction work, as he gets rid of lock_ref_sha1 entirely, and
 moves the code directly into prune_ref. This can be trivially resolved
 in favor of my patch, I think.

 The third patch is a cleanup I noticed while looking around, and I think
 should not conflict with anyone (and is a good thing to do).

 The last two are trickier. I wondered if we could get rid of
 lock_ref_sha1 entirely. After pack-refs, there are two callers:
 fast-import.c and walker.c. After converting the first, it occurred to
 me that Ronnie might be touching the same areas, and I see that yes,
 indeed, there's quite a bit of conflict (and he reaches the same end
 goal of dropping it entirely).

 So in that sense I do not mind dropping the final two patches. Ronnie's
 endpoint is much better, moving to a ref_transaction. However, there is
 actually a buffer overflow in the existing code. Ronnie's series fixes
 it in a similar way (moving to a strbuf), and I'm fine with that
 endpoint. But given that the ref transaction code is not yet merged (and
 would certainly never be maint-track), I think it is worth applying the
 buffer overflow fix separately.

 I think the final patch can probably be dropped, then. It is a clean-up,
 but one that we can just get when Ronnie's series is merged.

   [1/5]: git-prompt: do not look for refs/stash in $GIT_DIR
   [2/5]: pack-refs: prune top-level refs like refs/foo
   [3/5]: fast-import: clean up pack_data pointer in end_packfile
   [4/5]: fast-import: fix buffer overflow in dump_tags
   [5/5]: fast-import: stop using lock_ref_sha1


+1 on 1-3
+1 on 4. While I have a similar fix in the transaction series, you
should not need to wait for that series to address a security concern.
5: I think this one is not as urgent as the others so would prefer if
it is dropped, just so it doesn't cause more merge conflicts than is
already present in the transaction series.


1-4:
Reviewed-by: Ronnie Sahlberg sahlb...@google.com


 -Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/3] Push the NATIVE_CRLF Makefile variable to C and added a test for native.

2014-08-25 Thread Junio C Hamano
Torsten Bögershausen tbo...@web.de writes:

 On 2014-08-23 00.54, Junio C Hamano wrote:
 Torsten Bögershausen tbo...@web.de writes:
 
 Commit 95f31e9a correctly points out that the NATIVE_CRLF setting is
 incorrectly set on Mingw git. However, the Makefile variable is not
 propagated to the C preprocessor and results in no change. This patch
 pushes the definition to the C code and adds a test to validate that
 when core.eol as native is crlf, we actually normalize text files to this
 line ending convention when core.autocrlf is false.

 Signed-off-by: Pat Thoyts pattho...@users.sourceforge.net
 Signed-off-by: Stepan Kasal ka...@ucw.cz
 Signed-off-by: Torsten Bögershausen tbo...@web.de
 ---
 
 Who should I record as the author of this patch?
 

 Sorry for missing this, here is what Mingw says: 

 commit 0caba2cacbb9d8e6a31783b45f1a13e52dec6ce8
 Author: Pat Thoyts pattho...@users.sourceforge.net
 Date:   Mon Nov 26 00:24:00 2012 +

 Push the NATIVE_CRLF Makefile variable to C and added a test for native.
 
 Commit 95f31e9a correctly points out that the NATIVE_CRLF setting is
 []

When forwarding somebody else's patch, please start the *body* of
your message with the in-body header to force the author, followed
by a blank line and then the message, i.e.

From: Pat Thoyts pattho...@users.sourceforge.net

Commit 95f31e9a correctly points out that the NATIVE_CRLF setting is
incorrectly set on Mingw git. However, the Makefile variable is not...
...

The request applies to other patches in the series as well.  I
suspect that using send-email on format-patch output may do the
right thing automatically.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-ref-format: include refs/ in the argument or to strip it?

2014-08-25 Thread Ronnie Sahlberg
On Fri, Aug 22, 2014 at 10:46 PM, Jeff King p...@peff.net wrote:
 On Fri, Aug 22, 2014 at 11:45:15AM -0700, Jonathan Nieder wrote:

 Junio C Hamano wrote:

 implication of which is that the 'at least one slash'
  rule was to expect things are 'refs/anything' so there will be at
  least one.  Even back then, that anything alone had at least one
  slash (e.g. heads/master), but the intention was *never* that we
  would forbid anything that does not have a slash by feeding
  anything part alone to check-ref-format, i.e. things like
  refs/stash were designed to be allowed.

 Now I'm more confused.  Until 5f7b202a (2008-01-01), there was a
 comment

   if (level  2)
   return -2; /* at least of form heads/blah */

 and that behavior has been preserved since the beginning.

 Why do most old callers pass a string that doesn't start with refs/
 (e.g., see the callers in 03feddd6, 2005-10-13)?  Has the intent been
 to relax the requirement since then?

 Yeah, this weird do not allow refs/foo behavior has continually
 confused me. Coincidentally I just noticed a case today where
 pack-refs treats refs/foo specially for no good reason:

   http://thread.gmane.org/gmane.comp.version-control.git/255729

 After much head scratching over the years, I am of the opinion that
 nobody every really _meant_ to prevent refs/foo, and that code
 comments like the one you quote above were an attempt to document
 existing buggy behavior that was really trying to differentiate HEAD
 from refs/*. That's just my opinion, though. :) I'd be happy if all of
 the special-treatment of refs/foo went away and check_refname_format
 always got the full ref.


There are also a lot of places where we assume that a refs will start
with refs/heads/ and not just refs/
for_each_branch_ref(), log_ref_setup() (so no reflogs) is_branch() to
name a few.

This makes the api a bit confusing and hard to predict. Which
functions allow refs/foo and which will ignore it?
Are there any compelling reasons why refs/foo should be allowed?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] t0026: Add missing

2014-08-25 Thread Junio C Hamano
Torsten Bögershausen tbo...@web.de writes:

 Fix the broken  chain

 Reported-By: Eric Sunshine sunsh...@sunshineco.com
 Signed-off-by: Torsten Bögershausen tbo...@web.de
 ---

Please fold this kind of oops fix breakages discovered in the
version that hasn't been reached 'next' to the patch that
introduces the breakage, with Helped-by: whom.

Thanks.

  t/t0026-eol-config.sh | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

 diff --git a/t/t0026-eol-config.sh b/t/t0026-eol-config.sh
 index 43a580a..4806969 100755
 --- a/t/t0026-eol-config.sh
 +++ b/t/t0026-eol-config.sh
 @@ -84,9 +84,9 @@ test_expect_success NATIVE_CRLF 'eol native is crlf' '
  
   rm -rf native_eol  mkdir native_eol 
   ( cd native_eol 
 - printf *.txt text\n  .gitattributes
 - printf one\r\ntwo\r\nthree\r\n  filedos.txt
 - printf one\ntwo\nthree\n  fileunix.txt
 + printf *.txt text\n  .gitattributes 
 + printf one\r\ntwo\r\nthree\r\n  filedos.txt 
 + printf one\ntwo\nthree\n  fileunix.txt 
   git init 
   git config core.autocrlf false 
   git config core.eol native 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/18] signed push: final protocol update

2014-08-25 Thread Junio C Hamano
Shawn Pearce spea...@spearce.org writes:

 A stateless nonce could look like:

   nonce = HMAC_SHA1( SHA1(site+path) + '.' + now, site_key )

 where site_key is a private key known to the server. It doesn't have
 to be per-repo.

 receive-pack would then be willing to accept any nonce whose timestamp
 is within a window, e.g. 10 minutes of the current time, and whose
 signature verifies in the HMAC. The 10 minute window is important to
 allow clients time to generate the object list, perform delta
 compression, and begin transmitting to the server.

Hmph, don't you send the finally tell the other end the sequence
of update this ref from old to new and the packdata separately?  I
think we have a FLUSH in between, and the push certificate is given
before the FLUSH, which you do not have to wait for 10 minutes.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/18] Signed push

2014-08-25 Thread Junio C Hamano
Stefan Beller stefanbel...@gmail.com writes:

 burden is not an issue, as I'll be signing the push certificate
 anyway when I push.  A signed tag or a signed commit and signed push
 certificate solves two completely separate and orthogonal issues.
 
 What happens if you break into GitHub or k.org and did
 
 $ git tag maint_2014_08_22 master_2014_08_22

 Ok, I personally haven't used tags a lot.
 I just tried to
   git tag -s testbreaktag v2.1.0
   git show testbreaktag
   # However it would still read:
 tag v2.1.0
 Tagger: Junio C Hamano gits...@pobox.com
 Date:   Fri Aug 15 15:09:28 2014 -0700

 So as I do not posess your private key I could not create signed tags
 even if I were to break into github/k.org

The point was that after I push to 'maint', you break into the site
and copy or move that tag as if I pushed to 'master'.

You could argue that I could create a signed tag 'maint-2014-08-25',
push it, and if you moved it to tags/master-2014-08-25 as an
attacker, the refname would not match the tag  line in the signed
tag object.  While that is true, nobody other thaan fsck checks the
contents on the tag  line in practice.

But more importantly.

I may deem a commit a sensible version for the 'master' branch of
one repository but it would not be sensible for another repository's
'master' branch.  Imagine a world just like the kernel development
during 2.6 era used to be, where there was a separate tree 2.4
maintained with its own 'master' branch.  What is appropriate for
the tip of 'master' to one repository is not good for the other one,
and your timestamped tag  line may say for which branch the push
was for but does not say for which repository.  The exact problem is
also shared with the desire to have a branch object expressed
elsewhere; as there is no identity for a branch in a distributed
world, trying to name branch as if it is a global entity without
mentioning what repository will lead to tears.

Besides, these tags/maint-2014-08-25 tags will be interesting only
for those who are auditing and not for general public, and we do not
have a good way to hide uninteresting refs until those with narrow
niche interest ask yet, which is something we may want to add soon,
but I do not want auditable push taken hostage to that unrelated
feature.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-ref-format: include refs/ in the argument or to strip it?

2014-08-25 Thread Jonathan Nieder
Ronnie Sahlberg wrote:
 On Fri, Aug 22, 2014 at 10:46 PM, Jeff King p...@peff.net wrote:

 Yeah, this weird do not allow refs/foo behavior has continually
 confused me. Coincidentally I just noticed a case today where
 pack-refs treats refs/foo specially for no good reason:

   http://thread.gmane.org/gmane.comp.version-control.git/255729

 After much head scratching over the years, I am of the opinion that
 nobody every really _meant_ to prevent refs/foo, and that code
 comments like the one you quote above were an attempt to document
 existing buggy behavior that was really trying to differentiate HEAD
 from refs/*. That's just my opinion, though. :)

It's still very puzzling to me.  The comment came at the same time as
the behavior, in v0.99.9~120 (git-check-ref-format: reject funny ref
names, 2005-10-13).  Before that, the behavior was even stranger ---
it checked that there was exactly one slash in the argument.

I'm willing to believe we might not want that check any more, though.

[...]
 There are also a lot of places where we assume that a refs will start
 with refs/heads/ and not just refs/
 for_each_branch_ref(), log_ref_setup() (so no reflogs) is_branch() to
 name a few.

for_each_branch_ref is for iterating over local branches, which are
defined as refs that start with refs/heads/*.  Likewise, the only
point of is_branch is to check whether a ref is under refs/heads/*.
That's not an assumption about all refs.

log_ref_setup implements the policy that there are only reflogs for:

 * refs where the reflog was explicitly created (git branch
   --create-reflog does this, but for some reason there's no
   corresponding git update-ref --create-reflog so people have
   to use mkdir directly for other refs), plus

 * if the '[core] logallrefupdates' configuration is enabled (and it
   is by default for non-bare repositories), then HEAD, refs/heads/*,
   refs/notes/*, and refs/remotes/*.

This is documented in git-config(1) --- see core.logAllRefUpdates.

That way, when tools internally use other refs (e.g., FETCH_HEAD),
git doesn't have to automatically incur the cost of maintaining the
reflog for those.  What other refs should there be reflogs for?  I
haven't thought carefully about this.

It definitely isn't an assumption that *all* refs will match that
pattern.  But it might be worth changing for other reasons.

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bisect: save heap memory. allocate only the required amount

2014-08-25 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Mon, Aug 25, 2014 at 04:06:52PM +0200, Christian Couder wrote:

  This allocation should be name_len + 1 for the NUL-terminator, no?
 
 I wondered about that too, but as struct name_decoration is defined like 
 this:
 
 struct name_decoration {
 struct name_decoration *next;
 int type;
 char name[1];
 };
 
 the .name field of this struct already has one char, so the allocation
 above should be ok.

 Yeah, you're right. I would argue it should just be FLEX_ARRAY for
 consistency with other spots, though (in which case add_name_decoration
 needs to be updated with a +1).

 Running git grep '^  char [^ ]*\[[01]]' -- '*.[ch]' shows that this
 is one of only two spots that don't use FLEX_ARRAY (and the other has a
 comment explaining why not).

Good digging, and I agree that it should use the FLEX_ARRAY for
consistency.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 4/4] convert: Stream from fd to required clean filter instead of mmap

2014-08-25 Thread Junio C Hamano
Steffen Prohaska proha...@zib.de writes:

 Couldn't we do that with an lseek (or even an mmap with offset 0)? That
 obviously would not work for non-file inputs, but I think we address
 that already in index_fd: we push non-seekable things off to index_pipe,
 where we spool them to memory.

 It could be handled that way, but we would be back to the original problem
 that 32-bit git fails for large files.

Correct, and you are making an incremental improvement so that such
a large blob can be handled _when_ the filters can successfully
munge it back and forth.  If we fail due to out of memory when the
filters cannot, that would be the same as without your improvement,
so you are still making progress.

 To implement something like the ideal strategy below, the entire convert 
 machinery for crlf and ident would have to be converted to a streaming
 approach.

Yes, that has always been the longer term vision since the day the
streaming infrastructure was introduced.

 So it seems like the ideal strategy would be:
 
  1. If it's seekable, try streaming. If not, fall back to lseek/mmap.
 
  2. If it's not seekable and the filter is required, try streaming. We
 die anyway if we fail.

Puzzled...  Is it assumed that any content the filters tell us to
use the contents from the db as-is by exiting with non-zero status
will always be large not to fit in-core?  For small contents, isn't
this ideal strategy a regression?

  3. If it's not seekable and the filter is not required, decide based
 on file size:
 
   a. If it's small, spool to memory and proceed as we do now.
 
   b. If it's big, spool to a seekable tempfile.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 00/18] Signed push

2014-08-25 Thread Jason Pyeron
 -Original Message-
 From: Junio C Hamano
 Sent: Monday, August 25, 2014 13:55
 
 Stefan Beller stefanbel...@gmail.com writes:
 
  burden is not an issue, as I'll be signing the push certificate
  anyway when I push.  A signed tag or a signed commit and 
 signed push
  certificate solves two completely separate and orthogonal issues.
  
  What happens if you break into GitHub or k.org and did
  
  $ git tag maint_2014_08_22 master_2014_08_22
 
  Ok, I personally haven't used tags a lot.
  I just tried to
  git tag -s testbreaktag v2.1.0
  git show testbreaktag
  # However it would still read:
  tag v2.1.0
  Tagger: Junio C Hamano gits...@pobox.com
  Date:   Fri Aug 15 15:09:28 2014 -0700
 
  So as I do not posess your private key I could not create 
 signed tags
  even if I were to break into github/k.org
 
 The point was that after I push to 'maint', you break into the site
 and copy or move that tag as if I pushed to 'master'.

What is needed is not a signed push per se, but rather a need for a set of 
signed HEADS ...

 
 You could argue that I could create a signed tag 'maint-2014-08-25',
 push it, and if you moved it to tags/master-2014-08-25 as an
 attacker, the refname would not match the tag  line in the signed
 tag object.  While that is true, nobody other thaan fsck checks the
 contents on the tag  line in practice.
 
 But more importantly.
 
 I may deem a commit a sensible version for the 'master' branch of
 one repository but it would not be sensible for another repository's
 'master' branch.  Imagine a world just like the kernel development
 during 2.6 era used to be, where there was a separate tree 2.4
 maintained with its own 'master' branch.  What is appropriate for
 the tip of 'master' to one repository is not good for the other one,

... and these signed HEADS need to be tied to a particular repository instance. 
AFAIK git does not have any unique identifier per repository instance to 
leverage. If you were to make a repository instance id you could take that and 
the branch name as input to a signed hash for verification later. But this 
leads to deeper issues about new workflow, new configuration storage 
mechanisms, etc.

 and your timestamped tag  line may say for which branch the push
 was for but does not say for which repository.  The exact problem is
 also shared with the desire to have a branch object expressed
 elsewhere; as there is no identity for a branch in a distributed
 world, trying to name branch as if it is a global entity without
 mentioning what repository will lead to tears.
 
 Besides, these tags/maint-2014-08-25 tags will be interesting only
 for those who are auditing and not for general public, and we do not
 have a good way to hide uninteresting refs until those with narrow
 niche interest ask yet, which is something we may want to add soon,
 but I do not want auditable push taken hostage to that unrelated
 feature.
 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] fast-import: clean up pack_data pointer in end_packfile

2014-08-25 Thread Jeff King
On Mon, Aug 25, 2014 at 10:16:52AM -0700, Ronnie Sahlberg wrote:

 Print an error before returning when pack_data is NULL ?

I don't think so. We call end_packfile in some code paths (like the die
handler) as close if it's open. So I think it makes sense for it to be
a noop if nothing is open.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] fix pack-refs pruning of refs/foo

2014-08-25 Thread Jeff King
On Mon, Aug 25, 2014 at 10:38:56AM -0700, Ronnie Sahlberg wrote:

[1/5]: git-prompt: do not look for refs/stash in $GIT_DIR
[2/5]: pack-refs: prune top-level refs like refs/foo
[3/5]: fast-import: clean up pack_data pointer in end_packfile
[4/5]: fast-import: fix buffer overflow in dump_tags
[5/5]: fast-import: stop using lock_ref_sha1
 
 
 +1 on 1-3
 +1 on 4. While I have a similar fix in the transaction series, you
 should not need to wait for that series to address a security concern.
 5: I think this one is not as urgent as the others so would prefer if
 it is dropped, just so it doesn't cause more merge conflicts than is
 already present in the transaction series.

OK, I think we're in agreement then. Thanks for looking them over.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-ref-format: include refs/ in the argument or to strip it?

2014-08-25 Thread Junio C Hamano
Ronnie Sahlberg sahlb...@google.com writes:

 There are also a lot of places where we assume that a refs will start
 with refs/heads/ and not just refs/
 for_each_branch_ref(), log_ref_setup() (so no reflogs) is_branch() to
 name a few.

for-each-BRANCH-ref and is-BRANCH are explicitly about branches and
it is natural that they will work only on refs that start with
refs/heads/, no?  I am not sure about log-ref-setup (git grep pu
does not find it).

 This makes the api a bit confusing and hard to predict. Which
 functions allow refs/foo and which will ignore it?  Are there any
 compelling reasons why refs/foo should be allowed?

The only one I can think of offhand that we internally use is the
stash, but that does not give us any assurance that no third-party
tool is using refs/frotz for its own use X-.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-ref-format: include refs/ in the argument or to strip it?

2014-08-25 Thread Jeff King
On Mon, Aug 25, 2014 at 11:26:36AM -0700, Jonathan Nieder wrote:

 It's still very puzzling to me.  The comment came at the same time as
 the behavior, in v0.99.9~120 (git-check-ref-format: reject funny ref
 names, 2005-10-13).  Before that, the behavior was even stranger ---
 it checked that there was exactly one slash in the argument.
 
 I'm willing to believe we might not want that check any more, though.

Yeah, given that among experiences gits, nobody can figure out a
motivation or a history for the feature (and given that it causes
problems), I do not see any problem with loosening it.

 That way, when tools internally use other refs (e.g., FETCH_HEAD),
 git doesn't have to automatically incur the cost of maintaining the
 reflog for those.  What other refs should there be reflogs for?  I
 haven't thought carefully about this.

I think you'd in theory want a reflog for anything. The refs in
refs/tags are not meant to change, but if they do (e.g., somebody
force-pushes a tag to a server) it is nice to have a log of what
happened.  I think the same argument could apply to anything in refs/.

I think more ephemeral things (like MERGE_HEAD) tend to be in the root,
outside of refs. Reflogging those _could_ be useful, but probably isn't
(and in the case of something like FETCH_HEAD, would not record all of
the information anyway).

I wrote the patch below over a year ago and very nearly submitted it. At
GitHub we use reflogs frequently for auditing and forensics, and I
wanted to have such an audit trail for everything. However, we ended up
doing something a little more invasive that I do not think would be that
interesting to upstream (though I am happy to submit a patch if people
think otherwise): we maintain a separate audit log for all refs that
is never pruned, and lives on even when refs are deleted.

-- 8 --
Subject: [PATCH] teach core.logallrefupdates an always mode

When core.logallrefupdates is true, we only create a new
reflog for refs that are under certain well-known
hierarchies. The reason is that we know that some
hierarchies (like refs/tags) do not typically change, and
that unknown hierarchies might not want reflogs at all
(e.g., a hypothetical refs/foo might be meant to change
often and drop old history immediately).

However, sometimes it is useful to override this decision
and simply log for all refs, because the safety and audit
trail is more important than the performance implications of
keeping the log around.

This patch introduces a new always mode for the
core.logallrefupdates option which will log updates to
everything under refs/, regardless where in the hierarchy it
is (we still will not log things like ORIG_HEAD and
FETCH_HEAD, which are known to be transient).

Signed-off-by: Jeff King p...@peff.net
---
Looking over the code, I am not sure that it actually works as
advertised with respect to ORIG_HEAD, etc. That would be easy enough to
fix, though.

 Documentation/config.txt |  8 +---
 branch.c |  2 +-
 builtin/checkout.c   |  2 +-
 builtin/init-db.c|  2 +-
 cache.h  |  9 -
 config.c |  7 ++-
 environment.c|  2 +-
 refs.c   | 23 +--
 t/t1400-update-ref.sh| 31 +++
 9 files changed, 71 insertions(+), 15 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index c55c22a..27629df 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -405,10 +405,12 @@ core.logAllRefUpdates::
$GIT_DIR/logs/ref, by appending the new and old
SHA-1, the date/time and the reason of the update, but
only when the file exists.  If this configuration
-   variable is set to true, missing $GIT_DIR/logs/ref
+   variable is set to `true`, a missing $GIT_DIR/logs/ref
file is automatically created for branch heads (i.e. under
-   refs/heads/), remote refs (i.e. under refs/remotes/),
-   note refs (i.e. under refs/notes/), and the symbolic ref HEAD.
+   `refs/heads/`), remote refs (i.e. under `refs/remotes/`),
+   note refs (i.e. under `refs/notes/`), and the symbolic ref `HEAD`.
+   If it is set to `always`, then a missing reflog is automatically
+   created for any ref under `refs/`.
 +
 This information can be used to determine what commit
 was the tip of a branch 2 days ago.
diff --git a/branch.c b/branch.c
index 46e8aa8..1d140b7 100644
--- a/branch.c
+++ b/branch.c
@@ -292,7 +292,7 @@ void create_branch(const char *head,
}
 
if (reflog)
-   log_all_ref_updates = 1;
+   log_all_ref_updates = LOG_REFS_NORMAL;
 
if (forcing)
snprintf(msg, sizeof msg, branch: Reset to %s,
diff --git a/builtin/checkout.c b/builtin/checkout.c
index f71e745..65bc066 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -586,7 +586,7 @@ static void update_refs_for_switch(const struct 
checkout_opts *opts,

Re: [PATCH 2/3] Makefile: use `find` to determine static header dependencies

2014-08-25 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 Since we do not use the value $(LIB_H) unless either
 COMPUTE_HEADER_DEPENDENCIES is turned on or the user is
 building po/git.pot (where it comes in via $(LOCALIZED_C),
 make is smart enough to not even run this find in most
 cases. However, we do need to stop using the immediate
 variable assignment := for $(LOCALIZED_C). That's OK,
 because it was not otherwise useful here.

 Signed-off-by: Jeff King p...@peff.net
 ---
 I cannot see any reason for the :=, but maybe I am missing something.

If the right-hand-side were something like $(shell find ...) that
was heavy-weight then it might have made sense, but I do not think
it is that.  It has stayed to be := ever since it was introduced by
cd5513a7 (i18n: Makefile: pot target to extract messages marked
for translation, 2011-02-22).

And now you use LIB_H only once ;-).

Also interestingly, I notice that it is very clear that it is not
LIB_H but ANY_H ;-)


  Makefile | 140 
 ---
  1 file changed, 8 insertions(+), 132 deletions(-)

 diff --git a/Makefile b/Makefile
 index cf0ccdf..f2b85c9 100644
 --- a/Makefile
 +++ b/Makefile
 ...
 @@ -2128,9 +2004,9 @@ XGETTEXT_FLAGS_C = $(XGETTEXT_FLAGS) --language=C \
  XGETTEXT_FLAGS_SH = $(XGETTEXT_FLAGS) --language=Shell \
   --keyword=gettextln --keyword=eval_gettextln
  XGETTEXT_FLAGS_PERL = $(XGETTEXT_FLAGS) --keyword=__ --language=Perl
 -LOCALIZED_C := $(C_OBJ:o=c) $(LIB_H) $(GENERATED_H)
 -LOCALIZED_SH := $(SCRIPT_SH)
 -LOCALIZED_PERL := $(SCRIPT_PERL)
 +LOCALIZED_C = $(C_OBJ:o=c) $(GENERATED_H)
 +LOCALIZED_SH = $(SCRIPT_SH)
 +LOCALIZED_PERL = $(SCRIPT_PERL)
  
  ifdef XGETTEXT_INCLUDE_TESTS
  LOCALIZED_C += t/t0200/test.c
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] Makefile: use `find` to determine static header dependencies

2014-08-25 Thread Jeff King
On Mon, Aug 25, 2014 at 12:30:51PM -0700, Junio C Hamano wrote:

 Also interestingly, I notice that it is very clear that it is not
 LIB_H but ANY_H ;-)

Yeah, it has been that way for quite a while. I don't know if it is that
big a deal, but it would not be unreasonable to do a patch to rename on
top (I am not sure what the new one would be; ANY_H is probably OK).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bisect: save heap memory. allocate only the required amount

2014-08-25 Thread Jeff King
On Mon, Aug 25, 2014 at 11:26:39AM -0700, Junio C Hamano wrote:

 Good digging, and I agree that it should use the FLEX_ARRAY for
 consistency.

I can produce a patch, but I did not want to steal Arjun's thunder.

Arjun, did my proposal make sense? Do you want to try implementing that?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-25 Thread Jeff King
On Mon, Aug 25, 2014 at 10:21:58AM -0700, Junio C Hamano wrote:

  +   if (ce_stage(ce) = ce_stage(next_ce))
  +   die(Unordered stage entries for '%s',
  +   ce-name);
 
 Not quite.  We do allow multiple higher stage entries; having two or
 more stage #1 entries is perfectly fine during a merge resolution,
 and both ce and next_ce may be pointing at the stage #1 entries of
 the same path.  Replacing the comparison with  is sufficient, I
 think.

For my own curiosity, how do you get into this situation, and what does
it mean to have multiple stage#1 entries for the same path? What would
git cat-file :1:path output?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] Makefile: use `find` to determine static header dependencies

2014-08-25 Thread Jonathan Nieder
Jeff King wrote:

 -LOCALIZED_C := $(C_OBJ:o=c) $(LIB_H) $(GENERATED_H)
 +LOCALIZED_C = $(C_OBJ:o=c) $(GENERATED_H)

Why is LIB_H dropped here?  This would mean that po/git.pot stops
including strings from macros and static inline functions in headers
(e.g., in parse-options.h).

The rest looks good.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] Makefile: use `find` to determine static header dependencies

2014-08-25 Thread Jeff King
On Mon, Aug 25, 2014 at 12:46:41PM -0700, Jonathan Nieder wrote:

 Jeff King wrote:
 
  -LOCALIZED_C := $(C_OBJ:o=c) $(LIB_H) $(GENERATED_H)
  +LOCALIZED_C = $(C_OBJ:o=c) $(GENERATED_H)
 
 Why is LIB_H dropped here?  This would mean that po/git.pot stops
 including strings from macros and static inline functions in headers
 (e.g., in parse-options.h).

Ick, this is an accidental leftover from the earlier iteration of the
patch, which moved that part to a separate line (inside my gross
GIT_REAL_POT conditional). The extra line went away, but I forgot to add
$(LIB_H) back in here. Thanks for noticing.

Here's a revised version of the patch which fixes it (and I
double-checked to make sure it continues to not execute the find unless
you are doing a make pot).

-- 8 --
Subject: Makefile: use `find` to determine static header dependencies

Most modern platforms will use automatically computed header
dependencies to figure out when a C file needs rebuilt due
to a header changing. With old compilers, however, we
fallback to a static list of header files. If any of them
changes, we recompile everything. This is overly
conservative, but the best we can do on older platforms.

It is unfortunately easy for our static header list to grow
stale, as none of the regular developers make use of it.
Instead of trying to keep it up to date, let's invoke find
to generate the list dynamically.

Since we do not use the value $(LIB_H) unless either
COMPUTE_HEADER_DEPENDENCIES is turned on or the user is
building po/git.pot (where it comes in via $(LOCALIZED_C),
make is smart enough to not even run this find in most
cases. However, we do need to stop using the immediate
variable assignment := for $(LOCALIZED_C). That's OK,
because it was not otherwise useful here.

Signed-off-by: Jeff King p...@peff.net
---
 Makefile | 140 ---
 1 file changed, 8 insertions(+), 132 deletions(-)

diff --git a/Makefile b/Makefile
index cf0ccdf..a4fc440 100644
--- a/Makefile
+++ b/Makefile
@@ -432,7 +432,6 @@ XDIFF_OBJS =
 VCSSVN_OBJS =
 GENERATED_H =
 EXTRA_CPPFLAGS =
-LIB_H =
 LIB_OBJS =
 PROGRAM_OBJS =
 PROGRAMS =
@@ -631,131 +630,11 @@ VCSSVN_LIB = vcs-svn/lib.a
 
 GENERATED_H += common-cmds.h
 
-LIB_H += advice.h
-LIB_H += archive.h
-LIB_H += argv-array.h
-LIB_H += attr.h
-LIB_H += bisect.h
-LIB_H += blob.h
-LIB_H += branch.h
-LIB_H += builtin.h
-LIB_H += bulk-checkin.h
-LIB_H += bundle.h
-LIB_H += cache-tree.h
-LIB_H += cache.h
-LIB_H += color.h
-LIB_H += column.h
-LIB_H += commit.h
-LIB_H += compat/bswap.h
-LIB_H += compat/mingw.h
-LIB_H += compat/obstack.h
-LIB_H += compat/poll/poll.h
-LIB_H += compat/precompose_utf8.h
-LIB_H += compat/terminal.h
-LIB_H += compat/win32/dirent.h
-LIB_H += compat/win32/pthread.h
-LIB_H += compat/win32/syslog.h
-LIB_H += connected.h
-LIB_H += convert.h
-LIB_H += credential.h
-LIB_H += csum-file.h
-LIB_H += decorate.h
-LIB_H += delta.h
-LIB_H += diff.h
-LIB_H += diffcore.h
-LIB_H += dir.h
-LIB_H += exec_cmd.h
-LIB_H += ewah/ewok.h
-LIB_H += ewah/ewok_rlw.h
-LIB_H += fetch-pack.h
-LIB_H += fmt-merge-msg.h
-LIB_H += fsck.h
-LIB_H += gettext.h
-LIB_H += git-compat-util.h
-LIB_H += gpg-interface.h
-LIB_H += graph.h
-LIB_H += grep.h
-LIB_H += hashmap.h
-LIB_H += help.h
-LIB_H += http.h
-LIB_H += kwset.h
-LIB_H += levenshtein.h
-LIB_H += line-log.h
-LIB_H += line-range.h
-LIB_H += list-objects.h
-LIB_H += ll-merge.h
-LIB_H += log-tree.h
-LIB_H += mailmap.h
-LIB_H += merge-blobs.h
-LIB_H += merge-recursive.h
-LIB_H += mergesort.h
-LIB_H += notes-cache.h
-LIB_H += notes-merge.h
-LIB_H += notes-utils.h
-LIB_H += notes.h
-LIB_H += object.h
-LIB_H += pack-objects.h
-LIB_H += pack-revindex.h
-LIB_H += pack.h
-LIB_H += pack-bitmap.h
-LIB_H += parse-options.h
-LIB_H += patch-ids.h
-LIB_H += pathspec.h
-LIB_H += pkt-line.h
-LIB_H += prio-queue.h
-LIB_H += progress.h
-LIB_H += prompt.h
-LIB_H += quote.h
-LIB_H += reachable.h
-LIB_H += reflog-walk.h
-LIB_H += refs.h
-LIB_H += remote.h
-LIB_H += rerere.h
-LIB_H += resolve-undo.h
-LIB_H += revision.h
-LIB_H += run-command.h
-LIB_H += send-pack.h
-LIB_H += sequencer.h
-LIB_H += sha1-array.h
-LIB_H += sha1-lookup.h
-LIB_H += shortlog.h
-LIB_H += sideband.h
-LIB_H += sigchain.h
-LIB_H += strbuf.h
-LIB_H += streaming.h
-LIB_H += string-list.h
-LIB_H += submodule.h
-LIB_H += tag.h
-LIB_H += tar.h
-LIB_H += thread-utils.h
-LIB_H += transport.h
-LIB_H += tree-walk.h
-LIB_H += tree.h
-LIB_H += unpack-trees.h
-LIB_H += unicode_width.h
-LIB_H += url.h
-LIB_H += urlmatch.h
-LIB_H += userdiff.h
-LIB_H += utf8.h
-LIB_H += varint.h
-LIB_H += vcs-svn/fast_export.h
-LIB_H += vcs-svn/line_buffer.h
-LIB_H += vcs-svn/repo_tree.h
-LIB_H += vcs-svn/sliding_window.h
-LIB_H += vcs-svn/svndiff.h
-LIB_H += vcs-svn/svndump.h
-LIB_H += walker.h
-LIB_H += wildmatch.h
-LIB_H += wt-status.h
-LIB_H += xdiff-interface.h
-LIB_H += xdiff/xdiff.h
-LIB_H += xdiff/xdiffi.h
-LIB_H += xdiff/xemit.h
-LIB_H += xdiff/xinclude.h
-LIB_H += xdiff/xmacros.h
-LIB_H += 

Re: [PATCH 2/3] Makefile: use `find` to determine static header dependencies

2014-08-25 Thread Jeff King
On Mon, Aug 25, 2014 at 04:00:42PM -0400, Jeff King wrote:

 On Mon, Aug 25, 2014 at 12:46:41PM -0700, Jonathan Nieder wrote:
 
  Jeff King wrote:
  
   -LOCALIZED_C := $(C_OBJ:o=c) $(LIB_H) $(GENERATED_H)
   +LOCALIZED_C = $(C_OBJ:o=c) $(GENERATED_H)
  
  Why is LIB_H dropped here?  This would mean that po/git.pot stops
  including strings from macros and static inline functions in headers
  (e.g., in parse-options.h).
 
 Ick, this is an accidental leftover from the earlier iteration of the
 patch, which moved that part to a separate line (inside my gross
 GIT_REAL_POT conditional). The extra line went away, but I forgot to add
 $(LIB_H) back in here. Thanks for noticing.

As an aside, this makes an interesting case study for our git.git
workflow.

In some workflows, I would have made the original unacceptable patch,
you would have reviewed it, and then I would have made the followup
patch to adjust it, and you would have reviewed that. But it's quite
hard to see my mistake in just the followup patch; the fact that
$(LIB_H) was originally part of $(LOCALIZED_C) does not appear in that
hunk at all.

But in our workflow, we squash out the unacceptable diff, and you review
from scratch the movement from the original working state (assuming the
status quo was working :) ) to the newly proposed state. And there the
mistake is quite a bit more obvious.

Just an interesting observation.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] git-imap-send: use libcurl for implementation

2014-08-25 Thread Bernhard Reiter
Am 2014-08-19 um 19:51 schrieb Junio C Hamano:
 This looks strange and stands out like a sore thumb.  Do any of our
 other sources do this kind of macro tweaking inside C source before
 including git-compat-util.h (or its equivalent like cache.h)?

I haven't checked, but I agree that it's desirable to avoid.

 I think what you are trying to do is to change the meaning of
 NO_OPENSSL, which merely means we do not have OpenSSL library and
 do not want to link with it, locally to we may or may not have and
 use OpenSSL library elsewhere in Git, but in the code below we do
 not want to use the code written to work on top of OpenSSL and
 instead use libcurl.  

...and we don't want to link to OpenSSL in that case. Yeah.

 Because of that, you define NO_OPENSSL before
 including any of our headers to cause us not to include the headers,
 and typedef away SSL, for example.

The SSL un-typedef'ing was there before, but it's true that I'm defining
NO_OPENSSL on the very top so the included headers don't require OpenSSL
(and so we don't have to link to it later).

  #include cache.h
  #include credential.h
  #include exec_cmd.h
 @@ -29,6 +33,9 @@
  #ifdef NO_OPENSSL
  typedef void *SSL;
  #endif
 +#ifdef USE_CURL_FOR_IMAP_SEND
 +#include http.h
 +#endif
 
 But does it have to be that way?  For one thing, doing it this way,
 the user has to make a compilation-time choice, but if you separate
 these compilation time macro into two, one for can we even link
 with and use OpenSSL? (which is what NO_OPENSSL is about) and
 another for do we want an ability to talk to imap via libcurl?,
 wouldn't it make it possible for you to switch between them at
 runtime (e.g. you might want to go over the direct connection when
 tunneling, while letting libcurl do the heavy lifting in
 non-tunneled operation)?

Yeah, but I still need to wrap the non-tunneled operation in #ifdefs in
case we don't USE_CURL_FOR_IMAP_SEND in which case we fall back to the
handwritten IMAP code, don't I?

Maybe I'm not getting you entirely right, but if I don't typedef
NO_OPENSSL if USE_CURL_FOR_IMAP_SEND is defined, I don't see any way to
not link to OpenSSL, even if it's not required. I'm including a slightly
modified patch which does that (hoping that I've finally managed to send
a usable patch). Sorry it's nothing breathtakingly better than before,
even after giving this some thought I didn't arrive at a very elegant
new solution... (see below for more on that)

 @@ -1417,18 +1476,48 @@ int main(int argc, char **argv)
  }
  
  /* write it to the imap server */
 +
 +#ifdef USE_CURL_FOR_IMAP_SEND
 +if (!server.tunnel) {
 +curl = setup_curl(server);
 +curl_easy_setopt(curl, CURLOPT_READDATA, msgbuf);
 +} else {
 +#endif
  ctx = imap_open_store(server);
  if (!ctx) {
  fprintf(stderr, failed to open store\n);
  return 1;
  }
 +ctx-name = server.folder;
 +#ifdef USE_CURL_FOR_IMAP_SEND
 +}
 +#endif
  
  fprintf(stderr, sending %d message%s\n, total, (total != 1) ? s : 
 );
 -ctx-name = imap_folder;
  while (1) {
  unsigned percent = n * 100 / total;
  
  fprintf(stderr, %4u%% (%d/%d) done\r, percent, n, total);
 +#ifdef USE_CURL_FOR_IMAP_SEND
 +if (!server.tunnel) {
 ...
 +}
 +} else {
 +#endif
  if (!split_msg(all_msgs, msg, ofs))
  break;
  if (server.use_html)
 @@ -1436,11 +1525,19 @@ int main(int argc, char **argv)
  r = imap_store_msg(ctx, msg);
  if (r != DRV_OK)
  break;
 +#ifdef USE_CURL_FOR_IMAP_SEND
 +}
 +#endif
  n++;
  }
  fprintf(stderr, \n);
  
 +#ifdef USE_CURL_FOR_IMAP_SEND
 +curl_easy_cleanup(curl);
 +curl_global_cleanup();
 +#else
  imap_close_store(ctx);
 +#endif
  
  return 0;
  }
 
 Ugly.  Can we do this with less #ifdef/#else/#endif in the primary
 code path?

It is ugly, but as much as I'd love to put e.g.

+#ifdef USE_CURL_FOR_IMAP_SEND
+   if (!server.tunnel) {
+   curl = setup_curl(server);

etc into imap_open_store (and similarly for imap_store_msg etc), I don't
see any easy way to do it; imap_open_store's return type would still be
struct imap_store* in the non-tunneling case, and CURL* otherwise.

So the best I can come up with here is merging some of the #ifdef
blocks, but that means duplicating the code that applies in both cases.
But that isn't any better, is it?

 If we were to keep these two modes as a choice the users have to
 make at the compilation time, that is.

As stated above, I'm not sure how to do entirely without at least those
two compile time switches (NO_OPENSSL and USE_CURL_FOR_IMAP_SEND).

Sorry, perhaps I missed something obvious; grateful for any hints on how
to do it better.

Bernhard

 Documentation/git-imap-send.txt |  3 +-
 INSTALL | 15 ---
 Makefile 

Re: [PATCH 1/2] Check order when reading index

2014-08-25 Thread Jaime Soriano Pastor
On Mon, Aug 25, 2014 at 7:21 PM, Junio C Hamano gits...@pobox.com wrote:
 Jaime Soriano Pastor jsorianopas...@gmail.com writes:

 Subject: Re: [PATCH 1/2] Check order when reading index

 Please be careful when crafting the commit title.  This single line
 will be the only one that readers will have to identify the change
 among hundreds of entries in git shortlog output when trying to
 see what kind of change went into the project during the given
 period.  Something like:

 read_index_from(): catch out of order entries while reading an index file

 perhaps?

Ok, reprashing it.

 +void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)

 Does this have to be global, i.e. not static void ...?

Not really, changing it to static.

 + if (ce_stage(ce) = ce_stage(next_ce))
 + die(Unordered stage entries for '%s',
 + ce-name);

 Not quite.  We do allow multiple higher stage entries; having two or
 more stage #1 entries is perfectly fine during a merge resolution,
 and both ce and next_ce may be pointing at the stage #1 entries of
 the same path.  Replacing the comparison with  is sufficient, I
 think.

Ok, but like Jeff, I'm also curious about how to have multiple stage
#1 entries for the same path.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bisect: save heap memory. allocate only the required amount

2014-08-25 Thread Arjun Sreedharan
On 26 August 2014 01:05, Jeff King p...@peff.net wrote:
 On Mon, Aug 25, 2014 at 11:26:39AM -0700, Junio C Hamano wrote:

 Good digging, and I agree that it should use the FLEX_ARRAY for
 consistency.

 I can produce a patch, but I did not want to steal Arjun's thunder.

Please feel free to do so.


 Arjun, did my proposal make sense? Do you want to try implementing that?

 -Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] Makefile: use `find` to determine static header dependencies

2014-08-25 Thread Jonathan Nieder
Jeff King wrote:

 It is unfortunately easy for our static header list to grow
 stale, as none of the regular developers make use of it.
 Instead of trying to keep it up to date, let's invoke find
 to generate the list dynamically.

Yep, I like this.

It does mean that people who run make pot have to be a little more
vigilant about not keeping around extra .h files by mistake.  But I
trust them.

[...]
 +LIB_H = $(shell $(FIND) . \
 + -name .git -prune -o \
 + -name t -prune -o \
 + -name Documentation -prune -o \
 + -name '*.h' -print)

Tiny nit: I might use

$(shell $(FIND) * \
-name . -o
-name '.*' -prune -o \
-name t -prune -o \
-name Documentation -prune -o \
-name '*.h' -print)

or

$(shell $(FIND) * \
-name '.?*' -prune -o \
-name t -prune -o \
-name Documentation -prune -o \
-name '*.h' -print)

to avoid spending time looking in other dot-directories like .svn,
.hg, or .snapshot.

With or without such a change,
Reviewed-by: Jonathan Nieder jrnie...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-25 Thread Junio C Hamano
On Mon, Aug 25, 2014 at 12:44 PM, Jeff King p...@peff.net wrote:
 For my own curiosity, how do you get into this situation, and what does
 it mean to have multiple stage#1 entries for the same path? What would
 git cat-file :1:path output?

That is how we natively (read: not with the funky virtual stuff
merge-recursive does) express a merge with multiple merge bases.
You also should be able to read this in the way how git merge invokes
merge strategies (one or more bases, double-dash and then current
HEAD and the other branches). I think there are some tests in 3way
merge tests that checks what should happen when our HEAD matches
one of the stage #1 while their branch matches a different one of the
stage #1, too.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] Makefile: use `find` to determine static header dependencies

2014-08-25 Thread Junio C Hamano
Jonathan Nieder jrnie...@gmail.com writes:

 Jeff King wrote:

 It is unfortunately easy for our static header list to grow
 stale, as none of the regular developers make use of it.
 Instead of trying to keep it up to date, let's invoke find
 to generate the list dynamically.

 Yep, I like this.

 It does mean that people who run make pot have to be a little more
 vigilant about not keeping around extra .h files by mistake.  But I
 trust them.

 [...]
 +LIB_H = $(shell $(FIND) . \
 +-name .git -prune -o \
 +-name t -prune -o \
 +-name Documentation -prune -o \
 +-name '*.h' -print)

 Tiny nit: I might use

   $(shell $(FIND) * \
   -name . -o
   -name '.*' -prune -o \
   -name t -prune -o \
   -name Documentation -prune -o \
   -name '*.h' -print)

 or

   $(shell $(FIND) * \
   -name '.?*' -prune -o \
   -name t -prune -o \
   -name Documentation -prune -o \
   -name '*.h' -print)

 to avoid spending time looking in other dot-directories like .svn,
 .hg, or .snapshot.

Wouldn't it be sufficient to start digging not from * but from
??*?  That is, something like

find ??* \( -name Documentation -o -name .\?\* \) -prune -o -name \*.h

;-)

 With or without such a change,
 Reviewed-by: Jonathan Nieder jrnie...@gmail.com

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] git-imap-send: use libcurl for implementation

2014-08-25 Thread Junio C Hamano
Bernhard Reiter ock...@raz.or.at writes:

 Yeah, but I still need to wrap the non-tunneled operation in #ifdefs in
 case we don't USE_CURL_FOR_IMAP_SEND in which case we fall back to the
 handwritten IMAP code, don't I?

We do not mind multiple implementations of the same helper function
that are guarded with #ifdef/#endif, and we do use that style quite
a lot.  Would it help?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bisect: save heap memory. allocate only the required amount

2014-08-25 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Mon, Aug 25, 2014 at 11:26:39AM -0700, Junio C Hamano wrote:

 Good digging, and I agree that it should use the FLEX_ARRAY for
 consistency.

 I can produce a patch, but I did not want to steal Arjun's thunder.

Hmph, would it have to overlap?  I think we can queue Arjun's patch
with +1 fix and FLEX_ARRAY thing separately, and they can go in in
any order, no?

 Arjun, did my proposal make sense? Do you want to try implementing that?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bisect: save heap memory. allocate only the required amount

2014-08-25 Thread Junio C Hamano
Arjun Sreedharan arjun...@gmail.com writes:

 Find and allocate the required amount instead of allocating extra
 100 bytes

 Signed-off-by: Arjun Sreedharan arjun...@gmail.com
 ---

Interesting.  How much memory do we typically waste (in other words,
how big did cnt grew in your repository where you noticed this)?

  bisect.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

 diff --git a/bisect.c b/bisect.c
 index d6e851d..c96aab0 100644
 --- a/bisect.c
 +++ b/bisect.c
 @@ -215,10 +215,13 @@ static struct commit_list *best_bisection_sorted(struct 
 commit_list *list, int n
   }
   qsort(array, cnt, sizeof(*array), compare_commit_dist);
   for (p = list, i = 0; i  cnt; i++) {
 - struct name_decoration *r = xmalloc(sizeof(*r) + 100);
 + char name[100];
 + sprintf(name, dist=%d, array[i].distance);
 + int name_len = strlen(name);

Decl-after-stmt.

You do not have to run a separate strlen() for this.  The return
value from sprintf() should tell you how much you need to allocate,
I think.

 + struct name_decoration *r = xmalloc(sizeof(*r) + name_len);
   struct object *obj = (array[i].commit-object);
  
 - sprintf(r-name, dist=%d, array[i].distance);
 + memcpy(r-name, name, name_len + 1);
   r-next = add_decoration(name_decoration, obj, r);
   p-item = array[i].commit;
   p = p-next;
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] Makefile: use `find` to determine static header dependencies

2014-08-25 Thread Jonathan Nieder
Junio C Hamano wrote:
 Jonathan Nieder jrnie...@gmail.com writes:

 Tiny nit: I might use

  $(shell $(FIND) * \
  -name . -o
  -name '.*' -prune -o \
  -name t -prune -o \
  -name Documentation -prune -o \
  -name '*.h' -print)

 or

  $(shell $(FIND) * \
  -name '.?*' -prune -o \
  -name t -prune -o \
  -name Documentation -prune -o \
  -name '*.h' -print)

 to avoid spending time looking in other dot-directories like .svn,
 .hg, or .snapshot.

 Wouldn't it be sufficient to start digging not from * but from
 ??*?

Gah, the * was supposed to be . in my examples (though it doesn't
hurt).

   find ??* \( -name Documentation -o -name .\?\* \) -prune -o -name \*.h

Heh.  Yeah, that would work. ;-)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bisect: save heap memory. allocate only the required amount

2014-08-25 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 The string will always be dist= followed by decimal representation of
 a count that fits in int anyway, so I actually think use of strbuf is way
 overkill (and formatting it twice also is); the patch as posted should be
 just fine.

 I think you are right, and the patch is the right direction (assuming we
 want to do this; I question whether there are enough elements in the
 list for us to care about the size, and if there are, we are probably
 better off storing the int and formatting the strings on the fly).

;-)

 It would be nice if there was some way to abstract the idea of
 formatting a buffer directly into a flex-array. That would involve the
 double-format you mention, but we could use it in lots of places to make
 the code nicer
 ...
   struct name_decoration *r = fmt_flex_array(sizeof(*r),
  offsetof(*r, name),
dist=%d, x);

 which is a little less nice. You could make it nicer with a macro, but
 we don't assume variadic macros. sigh

At first I thought Yuck.  A helper only to format into the flex
member that holds a string?, and I tried to change my mind, but I
couldn't quite convince myself.  At least not yet.

Among the flex arrays we use, some are arrays of bools, some others
are arrays of object names, and there are many arrays of even more
esoteric types.  Only a small number of them are We want a struct
with a constant string, and we do not want to do two allocations and
to pay an extra dereference cost by using 'const char *'.

For them, by the time we allocate a struct, by definition we should
have sufficient information to compute how large to make that
structure and a printf-format plus its args would be the preferred
form of that sufficient information, I would think.

The name fmt_flex_array(), which stresses too much on the
formatting side without implying that it is the way to allocate
the thing, may be horrible, and I agree with you that without
variadic macros the end result may not read very well.  Unless we
have great many number of places we can use the helper to make the
code to create these objects look nicer, I am afraid that the
pros-and-cons may not be very favourable.

Thanks for an interesting tangent.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] Makefile: use `find` to determine static header dependencies

2014-08-25 Thread Junio C Hamano
Jonathan Nieder jrnie...@gmail.com writes:

 Wouldn't it be sufficient to start digging not from * but from
 ??*?

 Gah, the * was supposed to be . in my examples (though it doesn't
 hurt).

  find ??* \( -name Documentation -o -name .\?\* \) -prune -o -name \*.h

 Heh.  Yeah, that would work. ;-)

Continuing useless discussion...

Actually as you are not excluding CVS, RCS, etc., and using ??* as
the starting point will exclude .git, .hg, etc. at the top, I think
we can shorten it even further and say

find ??* -name Documentation -prune -o -name \*.h

or something.

...and time to go back to something more serious and practical.

Don't we want to exclude contrib/ by the way?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] l10n updates for 2.1.0 round 1

2014-08-25 Thread Junio C Hamano
Jiang Xin worldhello@gmail.com writes:

 The following changes since commit 49f1cb93a2f11845cfa2723611a729d3d7f02f0d:

   Git 2.1.0-rc0 (2014-07-27 15:22:22 -0700)

 are available in the git repository at:

   git://github.com/git-l10n/git-po

 for you to fetch changes up to f7fbc357f863ecc5323f3fcf2fc9cbf2aa2a8587:

   l10n: fr.po (2257t) update for version 2.1.0 (2014-08-07 09:07:18 +0200)

Thanks. I finally got around pulling this one.

Sorry for the delay.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 1/4] convert: Refactor would_convert_to_git() to single arg 'path'

2014-08-25 Thread Junio C Hamano
Steffen Prohaska proha...@zib.de writes:

 It is only the path that matters in the decision whether to filter or
 not.  Clarify this by making path the single argument of
 would_convert_to_git().

 Signed-off-by: Steffen Prohaska proha...@zib.de
 ---

I've retitled this as:

convert: drop arguments other than 'path' from would_convert_to_git()

to match the output from git shortlog --since=3.months --no-merges
by using lowercase 'd' after the convert:  area name, and also
more importantly avoid calling refactor which this change is not.

Thanks.

  convert.h   | 5 ++---
  sha1_file.c | 2 +-
  2 files changed, 3 insertions(+), 4 deletions(-)

 diff --git a/convert.h b/convert.h
 index 0c2143c..c638b33 100644
 --- a/convert.h
 +++ b/convert.h
 @@ -40,10 +40,9 @@ extern int convert_to_working_tree(const char *path, const 
 char *src,
  size_t len, struct strbuf *dst);
  extern int renormalize_buffer(const char *path, const char *src, size_t len,
 struct strbuf *dst);
 -static inline int would_convert_to_git(const char *path, const char *src,
 -size_t len, enum safe_crlf checksafe)
 +static inline int would_convert_to_git(const char *path)
  {
 - return convert_to_git(path, src, len, NULL, checksafe);
 + return convert_to_git(path, NULL, 0, NULL, 0);
  }
  
  /*
 diff --git a/sha1_file.c b/sha1_file.c
 index 3f70b1d..00c07f2 100644
 --- a/sha1_file.c
 +++ b/sha1_file.c
 @@ -3144,7 +3144,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat 
 *st,
   if (!S_ISREG(st-st_mode))
   ret = index_pipe(sha1, fd, type, path, flags);
   else if (size = big_file_threshold || type != OBJ_BLOB ||
 -  (path  would_convert_to_git(path, NULL, 0, 0)))
 +  (path  would_convert_to_git(path)))
   ret = index_core(sha1, fd, size, type, path, flags);
   else
   ret = index_stream(sha1, fd, size, type, path, flags);
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Show containing branches in log?

2014-08-25 Thread Robert Dailey
On Thu, Jul 3, 2014 at 2:41 PM, Øyvind A. Holm su...@sunbase.org wrote:
 On 2 July 2014 16:50, Robert Dailey rcdailey.li...@gmail.com wrote:
 I know that with the `git branch` command I can determine which
 branches contain a commit. Is there a way to represent this
 graphically with `git log`? Sometimes I just have a commit, and I need
 to find out what branch contains that commit. The reason why `git
 branch --contains` doesn't solve this problem for me is that it names
 almost all branches because of merge commits. Too much ancestry has
 been built since this commit, so there is no way to find the closest
 branch that contains that commit.

 Is there a way to graphically see what is the nearest named ref to
 the specified commit in the logs?

 I have created a script for just this functionality which I use very
 often, and have created a gist with the files at
 https://gist.github.com/sunny256/2eb583f21e0ffcfe994f, I think it
 should solve your problem. It contains these files:

   git-wn

 wn means What's New and will create a visual graph of all commits
 which has a specified ref as ancestor. It also needs the following
 script, just put it into your $PATH somewhere:

   git-lc

 lc means List branches Containing this commit and generates a list
 of all branches containing a specified ref.

 The files originates from https://github.com/sunny256/utils, but
 I've modified them in the gist to make your life easier. :)

 Hope that helps,
 Øyvind

I'm finally getting around to trying this out but it isn't working on
Windows because there is no fmt command in msysgit. Do you have a
workaround I can use? Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Transaction patch series overview

2014-08-25 Thread Jonathan Nieder
Jonathan Nieder wrote:
 Ronnie Sahlberg wrote:

 ref-transaction-1 (2014-07-16) 20 commits
 -
 Second batch of ref transactions

  - refs.c: make delete_ref use a transaction
  - refs.c: make prune_ref use a transaction to delete the ref
  - refs.c: remove lock_ref_sha1
  - refs.c: remove the update_ref_write function
  - refs.c: remove the update_ref_lock function
  - refs.c: make lock_ref_sha1 static
  - walker.c: use ref transaction for ref updates
  - fast-import.c: use a ref transaction when dumping tags
  - receive-pack.c: use a reference transaction for updating the refs
  - refs.c: change update_ref to use a transaction
  - branch.c: use ref transaction for all ref updates
  - fast-import.c: change update_branch to use ref transactions
  - sequencer.c: use ref transactions for all ref updates
  - commit.c: use ref transactions for updates
  - replace.c: use the ref transaction functions for updates
  - tag.c: use ref transactions when doing updates
  - refs.c: add transaction.status and track OPEN/CLOSED/ERROR
  - refs.c: make ref_transaction_begin take an err argument
  - refs.c: update ref_transaction_delete to check for error and return status
  - refs.c: change ref_transaction_create to do error checking and return 
 status
  (this branch is used by rs/ref-transaction, rs/ref-transaction-multi,
 rs/ref-transaction-reflog and rs/ref-transaction-rename.)
[...]
 I've having trouble keeping track of how patches change, which patches
 have been reviewed and which haven't, unaddressed comments, and so on,
 so as an experiment I've pushed this part of the series to the Gerrit
 server at

  https://code-review.googlesource.com/#/q/topic:ref-transaction-1

Outcome of the experiment: patches gained some minor changes and then

 1-12
Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 13
Reviewed-by: Michael Haggerty mhag...@alum.mit.edu
Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 14
Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 15-16
Reviewed-by: Michael Haggerty mhag...@alum.mit.edu
Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 17
Reviewed-by: Michael Haggerty mhag...@alum.mit.edu

 18
Reviewed-by: Michael Haggerty mhag...@alum.mit.edu
Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 19
Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 20
Reviewed-by: Michael Haggerty mhag...@alum.mit.edu
Reviewed-by: Jonathan Nieder jrnie...@gmail.com

I found the web UI helpful in seeing how each patch evolved since I
last looked at it.  Interdiff relative to the version in pu is below.
I'm still hoping for a tweak in response to a minor comment and then I
can put up a copy of the updated series to pull.

The next series from Ronnie's collection is available at
https://code-review.googlesource.com/#/q/topic:ref-transaction in case
someone wants a fresh series to look at.

diff --git a/branch.c b/branch.c
index c1eae00..37ac555 100644
--- a/branch.c
+++ b/branch.c
@@ -305,6 +305,7 @@ void create_branch(const char *head,
ref_transaction_commit(transaction, msg, err))
die(%s, err.buf);
ref_transaction_free(transaction);
+   strbuf_release(err);
}
 
if (real_ref  track)
diff --git a/builtin/commit.c b/builtin/commit.c
index 668fa6a..9bf1003 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1765,8 +1765,8 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
transaction = ref_transaction_begin(err);
if (!transaction ||
ref_transaction_update(transaction, HEAD, sha1,
-  current_head ?
-  current_head-object.sha1 : NULL,
+  current_head
+  ? current_head-object.sha1 : NULL,
   0, !!current_head, err) ||
ref_transaction_commit(transaction, sb.buf, err)) {
rollback_index_files();
@@ -1801,5 +1801,6 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
if (!quiet)
print_summary(prefix, sha1, !current_head);
 
+   strbuf_release(err);
return 0;
 }
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 91099ad..224fadc 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -194,7 +194,7 @@ static void write_head_info(void)
 
 struct command {
struct command *next;
-   char *error_string;
+   const char *error_string;
unsigned int skip_update:1,
 did_not_exist:1;
int index;
@@ -468,7 +468,7 @@ static int update_shallow_ref(struct command *cmd, struct 
shallow_info *si)
return 0;
 }
 
-static char *update(struct command *cmd, struct shallow_info *si)
+static const char *update(struct command *cmd, struct