Re: [PATCH] strtoul_ui: actually report error in case of negative input

2015-09-16 Thread Matthieu Moy
Max Kirillov  writes:

> On Tue, Sep 15, 2015 at 08:50:03AM +0200, Matthieu Moy wrote:
>> --- a/git-compat-util.h
>> +++ b/git-compat-util.h
>> @@ -814,6 +814,9 @@ static inline int strtoul_ui(char const *s, int base, 
>> unsigned int *result)
>> char *p;
>>  
>> errno = 0;
>> +   /* negative values would be accepted by strtoul */
>> +   if (strchr(s, '-'))
>> +   return -1;
>> ul = strtoul(s, , base);
>> if (errno || *p || p == s || (unsigned int) ul != ul)
>> return -1;
>> 
>> What do you think?
>
> Explicit rejection of '-' is of course useful addition.
>
> I still find "(unsigned int) ul != ul" bad. As far as I
> understand it makes no sense for i386.

Nothing would make sense here for i386: there's no case where you want
to reject anything on this architecture. Well, you may have expected
strtoul to reject big numbers, but it did not and it's too late.

> And even for 64-bit it's too obscure. In form of "(ul & 0xL)
> == 0" it would be more clear.

I disagree. "(unsigned int) ul != ul" reads immediately as "if casting
ul to unsigned int changes its value", regardless of sizeof(int). This
is exactly what the check is doing.

> Or just make explicit comparison with intended limit, like I did.

What you really want is to compare with UINT_MAX which would not make
sense on 32 bits architectures.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/8] branch: drop non-commit error reporting

2015-09-16 Thread Karthik Nayak
On Tue, Sep 15, 2015 at 1:19 AM, Junio C Hamano  wrote:
> Karthik Nayak  writes:
>
>> Remove the error reporting variable to make the code easier to port
>> over to using ref-filter APIs. This variable is not required as in
>> ref-filter we already check for possible errors and report them.
>
> H.  What do you exactly mean "possible errors" here?
>

The checking if it points to a commit.

> Unlike generic refs that can point at anything, refs/heads/* refs
> must point at a commit [*1*], and that is why the error message says
> 'does not point at a commit'.
>
> Does ref-filter API have corresponding check to treat the local and
> remote branch hierarchies differently from tags and others?
>

This check in the existing code is only done when one of these holds:
ref_list->verbose || ref_list->with_commit || merge_filter != NO_FILTER

There is a similar check in ref-filter when we filter and obtain refs.

if (filter->merge_commit || filter->with_commit || filter->verbose) {
commit = lookup_commit_reference_gently(oid->hash, 1);
if (!commit)
return 0;
/* We perform the filtering for the '--contains' option */
if (filter->with_commit &&
!commit_contains(filter, commit))
return 0;
}

> [Footnote]
>
> *1* Strictly speaking, use of lookup_commit_reference_gently() in
> the existing code allows a committish to be there and does not
> limit the tip objects to be commits, but I think it is a bug.
> At least, even with deref_tag(), we reject non committish found
> at the tip of branch refs and explain with the error message
> this patch removes.

Ah, didn't go through this.

-- 
Regards,
Karthik Nayak
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/8] branch: roll show_detached HEAD into regular ref_list

2015-09-16 Thread Karthik Nayak
On Tue, Sep 15, 2015 at 1:05 AM, Junio C Hamano  wrote:
> Karthik Nayak  writes:
>
>> + /*
>> +  * First we obtain all regular branch refs and if the HEAD is
>> +  * detached then we insert that ref to the end of the ref_fist
>> +  * so that it can be printed and removed first.
>> +  */
>>   for_each_rawref(append_ref, );
>> + if (detached)
>> + head_ref(append_ref, );
>> + index = ref_list.index;
>> +
>> + /* Print detached HEAD before sorting and printing the rest */
>> + if (detached && (ref_list.list[index - 1].kind == REF_DETACHED_HEAD) &&
>> + !strcmp(ref_list.list[index - 1].name, head)) {
>> + print_ref_item(_list.list[index - 1], maxwidth, verbose, 
>> abbrev,
>> +1, remote_prefix);
>> + index -= 1;
>> + }
>>
>> + qsort(ref_list.list, index, sizeof(struct ref_item), ref_cmp);
>
> This looks somewhat strange.  Wouldn't it be more consistent to
> teach ref_cmp that HEAD sorts where in the collection of refs (I
> presume that kind is checked first and then name, so if you give
> REF_DETACHED_HEAD a low number than others, it would automatically
> give you the ordering you want) without all of the above special
> casing?

Thats nice, we could do that, something like this perhaps:

qsort(ref_list.list, ref_list.index, sizeof(struct ref_item), ref_cmp);

for (i = 0; i < ref_list.index; i++) {
int current = !detached && (ref_list.list[i].kind ==
REF_LOCAL_BRANCH) &&
!strcmp(ref_list.list[i].name, head);
/*  If detached the first ref_item is the current ref */
if (detached && i == 0)
current = 1;
print_ref_item(_list.list[i], maxwidth, verbose,
   abbrev, current, remote_prefix);
}

-- 
Regards,
Karthik Nayak
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "Medium" log format: change proposal for author != committer

2015-09-16 Thread Jacob Keller
On Tue, Sep 15, 2015 at 6:52 PM, Junio C Hamano  wrote:
>
>  * Enhance the "--pretty=format:" thing so that the current set of
>hardcoded --pretty=medium,short,... formats and your modified
>"medium" can be expressed as a custom format string.
>
>  * Introduce a configuration mechanism to allow users to define new
>short-hand, e.g. if you have this in your $HOME/.gitconfig:
>
> [pretty "robin"]
> format = "commit %H%nAuthor: %an <%ae>%n..."
>

Afiak there is already support for this.. from "git help config":

pretty.
Alias for a --pretty= format string, as specified in git-log(1). Any
aliases defined here can be used just as the built-in pretty formats
could. For example, running git config pretty.changelog "format:* %H
%s" would cause the invocation git log --pretty=changelog to be
equivalent to running git log "--pretty=format:* %H %s". Note that an
alias with the same name as a built-in format will be silently
ignored.

>and run "git log --pretty=robin", it would behave as if you said
>"git log --pretty="format:commit %H%nAuthor: %an <%ae>%n...".
>

So this should already be supported... but to support "robinsformat"
we'd need to be able to "show committer only if different from
author"... Not sure how that would work.

>  * (optional) Replace the hardcoded implementations of pretty
>formats with short-hand names like "medium", "short", etc. with a
>built-in set of pretty.$name.format using the configuration
>mechanism.  But we need to make sure this does not hurt
>performance for common cases.
>

This part obviously hasn't been done, I don't know if any particular
format is not expressable today by the pretty syntax or not..

But at least configuration does work. I use it as part of displaying the

Fixes:  ("name")

used by the upstream kernel for marking bug fixes of known commits.

Thus the only real thing would be implementing a % modifier which
allows showing commiter if it's not the same as author. (or vise
versa) Ideally we could take work from the ref-filter library and the
proposed "%if" stuff but I don't tihnk this was actually implemented
yet, and I don't know if that would even work in the pretty modifiers.

Regards,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-16 Thread Jeff King
On Tue, Sep 15, 2015 at 01:38:42PM -0700, Stefan Beller wrote:

> Some off topic thoughts:
> 
> Having an "assert" behavior is not a good user experience though
> and should be fixed. To fix it we need stack traces. And the git
> version. And telling the user to send it to the mailing list.

Yes, any assert (or die("BUG")) should generally result in a code change
to make it never happen again.

I have been tempted to replace our die("BUG") calls with a BUG() macro,
so that we can do more fancy things (even if it is just adding
__FILE__:__LINE__ and a message to contact the list).

> I wonder if we want to include a trace where possible (i.e.
> when compiled with gcc or other precompiler conditions)
> into these assertive behaviors.

That would be nice. I took a look at this back in April:

  http://thread.gmane.org/gmane.comp.version-control.git/267643/focus=267755

but I think I convinced myself that we are better off relying on people
running gdb. If we do anything, it should be to make doing so easier.

> I'd guess we have an assertive behavior if die("BUG:...") is called,
> so maybe we can just check inside of die if we want to print
> a stack trace additionally ?

I guess we could look for starts_with(fmt, "BUG: ") in die(). I think it
would be a bit less gross to convert those calls to a BUG() macro (we
can't get accurate __LINE__ information without it, though in practice
the die messages are usually unique enough to be helpful).

I think a core file is even more useful than a backtrace. Having BUG()
call abort() would be even more useful, I think.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Big path on git add file [windows bug]

2015-09-16 Thread Alexey Kasyanchuk
I tried add one of node.js module to git project. But add operation failed:

libgit2 returned: Invalid path for filesystem
'E:/Projects/vsteams/node_modules/gulp-imagemin/node_modules/imagemin/node_modules/imagemin-gifsicle/node_modules/gifsicle/node_modules/bin-build/node_modules/decompress/node_modules/decompress-tar/node_modules/strip-dirs/node_modules/is-natural-number/is-natural-number-cjs.js':
Data area passed to a system call is too small.

Is this windows problem or problem with libgit2 realization on windows
(windows 7 by the way)? It looked like problem with big system path
(276 symbols). May be some sulution for this?

Version: Git-2.5.2.2-64-bit

-- 
Best regards,
Alexey Kasyanchuk
C/C++ Developer


Skype: degitx
mobile: +375295777373
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-16 Thread Jeff King
On Tue, Sep 15, 2015 at 11:19:21PM -0400, Eric Sunshine wrote:

> > strcpy(hexbuf[stage], sha1_to_hex(ce->sha1));
> > -   sprintf(ownbuf[stage], "%o", ce->ce_mode);
> > +   xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", 
> > ce->ce_mode);
> 
> Interesting. I wonder if there are any (old/broken) compilers which
> would barf on this. If we care, perhaps sizeof(ownbuf[0]) instead?

Good point. I've changed it to sizeof(ownbuf[0]).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Big path on git add file [windows bug]

2015-09-16 Thread Johannes Schindelin
Hi Alexey,

On 2015-09-16 10:01, Alexey Kasyanchuk wrote:
> I tried add one of node.js module to git project. But add operation failed:
> 
> libgit2 returned: Invalid path for filesystem
> 'E:/Projects/vsteams/node_modules/gulp-imagemin/node_modules/imagemin/node_modules/imagemin-gifsicle/node_modules/gifsicle/node_modules/bin-build/node_modules/decompress/node_modules/decompress-tar/node_modules/strip-dirs/node_modules/is-natural-number/is-natural-number-cjs.js':
> Data area passed to a system call is too small.

It is indeed too small for normal operations (MAX_PATH is somewhere around 
248). However, as far as I know, libgit2 has internal code to convert the path 
into a form that the Win32 API can handle.

> Is this windows problem or problem with libgit2 realization on windows
> (windows 7 by the way)? It looked like problem with big system path
> (276 symbols). May be some sulution for this?

Yes, there is a sulution for this. But it is unclear whether you have that 
already or not because you were a little parsimonious with information. For 
example, I have no idea which libgit2 version you use, in which context, what 
program you use to call libgit2, etc.

So I am left guessing, which is a little bit inefficient.

In any case, the work-around for the issue you described is in 
https://github.com/libgit2/libgit2/commit/cceae9a25d0bed8b00f4981e051d5f380ef54401
 which is part of v0.22.1.

> Version: Git-2.5.2.2-64-bit

No, that is not the correct version. This looks like the current Git for 
Windows version, but the error you reported is clearly a libgit2 error message. 
And that Git for Windows version contains no libgit2 at all.

Please clarify. You might want to err on the side of verbosity to avoid 
unnecessary back-and-forth.

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 6/7] git-p4: add support for large file systems

2015-09-16 Thread Luke Diamand

On 14/09/15 14:26, larsxschnei...@gmail.com wrote:

From: Lars Schneider 

Perforce repositories can contain large (binary) files. Migrating these
repositories to Git generates very large local clones. External storage
systems such as Git LFS [1], Git Fat [2], Git Media [3], git-annex [4]
try to address this problem.

Add a generic mechanism to detect large files based on extension,
uncompressed size, and/or compressed size.


Looks excellent! I've got a small few comments (below) and I need to 
come back and have another look through if that's OK.


Thanks!
Luke



[1] https://git-lfs.github.com/
[2] https://github.com/jedbrown/git-fat
[3] https://github.com/alebedev/git-media
[4] https://git-annex.branchable.com/

Signed-off-by: Lars Schneider 
---
  Documentation/git-p4.txt   |  32 +++
  git-p4.py  | 139 +
  t/t9823-git-p4-mock-lfs.sh |  96 +++
  3 files changed, 257 insertions(+), 10 deletions(-)
  create mode 100755 t/t9823-git-p4-mock-lfs.sh

diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
index 82aa5d6..3f21e95 100644
--- a/Documentation/git-p4.txt
+++ b/Documentation/git-p4.txt
@@ -510,6 +510,38 @@ git-p4.useClientSpec::
option '--use-client-spec'.  See the "CLIENT SPEC" section above.
This variable is a boolean, not the name of a p4 client.

+git-p4.largeFileSystem::
+   Specify the system that is used for large (binary) files. Please note
+   that large file systems do not support the 'git p4 submit' command.


Why is that? Is it just that you haven't implemented support, or is it 
fundamentally impossible?



+   Only Git LFS [1] is implemented right now. Download
+   and install the Git LFS command line extension to use this option
+   and configure it like this:
++
+-
+git config   git-p4.largeFileSystem GitLFS
+-
++
+   [1] https://git-lfs.github.com/
+
+git-p4.largeFileExtensions::
+   All files matching a file extension in the list will be processed
+   by the large file system. Do not prefix the extensions with '.'.
+
+git-p4.largeFileThreshold::
+   All files with an uncompressed size exceeding the threshold will be
+   processed by the large file system. By default the threshold is
+   defined in bytes. Add the suffix k, m, or g to change the unit.
+
+git-p4.largeFileCompressedThreshold::
+   All files with a compressed size exceeding the threshold will be
+   processed by the large file system. This option might slow down
+   your clone/sync process. By default the threshold is defined in
+   bytes. Add the suffix k, m, or g to change the unit.
+
+git-p4.pushLargeFiles::
+   Boolean variable which defines if large files are automatically
+   pushed to a server.
+
  Submit variables
  
  git-p4.detectRenames::
diff --git a/git-p4.py b/git-p4.py
index b465356..bfe71b5 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -22,6 +22,8 @@ import platform
  import re
  import shutil
  import stat
+import zipfile
+import zlib

  try:
  from subprocess import CalledProcessError
@@ -932,6 +934,111 @@ def wildcard_present(path):
  m = re.search("[*#@%]", path)
  return m is not None

+class LargeFileSystem(object):
+"""Base class for large file system support."""
+
+def __init__(self, writeToGitStream):
+self.largeFiles = set()
+self.writeToGitStream = writeToGitStream
+
+def generatePointer(self, cloneDestination, contentFile):
+"""Return the content of a pointer file that is stored in Git instead 
of
+   the actual content."""
+assert False, "Method 'generatePointer' required in " + 
self.__class__.__name__
+
+def pushFile(self, localLargeFile):
+"""Push the actual content which is not stored in the Git repository to
+   a server."""
+assert False, "Method 'pushFile' required in " + 
self.__class__.__name__
+
+def hasLargeFileExtension(self, relPath):
+return reduce(
+lambda a, b: a or b,
+[relPath.endswith('.' + e) for e in 
gitConfigList('git-p4.largeFileExtensions')],
+False
+)
+
+def generateTempFile(self, contents):
+contentFile = tempfile.NamedTemporaryFile(prefix='git-p4-large-file', 
delete=False)
+for d in contents:
+contentFile.write(d)
+contentFile.flush()
+contentFile.close()


close() should flush() automatically I would have thought.


+return contentFile.name
+
+def exceedsLargeFileThreshold(self, relPath, contents):
+if gitConfigInt('git-p4.largeFileThreshold'):
+contentsSize = sum(len(d) for d in contents)
+if contentsSize > gitConfigInt('git-p4.largeFileThreshold'):
+return True
+if gitConfigInt('git-p4.largeFileCompressedThreshold'):
+ 

Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time

2015-09-16 Thread Michael Haggerty
On 09/14/2015 07:37 PM, Junio C Hamano wrote:
> Junio C Hamano  writes:
> 
>> Thanks, will queue.
> 
> Ehh, I spoke a bit too early.
> 
>>> diff --git a/builtin/gc.c b/builtin/gc.c
>>> index bcc75d9..2c3aaeb 100644
>>> --- a/builtin/gc.c
>>> +++ b/builtin/gc.c
>>> @@ -43,9 +43,20 @@ static struct argv_array prune_worktrees = 
>>> ARGV_ARRAY_INIT;
>>>  static struct argv_array rerere = ARGV_ARRAY_INIT;
>>>  
>>>  static char *pidfile;
>>> +static struct strbuf log_filename = STRBUF_INIT;
>>> +static int daemonized;
>>>  
>>>  static void remove_pidfile(void)
>>>  {
>>> +   if (daemonized && log_filename.len) {
>>> +   struct stat st;
>>> +
>>> +   close(2);
>>> +   if (stat(log_filename.buf, ) ||
>>> +   !st.st_size ||
>>> +   rename(log_filename.buf, git_path("gc.log")))
>>> +   unlink(log_filename.buf);
>>> +   }
> 
> Unfortuantely we cannot queue this as-is, as we let the tempfile API
> to automatically manage the pidfile since ebebeaea (gc: use tempfile
> module to handle gc.pid file, 2015-08-10), and you cannot piggy-back
> the log file finalization to this function that no longer exists.
> 
> Besides, it is obviously wrong to remove this log file in a function
> whose name is remove_pidfile() ;-)
> 
> Adding a new function to tempfile API that puts the file to a final
> place if it is non-empty and otherwise remove it, and using that to
> create this "gc.log" file, would be the cleanest from the point of
> view of _this_ codepath.  I however do not know if that is too
> specific for the need of this codepath or "leave it if non-empty,
> but otherwise remove as it is uninteresting" is fairly common thing
> we would want and it is a good addition to the API set.
> 
> Michael, what do you think?

I'm not sure what behavior you want. At one point you say "puts the file
to a final place if it is non-empty" but later you say "leave it if
non-empty". Should the file be written directly, or should it be written
to a lockfile and renamed into place only when complete?

Technically I don't see a problem implementing either behavior. POSIX
allows [1] calls to stat() and rename() from a signal handler. There is
a minor technical difficulty that commit_lock_file() allocates memory
via get_locked_file_path(), but this would be surmountable by
pre-allocating the memory for the locked file path and storing it in the
lock_file object.

This doesn't seem like a common thing to want (as in, this might be the
only caller), but it probably makes sense to build it into the
tempfile/lockfile API nevertheless, because implementing it externally
would require a lot of other code to be duplicated.

Another possibility that might work (maybe without requiring changes to
tempfile/lockfile): don't worry about deleting the log file if it is
empty, but make observers treat an empty log file the same as an absent one.

Michael

[1]
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev

2015-09-16 Thread Johannes Schindelin
Hi Junio, Jeff & Ramsay,

On 2015-09-16 03:32, Junio C Hamano wrote:
> Jeff King  writes:
> 
>>> Hmm, I haven't read any other patches yet (including those which use these
>>> new '_to' functions), but I can't help feeling they should be named 
>>> something
>>> like 'sha1_to_hex_str()' and 'find_unique_abbrev_str()' instead.  i.e. I 
>>> don't get
>>> the '_to' thing - not that I'm any good at naming things ...
>>
>> I meant it as a contrast with their original. sha1_to_hex() formats into
>> an internal buffer and returns it. But sha1_to_hex_to() formats "to" a
>> buffer of your choice.
> 
> I think that naming makes perfect sense.

I have to admit that I needed the explanation.

Maybe we should stick to the established practice of the many, many reentrant 
POSIX functions following the *_r() naming convention? I.e. the reentrant 
version of localtime() is called localtime_r(), the reentrant version of 
random() is called random_r() etc.

So I could see myself not needing an explanation if I had read 
sha1_to_hex_r(...).

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/67] strbuf: make strbuf_complete_line more generic

2015-09-16 Thread Jeff King
On Tue, Sep 15, 2015 at 06:27:49PM -0700, Junio C Hamano wrote:

> Eric Sunshine  writes:
> 
> >> +static inline void strbuf_complete(struct strbuf *sb, char term)
> >> +{
> >> +   if (sb->len && sb->buf[sb->len - 1] != term)
> >> +   strbuf_addch(sb, term);
> >> +}
> >
> > Hmm, so this only adds 'term' if not already present *and* if 'sb' is
> > not empty, which doesn't seem to match the documentation which says
> > that it "ensures" termination.
> [...]
> So to these two plausible and different set of callers that would be
> helped by this function, the behaviour Peff gives it would match
> what the callers want better than your version.

Right. I think what the function is doing is the right thing (and
certainly it matches what the callers I'm changing are doing already
:) ).

But I agree the docstring is extremely misleading. I've changed it to:

diff --git a/strbuf.h b/strbuf.h
index aef2794..43f27c3 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -491,10 +491,21 @@ extern void strbuf_add_lines(struct strbuf *sb, const 
char *prefix, const char *
  */
 extern void strbuf_addstr_xml_quoted(struct strbuf *sb, const char *s);
 
+/**
+ * "Complete" the contents of `sb` by ensuring that either it ends with the
+ * character `term`, or it is empty.  This can be used, for example,
+ * to ensure that text ends with a newline, but without creating an empty
+ * blank line if there is no content in the first place.
+ */
+static inline void strbuf_complete(struct strbuf *sb, char term)
+{
+   if (sb->len && sb->buf[sb->len - 1] != term)
+   strbuf_addch(sb, term);
+}
+
 static inline void strbuf_complete_line(struct strbuf *sb)
 {
-   if (sb->len && sb->buf[sb->len - 1] != '\n')
-   strbuf_addch(sb, '\n');
+   strbuf_complete(sb, '\n');
 }
 
 extern int strbuf_branchname(struct strbuf *sb, const char *name);


It may be that we will not find other uses beyond completing slash and
newline, but at least this version should not actively mislead anybody.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/67] mailsplit: make PATH_MAX buffers dynamic

2015-09-16 Thread Jeff King
On Tue, Sep 15, 2015 at 08:51:26PM -0400, Eric Sunshine wrote:

> > if (strbuf_getwholeline(, f, '\n')) {
> > -   error("cannot read mail %s (%s)", file, 
> > strerror(errno));
> > +   error("cannot read mail %s (%s)", file.buf, 
> > strerror(errno));
> > goto out;
> > }
> >
> > -   sprintf(name, "%s/%0*d", dir, nr_prec, ++skip);
> > +   name = xstrfmt("%s/%0*d", dir, nr_prec, ++skip);
> > split_one(f, name, 1);
> > +   free(name);
> 
> Hmm, why does 'file' become a strbuf which is re-used each time
> through the loop, but 'name' is treated differently and gets
> re-allocated upon each iteration? Why doesn't 'name' deserve the same
> treatment as 'file'?

My thinking was rather the other way around: why doesn't "file" get the
same treatment as "name"?

I generally prefer xstrfmt to strbufs in these patches for two reasons:

  1. The result has fewer lines.

  2. The variable switches from an array to a pointer, so accessing it
 doesn't change. Whereas with a strbuf, you have to s/foo/foo.buf/
 wherever it is accessed.

We can do that easily with "name"; we allocate it, use it, and free it.
But the lifetime of "file" crosses the "goto out" boundaries, and so
it's simplest to clean it up in the "out" section. Doing that correctly
with a bare pointer is tricky (you have to re-NULL it every time you
free the old value), whereas the strbuf's invariants make it trivial.

I guess we could get away with always calling free() right before
assigning (the equivalent of strbuf_reset()), and then rely on exiting
the loop to "out" to do the final free. And then the result (versus the
original code, not my patch) would look like:

diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c
index 9de06e3..a82dd0d 100644
--- a/builtin/mailsplit.c
+++ b/builtin/mailsplit.c
@@ -148,8 +155,7 @@ static int maildir_filename_cmp(const char *a, const char 
*b)
 static int split_maildir(const char *maildir, const char *dir,
int nr_prec, int skip)
 {
-   char file[PATH_MAX];
-   char name[PATH_MAX];
+   char *file = NULL;
FILE *f = NULL;
int ret = -1;
int i;
@@ -161,7 +167,11 @@ static int split_maildir(const char *maildir, const char 
*dir,
goto out;
 
for (i = 0; i < list.nr; i++) {
-   snprintf(file, sizeof(file), "%s/%s", maildir, 
list.items[i].string);
+   char *name;
+
+   free(file);
+   file = xstrfmt("%s/%s", maildir, list.items[i].string);
+
f = fopen(file, "r");
if (!f) {
error("cannot open mail %s (%s)", file, 
strerror(errno));
@@ -173,8 +183,9 @@ static int split_maildir(const char *maildir, const char 
*dir,
goto out;
}
 
-   sprintf(name, "%s/%0*d", dir, nr_prec, ++skip);
+   name = xstrfmt("%s/%0*d", dir, nr_prec, ++skip);
split_one(f, name, 1);
+   free(name);
 
fclose(f);
f = NULL;
@@ -184,6 +195,7 @@ static int split_maildir(const char *maildir, const char 
*dir,
 out:
if (f)
fclose(f);
+   free(file);
string_list_clear(, 1);
return ret;
 }

which is not so bad.

Of course this is more allocations per loop than using a strbuf. I doubt
it matters in practice (we are about to fopen() and read into a strbuf,
after all!), but we could also follow the opposite direction and use
strbufs for both.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/67] mailsplit: make PATH_MAX buffers dynamic

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 06:14:18AM -0400, Jeff King wrote:

> I guess we could get away with always calling free() right before
> assigning (the equivalent of strbuf_reset()), and then rely on exiting
> the loop to "out" to do the final free. And then the result (versus the
> original code, not my patch) would look like:

And here is the whole patch converted to that style. I'm planning to go
with this, as the resulting diff is much smaller and much more clear
that we are touching only the allocations.

I _hope_ the result is pretty easy to understand. There is some subtlety
to the loop assumptions:

 - on entering, the pointer is NULL or an allocated buffer to be
   recycled; either way, free() is OK

 - on leaving, the pointer is either NULL (if we never ran the loop) or
   a buffer to be freed. The free() at the end is necessary to handle
   this.

That is not too complicated, but it is not an idiom we use elsewhere
(whereas recycled strbufs are). I can switch the whole thing to strbufs
if that's the direction we want to go.

-- >8 --
Subject: [PATCH] mailsplit: make PATH_MAX buffers dynamic

There are several static PATH_MAX-sized buffers in
mailsplit, along with some questionable uses of sprintf.
These are not really of security interest, as local
mailsplit pathnames are not typically under control of an
attacker, and you could generally only overflow a few
numbers at the end of a path that approaches PATH_MAX (a
longer path would choke mailsplit long before). But it does
not hurt to be careful, and as a bonus we lift some limits
for systems with too-small PATH_MAX varibles.

Signed-off-by: Jeff King 
---
 builtin/mailsplit.c | 34 +++---
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c
index 9de06e3..104277a 100644
--- a/builtin/mailsplit.c
+++ b/builtin/mailsplit.c
@@ -98,30 +98,37 @@ static int populate_maildir_list(struct string_list *list, 
const char *path)
 {
DIR *dir;
struct dirent *dent;
-   char name[PATH_MAX];
+   char *name = NULL;
char *subs[] = { "cur", "new", NULL };
char **sub;
+   int ret = -1;
 
for (sub = subs; *sub; ++sub) {
-   snprintf(name, sizeof(name), "%s/%s", path, *sub);
+   free(name);
+   name = xstrfmt("%s/%s", path, *sub);
if ((dir = opendir(name)) == NULL) {
if (errno == ENOENT)
continue;
error("cannot opendir %s (%s)", name, strerror(errno));
-   return -1;
+   goto out;
}
 
while ((dent = readdir(dir)) != NULL) {
if (dent->d_name[0] == '.')
continue;
-   snprintf(name, sizeof(name), "%s/%s", *sub, 
dent->d_name);
+   free(name);
+   name = xstrfmt("%s/%s", *sub, dent->d_name);
string_list_insert(list, name);
}
 
closedir(dir);
}
 
-   return 0;
+   ret = 0;
+
+out:
+   free(name);
+   return ret;
 }
 
 static int maildir_filename_cmp(const char *a, const char *b)
@@ -148,8 +155,7 @@ static int maildir_filename_cmp(const char *a, const char 
*b)
 static int split_maildir(const char *maildir, const char *dir,
int nr_prec, int skip)
 {
-   char file[PATH_MAX];
-   char name[PATH_MAX];
+   char *file = NULL;
FILE *f = NULL;
int ret = -1;
int i;
@@ -161,7 +167,11 @@ static int split_maildir(const char *maildir, const char 
*dir,
goto out;
 
for (i = 0; i < list.nr; i++) {
-   snprintf(file, sizeof(file), "%s/%s", maildir, 
list.items[i].string);
+   char *name;
+
+   free(file);
+   file = xstrfmt("%s/%s", maildir, list.items[i].string);
+
f = fopen(file, "r");
if (!f) {
error("cannot open mail %s (%s)", file, 
strerror(errno));
@@ -173,8 +183,9 @@ static int split_maildir(const char *maildir, const char 
*dir,
goto out;
}
 
-   sprintf(name, "%s/%0*d", dir, nr_prec, ++skip);
+   name = xstrfmt("%s/%0*d", dir, nr_prec, ++skip);
split_one(f, name, 1);
+   free(name);
 
fclose(f);
f = NULL;
@@ -184,6 +195,7 @@ static int split_maildir(const char *maildir, const char 
*dir,
 out:
if (f)
fclose(f);
+   free(file);
string_list_clear(, 1);
return ret;
 }
@@ -191,7 +203,6 @@ out:
 static int split_mbox(const char *file, const char *dir, int allow_bare,
  int nr_prec, int skip)
 {
-   char name[PATH_MAX];
int ret = -1;
int peek;
 
@@ -218,8 +229,9 @@ static int 

Re: [PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 10:15:02AM +0200, Johannes Schindelin wrote:

> Maybe we should stick to the established practice of the many, many
> reentrant POSIX functions following the *_r() naming convention? I.e.
> the reentrant version of localtime() is called localtime_r(), the
> reentrant version of random() is called random_r() etc.
> 
> So I could see myself not needing an explanation if I had read
> sha1_to_hex_r(...).

I like this suggestion. By itself, the "_r" does not communicate as much
as "_to" to me, but as long as the reader knows the "_r" idiom, it
communicates much more.

I'll switch to this unless there is any objection.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/67] war on sprintf, strcpy, etc

2015-09-16 Thread Jeff King
On Tue, Sep 15, 2015 at 06:54:28PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Obviously this is not intended for v2.6.0. But all of the spots touched
> > here are relatively quiet right now, so I wanted to get it out onto the
> > list.  There are a few minor conflicts against "pu", but they're all
> > just from touching nearby lines.
> 
> Thanks.  Looks like a lot of good work you did ;-)

I forgot to mention my favorite part:

  $ git diff --stat @{u} | tail -1
   88 files changed, 872 insertions(+), 980 deletions(-)

:)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-svn: cat-file memory usage

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 04:00:48AM -0700, Victor Leschuk wrote:

>  * git svn clone of trac  takes about 1 hour 
>  * git svn clone of FreeBSD has already taken more than 3 days and
>  still running (currently has cloned about 40% of revisions)

I haven't worked with git-svn in a long time, but I doubt that it is the
fastest way to do a large repository import. You might want to look into
a tool like svn2git or reposurgeon to do the initial import.

> I have valgrind'ed the git-cat-file (which is running is --batch mode
> during the whole clone) and found no serious leaks (about 100 bytes
> definitely leaked), so all memory is carefully freed, but the heap
> usage grows maybe due to fragmentation or smth else. When I looked
> through the code I found out that most of heap allocations are called
> from batch_object_write() function (strbuf_expand -> realloc).

Certainly we will call strbuf_expand once per object. I would have
expected we would call read_sha1_file(), too. It looks like we always
try to stream blobs, but I think we have to fall back to reading the
whole object if there are deltas.

You can try this patch, which will reuse the same strbuf over and over:

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 07baad1..73f338c 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -256,7 +256,7 @@ static void print_object_or_die(struct batch_options *opt, 
struct expand_data *d
 static void batch_object_write(const char *obj_name, struct batch_options *opt,
   struct expand_data *data)
 {
-   struct strbuf buf = STRBUF_INIT;
+   static struct strbuf buf = STRBUF_INIT;
 
if (sha1_object_info_extended(data->sha1, >info, 
LOOKUP_REPLACE_OBJECT) < 0) {
printf("%s missing\n", obj_name ? obj_name : 
sha1_to_hex(data->sha1));
@@ -264,10 +264,10 @@ static void batch_object_write(const char *obj_name, 
struct batch_options *opt,
return;
}
 
+   strbuf_reset();
strbuf_expand(, opt->format, expand_format, data);
strbuf_addch(, '\n');
batch_write(opt, buf.buf, buf.len);
-   strbuf_release();
 
if (opt->print_contents) {
print_object_or_die(opt, data);

That will reduce your reallocs due to strbuf_expand, though I'm doubtful
that it will solve the problem (and if it does, I think the right
solution is probably to look into using a better allocator than what
your system malloc() is providing).

>  * In perl code do not run git cat-file in batch mode (in
>  Git::SVN::apply_textdelta) but rather run it as separate commands
>  each time
> 
>my $size = $self->command_oneline('cat-file', '-s', $sha1);
># .
>my ($in, $c) = $self->command_output_pipe('cat-file', 'blob', $sha1);
> 
> The second approach doesn't slow down the whole process at all (~72
> minutes to clone repo both with --batch mode and without).

I'm surprised the startup cost of the process doesn't make an impact,
but maybe it gets lost in the noise of the rest of the work (AFAICT, the
point of this cat-file is to retrieve a blob, apply a delta to it, and
then write out the resulting object; that write is probably a lot more
expensive).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 6/7] git-p4: add support for large file systems

2015-09-16 Thread Lars Schneider

On 16 Sep 2015, at 10:36, Luke Diamand  wrote:

> On 14/09/15 14:26, larsxschnei...@gmail.com wrote:
>> From: Lars Schneider 
>> 
>> Perforce repositories can contain large (binary) files. Migrating these
>> repositories to Git generates very large local clones. External storage
>> systems such as Git LFS [1], Git Fat [2], Git Media [3], git-annex [4]
>> try to address this problem.
>> 
>> Add a generic mechanism to detect large files based on extension,
>> uncompressed size, and/or compressed size.
> 
> Looks excellent! I've got a small few comments (below) and I need to come 
> back and have another look through if that's OK.
Sure! Thank you :-)

> 
> Thanks!
> Luke
> 
>> 
>> [1] https://git-lfs.github.com/
>> [2] https://github.com/jedbrown/git-fat
>> [3] https://github.com/alebedev/git-media
>> [4] https://git-annex.branchable.com/
>> 
>> Signed-off-by: Lars Schneider 
>> ---
>>  Documentation/git-p4.txt   |  32 +++
>>  git-p4.py  | 139 
>> +
>>  t/t9823-git-p4-mock-lfs.sh |  96 +++
>>  3 files changed, 257 insertions(+), 10 deletions(-)
>>  create mode 100755 t/t9823-git-p4-mock-lfs.sh
>> 
>> diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
>> index 82aa5d6..3f21e95 100644
>> --- a/Documentation/git-p4.txt
>> +++ b/Documentation/git-p4.txt
>> @@ -510,6 +510,38 @@ git-p4.useClientSpec::
>>  option '--use-client-spec'.  See the "CLIENT SPEC" section above.
>>  This variable is a boolean, not the name of a p4 client.
>> 
>> +git-p4.largeFileSystem::
>> +Specify the system that is used for large (binary) files. Please note
>> +that large file systems do not support the 'git p4 submit' command.
> 
> Why is that? Is it just that you haven't implemented support, or is it 
> fundamentally impossible?

If we detect LFS files only by file extension then we could make it work. But 
then we must not use any git-p4 settings. We would need to rely only on the 
“.gitattributes” file that is stored in the P4 repository. My implementation 
also looks at the file size and decides on a individual file basis if a file is 
stored in LFS. That means all clients need the same file size threshold. 

Junio explained the problem in the v4 thread:
> How is collaboration between those who talk to the same p4 depot
> backed by GitHub LFS expected to work?  You use config to set size
> limits and list of file extensions in your repository, and grab new
> changes from p4 and turn them into Git commits (with pointers to LFS
> and the .gitattributes file that records your choice of the config).
> I as a new member to the same project come, clone the resulting Git
> repository from you and then what do I do before talking to the same
> p4 depot to further update the Git history?  Are the values recorded
> in .gitattributes (which essentially were derived from your
> configuration) somehow reflected back automatically to my config so
> that our world view would become consistent?  Perhaps you added
> 'iso' to largeFileExtensions before I cloned from you, and that
> would be present in the copy of .gitattributes I obtained from you.
> I may be trying to add a new ".iso" file, and I presume that an
> existing .gitattributes entry you created based on the file
> extension automatically covers it from the LFS side, but does my
> "git p4 submit" also know what to do, or would it be broken until I
> add a set of configrations that matches yours?
 


> 
>> +Only Git LFS [1] is implemented right now. Download
>> +and install the Git LFS command line extension to use this option
>> +and configure it like this:
>> ++
>> +-
>> +git config   git-p4.largeFileSystem GitLFS
>> +-
>> ++
>> +[1] https://git-lfs.github.com/
>> +
>> +git-p4.largeFileExtensions::
>> +All files matching a file extension in the list will be processed
>> +by the large file system. Do not prefix the extensions with '.'.
>> +
>> +git-p4.largeFileThreshold::
>> +All files with an uncompressed size exceeding the threshold will be
>> +processed by the large file system. By default the threshold is
>> +defined in bytes. Add the suffix k, m, or g to change the unit.
>> +
>> +git-p4.largeFileCompressedThreshold::
>> +All files with a compressed size exceeding the threshold will be
>> +processed by the large file system. This option might slow down
>> +your clone/sync process. By default the threshold is defined in
>> +bytes. Add the suffix k, m, or g to change the unit.
>> +
>> +git-p4.pushLargeFiles::
>> +Boolean variable which defines if large files are automatically
>> +pushed to a server.
>> +
>>  Submit variables
>>  
>>  git-p4.detectRenames::
>> diff --git a/git-p4.py b/git-p4.py
>> index b465356..bfe71b5 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -22,6 +22,8 @@ import 

[PATCH v2] git-p4: improve path encoding verbose output

2015-09-16 Thread larsxschneider
From: Lars Schneider 

Follow up patch for a9e38359 (git-p4: add config git-p4.pathEncoding,
2015-09-03) which is already on 'next'.

diff to v1 (wrongly called v7 in "[PATCH v7] git-p4: add config 
git-p4.pathEncoding")
* make path encoding for utf-8 default case more explicit (thanks Junio!)
* improve commit message (thanks Luke!)

Cheers,
Lars

Lars Schneider (1):
  git-p4: improve path encoding verbose output

 git-p4.py | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

--
2.5.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 29/67] use strip_suffix and xstrfmt to replace suffix

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 12:38:12AM -0400, Eric Sunshine wrote:

> > @@ -1524,9 +1525,9 @@ int finish_http_pack_request(struct http_pack_request 
> > *preq)
> > lst = &((*lst)->next);
> > *lst = (*lst)->next;
> >
> > -   tmp_idx = xstrdup(preq->tmpfile);
> > -   strcpy(tmp_idx + strlen(tmp_idx) - strlen(".pack.temp"),
> > -  ".idx.temp");
> > +   if (!strip_suffix(preq->tmpfile, ".pack.temp", ))
> > +   die("BUG: pack tmpfile does not end in .pack.temp?");
> > +   tmp_idx = xstrfmt("%.*s.idx.temp", (int)len, preq->tmpfile);
> 
> These instances of repeated replacement code may argue in favor of a
> general purpose replace_suffix() function:
> 
> char *replace_suffix(const char *s, const char *old, const char *new)
> {
> size_t n;
> if (!strip_suffix(s, old, ))
> die("BUG: '%s' does not end with '%s', s, old);
> return xstrfmt("%.*s%s", (int)n, s, new);
> }
> 
> or something.

Yeah, that is tempting, but I think the "die" here is not at all
appropriate in a reusable function. I'd probably write it as:

  char *replace_suffix(const char *s, const char *old, const char *new)
  {
size_t n;
if (!strip_suffix(s, old, ))
return NULL;
return xstrfmt("%.*s%s", (int)n, s, new);
  }

and do:

  tmp_idx = replace_suffix(preq->tmpfile, ".pack.temp", ".idx.temp");
  if (!tmp_idx)
die("BUG: pack tmpfile does not end in .pack.temp?");

but then we are not really saving much. And it is not clear whether
that is even a sane output for replace_suffix. I can easily imagine
three behaviors when we do not end in the original suffix:

  - return NULL to signal error

  - return the original with no replacement

  - return the original with "new" appended

So I'm not sure it makes a good reusable function beyond these three
call-sites.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git-svn: cat-file memory usage

2015-09-16 Thread Victor Leschuk
Hello all,

We are currently getting acquainted with git-svn tool and have experienced few 
problems with it. The main issue is memory usage during "git svn clone": on 
large repositories the perl and git processes are using significant amount of 
memory. 

I have conducted several tests with different repositories. I have created 
mirrors of Trac project (http://trac.edgewall.org/ - rather small repo, ~14000 
commits) and FreeBSD base repo (~28 commits). Here is the summary of my 
tests (to eliminate network issues all clones were performed for file:// repos):

 * git svn clone of trac  takes about 1 hour 
 * git svn clone of FreeBSD has already taken more than 3 days and still 
running (currently has cloned about 40% of revisions)
 * git cat-file process memory footprint keeps growing during the clone process 
(see figure attached)

The main issue here is git cat-file consuming memory. The attached figure is 
for small repository which takes about an hour to clone, however on my another 
machine where FreeBSD clone is currently running the git cat-file has already 
taken more than 1Gb of memory (RSS) and has overgrown the parent perl process 
(~300-400 Mb). 

I have valgrind'ed the git-cat-file (which is running is --batch mode during 
the whole clone) and found no serious leaks (about 100 bytes definitely 
leaked), so all memory is carefully freed, but the heap usage grows maybe due 
to fragmentation or smth else. When I looked through the code I found out that 
most of heap allocations are called from batch_object_write() function 
(strbuf_expand -> realloc).

So I have found two possible workarounds for the issue: 

 * Set GIT_ALLOC_LIMIT variable - it does reduce the memory footprint but slows 
down the process
 * In perl code do not run git cat-file in batch mode (in 
Git::SVN::apply_textdelta) but rather run it as separate commands each time

   my $size = $self->command_oneline('cat-file', '-s', $sha1);
   # .
   my ($in, $c) = $self->command_output_pipe('cat-file', 'blob', $sha1);

The second approach doesn't slow down the whole process at all (~72 minutes to 
clone repo both with --batch mode and without).

So the question is: what would be the correct approach to fight the problem 
with cat-file memory usage: maybe we should get rid of batch mode in perl code, 
or somehow tune allocation policy in C code?

Please let me know your thoughts. 

--
Best Regards,
Victor Leschuk

Re: [PATCH 11/67] trace: use strbuf for quote_crnl output

2015-09-16 Thread Jeff King
On Tue, Sep 15, 2015 at 08:55:58PM -0400, Eric Sunshine wrote:

> On Tue, Sep 15, 2015 at 11:28 AM, Jeff King  wrote:
> > When we output GIT_TRACE_SETUP paths, we quote any
> > meta-characters. But our buffer to hold the result is only
> > PATH_MAX bytes, and we could double the size of the input
> > path (if every character needs quoted). We could use a
> 
> s/quoted/to be &/ ...or... s/quoted/quoting/

Thanks, fixed.

> >  static const char *quote_crnl(const char *path)
> >  {
> > -   static char new_path[PATH_MAX];
> > +   static struct strbuf new_path = STRBUF_INIT;
> > const char *p2 = path;
> > -   char *p1 = new_path;
> 
> It's a little sad that this leaves a variable named 'p2' when there is
> no corresponding 'p1'. Would this deserve a cleanup patch which
> renames 'p2' to 'p' or do we not care enough?

Yeah, you're right. The original had symmetry in p1 and p2, in that it
moved them forward together. Now that symmetry is gone, and I wonder if
the simplest cleanup is to just drop "p2" altogether and advance the
"path" source pointer? Like this:

-- >8 --
Subject: [PATCH] trace: use strbuf for quote_crnl output

When we output GIT_TRACE_SETUP paths, we quote any
meta-characters. But our buffer to hold the result is only
PATH_MAX bytes, and we could double the size of the input
path (if every character needs quoting). We could use a
2*PATH_MAX buffer, if we assume the input will never be more
than PATH_MAX. But it's easier still to just switch to a
strbuf and not worry about whether the input can exceed
PATH_MAX or not.

The original copied the "p2" pointer to "p1", advancing
both. Since this gets rid of "p1", let's also drop "p2",
whose name is now confusing. We can just advance the
original "path" pointer.

Signed-off-by: Jeff King 
---
 trace.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/trace.c b/trace.c
index 7393926..4aeea60 100644
--- a/trace.c
+++ b/trace.c
@@ -277,25 +277,24 @@ void trace_performance_fl(const char *file, int line, 
uint64_t nanos,
 
 static const char *quote_crnl(const char *path)
 {
-   static char new_path[PATH_MAX];
-   const char *p2 = path;
-   char *p1 = new_path;
+   static struct strbuf new_path = STRBUF_INIT;
 
if (!path)
return NULL;
 
-   while (*p2) {
-   switch (*p2) {
-   case '\\': *p1++ = '\\'; *p1++ = '\\'; break;
-   case '\n': *p1++ = '\\'; *p1++ = 'n'; break;
-   case '\r': *p1++ = '\\'; *p1++ = 'r'; break;
+   strbuf_reset(_path);
+
+   while (*path) {
+   switch (*path) {
+   case '\\': strbuf_addstr(_path, ""); break;
+   case '\n': strbuf_addstr(_path, "\\n"); break;
+   case '\r': strbuf_addstr(_path, "\\r"); break;
default:
-   *p1++ = *p2;
+   strbuf_addch(_path, *path);
}
-   p2++;
+   path++;
}
-   *p1 = '\0';
-   return new_path;
+   return new_path.buf;
 }
 
 /* FIXME: move prefix to startup_info struct and get rid of this arg */
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/67] replace trivial malloc + sprintf /strcpy calls to xstrfmt

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 12:24:38AM -0400, Eric Sunshine wrote:

> On Tue, Sep 15, 2015 at 11:45 AM, Jeff King  wrote:
> > replace trivial malloc + sprintf /strcpy calls to xstrfmt
> 
> s/to/with/
> 
> Also, do you want either to add a space after '/' or drop the one before it?

Thanks, fixed both.

> > --- a/imap-send.c
> > +++ b/imap-send.c
> > @@ -889,9 +889,8 @@ static char *cram(const char *challenge_64, const char 
> > *user, const char *pass)
> > }
> >
> > /* response: " " */
> > -   resp_len = strlen(user) + 1 + strlen(hex) + 1;
> > -   response = xmalloc(resp_len);
> > -   sprintf(response, "%s %s", user, hex);
> > +   response = xstrfmt("%s %s", user, hex);
> > +   resp_len = strlen(response);
> >
> > response_64 = xmalloc(ENCODED_SIZE(resp_len) + 1);
> 
> The original resp_len calculation included the NUL but the revised
> does not. If I'm reading this correctly, the revised calculation is
> correct, and the original was over-allocating response_64, right?

Hrm, I didn't notice that. We actually feed "resp_len" directly to
EVP_EncodeBlock, meaning it it _would_ include the trailing NUL in the
encoded output. So my modification is not wrong memory-wise (we allocate
enough to hold the result), but does produce different output.

I'm not at all convinced that the original is correct, but if we are
going to fix it, it should definitely not be part of this patch. I'll
switch it to:

  resp_len = strlen(response) + 1;

which matches the original.

Thanks for being such a careful reviewer. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] git-p4: handle "Translation of file content failed"

2015-09-16 Thread Lars Schneider

On 16 Sep 2015, at 00:12, Luke Diamand  wrote:

> On 15/09/15 16:38, Lars Schneider wrote:
>> 
>> On 15 Sep 2015, at 08:43, Luke Diamand  wrote:
>> 
> 
> 
>>> Do we know the mechanism by which we end up in this state?
>> Unfortunately no. I tried hard to reproduce the error with “conventional” 
>> methods. As you can see I ended up manipulating the P4 database…
>> 
>> However, it looks like this error happens in the wild, too:
>> https://stackoverflow.com/questions/5156909/translation-of-file-content-failed-error-in-perforce
>> https://stackoverflow.com/questions/887006/perforce-translation-of-file-content-failed-error
> 
> It's described in the Perforce FAQ here:
> 
> http://answers.perforce.com/articles/KB/3117
> 
> i.e. it looks to be caused by mixing old and new P4 clients.
Good find! No idea why I did not find this article before… I will copy the text 
from the KB into the git commit message to explain the problem.
 
> 
 
 Known issue: This works only if git-p4 is executed in verbose mode.
 In normal mode no exceptions are thrown and git-p4 just exits.
>>> 
>>> Does that mean that the error will only be detected in verbose mode? That 
>>> doesn't seem right!
>> Correct. I don’t like this either but I also don’t want to make huge changes 
>> to git-p4.
>> You can see the root problem here:
>> https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L110-L114
>> 
>> Any idea how to approach that best?
> 
> I guess what we have is not ideal but probably good enough.
ok. thanks!

I will add another test case without “—verbose" to document that there is work 
to do :-)

> 
> 
 +try:
 +text = p4_read_pipe(['print', '-q', '-o', '-', '%s@%s' % 
 (file['depotFile'], file['change'])])
 +except Exception as e:
>>> 
>>> Would it be better to specify which kind of Exception you are catching? 
>>> Looks like you could get OSError, ValueError and CalledProcessError; it's 
>>> the last of these you want (I think).
>> I think it is just a plain exception. See here:
>> https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L111
> 
> OK, you're right (probably less than ideal behaviour from read_pipe() and 
> die() but let's not try to fix that).
ok

> 
> 
 +if p4_version_string().find('/NT') >= 0:
 +text = text.replace('\r\n', '\n')
 +contents = [ text ]
>>> 
>>> The indentation on this bit doesn't look right to me.
>> I believe it is exactly how it was:
>> https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L2397-L2399
> 
> OK.
> 
>> 
>> 
>> In general, what is the appropriate way to reference code in this email 
>> list? Are GitHub links OK?
> 
> I'm not an expert, but it feels possibly a bit ephemeral - if someone is 
> digging through email archives in a future where that github project has been 
> moved elsewhere, the links will all be dead.

Right. However, you could disassemble the URL and use the commit hash, the 
filename and the line number. They are not ephemeral because they are part of 
the repo.

Thanks,
Lars--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: git-svn: cat-file memory usage

2015-09-16 Thread Victor Leschuk
Hello Jeff, thanks for the advice.

Unfortunately using patch didn't change the situation. I will run some tests 
with alternate allocators (looking at jemalloc and tcmalloc). As for alternate 
tools: as far as I understood svn2git calls 'git svn' itself. So I assume it 
can't fix the memory usage or speed up clone process... Correct me if I'm wrong.

Reposurgeon looks interesting... Will give it a try. 

Btw, what do you think of getting rid of batch mode for clone/fetch in perl 
code. It really hardly has any impact on performance but reduces memory usage a 
lot.

--
Best Regards,
Victor

From: Jeff King [p...@peff.net]
Sent: Wednesday, September 16, 2015 4:56 AM
To: Victor Leschuk
Cc: git@vger.kernel.org
Subject: Re: git-svn: cat-file memory usage

On Wed, Sep 16, 2015 at 04:00:48AM -0700, Victor Leschuk wrote:

>  * git svn clone of trac  takes about 1 hour
>  * git svn clone of FreeBSD has already taken more than 3 days and
>  still running (currently has cloned about 40% of revisions)

I haven't worked with git-svn in a long time, but I doubt that it is the
fastest way to do a large repository import. You might want to look into
a tool like svn2git or reposurgeon to do the initial import.

> I have valgrind'ed the git-cat-file (which is running is --batch mode
> during the whole clone) and found no serious leaks (about 100 bytes
> definitely leaked), so all memory is carefully freed, but the heap
> usage grows maybe due to fragmentation or smth else. When I looked
> through the code I found out that most of heap allocations are called
> from batch_object_write() function (strbuf_expand -> realloc).

Certainly we will call strbuf_expand once per object. I would have
expected we would call read_sha1_file(), too. It looks like we always
try to stream blobs, but I think we have to fall back to reading the
whole object if there are deltas.

You can try this patch, which will reuse the same strbuf over and over:

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 07baad1..73f338c 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -256,7 +256,7 @@ static void print_object_or_die(struct batch_options *opt, 
struct expand_data *d
 static void batch_object_write(const char *obj_name, struct batch_options *opt,
   struct expand_data *data)
 {
-   struct strbuf buf = STRBUF_INIT;
+   static struct strbuf buf = STRBUF_INIT;

if (sha1_object_info_extended(data->sha1, >info, 
LOOKUP_REPLACE_OBJECT) < 0) {
printf("%s missing\n", obj_name ? obj_name : 
sha1_to_hex(data->sha1));
@@ -264,10 +264,10 @@ static void batch_object_write(const char *obj_name, 
struct batch_options *opt,
return;
}

+   strbuf_reset();
strbuf_expand(, opt->format, expand_format, data);
strbuf_addch(, '\n');
batch_write(opt, buf.buf, buf.len);
-   strbuf_release();

if (opt->print_contents) {
print_object_or_die(opt, data);

That will reduce your reallocs due to strbuf_expand, though I'm doubtful
that it will solve the problem (and if it does, I think the right
solution is probably to look into using a better allocator than what
your system malloc() is providing).

>  * In perl code do not run git cat-file in batch mode (in
>  Git::SVN::apply_textdelta) but rather run it as separate commands
>  each time
>
>my $size = $self->command_oneline('cat-file', '-s', $sha1);
># .
>my ($in, $c) = $self->command_output_pipe('cat-file', 'blob', $sha1);
>
> The second approach doesn't slow down the whole process at all (~72
> minutes to clone repo both with --batch mode and without).

I'm surprised the startup cost of the process doesn't make an impact,
but maybe it gets lost in the noise of the rest of the work (AFAICT, the
point of this cat-file is to retrieve a blob, apply a delta to it, and
then write out the resulting object; that write is probably a lot more
expensive).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 6/7] git-p4: add support for large file systems

2015-09-16 Thread Eric Sunshine
On Wed, Sep 16, 2015 at 02:05:40PM +0200, Lars Schneider wrote:
> On 16 Sep 2015, at 10:36, Luke Diamand  wrote:
> > On 14/09/15 14:26, larsxschnei...@gmail.com wrote:
> >> +test_file_in_mock () {
> >> +  FILE="$1"
> > Missing &&
> > Plus the next few lines
> Are they strictly necessary? I believe you can define variables all
> in ?one line?. I found it like that in existing tests:

The reason for keeping the &&-chain intact even on these variable
assignment lines is to future-proof it against some programmer coming
along and inserting new code above or in between the assignment
lines, and not realizing that the &&-chain is not intact.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/67] strbuf: make strbuf_complete_line more generic

2015-09-16 Thread Eric Sunshine
On Wed, Sep 16, 2015 at 05:57:41AM -0400, Jeff King wrote:
> On Tue, Sep 15, 2015 at 06:27:49PM -0700, Junio C Hamano wrote:
> > Eric Sunshine  writes:
> > >> +static inline void strbuf_complete(struct strbuf *sb, char term)
> > >> +{
> > >> +   if (sb->len && sb->buf[sb->len - 1] != term)
> > >> +   strbuf_addch(sb, term);
> > >> +}
> > >
> > > Hmm, so this only adds 'term' if not already present *and* if 'sb' is
> > > not empty, which doesn't seem to match the documentation which says
> > > that it "ensures" termination.
> > [...]
> > So to these two plausible and different set of callers that would be
> > helped by this function, the behaviour Peff gives it would match
> > what the callers want better than your version.
> 
> Right. I think what the function is doing is the right thing (and
> certainly it matches what the callers I'm changing are doing already
> :) ).
> 
> But I agree the docstring is extremely misleading. I've changed it to:
>  
> +/**
> + * "Complete" the contents of `sb` by ensuring that either it ends with the
> + * character `term`, or it is empty.  This can be used, for example,
> + * to ensure that text ends with a newline, but without creating an empty
> + * blank line if there is no content in the first place.
> + */

Sounds better, thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/67] trace: use strbuf for quote_crnl output

2015-09-16 Thread Eric Sunshine
On Wed, Sep 16, 2015 at 06:31:14AM -0400, Jeff King wrote:
> On Tue, Sep 15, 2015 at 08:55:58PM -0400, Eric Sunshine wrote:
> > On Tue, Sep 15, 2015 at 11:28 AM, Jeff King  wrote:
> > >  static const char *quote_crnl(const char *path)
> > >  {
> > > -   static char new_path[PATH_MAX];
> > > +   static struct strbuf new_path = STRBUF_INIT;
> > > const char *p2 = path;
> > > -   char *p1 = new_path;
> > 
> > It's a little sad that this leaves a variable named 'p2' when there is
> > no corresponding 'p1'. Would this deserve a cleanup patch which
> > renames 'p2' to 'p' or do we not care enough?
> 
> Yeah, you're right. The original had symmetry in p1 and p2, in that it
> moved them forward together. Now that symmetry is gone, and I wonder if
> the simplest cleanup is to just drop "p2" altogether and advance the
> "path" source pointer? Like this:

Works for me. The diff is a bit noisier, but the final result feels clean.

> -- >8 --
> Subject: [PATCH] trace: use strbuf for quote_crnl output
> 
> When we output GIT_TRACE_SETUP paths, we quote any
> meta-characters. But our buffer to hold the result is only
> PATH_MAX bytes, and we could double the size of the input
> path (if every character needs quoting). We could use a
> 2*PATH_MAX buffer, if we assume the input will never be more
> than PATH_MAX. But it's easier still to just switch to a
> strbuf and not worry about whether the input can exceed
> PATH_MAX or not.
> 
> The original copied the "p2" pointer to "p1", advancing
> both. Since this gets rid of "p1", let's also drop "p2",
> whose name is now confusing. We can just advance the
> original "path" pointer.
> 
> Signed-off-by: Jeff King 
> ---
>  trace.c | 23 +++
>  1 file changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/trace.c b/trace.c
> index 7393926..4aeea60 100644
> --- a/trace.c
> +++ b/trace.c
> @@ -277,25 +277,24 @@ void trace_performance_fl(const char *file, int line, 
> uint64_t nanos,
>  
>  static const char *quote_crnl(const char *path)
>  {
> - static char new_path[PATH_MAX];
> - const char *p2 = path;
> - char *p1 = new_path;
> + static struct strbuf new_path = STRBUF_INIT;
>  
>   if (!path)
>   return NULL;
>  
> - while (*p2) {
> - switch (*p2) {
> - case '\\': *p1++ = '\\'; *p1++ = '\\'; break;
> - case '\n': *p1++ = '\\'; *p1++ = 'n'; break;
> - case '\r': *p1++ = '\\'; *p1++ = 'r'; break;
> + strbuf_reset(_path);
> +
> + while (*path) {
> + switch (*path) {
> + case '\\': strbuf_addstr(_path, ""); break;
> + case '\n': strbuf_addstr(_path, "\\n"); break;
> + case '\r': strbuf_addstr(_path, "\\r"); break;
>   default:
> - *p1++ = *p2;
> + strbuf_addch(_path, *path);
>   }
> - p2++;
> + path++;
>   }
> - *p1 = '\0';
> - return new_path;
> + return new_path.buf;
>  }
>  
>  /* FIXME: move prefix to startup_info struct and get rid of this arg */
> -- 
> 2.6.0.rc2.408.ga2926b9
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 29/67] use strip_suffix and xstrfmt to replace suffix

2015-09-16 Thread Eric Sunshine
On Wed, Sep 16, 2015 at 06:50:38AM -0400, Jeff King wrote:
> On Wed, Sep 16, 2015 at 12:38:12AM -0400, Eric Sunshine wrote:
> > > @@ -1524,9 +1525,9 @@ int finish_http_pack_request(struct 
> > > http_pack_request *preq)
> > > lst = &((*lst)->next);
> > > *lst = (*lst)->next;
> > >
> > > -   tmp_idx = xstrdup(preq->tmpfile);
> > > -   strcpy(tmp_idx + strlen(tmp_idx) - strlen(".pack.temp"),
> > > -  ".idx.temp");
> > > +   if (!strip_suffix(preq->tmpfile, ".pack.temp", ))
> > > +   die("BUG: pack tmpfile does not end in .pack.temp?");
> > > +   tmp_idx = xstrfmt("%.*s.idx.temp", (int)len, preq->tmpfile);
> > 
> > These instances of repeated replacement code may argue in favor of a
> > general purpose replace_suffix() function:
> > 
> > char *replace_suffix(const char *s, const char *old, const char *new)
> > {
> > size_t n;
> > if (!strip_suffix(s, old, ))
> > die("BUG: '%s' does not end with '%s', s, old);
> > return xstrfmt("%.*s%s", (int)n, s, new);
> > }
> > 
> > or something.
> 
> Yeah, that is tempting, but I think the "die" here is not at all
> appropriate in a reusable function. I'd probably write it as:
> 
>   char *replace_suffix(const char *s, const char *old, const char *new)
>   {
>   size_t n;
>   if (!strip_suffix(s, old, ))
>   return NULL;
>   return xstrfmt("%.*s%s", (int)n, s, new);
>   }
> 
> and do:
> 
>   tmp_idx = replace_suffix(preq->tmpfile, ".pack.temp", ".idx.temp");
>   if (!tmp_idx)
>   die("BUG: pack tmpfile does not end in .pack.temp?");
> 
> but then we are not really saving much. And it is not clear whether
> that is even a sane output for replace_suffix. I can easily imagine
> three behaviors when we do not end in the original suffix:
> 
>   - return NULL to signal error
> 
>   - return the original with no replacement
> 
>   - return the original with "new" appended
> 
> So I'm not sure it makes a good reusable function beyond these three
> call-sites.

Indeed. I had a "gently" version in mind to satisfy callers not
interested in dying, but it's so little code that it likely isn't
worth it.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 6/7] git-p4: add support for large file systems

2015-09-16 Thread Junio C Hamano
Lars Schneider  writes:

>>> +git-p4.largeFileSystem::
>>> +   Specify the system that is used for large (binary) files. Please note
>>> +   that large file systems do not support the 'git p4 submit' command.
>> 
>> Why is that? Is it just that you haven't implemented support, or
>> is it fundamentally impossible?
>
> If we detect LFS files only by file extension then we could make
> it work. But then we must not use any git-p4 settings. We would
> need to rely only on the “.gitattributes” file that is stored in
> the P4 repository. My implementation also looks at the file size
> and decides on a individual file basis if a file is stored in
> LFS. That means all clients need the same file size threshold.
>
> Junio explained the problem in the v4 thread:
>> ...

Hmm, I am not sure if Luke's question was answered with the above,
and I do not think I explained anything, either.  I did point out
that with _your_ code I didn't see how "submit" would not work, but
that is quite different from the problem being fundamentally not
solvable.

>>> +test_file_in_mock () {
>>> +   FILE="$1"
>> Missing &&
>> Plus the next few lines

> Are they strictly necessary? I believe you can define variables all in “one 
> line”

Absolutely.  Making multiple assignments as a single statment like

X=A Y=B Z=C &&
...

is fine, but do not break &&-chain.

>> 
>>> +   CONTENT="$2"
>>> +   LOCAL_STORAGE=".git/mock-storage/local/$CONTENT"
>>> +   SERVER_STORAGE=".git/mock-storage/remote/$CONTENT"

... and to be sure, these 'echo' also must not break &&-chain.

>>> +   echo "pointer-$CONTENT" >expect_pointer
>>> +   echo "$CONTENT" >expect_content

>>> +test_file_count_in_dir () {
>>> +   DIR="$1"
>>> +   EXPECTED_COUNT="$2"
>>> +   find "$DIR" -type f >actual

... so are these.

>>> +   test_line_count = $EXPECTED_COUNT actual
>>> +}


Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time

2015-09-16 Thread Junio C Hamano
Michael Haggerty  writes:

> I'm not sure what behavior you want. At one point you say "puts the file
> to a final place if it is non-empty" but later you say "leave it if
> non-empty". Should the file be written directly, or should it be written
> to a lockfile and renamed into place only when complete?

I do not think we care that deeply either way, as we do not want to
run multiple auto-gc's at the same time in the first place.  So either
one of the following is perfectly fine:

 * We open the final destination directly, but with O_CREAT|O_EXCL,
   which as a side effect also detects the simultanous execution
   [*1*].  We do not do any "finalizing rename" if we go this route.
   When we are done, close it and remove it if we did not write
   anything, or leave it there if we did write something.

 * We open a lockfile the usual way.  When we are done, close it and
   and remove it if we did not write anything, or finalize it by
   renaming it if we did write something.

I think Duy's code wants to do the latter.

> This doesn't seem like a common thing to want (as in, this might be the
> only caller), but it probably makes sense to build it into the
> tempfile/lockfile API nevertheless, because implementing it externally
> would require a lot of other code to be duplicated.
>
> Another possibility that might work (maybe without requiring changes to
> tempfile/lockfile): don't worry about deleting the log file if it is
> empty, but make observers treat an empty log file the same as an absent one.

Probably your "don't remove and check for emptiness" approach would
be the simpler of the two, but I think we can go either way.

Thanks.

[Footnote]

*1* But one "gc" could be foreground gc that does not write to a log
file and the other "gc" could be background that does write to a
log file, so this cannot be the primary way to avoid double
execution.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> On Wed, Sep 16, 2015 at 10:15:02AM +0200, Johannes Schindelin wrote:
>
>> Maybe we should stick to the established practice of the many, many
>> reentrant POSIX functions following the *_r() naming convention? I.e.
>> the reentrant version of localtime() is called localtime_r(), the
>> reentrant version of random() is called random_r() etc.
>> 
>> So I could see myself not needing an explanation if I had read
>> sha1_to_hex_r(...).
>
> I like this suggestion. By itself, the "_r" does not communicate as much
> as "_to" to me, but as long as the reader knows the "_r" idiom, it
> communicates much more.
>
> I'll switch to this unless there is any objection.

Fine by me.  Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Sep 2015, #03; Mon, 14)

2015-09-16 Thread Junio C Hamano
"Philip Oakley"  writes:

> Oops..
> ...
>> Shall I just rework/resend the V2 patch-up ($gmane/277829) that also
>> links to 'merge's' usage as a fresh patch (would be tonight UK)?
>
> I now see that the full V2 patch is already there at 4934a96.

OK.  I was wondering what I missed.  4934a96^2 is a copy of your v2.

> I'd mistakenly just compared your note to the slightly fuller V2 commit
> message and in the morning rush hadn't had time to check.
>
> Sorry for the noise.

Thanks for being careful.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-svn: cat-file memory usage

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 06:40:23AM -0700, Victor Leschuk wrote:

> Unfortunately using patch didn't change the situation. I will run some
> tests with alternate allocators (looking at jemalloc and tcmalloc). As
> for alternate tools: as far as I understood svn2git calls 'git svn'
> itself. So I assume it can't fix the memory usage or speed up clone
> process... Correct me if I'm wrong.

I think there are actually several tools calling themselves svn2git.
There was a C tool once upon a time, but it looks fairly inactive, and
the top search hit for svn2git does turn up a git-svn wrapper. Like I
said, I am not very up on the current state of affairs.

> Btw, what do you think of getting rid of batch mode for clone/fetch in
> perl code. It really hardly has any impact on performance but reduces
> memory usage a lot.

I'd worry there are other cases where it does impact performance (e.g.,
perhaps smaller blobs), but I don't know enough about the git-svn
internals to say much more.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/67] fsck: don't fsck alternates for connectivity-only check

2015-09-16 Thread Junio C Hamano
Johannes Schindelin  writes:

> Hi Peff,
>
> On 2015-09-15 17:24, Jeff King wrote:
>> Commit 02976bf (fsck: introduce `git fsck --connectivity-only`,
>> 2015-06-22) recently gave fsck an option to perform only a
>> subset of the checks, by skipping the fsck_object_dir()
>> call. However, it does so only for the local object
>> directory, and we still do expensive checks on any alternate
>> repos. We should skip them in this case, too.
>> 
>> Signed-off-by: Jeff King 
>
> ACK!

Thanks, both.

Peff, I am inclined to take at least 1 and 4 outside the context of
this series and queue them on their own topics.  I do not think
either is too urgent to be in 2.6, but on the other hand they look
both trivially correct (that is a famous last word that often comes
back and haunt us, though), so...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/67] fsck: don't fsck alternates for connectivity-only check

2015-09-16 Thread Eric Sunshine
On Wed, Sep 16, 2015 at 3:12 PM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> Speaking of which, how do you want the next round of patches? I'm
>> hesitant to spam the list with 67 patches again, when only a fraction
>> have changed (and for all but the _to/_r thing, I've posted my changes
>> already).
>
> Cannot tell yet, as I am only halfway thru myself.

I'm also only about halfway through, plus trying dealing with other topics...

> If there is a
> significant update based on discussions, it may be worth sending
> only those so that you can sooner make sure that the resulting
> change and those who reviewed the first iteration are all on the
> same page, but a full resend, before giving enouth time to those who
> are willing to but have not found time to review the whole thing,
> would be a wasted mental bandwidth for everybody, I suspect.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/67] fsck: don't fsck alternates for connectivity-only check

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 11:04:49AM -0700, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > Hi Peff,
> >
> > On 2015-09-15 17:24, Jeff King wrote:
> >> Commit 02976bf (fsck: introduce `git fsck --connectivity-only`,
> >> 2015-06-22) recently gave fsck an option to perform only a
> >> subset of the checks, by skipping the fsck_object_dir()
> >> call. However, it does so only for the local object
> >> directory, and we still do expensive checks on any alternate
> >> repos. We should skip them in this case, too.
> >> 
> >> Signed-off-by: Jeff King 
> >
> > ACK!
> 
> Thanks, both.
> 
> Peff, I am inclined to take at least 1 and 4 outside the context of
> this series and queue them on their own topics.  I do not think
> either is too urgent to be in 2.6, but on the other hand they look
> both trivially correct (that is a famous last word that often comes
> back and haunt us, though), so...

Yeah, they are conceptually their own topics, and I do not mind doing it
that way. Note that a later patch in the sprintf-audit topic touches the
same spot in fsck, and we'll get a nasty conflict if they are done
separately.

Speaking of which, how do you want the next round of patches? I'm
hesitant to spam the list with 67 patches again, when only a fraction
have changed (and for all but the _to/_r thing, I've posted my changes
already).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git configure/make does not honor ARFLAGS

2015-09-16 Thread Eric Sunshine
On Sun, Sep 13, 2015 at 10:52:36PM -0700, Junio C Hamano wrote:
> On Sun, Sep 13, 2015 at 9:59 PM, Jeff King  wrote:
> >
> > My follow-up question was going to be: is this something we should be
> > setting in config.mak.uname for appropriate versions of Darwin? It
> > wasn't clear to me from Eric's description if this is something that
> > particular versions need, or just something that people who want to
> > build Universal binaries would choose to use.
> 
> My preference is not to worry anything about config.mak.uname
> ourselves, until somebody who does work on the ports proposes
> to do something concrete.

Normal 'ar' works for non-multi-architecture-binaries (MAB);
'libtool' is only needed when building Universal. Unfortunately,
there probably isn't a reliable way to auto-detect a Universal build.
Back in the NextStep days, projects would support MAB via a
TARGET_ARCHS variable:

make TARGET_ARCHS='m68k i386 sparc hppa'

And, for project's which didn't understand that, you'd just have to
specify build flags which the Makefile did understand:

make CFLAGS='-arch ppc -arch i386' LDFLAGS='-arch ppc -arch i386'

or, just make ad-hoc modifications to the Makefile if it didn't even
respect those variables. So, I don't think there's really a good way
to detect MAB builds.

On the other hand, as far as I know, it's *always* safe to replace
'ar' with 'libtool' on this platform, so we could just do it
unconditionally.

--- 8< ---
diff --git a/config.mak.uname b/config.mak.uname
index be5cbec..e7970dd 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -88,6 +88,8 @@ ifeq ($(uname_S),SCO_SV)
TAR = gtar
 endif
 ifeq ($(uname_S),Darwin)
+   AR = libtool
+   ARFLAGS = -static -o
NEEDS_CRYPTO_WITH_SSL = YesPlease
NEEDS_SSL_WITH_CRYPTO = YesPlease
NEEDS_LIBICONV = YesPlease
--- 8< ---

I've tested this on modern Mac OS X, Yosemite 10.10.5 (x86_64), and
ancient Snow Leopard 10.5.8 PowerPC (circa 2009), and it works fine
in both cases, so perhaps that's the way to go.

My one concern, however, would be people who've installed GNU libtool
and have that in PATH before Apple's tools.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] l10n: de.po: translate 2 messages

2015-09-16 Thread Phillip Sz
Acked-by: Phillip Sz 

> Signed-off-by: Ralf Thielow 
> ---
>  po/de.po | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/po/de.po b/po/de.po
> index e5b523d..c9b4d16 100644
> --- a/po/de.po
> +++ b/po/de.po
> @@ -10530,9 +10530,8 @@ msgstr ""
>  "hash[=]] [--abbrev[=]] [--tags] [--heads] [--] [...] "
>  
>  #: builtin/show-ref.c:11
> -#, fuzzy
>  msgid "git show-ref --exclude-existing[=] < "
> -msgstr "git show-ref --exclude-existing[=Muster] < ref-list"
> +msgstr "git show-ref --exclude-existing[=] < "
>  
>  #: builtin/show-ref.c:170
>  msgid "only show tags (can be combined with heads)"
> @@ -10761,9 +10760,8 @@ msgid "replace the tag if exists"
>  msgstr "das Tag ersetzen, wenn es existiert"
>  
>  #: builtin/tag.c:609 builtin/update-ref.c:368
> -#, fuzzy
>  msgid "create a reflog"
> -msgstr "create_reflog"
> +msgstr "Reflog erstellen"
>  
>  #: builtin/tag.c:611
>  msgid "Tag listing options"
> 




signature.asc
Description: OpenPGP digital signature


Re: [PATCH 04/67] fsck: don't fsck alternates for connectivity-only check

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> Speaking of which, how do you want the next round of patches? I'm
> hesitant to spam the list with 67 patches again, when only a fraction
> have changed (and for all but the _to/_r thing, I've posted my changes
> already).

Cannot tell yet, as I am only halfway thru myself.  If there is a
significant update based on discussions, it may be worth sending
only those so that you can sooner make sure that the resulting
change and those who reviewed the first iteration are all on the
same page, but a full resend, before giving enouth time to those who
are willing to but have not found time to review the whole thing,
would be a wasted mental bandwidth for everybody, I suspect.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> I think a core file is even more useful than a backtrace. Having BUG()
> call abort() would be even more useful, I think.

Sounds like a plan ;-)
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 23/67] add_packed_git: convert strcpy into xsnprintf

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> + alloc = path_len + strlen(".pack") + 1;
> + p = alloc_packed_git(alloc);
> + memcpy(p->pack_name, path, path_len); /* NUL from zero-ed struct */

This comment is confusing, isn't it?  Yes, there is a NUL, but you
will going to overwrite it with "." in ".keep" immediately and more
importantly, that overwriting does not depend on NUL being there.

What's more important to comment on would probably be the line that
computes the "alloc".  It uses ".pack" but that is because it knows
that is the longest suffix we care about, and that deserves mention
more than the NUL termination of intermediate result that does not
matter, no?

If ".bitmap" were in the set of suffixes we care about (it isn't),
we would be using that instead when measuring "alloc".

Thanks.

> +
> + xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep");
>   if (!access(p->pack_name, F_OK))
>   p->pack_keep = 1;
>  
> - strcpy(p->pack_name + path_len, ".pack");
> + xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
>   if (stat(p->pack_name, ) || !S_ISREG(st.st_mode)) {
>   free(p);
>   return NULL;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-16 Thread Stefan Beller
On Wed, Sep 16, 2015 at 12:07 PM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> On Wed, Sep 16, 2015 at 11:24:27AM -0700, Junio C Hamano wrote:
>>
>>> Jeff King  writes:
>>>
>>> > On Tue, Sep 15, 2015 at 11:19:21PM -0400, Eric Sunshine wrote:
>>> >
>>> >> > strcpy(hexbuf[stage], sha1_to_hex(ce->sha1));
>>> >> > -   sprintf(ownbuf[stage], "%o", ce->ce_mode);
>>> >> > +   xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", 
>>> >> > ce->ce_mode);
>>> >>
>>> >> Interesting. I wonder if there are any (old/broken) compilers which
>>> >> would barf on this. If we care, perhaps sizeof(ownbuf[0]) instead?
>>> >
>>> > Good point. I've changed it to sizeof(ownbuf[0]).
>>>
>>> Panda brain is lost here.  What's the difference, other than that we
>>> will now appear to be measuring the size of the thing at index 0
>>> while using that size to stuff data into a different location?  All
>>> elements of the array are of the same size so there wouldn't be any
>>> difference either way, no?
>>
>> Correct. The original is sane and gcc does the right thing. The question
>> is whether some compiler would complain that "stage" is not a constant
>> in the sizeof() expression. I don't know if any compiler would do so,
>> but it is easy enough to be conservative.
>
> Wouldn't such a compiler also complain if you did this, though?
>
> int *pointer_to_int;
> size_t sz = sizeof(*pointer_to_int);
>
> You (as a complier) do not know exactly where ownbuf[stage] is,
> because "stage" is unknown to you.  In the same way, you do not know
> exactly where *pointer_to_int is.  But of course, what the sizeof()
> operator is being asked is the size of the thing, which does not
> depend on where the thing is.  If you (as a compiler) does not know
> that and complain to ownbuf[stage], wouldn't you complain to the
> pointer dereference, too?
>
> A more important reason I am reluctant to see this:
>
> xsnprintf(ownbuf[stage], sizeof(ownbuf[0]), "%o", ...);
>
> is that it looks strange in the same way as this
>
> memcpy(ownbuf[stage], src, sizeof(ownbuf[0]));
>
> looks strange.  "We use the size of one thing to stuff into another".
>
> That will make future readers wonder "Is this a typo, and if it is,
> which index is a mistake I can fix?" and may lead to an unnecessary
> confusion.  I do not want to see a correctly written
>
> xsnprintf(ownbuf[stage], sizeof(ownbuf[0]), "%o", ...);
>
> turned into
>
> xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", ...);
>
> out of such a confusion.

So we could just not use the bracket notation, but the pointer then?

xsnprintf(ownbuf[stage], sizeof(*ownbuf), "%o", ...);

IMHO that would reasonably well tell you that we just care about the
size of one element there.

A funny thought:

 xsnprintf(ownbuf[stage], sizeof(ownbuf[-1]), "%o", ...);

should work as well as any reader would question the sanity
of a negative index.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/67] fsck: don't fsck alternates for connectivity-only check

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 03:14:24PM -0400, Eric Sunshine wrote:

> On Wed, Sep 16, 2015 at 3:12 PM, Junio C Hamano  wrote:
> > Jeff King  writes:
> >
> >> Speaking of which, how do you want the next round of patches? I'm
> >> hesitant to spam the list with 67 patches again, when only a fraction
> >> have changed (and for all but the _to/_r thing, I've posted my changes
> >> already).
> >
> > Cannot tell yet, as I am only halfway thru myself.
> 
> I'm also only about halfway through, plus trying dealing with other topics...

OK, I will hold back a resend for a few days, then.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git rebase -i failing due to phantom index.lock

2015-09-16 Thread Giuseppe Bilotta
Hello all,

I've recently started to note an issue with git rebase -i failing with
alarming frequency, especially on one of my repositories, claiming
that index.lock could not be created because it exists, even though it
doesn't and nothing else seems to be locking the index. The rebase
bombs more usually during the initial checkout than during any other
of the steps.

The problem is somewhat randomly reproducible, as it doesn't happen
100% of the time. It also seems to happen more consistently with more
recent git versions, or at least with my custom built git than with
the distro-shipped one.

A somewhat problematic git bisect has allowed me to identify commit
03b86647722f11ccc321cd7279aa49b811d17cc2 as the first bad commit.

The problem has all signs of a timing issue, which I'm having problems
isolating, although simply providing a timeout on the index lock calls
seems to fix it. Making git stall instead of die on error allowed me
to obtain this backtrace which I suspect will not be particularly
useful:

#0 0x004c4666 in unable_to_lock_message
(path=path@entry=0x1bad940 ".git/index", err=,
buf=buf@entry=0x7ffd3b990900) at lockfile.c:158
#1 0x004c46c6 in unable_to_lock_die (path=path@entry=0x1bad940
".git/index", err=) at lockfile.c:167
#2 0x004c480b in hold_lock_file_for_update_timeout
(lk=lk@entry=0x1bd7be0, path=0x1bad940 ".git/index", flags=, timeout_ms=timeout_ms@entry=0) at lockfile.c:177
#3 0x004e6e2a in hold_lock_file_for_update (flags=1,
path=, lk=0x1bd7be0) at lockfile.h:165
#4 hold_locked_index (lk=lk@entry=0x1bd7be0,
die_on_error=die_on_error@entry=1) at read-cache.c:1411
#5 0x00420cb0 in merge_working_tree (old=0x7ffd3b990a20,
old=0x7ffd3b990a20, new=0x7ffd3b990a00, new=0x7ffd3b990a00,
writeout_error=0x7ffd3b9909c0, opts=0x7ffd3b992a31)
at builtin/checkout.c:471
#6 switch_branches (new=0x7ffd3b990a00, opts=0x7ffd3b992a31) at
builtin/checkout.c:826
#7 checkout_branch (new=0x7ffd3b990a00, opts=0x7ffd3b992a31) at
builtin/checkout.c:1123
#8 cmd_checkout (argc=, argv=,
prefix=) at builtin/checkout.c:1273
#9 0x00405e7e in run_builtin (argv=0x7ffd3b992480, argc=2,
p=0x7acab0 ) at git.c:350
#10 handle_builtin (argc=2, argv=0x7ffd3b992480) at git.c:535
#11 0x00405021 in run_argv (argv=0x7ffd3b9922d8,
argcp=0x7ffd3b9922bc) at git.c:581
#12 main (argc=2, av=) at git.c:689

Additional information: I have the bash git prompt enabled, I do not
have anything else accessing that repository at the same time, and the
repository is “spread out” across multiple workdirs created with git
new-workdir, each with a different branch checked out. It is also
usually accessed under a symlink (cd ~/shortcut/repo/branch rather
than ~/full/path/to/repo/branch). Running Debian unstable with Linux
4.1.0 on an SSD-backed ext4 partition mounted with data=ordered. I
can't thnk of any other detail that might be even just remotely
involved in this odd issue. Any suggestions on how to debug it are
welcome.

Best regards,

-- 
Giuseppe "Oblomov" Bilotta
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> On Wed, Sep 16, 2015 at 11:24:27AM -0700, Junio C Hamano wrote:
>
>> Jeff King  writes:
>> 
>> > On Tue, Sep 15, 2015 at 11:19:21PM -0400, Eric Sunshine wrote:
>> >
>> >> > strcpy(hexbuf[stage], sha1_to_hex(ce->sha1));
>> >> > -   sprintf(ownbuf[stage], "%o", ce->ce_mode);
>> >> > +   xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", 
>> >> > ce->ce_mode);
>> >> 
>> >> Interesting. I wonder if there are any (old/broken) compilers which
>> >> would barf on this. If we care, perhaps sizeof(ownbuf[0]) instead?
>> >
>> > Good point. I've changed it to sizeof(ownbuf[0]).
>> 
>> Panda brain is lost here.  What's the difference, other than that we
>> will now appear to be measuring the size of the thing at index 0
>> while using that size to stuff data into a different location?  All
>> elements of the array are of the same size so there wouldn't be any
>> difference either way, no?
>
> Correct. The original is sane and gcc does the right thing. The question
> is whether some compiler would complain that "stage" is not a constant
> in the sizeof() expression. I don't know if any compiler would do so,
> but it is easy enough to be conservative.

Wouldn't such a compiler also complain if you did this, though?

int *pointer_to_int;
size_t sz = sizeof(*pointer_to_int);

You (as a complier) do not know exactly where ownbuf[stage] is,
because "stage" is unknown to you.  In the same way, you do not know
exactly where *pointer_to_int is.  But of course, what the sizeof()
operator is being asked is the size of the thing, which does not
depend on where the thing is.  If you (as a compiler) does not know
that and complain to ownbuf[stage], wouldn't you complain to the
pointer dereference, too?

A more important reason I am reluctant to see this:

xsnprintf(ownbuf[stage], sizeof(ownbuf[0]), "%o", ...);

is that it looks strange in the same way as this

memcpy(ownbuf[stage], src, sizeof(ownbuf[0]));

looks strange.  "We use the size of one thing to stuff into another".

That will make future readers wonder "Is this a typo, and if it is,
which index is a mistake I can fix?" and may lead to an unnecessary
confusion.  I do not want to see a correctly written

xsnprintf(ownbuf[stage], sizeof(ownbuf[0]), "%o", ...);

turned into

xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", ...);

out of such a confusion.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 30/67] ref-filter: drop sprintf and strcpy calls

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> The ref-filter code comes from for-each-ref, and inherited a
> number of raw sprintf and strcpy calls. These are generally
> all safe, as we custom-size the buffers, or are formatting
> numbers into sufficiently large buffers. But we can make the
> resulting code even simpler and more obviously correct by
> using some of our helper functions.
>
> Signed-off-by: Jeff King 
> ---
>  ref-filter.c | 70 
> +++-
>  1 file changed, 22 insertions(+), 48 deletions(-)

The end result indeed is much easier to read.  Looks good.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git configure/make does not honor ARFLAGS

2015-09-16 Thread Eric Sunshine
On Wed, Sep 16, 2015 at 3:38 PM, Eric Sunshine  wrote:
> On the other hand, as far as I know, it's *always* safe to replace
> 'ar' with 'libtool' on this platform, so we could just do it
> unconditionally.
>
> --- 8< ---
>  ifeq ($(uname_S),Darwin)
> +   AR = libtool
> +   ARFLAGS = -static -o
> --- 8< ---
>
> I've tested this on modern Mac OS X, Yosemite 10.10.5 (x86_64), and
> ancient Snow Leopard 10.5.8 PowerPC (circa 2009), and it works fine
> in both cases, so perhaps that's the way to go.
>
> My one concern, however, would be people who've installed GNU libtool
> and have that in PATH before Apple's tools.

Although, perhaps specifying the full path to 'libtool' would be
sufficient so as not to worry about GNU libtool being picked up by
accident. Apple's command resides at /usr/bin/libtool on both modern
and ancient Mac OS X, so maybe that's all we need.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/67] mailsplit: make PATH_MAX buffers dynamic

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> + free(file);
> + file = xstrfmt("%s/%s", maildir, list.items[i].string);

Repeated pattern makes one wonder if a thin wrapper

xstrfmt_to(, "%s/%s", maildir, list.items[i].string);

that first frees the existing value and then overwrites is an
overall win.  Perhaps not, as you would (1) initialize the variable
to NULL before doing a series of xstrfmt_to(), and (2) free the final
one yourself.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> On Tue, Sep 15, 2015 at 11:19:21PM -0400, Eric Sunshine wrote:
>
>> > strcpy(hexbuf[stage], sha1_to_hex(ce->sha1));
>> > -   sprintf(ownbuf[stage], "%o", ce->ce_mode);
>> > +   xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", 
>> > ce->ce_mode);
>> 
>> Interesting. I wonder if there are any (old/broken) compilers which
>> would barf on this. If we care, perhaps sizeof(ownbuf[0]) instead?
>
> Good point. I've changed it to sizeof(ownbuf[0]).

Panda brain is lost here.  What's the difference, other than that we
will now appear to be measuring the size of the thing at index 0
while using that size to stuff data into a different location?  All
elements of the array are of the same size so there wouldn't be any
difference either way, no?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 36/67] remote-ext: simplify git pkt-line generation

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

>  static void send_git_request(int stdin_fd, const char *serv, const char 
> *repo,
>   const char *vhost)
>  {
> - size_t bufferspace;
> - size_t wpos = 0;
> - char *buffer;
> + struct strbuf buffer = STRBUF_INIT;
>  
> - /*
> -  * Request needs 12 bytes extra if there is vhost ( \0host=\0) and
> -  * 6 bytes extra ( \0) if there is no vhost.
> -  */
> + /* Generate packet with a dummy size header */
> + strbuf_addf(, "%s %s%c", serv, repo, 0);
>   if (vhost)
> - bufferspace = strlen(serv) + strlen(repo) + strlen(vhost) + 12;
> - else
> - bufferspace = strlen(serv) + strlen(repo) + 6;
> + strbuf_addf(, "host=%s%c", vhost, 0);
>  
> - if (bufferspace > 0x)

> + /* Now go back and fill in the size */
> + if (buffer.len > 0x)
>   die("Request too large to send");
> + xsnprintf(buffer.buf, buffer.alloc, "%04x", (unsigned)buffer.len);

So we now write "something something\0host=something" into the buffer
and then try to overwrite the first four bytes?  Does this xsnprintf()
stop after writing the four hexadecimal, or does it clobber the first
byte of the payload (i.e. copy of serv[0]) by a NUL termination?

>  
> + if (write_in_full(stdin_fd, buffer.buf, buffer.len) < 0)
>   die_errno("Failed to send request");
>  
> + strbuf_release();
>  }
>  
>  static int run_child(const char *arg, const char *service)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 23/67] add_packed_git: convert strcpy into xsnprintf

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 11:43:49AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > +   alloc = path_len + strlen(".pack") + 1;
> > +   p = alloc_packed_git(alloc);
> > +   memcpy(p->pack_name, path, path_len); /* NUL from zero-ed struct */
> 
> This comment is confusing, isn't it?  Yes, there is a NUL, but you
> will going to overwrite it with "." in ".keep" immediately and more
> importantly, that overwriting does not depend on NUL being there.

Yeah, you're right. I was blindly making sure that the behavior did not
change from the original, without noticing that the original did not
care about the NUL either way.

> What's more important to comment on would probably be the line that
> computes the "alloc".  It uses ".pack" but that is because it knows
> that is the longest suffix we care about, and that deserves mention
> more than the NUL termination of intermediate result that does not
> matter, no?

Agreed. I'll add a comment to that effect.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 12:52:26PM -0700, Junio C Hamano wrote:

> > diff --git a/remote.c b/remote.c
> > index 5ab0f7f..1b69751 100644
> > --- a/remote.c
> > +++ b/remote.c
> > @@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote)
> > int n = 1000;
> > FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r");
> > char *s, *p;
> > -   int len;
> 
> Hmm, we would punish those with ridiculously long remote name by
> truncating at n but that is OK.

Yeah, though that is nothing new.

In some of the cases, as you've seen, I dug further in cleaning things
up. But in others I did the minimal fix (especially in this case, the
limitations are only about the deprecated "branches" and "remotes"
file), mostly to try to keep the scope of work sane.

> We use buffer[BUFSIZ] to read various things in this file, not just
> $GIT_DIR/branches/* files, with fgets(), which may be better done if
> we switched to strbuf_getline().  Then we can also use trim family
> of calls from the strbuf API suite.
>
> Move to strbuf_getline() may be a doubly attractive proposition,
> with a possible change to strbuf_getline() to make it also remove CR
> that immediately precedes LF [*1*], helping DOSsy platforms.

I'll take a look and see how painful this is.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 1/3] worktree: add top-level worktree.c

2015-09-16 Thread Mike Rappazzo
On Wed, Sep 16, 2015 at 4:32 PM, Eric Sunshine  wrote:
> On Mon, Sep 14, 2015 at 8:20 AM, Mike Rappazzo  wrote:
>> On Sat, Sep 12, 2015 at 10:39 PM, Eric Sunshine  
>> wrote:
 +struct worktree {
 +   char *path;
 +   char *git_dir;
 +   char *head_ref;
 +   unsigned char head_sha1[20];
 +   int is_detached;
 +   int is_bare;
 +};
 +
 +struct worktree_list {
 +   struct worktree *worktree;
 +   struct worktree_list *next;
 +};
>>>
>>> I don't care too strongly, but an alternate approach (which I probably
>>> would have taken) would be to have get_worktrees() simply return an
>>> array of 'struct worktree' objects, hence no need for the additional
>>> 'struct worktree_list'. The slight complication with this approach,
>>> though, is that get_worktrees() either also needs to return the length
>>> of the array, or the array should end with some sort of end-of-array
>>> sentinel. An obvious sentinel would be path==NULL or git_dir==NULL or
>>> all of the above.
>>>
>>> Client iteration is just about the same with the array approach as
>>> with the linked-list approach.
>>
>> I can't see what benefit this would provide.  I would sooner change
>> the returned list into
>> an array-backed list struct.  Alternatively, I think adding a
>> list_head pointer to this structure
>> could benefit client code.
>
> The benefit is a reduction in complexity, which is an important goal.
> Linked lists are inherently more complex than arrays, requiring extra
> effort, and especially extra care, to manage the head, each "next"
> pointer (and possibly a tail pointer). Arrays are used often in Git,
> thus are familiar in this code-base, and Git has decent support for
> making their management relatively painless. Your suggestions of
> changing into "an array-backed list structure" or "adding list_head
> pointer" increase complexity, rather than reduce it.
>
> Aside from the complexity issue, arrays allow random access, whereas
> linked lists allow only sequential access. While this limitation may
> not impact your current, planned use of get_worktrees(), it places a
> potentially unnecessary restriction on future clients.
>
> And, as noted, client iteration is at least as convenient with arrays,
> if not moreso, due to the reduction in noise ("p++" rather than "p =
> p->next").
>
> struct worktree *p;
> for (p = get_worktrees(); p->path; p++)
> blarp(p);
>
> There are cases where linked lists are an obvious win, but this
> doesn't seem to be such a case. On the other hand, there are obvious
> benefits to making this an array, such as reduced complexity and
> removal of the sequential-access-only restriction.

What you are suggesting makes sense, but I feel it falls short when it
comes to the "sentinel".  Would the last element in the list be a
dummy worktree?  I would sooner add a NULL to the end.  Would that be
an acceptable implementation?

I have re-coded it to put the next pointer in the worktree structure
as Junio has suggested, but I am open to changing it to use the array
approach.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] notes: allow non-writable actions on refs outside of refs/notes

2015-09-16 Thread Jacob Keller
From: Jacob Keller 

Allow non-destructive notes actions which do not require write
permission to be performed on refs outside of refs/notes/. The primary
advantage of this is to allow fetching remote refs to such location as
"refs/remote-notes//foo" and then performing merges into
refs/notes/

It is not reasonable to put remote notes inside refs/notes as users may
already have conflicting names inside the notes namespace.

Remove one test case regarding merge from refs/heads/master, which will
now pass under current code. It may be worth looking how to prevent
some of these more obviously wrong merges.

Signed-off-by: Jacob Keller 
---
 builtin/notes.c| 13 +
 t/t3308-notes-merge.sh |  1 -
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index 0f55d38983f0..ff056014d953 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -333,14 +333,14 @@ static struct notes_tree *init_notes_check(const char 
*subcommand,
   int flags)
 {
struct notes_tree *t;
-   const char *ref;
init_notes(NULL, NULL, NULL, flags);
t = _notes_tree;
 
-   ref = (flags & NOTES_INIT_WRITABLE) ? t->update_ref : t->ref;
-   if (!starts_with(ref, "refs/notes/"))
-   die("Refusing to %s notes in %s (outside of refs/notes/)",
-   subcommand, ref);
+   if (flags & NOTES_INIT_WRITABLE) {
+   if (!starts_with(t->update_ref, "refs/notes"))
+   die("Refusing to %s notes in %s (outside of 
refs/notes/)",
+   subcommand, t->update_ref);
+   }
return t;
 }
 
@@ -810,9 +810,6 @@ static int merge(int argc, const char **argv, const char 
*prefix)
o.local_ref = default_notes_ref();
strbuf_addstr(_ref, argv[0]);
expand_notes_ref(_ref);
-   if (!starts_with(remote_ref.buf, "refs/notes"))
-   die("Refusing to merge notes from %s (outside of refs/notes/)",
-   remote_ref.buf);
 
o.remote_ref = remote_ref.buf;
 
diff --git a/t/t3308-notes-merge.sh b/t/t3308-notes-merge.sh
index 24d82b49bbea..f0feb64bae6e 100755
--- a/t/t3308-notes-merge.sh
+++ b/t/t3308-notes-merge.sh
@@ -90,7 +90,6 @@ test_expect_success 'fail to merge various non-note-trees' '
test_must_fail git notes merge refs/notes/ &&
test_must_fail git notes merge refs/notes/dir &&
test_must_fail git notes merge refs/notes/dir/ &&
-   test_must_fail git notes merge refs/heads/master &&
test_must_fail git notes merge x: &&
test_must_fail git notes merge x:foo &&
test_must_fail git notes merge foo^{bar
-- 
2.6.0.rc2.248.g5b5be23

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] notes: don't expand qualified refs in expand_notes_ref

2015-09-16 Thread Jacob Keller
From: Jacob Keller 

The documentation for --refs says that it will treat unqualified refs as
under refs/notes. Current behavior is to prefix refs/notes to all
strings that do not start with refs/notes or notes/, resulting in
performing actions on refs such as "refs/notes/refs/foo/bar" instead of
attempting to perform actions on "refs/foo/bar". A future patch will
introduce the idea of performing non-writable actions to refs outside of
refs/notes. Change the behavior of expand_notes_ref to leave qualified
refs under refs/* alone.

In addition, fix git notes merge  to prevent merges from refs which
are not under refs/notes, in a similar way to how we already check note
refs in init_notes_check. This is required in order to keep current
tests passing without change. A future patch will change the behavior of
git notes merge so that it can merge from a ref outside of refs/notes.

Signed-off-by: Jacob Keller 
---
 builtin/notes.c | 4 
 notes.c | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index 6371d536d164..0f55d38983f0 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -810,6 +810,10 @@ static int merge(int argc, const char **argv, const char 
*prefix)
o.local_ref = default_notes_ref();
strbuf_addstr(_ref, argv[0]);
expand_notes_ref(_ref);
+   if (!starts_with(remote_ref.buf, "refs/notes"))
+   die("Refusing to merge notes from %s (outside of refs/notes/)",
+   remote_ref.buf);
+
o.remote_ref = remote_ref.buf;
 
t = init_notes_check("merge", NOTES_INIT_WRITABLE);
diff --git a/notes.c b/notes.c
index 6ef347ca3ac4..d49168fb3f01 100644
--- a/notes.c
+++ b/notes.c
@@ -1296,8 +1296,8 @@ int copy_note(struct notes_tree *t,
 
 void expand_notes_ref(struct strbuf *sb)
 {
-   if (starts_with(sb->buf, "refs/notes/"))
-   return; /* we're happy */
+   if (starts_with(sb->buf, "refs/"))
+   return; /* fully qualified, so we're happy */
else if (starts_with(sb->buf, "notes/"))
strbuf_insert(sb, 0, "refs/", 5);
else
-- 
2.6.0.rc2.248.g5b5be23

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] notes: allow read only notes actions on refs outside of refs/notes

2015-09-16 Thread Jacob Keller
From: Jacob Keller 

The primary purpose of this series is to allow fetching remote notes
into a namespace such as "refs/remote-notes//foo". Currently,
git-notes refuses to operate on refs outside of refs/notes/* including
merging from them, listing or showing them. This makes it difficult to
share notes from a remote repository.

Fix expand_notes_ref to not always prepend refs/notes to fully qualified
refs. Allow git-notes actions which do not specify NOTES_INIT_WRITABLE
to be performed on refs outside of refs/notes/*

Future work will include more coupling of "refs/remote-notes/"
into git-notes so that you can specify refs as "/foo" similar to
how remotes work for branches.

In addition, long term goal is to make it default to fetch notes into
refs/remote-notes//*, and possibly to add some status for
tracking similar to how tracking branches work.

The one downside currently is that a test case for prevention of merge
from "refs/heads/master" had to be removed as "git notes merge
refs/heads/master" now works. I am not sure how this could be fixed.. I
did not find any way to tell if a treeish actually was a notes tree or
not...

This topic depends on mh/notes-allow-reading-treeish and actually
expands what this topic allowed before. Previously, treeishes such as
notes@{1} were made allowable, but the ref still had to be found under
refs/notes.

No documentation changes were made as from the looks of it,
documentation for --ref and core.notesRef is already correct despite the
previous behavior of expand_notes_ref. In that sense, documentation was
wrong before.

Jacob Keller (2):
  notes: don't expand qualified refs in expand_notes_ref
  notes: allow non-writable actions on refs outside of refs/notes

 builtin/notes.c| 11 ++-
 notes.c|  4 ++--
 t/t3308-notes-merge.sh |  1 -
 3 files changed, 8 insertions(+), 8 deletions(-)

-- 
2.6.0.rc2.248.g5b5be23

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 66/67] use strbuf_complete to conditionally append slash

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> diff --git a/imap-send.c b/imap-send.c
> index 01aa227..f5d2b06 100644
> --- a/imap-send.c
> +++ b/imap-send.c
> @@ -1412,8 +1412,7 @@ static CURL *setup_curl(struct imap_server_conf *srvc)
>   curl_easy_setopt(curl, CURLOPT_PASSWORD, server.pass);
>  
>   strbuf_addstr(, server.host);
> - if (!path.len || path.buf[path.len - 1] != '/')
> - strbuf_addch(, '/');
> + strbuf_complete(, '/');
>   strbuf_addstr(, server.folder);

Is this conversion correct?  This seems to me that the caller wants
to create an IMAP folder name immediately under the root hierarchy
and wants to have the leading slash in the result.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/67] mailsplit: make PATH_MAX buffers dynamic

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 11:13:37AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > +   free(file);
> > +   file = xstrfmt("%s/%s", maildir, list.items[i].string);
> 
> Repeated pattern makes one wonder if a thin wrapper
> 
>   xstrfmt_to(, "%s/%s", maildir, list.items[i].string);
> 
> that first frees the existing value and then overwrites is an
> overall win.  Perhaps not, as you would (1) initialize the variable
> to NULL before doing a series of xstrfmt_to(), and (2) free the final
> one yourself.

Yeah, exactly. If you want to wrap it up in something that understands
invariants, I think strbuf is the way to go. I dunno. Maybe I should
just have done this whole thing with strbufs.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 1/3] worktree: add top-level worktree.c

2015-09-16 Thread Eric Sunshine
On Mon, Sep 14, 2015 at 8:20 AM, Mike Rappazzo  wrote:
> On Sat, Sep 12, 2015 at 10:39 PM, Eric Sunshine  
> wrote:
>>> +struct worktree {
>>> +   char *path;
>>> +   char *git_dir;
>>> +   char *head_ref;
>>> +   unsigned char head_sha1[20];
>>> +   int is_detached;
>>> +   int is_bare;
>>> +};
>>> +
>>> +struct worktree_list {
>>> +   struct worktree *worktree;
>>> +   struct worktree_list *next;
>>> +};
>>
>> I don't care too strongly, but an alternate approach (which I probably
>> would have taken) would be to have get_worktrees() simply return an
>> array of 'struct worktree' objects, hence no need for the additional
>> 'struct worktree_list'. The slight complication with this approach,
>> though, is that get_worktrees() either also needs to return the length
>> of the array, or the array should end with some sort of end-of-array
>> sentinel. An obvious sentinel would be path==NULL or git_dir==NULL or
>> all of the above.
>>
>> Client iteration is just about the same with the array approach as
>> with the linked-list approach.
>
> I can't see what benefit this would provide.  I would sooner change
> the returned list into
> an array-backed list struct.  Alternatively, I think adding a
> list_head pointer to this structure
> could benefit client code.

The benefit is a reduction in complexity, which is an important goal.
Linked lists are inherently more complex than arrays, requiring extra
effort, and especially extra care, to manage the head, each "next"
pointer (and possibly a tail pointer). Arrays are used often in Git,
thus are familiar in this code-base, and Git has decent support for
making their management relatively painless. Your suggestions of
changing into "an array-backed list structure" or "adding list_head
pointer" increase complexity, rather than reduce it.

Aside from the complexity issue, arrays allow random access, whereas
linked lists allow only sequential access. While this limitation may
not impact your current, planned use of get_worktrees(), it places a
potentially unnecessary restriction on future clients.

And, as noted, client iteration is at least as convenient with arrays,
if not moreso, due to the reduction in noise ("p++" rather than "p =
p->next").

struct worktree *p;
for (p = get_worktrees(); p->path; p++)
blarp(p);

There are cases where linked lists are an obvious win, but this
doesn't seem to be such a case. On the other hand, there are obvious
benefits to making this an array, such as reduced complexity and
removal of the sequential-access-only restriction.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 12:07:30PM -0700, Junio C Hamano wrote:

> > Correct. The original is sane and gcc does the right thing. The question
> > is whether some compiler would complain that "stage" is not a constant
> > in the sizeof() expression. I don't know if any compiler would do so,
> > but it is easy enough to be conservative.
> 
> Wouldn't such a compiler also complain if you did this, though?
> 
>   int *pointer_to_int;
> size_t sz = sizeof(*pointer_to_int);
>
> You (as a complier) do not know exactly where ownbuf[stage] is,
> because "stage" is unknown to you.  In the same way, you do not know
> exactly where *pointer_to_int is.  But of course, what the sizeof()
> operator is being asked is the size of the thing, which does not
> depend on where the thing is.  If you (as a compiler) does not know
> that and complain to ownbuf[stage], wouldn't you complain to the
> pointer dereference, too?

I think it depends on how crappily the compiler is implemented. I agree
that sizeof(ownbuf[stage]) is a reasonable thing to ask for without
knowing the exact value of stage. But I could also see it being a
mistake an old or badly written compiler might make.

But we are theorizing without data. I'm happy to leave it as in my
original and wait to see if anybody ever complains.

> A more important reason I am reluctant to see this:
> 
>   xsnprintf(ownbuf[stage], sizeof(ownbuf[0]), "%o", ...);
> 
> is that it looks strange in the same way as this
> 
>   memcpy(ownbuf[stage], src, sizeof(ownbuf[0]));
> 
> looks strange.  "We use the size of one thing to stuff into another".
> 
> That will make future readers wonder "Is this a typo, and if it is,
> which index is a mistake I can fix?" and may lead to an unnecessary
> confusion.  I do not want to see a correctly written
> 
>   xsnprintf(ownbuf[stage], sizeof(ownbuf[0]), "%o", ...);
> 
> turned into
> 
>   xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", ...);
> 
> out of such a confusion.

I think that's a reasonable concern.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] worktree: move/refactor find_shared_symref from branch.c

2015-09-16 Thread Eric Sunshine
On Mon, Sep 14, 2015 at 1:44 PM, Mike Rappazzo  wrote:
> On Sat, Sep 12, 2015 at 11:19 PM, Eric Sunshine  
> wrote:
>> On Fri, Sep 4, 2015 at 5:39 PM, Michael Rappazzo  wrote:
>>> +   while (!matched && worktree_list) {
>>> +   if (strcmp("HEAD", symref)) {
>>> +   strbuf_reset();
>>> +   strbuf_reset();
>>> +   strbuf_addf(, "%s/%s", 
>>> worktree_list->worktree->git_dir, symref);
>>> +
>>> +   if (_parse_ref(path.buf, , NULL)) {
>>> +   continue;
>>> +   }
>>> +
>>> +   if (!strcmp(sb.buf, target))
>>> +   matched = worktree_list->worktree;
>>
>> The original code in branch.c, which this patch removes, did not need
>> to make a special case of HEAD, so it's not immediately clear why this
>> replacement code does so. This is the sort of issue which argues in
>> favor of mutating the existing code (slowly) over the course of
>> several patches into the final form, rather than having the final form
>> come into existence out of thin air. When the changes are made
>> incrementally, it is easier for reviewers to understand why such
>> modifications are needed, which (hopefully) should lead to fewer
>> questions such as this one.
>
> I reversed the the check here; it is intended to check if the symref
> is _not_ the head, since the head
> ref has already been parsed.  This is used in notes.c to find
> "NOTES_MERGE_REF".

I'm probably being dense, but I still don't understand why the code
now needs a special case for HEAD, whereas the original didn't. But,
my denseness my be indicative of this change not being well-described
(or described at all) by the commit message. Hopefully, when this is
refactored into finer changes, the purpose will become clear.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 46/67] write_loose_object: convert to strbuf

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 02:27:57PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > -   memcpy(buffer, filename, dirlen);
> > -   strcpy(buffer + dirlen, "tmp_obj_XX");
> > -   fd = git_mkstemp_mode(buffer, 0444);
> > +   strbuf_reset(tmp);
> > +   strbuf_add(tmp, filename, dirlen);
> > +   strbuf_addstr(tmp, "tmp_obj_XX");
> > +   fd = git_mkstemp_mode(tmp->buf, 0444);
> > if (fd < 0 && dirlen && errno == ENOENT) {
> > -   /* Make sure the directory exists */
> > -   memcpy(buffer, filename, dirlen);
> > -   buffer[dirlen-1] = 0;
> > -   if (mkdir(buffer, 0777) && errno != EEXIST)
> > +   /*
> > +* Make sure the directory exists; note that mkstemp will have
> > +* put a NUL in our buffer, so we have to rewrite the path,
> > +* rather than just chomping the length.
> > +*/
> > +   strbuf_reset(tmp);
> > +   strbuf_add(tmp, filename, dirlen - 1);
> > +   if (mkdir(tmp->buf, 0777) && errno != EEXIST)
> > return -1;
> 
> I had to read the patch three times before understanding what the
> business with NUL in this comment is about.
> 
> The old code was doing the same thing, i.e. instead of attempting to
> reuse the early part of buffer[] it copied the early part of
> filename[] there again, exactly for the same reason, but it didn't
> even explain why the copy was necessary.  Now the new code explains
> why strbuf_setlen() is not used here pretty nicely.

Exactly (I found this out the hard way by trying to clean that up, and
learned something new about mkstemp).

Mentioning the NUL is probably unnecessarily confusing. That is what our
gitmkstemp does, but mkstemp(3) says "undefined" on my system (POSIX
does not mention it at all, but the NUL seems like a reasonable safety
in case any callers ignore the return value).

I've updated this to:

   /*
* Make sure the directory exists; note that the contents
* of the buffer are undefined after mkstemp returns an
* error, so we have to rewrite the whole buffer from
* scratch.
*/

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 52/67] use sha1_to_hex_to() instead of strcpy

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 02:51:13PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > diff --git a/builtin/merge-index.c b/builtin/merge-index.c
> > index 1d66111..4ed0a83 100644
> > --- a/builtin/merge-index.c
> > +++ b/builtin/merge-index.c
> > @@ -9,7 +9,7 @@ static int merge_entry(int pos, const char *path)
> >  {
> > int found;
> > const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
> > -   char hexbuf[4][60];
> > +   char hexbuf[4][GIT_SHA1_HEXSZ + 1];
> > char ownbuf[4][60];
> 
> So you saved 19*4 = 76 bytes at runtime?
> 
> Looks good ;-).

I think we can save even more in ownbuf, which holds only octal
modes. That was out of scope for this patch, though. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 36/67] remote-ext: simplify git pkt-line generation

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 01:18:03PM -0700, Junio C Hamano wrote:

> > +   /* Now go back and fill in the size */
> > +   if (buffer.len > 0x)
> > die("Request too large to send");
> > +   xsnprintf(buffer.buf, buffer.alloc, "%04x", (unsigned)buffer.len);
> 
> So we now write "something something\0host=something" into the buffer
> and then try to overwrite the first four bytes?  Does this xsnprintf()
> stop after writing the four hexadecimal, or does it clobber the first
> byte of the payload (i.e. copy of serv[0]) by a NUL termination?

Argh, you're right, this is totally broken. We obviously have zero test
coverage of this feature. :(

This strategy cannot work at all without a way to format without adding
the NUL (or hackily saving and restoring the overwritten character).
The pkt-line code does its own hex formatting to get around this.

In fact...I think this whole patch can basically be replaced with a call
to packet_write().

Like this:

-- >8 --
Subject: [PATCH] remote-ext: simplify git pkt-line generation

We format a pkt-line into a heap buffer, which requires
manual computation of the required size, and uses some bare
sprintf calls. We could use a strbuf instead, which would
take care of the computation for us. But it's even easier
still to use packet_write(). Besides handling the formatting
and writing for us, it fixes two things:

  1. Our manual max-size check used 0x, while technically
 LARGE_PACKET_MAX is slightly smaller than this.

  2. Our packet will now be output as part of
 GIT_TRACE_PACKET debugging.

Unfortunately packet_write() does not let us build up the
buffer progressively, so we do have to repeat ourselves a
little depending on the "vhost" setting, but the end result
is still far more readable than the original.

Since there were no tests covering this feature at all,
we'll add a few into t5802.

Signed-off-by: Jeff King 
---
 builtin/remote-ext.c  | 34 +-
 t/t5802-connect-helper.sh | 28 
 2 files changed, 33 insertions(+), 29 deletions(-)

diff --git a/builtin/remote-ext.c b/builtin/remote-ext.c
index 3b8c22c..e3cd25d 100644
--- a/builtin/remote-ext.c
+++ b/builtin/remote-ext.c
@@ -1,6 +1,7 @@
 #include "builtin.h"
 #include "transport.h"
 #include "run-command.h"
+#include "pkt-line.h"
 
 /*
  * URL syntax:
@@ -142,36 +143,11 @@ static const char **parse_argv(const char *arg, const 
char *service)
 static void send_git_request(int stdin_fd, const char *serv, const char *repo,
const char *vhost)
 {
-   size_t bufferspace;
-   size_t wpos = 0;
-   char *buffer;
-
-   /*
-* Request needs 12 bytes extra if there is vhost ( \0host=\0) and
-* 6 bytes extra ( \0) if there is no vhost.
-*/
-   if (vhost)
-   bufferspace = strlen(serv) + strlen(repo) + strlen(vhost) + 12;
+   if (!vhost)
+   packet_write(stdin_fd, "%s %s%c", serv, repo, 0);
else
-   bufferspace = strlen(serv) + strlen(repo) + 6;
-
-   if (bufferspace > 0x)
-   die("Request too large to send");
-   buffer = xmalloc(bufferspace);
-
-   /* Make the packet. */
-   wpos = sprintf(buffer, "%04x%s %s%c", (unsigned)bufferspace,
-   serv, repo, 0);
-
-   /* Add vhost if any. */
-   if (vhost)
-   sprintf(buffer + wpos, "host=%s%c", vhost, 0);
-
-   /* Send the request */
-   if (write_in_full(stdin_fd, buffer, bufferspace) < 0)
-   die_errno("Failed to send request");
-
-   free(buffer);
+   packet_write(stdin_fd, "%s %s%chost=%s%c", serv, repo, 0,
+vhost, 0);
 }
 
 static int run_child(const char *arg, const char *service)
diff --git a/t/t5802-connect-helper.sh b/t/t5802-connect-helper.sh
index 878faf2..b7a7f9d 100755
--- a/t/t5802-connect-helper.sh
+++ b/t/t5802-connect-helper.sh
@@ -69,4 +69,32 @@ test_expect_success 'update backfilled tag without primary 
transfer' '
test_cmp expect actual
 '
 
+
+test_expect_success 'set up fake git-daemon' '
+   mkdir remote &&
+   git init --bare remote/one.git &&
+   mkdir remote/host &&
+   git init --bare remote/host/two.git &&
+   write_script fake-daemon <<-\EOF &&
+   git daemon --inetd \
+   --informative-errors \
+   --export-all \
+   --base-path="$TRASH_DIRECTORY/remote" \
+   --interpolated-path="$TRASH_DIRECTORY/remote/%H%D" \
+   "$TRASH_DIRECTORY/remote"
+   EOF
+   export TRASH_DIRECTORY &&
+   PATH=$TRASH_DIRECTORY:$PATH
+'
+
+test_expect_success 'ext command can connect to git daemon (no vhost)' '
+   rm -rf dst &&
+   git clone "ext::fake-daemon %G/one.git" dst
+'
+
+test_expect_success 'ext command can connect to git daemon (vhost)' '
+   rm -rf dst &&
+   git clone "ext::fake-daemon %G/two.git %Vhost" dst
+'

Re: [PATCH 52/67] use sha1_to_hex_to() instead of strcpy

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> diff --git a/builtin/merge-index.c b/builtin/merge-index.c
> index 1d66111..4ed0a83 100644
> --- a/builtin/merge-index.c
> +++ b/builtin/merge-index.c
> @@ -9,7 +9,7 @@ static int merge_entry(int pos, const char *path)
>  {
>   int found;
>   const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
> - char hexbuf[4][60];
> + char hexbuf[4][GIT_SHA1_HEXSZ + 1];
>   char ownbuf[4][60];

So you saved 19*4 = 76 bytes at runtime?

Looks good ;-).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 1/3] worktree: add top-level worktree.c

2015-09-16 Thread Eric Sunshine
On Mon, Sep 14, 2015 at 1:41 PM, Junio C Hamano  wrote:
> Mike Rappazzo  writes:
>> On Sat, Sep 12, 2015 at 10:39 PM, Eric Sunshine  
>> wrote:
 +struct worktree_list *get_worktree_list()
>>>
>>> Can we be more concise and call this get_worktrees()?
>>
>> I prefer 'get_worktree_list' because I also added the 'get_worktree'
>> function, and I wanted to differentiate
>> the function names.
>
> I'd say that plural can be differentiating enough; it probably is a
> matter of taste.  How often do external callers want to call
> get_worktree() and not get_worktrees()?

The shorter name, get_worktrees(), also has the minor benefit of
concision, similar to the way we use short variable names (i, j, n, p,
s) to help reveal and (often) make code structure obvious at a glance;
whereas long, noisy, wordy names tend to obscure code structure.

The "_list" suffix doesn't add any value over the shorter pluralizing
"s"; in fact, it may be (very, very slightly) detrimental in implying
too strongly that the return value must be a linked list.

 +struct worktree {
 +   char *path;
 +   char *git_dir;
 +   char *head_ref;
 +   unsigned char head_sha1[20];
 +   int is_detached;
 +   int is_bare;
 +};
 +
 +struct worktree_list {
 +   struct worktree *worktree;
 +   struct worktree_list *next;
 +};
>>>
>>> I don't care too strongly, but an alternate approach (which I probably
>>> would have taken) would be to have get_worktrees() simply return an
>>> array of 'struct worktree' objects, hence no need for the additional
>>> 'struct worktree_list'.
>
> I do not think we are using this to hold thousands of worktree
> objects in core.  Adding "struct worktree *next" pointer to the
> worktree object itself would probably be sufficient for the need of
> codepaths that want to enumerate and iterate over them and that
> would be another way to lose the extra structure.

I was more concerned with the inherent (and, in this case,
unnecessary) complexity of a linked list. Being able to drop the extra
'worktree_list' structure was just an added benefit of moving to the
simpler array approach.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] worktree: move/refactor find_shared_symref from branch.c

2015-09-16 Thread Mike Rappazzo
On Wed, Sep 16, 2015 at 5:09 PM, Eric Sunshine  wrote:
> On Mon, Sep 14, 2015 at 1:44 PM, Mike Rappazzo  wrote:
>> On Sat, Sep 12, 2015 at 11:19 PM, Eric Sunshine  
>> wrote:
>>> On Fri, Sep 4, 2015 at 5:39 PM, Michael Rappazzo  wrote:
 +   while (!matched && worktree_list) {
 +   if (strcmp("HEAD", symref)) {
 +   strbuf_reset();
 +   strbuf_reset();
 +   strbuf_addf(, "%s/%s", 
 worktree_list->worktree->git_dir, symref);
 +
 +   if (_parse_ref(path.buf, , NULL)) {
 +   continue;
 +   }
 +
 +   if (!strcmp(sb.buf, target))
 +   matched = worktree_list->worktree;
>>>
>>> The original code in branch.c, which this patch removes, did not need
>>> to make a special case of HEAD, so it's not immediately clear why this
>>> replacement code does so. This is the sort of issue which argues in
>>> favor of mutating the existing code (slowly) over the course of
>>> several patches into the final form, rather than having the final form
>>> come into existence out of thin air. When the changes are made
>>> incrementally, it is easier for reviewers to understand why such
>>> modifications are needed, which (hopefully) should lead to fewer
>>> questions such as this one.
>>
>> I reversed the the check here; it is intended to check if the symref
>> is _not_ the head, since the head
>> ref has already been parsed.  This is used in notes.c to find
>> "NOTES_MERGE_REF".
>
> I'm probably being dense, but I still don't understand why the code
> now needs a special case for HEAD, whereas the original didn't. But,
> my denseness my be indicative of this change not being well-described
> (or described at all) by the commit message. Hopefully, when this is
> refactored into finer changes, the purpose will become clear.
>
> Thanks.

The special case for HEAD is because the HEAD is already included in
the worktree struct.  This block is intended to save from re-parsing.
If you think the code would be easier to read, the HEAD check could be
removed, and the ref will just be parsed always.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] notes: don't expand qualified refs in expand_notes_ref

2015-09-16 Thread Junio C Hamano
Jacob Keller  writes:

> From: Jacob Keller 
>
> The documentation for --refs says that it will treat unqualified refs as
> under refs/notes. Current behavior is to prefix refs/notes to all
> strings that do not start with refs/notes or notes/, resulting in
> performing actions on refs such as "refs/notes/refs/foo/bar" instead of
> attempting to perform actions on "refs/foo/bar".

That actually sounds like a sensible thing to do, if you replace
'foo' with 'heads', for example, i.e. refs/notes/refs/heads/bar is a
notes about commits reachable from the branch whose name is 'bar'.

So given "refs/heads/bar", which is unqualified in the context of
talking about references that hold notes trees, the current
behaviour to turn it into "refs/notes/refs/heads/bar" is very
sensible, I would think.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 12:19:10PM -0700, Stefan Beller wrote:

> > That will make future readers wonder "Is this a typo, and if it is,
> > which index is a mistake I can fix?" and may lead to an unnecessary
> > confusion.  I do not want to see a correctly written
> >
> > xsnprintf(ownbuf[stage], sizeof(ownbuf[0]), "%o", ...);
> >
> > turned into
> >
> > xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", ...);
> >
> > out of such a confusion.
> 
> So we could just not use the bracket notation, but the pointer then?
> 
> xsnprintf(ownbuf[stage], sizeof(*ownbuf), "%o", ...);

IMHO that is even more confusing, as I expect sizeof(*ownbuf) to
generally be dereferencing a pointer to a struct, and we would be
writing to "ownbuf". There's not anything _wrong_ with what you've
written, it's just using a syntax that in my mind generally applies to a
different idiom, and I'd wonder if the writer got it wrong.

> IMHO that would reasonably well tell you that we just care about the
> size of one element there.
> 
> A funny thought:
> 
>  xsnprintf(ownbuf[stage], sizeof(ownbuf[-1]), "%o", ...);
> 
> should work as well as any reader would question the sanity
> of a negative index.

I'm not sure what the standard would have to say on that, as the
expression invokes undefined behavior (but of course we're not really
using the expression, only asking for the size). I tried to find any
mention of non-constant indices with sizeof() in the standard, but
couldn't.

I think at this point I'm inclined to switch it back to the original
sizeof(ownbuf[stage]), and we can deal with this if and when any
compiler actually complains.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 46/67] write_loose_object: convert to strbuf

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> - memcpy(buffer, filename, dirlen);
> - strcpy(buffer + dirlen, "tmp_obj_XX");
> - fd = git_mkstemp_mode(buffer, 0444);
> + strbuf_reset(tmp);
> + strbuf_add(tmp, filename, dirlen);
> + strbuf_addstr(tmp, "tmp_obj_XX");
> + fd = git_mkstemp_mode(tmp->buf, 0444);
>   if (fd < 0 && dirlen && errno == ENOENT) {
> - /* Make sure the directory exists */
> - memcpy(buffer, filename, dirlen);
> - buffer[dirlen-1] = 0;
> - if (mkdir(buffer, 0777) && errno != EEXIST)
> + /*
> +  * Make sure the directory exists; note that mkstemp will have
> +  * put a NUL in our buffer, so we have to rewrite the path,
> +  * rather than just chomping the length.
> +  */
> + strbuf_reset(tmp);
> + strbuf_add(tmp, filename, dirlen - 1);
> + if (mkdir(tmp->buf, 0777) && errno != EEXIST)
>   return -1;

I had to read the patch three times before understanding what the
business with NUL in this comment is about.

The old code was doing the same thing, i.e. instead of attempting to
reuse the early part of buffer[] it copied the early part of
filename[] there again, exactly for the same reason, but it didn't
even explain why the copy was necessary.  Now the new code explains
why strbuf_setlen() is not used here pretty nicely.

An unsuccessful call to git_mkstemp_mode() would destroy the
template buffer by placing a NUL at the beginning of it (and I was
confused because I did not read the unwritten "at the beginning" in
"put a NUL in our buffer" above).

The patch looks good.  Thanks.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 52/67] use sha1_to_hex_to() instead of strcpy

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> I think we can save even more in ownbuf, which holds only octal
> modes. That was out of scope for this patch, though. :)

Sure.  Also the variable is misnamed.  It is modebuf[], I think.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 66/67] use strbuf_complete to conditionally append slash

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 03:54:50PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >> Is this conversion correct?  This seems to me that the caller wants
> >> to create an IMAP folder name immediately under the root hierarchy
> >> and wants to have the leading slash in the result.
> >
> > Ugh, you're right. This is the "other" style Eric mentioned earlier.
> >
> > This looks like the only one in the patch (there are many that did not
> > check buf.len at all, but if we assume they were not invoking undefined
> > behavior before, then they are fine under the new code).
> 
> Yes, I should have said that earlier to save one roundtrip.
> 
> Thanks for working on this.

For my re-roll, I've just omitted changing that caller. I think we can
leave it as-is; it is not worth trying to introduce a new helper for the
one site.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 66/67] use strbuf_complete to conditionally append slash

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 03:18:59PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > diff --git a/imap-send.c b/imap-send.c
> > index 01aa227..f5d2b06 100644
> > --- a/imap-send.c
> > +++ b/imap-send.c
> > @@ -1412,8 +1412,7 @@ static CURL *setup_curl(struct imap_server_conf *srvc)
> > curl_easy_setopt(curl, CURLOPT_PASSWORD, server.pass);
> >  
> > strbuf_addstr(, server.host);
> > -   if (!path.len || path.buf[path.len - 1] != '/')
> > -   strbuf_addch(, '/');
> > +   strbuf_complete(, '/');
> > strbuf_addstr(, server.folder);
> 
> Is this conversion correct?  This seems to me that the caller wants
> to create an IMAP folder name immediately under the root hierarchy
> and wants to have the leading slash in the result.

Ugh, you're right. This is the "other" style Eric mentioned earlier.

This looks like the only one in the patch (there are many that did not
check buf.len at all, but if we assume they were not invoking undefined
behavior before, then they are fine under the new code).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5] remote: add get-url subcommand

2015-09-16 Thread Junio C Hamano
Ben Boeckel  writes:

> +get_url_test () {
> + cat >expect &&
> + test_expect_success "get-url $*" "
> + git remote get-url $* >actual &&
> + test_cmp expect actual
> + "
> +}

This makes any use of get_url_test inside test_expect_success wrong,
I suspect.  Try running the tests under prove.  I think it will
complain that the test numbers do not match or something.

Minimally, this would probably fix the breakage.

 t/t5505-remote.sh | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh
index f03ba4c..dfaf9d9 100755
--- a/t/t5505-remote.sh
+++ b/t/t5505-remote.sh
@@ -921,10 +921,8 @@ test_expect_success 'new remote' '
 
 get_url_test () {
cat >expect &&
-   test_expect_success "get-url $*" "
-   git remote get-url $* >actual &&
-   test_cmp expect actual
-   "
+   git remote get-url "$@" >actual &&
+   test_cmp expect actual
 }
 
 test_expect_success 'get-url on new remote' '
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] notes: allow read only notes actions on refs outside of refs/notes

2015-09-16 Thread Mike Hommey
On Wed, Sep 16, 2015 at 03:06:32PM -0700, Jacob Keller wrote:
> This topic depends on mh/notes-allow-reading-treeish and actually
> expands what this topic allowed before. Previously, treeishes such as
> notes@{1} were made allowable, but the ref still had to be found under
> refs/notes.

I haven't found the time to finish that topic yet, but I haven't
forgotten it. Sorry about that.

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] notes: allow read only notes actions on refs outside of refs/notes

2015-09-16 Thread Jacob Keller
On Wed, Sep 16, 2015 at 3:36 PM, Mike Hommey  wrote:
> On Wed, Sep 16, 2015 at 03:06:32PM -0700, Jacob Keller wrote:
>> This topic depends on mh/notes-allow-reading-treeish and actually
>> expands what this topic allowed before. Previously, treeishes such as
>> notes@{1} were made allowable, but the ref still had to be found under
>> refs/notes.
>
> I haven't found the time to finish that topic yet, but I haven't
> forgotten it. Sorry about that.
>
> Mike

That's ok. I don't think any of the remaining re-works left for your
patch impact mine..? I haven't really read it, but I know I make use
if NOTES_INIT_WRITABLE is the main reason.

Regards,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/43] refs lmdb backend

2015-09-16 Thread David Turner
On Fri, 2015-09-04 at 12:01 -0400, David Turner wrote:
> On Thu, 2015-09-03 at 16:10 -0700, Junio C Hamano wrote:
> > David Turner  writes:
> > 
> > > I think I've broken about all of the standalone stuff out, so here's
> > > the main enchilada.
> > >
> > > This series depends on at least the following topics in pu:
> > > dt/refs-bisection
> > > dt/refs-pseudo
> > > dt/reflog-tests
> > > kn/for-each-tag (patch 21 and corresponding bits of 42 depend on this;
> > > we could skip them, but I wanted this to apply on top of pu)
> > >
> > > As before, I tested by hacking the test suite to run under the lmdb
> > > backend and changing a few dozen tests.  The remaiing failures are
> > > documented in Documentation/technical/refs-be-lmdb.txt, except for one
> > > in t1404 where this version gives a different error message (but still
> > > an error).
> > >
> > > As Jeff King suggested last time I sent this around, I've made the
> > > reflog format slightly more efficient.  Now it stores shas in a binary
> > > format, and only uses a header entry if there are no real entries.
> > >
> > > Also, now per-worktree refs live in the filesystem.
> > >
> > > I've also made a number of fixes to memory leaks, formatting,
> > > factoring, etc.
> > >
> > > As Michael Haggerty suggested, I'm now using struct ref_transaction as
> > > a base struct for the ref transaction structs.
> > >
> > > Looking forward to reviews.
> > 
> > [03/43] seems to be missing
> 
> I just attempted to re-send it, but I still don't see it on gmane.
> Perhaps this is because it is greater than some size limit?  It's about
> 265k.  I've attached a gzipped version to this email.  

Just wanted to send a ping on this series.  I know it's a big set of
changes to review.  Let me know if there's anything else I can do to
help here.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] notes: don't expand qualified refs in expand_notes_ref

2015-09-16 Thread Jacob Keller
On Wed, Sep 16, 2015 at 3:34 PM, Junio C Hamano  wrote:
> Jacob Keller  writes:
>
>> From: Jacob Keller 
>>
>> The documentation for --refs says that it will treat unqualified refs as
>> under refs/notes. Current behavior is to prefix refs/notes to all
>> strings that do not start with refs/notes or notes/, resulting in
>> performing actions on refs such as "refs/notes/refs/foo/bar" instead of
>> attempting to perform actions on "refs/foo/bar".
>
> That actually sounds like a sensible thing to do, if you replace
> 'foo' with 'heads', for example, i.e. refs/notes/refs/heads/bar is a
> notes about commits reachable from the branch whose name is 'bar'.
>
> So given "refs/heads/bar", which is unqualified in the context of
> talking about references that hold notes trees, the current
> behaviour to turn it into "refs/notes/refs/heads/bar" is very
> sensible, I would think.
>

The end goal is to allow refs inside "refs/remote-notes/".. it seems
really weird that other fully qualified refs don't get "expanded" in
the same way as notes, and documentation does not make it explicit
that this is how it would work. I think that users who actually want
this behavior are already free to say "refs/notes/refs/heads/bar"...
that wouldn't change..

How would you propose allowing merging from "refs/remote-notes//bar"?

We could easily just hard-code acceptance of refs/remote-notes/ as
well as refs/notes... But that felt weird to me...

But honestly I don't really care how it is done as long as we can "git
notes show", "git notes list" on refs/remote-notes//commits
(or similar, remote-notes may not be the actual location if someone
came up with a better name?)

How would you propose we allow that?

If we keep the current behavior of "expand_notes_ref" then we
absolutely can't because use of "--ref" will auto expand
"refs/remote-notes//commits" into
"refs/notes/refs/remote-notes//commits" which wouldn't work...

Regards,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/10] strbuf: Add strbuf_read_noblock

2015-09-16 Thread Stefan Beller
We need to read from pipes without blocking in a later patch.

Signed-off-by: Stefan Beller 
---
 strbuf.c | 25 +++--
 strbuf.h |  6 ++
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index cce5eed..4130ee2 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -357,7 +357,10 @@ size_t strbuf_fread(struct strbuf *sb, size_t size, FILE 
*f)
return res;
 }
 
-ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
+#define IGNORE_EAGAIN (1)
+
+static ssize_t strbuf_read_internal(struct strbuf *sb, int fd,
+   size_t hint, int flags)
 {
size_t oldlen = sb->len;
size_t oldalloc = sb->alloc;
@@ -366,8 +369,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
for (;;) {
ssize_t cnt;
 
-   cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
+   cnt = read(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
if (cnt < 0) {
+   if (errno == EINTR)
+   continue;
+   if (errno == EAGAIN) {
+   if (flags & IGNORE_EAGAIN)
+   break;
+   else
+   continue;
+   }
if (oldalloc == 0)
strbuf_release(sb);
else
@@ -384,6 +395,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
return sb->len - oldlen;
 }
 
+ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
+{
+   return strbuf_read_internal(sb, fd, hint, 0);
+}
+
+ssize_t strbuf_read_noblock(struct strbuf *sb, int fd, size_t hint)
+{
+   return strbuf_read_internal(sb, fd, hint, IGNORE_EAGAIN);
+}
+
 #define STRBUF_MAXLINK (2*PATH_MAX)
 
 int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
diff --git a/strbuf.h b/strbuf.h
index aef2794..23ca7aa 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -367,6 +367,12 @@ extern size_t strbuf_fread(struct strbuf *, size_t, FILE 
*);
 extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint);
 
 /**
+ * Same as strbuf_read, just returns non-blockingly by ignoring EAGAIN.
+ * The fd must have set O_NONBLOCK.
+ */
+extern ssize_t strbuf_read_noblock(struct strbuf *, int fd, size_t hint);
+
+/**
  * Read the contents of a file, specified by its path. The third argument
  * can be used to give a hint about the file size, to avoid reallocs.
  */
-- 
2.6.0.rc0.131.gf624c3d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update"

2015-09-16 Thread Stefan Beller
> I didn't say this in the previous round because it smelled like an
> RFC, but for a real submission, 2/2 may be doing too many things at
> once.  I suspect this is more or less "taste" thing, so I won't mind
> too much as long as the reviewers are OK with it.

The patch 2/2 is now broken up into the first five patches.

There are only 2 minor changes:
* If a number of tasks <= 0 is specified, use the number of cpus instead.
* Renamed the command line option in test-run-command.c to 
"run-command-parallel-4"
  as the 4 is hardcoded there.
  
The patch 6,7 are small cleanups (6 should add a test in a reroll)

Patches 7,8,9 are a preview of how I want to proceed in the near future:
After these functions are split out, we can add another patch on top which
rewrites the short main loop in cmd_update to be in C in submodule--helper
running in parallel.
It took me a while to get the idea how to realize parallelism with the
parallel run command structure now as opposed to the thread pool I proposed
earlier, but I think it will be straightforward from here.

Stefan

Stefan Beller (10):
  strbuf: Add strbuf_read_noblock
  run-command: factor out return value computation
  run-command: add an asynchronous parallel child processor
  fetch_populated_submodules: use new parallel job processing
  submodules: Allow parallel fetching, add tests and documentation
  git submodule update: Redirect any output to stderr
  git submodule update: pass --prefix only with a non empty prefix
  git submodule update: cmd_update_recursive
  git submodule update: cmd_update_recursive
  git submodule update: cmd_update_fetch

 Documentation/fetch-options.txt |   7 +
 builtin/fetch.c |   6 +-
 builtin/pull.c  |   6 +
 git-submodule.sh| 242 ++
 run-command.c   | 281 
 run-command.h   |  36 +
 strbuf.c|  25 +++-
 strbuf.h|   6 +
 submodule.c | 119 -
 submodule.h |   2 +-
 t/t0061-run-command.sh  |  20 +++
 t/t5526-fetch-submodules.sh |  19 +++
 test-run-command.c  |  24 
 13 files changed, 620 insertions(+), 173 deletions(-)

-- 
2.6.0.rc0.131.gf624c3d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/10] run-command: factor out return value computation

2015-09-16 Thread Stefan Beller
We will need computing the return value in a later patch without the
wait.

Signed-off-by: Stefan Beller 
---
 run-command.c | 54 --
 1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/run-command.c b/run-command.c
index 28e1d55..c892e9a 100644
--- a/run-command.c
+++ b/run-command.c
@@ -232,6 +232,35 @@ static inline void set_cloexec(int fd)
fcntl(fd, F_SETFD, flags | FD_CLOEXEC);
 }
 
+static int determine_return_value(int wait_status,
+ int *result,
+ int *error_code,
+ const char *argv0)
+{
+   if (WIFSIGNALED(wait_status)) {
+   *result = WTERMSIG(wait_status);
+   if (*result != SIGINT && *result != SIGQUIT)
+   error("%s died of signal %d", argv0, *result);
+   /*
+* This return value is chosen so that code & 0xff
+* mimics the exit code that a POSIX shell would report for
+* a program that died from this signal.
+*/
+   *result += 128;
+   } else if (WIFEXITED(wait_status)) {
+   *result = WEXITSTATUS(wait_status);
+   /*
+* Convert special exit code when execvp failed.
+*/
+   if (*result == 127) {
+   *result = -1;
+   *error_code = ENOENT;
+   }
+   } else
+   return 1;
+   return 0;
+}
+
 static int wait_or_whine(pid_t pid, const char *argv0)
 {
int status, code = -1;
@@ -244,29 +273,10 @@ static int wait_or_whine(pid_t pid, const char *argv0)
if (waiting < 0) {
failed_errno = errno;
error("waitpid for %s failed: %s", argv0, strerror(errno));
-   } else if (waiting != pid) {
-   error("waitpid is confused (%s)", argv0);
-   } else if (WIFSIGNALED(status)) {
-   code = WTERMSIG(status);
-   if (code != SIGINT && code != SIGQUIT)
-   error("%s died of signal %d", argv0, code);
-   /*
-* This return value is chosen so that code & 0xff
-* mimics the exit code that a POSIX shell would report for
-* a program that died from this signal.
-*/
-   code += 128;
-   } else if (WIFEXITED(status)) {
-   code = WEXITSTATUS(status);
-   /*
-* Convert special exit code when execvp failed.
-*/
-   if (code == 127) {
-   code = -1;
-   failed_errno = ENOENT;
-   }
} else {
-   error("waitpid is confused (%s)", argv0);
+   if (waiting != pid
+  || (determine_return_value(status, , _errno, 
argv0) < 0))
+   error("waitpid is confused (%s)", argv0);
}
 
clear_child_for_cleanup(pid);
-- 
2.6.0.rc0.131.gf624c3d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/10] fetch_populated_submodules: use new parallel job processing

2015-09-16 Thread Stefan Beller
In a later patch we enable parallel processing of submodules, this
only adds the possibility for it. So this change should not change
any user facing behavior.

Signed-off-by: Stefan Beller 
---
 builtin/fetch.c |   3 +-
 submodule.c | 119 +++-
 submodule.h |   2 +-
 3 files changed, 87 insertions(+), 37 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ee1f1a9..6620ed0 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1217,7 +1217,8 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
result = fetch_populated_submodules(,
submodule_prefix,
recurse_submodules,
-   verbosity < 0);
+   verbosity < 0,
+   0);
argv_array_clear();
}
 
diff --git a/submodule.c b/submodule.c
index 1d64e57..a0e06e8 100644
--- a/submodule.c
+++ b/submodule.c
@@ -12,6 +12,7 @@
 #include "sha1-array.h"
 #include "argv-array.h"
 #include "blob.h"
+#include "thread-utils.h"
 
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static struct string_list changed_submodule_paths;
@@ -615,37 +616,79 @@ static void calculate_changed_submodule_paths(void)
initialized_fetch_ref_tips = 0;
 }
 
+struct submodule_parallel_fetch {
+   int count;
+   struct argv_array args;
+   const char *work_tree;
+   const char *prefix;
+   int command_line_option;
+   int quiet;
+   int result;
+};
+#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0}
+
+int get_next_submodule(void *data, struct child_process *cp,
+  struct strbuf *err);
+
+void handle_submodule_fetch_start_err(void *data, struct child_process *cp, 
struct strbuf *err)
+{
+   struct submodule_parallel_fetch *spf = data;
+   spf->result = 1;
+}
+
+void handle_submodule_fetch_finish( void *data, struct child_process *cp, int 
retvalue)
+{
+   struct submodule_parallel_fetch *spf = data;
+
+   if (retvalue)
+   spf->result = 1;
+}
+
 int fetch_populated_submodules(const struct argv_array *options,
   const char *prefix, int command_line_option,
-  int quiet)
+  int quiet, int max_parallel_jobs)
 {
-   int i, result = 0;
-   struct child_process cp = CHILD_PROCESS_INIT;
-   struct argv_array argv = ARGV_ARRAY_INIT;
-   const char *work_tree = get_git_work_tree();
-   if (!work_tree)
+   int i;
+   struct submodule_parallel_fetch spf = SPF_INIT;
+   spf.work_tree = get_git_work_tree();
+   spf.command_line_option = command_line_option;
+   spf.quiet = quiet;
+   spf.prefix = prefix;
+   if (!spf.work_tree)
goto out;
 
if (read_cache() < 0)
die("index file corrupt");
 
-   argv_array_push(, "fetch");
+   argv_array_push(, "fetch");
for (i = 0; i < options->argc; i++)
-   argv_array_push(, options->argv[i]);
-   argv_array_push(, "--recurse-submodules-default");
+   argv_array_push(, options->argv[i]);
+   argv_array_push(, "--recurse-submodules-default");
/* default value, "--submodule-prefix" and its value are added later */
 
-   cp.env = local_repo_env;
-   cp.git_cmd = 1;
-   cp.no_stdin = 1;
-
calculate_changed_submodule_paths();
+   run_processes_parallel(max_parallel_jobs, ,
+  get_next_submodule,
+  handle_submodule_fetch_start_err,
+  handle_submodule_fetch_finish);
+
+   argv_array_clear();
+out:
+   string_list_clear(_submodule_paths, 1);
+   return spf.result;
+}
+
+int get_next_submodule(void *data, struct child_process *cp,
+  struct strbuf *err)
+{
+   int ret = 0;
+   struct submodule_parallel_fetch *spf = data;
 
-   for (i = 0; i < active_nr; i++) {
+   for ( ; spf->count < active_nr; spf->count++) {
struct strbuf submodule_path = STRBUF_INIT;
struct strbuf submodule_git_dir = STRBUF_INIT;
struct strbuf submodule_prefix = STRBUF_INIT;
-   const struct cache_entry *ce = active_cache[i];
+   const struct cache_entry *ce = active_cache[spf->count];
const char *git_dir, *default_argv;
const struct submodule *submodule;
 
@@ -657,7 +700,7 @@ int fetch_populated_submodules(const struct argv_array 
*options,
submodule = submodule_from_name(null_sha1, ce->name);
 
default_argv = "yes";
-   if (command_line_option == RECURSE_SUBMODULES_DEFAULT) {
+   

[PATCH 09/10] git submodule update: cmd_update_recursive

2015-09-16 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 git-submodule.sh | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 52c2967..c40d60f 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -607,6 +607,24 @@ cmd_update_recursive()
fi
 }
 
+cmd_update_clone()
+{
+   command="git checkout $subforce -q"
+   die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path 
'\$displaypath'")"
+   say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out 
'\$sha1'")"
+
+   git submodule--helper clone ${GIT_QUIET:+--quiet} ${prefix:+--prefix 
"$prefix"} --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" 
|| exit
+
+   if (clear_local_git_env; cd "$sm_path" && $command "$sha1")
+   then
+   say "$say_msg"
+   else
+   err="${err};$die_msg"
+   return
+   fi
+   cmd_update_recursive
+}
+
 #
 # Update each submodule path to correct revision, using clone and checkout as 
needed
 #
@@ -680,7 +698,6 @@ cmd_update()
cmd_init "--" "$@" || return
fi
 
-   cloned_modules=
git submodule--helper list --prefix "$wt_prefix" "$@" | {
err=
while read mode sha1 stage sm_path
@@ -725,9 +742,8 @@ Maybe you want to use 'update --init'?")"
 
if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
then
-   git submodule--helper clone ${GIT_QUIET:+--quiet} 
${prefix:+--prefix "$prefix"} --path "$sm_path" --name "$name" --url "$url" 
"$reference" "$depth" || exit
-   cloned_modules="$cloned_modules;$name"
-   subsha1=
+   cmd_update_clone
+   continue
else
subsha1=$(clear_local_git_env; cd "$sm_path" &&
git rev-parse --verify HEAD) ||
@@ -767,13 +783,6 @@ Maybe you want to use 'update --init'?")"
die "$(eval_gettext "Unable to fetch in 
submodule path '\$displaypath'")"
fi
 
-   # Is this something we just cloned?
-   case ";$cloned_modules;" in
-   *";$name;"*)
-   # then there is no local change to integrate
-   update_module=checkout ;;
-   esac
-
must_die_on_failure=
case "$update_module" in
checkout)
-- 
2.6.0.rc0.131.gf624c3d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/10] git submodule update: pass --prefix only with a non empty prefix

2015-09-16 Thread Stefan Beller
We should not pass --prefix NULL into the helper. Although the helper
can deal with it, it's just messy.

Signed-off-by: Stefan Beller 
---
 git-submodule.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 7ef3247..3ccb0b6 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -700,7 +700,7 @@ Maybe you want to use 'update --init'?")"
 
if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
then
-   git submodule--helper clone ${GIT_QUIET:+--quiet} 
--prefix "$prefix" --path "$sm_path" --name "$name" --url "$url" "$reference" 
"$depth" || exit
+   git submodule--helper clone ${GIT_QUIET:+--quiet} 
${prefix:+--prefix "$prefix"} --path "$sm_path" --name "$name" --url "$url" 
"$reference" "$depth" || exit
cloned_modules="$cloned_modules;$name"
subsha1=
else
-- 
2.6.0.rc0.131.gf624c3d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 66/67] use strbuf_complete to conditionally append slash

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

>> Is this conversion correct?  This seems to me that the caller wants
>> to create an IMAP folder name immediately under the root hierarchy
>> and wants to have the leading slash in the result.
>
> Ugh, you're right. This is the "other" style Eric mentioned earlier.
>
> This looks like the only one in the patch (there are many that did not
> check buf.len at all, but if we assume they were not invoking undefined
> behavior before, then they are fine under the new code).

Yes, I should have said that earlier to save one roundtrip.

Thanks for working on this.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6] remote: add get-url subcommand

2015-09-16 Thread Ben Boeckel
Expanding `insteadOf` is a part of ls-remote --url and there is no way
to expand `pushInsteadOf` as well. Add a get-url subcommand to be able
to query both as well as a way to get all configured urls.

Signed-off-by: Ben Boeckel 
---
 Documentation/git-remote.txt | 10 
 builtin/remote.c | 59 
 t/t5505-remote.sh| 37 +++
 3 files changed, 106 insertions(+)

diff --git a/Documentation/git-remote.txt b/Documentation/git-remote.txt
index 4c6d6de..3c9bf45 100644
--- a/Documentation/git-remote.txt
+++ b/Documentation/git-remote.txt
@@ -15,6 +15,7 @@ SYNOPSIS
 'git remote remove' 
 'git remote set-head'  (-a | --auto | -d | --delete | )
 'git remote set-branches' [--add]  ...
+'git remote get-url' [--push] [--all] 
 'git remote set-url' [--push]   []
 'git remote set-url --add' [--push]  
 'git remote set-url --delete' [--push]  
@@ -131,6 +132,15 @@ The named branches will be interpreted as if specified 
with the
 With `--add`, instead of replacing the list of currently tracked
 branches, adds to that list.
 
+'get-url'::
+
+Retrieves the URLs for a remote. Configurations for `insteadOf` and
+`pushInsteadOf` are expanded here. By default, only the first URL is listed.
++
+With '--push', push URLs are queried rather than fetch URLs.
++
+With '--all', all URLs for the remote will be listed.
+
 'set-url'::
 
 Changes URLs for the remote. Sets first URL for remote  that matches
diff --git a/builtin/remote.c b/builtin/remote.c
index 181668d..e4c3ea1 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -18,6 +18,7 @@ static const char * const builtin_remote_usage[] = {
N_("git remote prune [-n | --dry-run] "),
N_("git remote [-v | --verbose] update [-p | --prune] [( | 
)...]"),
N_("git remote set-branches [--add]  ..."),
+   N_("git remote get-url [--push] [--all] "),
N_("git remote set-url [--push]   []"),
N_("git remote set-url --add  "),
N_("git remote set-url --delete  "),
@@ -65,6 +66,11 @@ static const char * const builtin_remote_update_usage[] = {
NULL
 };
 
+static const char * const builtin_remote_geturl_usage[] = {
+   N_("git remote get-url [--push] [--all] "),
+   NULL
+};
+
 static const char * const builtin_remote_seturl_usage[] = {
N_("git remote set-url [--push]   []"),
N_("git remote set-url --add  "),
@@ -1467,6 +1473,57 @@ static int set_branches(int argc, const char **argv)
return set_remote_branches(argv[0], argv + 1, add_mode);
 }
 
+static int get_url(int argc, const char **argv)
+{
+   int i, push_mode = 0, all_mode = 0;
+   const char *remotename = NULL;
+   struct remote *remote;
+   const char **url;
+   int url_nr;
+   struct option options[] = {
+   OPT_BOOL('\0', "push", _mode,
+N_("query push URLs rather than fetch URLs")),
+   OPT_BOOL('\0', "all", _mode,
+N_("return all URLs")),
+   OPT_END()
+   };
+   argc = parse_options(argc, argv, NULL, options, 
builtin_remote_geturl_usage, 0);
+
+   if (argc != 1)
+   usage_with_options(builtin_remote_geturl_usage, options);
+
+   remotename = argv[0];
+
+   if (!remote_is_configured(remotename))
+   die(_("No such remote '%s'"), remotename);
+   remote = remote_get(remotename);
+
+   url_nr = 0;
+   if (push_mode) {
+   url = remote->pushurl;
+   url_nr = remote->pushurl_nr;
+   }
+   /* else fetch mode */
+
+   /* Use the fetch URL when no push URLs were found or requested. */
+   if (!url_nr) {
+   url = remote->url;
+   url_nr = remote->url_nr;
+   }
+
+   if (!url_nr)
+   die(_("no URLs configured for remote '%s'"), remotename);
+
+   if (all_mode) {
+   for (i = 0; i < url_nr; i++)
+   printf_ln("%s", url[i]);
+   } else {
+   printf_ln("%s", *url);
+   }
+
+   return 0;
+}
+
 static int set_url(int argc, const char **argv)
 {
int i, push_mode = 0, add_mode = 0, delete_mode = 0;
@@ -1576,6 +1633,8 @@ int cmd_remote(int argc, const char **argv, const char 
*prefix)
result = set_head(argc, argv);
else if (!strcmp(argv[0], "set-branches"))
result = set_branches(argc, argv);
+   else if (!strcmp(argv[0], "get-url"))
+   result = get_url(argc, argv);
else if (!strcmp(argv[0], "set-url"))
result = set_url(argc, argv);
else if (!strcmp(argv[0], "show"))
diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh
index 7a8499c..9a55389 100755
--- a/t/t5505-remote.sh
+++ b/t/t5505-remote.sh
@@ -919,6 +919,19 @@ test_expect_success 'new remote' '
cmp expect actual
 '
 
+get_url_test () {
+   cat >expect &&
+   git remote get-url $* 

[PATCH 03/10] run-command: add an asynchronous parallel child processor

2015-09-16 Thread Stefan Beller
This allows to run external commands in parallel with ordered output
on stderr.

If we run external commands in parallel we cannot pipe the output directly
to the our stdout/err as it would mix up. So each process's output will
flow through a pipe, which we buffer. One subprocess can be directly
piped to out stdout/err for a low latency feedback to the user.

Example:
Let's assume we have 5 submodules A,B,C,D,E and each fetch takes a
different amount of time as the different submodules vary in size, then
the output of fetches in sequential order might look like this:

 time -->
 output: |---A---|   |-B-|   |C---|   |-D-|   |-E-|

When we schedule these submodules into maximal two parallel processes,
a schedule and sample output over time may look like this:

thread 1: |---A---|   |-D-|   |-E-|

thread 2: |-B-|   |C---|

output:   |---A---|B|--C---|DE

So A will be perceived as it would run normally in the single child
version. As B has finished by the time A is done, we can dump its whole
progress buffer on stderr, such that it looks like it finished in no time.
Once that is done, C is determined to be the visible child and its progress
will be reported in real time.

So this way of output is really good for human consumption,
as it only changes the timing, not the actual output.

For machine consumption the output needs to be prepared in
the tasks, by either having a prefix per line or per block
to indicate whose tasks output is displayed.

Signed-off-by: Stefan Beller 
---
 run-command.c  | 228 +
 run-command.h  |  36 
 t/t0061-run-command.sh |  20 +
 test-run-command.c |  24 ++
 4 files changed, 308 insertions(+)

diff --git a/run-command.c b/run-command.c
index c892e9a..3af97ab 100644
--- a/run-command.c
+++ b/run-command.c
@@ -3,6 +3,7 @@
 #include "exec_cmd.h"
 #include "sigchain.h"
 #include "argv-array.h"
+#include "thread-utils.h"
 
 void child_process_init(struct child_process *child)
 {
@@ -862,3 +863,230 @@ int capture_command(struct child_process *cmd, struct 
strbuf *buf, size_t hint)
close(cmd->out);
return finish_command(cmd);
 }
+
+struct parallel_processes {
+   int max_number_processes;
+   void *data;
+   get_next_task fn;
+   handle_child_starting_failure fn_err;
+   handle_child_return_value fn_exit;
+
+   int nr_processes;
+   int all_tasks_started;
+   int foreground_child;
+   char *slots;
+   struct child_process *children;
+   struct pollfd *pfd;
+   struct strbuf *err;
+   struct strbuf finished_children;
+};
+
+static void run_processes_parallel_init(struct parallel_processes *pp,
+   int n, void *data,
+   get_next_task fn,
+   handle_child_starting_failure fn_err,
+   handle_child_return_value fn_exit)
+{
+   int i;
+
+   if (n < 1)
+   n = online_cpus();
+
+   pp->max_number_processes = n;
+   pp->data = data;
+   pp->fn = fn;
+   pp->fn_err = fn_err;
+   pp->fn_exit = fn_exit;
+
+   pp->nr_processes = 0;
+   pp->all_tasks_started = 0;
+   pp->foreground_child = 0;
+   pp->slots = xcalloc(n, sizeof(*pp->slots));
+   pp->children = xcalloc(n, sizeof(*pp->children));
+   pp->pfd = xcalloc(n, sizeof(*pp->pfd));
+   pp->err = xcalloc(n, sizeof(*pp->err));
+   strbuf_init(>finished_children, 0);
+
+   for (i = 0; i < n; i++) {
+   strbuf_init(>err[i], 0);
+   pp->pfd[i].events = POLLIN;
+   pp->pfd[i].fd = -1;
+   }
+}
+
+static void run_processes_parallel_cleanup(struct parallel_processes *pp)
+{
+   int i;
+   for (i = 0; i < pp->max_number_processes; i++)
+   strbuf_release(>err[i]);
+
+   free(pp->children);
+   free(pp->slots);
+   free(pp->pfd);
+   free(pp->err);
+   strbuf_release(>finished_children);
+}
+
+static void unblock_fd(int fd)
+{
+   int flags = fcntl(fd, F_GETFL);
+   if (flags < 0) {
+   warning("Could not get file status flags, "
+   "output will be degraded");
+   return;
+   }
+   if (fcntl(fd, F_SETFL, flags | O_NONBLOCK)) {
+   warning("Could not set file status flags, "
+   "output will be degraded");
+   return;
+   }
+}
+
+static void run_processes_parallel_start_new(struct parallel_processes *pp)
+{
+   int i;
+   /* Start new processes. */
+   while (!pp->all_tasks_started
+  && pp->nr_processes < pp->max_number_processes) {
+   for (i = 0; i < pp->max_number_processes; i++)
+   if (!pp->slots[i])
+   break; /* found an empty slot */
+   if (i == 

[PATCH 10/10] git submodule update: cmd_update_fetch

2015-09-16 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 git-submodule.sh | 164 ---
 1 file changed, 84 insertions(+), 80 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index c40d60f..2c9f1f2 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -625,6 +625,89 @@ cmd_update_clone()
cmd_update_recursive
 }
 
+cmd_update_fetch()
+{
+   subsha1=$(clear_local_git_env; cd "$sm_path" &&
+   git rev-parse --verify HEAD) ||
+   die "$(eval_gettext "Unable to find current revision in submodule path 
'\$displaypath'")"
+
+   if test -n "$remote"
+   then
+   if test -z "$nofetch"
+   then
+   # Fetch remote before determining tracking $sha1
+   (clear_local_git_env; cd "$sm_path" && git-fetch) ||
+   die "$(eval_gettext "Unable to fetch in submodule path 
'\$sm_path'")"
+   fi
+   remote_name=$(clear_local_git_env; cd "$sm_path" && 
get_default_remote)
+   sha1=$(clear_local_git_env; cd "$sm_path" &&
+   git rev-parse --verify "${remote_name}/${branch}") ||
+   die "$(eval_gettext "Unable to find current 
${remote_name}/${branch} revision in submodule path '\$sm_path'")"
+   fi
+
+   if test "$subsha1" != "$sha1" || test -n "$force"
+   then
+   subforce=$force
+   # If we don't already have a -f flag and the submodule has 
never been checked out
+   if test -z "$subsha1" && test -z "$force"
+   then
+   subforce="-f"
+   fi
+
+   if test -z "$nofetch"
+   then
+   # Run fetch only if $sha1 isn't present or it
+   # is not reachable from a ref.
+   (clear_local_git_env; cd "$sm_path" &&
+   ( (rev=$(git rev-list -n 1 $sha1 --not --all 
2>/dev/null) &&
+test -z "$rev") || git-fetch)) ||
+   die "$(eval_gettext "Unable to fetch in submodule path 
'\$displaypath'")"
+   fi
+
+   must_die_on_failure=
+   case "$update_module" in
+   checkout)
+   command="git checkout $subforce -q"
+   die_msg="$(eval_gettext "Unable to checkout '\$sha1' in 
submodule path '\$displaypath'")"
+   say_msg="$(eval_gettext "Submodule path 
'\$displaypath': checked out '\$sha1'")"
+   ;;
+   rebase)
+   command="git rebase"
+   die_msg="$(eval_gettext "Unable to rebase '\$sha1' in 
submodule path '\$displaypath'")"
+   say_msg="$(eval_gettext "Submodule path 
'\$displaypath': rebased into '\$sha1'")"
+   must_die_on_failure=yes
+   ;;
+   merge)
+   command="git merge"
+   die_msg="$(eval_gettext "Unable to merge '\$sha1' in 
submodule path '\$displaypath'")"
+   say_msg="$(eval_gettext "Submodule path 
'\$displaypath': merged in '\$sha1'")"
+   must_die_on_failure=yes
+   ;;
+   !*)
+   command="${update_module#!}"
+   die_msg="$(eval_gettext "Execution of '\$command 
\$sha1' failed in submodule  path '\$prefix\$sm_path'")"
+   say_msg="$(eval_gettext "Submodule path 
'\$prefix\$sm_path': '\$command \$sha1'")"
+   must_die_on_failure=yes
+   ;;
+   *)
+   die "$(eval_gettext "Invalid update mode 
'$update_module' for submodule '$name'")"
+   esac
+
+   if (clear_local_git_env; cd "$sm_path" && $command "$sha1")
+   then
+   say "$say_msg"
+   elif test -n "$must_die_on_failure"
+   then
+   die_with_status 2 "$die_msg"
+   else
+   err="${err};$die_msg"
+   return
+   fi
+   fi
+
+   cmd_update_recursive
+}
+
 #
 # Update each submodule path to correct revision, using clone and checkout as 
needed
 #
@@ -743,88 +826,9 @@ Maybe you want to use 'update --init'?")"
if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
then
cmd_update_clone
-   continue
else
-   subsha1=$(clear_local_git_env; cd "$sm_path" &&
-   git rev-parse --verify HEAD) ||
-   die "$(eval_gettext "Unable to find current revision in 
submodule path '\$displaypath'")"
+   cmd_update_fetch
fi
-
-   if test -n 

[PATCH 05/10] submodules: Allow parallel fetching, add tests and documentation

2015-09-16 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 Documentation/fetch-options.txt |  7 +++
 builtin/fetch.c |  5 -
 builtin/pull.c  |  6 ++
 t/t5526-fetch-submodules.sh | 19 +++
 4 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index 45583d8..6b109f6 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -100,6 +100,13 @@ ifndef::git-pull[]
reference to a commit that isn't already in the local submodule
clone.
 
+-j::
+--jobs=::
+   Number of parallel children to be used for fetching submodules.
+   Each will fetch from different submodules, such that fetching many
+   submodules will be faster. By default submodules will be fetched
+   one at a time.
+
 --no-recurse-submodules::
Disable recursive fetching of submodules (this has the same effect as
using the '--recurse-submodules=no' option).
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 6620ed0..f28eac6 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -37,6 +37,7 @@ static int prune = -1; /* unspecified */
 static int all, append, dry_run, force, keep, multiple, update_head_ok, 
verbosity;
 static int progress = -1, recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
 static int tags = TAGS_DEFAULT, unshallow, update_shallow;
+static int max_children = 1;
 static const char *depth;
 static const char *upload_pack;
 static struct strbuf default_rla = STRBUF_INIT;
@@ -99,6 +100,8 @@ static struct option builtin_fetch_options[] = {
N_("fetch all tags and associated objects"), TAGS_SET),
OPT_SET_INT('n', NULL, ,
N_("do not fetch all tags (--no-tags)"), TAGS_UNSET),
+   OPT_INTEGER('j', "jobs", _children,
+   N_("number of submodules fetched in parallel")),
OPT_BOOL('p', "prune", ,
 N_("prune remote-tracking branches no longer on remote")),
{ OPTION_CALLBACK, 0, "recurse-submodules", NULL, N_("on-demand"),
@@ -1218,7 +1221,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
submodule_prefix,
recurse_submodules,
verbosity < 0,
-   0);
+   max_children);
argv_array_clear();
}
 
diff --git a/builtin/pull.c b/builtin/pull.c
index 722a83c..f0af196 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -94,6 +94,7 @@ static int opt_force;
 static char *opt_tags;
 static char *opt_prune;
 static char *opt_recurse_submodules;
+static char *max_children;
 static int opt_dry_run;
 static char *opt_keep;
 static char *opt_depth;
@@ -177,6 +178,9 @@ static struct option pull_options[] = {
N_("on-demand"),
N_("control recursive fetching of submodules"),
PARSE_OPT_OPTARG),
+   OPT_PASSTHRU('j', "jobs", _children, N_("n"),
+   N_("number of submodules pulled in parallel"),
+   PARSE_OPT_OPTARG),
OPT_BOOL(0, "dry-run", _dry_run,
N_("dry run")),
OPT_PASSTHRU('k', "keep", _keep, NULL,
@@ -524,6 +528,8 @@ static int run_fetch(const char *repo, const char 
**refspecs)
argv_array_push(, opt_prune);
if (opt_recurse_submodules)
argv_array_push(, opt_recurse_submodules);
+   if (max_children)
+   argv_array_push(, max_children);
if (opt_dry_run)
argv_array_push(, "--dry-run");
if (opt_keep)
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 17759b1..1b4ce69 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -71,6 +71,16 @@ test_expect_success "fetch --recurse-submodules recurses 
into submodules" '
test_i18ncmp expect.err actual.err
 '
 
+test_expect_success "fetch --recurse-submodules -j2 has the same output 
behaviour" '
+   add_upstream_commit &&
+   (
+   cd downstream &&
+   git fetch --recurse-submodules -j2 2>../actual.err
+   ) &&
+   test_must_be_empty actual.out &&
+   test_i18ncmp expect.err actual.err
+'
+
 test_expect_success "fetch alone only fetches superproject" '
add_upstream_commit &&
(
@@ -140,6 +150,15 @@ test_expect_success "--quiet propagates to submodules" '
! test -s actual.err
 '
 
+test_expect_success "--quiet propagates to parallel submodules" '
+   (
+   cd downstream &&
+   git fetch --recurse-submodules -j 2 --quiet  >../actual.out 
2>../actual.err
+   ) &&
+   ! test -s actual.out &&
+   ! test -s actual.err
+'
+
 test_expect_success "--dry-run propagates to submodules" '

[PATCH 06/10] git submodule update: Redirect any output to stderr

2015-09-16 Thread Stefan Beller
There are no tests, which fail by this.

Signed-off-by: Stefan Beller 
---
 git-submodule.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 8b0eb9a..7ef3247 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -663,7 +663,7 @@ cmd_update()
die_if_unmatched "$mode"
if test "$stage" = U
then
-   echo >&2 "Skipping unmerged submodule $prefix$sm_path"
+   say >&2 "Skipping unmerged submodule $prefix$sm_path"
continue
fi
name=$(git submodule--helper name "$sm_path") || exit
@@ -684,7 +684,7 @@ cmd_update()
 
if test "$update_module" = "none"
then
-   echo "Skipping submodule '$displaypath'"
+   say >&2 "Skipping submodule '$displaypath'"
continue
fi
 
-- 
2.6.0.rc0.131.gf624c3d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/10] git submodule update: cmd_update_recursive

2015-09-16 Thread Stefan Beller
split the recursion part out to its own function

Signed-off-by: Stefan Beller 
---
 git-submodule.sh | 47 ++-
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 3ccb0b6..52c2967 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -582,6 +582,31 @@ cmd_deinit()
done
 }
 
+
+cmd_update_recursive()
+{
+   if test -n "$recursive"
+   then
+   (
+   prefix="$prefix$sm_path/"
+   clear_local_git_env
+   cd "$sm_path" &&
+   eval cmd_update
+   )
+   res=$?
+   if test $res -gt 0
+   then
+   die_msg="$(eval_gettext "Failed to recurse into 
submodule path '\$displaypath'")"
+   if test $res -eq 1
+   then
+   err="${err};$die_msg"
+   else
+   die_with_status $res "$die_msg"
+   fi
+   fi
+   fi
+}
+
 #
 # Update each submodule path to correct revision, using clone and checkout as 
needed
 #
@@ -790,27 +815,7 @@ Maybe you want to use 'update --init'?")"
fi
fi
 
-   if test -n "$recursive"
-   then
-   (
-   prefix="$prefix$sm_path/"
-   clear_local_git_env
-   cd "$sm_path" &&
-   eval cmd_update
-   )
-   res=$?
-   if test $res -gt 0
-   then
-   die_msg="$(eval_gettext "Failed to recurse into 
submodule path '\$displaypath'")"
-   if test $res -eq 1
-   then
-   err="${err};$die_msg"
-   continue
-   else
-   die_with_status $res "$die_msg"
-   fi
-   fi
-   fi
+   cmd_update_recursive
done
 
if test -n "$err"
-- 
2.6.0.rc0.131.gf624c3d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 17/67] use xsnprintf for generating git object headers

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> We generally use 32-byte buffers to format git's "type size"
> header fields. These should not generally overflow unless
> you can produce some truly gigantic objects (and our types
> come from our internal array of constant strings). But it is
> a good idea to use xsnprintf to make sure this is the case.
>
> Note that we slightly modify the interface to
> write_sha1_file_prepare, which nows uses "hdrlen" as an "in"
> parameter as well as an "out" (on the way in it stores the
> allocated size of the header, and on the way out it returns
> the ultimate size of the header).

;-).  I skipped this paragraph and jumped directly into the code,
noticed that you did something funny in write-sha1-file-prepare and
did the right thing in hash-sha1-file to compensate for it, and then
came back here to find it properly described.  Now I read the patch
again and I see the other caller is also properly adjusted.

I think this change makes sense.  Thanks.

>
> Signed-off-by: Jeff King 
> ---
>  builtin/index-pack.c |  2 +-
>  bulk-checkin.c   |  4 ++--
>  fast-import.c|  4 ++--
>  http-push.c  |  2 +-
>  sha1_file.c  | 13 +++--
>  5 files changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index 3431de2..1ad1bde 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -441,7 +441,7 @@ static void *unpack_entry_data(unsigned long offset, 
> unsigned long size,
>   int hdrlen;
>  
>   if (!is_delta_type(type)) {
> - hdrlen = sprintf(hdr, "%s %lu", typename(type), size) + 1;
> + hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), 
> size) + 1;
>   git_SHA1_Init();
>   git_SHA1_Update(, hdr, hdrlen);
>   } else
> diff --git a/bulk-checkin.c b/bulk-checkin.c
> index 7cffc3a..4347f5c 100644
> --- a/bulk-checkin.c
> +++ b/bulk-checkin.c
> @@ -200,8 +200,8 @@ static int deflate_to_pack(struct bulk_checkin_state 
> *state,
>   if (seekback == (off_t) -1)
>   return error("cannot find the current offset");
>  
> - header_len = sprintf((char *)obuf, "%s %" PRIuMAX,
> -  typename(type), (uintmax_t)size) + 1;
> + header_len = xsnprintf((char *)obuf, sizeof(obuf), "%s %" PRIuMAX,
> +typename(type), (uintmax_t)size) + 1;
>   git_SHA1_Init();
>   git_SHA1_Update(, obuf, header_len);
>  
> diff --git a/fast-import.c b/fast-import.c
> index 6c7c3c9..d0c2502 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -1035,8 +1035,8 @@ static int store_object(
>   git_SHA_CTX c;
>   git_zstream s;
>  
> - hdrlen = sprintf((char *)hdr,"%s %lu", typename(type),
> - (unsigned long)dat->len) + 1;
> + hdrlen = xsnprintf((char *)hdr, sizeof(hdr), "%s %lu",
> +typename(type), (unsigned long)dat->len) + 1;
>   git_SHA1_Init();
>   git_SHA1_Update(, hdr, hdrlen);
>   git_SHA1_Update(, dat->buf, dat->len);
> diff --git a/http-push.c b/http-push.c
> index 154e67b..1f3788f 100644
> --- a/http-push.c
> +++ b/http-push.c
> @@ -361,7 +361,7 @@ static void start_put(struct transfer_request *request)
>   git_zstream stream;
>  
>   unpacked = read_sha1_file(request->obj->sha1, , );
> - hdrlen = sprintf(hdr, "%s %lu", typename(type), len) + 1;
> + hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), len) + 1;
>  
>   /* Set it up */
>   git_deflate_init(, zlib_compression_level);
> diff --git a/sha1_file.c b/sha1_file.c
> index d295a32..f106091 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1464,7 +1464,7 @@ int check_sha1_signature(const unsigned char *sha1, 
> void *map,
>   return -1;
>  
>   /* Generate the header */
> - hdrlen = sprintf(hdr, "%s %lu", typename(obj_type), size) + 1;
> + hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(obj_type), 
> size) + 1;
>  
>   /* Sha1.. */
>   git_SHA1_Init();
> @@ -2930,7 +2930,7 @@ static void write_sha1_file_prepare(const void *buf, 
> unsigned long len,
>   git_SHA_CTX c;
>  
>   /* Generate the header */
> - *hdrlen = sprintf(hdr, "%s %lu", type, len)+1;
> + *hdrlen = xsnprintf(hdr, *hdrlen, "%s %lu", type, len)+1;
>  
>   /* Sha1.. */
>   git_SHA1_Init();
> @@ -2993,7 +2993,7 @@ int hash_sha1_file(const void *buf, unsigned long len, 
> const char *type,
> unsigned char *sha1)
>  {
>   char hdr[32];
> - int hdrlen;
> + int hdrlen = sizeof(hdr);
>   write_sha1_file_prepare(buf, len, type, sha1, hdr, );
>   return 0;
>  }
> @@ -3139,7 +3139,7 @@ static int freshen_packed_object(const unsigned char 
> *sha1)
>  int write_sha1_file(const void *buf, unsigned long len, const char *type, 
> unsigned char *sha1)
>  {
>   char hdr[32];
> - int hdrlen;
> + int hdrlen = sizeof(hdr);
>  
>   /* 

Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 11:24:27AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > On Tue, Sep 15, 2015 at 11:19:21PM -0400, Eric Sunshine wrote:
> >
> >> > strcpy(hexbuf[stage], sha1_to_hex(ce->sha1));
> >> > -   sprintf(ownbuf[stage], "%o", ce->ce_mode);
> >> > +   xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", 
> >> > ce->ce_mode);
> >> 
> >> Interesting. I wonder if there are any (old/broken) compilers which
> >> would barf on this. If we care, perhaps sizeof(ownbuf[0]) instead?
> >
> > Good point. I've changed it to sizeof(ownbuf[0]).
> 
> Panda brain is lost here.  What's the difference, other than that we
> will now appear to be measuring the size of the thing at index 0
> while using that size to stuff data into a different location?  All
> elements of the array are of the same size so there wouldn't be any
> difference either way, no?

Correct. The original is sane and gcc does the right thing. The question
is whether some compiler would complain that "stage" is not a constant
in the sizeof() expression. I don't know if any compiler would do so,
but it is easy enough to be conservative.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> This code is exactly replicating strdup, so let's just use
> that. It's shorter, and eliminates some confusion (such as
> whether "p - s" is really enough to hold the result; it is,
> because we write NULs as we shrink "p").
>
> Signed-off-by: Jeff King 
> ---
>  remote.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/remote.c b/remote.c
> index 5ab0f7f..1b69751 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote)
>   int n = 1000;
>   FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r");
>   char *s, *p;
> - int len;

Hmm, we would punish those with ridiculously long remote name by
truncating at n but that is OK.

We use buffer[BUFSIZ] to read various things in this file, not just
$GIT_DIR/branches/* files, with fgets(), which may be better done if
we switched to strbuf_getline().  Then we can also use trim family
of calls from the strbuf API suite.

Move to strbuf_getline() may be a doubly attractive proposition,
with a possible change to strbuf_getline() to make it also remove CR
that immediately precedes LF [*1*], helping DOSsy platforms.


[Reference]

*1* http://thread.gmane.org/gmane.comp.version-control.msysgit/21773/focus=21780



>  
>   if (!f)
>   return;
> @@ -313,9 +312,7 @@ static void read_branches_file(struct remote *remote)
>   p = s + strlen(s);
>   while (isspace(p[-1]))
>   *--p = 0;
> - len = p - s;
> - p = xmalloc(len + 1);
> - strcpy(p, s);
> + p = xstrdup(s);
>  
>   /*
>* The branches file would have URL and optionally
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 10:06:10AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > On Wed, Sep 16, 2015 at 10:15:02AM +0200, Johannes Schindelin wrote:
> >
> >> Maybe we should stick to the established practice of the many, many
> >> reentrant POSIX functions following the *_r() naming convention? I.e.
> >> the reentrant version of localtime() is called localtime_r(), the
> >> reentrant version of random() is called random_r() etc.
> >> 
> >> So I could see myself not needing an explanation if I had read
> >> sha1_to_hex_r(...).
> >
> > I like this suggestion. By itself, the "_r" does not communicate as much
> > as "_to" to me, but as long as the reader knows the "_r" idiom, it
> > communicates much more.
> >
> > I'll switch to this unless there is any objection.
> 
> Fine by me.  Thanks.

I started on this, but realized something interesting as I was updating
the docstrings. sha1_to_hex_r is truly reentrant. But
find_unique_abbrev_r is not, as it calls has_sha1_file(), which touches
lots of global data for the object lookup.

I think it's probably OK, as long as I don't make the claim that it is
truly reentrant.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] l10n: de.po: translate 2 messages

2015-09-16 Thread Ralf Thielow
Translate 2 messages came from git.pot update in e447091
(l10n: git.pot: v2.6.0 round 2 (3 improvements)).

Signed-off-by: Ralf Thielow 
---
 po/de.po | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/po/de.po b/po/de.po
index e5b523d..c9b4d16 100644
--- a/po/de.po
+++ b/po/de.po
@@ -10530,9 +10530,8 @@ msgstr ""
 "hash[=]] [--abbrev[=]] [--tags] [--heads] [--] [...] "
 
 #: builtin/show-ref.c:11
-#, fuzzy
 msgid "git show-ref --exclude-existing[=] < "
-msgstr "git show-ref --exclude-existing[=Muster] < ref-list"
+msgstr "git show-ref --exclude-existing[=] < "
 
 #: builtin/show-ref.c:170
 msgid "only show tags (can be combined with heads)"
@@ -10761,9 +10760,8 @@ msgid "replace the tag if exists"
 msgstr "das Tag ersetzen, wenn es existiert"
 
 #: builtin/tag.c:609 builtin/update-ref.c:368
-#, fuzzy
 msgid "create a reflog"
-msgstr "create_reflog"
+msgstr "Reflog erstellen"
 
 #: builtin/tag.c:611
 msgid "Tag listing options"
-- 
2.6.0.rc1.199.g678474c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Feature Request] git blame showing only revisions from git rev-list --first-parent

2015-09-16 Thread Jeff King
On Tue, Sep 15, 2015 at 06:14:26PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > It seems like nobody is actually that interested in what "blame
> > --first-parent --reverse" does in the first place, though, and there's
> > no reason for its complexity to hold up vanilla --first-parent. So what
> > do you think of:
> 
> I like the part that explicitly disables the combination of the two
> ;-)

Meaning you didn't like the other part, or that you'll pick up the patch
as-is? :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html