Re: [PATCH 2/2] refs_resolve_ref_unsafe: handle d/f conflicts for writes

2017-11-04 Thread Michael Haggerty
On 10/07/2017 06:36 AM, Michael Haggerty wrote:
> On 10/06/2017 07:16 PM, Jeff King wrote:
>> On Fri, Oct 06, 2017 at 07:09:10PM +0200, Michael Haggerty wrote:
>>
>>> I do have one twinge of uneasiness at a deeper level, that I haven't had
>>> time to check...
>>>
>>> Does this patch make it easier to *set* HEAD to an unborn branch that
>>> d/f conflicts with an existing reference? If so, that might be a
>>> slightly worse UI for users. I'd rather learn about such a problem when
>>> setting HEAD (when I am thinking about the new branch name and am in the
>>> frame of mind to solve the problem) rather than later, when I try to
>>> commit to the new branch.
>>
>> Good question. The answer is no, it's allowed both before and after my
>> patch. At least via git-symbolic-ref.
>>
>> I agree it would be nice to know earlier for such a case. For
>> symbolic-ref, we probably should allow it, because it's plumbing that
>> may be used for tricky things. For things like "checkout -b", you'd
>> generally get a timely warning as we try to create the ref.
>>
>> The odd man out is "checkout --orphan", which leaves the branch unborn.
>> It might be nice if it did a manual check that the ref is available (and
>> also that it's syntactically acceptable, though I think we may do that
>> already).
>>
>> But all of that is orthogonal to this fix, I think.
> 
> Thanks for checking. Yes, I totally agree that this is orthogonal.

I also just checked but there don't seem to be any docstrings that need
updating.

Reviewed-by: Michael Haggerty 

(both patches in this series).

Michael






Re: [PATCH 1/3] notes: move hex_to_bytes() to hex.c and export it

2017-11-04 Thread Kevin Daudt
On Tue, Oct 31, 2017 at 02:46:49PM +0100, René Scharfe wrote:
> Make the function for converting pairs of hexadecimal digits to binary
> available to other call sites.
> 
> Signed-off-by: Rene Scharfe 
> ---
>  cache.h |  7 +++
>  hex.c   | 12 
>  notes.c | 17 -
>  3 files changed, 19 insertions(+), 17 deletions(-)
> 
> diff --git a/cache.h b/cache.h
> index 6440e2bf21..f06bfbaf32 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1317,6 +1317,13 @@ extern int set_disambiguate_hint_config(const char 
> *var, const char *value);
>  extern int get_sha1_hex(const char *hex, unsigned char *sha1);
>  extern int get_oid_hex(const char *hex, struct object_id *sha1);
>  
> +/*
> + * Read `len` pairs of hexadecimal digits from `hex` and write the
> + * values to `binary` as `len` bytes. Return 0 on success, or -1 if

Is it correct to call the result binary? I would say that it's the value
that gets stored. To me, this value does not really have a base.

Kevin


Re: git grep -P fatal: pcre_exec failed with error code -8

2017-11-04 Thread Jeff King
On Sun, Nov 05, 2017 at 01:06:21AM +0100, Дилян Палаузов wrote:

> with git 2.14.3 linked with libpcre.so.1.2.9 when I do:
>   git clone https://github.com/django/django
>   cd django
>   git grep -P "if.*([^\s])+\s+and\s+\1"
> django/contrib/admin/static/admin/js/vendor/select2/select2.full.min.js
> the output is:
>   fatal: pcre_exec failed with error code -8

Code -8 is PCRE_ERROR_MATCHLIMIT. And "man pcreapi" has this to say:

  The match_limit field provides a means of preventing PCRE from
  using up a vast amount of resources when running patterns that
  are not going to match, but which have a very large number of
  possibilities in their search trees. The classic example is a
  pattern that uses nested unlimited repeats.

  Internally, pcre_exec() uses a function called match(), which
  it calls repeatedly (sometimes recursively). The limit set by
  match_limit is imposed on the number of times this function is
  called during a match, which has the effect of limiting the
  amount of backtracking that can take place. For patterns that
  are not anchored, the count restarts from zero for each posi‐
  tion in the subject string.

  When pcre_exec() is called with a pattern that was successfully
  studied with a JIT option, the way that the matching is exe‐
  cuted is entirely different. However, there is still the pos‐
  sibility of runaway matching that goes on for a very long time,
  and so the match_limit value is also used in this case (but in
  a different way) to limit how long the matching can continue.

  The default value for the limit can be set when PCRE is built;
  the default default is 10 million, which handles all but the
  most extreme cases. You can override the default by suppling
  pcre_exec() with a pcre_extra block in which match_limit is
  set, and PCRE_EXTRA_MATCH_LIMIT is set in the flags field. If
  the limit is exceeded, pcre_exec() returns PCRE_ERROR_MATCH‐
  LIMIT.

So your pattern is just really expensive and is running afoul of pcre's
backtracking limits (and it's not helped by the fact that the file is
basically one giant line).

There's no way to ask Git to specify a larger match_limit to pcre, but
you might be able to construct your pattern in a way that involves less
backtracking. It looks like you're trying to find things like "if foo
and foo"?

Should the captured term actually be "([^\s]+)" (with the "+" on the
_inside_ of the capture? Or maybe I'm just misunderstanding your goal.

-Peff


Re: [PATCH 1/2] sequencer: factor out rewrite_file()

2017-11-04 Thread Jeff King
On Sat, Nov 04, 2017 at 07:36:43PM +0100, Simon Ruderich wrote:

> On Fri, Nov 03, 2017 at 03:13:10PM -0400, Jeff King wrote:
> > I think we've been gravitating towards error strbufs, which would make
> > it something like:
> 
> I like this approach to store the error in a separate variable
> and let the caller handle it. This provides proper error messages
> and is cleaner than printing the error on the error site (what
> error_errno does).
> 
> However I wouldn't use strbuf directly and instead add a new
> struct error which provides a small set of helper functions.
> Using a separate type also makes it clear to the reader that is
> not a normal string and is more extendable in the future.

Yes, I think what you've written here (and below) is quite close to the
error_context patches I linked elsewhere in the thread. In other
words, I think it's a sane approach.

> We could also store the error condition in the error struct and
> don't use the return value to indicate and error like this:
> 
> void write_file_buf(const char *path, const char *buf, size_t len)
> {
> struct error err = ERROR_INIT;
> write_file_buf_gently2(path, buf, len, );
> if (err.error)
> error_die();
> }

I agree it might be nice for the error context to have a positive "there
was an error" flag. It's probably worth making it redundant with the
return code, though, so callers can use whichever style is most
convenient for them.

> > OTOH, if we went all-in on flexible error handling contexts, you could
> > imagine this function becoming:
> >
> >   void write_file_buf(const char *path, const char *buf, size_t len,
> >   struct error_context *err)
> >   {
> > int fd = xopen(path, err, O_WRONLY | O_CREAT | O_TRUNC, 0666);
> > if (fd < 0)
> > return -1;
> > if (write_in_full(fd, buf, len, err) < 0)
> > return -1;
> > if (xclose(fd, err) < 0)
> > return -1;
> > return 0;
> >   }
> 
> This looks interesting as well, but it misses the feature of
> custom error messages which is really useful.

Right, I didn't think that example through. The functions after the
open() don't have enough information to make a good message.

-Peff


Re: [PATCH 5/6] t0021/rot13-filter: add capability functions

2017-11-04 Thread Junio C Hamano
Christian Couder  writes:

>>> + my ( $res, $buf ) = packet_bin_read();
>>> + return ( $res, @cap ) if ( $res != 0 );
>>
>> The original had the same "'list eq list' does not do what you may
>> think it does" issue.  This one corrects it, which is good.
>>
>> I am not sure if ($res != 0) is correct though.  What should happen
>> when you get an unexpected EOF at this point?  The original would
>> have died; this ignores and continues.
>
> Well if there is an unexpected EOF, then packet_bin_read() returns
> (-1, ""), so packet_read_capabilities() returns (-1, @cap) where @cap
> contains the capabilities already received. Then
> packet_read_and_check_capabilities() checks that we received all the
> capabilities we expect and dies if that is not the case. If we did
> receive all the capabilities, then
> packet_read_and_check_capabilities() still returns -1, so the caller
> may check that and die.

In other words, it happens, by accident, to stop before going very
far.  I think we'd be better off making it an explicitly checked
error.

> But yeah we could also just die in packet_read_capabilities() if $res
> is -1. I will make this change.

Good.

Thanks.


Re: [PATCH v1 2/2] log: add option to choose which refs to decorate

2017-11-04 Thread Junio C Hamano
Rafael Ascensão  writes:

>>> The pattern follows similar rules as `--glob` except it doesn't assume a
>>> trailing '/*' if glob characters are missing.
>>
>> Why should this be a special case that burdens users to remember one
>> more rule?  Wouldn't users find "--decorate-refs=refs/tags" useful
>> and it woulld be shorter and nicer than having to say "refs/tags/*"?
>
> I wanted to allow exact patterns like:
> "--decorate-refs=refs/heads/master" and for that I disabled the flag
> that adds the trailing '/*' if no globs are found. As a side effect, I
> lost the shortcut.
>
> Is adding a yet another flag that appends '/*' only if the pattern
> equals "refs/{heads,remotes,tags}" a good idea?

No.

> Because changing the default behavior of that function has
> implications on multiple commands which I think shouldn't change. But
> at the same time, would be nice to have the logic that deals with
> glob-ref patterns all in one place.
>
> What's the sane way to do this?

Learn to type "--decorate-refs="refs/heads/[m]aster", and not twewak
the code at all, perhaps.  The users of existing "with no globbing,
/* is appended" interface are already used to that way and they do
not have to learn a new and inconsistent interface.

After all, "I only want to see 'git log' output with 'master'
decorated" (i.e. not specifying "this class of refs I can glob by
using the naming convention I am using" and instead enumerating the
ones you care about) does not sound like a sensible thing people
often want to do, so making it follow the other codepath so that
people can say "refs/tags" to get "refs/tags/*", while still allowing
such a rare but specific and exact one possible, may not sound too
bad to me.



Re: [PATCH] l10n: es.po: fix typos

2017-11-04 Thread Christopher Díaz Riveros
On Sat, Nov 04, 2017 at 08:20:44PM -0300, Gaston Gonzalez wrote:

Thank you, applied to my personal 2.15-next branch, I'll work on other fixes and
send a bigger chunk to main repo.

Regards


git grep -P fatal: pcre_exec failed with error code -8

2017-11-04 Thread Дилян Палаузов

Hello,

with git 2.14.3 linked with libpcre.so.1.2.9 when I do:
  git clone https://github.com/django/django
  cd django
  git grep -P "if.*([^\s])+\s+and\s+\1" 
django/contrib/admin/static/admin/js/vendor/select2/select2.full.min.js

the output is:
  fatal: pcre_exec failed with error code -8


(But not with
git clone https://github.com/select2/select2
cd select2
git grep -P "if.*([^\s])+\s+and\s+\1" dist/js/select2.full.min.js

as the two select2.full.min.js files differ slightly in their size)

Any ideas?

Regards
  Dilian


[PATCH] l10n: es.po: fix typos

2017-11-04 Thread Gaston Gonzalez
Fix some typos in the spanish translation.

Signed-off-by: Gaston Gonzalez 
---
 po/es.po | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/po/es.po b/po/es.po
index 89a2dc014..43251cbc9 100644
--- a/po/es.po
+++ b/po/es.po
@@ -4746,7 +4746,7 @@ msgstr ""
 #: wt-status.c:1668
 #, c-format
 msgid "nothing to commit, working tree clean\n"
-msgstr "nada para hacer comit, el arbol de trabajo esta limpio\n"
+msgstr "nada para hacer commit, el arbol de trabajo esta limpio\n"
 
 #: wt-status.c:1780
 msgid "No commits yet on "
@@ -15518,17 +15518,17 @@ msgid ""
 "\n"
 "  git rebase --continue\n"
 msgstr ""
-"TIene cambios en el area de stage de su arbol de trabajo.\n"
+"Tiene cambios en el area de stage de su arbol de trabajo.\n"
 "Si estos cambios estan destinados a \n"
 "ser aplastados en el commit previo, ejecute:\n"
 "\n"
 "  git commit --amend $gpg_sign_opt_quoted\n"
 "\n"
-"Si estos estan destinados a ir en un nuevo comit, ejecute:\n"
+"Si estos estan destinados a ir en un nuevo commit, ejecute:\n"
 "\n"
 "  git commit $gpg_sign_opt_quoted\n"
 "\n"
-"En ambos casos, cuando termite, continue con:\n"
+"En ambos casos, cuando termine, continue con:\n"
 "\n"
 "  git rebase --continue\n"
 
-- 
2.15.0



Re: [PATCH v1 1/2] refs: extract function to normalize partial refs

2017-11-04 Thread Kevin Daudt
On Sat, Nov 04, 2017 at 11:27:39AM +0900, Junio C Hamano wrote:
> I however notice that addition of /* to the tail is trying to be
> careful by using strbuf_complete('/'), but prefixing with "refs/"
> does not and we would end up with a double-slash if pattern begins
> with a slash.  The contract between the caller of this function (or
> its original, which is for_each_glob_ref_in()) and the callee is
> that prefix must not begin with '/', so it may be OK, but we might
> want to add "if (*pattern == '/') BUG(...)" at the beginning.
>
> I dunno.  In any case, that is totally outside the scope of this two
> patch series.

I do think it's a good idea to make future readers of the code aware of
this contract, and adding a BUG assert does that quite well. Here is a
patch that implements it.

This applies of course on top of this patch series.

-- >8 --
Subject: [PATCH] normalize_glob_ref: assert implicit contract of prefix

normalize_glob_ref has an implicit contract of expecting 'prefix' to not
start with a '/', otherwise the pattern would end up with a
double-slash.

Mark it as a BUG when the prefix argument of normalize_glob_ref starts
with a '/' so that future callers will be aware of this contract.

Signed-off-by: Kevin Daudt 
---
 refs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/refs.c b/refs.c
index e9ae659ae..6747981d1 100644
--- a/refs.c
+++ b/refs.c
@@ -372,6 +372,8 @@ int head_ref_namespaced(each_ref_fn fn, void *cb_data)
 void normalize_glob_ref(struct strbuf *normalized_pattern, const char *prefix,
const char *pattern, int flags)
 {
+   if (prefix && *prefix == '/') BUG("prefix cannot not start with '/'");
+
if (!prefix && !starts_with(pattern, "refs/"))
strbuf_addstr(normalized_pattern, "refs/");
else if (prefix)
-- 
2.15.0.rc2.57.g2f899857a9



Re: [PATCH 6/7] builtin/describe.c: describe a blob

2017-11-04 Thread Philip Oakley

From: "Junio C Hamano" 
Sent: Thursday, November 02, 2017 4:23 AM

Junio C Hamano  writes:


The reason why we say "-ish" is "Yes we know v2.15.0 is *NOT* a
commit object, we very well know it is a tag object, but because we
allow it to be used in a context that calls for a commit object, we
mark that use context as 'this accepts commit-ish, not just
commit'".


Having said all that, there is a valid case in which we might want
to say "blob-ish".


Is this not also an alternative case, relative to the user, for the scenario 
where the user has an oid/sha1 value but does not know what it is, and would 
like to find its source and type relative to the `describe` command.


IIUC the existing `describe` command only accepts  values, and 
here we are extending that to be even more inclusive, but at the same time 
the options become more restricted. Thus the synopsis terminology would be 
more about suggesting the range of options available (search style/start 
points) that are applicable to blobs, than being exactly about the 
'allow-blobs' parameter.


Or have I misunderstood how the fast commit search and the slower 
potentially-a-blob searching are disambiguated?

--
Philip



To review, X-ish is the word we use when the command wants to take
an X, but tolerates a lazy user who gives a Y, which is *NOT* X,
without bothering to add ^{X} suffix, i.e. Y^{X}.  In such a case,
the command takes not just X but takes X-ish because it takes a Y
and converts it internally to an X to be extra nice.

When the command wants to take a blob, but tolerates something else
and does "^{blob}" internally, we can say it takes "blob-ish".
Technically that "something else" could be an annotated tag that
points at a blob object, without any intervening commit or tree (I
did not check if the "describe " code in this thread handles
this, though).

But because it is not usually done to tag a blob directly, it would
probably be not worth to say "blob-ish" in the document and cause
readers to wonder in what situation something that is not a blob can
be treated as if it were a blob.  It does feel like we would be
pursuing technical correctness too much and sacrificing the readability
of the document, at least to me, and a bad trade-off.






Re: How to resolve mixture of modified and deleted conflicts easily in git?

2017-11-04 Thread Philip Oakley

From: "Robert Dailey" 

When doing a rebase, sometimes I will get `DU` and `UU` conflicts
(locally deleted and locally modified, respectively). Furthermore, in
some of these cases, I want to take "ours" for all conflicts,
including ones where the local file is deleted. Ideally, it's just one
command:

   $ git checkout --ours .

However, this fails for the locally deleted conflicts:

   error: path 'foo.xml' does not have our version

Even more annoyingly, the fact that these failures occur prevents the
`UU` conflicts from being resolved. The whole operation fails
atomically. I am not aware of a straightforward and uniform way to
resolve conflicts with "ours" during a rebase when locally deleted
files exist in the list of conflicts. What is the most elegant
solution in this situation?

I'm running Git for Windows v2.13.1.


Can you give the example commands and response?

In particular note the comment in the rebase command that the ours/theirs 
viewpoints are swapped, so you may find your stated "ours" should be (mid 
rebase) seen as "theirs".

--
Philip 



Re: [PATCH 1/2] sequencer: factor out rewrite_file()

2017-11-04 Thread Simon Ruderich
On Fri, Nov 03, 2017 at 03:13:10PM -0400, Jeff King wrote:
> I think we've been gravitating towards error strbufs, which would make
> it something like:

I like this approach to store the error in a separate variable
and let the caller handle it. This provides proper error messages
and is cleaner than printing the error on the error site (what
error_errno does).

However I wouldn't use strbuf directly and instead add a new
struct error which provides a small set of helper functions.
Using a separate type also makes it clear to the reader that is
not a normal string and is more extendable in the future.

> I'm not excited that the amount of error-handling code is now double the
> amount of code that actually does something useful. Maybe this function
> simply isn't large/complex enough to merit flexible error handling, and
> we should simply go with René's original near-duplicate.

A separate struct (and helper functions) would help in this case
and could look like this, which is almost equal (in code size) to
the original solution using error_errno:

int write_file_buf_gently2(const char *path, const char *buf, size_t len, 
struct error *err)
{
int rc = 0;
int fd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666);
if (fd < 0)
return error_addf_errno(err, _("could not open '%s' for 
writing"), path);
if (write_in_full(fd, buf, len) < 0)
rc = error_addf_errno(err, _("could not write to '%s'"), 
path);
if (close(fd) && !rc)
rc = error_addf_errno(err, _("could not close '%s'"), path);
return rc;
}

(I didn't touch write_in_full here, but it could also take the
err and then the code would get a little shorter, however would
lose the "path" information, but see below.)

And in the caller:

void write_file_buf(const char *path, const char *buf, size_t len)
{
struct error err = ERROR_INIT;
if (write_file_buf_gently2(path, buf, len, ) < 0)
error_die();
}

For now struct error just contains the strbuf, but one could add
the call location (by using a macro for error_addf_errno) or the
original errno or more information in the future.

error_addf_errno() could also prepend the error the buffer so
that the caller can add more information if necessary and we get
something like: "failed to write file 'foo': write failed: errno
text" in the write_file_buf case (the first error string is from
write_file_buf_gently2, the second from write_in_full). However
I'm not sure how well this works with translations.

We could also store the error condition in the error struct and
don't use the return value to indicate and error like this:

void write_file_buf(const char *path, const char *buf, size_t len)
{
struct error err = ERROR_INIT;
write_file_buf_gently2(path, buf, len, );
if (err.error)
error_die();
}

> OTOH, if we went all-in on flexible error handling contexts, you could
> imagine this function becoming:
>
>   void write_file_buf(const char *path, const char *buf, size_t len,
>   struct error_context *err)
>   {
>   int fd = xopen(path, err, O_WRONLY | O_CREAT | O_TRUNC, 0666);
>   if (fd < 0)
>   return -1;
>   if (write_in_full(fd, buf, len, err) < 0)
>   return -1;
>   if (xclose(fd, err) < 0)
>   return -1;
>   return 0;
>   }

This looks interesting as well, but it misses the feature of
custom error messages which is really useful.

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9


[PATCH] Git-cvsimport Improvement

2017-11-04 Thread patrick keshishian
Greetings,

I am attempting to improve CVS -> CVSps -> Git-cvsimport process.

The part involving Git-cvsimport has to do with parsing of CVSps
PatchSet file. Consider what happens if a CVS log/commit message
includes lines which start with "Members:", say from copy-and-paste
[2].

To avoid this issue, I have proposed that CVSps append the "Log:" tag
with line count of original CVS log/commit message [1].


The idea is if line-count is found after "Log:", that many (CVS log
message) lines get consumed before advancing $state to look for
"^Members:"

Current Git-cvsimport isn't strict in matching the "Log:" tag
(fortunately) and my proposed change to Git-cvsimport should be fully
backward compatible.

See attached patch.

Cheers,
--patrick

p.s., For reference: Why I'm doing this and RFC sent to CVS list:
http://lists.nongnu.org/archive/html/info-cvs/2017-11/msg0.html

[1] https://github.com/andreyvit/cvsps/pull/4

[2]  Example PatchSet with "Members:" line in original CVS commit message:
-
PatchSet 3
Date: 2017/10/30 23:25:20
Author: catbert
Branch: HEAD
Tag: (none)
Log:
This will confuse git-cvsimport's parser

Members:
somefile.c:1.1->1.2
another.h:1.7->1.8
foo.mk:1.22->1.23

Imagine these were lines pasted to note something

Members:
ABC:1.1->1.2
commit c3e406c54b8cd3a2bbf0aa729fef201e20fa6df5
Author: patrick keshishian 
Date:   Sat Nov 4 08:42:12 2017 -0700

Optionally parse line count out of PatchSets with "Log: count"

This is a change being suggested to CVSps where the line count of the
commit message gets added to the "Log:" tag to help Git cvsimport not
get confused if the CVS log/commit message included lines starting with
any of the tags found in CVSps PatchSet, e.g., Members:

This is part of a larger change to make CVS to Git import more robust.

diff --git a/git-cvsimport.perl b/git-cvsimport.perl
index 36929921e..5d78c5e87 100755
--- a/git-cvsimport.perl
+++ b/git-cvsimport.perl
@@ -786,6 +786,13 @@ open(CVS, "<$cvspsfile") or die $!;
 #
 #-
 
+# NOTE:
+## pk, 2017/10/30
+# patched cvsps will output ^Log: line with number of lines of log
+# which are to follow. This makes parsing robust for cases where the
+# log message contains ^Members: lines! Happens in OpenBSD sources:
+# e.g., See src/usr.sbin/bgpd/rde.c
+
 my $state = 0;
 
 sub update_index (\@\@) {
@@ -816,7 +823,7 @@ sub write_tree () {
return $tree;
 }
 
-my 
($patchset,$date,$author_name,$author_email,$author_tz,$branch,$ancestor,$tag,$logmsg);
+my 
($patchset,$date,$author_name,$author_email,$author_tz,$branch,$ancestor,$tag,$logmsg,$loglines);
 my (@old,@new,@skipped,%ignorebranch,@commit_revisions);
 
 # commits that cvsps cannot place anywhere...
@@ -1005,8 +1012,13 @@ while () {
$tag = $_;
}
$state = 7;
-   } elsif ($state == 7 and /^Log:/) {
+   } elsif ($state == 7 and /^Log:\s*(\d+)?$/) {
+   $loglines = $1 // -1;
$logmsg = "";
+   while ($loglines-- > 0 && ($_ = )) {
+   chomp;
+   $logmsg .= "$_\n";
+   }
$state = 8;
} elsif ($state == 8 and /^Members:/) {
$branch = $opt_o if $branch eq "HEAD";


Re: Regression[2.14.3->2.15]: Interactive rebase fails if submodule is modified

2017-11-04 Thread Orgad Shaneh
On Fri, Nov 3, 2017 at 6:20 PM, Johannes Schindelin
 wrote:
> Hi Orgad,
>
> On Fri, 3 Nov 2017, Johannes Schindelin wrote:
>
>> On Thu, 2 Nov 2017, Orgad Shaneh wrote:
>>
>> > I can't reproduce this with a minimal example, but it happens in my 
>> > project.
>
> Whoa, I somehow overlooked the "can't". Sorry.
>
> I inserted a `git diff-files` here, and it printed exactly what I
> expected:
>
> ++ git diff-files
> :16 16 62cab94c8d8cf047bbb60c12def559339300efa4 
>  M  sub
>
>> + git rebase -i HEAD^^
>> + )
>> +'
>
> There must be something else going wrong that we did not replicate here.
> Maybe the `error: cannot rebase: You have unstaged changes.` message was
> caused not by a change in the submodule? Could you run `git diff-files`
> before the rebase?

It's the same before and during the rebase:
$ git diff-files
:16 16 c840225a7cf6bb2ec64da9d35d2c29210bc5e5e8
 M  sub


>
> This does *not* refresh the index, but maybe that is what is going wrong;
> you could call `git update-index --refresh` before the rebase and see
> whether that works around the issue?

Nope.

If I run git submodule update, then rebase --continue works fine, so
it's definitely somehow caused by the submodule.

- Orgad


Re: Git Open Source Weekend London 11th/12th November

2017-11-04 Thread Philip Oakley

From: "Jeff King" 

On Wed, Nov 01, 2017 at 04:36:24PM +, Thomas Gummerer wrote:


Normally attendees work in small groups on a specific task to
prevent anyone from getting stuck. Per usual, Bloomberg will
provide the venue, mentors, snacks and drinks.  Bring your
enthusiasm (and your laptop!) and come share in the fun!  The
event is also open to everyone, so feel free to pass on the
invite!


I think it will help if the experienced members of the community (both
those who will be at the event and not) can come up with some possible
topics to work on (though of course I'd be glad for participants to come
with their own itch to scratch).

We've started using the #leftoverbits tag to allow searching in the
archive:

 https://public-inbox.org/git/?q=leftoverbits

Some of those have since been completed, but others are left open.
There's not really a master list, but it's a potential source for
digging for gold (well, if you want to call leftover bugs gold :) ).

I started a list over the summer of larger items that people might want
to pick up. Here it is in no particular order:

- the pager. config is mis-designed, because our config keys
  cannot represent all possible command names (e.g., case folding and
  illegal characters). This should be pager..enable or similar.
  Some discussion in (this message and the surrounding thread):


https://public-inbox.org/git/20170711101942.h2uwxtgzvgguz...@sigill.intra.peff.net/

  But I think you could find more by searching the archive.

- ditto for alias.* config, which has the same syntax problems.

- auto-gc is sometimes over-anxious to run if you have a lot of
  unreachable loose objects. We should pack unreachables into a single
  pack. That solves the --auto problem, and is also way more efficient.
  The catch is expiration. Some discussion here (and especially
  down-thread):


https://public-inbox.org/git/20170711101942.h2uwxtgzvgguz...@sigill.intra.peff.net/

- git-config's "--get-color" is unlike all the other types in that it
  takes a "default" value if the config key isn't set. This makes it
annoyingly
  inconsistent, but there's also no way to ask Git to interpret other
  values (e.g., you might want it to expand "--path" or an "--int"). It
  would be nice to have a general "--default" option so you could do:

# same as git config --get-color color.diff.old red
git config --default red --color color.diff.old

  or

# not currently possible to ask git to interpret "10k"
git config --default 10k --int some.integer.key

- git's internal config can parse expiration dates (see
  parse_expiry_date()), but you can't do so from git-config. It should
  probably have a type for "--expiry-date" (which would of course be
  more useful with the --default option above).

- there's no efficient way to ask git-config for several keys with a
  specific type (or even multiple different types).  You can use
  "--list" to get their strings, but you can't get any interpretation
  (like colors, integers, etc). Invoking git-config many times can have
  a noticeable speed impact for a script. There should probably be a
  "--stdin" mode (or maybe "--get-stdin" if we would one day want to
  have a "--set-stdin") that takes a list of keys, optional types, and
  optional defaults (that "--default" again!) and outputs them to
  stdout.


Those were just sitting on my ideas list. I'm happy to go into more
detail if anybody's interested in discussing any of them. Some of them
may be half-baked.

And of course I'd be happy if people wanted to contribute more items to
the list.



A few I've seen recently are:

* The provison of a `git resolve -X  -- ` command to
simplify the manual resolution of remaining conflicts.
https://public-inbox.org/git/8737615iu5@javad.com/  Sergey Organov: How
to re-merge paths differently?

* (Git for Windows/HFS): Detect directory capitalisation changes when
switching branches, and rename them correctly on case preserving, case
insensitive file systems. Optimisation: If the underlying tree is identical
then do not update the modified dates.
https://github.com/git-for-windows/git/issues/1333 Chuck Lu: Folder name
should be case sensitive when switch branches.

* (Git for Windows/HFS) (long standing):
detect branch name capitalisation issues
- may need a struct to carry both the filename and pathname down the
different parts of the code base so that the FS name of the requested
ref/heads/ can be checked and warned.
e.g. https://github.com/git-for-windows/git/issues/852 "`git checkout head`
inconsistent behavior"
- ('head' finds 'HEAD', but also 'branch' finds 'Branch');
https://github.com/git-for-windows/git/issues/852#issuecomment-239675187 ->
"rambling notes" for partial analysis.

https://github.com/git-for-windows/git/issues/752 "git checkout creating
tracking branch using case-insensitive match?"
- Is this part of the same problem?

* Documentation:

There's always the newby role at the hackathon of 

[PATCH] fix typos in 2.15.0 release notes

2017-11-04 Thread Jean Carlo Machado
Signed-off-by: Jean Carlo Machado 
---
 Documentation/RelNotes/2.15.0.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/RelNotes/2.15.0.txt 
b/Documentation/RelNotes/2.15.0.txt
index 248ba70c3..cdd761bcc 100644
--- a/Documentation/RelNotes/2.15.0.txt
+++ b/Documentation/RelNotes/2.15.0.txt
@@ -65,7 +65,7 @@ UI, Workflows & Features
learned to take the 'unfold' and 'only' modifiers to normalize its
output, e.g. "git log --format=%(trailers:only,unfold)".
 
- * "gitweb" shows a link to visit the 'raw' contents of blbos in the
+ * "gitweb" shows a link to visit the 'raw' contents of blobs in the
history overview page.
 
  * "[gc] rerereResolved = 5.days" used to be invalid, as the variable
@@ -109,13 +109,13 @@ Performance, Internal Implementation, Development Support 
etc.
  * Conversion from uchar[20] to struct object_id continues.
 
  * Start using selected c99 constructs in small, stable and
-   essentialpart of the system to catch people who care about
+   essential part of the system to catch people who care about
older compilers that do not grok them.
 
  * The filter-process interface learned to allow a process with long
latency give a "delayed" response.
 
- * Many uses of comparision callback function the hashmap API uses
+ * Many uses of comparison callback function the hashmap API uses
cast the callback function type when registering it to
hashmap_init(), which defeats the compile time type checking when
the callback interface changes (e.g. gaining more parameters).
-- 
2.15.0



[no subject]

2017-11-04 Thread Jean Carlo Machado

I hope it's right this time. :)


Re: [PATCH 3/4] remote-mediawiki: show known namespace choices on failure

2017-11-04 Thread Thomas Adam
On Sun, Oct 29, 2017 at 12:08:56PM -0400, Antoine Beaupré wrote:
> if we fail to find a requested namespace, we should tell the user
> which ones we know about, since we already do. this allows users to
> feetch all namespaces by specifying a dummy namespace, failing, then
> copying the list of namespaces in the config.
> 
> eventually, we should have a flag that allows fetching all namespaces
> automatically.
> 
> Reviewed-by: Antoine Beaupré 
> Signed-off-by: Antoine Beaupré 
> ---
>  contrib/mw-to-git/git-remote-mediawiki.perl | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
> b/contrib/mw-to-git/git-remote-mediawiki.perl
> index fc48846a1..07cc74bac 100755
> --- a/contrib/mw-to-git/git-remote-mediawiki.perl
> +++ b/contrib/mw-to-git/git-remote-mediawiki.perl
> @@ -1334,7 +1334,9 @@ sub get_mw_namespace_id {
>   my $id;
>  
>   if (!defined $ns) {
> - print {*STDERR} "No such namespace ${name} on MediaWiki.\n";
> + my @namespaces = sort keys %namespace_id;
> + for (@namespaces) { s/ /_/g; }

I am sure we can improve upon the need to process @namespaces twice:

my @namespaces = map { s/ /_/g; $_; } sort keys %namespaces_id;

-- Thomas Adam


Re: [PATCH v3 7/7] remote-mediawiki: show progress while fetching namespaces

2017-11-04 Thread Thomas Adam
On Thu, Nov 02, 2017 at 07:10:17PM -0400, Antoine Beaupré wrote:
> Actually, is there a standard way to do this in git with Perl
> extensions? I know about "option verbosity N" but how should I translate
> this into Perl? Carp? Warn? Log::Any? Log4perl?

No, not really.  From a quick glance at some of the existing perl code in git,
a lot of it continues to use "print STDERR" -- but then to be fair, a lot of
the perl code also reads like it has been written by C programmers...

While there's nothing wrong with using "print STDERR", it's probably wiser to
transition this to using Carp in the long run -- it would decrease the
round-trip time to debugging should there be a situation where that was
needed, and hence I would recommend using "warn" for less-severe
errors/debugging.

-- Thomas Adam


Re: [PATCH 1/2] sequencer: factor out rewrite_file()

2017-11-04 Thread Jeff King
On Sat, Nov 04, 2017 at 10:05:43AM +0100, René Scharfe wrote:

> >> How about *not* printing the error at the place where you notice the
> >> error, and instead return an error code to the caller to be noticed
> >> which dies with an error message?
> > 
> > That ends up giving less-specific errors.
> 
> Not necessarily.  Function could return different codes for different
> errors, e.g. -1 for an open(2) error and -2 for a write(2) error, and
> the caller could use that to select the message to show.
> 
> Basically all of the messages in wrapper.c consist of some text mixed
> with the affected path path and a strerror(3) string, so they're
> compatible in that way.  A single function (get_path_error_format()?)
> could thus be used and callers would be able to combine its result with
> die(), error(), or warning().

I think we've had this discussion before, a while ago. Yes, returning an
integer error code is nice because you don't have pass in an extra
parameter. But I think there are two pitfalls:

  1. Integers may not be descriptive enough to cover all cases, which is
 how we ended up with the strbuf-passing strategy in the ref code.
 Certainly you could add an integer for every possible bespoke
 message, but then I'm not sure it's buying that much over having
 the function simply fill in a strbuf.

  2. For complex functions there may be multiple errors that need to
 stack. I think the refs code has cases like this, where a syscall
 fails, which causes a fundamental ref operation to fail, which
 causes a higher-level operation to fail. It's only the caller of
 the higher-level operation that knows how to report the error.

Certainly an integer error code would work for _this_ function, but I'd
rather see us grow towards consistent error handling.

-Peff


Re: Git Open Source Weekend London 11th/12th November

2017-11-04 Thread Jeff King
On Wed, Nov 01, 2017 at 04:36:24PM +, Thomas Gummerer wrote:

> Normally attendees work in small groups on a specific task to
> prevent anyone from getting stuck. Per usual, Bloomberg will
> provide the venue, mentors, snacks and drinks.  Bring your
> enthusiasm (and your laptop!) and come share in the fun!  The
> event is also open to everyone, so feel free to pass on the
> invite!

I think it will help if the experienced members of the community (both
those who will be at the event and not) can come up with some possible
topics to work on (though of course I'd be glad for participants to come
with their own itch to scratch).

We've started using the #leftoverbits tag to allow searching in the
archive:

  https://public-inbox.org/git/?q=leftoverbits

Some of those have since been completed, but others are left open.
There's not really a master list, but it's a potential source for
digging for gold (well, if you want to call leftover bugs gold :) ).

I started a list over the summer of larger items that people might want
to pick up. Here it is in no particular order:

 - the pager. config is mis-designed, because our config keys
   cannot represent all possible command names (e.g., case folding and
   illegal characters). This should be pager..enable or similar.
   Some discussion in (this message and the surrounding thread):

 
https://public-inbox.org/git/20170711101942.h2uwxtgzvgguz...@sigill.intra.peff.net/

   But I think you could find more by searching the archive.

 - ditto for alias.* config, which has the same syntax problems.

 - auto-gc is sometimes over-anxious to run if you have a lot of
   unreachable loose objects. We should pack unreachables into a single
   pack. That solves the --auto problem, and is also way more efficient.
   The catch is expiration. Some discussion here (and especially
   down-thread):

 
https://public-inbox.org/git/20170711101942.h2uwxtgzvgguz...@sigill.intra.peff.net/

 - git-config's "--get-color" is unlike all the other types in that it
   takes a "default" value if the config key isn't set. This makes it annoyingly
   inconsistent, but there's also no way to ask Git to interpret other
   values (e.g., you might want it to expand "--path" or an "--int"). It
   would be nice to have a general "--default" option so you could do:

 # same as git config --get-color color.diff.old red
 git config --default red --color color.diff.old

   or

 # not currently possible to ask git to interpret "10k"
 git config --default 10k --int some.integer.key

 - git's internal config can parse expiration dates (see
   parse_expiry_date()), but you can't do so from git-config. It should
   probably have a type for "--expiry-date" (which would of course be
   more useful with the --default option above).

 - there's no efficient way to ask git-config for several keys with a
   specific type (or even multiple different types).  You can use
   "--list" to get their strings, but you can't get any interpretation
   (like colors, integers, etc). Invoking git-config many times can have
   a noticeable speed impact for a script. There should probably be a
   "--stdin" mode (or maybe "--get-stdin" if we would one day want to
   have a "--set-stdin") that takes a list of keys, optional types, and
   optional defaults (that "--default" again!) and outputs them to
   stdout.


Those were just sitting on my ideas list. I'm happy to go into more
detail if anybody's interested in discussing any of them. Some of them
may be half-baked.

And of course I'd be happy if people wanted to contribute more items to
the list.

-Peff


Re: [PATCH 1/2] sequencer: factor out rewrite_file()

2017-11-04 Thread René Scharfe
Am 03.11.2017 um 20:13 schrieb Jeff King:
> On Fri, Nov 03, 2017 at 10:44:08PM +0900, Junio C Hamano wrote:
> 
>> Simon Ruderich  writes:
>>
>>> I tried looking into this by adding a new write_file_buf_gently()
>>> (or maybe renaming write_file_buf to write_file_buf_or_die) and
>>> using it from write_file_buf() but I don't know the proper way to
>>> handle the error-case in write_file_buf(). Just calling
>>> die("write_file_buf") feels ugly, as the real error was already
>>> printed on screen by error_errno() and I didn't find any function
>>> to just exit without writing a message (which still respects
>>> die_routine). Suggestions welcome.
>>
>> How about *not* printing the error at the place where you notice the
>> error, and instead return an error code to the caller to be noticed
>> which dies with an error message?
> 
> That ends up giving less-specific errors.

Not necessarily.  Function could return different codes for different
errors, e.g. -1 for an open(2) error and -2 for a write(2) error, and
the caller could use that to select the message to show.

Basically all of the messages in wrapper.c consist of some text mixed
with the affected path path and a strerror(3) string, so they're
compatible in that way.  A single function (get_path_error_format()?)
could thus be used and callers would be able to combine its result with
die(), error(), or warning().

René


Re: [PATCH 2/3] http-push: use hex_to_bytes()

2017-11-04 Thread René Scharfe
Am 01.11.2017 um 23:15 schrieb Jeff King:
> On Wed, Nov 01, 2017 at 10:59:49PM +0100, René Scharfe wrote:
> 
>>> The hex_to_bytes() function requires that the caller make sure they have
>>> the right number of bytes. But for many callers, I think they'd want to
>>> say "parse this oid, which might be truncated; I can't tell what the
>>> length is supposed to be".
>>
>> I'm confused by the word "many".  After this series there are three
>> callers of hex_to_bytes() and I don't expect that number to grow.
> 
> I meant only that most callers that parse oids, both in-file and not,
> would want to stop knowing about the length ahead of time.

That's a good idea.

> I'm not sure we know 100%
> yet what "new"-style hashes will look like, nor how their loose-object
> filenames would look.

So it's too early to implement a solution, but here's how a patch for
reducing dependency on hash lengths could look like today:

---
 cache.h |  2 ++
 hex.c   | 13 +
 http-push.c |  7 +++
 notes.c |  6 +-
 sha1_file.c |  4 +---
 5 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/cache.h b/cache.h
index f06bfbaf32..acd3804c21 100644
--- a/cache.h
+++ b/cache.h
@@ -1317,6 +1317,8 @@ extern int set_disambiguate_hint_config(const char *var, 
const char *value);
 extern int get_sha1_hex(const char *hex, unsigned char *sha1);
 extern int get_oid_hex(const char *hex, struct object_id *sha1);
 
+extern int get_oid_hex_tail(const char *hex, struct object_id *oid, size_t 
offset);
+
 /*
  * Read `len` pairs of hexadecimal digits from `hex` and write the
  * values to `binary` as `len` bytes. Return 0 on success, or -1 if
diff --git a/hex.c b/hex.c
index 8df2d63728..3e6abe4d5e 100644
--- a/hex.c
+++ b/hex.c
@@ -47,6 +47,19 @@ int hex_to_bytes(unsigned char *binary, const char *hex, 
size_t len)
return 0;
 }
 
+int get_oid_hex_tail(const char *hex, struct object_id *oid, size_t offset)
+{
+   for (; offset < GIT_SHA1_RAWSZ; offset++, hex += 2) {
+   int val = hex2chr(hex);
+   if (val < 0)
+   return -1;
+   oid->hash[offset] = val;
+   }
+   if (*hex)
+   return -1;
+   return 0;
+}
+
 int get_sha1_hex(const char *hex, unsigned char *sha1)
 {
int i;
diff --git a/http-push.c b/http-push.c
index 14435ab65d..a5512616b9 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1007,18 +1007,17 @@ static void remote_ls(const char *path, int flags,
  void (*userFunc)(struct remote_ls_ctx *ls),
  void *userData);
 
-/* extract hex from sharded "xx/x{38}" filename */
+/* extract hex from sharded "xx/x{N}" filename */
 static int get_oid_hex_from_objpath(const char *path, struct object_id *oid)
 {
-   if (strlen(path) != GIT_SHA1_HEXSZ + 1)
+   if (strnlen(path, 3) < 3)
return -1;
-
if (hex_to_bytes(oid->hash, path, 1))
return -1;
path += 2;
path++; /* skip '/' */
 
-   return hex_to_bytes(oid->hash + 1, path, GIT_SHA1_RAWSZ - 1);
+   return get_oid_hex_tail(path, oid, 1);
 }
 
 static void process_ls_object(struct remote_ls_ctx *ls)
diff --git a/notes.c b/notes.c
index 04f8c8613c..ff6ce57022 100644
--- a/notes.c
+++ b/notes.c
@@ -410,17 +410,13 @@ static void load_subtree(struct notes_tree *t, struct 
leaf_node *subtree,
struct leaf_node *l;
size_t path_len = strlen(entry.path);
 
-   if (path_len == 2 * (GIT_SHA1_RAWSZ - prefix_len)) {
+   if (!get_oid_hex_tail(entry.path, _oid, prefix_len)) {
/* This is potentially the remainder of the SHA-1 */
 
if (!S_ISREG(entry.mode))
/* notes must be blobs */
goto handle_non_note;
 
-   if (hex_to_bytes(object_oid.hash + prefix_len, 
entry.path,
-GIT_SHA1_RAWSZ - prefix_len))
-   goto handle_non_note; /* entry.path is not a 
SHA1 */
-
type = PTR_TYPE_NOTE;
} else if (path_len == 2) {
/* This is potentially an internal node */
diff --git a/sha1_file.c b/sha1_file.c
index a3c32d91d1..0486696b0b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1911,9 +1911,7 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
strbuf_setlen(path, baselen);
strbuf_addf(path, "/%s", de->d_name);
 
-   if (strlen(de->d_name) == GIT_SHA1_HEXSZ - 2 &&
-   !hex_to_bytes(oid.hash + 1, de->d_name,
- GIT_SHA1_RAWSZ - 1)) {
+   if (!get_oid_hex_tail(de->d_name, , 1)) {
if (obj_cb) {
r = obj_cb(, path->buf, data);
if (r)
-- 
2.15.0


Re: [PATCH 5/6] t0021/rot13-filter: add capability functions

2017-11-04 Thread Christian Couder
On Sun, Oct 22, 2017 at 3:46 AM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> Add functions to help read and write capabilities.
>> These functions will be reused in following patches.
>
> One more thing that is more noteworthy (read: do not forget to
> describe it in the proposed log message) is that the original used
> to require capabilities to come in a fixed order.
>
> The new code allows these capabilities to be declared in any order,

Yeah and I think it is good.

> it even allows duplicates (intended? shouldn't we be flagging it as
> an error?),

I think allowing duplicates is ok, as we allow duplicates in many
cases already, like duplicate command line options.
Or perhaps we should just warn?

> the helper can require a set of capabilities we do want
> to see and fail if the remote doesn't declare any one of them
> (good).

Yeah.

> It does not check if the remote declares any capability we do not
> know about (intended? the original noticed this situation and error
> out---shouldn't the more generalized helper that is meant to be
> reusable allow us to do so, too?).

I think that it is ok in general to just ignore capabilities we don't
know (as long as we don't advertise them). The protocol should work
using only the capabilities that both ends support.

Now if we just talk about testing, we might sometimes want to check
that one end sends only some specific capabilities. So in this case it
could be good if we could error out. On the other hand, if we test how
Git behaves when we advertise some specific capabilities, it might not
be a good idea to error out as it would break tests when Git learns a
new capability and advertise it.

In the specific case of rot13-filter.pl I think we are more in the
later case than in the former.

So I think it is ok to wait until we would really want to check that
one end sends only some specific capabilities, before we improve the
Packet.pm module to make it support that.

> Side note: my answer to the last question is "it is OK and
> even desirable to ignore the fact that they claim to support
> a capability we do not know about", but I may be mistaken.

Yeah I agree.

> The reasoning and the conclusion that is consistent with
> what the code does should be described in any case.

Ok I will document all the above in the commit message.

>> +sub packet_read_capabilities {
>> + my @cap;
>> + while (1) {
>> + my ( $res, $buf ) = packet_bin_read();
>> + return ( $res, @cap ) if ( $res != 0 );
>
> The original had the same "'list eq list' does not do what you may
> think it does" issue.  This one corrects it, which is good.
>
> I am not sure if ($res != 0) is correct though.  What should happen
> when you get an unexpected EOF at this point?  The original would
> have died; this ignores and continues.

Well if there is an unexpected EOF, then packet_bin_read() returns
(-1, ""), so packet_read_capabilities() returns (-1, @cap) where @cap
contains the capabilities already received. Then
packet_read_and_check_capabilities() checks that we received all the
capabilities we expect and dies if that is not the case. If we did
receive all the capabilities, then
packet_read_and_check_capabilities() still returns -1, so the caller
may check that and die.

But yeah we could also just die in packet_read_capabilities() if $res
is -1. I will make this change.

>> + unless ( $buf =~ s/\n$// ) {
>> + die "A non-binary line MUST be terminated by an LF.\n"
>> + . "Received: '$buf'";
>> + }
>
> It may make sense to extract this in a small helper and call it from
> here and from packet_txt_read().

Ok, I have done this in my current version, that I plan to send soon.

>> + die "bad capability buf: '$buf'" unless ( $buf =~ 
>> s/capability=// );
>
> This may merely be a style thing, but I somehow find statement
> modifiers hard to follow, unless its condition is almost always
> true.  If you follow the logic in a loop and see "die" at the
> beginning, a normal thing to expect is that there were conditionals
> that said "continue" (eh, 'next' or 'redo') to catch the normal case
> and the control would reach "die" only under exceptional error
> cases, but hiding a rare error condition after 'unless' statement
> modifier breaks that expectation.

Ok, I will use:

unless ( $buf =~ s/capability=// ) {
die "bad capability buf: '$buf'" ;
}

>> + push @cap, $buf;
>> + }
>> +}
>> +
>> +sub packet_read_and_check_capabilities {
>> + my @local_caps = @_;
>> + my @remote_res_caps = packet_read_capabilities();
>> + my $res = shift @remote_res_caps;
>> + my %remote_caps = map { $_ => 1 } @remote_res_caps;
>
> FYI:
>
> my ($res, @remote_caps) = packet_read_capabilities();
> my %remote_caps = map { $_ => 1 } @remote_caps;
>
> may be more 

Re: [PATCH v1 2/2] log: add option to choose which refs to decorate

2017-11-04 Thread Rafael Ascensão

On 04/11/17 03:49, Junio C Hamano wrote:

Rafael Ascensão  writes:


Using `--exclude=` can help mitigate that verboseness by
removing unnecessary 'branches' from the output. However, if the tip of
an excluded ref points to an ancestor of a non-excluded ref, git will
decorate it regardless.


Is this even relevant?  I think the above would only serve to
confuse the readers.  --exclude, --branches, etc. are ways to
specify what starting points "git log" history traversal should
begin and has nothing to do with what set of refs are to be used to
decorate the commits that are shown.  But the paragraph makes
readers wonder if it might have any effect in some circumstances.


With `--decorate-refs=`, only refs that match  are
decorated while `--decorate-refs-exclude=` allows to do the
reverse, remove ref decorations that match 


And "Only refs that match ... are decorated" is also confusing.  The
thing is, refs are never decorated, they are used for decorating
commits in the output from "git log".  For example, if you have 


---A---B---C---D

and B is at the tip of the 'master' branch, the output from "git log
D" would decorate B with 'master', even if you do not say 'master'
on the command line as the commit to start the traversal from. >
Perhaps drop the irrelevant paragraph about "--exclude" and write
something like this instead?

When "--decorate-refs=" is given, only the refs
that match the pattern is used in decoration.  The refs that
match the pattern, when "--decorate-refs-exclude="
is given, are never used in decoration.



What you explained was the reason I mentioned that. Because some users 
were wrongfully trying to remove decorations by trying to exclude the 
starting points. But I agree this adds little value and can generate 
further confusion. I will remove that section.



Both can be used together but --decorate-refs-exclude patterns have
precedence over --decorate-refs patterns.


A reasonable and an easy-to-explain way to mix zero or more positive
and zero or more negagive patterns that follows the convention used
elsewhere in the system (e.g. how negative pathspecs work) is

  (1) if there is no positive pattern given, pretend as if an
  inclusive default positive pattern was given;

  (2) for each candidate, reject it if it matches no positive
  pattern, or if it matches any one of negative patterns.

For pathspecs, we use "everything" as the inclusive default positive
pattern, I think, and for the set of refs used for decoration, a
reasonable choice would also be to use "everything", which matches
the current behaviour.



That's a nice explanation that fits the current "--decorate-refs" behavior.


The pattern follows similar rules as `--glob` except it doesn't assume a
trailing '/*' if glob characters are missing.


Why should this be a special case that burdens users to remember one
more rule?  Wouldn't users find "--decorate-refs=refs/tags" useful
and it woulld be shorter and nicer than having to say "refs/tags/*"?



I wanted to allow exact patterns like:
"--decorate-refs=refs/heads/master" and for that I disabled the flag 
that adds the trailing '/*' if no globs are found. As a side effect, I 
lost the shortcut.


Is adding a yet another flag that appends '/*' only if the pattern 
equals "refs/{heads,remotes,tags}" a good idea?


Because changing the default behavior of that function has implications 
on multiple commands which I think shouldn't change. But at the same 
time, would be nice to have the logic that deals with glob-ref patterns 
all in one place.


What's the sane way to do this?


diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
index 32246fdb0..314417d89 100644
--- a/Documentation/git-log.txt
+++ b/Documentation/git-log.txt
@@ -38,6 +38,18 @@ OPTIONS
are shown as if 'short' were given, otherwise no ref names are
shown. The default option is 'short'.
  
+--decorate-refs=::

+   Only print ref names that match the specified pattern. Uses the same
+   rules as `git rev-list --glob` except it doesn't assume a trailing a
+   trailing '/{asterisk}' if pattern lacks '?', '{asterisk}', or '['.
+   `--decorate-refs-exlclude` has precedence.
+
+--decorate-refs-exclude=::
+   Do not print ref names that match the specified pattern. Uses the same
+   rules as `git rev-list --glob` except it doesn't assume a trailing a
+   trailing '/{asterisk}' if pattern lacks '?', '{asterisk}', or '['.
+   Has precedence over `--decorate-refs`.



These two may be technically correct, but I wonder if we can make it
easier to understand (I found "precedence" bit hard to follow, as in
my mind, these are ANDed conditions and between (A & ~B), there is
no "precedence").  Also we'd want to clarify what happens when only
"--decorate-refs-exclude"s are given, which in turn necessitates us
to describe what happens when only "--decorate-refs"s are given.


I 

Re: [PATCH v1 1/2] refs: extract function to normalize partial refs

2017-11-04 Thread Rafael Ascensão

On 04/11/17 02:27, Junio C Hamano wrote:

Rafael Ascensão  writes:


Signed-off-by: Kevin Daudt 
Signed-off-by: Rafael Ascensão 


Could you explain Kevin's sign-off we see above?  It is a bit
unusual (I am not yet saying it is wrong---I cannot judge until I
find out why it is there) to see a patch from person X with sign off
from person Y and then person X in that order.  It is normal for a
patch authored by person X to have sign-off by X and then Y if X
wrote it, signed it off and passed to Y, and then Y resent it after
signing it off (while preserving the authorship of X by adding an
in-body From: header), but I do not think that is what we have here.

It could be that you did pretty much all the work on this patch
and Kevin helped you to polish this patch off-list, in which case
the usual thing to do is to use "Helped-by: Kevin" instead.


That's more or less what happened. I wouldn't say I did "pretty much all 
the work". Yes, I wrote the code but with great help of Kevin. The 
intention of the dual Signed-off-by was to equally attribute authorship 
of the patch. But if that creates ambiguity I will change it to 
"Helped-by" as suggested.



It is better to use "unsigned" for a single word "flags" used as a
collection of bits.  In older parts of the codebase, we have
codepaths that pass signed int as a flags word, simply because we
didn't know better, but we do not have to spread that practice to
new code.


I noticed this, but chose to "mimic" the code around me. I'll correct it.
On a related note is there a guideline for defining flags or are
`#define FLAG (1u << 0)`, `#define FLAG (1 << 0)`
`#define FLAG 1` and `#define FLAG 0x1` equally accepted?


  {
-   struct strbuf real_pattern = STRBUF_INIT;
-   struct ref_filter filter;
-   int ret;
-
if (!prefix && !starts_with(pattern, "refs/"))
-   strbuf_addstr(_pattern, "refs/");
+   strbuf_addstr(normalized_pattern, "refs/");
else if (prefix)
-   strbuf_addstr(_pattern, prefix);
-   strbuf_addstr(_pattern, pattern);
+   strbuf_addstr(normalized_pattern, prefix);
+   strbuf_addstr(normalized_pattern, pattern);
  
-	if (!has_glob_specials(pattern)) {

+   if (!has_glob_specials(pattern) && (flags & ENSURE_GLOB)) {
/* Append implied '/' '*' if not present. */
-   strbuf_complete(_pattern, '/');
+   strbuf_complete(normalized_pattern, '/');
/* No need to check for '*', there is none. */
-   strbuf_addch(_pattern, '*');
+   strbuf_addch(normalized_pattern, '*');
}
+}


The above looks like a pure and regression-free code movement (plus
a small new feature) that is faithful to the original, which is good.

I however notice that addition of /* to the tail is trying to be
careful by using strbuf_complete('/'), but prefixing with "refs/"
does not and we would end up with a double-slash if pattern begins
with a slash.  The contract between the caller of this function (or
its original, which is for_each_glob_ref_in()) and the callee is
that prefix must not begin with '/', so it may be OK, but we might
want to add "if (*pattern == '/') BUG(...)" at the beginning.

I dunno.  In any case, that is totally outside the scope of this two
patch series.


I guess it doesn't hurt adding that safety net.


Thanks.  Other than the above minor points, looks good to me.

I'll fix the mentioned issues. Thanks.