Re: [PATCH 2/2] refs_resolve_ref_unsafe: handle d/f conflicts for writes
On 10/07/2017 06:36 AM, Michael Haggerty wrote: > On 10/06/2017 07:16 PM, Jeff King wrote: >> On Fri, Oct 06, 2017 at 07:09:10PM +0200, Michael Haggerty wrote: >> >>> I do have one twinge of uneasiness at a deeper level, that I haven't had >>> time to check... >>> >>> Does this patch make it easier to *set* HEAD to an unborn branch that >>> d/f conflicts with an existing reference? If so, that might be a >>> slightly worse UI for users. I'd rather learn about such a problem when >>> setting HEAD (when I am thinking about the new branch name and am in the >>> frame of mind to solve the problem) rather than later, when I try to >>> commit to the new branch. >> >> Good question. The answer is no, it's allowed both before and after my >> patch. At least via git-symbolic-ref. >> >> I agree it would be nice to know earlier for such a case. For >> symbolic-ref, we probably should allow it, because it's plumbing that >> may be used for tricky things. For things like "checkout -b", you'd >> generally get a timely warning as we try to create the ref. >> >> The odd man out is "checkout --orphan", which leaves the branch unborn. >> It might be nice if it did a manual check that the ref is available (and >> also that it's syntactically acceptable, though I think we may do that >> already). >> >> But all of that is orthogonal to this fix, I think. > > Thanks for checking. Yes, I totally agree that this is orthogonal. I also just checked but there don't seem to be any docstrings that need updating. Reviewed-by: Michael Haggerty(both patches in this series). Michael
Re: [PATCH 1/3] notes: move hex_to_bytes() to hex.c and export it
On Tue, Oct 31, 2017 at 02:46:49PM +0100, René Scharfe wrote: > Make the function for converting pairs of hexadecimal digits to binary > available to other call sites. > > Signed-off-by: Rene Scharfe> --- > cache.h | 7 +++ > hex.c | 12 > notes.c | 17 - > 3 files changed, 19 insertions(+), 17 deletions(-) > > diff --git a/cache.h b/cache.h > index 6440e2bf21..f06bfbaf32 100644 > --- a/cache.h > +++ b/cache.h > @@ -1317,6 +1317,13 @@ extern int set_disambiguate_hint_config(const char > *var, const char *value); > extern int get_sha1_hex(const char *hex, unsigned char *sha1); > extern int get_oid_hex(const char *hex, struct object_id *sha1); > > +/* > + * Read `len` pairs of hexadecimal digits from `hex` and write the > + * values to `binary` as `len` bytes. Return 0 on success, or -1 if Is it correct to call the result binary? I would say that it's the value that gets stored. To me, this value does not really have a base. Kevin
Re: git grep -P fatal: pcre_exec failed with error code -8
On Sun, Nov 05, 2017 at 01:06:21AM +0100, Дилян Палаузов wrote: > with git 2.14.3 linked with libpcre.so.1.2.9 when I do: > git clone https://github.com/django/django > cd django > git grep -P "if.*([^\s])+\s+and\s+\1" > django/contrib/admin/static/admin/js/vendor/select2/select2.full.min.js > the output is: > fatal: pcre_exec failed with error code -8 Code -8 is PCRE_ERROR_MATCHLIMIT. And "man pcreapi" has this to say: The match_limit field provides a means of preventing PCRE from using up a vast amount of resources when running patterns that are not going to match, but which have a very large number of possibilities in their search trees. The classic example is a pattern that uses nested unlimited repeats. Internally, pcre_exec() uses a function called match(), which it calls repeatedly (sometimes recursively). The limit set by match_limit is imposed on the number of times this function is called during a match, which has the effect of limiting the amount of backtracking that can take place. For patterns that are not anchored, the count restarts from zero for each posi‐ tion in the subject string. When pcre_exec() is called with a pattern that was successfully studied with a JIT option, the way that the matching is exe‐ cuted is entirely different. However, there is still the pos‐ sibility of runaway matching that goes on for a very long time, and so the match_limit value is also used in this case (but in a different way) to limit how long the matching can continue. The default value for the limit can be set when PCRE is built; the default default is 10 million, which handles all but the most extreme cases. You can override the default by suppling pcre_exec() with a pcre_extra block in which match_limit is set, and PCRE_EXTRA_MATCH_LIMIT is set in the flags field. If the limit is exceeded, pcre_exec() returns PCRE_ERROR_MATCH‐ LIMIT. So your pattern is just really expensive and is running afoul of pcre's backtracking limits (and it's not helped by the fact that the file is basically one giant line). There's no way to ask Git to specify a larger match_limit to pcre, but you might be able to construct your pattern in a way that involves less backtracking. It looks like you're trying to find things like "if foo and foo"? Should the captured term actually be "([^\s]+)" (with the "+" on the _inside_ of the capture? Or maybe I'm just misunderstanding your goal. -Peff
Re: [PATCH 1/2] sequencer: factor out rewrite_file()
On Sat, Nov 04, 2017 at 07:36:43PM +0100, Simon Ruderich wrote: > On Fri, Nov 03, 2017 at 03:13:10PM -0400, Jeff King wrote: > > I think we've been gravitating towards error strbufs, which would make > > it something like: > > I like this approach to store the error in a separate variable > and let the caller handle it. This provides proper error messages > and is cleaner than printing the error on the error site (what > error_errno does). > > However I wouldn't use strbuf directly and instead add a new > struct error which provides a small set of helper functions. > Using a separate type also makes it clear to the reader that is > not a normal string and is more extendable in the future. Yes, I think what you've written here (and below) is quite close to the error_context patches I linked elsewhere in the thread. In other words, I think it's a sane approach. > We could also store the error condition in the error struct and > don't use the return value to indicate and error like this: > > void write_file_buf(const char *path, const char *buf, size_t len) > { > struct error err = ERROR_INIT; > write_file_buf_gently2(path, buf, len, ); > if (err.error) > error_die(); > } I agree it might be nice for the error context to have a positive "there was an error" flag. It's probably worth making it redundant with the return code, though, so callers can use whichever style is most convenient for them. > > OTOH, if we went all-in on flexible error handling contexts, you could > > imagine this function becoming: > > > > void write_file_buf(const char *path, const char *buf, size_t len, > > struct error_context *err) > > { > > int fd = xopen(path, err, O_WRONLY | O_CREAT | O_TRUNC, 0666); > > if (fd < 0) > > return -1; > > if (write_in_full(fd, buf, len, err) < 0) > > return -1; > > if (xclose(fd, err) < 0) > > return -1; > > return 0; > > } > > This looks interesting as well, but it misses the feature of > custom error messages which is really useful. Right, I didn't think that example through. The functions after the open() don't have enough information to make a good message. -Peff
Re: [PATCH 5/6] t0021/rot13-filter: add capability functions
Christian Couderwrites: >>> + my ( $res, $buf ) = packet_bin_read(); >>> + return ( $res, @cap ) if ( $res != 0 ); >> >> The original had the same "'list eq list' does not do what you may >> think it does" issue. This one corrects it, which is good. >> >> I am not sure if ($res != 0) is correct though. What should happen >> when you get an unexpected EOF at this point? The original would >> have died; this ignores and continues. > > Well if there is an unexpected EOF, then packet_bin_read() returns > (-1, ""), so packet_read_capabilities() returns (-1, @cap) where @cap > contains the capabilities already received. Then > packet_read_and_check_capabilities() checks that we received all the > capabilities we expect and dies if that is not the case. If we did > receive all the capabilities, then > packet_read_and_check_capabilities() still returns -1, so the caller > may check that and die. In other words, it happens, by accident, to stop before going very far. I think we'd be better off making it an explicitly checked error. > But yeah we could also just die in packet_read_capabilities() if $res > is -1. I will make this change. Good. Thanks.
Re: [PATCH v1 2/2] log: add option to choose which refs to decorate
Rafael Ascensãowrites: >>> The pattern follows similar rules as `--glob` except it doesn't assume a >>> trailing '/*' if glob characters are missing. >> >> Why should this be a special case that burdens users to remember one >> more rule? Wouldn't users find "--decorate-refs=refs/tags" useful >> and it woulld be shorter and nicer than having to say "refs/tags/*"? > > I wanted to allow exact patterns like: > "--decorate-refs=refs/heads/master" and for that I disabled the flag > that adds the trailing '/*' if no globs are found. As a side effect, I > lost the shortcut. > > Is adding a yet another flag that appends '/*' only if the pattern > equals "refs/{heads,remotes,tags}" a good idea? No. > Because changing the default behavior of that function has > implications on multiple commands which I think shouldn't change. But > at the same time, would be nice to have the logic that deals with > glob-ref patterns all in one place. > > What's the sane way to do this? Learn to type "--decorate-refs="refs/heads/[m]aster", and not twewak the code at all, perhaps. The users of existing "with no globbing, /* is appended" interface are already used to that way and they do not have to learn a new and inconsistent interface. After all, "I only want to see 'git log' output with 'master' decorated" (i.e. not specifying "this class of refs I can glob by using the naming convention I am using" and instead enumerating the ones you care about) does not sound like a sensible thing people often want to do, so making it follow the other codepath so that people can say "refs/tags" to get "refs/tags/*", while still allowing such a rare but specific and exact one possible, may not sound too bad to me.
Re: [PATCH] l10n: es.po: fix typos
On Sat, Nov 04, 2017 at 08:20:44PM -0300, Gaston Gonzalez wrote: Thank you, applied to my personal 2.15-next branch, I'll work on other fixes and send a bigger chunk to main repo. Regards
git grep -P fatal: pcre_exec failed with error code -8
Hello, with git 2.14.3 linked with libpcre.so.1.2.9 when I do: git clone https://github.com/django/django cd django git grep -P "if.*([^\s])+\s+and\s+\1" django/contrib/admin/static/admin/js/vendor/select2/select2.full.min.js the output is: fatal: pcre_exec failed with error code -8 (But not with git clone https://github.com/select2/select2 cd select2 git grep -P "if.*([^\s])+\s+and\s+\1" dist/js/select2.full.min.js as the two select2.full.min.js files differ slightly in their size) Any ideas? Regards Dilian
[PATCH] l10n: es.po: fix typos
Fix some typos in the spanish translation. Signed-off-by: Gaston Gonzalez--- po/es.po | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/po/es.po b/po/es.po index 89a2dc014..43251cbc9 100644 --- a/po/es.po +++ b/po/es.po @@ -4746,7 +4746,7 @@ msgstr "" #: wt-status.c:1668 #, c-format msgid "nothing to commit, working tree clean\n" -msgstr "nada para hacer comit, el arbol de trabajo esta limpio\n" +msgstr "nada para hacer commit, el arbol de trabajo esta limpio\n" #: wt-status.c:1780 msgid "No commits yet on " @@ -15518,17 +15518,17 @@ msgid "" "\n" " git rebase --continue\n" msgstr "" -"TIene cambios en el area de stage de su arbol de trabajo.\n" +"Tiene cambios en el area de stage de su arbol de trabajo.\n" "Si estos cambios estan destinados a \n" "ser aplastados en el commit previo, ejecute:\n" "\n" " git commit --amend $gpg_sign_opt_quoted\n" "\n" -"Si estos estan destinados a ir en un nuevo comit, ejecute:\n" +"Si estos estan destinados a ir en un nuevo commit, ejecute:\n" "\n" " git commit $gpg_sign_opt_quoted\n" "\n" -"En ambos casos, cuando termite, continue con:\n" +"En ambos casos, cuando termine, continue con:\n" "\n" " git rebase --continue\n" -- 2.15.0
Re: [PATCH v1 1/2] refs: extract function to normalize partial refs
On Sat, Nov 04, 2017 at 11:27:39AM +0900, Junio C Hamano wrote: > I however notice that addition of /* to the tail is trying to be > careful by using strbuf_complete('/'), but prefixing with "refs/" > does not and we would end up with a double-slash if pattern begins > with a slash. The contract between the caller of this function (or > its original, which is for_each_glob_ref_in()) and the callee is > that prefix must not begin with '/', so it may be OK, but we might > want to add "if (*pattern == '/') BUG(...)" at the beginning. > > I dunno. In any case, that is totally outside the scope of this two > patch series. I do think it's a good idea to make future readers of the code aware of this contract, and adding a BUG assert does that quite well. Here is a patch that implements it. This applies of course on top of this patch series. -- >8 -- Subject: [PATCH] normalize_glob_ref: assert implicit contract of prefix normalize_glob_ref has an implicit contract of expecting 'prefix' to not start with a '/', otherwise the pattern would end up with a double-slash. Mark it as a BUG when the prefix argument of normalize_glob_ref starts with a '/' so that future callers will be aware of this contract. Signed-off-by: Kevin Daudt--- refs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/refs.c b/refs.c index e9ae659ae..6747981d1 100644 --- a/refs.c +++ b/refs.c @@ -372,6 +372,8 @@ int head_ref_namespaced(each_ref_fn fn, void *cb_data) void normalize_glob_ref(struct strbuf *normalized_pattern, const char *prefix, const char *pattern, int flags) { + if (prefix && *prefix == '/') BUG("prefix cannot not start with '/'"); + if (!prefix && !starts_with(pattern, "refs/")) strbuf_addstr(normalized_pattern, "refs/"); else if (prefix) -- 2.15.0.rc2.57.g2f899857a9
Re: [PATCH 6/7] builtin/describe.c: describe a blob
From: "Junio C Hamano"Sent: Thursday, November 02, 2017 4:23 AM Junio C Hamano writes: The reason why we say "-ish" is "Yes we know v2.15.0 is *NOT* a commit object, we very well know it is a tag object, but because we allow it to be used in a context that calls for a commit object, we mark that use context as 'this accepts commit-ish, not just commit'". Having said all that, there is a valid case in which we might want to say "blob-ish". Is this not also an alternative case, relative to the user, for the scenario where the user has an oid/sha1 value but does not know what it is, and would like to find its source and type relative to the `describe` command. IIUC the existing `describe` command only accepts values, and here we are extending that to be even more inclusive, but at the same time the options become more restricted. Thus the synopsis terminology would be more about suggesting the range of options available (search style/start points) that are applicable to blobs, than being exactly about the 'allow-blobs' parameter. Or have I misunderstood how the fast commit search and the slower potentially-a-blob searching are disambiguated? -- Philip To review, X-ish is the word we use when the command wants to take an X, but tolerates a lazy user who gives a Y, which is *NOT* X, without bothering to add ^{X} suffix, i.e. Y^{X}. In such a case, the command takes not just X but takes X-ish because it takes a Y and converts it internally to an X to be extra nice. When the command wants to take a blob, but tolerates something else and does "^{blob}" internally, we can say it takes "blob-ish". Technically that "something else" could be an annotated tag that points at a blob object, without any intervening commit or tree (I did not check if the "describe " code in this thread handles this, though). But because it is not usually done to tag a blob directly, it would probably be not worth to say "blob-ish" in the document and cause readers to wonder in what situation something that is not a blob can be treated as if it were a blob. It does feel like we would be pursuing technical correctness too much and sacrificing the readability of the document, at least to me, and a bad trade-off.
Re: How to resolve mixture of modified and deleted conflicts easily in git?
From: "Robert Dailey"When doing a rebase, sometimes I will get `DU` and `UU` conflicts (locally deleted and locally modified, respectively). Furthermore, in some of these cases, I want to take "ours" for all conflicts, including ones where the local file is deleted. Ideally, it's just one command: $ git checkout --ours . However, this fails for the locally deleted conflicts: error: path 'foo.xml' does not have our version Even more annoyingly, the fact that these failures occur prevents the `UU` conflicts from being resolved. The whole operation fails atomically. I am not aware of a straightforward and uniform way to resolve conflicts with "ours" during a rebase when locally deleted files exist in the list of conflicts. What is the most elegant solution in this situation? I'm running Git for Windows v2.13.1. Can you give the example commands and response? In particular note the comment in the rebase command that the ours/theirs viewpoints are swapped, so you may find your stated "ours" should be (mid rebase) seen as "theirs". -- Philip
Re: [PATCH 1/2] sequencer: factor out rewrite_file()
On Fri, Nov 03, 2017 at 03:13:10PM -0400, Jeff King wrote: > I think we've been gravitating towards error strbufs, which would make > it something like: I like this approach to store the error in a separate variable and let the caller handle it. This provides proper error messages and is cleaner than printing the error on the error site (what error_errno does). However I wouldn't use strbuf directly and instead add a new struct error which provides a small set of helper functions. Using a separate type also makes it clear to the reader that is not a normal string and is more extendable in the future. > I'm not excited that the amount of error-handling code is now double the > amount of code that actually does something useful. Maybe this function > simply isn't large/complex enough to merit flexible error handling, and > we should simply go with René's original near-duplicate. A separate struct (and helper functions) would help in this case and could look like this, which is almost equal (in code size) to the original solution using error_errno: int write_file_buf_gently2(const char *path, const char *buf, size_t len, struct error *err) { int rc = 0; int fd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666); if (fd < 0) return error_addf_errno(err, _("could not open '%s' for writing"), path); if (write_in_full(fd, buf, len) < 0) rc = error_addf_errno(err, _("could not write to '%s'"), path); if (close(fd) && !rc) rc = error_addf_errno(err, _("could not close '%s'"), path); return rc; } (I didn't touch write_in_full here, but it could also take the err and then the code would get a little shorter, however would lose the "path" information, but see below.) And in the caller: void write_file_buf(const char *path, const char *buf, size_t len) { struct error err = ERROR_INIT; if (write_file_buf_gently2(path, buf, len, ) < 0) error_die(); } For now struct error just contains the strbuf, but one could add the call location (by using a macro for error_addf_errno) or the original errno or more information in the future. error_addf_errno() could also prepend the error the buffer so that the caller can add more information if necessary and we get something like: "failed to write file 'foo': write failed: errno text" in the write_file_buf case (the first error string is from write_file_buf_gently2, the second from write_in_full). However I'm not sure how well this works with translations. We could also store the error condition in the error struct and don't use the return value to indicate and error like this: void write_file_buf(const char *path, const char *buf, size_t len) { struct error err = ERROR_INIT; write_file_buf_gently2(path, buf, len, ); if (err.error) error_die(); } > OTOH, if we went all-in on flexible error handling contexts, you could > imagine this function becoming: > > void write_file_buf(const char *path, const char *buf, size_t len, > struct error_context *err) > { > int fd = xopen(path, err, O_WRONLY | O_CREAT | O_TRUNC, 0666); > if (fd < 0) > return -1; > if (write_in_full(fd, buf, len, err) < 0) > return -1; > if (xclose(fd, err) < 0) > return -1; > return 0; > } This looks interesting as well, but it misses the feature of custom error messages which is really useful. Regards Simon -- + privacy is necessary + using gnupg http://gnupg.org + public key id: 0x92FEFDB7E44C32F9
[PATCH] Git-cvsimport Improvement
Greetings, I am attempting to improve CVS -> CVSps -> Git-cvsimport process. The part involving Git-cvsimport has to do with parsing of CVSps PatchSet file. Consider what happens if a CVS log/commit message includes lines which start with "Members:", say from copy-and-paste [2]. To avoid this issue, I have proposed that CVSps append the "Log:" tag with line count of original CVS log/commit message [1]. The idea is if line-count is found after "Log:", that many (CVS log message) lines get consumed before advancing $state to look for "^Members:" Current Git-cvsimport isn't strict in matching the "Log:" tag (fortunately) and my proposed change to Git-cvsimport should be fully backward compatible. See attached patch. Cheers, --patrick p.s., For reference: Why I'm doing this and RFC sent to CVS list: http://lists.nongnu.org/archive/html/info-cvs/2017-11/msg0.html [1] https://github.com/andreyvit/cvsps/pull/4 [2] Example PatchSet with "Members:" line in original CVS commit message: - PatchSet 3 Date: 2017/10/30 23:25:20 Author: catbert Branch: HEAD Tag: (none) Log: This will confuse git-cvsimport's parser Members: somefile.c:1.1->1.2 another.h:1.7->1.8 foo.mk:1.22->1.23 Imagine these were lines pasted to note something Members: ABC:1.1->1.2 commit c3e406c54b8cd3a2bbf0aa729fef201e20fa6df5 Author: patrick keshishianDate: Sat Nov 4 08:42:12 2017 -0700 Optionally parse line count out of PatchSets with "Log: count" This is a change being suggested to CVSps where the line count of the commit message gets added to the "Log:" tag to help Git cvsimport not get confused if the CVS log/commit message included lines starting with any of the tags found in CVSps PatchSet, e.g., Members: This is part of a larger change to make CVS to Git import more robust. diff --git a/git-cvsimport.perl b/git-cvsimport.perl index 36929921e..5d78c5e87 100755 --- a/git-cvsimport.perl +++ b/git-cvsimport.perl @@ -786,6 +786,13 @@ open(CVS, "<$cvspsfile") or die $!; # #- +# NOTE: +## pk, 2017/10/30 +# patched cvsps will output ^Log: line with number of lines of log +# which are to follow. This makes parsing robust for cases where the +# log message contains ^Members: lines! Happens in OpenBSD sources: +# e.g., See src/usr.sbin/bgpd/rde.c + my $state = 0; sub update_index (\@\@) { @@ -816,7 +823,7 @@ sub write_tree () { return $tree; } -my ($patchset,$date,$author_name,$author_email,$author_tz,$branch,$ancestor,$tag,$logmsg); +my ($patchset,$date,$author_name,$author_email,$author_tz,$branch,$ancestor,$tag,$logmsg,$loglines); my (@old,@new,@skipped,%ignorebranch,@commit_revisions); # commits that cvsps cannot place anywhere... @@ -1005,8 +1012,13 @@ while () { $tag = $_; } $state = 7; - } elsif ($state == 7 and /^Log:/) { + } elsif ($state == 7 and /^Log:\s*(\d+)?$/) { + $loglines = $1 // -1; $logmsg = ""; + while ($loglines-- > 0 && ($_ = )) { + chomp; + $logmsg .= "$_\n"; + } $state = 8; } elsif ($state == 8 and /^Members:/) { $branch = $opt_o if $branch eq "HEAD";
Re: Regression[2.14.3->2.15]: Interactive rebase fails if submodule is modified
On Fri, Nov 3, 2017 at 6:20 PM, Johannes Schindelinwrote: > Hi Orgad, > > On Fri, 3 Nov 2017, Johannes Schindelin wrote: > >> On Thu, 2 Nov 2017, Orgad Shaneh wrote: >> >> > I can't reproduce this with a minimal example, but it happens in my >> > project. > > Whoa, I somehow overlooked the "can't". Sorry. > > I inserted a `git diff-files` here, and it printed exactly what I > expected: > > ++ git diff-files > :16 16 62cab94c8d8cf047bbb60c12def559339300efa4 > M sub > >> + git rebase -i HEAD^^ >> + ) >> +' > > There must be something else going wrong that we did not replicate here. > Maybe the `error: cannot rebase: You have unstaged changes.` message was > caused not by a change in the submodule? Could you run `git diff-files` > before the rebase? It's the same before and during the rebase: $ git diff-files :16 16 c840225a7cf6bb2ec64da9d35d2c29210bc5e5e8 M sub > > This does *not* refresh the index, but maybe that is what is going wrong; > you could call `git update-index --refresh` before the rebase and see > whether that works around the issue? Nope. If I run git submodule update, then rebase --continue works fine, so it's definitely somehow caused by the submodule. - Orgad
Re: Git Open Source Weekend London 11th/12th November
From: "Jeff King"On Wed, Nov 01, 2017 at 04:36:24PM +, Thomas Gummerer wrote: Normally attendees work in small groups on a specific task to prevent anyone from getting stuck. Per usual, Bloomberg will provide the venue, mentors, snacks and drinks. Bring your enthusiasm (and your laptop!) and come share in the fun! The event is also open to everyone, so feel free to pass on the invite! I think it will help if the experienced members of the community (both those who will be at the event and not) can come up with some possible topics to work on (though of course I'd be glad for participants to come with their own itch to scratch). We've started using the #leftoverbits tag to allow searching in the archive: https://public-inbox.org/git/?q=leftoverbits Some of those have since been completed, but others are left open. There's not really a master list, but it's a potential source for digging for gold (well, if you want to call leftover bugs gold :) ). I started a list over the summer of larger items that people might want to pick up. Here it is in no particular order: - the pager. config is mis-designed, because our config keys cannot represent all possible command names (e.g., case folding and illegal characters). This should be pager..enable or similar. Some discussion in (this message and the surrounding thread): https://public-inbox.org/git/20170711101942.h2uwxtgzvgguz...@sigill.intra.peff.net/ But I think you could find more by searching the archive. - ditto for alias.* config, which has the same syntax problems. - auto-gc is sometimes over-anxious to run if you have a lot of unreachable loose objects. We should pack unreachables into a single pack. That solves the --auto problem, and is also way more efficient. The catch is expiration. Some discussion here (and especially down-thread): https://public-inbox.org/git/20170711101942.h2uwxtgzvgguz...@sigill.intra.peff.net/ - git-config's "--get-color" is unlike all the other types in that it takes a "default" value if the config key isn't set. This makes it annoyingly inconsistent, but there's also no way to ask Git to interpret other values (e.g., you might want it to expand "--path" or an "--int"). It would be nice to have a general "--default" option so you could do: # same as git config --get-color color.diff.old red git config --default red --color color.diff.old or # not currently possible to ask git to interpret "10k" git config --default 10k --int some.integer.key - git's internal config can parse expiration dates (see parse_expiry_date()), but you can't do so from git-config. It should probably have a type for "--expiry-date" (which would of course be more useful with the --default option above). - there's no efficient way to ask git-config for several keys with a specific type (or even multiple different types). You can use "--list" to get their strings, but you can't get any interpretation (like colors, integers, etc). Invoking git-config many times can have a noticeable speed impact for a script. There should probably be a "--stdin" mode (or maybe "--get-stdin" if we would one day want to have a "--set-stdin") that takes a list of keys, optional types, and optional defaults (that "--default" again!) and outputs them to stdout. Those were just sitting on my ideas list. I'm happy to go into more detail if anybody's interested in discussing any of them. Some of them may be half-baked. And of course I'd be happy if people wanted to contribute more items to the list. A few I've seen recently are: * The provison of a `git resolve -X -- ` command to simplify the manual resolution of remaining conflicts. https://public-inbox.org/git/8737615iu5@javad.com/ Sergey Organov: How to re-merge paths differently? * (Git for Windows/HFS): Detect directory capitalisation changes when switching branches, and rename them correctly on case preserving, case insensitive file systems. Optimisation: If the underlying tree is identical then do not update the modified dates. https://github.com/git-for-windows/git/issues/1333 Chuck Lu: Folder name should be case sensitive when switch branches. * (Git for Windows/HFS) (long standing): detect branch name capitalisation issues - may need a struct to carry both the filename and pathname down the different parts of the code base so that the FS name of the requested ref/heads/ can be checked and warned. e.g. https://github.com/git-for-windows/git/issues/852 "`git checkout head` inconsistent behavior" - ('head' finds 'HEAD', but also 'branch' finds 'Branch'); https://github.com/git-for-windows/git/issues/852#issuecomment-239675187 -> "rambling notes" for partial analysis. https://github.com/git-for-windows/git/issues/752 "git checkout creating tracking branch using case-insensitive match?" - Is this part of the same problem? * Documentation: There's always the newby role at the hackathon of
[PATCH] fix typos in 2.15.0 release notes
Signed-off-by: Jean Carlo Machado--- Documentation/RelNotes/2.15.0.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/RelNotes/2.15.0.txt b/Documentation/RelNotes/2.15.0.txt index 248ba70c3..cdd761bcc 100644 --- a/Documentation/RelNotes/2.15.0.txt +++ b/Documentation/RelNotes/2.15.0.txt @@ -65,7 +65,7 @@ UI, Workflows & Features learned to take the 'unfold' and 'only' modifiers to normalize its output, e.g. "git log --format=%(trailers:only,unfold)". - * "gitweb" shows a link to visit the 'raw' contents of blbos in the + * "gitweb" shows a link to visit the 'raw' contents of blobs in the history overview page. * "[gc] rerereResolved = 5.days" used to be invalid, as the variable @@ -109,13 +109,13 @@ Performance, Internal Implementation, Development Support etc. * Conversion from uchar[20] to struct object_id continues. * Start using selected c99 constructs in small, stable and - essentialpart of the system to catch people who care about + essential part of the system to catch people who care about older compilers that do not grok them. * The filter-process interface learned to allow a process with long latency give a "delayed" response. - * Many uses of comparision callback function the hashmap API uses + * Many uses of comparison callback function the hashmap API uses cast the callback function type when registering it to hashmap_init(), which defeats the compile time type checking when the callback interface changes (e.g. gaining more parameters). -- 2.15.0
[no subject]
I hope it's right this time. :)
Re: [PATCH 3/4] remote-mediawiki: show known namespace choices on failure
On Sun, Oct 29, 2017 at 12:08:56PM -0400, Antoine Beaupré wrote: > if we fail to find a requested namespace, we should tell the user > which ones we know about, since we already do. this allows users to > feetch all namespaces by specifying a dummy namespace, failing, then > copying the list of namespaces in the config. > > eventually, we should have a flag that allows fetching all namespaces > automatically. > > Reviewed-by: Antoine Beaupré> Signed-off-by: Antoine Beaupré > --- > contrib/mw-to-git/git-remote-mediawiki.perl | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl > b/contrib/mw-to-git/git-remote-mediawiki.perl > index fc48846a1..07cc74bac 100755 > --- a/contrib/mw-to-git/git-remote-mediawiki.perl > +++ b/contrib/mw-to-git/git-remote-mediawiki.perl > @@ -1334,7 +1334,9 @@ sub get_mw_namespace_id { > my $id; > > if (!defined $ns) { > - print {*STDERR} "No such namespace ${name} on MediaWiki.\n"; > + my @namespaces = sort keys %namespace_id; > + for (@namespaces) { s/ /_/g; } I am sure we can improve upon the need to process @namespaces twice: my @namespaces = map { s/ /_/g; $_; } sort keys %namespaces_id; -- Thomas Adam
Re: [PATCH v3 7/7] remote-mediawiki: show progress while fetching namespaces
On Thu, Nov 02, 2017 at 07:10:17PM -0400, Antoine Beaupré wrote: > Actually, is there a standard way to do this in git with Perl > extensions? I know about "option verbosity N" but how should I translate > this into Perl? Carp? Warn? Log::Any? Log4perl? No, not really. From a quick glance at some of the existing perl code in git, a lot of it continues to use "print STDERR" -- but then to be fair, a lot of the perl code also reads like it has been written by C programmers... While there's nothing wrong with using "print STDERR", it's probably wiser to transition this to using Carp in the long run -- it would decrease the round-trip time to debugging should there be a situation where that was needed, and hence I would recommend using "warn" for less-severe errors/debugging. -- Thomas Adam
Re: [PATCH 1/2] sequencer: factor out rewrite_file()
On Sat, Nov 04, 2017 at 10:05:43AM +0100, René Scharfe wrote: > >> How about *not* printing the error at the place where you notice the > >> error, and instead return an error code to the caller to be noticed > >> which dies with an error message? > > > > That ends up giving less-specific errors. > > Not necessarily. Function could return different codes for different > errors, e.g. -1 for an open(2) error and -2 for a write(2) error, and > the caller could use that to select the message to show. > > Basically all of the messages in wrapper.c consist of some text mixed > with the affected path path and a strerror(3) string, so they're > compatible in that way. A single function (get_path_error_format()?) > could thus be used and callers would be able to combine its result with > die(), error(), or warning(). I think we've had this discussion before, a while ago. Yes, returning an integer error code is nice because you don't have pass in an extra parameter. But I think there are two pitfalls: 1. Integers may not be descriptive enough to cover all cases, which is how we ended up with the strbuf-passing strategy in the ref code. Certainly you could add an integer for every possible bespoke message, but then I'm not sure it's buying that much over having the function simply fill in a strbuf. 2. For complex functions there may be multiple errors that need to stack. I think the refs code has cases like this, where a syscall fails, which causes a fundamental ref operation to fail, which causes a higher-level operation to fail. It's only the caller of the higher-level operation that knows how to report the error. Certainly an integer error code would work for _this_ function, but I'd rather see us grow towards consistent error handling. -Peff
Re: Git Open Source Weekend London 11th/12th November
On Wed, Nov 01, 2017 at 04:36:24PM +, Thomas Gummerer wrote: > Normally attendees work in small groups on a specific task to > prevent anyone from getting stuck. Per usual, Bloomberg will > provide the venue, mentors, snacks and drinks. Bring your > enthusiasm (and your laptop!) and come share in the fun! The > event is also open to everyone, so feel free to pass on the > invite! I think it will help if the experienced members of the community (both those who will be at the event and not) can come up with some possible topics to work on (though of course I'd be glad for participants to come with their own itch to scratch). We've started using the #leftoverbits tag to allow searching in the archive: https://public-inbox.org/git/?q=leftoverbits Some of those have since been completed, but others are left open. There's not really a master list, but it's a potential source for digging for gold (well, if you want to call leftover bugs gold :) ). I started a list over the summer of larger items that people might want to pick up. Here it is in no particular order: - the pager. config is mis-designed, because our config keys cannot represent all possible command names (e.g., case folding and illegal characters). This should be pager..enable or similar. Some discussion in (this message and the surrounding thread): https://public-inbox.org/git/20170711101942.h2uwxtgzvgguz...@sigill.intra.peff.net/ But I think you could find more by searching the archive. - ditto for alias.* config, which has the same syntax problems. - auto-gc is sometimes over-anxious to run if you have a lot of unreachable loose objects. We should pack unreachables into a single pack. That solves the --auto problem, and is also way more efficient. The catch is expiration. Some discussion here (and especially down-thread): https://public-inbox.org/git/20170711101942.h2uwxtgzvgguz...@sigill.intra.peff.net/ - git-config's "--get-color" is unlike all the other types in that it takes a "default" value if the config key isn't set. This makes it annoyingly inconsistent, but there's also no way to ask Git to interpret other values (e.g., you might want it to expand "--path" or an "--int"). It would be nice to have a general "--default" option so you could do: # same as git config --get-color color.diff.old red git config --default red --color color.diff.old or # not currently possible to ask git to interpret "10k" git config --default 10k --int some.integer.key - git's internal config can parse expiration dates (see parse_expiry_date()), but you can't do so from git-config. It should probably have a type for "--expiry-date" (which would of course be more useful with the --default option above). - there's no efficient way to ask git-config for several keys with a specific type (or even multiple different types). You can use "--list" to get their strings, but you can't get any interpretation (like colors, integers, etc). Invoking git-config many times can have a noticeable speed impact for a script. There should probably be a "--stdin" mode (or maybe "--get-stdin" if we would one day want to have a "--set-stdin") that takes a list of keys, optional types, and optional defaults (that "--default" again!) and outputs them to stdout. Those were just sitting on my ideas list. I'm happy to go into more detail if anybody's interested in discussing any of them. Some of them may be half-baked. And of course I'd be happy if people wanted to contribute more items to the list. -Peff
Re: [PATCH 1/2] sequencer: factor out rewrite_file()
Am 03.11.2017 um 20:13 schrieb Jeff King: > On Fri, Nov 03, 2017 at 10:44:08PM +0900, Junio C Hamano wrote: > >> Simon Ruderichwrites: >> >>> I tried looking into this by adding a new write_file_buf_gently() >>> (or maybe renaming write_file_buf to write_file_buf_or_die) and >>> using it from write_file_buf() but I don't know the proper way to >>> handle the error-case in write_file_buf(). Just calling >>> die("write_file_buf") feels ugly, as the real error was already >>> printed on screen by error_errno() and I didn't find any function >>> to just exit without writing a message (which still respects >>> die_routine). Suggestions welcome. >> >> How about *not* printing the error at the place where you notice the >> error, and instead return an error code to the caller to be noticed >> which dies with an error message? > > That ends up giving less-specific errors. Not necessarily. Function could return different codes for different errors, e.g. -1 for an open(2) error and -2 for a write(2) error, and the caller could use that to select the message to show. Basically all of the messages in wrapper.c consist of some text mixed with the affected path path and a strerror(3) string, so they're compatible in that way. A single function (get_path_error_format()?) could thus be used and callers would be able to combine its result with die(), error(), or warning(). René
Re: [PATCH 2/3] http-push: use hex_to_bytes()
Am 01.11.2017 um 23:15 schrieb Jeff King: > On Wed, Nov 01, 2017 at 10:59:49PM +0100, René Scharfe wrote: > >>> The hex_to_bytes() function requires that the caller make sure they have >>> the right number of bytes. But for many callers, I think they'd want to >>> say "parse this oid, which might be truncated; I can't tell what the >>> length is supposed to be". >> >> I'm confused by the word "many". After this series there are three >> callers of hex_to_bytes() and I don't expect that number to grow. > > I meant only that most callers that parse oids, both in-file and not, > would want to stop knowing about the length ahead of time. That's a good idea. > I'm not sure we know 100% > yet what "new"-style hashes will look like, nor how their loose-object > filenames would look. So it's too early to implement a solution, but here's how a patch for reducing dependency on hash lengths could look like today: --- cache.h | 2 ++ hex.c | 13 + http-push.c | 7 +++ notes.c | 6 +- sha1_file.c | 4 +--- 5 files changed, 20 insertions(+), 12 deletions(-) diff --git a/cache.h b/cache.h index f06bfbaf32..acd3804c21 100644 --- a/cache.h +++ b/cache.h @@ -1317,6 +1317,8 @@ extern int set_disambiguate_hint_config(const char *var, const char *value); extern int get_sha1_hex(const char *hex, unsigned char *sha1); extern int get_oid_hex(const char *hex, struct object_id *sha1); +extern int get_oid_hex_tail(const char *hex, struct object_id *oid, size_t offset); + /* * Read `len` pairs of hexadecimal digits from `hex` and write the * values to `binary` as `len` bytes. Return 0 on success, or -1 if diff --git a/hex.c b/hex.c index 8df2d63728..3e6abe4d5e 100644 --- a/hex.c +++ b/hex.c @@ -47,6 +47,19 @@ int hex_to_bytes(unsigned char *binary, const char *hex, size_t len) return 0; } +int get_oid_hex_tail(const char *hex, struct object_id *oid, size_t offset) +{ + for (; offset < GIT_SHA1_RAWSZ; offset++, hex += 2) { + int val = hex2chr(hex); + if (val < 0) + return -1; + oid->hash[offset] = val; + } + if (*hex) + return -1; + return 0; +} + int get_sha1_hex(const char *hex, unsigned char *sha1) { int i; diff --git a/http-push.c b/http-push.c index 14435ab65d..a5512616b9 100644 --- a/http-push.c +++ b/http-push.c @@ -1007,18 +1007,17 @@ static void remote_ls(const char *path, int flags, void (*userFunc)(struct remote_ls_ctx *ls), void *userData); -/* extract hex from sharded "xx/x{38}" filename */ +/* extract hex from sharded "xx/x{N}" filename */ static int get_oid_hex_from_objpath(const char *path, struct object_id *oid) { - if (strlen(path) != GIT_SHA1_HEXSZ + 1) + if (strnlen(path, 3) < 3) return -1; - if (hex_to_bytes(oid->hash, path, 1)) return -1; path += 2; path++; /* skip '/' */ - return hex_to_bytes(oid->hash + 1, path, GIT_SHA1_RAWSZ - 1); + return get_oid_hex_tail(path, oid, 1); } static void process_ls_object(struct remote_ls_ctx *ls) diff --git a/notes.c b/notes.c index 04f8c8613c..ff6ce57022 100644 --- a/notes.c +++ b/notes.c @@ -410,17 +410,13 @@ static void load_subtree(struct notes_tree *t, struct leaf_node *subtree, struct leaf_node *l; size_t path_len = strlen(entry.path); - if (path_len == 2 * (GIT_SHA1_RAWSZ - prefix_len)) { + if (!get_oid_hex_tail(entry.path, _oid, prefix_len)) { /* This is potentially the remainder of the SHA-1 */ if (!S_ISREG(entry.mode)) /* notes must be blobs */ goto handle_non_note; - if (hex_to_bytes(object_oid.hash + prefix_len, entry.path, -GIT_SHA1_RAWSZ - prefix_len)) - goto handle_non_note; /* entry.path is not a SHA1 */ - type = PTR_TYPE_NOTE; } else if (path_len == 2) { /* This is potentially an internal node */ diff --git a/sha1_file.c b/sha1_file.c index a3c32d91d1..0486696b0b 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1911,9 +1911,7 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr, strbuf_setlen(path, baselen); strbuf_addf(path, "/%s", de->d_name); - if (strlen(de->d_name) == GIT_SHA1_HEXSZ - 2 && - !hex_to_bytes(oid.hash + 1, de->d_name, - GIT_SHA1_RAWSZ - 1)) { + if (!get_oid_hex_tail(de->d_name, , 1)) { if (obj_cb) { r = obj_cb(, path->buf, data); if (r) -- 2.15.0
Re: [PATCH 5/6] t0021/rot13-filter: add capability functions
On Sun, Oct 22, 2017 at 3:46 AM, Junio C Hamanowrote: > Christian Couder writes: > >> Add functions to help read and write capabilities. >> These functions will be reused in following patches. > > One more thing that is more noteworthy (read: do not forget to > describe it in the proposed log message) is that the original used > to require capabilities to come in a fixed order. > > The new code allows these capabilities to be declared in any order, Yeah and I think it is good. > it even allows duplicates (intended? shouldn't we be flagging it as > an error?), I think allowing duplicates is ok, as we allow duplicates in many cases already, like duplicate command line options. Or perhaps we should just warn? > the helper can require a set of capabilities we do want > to see and fail if the remote doesn't declare any one of them > (good). Yeah. > It does not check if the remote declares any capability we do not > know about (intended? the original noticed this situation and error > out---shouldn't the more generalized helper that is meant to be > reusable allow us to do so, too?). I think that it is ok in general to just ignore capabilities we don't know (as long as we don't advertise them). The protocol should work using only the capabilities that both ends support. Now if we just talk about testing, we might sometimes want to check that one end sends only some specific capabilities. So in this case it could be good if we could error out. On the other hand, if we test how Git behaves when we advertise some specific capabilities, it might not be a good idea to error out as it would break tests when Git learns a new capability and advertise it. In the specific case of rot13-filter.pl I think we are more in the later case than in the former. So I think it is ok to wait until we would really want to check that one end sends only some specific capabilities, before we improve the Packet.pm module to make it support that. > Side note: my answer to the last question is "it is OK and > even desirable to ignore the fact that they claim to support > a capability we do not know about", but I may be mistaken. Yeah I agree. > The reasoning and the conclusion that is consistent with > what the code does should be described in any case. Ok I will document all the above in the commit message. >> +sub packet_read_capabilities { >> + my @cap; >> + while (1) { >> + my ( $res, $buf ) = packet_bin_read(); >> + return ( $res, @cap ) if ( $res != 0 ); > > The original had the same "'list eq list' does not do what you may > think it does" issue. This one corrects it, which is good. > > I am not sure if ($res != 0) is correct though. What should happen > when you get an unexpected EOF at this point? The original would > have died; this ignores and continues. Well if there is an unexpected EOF, then packet_bin_read() returns (-1, ""), so packet_read_capabilities() returns (-1, @cap) where @cap contains the capabilities already received. Then packet_read_and_check_capabilities() checks that we received all the capabilities we expect and dies if that is not the case. If we did receive all the capabilities, then packet_read_and_check_capabilities() still returns -1, so the caller may check that and die. But yeah we could also just die in packet_read_capabilities() if $res is -1. I will make this change. >> + unless ( $buf =~ s/\n$// ) { >> + die "A non-binary line MUST be terminated by an LF.\n" >> + . "Received: '$buf'"; >> + } > > It may make sense to extract this in a small helper and call it from > here and from packet_txt_read(). Ok, I have done this in my current version, that I plan to send soon. >> + die "bad capability buf: '$buf'" unless ( $buf =~ >> s/capability=// ); > > This may merely be a style thing, but I somehow find statement > modifiers hard to follow, unless its condition is almost always > true. If you follow the logic in a loop and see "die" at the > beginning, a normal thing to expect is that there were conditionals > that said "continue" (eh, 'next' or 'redo') to catch the normal case > and the control would reach "die" only under exceptional error > cases, but hiding a rare error condition after 'unless' statement > modifier breaks that expectation. Ok, I will use: unless ( $buf =~ s/capability=// ) { die "bad capability buf: '$buf'" ; } >> + push @cap, $buf; >> + } >> +} >> + >> +sub packet_read_and_check_capabilities { >> + my @local_caps = @_; >> + my @remote_res_caps = packet_read_capabilities(); >> + my $res = shift @remote_res_caps; >> + my %remote_caps = map { $_ => 1 } @remote_res_caps; > > FYI: > > my ($res, @remote_caps) = packet_read_capabilities(); > my %remote_caps = map { $_ => 1 } @remote_caps; > > may be more
Re: [PATCH v1 2/2] log: add option to choose which refs to decorate
On 04/11/17 03:49, Junio C Hamano wrote: Rafael Ascensãowrites: Using `--exclude=` can help mitigate that verboseness by removing unnecessary 'branches' from the output. However, if the tip of an excluded ref points to an ancestor of a non-excluded ref, git will decorate it regardless. Is this even relevant? I think the above would only serve to confuse the readers. --exclude, --branches, etc. are ways to specify what starting points "git log" history traversal should begin and has nothing to do with what set of refs are to be used to decorate the commits that are shown. But the paragraph makes readers wonder if it might have any effect in some circumstances. With `--decorate-refs=`, only refs that match are decorated while `--decorate-refs-exclude=` allows to do the reverse, remove ref decorations that match And "Only refs that match ... are decorated" is also confusing. The thing is, refs are never decorated, they are used for decorating commits in the output from "git log". For example, if you have ---A---B---C---D and B is at the tip of the 'master' branch, the output from "git log D" would decorate B with 'master', even if you do not say 'master' on the command line as the commit to start the traversal from. > Perhaps drop the irrelevant paragraph about "--exclude" and write something like this instead? When "--decorate-refs=" is given, only the refs that match the pattern is used in decoration. The refs that match the pattern, when "--decorate-refs-exclude=" is given, are never used in decoration. What you explained was the reason I mentioned that. Because some users were wrongfully trying to remove decorations by trying to exclude the starting points. But I agree this adds little value and can generate further confusion. I will remove that section. Both can be used together but --decorate-refs-exclude patterns have precedence over --decorate-refs patterns. A reasonable and an easy-to-explain way to mix zero or more positive and zero or more negagive patterns that follows the convention used elsewhere in the system (e.g. how negative pathspecs work) is (1) if there is no positive pattern given, pretend as if an inclusive default positive pattern was given; (2) for each candidate, reject it if it matches no positive pattern, or if it matches any one of negative patterns. For pathspecs, we use "everything" as the inclusive default positive pattern, I think, and for the set of refs used for decoration, a reasonable choice would also be to use "everything", which matches the current behaviour. That's a nice explanation that fits the current "--decorate-refs" behavior. The pattern follows similar rules as `--glob` except it doesn't assume a trailing '/*' if glob characters are missing. Why should this be a special case that burdens users to remember one more rule? Wouldn't users find "--decorate-refs=refs/tags" useful and it woulld be shorter and nicer than having to say "refs/tags/*"? I wanted to allow exact patterns like: "--decorate-refs=refs/heads/master" and for that I disabled the flag that adds the trailing '/*' if no globs are found. As a side effect, I lost the shortcut. Is adding a yet another flag that appends '/*' only if the pattern equals "refs/{heads,remotes,tags}" a good idea? Because changing the default behavior of that function has implications on multiple commands which I think shouldn't change. But at the same time, would be nice to have the logic that deals with glob-ref patterns all in one place. What's the sane way to do this? diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt index 32246fdb0..314417d89 100644 --- a/Documentation/git-log.txt +++ b/Documentation/git-log.txt @@ -38,6 +38,18 @@ OPTIONS are shown as if 'short' were given, otherwise no ref names are shown. The default option is 'short'. +--decorate-refs=:: + Only print ref names that match the specified pattern. Uses the same + rules as `git rev-list --glob` except it doesn't assume a trailing a + trailing '/{asterisk}' if pattern lacks '?', '{asterisk}', or '['. + `--decorate-refs-exlclude` has precedence. + +--decorate-refs-exclude=:: + Do not print ref names that match the specified pattern. Uses the same + rules as `git rev-list --glob` except it doesn't assume a trailing a + trailing '/{asterisk}' if pattern lacks '?', '{asterisk}', or '['. + Has precedence over `--decorate-refs`. These two may be technically correct, but I wonder if we can make it easier to understand (I found "precedence" bit hard to follow, as in my mind, these are ANDed conditions and between (A & ~B), there is no "precedence"). Also we'd want to clarify what happens when only "--decorate-refs-exclude"s are given, which in turn necessitates us to describe what happens when only "--decorate-refs"s are given. I
Re: [PATCH v1 1/2] refs: extract function to normalize partial refs
On 04/11/17 02:27, Junio C Hamano wrote: Rafael Ascensãowrites: Signed-off-by: Kevin Daudt Signed-off-by: Rafael Ascensão Could you explain Kevin's sign-off we see above? It is a bit unusual (I am not yet saying it is wrong---I cannot judge until I find out why it is there) to see a patch from person X with sign off from person Y and then person X in that order. It is normal for a patch authored by person X to have sign-off by X and then Y if X wrote it, signed it off and passed to Y, and then Y resent it after signing it off (while preserving the authorship of X by adding an in-body From: header), but I do not think that is what we have here. It could be that you did pretty much all the work on this patch and Kevin helped you to polish this patch off-list, in which case the usual thing to do is to use "Helped-by: Kevin" instead. That's more or less what happened. I wouldn't say I did "pretty much all the work". Yes, I wrote the code but with great help of Kevin. The intention of the dual Signed-off-by was to equally attribute authorship of the patch. But if that creates ambiguity I will change it to "Helped-by" as suggested. It is better to use "unsigned" for a single word "flags" used as a collection of bits. In older parts of the codebase, we have codepaths that pass signed int as a flags word, simply because we didn't know better, but we do not have to spread that practice to new code. I noticed this, but chose to "mimic" the code around me. I'll correct it. On a related note is there a guideline for defining flags or are `#define FLAG (1u << 0)`, `#define FLAG (1 << 0)` `#define FLAG 1` and `#define FLAG 0x1` equally accepted? { - struct strbuf real_pattern = STRBUF_INIT; - struct ref_filter filter; - int ret; - if (!prefix && !starts_with(pattern, "refs/")) - strbuf_addstr(_pattern, "refs/"); + strbuf_addstr(normalized_pattern, "refs/"); else if (prefix) - strbuf_addstr(_pattern, prefix); - strbuf_addstr(_pattern, pattern); + strbuf_addstr(normalized_pattern, prefix); + strbuf_addstr(normalized_pattern, pattern); - if (!has_glob_specials(pattern)) { + if (!has_glob_specials(pattern) && (flags & ENSURE_GLOB)) { /* Append implied '/' '*' if not present. */ - strbuf_complete(_pattern, '/'); + strbuf_complete(normalized_pattern, '/'); /* No need to check for '*', there is none. */ - strbuf_addch(_pattern, '*'); + strbuf_addch(normalized_pattern, '*'); } +} The above looks like a pure and regression-free code movement (plus a small new feature) that is faithful to the original, which is good. I however notice that addition of /* to the tail is trying to be careful by using strbuf_complete('/'), but prefixing with "refs/" does not and we would end up with a double-slash if pattern begins with a slash. The contract between the caller of this function (or its original, which is for_each_glob_ref_in()) and the callee is that prefix must not begin with '/', so it may be OK, but we might want to add "if (*pattern == '/') BUG(...)" at the beginning. I dunno. In any case, that is totally outside the scope of this two patch series. I guess it doesn't hurt adding that safety net. Thanks. Other than the above minor points, looks good to me. I'll fix the mentioned issues. Thanks.