Re: [PATCH 0/4] dropping support for older curl
On Thu, Aug 10, 2017 at 07:09:02PM -0400, Jeff King wrote: > The first is "should we eventually drop support for antiquated versions > of dependencies?". And the argument in favor is the one I was making > here: besides lowering maintenance cost, it is more honest to our users > about what to expect[1]. As usual, I forgot all my footnotes. [1] When I've talked about keeping expectations reasonable, I'm a lot less interested in "oops, I built Git and this particular feature didn't work". It's easy to write that off as "well, you have an old version of curl, patches welcome". I'm much more concerned about security issues. Curl is network-facing. Leaving aside security vulnerabilities in curl itself (which hopefully distros with 10-year support periods would fix), I wouldn't be surprised if there are bad interactions possible due to our tangle of ifdefs. One way to address that would be more careful auditing. But then that goes back to the cost/benefit thing. > One is to do it by date and what dependencies are in long-term OS > releases, and then compare that to the benefit. Requiring curl 7.11.1 > still keeps us working back to rhel4, which was already end-of-lifed > completely after a 12 year run. Bumping to 7.16.0 drops rhel4 and rhel5, > the latter of which is in its final "barely supported" phase after 10 > years. But it gives us a bit more bang for our buck by making CURL_MULTI > uconditional[2]. Requiring 7.19.4 actually doesn't drop any more rhel > releases. So by that metric, we might as well go there. [2] The line-count change from dropping CURL_MULTI isn't _too_ exciting. But a lot of the tangled design of our http code revolves around the abstractions we've introduced. I have a feeling that it will enable further cleanups as we move forward (OTOH, a lot of the worst parts of our design are because of _using_ curl_multi for dumb http, which of course hardly anyone does these days. But I have a feeling if I suggested removing that, people would really scream). -Peff
Re: [PATCH] apply: remove prefix_length member from apply_state
On Wed, Aug 09, 2017 at 05:54:46PM +0200, René Scharfe wrote: > Use a NULL-and-NUL check to see if we have a prefix and consistently use > C string functions on it instead of storing its length in a member of > struct apply_state. This avoids strlen() calls and simplifies the code. I had to read the code to figure out exactly what you meant by NULL-and-NUL (and even then it took me a minute). I thought at first the latter half just means "use starts_with to walk the string incrementally rather than bothering to find the length ahead of time". Which makes perfect sense to me. But actually, I think you mean the final block which makes sure we have a non-empty string. > diff --git a/apply.c b/apply.c > index 970bed6d41..168dfe3d16 100644 > --- a/apply.c > +++ b/apply.c > @@ -80,7 +80,6 @@ int init_apply_state(struct apply_state *state, > { > memset(state, 0, sizeof(*state)); > state->prefix = prefix; > - state->prefix_length = state->prefix ? strlen(state->prefix) : 0; So we suspect that state->prefix might be NULL here. > @@ -786,11 +785,11 @@ static int guess_p_value(struct apply_state *state, > const char *nameline) >* Does it begin with "a/$our-prefix" and such? Then this is >* very likely to apply to our directory. >*/ > - if (!strncmp(name, state->prefix, state->prefix_length)) > + if (starts_with(name, state->prefix)) > val = count_slashes(state->prefix); At first this looked wrong to me. Don't we need to check for NULL? But the check is simply just outside the context, so we are good. > else { > cp++; > - if (!strncmp(cp, state->prefix, state->prefix_length)) > + if (starts_with(cp, state->prefix)) > val = count_slashes(state->prefix) + 1; > } And likewise for this one, which is part of the same block. > @@ -2088,10 +2087,9 @@ static int use_patch(struct apply_state *state, struct > patch *p) > int i; > > /* Paths outside are not touched regardless of "--include" */ > - if (0 < state->prefix_length) { > - int pathlen = strlen(pathname); > - if (pathlen <= state->prefix_length || > - memcmp(state->prefix, pathname, state->prefix_length)) > + if (state->prefix && *state->prefix) { > + const char *rest; > + if (!skip_prefix(pathname, state->prefix, ) || !*rest) > return 0; > } The check for *state->prefix here makes sure the behavior remains identical. I wondered at first whether it's actually necessary. Wouldn't an empty prefix always match? But I think we're looking for the pathname to be a proper superset of the prefix (hence the "!*rest" check). So I guess an empty path would not be a proper superset of an empty prefix, even though with the current code it doesn't hit that block at all. I'm still not sure it's possible to have an empty pathname, so that corner case may be moot and we could simplify the condition a little. But certainly as you've written it, it could not be a regression. The use of skip_prefix also took me a minute. I wonder if it's worth a comment with the words "proper superset" or some other indicator (it also surprised me that we're ok with matching "foobar" in the prefix "foo", and not demanding "foo/bar". But I didn't look at the context to know whether that's right or not. This may be a case where the prefix is supposed to have "/" on it already). -Peff
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On 11/08/17 00:10, Jeff King wrote: > On Fri, Aug 11, 2017 at 12:02:49AM +0100, Ramsay Jones wrote: > >>> But some of those things are not 1:1 mappings with normalization. For >>> instance, --json presumably implies --only-trailers. Or are we proposing >>> to break the whole commit message down into components and output it all >>> as json? >> >> Hmm, to me the operation wasn't so much a normalization, rather >> it was an --unfold (or maybe --un-fold ;-D). I suppose going in >> the other direction could be --fold=72, or some such. > > But I really don't want callers to think of it as "unfold". I want it to > be "turn this into something I can parse simply". Hence if we were to > find another case where the output is irregular, I'd feel comfortable > calling that a bug and fixing it (one that I suspect but haven't tested > is that alternate separators probably should all be converted to > colons). Ah, yes, good point. ATB, Ramsay Jones
Re: [PATCH 0/4] dropping support for older curl
On 11/08/17 01:23, Jeff King wrote: On Fri, Aug 11, 2017 at 01:17:51AM +0200, Tom G. Christensen wrote: OK, thanks for double-checking. I'm still puzzled why your build succeeds and mine does not. I know what's going on now and it's so simple. Red Hats version of curl 7.15.5 includes a number of patches including one that backports support for CURLPROTO_* (as part of a fix for CVE-2009-0037). I haven't checked el6 but I would not be surprised if there where similar things going on there. el6 should have it already as part of 7.19.7, right? Yes of course. So in conclusion version based #ifdefs are misleading when used with curl as shipped with RHEL. Yeah, that's certainly an interesting finding. In this case your builds are missing out on redirect protection that we _could_ be providing. Yes and I'm looking into that right now. If we do keep the compat ifdefs around this feature, it may be worth converting them to "#ifdef CURLPROTO_HTTP" to more directly check the feature. Yes, a feature test would be better. -tgc
Re: [PATCH 0/4] dropping support for older curl
On Fri, Aug 11, 2017 at 01:17:51AM +0200, Tom G. Christensen wrote: > > OK, thanks for double-checking. I'm still puzzled why your build > > succeeds and mine does not. > > I know what's going on now and it's so simple. > Red Hats version of curl 7.15.5 includes a number of patches including one > that backports support for CURLPROTO_* (as part of a fix for CVE-2009-0037). > I haven't checked el6 but I would not be surprised if there where similar > things going on there. el6 should have it already as part of 7.19.7, right? > So in conclusion version based #ifdefs are misleading when used with curl as > shipped with RHEL. Yeah, that's certainly an interesting finding. In this case your builds are missing out on redirect protection that we _could_ be providing. If we do keep the compat ifdefs around this feature, it may be worth converting them to "#ifdef CURLPROTO_HTTP" to more directly check the feature. -Peff
Re: [PATCH v2 1/2] format-patch: have progress option while generating patches
On Thu, Aug 10, 2017 at 02:32:55PM -0400, Kevin Willford wrote: > @@ -1493,6 +1496,8 @@ int cmd_format_patch(int argc, const char **argv, const > char *prefix) > OPT_FILENAME(0, "signature-file", _file, > N_("add a signature from a file")), > OPT__QUIET(, N_("don't print the patch filenames")), > + OPT_BOOL(0, "progress", _progress, > + N_("show progress while generating patches")), Earlier I suggested allowing --progress="custom text" since this may be driven as plumbing for other commands. But I don't think there's any need to worry about it now. It can be added seamlessly later if we find such a caller. > @@ -1752,8 +1757,12 @@ int cmd_format_patch(int argc, const char **argv, > const char *prefix) > start_number--; > } > rev.add_signoff = do_signoff; > + > + if (show_progress) > + progress = start_progress_delay(_("Generating patches"), total, > 0, 1); I don't really have an opinion on a 1 second delay versus 2. I thought we used 2 pretty consistently, though grepping around I do see a couple of 1's. It probably doesn't matter, but just a curiosity. -Peff
Re: [PATCH 0/4] dropping support for older curl
On 11/08/17 00:54, Jeff King wrote: On Fri, Aug 11, 2017 at 12:23:42AM +0200, Tom G. Christensen wrote: Er, sorry if I'm being dense, but how? Are you suggesting that by removing the callsite of get_curl_allowed_protocols(), the compiler might elide the now-dead code completely? I could certainly see it being dropped after the compilation, but I'm surprised that it wouldn't complain about the undeclared identifiers in the first place. You're right, that should not be able to handle it. Can you please double-check that you're building against the correct version of curl, and that you are building the HTTP parts of Git (which _are_ optional, and the test suite will pass without them). I use a mock buildroot and there is no other curl than the vendor supplied 7.15.5 installed: [...] OK, thanks for double-checking. I'm still puzzled why your build succeeds and mine does not. I know what's going on now and it's so simple. Red Hats version of curl 7.15.5 includes a number of patches including one that backports support for CURLPROTO_* (as part of a fix for CVE-2009-0037). I haven't checked el6 but I would not be surprised if there where similar things going on there. So in conclusion version based #ifdefs are misleading when used with curl as shipped with RHEL. -tgc
Re: [PATCH v2 0/2] Add progress for format-patch and rebase
On Thu, Aug 10, 2017 at 03:48:31PM -0700, Junio C Hamano wrote: > Kevin Willfordwrites: > > > Changes since last patch: > > 1. Use start_progress_delay so progress isn't shown if generating > >the patches is fast enough > > 2. Updated to have text of "Generating patches" > > 3. Only show progress when the --progress flag is passed > > 4. In the rebase script check stderr and the quiet option is not > >set before propagating the progress flag to format-patch > > > > Kevin Willford (2): > > format-patch: have progress option while generating patches > > rebase: turn on progress option by default for format-patch > > Do you have a pointer to the previous discussion handy (if you do > not, that is OK---I think I can dig the list archive myself)? https://public-inbox.org/git/20170531150427.7820-1-kewi...@microsoft.com/ is what I turned up. Overall this version looks good to me, and addresses all of the previous points. I have two minor points which I'll make inline. -Peff
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Fri, Aug 11, 2017 at 12:02:49AM +0100, Ramsay Jones wrote: > > But some of those things are not 1:1 mappings with normalization. For > > instance, --json presumably implies --only-trailers. Or are we proposing > > to break the whole commit message down into components and output it all > > as json? > > Hmm, to me the operation wasn't so much a normalization, rather > it was an --unfold (or maybe --un-fold ;-D). I suppose going in > the other direction could be --fold=72, or some such. But I really don't want callers to think of it as "unfold". I want it to be "turn this into something I can parse simply". Hence if we were to find another case where the output is irregular, I'd feel comfortable calling that a bug and fixing it (one that I suspect but haven't tested is that alternate separators probably should all be converted to colons). > [blue is my favourite colour ... :-P ] Yes, I'm feeling that, too. :-/ -Peff
Re: [PATCH 0/4] dropping support for older curl
On Thu, Aug 10, 2017 at 03:17:06PM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > On Thu, Aug 10, 2017 at 11:36:41AM +0200, Johannes Schindelin wrote: > > > >> Hopefully I had better luck expressing my concerns this time? > > > > I understand your argument much better now. I'm still not sure I agree. > > > > -Peff > > I do not think "there are a dozen #ifdefs and I don't know whether > they still work. I don't know whether anybody (who most likely has > better things to do than read the Git mailing list) is still using > those. So let's just remove them." was why you were suggesting to > clean up the (apparent) support of older curl in the code, though. > > Isn't the reason why your series simplifies these #ifdefs away > because we by accident started using some features that require a > version that is even newer than any of these #ifdef's try to cater > to and yet nobody complained? That is a lot more similar to the > removal of rsync transport that happened in a not so distant past, > where the reason for removal was "We have been shipping code that > couldn't have possibly worked for some time and nobody complained > ---we know nobody is depending on it." I think there are two questions to be asked, and their answers come from different lines of reasoning. The first is "should we eventually drop support for antiquated versions of dependencies?". And the argument in favor is the one I was making here: besides lowering maintenance cost, it is more honest to our users about what to expect[1]. The second is "how far back should we keep support?". And there are two lines of thought there. One is to do it by date and what dependencies are in long-term OS releases, and then compare that to the benefit. Requiring curl 7.11.1 still keeps us working back to rhel4, which was already end-of-lifed completely after a 12 year run. Bumping to 7.16.0 drops rhel4 and rhel5, the latter of which is in its final "barely supported" phase after 10 years. But it gives us a bit more bang for our buck by making CURL_MULTI uconditional[2]. Requiring 7.19.4 actually doesn't drop any more rhel releases. So by that metric, we might as well go there. And the second line of thought is: it was already broken and nobody reported it or offered up a fix. And that's where the "similar to rsync" thing comes in. Though in this case we do have some evidence that people (at least Tom) was patching and distributing behind the scenes. And our breakage period was much shorter (since v2.12.0, but that's only months or so). -Peff
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On 10/08/17 22:10, Jeff King wrote: > On Thu, Aug 10, 2017 at 11:06:40PM +0200, Christian Couder wrote: > Related to this, I wonder if people might want to "normalize" in different ways later. If that happens, we might regret having called this option "--normalize" instead of "--one-per-line" for example. >>> >>> What is normal? >>> >>> Maybe --style=[one-line, wrapped:72, rfc, json, xml, ...] >>> then? >> >> Yeah, we could go there right now (using perhaps "--pretty" or >> "--format" instead of "--style", so that we are more consistent with >> other commands), but on the other hand we don't know yet if the other >> formats will really be needed. > > But some of those things are not 1:1 mappings with normalization. For > instance, --json presumably implies --only-trailers. Or are we proposing > to break the whole commit message down into components and output it all > as json? Hmm, to me the operation wasn't so much a normalization, rather it was an --unfold (or maybe --un-fold ;-D). I suppose going in the other direction could be --fold=72, or some such. [blue is my favourite colour ... :-P ] ATB, Ramsay Jones
Re: [PATCH 0/4] dropping support for older curl
On Fri, Aug 11, 2017 at 12:23:42AM +0200, Tom G. Christensen wrote: > > > I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems > > > at > > > all neither with building nor with running the testsuite. > > > > As you can see, this does not compile for me. What's going on? > > > The call site for get_curl_allowed_protocols() in http.c is still protected > by an #if: > #if LIBCURL_VERSION_NUM >= 0x071304 > curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS, > get_curl_allowed_protocols(0)); > curl_easy_setopt(result, CURLOPT_PROTOCOLS, > get_curl_allowed_protocols(-1)); > #else > warning("protocol restrictions not applied to curl redirects > because\n" > "your curl version is too old (>= 7.19.4)"); > #endif > > > I don't see how it could work, as CURLPROTO_HTTP is not defined at all > > in that version of curl. > > Indeed but the #if will handle that. Er, sorry if I'm being dense, but how? Are you suggesting that by removing the callsite of get_curl_allowed_protocols(), the compiler might elide the now-dead code completely? I could certainly see it being dropped after the compilation, but I'm surprised that it wouldn't complain about the undeclared identifiers in the first place. And if that _is_ what is happening...that seems like a very fragile and unportable thing to be depending on. > > Can you please double-check that you're > > building against the correct version of curl, and that you are building > > the HTTP parts of Git (which _are_ optional, and the test suite will > > pass without them). > > I use a mock buildroot and there is no other curl than the vendor supplied > 7.15.5 installed: > [...] OK, thanks for double-checking. I'm still puzzled why your build succeeds and mine does not. > > I saw that, too. But as I understand it, they provide no code updates: > > no bugfixes and no security updates. They just promise to answer the > > phone and help you with troubleshooting. It's possible my perception is > > wrong, though; I'm certainly not one of their customers. > > I am refering to the Extended Life-cycle Support product (ELS), which > promises: > "the ELS Add-On delivers certain critical-impact security fixes and selected > urgent priority bug fixes and troubleshooting for the last minor release" > > The full description is here: > https://access.redhat.com/support/policy/updates/errata#Extended_Life_Cycle_Phase That was the same page I was looking at. The bit I read was: For versions of products in the Extended Life Phase, Red Hat will provide limited ongoing technical support. No bug fixes, security fixes, hardware enablement or root-cause analysis will be available during this phase, and support will be provided on existing installations only. But I missed the bit about the "ELS add-on" below there, which I guess is an extra thing. I do suspect that "install arbitrary new versions of Git" is outside of their scope of "urgent priority bug fixes". But in a sense it doesn't really matter. What is much more interesting is whether there's a significant population that is running RHEL5 and has a strong need for newer versions of Git. That I'm not sure about. -Peff
Re: [PATCH v2 0/2] Add progress for format-patch and rebase
Kevin Willfordwrites: > Changes since last patch: > 1. Use start_progress_delay so progress isn't shown if generating >the patches is fast enough > 2. Updated to have text of "Generating patches" > 3. Only show progress when the --progress flag is passed > 4. In the rebase script check stderr and the quiet option is not >set before propagating the progress flag to format-patch > > Kevin Willford (2): > format-patch: have progress option while generating patches > rebase: turn on progress option by default for format-patch Do you have a pointer to the previous discussion handy (if you do not, that is OK---I think I can dig the list archive myself)? Also, your patch messages appear from you at gmail yet signed off by you at microsoft. Please make them consistent (I can fix them up while queuing _this_ time, but I do not want forever doing that for future patches from you or other people). One easy workaround to do so without convincing your mailer to put your microsoft address on the sender's From: line in the e-mail is to put an extra From: Kevin Without line, followed by a blank line, at the very beginning of the body of the e-mail message. Thanks.
Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
Jonathan Tanwrites: > On Thu, 10 Aug 2017 14:19:59 -0700 > Junio C Hamano wrote: > >> Jonathan Tan writes: >> >> > Here is the complete patch set. I have only moved the exported functions >> > that operate with packfiles and their static helpers - for example, >> > static functions like freshen_packed_object() that are used only by >> > non-pack-specific functions are not moved. >> >> This will interfere with smaller changes and fixes we want to have >> early in the 'master' branch, so while I think it is a good idea to >> do something like this in the longer term, I'd have to ask you to >> either hold on or rebase this on them (you'll know what else you are >> conflicting with when you try to merge this to 'pu' yourself). >> >> Thanks. > > OK, I'll wait until you have updated the master branch, then I'll try to > rebase on it. That will take a few weeks, and I do not think we want you idling during that time ;-). You'd need to double check, but I think the topics that cause trouble are rs/find-apck-entry-bisection and jk/drop-sha1-entry-pos; you can start from v2.14.1 and merge these topics on top and then build your change on top. That would allow you to start cooking before both of them graduate to 'master', as I expect they are both quick-to-next material. There might be other topics that interfere with what you are doing, but you can easily find out what they are if you do a trial merge to 'next' and 'pu' yourself. Thanks.
Re: [PATCH 5/9] Convert sha1_file.c to size_t
Martin Koeglerwrites: > From: Martin Koegler > > Signed-off-by: Martin Koegler > --- Please do not start your patch series from 5/9 when there is no 1/9, 2/9, 3/9, and 4/9. It is seriously confusing. I am guessing that you are trying to split the series into manageable pieces by going per call graph and codeflow. I think it is a more sensible approach than a single huge ball of wax we saw earlier. It may take me a while to get back to this topic before I finish reviewing other new topics in flight and also merging down existing topics so that the codebase will become reasonably stable for a topic that is invasive like this one can safely land. Please be patient. Thanks. > cache.h | 16 +++ > sha1_file.c | 68 > ++--- > streaming.c | 2 +- > 3 files changed, 43 insertions(+), 43 deletions(-) > > diff --git a/cache.h b/cache.h > index 9185763..9322303 100644 > --- a/cache.h > +++ b/cache.h > @@ -1189,15 +1189,15 @@ static inline const unsigned char > *lookup_replace_object(const unsigned char *sh > > /* Read and unpack a sha1 file into memory, write memory to a sha1 file */ > extern int sha1_object_info(const unsigned char *, size_t *); > -extern int hash_sha1_file(const void *buf, unsigned long len, const char > *type, unsigned char *sha1); > -extern int write_sha1_file(const void *buf, unsigned long len, const char > *type, unsigned char *return_sha1); > -extern int hash_sha1_file_literally(const void *buf, unsigned long len, > const char *type, unsigned char *sha1, unsigned flags); > -extern int pretend_sha1_file(void *, unsigned long, enum object_type, > unsigned char *); > +extern int hash_sha1_file(const void *buf, size_t len, const char *type, > unsigned char *sha1); > +extern int write_sha1_file(const void *buf, size_t len, const char *type, > unsigned char *return_sha1); > +extern int hash_sha1_file_literally(const void *buf, size_t len, const char > *type, unsigned char *sha1, unsigned flags); > +extern int pretend_sha1_file(void *, size_t, enum object_type, unsigned char > *); > extern int force_object_loose(const unsigned char *sha1, time_t mtime); > extern int git_open_cloexec(const char *name, int flags); > #define git_open(name) git_open_cloexec(name, O_RDONLY) > -extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size); > -extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, > unsigned long mapsize, void *buffer, unsigned long bufsiz); > +extern void *map_sha1_file(const unsigned char *sha1, size_t *size); > +extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, > size_t mapsize, void *buffer, size_t bufsiz); > extern int parse_sha1_header(const char *hdr, size_t *sizep); > > /* global flag to enable extra checks when accessing packed objects */ > @@ -1723,8 +1723,8 @@ extern off_t find_pack_entry_one(const unsigned char > *sha1, struct packed_git *) > > extern int is_pack_valid(struct packed_git *); > extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, > size_t *); > -extern unsigned long unpack_object_header_buffer(const unsigned char *buf, > unsigned long len, enum object_type *type, size_t *sizep); > -extern unsigned long get_size_from_delta(struct packed_git *, struct > pack_window **, off_t); > +extern size_t unpack_object_header_buffer(const unsigned char *buf, size_t > len, enum object_type *type, size_t *sizep); > +extern size_t get_size_from_delta(struct packed_git *, struct pack_window > **, off_t); > extern int unpack_object_header(struct packed_git *, struct pack_window **, > off_t *, size_t *); > > /* > diff --git a/sha1_file.c b/sha1_file.c > index 3428172..1b3efea 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -51,7 +51,7 @@ static struct cached_object { > unsigned char sha1[20]; > enum object_type type; > void *buf; > - unsigned long size; > + size_t size; > } *cached_objects; > static int cached_object_nr, cached_object_alloc; > > @@ -818,8 +818,8 @@ static int check_packed_git_idx(const char *path, struct > packed_git *p) >* variable sized table containing 8-byte entries >* for offsets larger than 2^31. >*/ > - unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20; > - unsigned long max_size = min_size; > + size_t min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20; > + size_t max_size = min_size; > if (nr) > max_size += (nr - 1)*8; > if (idx_size < min_size || idx_size > max_size) { > @@ -1763,7 +1763,7 @@ static int open_sha1_file(const unsigned char *sha1, > const char **path) > */ > static void *map_sha1_file_1(const char *path, >const unsigned char *sha1, > -
Re: [PATCH 0/4] dropping support for older curl
On 10/08/17 23:32, Jeff King wrote: On Thu, Aug 10, 2017 at 10:33:18PM +0200, Tom G. Christensen wrote: You've totally ignored the argument I made back then[1], and which I reiterated in this thread. So I'll say it one more time: the more compelling reason is not the #ifdefs, but the fact that the older versions are totally untested. Perhaps you forgot but I stated in the original thread that I build RPMS for RHEL/CentOS 3, 4, 5, 6 and 7. I still do and I run the testsuite every single time. I didn't forget. I actually double-checked the patches you sent at the time, but I didn't see one for the CURLPROTO issue. And indeed, it is still broken for me: $ cd /path/to/curl/repo $ git checkout curl-7_15_5 $ ./buildconf && ./configure --prefix=/tmp/foo && make install $ cd /path/to/git $ git checkout v2.14.0 $ make CURLDIR=/tmp/foo V=1 http.o gcc -o http.o -c -MF ./.depend/http.o.d -MQ http.o -MMD -MP -g -O0 -Wall -Werror -Wdeclaration-after-statement -Wpointer-arith -Wstrict-prototypes -Wvla -Wold-style-declaration -Wold-style-definition -Wno-error -Wno-cpp -Wno-unused-value -Wno-strict-prototypes -I. -DUSE_LIBPCRE1 -DHAVE_ALLOCA_H -I/tmp/foo/include -DUSE_CURL_FOR_IMAP_SEND -DNO_GETTEXT -DSHA1_DC -DSHA1DC_NO_STANDARD_INCLUDES -DSHA1DC_INIT_SAFE_HASH_DEFAULT=0 -DSHA1DC_CUSTOM_INCLUDE_SHA1_C="\"cache.h\"" -DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_C="\"sha1dc_git.c\"" -DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_H="\"sha1dc_git.h\"" -DSHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C="\"git-compat-util.h\"" -DHAVE_PATHS_H -DHAVE_DEV_TTY -DHAVE_CLOCK_GETTIME -DHAVE_CLOCK_MONOTONIC -DHAVE_GETDELIM -DFREAD_READS_DIRECTORIES -DNO_STRLCPY -DSHELL_PATH=' "/bin/sh"' -DPAGER_ENV='"LESS=FRX LV=-c"' http.c http.c: In function ‘get_curl_allowed_protocols’: http.c:685:24: error: ‘CURLPROTO_HTTP’ undeclared (first use in this function); did you mean ‘CURLPROXY_HTTP’? allowed_protocols |= CURLPROTO_HTTP; ^~ CURLPROXY_HTTP [and so on] I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems at all neither with building nor with running the testsuite. As you can see, this does not compile for me. What's going on? The call site for get_curl_allowed_protocols() in http.c is still protected by an #if: #if LIBCURL_VERSION_NUM >= 0x071304 curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS, get_curl_allowed_protocols(0)); curl_easy_setopt(result, CURLOPT_PROTOCOLS, get_curl_allowed_protocols(-1)); #else warning("protocol restrictions not applied to curl redirects because\n" "your curl version is too old (>= 7.19.4)"); #endif I don't see how it could work, as CURLPROTO_HTTP is not defined at all in that version of curl. Indeed but the #if will handle that. Can you please double-check that you're building against the correct version of curl, and that you are building the HTTP parts of Git (which _are_ optional, and the test suite will pass without them). I use a mock buildroot and there is no other curl than the vendor supplied 7.15.5 installed: # pwd /var/lib/mock/jrpms-el5-x86_64/root # find . -name 'curlver.h' ./usr/include/curl/curlver.h # grep LIBCURL_VERSION_NUM ./usr/include/curl/curlver.h parsing and comparions by programs. The LIBCURL_VERSION_NUM define will #define LIBCURL_VERSION_NUM 0x070f05 # [root@c5-32bit-01 ~]# rpm -q git git-2.14.1-1.el5.jr [root@c5-32bit-01 ~]# ldd /usr/libexec/git-core/git-http-fetch |grep libcurl libcurl.so.3 => /usr/lib/libcurl.so.3 (0x001e7000) [root@c5-32bit-01 ~]# rpm -qf /usr/lib/libcurl.so.3 curl-7.15.5-17.el5_9 [root@c5-32bit-01 ~]# git --version git version 2.14.1 [root@c5-32bit-01 ~]# git clone https://github.com/tgc/tgcware-for-solaris.git Cloning into 'tgcware-for-solaris'... warning: protocol restrictions not applied to curl redirects because your curl version is too old (>= 7.19.4) remote: Counting objects: 2793, done. remote: Total 2793 (delta 0), reused 0 (delta 0), pack-reused 2793 Receiving objects: 100% (2793/2793), 780.88 KiB | 639.00 KiB/s, done. Resolving deltas: 100% (1233/1233), done. [root@c5-32bit-01 ~]# I also won't claim any absolutes. I think we all agree this is a cost/benefit tradeoff. But there are a lot of options for building on a very old system. For instance, building without http if you don't need it. Or building a more recent libcurl (and even linking statically for simplicity). Of course that is always an option but it does complicate things. I'd find arguments against the latter more compelling if recent curl were hard to compile on old systems. I don't know whether that's the case (certainly on a modern system, it's much easier to get newer versions of curl to compile than older ones). I have no experience with building curl on older Linux systems. I know that I can build it on old Solaris releases but that is not
Re: [PATCH] commit: skip discarding the index if there is no pre-commit hook
Kevin Willfordwrites: > If there is not a pre-commit hook, there is no reason to discard > the index and reread it. > > This change checks to presence of a pre-commit hook and then only > discards the index if there was one. > > Signed-off-by: Kevin Willford > --- Peff already has done a good job reviewing the patch text, and I agree that this is a worthwhile optimization. Could Microsoft folks all make sure that their signed-off-by lines match their From: address (or leave an in-body From: to override the From: address your MUA places in your messages)? Thanks. > builtin/commit.c | 29 + > 1 file changed, 21 insertions(+), 8 deletions(-) > > diff --git a/builtin/commit.c b/builtin/commit.c > index e7a2cb6285..443949d87b 100644 > --- a/builtin/commit.c > +++ b/builtin/commit.c > @@ -671,12 +671,22 @@ static int prepare_to_commit(const char *index_file, > const char *prefix, > const char *hook_arg2 = NULL; > int clean_message_contents = (cleanup_mode != CLEANUP_NONE); > int old_display_comment_prefix; > + const char *precommit_hook = NULL; > > /* This checks and barfs if author is badly specified */ > determine_author_info(author_ident); > > - if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", > NULL)) > - return 0; > + > + if (!no_verify) { > + /* > + * Check to see if there is a pre-commit hook > + * If there not one we can skip discarding the index later on > + */ > + precommit_hook = find_hook("pre-commit"); > + if (precommit_hook && > + run_commit_hook(use_editor, index_file, "pre-commit", NULL)) > + return 0; > + } > > if (squash_message) { > /* > @@ -940,12 +950,15 @@ static int prepare_to_commit(const char *index_file, > const char *prefix, > return 0; > } > > - /* > - * Re-read the index as pre-commit hook could have updated it, > - * and write it out as a tree. We must do this before we invoke > - * the editor and after we invoke run_status above. > - */ > - discard_cache(); > + if (!no_verify && precommit_hook) { > + /* > + * Re-read the index as pre-commit hook could have updated it, > + * and write it out as a tree. We must do this before we invoke > + * the editor and after we invoke run_status above. > + */ > + discard_cache(); > + } > + > read_cache_from(index_file); > if (update_main_cache_tree(0)) { > error(_("Error building trees"));
Re: [PATCH 0/4] dropping support for older curl
Jeff Kingwrites: > On Thu, Aug 10, 2017 at 11:36:41AM +0200, Johannes Schindelin wrote: > >> Hopefully I had better luck expressing my concerns this time? > > I understand your argument much better now. I'm still not sure I agree. > > -Peff I do not think "there are a dozen #ifdefs and I don't know whether they still work. I don't know whether anybody (who most likely has better things to do than read the Git mailing list) is still using those. So let's just remove them." was why you were suggesting to clean up the (apparent) support of older curl in the code, though. Isn't the reason why your series simplifies these #ifdefs away because we by accident started using some features that require a version that is even newer than any of these #ifdef's try to cater to and yet nobody complained? That is a lot more similar to the removal of rsync transport that happened in a not so distant past, where the reason for removal was "We have been shipping code that couldn't have possibly worked for some time and nobody complained ---we know nobody is depending on it." Or "We accidentally started shipping code with comma after the last element of enum decl and nobody compalined---everybody's compiler must be ready" ;-)
Re: [PATCH V2 2/2] Convert size datatype to size_t
Martin Koeglerwrites: > For next. As this touches core functions, it will likely produce > conflicts with other changes. Please provide the commit you want > to rebase the patch on and I'll produce a V3. No matter what base you pick, by the time the series is merged with other topics in flight to form an updated 'pu' branch, any series of this invasiveness will cause conflict. So from that point of view, picking 'master' or 'next' as the base would not make much difference. However, picking 'next' (or 'pu') as the base is definitely worse than 'master' for a different reason. Anything based on 'next', even though it may apply cleanly there, will not be able to graduate to 'master' without dragging all the other topics that are in 'next' with it. Immediately after a feature release is the worst time, as we will rewind and rebuild 'next' on top of 'master'. In practice, the only sensible base for an invasive change is the mimimum one you create yourself. You would: (1) Start from a reasonably stable base, like 'master'. (2) Among topics that are in flight but not in 'master', find the ones that materially interfere with your changes. Merge them on top of (1). (3) Then build your change on top. In the patch series you create in step 3, you would note which base you chosen (e.g. "v2.14.1") in step 1, plus the names of the topics you merged in step 2, after three-dash lines. The set of topics you find in step 2 might end up including a topic that is of dubious doneness (e.g. especially the ones that are not yet in 'next'). In such a case, you or the other topic may have to yield and wait for the other to stabilize. Git is not a substitute for inter-developer communication, and you'd talk to the author of the other topic and coordinate between yourselves when it happens. Thanks.
Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
On Thu, 10 Aug 2017 14:19:59 -0700 Junio C Hamanowrote: > Jonathan Tan writes: > > > Here is the complete patch set. I have only moved the exported functions > > that operate with packfiles and their static helpers - for example, > > static functions like freshen_packed_object() that are used only by > > non-pack-specific functions are not moved. > > This will interfere with smaller changes and fixes we want to have > early in the 'master' branch, so while I think it is a good idea to > do something like this in the longer term, I'd have to ask you to > either hold on or rebase this on them (you'll know what else you are > conflicting with when you try to merge this to 'pu' yourself). > > Thanks. OK, I'll wait until you have updated the master branch, then I'll try to rebase on it.
Re: [PATCH 3/4] Convert zlib.c to size_t
Martin Koeglerwrites: > From: Martin Koegler > > Signed-off-by: Martin Koegler > --- Thanks. I haven't thought things through but this looks sensible. Will queue.
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
René Scharfewrites: > getdelim(3) returns -1 at the end of the file and if it encounters an > error, but sets errno only in the latter case. Set errno to zero before > calling it to avoid misdiagnosing an out-of-memory condition due to a > left-over value from some other function call. > > Reported-by: Yaroslav Halchenko > Suggested-by: Junio C Hamano > Signed-off-by: Rene Scharfe Heh. I mumble something vague then people more capable than me jump in to take it to the conclusion, and still I get the credit. I wish all the debugging sessions were this easy ;-) > --- > Do we need to save and restore the original value of errno? I doubt it, > but didn't think deeply about it, yet. We probably don't need to---a caller who knows it got an error before calling this function and wants to use errno after doing so should be stashing it away; after all, this function will clobber errno when any of the library calls it makes fails and this is on the I/O codepath, so anything can go wrong. Thanks. > > strbuf.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/strbuf.c b/strbuf.c > index 89d22e3b09..323c49ceb3 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int > term) > /* Translate slopbuf to NULL, as we cannot call realloc on it */ > if (!sb->alloc) > sb->buf = NULL; > + errno = 0; > r = getdelim(>buf, >alloc, term, fp); > > if (r > 0) {
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
On Thu, 10 Aug 2017, Jeff King wrote: > On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote: > > getdelim(3) returns -1 at the end of the file and if it encounters an > > error, but sets errno only in the latter case. Set errno to zero before > > calling it to avoid misdiagnosing an out-of-memory condition due to a > > left-over value from some other function call. > Looks good to me. > > Do we need to save and restore the original value of errno? I doubt it, > > but didn't think deeply about it, yet. > I'd say no. Anybody depending on strbuf_getwholeline() is clearly > already wrong in the error case. And in general I think we assume that > syscalls can clear errno on success if they choose to (this isn't a > syscall, but obviously it is calling some). Shouldn't ideally errno being reset to 0 upon check of the syscall successful run right after that syscall? (I also see some spots within git code where it sets errno to ENOMEM) -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik
Re: [PATCH 0/4] dropping support for older curl
On Thu, Aug 10, 2017 at 11:36:41AM +0200, Johannes Schindelin wrote: > Hopefully I had better luck expressing my concerns this time? I understand your argument much better now. I'm still not sure I agree. -Peff
Re: [PATCH 0/4] dropping support for older curl
On Thu, Aug 10, 2017 at 10:33:18PM +0200, Tom G. Christensen wrote: > > You've totally ignored the argument I made back then[1], and which I > > reiterated in this thread. So I'll say it one more time: the more > > compelling reason is not the #ifdefs, but the fact that the older > > versions are totally untested. > > Perhaps you forgot but I stated in the original thread that I build RPMS for > RHEL/CentOS 3, 4, 5, 6 and 7. I still do and I run the testsuite every > single time. I didn't forget. I actually double-checked the patches you sent at the time, but I didn't see one for the CURLPROTO issue. And indeed, it is still broken for me: $ cd /path/to/curl/repo $ git checkout curl-7_15_5 $ ./buildconf && ./configure --prefix=/tmp/foo && make install $ cd /path/to/git $ git checkout v2.14.0 $ make CURLDIR=/tmp/foo V=1 http.o gcc -o http.o -c -MF ./.depend/http.o.d -MQ http.o -MMD -MP -g -O0 -Wall -Werror -Wdeclaration-after-statement -Wpointer-arith -Wstrict-prototypes -Wvla -Wold-style-declaration -Wold-style-definition -Wno-error -Wno-cpp -Wno-unused-value -Wno-strict-prototypes -I. -DUSE_LIBPCRE1 -DHAVE_ALLOCA_H -I/tmp/foo/include -DUSE_CURL_FOR_IMAP_SEND -DNO_GETTEXT -DSHA1_DC -DSHA1DC_NO_STANDARD_INCLUDES -DSHA1DC_INIT_SAFE_HASH_DEFAULT=0 -DSHA1DC_CUSTOM_INCLUDE_SHA1_C="\"cache.h\"" -DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_C="\"sha1dc_git.c\"" -DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_H="\"sha1dc_git.h\"" -DSHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C="\"git-compat-util.h\"" -DHAVE_PATHS_H -DHAVE_DEV_TTY -DHAVE_CLOCK_GETTIME -DHAVE_CLOCK_MONOTONIC -DHAVE_GETDELIM -DFREAD_READS_DIRECTORIES -DNO_STRLCPY -DSHELL_PATH='"/bin/sh"' -DPAGER_ENV='"LESS=FRX LV=-c"' http.c http.c: In function ‘get_curl_allowed_protocols’: http.c:685:24: error: ‘CURLPROTO_HTTP’ undeclared (first use in this function); did you mean ‘CURLPROXY_HTTP’? allowed_protocols |= CURLPROTO_HTTP; ^~ CURLPROXY_HTTP [and so on] > I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems at > all neither with building nor with running the testsuite. As you can see, this does not compile for me. What's going on? I don't see how it could work, as CURLPROTO_HTTP is not defined at all in that version of curl. Can you please double-check that you're building against the correct version of curl, and that you are building the HTTP parts of Git (which _are_ optional, and the test suite will pass without them). > > So IMHO this is about being honest with users about which versions we > > _actually_ support. > > I have no problem with you wanting to drop support for older curl releases > (such as 7.15.5) but don't use the argument that it doesn't currently build > and nobody cares. My argument isn't quite that nobody cares. It's that we do users a disservice by shipping a version of the code that very well may have hidden problems like security holes (for instance, we do not handle redirects safely in old versions of curl). So if you can get it to build it may _seem_ fine, but it's a bit of a booby-trap waiting to spring. I also won't claim any absolutes. I think we all agree this is a cost/benefit tradeoff. But there are a lot of options for building on a very old system. For instance, building without http if you don't need it. Or building a more recent libcurl (and even linking statically for simplicity). I'd find arguments against the latter more compelling if recent curl were hard to compile on old systems. I don't know whether that's the case (certainly on a modern system, it's much easier to get newer versions of curl to compile than older ones). > Also FWIW Red Hat continues to support RHEL 5 with the Extended Life-cycle > Support program until 2020-11-30. I saw that, too. But as I understand it, they provide no code updates: no bugfixes and no security updates. They just promise to answer the phone and help you with troubleshooting. It's possible my perception is wrong, though; I'm certainly not one of their customers. -Peff
Re: [RFC] clang-format: outline the git project's coding style
On 08/10, Junio C Hamano wrote: > Brandon Williamswrites: > > > On 08/10, Junio C Hamano wrote: > > > >> I vaguely recall that there was a discussion to have SubmitGit wait > >> for success from Travis CI; if that is already in place, then I can > >> sort of see how it would help individual contributors to have the > >> style checker in that pipeline as well. > >> > >> I have a mixed feelings about "fixing" styles automatically, though. > > > > I still think we are far away from a world where we can fix style > > automatically. If we do want to keep pursuing this there are a number > > steps we'd want to take first. > > > > 1. Settle on a concrete style and document it using a formatter's rules > >(in say a .clang-format file). This style would most likely need to > >be tuned a little bit, at least the 'Penalty' configuration would > >need to be tuned which (as far as I understand it) is used to > >determine which rule to break first to ensure a line isn't too long. > > Yes. I think this is what you started to get the ball rolling. > Together with what checkpatch.pl already diagnoses, I think we can > get a guideline that is more or less reasonable. > > > 2. Start getting contributors to use the tool to format their patches. > >This would include having some script or hook that a contributor > >could run to only format the sections of code that they touched. > > This, too. Running checkpatch.pl (possibly combined with a bit of > tweaking it to match our needs) already catches many of the issues, > so a tool with a similar interface would be easy to use, I would > imagine. > > > 3. Slowly the code base would begin to have a uniform style. At > >some point we may want to then reformat the remaining sections of the > >code base. At this point we could have some automated bot that fixes > >style. > > I suspect I am discussing this based on a different assumption. > > I think the primary goal of this effort is to make it easier to > cleanse the new patches that appear on the list of trivial style > issues, so that contributors and reviewers do not have to spend > bandwidth and brain cycles during the review. And I have been > assuming that we can do so even without waiting for a "tree wide" > code churn on existing code to complete. Yes that's one of the steps I missed we can call it 2.5 ;) (3) could be a long term goal which is what I was trying to get at by saying: > > 3. Slowly the code base would begin to have a uniform style. -- Brandon Williams
Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
Jonathan Tanwrites: > Here is the complete patch set. I have only moved the exported functions > that operate with packfiles and their static helpers - for example, > static functions like freshen_packed_object() that are used only by > non-pack-specific functions are not moved. This will interfere with smaller changes and fixes we want to have early in the 'master' branch, so while I think it is a good idea to do something like this in the longer term, I'd have to ask you to either hold on or rebase this on them (you'll know what else you are conflicting with when you try to merge this to 'pu' yourself). Thanks.
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 11:06:40PM +0200, Christian Couder wrote: > >> Related to this, I wonder if people might want to "normalize" in > >> different ways later. If that happens, we might regret having called > >> this option "--normalize" instead of "--one-per-line" for example. > > > > What is normal? > > > > Maybe --style=[one-line, wrapped:72, rfc, json, xml, ...] > > then? > > Yeah, we could go there right now (using perhaps "--pretty" or > "--format" instead of "--style", so that we are more consistent with > other commands), but on the other hand we don't know yet if the other > formats will really be needed. But some of those things are not 1:1 mappings with normalization. For instance, --json presumably implies --only-trailers. Or are we proposing to break the whole commit message down into components and output it all as json? -Peff
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 9:44 PM, Stefan Bellerwrote: > On Thu, Aug 10, 2017 at 12:39 PM, Christian Couder > wrote: >> On Thu, Aug 10, 2017 at 8:37 PM, Jeff King wrote: >>> On Thu, Aug 10, 2017 at 11:35:00AM -0700, Stefan Beller wrote: >>> On Thu, Aug 10, 2017 at 1:03 AM, Jeff King wrote: > The point of "--only-trailers" is to give a caller an output > that's easy for them to parse. Getting rid of the > non-trailer material helps, but we still may see more > complicated syntax like whitespace continuation. Let's add > an option to normalize the output into one "key: value" line > per trailer. > > As a bonus, this could be used even without --only-trailers > to clean up unusual formatting in the incoming data. This is useful for the parsing part, but for the writing part we'd rather want to have the opposite thing, such as '--line-break=rfc822'. But this doesn't have to be part of this series. With this in mind, I do not quite understand the latter use case how you would use normalized trailers without --only-trailers? >>> >>> If you prefer the normalized form (and the input was line-broken in a >>> way that you don't like), then this would convert to your preferred >>> form. I agree that you could potentially want the opposite (folding long >>> lines). Perhaps something like --wrap=72. >> >> Related to this, I wonder if people might want to "normalize" in >> different ways later. If that happens, we might regret having called >> this option "--normalize" instead of "--one-per-line" for example. > > What is normal? > > Maybe --style=[one-line, wrapped:72, rfc, json, xml, ...] > then? Yeah, we could go there right now (using perhaps "--pretty" or "--format" instead of "--style", so that we are more consistent with other commands), but on the other hand we don't know yet if the other formats will really be needed. > If you have --one-per-line, this may be orthogonal to e.g. json > (as json can be crammed into one line IIUC), but when given the > selection you cannot combine multiple styles. > > Scratch that, we actually want to combine these styles with each > other. Yeah, that's another possibility for the future. People might want a --json option that can be used both with and without --oneline. But as the future is difficult to predict, let's try to make it easy for us in both cases. And I think starting with just "--oneline" would be easier to deal with later than "--normalize" (or "--style" or "--pretty" or "--format") especially in the latter case.
Re: [PATCH 0/4] dropping support for older curl
On 09/08/17 23:47, Jeff King wrote: On Wed, Aug 09, 2017 at 11:42:12PM +0200, Johannes Schindelin wrote: I mean, if we even go out of our way to support the completely outdated and obsolete .git/branches/ for what is likely a single user, it may not be the worst to keep those couple of #ifdef guards to keep at least nominal support for older cURLs? You've totally ignored the argument I made back then[1], and which I reiterated in this thread. So I'll say it one more time: the more compelling reason is not the #ifdefs, but the fact that the older versions are totally untested. Perhaps you forgot but I stated in the original thread that I build RPMS for RHEL/CentOS 3, 4, 5, 6 and 7. I still do and I run the testsuite every single time. I currently have 2.13.3 up for el4, el5, el6 and el7. Only el4 requires any patches, the rest will build out of the box with the vendor supplied version of curl. The plan was to drop the el4 builds for 2.14.0 to get rid of the patches. In fact, they do not even compile, and yet I have not seen any patches to fix that. I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems at all neither with building nor with running the testsuite. So IMHO this is about being honest with users about which versions we _actually_ support. I have no problem with you wanting to drop support for older curl releases (such as 7.15.5) but don't use the argument that it doesn't currently build and nobody cares. Also FWIW Red Hat continues to support RHEL 5 with the Extended Life-cycle Support program until 2020-11-30. -tgc
Re: [PATCH] strbuf: clear errno before calling getdelim(3)
On Thu, Aug 10, 2017 at 10:56:40PM +0200, René Scharfe wrote: > getdelim(3) returns -1 at the end of the file and if it encounters an > error, but sets errno only in the latter case. Set errno to zero before > calling it to avoid misdiagnosing an out-of-memory condition due to a > left-over value from some other function call. Looks good to me. > Do we need to save and restore the original value of errno? I doubt it, > but didn't think deeply about it, yet. I'd say no. Anybody depending on strbuf_getwholeline() is clearly already wrong in the error case. And in general I think we assume that syscalls can clear errno on success if they choose to (this isn't a syscall, but obviously it is calling some). -Peff
[PATCH] strbuf: clear errno before calling getdelim(3)
getdelim(3) returns -1 at the end of the file and if it encounters an error, but sets errno only in the latter case. Set errno to zero before calling it to avoid misdiagnosing an out-of-memory condition due to a left-over value from some other function call. Reported-by: Yaroslav HalchenkoSuggested-by: Junio C Hamano Signed-off-by: Rene Scharfe --- Do we need to save and restore the original value of errno? I doubt it, but didn't think deeply about it, yet. strbuf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/strbuf.c b/strbuf.c index 89d22e3b09..323c49ceb3 100644 --- a/strbuf.c +++ b/strbuf.c @@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) /* Translate slopbuf to NULL, as we cannot call realloc on it */ if (!sb->alloc) sb->buf = NULL; + errno = 0; r = getdelim(>buf, >alloc, term, fp); if (r > 0) { -- 2.14.0
Re: [PATCH 4/4] Fix delta offset overflow
Martin Koeglerwrites: > From: Martin Koegler > > Prevent generating delta offsets beyond 4G. > > Signed-off-by: Martin Koegler > --- > diff-delta.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/diff-delta.c b/diff-delta.c > index 3d5e1ef..633883e 100644 > --- a/diff-delta.c > +++ b/diff-delta.c > @@ -454,6 +454,9 @@ create_delta(const struct delta_index *index, > moff += msize; > msize = left; > > + if (moff > 0x) > + msize = 0; > + The lower 4-byte of moff (before incrementing it with msize) were already encoded to the output stream before this hunk. Shouldn't we be checking if moff would fit in uint32_t _before_ that happens? IOW, in a much earlier part of that "while (data < top)" loop, there is a code that tries to find a match that gives us a large msize by iterating over index->hash[], and it sets msize and moff as a potential location that we may want to use. If moff is already big there, then we shouldn't consider it a match because we cannot encode its location using 4-byte anyway, no? Cutting it off at here by resetting msize to 0 might help the next iteration (I didn't check, but is the effect of it is to corrupt the "val" rolling checksum and make it unlikely that the hash computation would not find a correct match?) but it somehow feels like closing the barn door after the horse has already bolted... > if (msize < 4096) { > int j; > val = 0;
Re: [PATCH V2 1/2] Fix delta integer overflows
On Thu, Aug 10, 2017 at 01:07:07PM -0700, Junio C Hamano wrote: > Perhaps we should teach the receiving end to notice that the varint > data it reads encodes a size that is too large for it to grok and > die. With that, we can safely move forward with whatever size_t > each platform uses. Yes, this is very important even for "unsigned long". I'd worry that malicious input could cause us to wrap to 0, and we'd potentially write into a too-small buffer[1]. There's some prior art with checking this against bitsizeof() in unpack_object_header_buffer() but get_delta_hdr_size() does not seem to have a check. -Peff [1] In most cases it's _probably_ not a vulnerability to wrap here, because we'd just read less data than we ought to. But it makes me nervous nonetheless.
Re: [PATCH] cache-tree: remove use of strbuf_addf in update_one
On Thu, Aug 10, 2017 at 12:57 PM, Kevin Willfordwrote: > Here are some of the performance numbers from the windows repo. > I will work on writing a perf test for this change so that we > have a better idea on smaller repo what the impact of this change > is on them. > > | w/o | with fix | > --- > git checkout | 36.08 s | 33.34 s | > --- > git checkout | 32.54 s | 28.26 s | > --- > git checkout | 44.10 s | 38.13 s | > --- > git merge| 32.90 s | 30.56 s | > --- > git rebase | 46.14 s | 42.18 s | > ~10-15% is impressive for this patch, I certainly did not expect as much. Thanks for providing the numbers! Stefan
Re: [PATCH 0/4] dropping support for older curl
[I am resending this since the original does not seem to have made it to the list, at least I cannot find it in any archives] On 09/08/17 23:47, Jeff King wrote: On Wed, Aug 09, 2017 at 11:42:12PM +0200, Johannes Schindelin wrote: I mean, if we even go out of our way to support the completely outdated and obsolete .git/branches/ for what is likely a single user, it may not be the worst to keep those couple of #ifdef guards to keep at least nominal support for older cURLs? You've totally ignored the argument I made back then[1], and which I reiterated in this thread. So I'll say it one more time: the more compelling reason is not the #ifdefs, but the fact that the older versions are totally untested. Perhaps you forgot but I stated in the original thread that I build RPMS for RHEL/CentOS 3, 4, 5, 6 and 7. I still do and I run the testsuite every single time. I currently have 2.13.3 up for el4, el5, el6 and el7. Only el4 requires any patches, the rest will build out of the box with the vendor supplied version of curl. The plan was to drop the el4 builds for 2.14.0 to get rid of the patches. In fact, they do not even compile, and yet I have not seen any patches to fix that. I just built a pristine 2.14.0 on CentOS 5 with curl 7.15.5. No problems at all neither with building nor with running the testsuite. So IMHO this is about being honest with users about which versions we _actually_ support. I have no problem with you wanting to drop support for older curl releases (such as 7.15.5) but don't use the argument that it doesn't currently build and nobody cares. Also FWIW Red Hat continues to support RHEL 5 with the Extended Life-cycle Support program until 2020-11-30. -tgc
Re: fatal: Out of memory, getdelim failed under NFS mounts
On Thu, 10 Aug 2017, Jeff King wrote: > On Thu, Aug 10, 2017 at 09:58:37PM +0200, René Scharfe wrote: > > > So the function is returning -1 and leaving ENOMEM in errno on > > > Yaroslav's system. > > > I wonder if we are truly hitting out of memory, though. The same > > > symptom could bee seen if getdelim() does not touch errno when it > > > returns -1, but some other system call earlier set it to ENOMEM, > > > for example. > > That can happen when the end of a file is reached; getdelim() returns > > -1 and leaves errno unchanged. So we need to set errno to 0 just > > before that call. > Good catch. That's a bug in my original conversoin of > strbuf_getwholeline(). I think this did it! at least on this simple test... yet to test a bit more in those scenarios we ran into it before while testing git-annex. commit 36ef5e3ad2c187d3be664c33dbc8c06e59bceaf4 (HEAD -> bf-seterrno0) Author: Yaroslav O. HalchenkoDate: Thu Aug 10 20:26:47 2017 + BF: set errno to 0 before getdelim call to get unbiased assesment later diff --git a/strbuf.c b/strbuf.c index 89d22e3b0..323c49ceb 100644 --- a/strbuf.c +++ b/strbuf.c @@ -476,6 +476,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) /* Translate slopbuf to NULL, as we cannot call realloc on it */ if (!sb->alloc) sb->buf = NULL; + errno = 0; r = getdelim(>buf, >alloc, term, fp); if (r > 0) { -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 9:42 PM, Jeff Kingwrote: > On Thu, Aug 10, 2017 at 09:39:21PM +0200, Christian Couder wrote: > >> > If you prefer the normalized form (and the input was line-broken in a >> > way that you don't like), then this would convert to your preferred >> > form. I agree that you could potentially want the opposite (folding long >> > lines). Perhaps something like --wrap=72. >> >> Related to this, I wonder if people might want to "normalize" in >> different ways later. If that happens, we might regret having called >> this option "--normalize" instead of "--one-per-line" for example. > > My assumption was that it would be OK to add other normalization later > if it brings us closer to the "key: value" form as a standard, and it > could fall under "--normalize", since that's what callers would want. > And that's why I didn't want to call it something like --one-per-line. > > But if you are arguing that there can be many "standards" to normalize > to, I agree that's a possibility. I think we have an out by extending to > "--normalize=whatever-form" in the future. If we take `git log` as an example, we now have "--oneline" which is a shorthand for "--pretty=oneline --abbrev-commit". And the default for "--pretty" is called "medium". So instead of your suggestion, we could call this option "--oneline" now, and if other normalizations are later required we could then create "--pretty=whatever" and say that "--oneline" is a shorthand for "--pretty=oneline".
Product Inquiry
Dear Sir/Madam We are interested in your product, can we have your MOQ and FOB prices for consideration and purchase waiting for your response. Best Regards, Ms Tina Wang (Purchasing Manager) Email:purchasing.sale...@gmail.com
[ANNOUNCE] Git for Windows 2.14.1
Dear Git users, It is my pleasure to announce that Git for Windows 2.14.1 is available from: https://git-for-windows.github.io/ Changes since Git for Windows v2.14.0(2) (August 7th 2017) Note: there have been MinGit-only releases v2.12.2(3) and v2.13.1(3) with backports of the important bug fix in v2.14.1 as well as the experimental --show-ignored-directory option of git status. New Features * Comes with Git v2.14.1. * Comes with cURL v7.55.0. * The Git Bash Here context menu item is now also available in the special Libraries folders. Filename | SHA-256 | --- Git-2.14.1-64-bit.exe | 0dc556503e3ce4699228fc910a8e4a8d81172635ac8e8e16a11be107254c4901 Git-2.14.1-32-bit.exe | 0129e21eaed8efa6d795f712656463ee4f90aa2b3b66168f29b0da98f74104f7 PortableGit-2.14.1-64-bit.7z.exe | 3c3270a9df5f3db1f7637d86b94fb54a96e9145ba43c98a3e993cdffb1a1842e PortableGit-2.14.1-32-bit.7z.exe | df3f9b6c2dd2b12e5cb7035b9ca48d13b973d054a35b0939953aa6e7a00a0659 MinGit-2.14.1-64-bit.zip | 65c12e4959b8874187b68ec37e532fe7fc526e10f6f0f29e699fa1d2449e7d92 MinGit-2.14.1-32-bit.zip | 77b468e0ead1e7da4cb3a1cf35dabab5210bf10457b4142f5e9430318217cdef MinGit-2.14.1-busybox-64-bit.zip | 7e72a78e0711d27d98f851ec81a6fe27b4159066d548c2013dd7ce57a1b8cd03 MinGit-2.14.1-busybox-32-bit.zip | 2f3a3ae26391e5e3487501b3b16ee1c6385259ebfdaafcbee9947d7513dc0a0f Git-2.14.1-64-bit.tar.bz2 | 544615e2ef5e2040a67878ce7aac42cb103f948d52989239b3715dd6023b1007 Git-2.14.1-32-bit.tar.bz2 | 0aede42a7ec7a6351a3f273ab519679f95e9341cb63899c54be18a57819da6aa Ciao, Johannes
Re: [PATCH V2 1/2] Fix delta integer overflows
Martin Koeglerwrites: > From: Martin Koegler Just a nitpick on the patch title. As "git shortlog --no-merges" output would tell you, we try to prefix the title with a short name of the area of codebase we are touching, followed by a colon and a space and then remainder without extra capitalization. Perhaps Subject: delta: fix enconding size larger than an "uint" can hold > The current delta code produces incorrect pack objects for files > 4GB. > > Signed-off-by: Martin Koegler I am a bit torn on this change. The original is indeed bad in that the code does not guarantee that an intermediate variable like 'l' is not large enough to hold the true size we know in index->src_size, and in that sense this change is an improvement. Given that this is not merely a local storage format but it also is an interchange format, we would probably want to make sure that the receiving end (e.g. get_delta_hdr_size() that is used at the beginning of patch_delta()) on a platform whose size_t is smaller than that of a platform that produced the delta stream with this code behaves "sensibly". If we replaced ulong we use in create/patch delta codepaths with uint32_t, that would be safer, just because the encoder would not be able to emit varint that is larger than the receivers to handle. But that defeats the whole point of using varint() to encode the sizes in the first place. It was partly done for space saving, but more for allowing larger sizes and larger ulong in the future without having to change the file format. Perhaps we should teach the receiving end to notice that the varint data it reads encodes a size that is too large for it to grok and die. With that, we can safely move forward with whatever size_t each platform uses. Thanks. > --- > For next. > > diff-delta.c | 24 +--- > 1 file changed, 13 insertions(+), 11 deletions(-) > > diff --git a/diff-delta.c b/diff-delta.c > index 3797ce6..cd238c8 100644 > --- a/diff-delta.c > +++ b/diff-delta.c > @@ -319,7 +319,9 @@ create_delta(const struct delta_index *index, >const void *trg_buf, unsigned long trg_size, >unsigned long *delta_size, unsigned long max_size) > { > - unsigned int i, outpos, outsize, moff, msize, val; > + unsigned int i, val; > + off_t outpos, moff; > + size_t l, outsize, msize; > int inscnt; > const unsigned char *ref_data, *ref_top, *data, *top; > unsigned char *out; > @@ -336,20 +338,20 @@ create_delta(const struct delta_index *index, > return NULL; > > /* store reference buffer size */ > - i = index->src_size; > - while (i >= 0x80) { > - out[outpos++] = i | 0x80; > - i >>= 7; > + l = index->src_size; > + while (l >= 0x80) { > + out[outpos++] = l | 0x80; > + l >>= 7; > } > - out[outpos++] = i; > + out[outpos++] = l; > > /* store target buffer size */ > - i = trg_size; > - while (i >= 0x80) { > - out[outpos++] = i | 0x80; > - i >>= 7; > + l = trg_size; > + while (l >= 0x80) { > + out[outpos++] = l | 0x80; > + l >>= 7; > } > - out[outpos++] = i; > + out[outpos++] = l; > > ref_data = index->src_buf; > ref_top = ref_data + index->src_size;
Re: fatal: Out of memory, getdelim failed under NFS mounts
On Thu, Aug 10, 2017 at 09:58:37PM +0200, René Scharfe wrote: > > So the function is returning -1 and leaving ENOMEM in errno on > > Yaroslav's system. > > > > I wonder if we are truly hitting out of memory, though. The same > > symptom could bee seen if getdelim() does not touch errno when it > > returns -1, but some other system call earlier set it to ENOMEM, > > for example. > > That can happen when the end of a file is reached; getdelim() returns > -1 and leaves errno unchanged. So we need to set errno to 0 just > before that call. Good catch. That's a bug in my original conversoin of strbuf_getwholeline(). -Peff
Re: [PATCH v1 1/1] dir: teach status to show ignored directories
On Thu, Aug 10, 2017 at 11:49 AM, Jameson Millerwrote: Welcome to the Git mailing list. :) > Teach Git to optionally show ignored directories when showing all > untracked files. The git status command exposes the options to report > ignored and/or untracked files. However, when reporting all untracked > files (--untracked-files=all), all individual ignored files are reported > as well. It is not currently possible to get the reporting behavior of > the --ignored flag, while also reporting all untracked files. Trying to understand this based off the documentation for --untracked=all and --ignored, I realize that the documentation for --ignored seems to be lacking as I do not understand what the --ignored behavior is in combination with --untracked=[all, normal, no] > This > change exposes a flag to report all untracked files while not showing > individual files in ignored directories. By the description up to here, it sounds as if you want to introduce mode for --untracked, e.g. "normal-adjusted-for-ignored" (it's a bad suggestion)? However the patch seems to add an orthogonal flag, such that status --no-ignored --untracked=no --show-ignored-directory would also be possible. What is a reasonable expectation for the output of such? > Motivation: > Our application (Visual Studio) needs all untracked files listed > individually, but does not need all ignored files listed individually. For parsing output, I would strongly recommend --porcelain[=2], but that is a tangent. > Reporting all ignored files can affect the time it takes for status > to run. For a representative repository, here are some measurements > showing a large perf improvement for this scenario: > > | Command | Reported ignored entries | Time (s) | > | --- | | | > | 1 | 0| 1.3 | > | 2 | 1024 | 4.2 | > | 3 | 174904 | 7.5 | > | 4 | 1046 | 1.6 | > > Commands: > 1) status > 2) status --ignored > 3) status --ignored --untracked-files=all > 4) status --ignored --untracked-files=all --show-ignored-directory (2) is --untracked-files=normal I'd presume, such that this flag can be understood as a tweak to "normal" based on the similar size between 2 and 4? (The timing improvement from 2 to 4 is huge though). > This changes exposes a --show-ignored-directory flag to the git status s/changes/change/ > command. This flag is utilized when running git status with the > --ignored and --untracked-files options to not list ignored individual > ignored files contained in directories that match an ignore pattern. > > Part of the perf improvement comes from the tweak to > read_directory_recursive to stop scanning the file system after it > encounters the first file. When a directory is ignored, all it needs to > determine is if the directory is empty or not. The logic currently keeps > scanning the file system until it finds an untracked file. However, as > the directory is ignored, all the contained contents are also marked > excluded. For ignored directories that contain a large number of files, > this can take some time. I think it is possible to ignore a directory and still track files in it, what are the implications of this flag on a tracked (and changed) file in an ignored dir? What happens to empty directories that match an ignore pattern? > @@ -1362,6 +1363,8 @@ int cmd_status(int argc, const char **argv, const char > *prefix) > N_("ignore changes to submodules, optional when: all, > dirty, untracked. (Default: all)"), > PARSE_OPT_OPTARG, NULL, (intptr_t)"all" }, > OPT_COLUMN(0, "column", , N_("list untracked files > in columns")), > + OPT_BOOL(0, "show-ignored-directory", _ignored_directory, Is it possible to directly read into s.show_ignored_directory here? > +test_expect_success 'Verify behavior of status on folders with ignored > files' ' > + test_when_finished "git clean -fdx" && > + git status --porcelain=v2 --ignored --untracked-files=all > --show-ignored-directory >output && > + test_i18ncmp expect output > +' > + > +# Test status bahavior on folder with tracked and ignored files behavior > +cat >expect <<\EOF > +? expect > +? output > +! dir/tracked_ignored/ignored_1.ign > +! dir/tracked_ignored/ignored_2.ign > +! tracked_ignored/ignored_1.ign > +! tracked_ignored/ignored_2.ign > +EOF I think our latest 'best style' is to include these heredocs into the test.
Re: fatal: Out of memory, getdelim failed under NFS mounts
Am 10.08.2017 um 20:56 schrieb Junio C Hamano: > René Scharfewrites: > >> I doubt the type of file system matters. The questions are: How much >> main memory do you have, what is git trying to cram into it, is there >> a way to reduce the memory footprint or do you need to add more RAM? >> >>> any recommendations on how to pin point the "offender"? ;) >> Running "GIT_TRACE=1 git pull --ff-only origin master" would be a >> good start, I think, to find out which of the different activities >> that pull is doing causes the out-of-memory error. >> >> "free" and "ulimit -a" can help you find out how much memory you can >> use. >> >> Also: What does "wc -L .git/FETCH_HEAD .git/packed-refs" report? >> getdelim() is used mostly to read lines from files like these and in >> the admittedly unlikely case that they are *really* long such an >> error would be expected. > > There is only one getdelim() call, which was introduced in v2.5.0 > timeframe, and it is used like this: > > r = getdelim(>buf, >alloc, term, fp); > > if (r > 0) { > sb->len = r; > return 0; > } > assert(r == -1); > > /* >* Normally we would have called xrealloc, which will try to free >* memory and recover. But we have no way to tell getdelim() to do so. >* Worse, we cannot try to recover ENOMEM ourselves, because we have >* no idea how many bytes were read by getdelim. >* >* Dying here is reasonable. It mirrors what xrealloc would do on >* catastrophic memory failure. We skip the opportunity to free pack >* memory and retry, but that's unlikely to help for a malloc small >* enough to hold a single line of input, anyway. >*/ > if (errno == ENOMEM) > die("Out of memory, getdelim failed"); > > So the function is returning -1 and leaving ENOMEM in errno on > Yaroslav's system. > > I wonder if we are truly hitting out of memory, though. The same > symptom could bee seen if getdelim() does not touch errno when it > returns -1, but some other system call earlier set it to ENOMEM, > for example. That can happen when the end of a file is reached; getdelim() returns -1 and leaves errno unchanged. So we need to set errno to 0 just before that call. Still *some* function must have run into a memory shortage in that scenario, so adding/assigning more should help, right? > If the same version of Git is recompiled there without HAVE_GETDELIM > defined, would it still die with out of memory (presumably inside > the call to strbuf_grow() in the strbuf_getwholeline() function)? It's worth a try, if possible. René
Re: [PATCH] cache-tree: remove use of strbuf_addf in update_one
On 8/10/2017 3:03 PM, Jeff King wrote: On Thu, Aug 10, 2017 at 11:58:34AM -0700, Stefan Beller wrote: On Thu, Aug 10, 2017 at 11:47 AM, Kevin Willfordwrote: String formatting can be a performance issue when there are hundreds of thousands of trees. When changing this for the sake of performance, could you give an example (which kind of repository you need for this to become a bottleneck? I presume the large Windows repo? Or can I reproduce it with a small repo such as linux.git or even git.git?) and some numbers how this improves the performance? I was about to say the same thing. Normally I don't mind a small optimization without numbers if the result is obviously an improvement. But in this case the result is a lot less readable, and it's not entirely clear to me that it would always be an improvement (we now always run 3 strbuf calls instead of one, and have to check the length for each one). What I'm wondering specifically is if vsnprintf() on Kevin's platform (which I'll assume is Windows) is slow, and we would do better to replace it with a faster compat/ routine. -Peff The strbuf_add call is essentially only having to do a memcpy whereas the strbuf_addf will have to parse the string, determine the types, convert the data, and then get it in the buffer. That could be made faster with a better compat/ routine but I fear still far from the length check and memcpy. void strbuf_add(struct strbuf *sb, const void *data, size_t len) { strbuf_grow(sb, len); memcpy(sb->buf + sb->len, data, len); strbuf_setlen(sb, sb->len + len); } Here are some of the performance numbers from the windows repo. I will work on writing a perf test for this change so that we have a better idea on smaller repo what the impact of this change is on them. | w/o | with fix | --- git checkout | 36.08 s | 33.34 s | --- git checkout | 32.54 s | 28.26 s | --- git checkout | 44.10 s | 38.13 s | --- git merge| 32.90 s | 30.56 s | --- git rebase | 46.14 s | 42.18 s |
Re: fatal: Out of memory, getdelim failed under NFS mounts
Thank you Junio On Thu, 10 Aug 2017, Junio C Hamano wrote: > There is only one getdelim() call, which was introduced in v2.5.0 > timeframe, and it is used like this: > r = getdelim(>buf, >alloc, term, fp); > if (r > 0) { > sb->len = r; > return 0; > } > assert(r == -1); > /* >* Normally we would have called xrealloc, which will try to free >* memory and recover. But we have no way to tell getdelim() to do so. >* Worse, we cannot try to recover ENOMEM ourselves, because we have >* no idea how many bytes were read by getdelim. >* >* Dying here is reasonable. It mirrors what xrealloc would do on >* catastrophic memory failure. We skip the opportunity to free pack >* memory and retry, but that's unlikely to help for a malloc small >* enough to hold a single line of input, anyway. >*/ > if (errno == ENOMEM) > die("Out of memory, getdelim failed"); > So the function is returning -1 and leaving ENOMEM in errno on > Yaroslav's system. > I wonder if we are truly hitting out of memory, though. The same > symptom could bee seen if getdelim() does not touch errno when it > returns -1, but some other system call earlier set it to ENOMEM, > for example. > If the same version of Git is recompiled there without HAVE_GETDELIM > defined, would it still die with out of memory (presumably inside > the call to strbuf_grow() in the strbuf_getwholeline() function)? will check now... for my own education (rotten by Python) -- how do you know which syscall set errno to be analyzed at this specific point? may be it was already set to ENOMEM before entry to this function? -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 12:39 PM, Christian Couderwrote: > On Thu, Aug 10, 2017 at 8:37 PM, Jeff King wrote: >> On Thu, Aug 10, 2017 at 11:35:00AM -0700, Stefan Beller wrote: >> >>> On Thu, Aug 10, 2017 at 1:03 AM, Jeff King wrote: >>> > The point of "--only-trailers" is to give a caller an output >>> > that's easy for them to parse. Getting rid of the >>> > non-trailer material helps, but we still may see more >>> > complicated syntax like whitespace continuation. Let's add >>> > an option to normalize the output into one "key: value" line >>> > per trailer. >>> > >>> > As a bonus, this could be used even without --only-trailers >>> > to clean up unusual formatting in the incoming data. >>> >>> This is useful for the parsing part, but for the writing part we'd >>> rather want to have the opposite thing, such as >>> '--line-break=rfc822'. But this doesn't have to be part of this >>> series. With this in mind, I do not quite understand the latter >>> use case how you would use normalized trailers without >>> --only-trailers? >> >> If you prefer the normalized form (and the input was line-broken in a >> way that you don't like), then this would convert to your preferred >> form. I agree that you could potentially want the opposite (folding long >> lines). Perhaps something like --wrap=72. > > Related to this, I wonder if people might want to "normalize" in > different ways later. If that happens, we might regret having called > this option "--normalize" instead of "--one-per-line" for example. What is normal? Maybe --style=[one-line, wrapped:72, rfc, json, xml, ...] then? If you have --one-per-line, this may be orthogonal to e.g. json (as json can be crammed into one line IIUC), but when given the selection you cannot combine multiple styles. Scratch that, we actually want to combine these styles with each other.
Re: fatal: Out of memory, getdelim failed under NFS mounts
Am 10.08.2017 um 16:43 schrieb Yaroslav Halchenko: > On Thu, 10 Aug 2017, René Scharfe wrote: >> Am 09.08.2017 um 19:39 schrieb Yaroslav Halchenko: >>> More context (may be different issue(s)) could be found at >>> http://git-annex.branchable.com/forum/git-annex_add_out_of_memory_error/ >>> but currently I am consistently reproducing it while running >>> git (1:2.11.0-3 debian stretch build) within debian stretch singularity >>> environment [1]. > >>> External system is Centos 6.9, and git 1.7.1 (and installed in modules >>> 2.0.4) do not show similar buggy behavior. > >>> NFS mounted partitions are bind mounted inside the sinularity space and >>> when I try to do some git operations, I get that error inconsistently , e.g. > >>> yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin >>> master >>> fatal: Out of memory, getdelim failed >>> error: git://github.com/datalad/datalad did not send all necessary >>> objects > >>> yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin >>> master >>> fatal: Out of memory, getdelim failed >>> error: git://github.com/datalad/datalad did not send all necessary >>> objects > >>> yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin >>> master >>> From git://github.com/datalad/datalad >>> * branch master -> FETCH_HEAD >>> fatal: Out of memory, getdelim failed > >>> and some times it succeeds. So it smells that some race condition >>> somewhere...? > >> I doubt the type of file system matters. > > So far it has been a very consistent indicator. I did not manage to get > this error while performing the same operation under /tmp (bind to local > mounted drive), where it also feels going faster (again suggesting that > original issue is some kind of a race) Well, there have been bugs in getdelim() before, e.g.: https://bugzilla.redhat.com/show_bug.cgi?id=601071 https://bugzilla.redhat.com/show_bug.cgi?id=1332917 git v2.5.0 was the first version to use it. So if all else fails it may be worth compiling git without HAVE_GETDELIM. >> The questions are: How much >> main memory do you have, what is git trying to cram into it, is there >> a way to reduce the memory footprint or do you need to add more RAM? >> ... reordered ... >> "free" and "ulimit -a" can help you find out how much memory you can >> use. > > I think those aren't the reason: > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ free -h >totalusedfree shared buff/cache > available > Mem: 126G2.5G 90G652K 33G > 123G > Swap: 127G1.7M127G Is all of that available to the git in the Singularity container or is that the memory size of the host and there's some kind of limit for the guests? > yhalchen@discovery:/mnt/scratch/yoh/datalad$ ulimit > unlimited That's just the maximum file size; memory-related limits are more interesting for this case. "ulimit -a" will show all limits. >>> any recommendations on how to pin point the "offender"? ;) >> Running "GIT_TRACE=1 git pull --ff-only origin master" would be a >> good start, I think, to find out which of the different activities >> that pull is doing causes the out-of-memory error. > > samples of bad, and then good runs (from eyeballing -- the same until > error message): > > yhalchen@discovery:/mnt/scratch/yoh$ cat git_trace_bad.log > 14:05:25.782270 git.c:371 trace: built-in: git 'pull' > '--ff-only' 'origin' 'master' > 14:05:25.795036 run-command.c:350 trace: run_command: 'fetch' > '--update-head-ok' 'origin' 'master' > 14:05:25.795332 exec_cmd.c:116 trace: exec: 'git' 'fetch' > '--update-head-ok' 'origin' 'master' > 14:05:25.797212 git.c:371 trace: built-in: git 'fetch' > '--update-head-ok' 'origin' 'master' > 14:05:25.904088 run-command.c:350 trace: run_command: 'rev-list' > '--objects' '--stdin' '--not' '--all' '--quiet' > 14:05:26.085954 run-command.c:350 trace: run_command: 'index-pack' > '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on > discovery.hpcc.dartmouth.edu' '--pack_header=2,103' > 14:05:26.086333 exec_cmd.c:116 trace: exec: 'git' 'index-pack' > '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on > discovery.hpcc.dartmouth.edu' '--pack_header=2,103' > 14:05:26.088382 git.c:371 trace: built-in: git 'index-pack' > '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on > discovery.hpcc.dartmouth.edu' '--pack_header=2,103' > 14:05:26.133326 run-command.c:350 trace: run_command: 'rev-list' > '--objects' '--stdin' '--not' '--all' '--quiet' > 14:05:26.133688 exec_cmd.c:116 trace: exec: 'git' 'rev-list' > '--objects' '--stdin' '--not' '--all' '--quiet' > 14:05:26.135493 git.c:371 trace: built-in: git 'rev-list' > '--objects' '--stdin' '--not' '--all' '--quiet' > fatal: Out of memory, getdelim failed
Re: [PATCH 0/5] make interpret-trailers useful for parsing
Stefan Bellerwrites: > On Thu, Aug 10, 2017 at 11:03 AM, Jeff King wrote: >> On Thu, Aug 10, 2017 at 04:02:46AM -0400, Jeff King wrote: >> >>> On Wed, Aug 09, 2017 at 08:21:47AM -0400, Jeff King wrote: >>> >>> > This series teaches interpret-trailers to parse and output just the >>> > trailers. So now you can do: >>> > >>> > $ git log --format=%B -1 8d44797cc91231cd44955279040dc4a1ee0a797f | >>> > git interpret-trailers --parse >>> > Signed-off-by: Hartmut Henkel >>> > Helped-by: Stefan Beller >>> > Signed-off-by: Ralf Thielow >>> > Acked-by: Matthias Rüster >>> >>> And here's a v2 that addresses all of the comments except one: Stefan >>> suggested that --only-existing wasn't a great name. I agree, but I like >>> everything else less. >> >> Here's a v3 that takes care of that (renaming it to --only-input). >> >> It's otherwise the same as v2, but since the name-change ripples through >> the remaining patches, I wanted to get v3 in front of people sooner >> rather than later. >> > > Looks good, Yeah, looks good to me too. Thanks for the "--only-input" discussion.
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 09:39:21PM +0200, Christian Couder wrote: > > If you prefer the normalized form (and the input was line-broken in a > > way that you don't like), then this would convert to your preferred > > form. I agree that you could potentially want the opposite (folding long > > lines). Perhaps something like --wrap=72. > > Related to this, I wonder if people might want to "normalize" in > different ways later. If that happens, we might regret having called > this option "--normalize" instead of "--one-per-line" for example. My assumption was that it would be OK to add other normalization later if it brings us closer to the "key: value" form as a standard, and it could fall under "--normalize", since that's what callers would want. And that's why I didn't want to call it something like --one-per-line. But if you are arguing that there can be many "standards" to normalize to, I agree that's a possibility. I think we have an out by extending to "--normalize=whatever-form" in the future. -Peff
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 8:37 PM, Jeff Kingwrote: > On Thu, Aug 10, 2017 at 11:35:00AM -0700, Stefan Beller wrote: > >> On Thu, Aug 10, 2017 at 1:03 AM, Jeff King wrote: >> > The point of "--only-trailers" is to give a caller an output >> > that's easy for them to parse. Getting rid of the >> > non-trailer material helps, but we still may see more >> > complicated syntax like whitespace continuation. Let's add >> > an option to normalize the output into one "key: value" line >> > per trailer. >> > >> > As a bonus, this could be used even without --only-trailers >> > to clean up unusual formatting in the incoming data. >> >> This is useful for the parsing part, but for the writing part we'd >> rather want to have the opposite thing, such as >> '--line-break=rfc822'. But this doesn't have to be part of this >> series. With this in mind, I do not quite understand the latter >> use case how you would use normalized trailers without >> --only-trailers? > > If you prefer the normalized form (and the input was line-broken in a > way that you don't like), then this would convert to your preferred > form. I agree that you could potentially want the opposite (folding long > lines). Perhaps something like --wrap=72. Related to this, I wonder if people might want to "normalize" in different ways later. If that happens, we might regret having called this option "--normalize" instead of "--one-per-line" for example.
Re: [PATCH] t1200: remove t1200-tutorial.sh
Johannes Schindelinwrites: > Hi, > > On Wed, 9 Aug 2017, Stefan Beller wrote: > >> v1.2.0~121 (New tutorial, 2006-01-22) rewrote the tutorial such that the >> original intent of 2ae6c70674 (Adapt tutorial to cygwin and add test case, >> 2005-10-13) to test the examples from the tutorial doesn't hold any more. >> >> There are dedicated tests for the commands used, even "git whatchanged", >> such that removing these tests doesn't seem like a reduction in test >> coverage. >> >> Signed-off-by: Stefan Beller >> --- > > ACK, > Dscho Thanks, both. Will queue.
[PATCH 5/9] Convert sha1_file.c to size_t
From: Martin KoeglerSigned-off-by: Martin Koegler --- cache.h | 16 +++ sha1_file.c | 68 ++--- streaming.c | 2 +- 3 files changed, 43 insertions(+), 43 deletions(-) diff --git a/cache.h b/cache.h index 9185763..9322303 100644 --- a/cache.h +++ b/cache.h @@ -1189,15 +1189,15 @@ static inline const unsigned char *lookup_replace_object(const unsigned char *sh /* Read and unpack a sha1 file into memory, write memory to a sha1 file */ extern int sha1_object_info(const unsigned char *, size_t *); -extern int hash_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *sha1); -extern int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *return_sha1); -extern int hash_sha1_file_literally(const void *buf, unsigned long len, const char *type, unsigned char *sha1, unsigned flags); -extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned char *); +extern int hash_sha1_file(const void *buf, size_t len, const char *type, unsigned char *sha1); +extern int write_sha1_file(const void *buf, size_t len, const char *type, unsigned char *return_sha1); +extern int hash_sha1_file_literally(const void *buf, size_t len, const char *type, unsigned char *sha1, unsigned flags); +extern int pretend_sha1_file(void *, size_t, enum object_type, unsigned char *); extern int force_object_loose(const unsigned char *sha1, time_t mtime); extern int git_open_cloexec(const char *name, int flags); #define git_open(name) git_open_cloexec(name, O_RDONLY) -extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size); -extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz); +extern void *map_sha1_file(const unsigned char *sha1, size_t *size); +extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, size_t mapsize, void *buffer, size_t bufsiz); extern int parse_sha1_header(const char *hdr, size_t *sizep); /* global flag to enable extra checks when accessing packed objects */ @@ -1723,8 +1723,8 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *) extern int is_pack_valid(struct packed_git *); extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, size_t *); -extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, size_t *sizep); -extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t); +extern size_t unpack_object_header_buffer(const unsigned char *buf, size_t len, enum object_type *type, size_t *sizep); +extern size_t get_size_from_delta(struct packed_git *, struct pack_window **, off_t); extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, size_t *); /* diff --git a/sha1_file.c b/sha1_file.c index 3428172..1b3efea 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -51,7 +51,7 @@ static struct cached_object { unsigned char sha1[20]; enum object_type type; void *buf; - unsigned long size; + size_t size; } *cached_objects; static int cached_object_nr, cached_object_alloc; @@ -818,8 +818,8 @@ static int check_packed_git_idx(const char *path, struct packed_git *p) * variable sized table containing 8-byte entries * for offsets larger than 2^31. */ - unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20; - unsigned long max_size = min_size; + size_t min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20; + size_t max_size = min_size; if (nr) max_size += (nr - 1)*8; if (idx_size < min_size || idx_size > max_size) { @@ -1763,7 +1763,7 @@ static int open_sha1_file(const unsigned char *sha1, const char **path) */ static void *map_sha1_file_1(const char *path, const unsigned char *sha1, -unsigned long *size) +size_t *size) { void *map; int fd; @@ -1790,13 +1790,13 @@ static void *map_sha1_file_1(const char *path, return map; } -void *map_sha1_file(const unsigned char *sha1, unsigned long *size) +void *map_sha1_file(const unsigned char *sha1, size_t *size) { return map_sha1_file_1(NULL, sha1, size); } -unsigned long unpack_object_header_buffer(const unsigned char *buf, - unsigned long len, enum object_type *type, size_t *sizep) +size_t unpack_object_header_buffer(const unsigned char *buf, + size_t len, enum object_type *type, size_t *sizep) { unsigned shift; size_t size, c; @@ -1821,8 +1821,8 @@ unsigned long unpack_object_header_buffer(const
[PATCH 6/9] Use size_t for sha1
From: Martin KoeglerSigned-off-by: Martin Koegler --- block-sha1/sha1.c | 2 +- block-sha1/sha1.h | 2 +- ppc/sha1.c| 2 +- ppc/sha1.h| 2 +- sha1dc_git.c | 2 +- sha1dc_git.h | 2 +- sha1dc_git_ext.h | 2 +- 7 files changed, 7 insertions(+), 7 deletions(-) diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c index 22b125c..8681031 100644 --- a/block-sha1/sha1.c +++ b/block-sha1/sha1.c @@ -203,7 +203,7 @@ void blk_SHA1_Init(blk_SHA_CTX *ctx) ctx->H[4] = 0xc3d2e1f0; } -void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *data, unsigned long len) +void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *data, size_t len) { unsigned int lenW = ctx->size & 63; diff --git a/block-sha1/sha1.h b/block-sha1/sha1.h index 4df6747..9fb0441 100644 --- a/block-sha1/sha1.h +++ b/block-sha1/sha1.h @@ -13,7 +13,7 @@ typedef struct { } blk_SHA_CTX; void blk_SHA1_Init(blk_SHA_CTX *ctx); -void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *dataIn, unsigned long len); +void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *dataIn, size_t len); void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX *ctx); #define platform_SHA_CTX blk_SHA_CTX diff --git a/ppc/sha1.c b/ppc/sha1.c index ec6a192..f0dfcfb 100644 --- a/ppc/sha1.c +++ b/ppc/sha1.c @@ -25,7 +25,7 @@ int ppc_SHA1_Init(ppc_SHA_CTX *c) return 0; } -int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *ptr, unsigned long n) +int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *ptr, size_t n) { unsigned long nb; const unsigned char *p = ptr; diff --git a/ppc/sha1.h b/ppc/sha1.h index 9b24b32..52cac23 100644 --- a/ppc/sha1.h +++ b/ppc/sha1.h @@ -16,7 +16,7 @@ typedef struct { } ppc_SHA_CTX; int ppc_SHA1_Init(ppc_SHA_CTX *c); -int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *p, unsigned long n); +int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *p, size_t n); int ppc_SHA1_Final(unsigned char *hash, ppc_SHA_CTX *c); #define platform_SHA_CTX ppc_SHA_CTX diff --git a/sha1dc_git.c b/sha1dc_git.c index 4d32b4f..a9076bc 100644 --- a/sha1dc_git.c +++ b/sha1dc_git.c @@ -11,7 +11,7 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx) sha1_to_hex(hash)); } -void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len) +void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, size_t len) { const char *data = vdata; /* We expect an unsigned long, but sha1dc only takes an int */ diff --git a/sha1dc_git.h b/sha1dc_git.h index a8a5c1d..f6051aa 100644 --- a/sha1dc_git.h +++ b/sha1dc_git.h @@ -11,7 +11,7 @@ void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *); /* * Same as SHA1DCUpdate, but adjust types to match git's usual interface. */ -void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len); +void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len); #define platform_SHA_CTX SHA1_CTX #define platform_SHA1_Init SHA1DCInit diff --git a/sha1dc_git_ext.h b/sha1dc_git_ext.h index d0ea8ce..aede828 100644 --- a/sha1dc_git_ext.h +++ b/sha1dc_git_ext.h @@ -17,7 +17,7 @@ void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *); /* * Same as SHA1DCUpdate, but adjust types to match git's usual interface. */ -void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len); +void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len); #define platform_SHA_CTX SHA1_CTX #define platform_SHA1_Init git_SHA1DCInit -- 2.1.4
[PATCH 8/9] Convert fsck.c & commit.c to size_t
From: Martin KoeglerSigned-off-by: Martin Koegler --- builtin/replace.c | 2 +- commit.c | 14 +++--- commit.h | 8 fsck.c| 14 +++--- fsck.h| 2 +- 5 files changed, 20 insertions(+), 20 deletions(-) diff --git a/builtin/replace.c b/builtin/replace.c index f4a85a1..dcd0d1e 100644 --- a/builtin/replace.c +++ b/builtin/replace.c @@ -391,7 +391,7 @@ static int create_graft(int argc, const char **argv, int force) struct commit *commit; struct strbuf buf = STRBUF_INIT; const char *buffer; - unsigned long size; + size_t size; if (get_oid(old_ref, ) < 0) die(_("Not a valid object name: '%s'"), old_ref); diff --git a/commit.c b/commit.c index 79decc2..5ebac6a 100644 --- a/commit.c +++ b/commit.c @@ -231,19 +231,19 @@ int unregister_shallow(const struct object_id *oid) struct commit_buffer { void *buffer; - unsigned long size; + size_t size; }; define_commit_slab(buffer_slab, struct commit_buffer); static struct buffer_slab buffer_slab = COMMIT_SLAB_INIT(1, buffer_slab); -void set_commit_buffer(struct commit *commit, void *buffer, unsigned long size) +void set_commit_buffer(struct commit *commit, void *buffer, size_t size) { struct commit_buffer *v = buffer_slab_at(_slab, commit); v->buffer = buffer; v->size = size; } -const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep) +const void *get_cached_commit_buffer(const struct commit *commit, size_t *sizep) { struct commit_buffer *v = buffer_slab_peek(_slab, commit); if (!v) { @@ -256,7 +256,7 @@ const void *get_cached_commit_buffer(const struct commit *commit, unsigned long return v->buffer; } -const void *get_commit_buffer(const struct commit *commit, unsigned long *sizep) +const void *get_commit_buffer(const struct commit *commit, size_t *sizep) { const void *ret = get_cached_commit_buffer(commit, sizep); if (!ret) { @@ -291,7 +291,7 @@ void free_commit_buffer(struct commit *commit) } } -const void *detach_commit_buffer(struct commit *commit, unsigned long *sizep) +const void *detach_commit_buffer(struct commit *commit, size_t *sizep) { struct commit_buffer *v = buffer_slab_peek(_slab, commit); void *ret; @@ -1128,7 +1128,7 @@ int parse_signed_commit(const struct commit *commit, struct strbuf *payload, struct strbuf *signature) { - unsigned long size; + size_t size; const char *buffer = get_commit_buffer(commit, ); int in_signature, saw_signature = -1; const char *line, *tail; @@ -1284,7 +1284,7 @@ struct commit_extra_header *read_commit_extra_headers(struct commit *commit, const char **exclude) { struct commit_extra_header *extra = NULL; - unsigned long size; + size_t size; const char *buffer = get_commit_buffer(commit, ); extra = read_commit_extra_header_lines(buffer, size, exclude); unuse_commit_buffer(commit, buffer); diff --git a/commit.h b/commit.h index 82e966e..fd44de3 100644 --- a/commit.h +++ b/commit.h @@ -70,20 +70,20 @@ void parse_commit_or_die(struct commit *item); * Associate an object buffer with the commit. The ownership of the * memory is handed over to the commit, and must be free()-able. */ -void set_commit_buffer(struct commit *, void *buffer, unsigned long size); +void set_commit_buffer(struct commit *, void *buffer, size_t size); /* * Get any cached object buffer associated with the commit. Returns NULL * if none. The resulting memory should not be freed. */ -const void *get_cached_commit_buffer(const struct commit *, unsigned long *size); +const void *get_cached_commit_buffer(const struct commit *, size_t *size); /* * Get the commit's object contents, either from cache or by reading the object * from disk. The resulting memory should not be modified, and must be given * to unuse_commit_buffer when the caller is done. */ -const void *get_commit_buffer(const struct commit *, unsigned long *size); +const void *get_commit_buffer(const struct commit *, size_t *size); /* * Tell the commit subsytem that we are done with a particular commit buffer. @@ -102,7 +102,7 @@ void free_commit_buffer(struct commit *); * Disassociate any cached object buffer from the commit, but do not free it. * The buffer (or NULL, if none) is returned. */ -const void *detach_commit_buffer(struct commit *, unsigned long *sizep); +const void *detach_commit_buffer(struct commit *, size_t *sizep); /* Find beginning and length of commit subject. */ int find_commit_subject(const char *commit_buffer, const char **subject); diff --git a/fsck.c b/fsck.c index feca3a8..9039373 100644 --- a/fsck.c +++ b/fsck.c @@ -632,18 +632,18 @@
[PATCH 9/9] Convert cache functions to size_t
From: Martin KoeglerSigned-off-by: Martin Koegler --- cache-tree.c | 6 +++--- cache-tree.h | 2 +- cache.h | 6 +++--- convert.c | 18 +- environment.c | 4 ++-- read-cache.c | 18 +- 6 files changed, 27 insertions(+), 27 deletions(-) diff --git a/cache-tree.c b/cache-tree.c index 2440d1d..77b3253 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -485,10 +485,10 @@ void cache_tree_write(struct strbuf *sb, struct cache_tree *root) write_one(sb, root, "", 0); } -static struct cache_tree *read_one(const char **buffer, unsigned long *size_p) +static struct cache_tree *read_one(const char **buffer, size_t *size_p) { const char *buf = *buffer; - unsigned long size = *size_p; + size_t size = *size_p; const char *cp; char *ep; struct cache_tree *it; @@ -568,7 +568,7 @@ static struct cache_tree *read_one(const char **buffer, unsigned long *size_p) return NULL; } -struct cache_tree *cache_tree_read(const char *buffer, unsigned long size) +struct cache_tree *cache_tree_read(const char *buffer, size_t size) { if (buffer[0]) return NULL; /* not the whole tree */ diff --git a/cache-tree.h b/cache-tree.h index f7b9cab..df59e6e 100644 --- a/cache-tree.h +++ b/cache-tree.h @@ -28,7 +28,7 @@ void cache_tree_invalidate_path(struct index_state *, const char *); struct cache_tree_sub *cache_tree_sub(struct cache_tree *, const char *); void cache_tree_write(struct strbuf *, struct cache_tree *root); -struct cache_tree *cache_tree_read(const char *buffer, unsigned long size); +struct cache_tree *cache_tree_read(const char *buffer, size_t size); int cache_tree_fully_valid(struct cache_tree *); int cache_tree_update(struct index_state *, int); diff --git a/cache.h b/cache.h index 9322303..f77d9ec 100644 --- a/cache.h +++ b/cache.h @@ -667,7 +667,7 @@ extern int chmod_index_entry(struct index_state *, struct cache_entry *ce, char extern int ce_same_name(const struct cache_entry *a, const struct cache_entry *b); extern void set_object_name_for_intent_to_add_entry(struct cache_entry *ce); extern int index_name_is_other(const struct index_state *, const char *, int); -extern void *read_blob_data_from_index(const struct index_state *, const char *, unsigned long *); +extern void *read_blob_data_from_index(const struct index_state *, const char *, size_t *); /* do stat comparison even if CE_VALID is true */ #define CE_MATCH_IGNORE_VALID 01 @@ -743,8 +743,8 @@ extern int pack_compression_level; extern size_t packed_git_window_size; extern size_t packed_git_limit; extern size_t delta_base_cache_limit; -extern unsigned long big_file_threshold; -extern unsigned long pack_size_limit_cfg; +extern size_t big_file_threshold; +extern size_t pack_size_limit_cfg; /* * Accessors for the core.sharedrepository config which lazy-load the value diff --git a/convert.c b/convert.c index 1012462..445599b 100644 --- a/convert.c +++ b/convert.c @@ -41,9 +41,9 @@ struct text_stat { unsigned printable, nonprintable; }; -static void gather_stats(const char *buf, unsigned long size, struct text_stat *stats) +static void gather_stats(const char *buf, size_t size, struct text_stat *stats) { - unsigned long i; + size_t i; memset(stats, 0, sizeof(*stats)); @@ -90,7 +90,7 @@ static void gather_stats(const char *buf, unsigned long size, struct text_stat * * The same heuristics as diff.c::mmfile_is_binary() * We treat files with bare CR as binary */ -static int convert_is_binary(unsigned long size, const struct text_stat *stats) +static int convert_is_binary(size_t size, const struct text_stat *stats) { if (stats->lonecr) return 1; @@ -101,7 +101,7 @@ static int convert_is_binary(unsigned long size, const struct text_stat *stats) return 0; } -static unsigned int gather_convert_stats(const char *data, unsigned long size) +static unsigned int gather_convert_stats(const char *data, size_t size) { struct text_stat stats; int ret = 0; @@ -118,7 +118,7 @@ static unsigned int gather_convert_stats(const char *data, unsigned long size) return ret; } -static const char *gather_convert_stats_ascii(const char *data, unsigned long size) +static const char *gather_convert_stats_ascii(const char *data, size_t size) { unsigned int convert_stats = gather_convert_stats(data, size); @@ -140,7 +140,7 @@ const char *get_cached_convert_stats_ascii(const struct index_state *istate, const char *path) { const char *ret; - unsigned long sz; + size_t sz; void *data = read_blob_data_from_index(istate, path, ); ret = gather_convert_stats_ascii(data, sz); free(data); @@ -222,7 +222,7 @@ static void check_safe_crlf(const char *path, enum crlf_action
Re: [PATCH] commit: skip discarding the index if there is no pre-commit hook
On Thu, Aug 10, 2017 at 02:54:16PM -0400, Kevin Willford wrote: > If there is not a pre-commit hook, there is no reason to discard > the index and reread it. > > This change checks to presence of a pre-commit hook and then only > discards the index if there was one. Sounds like a smart optimization. > diff --git a/builtin/commit.c b/builtin/commit.c > index e7a2cb6285..443949d87b 100644 > --- a/builtin/commit.c > +++ b/builtin/commit.c > @@ -671,12 +671,22 @@ static int prepare_to_commit(const char *index_file, > const char *prefix, > const char *hook_arg2 = NULL; > int clean_message_contents = (cleanup_mode != CLEANUP_NONE); > int old_display_comment_prefix; > + const char *precommit_hook = NULL; The return value from find_hook() points to storage that may later be overwritten or even freed. I know your patch doesn't actually look at the contents of precommit_hook, but it seems like it's setting up a potential bomb for somebody later. Can we make this: int have_precommit_hook = 0; ... have_precommit_hook = !!find_hook("pre-commit"); ? Though see below. > - if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", > NULL)) > - return 0; > + > + if (!no_verify) { > + /* > + * Check to see if there is a pre-commit hook > + * If there not one we can skip discarding the index later on > + */ > + precommit_hook = find_hook("pre-commit"); > + if (precommit_hook && > + run_commit_hook(use_editor, index_file, "pre-commit", NULL)) > + return 0; > + } We'll find the hook again in run_commit_hook(), but it's not so expensive that it's worth trying to pass down the hook path we found (and if we switch to an integer flag we don't have the path anyway ;) ). But note that we don't even care about precommit_hook here. We could just call run_commit_hook() regardless. We only care later whether we ran it or not. So we could drop this hunk and the precommit_hook variable entirely, and... > @@ -940,12 +950,15 @@ static int prepare_to_commit(const char *index_file, > const char *prefix, > return 0; > } > > - /* > - * Re-read the index as pre-commit hook could have updated it, > - * and write it out as a tree. We must do this before we invoke > - * the editor and after we invoke run_status above. > - */ > - discard_cache(); > + if (!no_verify && precommit_hook) { > + /* > + * Re-read the index as pre-commit hook could have updated it, > + * and write it out as a tree. We must do this before we invoke > + * the editor and after we invoke run_status above. > + */ > + discard_cache(); > + } Just make this: if (!no_verify && find_hook("pre-commit")) ... discard cache ... That is racy if somebody removes the hook while we're running (so is your patch, but in the opposite direction). What we really want to know is "did we run the hook" and annoyingly run_hook_ve() doesn't tell us that. So I think the most robust solution would be refactoring that to pass out the information, and then use the flag here. But I'm not sure it's actually worth worrying about such a race in practice. -Peff
Re: [PATCH] cache-tree: remove use of strbuf_addf in update_one
On Thu, Aug 10, 2017 at 11:58:34AM -0700, Stefan Beller wrote: > On Thu, Aug 10, 2017 at 11:47 AM, Kevin Willfordwrote: > > String formatting can be a performance issue when there are > > hundreds of thousands of trees. > > When changing this for the sake of performance, could you give > an example (which kind of repository you need for this to become > a bottleneck? I presume the large Windows repo? Or can I > reproduce it with a small repo such as linux.git or even git.git?) > and some numbers how this improves the performance? I was about to say the same thing. Normally I don't mind a small optimization without numbers if the result is obviously an improvement. But in this case the result is a lot less readable, and it's not entirely clear to me that it would always be an improvement (we now always run 3 strbuf calls instead of one, and have to check the length for each one). What I'm wondering specifically is if vsnprintf() on Kevin's platform (which I'll assume is Windows) is slow, and we would do better to replace it with a faster compat/ routine. -Peff
Re: [PATCH] cache-tree: remove use of strbuf_addf in update_one
On Thu, Aug 10, 2017 at 11:47 AM, Kevin Willfordwrote: > String formatting can be a performance issue when there are > hundreds of thousands of trees. When changing this for the sake of performance, could you give an example (which kind of repository you need for this to become a bottleneck? I presume the large Windows repo? Or can I reproduce it with a small repo such as linux.git or even git.git?) and some numbers how this improves the performance? > Change to stop using the strbuf_addf and just add the strings > or characters individually. > > There are a limited number of modes so added a switch for the > known ones and a default case if something comes through that > are not a known one for git. > > Signed-off-by: Kevin Willford > --- > cache-tree.c | 24 +++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/cache-tree.c b/cache-tree.c > index 2440d1dc89..41744b3db7 100644 > --- a/cache-tree.c > +++ b/cache-tree.c > @@ -390,7 +390,29 @@ static int update_one(struct cache_tree *it, > continue; > > strbuf_grow(, entlen + 100); > - strbuf_addf(, "%o %.*s%c", mode, entlen, path + > baselen, '\0'); > + > + switch (mode) { > + case 0100644: > + strbuf_add(, "100644 ", 7); > + break; > + case 0100664: > + strbuf_add(, "100664 ", 7); > + break; > + case 0100755: > + strbuf_add(, "100755 ", 7); > + break; > + case 012: > + strbuf_add(, "12 ", 7); > + break; > + case 016: > + strbuf_add(, "16 ", 7); > + break; Maybe it is worth spelling out the modes in non-numeric, but e.g. S_IFGITLINK. > + default: > + strbuf_addf(, "%o ", mode); Given the repository you are measuring, maybe we could get away with fewer entries here and only take the 2 or 3 most used entries and special case them? Or in case this is assumed to be the exhaustive list, we could issue a warning here? Thanks, Stefan
Re: fatal: Out of memory, getdelim failed under NFS mounts
René Scharfewrites: > I doubt the type of file system matters. The questions are: How much > main memory do you have, what is git trying to cram into it, is there > a way to reduce the memory footprint or do you need to add more RAM? > >> any recommendations on how to pin point the "offender"? ;) > Running "GIT_TRACE=1 git pull --ff-only origin master" would be a > good start, I think, to find out which of the different activities > that pull is doing causes the out-of-memory error. > > "free" and "ulimit -a" can help you find out how much memory you can > use. > > Also: What does "wc -L .git/FETCH_HEAD .git/packed-refs" report? > getdelim() is used mostly to read lines from files like these and in > the admittedly unlikely case that they are *really* long such an > error would be expected. There is only one getdelim() call, which was introduced in v2.5.0 timeframe, and it is used like this: r = getdelim(>buf, >alloc, term, fp); if (r > 0) { sb->len = r; return 0; } assert(r == -1); /* * Normally we would have called xrealloc, which will try to free * memory and recover. But we have no way to tell getdelim() to do so. * Worse, we cannot try to recover ENOMEM ourselves, because we have * no idea how many bytes were read by getdelim. * * Dying here is reasonable. It mirrors what xrealloc would do on * catastrophic memory failure. We skip the opportunity to free pack * memory and retry, but that's unlikely to help for a malloc small * enough to hold a single line of input, anyway. */ if (errno == ENOMEM) die("Out of memory, getdelim failed"); So the function is returning -1 and leaving ENOMEM in errno on Yaroslav's system. I wonder if we are truly hitting out of memory, though. The same symptom could bee seen if getdelim() does not touch errno when it returns -1, but some other system call earlier set it to ENOMEM, for example. If the same version of Git is recompiled there without HAVE_GETDELIM defined, would it still die with out of memory (presumably inside the call to strbuf_grow() in the strbuf_getwholeline() function)?
[PATCH] commit: skip discarding the index if there is no pre-commit hook
If there is not a pre-commit hook, there is no reason to discard the index and reread it. This change checks to presence of a pre-commit hook and then only discards the index if there was one. Signed-off-by: Kevin Willford--- builtin/commit.c | 29 + 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index e7a2cb6285..443949d87b 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -671,12 +671,22 @@ static int prepare_to_commit(const char *index_file, const char *prefix, const char *hook_arg2 = NULL; int clean_message_contents = (cleanup_mode != CLEANUP_NONE); int old_display_comment_prefix; + const char *precommit_hook = NULL; /* This checks and barfs if author is badly specified */ determine_author_info(author_ident); - if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", NULL)) - return 0; + + if (!no_verify) { + /* +* Check to see if there is a pre-commit hook +* If there not one we can skip discarding the index later on +*/ + precommit_hook = find_hook("pre-commit"); + if (precommit_hook && + run_commit_hook(use_editor, index_file, "pre-commit", NULL)) + return 0; + } if (squash_message) { /* @@ -940,12 +950,15 @@ static int prepare_to_commit(const char *index_file, const char *prefix, return 0; } - /* -* Re-read the index as pre-commit hook could have updated it, -* and write it out as a tree. We must do this before we invoke -* the editor and after we invoke run_status above. -*/ - discard_cache(); + if (!no_verify && precommit_hook) { + /* +* Re-read the index as pre-commit hook could have updated it, +* and write it out as a tree. We must do this before we invoke +* the editor and after we invoke run_status above. +*/ + discard_cache(); + } + read_cache_from(index_file); if (update_main_cache_tree(0)) { error(_("Error building trees")); -- 2.14.0.rc0.286.g44127d70e4
[PATCH v1 0/1] Teach status to show ignored directories
Our application (Visual Studio) needs to run git status with options to report untracked and ignored files. It needs all untracked files reported individually, but would rather not have all individual ignored files under explicitly ignored directories reported. Directories that match an ignore pattern should be reported, instead of all contained files. It is not possible to get this output with the current arguments available to git status. You can get ignored files (--ignored), all untracked files (--untracked-files=all), but if you specify both options then you will also get all individual ignored files. This change is to add the option to have git status report all untracked files while also reporting directories that match an ignore pattern. The logic in dir.c was also modified to not iterate over all files in an ignored directory. This change resulted in a performance improvement in work directories with a large number of files contained in ignored directories. For example, this setup is present when bin and obj directories are contained in the repository work dir. Jameson Miller (1): dir: teach status to show ignored directories Documentation/git-status.txt | 5 + Documentation/technical/api-directory-listing.txt | 6 + builtin/commit.c | 4 + dir.c | 48 +- dir.h | 3 +- t/t7519-status-show-ignored-directory.sh | 189 ++ wt-status.c | 4 + wt-status.h | 1 + 8 files changed, 253 insertions(+), 7 deletions(-) create mode 100755 t/t7519-status-show-ignored-directory.sh -- 2.11.0
[PATCH v1 1/1] dir: teach status to show ignored directories
Teach Git to optionally show ignored directories when showing all untracked files. The git status command exposes the options to report ignored and/or untracked files. However, when reporting all untracked files (--untracked-files=all), all individual ignored files are reported as well. It is not currently possible to get the reporting behavior of the --ignored flag, while also reporting all untracked files. This change exposes a flag to report all untracked files while not showing individual files in ignored directories. Motivation: Our application (Visual Studio) needs all untracked files listed individually, but does not need all ignored files listed individually. Reporting all ignored files can affect the time it takes for status to run. For a representative repository, here are some measurements showing a large perf improvement for this scenario: | Command | Reported ignored entries | Time (s) | | --- | | | | 1 | 0| 1.3 | | 2 | 1024 | 4.2 | | 3 | 174904 | 7.5 | | 4 | 1046 | 1.6 | Commands: 1) status 2) status --ignored 3) status --ignored --untracked-files=all 4) status --ignored --untracked-files=all --show-ignored-directory This changes exposes a --show-ignored-directory flag to the git status command. This flag is utilized when running git status with the --ignored and --untracked-files options to not list ignored individual ignored files contained in directories that match an ignore pattern. Part of the perf improvement comes from the tweak to read_directory_recursive to stop scanning the file system after it encounters the first file. When a directory is ignored, all it needs to determine is if the directory is empty or not. The logic currently keeps scanning the file system until it finds an untracked file. However, as the directory is ignored, all the contained contents are also marked excluded. For ignored directories that contain a large number of files, this can take some time. Signed-off-by: Jameson Miller--- Documentation/git-status.txt | 5 + Documentation/technical/api-directory-listing.txt | 6 + builtin/commit.c | 4 + dir.c | 48 +- dir.h | 3 +- t/t7519-status-show-ignored-directory.sh | 189 ++ wt-status.c | 4 + wt-status.h | 1 + 8 files changed, 253 insertions(+), 7 deletions(-) create mode 100755 t/t7519-status-show-ignored-directory.sh diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt index d47f198f15..48ebe18571 100644 --- a/Documentation/git-status.txt +++ b/Documentation/git-status.txt @@ -100,6 +100,11 @@ configuration variable documented in linkgit:git-config[1]. --ignored:: Show ignored files as well. +--show-ignored-directory:: +Show directories that are ignored, instead of individual files. +Does not recurse into excluded directories when listing all +untracked files. + -z:: Terminate entries with NUL, instead of LF. This implies the `--porcelain=v1` output format if no other format is given. diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt index 6c77b4920c..a9816df44c 100644 --- a/Documentation/technical/api-directory-listing.txt +++ b/Documentation/technical/api-directory-listing.txt @@ -33,6 +33,12 @@ The notable options are: Similar to `DIR_SHOW_IGNORED`, but return ignored files in `ignored[]` in addition to untracked files in `entries[]`. +`DIR_SHOW_IGNORED_DIRECTORY`::: + + If this is set, non-empty directories that match an ignore pattern are + returned. The individual files contained in ignored directories are not + included. + `DIR_KEEP_UNTRACKED_CONTENTS`::: Only has meaning if `DIR_SHOW_IGNORED_TOO` is also set; if this is set, the diff --git a/builtin/commit.c b/builtin/commit.c index 8e93802511..9fcf77ae3a 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1333,6 +1333,7 @@ static int git_status_config(const char *k, const char *v, void *cb) int cmd_status(int argc, const char **argv, const char *prefix) { + static int show_ignored_directory = 0; static struct wt_status s; int fd; struct object_id oid; @@ -1362,6 +1363,8 @@ int cmd_status(int argc, const char **argv, const char *prefix) N_("ignore changes to submodules, optional when: all, dirty, untracked. (Default: all)"), PARSE_OPT_OPTARG, NULL, (intptr_t)"all" }, OPT_COLUMN(0, "column", , N_("list untracked files in columns")), + OPT_BOOL(0,
Re: Not understanding with git wants to copy one file to another
On Thu, Aug 10, 2017 at 11:18 AM, Harry Putnamwrote: > Stefan Beller writes: > >> On Thu, Aug 10, 2017 at 10:03 AM, Harry Putnam wrote: > > [...] > > Harry wrote: >>> Here are two that are at least kind of similar but would never be seen >>> as the same: >>> >>> < 192.168.1.43 m2.local.lan m2 # 00-90-F5-A1-F9-E5 192.168.1.43m2.local.lanm2 # win 7 > > Stefan B replied: >> The diff machinery has a threshold for when it assumes >> a copy/move of a file. (e.g. "A file is assumed copied when >> at least 55% of lines are equal") >> >> https://git-scm.com/docs/git-diff >> >> See -C and -M option. >> >> git-status seems to use this machinery as well, but does >> not expose the options? > > Well, now I'm even more confused. What actually happens? Is either > file changed? Is only one file kept? > > On the surface it sounds like complete anathema to what git is all > about. > > However, I know a tool this sophisticated is not doing something just > outright stupid... so must be really missing the point here. > > I get the way you can make -M stricter or not... but I didn't call > git-diff to see that copy thing comeup. > > I called git commit. Ah. Sorry for confusing even more. By pointing out the options for git-diff, I just wanted to point out that such a mechanism ("rename/copy detection") exists. The output of git-status is similar to a dry run of git-commit, and apparently this detection is used there. > > There must be some way to set stricter guidlines to calling things > copies. Well from Gits perspective it is really hard to tell if it was a copy, or if it was similar incidentally (because the format/content of these files happen to follow some strict guidelines). The user could have moved/copied a file outside of Git (instead of git-mv, you'd use tools provided by your operating system to copy a file). Or the user could have written a file that is similar by chance. However that doesn't really matter, as Git tracks the content, and not how the file evolved. Consider the copy/move/rename detection as a heuristic, that wants to help the user, but may be mistaken. > > But then I must really not get it because it still seems almost silly > to consider one file a copy of another if only 55% is the same. > > What am I missing? > https://www.reddit.com/r/git/comments/3ogkk1/beginner_disable_rename_detection/ "Rename detection is just GUI sugar".
[PATCH] cache-tree: remove use of strbuf_addf in update_one
String formatting can be a performance issue when there are hundreds of thousands of trees. Change to stop using the strbuf_addf and just add the strings or characters individually. There are a limited number of modes so added a switch for the known ones and a default case if something comes through that are not a known one for git. Signed-off-by: Kevin Willford--- cache-tree.c | 24 +++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index 2440d1dc89..41744b3db7 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -390,7 +390,29 @@ static int update_one(struct cache_tree *it, continue; strbuf_grow(, entlen + 100); - strbuf_addf(, "%o %.*s%c", mode, entlen, path + baselen, '\0'); + + switch (mode) { + case 0100644: + strbuf_add(, "100644 ", 7); + break; + case 0100664: + strbuf_add(, "100664 ", 7); + break; + case 0100755: + strbuf_add(, "100755 ", 7); + break; + case 012: + strbuf_add(, "12 ", 7); + break; + case 016: + strbuf_add(, "16 ", 7); + break; + default: + strbuf_addf(, "%o ", mode); + break; + } + strbuf_add(, path + baselen, entlen); + strbuf_addch(, '\0'); strbuf_add(, sha1, 20); #if DEBUG -- 2.14.0.rc0.286.g44127d70e4
Re: [PATCH 3/5] interpret-trailers: add an option to show only existing trailers
Jeff Kingwrites: > Perhaps "--exact-input" hits all of those. Or maybe "--only-input" to > match the other "--only". > > I think I like that last one the best. It makes it clear that we are > looking just at the input, and not anything else. Which is exactly what > the feature does. Sounds good. Thanks.
Re: [PATCH 0/5] make interpret-trailers useful for parsing
Jeff Kingwrites: >> > The above example made me wonder if we also want a format specifier >> > to do the above without piping, but it turns out that we already >> > have "log --format=%(trailers)", so we are good ;-) >> >> I was going to say, I thought we had a way to get trailers for a >> commit via the pretty format, since that is what i used in the past. > > I do like that you could get the trailers for many commits in a single > invocation. That doesn't matter for my current use-case, but obviously > piping through O(n) interpret-trailers invocations is a bad idea. > But there are a few difficulties with using %(trailers) for this,... I think it is clear to you, but it may not be clear to others, that I did not mean to say "because 'log --format' already knows about it, this change to interpret-trailers is unnecessary". > For (1) I think many callers would prefer to see the original > formatting. Maybe we'd need a %(trailers:normalize) or something. Thanks; that is exactly the line of thought I had in the back of my head without even realizing when I brought up %(trailers) format element.
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 11:35:00AM -0700, Stefan Beller wrote: > On Thu, Aug 10, 2017 at 1:03 AM, Jeff Kingwrote: > > The point of "--only-trailers" is to give a caller an output > > that's easy for them to parse. Getting rid of the > > non-trailer material helps, but we still may see more > > complicated syntax like whitespace continuation. Let's add > > an option to normalize the output into one "key: value" line > > per trailer. > > > > As a bonus, this could be used even without --only-trailers > > to clean up unusual formatting in the incoming data. > > This is useful for the parsing part, but for the writing part we'd > rather want to have the opposite thing, such as > '--line-break=rfc822'. But this doesn't have to be part of this > series. With this in mind, I do not quite understand the latter > use case how you would use normalized trailers without > --only-trailers? If you prefer the normalized form (and the input was line-broken in a way that you don't like), then this would convert to your preferred form. I agree that you could potentially want the opposite (folding long lines). Perhaps something like --wrap=72. -Peff
Re: [PATCH 0/5] make interpret-trailers useful for parsing
On Thu, Aug 10, 2017 at 11:03 AM, Jeff Kingwrote: > On Thu, Aug 10, 2017 at 04:02:46AM -0400, Jeff King wrote: > >> On Wed, Aug 09, 2017 at 08:21:47AM -0400, Jeff King wrote: >> >> > This series teaches interpret-trailers to parse and output just the >> > trailers. So now you can do: >> > >> > $ git log --format=%B -1 8d44797cc91231cd44955279040dc4a1ee0a797f | >> > git interpret-trailers --parse >> > Signed-off-by: Hartmut Henkel >> > Helped-by: Stefan Beller >> > Signed-off-by: Ralf Thielow >> > Acked-by: Matthias Rüster >> >> And here's a v2 that addresses all of the comments except one: Stefan >> suggested that --only-existing wasn't a great name. I agree, but I like >> everything else less. > > Here's a v3 that takes care of that (renaming it to --only-input). > > It's otherwise the same as v2, but since the name-change ripples through > the remaining patches, I wanted to get v3 in front of people sooner > rather than later. > Looks good, Thanks, Stefan
Re: [PATCH 4/5] interpret-trailers: add an option to normalize output
On Thu, Aug 10, 2017 at 1:03 AM, Jeff Kingwrote: > The point of "--only-trailers" is to give a caller an output > that's easy for them to parse. Getting rid of the > non-trailer material helps, but we still may see more > complicated syntax like whitespace continuation. Let's add > an option to normalize the output into one "key: value" line > per trailer. > > As a bonus, this could be used even without --only-trailers > to clean up unusual formatting in the incoming data. This is useful for the parsing part, but for the writing part we'd rather want to have the opposite thing, such as '--line-break=rfc822'. But this doesn't have to be part of this series. With this in mind, I do not quite understand the latter use case how you would use normalized trailers without --only-trailers?
[PATCH v2 0/2] Add progress for format-patch and rebase
Changes since last patch: 1. Use start_progress_delay so progress isn't shown if generating the patches is fast enough 2. Updated to have text of "Generating patches" 3. Only show progress when the --progress flag is passed 4. In the rebase script check stderr and the quiet option is not set before propagating the progress flag to format-patch Kevin Willford (2): format-patch: have progress option while generating patches rebase: turn on progress option by default for format-patch Documentation/git-format-patch.txt | 4 builtin/log.c | 10 ++ git-rebase--am.sh | 1 + git-rebase.sh | 6 ++ 4 files changed, 21 insertions(+) -- 2.14.0.rc0.286.g44127d70e4
[PATCH v2 1/2] format-patch: have progress option while generating patches
When generating patches for the rebase command if the user does not realize the branch they are rebasing onto is thousands of commits different there is no progress indication after initial rewinding message. The progress meter as presented in this patch assumes the thousands of patches to have a fine granularity as well as assuming to require all the same amount of work/time for each, such that a steady progress bar is achieved. We do not want to estimate the time for each patch based e.g. on their size or number of touched files (or parents) as that is too expensive for just a progress meter. This patch allows a progress option to be passed to format-patch so that the user can be informed the progress of generating the patch. This option is then used by the rebase command when calling format-patch. Signed-off-by: Kevin Willford--- Documentation/git-format-patch.txt | 4 builtin/log.c | 10 ++ 2 files changed, 14 insertions(+) diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt index c890328b02..6cbe462a77 100644 --- a/Documentation/git-format-patch.txt +++ b/Documentation/git-format-patch.txt @@ -23,6 +23,7 @@ SYNOPSIS [(--reroll-count|-v) ] [--to=] [--cc=] [--[no-]cover-letter] [--quiet] [--notes[=]] + [--progress] [] [ | ] @@ -283,6 +284,9 @@ you can use `--suffix=-patch` to get `0001-description-of-my-change-patch`. range are always formatted as creation patches, independently of this flag. +--progress:: + Show progress reports on stderr as patches are generated. + CONFIGURATION - You can specify extra mail header lines to be added to each message, diff --git a/builtin/log.c b/builtin/log.c index 725c7b8a1a..b07a5529c2 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -27,6 +27,7 @@ #include "version.h" #include "mailmap.h" #include "gpg-interface.h" +#include "progress.h" /* Set a default date-time format for git log ("log.date" config variable) */ static const char *default_date_mode = NULL; @@ -1422,6 +1423,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) char *branch_name = NULL; char *base_commit = NULL; struct base_tree_info bases; + int show_progress = 0; + struct progress *progress = NULL; const struct option builtin_format_patch_options[] = { { OPTION_CALLBACK, 'n', "numbered", , NULL, @@ -1493,6 +1496,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) OPT_FILENAME(0, "signature-file", _file, N_("add a signature from a file")), OPT__QUIET(, N_("don't print the patch filenames")), + OPT_BOOL(0, "progress", _progress, +N_("show progress while generating patches")), OPT_END() }; @@ -1752,8 +1757,12 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) start_number--; } rev.add_signoff = do_signoff; + + if (show_progress) + progress = start_progress_delay(_("Generating patches"), total, 0, 1); while (0 <= --nr) { int shown; + display_progress(progress, total - nr); commit = list[nr]; rev.nr = total - nr + (start_number - 1); /* Make the second and subsequent mails replies to the first */ @@ -1818,6 +1827,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) if (!use_stdout) fclose(rev.diffopt.file); } + stop_progress(); free(list); free(branch_name); string_list_clear(_to, 0); -- 2.14.0.rc0.286.g44127d70e4
[PATCH v2 2/2] rebase: turn on progress option by default for format-patch
This change passes the progress option of format-patch checking that stderr is attached and rebase is not being run in quiet mode. Signed-off-by: Kevin Willford--- git-rebase--am.sh | 1 + git-rebase.sh | 6 ++ 2 files changed, 7 insertions(+) diff --git a/git-rebase--am.sh b/git-rebase--am.sh index 375239341f..ff98fe3a73 100644 --- a/git-rebase--am.sh +++ b/git-rebase--am.sh @@ -53,6 +53,7 @@ else git format-patch -k --stdout --full-index --cherry-pick --right-only \ --src-prefix=a/ --dst-prefix=b/ --no-renames --no-cover-letter \ + $git_format_patch_opt \ "$revisions" ${restrict_revision+^$restrict_revision} \ >"$GIT_DIR/rebased-patches" ret=$? diff --git a/git-rebase.sh b/git-rebase.sh index f8b3d1fd97..ad8415e3cf 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -74,6 +74,7 @@ test "$(git config --bool rebase.stat)" = true && diffstat=t autostash="$(git config --bool rebase.autostash || echo false)" fork_point=auto git_am_opt= +git_format_patch_opt= rebase_root= force_rebase= allow_rerere_autoupdate= @@ -445,6 +446,11 @@ else state_dir="$apply_dir" fi +if test -t 2 && test -z "$GIT_QUIET" +then + git_format_patch_opt="$git_format_patch_opt --progress" +fi + if test -z "$rebase_root" then case "$#" in -- 2.14.0.rc0.286.g44127d70e4
Re: [PATCH v3 2/5] interpret-trailers: add an option to show only the trailers
On Thu, Aug 10, 2017 at 11:28:52AM -0700, Stefan Beller wrote: > > +test_expect_success 'only trailers' ' > > + git config trailer.sign.command "echo config-value" && > > You may want to use 'test_config' here, which keeps the config > only for one test. The subsequent tests seem to overwrite the > config, so this is not wrong, just not the best style. Yeah, I actually considered that but decided to keep style with the rest of the script. I agree the whole thing could possibly switch to test_config, but I suspect there may be some fallouts (the style of the rest of the script seems to assume some continuity between the tests). -Peff
Re: [PATCH v3 2/5] interpret-trailers: add an option to show only the trailers
On Thu, Aug 10, 2017 at 11:04 AM, Jeff Kingwrote: > In theory it's easy for any reader who wants to parse > trailers to do so. But there are a lot of subtle corner > cases around what counts as a trailer, when the trailer > block begins and ends, etc. Since interpret-trailers already > has our parsing logic, let's let callers ask it to just > output the trailers. > > They still have to parse the "key: value" lines, but at > least they can ignore all of the other corner cases. > > Signed-off-by: Jeff King Sorry for a sloppy review last round, upon reviewing I found another nit. > > +test_expect_success 'only trailers' ' > + git config trailer.sign.command "echo config-value" && You may want to use 'test_config' here, which keeps the config only for one test. The subsequent tests seem to overwrite the config, so this is not wrong, just not the best style.
Re: Not understanding with git wants to copy one file to another
Stefan Bellerwrites: > On Thu, Aug 10, 2017 at 10:03 AM, Harry Putnam wrote: [...] Harry wrote: >> Here are two that are at least kind of similar but would never be seen >> as the same: >> >> < 192.168.1.43 m2.local.lan m2 # 00-90-F5-A1-F9-E5 >>> 192.168.1.43m2.local.lanm2 # win 7 Stefan B replied: > The diff machinery has a threshold for when it assumes > a copy/move of a file. (e.g. "A file is assumed copied when > at least 55% of lines are equal") > > https://git-scm.com/docs/git-diff > > See -C and -M option. > > git-status seems to use this machinery as well, but does > not expose the options? Well, now I'm even more confused. What actually happens? Is either file changed? Is only one file kept? On the surface it sounds like complete anathema to what git is all about. However, I know a tool this sophisticated is not doing something just outright stupid... so must be really missing the point here. I get the way you can make -M stricter or not... but I didn't call git-diff to see that copy thing comeup. I called git commit. There must be some way to set stricter guidlines to calling things copies. But then I must really not get it because it still seems almost silly to consider one file a copy of another if only 55% is the same. What am I missing?
[PATCH 4/4] Fix delta offset overflow
From: Martin KoeglerPrevent generating delta offsets beyond 4G. Signed-off-by: Martin Koegler --- diff-delta.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/diff-delta.c b/diff-delta.c index 3d5e1ef..633883e 100644 --- a/diff-delta.c +++ b/diff-delta.c @@ -454,6 +454,9 @@ create_delta(const struct delta_index *index, moff += msize; msize = left; + if (moff > 0x) + msize = 0; + if (msize < 4096) { int j; val = 0; -- 2.1.4
[PATCH 3/4] Convert zlib.c to size_t
From: Martin KoeglerSigned-off-by: Martin Koegler --- builtin/pack-objects.c | 10 +- cache.h| 12 ++-- pack-check.c | 6 +++--- pack.h | 2 +- sha1_file.c| 6 +++--- wrapper.c | 8 zlib.c | 8 7 files changed, 26 insertions(+), 26 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d94fd17..aa70f80 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -222,15 +222,15 @@ static void copy_pack_data(struct sha1file *f, struct packed_git *p, struct pack_window **w_curs, off_t offset, - off_t len) + size_t len) { unsigned char *in; - unsigned long avail; + size_t avail; while (len) { in = use_pack(p, w_curs, offset, ); if (avail > len) - avail = (unsigned long)len; + avail = len; sha1write(f, in, avail); offset += avail; len -= avail; @@ -1388,8 +1388,8 @@ static void check_object(struct object_entry *entry) struct pack_window *w_curs = NULL; const unsigned char *base_ref = NULL; struct object_entry *base_entry; - unsigned long used, used_0; - unsigned long avail; + size_t used, used_0; + size_t avail; off_t ofs; unsigned char *buf, c; diff --git a/cache.h b/cache.h index 26a3eaa..9185763 100644 --- a/cache.h +++ b/cache.h @@ -42,10 +42,10 @@ #include typedef struct git_zstream { z_stream z; - unsigned long avail_in; - unsigned long avail_out; - unsigned long total_in; - unsigned long total_out; + size_t avail_in; + size_t avail_out; + size_t total_in; + size_t total_out; unsigned char *next_in; unsigned char *next_out; } git_zstream; @@ -62,7 +62,7 @@ void git_deflate_end(git_zstream *); int git_deflate_abort(git_zstream *); int git_deflate_end_gently(git_zstream *); int git_deflate(git_zstream *, int flush); -unsigned long git_deflate_bound(git_zstream *, unsigned long); +size_t git_deflate_bound(git_zstream *, size_t); /* The length in bytes and in hex digits of an object name (SHA-1 value). */ #define GIT_SHA1_RAWSZ 20 @@ -1678,7 +1678,7 @@ extern int open_pack_index(struct packed_git *); */ extern void close_pack_index(struct packed_git *); -extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); +extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, size_t *); extern void close_pack_windows(struct packed_git *); extern void close_all_packs(void); extern void unuse_pack(struct pack_window **); diff --git a/pack-check.c b/pack-check.c index 6f7714f..8cc2364 100644 --- a/pack-check.c +++ b/pack-check.c @@ -24,13 +24,13 @@ static int compare_entries(const void *e1, const void *e2) } int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, - off_t offset, off_t len, unsigned int nr) + off_t offset, size_t len, unsigned int nr) { const uint32_t *index_crc; uint32_t data_crc = crc32(0, NULL, 0); do { - unsigned long avail; + size_t avail; void *data = use_pack(p, w_curs, offset, ); if (avail > len) avail = len; @@ -65,7 +65,7 @@ static int verify_packfile(struct packed_git *p, git_SHA1_Init(); do { - unsigned long remaining; + size_t remaining; unsigned char *in = use_pack(p, w_curs, offset, ); offset += remaining; if (!pack_sig_ofs) diff --git a/pack.h b/pack.h index 8294341..ca89ad0 100644 --- a/pack.h +++ b/pack.h @@ -78,7 +78,7 @@ struct progress; typedef int (*verify_fn)(const struct object_id *, enum object_type, unsigned long, void*, int*); extern const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects, int nr_objects, const struct pack_idx_option *, const unsigned char *sha1); -extern int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr); +extern int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, size_t len, unsigned int nr); extern int verify_pack_index(struct packed_git *); extern int verify_pack(struct packed_git *, verify_fn fn, struct progress *, uint32_t); extern off_t write_pack_header(struct sha1file *f, uint32_t); diff --git a/sha1_file.c b/sha1_file.c index 97b39b0..3428172 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1228,7 +1228,7 @@ static int in_window(struct
[PATCH v3 5/5] interpret-trailers: add --parse convenience option
The last few commits have added command line options that can turn interpret-trailers into a parsing tool. Since they'd most often be used together, let's provide a convenient single option for callers to invoke this mode. This is implemented as a callback rather than a boolean so that its effect is applied immediately, as if those options had been specified. Later options can then override them. E.g.: git interpret-trailers --parse --no-normalize would work. Let's also update the documentation to make clear that this parsing mode behaves quite differently than the normal "add trailers to the input" mode. Signed-off-by: Jeff King--- Documentation/git-interpret-trailers.txt | 21 ++--- builtin/interpret-trailers.c | 12 2 files changed, 26 insertions(+), 7 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 598b74efd7..54c8b081c6 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -3,24 +3,27 @@ git-interpret-trailers(1) NAME -git-interpret-trailers - help add structured information into commit messages +git-interpret-trailers - add or parse structured information in commit messages SYNOPSIS [verse] -'git interpret-trailers' [--in-place] [--trim-empty] [(--trailer [(=|:)])...] [...] +'git interpret-trailers' [options] [(--trailer [(=|:)])...] [...] +'git interpret-trailers' [options] [--parse] [...] DESCRIPTION --- -Help adding 'trailers' lines, that look similar to RFC 822 e-mail +Help parsing or adding 'trailers' lines, that look similar to RFC 822 e-mail headers, at the end of the otherwise free-form part of a commit message. This command reads some patches or commit messages from either the - arguments or the standard input if no is specified. Then -this command applies the arguments passed using the `--trailer` -option, if any, to the commit message part of each input file. The -result is emitted on the standard output. + arguments or the standard input if no is specified. If +`--parse` is specified, the output consists of the parsed trailers. + +Otherwise, the this command applies the arguments passed using the +`--trailer` option, if any, to the commit message part of each input +file. The result is emitted on the standard output. Some configuration variables control the way the `--trailer` arguments are applied to each commit message and the way any existing trailer in @@ -93,6 +96,10 @@ OPTIONS line, with any existing whitespace continuation folded into a single line. +--parse:: + A convenience alias for `--only-trailers --only-input + --normalize`. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index b7592bedd9..010c181492 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -16,6 +16,16 @@ static const char * const git_interpret_trailers_usage[] = { NULL }; +static int parse_opt_parse(const struct option *opt, const char *arg, + int unset) +{ + struct process_trailer_options *v = opt->value; + v->only_trailers = 1; + v->only_input = 1; + v->normalize = 1; + return 0; +} + int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) { struct process_trailer_options opts = PROCESS_TRAILER_OPTIONS_INIT; @@ -27,6 +37,8 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "only-trailers", _trailers, N_("output only the trailers")), OPT_BOOL(0, "only-input", _input, N_("do not apply config rules")), OPT_BOOL(0, "normalize", , N_("normalize trailer formatting")), + { OPTION_CALLBACK, 0, "parse", , NULL, N_("set parsing options"), + PARSE_OPT_NOARG | PARSE_OPT_NONEG, parse_opt_parse }, OPT_STRING_LIST(0, "trailer", , N_("trailer"), N_("trailer(s) to add")), OPT_END() -- 2.14.0.614.g0beb26d5e9
[PATCH v3 4/5] interpret-trailers: add an option to normalize output
The point of "--only-trailers" is to give a caller an output that's easy for them to parse. Getting rid of the non-trailer material helps, but we still may see more complicated syntax like whitespace continuation. Let's add an option to normalize the output into one "key: value" line per trailer. As a bonus, this could be used even without --only-trailers to clean up unusual formatting in the incoming data. Signed-off-by: Jeff King--- Documentation/git-interpret-trailers.txt | 5 + builtin/interpret-trailers.c | 1 + t/t7513-interpret-trailers.sh| 21 trailer.c| 34 +++- trailer.h| 1 + 5 files changed, 61 insertions(+), 1 deletion(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 7cc43b0e3e..598b74efd7 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -88,6 +88,11 @@ OPTIONS from the command-line or by following configured `trailer.*` rules. +--normalize:: + Normalize trailer output so that trailers appear exactly one per + line, with any existing whitespace continuation folded into a + single line. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index 2d90e0e480..b7592bedd9 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -26,6 +26,7 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "trim-empty", _empty, N_("trim empty trailers")), OPT_BOOL(0, "only-trailers", _trailers, N_("output only the trailers")), OPT_BOOL(0, "only-input", _input, N_("do not apply config rules")), + OPT_BOOL(0, "normalize", , N_("normalize trailer formatting")), OPT_STRING_LIST(0, "trailer", , N_("trailer"), N_("trailer(s) to add")), OPT_END() diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 94b6c52473..37c4f37882 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -1330,4 +1330,25 @@ test_expect_success 'only input' ' test_cmp expected actual ' +test_expect_success 'normalize' ' + cat >expected <<-\EOF && + foo: continued across several lines + EOF + # pass through tr to make leading and trailing whitespace more obvious + tr _ " " <<-\EOF | + my subject + + my body + + foo:_ + __continued + ___across + several + _lines + ___ + EOF + git interpret-trailers --only-trailers --only-input --normalize >actual && + test_cmp expected actual +' + test_done diff --git a/trailer.c b/trailer.c index 2b5269183a..07ef082284 100644 --- a/trailer.c +++ b/trailer.c @@ -887,7 +887,35 @@ static int ends_with_blank_line(const char *buf, size_t len) return is_blank_line(buf + ll); } +static void normalize_value(struct strbuf *val) +{ + struct strbuf out = STRBUF_INIT; + size_t i; + + strbuf_grow(, val->len); + i = 0; + while (i < val->len) { + char c = val->buf[i++]; + if (c == '\n') { + /* Collapse continuation down to a single space. */ + while (i < val->len && isspace(val->buf[i])) + i++; + strbuf_addch(, ' '); + } else { + strbuf_addch(, c); + } + } + + /* Empty lines may have left us with whitespace cruft at the edges */ + strbuf_trim(); + + /* output goes back to val as if we modified it in-place */ + strbuf_swap(, val); + strbuf_release(); +} + static int process_input_file(FILE *outfile, + int normalize, const char *str, struct list_head *head) { @@ -914,12 +942,16 @@ static int process_input_file(FILE *outfile, if (separator_pos >= 1) { parse_trailer(, , NULL, trailer, separator_pos); + if (normalize) + normalize_value(); add_trailer_item(head, strbuf_detach(, NULL), strbuf_detach(, NULL)); } else { strbuf_addstr(, trailer); strbuf_strip_suffix(, "\n"); + if (normalize) + normalize_value();
[PATCH v3 1/5] trailer: put process_trailers() options into a struct
We already have two options and are about to add a few more. To avoid having a huge number of boolean arguments, let's convert to an options struct which can be passed in. Signed-off-by: Jeff King--- builtin/interpret-trailers.c | 13 ++--- trailer.c| 10 ++ trailer.h| 10 +- 3 files changed, 21 insertions(+), 12 deletions(-) diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index 175f14797b..bb0d7b937a 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -18,13 +18,12 @@ static const char * const git_interpret_trailers_usage[] = { int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) { - int in_place = 0; - int trim_empty = 0; + struct process_trailer_options opts = PROCESS_TRAILER_OPTIONS_INIT; struct string_list trailers = STRING_LIST_INIT_NODUP; struct option options[] = { - OPT_BOOL(0, "in-place", _place, N_("edit files in place")), - OPT_BOOL(0, "trim-empty", _empty, N_("trim empty trailers")), + OPT_BOOL(0, "in-place", _place, N_("edit files in place")), + OPT_BOOL(0, "trim-empty", _empty, N_("trim empty trailers")), OPT_STRING_LIST(0, "trailer", , N_("trailer"), N_("trailer(s) to add")), OPT_END() @@ -36,11 +35,11 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) if (argc) { int i; for (i = 0; i < argc; i++) - process_trailers(argv[i], in_place, trim_empty, ); + process_trailers(argv[i], , ); } else { - if (in_place) + if (opts.in_place) die(_("no input file given for in-place editing")); - process_trailers(NULL, in_place, trim_empty, ); + process_trailers(NULL, , ); } string_list_clear(, 0); diff --git a/trailer.c b/trailer.c index 751b56c009..e21a0d1629 100644 --- a/trailer.c +++ b/trailer.c @@ -968,7 +968,9 @@ static FILE *create_in_place_tempfile(const char *file) return outfile; } -void process_trailers(const char *file, int in_place, int trim_empty, struct string_list *trailers) +void process_trailers(const char *file, + const struct process_trailer_options *opts, + struct string_list *trailers) { LIST_HEAD(head); LIST_HEAD(arg_head); @@ -980,7 +982,7 @@ void process_trailers(const char *file, int in_place, int trim_empty, struct str read_input_file(, file); - if (in_place) + if (opts->in_place) outfile = create_in_place_tempfile(file); /* Print the lines before the trailers */ @@ -990,14 +992,14 @@ void process_trailers(const char *file, int in_place, int trim_empty, struct str process_trailers_lists(, _head); - print_all(outfile, , trim_empty); + print_all(outfile, , opts->trim_empty); free_all(); /* Print the lines after the trailers as is */ fwrite(sb.buf + trailer_end, 1, sb.len - trailer_end, outfile); - if (in_place) + if (opts->in_place) if (rename_tempfile(_tempfile, file)) die_errno(_("could not rename temporary file to %s"), file); diff --git a/trailer.h b/trailer.h index 65cc5d79c6..9da00bedec 100644 --- a/trailer.h +++ b/trailer.h @@ -22,7 +22,15 @@ struct trailer_info { size_t trailer_nr; }; -void process_trailers(const char *file, int in_place, int trim_empty, +struct process_trailer_options { + int in_place; + int trim_empty; +}; + +#define PROCESS_TRAILER_OPTIONS_INIT {0} + +void process_trailers(const char *file, + const struct process_trailer_options *opts, struct string_list *trailers); void trailer_info_get(struct trailer_info *info, const char *str); -- 2.14.0.614.g0beb26d5e9
[PATCH v3 2/5] interpret-trailers: add an option to show only the trailers
In theory it's easy for any reader who wants to parse trailers to do so. But there are a lot of subtle corner cases around what counts as a trailer, when the trailer block begins and ends, etc. Since interpret-trailers already has our parsing logic, let's let callers ask it to just output the trailers. They still have to parse the "key: value" lines, but at least they can ignore all of the other corner cases. Signed-off-by: Jeff King--- Documentation/git-interpret-trailers.txt | 3 +++ builtin/interpret-trailers.c | 1 + t/t7513-interpret-trailers.sh| 39 trailer.c| 19 ++-- trailer.h| 1 + 5 files changed, 56 insertions(+), 7 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 31cdeaecdf..295dffbd21 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -80,6 +80,9 @@ OPTIONS trailer to the input messages. See the description of this command. +--only-trailers:: + Output only the trailers, not any other parts of the input. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index bb0d7b937a..afb12c11bc 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -24,6 +24,7 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) struct option options[] = { OPT_BOOL(0, "in-place", _place, N_("edit files in place")), OPT_BOOL(0, "trim-empty", _empty, N_("trim empty trailers")), + OPT_BOOL(0, "only-trailers", _trailers, N_("output only the trailers")), OPT_STRING_LIST(0, "trailer", , N_("trailer"), N_("trailer(s) to add")), OPT_END() diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 0c6f91c433..90d30037b7 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -1275,4 +1275,43 @@ test_expect_success 'with cut line' ' test_cmp expected actual ' +test_expect_success 'only trailers' ' + git config trailer.sign.command "echo config-value" && + cat >expected <<-\EOF && + existing: existing-value + sign: config-value + added: added-value + EOF + git interpret-trailers \ + --trailer added:added-value \ + --only-trailers >actual <<-\EOF && + my subject + + my body + + existing: existing-value + EOF + test_cmp expected actual +' + +test_expect_success 'only-trailers omits non-trailer in middle of block' ' + git config trailer.sign.command "echo config-value" && + cat >expected <<-\EOF && + Signed-off-by: nobody + Signed-off-by: somebody + sign: config-value + EOF + git interpret-trailers --only-trailers >actual <<-\EOF && + subject + + it is important that the trailers below are signed-off-by + so that they meet the "25% trailers Git knows about" heuristic + + Signed-off-by: nobody + this is not a trailer + Signed-off-by: somebody + EOF + test_cmp expected actual +' + test_done diff --git a/trailer.c b/trailer.c index e21a0d1629..706351fc98 100644 --- a/trailer.c +++ b/trailer.c @@ -164,13 +164,15 @@ static void print_tok_val(FILE *outfile, const char *tok, const char *val) fprintf(outfile, "%s%c %s\n", tok, separators[0], val); } -static void print_all(FILE *outfile, struct list_head *head, int trim_empty) +static void print_all(FILE *outfile, struct list_head *head, + const struct process_trailer_options *opts) { struct list_head *pos; struct trailer_item *item; list_for_each(pos, head) { item = list_entry(pos, struct trailer_item, list); - if (!trim_empty || strlen(item->value) > 0) + if ((!opts->trim_empty || strlen(item->value) > 0) && + (!opts->only_trailers || item->token)) print_tok_val(outfile, item->token, item->value); } } @@ -897,9 +899,10 @@ static int process_input_file(FILE *outfile, trailer_info_get(, str); /* Print lines before the trailers as is */ - fwrite(str, 1, info.trailer_start - str, outfile); + if (outfile) + fwrite(str, 1, info.trailer_start - str, outfile); - if (!info.blank_line_before_trailer) + if (outfile && !info.blank_line_before_trailer) fprintf(outfile, "\n"); for (i = 0;
[PATCH v3 3/5] interpret-trailers: add an option to show only existing trailers
It can be useful to invoke interpret-trailers for the primary purpose of parsing existing trailers. But in that case, we don't want to apply existing ifMissing or ifExists rules from the config. Let's add a special mode where we avoid applying those rules. Coupled with --only-trailers, this gives us a reasonable parsing tool. Signed-off-by: Jeff King--- Documentation/git-interpret-trailers.txt | 5 + builtin/interpret-trailers.c | 7 +++ t/t7513-interpret-trailers.sh| 16 trailer.c| 9 + trailer.h| 1 + 5 files changed, 34 insertions(+), 4 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 295dffbd21..7cc43b0e3e 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -83,6 +83,11 @@ OPTIONS --only-trailers:: Output only the trailers, not any other parts of the input. +--only-input:: + Output only trailers that exist in the input; do not add any + from the command-line or by following configured `trailer.*` + rules. + CONFIGURATION VARIABLES --- diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index afb12c11bc..2d90e0e480 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -25,6 +25,7 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "in-place", _place, N_("edit files in place")), OPT_BOOL(0, "trim-empty", _empty, N_("trim empty trailers")), OPT_BOOL(0, "only-trailers", _trailers, N_("output only the trailers")), + OPT_BOOL(0, "only-input", _input, N_("do not apply config rules")), OPT_STRING_LIST(0, "trailer", , N_("trailer"), N_("trailer(s) to add")), OPT_END() @@ -33,6 +34,12 @@ int cmd_interpret_trailers(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, options, git_interpret_trailers_usage, 0); + if (opts.only_input && trailers.nr) + usage_msg_opt( + _("--trailer with --only-input does not make sense"), + git_interpret_trailers_usage, + options); + if (argc) { int i; for (i = 0; i < argc; i++) diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 90d30037b7..94b6c52473 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -1314,4 +1314,20 @@ test_expect_success 'only-trailers omits non-trailer in middle of block' ' test_cmp expected actual ' +test_expect_success 'only input' ' + git config trailer.sign.command "echo config-value" && + cat >expected <<-\EOF && + existing: existing-value + EOF + git interpret-trailers \ + --only-trailers --only-input >actual <<-\EOF && + my subject + + my body + + existing: existing-value + EOF + test_cmp expected actual +' + test_done diff --git a/trailer.c b/trailer.c index 706351fc98..2b5269183a 100644 --- a/trailer.c +++ b/trailer.c @@ -976,7 +976,6 @@ void process_trailers(const char *file, struct string_list *trailers) { LIST_HEAD(head); - LIST_HEAD(arg_head); struct strbuf sb = STRBUF_INIT; int trailer_end; FILE *outfile = stdout; @@ -992,9 +991,11 @@ void process_trailers(const char *file, trailer_end = process_input_file(opts->only_trailers ? NULL : outfile, sb.buf, ); - process_command_line_args(_head, trailers); - - process_trailers_lists(, _head); + if (!opts->only_input) { + LIST_HEAD(arg_head); + process_command_line_args(_head, trailers); + process_trailers_lists(, _head); + } print_all(outfile, , opts); diff --git a/trailer.h b/trailer.h index 3cf35ced00..76c3b571bf 100644 --- a/trailer.h +++ b/trailer.h @@ -26,6 +26,7 @@ struct process_trailer_options { int in_place; int trim_empty; int only_trailers; + int only_input; }; #define PROCESS_TRAILER_OPTIONS_INIT {0} -- 2.14.0.614.g0beb26d5e9
Re: [PATCH 0/5] make interpret-trailers useful for parsing
On Thu, Aug 10, 2017 at 04:02:46AM -0400, Jeff King wrote: > On Wed, Aug 09, 2017 at 08:21:47AM -0400, Jeff King wrote: > > > This series teaches interpret-trailers to parse and output just the > > trailers. So now you can do: > > > > $ git log --format=%B -1 8d44797cc91231cd44955279040dc4a1ee0a797f | > > git interpret-trailers --parse > > Signed-off-by: Hartmut Henkel> > Helped-by: Stefan Beller > > Signed-off-by: Ralf Thielow > > Acked-by: Matthias Rüster > > And here's a v2 that addresses all of the comments except one: Stefan > suggested that --only-existing wasn't a great name. I agree, but I like > everything else less. Here's a v3 that takes care of that (renaming it to --only-input). It's otherwise the same as v2, but since the name-change ripples through the remaining patches, I wanted to get v3 in front of people sooner rather than later. [1/5]: trailer: put process_trailers() options into a struct [2/5]: interpret-trailers: add an option to show only the trailers [3/5]: interpret-trailers: add an option to show only existing trailers [4/5]: interpret-trailers: add an option to normalize output [5/5]: interpret-trailers: add --parse convenience option Documentation/git-interpret-trailers.txt | 34 +++--- builtin/interpret-trailers.c | 34 +++--- t/t7513-interpret-trailers.sh| 76 trailer.c| 68 ++-- trailer.h| 13 +- 5 files changed, 196 insertions(+), 29 deletions(-)
[ANNOUNCE] Git v2.14.1, v2.13.5, and others
The latest maintenance release Git v2.14.1 is now available at the usual places, together with releases for older maintenance track for the same issue: v2.7.6, v2.8.6, v2.9.5, v2.10.4, v2.11.3, v2.12.4, and v2.13.5. These contain a security fix for CVE-2017-1000117, and are released in coordination with Subversion and Mercurial that share a similar issue. CVE-2017-9800 and CVE-2017-1000116 are assigned to these systems, respectively, for issues similar to it that are now addressed in their part of this coordinated release. The tarballs are found at: https://www.kernel.org/pub/software/scm/git/ The following public repositories all have a copy of these tags: url = https://kernel.googlesource.com/pub/scm/git/git url = git://repo.or.cz/alt-git.git url = https://github.com/gitster/git A malicious third-party can give a crafted "ssh://..." URL to an unsuspecting victim, and an attempt to visit the URL can result in any program that exists on the victim's machine being executed. Such a URL could be placed in the .gitmodules file of a malicious project, and an unsuspecting victim could be tricked into running "git clone --recurse-submodules" to trigger the vulnerability. Credits to find and fix the issue go to Brian Neel at GitLab, Joern Schneeweisz of Recurity Labs and Jeff King at GitHub. * A "ssh://..." URL can result in a "ssh" command line with a hostname that begins with a dash "-", which would cause the "ssh" command to instead (mis)treat it as an option. This is now prevented by forbidding such a hostname (which should not impact any real-world usage). * Similarly, when GIT_PROXY_COMMAND is configured, the command is run with host and port that are parsed out from "ssh://..." URL; a poorly written GIT_PROXY_COMMAND could be tricked into treating a string that begins with a dash "-" as an option. This is now prevented by forbidding such a hostname and port number (again, which should not impact any real-world usage). * In the same spirit, a repository name that begins with a dash "-" is also forbidden now.
Re: [PATCH 3/5] interpret-trailers: add an option to show only existing trailers
On Thu, Aug 10, 2017 at 10:33 AM, Jeff Kingwrote: > On Thu, Aug 10, 2017 at 10:27:19AM -0700, Stefan Beller wrote: > >> > I'm not fond of that, as it's vague about which exact trailers we're >> > talking about. I also thought of something like --verbatim, but I'd >> > worry that would seem to conflict with --normalize. >> > >> > I dunno. All of the names seem not quite descriptive enough to me. >> >> I meant 'exact' as in 'exactly from the patch/commit, no external >> influence such as config', so maybe '--from-patch' or '--from-commit' >> (which says the same as --no-config just the other way round. >> Having --no- in config options as the standard is a UX disaster >> IMHO as then we have to forbid the --no-no-X or reintroduce X >> and flip the default) > > Yes, that was definitely the other reason I didn't want to call it > "--no-config". :) > > It's not always from a patch or commit. The most accurate along those > lines is "--from-input". > >> Maybe --genuine ? > > But in the greater context I think that's vague again; we don't know > which part of the command's operation is "genuine". The input of course. ;) --genuine-input. > > Perhaps "--exact-input" hits all of those. Or maybe "--only-input" to > match the other "--only". > > I think I like that last one the best. It makes it clear that we are > looking just at the input, and not anything else. Which is exactly what > the feature does. Makes sense to me, Thanks, Stefan
Re: Not understanding with git wants to copy one file to another
On Thu, Aug 10, 2017 at 10:03 AM, Harry Putnamwrote: > I ran into a line in git commit ouput I had not see before > > #copied: d0/etc/hosts -> misc/old-readerHOSTvcs-files/etc/hosts > > So googling I learned that this might happen if git thinks the two > files are the same. > > I was pretty sure they were not the same so checked them> > > > > diff d0/etc/host misc/old-readerHOSTvcs-files/etc/hosts > > The output is a bit long but shows them being quite different. > > Some 2 dozen or so lines that dramatically differ. > > Here are two that are at least kind of similar but would never be seen > as the same: > > < 192.168.1.43 m2.local.lan m2 # 00-90-F5-A1-F9-E5 >> 192.168.1.43m2.local.lanm2 # win 7 > > Not to mention they are quite different lines as well. > > So what is going on and what should I be looking at? The diff machinery has a threshold for when it assumes a copy/move of a file. (e.g. "A file is assumed copied when at least 55% of lines are equal") https://git-scm.com/docs/git-diff See -C and -M option. git-status seems to use this machinery as well, but does not expose the options?
Re: [PATCH 3/4] http: drop support for curl < 7.19.4
On Thu, Aug 10, 2017 at 02:36:41PM +0200, Mischa POSLAWSKY wrote: > Jeff King wrote: > > -#if LIBCURL_VERSION_NUM >= 0x071301 > > curl_easy_setopt(result, CURLOPT_POSTREDIR, CURL_REDIR_POST_ALL); > > -#elif LIBCURL_VERSION_NUM >= 0x071101 > > curl_easy_setopt(result, CURLOPT_POST301, 1); > > -#endif > > This seems to be an unintended behavioural change: the second condition > wouldn't have applied previously and overrides the first option > (equivalent to CURLOPT_POSTREDIR = CURL_REDIR_POST_301). Thanks, you're right. I'll fix it in my re-roll. -Peff
Re: [PATCH 3/5] interpret-trailers: add an option to show only existing trailers
On Thu, Aug 10, 2017 at 10:27:19AM -0700, Stefan Beller wrote: > > I'm not fond of that, as it's vague about which exact trailers we're > > talking about. I also thought of something like --verbatim, but I'd > > worry that would seem to conflict with --normalize. > > > > I dunno. All of the names seem not quite descriptive enough to me. > > I meant 'exact' as in 'exactly from the patch/commit, no external > influence such as config', so maybe '--from-patch' or '--from-commit' > (which says the same as --no-config just the other way round. > Having --no- in config options as the standard is a UX disaster > IMHO as then we have to forbid the --no-no-X or reintroduce X > and flip the default) Yes, that was definitely the other reason I didn't want to call it "--no-config". :) It's not always from a patch or commit. The most accurate along those lines is "--from-input". > Maybe --genuine ? But in the greater context I think that's vague again; we don't know which part of the command's operation is "genuine". Perhaps "--exact-input" hits all of those. Or maybe "--only-input" to match the other "--only". I think I like that last one the best. It makes it clear that we are looking just at the input, and not anything else. Which is exactly what the feature does. -Peff
Re: [RFC] clang-format: outline the git project's coding style
Brandon Williamswrites: > On 08/10, Junio C Hamano wrote: > >> I vaguely recall that there was a discussion to have SubmitGit wait >> for success from Travis CI; if that is already in place, then I can >> sort of see how it would help individual contributors to have the >> style checker in that pipeline as well. >> >> I have a mixed feelings about "fixing" styles automatically, though. > > I still think we are far away from a world where we can fix style > automatically. If we do want to keep pursuing this there are a number > steps we'd want to take first. > > 1. Settle on a concrete style and document it using a formatter's rules >(in say a .clang-format file). This style would most likely need to >be tuned a little bit, at least the 'Penalty' configuration would >need to be tuned which (as far as I understand it) is used to >determine which rule to break first to ensure a line isn't too long. Yes. I think this is what you started to get the ball rolling. Together with what checkpatch.pl already diagnoses, I think we can get a guideline that is more or less reasonable. > 2. Start getting contributors to use the tool to format their patches. >This would include having some script or hook that a contributor >could run to only format the sections of code that they touched. This, too. Running checkpatch.pl (possibly combined with a bit of tweaking it to match our needs) already catches many of the issues, so a tool with a similar interface would be easy to use, I would imagine. > 3. Slowly the code base would begin to have a uniform style. At >some point we may want to then reformat the remaining sections of the >code base. At this point we could have some automated bot that fixes >style. I suspect I am discussing this based on a different assumption. I think the primary goal of this effort is to make it easier to cleanse the new patches that appear on the list of trivial style issues, so that contributors and reviewers do not have to spend bandwidth and brain cycles during the review. And I have been assuming that we can do so even without waiting for a "tree wide" code churn on existing code to complete.
Re: [PATCH 3/5] interpret-trailers: add an option to show only existing trailers
On Thu, Aug 10, 2017 at 12:32 AM, Jeff Kingwrote: > On Wed, Aug 09, 2017 at 11:18:19AM -0700, Stefan Beller wrote: > >> On Wed, Aug 9, 2017 at 5:24 AM, Jeff King wrote: >> > It can be useful to invoke interpret-trailers for the >> > primary purpose of parsing existing trailers. But in that >> > case, we don't want to apply existing ifMissing or ifExists >> > rules from the config. Let's add a special mode where we >> > avoid applying those rules. Coupled with --only-trailers, >> > this gives us a reasonable parsing tool. >> >> I have the impression that the name is slightly misleading >> because 'only' just reduces the set. it does not enhance it. >> (Do we have a configuration that says "remove this trailer >> anytime"?) > > No, I think you can only add trailers via ifExists or ifMissing. > I actually called this --no-config originally, because to me it meant > "do not apply config". But the processing applies also to --trailer > arguments no the command line, which is how I ended up with > --only-existing. > >> So maybe this is rather worded as 'exact-trailers' ? > > I'm not fond of that, as it's vague about which exact trailers we're > talking about. I also thought of something like --verbatim, but I'd > worry that would seem to conflict with --normalize. > > I dunno. All of the names seem not quite descriptive enough to me. I meant 'exact' as in 'exactly from the patch/commit, no external influence such as config', so maybe '--from-patch' or '--from-commit' (which says the same as --no-config just the other way round. Having --no- in config options as the standard is a UX disaster IMHO as then we have to forbid the --no-no-X or reintroduce X and flip the default) Maybe --genuine ? > > -Peff
Re: [PATCH v2 00/25] Move exported packfile funcs to its own file
On Tue, Aug 8, 2017 at 6:22 PM, Jonathan Tanwrote: > Here is the complete patch set. I have only moved the exported functions > that operate with packfiles and their static helpers - for example, > static functions like freshen_packed_object() that are used only by > non-pack-specific functions are not moved. > > In the end, 3 functions needed to be made global. They are > find_pack_entry(), mark_bad_packed_object(), and has_packed_and_bad(). > > Of the 3, find_pack_entry() is probably legitimately promoted. But I > think that the latter two functions needing to be accessed from > sha1_file.c points to a design that could be improved - they are only > used when packed_object_info() detects corruption, and used for marking > as bad and printing messages to the user respectively, which > packed_object_info() should probably do itself. But I have not made this > change in this patch set. > > (Other than the 3 functions above, there are some variables and > functions that are temporarily made global, but reduced back to static > when the wide scope is no longer needed.) I read through the patches yesterday and had no comment. Thanks, Stefan
Re: [RFC] clang-format: outline the git project's coding style
On 08/10, Junio C Hamano wrote: > Johannes Schindelinwrites: > > > On Wed, 9 Aug 2017, Stefan Beller wrote: > > > >> > I am sure that something even better will be possible: a Continuous > >> > "Integration" that fixes the coding style automatically by using > >> > `filter-branch` (avoiding the merge conflicts that would arise if > >> > `rebase -i` was used). > >> > >> I do not quite follow. Is that to be used by Junio while integrating > >> branches? > > > > I was more thinking about a bot on GitHub. "Code cleanup as a service". > > I vaguely recall that there was a discussion to have SubmitGit wait > for success from Travis CI; if that is already in place, then I can > sort of see how it would help individual contributors to have the > style checker in that pipeline as well. > > I have a mixed feelings about "fixing" styles automatically, though. > I still think we are far away from a world where we can fix style automatically. If we do want to keep pursuing this there are a number steps we'd want to take first. 1. Settle on a concrete style and document it using a formatter's rules (in say a .clang-format file). This style would most likely need to be tuned a little bit, at least the 'Penalty' configuration would need to be tuned which (as far as I understand it) is used to determine which rule to break first to ensure a line isn't too long. 2. Start getting contributors to use the tool to format their patches. This would include having some script or hook that a contributor could run to only format the sections of code that they touched. 3. Slowly the code base would begin to have a uniform style. At some point we may want to then reformat the remaining sections of the code base. At this point we could have some automated bot that fixes style. I'm sure I missed a step in there somewhere though. -- Brandon Williams
Not understanding with git wants to copy one file to another
I ran into a line in git commit ouput I had not see before #copied: d0/etc/hosts -> misc/old-readerHOSTvcs-files/etc/hosts So googling I learned that this might happen if git thinks the two files are the same. I was pretty sure they were not the same so checked them> diff d0/etc/host misc/old-readerHOSTvcs-files/etc/hosts The output is a bit long but shows them being quite different. Some 2 dozen or so lines that dramatically differ. Here are two that are at least kind of similar but would never be seen as the same: < 192.168.1.43 m2.local.lan m2 # 00-90-F5-A1-F9-E5 > 192.168.1.43m2.local.lanm2 # win 7 Not to mention they are quite different lines as well. So what is going on and what should I be looking at?
[PATCH] merge: use skip_prefix()
Get rid of a magic string length constant by using skip_prefix() instead of starts_with(). Signed-off-by: Rene Scharfe--- builtin/merge.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/builtin/merge.c b/builtin/merge.c index 900bafdb45..4facb6fdd0 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -1117,8 +1117,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix) * current branch. */ branch = branch_to_free = resolve_refdup("HEAD", 0, head_oid.hash, NULL); - if (branch && starts_with(branch, "refs/heads/")) - branch += 11; + if (branch) + skip_prefix(branch, "refs/heads/", ); if (!branch || is_null_oid(_oid)) head_commit = NULL; else -- 2.14.0
Re: [RFC] clang-format: outline the git project's coding style
Johannes Schindelinwrites: > On Wed, 9 Aug 2017, Stefan Beller wrote: > >> > I am sure that something even better will be possible: a Continuous >> > "Integration" that fixes the coding style automatically by using >> > `filter-branch` (avoiding the merge conflicts that would arise if >> > `rebase -i` was used). >> >> I do not quite follow. Is that to be used by Junio while integrating >> branches? > > I was more thinking about a bot on GitHub. "Code cleanup as a service". I vaguely recall that there was a discussion to have SubmitGit wait for success from Travis CI; if that is already in place, then I can sort of see how it would help individual contributors to have the style checker in that pipeline as well. I have a mixed feelings about "fixing" styles automatically, though.
Re: [PATCH] apply: remove prefix_length member from apply_state
On Wed, Aug 9, 2017 at 5:54 PM, René Scharfewrote: > Use a NULL-and-NUL check to see if we have a prefix and consistently use > C string functions on it instead of storing its length in a member of > struct apply_state. This avoids strlen() calls and simplifies the code. This looks like a good idea. > @@ -2088,10 +2087,9 @@ static int use_patch(struct apply_state *state, struct > patch *p) > int i; > > /* Paths outside are not touched regardless of "--include" */ > - if (0 < state->prefix_length) { > - int pathlen = strlen(pathname); > - if (pathlen <= state->prefix_length || > - memcmp(state->prefix, pathname, state->prefix_length)) > + if (state->prefix && *state->prefix) { > + const char *rest; > + if (!skip_prefix(pathname, state->prefix, ) || !*rest) > return 0; > } Yeah, or maybe declare "const char *rest;" just after "int i;" and then use: if (state->prefix && *state->prefix && (!skip_prefix(pathname, state->prefix, ) || !*rest)) return 0;
Re: [PATCH V2 2/2] Convert size datatype to size_t
Hi Martin, On Thu, 10 Aug 2017, Martin Koegler wrote: > From: Martin Koegler> > It changes the signature of the core object access function > including any other functions to assure a clean compile if > sizeof(size_t) != sizeof(unsigned long). > > Signed-off-by: Martin Koegler > --- > For next. As this touches core functions, it will likely produce > conflicts with other changes. Please provide the commit you want > to rebase the patch on and I'll produce a V3. > > Includes changes from Johannes Schindelin. Thank you so much! Dscho
Re: fatal: Out of memory, getdelim failed under NFS mounts
Thank you René! comments/answers embedded below On Thu, 10 Aug 2017, René Scharfe wrote: > Am 09.08.2017 um 19:39 schrieb Yaroslav Halchenko: > > More context (may be different issue(s)) could be found at > > http://git-annex.branchable.com/forum/git-annex_add_out_of_memory_error/ > > but currently I am consistently reproducing it while running > > git (1:2.11.0-3 debian stretch build) within debian stretch singularity > > environment [1]. > > External system is Centos 6.9, and git 1.7.1 (and installed in modules > > 2.0.4) do not show similar buggy behavior. > > NFS mounted partitions are bind mounted inside the sinularity space and > > when I try to do some git operations, I get that error inconsistently , e.g. > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > > master > > fatal: Out of memory, getdelim failed > > error: git://github.com/datalad/datalad did not send all necessary > > objects > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > > master > > fatal: Out of memory, getdelim failed > > error: git://github.com/datalad/datalad did not send all necessary > > objects > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > > master > > From git://github.com/datalad/datalad > > * branch master -> FETCH_HEAD > > fatal: Out of memory, getdelim failed > > and some times it succeeds. So it smells that some race condition > > somewhere...? > I doubt the type of file system matters. So far it has been a very consistent indicator. I did not manage to get this error while performing the same operation under /tmp (bind to local mounted drive), where it also feels going faster (again suggesting that original issue is some kind of a race) > The questions are: How much > main memory do you have, what is git trying to cram into it, is there > a way to reduce the memory footprint or do you need to add more RAM? > ... reordered ... > "free" and "ulimit -a" can help you find out how much memory you can > use. I think those aren't the reason: yhalchen@discovery:/mnt/scratch/yoh/datalad$ free -h totalusedfree shared buff/cache available Mem: 126G2.5G 90G652K 33G123G Swap: 127G1.7M127G yhalchen@discovery:/mnt/scratch/yoh/datalad$ ulimit unlimited > > any recommendations on how to pin point the "offender"? ;) > Running "GIT_TRACE=1 git pull --ff-only origin master" would be a > good start, I think, to find out which of the different activities > that pull is doing causes the out-of-memory error. samples of bad, and then good runs (from eyeballing -- the same until error message): yhalchen@discovery:/mnt/scratch/yoh$ cat git_trace_bad.log 14:05:25.782270 git.c:371 trace: built-in: git 'pull' '--ff-only' 'origin' 'master' 14:05:25.795036 run-command.c:350 trace: run_command: 'fetch' '--update-head-ok' 'origin' 'master' 14:05:25.795332 exec_cmd.c:116 trace: exec: 'git' 'fetch' '--update-head-ok' 'origin' 'master' 14:05:25.797212 git.c:371 trace: built-in: git 'fetch' '--update-head-ok' 'origin' 'master' 14:05:25.904088 run-command.c:350 trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' 14:05:26.085954 run-command.c:350 trace: run_command: 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103' 14:05:26.086333 exec_cmd.c:116 trace: exec: 'git' 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103' 14:05:26.088382 git.c:371 trace: built-in: git 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 11652 on discovery.hpcc.dartmouth.edu' '--pack_header=2,103' 14:05:26.133326 run-command.c:350 trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' 14:05:26.133688 exec_cmd.c:116 trace: exec: 'git' 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' 14:05:26.135493 git.c:371 trace: built-in: git 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' fatal: Out of memory, getdelim failed error: git://github.com/datalad/datalad did not send all necessary objects 14:05:26.138838 run-command.c:350 trace: run_command: 'gc' '--auto' 14:05:26.139131 exec_cmd.c:116 trace: exec: 'git' 'gc' '--auto' 14:05:26.141108 git.c:371 trace: built-in: git 'gc' '--auto' yhalchen@discovery:/mnt/scratch/yoh$ cat git_trace_good.log 14:05:37.851862 git.c:371 trace: built-in: git 'pull' '--ff-only' 'origin' 'master' 14:05:37.854250 run-command.c:350 trace: run_command: 'fetch' '--update-head-ok' 'origin' 'master' 14:05:37.854527 exec_cmd.c:116 trace: exec: 'git' 'fetch' '--update-head-ok' 'origin' 'master' 14:05:37.856389
Re: fatal: Out of memory, getdelim failed under NFS mounts
Am 09.08.2017 um 19:39 schrieb Yaroslav Halchenko: > More context (may be different issue(s)) could be found at > http://git-annex.branchable.com/forum/git-annex_add_out_of_memory_error/ > but currently I am consistently reproducing it while running > git (1:2.11.0-3 debian stretch build) within debian stretch singularity > environment [1]. > > External system is Centos 6.9, and git 1.7.1 (and installed in modules > 2.0.4) do not show similar buggy behavior. > > NFS mounted partitions are bind mounted inside the sinularity space and > when I try to do some git operations, I get that error inconsistently , e.g. > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > master > fatal: Out of memory, getdelim failed > error: git://github.com/datalad/datalad did not send all necessary > objects > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > master > fatal: Out of memory, getdelim failed > error: git://github.com/datalad/datalad did not send all necessary > objects > > yhalchen@discovery:/mnt/scratch/yoh/datalad$ git pull --ff-only origin > master > From git://github.com/datalad/datalad >* branch master -> FETCH_HEAD > fatal: Out of memory, getdelim failed > > and some times it succeeds. So it smells that some race condition > somewhere...? I doubt the type of file system matters. The questions are: How much main memory do you have, what is git trying to cram into it, is there a way to reduce the memory footprint or do you need to add more RAM? > any recommendations on how to pin point the "offender"? ;) Running "GIT_TRACE=1 git pull --ff-only origin master" would be a good start, I think, to find out which of the different activities that pull is doing causes the out-of-memory error. "free" and "ulimit -a" can help you find out how much memory you can use. Also: What does "wc -L .git/FETCH_HEAD .git/packed-refs" report? getdelim() is used mostly to read lines from files like these and in the admittedly unlikely case that they are *really* long such an error would be expected. René