Re: [PATCH v3] pkt-line: allow writing of LARGE_PACKET_MAX buffers
On Wed, Dec 10, 2014 at 2:34 AM, Jeff King p...@peff.net wrote: Below is a another iteration on the patch. The actual code changes are the same as the strbuf one, but the tests take care to avoid assuming the filesystem can handle such a long path. Testing on Windows and OS X is appreciated. All three new tests fail on OS X. Thus far brief examination of the first failing tests shows that 'expect' and 'actual' differ: expect: long master actual: master Note that in addition to cheating on the creation of the long ref, I had to tweak the fetch test a little to avoid writing the loose ref there, too. That makes the test a little weaker (it is not as end to end, checking that all parts of fetch can handle it), but it does check the thing we are changing here, that the protocol code can handle it. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] document string_list_clear
On Tue, Dec 09, 2014 at 02:23:37PM -0800, Jonathan Nieder wrote: Jeff King wrote: Elsewhere I mentioned a tool to extract comments and format them. But do people actually care about the formatting step? If formatting means convert to a straightforward text document that I can read through, then I do. I meant run asciidoc in api-*. I am confused by your response. Here it implies to me that you asciidoc them into a text format to make it easier to read. But below you say I also view [the api-* files] as text, which implies you just look at them in your editor. Does anybody asciidoc the technical/api-* files? We did not even support building them until sometime in 2012. Personally, I only ever view them as text. I also view them as text. I tend to find it easier to read the technical/api-* files than headers. That might be for the same reason that some other people prefer header files --- a Documentation/technical/ file tends to be written in one chunk together and ends up saying exactly what I need to know about the calling sequence in a coherent way, while header files tend to evolve over time to match the implementation without maintaining that organization and usability. We could probably do a better job of keeping our header files neat and well-ordered. And perhaps would so if they had a coherent narrative in the first place. My feeling is that it is at least as much work to keep an api-*.txt file neat as it would be to keep the header file neat (generally more, because of the duplication of effort). The one exception might be that ordering in the header file may require some forward declarations. I'd guess that to be pretty rare, though (and certainly I did not run into it when converting the strbuf documentation below). I suspect a simple tool to extract the comments and produce something like the technical/api-* files would help, since then when someone writes a patch, they could try the tool and see if the result is still easy to read. That seems like wishful thinking to me. Some subset of developers will be happy reading the documentation in the header file, and will not commonly run the tool. Therefore they will also not bother to examine the output of the tool when making a change (either because they forget, or because it is simply a hassle that they do not care about). Anyway, the patch I wrote was an attempt at a minimal fix (and an attempt at getting out of a conversation that seemed to be moving toward bikeshedding). If someone wants to move the documentation from api-strbuf.txt to strbuf.h, I won't object, since I hope that a simple tool like described above isn't too far away. I do not mind a partial or minimal fix. This conversion is tedious, and it does not have to be done all at once. But IMHO, your patch is actively making things worse. The api-* files have slowly grown out of date over the years, and I believe that is because they are too far removed from the code people are actually touching. I've been making an active effort recently to document function interfaces in comments, and I feel like the presence of other comments has encouraged me to do so. Here's strbuf.h, with the data from api-strbuf.txt moved into it. I followed the ordering of api-strbuf.txt, since that has a more coherent grouping of functions (but it does make the latter parts of the diff messier, as we have to move function declarations around). I left the bottom part of strbuf.h, where there are several undocumented functions, largely untouched. As further work, it would make sense to document those functions and also fit them into the right spot amidst the already-grouped functions. This has almost every bit of text from api-strbuf.txt. I omitted a few descriptions where the documentation already in strbuf.h was actually _better_. I did remove the function names, sine they are redundant with the declarations below. E.g., rather than: /** * `foo`:: * Do something. */ void foo(int x); just: /** * Do something. */ void foo(int x); The declaration is nearby and has strictly more information, and the less boilerplate the better for convincing people to actually write something. An extraction tool can pretty-print the function name if it wants. I did drop some asciidoc-specific formatting from the main text (e.g., + lines for connecting blocks), which means that extracting the result and running it through asciidoc won't work. I think it makes the actual text much more readable, though. But we could do it the other way if people really want to asciidoc it. --- diff --git a/strbuf.h b/strbuf.h index 652b6c4..6143680 100644 --- a/strbuf.h +++ b/strbuf.h @@ -1,22 +1,105 @@ #ifndef STRBUF_H #define STRBUF_H -/* See Documentation/technical/api-strbuf.txt */ - -extern char strbuf_slopbuf[]; +/** + * strbuf's are meant to be used with all the usual C string and memory + * APIs. Given
Re: [PATCH] document string_list_clear
On Tue, Dec 09, 2014 at 03:18:28PM -0800, Junio C Hamano wrote: Jonathan Nieder jrnie...@gmail.com writes: Jeff King wrote: Elsewhere I mentioned a tool to extract comments and format them. But do people actually care about the formatting step? If formatting means convert to a straightforward text document that I can read through, then I do. If the result becomes unparsable by AsciiDoc(or), those who AsciiDoc'ified the api-*.txt may feel unhappy. I however strongly suspect that the primary consumer of these api-*.txt documents are consuming the text version, not AsciiDoc-to-html output. Yeah, that was my point. They were not even asciidoc'd for a long time, then one contributor, who does not otherwise work on the code, volunteered to asciidoc them. AFAIK there are no regular developers who wanted this feature (but I do not know for sure, which is why I asked). I do not _mind_ if they can be asciidoc'd, but if nobody does so, they are bound to develop formatting errors. And matching asciidoc syntax does come at a readability cost to the text. The documentation that was written with an explicit purpose to serve as documentation would explain how each pieces fit in the whole system much better than a list of pieces extracted from per-function comments, unless the header comment that serves as the source of generated document is written carefully. If we split the header files sanely (i.e., to match what would be one api-* file), then I had imagined each header file having a long free-form comment at the top giving the overview, followed by commented structures and functions, with possibly other free-form comments in the middle as necessary. That's what I did for the strbuf.h conversion I just sent. I am a bit hesitant to see us spending extra cycles either to reinvent doxygen (if we do our own) or working around quirks in third-party tools (if we decide to use existing ones). Me too. That's what I hoped that we could come up with a form whose in-header source is readable enough that people would use it. Then there is really no tool necessary. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] introduce git root
On Tue, Dec 9, 2014 at 7:17 PM, Junio C Hamano gits...@pobox.com wrote: Jeff King p...@peff.net writes: But the one place I do not agree is: I think sequence.editor and core.editor are better because: - they use the same syntax as the config variables, so they are easier to remember and to discover, and I really don't like using core.editor here, because it has the same name as a config variable, but it is _not_ the config variable. It happens to use the config variable as one of the inputs to its computation, but in many cases: git config core.editor and git var core.editor will produce different values. They are entirely different namespaces. Using the same syntax and name seems unnecessarily confusing to me. I think this is a valid concern. It is a tangent, the current definition of git_editor helper reads like this: git_editor() { if test -z ${GIT_EDITOR:+set} then GIT_EDITOR=$(git var GIT_EDITOR) || return $? fi eval $GIT_EDITOR '$@' } If git var editor were to compute a reasonable value from the various user settings, and because GIT_EDITOR is among the sources of user settings, I wonder if the surrounding if test -z then...fi should be there. The pager side seems to do (halfway) the right thing: git_pager() { if test -t 1 then GIT_PAGER=$(git var GIT_PAGER) else GIT_PAGER=cat fi : ${LESS=-FRX} : ${LV=-c} export LESS LV eval $GIT_PAGER '$@' } The initial test -t 1 is we do not page to non-terminal, but we ask git var to take care of PAGER/GIT_PAGER fallback/precedence. It is tempting to argue that we do not page to non-terminal should also be part of various user settings for git var to take into account and fold this if test -t 1...then...else...fi also into git var, but because a typical command line git var will be used on would look like this: GIT_PAGER=$(git var pager) eval $GIT_PAGER ... with the standard output stream of git var not connected to terminal, that would not fly very well, and I am not sure what should be done about it. This is another tangent that comes back to the original how to name them? question, but I wonder if a convention like this would work. * When asking for a feature (e.g. what editor to use), if there is a git-specific environment variable that can be set to override all other settings (e.g. $GIT_EDITOR trumps $EDITOR environment or core.editor configuration), git var can be queried with the name of that strong environment variable. * All other features without such a strong environment variables should not be spelled as if there is such an environment variable the user can set in order to avoid confusion. Does that mean that we would also have the following: - git var GIT_DIR - git var GIT_EXEC_PATH - git var GIT_HTML_PATH - git var GIT_WORK_TREE - git var GIT_PREFIX ... but not: - git var GIT_SHARED_INDEX_PATH - git var GIT_TOP_LEVEL - git var GIT_CDUP ? For the above 3 variables we would have: - git var shared-index-path - git var top-level - git var cdup And then what if we add the GIT_SHARED_INDEX_PATH env variable for example? Would we need to deprecate git var shared-index-path? We also would have: - git var GIT_EDITOR but: - git var web-browser (or git var web.browser like the config option?) And when we add a GIT_WEB_BROWSER or GIT_BROWSER env variable, we deprecate git var web-browser (or git var web.browser?) I doesn't look like user and future friendly to me, but maybe I misunderstood something. Thanks, Christian. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor push performance with large number of refs
On Wed, Dec 10, 2014 at 7:37 AM, brian m. carlson sand...@crustytoothpaste.net wrote: I have a repository that's just under 2 GiB in size and contains over 2 refs, with a copy of it on a server. Both sides are using Git 2.1.2. If I push a branch that contains a single commit, it takes about 15 seconds to push. However, if everything is up-to-date, it completes within 2 seconds. Notably, HTTPS performs the same as SSH. Most of the time is spent between the Pushing to remote machine and Counting objects, running git pack-objects: git pack-objects --all-progress-implied --revs --stdout --thin --delta-base-offset --progress Unfortunately, -vvv doesn't provide any helpful output. I have some suspicions what's going on here, but no hard data. Where should I be looking to determine the bottleneck? Start with perf record, if this is on linux? -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] document string_list_clear
Jeff King wrote: Jeff King wrote: We could probably do a better job of keeping our header files neat and well-ordered. And perhaps would so if they had a coherent narrative in the first place. The example I was looking at before was refs.h. It is still a mess. Hopefully strbuf.h will work better. [...] On Tue, Dec 09, 2014 at 02:23:37PM -0800, Jonathan Nieder wrote: I suspect a simple tool to extract the comments and produce something like the technical/api-* files would help, since then when someone writes a patch, they could try the tool and see if the result is still easy to read. That seems like wishful thinking to me. Some subset of developers will be happy reading the documentation in the header file, and will not commonly run the tool. Therefore they will also not bother to examine the output of the tool when making a change (either because they forget, or because it is simply a hassle that they do not care about). That is okay as long as enough people extract documentation to ensure the result is sane. It is similar to what happens with Documentation/*.txt. A subset of developers just work with the source and make local changes without looking at the big picture (and sometimes buggy markup or documentation organized in a confusing way results). But that doesn't change much. When people forget to look at the manpage and write text that is awkward in context or markup errors, one of a few things happens: * a reviewer notices. The patch is fixed, and the patch author develops better habits for next time. * someone using the documentation notices later, and it gets fixed then, which is still fine. Both outcomes are much, much better than documentation that never gets organized. [...] The api-* files have slowly grown out of date over the years, and I believe that is because they are too far removed from the code people are actually touching. I don't think they ever were very comprehensive or up to date. As far as I understand, the api-* files are intended to be usually out of date. Their main purpose is to explain the gist of how to use the API. They were usually written way after the API was introduced (so they are behind already). They are clearly written and as long as they mostly match the code, that is enough for them to be useful and educational. Then every few years or so, someone updates the file with the latest API changes, still with an eye to overall organization and readability of the document. In other words, they prioritize clarity over currency. Of course if it's possible to have clear documentation that also matches the current code, that would be even better. I have hope that moving the documentation to the header files will get us there if we do it right. I did drop some asciidoc-specific formatting from the main text (e.g., + lines for connecting blocks), which means that extracting the result and running it through asciidoc won't work. This unfortunately lost some structure (e.g., which paragraphs were part of the same bulleted list). It would be more common to use a hanging indent: - The `buf` member is never NULL, so ... string operations ... by `strbuf_init()` ... Do not assume anything about what `buf` really is ... ... - ... [...] I think it makes the actual text much more readable, though. But we could do it the other way if people really want to asciidoc it. I also don't mind losing the asciidoc markup, especially given that it would be a pain to reconstruct and render. My reaction at first glance is that this is trying to do something good but the current implementation is less readable than Documentation/technical/api-strbuf.txt. Am I looking at a work in progress, or is this meant to be ready for application? Which aspects would you be interested in comments on? Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] pkt-line: allow writing of LARGE_PACKET_MAX buffers
On Wed, Dec 10, 2014 at 3:36 AM, Eric Sunshine sunsh...@sunshineco.com wrote: On Wed, Dec 10, 2014 at 2:34 AM, Jeff King p...@peff.net wrote: Below is a another iteration on the patch. The actual code changes are the same as the strbuf one, but the tests take care to avoid assuming the filesystem can handle such a long path. Testing on Windows and OS X is appreciated. All three new tests fail on OS X. Thus far brief examination of the first failing tests shows that 'expect' and 'actual' differ: expect: long master actual: master The failure manifests as soon as the refname hits length 1024, at which point for-each-ref stops reporting it. MAX_PATH on OS X is 1024, so some part of the machinery invoked by for-each-ref likely is rejecting refnames longer than that (even when coming from packed-refs). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4] pkt-line: allow writing of LARGE_PACKET_MAX buffers
On Wed, Dec 10, 2014 at 03:36:31AM -0500, Eric Sunshine wrote: On Wed, Dec 10, 2014 at 2:34 AM, Jeff King p...@peff.net wrote: Below is a another iteration on the patch. The actual code changes are the same as the strbuf one, but the tests take care to avoid assuming the filesystem can handle such a long path. Testing on Windows and OS X is appreciated. All three new tests fail on OS X. Thus far brief examination of the first failing tests shows that 'expect' and 'actual' differ: expect: long master actual: master Ugh. It looks like the packed-refs reader uses a PATH_MAX-sized buffer to read each line, and simply omits the ref. That may be something we want to work on, but it's a separate topic. I think there are platforms whose PATH_MAX is much smaller than what they can actually represent. So ideally we would lift the arbitrary PATH_MAX limitations inside git, and just let the OS complain when we exceed its limits. Even if we fix that, though, I think the push test would still fail on systems that have a true limit on the filesystem. Writing to a ref requires taking a lock in the filesystem, which will involve creating the too-long path. I think there are simply some systems that will not support long refs well, and the tests need to be skipped there. So here's a re-roll that uses prerequisites. -Peff -- 8 -- Subject: pkt-line: allow writing of LARGE_PACKET_MAX buffers When we send out pkt-lines with refnames, we use a static 1000-byte buffer. This means that the maximum size of a ref over the git protocol is around 950 bytes (the exact size depends on the protocol line being written, but figure on a sha1 plus some boilerplate). This is enough for any sane workflow, but occasionally odd things happen (e.g., a bug may create a ref foo/foo/foo/... accidentally). With the current code, you cannot even use push to delete such a ref from a remote. Let's switch to using a strbuf, with a hard-limit of LARGE_PACKET_MAX (which is specified by the protocol). This matches the size of the readers, as of 74543a0 (pkt-line: provide a LARGE_PACKET_MAX static buffer, 2013-02-20). Versions of git older than that will complain about our large packets, but it's really no worse than the current behavior. Right now the sender barfs with impossibly long line trying to send the packet, and afterwards the reader will barf with protocol error: bad line length %d, which is arguably better anyway. Note that we're not really _solving_ the problem here, but just bumping the limits. In theory, the length of a ref is unbounded, and pkt-line can only represent sizes up to 65531 bytes. So we are just bumping the limit, not removing it. But hopefully 64K should be enough for anyone. As a bonus, by using a strbuf for the formatting we can eliminate an unnecessary copy in format_buf_write. Signed-off-by: Jeff King p...@peff.net --- pkt-line.c| 37 +++-- t/t5527-fetch-odd-refs.sh | 33 + 2 files changed, 52 insertions(+), 18 deletions(-) diff --git a/pkt-line.c b/pkt-line.c index 8bc89b1..187a229 100644 --- a/pkt-line.c +++ b/pkt-line.c @@ -64,44 +64,45 @@ void packet_buf_flush(struct strbuf *buf) } #define hex(a) (hexchar[(a) 15]) -static char buffer[1000]; -static unsigned format_packet(const char *fmt, va_list args) +static void format_packet(struct strbuf *out, const char *fmt, va_list args) { static char hexchar[] = 0123456789abcdef; - unsigned n; + size_t orig_len, n; - n = vsnprintf(buffer + 4, sizeof(buffer) - 4, fmt, args); - if (n = sizeof(buffer)-4) + orig_len = out-len; + strbuf_addstr(out, ); + strbuf_vaddf(out, fmt, args); + n = out-len - orig_len; + + if (n LARGE_PACKET_MAX) die(protocol error: impossibly long line); - n += 4; - buffer[0] = hex(n 12); - buffer[1] = hex(n 8); - buffer[2] = hex(n 4); - buffer[3] = hex(n); - packet_trace(buffer+4, n-4, 1); - return n; + + out-buf[orig_len + 0] = hex(n 12); + out-buf[orig_len + 1] = hex(n 8); + out-buf[orig_len + 2] = hex(n 4); + out-buf[orig_len + 3] = hex(n); + packet_trace(out-buf + orig_len + 4, n - 4, 1); } void packet_write(int fd, const char *fmt, ...) { + static struct strbuf buf = STRBUF_INIT; va_list args; - unsigned n; + strbuf_reset(buf); va_start(args, fmt); - n = format_packet(fmt, args); + format_packet(buf, fmt, args); va_end(args); - write_or_die(fd, buffer, n); + write_or_die(fd, buf.buf, buf.len); } void packet_buf_write(struct strbuf *buf, const char *fmt, ...) { va_list args; - unsigned n; va_start(args, fmt); - n = format_packet(fmt, args); + format_packet(buf, fmt, args); va_end(args); - strbuf_add(buf, buffer, n); } static int
Re: [PATCH v3] pkt-line: allow writing of LARGE_PACKET_MAX buffers
On Wed, Dec 10, 2014 at 4:42 AM, Eric Sunshine sunsh...@sunshineco.com wrote: On Wed, Dec 10, 2014 at 3:36 AM, Eric Sunshine sunsh...@sunshineco.com wrote: On Wed, Dec 10, 2014 at 2:34 AM, Jeff King p...@peff.net wrote: Below is a another iteration on the patch. The actual code changes are the same as the strbuf one, but the tests take care to avoid assuming the filesystem can handle such a long path. Testing on Windows and OS X is appreciated. All three new tests fail on OS X. Thus far brief examination of the first failing tests shows that 'expect' and 'actual' differ: expect: long master actual: master The failure manifests as soon as the refname hits length 1024, at which point for-each-ref stops reporting it. MAX_PATH on OS X is 1024, so some part of the machinery invoked by for-each-ref likely is rejecting refnames longer than that (even when coming from packed-refs). Clarification: for-each-ref ignores the ref when the full line read from packed-refs hits length 1024 (not when the refname itself hits length 1024). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] pkt-line: allow writing of LARGE_PACKET_MAX buffers
On Wed, Dec 10, 2014 at 04:49:38AM -0500, Eric Sunshine wrote: On Wed, Dec 10, 2014 at 4:42 AM, Eric Sunshine sunsh...@sunshineco.com wrote: On Wed, Dec 10, 2014 at 3:36 AM, Eric Sunshine sunsh...@sunshineco.com wrote: On Wed, Dec 10, 2014 at 2:34 AM, Jeff King p...@peff.net wrote: Below is a another iteration on the patch. The actual code changes are the same as the strbuf one, but the tests take care to avoid assuming the filesystem can handle such a long path. Testing on Windows and OS X is appreciated. All three new tests fail on OS X. Thus far brief examination of the first failing tests shows that 'expect' and 'actual' differ: expect: long master actual: master The failure manifests as soon as the refname hits length 1024, at which point for-each-ref stops reporting it. MAX_PATH on OS X is 1024, so some part of the machinery invoked by for-each-ref likely is rejecting refnames longer than that (even when coming from packed-refs). Clarification: for-each-ref ignores the ref when the full line read from packed-refs hits length 1024 (not when the refname itself hits length 1024). Yes, the problem is in read_packed_refs: char refline[PATH_MAX]; ... while (fgets(refline, sizeof(refline), f)) { ... } This could be trivially converted to strbuf_getwholeline, but I am not sure what else would break, or whether such a system would actually be _usable_ with such long refs (e.g., would it break the first time you Using fgets like this does shear lines, though. The next fgets call will see the second half of the line. I think we are saved from doing anything stupid by parse_ref_line, but it is mostly luck. So perhaps for that reason the trivial conversion to strbuf is worth it, even if it doesn't help any practical cases. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] pkt-line: allow writing of LARGE_PACKET_MAX buffers
On Wed, Dec 10, 2014 at 4:47 AM, Jeff King p...@peff.net wrote: On Wed, Dec 10, 2014 at 03:36:31AM -0500, Eric Sunshine wrote: On Wed, Dec 10, 2014 at 2:34 AM, Jeff King p...@peff.net wrote: Below is a another iteration on the patch. The actual code changes are the same as the strbuf one, but the tests take care to avoid assuming the filesystem can handle such a long path. Testing on Windows and OS X is appreciated. All three new tests fail on OS X. Thus far brief examination of the first failing tests shows that 'expect' and 'actual' differ: expect: long master actual: master Ugh. It looks like the packed-refs reader uses a PATH_MAX-sized buffer to read each line, and simply omits the ref. That may be something we want to work on, but it's a separate topic. I think there are platforms whose PATH_MAX is much smaller than what they can actually represent. So ideally we would lift the arbitrary PATH_MAX limitations inside git, and just let the OS complain when we exceed its limits. Even if we fix that, though, I think the push test would still fail on systems that have a true limit on the filesystem. Writing to a ref requires taking a lock in the filesystem, which will involve creating the too-long path. I think there are simply some systems that will not support long refs well, and the tests need to be skipped there. So here's a re-roll that uses prerequisites. On OS X, this version correctly skips the two final tests as intended. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] read_packed_refs: use skip_prefix instead of static array
We want to recognize the packed-refs header and skip to the traits part of the line. We currently do it by feeding sizeof() a static const array to strncmp. However, it's a bit simpler to just skip_prefix, which expresses the intention more directly, and without remembering to account for the NUL-terminator in each sizeof() call. Signed-off-by: Jeff King p...@peff.net --- refs.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/refs.c b/refs.c index 10f8247..c71553f 100644 --- a/refs.c +++ b/refs.c @@ -1135,10 +1135,9 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir) while (strbuf_getwholeline(line, f, '\n') != EOF) { unsigned char sha1[20]; const char *refname; - static const char header[] = # pack-refs with:; + const char *traits; - if (!strncmp(line.buf, header, sizeof(header)-1)) { - const char *traits = line.buf + sizeof(header) - 1; + if (skip_prefix(line.buf, # pack-refs with:, traits)) { if (strstr(traits, fully-peeled )) peeled = PEELED_FULLY; else if (strstr(traits, peeled )) -- 2.2.0.454.g7eca6b7 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] read_packed_refs: pass strbuf to parse_ref_line
Now that we have a strbuf in read_packed_refs, we can pass it straight to the line parser, which saves us an extra strlen. Signed-off-by: Jeff King p...@peff.net --- refs.c | 27 +++ 1 file changed, 15 insertions(+), 12 deletions(-) diff --git a/refs.c b/refs.c index 6f31935..10f8247 100644 --- a/refs.c +++ b/refs.c @@ -1068,8 +1068,10 @@ static const char PACKED_REFS_HEADER[] = * Return a pointer to the refname within the line (null-terminated), * or NULL if there was a problem. */ -static const char *parse_ref_line(char *line, unsigned char *sha1) +static const char *parse_ref_line(struct strbuf *line, unsigned char *sha1) { + const char *ref; + /* * 42: the answer to everything. * @@ -1078,22 +1080,23 @@ static const char *parse_ref_line(char *line, unsigned char *sha1) * +1 (space in between hex and name) * +1 (newline at the end of the line) */ - int len = strlen(line) - 42; - - if (len = 0) + if (line-len = 42) return NULL; - if (get_sha1_hex(line, sha1) 0) + + if (get_sha1_hex(line-buf, sha1) 0) return NULL; - if (!isspace(line[40])) + if (!isspace(line-buf[40])) return NULL; - line += 41; - if (isspace(*line)) + + ref = line-buf + 41; + if (isspace(*ref)) return NULL; - if (line[len] != '\n') + + if (line-buf[line-len - 1] != '\n') return NULL; - line[len] = 0; + line-buf[--line-len] = 0; - return line; + return ref; } /* @@ -1144,7 +1147,7 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir) continue; } - refname = parse_ref_line(line.buf, sha1); + refname = parse_ref_line(line, sha1); if (refname) { int flag = REF_ISPACKED; -- 2.2.0.454.g7eca6b7 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] convert read_packed_refs to use strbuf
On Wed, Dec 10, 2014 at 04:53:19AM -0500, Jeff King wrote: Clarification: for-each-ref ignores the ref when the full line read from packed-refs hits length 1024 (not when the refname itself hits length 1024). Yes, the problem is in read_packed_refs: char refline[PATH_MAX]; ... while (fgets(refline, sizeof(refline), f)) { ... } This could be trivially converted to strbuf_getwholeline, but I am not sure what else would break, or whether such a system would actually be _usable_ with such long refs (e.g., would it break the first time you I accidentally cut off the next line, but it was something like ...first time you actually tried writing to the ref). Using fgets like this does shear lines, though. The next fgets call will see the second half of the line. I think we are saved from doing anything stupid by parse_ref_line, but it is mostly luck. So perhaps for that reason the trivial conversion to strbuf is worth it, even if it doesn't help any practical cases. Here's a patch to do that. It still doesn't let you create long refs on OS X, as we get caught up in the PATH_MAX found in git_path() and friends. Still, I think it's a step in the right direction, and it fixes the shearing issue. Patches 2 and 3 are just follow-on cleanups. [1/3]: read_packed_refs: use a strbuf for reading lines [2/3]: read_packed_refs: pass strbuf to parse_ref_line [3/3]: read_packed_refs: use skip_prefix instead of static array I checked, and this miraculously does not conflict with any of the refs work in pu. :) -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] read_packed_refs: use a strbuf for reading lines
We currently used a fixed PATH_MAX-sized buffer for reading packed-refs lines. This is a reasonable guess, in the sense that git generally cannot work with refs larger than PATH_MAX. However, there are a few cases where it is not great: 1. Some systems may have a low value of PATH_MAX, but can actually handle larger paths in practice. Fixing this code path probably isn't enough to make them work completely with long refs, but it is a step in the right direction. 2. We use fgets, which will happily give us half a line on the first read, and then the rest of the line on the second. This is probably OK in practice, because our refline parser is careful enough to look for the trailing newline on the first line. The second line may look like a peeled line to us, but since ^ is illegal in refnames, it is not likely to come up. Still, it does not hurt to be more careful. Signed-off-by: Jeff King p...@peff.net --- refs.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/refs.c b/refs.c index 5ff457e..6f31935 100644 --- a/refs.c +++ b/refs.c @@ -1126,16 +1126,16 @@ static const char *parse_ref_line(char *line, unsigned char *sha1) static void read_packed_refs(FILE *f, struct ref_dir *dir) { struct ref_entry *last = NULL; - char refline[PATH_MAX]; + struct strbuf line = STRBUF_INIT; enum { PEELED_NONE, PEELED_TAGS, PEELED_FULLY } peeled = PEELED_NONE; - while (fgets(refline, sizeof(refline), f)) { + while (strbuf_getwholeline(line, f, '\n') != EOF) { unsigned char sha1[20]; const char *refname; static const char header[] = # pack-refs with:; - if (!strncmp(refline, header, sizeof(header)-1)) { - const char *traits = refline + sizeof(header) - 1; + if (!strncmp(line.buf, header, sizeof(header)-1)) { + const char *traits = line.buf + sizeof(header) - 1; if (strstr(traits, fully-peeled )) peeled = PEELED_FULLY; else if (strstr(traits, peeled )) @@ -1144,7 +1144,7 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir) continue; } - refname = parse_ref_line(refline, sha1); + refname = parse_ref_line(line.buf, sha1); if (refname) { int flag = REF_ISPACKED; @@ -1160,10 +1160,10 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir) continue; } if (last - refline[0] == '^' - strlen(refline) == PEELED_LINE_LENGTH - refline[PEELED_LINE_LENGTH - 1] == '\n' - !get_sha1_hex(refline + 1, sha1)) { + line.buf[0] == '^' + line.len == PEELED_LINE_LENGTH + line.buf[PEELED_LINE_LENGTH - 1] == '\n' + !get_sha1_hex(line.buf + 1, sha1)) { hashcpy(last-u.value.peeled, sha1); /* * Regardless of what the file header said, @@ -1173,6 +1173,8 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir) last-flag |= REF_KNOWS_PEELED; } } + + strbuf_release(line); } /* -- 2.2.0.454.g7eca6b7 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git fetch --quiet inconsistently reports to STDERR
Hello, I'm using git 2.0.4 and 1.7.10.4 and have discovered an issue with quiet option on fetch. Scenario: I am fetching from one bare repo into another bare repo and fetching from branch A to branch B. In the case that the branches have deviated from one another, --quiet surpresses all output, incl. output to STDERR. $ pwd /home/ryan_p/gitrepos/git01.git $ git fetch --quiet ../git02.git working_branch:master $ echo $? 1 $ git fetch ../git02.git working_branch:master Von ../git02 ! [zurückgewiesen] working_branch - master (kein Vorspulen) However, if I pass a false remote name to git fetch --quiet I get the following: $ git fetch --quiet ../git02false.git working_branch:master fatal: '../git02false.git' does not appear to be a git repository fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. So, I see inconsistent behaviour here. Further, the git push behaviour with the --quiet option corresponds to what the user actually expects: -q, --quiet Suppress all output, including the listing of updated refs, unless an error occurs. Progress is not reported to the standard error stream. I perceive the --quiet option on git fetch as behaving incorrectly and would like to file this as a bug. Best Rgs, Phil. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 23/23] untracked cache: guard and disable on system changes
On Wed, Dec 10, 2014 at 12:08 PM, Torsten Bögershausen tbo...@web.de wrote: That opens another question: How flexible/extensible/self-describing is the format of the UNTR extension ? If we drop the OS name root dir check because it disallows network use, but later add a better method to verify that the underlying chain local OS - network - remote OS-remote FS is OK, do we need to change the file format of UNTR ? If yes, can old clients read the new format and vice versa? Do we need a version information of some kind, or does the old client skip unknown entries like we do with extensions in the index ? The way index extensions are done so far, there's no actual versioning inside an extension.Once an extension is out, its format is set in stone. If you change your mind, you make a new extension (with a different signature), so signatures are sort of version. Code is shared mostly so it should not be a problem. Old clients don't recognize new extensions, so they drop them. New clients either stick to old extensions or convert them to new ones. This is all local matters, so I don't think we need to worry too much. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Git commit amend empty emails
Git is having empty email problems I think, I'm on git v2.1.3. Steps to reproduce: $ git init Initialized empty Git repository in /tmp/test_git/.git/ $ echo 'test' abc $ git add --all 1 ↵ $ git commit --message 'test' [master (root-commit) 3cc2793] test 1 file changed, 1 insertion(+) create mode 100644 abc $ echo 'test2' abc $ git commit --amend fatal: Malformed ident string: 'admin 1418217345 +' -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: filter-branch performance
On 9 December 2014 at 18:59, Jeff King p...@peff.net wrote: On Tue, Dec 09, 2014 at 07:52:33PM +0100, Henning Moll wrote: I assume that there is a lot of process forking going on. Could that be the cause? Yes. filter-branch is a shell scripts, and it is probably running multiple git commands per commit it is filtering. Any ideas how to further improve? Depending on how much time you can sink into improving the performance (versus just allowing the process to run to completion), you could also look into a non-forking solution, as well as not bothering to load the commit trees. To me non-forking means putting everything into the JVM by using JGit, like the BFG does, though libgit2 might also be an option. Changing the BFG's code to do the transformation in your script is absolutely trivial - define a commit-node cleaner like this: object SetCommitterToAuthor extends CommitNodeCleaner { override def fixer(kit: CommitNodeCleaner.Kit) = c = c.copy(committer = c.author) // PersonIdent class holds name, email time } ...trivial if you don't mind compiling Scala with SBT that is, and I'm sure some people do! A DSL for non-Scala people to define their own BFG scripts would be good, I must get on that some day. The BFG is generally faster than filter-branch for 3 reasons: 1. No forking - everything stays in the JVM process 2. Embarrassingly parallel algorithm makes good use of multi-core machines 3. Memoization means no Git object (file or folder) is cleaned more than once In the case of your problem, only the first factor will be noticeably helpful. Unfortunately commits do need to be cleaned sequentially, as their hashes depend on the hashes of their parents, and filter-branch doesn't clean /commits/ more than once, the way it does with files or folders - so the last 2 reasons in the list won't be significant. For your specific use case tho', the fact that BFG doesn't load the file tree at all unless it needs to clean it will also help. I decided to knock up an egregious hack in the BFG to see what performance would be like. I ran it against a fairly large repo (https://github.com/bfg-repo-cleaner-demos/intellij-community-original), 100k commits, stored in /dev/shm, and used the SetCommitterToAuthor code above. The BFG run completed in 31.7 seconds, you can see the resulting repo here: https://github.com/rtyley/intellij-community-set-committer-to-author I started running the same test some time ago using filter-branch, unfortunately that test has not completed yet - the BFG appears to be substantially faster. Before: $ git cat-file -p b02bf46c4e93c2e8570910cdd68eb6f4ce21ff81 tree 7a412e49ecdbd966d7efe5fe746ff3ea3b6067d1 parent 8794219e3e84aed3cc8af926ffd74beafa51fb6b author peter pe...@jetbrains.com 1370854045 +0200 committer peter pe...@jetbrains.com 1370854098 +0200 After: $ git cat-file -p 3adb7b2a5c87320a5a028b6a59a7132c75a6e91c tree 7a412e49ecdbd966d7efe5fe746ff3ea3b6067d1 parent 5efcdb551789b0d0bb541de9325f09521c5fbcb6 author peter pe...@jetbrains.com 1370854045 +0200 committer peter pe...@jetbrains.com 1370854045 +0200 - time fixed The relevant code is in: https://github.com/rtyley/bfg-repo-cleaner/compare/set-committer-to-author -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: filter-branch performance
On Wed, Dec 10, 2014 at 02:18:24PM +, Roberto Tyley wrote: Depending on how much time you can sink into improving the performance (versus just allowing the process to run to completion), you could also look into a non-forking solution, as well as not bothering to load the commit trees. To me non-forking means putting everything into the JVM by using JGit, like the BFG does, though libgit2 might also be an option. Changing the BFG's code to do the transformation in your script is absolutely trivial - define a commit-node cleaner like this: object SetCommitterToAuthor extends CommitNodeCleaner { override def fixer(kit: CommitNodeCleaner.Kit) = c = c.copy(committer = c.author) // PersonIdent class holds name, email time } Thanks. I _almost_ mentioned BFG in the original email, but I didn't think it could do arbitrary fixes like this. Can you monkey-patch in arbitrary code, or do you have to rebuild all of BFG to include the snippet above? ...trivial if you don't mind compiling Scala with SBT that is, and I'm sure some people do! A DSL for non-Scala people to define their own BFG scripts would be good, I must get on that some day. That would be cool. Even if the DSL was just Java, if you could do something like: vi fix.java javac fix.java bfg --filter=fix.class that would be very useful (and I am probably showing my lack of Java chops by getting the compilation command or filenames wrong :) ). I started running the same test some time ago using filter-branch, unfortunately that test has not completed yet - the BFG appears to be substantially faster. No fair if you didn't run filter-branch on a PC and BFG on a Raspberry Pi. You have to give us a fighting chance. :) -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFD/PATCH] Documentation: mention category
Junio C Hamano schrieb am 09.12.2014 um 21:26: Michael J Gruber g...@drmicha.warpmail.net writes: Rather than changing git-foo.txt, we could do the substitution magic from Documentation/Makefile, of course, to keep man pages and command-list in sync. Although this would keep me from submitting the final series with 1 patch per file :) I do not get that smiley. Are you saying that these noisy patches add to your karma points? Well, I didn't and I won't send them as 1p/f. Have I ever split my patches noisilly atomically? diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt index 9631526..b6a8bc6 100644 --- a/Documentation/git-add.txt +++ b/Documentation/git-add.txt @@ -13,6 +13,10 @@ SYNOPSIS [--intent-to-add | -N] [--refresh] [--ignore-errors] [--ignore-missing] [--] [pathspec...] +CATEGORY + +Main user interface command (porcelain) + DESCRIPTION --- While I do not have objection to adding this information, I have a few problems with the execution: - These four lines at the very beginning is a precious real estate. The new reader would not benefit from the distinction before reading the first paragraph of description to learn what it does and what it is for anyway. Move it much later, perhaps at the end. Then we can just leave it out. The aim is to give new(er) users some orientation about which commands to use and which to stay away from, unless they want to do scripting. There's no point in putting a stop sign at the end of the road. This series was triggered by the recent update-index discussion. I think it showed: Sometimes users get confused by the plumbing man pages who shouldn't even read them at all (and rather stick with porcelain commands). - A phrase Main user interface command to a new user does not help very much if it does not tell enough what that phrase really means (e.g. you should not be using it for scripting). Extend the description more, after moving it to the end. I'm all open for different wording, this is just a POC. Even though I think that it should be okay to use a term like porcelain (defined in the glossary) the same way as we use other central terms (revision, object) which we don't spell out in each man page. The point is to give some direction. - As you said, this should be done in a way to keep the two sources of information in sync. Either add these from command-list, or generate command-list from these. Does the categorisation change a lot? My impression is that this is a one-time business like adding a new command, which causes redundant work already. If we really want to automate it, Doc/Makefile would probably need to translate foo.txt to foo.txt+ and operate on that. I'm not sure automatisation is worth that effort. Michael -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: filter-branch performance
On 10 December 2014 at 14:37, Jeff King p...@peff.net wrote: On Wed, Dec 10, 2014 at 02:18:24PM +, Roberto Tyley wrote: object SetCommitterToAuthor extends CommitNodeCleaner { override def fixer(kit: CommitNodeCleaner.Kit) = c = c.copy(committer = c.author) // PersonIdent class holds name, email time } Thanks. I _almost_ mentioned BFG in the original email, but I didn't think it could do arbitrary fixes like this. Can you monkey-patch in arbitrary code, or do you have to rebuild all of BFG to include the snippet above? Well, I publish a bfg-library jar to Maven Central, so you don't need to rebuild that: http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22bfg-library_2.11%22 ...in principle you can write a Java/Groovy/whatever project that calls that jar (your entry point would be com.madgag.git.bfg.cleaner.RepoRewriter) - tho' to be honest, I can't swear to how /friendly/ the API would be to call from non-Scala-land though, as I haven't tried it. Incidentally, if people want to try compiling this monkey-patched BFG at home, this is how you'd do it: * Install SBT - http://www.scala-sbt.org/download.html (or 'brew install sbt' for Mac OS X) * git clone https://github.com/rtyley/bfg-repo-cleaner.git --branch set-committer-to-author * cd bfg-repo-cleaner * sbt bfg/run --no-blob-protection There will be a lot of automated downloading of dependencies, and compilation will be slow the first time around, but at least there aren't that many steps. I do realise that being Scala/JVM based makes working on the BFG a bit of a specialist activity at the moment! A DSL for non-Scala people to define their own BFG scripts would be good, I must get on that some day. That would be cool. Even if the DSL was just Java, if you could do something like: vi fix.java javac fix.java bfg --filter=fix.class that would be very useful (and I am probably showing my lack of Java chops by getting the compilation command or filenames wrong :) ). Your syntax is right :) I'll give it some thought. I started running the same test some time ago using filter-branch, unfortunately that test has not completed yet - the BFG appears to be substantially faster. No fair if you didn't run filter-branch on a PC and BFG on a Raspberry Pi. You have to give us a fighting chance. :) I guess I made that rod for my own back :) http://youtu.be/Ir4IHzPhJuI for those who haven't seen it. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/18] Introduce fsck options
Johannes Schindelin johannes.schinde...@gmx.de writes: THis is not limited to this step, but Subject: Re: [PATCH 01/18] Introduce fsck options please make it easier to cluster and spot the series in the eventual shortlog by giving a common prefix to the patches, e.g. fsck: introduce fsck_options struct +static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT; +static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT; Is it a good idea to allow walker to be strict but obj verifier to be not (or vice versa)? I am wondering why this is not a single struct with two callback function pointers. +struct fsck_options { + fsck_walk_func walk; + fsck_error error_func; + int strict:1; A signed 1-bit-wide bitfield can hold its sign-bit and nothing else, no? unsigned strict:1; -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git commit amend empty emails
On Thu, Dec 11, 2014 at 12:17:35AM +1100, Simon wrote: Git is having empty email problems I think, I'm on git v2.1.3. Steps to reproduce: $ git init Initialized empty Git repository in /tmp/test_git/.git/ $ echo 'test' abc $ git add --all 1 ↵ $ git commit --message 'test' [master (root-commit) 3cc2793] test 1 file changed, 1 insertion(+) create mode 100644 abc $ echo 'test2' abc $ git commit --amend fatal: Malformed ident string: 'admin 1418217345 +' Yes, this looks like a regression in v2.1.0, due to my 4701026 (commit: use split_ident_line to compare author/committer, 2014-05-01). That commit added a call to sane_ident_split() when preparing the commit message template, which is what is causing the error you see (and why your first commit succeeds, but the latter fails). The absolute simplest fix would be to remove the sane_ident_split call. Though I have a different patch prepared, which argues that we should not be doing any validation in this code path at all (see below). However, there's something else going on. I am surprised that we allow empty emails at all and the code here is quite strange. The first check on the ident format is when we feed the data to fmt_ident to generate the string that goes into the commit object. We disallow empty _names_ there, but not empty _emails_. I'm not sure if this is an oversight, or an intentional historic compatibility thing. And then it gets weirder. We take the output of fmt_ident, and then feed that back into split_ident_line, so that we can set the GIT_AUTHOR_* variables in the environment. And that does its _own_ set of checks, via sane_ident_split. Once upon a time, it relied only on split_ident_lane to report problems. But Junio's e27ddb6 (split_ident_line(): make best effort when parsing author/committer line, 2012-08-31) made split_ident_line more lenient, and introduced sane_ident_split to cover the difference. Except that it did more than that: besides checking whether the name is empty (which the original split_ident_line used to do), it also complains if the email is empty (which is new in that commit). So we now notice the empty email in this code path, but the only thing we do is avoid writing out the environment variables and continue. Which means that the actual string generated by fmt_ident (complete with empty email) is what goes into the commit. So why are we setting the environment variables at all? It looks like they are for the benefit of hooks, as evidenced by 7dfe8ad (commit: pass author/committer info to hooks, 2012-03-11). So that means that we are sometimes passing totally bogus data to hooks (i.e., an ident is good enough to make it into a commit message, but sane_ident_split prevents us from putting the data into the environment). So I think e27ddb6 was a little over-anxious in adding the email validation, but nobody noticed because those checks don't actually affect whether or not we commit. Hooks might see bogus data, but it is in such a rare set of cases that nobody has noticed. Here are two patches to improve this. These are on top of the jk/commit-date-approxidate topic, as that is where the regression was introduced. The first one fixes the regression and can stand by itself. The second fixes the GIT_AUTHOR problem, but AFAIK that has been there for years. So it is not as urgent, but is still maint-worthy, in my opinion. [1/2]: commit: loosen ident checks when generating template [2/2]: commit: always populate GIT_AUTHOR_* variables If we did want to truly disallow empty emails, we could do a follow-on 3/2 that teaches fmt_ident to reject them (that is the right place because it is where the validation checks for the author go, and also because we would probably want the same validation for the committer). But I do not think we should do that lightly. It has been this way for years, and clearly at least one person is depending on it. If we're going to change it, we might want a warning/deprecation period. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] commit: loosen ident checks when generating template
When we generate the commit-message template, we try to report an author or committer ident that will be of interest to the user: an author that does not match the committer, or a committer that was auto-configured. When doing so, if we encounter what we consider to be a bogus ident, we immediately die. This is a bad idea, because our use of the idents here is purely informational. Any ident rules should be enforced elsewhere, because commits that do not invoke the editor will not even hit this code path (e.g., git commit -mfoo would work, but git commit would not). So at best, we are redundant with other checks, and at worse, we actively prevent commits that should otherwise be allowed. We should therefore do the minimal parsing we can to get a value and not do any validation (i.e., drop the call to sane_ident_split()). In theory we could notice when even our minimal parsing fails to work, and do the sane thing for each check (e.g., if we have an author but can't parse the committer, assume they are different and print the author). But we can actually simplify this even further. We know that the author and committer strings we are parsing have been generated by us earlier in the program, and therefore they must be parseable. We could just call split_ident_line without even checking its return value, knowing that it will put _something_ in the name/mail fields. Of course, to protect ourselves against future changes to the code, it makes sense to turn this into an assert, so we are not surprised if our assumption fails. Signed-off-by: Jeff King p...@peff.net --- builtin/commit.c | 23 ++- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index d1c90db..2be5506 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -502,6 +502,12 @@ static int is_a_merge(const struct commit *current_head) return !!(current_head-parents current_head-parents-next); } +static void assert_split_ident(struct ident_split *id, const struct strbuf *buf) +{ + if (split_ident_line(id, buf-buf, buf-len)) + die(BUG: unable to parse our own ident: %s, buf-buf); +} + static void export_one(const char *var, const char *s, const char *e, int hack) { struct strbuf buf = STRBUF_INIT; @@ -608,13 +614,6 @@ static void determine_author_info(struct strbuf *author_ident) } } -static void split_ident_or_die(struct ident_split *id, const struct strbuf *buf) -{ - if (split_ident_line(id, buf-buf, buf-len) || - !sane_ident_split(id)) - die(_(Malformed ident string: '%s'), buf-buf); -} - static int author_date_is_interesting(void) { return author_message || force_date; @@ -822,8 +821,14 @@ static int prepare_to_commit(const char *index_file, const char *prefix, status_printf_ln(s, GIT_COLOR_NORMAL, %s, only_include_assumed); - split_ident_or_die(ai, author_ident); - split_ident_or_die(ci, committer_ident); + /* +* These should never fail because we they come from our own +* fmt_ident. They may fail the sane_ident test, but we know +* that the name and mail pointers will at least be valid, +* which is enough for our tests and printing here. +*/ + assert_split_ident(ai, author_ident); + assert_split_ident(ci, committer_ident); if (ident_cmp(ai, ci)) status_printf_ln(s, GIT_COLOR_NORMAL, -- 2.2.0.454.g7eca6b7 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] commit: always populate GIT_AUTHOR_* variables
To figure out the author ident for a commit, we call determine_author_info(). This function collects information from the environment, other commits (in the case of --amend or -c/-C), and the --author option. It then uses fmt_ident to generate the final ident string that goes into the commit object. fmt_ident is therefore responsible for any quality or validation checks on what is allowed to go into a commit. Before returning, though, we call split_ident_line on the result, and feed the individual components to hooks via the GIT_AUTHOR_* variables. Furthermore, we do extra validation by feeding the split to sane_ident_split(), which is pickier than fmt_ident (in particular, it will complain about an empty email field). If this parsing or validation fails, we skip updating the environment variables. This is bad, because it means that hooks may silently see a different ident than what we are putting into the commit. We should drop the extra sane_ident_split checks entirely, and take whatever fmt_ident has fed us (and what will go into the commit object). If parsing fails, we should actually abort here rather than continuing (and feeding the hooks bogus data). However, split_ident_line should never fail here. The ident was just generated by fmt_ident, so we know that it's sane. We can use assert_split_ident to double-check this. Note that we also teach that assertion to check that we found a date (it always should, but until now, no caller cared whether we found a date or not). Checking the return value of sane_ident_split is enough to ensure we have the name/email pointers set, and checking date_begin is enough to know that all of the date/tz variables are set. Signed-off-by: Jeff King p...@peff.net --- builtin/commit.c | 26 +- 1 file changed, 5 insertions(+), 21 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 2be5506..f1a9e07 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -504,7 +504,7 @@ static int is_a_merge(const struct commit *current_head) static void assert_split_ident(struct ident_split *id, const struct strbuf *buf) { - if (split_ident_line(id, buf-buf, buf-len)) + if (split_ident_line(id, buf-buf, buf-len) || !id-date_begin) die(BUG: unable to parse our own ident: %s, buf-buf); } @@ -518,20 +518,6 @@ static void export_one(const char *var, const char *s, const char *e, int hack) strbuf_release(buf); } -static int sane_ident_split(struct ident_split *person) -{ - if (!person-name_begin || !person-name_end || - person-name_begin == person-name_end) - return 0; /* no human readable name */ - if (!person-mail_begin || !person-mail_end || - person-mail_begin == person-mail_end) - return 0; /* no usable mail */ - if (!person-date_begin || !person-date_end || - !person-tz_begin || !person-tz_end) - return 0; - return 1; -} - static int parse_force_date(const char *in, char *out, int len) { if (len 1) @@ -606,12 +592,10 @@ static void determine_author_info(struct strbuf *author_ident) } strbuf_addstr(author_ident, fmt_ident(name, email, date, IDENT_STRICT)); - if (!split_ident_line(author, author_ident-buf, author_ident-len) - sane_ident_split(author)) { - export_one(GIT_AUTHOR_NAME, author.name_begin, author.name_end, 0); - export_one(GIT_AUTHOR_EMAIL, author.mail_begin, author.mail_end, 0); - export_one(GIT_AUTHOR_DATE, author.date_begin, author.tz_end, '@'); - } + assert_split_ident(author, author_ident); + export_one(GIT_AUTHOR_NAME, author.name_begin, author.name_end, 0); + export_one(GIT_AUTHOR_EMAIL, author.mail_begin, author.mail_end, 0); + export_one(GIT_AUTHOR_DATE, author.date_begin, author.tz_end, '@'); } static int author_date_is_interesting(void) -- 2.2.0.454.g7eca6b7 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: filter-branch performance
Roberto Tyley roberto.ty...@gmail.com writes: The BFG is generally faster than filter-branch for 3 reasons: 1. No forking - everything stays in the JVM process 2. Embarrassingly parallel algorithm makes good use of multi-core machines 3. Memoization means no Git object (file or folder) is cleaned more than once In the case of your problem, only the first factor will be noticeably helpful. Unfortunately commits do need to be cleaned sequentially, as their hashes depend on the hashes of their parents, and filter-branch doesn't clean /commits/ more than once, the way it does with files or folders - so the last 2 reasons in the list won't be significant. Just this part. If your history is bushy, you should be able to rewrite histories of merged branches in parallel up to the point they are merged---rewriting of the merge commit of course has to wait until all the branches have been rewritten, though. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Blobs not referenced by file (anymore) are not removed by GC
Jeff King p...@peff.net writes: ... I'd be happy to contribute a patch that gives 'gc' a flag to do the equivalent of: git reflog expire --expire=now --all git gc --prune=now --aggressive Maybe: git gc --purge Yeah, that is common enough that it might be worthwhile (you probably want --expire-unreachable in the reflog invocation, though). Also you would not want an unconditional --aggressive. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/14] copy_fd: pass error message back through a strbuf
On 12/04/2014 08:59 AM, Jeff King wrote: On Wed, Dec 03, 2014 at 01:38:58PM -0800, Jonathan Nieder wrote: The allocation of a variable-sized buffer is a small overhead that I don't mind incurring on error. In the non-error case, the caller doesn't actually have to free the buffer, and if they choose to, the overhead incurred is that of free(NULL)'. I don't care at all about overhead. I care about extra work on the part of the caller to avoid a leak. It turns: if (some_func(fd, err)) return error(%s, err.msg); into: if (some_func(fd, err)) { error(%s, err.buf); strbuf_release(err); return -1; } What if we go in the direction not of less infrastructure, but a little bit more? Like struct result { int code; struct strbuf msg; }; int report_errors(struct result *result) { int code = result-code; if (code) { error(result-msg.buf); } result-code = 0; strbuf_release(result-msg); return code; /* or alternatively (code ? -1 : 0) */ } int report_warnings(struct result *result) { ... } int report_with_prefix(struct result *result, const char *fmt, ...) { ... } Then a caller could look pretty much like before: struct result result = RESULT_INIT; if (some_func(fd, result)) return report_errors(result); Other callers might not even bother to check the return value of the function, relying instead on result.code via process_error(): char *ptr = some_func(fd, result); if (report_errors(result)) return -1; If the result code is considered superfluous, we could use naked strbufs and use msg.len as the indicator that there was an error. Michael -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
http.proxy seems to not be propagated on `submodule update`
Hello, I'm trying to recursively update submodules in a repo on a server that requires a proxy to access the internet. I typically pass http.proxy with -c to tell git about the proxy, but with `submodule update`, it seems the argument isn't propagated and thus results in the process sitting doing nothing when it tries to connect. Is this expected behaviour? cdown@dev023:dotfiles:linux$ GIT_TRACE=2 git -c http.proxy=fwdproxy.any:8080 submodule update --init --recursive trace: exec: 'git-submodule' 'update' '--init' '--recursive' trace: run_command: 'git-submodule' 'update' '--init' '--recursive' trace: built-in: git 'rev-parse' '--git-dir' trace: built-in: git 'rev-parse' '--show-cdup' trace: built-in: git 'rev-parse' '-q' '--git-dir' trace: built-in: git 'rev-parse' '--sq-quote' '--init' trace: built-in: git 'rev-parse' '--sq-quote' '--recursive' trace: built-in: git 'ls-files' '--error-unmatch' '--stage' '--' trace: built-in: git 'config' '-f' '.gitmodules' '--get-regexp' '^submodule\..*\.path$' trace: built-in: git 'config' 'submodule..vim/bundle/auto-save.url' trace: built-in: git 'config' '-f' '.gitmodules' 'submodule..vim/bundle/auto-save.update' trace: built-in: git 'config' '-f' '.gitmodules' '--get-regexp' '^submodule\..*\.path$' trace: built-in: git 'config' 'submodule..vim/bundle/ctrlp.url' trace: built-in: git 'config' '-f' '.gitmodules' 'submodule..vim/bundle/ctrlp.update' trace: built-in: git 'config' '-f' '.gitmodules' '--get-regexp' '^submodule\..*\.path$' trace: built-in: git 'config' 'submodule..vim/bundle/easymotion.url' trace: built-in: git 'config' '-f' '.gitmodules' 'submodule..vim/bundle/easymotion.update' trace: built-in: git 'config' '-f' '.gitmodules' '--get-regexp' '^submodule\..*\.path$' trace: built-in: git 'config' 'submodule..vim/bundle/fugitive.url' trace: built-in: git 'config' '-f' '.gitmodules' 'submodule..vim/bundle/fugitive.update' trace: built-in: git 'config' '-f' '.gitmodules' '--get-regexp' '^submodule\..*\.path$' trace: built-in: git 'config' 'submodule..vim/bundle/markdown.url' trace: built-in: git 'config' '-f' '.gitmodules' 'submodule..vim/bundle/markdown.update' trace: built-in: git 'config' '-f' '.gitmodules' '--get-regexp' '^submodule\..*\.path$' trace: built-in: git 'config' 'submodule..vim/bundle/pathogen.url' trace: built-in: git 'config' '-f' '.gitmodules' 'submodule..vim/bundle/pathogen.update' trace: built-in: git 'config' '-f' '.gitmodules' '--get-regexp' '^submodule\..*\.path$' trace: built-in: git 'config' 'submodule..vim/bundle/supertab.url' trace: built-in: git 'config' '-f' '.gitmodules' 'submodule..vim/bundle/supertab.update' trace: built-in: git 'ls-files' '--error-unmatch' '--stage' '--' trace: built-in: git 'config' '-f' '.gitmodules' '--get-regexp' '^submodule\..*\.path$' trace: built-in: git 'config' 'submodule..vim/bundle/auto-save.url' trace: built-in: git 'config' 'submodule..vim/bundle/auto-save.update' trace: built-in: git 'rev-parse' '--git-dir' trace: built-in: git 'rev-parse' '--show-toplevel' trace: built-in: git 'rev-parse' '--show-toplevel' trace: built-in: git 'rev-parse' '--local-env-vars' trace: built-in: git 'config' 'core.worktree' '../../../../../.vim/bundle/auto-save' trace: built-in: git 'rev-parse' '--local-env-vars' trace: built-in: git 'fetch' trace: run_command: 'git-remote-https' 'origin' 'https://github.com/907th/vim-auto-save.git' ^C cdown@dev023:dotfiles:linux$ git --version git version 1.8.1 Thanks, Chris pgpiNdmbagSNx.pgp Description: PGP signature
Re: [PATCH v4] git-new-workdir: Don't fail if the target directory is empty
On Wed, 2014-11-26 at 15:16 -0800, Junio C Hamano wrote: Paul Smith p...@mad-scientist.net writes: This is what happens for a file: $ rm -f foo $ touch foo $ ./src/git/contrib/workdir/git-new-workdir src/git foo master mkdir: cannot create directory ‘foo’: Not a directory unable to create new workdir foo! ;-) That comes from mkdir || fail which is indeed sufficient. Hi Junio; Is this all set to be applied or is there more to do? Cheers! -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] convert read_packed_refs to use strbuf
Jeff King p...@peff.net writes: Here's a patch to do that. It still doesn't let you create long refs on OS X, as we get caught up in the PATH_MAX found in git_path() and friends. Still, I think it's a step in the right direction, and it fixes the shearing issue. Patches 2 and 3 are just follow-on cleanups. [1/3]: read_packed_refs: use a strbuf for reading lines [2/3]: read_packed_refs: pass strbuf to parse_ref_line [3/3]: read_packed_refs: use skip_prefix instead of static array I checked, and this miraculously does not conflict with any of the refs work in pu. :) Heh, I suspect that is largely because Michael's reflog expire is kept out of my tree while it is still under discusson. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/18] Allow demoting errors to warnings via receive.fsck.key = warn
Johannes Schindelin johannes.schinde...@gmx.de writes: For example, missing emails in commit and tag objects can be demoted to mere warnings with git config receive.fsck.missing-email warn No punctuations in the first and the last level of configuration variable names, please. I.e. s/missing-email/missingEmail/ or something. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/18] Provide a function to parse fsck message IDs
Johannes Schindelin johannes.schinde...@gmx.de writes: This function will be used in the next commits to allow the user to ask fsck to handle specific problems differently, e.g. demoting certain errors to warnings. It has to handle partial strings because we would like to be able to parse, say, '--strict=missing-email=warn' command lines. To make the parsing robust, we generate strings from the enum keys, and we will match both lower-case, dash-separated values as well as camelCased ones (e.g. both missing-email and missingEmail will match the MISSING_EMAIL key). Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 32 1 file changed, 32 insertions(+) diff --git a/fsck.c b/fsck.c index 3cea034..05b146c 100644 --- a/fsck.c +++ b/fsck.c @@ -63,6 +63,38 @@ enum fsck_msg_id { FSCK_MSG_MAX }; +#define STR(x) #x +#define MSG_ID_STR(x) STR(x), +static const char *msg_id_str[FSCK_MSG_MAX + 1] = { + FOREACH_MSG_ID(MSG_ID_STR) + NULL +}; I wondered what the ugly macro was in the previous step, but as a way to keep these two lists in sync it makes sense. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: There are legacy repositories out there whose older commits and tags have issues that prevent pushing them when 'receive.fsckObjects' is set. One real-life example is a commit object that has been hand-crafted to list two authors. Often, it is not possible to fix those issues without disrupting the work with said repositories, yet it is still desirable to perform checks by setting `receive.fsckObjects = true`. This commit is the first step to allow demoting specific fsck issues to mere warnings. The function added by this commit parses a list of settings in the form: missing-email=warn,bad-name=warn,... Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by git fsck so far, but other call paths (e.g. git index-pack --strict) error out *always* no matter what type was specified. Therefore, we need to take extra care to default to all FSCK_ERROR in those cases. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 58 ++ fsck.h | 7 +-- 2 files changed, 63 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 05b146c..9e6d70f 100644 --- a/fsck.c +++ b/fsck.c @@ -97,9 +97,63 @@ static int parse_msg_id(const char *text, int len) int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { + if (options-strict_mode msg_id = 0 msg_id FSCK_MSG_MAX) + return options-strict_mode[msg_id]; + if (options-strict) + return FSCK_ERROR; return msg_id FIRST_WARNING ? FSCK_ERROR : FSCK_WARN; } Hmm, if you are later going to allow demoting (hopefully also promoting) error to warn, etc., would the comparison between msg_id and FIRST_WARNING make much sense? In other words, at some point wouldn't we be better off with something like this struct { enum id; const char *id_string; enum error_level { FSCK_PASS, FSCK_WARN, FSCK_ERROR }; } possible_fsck_errors[]; -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/18] fsck: handle multiple authors in commits specially
Johannes Schindelin johannes.schinde...@gmx.de writes: This problem has been detected in the wild, and is the primary reason to introduce an option to demote certain fsck errors to warnings. Let's offer to ignore this particular problem specifically. ... + while (skip_prefix(buffer, author , buffer)) { + err = report(options, commit-object, FSCK_MSG_MULTIPLE_AUTHORS, invalid format - multiple 'author' lines); + if (err) + return err; If we have an option to demote this to a warning, wouldn't we want to do the same fsck_ident() on that secondary author line? + /* require_end_of_header() ensured that there is a newline */ + buffer = strchr(buffer, '\n') + 1; + } if (!skip_prefix(buffer, committer , buffer)) return report(options, commit-object, FSCK_MSG_MISSING_COMMITTER, invalid format - expected 'committer' line); err = fsck_ident(buffer, commit-object, options); -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/18] Disallow demoting grave fsck errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: Some kinds of errors are intrinsically unrecoverable (e.g. errors while uncompressing objects). It does not make sense to allow demoting them to mere warnings. Makes sense, but the patch makes it look as if this is an oops, we should have done the list in patch 02/18 in this order from the beginning. Can we reorder the patches? Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 8 ++-- t/t5504-fetch-receive-strict.sh | 9 + 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index c1e7a85..f8339af 100644 --- a/fsck.c +++ b/fsck.c @@ -9,6 +9,9 @@ #include refs.h #define FOREACH_MSG_ID(FUNC) \ + /* fatal errors */ \ + FUNC(NUL_IN_HEADER) \ + FUNC(UNTERMINATED_HEADER) \ /* errors */ \ FUNC(BAD_DATE) \ FUNC(BAD_EMAIL) \ @@ -39,10 +42,8 @@ FUNC(MISSING_TYPE_ENTRY) \ FUNC(MULTIPLE_AUTHORS) \ FUNC(NOT_SORTED) \ - FUNC(NUL_IN_HEADER) \ FUNC(TAG_OBJECT_NOT_TAG) \ FUNC(UNKNOWN_TYPE) \ - FUNC(UNTERMINATED_HEADER) \ FUNC(ZERO_PADDED_DATE) \ /* warnings */ \ FUNC(BAD_FILEMODE) \ @@ -56,6 +57,7 @@ FUNC(NULL_SHA1) \ FUNC(ZERO_PADDED_FILEMODE) +#define FIRST_NON_FATAL_ERROR FSCK_MSG_BAD_DATE #define FIRST_WARNING FSCK_MSG_BAD_FILEMODE #define MSG_ID(x) FSCK_MSG_##x, @@ -150,6 +152,8 @@ void fsck_strict_mode(struct fsck_options *options, const char *mode) } msg_id = parse_msg_id(mode, equal); + if (type != FSCK_ERROR msg_id FIRST_NON_FATAL_ERROR) + die(Cannot demote %.*s, len, mode); options-strict_mode[msg_id] = type; mode += len; } diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh index db79e56..8a47db2 100755 --- a/t/t5504-fetch-receive-strict.sh +++ b/t/t5504-fetch-receive-strict.sh @@ -135,4 +135,13 @@ test_expect_success 'push with receive.fsck.missing-mail = warn' ' grep missing-email act ' +test_expect_success 'receive.fsck.unterminated-header = warn triggers error' ' + rm -rf dst + git init dst + git --git-dir=dst/.git config receive.fsckobjects true + git --git-dir=dst/.git config receive.fsck.unterminated-header warn + test_must_fail git push --porcelain dst HEAD act 21 + grep Cannot demote unterminated-header=warn act +' + test_done -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 13/18] Optionally ignore specific fsck issues completely
Johannes Schindelin johannes.schinde...@gmx.de writes: An fsck issue in a legacy repository might be so common that one would like not to bother the user with mentioning it at all. With this change, that is possible by setting the respective error to ignore. Makes sense. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 14/18] fsck: allow upgrading fsck warnings to errors
Johannes Schindelin johannes.schinde...@gmx.de writes: The 'invalid tag name' and 'missing tagger entry' warnings can now be upgraded to errors by setting receive.fsck.invalid-tag-name and receive.fsck.missing-tagger-entry to 'error'. Hmm, why can't all warnings promotable to errors, or are the above two mentioned only as examples? Incidentally, the missing tagger warning is now really shown as a warning (as opposed to being reported with the error: prefix, as it used to be the case before this commit). Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c| 24 t/t5302-pack-index.sh | 2 +- 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/fsck.c b/fsck.c index abfd3af..154f361 100644 --- a/fsck.c +++ b/fsck.c @@ -52,13 +52,15 @@ FUNC(HAS_DOT) \ FUNC(HAS_DOTDOT) \ FUNC(HAS_DOTGIT) \ - FUNC(INVALID_TAG_NAME) \ - FUNC(MISSING_TAGGER_ENTRY) \ FUNC(NULL_SHA1) \ - FUNC(ZERO_PADDED_FILEMODE) + FUNC(ZERO_PADDED_FILEMODE) \ + /* infos (reported as warnings, but ignored by default) */ \ + FUNC(INVALID_TAG_NAME) \ + FUNC(MISSING_TAGGER_ENTRY) #define FIRST_NON_FATAL_ERROR FSCK_MSG_BAD_DATE #define FIRST_WARNING FSCK_MSG_BAD_FILEMODE +#define FIRST_INFO FSCK_MSG_INVALID_TAG_NAME #define MSG_ID(x) FSCK_MSG_##x, enum fsck_msg_id { @@ -103,7 +105,7 @@ int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) if (options-strict_mode msg_id = 0 msg_id FSCK_MSG_MAX) return options-strict_mode[msg_id]; if (options-strict) - return FSCK_ERROR; + return msg_id FIRST_INFO ? FSCK_ERROR : FSCK_WARN; return msg_id FIRST_WARNING ? FSCK_ERROR : FSCK_WARN; } @@ -643,13 +645,19 @@ static int fsck_tag_buffer(struct tag *tag, const char *data, goto done; } strbuf_addf(sb, refs/tags/%.*s, (int)(eol - buffer), buffer); - if (check_refname_format(sb.buf, 0)) - report(options, tag-object, FSCK_MSG_INVALID_TAG_NAME, invalid 'tag' name: %s, buffer); + if (check_refname_format(sb.buf, 0)) { + ret = report(options, tag-object, FSCK_MSG_INVALID_TAG_NAME, invalid 'tag' name: %s, buffer); + if (ret) + goto done; + } buffer = eol + 1; - if (!skip_prefix(buffer, tagger , buffer)) + if (!skip_prefix(buffer, tagger , buffer)) { /* early tags do not contain 'tagger' lines; warn only */ - report(options, tag-object, FSCK_MSG_MISSING_TAGGER_ENTRY, invalid format - expected 'tagger' line); + ret = report(options, tag-object, FSCK_MSG_MISSING_TAGGER_ENTRY, invalid format - expected 'tagger' line); + if (ret) + goto done; + } else ret = fsck_ident(buffer, tag-object, options); diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh index 61bc8da..3dc5ec4 100755 --- a/t/t5302-pack-index.sh +++ b/t/t5302-pack-index.sh @@ -259,7 +259,7 @@ EOF thirtyeight=${tag#??} rm -f .git/objects/${tag%$thirtyeight}/$thirtyeight git index-pack --strict tag-test-${pack1}.pack 2err -grep ^error:.* expected .tagger. line err +grep ^warning:.* expected .tagger. line err ' test_done -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: We already have support in `git receive-pack` to deal with some legacy repositories which have non-fatal issues. Let's make `git fsck` itself useful with such repositories, too, by allowing users to ignore known issues, or at least demote those issues to mere warnings. Example: `git -c fsck.missing-email=ignore fsck` would hide problems with missing emails in author, committer and tagger lines. Hopefully I do not have to repeat myself, but please do not have punctuations in the first and the last level of configuration variable names, i.e. fsck.missingEmail, not mising-email. Should these be tied to receive-pack ones in any way? E.g. if you set fsck.missingEmail to ignore, you do not have to do the same for receive and accept a push with the specific error turned off? Not a rhetorical question. I can see it argued both ways. The justification to defend the position of not tying these two I would have is so that I can be more strict to newer breakages (i.e. not accepting a push that introduce a new breakage by not ignoring with receive.fsck.*) while allowing breakages that are already present. The justification for the opposite position is to make it more convenient to write a consistent policy. Whichever way is chosen, we would want to see the reason left in the log message so that people do not have to wonder what the original motivation was when they decide if it is a good idea to change this part of the code. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Show number of commits being rebased interactively
Hi, In the case of having 'exec' lines interspersed, the $commitcount becomes a lot less useful (no comparison to editor line numbers), though. Hmph, interesting. Then perhaps not filtering at all and instead labelling this new piece of information not commit(s) a better solution? You are counting the number of instructions in the insn sheet, so perhaps something like ($count todo items(s)) or something? Yes that sounds good. Though I lack the experience to say what would be the typical workflow for someone who uses rebase-i a lot with execs interspersed(?) In all cases, simply counting and noting '#todo item(s)' would always be a correct labelling and simple to understand, albeit not necessarily a very helpful value. But at least for pure picks/edit/squash lists, it will allow a quick sanity check. Also, I noted that filtering for ^pick will not be good in case one uses '--autosquash'. 8 8 8 8 8 8 8 8 8 That is not a scissors line, I suspect. I didn't try running git am on this message, though. Did you? Yes, I RTFMed on git-mailinfo and to be sure I tested this before sending the last mail out :) Below is the patch, changed to say 'TODO items', and with updated commit message. Cheers, Onno 8 8 8 8 8 8 8 8 8 Subject: [PATCH] Show number of TODO items for interactive rebase During 'rebase -i', one wrong edit in a long rebase session might inadvertently drop commits/items. This change shows the total number of TODO items in the comments after the list. After performing the rebase edit, total item counts can be compared to make sure that no changes have been lost in the edit. Signed-off-by: Onno Kortmann o...@gmx.net --- git-rebase--interactive.sh | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index b64dd28..c6a4629 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -1031,9 +1031,11 @@ test -s $todo || echo noop $todo test -n $autosquash rearrange_squash $todo test -n $cmd add_exec_commands $todo +todocount=$(git stripspace --strip-comments $todo | wc -l) + cat $todo EOF -$comment_char Rebase $shortrevisions onto $shortonto +$comment_char Rebase $shortrevisions onto $shortonto ($todocount TODO item(s)) EOF append_todo_help git stripspace --comment-lines $todo \EOF -- 2.2.0.34.gb8f29bf -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 18/18] git receive-pack: support excluding objects from fsck'ing
Johannes Schindelin johannes.schinde...@gmx.de writes: The optional new config option `receive.fsck.skip-list` specifies the path to a file listing the names, i.e. SHA-1s, one per line, of objects that are to be ignored by `git receive-pack` when `receive.fsckObjects = true`. This is extremely handy in case of legacy repositories where it would cause more pain to change incorrect objects than to live with them (e.g. a duplicate 'author' line in an early commit object). The intended use case is for server administrators to inspect objects that are reported by `git push` as being too problematic to enter the repository, and to add the objects' SHA-1 to a (preferably sorted) file when the objects are legitimate, i.e. when it is determined that those problematic objects should be allowed to enter the server. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- builtin/receive-pack.c | 9 +++ fsck.c | 59 +++-- fsck.h | 2 ++ t/t5504-fetch-receive-strict.sh | 12 + 4 files changed, 80 insertions(+), 2 deletions(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 111e514..5169f1f 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -110,6 +110,15 @@ static int receive_pack_config(const char *var, const char *value, void *cb) return 0; } + if (starts_with(var, receive.fsck.skip-list)) { s/skip-list/skiplist/; + const char *path = is_absolute_path(value) ? + value : git_path(%s, value); + if (fsck_strict_mode.len) + strbuf_addch(fsck_strict_mode, ','); + strbuf_addf(fsck_strict_mode, skip-list=%s, path); + return 0; + } + if (starts_with(var, receive.fsck.)) { if (fsck_strict_mode.len) strbuf_addch(fsck_strict_mode, ','); diff --git a/fsck.c b/fsck.c index 154f361..00693f2 100644 --- a/fsck.c +++ b/fsck.c @@ -7,6 +7,7 @@ #include tag.h #include fsck.h #include refs.h +#include sha1-array.h #define FOREACH_MSG_ID(FUNC) \ /* fatal errors */ \ @@ -56,7 +57,9 @@ FUNC(ZERO_PADDED_FILEMODE) \ /* infos (reported as warnings, but ignored by default) */ \ FUNC(INVALID_TAG_NAME) \ - FUNC(MISSING_TAGGER_ENTRY) + FUNC(MISSING_TAGGER_ENTRY) \ + /* special value */ \ + FUNC(SKIP_LIST) This feels like a kludge to me without comment on what special value means. Does it mean this object has an error (which by default is ignored) of being on the skip list? Should we be able to optionally warn an object on the skip-list exists with the same mechansim the rest of the series uses to tweak the error level? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/18] Introduce an internal API to interact with the fsck machinery
Johannes Schindelin johannes.schinde...@gmx.de writes: At the moment, the git-fsck's integrity checks are targeted toward the end user, i.e. the error messages are really just messages, intended for human consumption. Under certain circumstances, some of those errors should be allowed to be turned into mere warnings, though, because the cost of fixing the issues might well be larger than the cost of carrying those flawed objects. Overall I very much like what this series aims to do. Thanks for working on this. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git commit amend empty emails
Jeff King p...@peff.net writes: However, there's something else going on. I am surprised that we allow empty emails at all and the code here is quite strange. The first check on the ident format is when we feed the data to fmt_ident to generate the string that goes into the commit object. We disallow empty _names_ there, but not empty _emails_. I'm not sure if this is an oversight, or an intentional historic compatibility thing. Looking at e27ddb6 you cited, I think we knew about historical mistakes that allowed an empty names, but not an empty e-mail address. We probably have tried to kill both in one stone. Once upon a time, it relied only on split_ident_lane to report problems. But Junio's e27ddb6 (split_ident_line(): make best effort when parsing author/committer line, 2012-08-31) made split_ident_line more lenient, and introduced sane_ident_split to cover the difference. Except that it did more than that: besides checking whether the name is empty (which the original split_ident_line used to do), it also complains if the email is empty (which is new in that commit). So we now notice the empty email in this code path, but the only thing we do is avoid writing out the environment variables and continue. Which means that the actual string generated by fmt_ident (complete with empty email) is what goes into the commit. So why are we setting the environment variables at all? I think that part was more underthinking than oversight. We didn't want to abort the commit but we didn't want to contaminate the environment variables with known-to-be-bad values to spread the problem further. But there is no guarantee that not exporting the environment variables would give us more comformant name and e-mail address, so that thinking is flawed. Here are two patches to improve this. These are on top of the jk/commit-date-approxidate topic, as that is where the regression was introduced. The first one fixes the regression and can stand by itself. The second fixes the GIT_AUTHOR problem, but AFAIK that has been there for years. So it is not as urgent, but is still maint-worthy, in my opinion. [1/2]: commit: loosen ident checks when generating template [2/2]: commit: always populate GIT_AUTHOR_* variables If we did want to truly disallow empty emails, we could do a follow-on 3/2 that teaches fmt_ident to reject them (that is the right place because it is where the validation checks for the author go, and also because we would probably want the same validation for the committer). But I do not think we should do that lightly. It has been this way for years, and clearly at least one person is depending on it. If we're going to change it, we might want a warning/deprecation period. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/14] copy_fd: pass error message back through a strbuf
Michael Haggerty mhag...@alum.mit.edu writes: What if we go in the direction not of less infrastructure, but a little bit more? Like struct result { int code; struct strbuf msg; }; int report_errors(struct result *result) { int code = result-code; if (code) { error(result-msg.buf); } result-code = 0; strbuf_release(result-msg); return code; /* or alternatively (code ? -1 : 0) */ } int report_warnings(struct result *result) { ... } int report_with_prefix(struct result *result, const char *fmt, ...) { ... } Then a caller could look pretty much like before: struct result result = RESULT_INIT; if (some_func(fd, result)) return report_errors(result); Other callers might not even bother to check the return value of the function, relying instead on result.code via process_error(): char *ptr = some_func(fd, result); if (report_errors(result)) return -1; If the result code is considered superfluous, we could use naked strbufs and use msg.len as the indicator that there was an error. Two potential issues are: - Callers that ignore errors need to actively ignore errors with strbuf_release(result.msg); - Callers have to remember that once the report_errors() function is called on a struct result, the struct loses its information. Neither is insurmountable, but the latter might turn out to be cumbersome to work around in some codepaths. Other than that, I think it is OK. Another alternative may be to have the reporting storage on the side of the callee, and have callers that are interested in errors to supply a place to store a pointer to it, i.e. int some_func(..., struct result **errors) { static struct result mine; clear_result(mine); ... do its thing ... if (... error detected ...) { mine.code = E_SOMEFUNC; strbuf_addstr(mine.msg, some_func: foobared); } if (errors) *errors = mine; return mine.code; } And a caller would do this instead: struct result *result = NULL; if (some_func(fd, result)) return report_errors(result); where report_errors() and friends will only peek but not clear the result reporting storage. The clear_result() helper would trivially be: void clear_result(struct result *result) { result-code = 0; strbuf_release(result-msg); } -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/14] copy_fd: pass error message back through a strbuf
On Wed, Dec 10, 2014 at 11:00:18AM -0800, Junio C Hamano wrote: Two potential issues are: - Callers that ignore errors need to actively ignore errors with strbuf_release(result.msg); That was my first thought, too. If you want to do anything besides report_error, you have to deal with the strbuf. But I'd guess that they often fall into one of two cases: 1. You are just propagating the error to your caller. In which case it is not _your_ result struct in the first place, and you do not need to care about deallocating it either way. I.e.: int some_func(..., struct result *err) { if (some_other_func(..., err)) return -1; ... } 2. You want to ignore the error. I think anybody taking a result struct (or a strbuf, or whatever) should accept NULL as do not bother giving me your message. And the convenience wrappers should handle that (I think the mkerror example I sent earlier did), so callees can just do: return mkerror(err, whatever: %s, ...); The remainder could strbuf_release manually, but there would hopefully not be many of them. I think I could live with something like that. - Callers have to remember that once the report_errors() function is called on a struct result, the struct loses its information. Neither is insurmountable, but the latter might turn out to be cumbersome to work around in some codepaths. I suspect the message is not that interesting after calling report_errors(). The code flag could remain, as it does not require deallocation. Another alternative may be to have the reporting storage on the side of the callee, and have callers that are interested in errors to supply a place to store a pointer to it, i.e. int some_func(..., struct result **errors) { static struct result mine; This makes some_func not reentrant. Which is a hazard both for threaded code, but also for functions which want to do: if (some_func(foo, err_one)) { /* didn't work? Try an alternative. */ if (!some_func(bar, err_two)) and expect err_one to contain anything useful. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git's configure script --mandir doesn't work
On Fri, Dec 05, 2014 at 04:36:20AM -0500, Jeff King wrote: On Thu, Dec 04, 2014 at 04:25:32PM -0700, Stephen Fisher wrote: I'm installing Git 2.2.0 from source distribution on NetBSD 6.1.5 (amd64) and when I specify --mandir=/usr/local/man, it still installs man pages in the default /usr/local/share/man directory. Is there a fix available for this? It works fine for me here (Debian): Can you elaborate on the commands you're running? After running the configure script, can you confirm that mandir is set appropriately in config.mak.autogen? Thanks for your reply and sorry for my delay in responding. I'm executing ./configure --mandir=/usr/local/man --disable-pthreads then gmake and gmake install. I'm using gmake (GNU make) because I get Makefile errors with the regular BSD make, but that's another issue. I'm disabling pthreads because there is a linking error for undefined references to a few functions (I probably need to pass -lpthread in LDFLAGS, but haven't tried that yet). mandir is properly set in config.mak.autogen. When I set prefix to /tmp/foo and mandir to /tmp/bar like your example, it installs things into /tmp/foo, but /tmp/bar isn't even created. I noticed text files in Documentation/ that look like the content of man pages, and when I run gmake in that directory, I get an error about asciidoc missing to make an HTML file. Is asciidoc required for the man pages as well? I don't see any files that appear to be man page format other than in perl/blib/man3 and those are installed (but not under the mandir prefix, rather the default /usr/local/share/man prefix). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git commit amend empty emails
On Wed, Dec 10, 2014 at 10:46:16AM -0800, Junio C Hamano wrote: So we now notice the empty email in this code path, but the only thing we do is avoid writing out the environment variables and continue. Which means that the actual string generated by fmt_ident (complete with empty email) is what goes into the commit. So why are we setting the environment variables at all? I think that part was more underthinking than oversight. We didn't want to abort the commit but we didn't want to contaminate the environment variables with known-to-be-bad values to spread the problem further. But there is no guarantee that not exporting the environment variables would give us more comformant name and e-mail address, so that thinking is flawed. That sort of makes sense, but I agree it is flawed. The real spread is when the bogus data makes it into the commit objects themselves. The first one fixes the regression and can stand by itself. The second fixes the GIT_AUTHOR problem, but AFAIK that has been there for years. So it is not as urgent, but is still maint-worthy, in my opinion. [1/2]: commit: loosen ident checks when generating template [2/2]: commit: always populate GIT_AUTHOR_* variables By the way, as I said I built these on the original regression. They will have some minor textual conflicts if you merge them up to master, due to the jk/commit-author-parsing topic. Please me know if you run into any trouble merging. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git's configure script --mandir doesn't work
On Wed, Dec 10, 2014 at 12:41:50PM -0700, Stephen Fisher wrote: I'm executing ./configure --mandir=/usr/local/man --disable-pthreads then gmake and gmake install. I'm using gmake (GNU make) because I get Makefile errors with the regular BSD make, but that's another issue. You have to build git with GNU make; there are several GNU-isms in the Makefile. I'm disabling pthreads because there is a linking error for undefined references to a few functions (I probably need to pass -lpthread in LDFLAGS, but haven't tried that yet). We should link against -lpthread automatically unless pthreads are disabled. So that may be an issue worth investigating. I noticed text files in Documentation/ that look like the content of man pages, and when I run gmake in that directory, I get an error about asciidoc missing to make an HTML file. Is asciidoc required for the man pages as well? Yes, you need asciidoc to build the manpages. However, there is a make quick-install-man target which will install pre-formatted manpages. You'll need to: git clone git://git.kernel.org/pub/scm/git/git-manpages.git next to your git.git clone. If you are installing from tarballs, I think there are manpage tarballs on kernel.org, as well. I don't see any files that appear to be man page format other than in perl/blib/man3 and those are installed (but not under the mandir prefix, rather the default /usr/local/share/man prefix). It sounds like the manpage install bailed due to asciidoc failing. So the remaining bug is that the perl Makefile does not respect $(mandir). That does not surprise me too much. We use perl's MakeMaker to build that Makefile, and it looks like we just pass in the prefix, not individual paths. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] document string_list_clear
My vote is for putting the API docs in the header files: * Functions are documented right next to their declarations (which the compiler guarantees are up-to-date), removing ambiguity and avoiding some redundancy. * It is obvious at a glance which functions are documented and which still need docs. * It is easier to remember to update the documentation when it is near the code, and easier for reviewers to remember to pester authors to do so. I also like it when function declarations in header files include the names of their parameters, as (1) it's an efficient way to give the reader a good hint about how the function should be used, and (2) it makes it easier to refer to the parameters from the docstring. I agree that there is an important need for introductory/conceptual documentation about and API, but I think this fits fine as a large docstring at the top of the header file. I personally don't have much use for formatted documentation but if that were a need I would think that a standard tool like doxygen could be used. I have been trying hard to add good header-file docstrings for code that I have worked on, and even for code that I had to read because I needed to figure out how to use it. I would find it a pity for that work to be squashed into Documentation/technical/api-*.txt, where in my opinion it is less discoverable and more likely to fall into disrepair. Michael -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] introduce git root
On Tue, Dec 09, 2014 at 10:17:13AM -0800, Junio C Hamano wrote: It is a tangent, the current definition of git_editor helper reads like this: git_editor() { if test -z ${GIT_EDITOR:+set} then GIT_EDITOR=$(git var GIT_EDITOR) || return $? fi eval $GIT_EDITOR '$@' } If git var editor were to compute a reasonable value from the various user settings, and because GIT_EDITOR is among the sources of user settings, I wonder if the surrounding if test -z then...fi should be there. I think it should not be. The point of git var is to say compute for me what the internal C code would compute. I think it happens to be a noop in this case, because the C code will give GIT_EDITOR preference over anything else. But it seems like a layering violation; the shell scripts should not have to know or care about the existence $GIT_EDITOR. The pager side seems to do (halfway) the right thing: git_pager() { if test -t 1 then GIT_PAGER=$(git var GIT_PAGER) else GIT_PAGER=cat fi : ${LESS=-FRX} : ${LV=-c} export LESS LV eval $GIT_PAGER '$@' } The initial test -t 1 is we do not page to non-terminal, but we ask git var to take care of PAGER/GIT_PAGER fallback/precedence. It is tempting to argue that we do not page to non-terminal should also be part of various user settings for git var to take into account and fold this if test -t 1...then...else...fi also into git var, but because a typical command line git var will be used on would look like this: GIT_PAGER=$(git var pager) eval $GIT_PAGER ... with the standard output stream of git var not connected to terminal, that would not fly very well, and I am not sure what should be done about it. Right, I think in an ideal world the is it a terminal context is handled by git var, but the reality of shell scripts makes it hard. We face the same problem with git config --get-colorbool. There we cheat a little and use the exit code to signal the result. That works for a bool, but not for a string. :) I think you'd have to do something a little gross like: pager=$(git var GIT_PAGER 3) 31 Except that doesn't work; the shell does not take redirections for a straight variable assignment. Something like: # git var uses fd 3 by convention for stdout-tty tests exec 31 at the top of git-sh-setup, and then: pager=$(git var GIT_PAGER 3) in the scripts themselves. That works, but is a bit ugly. This is another tangent that comes back to the original how to name them? question, but I wonder if a convention like this would work. * When asking for a feature (e.g. what editor to use), if there is a git-specific environment variable that can be set to override all other settings (e.g. $GIT_EDITOR trumps $EDITOR environment or core.editor configuration), git var can be queried with the name of that strong environment variable. I'd prefer to create a new namespace. When you set $GIT_EDITOR as a trump variable, then that is leaking information about the computation from git var out to the caller. What happens if the computation changes so that $GIT_EDITOR is no longer the last word? * All other features without such a strong environment variables should not be spelled as if there is such an environment variable the user can set in order to avoid confusion. So now you have two classes of variables. Some that look like environment variables, and some that do not. What does it buy you for the ones that do look like environment variables? I feel like either way you will still use them as: editor=$(git var editor) eval $editor $@ Does it matter whether you call it $editor or $GIT_EDITOR? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] pkt-line: allow writing of LARGE_PACKET_MAX buffers
On Wed, Dec 10, 2014 at 4:47 AM, Jeff King p...@peff.net wrote: Subject: pkt-line: allow writing of LARGE_PACKET_MAX buffers When we send out pkt-lines with refnames, we use a static 1000-byte buffer. This means that the maximum size of a ref over the git protocol is around 950 bytes (the exact size depends on the protocol line being written, but figure on a sha1 plus some boilerplate). This is enough for any sane workflow, but occasionally odd things happen (e.g., a bug may create a ref foo/foo/foo/... accidentally). With the current code, you cannot even use push to delete such a ref from a remote. Let's switch to using a strbuf, with a hard-limit of LARGE_PACKET_MAX (which is specified by the protocol). This matches the size of the readers, as of 74543a0 (pkt-line: provide a LARGE_PACKET_MAX static buffer, 2013-02-20). Versions of git older than that will complain about our large packets, but it's really no worse than the current behavior. Right now the sender barfs with impossibly long line trying to send the packet, and afterwards the reader will barf with protocol error: bad line length %d, which is arguably better anyway. Note that we're not really _solving_ the problem here, but just bumping the limits. In theory, the length of a ref is unbounded, and pkt-line can only represent sizes up to 65531 bytes. So we are just bumping the limit, not removing it. But hopefully 64K should be enough for anyone. As a bonus, by using a strbuf for the formatting we can eliminate an unnecessary copy in format_buf_write. Signed-off-by: Jeff King p...@peff.net --- diff --git a/t/t5527-fetch-odd-refs.sh b/t/t5527-fetch-odd-refs.sh index edea9f9..85bcb2e 100755 --- a/t/t5527-fetch-odd-refs.sh +++ b/t/t5527-fetch-odd-refs.sh @@ -26,4 +26,37 @@ test_expect_success 'suffix ref is ignored during fetch' ' test_cmp expect actual ' +test_expect_success 'try to create repo with absurdly long refname' ' + ref240=$_z40/$_z40/$_z40/$_z40/$_z40/$_z40 Maybe you want to keep the -chain intact here? + ref1440=$ref240/$ref240/$ref240/$ref240/$ref240/$ref240 + git init long + ( + cd long + test_commit long + test_commit master + ) + if git -C long update-ref refs/heads/$ref1440 long; then + test_set_prereq LONG_REF + else + echo 2 long refs not supported + fi +' + +test_expect_success LONG_REF 'fetch handles extremely long refname' ' + git fetch long refs/heads/*:refs/remotes/long/* + cat expect -\EOF + long + master + EOF + git for-each-ref --format=%(subject) refs/remotes/long actual + test_cmp expect actual +' + +test_expect_success LONG_REF 'push handles extremely long refname' ' + git push long :refs/heads/$ref1440 + git -C long for-each-ref --format=%(subject) refs/heads actual + echo master expect + test_cmp expect actual +' + test_done -- 2.2.0.454.g7eca6b7 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
local clone and submodules
Say I have a local repository with several submodules that point at remote repositories. All submodules are up-to-date. I want to clone everything to another location on disk, *without hitting the network to fetch the submodules*. Obviously a simple git clone will work for the superproject, but submodules will be re-fetched from the remote if I pass --recurse-submodules. Is there any way to do this offline? Reading through the documentation, it doesn't seem easy, but perhaps it is possible to do by recursing through the submodules manually. In Homebrew, we allow building packages directly from a project's source repository. Currently, the repositories are downloaded into a local cache, then checked out into a build directory using git checkout-index and git submodule foreach. However, some projects require access to the git repository to do things like compute a version string. We experimented with using GIT_DIR, but sometimes we are operating on multiple git repositories, and saving and restoring its value in the right places is tedious and error prone. Our current solution is to symlink the .git directory into the build directory, but I'd like to replace this manual checkout + symlink with something cleaner. I could just copy the entire repository from the cache into the build directory, but we deal with some fairly large repositories so I would prefer to be able to use git clone or git clone --shared. Thanks for any suggestions, Jack -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] pkt-line: allow writing of LARGE_PACKET_MAX buffers
On Wed, Dec 10, 2014 at 03:14:17PM -0500, Eric Sunshine wrote: +test_expect_success 'try to create repo with absurdly long refname' ' + ref240=$_z40/$_z40/$_z40/$_z40/$_z40/$_z40 Maybe you want to keep the -chain intact here? Thanks, yeah. It doesn't matter in practice, but we do try to -chain everything. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] introduce git root
Jeff King p...@peff.net writes: On Tue, Dec 09, 2014 at 10:17:13AM -0800, Junio C Hamano wrote: This is another tangent that comes back to the original how to name them? question, but I wonder if a convention like this would work. * When asking for a feature (e.g. what editor to use), if there is a git-specific environment variable that can be set to override all other settings (e.g. $GIT_EDITOR trumps $EDITOR environment or core.editor configuration), git var can be queried with the name of that strong environment variable. I'd prefer to create a new namespace. When you set $GIT_EDITOR as a trump variable, then that is leaking information about the computation from git var out to the caller. What happens if the computation changes so that $GIT_EDITOR is no longer the last word? Having two classes of variables, ones that have their corresponding trump environment variables and others (and making the users aware of the trump variables) is a way to teach users that they could use them when they want to tweak the specific aspect of Git. It of course casts what variable trumps all others in stone, which is a prerequisite if we want to have something stable and teachable to the users. Casting in stone will hinder future improvements, so the argument cuts both ways. Having said that, this was not even a suggestion or a proposal (in the sense that I would be happier if people agreed on this design rather than other design). I'd be happy as long as people agreed on anything sensible. So now you have two classes of variables. Some that look like environment variables, and some that do not. What does it buy you for the ones that do look like environment variables? Teachability. I personally do not care too much about it, though ;-) I would have used a separate namespace for the ones without trump variables, and would have given aliases in that namespace even for the ones with trump variables, so it does not make much difference to my mind---we will have the third namespace anyway whether we went the trumps can be queried via the environment variable name option or not. So it seems that we agree that a separate namespace that is clearly distinct from environment or config would be the way to go? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] document string_list_clear
Michael Haggerty wrote: I would find it a pity for that work to be squashed into Documentation/technical/api-*.txt, where in my opinion it is less discoverable and more likely to fall into disrepair. I think we're in violent agreement and keep repeating ourselves. All I said is that api-strbuf.txt is currently the most readable documentation of the strbuf API I can find. The patch to move the text to strbuf.h looked rough and incomplete. Therefore I don't think it's ready to be applied as is. If you'd like more details about why I say that, feel free to ask. Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] send-email: handle adjacent RFC 2047-encoded words properly
From: Роман Донченко d...@corrigendum.ru Sent: Sunday, December 07, 2014 6:17 PM Would that be worth a terse comment in the documentation change part of the patch? Multiple (RFC2047) encodings are not supported., or would that be bike shed noise. I didn't change any documentation... and in either case, they weren't supported in the first place, so I don't think it's anything I need to mention. I'd confused this with the crossing thread by Paolo Bonzini bonz...@gnu.org [PATCH 2/2] git-send-email: add --transfer-encoding option; 25 November 2014 14:00. $gmane/260217. Sorry for the noise. -- Philip -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] document string_list_clear
Jonathan Nieder jrnie...@gmail.com writes: Michael Haggerty wrote: I would find it a pity for that work to be squashed into Documentation/technical/api-*.txt, where in my opinion it is less discoverable and more likely to fall into disrepair. I think we're in violent agreement and keep repeating ourselves. Hmph, I am confused. I somehow had an impression that the move to doc and remove from header patch was to illustrate how unpleasant the result will be as a whole (i.e. results in a nice documentation as a starting point, but we can see that it will be hard to motivate and help people to keep it up to date during further development). Which would suggest that you are in favor of moving the other way around, to keep the header rich with documentation only at the higher level. Am I reading you correctly? All I said is that api-strbuf.txt is currently the most readable documentation of the strbuf API I can find. The patch to move the text to strbuf.h looked rough and incomplete. Therefore I don't think it's ready to be applied as is. If you'd like more details about why I say that, feel free to ask. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] document string_list_clear
Junio C Hamano wrote: Jonathan Nieder jrnie...@gmail.com writes: Michael Haggerty wrote: I would find it a pity for that work to be squashed into Documentation/technical/api-*.txt, where in my opinion it is less discoverable and more likely to fall into disrepair. I think we're in violent agreement and keep repeating ourselves. Hmph, I am confused. I somehow had an impression that the move to doc and remove from header patch was to illustrate how unpleasant the result will be as a whole (i.e. results in a nice documentation as a starting point, but we can see that it will be hard to motivate and help people to keep it up to date during further development). Which would suggest that you are in favor of moving the other way around, to keep the header rich with documentation only at the higher level. Am I reading you correctly? Sorry, I think I was unclear. Some possibilities, in order of my preference (earlier items are better): 1. Move documentation to header and provide a program to generate a nice standalone document. 2. Move documentation to header, being careful enough that the header sort of works as a standalone document. 3. Move documentation to Documentation/technical/ and keep the header bare-bones. 4. Status quo (comprehensive documentation for some functions in both places, for others in only one place, no reliable way for someone to find the information they need in one place). Since (3) is better than (4), I wrote simple patches to do that for strbuf.h and string-list.h. I meant them in earnest --- I hope they get applied. I think peff was working on (2), which is an admirable goal. The patch seemed to be incomplete. Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
What's cooking in git.git (Dec 2014, #02; Wed, 10)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [Graduated to master] * jh/empty-notes (2014-11-14) 9 commits (merged to 'next' on 2014-11-18 at 9eeb338) + t3301: modernize style + notes: empty notes should be shown by 'git log' + builtin/notes: add --allow-empty, to allow storing empty notes + builtin/notes: split create_note() to clarify add vs. remove logic + builtin/notes: simplify early exit code in add() + builtin/notes: refactor note file path into struct note_data + builtin/notes: improve naming + t3301: verify that 'git notes' removes empty notes by default + builtin/notes: fix premature failure when trying to add the empty blob A request to store an empty note via git notes meant to remove note from the object but with --allow-empty we will store a (surprise!) note that is empty. In the longer run, we might want to deprecate the somewhat unintuitive emptying means deletion behaviour. * jk/checkout-from-tree (2014-11-13) 1 commit (merged to 'next' on 2014-11-14 at ddbffb0) + checkout $tree: do not throw away unchanged index entries git checkout $treeish $path, when $path in the index and the working tree already matched what is in $treeish at the $path, still overwrote the $path unnecessarily. * jk/gitweb-with-newer-cgi-multi-param (2014-11-18) 1 commit (merged to 'next' on 2014-11-18 at 6ac61fe) + gitweb: hack around CGI's list-context param() handling gitweb used to depend on a behaviour recent CGI.pm deprecated. * js/windows-open-eisdir-error (2014-11-17) 1 commit (merged to 'next' on 2014-11-18 at 57b0d49) + Windows: correct detection of EISDIR in mingw_open() open() emulated on Windows platforms did not give EISDIR upon an attempt to open a directory for writing. * mh/config-flip-xbit-back-after-checking (2014-11-18) 1 commit (merged to 'next' on 2014-11-18 at 45f7d71) + create_default_files(): don't set u+x bit on $GIT_DIR/config (this branch is used by tb/config-core-filemode-check-on-broken-fs.) git init (hence git clone) initialized the per-repository configuration file .git/config with x-bit by mistake. * rs/env-array-in-child-process (2014-11-10) 1 commit (merged to 'next' on 2014-11-14 at 3f6ba07) + use args member of struct child_process Code cleanup. * rs/maint-config-use-labs (2014-11-17) 1 commit (merged to 'next' on 2014-11-18 at 53c2404) + use labs() for variables of type long instead of abs() A few code paths used abs() when they should have used labs() on long integers. * rs/receive-pack-use-labs (2014-11-17) 1 commit (merged to 'next' on 2014-11-18 at c6d2d94) + use labs() for variables of type long instead of abs() A few code paths used abs() when they should have used labs() on long integers. * sv/get-builtin (2014-11-13) 1 commit (merged to 'next' on 2014-11-14 at 9497e17) + builtin: move builtin retrieval to get_builtin() Small code consolidation. * tq/git-ssh-command (2014-11-10) 1 commit (merged to 'next' on 2014-11-14 at 83f5dae) + git_connect: set ssh shell command in GIT_SSH_COMMAND Allow passing extra set of arguments when ssh is invoked to create an encrypted authenticated connection, which is not possible with existing GIT_SSH mechanism, which was designed more to match what other programs with similar variables did, not necessarily to be more useful. -- [New Topics] * dm/compat-s-ifmt-for-zos (2014-12-04) 1 commit - compat: convert modes to use portable file type values Long overdue departure from the assumption that S_IFMT is shared by everybody made in 2005. * jk/credential-quit (2014-12-04) 2 commits - prompt: respect GIT_TERMINAL_PROMPT to disable terminal prompts - credential: let helpers tell us to quit Credential helpers are asked in turn until one of them give positive response, which is cumbersome to turn off when you need to run Git in an automated setting. The credential helper interface learned to allow a helper to say stop, don't ask other helpers. Also GIT_TERMINAL_PROMPT environment can be set to false to disable our built-in prompt mechanism for passwords. Will merge to 'next'. * mg/branch-d-m-f (2014-12-09) 2 commits - branch: allow -f with -m and -d - t3200-branch: test -M git branch -d (delete) and git branch -m (move) learned to honor -f (force) flag; unlike many other subcommands, the way to force these have been with separate -D/-M options, which was inconsistent. Will merge to 'next'. * mg/doc-check-ignore-tracked-are-not-ignored (2014-12-04) 1 commit - check-ignore: clarify treatment of tracked files Will merge to 'next'. * rt/completion-tag
Re: [RFC/PATCH 2/5] glossary: introduce glossary lookup command
Michael J Gruber g...@drmicha.warpmail.net writes: When using a localised git, there are many reasons why a correspondence between English and localised git terms is needed: - connect localised messages with English ones (porcelain vs. plumbing) - connect localised messages with English man pages or online docs - help out someone in a different locale Introduce a `git glossary' command that leverages the existing infrastructure in three possible ways: - `git glossary' lists all glossary terms along with their translation - `git glossary foo' matches `foo' in the glossary (both English and localisation; partial matches shown) - `git glossary -a foo' matches `foo' in the git message catalogue (English, exact match only). Signed-off-by: Michael J Gruber g...@drmicha.warpmail.net --- Some bike-shedding expected regarding the interface... Once I've learned how to test l10n stuff, this will be amended. This is interesting. I wondered if we want to also have the associated documentation in response to a query, but I am not sure how well that would go without having a translated glossary at least. I.e. pulling the original from glossary-content.txt would produce something like this: $ LANG=de git glossary -l blob object Blob-Objekten Untyped def_object,object, e.g. the contents of a file. which is not ideal. I noticed that you allow querying more than one term from the command line, so the above example would not work quite well, as we would end up querying blob and then object, not a single term blob object which does have N_() translation in glossary.h. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] document string_list_clear
Jonathan Nieder jrnie...@gmail.com writes: Some possibilities, in order of my preference (earlier items are better): 1. Move documentation to header and provide a program to generate a nice standalone document. 2. Move documentation to header, being careful enough that the header sort of works as a standalone document. 3. Move documentation to Documentation/technical/ and keep the header bare-bones. 4. Status quo (comprehensive documentation for some functions in both places, for others in only one place, no reliable way for someone to find the information they need in one place). Since (3) is better than (4), I wrote simple patches to do that for strbuf.h and string-list.h. I meant them in earnest --- I hope they get applied. I think peff was working on (2), which is an admirable goal. The patch seemed to be incomplete. Yeah, I agree with the above preferred ordering, and also think once we get to (2) it would be usable by the intended audience. Those who prefer (1) over (2) are the ones who somehow want to read hard copies ;-) and are likely to be different audiences and are better served by people with different skill-set and inclination. I am not sure if (2) and (3) are that incompatible, though. If you had an acceptable version of (3), wouldn't it be just the matter of indenting the whole thing with s/^/ */, sprinkle /** and */ at strategic paragraph breaks, and move them back to the corresponding header? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] document string_list_clear
Junio C Hamano wrote: Jonathan Nieder jrnie...@gmail.com writes: 2. Move documentation to header, being careful enough that the header sort of works as a standalone document. 3. Move documentation to Documentation/technical/ and keep the header bare-bones. [...] I am not sure if (2) and (3) are that incompatible, though. If you had an acceptable version of (3), wouldn't it be just the matter of indenting the whole thing with s/^/ */, sprinkle /** and */ at strategic paragraph breaks, and move them back to the corresponding header? Presumably. There's also the question of whether to use asciidoc markup, which got mixed in somehow (but I don't see why it has to be --- a header with asciidoc would be fine with me, as would a file in Documentation/technical/ without asciidoc). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor push performance with large number of refs
On Tue, Dec 09, 2014 at 09:41:28PM -0800, Shawn Pearce wrote: On Tue, Dec 9, 2014 at 4:37 PM, brian m. carlson sand...@crustytoothpaste.net wrote: Most of the time is spent between the Pushing to remote machine and Counting objects, running git pack-objects: git pack-objects --all-progress-implied --revs --stdout --thin --delta-base-offset --progress Unfortunately, -vvv doesn't provide any helpful output. I have some suspicions what's going on here, but no hard data. Where should I be looking to determine the bottleneck? My guess is the revision queue is struggling to insert 20,000 commits that the remote side has, are uninteresting, and should not be transmitted. This queue insertion usually requires parsing the commit object out of the local object store to get the commit timestamp, then bubble sort inserting that commit into the queue. I looked at this more in depth today and I found that the bottleneck is --thin. I tried git send-pack, which does not use --thin by default, which led me to further testing. A particular push went from 24 seconds with --thin to 4 seconds without. I agree that the large number of refs is at least part of the problem, because reducing the number of refs has a slight but noticeable impact. It's also the factor I can least control. I have a patch which allows per-remote configuration of whether to use thin packs (which I will send shortly), but I'm wondering if we can do better, especially since --thin is the default. It looks like --thin forces pack-objects to do its own lookup (essentially a rev-list) instead of using the values provided on stdin. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: Blobs not referenced by file (anymore) are not removed by GC
On 10 December 2014 at 16:07, Junio C Hamano gits...@pobox.com wrote: Jeff King p...@peff.net writes: git reflog expire --expire=now --all git gc --prune=now --aggressive Maybe: git gc --purge Yeah, that is common enough that it might be worthwhile (you probably want --expire-unreachable in the reflog invocation, though). Also you would not want an unconditional --aggressive. After a big rewrite deleting files the re-optimisation of --aggressive can make a big difference to packsize - for instance 1.2GB to 768MB in a test I just ran - but of course it is *much* slower, so I suspect you're right about not including it. I wasn't aware of the '--expire-unreachable=all' switch, though it seems like a 'milder' version of the '--expire=now' switch? - in that it would keep reflog entries if they haven't been changed, which is fair enough and compatible with the 'purge' goal. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: filter-branch performance
On 10 December 2014 at 16:05, Junio C Hamano gits...@pobox.com wrote: Roberto Tyley roberto.ty...@gmail.com writes: The BFG is generally faster than filter-branch for 3 reasons: 1. No forking - everything stays in the JVM process 2. Embarrassingly parallel algorithm makes good use of multi-core machines 3. Memoization means no Git object (file or folder) is cleaned more than once In the case of your problem, only the first factor will be noticeably helpful. Unfortunately commits do need to be cleaned sequentially, as their hashes depend on the hashes of their parents, and filter-branch doesn't clean /commits/ more than once, the way it does with files or folders - so the last 2 reasons in the list won't be significant. Just this part. If your history is bushy, you should be able to rewrite histories of merged branches in parallel up to the point they are merged---rewriting of the merge commit of course has to wait until all the branches have been rewritten, though. That's true, and the bfg does take advantage of that parallelism, so as well as point 1, point 2 will provide some benefit if history is bushy enough :) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] t1400: add some more tests of update-ref --stdin's verify command
Two of the tests fail because verify refs/heads/foo with no argument (not even zeros) actually *deletes* refs/heads/foo. This problem will be fixed in the next commit. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- The two failing tests have to restore the $m reference when they're done because otherwise the bug deletes it, causing subsequent tests to fail. t/t1400-update-ref.sh | 92 +++ 1 file changed, 92 insertions(+) diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index 7b4707b..6a3cdd1 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -619,6 +619,52 @@ test_expect_success 'stdin update/create/verify combination works' ' test_must_fail git rev-parse --verify -q $c ' +test_expect_success 'stdin verify succeeds for correct value' ' + git rev-parse $m expect + echo verify $m $m stdin + git update-ref --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_success 'stdin verify succeeds for missing reference' ' + echo verify refs/heads/missing $Z stdin + git update-ref --stdin stdin + test_must_fail git rev-parse --verify -q refs/heads/missing +' + +test_expect_success 'stdin verify treats no value as missing' ' + echo verify refs/heads/missing stdin + git update-ref --stdin stdin + test_must_fail git rev-parse --verify -q refs/heads/missing +' + +test_expect_success 'stdin verify fails for wrong value' ' + git rev-parse $m expect + echo verify $m $m~1 stdin + test_must_fail git update-ref --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_success 'stdin verify fails for mistaken null value' ' + git rev-parse $m expect + echo verify $m $Z stdin + test_must_fail git update-ref --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_failure 'stdin verify fails for mistaken empty value' ' + M=$(git rev-parse $m) + test_when_finished git update-ref $m $M + git rev-parse $m expect + echo verify $m stdin + test_must_fail git update-ref --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + test_expect_success 'stdin update refs works with identity updates' ' cat stdin -EOF update $a $m $m @@ -938,6 +984,52 @@ test_expect_success 'stdin -z update/create/verify combination works' ' test_must_fail git rev-parse --verify -q $c ' +test_expect_success 'stdin -z verify succeeds for correct value' ' + git rev-parse $m expect + printf $F verify $m $m stdin + git update-ref -z --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_success 'stdin -z verify succeeds for missing reference' ' + printf $F verify refs/heads/missing $Z stdin + git update-ref -z --stdin stdin + test_must_fail git rev-parse --verify -q refs/heads/missing +' + +test_expect_success 'stdin -z verify treats no value as missing' ' + printf $F verify refs/heads/missing stdin + git update-ref -z --stdin stdin + test_must_fail git rev-parse --verify -q refs/heads/missing +' + +test_expect_success 'stdin -z verify fails for wrong value' ' + git rev-parse $m expect + printf $F verify $m $m~1 stdin + test_must_fail git update-ref -z --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_success 'stdin -z verify fails for mistaken null value' ' + git rev-parse $m expect + printf $F verify $m $Z stdin + test_must_fail git update-ref -z --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_failure 'stdin -z verify fails for mistaken empty value' ' + M=$(git rev-parse $m) + test_when_finished git update-ref $m $M + git rev-parse $m expect + printf $F verify $m stdin + test_must_fail git update-ref -z --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + test_expect_success 'stdin -z update refs works with identity updates' ' printf $F update $a $m $m update $b $m $m update $c $Z stdin git update-ref -z --stdin stdin -- 2.1.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] Fix a bug with update-ref verify and no oldvalue
Ever since the --stdin option was added to git update-ref in c750ba9519 update-ref: support multiple simultaneous updates (2013-09-09) the verify command has been broken. If no oldvalue is specified, the documentation says that the verify command will verify that the reference doesn't currently exist. But in fact, it unconditionally *deletes* the reference (!) Hopefully this is not a common usage idiom, but this is nonetheless a serious bug. Add some tests for this and related functionality, then fix the bug. These patches are also available from my GitHub repository [1] as branch update-ref-verify-fix-v1. This fix applies to maint, for which I think it is appropriate. It also merges through to master with no conflicts, though it conflicts trivially with pu. [1] https://github.com/mhagger/git Michael Haggerty (2): t1400: add some more tests of update-ref --stdin's verify command update-ref: fix verify command with missing oldvalue builtin/update-ref.c | 14 +++- t/t1400-update-ref.sh | 92 +++ 2 files changed, 97 insertions(+), 9 deletions(-) -- 2.1.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] update-ref: fix verify command with missing oldvalue
If git update-ref --stdin was given a verify command with no newvalue at all (not even zeros), the code was mistakenly setting have_old=0 (and leaving old_sha1 uninitialized). But this is incorrect: this command is supposed to verify that the reference doesn't exist. So in this case we really need old_sha1 to be set to null_sha1 and have_old to be set to 1. Moreover, since have_old was being set to zero, *no* check of the old value was being done, so the new value of the reference was being set unconditionally to the value in new_sha1. new_sha1, in turn, was set to null_sha1 in the expectation that that was the old value and it shouldn't be changed. But because the precondition was not being checked, the result was that the reference was being deleted unconditionally. So, if oldvalue is missing, set have_old unconditionally and set old_sha1 to null_sha1. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- builtin/update-ref.c | 14 +- t/t1400-update-ref.sh | 4 ++-- 2 files changed, 7 insertions(+), 11 deletions(-) diff --git a/builtin/update-ref.c b/builtin/update-ref.c index 6c9be05..1993529 100644 --- a/builtin/update-ref.c +++ b/builtin/update-ref.c @@ -282,26 +282,22 @@ static const char *parse_cmd_verify(struct ref_transaction *transaction, char *refname; unsigned char new_sha1[20]; unsigned char old_sha1[20]; - int have_old; refname = parse_refname(input, next); if (!refname) die(verify: missing ref); if (parse_next_sha1(input, next, old_sha1, verify, refname, - PARSE_SHA1_OLD)) { - hashclr(new_sha1); - have_old = 0; - } else { - hashcpy(new_sha1, old_sha1); - have_old = 1; - } + PARSE_SHA1_OLD)) + hashclr(old_sha1); + + hashcpy(new_sha1, old_sha1); if (*next != line_termination) die(verify %s: extra input: %s, refname, next); if (ref_transaction_update(transaction, refname, new_sha1, old_sha1, - update_flags, have_old, msg, err)) + update_flags, 1, msg, err)) die(%s, err.buf); update_flags = 0; diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index 6a3cdd1..6805b9e 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -655,7 +655,7 @@ test_expect_success 'stdin verify fails for mistaken null value' ' test_cmp expect actual ' -test_expect_failure 'stdin verify fails for mistaken empty value' ' +test_expect_success 'stdin verify fails for mistaken empty value' ' M=$(git rev-parse $m) test_when_finished git update-ref $m $M git rev-parse $m expect @@ -1020,7 +1020,7 @@ test_expect_success 'stdin -z verify fails for mistaken null value' ' test_cmp expect actual ' -test_expect_failure 'stdin -z verify fails for mistaken empty value' ' +test_expect_success 'stdin -z verify fails for mistaken empty value' ' M=$(git rev-parse $m) test_when_finished git update-ref $m $M git rev-parse $m expect -- 2.1.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] push: add remote.pushThin to control thin pack generation
From: brian m. carlson brian.carl...@cpanel.net Thin packs are enabled when pushing by default, but with a large number of refs and a fast network, git may spend more time trying to find a good delta than it would spend simply sending data over the network. Add a per-remote option, pushThin, that controls the default for pushes on that remote. Signed-off-by: brian m. carlson brian.carl...@cpanel.net --- SOURCES/git/Documentation/config.txt | 6 ++ SOURCES/git/builtin/push.c | 6 -- SOURCES/git/remote.c | 3 +++ SOURCES/git/remote.h | 1 + SOURCES/git/transport.c | 2 ++ 5 files changed, 16 insertions(+), 2 deletions(-) diff --git a/SOURCES/git/Documentation/config.txt b/SOURCES/git/Documentation/config.txt index 9220725..7fededd 100644 --- a/SOURCES/git/Documentation/config.txt +++ b/SOURCES/git/Documentation/config.txt @@ -2178,6 +2178,12 @@ remote.name.push:: The default set of refspec for linkgit:git-push[1]. See linkgit:git-push[1]. +remote.name.pushThin:: + If true (the default), pass the \--thin option to + linkgit:git-pack-objects[1] during push. This results in a smaller + pack being sent and can improve push time over slow networks. Over + fast networks, setting this value to false may improve performance. + remote.name.mirror:: If true, pushing to this remote will automatically behave as if the `--mirror` option was given on the command line. diff --git a/SOURCES/git/builtin/push.c b/SOURCES/git/builtin/push.c index a076b19..ae39677 100644 --- a/SOURCES/git/builtin/push.c +++ b/SOURCES/git/builtin/push.c @@ -15,7 +15,7 @@ static const char * const push_usage[] = { NULL, }; -static int thin = 1; +static int thin = -1; static int deleterefs; static const char *receivepack; static int verbosity; @@ -347,7 +347,9 @@ static int push_with_options(struct transport *transport, int flags) if (receivepack) transport_set_option(transport, TRANS_OPT_RECEIVEPACK, receivepack); - transport_set_option(transport, TRANS_OPT_THIN, thin ? yes : NULL); + + if (thin != -1) + transport_set_option(transport, TRANS_OPT_THIN, thin ? yes : NULL); if (!is_empty_cas(cas)) { if (!transport-smart_options) diff --git a/SOURCES/git/remote.c b/SOURCES/git/remote.c index f624217..54777cb 100644 --- a/SOURCES/git/remote.c +++ b/SOURCES/git/remote.c @@ -175,6 +175,7 @@ static struct remote *make_remote(const char *name, int len) ret = xcalloc(1, sizeof(struct remote)); ret-prune = -1; /* unspecified */ + ret-push_thin = 1; /* default on */ ALLOC_GROW(remotes, remotes_nr + 1, remotes_alloc); remotes[remotes_nr++] = ret; ret-name = xstrndup(name, len); @@ -445,6 +446,8 @@ static int handle_config(const char *key, const char *value, void *cb) if (git_config_string(v, key, value)) return -1; add_push_refspec(remote, v); + } else if (!strcmp(subkey, .pushthin)) { + remote-push_thin = git_config_bool(key, value); } else if (!strcmp(subkey, .fetch)) { const char *v; if (git_config_string(v, key, value)) diff --git a/SOURCES/git/remote.h b/SOURCES/git/remote.h index 8b62efd..badf266 100644 --- a/SOURCES/git/remote.h +++ b/SOURCES/git/remote.h @@ -46,6 +46,7 @@ struct remote { int skip_default_update; int mirror; int prune; + int push_thin; const char *receivepack; const char *uploadpack; diff --git a/SOURCES/git/transport.c b/SOURCES/git/transport.c index 70d38e4..2f495fa 100644 --- a/SOURCES/git/transport.c +++ b/SOURCES/git/transport.c @@ -987,6 +987,8 @@ struct transport *transport_get(struct remote *remote, const char *url) ret-smart_options-receivepack = remote-receivepack; } + transport_set_option(ret, TRANS_OPT_THIN, remote-push_thin ? yes : NULL); + return ret; } -- 2.2.0 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] t1400: add some more tests of update-ref --stdin's verify command
On Thu, Dec 11, 2014 at 12:47:51AM +0100, Michael Haggerty wrote: Two of the tests fail because verify refs/heads/foo with no argument (not even zeros) actually *deletes* refs/heads/foo. This problem will be fixed in the next commit. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- Reviewed-By: Stefan Beller sbel...@google.com The two failing tests have to restore the $m reference when they're done because otherwise the bug deletes it, causing subsequent tests to fail. t/t1400-update-ref.sh | 92 +++ 1 file changed, 92 insertions(+) diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index 7b4707b..6a3cdd1 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -619,6 +619,52 @@ test_expect_success 'stdin update/create/verify combination works' ' test_must_fail git rev-parse --verify -q $c ' +test_expect_success 'stdin verify succeeds for correct value' ' + git rev-parse $m expect + echo verify $m $m stdin + git update-ref --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_success 'stdin verify succeeds for missing reference' ' + echo verify refs/heads/missing $Z stdin + git update-ref --stdin stdin + test_must_fail git rev-parse --verify -q refs/heads/missing +' + +test_expect_success 'stdin verify treats no value as missing' ' + echo verify refs/heads/missing stdin + git update-ref --stdin stdin + test_must_fail git rev-parse --verify -q refs/heads/missing +' + +test_expect_success 'stdin verify fails for wrong value' ' + git rev-parse $m expect + echo verify $m $m~1 stdin + test_must_fail git update-ref --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_success 'stdin verify fails for mistaken null value' ' + git rev-parse $m expect + echo verify $m $Z stdin + test_must_fail git update-ref --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_failure 'stdin verify fails for mistaken empty value' ' + M=$(git rev-parse $m) + test_when_finished git update-ref $m $M + git rev-parse $m expect + echo verify $m stdin + test_must_fail git update-ref --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + test_expect_success 'stdin update refs works with identity updates' ' cat stdin -EOF update $a $m $m @@ -938,6 +984,52 @@ test_expect_success 'stdin -z update/create/verify combination works' ' test_must_fail git rev-parse --verify -q $c ' +test_expect_success 'stdin -z verify succeeds for correct value' ' + git rev-parse $m expect + printf $F verify $m $m stdin + git update-ref -z --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_success 'stdin -z verify succeeds for missing reference' ' + printf $F verify refs/heads/missing $Z stdin + git update-ref -z --stdin stdin + test_must_fail git rev-parse --verify -q refs/heads/missing +' + +test_expect_success 'stdin -z verify treats no value as missing' ' + printf $F verify refs/heads/missing stdin + git update-ref -z --stdin stdin + test_must_fail git rev-parse --verify -q refs/heads/missing +' + +test_expect_success 'stdin -z verify fails for wrong value' ' + git rev-parse $m expect + printf $F verify $m $m~1 stdin + test_must_fail git update-ref -z --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_success 'stdin -z verify fails for mistaken null value' ' + git rev-parse $m expect + printf $F verify $m $Z stdin + test_must_fail git update-ref -z --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + +test_expect_failure 'stdin -z verify fails for mistaken empty value' ' + M=$(git rev-parse $m) + test_when_finished git update-ref $m $M + git rev-parse $m expect + printf $F verify $m stdin + test_must_fail git update-ref -z --stdin stdin + git rev-parse $m actual + test_cmp expect actual +' + test_expect_success 'stdin -z update refs works with identity updates' ' printf $F update $a $m $m update $b $m $m update $c $Z stdin git update-ref -z --stdin stdin -- 2.1.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 22/23] lock_any_ref_for_update(): inline function
On 12/09/2014 12:34 AM, Stefan Beller wrote: On Fri, Dec 05, 2014 at 12:08:34AM +0100, Michael Haggerty wrote: From: Ronnie Sahlberg sahlb...@google.com Inline the function at its one remaining caller (which is within refs.c) and remove it. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu It's originally from Ronnie, but his sign off is missing? If that sign off is found again, Reviewed-by: Stefan Beller sbel...@google.com Sorry for the confusion. This patch ultimately descends from a patch by Ronnie that completely deleted lock_any_ref_for_update() because (in the context that he wrote it) there were no remaining callers of that function. In the context of this patch, there *is* a remaining caller. So this patch is different--it deletes the obvious stuff as in Ronnie's patch, but it also inlines the function at its last caller. And the commit message is completely different. Given those differences and the fact that the only overlap with Ronnie's original patch is the *deletion* of content, I thought it most appropriate to change the authorship of the patch. So I removed Ronnie's Signed-off-by line but I forgot to remove his authorship. If anybody thinks it would be more appropriate to leave Ronnie as author of the new patch or leave his Signed-off-by line, I am totally OK with doing so. I'm just trying to figure out what's right. Otherwise, I will change the authorship of the patch to myself in the upcoming reroll and include only my own Signed-off-by line. Michael -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] update-ref: fix verify command with missing oldvalue
On Thu, Dec 11, 2014 at 12:47:52AM +0100, Michael Haggerty wrote: If git update-ref --stdin was given a verify command with no newvalue at all (not even zeros), the code was mistakenly setting have_old=0 (and leaving old_sha1 uninitialized). But this is incorrect: this command is supposed to verify that the reference doesn't exist. So in this case we really need old_sha1 to be set to null_sha1 and have_old to be set to 1. Moreover, since have_old was being set to zero, *no* check of the old value was being done, so the new value of the reference was being set unconditionally to the value in new_sha1. new_sha1, in turn, was set to null_sha1 in the expectation that that was the old value and it shouldn't be changed. But because the precondition was not being checked, the result was that the reference was being deleted unconditionally. So, if oldvalue is missing, set have_old unconditionally and set old_sha1 to null_sha1. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu This is reviewed by me as well. Reviewed-by: Stefan Beller sbel...@google.com --- builtin/update-ref.c | 14 +- t/t1400-update-ref.sh | 4 ++-- 2 files changed, 7 insertions(+), 11 deletions(-) diff --git a/builtin/update-ref.c b/builtin/update-ref.c index 6c9be05..1993529 100644 --- a/builtin/update-ref.c +++ b/builtin/update-ref.c @@ -282,26 +282,22 @@ static const char *parse_cmd_verify(struct ref_transaction *transaction, char *refname; unsigned char new_sha1[20]; unsigned char old_sha1[20]; - int have_old; refname = parse_refname(input, next); if (!refname) die(verify: missing ref); if (parse_next_sha1(input, next, old_sha1, verify, refname, - PARSE_SHA1_OLD)) { - hashclr(new_sha1); - have_old = 0; - } else { - hashcpy(new_sha1, old_sha1); - have_old = 1; - } + PARSE_SHA1_OLD)) + hashclr(old_sha1); + + hashcpy(new_sha1, old_sha1); if (*next != line_termination) die(verify %s: extra input: %s, refname, next); if (ref_transaction_update(transaction, refname, new_sha1, old_sha1, -update_flags, have_old, msg, err)) +update_flags, 1, msg, err)) die(%s, err.buf); update_flags = 0; diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index 6a3cdd1..6805b9e 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -655,7 +655,7 @@ test_expect_success 'stdin verify fails for mistaken null value' ' test_cmp expect actual ' -test_expect_failure 'stdin verify fails for mistaken empty value' ' +test_expect_success 'stdin verify fails for mistaken empty value' ' M=$(git rev-parse $m) test_when_finished git update-ref $m $M git rev-parse $m expect @@ -1020,7 +1020,7 @@ test_expect_success 'stdin -z verify fails for mistaken null value' ' test_cmp expect actual ' -test_expect_failure 'stdin -z verify fails for mistaken empty value' ' +test_expect_success 'stdin -z verify fails for mistaken empty value' ' M=$(git rev-parse $m) test_when_finished git update-ref $m $M git rev-parse $m expect -- 2.1.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] push: add remote.pushThin to control thin pack generation
On Wed, Dec 10, 2014 at 11:49:49PM +, brian m. carlson wrote: From: brian m. carlson brian.carl...@cpanel.net Thin packs are enabled when pushing by default, but with a large number of refs and a fast network, git may spend more time trying to find a good delta than it would spend simply sending data over the network. Add a per-remote option, pushThin, that controls the default for pushes on that remote. I just realized this patch doesn't apply outside of the particular repo I was using. Consider this more of an RFC, since I'd like to avoid going this route if possible, in favor of a more robust approach. Signed-off-by: brian m. carlson brian.carl...@cpanel.net --- SOURCES/git/Documentation/config.txt | 6 ++ SOURCES/git/builtin/push.c | 6 -- SOURCES/git/remote.c | 3 +++ SOURCES/git/remote.h | 1 + SOURCES/git/transport.c | 2 ++ 5 files changed, 16 insertions(+), 2 deletions(-) diff --git a/SOURCES/git/Documentation/config.txt b/SOURCES/git/Documentation/config.txt index 9220725..7fededd 100644 --- a/SOURCES/git/Documentation/config.txt +++ b/SOURCES/git/Documentation/config.txt @@ -2178,6 +2178,12 @@ remote.name.push:: The default set of refspec for linkgit:git-push[1]. See linkgit:git-push[1]. +remote.name.pushThin:: + If true (the default), pass the \--thin option to + linkgit:git-pack-objects[1] during push. This results in a smaller + pack being sent and can improve push time over slow networks. Over + fast networks, setting this value to false may improve performance. + remote.name.mirror:: If true, pushing to this remote will automatically behave as if the `--mirror` option was given on the command line. diff --git a/SOURCES/git/builtin/push.c b/SOURCES/git/builtin/push.c index a076b19..ae39677 100644 --- a/SOURCES/git/builtin/push.c +++ b/SOURCES/git/builtin/push.c @@ -15,7 +15,7 @@ static const char * const push_usage[] = { NULL, }; -static int thin = 1; +static int thin = -1; static int deleterefs; static const char *receivepack; static int verbosity; @@ -347,7 +347,9 @@ static int push_with_options(struct transport *transport, int flags) if (receivepack) transport_set_option(transport, TRANS_OPT_RECEIVEPACK, receivepack); - transport_set_option(transport, TRANS_OPT_THIN, thin ? yes : NULL); + + if (thin != -1) + transport_set_option(transport, TRANS_OPT_THIN, thin ? yes : NULL); if (!is_empty_cas(cas)) { if (!transport-smart_options) diff --git a/SOURCES/git/remote.c b/SOURCES/git/remote.c index f624217..54777cb 100644 --- a/SOURCES/git/remote.c +++ b/SOURCES/git/remote.c @@ -175,6 +175,7 @@ static struct remote *make_remote(const char *name, int len) ret = xcalloc(1, sizeof(struct remote)); ret-prune = -1; /* unspecified */ + ret-push_thin = 1; /* default on */ ALLOC_GROW(remotes, remotes_nr + 1, remotes_alloc); remotes[remotes_nr++] = ret; ret-name = xstrndup(name, len); @@ -445,6 +446,8 @@ static int handle_config(const char *key, const char *value, void *cb) if (git_config_string(v, key, value)) return -1; add_push_refspec(remote, v); + } else if (!strcmp(subkey, .pushthin)) { + remote-push_thin = git_config_bool(key, value); } else if (!strcmp(subkey, .fetch)) { const char *v; if (git_config_string(v, key, value)) diff --git a/SOURCES/git/remote.h b/SOURCES/git/remote.h index 8b62efd..badf266 100644 --- a/SOURCES/git/remote.h +++ b/SOURCES/git/remote.h @@ -46,6 +46,7 @@ struct remote { int skip_default_update; int mirror; int prune; + int push_thin; const char *receivepack; const char *uploadpack; diff --git a/SOURCES/git/transport.c b/SOURCES/git/transport.c index 70d38e4..2f495fa 100644 --- a/SOURCES/git/transport.c +++ b/SOURCES/git/transport.c @@ -987,6 +987,8 @@ struct transport *transport_get(struct remote *remote, const char *url) ret-smart_options-receivepack = remote-receivepack; } + transport_set_option(ret, TRANS_OPT_THIN, remote-push_thin ? yes : NULL); + return ret; } -- 2.2.0 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: Poor push performance with large number of refs
On Thu, Dec 11, 2014 at 6:34 AM, brian m. carlson sand...@crustytoothpaste.net wrote: I looked at this more in depth today and I found that the bottleneck is --thin. I tried git send-pack, which does not use --thin by default, which led me to further testing. A particular push went from 24 seconds with --thin to 4 seconds without. I agree that the large number of refs is at least part of the problem, because reducing the number of refs has a slight but noticeable impact. It's also the factor I can least control. I have a patch which allows per-remote configuration of whether to use thin packs (which I will send shortly), but I'm wondering if we can do better, especially since --thin is the default. It looks like --thin forces pack-objects to do its own lookup (essentially a rev-list) instead of using the values provided on stdin. It could be a regression by fbd4a70 (list-objects: mark more commits as edges in mark_edges_uninteresting - 2013-08-16). That commit makes --thin a lot more agressive (reading lots of trees). You can try to revert that commit (or use a git version without that commit) and see if it improves performance. If so, we probably want to enable that code for shallow repos only. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor push performance with large number of refs
On Thu, Dec 11, 2014 at 08:41:07AM +0700, Duy Nguyen wrote: It could be a regression by fbd4a70 (list-objects: mark more commits as edges in mark_edges_uninteresting - 2013-08-16). That commit makes --thin a lot more agressive (reading lots of trees). You can try to revert that commit (or use a git version without that commit) and see if it improves performance. If so, we probably want to enable that code for shallow repos only. That's exactly it. With Git 2.2.0, --no-thin was 2.295s, --thin was 8.769s, and --thin with the patch reverted was 3.645s. I'll come up with a patch. Thanks for the suggestion. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
[PATCH] list-objects: mark fewer commits as edges for non-shallow clones
In commit fbd4a70 (list-objects: mark more commits as edges in mark_edges_uninteresting - 2013-08-16), we made --thin much more aggressive by reading lots more trees. This produces much smaller packs for shallow clones; however, it causes a significant performance regression for non-shallow clones with lots of refs (23.322 seconds vs. 4.785 seconds with 22400 refs). Limit this extra edge-marking to shallow clones to avoid poor performance. Suggested-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com Signed-off-by: brian m. carlson sand...@crustytoothpaste.net --- list-objects.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/list-objects.c b/list-objects.c index 2910bec..274ebb4 100644 --- a/list-objects.c +++ b/list-objects.c @@ -151,13 +151,14 @@ void mark_edges_uninteresting(struct rev_info *revs, show_edge_fn show_edge) { struct commit_list *list; int i; + int shallow = is_repository_shallow(); for (list = revs-commits; list; list = list-next) { struct commit *commit = list-item; if (commit-object.flags UNINTERESTING) { mark_tree_uninteresting(commit-tree); - if (revs-edge_hint !(commit-object.flags SHOWN)) { + if (shallow revs-edge_hint !(commit-object.flags SHOWN)) { commit-object.flags |= SHOWN; show_edge(commit); } @@ -165,6 +166,8 @@ void mark_edges_uninteresting(struct rev_info *revs, show_edge_fn show_edge) } mark_edge_parents_uninteresting(commit, revs, show_edge); } + if (!shallow) + return; if (revs-edge_hint) { for (i = 0; i revs-cmdline.nr; i++) { struct object *obj = revs-cmdline.rev[i].item; -- 2.2.0.rc0.207.ga3a616c -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] added git-config support for diff.relative setting
By default, git-diff shows changes and pathnames relative to the repository root. Setting the diff.relative config option to true shows pathnames relative to the current directory and excludes changes outside this directory. --- Documentation/diff-config.txt | 6 ++ diff.c| 8 t/t4045-diff-relative.sh | 21 + 3 files changed, 35 insertions(+) diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt index b001779..10f183f 100644 --- a/Documentation/diff-config.txt +++ b/Documentation/diff-config.txt @@ -182,3 +182,9 @@ diff.algorithm:: low-occurrence common elements. -- + + +diff.relative:: + By default, linkgit:git-diff[1] shows changes and pathnames + relative to the repository root. Setting this variable to + `true` shows pathnames relative to the current directory and + excludes changes outside this directory. diff --git a/diff.c b/diff.c index d1bd534..22daa2f 100644 --- a/diff.c +++ b/diff.c @@ -270,6 +270,14 @@ int git_diff_basic_config(const char *var, const char *value, void *cb) return 0; } + if (!strcmp(var, diff.relative)) { + if (git_config_bool(var, value)) + DIFF_OPT_SET(default_diff_options, RELATIVE_NAME); + else + DIFF_OPT_CLR(default_diff_options, RELATIVE_NAME); + return 0; + } + if (starts_with(var, submodule.)) return parse_submodule_config_option(var, value); diff --git a/t/t4045-diff-relative.sh b/t/t4045-diff-relative.sh index 3950f50..8c8fe0b 100755 --- a/t/t4045-diff-relative.sh +++ b/t/t4045-diff-relative.sh @@ -29,6 +29,23 @@ test_expect_success -p $* } +check_config() { +expect=$1; shift +cat expected EOF +diff --git a/$expect b/$expect +new file mode 100644 +index 000..25c05ef +--- /dev/null b/$expect +@@ -0,0 +1 @@ ++other content +EOF +test_expect_success git-config diff.relative=true in $1 + (cd $1; git -c diff.relative=true diff -p HEAD^ ../actual) + test_cmp expected actual + +} + check_numstat() { expect=$1; shift cat expected EOF @@ -69,5 +86,9 @@ for type in diff numstat stat raw; do check_$type file2 --relative=subdir check_$type dir/file2 --relative=sub done +for type in config; do + check_$type file2 subdir/ + check_$type file2 subdir +done test_done -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Interested in helping open source friends on HP-UX?
Hello, all. H. Merijn Brand runs a few HP-UX boxes to help perl5 and other open source communities, wants help porting more recent Git on these boxes, running HP-UX 10.20, 11.00, and 11.23, and looking for a volunteer. Please contact him directly if you are interested. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html