Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter
On Wed, May 21, 2014 at 05:07:38PM -0700, Kyle J. McKay wrote: +p = skip_prefix(type-buf, text/plain); +if (!p || (*p *p != ';')) +return 0; + +return 1; +} + I think that a strict reading of RFC 2616 allows text/plain ; charset=utf-8 as well as text/plain;charset=utf-8 and text/plain; charset=utf-8. So perhaps this if line instead: + if (!p || (*p *p != ';' *p != ' ' *p != '\t')) See RFC 2616 section 2.2 for the definition of linear white space (LWS) and the end of section 2.1 for the implied *LWS rule that allows it. You're right. There are a few exceptions in 3.7: Linear white space (LWS) MUST NOT be used between the type and subtype, nor between an attribute and its value. but it does not include between the subtype and parameter. So the find_parameter also needs to accept the collapsed whitespace, too, I guess. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 9/9] remote-curl: reencode http error messages
On Wed, May 21, 2014 at 05:07:40PM -0700, Kyle J. McKay wrote: +/* default charset from rfc2616 */ +if (!*charset) +*charset = xstrdup(iso8859-1); Actually the name should be ISO-8859-1. See RFC 2616 section 3.7.1. Since it's case insensitive iso-8859-1 would be fine too. Thanks, will fix. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 10/19] git-submodule.sh: convert test -a/-o to and ||
Elia Pinto gitter.spi...@gmail.com writes: This is version 2 of the patch to git-submodule of the patch series convert test -a/-o to and ||. It contains the fixes identified by Johannes and Matthieu. This version of the patch (not the whole series) is Reviewed-by: Matthieu Moy matthieu@imag.fr Some comments for next time: * I agree with Johannes that splitting the series with one patch per file is not the right way to split. As a reviewer, I want to - check that -a trivially translates to - check that -o trivially translates to || - check non-trivial cases Interleaving these cases (especially the trivial and non-trivial cases) makes the review much harder. For this kind of series, the change is trivial, but the review is not (Johannes caught a || Vs inversion that I didn't find for example, which is quite serious), so the optimize your patches for review is even more important than usual. * Again, to avoid mixing trivial and non-trivial changes, ... @@ -1059,13 +1059,17 @@ cmd_summary() { while read mod_src mod_dst sha1_src sha1_dst status sm_path do # Always show modules deleted or type-changed (blob-module) - test $status = D -o $status = T echo $sm_path continue + case $status in + [DT]) + printf '%s\n' $sm_path + continue + esac turning a echo into a printf is good, but would be better done separately. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter
Kyle J. McKay: I think that a strict reading of RFC 2616 allows text/plain ; charset=utf-8 as well as text/plain;charset=utf-8 and text/plain; charset=utf-8. It does indeed, and I have seen servers send both variants, so they do need to be catered for. The number of servers that would actually send the charset attribute here (for error messages) are probably not that many. It is probably a good idea to make the default user-configurable (I know the specs state that anything undeclared is iso-8859-1, but the real world doesn't agree to that). -- \\// Peter - http://www.softwolves.pp.se/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 9/9] remote-curl: reencode http error messages
Kyle J. McKay: + if (!*charset) + *charset = xstrdup(iso8859-1); Actually the name should be ISO-8859-1. See RFC 2616 section 3.7.1. Since it's case insensitive iso-8859-1 would be fine too. You'd be amazed at what you see in the wild... I'd recommend going with the recommended algorithm from WHATWG's Encoding Standard, if you want to make this robust: http://encoding.spec.whatwg.org/#names-and-labels. The spec is partly based on a lot of research I made in my previous $DAYJOB, with a lot of research added by the spec writer. There is also Unicode's attempt at it, but it does unfortunately produce too many false positives: http://www.unicode.org/reports/tr22/tr22-7.html#Charset_Alias_Matching -- \\// Peter - http://www.softwolves.pp.se/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter
On May 21, 2014, at 23:05, Jeff King wrote: On Wed, May 21, 2014 at 05:07:38PM -0700, Kyle J. McKay wrote: + p = skip_prefix(type-buf, text/plain); + if (!p || (*p *p != ';')) + return 0; + + return 1; +} + I think that a strict reading of RFC 2616 allows text/plain ; charset=utf-8 as well as text/plain;charset=utf-8 and text/plain; charset=utf-8. So perhaps this if line instead: + if (!p || (*p *p != ';' *p != ' ' *p != '\t')) See RFC 2616 section 2.2 for the definition of linear white space (LWS) and the end of section 2.1 for the implied *LWS rule that allows it. You're right. There are a few exceptions in 3.7: Linear white space (LWS) MUST NOT be used between the type and subtype, nor between an attribute and its value. but it does not include between the subtype and parameter. So the find_parameter also needs to accept the collapsed whitespace, too, I guess. Yeah I think so too. It's probably enough though just to just strip all and \t characters at the same time the content type is lowercased. While that would cause invalid content types such as text / plain to be recognized it would keep the rest of the code simpler. Since a producer of an invalid content type shouldn't really be depending on any particular behavior by a receiver of an invalid content type it's probably not an issue. --Kyle -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 10/19] git-submodule.sh: convert test -a/-o to and ||
2014-05-22 8:49 GMT+02:00 Matthieu Moy matthieu@grenoble-inp.fr: Elia Pinto gitter.spi...@gmail.com writes: This is version 2 of the patch to git-submodule of the patch series convert test -a/-o to and ||. It contains the fixes identified by Johannes and Matthieu. This version of the patch (not the whole series) is Reviewed-by: Matthieu Moy matthieu@imag.fr Some comments for next time: * I agree with Johannes that splitting the series with one patch per file is not the right way to split. As a reviewer, I want to - check that -a trivially translates to - check that -o trivially translates to || - check non-trivial cases Interleaving these cases (especially the trivial and non-trivial cases) makes the review much harder. For this kind of series, the change is trivial, but the review is not (Johannes caught a || Vs inversion that I didn't find for example, which is quite serious), so the optimize your patches for review is even more important than usual. * Again, to avoid mixing trivial and non-trivial changes, ... First of all, thank you. I will take note of your valuable suggestions for the future. @@ -1059,13 +1059,17 @@ cmd_summary() { while read mod_src mod_dst sha1_src sha1_dst status sm_path do # Always show modules deleted or type-changed (blob-module) - test $status = D -o $status = T echo $sm_path continue + case $status in + [DT]) + printf '%s\n' $sm_path + continue + esac turning a echo into a printf is good, but would be better done separately. I had thought, but the change was in the fix of Johannes, and it did not think was right to change this, especially that it was right anyway. But I understand very well the observation. Best Regards -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter
On Thu, May 22, 2014 at 12:27:38AM -0700, Kyle J. McKay wrote: Yeah I think so too. It's probably enough though just to just strip all and \t characters at the same time the content type is lowercased. While that would cause invalid content types such as text / plain to be recognized it would keep the rest of the code simpler. Since a producer of an invalid content type shouldn't really be depending on any particular behavior by a receiver of an invalid content type it's probably not an issue. I think you have to retain whitespace between parameters. Not that I expect there to be multiple parameters in a text/plain, but it's allowed. I started doing this path of trying to produce a normalized version, but found that it was just as much code as parsing at all (and it still leaves the calling code to do its own parse). Instead, what I ended up with was just doing the parsing in http.c into nice, normalized chunks, and then the calling code can compare them with simple strcmps. I'll post the patches in a few minutes. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter
On Thu, May 22, 2014 at 08:12:58AM +0100, Peter Krefting wrote: Kyle J. McKay: I think that a strict reading of RFC 2616 allows text/plain ; charset=utf-8 as well as text/plain;charset=utf-8 and text/plain; charset=utf-8. It does indeed, and I have seen servers send both variants, so they do need to be catered for. The number of servers that would actually send the charset attribute here (for error messages) are probably not that many. It is probably a good idea to make the default user-configurable (I know the specs state that anything undeclared is iso-8859-1, but the real world doesn't agree to that). I was really hoping to avoid getting into all of the real-world messiness that a real http client needs to deal with (as opposed to just following the standards). This is only used for an optional relay of error messages from the server. Most servers will send back text/html by default, and only those which specifically configure text/plain will have their messages shown. IOW, I expect this to be configured specifically for git messages, and the server admin can make sure they are following the standard and that it works with git. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 10/19] git-submodule.sh: convert test -a/-o to and ||
Am 5/22/2014 10:38, schrieb Elia Pinto: 2014-05-22 8:49 GMT+02:00 Matthieu Moy matthieu@grenoble-inp.fr: Elia Pinto gitter.spi...@gmail.com writes: @@ -1059,13 +1059,17 @@ cmd_summary() { while read mod_src mod_dst sha1_src sha1_dst status sm_path do # Always show modules deleted or type-changed (blob-module) - test $status = D -o $status = T echo $sm_path continue + case $status in + [DT]) + printf '%s\n' $sm_path + continue + esac turning a echo into a printf is good, but would be better done separately. I had thought, but the change was in the fix of Johannes, and it did not think was right to change this, especially that it was right anyway. But I understand very well the observation. My intent was to show the final state of this piece of code. I do agree with Matthieu that the change from 'echo' to 'printf' is a different topic (in particular, since this is not the only point in the script that would need that change). Sorry for having sent you in circles. -- Hannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/9] handle alternate charsets for remote http errors
On Wed, May 21, 2014 at 06:25:24AM -0400, Jeff King wrote: As of commit 426e70d (remote-curl: show server content on http errors, 2013-04-05), we relay any text/plain errors shown by the remote http server to the user. However, we were lazy back then and left this TODO in place: /* * We only show text/plain parts, as other types are likely * to be ugly to look at on the user's terminal. * * TODO should handle ; charset=XXX, and re-encode into * logoutputencoding */ This series actually implements that, along with a few other cleanups. Here's a second version based on feedback from Kyle and Peter. Thanks both for your comments. It drops the tolower patches, which are not used anymore, and makes the parsing of content-types and their parameters a bit more robust. [1/8]: test-lib: preserve GIT_CURL_VERBOSE from the environment [2/8]: t/lib-httpd: use write_script to copy CGI scripts [3/8]: t5550: test display of remote http error messages These three are the same as before. [4/8]: http: extract type/subtype portion of content-type Make our content-type matching more robust, both for the errors and for matching smart-http types. [5/8]: http: optionally extract charset parameter from content-type Feature work to support 7/8. [6/8]: strbuf: add strbuf_reencode helper Same as before (feature work to support 7/8). [7/8]: remote-curl: reencode http error messages The actual fix. [8/8]: http: default text charset to iso-8859-1 This could be part of 5/8, but I floated it to the top of the heap to make it easier to discuss/adjust it. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/8] test-lib: preserve GIT_CURL_VERBOSE from the environment
Turning on this variable can be useful when debugging http tests. It does break a few tests in t5541, but it is not a variable that the user is likely to have enabled accidentally. Signed-off-by: Jeff King p...@peff.net --- t/test-lib.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/t/test-lib.sh b/t/test-lib.sh index c081668..f7e55b1 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -91,6 +91,7 @@ unset VISUAL EMAIL LANGUAGE COLUMNS $($PERL_PATH -e ' VALGRIND UNZIP PERF_ + CURL_VERBOSE )); my @vars = grep(/^GIT_/ !/^GIT_($ok)/o, @env); print join(\n, @vars); -- 2.0.0.rc1.436.g03cb729 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/8] t/lib-httpd: use write_script to copy CGI scripts
Using write_script will set our shebang line appropriately with $SHELL_PATH. The script that is there now is quite simple and likely to succeed even with a non-POSIX /bin/sh, but it does not hurt to be defensive. Signed-off-by: Jeff King p...@peff.net --- t/lib-httpd.sh | 6 +- t/lib-httpd/broken-smart-http.sh | 1 - 2 files changed, 5 insertions(+), 2 deletions(-) mode change 100755 = 100644 t/lib-httpd/broken-smart-http.sh diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index 252cbf1..8e680ef 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -105,10 +105,14 @@ else Could not identify web server at '$LIB_HTTPD_PATH' fi +install_script () { + write_script $HTTPD_ROOT_PATH/$1 $TEST_PATH/$1 +} + prepare_httpd() { mkdir -p $HTTPD_DOCUMENT_ROOT_PATH cp $TEST_PATH/passwd $HTTPD_ROOT_PATH - cp $TEST_PATH/broken-smart-http.sh $HTTPD_ROOT_PATH + install_script broken-smart-http.sh ln -s $LIB_HTTPD_MODULE_PATH $HTTPD_ROOT_PATH/modules diff --git a/t/lib-httpd/broken-smart-http.sh b/t/lib-httpd/broken-smart-http.sh old mode 100755 new mode 100644 index f7ebfff..82cc610 --- a/t/lib-httpd/broken-smart-http.sh +++ b/t/lib-httpd/broken-smart-http.sh @@ -1,4 +1,3 @@ -#!/bin/sh printf Content-Type: text/%s\n html echo printf %s\n 001e# service=git-upload-pack -- 2.0.0.rc1.436.g03cb729 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/8] t5550: test display of remote http error messages
Since commit 426e70d (remote-curl: show server content on http errors, 2013-04-05), we relay any text/plain error messages from the remote server to the user. However, we never tested it. Signed-off-by: Jeff King p...@peff.net --- t/lib-httpd.sh | 1 + t/lib-httpd/apache.conf| 4 t/lib-httpd/error.sh | 17 + t/t5550-http-fetch-dumb.sh | 10 ++ 4 files changed, 32 insertions(+) create mode 100755 t/lib-httpd/error.sh diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index 8e680ef..f7640be 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -113,6 +113,7 @@ prepare_httpd() { mkdir -p $HTTPD_DOCUMENT_ROOT_PATH cp $TEST_PATH/passwd $HTTPD_ROOT_PATH install_script broken-smart-http.sh + install_script error.sh ln -s $LIB_HTTPD_MODULE_PATH $HTTPD_ROOT_PATH/modules diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf index 3a03e82..b384d79 100644 --- a/t/lib-httpd/apache.conf +++ b/t/lib-httpd/apache.conf @@ -97,12 +97,16 @@ Alias /auth/dumb/ www/auth/dumb/ /LocationMatch ScriptAliasMatch /smart_*[^/]*/(.*) ${GIT_EXEC_PATH}/git-http-backend/$1 ScriptAlias /broken_smart/ broken-smart-http.sh/ +ScriptAlias /error/ error.sh/ Directory ${GIT_EXEC_PATH} Options FollowSymlinks /Directory Files broken-smart-http.sh Options ExecCGI /Files +Files error.sh + Options ExecCGI +/Files Files ${GIT_EXEC_PATH}/git-http-backend Options ExecCGI /Files diff --git a/t/lib-httpd/error.sh b/t/lib-httpd/error.sh new file mode 100755 index 000..786f281 --- /dev/null +++ b/t/lib-httpd/error.sh @@ -0,0 +1,17 @@ +#!/bin/sh + +printf Status: 500 Intentional Breakage\n + +printf Content-Type: +case $PATH_INFO in +*html*) + printf text/html + ;; +*text*) + printf text/plain + ;; +esac +printf \n + +printf \n +printf this is the error message\n diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index 1a3a2b6..13defd3 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -171,5 +171,15 @@ test_expect_success 'did not use upload-pack service' ' test_cmp exp act ' +test_expect_success 'git client shows text/plain errors' ' + test_must_fail git clone $HTTPD_URL/error/text 2stderr + grep this is the error message stderr +' + +test_expect_success 'git client does not show html errors' ' + test_must_fail git clone $HTTPD_URL/error/html 2stderr + ! grep this is the error message stderr +' + stop_httpd test_done -- 2.0.0.rc1.436.g03cb729 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 4/8] http: extract type/subtype portion of content-type
When we get a content-type from curl, we get the whole header line, including any parameters, and without any normalization (like downcasing or whitespace) applied. If we later try to match it with strcmp() or even strcasecmp(), we may get false negatives. This could cause two visible behaviors: 1. We might fail to recognize a smart-http server by its content-type. 2. We might fail to relay text/plain error messages to users (especially if they contain a charset parameter). This patch teaches the http code to extract and normalize just the type/subtype portion of the string. This is technically passing out less information to the callers, who can no longer see the parameters. But none of the current callers cares, and a future patch will add back an easier-to-use method for accessing those parameters. Signed-off-by: Jeff King p...@peff.net --- http.c | 32 +--- remote-curl.c | 2 +- t/lib-httpd/error.sh | 8 +++- t/t5550-http-fetch-dumb.sh | 5 + 4 files changed, 42 insertions(+), 5 deletions(-) diff --git a/http.c b/http.c index 94e1afd..4edf5b9 100644 --- a/http.c +++ b/http.c @@ -906,6 +906,29 @@ static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO info, struct strbuf *buf) return ret; } +/* + * Extract a normalized version of the content type, with any + * spaces suppressed, all letters lowercased, and no trailing ; + * or parameters. + * + * Example: + * TEXT/PLAIN; charset=utf-8 - text/plain + */ +static void extract_content_type(struct strbuf *raw, struct strbuf *type) +{ + const char *p; + + strbuf_reset(type); + strbuf_grow(type, raw-len); + for (p = raw-buf; *p; p++) { + if (isspace(*p)) + continue; + if (*p == ';') + break; + strbuf_addch(type, tolower(*p)); + } +} + /* http_request() targets */ #define HTTP_REQUEST_STRBUF0 #define HTTP_REQUEST_FILE 1 @@ -957,9 +980,12 @@ static int http_request(const char *url, ret = run_one_slot(slot, results); - if (options options-content_type) - curlinfo_strbuf(slot-curl, CURLINFO_CONTENT_TYPE, - options-content_type); + if (options options-content_type) { + struct strbuf raw = STRBUF_INIT; + curlinfo_strbuf(slot-curl, CURLINFO_CONTENT_TYPE, raw); + extract_content_type(raw, options-content_type); + strbuf_release(raw); + } if (options options-effective_url) curlinfo_strbuf(slot-curl, CURLINFO_EFFECTIVE_URL, diff --git a/remote-curl.c b/remote-curl.c index 52c2d96..a5ab977 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -205,7 +205,7 @@ static int show_http_message(struct strbuf *type, struct strbuf *msg) * TODO should handle ; charset=XXX, and re-encode into * logoutputencoding */ - if (strcasecmp(type-buf, text/plain)) + if (strcmp(type-buf, text/plain)) return -1; strbuf_trim(msg); diff --git a/t/lib-httpd/error.sh b/t/lib-httpd/error.sh index 786f281..23cec97 100755 --- a/t/lib-httpd/error.sh +++ b/t/lib-httpd/error.sh @@ -3,6 +3,7 @@ printf Status: 500 Intentional Breakage\n printf Content-Type: +charset=iso-8859-1 case $PATH_INFO in *html*) printf text/html @@ -10,8 +11,13 @@ case $PATH_INFO in *text*) printf text/plain ;; +*charset*) + printf text/plain; charset=utf-8 + charset=utf-8 + ;; esac printf \n printf \n -printf this is the error message\n +printf this is the error message\n | +iconv -f us-ascii -t $charset diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index 13defd3..b35b261 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -181,5 +181,10 @@ test_expect_success 'git client does not show html errors' ' ! grep this is the error message stderr ' +test_expect_success 'git client shows text/plain with a charset' ' + test_must_fail git clone $HTTPD_URL/error/charset 2stderr + grep this is the error message stderr +' + stop_httpd test_done -- 2.0.0.rc1.436.g03cb729 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 5/8] http: optionally extract charset parameter from content-type
Since the previous commit, we now give a sanitized, shortened version of the content-type header to any callers who ask for it. This patch adds back a way for them to cleanly access specific parameters to the type. We could easily extract all parameters and make them available via a string_list, but: 1. That complicates the interface and memory management. 2. In practice, no planned callers care about anything except the charset. This patch therefore goes with the simplest thing, and we can expand or change the interface later if it becomes necessary. Signed-off-by: Jeff King p...@peff.net --- http.c | 54 ++ http.h | 7 +++ 2 files changed, 57 insertions(+), 4 deletions(-) diff --git a/http.c b/http.c index 4edf5b9..e26ee8b 100644 --- a/http.c +++ b/http.c @@ -907,14 +907,44 @@ static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO info, struct strbuf *buf) } /* + * Check for and extract a content-type parameter. raw + * should be positioned at the start of the potential + * parameter, with any whitespace already removed. + * + * name is the name of the parameter. The value is appended + * to out. + */ +static int extract_param(const char *raw, const char *name, +struct strbuf *out) +{ + size_t len = strlen(name); + + if (strncasecmp(raw, name, len)) + return -1; + raw += len; + + if (*raw != '=') + return -1; + raw++; + + while (*raw !isspace(*raw)) + strbuf_addch(out, *raw++); + return 0; +} + +/* * Extract a normalized version of the content type, with any * spaces suppressed, all letters lowercased, and no trailing ; * or parameters. * + * If the charset argument is not NULL, store the value of any + * charset parameter there. + * * Example: - * TEXT/PLAIN; charset=utf-8 - text/plain + * TEXT/PLAIN; charset=utf-8 - text/plain, utf-8 */ -static void extract_content_type(struct strbuf *raw, struct strbuf *type) +static void extract_content_type(struct strbuf *raw, struct strbuf *type, +struct strbuf *charset) { const char *p; @@ -923,10 +953,25 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type) for (p = raw-buf; *p; p++) { if (isspace(*p)) continue; - if (*p == ';') + if (*p == ';') { + p++; break; + } strbuf_addch(type, tolower(*p)); } + + if (!charset) + return; + + strbuf_reset(charset); + while (*p) { + while (isspace(*p)) + p++; + if (!extract_param(p, charset, charset)) + return; + while (*p !isspace(*p)) + p++; + } } /* http_request() targets */ @@ -983,7 +1028,8 @@ static int http_request(const char *url, if (options options-content_type) { struct strbuf raw = STRBUF_INIT; curlinfo_strbuf(slot-curl, CURLINFO_CONTENT_TYPE, raw); - extract_content_type(raw, options-content_type); + extract_content_type(raw, options-content_type, +options-charset); strbuf_release(raw); } diff --git a/http.h b/http.h index e64084f..473179b 100644 --- a/http.h +++ b/http.h @@ -144,6 +144,13 @@ struct http_get_options { struct strbuf *content_type; /* +* If non-NULL, and content_type above is non-NULL, returns +* the charset parameter from the content-type. If none is +* present, returns an empty string. +*/ + struct strbuf *charset; + + /* * If non-NULL, returns the URL we ended up at, including any * redirects we followed. */ -- 2.0.0.rc1.436.g03cb729 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 6/8] strbuf: add strbuf_reencode helper
This is a convenience wrapper around `reencode_string_len` and `strbuf_attach`. Signed-off-by: Jeff King p...@peff.net --- Documentation/technical/api-strbuf.txt | 5 + strbuf.c | 17 + strbuf.h | 1 + 3 files changed, 23 insertions(+) diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt index 3350d97..9d28b03 100644 --- a/Documentation/technical/api-strbuf.txt +++ b/Documentation/technical/api-strbuf.txt @@ -125,6 +125,11 @@ Functions Strip whitespace from the end of a string. +`strbuf_reencode`:: + + Replace the contents of the strbuf with a reencoded form. Returns -1 + on error, 0 on success. + `strbuf_cmp`:: Compare two buffers. Returns an integer less than, equal to, or greater diff --git a/strbuf.c b/strbuf.c index ee96dcf..fc7290f 100644 --- a/strbuf.c +++ b/strbuf.c @@ -1,5 +1,6 @@ #include cache.h #include refs.h +#include utf8.h int starts_with(const char *str, const char *prefix) { @@ -106,6 +107,22 @@ void strbuf_ltrim(struct strbuf *sb) sb-buf[sb-len] = '\0'; } +int strbuf_reencode(struct strbuf *sb, const char *from, const char *to) +{ + char *out; + int len; + + if (same_encoding(from, to)) + return 0; + + out = reencode_string_len(sb-buf, sb-len, to, from, len); + if (!out) + return -1; + + strbuf_attach(sb, out, len, len); + return 0; +} + struct strbuf **strbuf_split_buf(const char *str, size_t slen, int terminator, int max) { diff --git a/strbuf.h b/strbuf.h index 39c14cf..4e9a2f8 100644 --- a/strbuf.h +++ b/strbuf.h @@ -45,6 +45,7 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len) extern void strbuf_trim(struct strbuf *); extern void strbuf_rtrim(struct strbuf *); extern void strbuf_ltrim(struct strbuf *); +extern int strbuf_reencode(struct strbuf *sb, const char *from, const char *to); extern int strbuf_cmp(const struct strbuf *, const struct strbuf *); /* -- 2.0.0.rc1.436.g03cb729 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 7/8] remote-curl: reencode http error messages
We currently recognize an error message with a content-type text/plain; charset=utf-16 as text, but we ignore the charset parameter entirely. Let's encode it to log_output_encoding, which is presumably something the user's terminal can handle. Signed-off-by: Jeff King p...@peff.net --- remote-curl.c | 17 ++--- t/lib-httpd/error.sh | 4 t/t5550-http-fetch-dumb.sh | 5 + 3 files changed, 19 insertions(+), 7 deletions(-) diff --git a/remote-curl.c b/remote-curl.c index a5ab977..4493b38 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -194,19 +194,19 @@ static void free_discovery(struct discovery *d) } } -static int show_http_message(struct strbuf *type, struct strbuf *msg) +static int show_http_message(struct strbuf *type, struct strbuf *charset, +struct strbuf *msg) { const char *p, *eol; /* * We only show text/plain parts, as other types are likely * to be ugly to look at on the user's terminal. -* -* TODO should handle ; charset=XXX, and re-encode into -* logoutputencoding */ if (strcmp(type-buf, text/plain)) return -1; + if (charset-len) + strbuf_reencode(msg, charset-buf, get_log_output_encoding()); strbuf_trim(msg); if (!msg-len) @@ -225,6 +225,7 @@ static struct discovery* discover_refs(const char *service, int for_push) { struct strbuf exp = STRBUF_INIT; struct strbuf type = STRBUF_INIT; + struct strbuf charset = STRBUF_INIT; struct strbuf buffer = STRBUF_INIT; struct strbuf refs_url = STRBUF_INIT; struct strbuf effective_url = STRBUF_INIT; @@ -249,6 +250,7 @@ static struct discovery* discover_refs(const char *service, int for_push) memset(options, 0, sizeof(options)); options.content_type = type; + options.charset = charset; options.effective_url = effective_url; options.base_url = url; options.no_cache = 1; @@ -259,13 +261,13 @@ static struct discovery* discover_refs(const char *service, int for_push) case HTTP_OK: break; case HTTP_MISSING_TARGET: - show_http_message(type, buffer); + show_http_message(type, charset, buffer); die(repository '%s' not found, url.buf); case HTTP_NOAUTH: - show_http_message(type, buffer); + show_http_message(type, charset, buffer); die(Authentication failed for '%s', url.buf); default: - show_http_message(type, buffer); + show_http_message(type, charset, buffer); die(unable to access '%s': %s, url.buf, curl_errorstr); } @@ -310,6 +312,7 @@ static struct discovery* discover_refs(const char *service, int for_push) strbuf_release(refs_url); strbuf_release(exp); strbuf_release(type); + strbuf_release(charset); strbuf_release(effective_url); strbuf_release(buffer); last_discovery = last; diff --git a/t/lib-httpd/error.sh b/t/lib-httpd/error.sh index 23cec97..eafc9d2 100755 --- a/t/lib-httpd/error.sh +++ b/t/lib-httpd/error.sh @@ -15,6 +15,10 @@ case $PATH_INFO in printf text/plain; charset=utf-8 charset=utf-8 ;; +*utf16*) + printf text/plain; charset=utf-16 + charset=utf-16 + ;; esac printf \n diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index b35b261..01b8aae 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -186,5 +186,10 @@ test_expect_success 'git client shows text/plain with a charset' ' grep this is the error message stderr ' +test_expect_success 'http error messages are reencoded' ' + test_must_fail git clone $HTTPD_URL/error/utf16 2stderr + grep this is the error message stderr +' + stop_httpd test_done -- 2.0.0.rc1.436.g03cb729 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 8/8] http: default text charset to iso-8859-1
This is specified by RFC 2616 as the default if no charset parameter is given. Signed-off-by: Jeff King p...@peff.net --- I'd prefer to do this simple, standard thing, and see how it works in the real world. We'll hand whatever we get off to iconv, and if it chokes, we'll pass through the data as-is. That should be enough for most ascii messages to make it through readable, even if we get the encoding wrong. If we do want to do magic like latin1 is really iso-8859-1, that seems like the domain of iconv to me. If iconv doesn't handle it itself, I'd rather have a wrapper there. Putting it at that layer keeps the code cleaner, and it means the wrapper would benefit the regular commit-log reencoding code. If anybody wants to go further in that direction, be my guest, but please make your suggestions in the form of patches which apply on top. :) http.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/http.c b/http.c index e26ee8b..a37e84e 100644 --- a/http.c +++ b/http.c @@ -972,6 +972,9 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, while (*p !isspace(*p)) p++; } + + if (!charset-len starts_with(type-buf, text/)) + strbuf_addstr(charset, ISO-8859-1); } /* http_request() targets */ -- 2.0.0.rc1.436.g03cb729 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] tolower cleanups
These two patches were pulled from the http charset series I posted nearby. The second iteration of that series did not need them, but they may have value as cleanups. [1/2]: daemon/config: factor out duplicate xstrdup_tolower [2/2]: strbuf: add strbuf_tolower function The first one is a real reduction of code. The second is just making code public that _might_ get used later, so it's a bit less exciting. Both are independent of each other, so we can take or leave them separately. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] daemon/config: factor out duplicate xstrdup_tolower
We have two implementations of the same function; let's drop that to one. We take the name from daemon.c, but the implementation (which is just slightly more efficient) from the config code. Signed-off-by: Jeff King p...@peff.net --- builtin/config.c | 15 +-- daemon.c | 8 strbuf.c | 13 + strbuf.h | 2 ++ 4 files changed, 16 insertions(+), 22 deletions(-) diff --git a/builtin/config.c b/builtin/config.c index 5677c94..fcd8474 100644 --- a/builtin/config.c +++ b/builtin/config.c @@ -395,19 +395,6 @@ static int urlmatch_collect_fn(const char *var, const char *value, void *cb) return 0; } -static char *dup_downcase(const char *string) -{ - char *result; - size_t len, i; - - len = strlen(string); - result = xmalloc(len + 1); - for (i = 0; i len; i++) - result[i] = tolower(string[i]); - result[i] = '\0'; - return result; -} - static int get_urlmatch(const char *var, const char *url) { char *section_tail; @@ -422,7 +409,7 @@ static int get_urlmatch(const char *var, const char *url) if (!url_normalize(url, config.url)) die(%s, config.url.err); - config.section = dup_downcase(var); + config.section = xstrdup_tolower(var); section_tail = strchr(config.section, '.'); if (section_tail) { *section_tail = '\0'; diff --git a/daemon.c b/daemon.c index eba1255..f9c63e9 100644 --- a/daemon.c +++ b/daemon.c @@ -475,14 +475,6 @@ static void make_service_overridable(const char *name, int ena) die(No such service %s, name); } -static char *xstrdup_tolower(const char *str) -{ - char *p, *dup = xstrdup(str); - for (p = dup; *p; p++) - *p = tolower(*p); - return dup; -} - static void parse_host_and_port(char *hostport, char **host, char **port) { diff --git a/strbuf.c b/strbuf.c index ee96dcf..854c725 100644 --- a/strbuf.c +++ b/strbuf.c @@ -570,3 +570,16 @@ int fprintf_ln(FILE *fp, const char *fmt, ...) return -1; return ret + 1; } + +char *xstrdup_tolower(const char *string) +{ + char *result; + size_t len, i; + + len = strlen(string); + result = xmalloc(len + 1); + for (i = 0; i len; i++) + result[i] = tolower(string[i]); + result[i] = '\0'; + return result; +} diff --git a/strbuf.h b/strbuf.h index 39c14cf..4de7531 100644 --- a/strbuf.h +++ b/strbuf.h @@ -183,4 +183,6 @@ extern int printf_ln(const char *fmt, ...); __attribute__((format (printf,2,3))) extern int fprintf_ln(FILE *fp, const char *fmt, ...); +char *xstrdup_tolower(const char *); + #endif /* STRBUF_H */ -- 2.0.0.rc1.436.g03cb729 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] strbuf: add strbuf_tolower function
This makes config's lowercase() function public. Note that we could continue to offer a pure-string lowercase, but there would be no callers (in most pure-string cases, we actually duplicate and lowercase the duplicate). Signed-off-by: Jeff King p...@peff.net --- Documentation/technical/api-strbuf.txt | 4 config.c | 8 +--- strbuf.c | 7 +++ strbuf.h | 1 + 4 files changed, 13 insertions(+), 7 deletions(-) diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt index 3350d97..8480f89 100644 --- a/Documentation/technical/api-strbuf.txt +++ b/Documentation/technical/api-strbuf.txt @@ -125,6 +125,10 @@ Functions Strip whitespace from the end of a string. +`strbuf_tolower`:: + + Lowercase each character in the buffer using `tolower`. + `strbuf_cmp`:: Compare two buffers. Returns an integer less than, equal to, or greater diff --git a/config.c b/config.c index a30cb5c..03ce5c6 100644 --- a/config.c +++ b/config.c @@ -147,12 +147,6 @@ int git_config_include(const char *var, const char *value, void *data) return ret; } -static void lowercase(char *p) -{ - for (; *p; p++) - *p = tolower(*p); -} - void git_config_push_parameter(const char *text) { struct strbuf env = STRBUF_INIT; @@ -180,7 +174,7 @@ int git_config_parse_parameter(const char *text, strbuf_list_free(pair); return error(bogus config parameter: %s, text); } - lowercase(pair[0]-buf); + strbuf_tolower(pair[0]); if (fn(pair[0]-buf, pair[1] ? pair[1]-buf : NULL, data) 0) { strbuf_list_free(pair); return -1; diff --git a/strbuf.c b/strbuf.c index 854c725..3da4f3e 100644 --- a/strbuf.c +++ b/strbuf.c @@ -106,6 +106,13 @@ void strbuf_ltrim(struct strbuf *sb) sb-buf[sb-len] = '\0'; } +void strbuf_tolower(struct strbuf *sb) +{ + char *p; + for (p = sb-buf; *p; p++) + *p = tolower(*p); +} + struct strbuf **strbuf_split_buf(const char *str, size_t slen, int terminator, int max) { diff --git a/strbuf.h b/strbuf.h index 4de7531..25328b9 100644 --- a/strbuf.h +++ b/strbuf.h @@ -45,6 +45,7 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len) extern void strbuf_trim(struct strbuf *); extern void strbuf_rtrim(struct strbuf *); extern void strbuf_ltrim(struct strbuf *); +extern void strbuf_tolower(struct strbuf *sb); extern int strbuf_cmp(const struct strbuf *, const struct strbuf *); /* -- 2.0.0.rc1.436.g03cb729 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH v4 3/3] add command performance tracing to debug scripted commands
On Thu, May 22, 2014 at 02:40:48AM +0200, Karsten Blees wrote: E.g. if I'm interested in a particular code section, I throw in 2 lines of code (before and after the code section). This gives very accurate results, without significantly affecting overall performance. I can then push the changes to my Linux/Windows box and get comparable results there. No need to disable optimization. No worries that the profiling tool isn't available on the other platform. No analyzing megabytes of mostly irrelevant profiling data. Does that make sense? Ah, I see. I misunderstood from your example above. I do agree that automatically stamping with __FILE__ and __LINE__ is very helpful there. Could we maybe restrict that use of the variadic macros to a few known-good compilers (maybe #ifdef __GNUC__, which also hits clang, and something to catch MSVC)? On other systems it would become a compile-time noop, and they could live without the feature. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter
Jeff King: I was really hoping to avoid getting into all of the real-world messiness that a real http client needs to deal with (as opposed to just following the standards). Yeah, I agree, you're probably fine without all this detail in over 99% of the cases where this code would ever be exposed. I'm a bit damaged after 10+ years as a web browser developer, responsible for exactly this kind of stuff... :-) -- \\// Peter - http://www.softwolves.pp.se/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Get rid of the non portable shell export VAR=VALUE costruct
Found by check-non-portable-shell.pl Signed-off-by: Elia Pinto gitter.spi...@gmail.com --- contrib/subtree/t/t7900-subtree.sh |2 +- git-remote-testgit.sh |2 +- git-stash.sh |2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/contrib/subtree/t/t7900-subtree.sh b/contrib/subtree/t/t7900-subtree.sh index 66ce4b0..c1d0b23 100755 --- a/contrib/subtree/t/t7900-subtree.sh +++ b/contrib/subtree/t/t7900-subtree.sh @@ -8,7 +8,7 @@ This test verifies the basic operation of the merge, pull, add and split subcommands of git subtree. ' -export TEST_DIRECTORY=$(pwd)/../../../t +TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY . ../../../t/test-lib.sh diff --git a/git-remote-testgit.sh b/git-remote-testgit.sh index 1c006a0..9d1ce76 100755 --- a/git-remote-testgit.sh +++ b/git-remote-testgit.sh @@ -13,7 +13,7 @@ refspec=${GIT_REMOTE_TESTGIT_REFSPEC-$default_refspec} test -z $refspec prefix=refs -export GIT_DIR=$url/.git +GIT_DIR=$url/.git export GIT_DIR force= diff --git a/git-stash.sh b/git-stash.sh index 4798bcf..0bb1af8 100755 --- a/git-stash.sh +++ b/git-stash.sh @@ -94,7 +94,7 @@ create_stash () { # ease of unpacking later. u_commit=$( untracked_files | ( - export GIT_INDEX_FILE=$TMPindex + GIT_INDEX_FILE=$TMPindex export GIT_INDEX_FILE rm -f $TMPindex git update-index -z --add --remove --stdin u_tree=$(git write-tree) -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct
On 2014-05-22 14.48, Elia Pinto wrote: Found by check-non-portable-shell.pl Thanks for picking this up -export TEST_DIRECTORY=$(pwd)/../../../t +TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY Minor remark: Both commands should go on their own line, like this: TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY And, unrelated to this patch, there seem to be a lot of missing in git-remote-testgit.sh. But this should be a seperate patch, I think. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct
Torsten Bögershausen tbo...@web.de writes: On 2014-05-22 14.48, Elia Pinto wrote: Found by check-non-portable-shell.pl Thanks for picking this up -export TEST_DIRECTORY=$(pwd)/../../../t +TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY Minor remark: Both commands should go on their own line, like this: TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY And, unrelated to this patch, there seem to be a lot of missing in git-remote-testgit.sh. I have a hard time taking the above seriously. pwd is a shell builtin (when we are not talking about Version 3 UNIX or something) that can hardly fail. And when your shell does not support assignment to a shell variable, you'll have a hard time getting the shell script to run. That's stuff of the if (1+1 != 2) { fputs(Warning: your CPU may be broken, stderr); } variety. If you have to check for that, you have bigger problems... -- David Kastrup -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] strbuf: add strbuf_tolower function
[re-adding list cc] On Thu, May 22, 2014 at 03:16:45PM +0200, Christian Couder wrote: +void strbuf_tolower(struct strbuf *sb) +{ + char *p; + for (p = sb-buf; *p; p++) + *p = tolower(*p); +} Last time I tried a change like the above, I was told that strbufs are buffers that can contain NUL bytes. So maybe it should be: for (p = sb-buf; p sb-buf + sb-len; p++) *p = tolower(*p); Hah. I wrote it like that originally, and in review was asked to switch it. I agree that it might be worth keeping strbuf functions 8bit-clean. The original intent of the strbuf code was to make C strings easier to use, but I think we sometimes use them as general buffers, too. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct
5-22 14.48, Elia Pinto wrote: Found by check-non-portable-shell.pl Thanks for picking this up -export TEST_DIRECTORY=$(pwd)/../../../t +TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY Minor remark: Both commands should go on their own line, like this: TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY Apparently they are both common idioms and I find no indication in the CodingGuideline. I have no problems rerolling this simple patch, but i need to know what is the (git) right style in this case. Thanks -- cd ./git-core grep .*export *.sh git-am.sh: GIT_MERGE_VERBOSITY=0 export GIT_MERGE_VERBOSITY git-filter-branch.sh: echo case \\$GIT_$1_NAME\ in \\) GIT_$1_NAME=\\${GIT_$1_EMAIL%%@*}\ export GIT_$1_NAME;; esac git-filter-branch.sh: GIT_DIR=$ORIG_GIT_DIR export GIT_DIR git-rebase--merge.sh: GIT_MERGE_VERBOSITY=1 export GIT_MERGE_VERBOSITY git-remote-testgit.sh:GIT_DIR=$url/.git export GIT_DIR git-stash.sh: GIT_INDEX_FILE=$TMPindex export GIT_INDEX_FILE git-stash.sh: GIT_MERGE_VERBOSITY=0 export GIT_MERGE_VERBOSITY grep 'export.*' *.sh git-stash.sh: GIT_INDEX_FILE=$TMPindex export GIT_INDEX_FILE git-stash.sh: export GIT_INDEX_FILE And, unrelated to this patch, there seem to be a lot of missing in git-remote-testgit.sh. But this should be a seperate patch, I think. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] completion: add a note that merge options are shared
This should avoid future confusion after a subsequent patch has added some options to __git_merge_options and some directly in _git_merge(). Signed-off-by: John Keeping j...@keeping.me.uk --- contrib/completion/git-completion.bash | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 2c59a76..ff97c20 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1472,6 +1472,7 @@ _git_log () __git_complete_revlist } +# Common merge options shared by git-merge(1) and git-pull(1). __git_merge_options= --no-commit --no-stat --log --no-log --squash --strategy --commit --stat --no-squash --ff --no-ff --ff-only --edit --no-edit -- 2.0.0.rc2.4.g1dc51c6 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct
Am 5/22/2014 15:19, schrieb David Kastrup: Torsten Bögershausen tbo...@web.de writes: On 2014-05-22 14.48, Elia Pinto wrote: Found by check-non-portable-shell.pl Thanks for picking this up -export TEST_DIRECTORY=$(pwd)/../../../t +TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY Minor remark: Both commands should go on their own line, like this: TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY And, unrelated to this patch, there seem to be a lot of missing in git-remote-testgit.sh. I have a hard time taking the above seriously. pwd is a shell builtin (when we are not talking about Version 3 UNIX or something) that can hardly fail. And when your shell does not support assignment to a shell variable, you'll have a hard time getting the shell script to run. The after an assignment makes a big difference when the assignment is part of an chain. This is *very* common in our test suite, as you know. People tend to copy-and-paste. And then it is better to provide a more universally applicable precedent. -- Hannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct
Johannes Sixt j.s...@viscovery.net writes: Am 5/22/2014 15:19, schrieb David Kastrup: Torsten Bögershausen tbo...@web.de writes: On 2014-05-22 14.48, Elia Pinto wrote: Found by check-non-portable-shell.pl Thanks for picking this up -export TEST_DIRECTORY=$(pwd)/../../../t +TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY Minor remark: Both commands should go on their own line, like this: TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY And, unrelated to this patch, there seem to be a lot of missing in git-remote-testgit.sh. I have a hard time taking the above seriously. pwd is a shell builtin (when we are not talking about Version 3 UNIX or something) that can hardly fail. And when your shell does not support assignment to a shell variable, you'll have a hard time getting the shell script to run. The after an assignment makes a big difference when the assignment is part of an chain. This is *very* common in our test suite, as you know. People tend to copy-and-paste. And then it is better to provide a more universally applicable precedent. Copy-and-paste will not magically add the second that would be required for that usage, and the one in the line above might mislead you into thinking that the problem has been dealt with already. So I'm not convinced that using outside of a preexisting chain makes any sense in this context. -- David Kastrup -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 35/44] refs.c: make delete_ref use a transaction
On Wed, May 21, 2014 at 4:22 PM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: --- a/refs.c +++ b/refs.c [...] @@ -2515,24 +2510,18 @@ static int delete_ref_loose(struct ref_lock *lock, int flag, struct strbuf *err) int delete_ref(const char *refname, const unsigned char *sha1, int delopt) { - struct ref_lock *lock; - int ret = 0, flag = 0; + struct ref_transaction *transaction; [...] - lock = lock_ref_sha1_basic(refname, sha1, delopt, flag); + if (!transaction || + ref_transaction_delete(transaction, refname, sha1, delopt, +sha1 !is_null_sha1(sha1), err) || What should happen when is_null_sha1(sha1)? In that case the caller has asked us to check that the ref doesn't exist before deleting it, which doesn't make a lot of sense, but the old code did exactly that if I read it correctly. The new code seems to disable verification instead. Do we know that no callers call delete_ref with such an argument? Would it make sense to just die with a BUG: message to make the contract more clear? There are no callers that pass in null_sha1 explicitely and all tests pass. I have changed it to a die(BUG... to make it more explicit as you suggested. [...] - unlink_or_warn(git_path(logs/%s, lock-ref_name)); When does the reflog get deleted in the new code? It is deleted towards the end of ref_transaction_commit, after the ref itself has been deleted. Thanks! ronnie sahlberg -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 36/44] refs.c: pass the ref log message to _create/delete/update instead of _commit
On Wed, May 21, 2014 at 4:47 PM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: Change the reference transactions so that we pass the reflog message through to the create/delete/update function instead of the commit message. Nice. [...] --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -673,7 +673,6 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, } } } - if (rc STORE_REF_ERROR_DF_CONFLICT) error(_(some local refs could not be updated; try running\n 'git remote prune %s' to remove any old, conflicting Stray whitespace change? Fixed. [...] --- a/refs.c +++ b/refs.c [...] @@ -3264,6 +3264,7 @@ struct ref_update { int have_old; /* 1 if old_sha1 is valid, 0 otherwise */ struct ref_lock *lock; int type; + const char *msg; const char refname[FLEX_ARRAY]; Should be 'char *msg' since we own the memory (or perhaps a strbuf). I changed it to char *msg, but it still might/should be const. We own the memory but we are not supposed to modify the content (after we have copied what the caller gave us). Yes, we should change it into a strbuf at some stage. [...] @@ -3297,9 +3298,10 @@ void ref_transaction_free(struct ref_transaction *transaction) if (!transaction) return; - for (i = 0; i transaction-nr; i++) + for (i = 0; i transaction-nr; i++) { + free((char *)transaction-updates[i]-msg); free(transaction-updates[i]); Whitespace? Fixed. No need to cast. Done. Thanks! ronnie sahlberg -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 38/44] refs.c: pack all refs before we start to rename a ref
On Wed, May 21, 2014 at 4:57 PM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: This means that most loose refs will no longer be present after the rename Is this to handle the git branch -m foo/bar foo case or for some other purpose? Yes. That is the main reason. [...] --- a/t/t3200-branch.sh +++ b/t/t3200-branch.sh @@ -289,7 +289,7 @@ test_expect_success 'renaming a symref is not allowed' ' git symbolic-ref refs/heads/master2 refs/heads/master test_must_fail git branch -m master2 master3 git symbolic-ref refs/heads/master2 - test_path_is_file .git/refs/heads/master + test_path_is_missing .git/refs/heads/master test_path_is_missing .git/refs/heads/master3 It's kind of silly that this test is mucking about in the .git directory at all. Shouldn't the check be something like git rev-parse --verify refs/heads/master test_must_fail git symbolic-ref refs/heads/master3 test_must_fail git rev-parse refs/heads/master3 ? Thanks. I updated the test with your change. There is a whole bunch of tests that are like and access files directly.. Testing if a path exists or not, or checking that a reflog file contains x number of lines. etc. All these tests will be updated to not access the files directly once I start mucking around with a TDB based refs backend. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git reset for index restoration?
If I have a git repository with a clean working tree, and I delete the index, then I can use git reset (with no arguments) to recreate it. However, when I do recreate it, it doesn't come back the same. I have not analyzed this in detail, but the effect is that commands like git status take much longer because they must read objects out of a pack file. In other words, the index seems to not realize that the index (or at least most of it) represents the same state as HEAD. If I do git reset --hard, the index is restored to the original state (it's byte-for-byte identical), and the pack file is no longer read. Before I try to dig in to why this should be so, does anyone happen to know off the top of their head? Does this constitute a bug in git, or a bug in my understanding of git? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 35/44] refs.c: make delete_ref use a transaction
On Thu, May 22, 2014 at 8:32 AM, Ronnie Sahlberg sahlb...@google.com wrote: On Wed, May 21, 2014 at 4:22 PM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: --- a/refs.c +++ b/refs.c [...] @@ -2515,24 +2510,18 @@ static int delete_ref_loose(struct ref_lock *lock, int flag, struct strbuf *err) int delete_ref(const char *refname, const unsigned char *sha1, int delopt) { - struct ref_lock *lock; - int ret = 0, flag = 0; + struct ref_transaction *transaction; [...] - lock = lock_ref_sha1_basic(refname, sha1, delopt, flag); + if (!transaction || + ref_transaction_delete(transaction, refname, sha1, delopt, +sha1 !is_null_sha1(sha1), err) || What should happen when is_null_sha1(sha1)? In that case the caller has asked us to check that the ref doesn't exist before deleting it, which doesn't make a lot of sense, but the old code did exactly that if I read it correctly. The new code seems to disable verification instead. Do we know that no callers call delete_ref with such an argument? Would it make sense to just die with a BUG: message to make the contract more clear? There are no callers that pass in null_sha1 explicitely and all tests pass. I have changed it to a die(BUG... to make it more explicit as you suggested. Strike that. While there are no callers I can see that deliberately send null_sha1 it can happen because resolve_ref_unsafe(reading=0) calls handle_missing_loose_ref which will clear sha1 if the ref is missing. This can happen for either symrefs or refs that are soft links that point to a nonexisting ref. [...] - unlink_or_warn(git_path(logs/%s, lock-ref_name)); When does the reflog get deleted in the new code? It is deleted towards the end of ref_transaction_commit, after the ref itself has been deleted. Thanks! ronnie sahlberg -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] read-cache: inform the daemon that the index has been updated
On Tue, 2014-05-13 at 18:15 +0700, Nguyễn Thái Ngọc Duy wrote: + if (run_command(cp)) + warning(_(failed to start read-cache--daemon: %s), + strerror(errno)); errno is not always (ever?) set, so if read-cache--daemon is missing, you get: warning: failed to start read-cache--daemon: Success -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Thu, May 22, 2014 at 12:22:43PM -0400, David Turner wrote: If I have a git repository with a clean working tree, and I delete the index, then I can use git reset (with no arguments) to recreate it. However, when I do recreate it, it doesn't come back the same. I have not analyzed this in detail, but the effect is that commands like git status take much longer because they must read objects out of a pack file. In other words, the index seems to not realize that the index (or at least most of it) represents the same state as HEAD. If I do git reset --hard, the index is restored to the original state (it's byte-for-byte identical), and the pack file is no longer read. Are you sure it's reading a packfile? I would expect that it is rather reading the files in the working tree. One of the functions of the index is to maintain a cache of the sha1 of the working tree file content and the stat information. Once that is populated, a subsequent git status can then just lstat() the files to see that its sha1 cache is up to date. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Thu, May 22, 2014 at 9:22 AM, David Turner dtur...@twopensource.com wrote: If I have a git repository with a clean working tree, and I delete the index, then I can use git reset (with no arguments) to recreate it. However, when I do recreate it, it doesn't come back the same. I have not analyzed this in detail, but the effect is that commands like git status take much longer because they must read objects out of a pack file. In other words, the index seems to not realize that the index (or at least most of it) represents the same state as HEAD. If I do git reset --hard, the index is restored to the original state (it's byte-for-byte identical), and the pack file is no longer read. Before I try to dig in to why this should be so, does anyone happen to know off the top of their head? Does this constitute a bug in git, or a bug in my understanding of git? It's not a bug. The index has additional stat-info it tracks about files -- inode number, mtime, etc. -- so that it can quickly check whether files are up to date without comparing full file contents in the working copy to the relevant version from .git/objects. 'git reset' means make the index match the commit in HEAD. It implies nothing about the working copy files, which could be different. Although you say that you have a clean working tree, git doesn't check to verify that, so it has to treat these files as stat-dirty until the next operation (e.g. git status) fills these in -- an operation that involves full comparison of the files in the working copy with the relevant version of the file from under .git/objects. (You may find 'git update-index --refresh' helpful here.) git reset --hard means not only make the index match the commit in HEAD, but change files in the working copy to match as well. In such a case, git will know that the index matches the working copy (since it is enforcing it), so it can update all the stat-info in the index and future git-status operations will be fast. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 32/44] refs.c: remove the update_ref_write function
On Wed, May 21, 2014 at 3:07 PM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: +++ b/refs.c [...] @@ -3518,14 +3499,16 @@ int ref_transaction_commit(struct ref_transaction *transaction, struct ref_update *update = updates[i]; if (!is_null_sha1(update-new_sha1)) { - ret = update_ref_write(msg, -update-refname, -update-new_sha1, -update-lock, err, -UPDATE_REFS_QUIET_ON_ERR); + ret = write_ref_sha1(update-lock, update-new_sha1, + msg); This changes the return value on error from 1 to -1. That seems like a good change. It's probably worth mentioning in the commit message. Done. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 34/44] refs.c: make prune_ref use a transaction to delete the ref
Added a comment that any flags =0x100 are reserved for internal use. On Wed, May 21, 2014 at 4:01 PM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: Change prune_ref to delete the ref using a ref transaction. To do this we also need to add a new flag REF_ISPRUNING that will tell the transaction that we do not want to delete this ref from the packed refs. Interesting. Since the flag is per ref update, it even would allow deleting some refs and pruning others in the same transaction. Makes sense. Looks like this doesn't batch up multiple ref-prunings into a single transaction. Makes sense (it would just make the period while refs are locked longer without having any real benefit). [...] +#define REF_ISPRUNING0x0100 Can this conflict with bit values declared elsewhere some day? It would be more comfortable if refs.h also had a note about bits = 0x100 being reserved for private use. The rest of the patch looks good. Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 30/44] refs.c: add transaction.status and track OPEN/CLOSED/ERROR
On Wed, May 21, 2014 at 3:22 PM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: Please pull my ref-transactions branch. I'm at bd5736cb (2014-05-21 13:46) now. On Wed, May 21, 2014 at 3:00 PM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: --- a/refs.c +++ b/refs.c @@ -3308,6 +3308,12 @@ struct ref_update { const char refname[FLEX_ARRAY]; }; +enum ref_transaction_status { + REF_TRANSACTION_OPEN = 0, + REF_TRANSACTION_CLOSED = 1, + REF_TRANSACTION_ERROR = 2, What is the difference between _TRANSACTION_CLOSED and _TRANSACTION_ERROR? Closed is a transaction that has been committed successfully, and which we can not do any more updates onto. Error is a transaction that has failed, and which we can not do any more updates onto. That means that both mean the caller cannot do any more updates, right? Why not call them both _CLOSED? The distinction could be useful if in the future we add support to reuse a transaction If the distinction becomes useful in the future, wouldn't that be the moment to add a new state? I don't mean that there has to be a big useful distinction. E.g., maybe the idea is that when using a debugger it can be useful to see the difference between _ERROR and _CLOSED or something, and I think that would be fine as long as the intended meaning is documented (e.g., using comments). My only complaint is that it's hard to understand the code and keep track of which status should be used in a given place unless there is some distinction between them. I have documented the transaction states in refs.c Thanks. ronnie sahlberg -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 39/44] refs.c: move the check for valid refname to lock_ref_sha1_basic
On Wed, May 21, 2014 at 6:42 PM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: --- a/refs.c +++ b/refs.c @@ -2044,6 +2044,9 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, int missing = 0; int attempts_remaining = 3; + if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) + return NULL; Are we sure that no callers are locking a poorly named ref as preparation for repairing it? To see what works already in that vein, I tried: $ git rev-parse HEAD .git/refs/heads/foo..bar $ git branch -m foo..bar something-saner fatal: Invalid branch name: 'foo..bar' git branch -m has an explicit codepath (recovery = 1;) to handle this case, but it looks like it was not well tested and regressed in v1.7.8-rc0~19^2~7 (resolve_ref(): verify that the input refname has the right format, 2011-09-15). Is what the recovery codepath of branch -m does misguided? One school of thought would be that people with malformed refs are responsible for recovering using low-level tools like mv and vi instead of normal git commands since normal git commands should never have created such a bad situation. Another school of thought would assert that * git commands can have bugs * the format checked by check_refname_format() keeps getting stricter over time which means it's nice when people can recover with 'update-ref -d' or 'branch -m'. It's not obvious to me what the right thing to do here is (maybe a special flag to be attached to a ref update during recovery?). I don't think we should allow external caller any other way to access/modify refs than through the transaction api. This may make it awkward to handle cases such as foo/../bar was created and how would you now delete it ? The exception to this I think would be 'git fsck'. We could allow fsck to have low level access to the backing store and allow it to access the files directly, or allow fsck to set magic flags that disable various checks. Hope that helps, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 39/44] refs.c: move the check for valid refname to lock_ref_sha1_basic
Ronnie Sahlberg wrote: On Wed, May 21, 2014 at 6:42 PM, Jonathan Nieder jrnie...@gmail.com wrote: $ git rev-parse HEAD .git/refs/heads/foo..bar $ git branch -m foo..bar something-saner fatal: Invalid branch name: 'foo..bar' git branch -m has an explicit codepath (recovery = 1;) to handle this case, but it looks like it was not well tested and regressed in v1.7.8-rc0~19^2~7 (resolve_ref(): verify that the input refname has the right format, 2011-09-15). Is what the recovery codepath of branch -m does misguided? [...] I don't think we should allow external caller any other way to access/modify refs than through the transaction api. This may make it awkward to handle cases such as foo/../bar was created and how would you now delete it ? The exception to this I think would be 'git fsck'. We could allow fsck to have low level access to the backing store and allow it to access the files directly, or allow fsck to set magic flags that disable various checks. Interesting. Do you mean that 'git fsck' should notice invalid refnames and have an option to repair them on its own? That would be a big change from what git fsck currently does --- it's currently read-only (except for fsck --lost-found, which is read-only except in the .git/lost-found/ directory). If there's an option to check refnames only and not have to wait for the full check of all objects then I think it makes sense. In the meantime, 'git branch -m' and 'git update-ref -d' (which were the historical ways to do this kind of repair) are already broken, so at least this patch doesn't seem to make anything worse. :/ (I haven't checked all callers, though.) Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 40/44] refs.c: call lock_ref_sha1_basic directly from commit
Ronnie Sahlberg wrote: Signed-off-by: Ronnie Sahlberg sahlb...@google.com --- refs.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) Reviewed-by: Jonathan Nieder jrnie...@gmail.com -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 39/44] refs.c: move the check for valid refname to lock_ref_sha1_basic
On Thu, May 22, 2014 at 10:44 AM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: On Wed, May 21, 2014 at 6:42 PM, Jonathan Nieder jrnie...@gmail.com wrote: $ git rev-parse HEAD .git/refs/heads/foo..bar $ git branch -m foo..bar something-saner fatal: Invalid branch name: 'foo..bar' git branch -m has an explicit codepath (recovery = 1;) to handle this case, but it looks like it was not well tested and regressed in v1.7.8-rc0~19^2~7 (resolve_ref(): verify that the input refname has the right format, 2011-09-15). Is what the recovery codepath of branch -m does misguided? [...] I don't think we should allow external caller any other way to access/modify refs than through the transaction api. This may make it awkward to handle cases such as foo/../bar was created and how would you now delete it ? The exception to this I think would be 'git fsck'. We could allow fsck to have low level access to the backing store and allow it to access the files directly, or allow fsck to set magic flags that disable various checks. Interesting. Do you mean that 'git fsck' should notice invalid refnames and have an option to repair them on its own? I think that would be useful. But it would prompt the user before it does any changes. Even for a fsck it would be rude to do any changes without asking the user. '.git/refs/heads/foo^G^G^G.bar' it not a valid ref. [s kip] [d elete] [r ename] ? That would be a big change from what git fsck currently does --- it's currently read-only (except for fsck --lost-found, which is read-only except in the .git/lost-found/ directory). If there's an option to check refnames only and not have to wait for the full check of all objects then I think it makes sense. In the meantime, 'git branch -m' and 'git update-ref -d' (which were the historical ways to do this kind of repair) are already broken, so at least this patch doesn't seem to make anything worse. :/ (I haven't checked all callers, though.) Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 38/44] refs.c: pack all refs before we start to rename a ref
On Thu, May 22, 2014 at 10:51 AM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: On Wed, May 21, 2014 at 4:57 PM, Jonathan Nieder jrnie...@gmail.com wrote: Ronnie Sahlberg wrote: This means that most loose refs will no longer be present after the rename Is this to handle the git branch -m foo/bar foo case or for some other purpose? Yes. That is the main reason. Thanks. Please say so in the log message. Remember that people of the future debugging and trying to figure out how to fix the latest regression without breaking something else and whether to just revert your patch cannot read your mind. ;-) Done. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Thu, 2014-05-22 at 12:46 -0400, Jeff King wrote: On Thu, May 22, 2014 at 12:22:43PM -0400, David Turner wrote: If I have a git repository with a clean working tree, and I delete the index, then I can use git reset (with no arguments) to recreate it. However, when I do recreate it, it doesn't come back the same. I have not analyzed this in detail, but the effect is that commands like git status take much longer because they must read objects out of a pack file. In other words, the index seems to not realize that the index (or at least most of it) represents the same state as HEAD. If I do git reset --hard, the index is restored to the original state (it's byte-for-byte identical), and the pack file is no longer read. Are you sure it's reading a packfile? Well, it's calling inflate(), and strace says it is reading e.g. .git/objects/pack/pack-{idx,pack}. So, I would say so. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Thu, 2014-05-22 at 09:46 -0700, Elijah Newren wrote: On Thu, May 22, 2014 at 9:22 AM, David Turner dtur...@twopensource.com wrote: If I have a git repository with a clean working tree, and I delete the index, then I can use git reset (with no arguments) to recreate it. However, when I do recreate it, it doesn't come back the same. I have not analyzed this in detail, but the effect is that commands like git status take much longer because they must read objects out of a pack file. In other words, the index seems to not realize that the index (or at least most of it) represents the same state as HEAD. If I do git reset --hard, the index is restored to the original state (it's byte-for-byte identical), and the pack file is no longer read. Before I try to dig in to why this should be so, does anyone happen to know off the top of their head? Does this constitute a bug in git, or a bug in my understanding of git? It's not a bug. The index has additional stat-info it tracks about files -- inode number, mtime, etc. -- so that it can quickly check whether files are up to date without comparing full file contents in the working copy to the relevant version from .git/objects. 'git reset' means make the index match the commit in HEAD. Sure, so why does git status the need to read the pack file, given that we already know that the index matches HEAD? It implies nothing about the working copy files, which could be different. Although you say that you have a clean working tree, git doesn't check to verify that, so it has to treat these files as stat-dirty until the next operation (e.g. git status) fills these in -- an operation that involves full comparison of the files in the working copy with the relevant version of the file from under .git/objects. (You may find 'git update-index --refresh' helpful here.) In fact, git status does not write the index (at least in this context). And what is slow is not (only) checking over the working tree, but reading the packs. There should be no need to read files from the ODB at all, since the index knows those objects' SHA1s, and it can check those against the working tree's SHA1s. Interestingly, if strace is to be believed, git status doesn't even do check the working trees SHA1s after a git reset; maybe reset already sets the stat bits? So that's why I'm confused. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only
Hi, Ronnie Sahlberg wrote: Add a new flag REF_ISPACKONLY that we can use in ref_transaction_delete. This flag indicates that the ref does not exist as a loose ref andf only as a packed ref. If this is the case we then change the commit code so that we skip taking out a lock file and we skip calling delete_ref_loose. This seems wrong. Can't someone else create a loose ref which will shadow the packed ref and break the serializability of updates to this ref? The above doesn't explain why we want to avoid locking the loose ref, but I assume it's for the sake of the git branch -m foo/bar foo case. For that case, wouldn't the following sequence of filesystem operations work? - create $GIT_DIR/refs/heads/foo/bar.lock - create $GIT_DIR/refs/heads/foo.lock - if $GIT_DIR/refs/heads/foo/bar exists, add the ref to packed-refs (using the usual lockfile-update mechanism) and unlink $GIT_DIR/refs/heads/foo/bar - verify that current foo and foo/bar state are okay. If not, roll back. - unlink $GIT_DIR/refs/heads/foo/bar.lock - rmdir $GIT_DIR/refs/heads/foo - rename $GIT_DIR/refs/heads/foo.lock into place Or: - create $GIT_DIR/refs/heads/foo/bar.lock - create $GIT_DIR/refs/heads/foo.lock - verify state of all relevant refs - update packed-refs to remove refs/heads/foo/bar and add refs/heads/foo - unlink $GIT_DIR/refs/heads/foo/bar - unlink $GIT_DIR/refs/heads/foo - unlink $GIT_DIR/refs/heads/foo/bar.lock - unlink $GIT_DIR/refs/heads/foo.lock Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Thu, May 22, 2014 at 02:08:16PM -0400, David Turner wrote: On Thu, 2014-05-22 at 12:46 -0400, Jeff King wrote: On Thu, May 22, 2014 at 12:22:43PM -0400, David Turner wrote: If I have a git repository with a clean working tree, and I delete the index, then I can use git reset (with no arguments) to recreate it. However, when I do recreate it, it doesn't come back the same. I have not analyzed this in detail, but the effect is that commands like git status take much longer because they must read objects out of a pack file. In other words, the index seems to not realize that the index (or at least most of it) represents the same state as HEAD. If I do git reset --hard, the index is restored to the original state (it's byte-for-byte identical), and the pack file is no longer read. Are you sure it's reading a packfile? Well, it's calling inflate(), and strace says it is reading e.g. .git/objects/pack/pack-{idx,pack}. So, I would say so. That seems odd that we would be spending extra time there. We do inflate() the trees in order to diff the index against HEAD, but we shouldn't need to inflate any blobs. Here it is for me (on linux.git): [before, warm cache] $ time perf record -q git status /dev/null real0m0.192s user0m0.080s sys 0m0.108s $ perf report | grep -v '#' | head -5 7.46% git [kernel.kallsyms] [k] __d_lookup_rcu 4.55% git libz.so.1.2.8 [.] inflate 3.53% git libc-2.18.so[.] __memcmp_sse4_1 3.46% git [kernel.kallsyms] [k] security_inode_getattr 3.29% git git [.] memihash $ time git reset real0m0.080s user0m0.036s sys 0m0.040s So status is pretty quick, and the time is going to lstat in the kernel, and some tree inflation. Reset is fast, because it has nothing much to do. Now let's kill off the index's stat cache: $ rm .git/index $ time perf record -q git reset real0m0.967s user0m0.780s sys 0m0.180s That took a while. What was it doing? $ perf report | grep -v '#' | head -5 3.23% git [kernel.kallsyms] [k] copy_user_enhanced_fast_string 1.74% git libcrypto.so.1.0.0 [.] 0x0007e010 1.60% git [kernel.kallsyms] [k] __d_lookup_rcu 1.51% git [kernel.kallsyms] [k] page_fault 1.44% git libc-2.18.so[.] __memcmp_sse4_1 Reading files and sha1. We hash the working-tree files here (reset doesn't technically need to refresh the index from the working tree to copy entries from HEAD into the index, but it does it so it can do fancy things like tell you about which files are now out-of-date). Now how does stat fare after this? $ time perf record -q git status /dev/null real0m0.189s user0m0.088s sys 0m0.096s Looks about the same as before to me. Note that if you use read-tree instead of reset, it _just_ loads the index, and doesn't touch the working tree. If you then run git status, then _that_ command has to refresh the index, and it will pay the hashing cost. Like: $ rm .git/index $ time git read-tree HEAD real0m0.084s user0m0.064s sys 0m0.016s $ time git status /dev/null real0m0.833s user0m0.712s sys 0m0.112s All of this is behaving as I would expect. Can you show us a set of commands that deviate from this? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/9] strbuf: add strbuf_tolower function
Jeff King p...@peff.net writes: On Wed, May 21, 2014 at 05:07:36PM -0700, Kyle J. McKay wrote: +void strbuf_tolower(struct strbuf *sb) +{ + size_t i; + for (i = 0; i sb-len; i++) + sb-buf[i] = tolower(sb-buf[i]); +} + Wouldn't a direct transfer of the lowercase function be something more like: void strbuf_tolower(struct strbuf *sb) { char *p = sb-buf; for (; *p; p++) *p = tolower(*p); } That seems to me to be a bit more efficient. According to the comments in strbuf.c, people can always assume buf is non NULL and -buf is NUL terminated even for a freshly initialized strbuf. Yes, and that would be fine with me (I actually wrote strbuf_tolower for my own use, and _then_ realized that we already had such a thing that could be replaced). Do we forbid that sb-buf[x] for some x sb-len to be NUL, and if there is such a byte we stop running tolower() on the remainder? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Thu, May 22, 2014 at 02:17:22PM -0400, David Turner wrote: In fact, git status does not write the index (at least in this context). And what is slow is not (only) checking over the working tree, but reading the packs. There should be no need to read files from the ODB at all, since the index knows those objects' SHA1s, and it can check those against the working tree's SHA1s. Interestingly, if strace is to be believed, git status doesn't even do check the working trees SHA1s after a git reset; maybe reset already sets the stat bits? Keep in mind that status is running _two_ diffs. One between HEAD and the index, and one between the index and the working tree. Your timings might be more interesting just run git diff-index --cached HEAD, which is the index-HEAD half of that, ignoring the working tree state entirely. Naively, running that diff will involve walking the trees and the index in parallel, which means opening a bunch of tree objects. The cache-tree index extension, however, records tree sha1s, which means we can avoid recursing into parts of the comparison. Comparing the two: $ rm .git/index $ git reset $ time git diff-index --cached HEAD real0m0.044s user0m0.040s sys 0m0.000s $ git reset --hard $ time git diff-index --cached HEAD real0m0.012s user0m0.008s sys 0m0.000s does show some improvement. Perhaps git reset is not writing out the cache-tree extension? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/9] strbuf: add strbuf_tolower function
On Thu, May 22, 2014 at 11:36:37AM -0700, Junio C Hamano wrote: Yes, and that would be fine with me (I actually wrote strbuf_tolower for my own use, and _then_ realized that we already had such a thing that could be replaced). Do we forbid that sb-buf[x] for some x sb-len to be NUL, and if there is such a byte we stop running tolower() on the remainder? Christian brought this up elsewhere, and I agree it's probably better to work over the whole buffer, NULs included. I'm happy to re-roll (or you can just pick up the version of the patch in this thread), but I think the bigger question is: is this refactor worth doing, since there is only one caller? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 0/2] format-patch --signature-file file
Jeremiah Mahler jmmah...@gmail.com writes: I just notice that my patch is in 'pu'. But it is version 7 instead of the improved version 8. Yeah, I know. In a distributed environment, multiple people work independently and a sequence of event can go like this: - I read v7, comment, and queue it only so that I won't forget; - I attend to other topics; - I start the day's integration cycle, merging topics to the integration branches, maint, master, next and pu; - You reroll v8 and post it; - I may not notice v8, or I may notice v8 but think it is not worth to discard the integration work so far only to replace v7 with v8. - The integration result is pushed out. A reminder is very much appreciated, but on the other hand it is not something unusual. It happens all the time, and I usually am aware of it ;-) Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Thu, 2014-05-22 at 14:39 -0400, Jeff King wrote: does show some improvement. Perhaps git reset is not writing out the cache-tree extension? Yes, that seems to be exactly what is going on; the two indexes are identical up to the point where the TREE extension appears. Thanks for clearing that up! Do you think that this is a bug in git reset? (I'm also going to reply to your other mail because it seems like it maybe points up some related bug) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Thu, May 22, 2014 at 03:07:49PM -0400, David Turner wrote: On Thu, 2014-05-22 at 14:39 -0400, Jeff King wrote: does show some improvement. Perhaps git reset is not writing out the cache-tree extension? Yes, that seems to be exactly what is going on; the two indexes are identical up to the point where the TREE extension appears. Thanks for clearing that up! Do you think that this is a bug in git reset? Possibly. There is a call to prime_cache_tree in builtin/reset.c, which looks like it should trigger during a mixed or hard reset (and without arguments, you should have a mixed reset). But it doesn't seem to get called. I haven't traced it further. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only
On Thu, May 22, 2014 at 11:17 AM, Jonathan Nieder jrnie...@gmail.com wrote: Hi, Ronnie Sahlberg wrote: Add a new flag REF_ISPACKONLY that we can use in ref_transaction_delete. This flag indicates that the ref does not exist as a loose ref andf only as a packed ref. If this is the case we then change the commit code so that we skip taking out a lock file and we skip calling delete_ref_loose. This seems wrong. Can't someone else create a loose ref which will shadow the packed ref and break the serializability of updates to this ref? The above doesn't explain why we want to avoid locking the loose ref, but I assume it's for the sake of the git branch -m foo/bar foo case. For that case, wouldn't the following sequence of filesystem operations work? - create $GIT_DIR/refs/heads/foo/bar.lock - create $GIT_DIR/refs/heads/foo.lock - if $GIT_DIR/refs/heads/foo/bar exists, add the ref to packed-refs (using the usual lockfile-update mechanism) and unlink $GIT_DIR/refs/heads/foo/bar - verify that current foo and foo/bar state are okay. If not, roll back. - unlink $GIT_DIR/refs/heads/foo/bar.lock - rmdir $GIT_DIR/refs/heads/foo - rename $GIT_DIR/refs/heads/foo.lock into place Or: - create $GIT_DIR/refs/heads/foo/bar.lock - create $GIT_DIR/refs/heads/foo.lock - verify state of all relevant refs - update packed-refs to remove refs/heads/foo/bar and add refs/heads/foo - unlink $GIT_DIR/refs/heads/foo/bar - unlink $GIT_DIR/refs/heads/foo - unlink $GIT_DIR/refs/heads/foo/bar.lock - unlink $GIT_DIR/refs/heads/foo.lock I removed all the rename_ref related patches for now. rename_ref is probably best implemented specifically for each backend anyway. I will still produce a separate patch that will do something like this you suggested (since rename_ref is still racy and fragile) - create $GIT_DIR/refs/heads/foo/bar.lock - create $GIT_DIR/refs/heads/foo.lock - verify state of all relevant refs - update packed-refs to remove refs/heads/foo/bar and add refs/heads/foo - unlink $GIT_DIR/refs/heads/foo/bar - unlink $GIT_DIR/refs/heads/foo - unlink $GIT_DIR/refs/heads/foo/bar.lock - unlink $GIT_DIR/refs/heads/foo.lock Thanks ronnie sahlberg -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Thu, 2014-05-22 at 14:23 -0400, Jeff King wrote: On Thu, May 22, 2014 at 02:08:16PM -0400, David Turner wrote: On Thu, 2014-05-22 at 12:46 -0400, Jeff King wrote: On Thu, May 22, 2014 at 12:22:43PM -0400, David Turner wrote: If I have a git repository with a clean working tree, and I delete the index, then I can use git reset (with no arguments) to recreate it. However, when I do recreate it, it doesn't come back the same. I have not analyzed this in detail, but the effect is that commands like git status take much longer because they must read objects out of a pack file. In other words, the index seems to not realize that the index (or at least most of it) represents the same state as HEAD. If I do git reset --hard, the index is restored to the original state (it's byte-for-byte identical), and the pack file is no longer read. Are you sure it's reading a packfile? Well, it's calling inflate(), and strace says it is reading e.g. .git/objects/pack/pack-{idx,pack}. So, I would say so. That seems odd that we would be spending extra time there. We do inflate() the trees in order to diff the index against HEAD, but we shouldn't need to inflate any blobs. I just did a fresh clone of linux.git, and it doesn't have TREE (and spends time in inflate). Then I did a reset --hard, and it gained TREE (and stopped spending time in inflate). Then I did a checkout -b, and TREE was lost again (time in inflate). hard reset again (to HEAD, so no change), and TREE returns and status is fast again. Committing doesn't seem to create a TREE extension where one doesn't exist. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 42/44] refs.c: pass a skip list to name_conflict_fn
Ronnie Sahlberg wrote: --- a/refs.c +++ b/refs.c @@ -798,11 +798,19 @@ struct name_conflict_cb { const char *refname; const char *oldrefname; const char *conflicting_refname; + const char **skip; + int skipnum; Would a struct string_list make sense here? (See Documentation/technical/api-string-list.txt.) [...] }; static int name_conflict_fn(struct ref_entry *entry, void *cb_data) { struct name_conflict_cb *data = (struct name_conflict_cb *)cb_data; + int i; + for(i = 0; i data-skipnum; i++) { (style nit) missing space after 'for'. + if (!strcmp(entry-name, data-skip[i])) { + return 0; + } Style: git tends to avoid braces around a single-line if/for/etc body. [...] @@ -817,15 +825,21 @@ static int name_conflict_fn(struct ref_entry *entry, void *cb_data) * conflicting with the name of an existing reference in dir. If * oldrefname is non-NULL, ignore potential conflicts with oldrefname * (e.g., because oldrefname is scheduled for deletion in the same - * operation). + * operation). skip contains a list of refs we want to skip checking for + * conflicts with. Refs may be skipped due to us knowing that it will + * be deleted later during a transaction that deletes one reference and then + * creates a new conflicting reference. For example a rename from m to m/m. This example of Refs may be skipped due to seems overly complicated. Isn't the idea just that skip contains a list of refs scheduled for deletion in this transaction, since they shouldn't be treated as conflicts at all (for example when renamining m to m/m)? I wonder if there's some way to make use of the result of the naive refname_available check to decide what to do when creating a ref. E.g.: if a refname would be available except there's a ref being deleted in the way, we could do one of the following: a. delete all relevant loose refs and perform the transaction in packed-refs, or b. order operations to avoid the D/F conflict, even with loose refs (the hardest case is if the ref being deleted uses a directory and we want to create a file with the same name. But that's still doable if we're willing to rmdir when needed as part of the loop to commit changes) The packed-refs trick (a) seems much simpler, but either should work. This could be done e.g. by checking is_refname_available with an empty list first before doing the real thing with a list of exclusions. [...] @@ -2592,6 +2609,9 @@ int rename_ref(const char *oldrefname, const char *newrefname, const char *logms int log = !lstat(git_path(logs/%s, oldrefname), loginfo); const char *symref = NULL; + if (!strcmp(oldrefname, newrefname)) + return 0; What is the intended result if I try to rename a nonexistent ref or an existent symref to its own name? Sorry to be so fussy about this part. It's not that I think that this change is trying to do something bad --- in fact, it's more the opposite, that I'm excited to see git learning to have a better understanding and handling of refname D/F conflicts. That would allow: * git fetch --prune working as a single transaction even if the repository being fetched from removed a refs/heads/topic branch and created refs/heads/topic/1 and refs/heads/topic/2 * git fast-import and git fetch --mirror learning the same trick * fewer code paths having to be touched to be able to (optionally) let git actually tolerate D/F conflicts, for people who want to have 'topic', 'topic/1', and 'topic/2' branches at the same time. This could be turned on by default for remote-tracking refs. It would be especially nice for people on Windows and Mac OS where there can be D/F conflicts that people on Linux didn't notice due to case-sensitivity. Longer term, through a configuration that starts turned off by default and has the default flipped as more people have upgraded git, this could make D/F conflicts in refnames stop being an error altogether. So it's kind of exciting to see, even though it's fussy to get it right. Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
[+cc Junio for cache-tree expertise] On Thu, May 22, 2014 at 03:09:59PM -0400, Jeff King wrote: does show some improvement. Perhaps git reset is not writing out the cache-tree extension? [...] Possibly. There is a call to prime_cache_tree in builtin/reset.c, which looks like it should trigger during a mixed or hard reset (and without arguments, you should have a mixed reset). But it doesn't seem to get called. I haven't traced it further. So here's what's happening. The prime_cache_tree call is in reset_index, and was added by: commit 6c52ec8a9ab48b50fc8bf9559467d5a4cf7eee3b Author: Thomas Rast tr...@student.ethz.ch Date: Tue Dec 6 18:43:39 2011 +0100 reset: update cache-tree data when appropriate In the case of --mixed and --hard, we throw away the old index and rebuild everything from the tree argument (or HEAD). So we have an opportunity here to fill in the cache-tree data, just as read-tree did. Signed-off-by: Thomas Rast tr...@student.ethz.ch Signed-off-by: Junio C Hamano gits...@pobox.com But that was counteracted by: commit 3fde386a40f38dbaa684c17603e71909b862d021 Author: Martin von Zweigbergk martinv...@gmail.com Date: Mon Jan 14 21:47:51 2013 -0800 reset [--mixed]: use diff-based reset whether or not pathspec was given Thanks to b65982b (Optimize diff-index --cached using cache-tree, 2009-05-20), resetting with paths is much faster than resetting without paths. Some timings for the linux-2.6 repo to illustrate this (best of five, warm cache): reset reset . real0m0.219s0m0.080s user0m0.140s0m0.040s sys 0m0.070s0m0.030s These two commands should do the same thing, so instead of having the user type the trailing . to get the faster do_diff_cache()-based implementation, always use it when doing a mixed reset, with or without paths (so git reset $rev would also be faster). Timing git reset shows that it indeed becomes as fast as git reset . after this patch. Signed-off-by: Martin von Zweigbergk martinv...@gmail.com Signed-off-by: Junio C Hamano gits...@pobox.com We never call reset_index now, because we handle it via diff. We could call prime_cache_tree in this case, but I'm not sure if that is a good idea, because it primes it from scratch (and so it opens up all those trees that we are trying to avoid touching). I'm not sure if there's an easy way to update it incrementally; I don't know the cache-tree code very well. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 0/2] format-patch --signature-file file
Junio, On Thu, May 22, 2014 at 12:00:39PM -0700, Junio C Hamano wrote: Jeremiah Mahler jmmah...@gmail.com writes: I just notice that my patch is in 'pu'. But it is version 7 instead of the improved version 8. Yeah, I know. In a distributed environment, multiple people work independently and a sequence of event can go like this: - I read v7, comment, and queue it only so that I won't forget; - I attend to other topics; - I start the day's integration cycle, merging topics to the integration branches, maint, master, next and pu; - You reroll v8 and post it; - I may not notice v8, or I may notice v8 but think it is not worth to discard the integration work so far only to replace v7 with v8. - The integration result is pushed out. A reminder is very much appreciated, but on the other hand it is not something unusual. It happens all the time, and I usually am aware of it ;-) Thanks. Thanks for explaining how this process works. -- Jeremiah Mahler jmmah...@gmail.com http://github.com/jmahler -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 00/44] Use ref transactions for all ref updates
Ronnie Sahlberg wrote: This version completes the work to convert all ref updates to use transactions. Finally got through this. It had thorny bits but generally goes in a very good direction. Thanks for a pleasant read. Feel free to send another iteration if you'd like review for the newer code. I expect except for the part about renames that most of what's left is just nits so it should go faster. Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 00/44] Use ref transactions for all ref updates
Ronnie Sahlberg wrote: This patch series can also be found at https://github.com/rsahlberg/git/tree/ref-transactions Thoughts on 65a1cb7b (2014-05-22 12:08): 01/40 remove ref_transaction_rollback Reviewed-by: Jonathan Nieder jrnie...@gmail.com 02/40 constify the sha arguments for ref_transaction_create|delete|update still Reviewed-by: Jonathan Nieder jrnie...@gmail.com 03/40 allow passing NULL to ref_transaction_free The third paragraph (This allows us to write code like) is still confusing. Why not drop that paragraph? 04/40 --- oh, lunch time. Will pick up when I get back. Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] Add an explicit GIT_DIR to the list of excludes
When an explicit '--git-dir' option points to a directory inside the work tree, git treats it as if it were any other directory. In particular, 'git status' lists it as untracked, while 'git add -A' stages the metadata directory entirely Add GIT_DIR to the list of excludes in setup_standard_excludes(), while checking that GIT_DIR is not just '.git', in which case it would be ignored by default, and that GIT_DIR is inside GIT_WORK_TREE Although an analogous comparison of any given path against '.git' is done in treat_path(), this does not seem to be the right place to compare against GIT_DIR. Instead, the excludes provide an effective mechanism of ignoring a file/directory, and adding GIT_DIR as an exclude is equivalent of putting it into '.gitignore'. Function setup_standard_excludes() was chosen because that is the place where the excludes are initialized by the commands that are concerned about excludes Signed-off-by: Pasha Bolokhov pasha.bolok...@gmail.com --- Add a test and change the documentation which now reflects the fact that 'setup_standard_excludes()' has to be called for the setup purposes Documentation/technical/api-directory-listing.txt | 4 +- dir.c | 20 t/t2205-add-gitdir.sh | 61 +++ 3 files changed, 83 insertions(+), 2 deletions(-) create mode 100755 t/t2205-add-gitdir.sh diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt index 7f8e78d..fd4a178 100644 --- a/Documentation/technical/api-directory-listing.txt +++ b/Documentation/technical/api-directory-listing.txt @@ -90,8 +90,8 @@ marked. If you to exclude files, make sure you have loaded index first. `add_exclude()`. * To add patterns from a file (e.g. `.git/info/exclude`), call - `add_excludes_from_file()` , and/or set `dir.exclude_per_dir`. A - short-hand function `setup_standard_excludes()` can be used to set + `add_excludes_from_file()` , and/or set `dir.exclude_per_dir`. The + short-hand function `setup_standard_excludes()` must be used to set up the standard set of exclude settings. * Set options described in the Data Structure section above. diff --git a/dir.c b/dir.c index 98bb50f..07e36f3 100644 --- a/dir.c +++ b/dir.c @@ -1588,6 +1588,26 @@ void setup_standard_excludes(struct dir_struct *dir) { const char *path; char *xdg_path; + const char *gitdir = get_git_dir(); + + /* Add git directory to the ignores first */ + if (strcmp(gitdir, .git) != 0) { /* --git-dir has been given */ + char ngit[PATH_MAX + 1]; + + /* +* See if GIT_DIR is inside the work tree; need to normalize +* 'gitdir' but 'get_git_work_tree()' always appears absolute +* and normalized +*/ + normalize_path_copy(ngit, real_path(absolute_path(gitdir))); + + if (dir_inside_of(ngit, get_git_work_tree()) = 0) { + struct exclude_list *el = add_exclude_list(dir, EXC_CMDL, + --git-dir option); + + add_exclude(gitdir, , 0, el, 0); + } + } dir-exclude_per_dir = .gitignore; path = git_path(info/exclude); diff --git a/t/t2205-add-gitdir.sh b/t/t2205-add-gitdir.sh new file mode 100755 index 000..3c6b853 --- /dev/null +++ b/t/t2205-add-gitdir.sh @@ -0,0 +1,61 @@ +#!/bin/sh +# +# Copyright (c) 2014 Pasha Bolokhov +# + +test_description='alternative repository path specified by --git-dir is ignored by add and status' + +. ./test-lib.sh + +test_expect_success setup ' + git --git-dir=meta init + for f in a b c d + do + echo DATA $f || exit 1 + done + mkdir subdir1 + for f in e f g h + do + echo MORE DATA subdir1/$f || exit 1 + done + mkdir subdir1/meta + echo EVEN more Data subdir1/meta/aa + mkdir subdir1/ssubdir subdir1/ssubdir/meta + echo So much more data subdir1/ssubdir/meta/aaa +' + +test_expect_success 'git status' ignores the repository directory ' + git --git-dir=meta --work-tree=. status status.out + test_might_fail grep meta status.out out + ! test -s out +' + +test_expect_success 'git add -A' ignores the repository directory ' + git --git-dir=meta --work-tree=. add -A + git --git-dir=meta --work-tree=. status --porcelain status1.out + test_might_fail grep meta status1.out out1 + ! test -s out1 +' + +test_expect_success 'git status' acknowledges files 'meta' if repository is not within work tree ' + test_might_fail rm -rf meta/ + ( + cd subdir1 + git --git-dir=../meta init + git --git-dir=../meta --work-tree=. status --porcelain status2.out + test_might_fail
Re: [PATCH v8 2/2] format-patch --signature-file file
Jeremiah Mahler jmmah...@gmail.com writes: Added option that allows a signature file to be used with format-patch so that signatures with newlines and other special characters can be easily included. s/Added option/Add an option/. I do not think with newlines and other special characters is the primary issue---isn't it more about I have chosen to use this mail-signature; do not force me to retype the same all the time? $ git format-patch --signature-file ~/.signature -1 The recommended command-line convention (see gitcli(7)) is to use --option=value, so an example would be better to follow it, i.e. $ git format-patch -1 --signature-file=$HOME/.signature The config variable format.signaturefile is also provided so that it can be added by default. $ git config format.signaturefile ~/.signature $ git format-patch -1 Something like: To countermand the configuration variable for a specific run: $ git format-patch -1 --signature=This time only $ git format-patch -1 --signature;# to use the default $ git format-patch -1 --signature= ;# to add nothing is also needed here, I think. Similarly, these two needs to be tested in the test scripts you are modifying. Specifically: +test_expect_success 'format-patch --no-signature and --signature-file OK' ' + git format-patch --stdout --no-signature --signature-file=mail-signature -1 +' should not just make sure format-patch does _something_, but needs to make sure it does not contain the contents of the configured mail signagture file. I didn't see offhand if the tests make sure that a configured mail signature can be overriden from the command line. I think you would want to test, with format-patch.signature-file pointing at the mail-signature file, at least these three cases: - Run format-patch --no-signature and make sure that stops the contents from mail-signature file from being shown, and instead no mail-signature is given. - Run format-patch --signature='this time only' and make sure that stops the contents from mail-signature file from being shown and this time only is used instead. - Run format-patch --signature-file=another-mail-signature and make sure that stops the contents from mail-signature file from being shown and the contents from the other file is used instead. Test for these negative cases is often what we forget, when we are thrilled to show off that the shiny new feature works as expected. We need to ensure that the ways to stop the shiny new feature from kicking in will not be broken as well. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/9] strbuf: add strbuf_tolower function
Jeff King p...@peff.net writes: Yes, and that would be fine with me (I actually wrote strbuf_tolower for my own use, and _then_ realized that we already had such a thing that could be replaced). ... ... I think the bigger question is: is this refactor worth doing, since there is only one caller? If you wrote it for your own use and then realized that it is applicable to this codepath, wouldn't that say that there are multiple potential callers that would benefit from having it? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
Jeff King p...@peff.net writes: [+cc Junio for cache-tree expertise] ... We never call reset_index now, because we handle it via diff. We could call prime_cache_tree in this case, but I'm not sure if that is a good idea, because it primes it from scratch (and so it opens up all those trees that we are trying to avoid touching). I'm not sure if there's an easy way to update it incrementally; I don't know the cache-tree code very well. The cache-tree is designed to start in a well-populated state, allowing you to efficiently smudge the part you touched by invalidating while keeping the parts you haven't touched intact. What is missing in its API is a more fine-grained support to let us say it has degraded too much and we need to bring it into a well-populated state again for it to be truly useful as an optimization. There are only two modes of support to revive a degraded cache-tree, one being write_cache_as_tree(), in which case we have to compute necessary tree object names anyway (so there is no point discarding the result of the computation), and the other being calls to prime-cache-tree, in which we happen to know that the whole index contents must match the whole tree structure represented by one tree object. Both aim to restore the cache-tree into a fully-populated state, and there is no support to populate it well enough by doing anything incremental. You can call write-tree side incremental, because it does reuse what is still valid without recomputing tree objects for them---but the result is a fully-populated state. Adding a more fine-grain support is not against the overall design, but it was unclear what such additional API functions should look like, and where we can call them safely, at least back when we were actively improving it. Two that comes to my mind are: - We know that the subtrees down in this directory are degraded too much; write-tree only the subtrees that correspond to this directory without restoring other parts of the tree. - We just populated the index with the subtrees in this directory and know that they should match the tree hierarchy exactly. prime-cache-tree only the parts without affecting other parts of the tree. As with calls to existing (whole-tree) prime-cache-tree, the latter is an error-prone optimization---I think we had cases where we said after this operation, we know that the index must exactly match the tree we used to muck with the index and added a call, and later discovered that must exactly match was not true. The former forces recomputation, so there is much less safety concern. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Thu, 2014-05-22 at 14:34 -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: [+cc Junio for cache-tree expertise] ... We never call reset_index now, because we handle it via diff. We could call prime_cache_tree in this case, but I'm not sure if that is a good idea, because it primes it from scratch (and so it opens up all those trees that we are trying to avoid touching). I'm not sure if there's an easy way to update it incrementally; I don't know the cache-tree code very well. The cache-tree is designed to start in a well-populated state, allowing you to efficiently smudge the part you touched by invalidating while keeping the parts you haven't touched intact. As far as I can tell, the cache-tree does not in fact ever get into a well-populated state (that is, it does not exist at all) under ordinary git operation except by git reset --hard. Perhaps this was already clear from the previous traffic on the thread, but I wanted to make sure Junio was also aware of this. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
David Turner dtur...@twopensource.com writes: On Thu, 2014-05-22 at 14:34 -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: [+cc Junio for cache-tree expertise] ... We never call reset_index now, because we handle it via diff. We could call prime_cache_tree in this case, but I'm not sure if that is a good idea, because it primes it from scratch (and so it opens up all those trees that we are trying to avoid touching). I'm not sure if there's an easy way to update it incrementally; I don't know the cache-tree code very well. The cache-tree is designed to start in a well-populated state, allowing you to efficiently smudge the part you touched by invalidating while keeping the parts you haven't touched intact. As far as I can tell, the cache-tree does not in fact ever get into a well-populated state (that is, it does not exist at all) under ordinary git operation except by git reset --hard. Perhaps this was already clear from the previous traffic on the thread, but I wanted to make sure Junio was also aware of this. Yes. As I said, that should not usually be a problem for those who do the real work (read: commit), at which time write-tree will fully populate the cache-tree. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 00/44] Use ref transactions for all ref updates
Ronnie Sahlberg wrote: This patch series can also be found at https://github.com/rsahlberg/git/tree/ref-transactions Thoughts on 65a1cb7b (2014-05-22 12:08): 04/40 add a strbuf argument to ref_transaction_commit for error logging Ideally this would come after the functions it calls so the comment If error is non-NULL we will add an error string to it to explain why the transaction failed would be true already. But rearranging the series for that would be more fuss than it's worth IMHO. The sanity check int ref_transaction_commit(...) { int ret = ref_transaction_commit_real(...); if (ret err !err-len) die(BUG: ref_transaction_commit forgot to write an error message); return ret; } shows no instances of forgotten error messages in cases exercised by tests, at least. And it's a step in the right direction. Reviewed-by: Jonathan Nieder jrnie...@gmail.com 05/40 add an err argument to repack_without_refs unable_to_lock_strbuf could be called unable_to_lock_message (which would make its behavior more obvious imho). This makes errno meaningful when commit_packed_refs returns an error. Probably its API documentation should say so to make it obvious to people modifying it in the future to preserve that property or audit callers. Via the new call to unable_to_lock_..., repack_without_refs cares about errno after a failed call to lock_packed_refs. lock_packed_refs can only fail in hold_lock_file_for_update. hold_lock_file_for_update is a thin wrapper around lockfile.c::lock_file. lock_file can error out because strlen(path) = max_path_len adjust_shared_perm failed [calls error(), clobbers errno] open failed So lock_file needs a similar kind of fix, and it's probably worth updating API documentation for these calls to make it clear that their errno is used (though that's not a new problem since repack_without_refs already called unable_to_lock_error). Could be a separate, earlier patch since it's fixing an existing bug. 06/40 make ref_update_reject_duplicates take a strbuf argument for errors still Reviewed-by: Jonathan Nieder jrnie...@gmail.com 07/40 add an err argument to delete_ref_loose The new unlink_or_err has an odd contract when the err argument is passed. On error: * if errno != ENOENT, it will append a message to err and return -1 (good) * if errno == ENOENT, it will not append a message to err but will still return -1. Perhaps it should return 0 when errno == ENOENT. After all, in that case the file does not exist any more, which is all we wanted. And it would save the caller from having to inspect errno. On failure we seem to add our own message to err, too, so the resulting message would be something like fatal: unable to unlink .git/refs/heads/master: \ Permission deniedfailed to delete loose ref '.git/refs/heads/master.lock' The second strbuf_addf is probably not needed. 08/40 make update_ref_write update a strbuf on failure still Reviewed-by: Jonathan Nieder jrnie...@gmail.com 09/40 log transaction error from the update_ref No actual functional change intended, right? I'd say something like update-ref: use err argument to get error from ref_transaction_commit or something similar to make it clearer that this is just about changing APIs. Or if there's an intended functional change, then the commit message could say something about that. 10/40 remove the onerr argument to ref_transaction_commit still Reviewed-by: Jonathan Nieder jrnie...@gmail.com -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
David Turner dtur...@twopensource.com writes: Yes. As I said, that should not usually be a problem for those who do the real work (read: commit), at which time write-tree will fully populate the cache-tree. Git commit does not in fact populate the cache-tree. If that is the case, we must have broken the write-tree codepath over time. Of course, git commit foo will need to prepare a temporary index where only the entry foo is different from the HEAD version, write the tree to create a commit, but we do not write out the main index as a tree after updating the same entry foo in it (there may be other changes relative to HEAD), so its cache-tree is only invalidated at foo when we updating the entry and we do not spend extra cycles to repopulate it. But at least my understanding has been that git commit (no partial commit, write the whole index as a commit) which uses the git write-tree machinery knows which subtree has what tree object name and populates the cache-tree fully. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
Junio C Hamano gits...@pobox.com writes: David Turner dtur...@twopensource.com writes: Yes. As I said, that should not usually be a problem for those who do the real work (read: commit), at which time write-tree will fully populate the cache-tree. Git commit does not in fact populate the cache-tree. If that is the case, we must have broken the write-tree codepath over time. Of course, git commit foo will need to prepare a temporary index where only the entry foo is different from the HEAD version, write the tree to create a commit, but we do not write out the main index as a tree after updating the same entry foo in it (there may be other changes relative to HEAD), so its cache-tree is only invalidated at foo when we updating the entry and we do not spend extra cycles to repopulate it. But at least my understanding has been that git commit (no partial commit, write the whole index as a commit) which uses the git write-tree machinery knows which subtree has what tree object name and populates the cache-tree fully. ... and the incrementally repair Peff talks about would be to cover more cases where we may know (either because we have already computed it to write out a subtree, or we have just read from a known tree to populate a part of the index and we know the paths in the index that correspond to that subtree are exactly the same ones as found in the tree we read from) parts of the cache-tree can be updated with tree object names for subtrees, but we don't do anything right now. git commit foo (where foo is a directory) may be able to say The cache-tree entry for 'foo' can be updated with the tree object of the new HEAD:foo because we know they must match in the main index, for example. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
Junio C Hamano gits...@pobox.com writes: But at least my understanding has been that git commit (no partial commit, write the whole index as a commit) which uses the git write-tree machinery knows which subtree has what tree object name and populates the cache-tree fully. Here is what I tried just now. $ rm .git/index $ git read-tree HEAD HEAD Note that a single-tree read-tree will populate the cache-tree and that is why I am forcing switch branches 2-way read-tree here, which I know will discard the cache-tree fully. $ ls -l .git/index -rw-r- 1 jch eng 249440 May 22 15:20 .git/index $ git checkout HEAD^0 $ ls -l .git/index -rw-r- 1 jch eng 249440 May 22 15:21 .git/index Still the same size, without cache-tree. $ git write-tree 57361c4add61b638dad1c1c2542edf877f515c48 $ ls -l .git/index -rw-r- 1 jch eng 254383 May 22 15:21 .git/index The size differences come from the recomputation of the cache tree. The result is the same if we replace git write-tree with a whole-index commit, e.g. $ git commit --allow-empty -m foo and test-dump-cache-tree seem to see a fully populated cache-tree after these steps. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/9] strbuf: add strbuf_tolower function
On May 22, 2014, at 11:41, Jeff King wrote: On Thu, May 22, 2014 at 11:36:37AM -0700, Junio C Hamano wrote: Yes, and that would be fine with me (I actually wrote strbuf_tolower for my own use, and _then_ realized that we already had such a thing that could be replaced). Do we forbid that sb-buf[x] for some x sb-len to be NUL, and if there is such a byte we stop running tolower() on the remainder? Christian brought this up elsewhere, and I agree it's probably better to work over the whole buffer, NULs included. I'm happy to re-roll (or you can just pick up the version of the patch in this thread), The only reason I brought up the code difference in the first place was that the comment was This makes config's lowercase() function public which made me expect to see basically the equivalent of replacing a static with an extern, but actually the function being made public was a different implementation than config's lowercase() function. So it looks like the original PATCH 2/9 version of the code is best, but perhaps the comment can be tweaked a bit to not convey the same impression I got. Maybe something like Replace config's lowercase() function with a public one. --Kyle -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/8] http: extract type/subtype portion of content-type
On May 22, 2014, at 02:29, Jeff King wrote: When we get a content-type from curl, we get the whole header line, including any parameters, and without any normalization (like downcasing or whitespace) applied. If we later try to match it with strcmp() or even strcasecmp(), we may get false negatives. This could cause two visible behaviors: 1. We might fail to recognize a smart-http server by its content-type. 2. We might fail to relay text/plain error messages to users (especially if they contain a charset parameter). This patch teaches the http code to extract and normalize just the type/subtype portion of the string. This is technically passing out less information to the callers, who can no longer see the parameters. But none of the current callers cares, and a future patch will add back an easier-to-use method for accessing those parameters. Signed-off-by: Jeff King p...@peff.net --- http.c | 32 +--- remote-curl.c | 2 +- t/lib-httpd/error.sh | 8 +++- t/t5550-http-fetch-dumb.sh | 5 + 4 files changed, 42 insertions(+), 5 deletions(-) diff --git a/http.c b/http.c index 94e1afd..4edf5b9 100644 --- a/http.c +++ b/http.c @@ -906,6 +906,29 @@ static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO info, struct strbuf *buf) return ret; } +/* + * Extract a normalized version of the content type, with any + * spaces suppressed, all letters lowercased, and no trailing ; + * or parameters. + * + * Example: + * TEXT/PLAIN; charset=utf-8 - text/plain + */ +static void extract_content_type(struct strbuf *raw, struct strbuf *type) +{ + const char *p; + + strbuf_reset(type); + strbuf_grow(type, raw-len); + for (p = raw-buf; *p; p++) { + if (isspace(*p)) + continue; + if (*p == ';') + break; + strbuf_addch(type, tolower(*p)); + } +} + This will parse invalid content types as valid. Probably not important since the producer of an invalid content type shouldn't be depending on any particular behavior by the consumer of such a type, but I think it warrants a note in the comment block, perhaps something like: * Note that an invalid content-type may be converted to a valid one or some such. --Kyle -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only
I hate rename_ref :-) I have reworked the transaction code to special case the deletion of the old ref for n/n - n and n - n/n renames so that we can carefully avoid n/n.lock files to exist or prevent the directory - file transition for n during these renames. This should allow us to have rename_ref becoming reasonably implementation agnostic, aside for that it wants to lstat the ref to see if it is a soft link, and re-use the code for other refs backends. Please see the ref-transaction branch. On Thu, May 22, 2014 at 12:12 PM, Ronnie Sahlberg sahlb...@google.com wrote: On Thu, May 22, 2014 at 11:17 AM, Jonathan Nieder jrnie...@gmail.com wrote: Hi, Ronnie Sahlberg wrote: Add a new flag REF_ISPACKONLY that we can use in ref_transaction_delete. This flag indicates that the ref does not exist as a loose ref andf only as a packed ref. If this is the case we then change the commit code so that we skip taking out a lock file and we skip calling delete_ref_loose. This seems wrong. Can't someone else create a loose ref which will shadow the packed ref and break the serializability of updates to this ref? The above doesn't explain why we want to avoid locking the loose ref, but I assume it's for the sake of the git branch -m foo/bar foo case. For that case, wouldn't the following sequence of filesystem operations work? - create $GIT_DIR/refs/heads/foo/bar.lock - create $GIT_DIR/refs/heads/foo.lock - if $GIT_DIR/refs/heads/foo/bar exists, add the ref to packed-refs (using the usual lockfile-update mechanism) and unlink $GIT_DIR/refs/heads/foo/bar - verify that current foo and foo/bar state are okay. If not, roll back. - unlink $GIT_DIR/refs/heads/foo/bar.lock - rmdir $GIT_DIR/refs/heads/foo - rename $GIT_DIR/refs/heads/foo.lock into place Or: - create $GIT_DIR/refs/heads/foo/bar.lock - create $GIT_DIR/refs/heads/foo.lock - verify state of all relevant refs - update packed-refs to remove refs/heads/foo/bar and add refs/heads/foo - unlink $GIT_DIR/refs/heads/foo/bar - unlink $GIT_DIR/refs/heads/foo - unlink $GIT_DIR/refs/heads/foo/bar.lock - unlink $GIT_DIR/refs/heads/foo.lock I removed all the rename_ref related patches for now. rename_ref is probably best implemented specifically for each backend anyway. I will still produce a separate patch that will do something like this you suggested (since rename_ref is still racy and fragile) - create $GIT_DIR/refs/heads/foo/bar.lock - create $GIT_DIR/refs/heads/foo.lock - verify state of all relevant refs - update packed-refs to remove refs/heads/foo/bar and add refs/heads/foo - unlink $GIT_DIR/refs/heads/foo/bar - unlink $GIT_DIR/refs/heads/foo - unlink $GIT_DIR/refs/heads/foo/bar.lock - unlink $GIT_DIR/refs/heads/foo.lock Thanks ronnie sahlberg -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Thu, 2014-05-22 at 15:29 -0700, Junio C Hamano wrote: Junio C Hamano gits...@pobox.com writes: But at least my understanding has been that git commit (no partial commit, write the whole index as a commit) which uses the git write-tree machinery knows which subtree has what tree object name and populates the cache-tree fully. Here is what I tried just now. $ rm .git/index $ git read-tree HEAD HEAD Note that a single-tree read-tree will populate the cache-tree and that is why I am forcing switch branches 2-way read-tree here, which I know will discard the cache-tree fully. $ ls -l .git/index -rw-r- 1 jch eng 249440 May 22 15:20 .git/index $ git checkout HEAD^0 $ ls -l .git/index -rw-r- 1 jch eng 249440 May 22 15:21 .git/index Still the same size, without cache-tree. $ git write-tree 57361c4add61b638dad1c1c2542edf877f515c48 $ ls -l .git/index -rw-r- 1 jch eng 254383 May 22 15:21 .git/index The size differences come from the recomputation of the cache tree. The result is the same if we replace git write-tree with a whole-index commit, e.g. $ git commit --allow-empty -m foo and test-dump-cache-tree seem to see a fully populated cache-tree after these steps. I get the same results as you with git write-tree. But I do not get the same results from a whole-index git commit (I tried your exact command-line). That is, when I do git commit with no cache-tree in place, it does not create one. To expand: even if git commit did work for me the way it seems to work for you, I still believe that the cache-tree behavior would be suboptimal, because every time a user switches branches, they lose their cache-tree, and thus all of their git status commands are slow until their first commit. But I am willing to believe that my workflow is atypical, and that most people commit enough soon after switching branches. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 00/44] Use ref transactions for all ref updates
Ronnie Sahlberg wrote: This patch series can also be found at https://github.com/rsahlberg/git/tree/ref-transactions Continuing with the review of 65a1cb7b (2014-05-22 12:08): 11/40 change ref_transaction_update() to do error checking and return status The there will be in the future sounds ominous. Do you have an example in mind? E.g., I suppose it would be nice if _update could notice D/F conflicts or connection to a database server closing early, but it's not clear to me whether the kind of errors you're talking about are that or something else. With or without such a clarification, Reviewed-by: Jonathan Nieder jrnie...@gmail.com 12/40 change ref_transaction_create to do error checking and return status What does On failure the err buffer will be updated mean? Will it clear err and replace it with a message, append to err, or something else? Does the message explain the context or is the caller responsible for adding to it? Does the message end with a newline or is the caller responsible for adding one when printing it out? For cases like this where lots of functions have a similar API, API comments start to become potentially repetitive. It might be better to explain conventions at the top of the file or in Documentation/technical/api-refs.txt and say See the top of the file for error handling conventions or Returns non-zero and appends a message to err on error. See Documentation/technical/api-refs.txt for more details on error handling. 13/40 ref_transaction_delete to check for error and return status Each successive commit dropped something from its subject. :) (First the (), then the verb.) Same comments as before about an example being useful for the log message and the API documentation on error handling being a bit vague. 14/40 make ref_transaction_begin take an err argument I found the failed to connect to mysql example instructive while doing reviews. Perhaps it would be worth mentioning in the commit message. Reviewed-by: Jonathan Nieder jrnie...@gmail.com 15/40 add transaction.status and track OPEN/CLOSED/ERROR It says an ERRORed transaction cannot be committed and can be rolled back by calling _free. Can a CLOSED transaction be committed or _freed? What does faild mean in the documentation comments? (Maybe non-OPEN?) In the previous version of this patch passing a non-OPEN transaction would die(BUG: ...) to diagnose the caller's mistake. Now I'm confused about the API: it seems you're allowed to pass a non-OPEN transaction but it doesn't append a message to 'err' in that case. Is this meant as a way to save the caller some typing, like fwrite/fclose do? (I've found people often make mistakes with the fwrite API fwiw but can understand the appeal of it.) Maybe with more context I'd like this. As is, it feels like a step in the wrong direction. 16/40 tag: use ref transactions when doing updates Reviewed-by: Jonathan Nieder jrnie...@gmail.com 17/40 replace: use ref transactions when doing updates Reviewed-by: Jonathan Nieder jrnie...@gmail.com 18/40 commit: use ref transactions for updates Reviewed-by: Jonathan Nieder jrnie...@gmail.com 19/40 sequencer: use ref transactions for all ref updates This would be a lot simpler if the ref_transaction_commit should not free the transaction patch came before it (yes, sorry, killing the fun). I can push the result of a rebase doing that somewhere if you like. 20/40 fast-import: change update_branch to use ref transactions Likewise. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
David Turner dtur...@twopensource.com writes: ... I still believe that the cache-tree behavior would be suboptimal, ... I do not think anybody doubts that suboptimal-ness in this thread. As you saw the incremental thing from Peff and my responses to it, there may be more things we could be doing. It just has not been at a ultra high priority, and if we can choose only one change from possibilities, losing the entire cache-tree upon switching branches, like in my two-way read-tree example, would be the thing that would give us the most benefit if we somehow change it. That however is unfortunately not a low-hanging fruit. The two-way merge machinery we use for switching branches wants to populate the index one entry at a time, without having any point where you can say OK the result in this subdirectory will exactly match this subtree which would allow us to say prime the cache tree for that subdirectory with this tree object. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] strbuf: add strbuf_tolower function
On May 22, 2014, at 06:42, Jeff King wrote: [re-adding list cc] On Thu, May 22, 2014 at 03:16:45PM +0200, Christian Couder wrote: +void strbuf_tolower(struct strbuf *sb) +{ + char *p; + for (p = sb-buf; *p; p++) + *p = tolower(*p); +} Last time I tried a change like the above, I was told that strbufs are buffers that can contain NUL bytes. So maybe it should be: for (p = sb-buf; p sb-buf + sb-len; p++) *p = tolower(*p); Hah. I wrote it like that originally, and in review was asked to switch it. I agree that it might be worth keeping strbuf functions 8bit- clean. The original intent of the strbuf code was to make C strings easier to use, but I think we sometimes use them as general buffers, too. I didn't see this patch before I sent the other email, but this is the relevant part: On May 22, 2014, at 15:52, Kyle J. McKay wrote: The only reason I brought up the code difference in the first place was that the comment was This makes config's lowercase() function public which made me expect to see basically the equivalent of replacing a static with an extern, but actually the function being made public was a different implementation than config's lowercase() function. So it looks like the original PATCH 2/9 version of the code is best, but perhaps the comment can be tweaked a bit to not convey the same impression I got. Maybe something like Replace config's lowercase() function with a public one. Or even just delete the This makes config's lowercase() function public line after switching the implementation back. --Kyle -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Fri, May 23, 2014 at 5:18 AM, Junio C Hamano gits...@pobox.com wrote: ... and the incrementally repair Peff talks about would be to cover more cases where we may know (either because we have already computed it to write out a subtree, or we have just read from a known tree to populate a part of the index and we know the paths in the index that correspond to that subtree are exactly the same ones as found in the tree we read from) parts of the cache-tree can be updated with tree object names for subtrees, but we don't do anything right now. I wanted to do this but it's hard. For diff --cached (which should be where we find out and repair cache-tree), we flatten the trees in traverse_trees() and let unpack-trees figure out the differences against the index. The's no direct connection between a change and a tree. Maybe I missed something. David if you are interested in git status performance, this repairing thing could be important when the worktree is large because the diff cost increases accordingly if cache-tree is not fully populated. Empty cache-tree could cost us nearly the same time we save with avoiding opendir/readdir.. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git reset for index restoration?
On Fri, 2014-05-23 at 06:33 +0700, Duy Nguyen wrote: On Fri, May 23, 2014 at 5:18 AM, Junio C Hamano gits...@pobox.com wrote: ... and the incrementally repair Peff talks about would be to cover more cases where we may know (either because we have already computed it to write out a subtree, or we have just read from a known tree to populate a part of the index and we know the paths in the index that correspond to that subtree are exactly the same ones as found in the tree we read from) parts of the cache-tree can be updated with tree object names for subtrees, but we don't do anything right now. I wanted to do this but it's hard. For diff --cached (which should be where we find out and repair cache-tree), we flatten the trees in traverse_trees() and let unpack-trees figure out the differences against the index. The's no direct connection between a change and a tree. Maybe I missed something. David if you are interested in git status performance, this repairing thing could be important when the worktree is large because the diff cost increases accordingly if cache-tree is not fully populated. Empty cache-tree could cost us nearly the same time we save with avoiding opendir/readdir.. I am interested, and I believe I might be able to start looking into it in a week or two. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct
Elia Pinto gitter.spi...@gmail.com writes: I have no problems rerolling this simple patch, but i need to know what is the (git) right style in this case. If I were doing this... diff --git a/contrib/subtree/t/t7900-subtree.sh b/contrib/subtree/t/t7900-subtree.sh index 66ce4b0..c1d0b23 100755 --- a/contrib/subtree/t/t7900-subtree.sh +++ b/contrib/subtree/t/t7900-subtree.sh @@ -8,7 +8,7 @@ This test verifies the basic operation of the merge, pull, add and split subcommands of git subtree. ' -export TEST_DIRECTORY=$(pwd)/../../../t +TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY ... I'd say I would make this two lines without , i.e. TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY because we are not doing anything about a failure of $(pwd), other than skipping the export. If we were failing the entire thing, it would be a different story, but at this point of the test, it is unreasonable to do something like: TEST_DIRECTORY=$(pwd)/../../../t export TEST_DIRECTORY || exit $? anyway. diff --git a/git-remote-testgit.sh b/git-remote-testgit.sh index 1c006a0..9d1ce76 100755 --- a/git-remote-testgit.sh +++ b/git-remote-testgit.sh @@ -13,7 +13,7 @@ refspec=${GIT_REMOTE_TESTGIT_REFSPEC-$default_refspec} test -z $refspec prefix=refs -export GIT_DIR=$url/.git +GIT_DIR=$url/.git export GIT_DIR force= Similarly, as we do not do anything if the assignment to GIT_DIR fails here, just like we ignore any failure from assignment to force, GIT_DIR=$url/.git export GIT_DIR would be sufficient. The logic before the mkdir -p of this script may want to be cleaned up (can you tell how if $refspec is empty, set prefix=refs makes any sense, and what its implications are, without thinking for more than 10 seconds?), but that is clearly a separate topic [*1*]. diff --git a/git-stash.sh b/git-stash.sh index 4798bcf..0bb1af8 100755 --- a/git-stash.sh +++ b/git-stash.sh @@ -94,7 +94,7 @@ create_stash () { # ease of unpacking later. u_commit=$( untracked_files | ( - export GIT_INDEX_FILE=$TMPindex + GIT_INDEX_FILE=$TMPindex export GIT_INDEX_FILE rm -f $TMPindex git update-index -z --add --remove --stdin u_tree=$(git write-tree) This one does care about failures. I'd rather have these on separate lines, i.e. GIT_INDEX_FILE=$TMPindex export GIT_INDEX_FILE rm -f $TMPindex ... [Footnote] *1* Perhaps something along these lines with the exact placements of blank lines to group similar things together: alias=$1 url=$2 if test -z ${GIT_REMOTE_TESTGIT_REFSPEC-notEmpty} then # only if it is explicitly set to an empty string... prefix=refs else prefix=refs/testgit/$alias fi dir=$GIT_DIR/testgit/$alias default_refspec=refs/heads/*:${prefix}/heads/* refspec=${GIT_REMOTE_TESTGIT_REFSPEC-$default_refspec} force= GIT_DIR=$url/.git export GIT_DIR -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only
Ronnie Sahlberg wrote: I hate rename_ref :-) I have reworked the transaction code to special case the deletion of the old ref for n/n - n and n - n/n renames so that we can carefully avoid n/n.lock files to exist or prevent the directory - file transition for n during these renames. Thanks. I suspect the REF_ISRENAME flag shouldn't be needed. Wouldn't something like the following work (in _commit)? Allocate work space Copy sort, and reject duplicate refs Acquire all locks while verifying old values This calls is_refname_available. If a refname is unavailable, goto slowpath. Perform updates first so live commits remain referenced. Perform deletes now that updates are safely completed. Done. slowpath: Acquire locks, telling is_refname_available not to worry about deleted refs. Lock packed-refs. Add all relevant refs to packed-refs (pack_if_possible_fn). Commit packed-refs. Unlink the corresponding loose refs so packed-refs becomes authoritative for them. Lock packed-refs. Perform updates and removals in the packed-refs cache. Commit packed-refs. Release locks. Done. This wouldn't be any slower for the case without D/F conflicts, and in the D/F conflict case, it should work for arbitrary transactions that want to remove one ref to make room for another. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only
Jonathan Nieder wrote: Ronnie Sahlberg wrote: I hate rename_ref :-) I have reworked the transaction code to special case the deletion of the old ref for n/n - n and n - n/n renames so that we can carefully avoid n/n.lock files to exist or prevent the directory - file transition for n during these renames. [...] Unlink the corresponding loose refs so packed-refs becomes authoritative for them. Lock packed-refs. Perform updates and removals in the packed-refs cache. Commit packed-refs. ... or is the problem that the reflogs conflict? How does rename_ref handle propagating the reflog from the old name to the new name, by the way? Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] completion: add missing options for git-merge
John Keeping j...@keeping.me.uk writes: The options added to __git_merge_options are those that git-pull also understands, since that variable is used by both commands. Those added directly in _git_merge() are specific to git-merge and are not supported by git-pull. Interesting. Technically, are not passed through would be more accurate than are not supported, as there may be a very good reason not to call git merge --frotz from git pull --frotz=nitfol. In such a case, git pull would _support_ the --frotz option, but in a different way and possibly with semantics different from git merge --frotz (i.e. option with value vs boolean), and completion of the --frotz=value option may need to be supported for git pull separately without using $__git_merge_options. The patch makes us think if --[no-]rerere-autoupdate should be a pass-thru option. You are unlikely to have seen the same conflict that will arise from a pull from another person for autoupdate to matter, but on the other hand, if you do have seen one and resolved it already, you may want the autoupdate to kick in, so accepting and passing it through may not be an unreasonable thing to do. As long as we declare that git pull --abort does not make any sense (which I think is a sane declaration), the distinction you made in [PATCH 1/2] will always be with us, whether we make --rerere-autoupdate pass-thru or not, so in that sense, both patches make sense to me. Thanks. Reported-by: Haralan Dobrev hkdob...@gmail.com Signed-off-by: John Keeping j...@keeping.me.uk --- contrib/completion/git-completion.bash | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index ff97c20..019026e 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1476,6 +1476,8 @@ _git_log () __git_merge_options= --no-commit --no-stat --log --no-log --squash --strategy --commit --stat --no-squash --ff --no-ff --ff-only --edit --no-edit + --verify-signatures --no-verify-signatures --gpg-sign + --quiet --verbose --progress --no-progress _git_merge () @@ -1484,7 +1486,8 @@ _git_merge () case $cur in --*) - __gitcomp $__git_merge_options + __gitcomp $__git_merge_options + --rerere-autoupdate --no-rerere-autoupdate --abort return esac __gitcomp_nl $(__git_refs) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[no subject]
-- Compliment of the day, I am Mrs. Jiang Ming, a staff of Lloyds TSB Group Plc. here in Hong Kong attached with Private Banking Services;I have a secured business proposal for you. Should you be interested please reach me on my private emailaddress (mrsjiangming1...@outlook.com) And after that I shall provide you with more details of my proposal. Your earliest response to this letter will be appreciated. Yours Sincerely, Jiang Ming -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 8/8] http: default text charset to iso-8859-1
On Thu, May 22, 2014 at 05:36:12AM -0400, Jeff King wrote: If we do want to do magic like latin1 is really iso-8859-1, that seems like the domain of iconv to me. If iconv doesn't handle it itself, I'd rather have a wrapper there. Putting it at that layer keeps the code cleaner, and it means the wrapper would benefit the regular commit-log reencoding code. I think being a little stricter in our character encoding actually benefits users. If someone claims that all their commit messages are in US-ASCII or ISO-8859-1, and then stuffs Windows-1252 in there, that's going to break a lot of stuff, especially if someone assumes US-ASCII means it's okay to use it where UTF-8 is required. It's much better to let people not insert broken stuff in the first place rather than deal with it afterwards. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: [PATCH v2] Add an explicit GIT_DIR to the list of excludes
On Thu, May 22, 2014 at 4:11 PM, Pasha Bolokhov pasha.bolok...@gmail.com wrote: diff --git a/t/t2205-add-gitdir.sh b/t/t2205-add-gitdir.sh new file mode 100755 index 000..3c6b853 --- /dev/null +++ b/t/t2205-add-gitdir.sh @@ -0,0 +1,61 @@ +#!/bin/sh +# +# Copyright (c) 2014 Pasha Bolokhov +# + +test_description='alternative repository path specified by --git-dir is ignored by add and status' + +. ./test-lib.sh + +test_expect_success setup ' + git --git-dir=meta init + for f in a b c d + do + echo DATA $f || exit 1 On this project, it's customary to say 'foo bar' (no whitespace after ''). Ditto for the rest of the file. + done + mkdir subdir1 + for f in e f g h + do + echo MORE DATA subdir1/$f || exit 1 + done + mkdir subdir1/meta + echo EVEN more Data subdir1/meta/aa + mkdir subdir1/ssubdir subdir1/ssubdir/meta + echo So much more data subdir1/ssubdir/meta/aaa +' + +test_expect_success 'git status' acknowledges files 'meta' if repository is not within work tree ' + test_might_fail rm -rf meta/ + ( + cd subdir1 + git --git-dir=../meta init Broken -chain. + git --git-dir=../meta --work-tree=. status --porcelain status2.out + test_might_fail grep meta status2.out out2 + test -s out2 + ) +' + +test_done -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1 1/3] replace: add --graft option
The usage string for this option is: git replace [-f] --graft commit [parent...] First we create a new commit that is the same as commit except that its parents are [parents...] Then we create a replace ref that replace commit with the commit we just created. With this new option, it should be straightforward to convert grafts to replace refs, with something like: cat .git/info/grafts | while read line do git replace --graft $line; done Signed-off-by: Christian Couder chrisc...@tuxfamily.org --- builtin/replace.c | 84 ++- 1 file changed, 83 insertions(+), 1 deletion(-) diff --git a/builtin/replace.c b/builtin/replace.c index 1bb491d..9d1e82f 100644 --- a/builtin/replace.c +++ b/builtin/replace.c @@ -17,6 +17,7 @@ static const char * const git_replace_usage[] = { N_(git replace [-f] object replacement), N_(git replace [-f] --edit object), + N_(git replace [-f] --graft commit [parent...]), N_(git replace -d object...), N_(git replace [--format=format] [-l [pattern]]), NULL @@ -294,6 +295,76 @@ static int edit_and_replace(const char *object_ref, int force) return replace_object_sha1(object_ref, old, replacement, new, force); } +static int read_sha1_commit(const unsigned char *sha1, struct strbuf *dst) +{ + void *buf; + enum object_type type; + unsigned long size; + + buf = read_sha1_file(sha1, type, size); + if (!buf) + return error(_(cannot read object %s), sha1_to_hex(sha1)); + if (type != OBJ_COMMIT) { + free(buf); + return error(_(object %s is not a commit), sha1_to_hex(sha1)); + } + strbuf_attach(dst, buf, size, size + 1); + return 0; +} + +static int create_graft(int argc, const char **argv, int force) +{ + unsigned char old[20], new[20]; + const char *old_ref = argv[0]; + struct commit *commit; + struct strbuf buf = STRBUF_INIT; + struct strbuf new_parents = STRBUF_INIT; + const char *parent_start, *parent_end; + int i; + + if (get_sha1(old_ref, old) 0) + die(_(Not a valid object name: '%s'), old_ref); + commit = lookup_commit_or_die(old, old_ref); + if (read_sha1_commit(old, buf)) + die(_(Invalid commit: '%s'), old_ref); + + /* make sure the commit is not corrupt */ + if (parse_commit_buffer(commit, buf.buf, buf.len)) + die(_(Could not parse commit: '%s'), old_ref); + + /* find existing parents */ + parent_start = buf.buf; + parent_start += 46; /* tree + hex sha1 + \n */ + parent_end = parent_start; + + while (starts_with(parent_end, parent )) + parent_end += 48; /* parent + hex sha1 + \n */ + + /* prepare new parents */ + for (i = 1; i argc; i++) { + unsigned char sha1[20]; + if (get_sha1(argv[i], sha1) 0) + die(_(Not a valid object name: '%s'), argv[i]); + lookup_commit_or_die(sha1, argv[i]); + strbuf_addf(new_parents, parent %s\n, sha1_to_hex(sha1)); + } + + /* replace existing parents with new ones */ + strbuf_splice(buf, parent_start - buf.buf, parent_end - parent_start, + new_parents.buf, new_parents.len); + + if (write_sha1_file(buf.buf, buf.len, commit_type, new)) + die(_(could not write replacement commit for: '%s'), old_ref); + + strbuf_release(new_parents); + strbuf_release(buf); + + if (!hashcmp(old, new)) + return error(new commit is the same as the old one: '%s', sha1_to_hex(old)); + + return replace_object_sha1(old_ref, old, replacement, new, force); +} + int cmd_replace(int argc, const char **argv, const char *prefix) { int force = 0; @@ -303,12 +374,14 @@ int cmd_replace(int argc, const char **argv, const char *prefix) MODE_LIST, MODE_DELETE, MODE_EDIT, + MODE_GRAFT, MODE_REPLACE } cmdmode = MODE_UNSPECIFIED; struct option options[] = { OPT_CMDMODE('l', list, cmdmode, N_(list replace refs), MODE_LIST), OPT_CMDMODE('d', delete, cmdmode, N_(delete replace refs), MODE_DELETE), OPT_CMDMODE('e', edit, cmdmode, N_(edit existing object), MODE_EDIT), + OPT_CMDMODE('g', graft, cmdmode, N_(change a commit's parents), MODE_GRAFT), OPT_BOOL('f', force, force, N_(replace the ref if it exists)), OPT_STRING(0, format, format, N_(format), N_(use this format)), OPT_END() @@ -325,7 +398,10 @@ int cmd_replace(int argc, const char **argv, const char *prefix) usage_msg_opt(--format cannot be used when not listing, git_replace_usage, options); - if (force cmdmode != MODE_REPLACE
[PATCH v1 0/3] Add --graft option to git replace
Here is a small patch series to implement: git replace [-f] --graft commit [parent...] The changes since the RFC/PATCH are the following: - in patch 1/3, parse_commit_buffer() is now used to make sure commit is not corrupt - patch 2/3 add some tests - patch 3/3 add some documentation About the documentation, maybe we should add that --graft can now be used instead of grafts in .git/info/grafts, and maybe we could add an example that shows how it can be done. Christian Couder (3): replace: add --graft option replace: add test for --graft Documentation: replace: add --graft option Documentation/git-replace.txt | 8 + builtin/replace.c | 84 ++- t/t6050-replace.sh| 12 +++ 3 files changed, 103 insertions(+), 1 deletion(-) -- 1.9.rc0.17.g651113e -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1 2/3] replace: add test for --graft
Signed-off-by: Christian Couder chrisc...@tuxfamily.org --- t/t6050-replace.sh | 12 1 file changed, 12 insertions(+) diff --git a/t/t6050-replace.sh b/t/t6050-replace.sh index 68b3cb2..ca45a84 100755 --- a/t/t6050-replace.sh +++ b/t/t6050-replace.sh @@ -351,4 +351,16 @@ test_expect_success 'replace ref cleanup' ' test -z $(git replace) ' +test_expect_success '--graft with and without already replaced object' ' + test $(git log --oneline | wc -l) = 7 + git replace --graft $HASH5 + test $(git log --oneline | wc -l) = 3 + git cat-file -p $HASH5 | test_must_fail grep parent + test_must_fail git replace --graft $HASH5 $HASH4 $HASH3 + git replace --force -g $HASH5 $HASH4 $HASH3 + git cat-file -p $HASH5 | grep parent $HASH4 + git cat-file -p $HASH5 | grep parent $HASH3 + git replace -d $HASH5 +' + test_done -- 1.9.rc0.17.g651113e -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html