Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter

2014-05-22 Thread Jeff King
On Wed, May 21, 2014 at 05:07:38PM -0700, Kyle J. McKay wrote:

 +p = skip_prefix(type-buf, text/plain);
 +if (!p || (*p  *p != ';'))
 +return 0;
 +
 +return 1;
 +}
 +
 
 I think that a strict reading of RFC 2616 allows text/plain ;
 charset=utf-8 as well as text/plain;charset=utf-8 and text/plain;
 charset=utf-8.  So perhaps this if line instead:
 
 + if (!p || (*p  *p != ';'  *p != ' '  *p != '\t'))
 
 See RFC 2616 section 2.2 for the definition of linear white space (LWS) and
 the end of section 2.1 for the implied *LWS rule that allows it.

You're right. There are a few exceptions in 3.7:

  Linear white space (LWS) MUST NOT be used between the type and
  subtype, nor between an attribute and its value.

but it does not include between the subtype and parameter. So the
find_parameter also needs to accept the collapsed whitespace, too, I
guess.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/9] remote-curl: reencode http error messages

2014-05-22 Thread Jeff King
On Wed, May 21, 2014 at 05:07:40PM -0700, Kyle J. McKay wrote:

 +/* default charset from rfc2616 */
 +if (!*charset)
 +*charset = xstrdup(iso8859-1);
 
 Actually the name should be ISO-8859-1.  See RFC 2616 section 3.7.1.
 Since it's case insensitive iso-8859-1 would be fine too.

Thanks, will fix.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 10/19] git-submodule.sh: convert test -a/-o to and ||

2014-05-22 Thread Matthieu Moy
Elia Pinto gitter.spi...@gmail.com writes:

 This is version 2 of the patch to git-submodule of the 
 patch series convert  test -a/-o to  and ||. 
 It contains the fixes identified by Johannes and
 Matthieu. 

This version of the patch (not the whole series) is

Reviewed-by: Matthieu Moy matthieu@imag.fr

Some comments for next time:

* I agree with Johannes that splitting the series with one patch per
  file is not the right way to split. As a reviewer, I want to

  - check that -a trivially translates to 
  - check that -o trivially translates to ||
  - check non-trivial cases

  Interleaving these cases (especially the trivial and non-trivial
  cases) makes the review much harder. For this kind of series, the
  change is trivial, but the review is not (Johannes caught a || Vs 
  inversion that I didn't find for example, which is quite serious), so
  the optimize your patches for review is even more important than
  usual.

* Again, to avoid mixing trivial and non-trivial changes, ...

 @@ -1059,13 +1059,17 @@ cmd_summary() {
   while read mod_src mod_dst sha1_src sha1_dst status sm_path
   do
   # Always show modules deleted or type-changed 
 (blob-module)
 - test $status = D -o $status = T  echo $sm_path  
 continue
 + case $status in
 + [DT])
 + printf '%s\n' $sm_path 
 + continue
 + esac

turning a echo into a printf is good, but would be better done
separately.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter

2014-05-22 Thread Peter Krefting

Kyle J. McKay:

I think that a strict reading of RFC 2616 allows text/plain ; charset=utf-8 
as well as text/plain;charset=utf-8 and text/plain; charset=utf-8.


It does indeed, and I have seen servers send both variants, so they do 
need to be catered for.


The number of servers that would actually send the charset attribute 
here (for error messages) are probably not that many. It is probably a 
good idea to make the default user-configurable (I know the specs 
state that anything undeclared is iso-8859-1, but the real world 
doesn't agree to that).


--
\\// Peter - http://www.softwolves.pp.se/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/9] remote-curl: reencode http error messages

2014-05-22 Thread Peter Krefting

Kyle J. McKay:


+   if (!*charset)
+   *charset = xstrdup(iso8859-1);


Actually the name should be ISO-8859-1.  See RFC 2616 section 3.7.1.  Since 
it's case insensitive iso-8859-1 would be fine too.


You'd be amazed at what you see in the wild... I'd recommend going 
with the recommended algorithm from WHATWG's Encoding Standard, if you 
want to make this robust: 
http://encoding.spec.whatwg.org/#names-and-labels.


The spec is partly based on a lot of research I made in my previous 
$DAYJOB, with a lot of research added by the spec writer.


There is also Unicode's attempt at it, but it does unfortunately 
produce too many false positives: http://www.unicode.org/reports/tr22/tr22-7.html#Charset_Alias_Matching


--
\\// Peter - http://www.softwolves.pp.se/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter

2014-05-22 Thread Kyle J. McKay

On May 21, 2014, at 23:05, Jeff King wrote:


On Wed, May 21, 2014 at 05:07:38PM -0700, Kyle J. McKay wrote:


+   p = skip_prefix(type-buf, text/plain);
+   if (!p || (*p  *p != ';'))
+   return 0;
+
+   return 1;
+}
+


I think that a strict reading of RFC 2616 allows text/plain ;
charset=utf-8 as well as text/plain;charset=utf-8 and text/plain;
charset=utf-8.  So perhaps this if line instead:

+   if (!p || (*p  *p != ';'  *p != ' '  *p != '\t'))

See RFC 2616 section 2.2 for the definition of linear white space  
(LWS) and

the end of section 2.1 for the implied *LWS rule that allows it.


You're right. There are a few exceptions in 3.7:

 Linear white space (LWS) MUST NOT be used between the type and
 subtype, nor between an attribute and its value.

but it does not include between the subtype and parameter. So the
find_parameter also needs to accept the collapsed whitespace, too, I
guess.


Yeah I think so too.  It's probably enough though just to just strip  
all   and \t characters at the same time the content type is  
lowercased.  While that would cause invalid content types such as  
text / plain to be recognized it would keep the rest of the code  
simpler.  Since a producer of an invalid content type shouldn't really  
be depending on any particular behavior by a receiver of an invalid  
content type it's probably not an issue.


--Kyle
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 10/19] git-submodule.sh: convert test -a/-o to and ||

2014-05-22 Thread Elia Pinto
2014-05-22 8:49 GMT+02:00 Matthieu Moy matthieu@grenoble-inp.fr:
 Elia Pinto gitter.spi...@gmail.com writes:

 This is version 2 of the patch to git-submodule of the
 patch series convert  test -a/-o to  and ||.
 It contains the fixes identified by Johannes and
 Matthieu.

 This version of the patch (not the whole series) is

 Reviewed-by: Matthieu Moy matthieu@imag.fr

 Some comments for next time:

 * I agree with Johannes that splitting the series with one patch per
   file is not the right way to split. As a reviewer, I want to

   - check that -a trivially translates to 
   - check that -o trivially translates to ||
   - check non-trivial cases

   Interleaving these cases (especially the trivial and non-trivial
   cases) makes the review much harder. For this kind of series, the
   change is trivial, but the review is not (Johannes caught a || Vs 
   inversion that I didn't find for example, which is quite serious), so
   the optimize your patches for review is even more important than
   usual.

 * Again, to avoid mixing trivial and non-trivial changes, ...

First of all, thank you. I will take note of your valuable suggestions
for the future.


 @@ -1059,13 +1059,17 @@ cmd_summary() {
   while read mod_src mod_dst sha1_src sha1_dst status sm_path
   do
   # Always show modules deleted or type-changed 
 (blob-module)
 - test $status = D -o $status = T  echo $sm_path  
 continue
 + case $status in
 + [DT])
 + printf '%s\n' $sm_path 
 + continue
 + esac

 turning a echo into a printf is good, but would be better done
 separately.

I had thought, but the change was in the fix of Johannes, and it did
not think was right to change this, especially that it was right
anyway. But I understand very well the observation.

Best Regards

 --
 Matthieu Moy
 http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter

2014-05-22 Thread Jeff King
On Thu, May 22, 2014 at 12:27:38AM -0700, Kyle J. McKay wrote:

 Yeah I think so too.  It's probably enough though just to just strip all  
 and \t characters at the same time the content type is lowercased.  While
 that would cause invalid content types such as text / plain to be
 recognized it would keep the rest of the code simpler.  Since a producer of
 an invalid content type shouldn't really be depending on any particular
 behavior by a receiver of an invalid content type it's probably not an
 issue.

I think you have to retain whitespace between parameters. Not that I
expect there to be multiple parameters in a text/plain, but it's
allowed.

I started doing this path of trying to produce a normalized version, but
found that it was just as much code as parsing at all (and it still
leaves the calling code to do its own parse). Instead, what I ended up
with was just doing the parsing in http.c into nice, normalized chunks,
and then the calling code can compare them with simple strcmps.

I'll post the patches in a few minutes.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter

2014-05-22 Thread Jeff King
On Thu, May 22, 2014 at 08:12:58AM +0100, Peter Krefting wrote:

 Kyle J. McKay:
 
 I think that a strict reading of RFC 2616 allows text/plain ;
 charset=utf-8 as well as text/plain;charset=utf-8 and text/plain;
 charset=utf-8.
 
 It does indeed, and I have seen servers send both variants, so they do need
 to be catered for.
 
 The number of servers that would actually send the charset attribute here
 (for error messages) are probably not that many. It is probably a good idea
 to make the default user-configurable (I know the specs state that anything
 undeclared is iso-8859-1, but the real world doesn't agree to that).

I was really hoping to avoid getting into all of the real-world
messiness that a real http client needs to deal with (as opposed to just
following the standards). This is only used for an optional relay of
error messages from the server. Most servers will send back text/html by
default, and only those which specifically configure text/plain will
have their messages shown. IOW, I expect this to be configured
specifically for git messages, and the server admin can make sure they
are following the standard and that it works with git.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 10/19] git-submodule.sh: convert test -a/-o to and ||

2014-05-22 Thread Johannes Sixt
Am 5/22/2014 10:38, schrieb Elia Pinto:
 2014-05-22 8:49 GMT+02:00 Matthieu Moy matthieu@grenoble-inp.fr:
 Elia Pinto gitter.spi...@gmail.com writes:
 @@ -1059,13 +1059,17 @@ cmd_summary() {
   while read mod_src mod_dst sha1_src sha1_dst status sm_path
   do
   # Always show modules deleted or type-changed 
 (blob-module)
 - test $status = D -o $status = T  echo $sm_path  
 continue
 + case $status in
 + [DT])
 + printf '%s\n' $sm_path 
 + continue
 + esac

 turning a echo into a printf is good, but would be better done
 separately.
 
 I had thought, but the change was in the fix of Johannes, and it did
 not think was right to change this, especially that it was right
 anyway. But I understand very well the observation.

My intent was to show the final state of this piece of code. I do agree
with Matthieu that the change from 'echo' to 'printf' is a different topic
(in particular, since this is not the only point in the script that would
need that change). Sorry for having sent you in circles.

-- Hannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/9] handle alternate charsets for remote http errors

2014-05-22 Thread Jeff King
On Wed, May 21, 2014 at 06:25:24AM -0400, Jeff King wrote:

 As of commit 426e70d (remote-curl: show server content on http errors,
 2013-04-05), we relay any text/plain errors shown by the remote http
 server to the user. However, we were lazy back then and left this TODO
 in place:
 
/*
 * We only show text/plain parts, as other types are likely
 * to be ugly to look at on the user's terminal.
 *
 * TODO should handle ; charset=XXX, and re-encode into
 * logoutputencoding
 */
 
 This series actually implements that, along with a few other cleanups.

Here's a second version based on feedback from Kyle and Peter. Thanks
both for your comments.

It drops the tolower patches, which are not used anymore, and makes
the parsing of content-types and their parameters a bit more robust.

  [1/8]: test-lib: preserve GIT_CURL_VERBOSE from the environment
  [2/8]: t/lib-httpd: use write_script to copy CGI scripts
  [3/8]: t5550: test display of remote http error messages

These three are the same as before.

  [4/8]: http: extract type/subtype portion of content-type

Make our content-type matching more robust, both for the errors and
for matching smart-http types.

  [5/8]: http: optionally extract charset parameter from content-type

Feature work to support 7/8.

  [6/8]: strbuf: add strbuf_reencode helper

Same as before (feature work to support 7/8).

  [7/8]: remote-curl: reencode http error messages

The actual fix.

  [8/8]: http: default text charset to iso-8859-1

This could be part of 5/8, but I floated it to the top of the heap
to make it easier to discuss/adjust it.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/8] test-lib: preserve GIT_CURL_VERBOSE from the environment

2014-05-22 Thread Jeff King
Turning on this variable can be useful when debugging http
tests. It does break a few tests in t5541, but it is not
a variable that the user is likely to have enabled
accidentally.

Signed-off-by: Jeff King p...@peff.net
---
 t/test-lib.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index c081668..f7e55b1 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -91,6 +91,7 @@ unset VISUAL EMAIL LANGUAGE COLUMNS $($PERL_PATH -e '
VALGRIND
UNZIP
PERF_
+   CURL_VERBOSE
));
my @vars = grep(/^GIT_/  !/^GIT_($ok)/o, @env);
print join(\n, @vars);
-- 
2.0.0.rc1.436.g03cb729

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/8] t/lib-httpd: use write_script to copy CGI scripts

2014-05-22 Thread Jeff King
Using write_script will set our shebang line appropriately
with $SHELL_PATH. The script that is there now is quite
simple and likely to succeed even with a non-POSIX /bin/sh,
but it does not hurt to be defensive.

Signed-off-by: Jeff King p...@peff.net
---
 t/lib-httpd.sh   | 6 +-
 t/lib-httpd/broken-smart-http.sh | 1 -
 2 files changed, 5 insertions(+), 2 deletions(-)
 mode change 100755 = 100644 t/lib-httpd/broken-smart-http.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 252cbf1..8e680ef 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -105,10 +105,14 @@ else
Could not identify web server at '$LIB_HTTPD_PATH'
 fi
 
+install_script () {
+   write_script $HTTPD_ROOT_PATH/$1 $TEST_PATH/$1
+}
+
 prepare_httpd() {
mkdir -p $HTTPD_DOCUMENT_ROOT_PATH
cp $TEST_PATH/passwd $HTTPD_ROOT_PATH
-   cp $TEST_PATH/broken-smart-http.sh $HTTPD_ROOT_PATH
+   install_script broken-smart-http.sh
 
ln -s $LIB_HTTPD_MODULE_PATH $HTTPD_ROOT_PATH/modules
 
diff --git a/t/lib-httpd/broken-smart-http.sh b/t/lib-httpd/broken-smart-http.sh
old mode 100755
new mode 100644
index f7ebfff..82cc610
--- a/t/lib-httpd/broken-smart-http.sh
+++ b/t/lib-httpd/broken-smart-http.sh
@@ -1,4 +1,3 @@
-#!/bin/sh
 printf Content-Type: text/%s\n html
 echo
 printf %s\n 001e# service=git-upload-pack
-- 
2.0.0.rc1.436.g03cb729

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/8] t5550: test display of remote http error messages

2014-05-22 Thread Jeff King
Since commit 426e70d (remote-curl: show server content on
http errors, 2013-04-05), we relay any text/plain error
messages from the remote server to the user. However, we
never tested it.

Signed-off-by: Jeff King p...@peff.net
---
 t/lib-httpd.sh |  1 +
 t/lib-httpd/apache.conf|  4 
 t/lib-httpd/error.sh   | 17 +
 t/t5550-http-fetch-dumb.sh | 10 ++
 4 files changed, 32 insertions(+)
 create mode 100755 t/lib-httpd/error.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 8e680ef..f7640be 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -113,6 +113,7 @@ prepare_httpd() {
mkdir -p $HTTPD_DOCUMENT_ROOT_PATH
cp $TEST_PATH/passwd $HTTPD_ROOT_PATH
install_script broken-smart-http.sh
+   install_script error.sh
 
ln -s $LIB_HTTPD_MODULE_PATH $HTTPD_ROOT_PATH/modules
 
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 3a03e82..b384d79 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -97,12 +97,16 @@ Alias /auth/dumb/ www/auth/dumb/
 /LocationMatch
 ScriptAliasMatch /smart_*[^/]*/(.*) ${GIT_EXEC_PATH}/git-http-backend/$1
 ScriptAlias /broken_smart/ broken-smart-http.sh/
+ScriptAlias /error/ error.sh/
 Directory ${GIT_EXEC_PATH}
Options FollowSymlinks
 /Directory
 Files broken-smart-http.sh
Options ExecCGI
 /Files
+Files error.sh
+  Options ExecCGI
+/Files
 Files ${GIT_EXEC_PATH}/git-http-backend
Options ExecCGI
 /Files
diff --git a/t/lib-httpd/error.sh b/t/lib-httpd/error.sh
new file mode 100755
index 000..786f281
--- /dev/null
+++ b/t/lib-httpd/error.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+printf Status: 500 Intentional Breakage\n
+
+printf Content-Type: 
+case $PATH_INFO in
+*html*)
+   printf text/html
+   ;;
+*text*)
+   printf text/plain
+   ;;
+esac
+printf \n
+
+printf \n
+printf this is the error message\n
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 1a3a2b6..13defd3 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -171,5 +171,15 @@ test_expect_success 'did not use upload-pack service' '
test_cmp exp act
 '
 
+test_expect_success 'git client shows text/plain errors' '
+   test_must_fail git clone $HTTPD_URL/error/text 2stderr 
+   grep this is the error message stderr
+'
+
+test_expect_success 'git client does not show html errors' '
+   test_must_fail git clone $HTTPD_URL/error/html 2stderr 
+   ! grep this is the error message stderr
+'
+
 stop_httpd
 test_done
-- 
2.0.0.rc1.436.g03cb729

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/8] http: extract type/subtype portion of content-type

2014-05-22 Thread Jeff King
When we get a content-type from curl, we get the whole
header line, including any parameters, and without any
normalization (like downcasing or whitespace) applied.
If we later try to match it with strcmp() or even
strcasecmp(), we may get false negatives.

This could cause two visible behaviors:

  1. We might fail to recognize a smart-http server by its
 content-type.

  2. We might fail to relay text/plain error messages to
 users (especially if they contain a charset parameter).

This patch teaches the http code to extract and normalize
just the type/subtype portion of the string. This is
technically passing out less information to the callers, who
can no longer see the parameters. But none of the current
callers cares, and a future patch will add back an
easier-to-use method for accessing those parameters.

Signed-off-by: Jeff King p...@peff.net
---
 http.c | 32 +---
 remote-curl.c  |  2 +-
 t/lib-httpd/error.sh   |  8 +++-
 t/t5550-http-fetch-dumb.sh |  5 +
 4 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/http.c b/http.c
index 94e1afd..4edf5b9 100644
--- a/http.c
+++ b/http.c
@@ -906,6 +906,29 @@ static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO info, 
struct strbuf *buf)
return ret;
 }
 
+/*
+ * Extract a normalized version of the content type, with any
+ * spaces suppressed, all letters lowercased, and no trailing ;
+ * or parameters.
+ *
+ * Example:
+ *   TEXT/PLAIN; charset=utf-8 - text/plain
+ */
+static void extract_content_type(struct strbuf *raw, struct strbuf *type)
+{
+   const char *p;
+
+   strbuf_reset(type);
+   strbuf_grow(type, raw-len);
+   for (p = raw-buf; *p; p++) {
+   if (isspace(*p))
+   continue;
+   if (*p == ';')
+   break;
+   strbuf_addch(type, tolower(*p));
+   }
+}
+
 /* http_request() targets */
 #define HTTP_REQUEST_STRBUF0
 #define HTTP_REQUEST_FILE  1
@@ -957,9 +980,12 @@ static int http_request(const char *url,
 
ret = run_one_slot(slot, results);
 
-   if (options  options-content_type)
-   curlinfo_strbuf(slot-curl, CURLINFO_CONTENT_TYPE,
-   options-content_type);
+   if (options  options-content_type) {
+   struct strbuf raw = STRBUF_INIT;
+   curlinfo_strbuf(slot-curl, CURLINFO_CONTENT_TYPE, raw);
+   extract_content_type(raw, options-content_type);
+   strbuf_release(raw);
+   }
 
if (options  options-effective_url)
curlinfo_strbuf(slot-curl, CURLINFO_EFFECTIVE_URL,
diff --git a/remote-curl.c b/remote-curl.c
index 52c2d96..a5ab977 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -205,7 +205,7 @@ static int show_http_message(struct strbuf *type, struct 
strbuf *msg)
 * TODO should handle ; charset=XXX, and re-encode into
 * logoutputencoding
 */
-   if (strcasecmp(type-buf, text/plain))
+   if (strcmp(type-buf, text/plain))
return -1;
 
strbuf_trim(msg);
diff --git a/t/lib-httpd/error.sh b/t/lib-httpd/error.sh
index 786f281..23cec97 100755
--- a/t/lib-httpd/error.sh
+++ b/t/lib-httpd/error.sh
@@ -3,6 +3,7 @@
 printf Status: 500 Intentional Breakage\n
 
 printf Content-Type: 
+charset=iso-8859-1
 case $PATH_INFO in
 *html*)
printf text/html
@@ -10,8 +11,13 @@ case $PATH_INFO in
 *text*)
printf text/plain
;;
+*charset*)
+   printf text/plain; charset=utf-8
+   charset=utf-8
+   ;;
 esac
 printf \n
 
 printf \n
-printf this is the error message\n
+printf this is the error message\n |
+iconv -f us-ascii -t $charset
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 13defd3..b35b261 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -181,5 +181,10 @@ test_expect_success 'git client does not show html errors' 
'
! grep this is the error message stderr
 '
 
+test_expect_success 'git client shows text/plain with a charset' '
+   test_must_fail git clone $HTTPD_URL/error/charset 2stderr 
+   grep this is the error message stderr
+'
+
 stop_httpd
 test_done
-- 
2.0.0.rc1.436.g03cb729

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 5/8] http: optionally extract charset parameter from content-type

2014-05-22 Thread Jeff King
Since the previous commit, we now give a sanitized,
shortened version of the content-type header to any callers
who ask for it.

This patch adds back a way for them to cleanly access
specific parameters to the type. We could easily extract all
parameters and make them available via a string_list, but:

  1. That complicates the interface and memory management.

  2. In practice, no planned callers care about anything
 except the charset.

This patch therefore goes with the simplest thing, and we
can expand or change the interface later if it becomes
necessary.

Signed-off-by: Jeff King p...@peff.net
---
 http.c | 54 ++
 http.h |  7 +++
 2 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/http.c b/http.c
index 4edf5b9..e26ee8b 100644
--- a/http.c
+++ b/http.c
@@ -907,14 +907,44 @@ static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO 
info, struct strbuf *buf)
 }
 
 /*
+ * Check for and extract a content-type parameter. raw
+ * should be positioned at the start of the potential
+ * parameter, with any whitespace already removed.
+ *
+ * name is the name of the parameter. The value is appended
+ * to out.
+ */
+static int extract_param(const char *raw, const char *name,
+struct strbuf *out)
+{
+   size_t len = strlen(name);
+
+   if (strncasecmp(raw, name, len))
+   return -1;
+   raw += len;
+
+   if (*raw != '=')
+   return -1;
+   raw++;
+
+   while (*raw  !isspace(*raw))
+   strbuf_addch(out, *raw++);
+   return 0;
+}
+
+/*
  * Extract a normalized version of the content type, with any
  * spaces suppressed, all letters lowercased, and no trailing ;
  * or parameters.
  *
+ * If the charset argument is not NULL, store the value of any
+ * charset parameter there.
+ *
  * Example:
- *   TEXT/PLAIN; charset=utf-8 - text/plain
+ *   TEXT/PLAIN; charset=utf-8 - text/plain, utf-8
  */
-static void extract_content_type(struct strbuf *raw, struct strbuf *type)
+static void extract_content_type(struct strbuf *raw, struct strbuf *type,
+struct strbuf *charset)
 {
const char *p;
 
@@ -923,10 +953,25 @@ static void extract_content_type(struct strbuf *raw, 
struct strbuf *type)
for (p = raw-buf; *p; p++) {
if (isspace(*p))
continue;
-   if (*p == ';')
+   if (*p == ';') {
+   p++;
break;
+   }
strbuf_addch(type, tolower(*p));
}
+
+   if (!charset)
+   return;
+
+   strbuf_reset(charset);
+   while (*p) {
+   while (isspace(*p))
+   p++;
+   if (!extract_param(p, charset, charset))
+   return;
+   while (*p  !isspace(*p))
+   p++;
+   }
 }
 
 /* http_request() targets */
@@ -983,7 +1028,8 @@ static int http_request(const char *url,
if (options  options-content_type) {
struct strbuf raw = STRBUF_INIT;
curlinfo_strbuf(slot-curl, CURLINFO_CONTENT_TYPE, raw);
-   extract_content_type(raw, options-content_type);
+   extract_content_type(raw, options-content_type,
+options-charset);
strbuf_release(raw);
}
 
diff --git a/http.h b/http.h
index e64084f..473179b 100644
--- a/http.h
+++ b/http.h
@@ -144,6 +144,13 @@ struct http_get_options {
struct strbuf *content_type;
 
/*
+* If non-NULL, and content_type above is non-NULL, returns
+* the charset parameter from the content-type. If none is
+* present, returns an empty string.
+*/
+   struct strbuf *charset;
+
+   /*
 * If non-NULL, returns the URL we ended up at, including any
 * redirects we followed.
 */
-- 
2.0.0.rc1.436.g03cb729

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 6/8] strbuf: add strbuf_reencode helper

2014-05-22 Thread Jeff King
This is a convenience wrapper around `reencode_string_len`
and `strbuf_attach`.

Signed-off-by: Jeff King p...@peff.net
---
 Documentation/technical/api-strbuf.txt |  5 +
 strbuf.c   | 17 +
 strbuf.h   |  1 +
 3 files changed, 23 insertions(+)

diff --git a/Documentation/technical/api-strbuf.txt 
b/Documentation/technical/api-strbuf.txt
index 3350d97..9d28b03 100644
--- a/Documentation/technical/api-strbuf.txt
+++ b/Documentation/technical/api-strbuf.txt
@@ -125,6 +125,11 @@ Functions
 
Strip whitespace from the end of a string.
 
+`strbuf_reencode`::
+
+   Replace the contents of the strbuf with a reencoded form.  Returns -1
+   on error, 0 on success.
+
 `strbuf_cmp`::
 
Compare two buffers. Returns an integer less than, equal to, or greater
diff --git a/strbuf.c b/strbuf.c
index ee96dcf..fc7290f 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,5 +1,6 @@
 #include cache.h
 #include refs.h
+#include utf8.h
 
 int starts_with(const char *str, const char *prefix)
 {
@@ -106,6 +107,22 @@ void strbuf_ltrim(struct strbuf *sb)
sb-buf[sb-len] = '\0';
 }
 
+int strbuf_reencode(struct strbuf *sb, const char *from, const char *to)
+{
+   char *out;
+   int len;
+
+   if (same_encoding(from, to))
+   return 0;
+
+   out = reencode_string_len(sb-buf, sb-len, to, from, len);
+   if (!out)
+   return -1;
+
+   strbuf_attach(sb, out, len, len);
+   return 0;
+}
+
 struct strbuf **strbuf_split_buf(const char *str, size_t slen,
 int terminator, int max)
 {
diff --git a/strbuf.h b/strbuf.h
index 39c14cf..4e9a2f8 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -45,6 +45,7 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t 
len)
 extern void strbuf_trim(struct strbuf *);
 extern void strbuf_rtrim(struct strbuf *);
 extern void strbuf_ltrim(struct strbuf *);
+extern int strbuf_reencode(struct strbuf *sb, const char *from, const char 
*to);
 extern int strbuf_cmp(const struct strbuf *, const struct strbuf *);
 
 /*
-- 
2.0.0.rc1.436.g03cb729

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 7/8] remote-curl: reencode http error messages

2014-05-22 Thread Jeff King
We currently recognize an error message with a content-type
text/plain; charset=utf-16 as text, but we ignore the
charset parameter entirely. Let's encode it to
log_output_encoding, which is presumably something the
user's terminal can handle.

Signed-off-by: Jeff King p...@peff.net
---
 remote-curl.c  | 17 ++---
 t/lib-httpd/error.sh   |  4 
 t/t5550-http-fetch-dumb.sh |  5 +
 3 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index a5ab977..4493b38 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -194,19 +194,19 @@ static void free_discovery(struct discovery *d)
}
 }
 
-static int show_http_message(struct strbuf *type, struct strbuf *msg)
+static int show_http_message(struct strbuf *type, struct strbuf *charset,
+struct strbuf *msg)
 {
const char *p, *eol;
 
/*
 * We only show text/plain parts, as other types are likely
 * to be ugly to look at on the user's terminal.
-*
-* TODO should handle ; charset=XXX, and re-encode into
-* logoutputencoding
 */
if (strcmp(type-buf, text/plain))
return -1;
+   if (charset-len)
+   strbuf_reencode(msg, charset-buf, get_log_output_encoding());
 
strbuf_trim(msg);
if (!msg-len)
@@ -225,6 +225,7 @@ static struct discovery* discover_refs(const char *service, 
int for_push)
 {
struct strbuf exp = STRBUF_INIT;
struct strbuf type = STRBUF_INIT;
+   struct strbuf charset = STRBUF_INIT;
struct strbuf buffer = STRBUF_INIT;
struct strbuf refs_url = STRBUF_INIT;
struct strbuf effective_url = STRBUF_INIT;
@@ -249,6 +250,7 @@ static struct discovery* discover_refs(const char *service, 
int for_push)
 
memset(options, 0, sizeof(options));
options.content_type = type;
+   options.charset = charset;
options.effective_url = effective_url;
options.base_url = url;
options.no_cache = 1;
@@ -259,13 +261,13 @@ static struct discovery* discover_refs(const char 
*service, int for_push)
case HTTP_OK:
break;
case HTTP_MISSING_TARGET:
-   show_http_message(type, buffer);
+   show_http_message(type, charset, buffer);
die(repository '%s' not found, url.buf);
case HTTP_NOAUTH:
-   show_http_message(type, buffer);
+   show_http_message(type, charset, buffer);
die(Authentication failed for '%s', url.buf);
default:
-   show_http_message(type, buffer);
+   show_http_message(type, charset, buffer);
die(unable to access '%s': %s, url.buf, curl_errorstr);
}
 
@@ -310,6 +312,7 @@ static struct discovery* discover_refs(const char *service, 
int for_push)
strbuf_release(refs_url);
strbuf_release(exp);
strbuf_release(type);
+   strbuf_release(charset);
strbuf_release(effective_url);
strbuf_release(buffer);
last_discovery = last;
diff --git a/t/lib-httpd/error.sh b/t/lib-httpd/error.sh
index 23cec97..eafc9d2 100755
--- a/t/lib-httpd/error.sh
+++ b/t/lib-httpd/error.sh
@@ -15,6 +15,10 @@ case $PATH_INFO in
printf text/plain; charset=utf-8
charset=utf-8
;;
+*utf16*)
+   printf text/plain; charset=utf-16
+   charset=utf-16
+   ;;
 esac
 printf \n
 
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index b35b261..01b8aae 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -186,5 +186,10 @@ test_expect_success 'git client shows text/plain with a 
charset' '
grep this is the error message stderr
 '
 
+test_expect_success 'http error messages are reencoded' '
+   test_must_fail git clone $HTTPD_URL/error/utf16 2stderr 
+   grep this is the error message stderr
+'
+
 stop_httpd
 test_done
-- 
2.0.0.rc1.436.g03cb729

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 8/8] http: default text charset to iso-8859-1

2014-05-22 Thread Jeff King
This is specified by RFC 2616 as the default if no charset
parameter is given.

Signed-off-by: Jeff King p...@peff.net
---
I'd prefer to do this simple, standard thing, and see how it works in
the real world. We'll hand whatever we get off to iconv, and if it
chokes, we'll pass through the data as-is. That should be enough for
most ascii messages to make it through readable, even if we get the
encoding wrong.

If we do want to do magic like latin1 is really iso-8859-1, that seems
like the domain of iconv to me. If iconv doesn't handle it itself, I'd
rather have a wrapper there. Putting it at that layer keeps the code
cleaner, and it means the wrapper would benefit the regular commit-log
reencoding code.

If anybody wants to go further in that direction, be my guest, but please
make your suggestions in the form of patches which apply on top. :)

 http.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/http.c b/http.c
index e26ee8b..a37e84e 100644
--- a/http.c
+++ b/http.c
@@ -972,6 +972,9 @@ static void extract_content_type(struct strbuf *raw, struct 
strbuf *type,
while (*p  !isspace(*p))
p++;
}
+
+   if (!charset-len  starts_with(type-buf, text/))
+   strbuf_addstr(charset, ISO-8859-1);
 }
 
 /* http_request() targets */
-- 
2.0.0.rc1.436.g03cb729
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] tolower cleanups

2014-05-22 Thread Jeff King
These two patches were pulled from the http charset series I posted
nearby. The second iteration of that series did not need them, but they
may have value as cleanups.

  [1/2]: daemon/config: factor out duplicate xstrdup_tolower
  [2/2]: strbuf: add strbuf_tolower function

The first one is a real reduction of code. The second is just making
code public that _might_ get used later, so it's a bit less exciting.
Both are independent of each other, so we can take or leave them
separately.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] daemon/config: factor out duplicate xstrdup_tolower

2014-05-22 Thread Jeff King
We have two implementations of the same function; let's drop
that to one. We take the name from daemon.c, but the
implementation (which is just slightly more efficient) from
the config code.

Signed-off-by: Jeff King p...@peff.net
---
 builtin/config.c | 15 +--
 daemon.c |  8 
 strbuf.c | 13 +
 strbuf.h |  2 ++
 4 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/builtin/config.c b/builtin/config.c
index 5677c94..fcd8474 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -395,19 +395,6 @@ static int urlmatch_collect_fn(const char *var, const char 
*value, void *cb)
return 0;
 }
 
-static char *dup_downcase(const char *string)
-{
-   char *result;
-   size_t len, i;
-
-   len = strlen(string);
-   result = xmalloc(len + 1);
-   for (i = 0; i  len; i++)
-   result[i] = tolower(string[i]);
-   result[i] = '\0';
-   return result;
-}
-
 static int get_urlmatch(const char *var, const char *url)
 {
char *section_tail;
@@ -422,7 +409,7 @@ static int get_urlmatch(const char *var, const char *url)
if (!url_normalize(url, config.url))
die(%s, config.url.err);
 
-   config.section = dup_downcase(var);
+   config.section = xstrdup_tolower(var);
section_tail = strchr(config.section, '.');
if (section_tail) {
*section_tail = '\0';
diff --git a/daemon.c b/daemon.c
index eba1255..f9c63e9 100644
--- a/daemon.c
+++ b/daemon.c
@@ -475,14 +475,6 @@ static void make_service_overridable(const char *name, int 
ena)
die(No such service %s, name);
 }
 
-static char *xstrdup_tolower(const char *str)
-{
-   char *p, *dup = xstrdup(str);
-   for (p = dup; *p; p++)
-   *p = tolower(*p);
-   return dup;
-}
-
 static void parse_host_and_port(char *hostport, char **host,
char **port)
 {
diff --git a/strbuf.c b/strbuf.c
index ee96dcf..854c725 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -570,3 +570,16 @@ int fprintf_ln(FILE *fp, const char *fmt, ...)
return -1;
return ret + 1;
 }
+
+char *xstrdup_tolower(const char *string)
+{
+   char *result;
+   size_t len, i;
+
+   len = strlen(string);
+   result = xmalloc(len + 1);
+   for (i = 0; i  len; i++)
+   result[i] = tolower(string[i]);
+   result[i] = '\0';
+   return result;
+}
diff --git a/strbuf.h b/strbuf.h
index 39c14cf..4de7531 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -183,4 +183,6 @@ extern int printf_ln(const char *fmt, ...);
 __attribute__((format (printf,2,3)))
 extern int fprintf_ln(FILE *fp, const char *fmt, ...);
 
+char *xstrdup_tolower(const char *);
+
 #endif /* STRBUF_H */
-- 
2.0.0.rc1.436.g03cb729

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] strbuf: add strbuf_tolower function

2014-05-22 Thread Jeff King
This makes config's lowercase() function public.

Note that we could continue to offer a pure-string
lowercase, but there would be no callers (in most
pure-string cases, we actually duplicate and lowercase the
duplicate).

Signed-off-by: Jeff King p...@peff.net
---
 Documentation/technical/api-strbuf.txt | 4 
 config.c   | 8 +---
 strbuf.c   | 7 +++
 strbuf.h   | 1 +
 4 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/Documentation/technical/api-strbuf.txt 
b/Documentation/technical/api-strbuf.txt
index 3350d97..8480f89 100644
--- a/Documentation/technical/api-strbuf.txt
+++ b/Documentation/technical/api-strbuf.txt
@@ -125,6 +125,10 @@ Functions
 
Strip whitespace from the end of a string.
 
+`strbuf_tolower`::
+
+   Lowercase each character in the buffer using `tolower`.
+
 `strbuf_cmp`::
 
Compare two buffers. Returns an integer less than, equal to, or greater
diff --git a/config.c b/config.c
index a30cb5c..03ce5c6 100644
--- a/config.c
+++ b/config.c
@@ -147,12 +147,6 @@ int git_config_include(const char *var, const char *value, 
void *data)
return ret;
 }
 
-static void lowercase(char *p)
-{
-   for (; *p; p++)
-   *p = tolower(*p);
-}
-
 void git_config_push_parameter(const char *text)
 {
struct strbuf env = STRBUF_INIT;
@@ -180,7 +174,7 @@ int git_config_parse_parameter(const char *text,
strbuf_list_free(pair);
return error(bogus config parameter: %s, text);
}
-   lowercase(pair[0]-buf);
+   strbuf_tolower(pair[0]);
if (fn(pair[0]-buf, pair[1] ? pair[1]-buf : NULL, data)  0) {
strbuf_list_free(pair);
return -1;
diff --git a/strbuf.c b/strbuf.c
index 854c725..3da4f3e 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -106,6 +106,13 @@ void strbuf_ltrim(struct strbuf *sb)
sb-buf[sb-len] = '\0';
 }
 
+void strbuf_tolower(struct strbuf *sb)
+{
+   char *p;
+   for (p = sb-buf; *p; p++)
+   *p = tolower(*p);
+}
+
 struct strbuf **strbuf_split_buf(const char *str, size_t slen,
 int terminator, int max)
 {
diff --git a/strbuf.h b/strbuf.h
index 4de7531..25328b9 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -45,6 +45,7 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t 
len)
 extern void strbuf_trim(struct strbuf *);
 extern void strbuf_rtrim(struct strbuf *);
 extern void strbuf_ltrim(struct strbuf *);
+extern void strbuf_tolower(struct strbuf *sb);
 extern int strbuf_cmp(const struct strbuf *, const struct strbuf *);
 
 /*
-- 
2.0.0.rc1.436.g03cb729
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH v4 3/3] add command performance tracing to debug scripted commands

2014-05-22 Thread Jeff King
On Thu, May 22, 2014 at 02:40:48AM +0200, Karsten Blees wrote:

 E.g. if I'm interested in a particular code section, I throw in 2
 lines of code (before and after the code section). This gives very
 accurate results, without significantly affecting overall performance.
 I can then push the changes to my Linux/Windows box and get comparable
 results there. No need to disable optimization. No worries that the
 profiling tool isn't available on the other platform. No analyzing
 megabytes of mostly irrelevant profiling data.
 
 Does that make sense?

Ah, I see. I misunderstood from your example above.

I do agree that automatically stamping with __FILE__ and __LINE__ is
very helpful there. Could we maybe restrict that use of the variadic
macros to a few known-good compilers (maybe #ifdef __GNUC__, which also
hits clang, and something to catch MSVC)? On other systems it would
become a compile-time noop, and they could live without the feature.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] remote-curl: recognize text/plain with a charset parameter

2014-05-22 Thread Peter Krefting

Jeff King:


I was really hoping to avoid getting into all of the real-world
messiness that a real http client needs to deal with (as opposed to just
following the standards).


Yeah, I agree, you're probably fine without all this detail in over 
99% of the cases where this code would ever be exposed. I'm a bit 
damaged after 10+ years as a web browser developer, responsible for 
exactly this kind of stuff... :-)


--
\\// Peter - http://www.softwolves.pp.se/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Get rid of the non portable shell export VAR=VALUE costruct

2014-05-22 Thread Elia Pinto
Found by check-non-portable-shell.pl

Signed-off-by: Elia Pinto gitter.spi...@gmail.com
---
 contrib/subtree/t/t7900-subtree.sh |2 +-
 git-remote-testgit.sh  |2 +-
 git-stash.sh   |2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/contrib/subtree/t/t7900-subtree.sh 
b/contrib/subtree/t/t7900-subtree.sh
index 66ce4b0..c1d0b23 100755
--- a/contrib/subtree/t/t7900-subtree.sh
+++ b/contrib/subtree/t/t7900-subtree.sh
@@ -8,7 +8,7 @@ This test verifies the basic operation of the merge, pull, add
 and split subcommands of git subtree.
 '
 
-export TEST_DIRECTORY=$(pwd)/../../../t
+TEST_DIRECTORY=$(pwd)/../../../t  export TEST_DIRECTORY
 
 . ../../../t/test-lib.sh
 
diff --git a/git-remote-testgit.sh b/git-remote-testgit.sh
index 1c006a0..9d1ce76 100755
--- a/git-remote-testgit.sh
+++ b/git-remote-testgit.sh
@@ -13,7 +13,7 @@ refspec=${GIT_REMOTE_TESTGIT_REFSPEC-$default_refspec}
 
 test -z $refspec  prefix=refs
 
-export GIT_DIR=$url/.git
+GIT_DIR=$url/.git  export GIT_DIR
 
 force=
 
diff --git a/git-stash.sh b/git-stash.sh
index 4798bcf..0bb1af8 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -94,7 +94,7 @@ create_stash () {
# ease of unpacking later.
u_commit=$(
untracked_files | (
-   export GIT_INDEX_FILE=$TMPindex
+   GIT_INDEX_FILE=$TMPindex  export 
GIT_INDEX_FILE 
rm -f $TMPindex 
git update-index -z --add --remove --stdin 
u_tree=$(git write-tree) 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct

2014-05-22 Thread Torsten Bögershausen
On 2014-05-22 14.48, Elia Pinto wrote:
 Found by check-non-portable-shell.pl

Thanks for picking this up
 -export TEST_DIRECTORY=$(pwd)/../../../t
 +TEST_DIRECTORY=$(pwd)/../../../t  export TEST_DIRECTORY
Minor remark:
Both commands should go on their own line, like this:

TEST_DIRECTORY=$(pwd)/../../../t 
export TEST_DIRECTORY


And, unrelated to this patch, 
there seem to be a lot of  missing in git-remote-testgit.sh.
But this should be a seperate patch, I think.



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct

2014-05-22 Thread David Kastrup
Torsten Bögershausen tbo...@web.de writes:

 On 2014-05-22 14.48, Elia Pinto wrote:
 Found by check-non-portable-shell.pl

 Thanks for picking this up
 -export TEST_DIRECTORY=$(pwd)/../../../t
 +TEST_DIRECTORY=$(pwd)/../../../t  export TEST_DIRECTORY
 Minor remark:
 Both commands should go on their own line, like this:

 TEST_DIRECTORY=$(pwd)/../../../t 
 export TEST_DIRECTORY


 And, unrelated to this patch, 
 there seem to be a lot of  missing in git-remote-testgit.sh.

I have a hard time taking the above  seriously.  pwd is a shell
builtin (when we are not talking about Version 3 UNIX or something) that
can hardly fail.  And when your shell does not support assignment to a
shell variable, you'll have a hard time getting the shell script to run.

That's stuff of the

if (1+1 != 2) { fputs(Warning: your CPU may be broken, stderr); }

variety.  If you have to check for that, you have bigger problems...

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] strbuf: add strbuf_tolower function

2014-05-22 Thread Jeff King
[re-adding list cc]

On Thu, May 22, 2014 at 03:16:45PM +0200, Christian Couder wrote:

  +void strbuf_tolower(struct strbuf *sb)
  +{
  +   char *p;
  +   for (p = sb-buf; *p; p++)
  +   *p = tolower(*p);
  +}
 
 Last time I tried a change like the above, I was told that strbufs are
 buffers that can contain NUL bytes. So maybe it should be:
 
for (p = sb-buf; p  sb-buf + sb-len; p++)
   *p = tolower(*p);

Hah. I wrote it like that originally, and in review was asked to switch
it. I agree that it might be worth keeping strbuf functions 8bit-clean.
The original intent of the strbuf code was to make C strings easier to
use, but I think we sometimes use them as general buffers, too.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct

2014-05-22 Thread Elia Pinto
5-22 14.48, Elia Pinto wrote:
 Found by check-non-portable-shell.pl

 Thanks for picking this up
 -export TEST_DIRECTORY=$(pwd)/../../../t
 +TEST_DIRECTORY=$(pwd)/../../../t  export TEST_DIRECTORY
 Minor remark:
 Both commands should go on their own line, like this:

 TEST_DIRECTORY=$(pwd)/../../../t 
 export TEST_DIRECTORY

Apparently they are both common idioms and I find no indication in the
CodingGuideline.

I have no problems rerolling  this simple patch, but i need to know
what is the (git) right style in this case.

Thanks
--

cd ./git-core

 grep .*export *.sh
git-am.sh:  GIT_MERGE_VERBOSITY=0  export GIT_MERGE_VERBOSITY
git-filter-branch.sh:   echo case \\$GIT_$1_NAME\ in \\)
GIT_$1_NAME=\\${GIT_$1_EMAIL%%@*}\  export GIT_$1_NAME;; esac
git-filter-branch.sh:   GIT_DIR=$ORIG_GIT_DIR  export GIT_DIR
git-rebase--merge.sh:   GIT_MERGE_VERBOSITY=1  export
GIT_MERGE_VERBOSITY
git-remote-testgit.sh:GIT_DIR=$url/.git  export GIT_DIR
git-stash.sh:   GIT_INDEX_FILE=$TMPindex 
export GIT_INDEX_FILE 
git-stash.sh:   GIT_MERGE_VERBOSITY=0  export GIT_MERGE_VERBOSITY

grep 'export.*' *.sh
git-stash.sh:   GIT_INDEX_FILE=$TMPindex 
export GIT_INDEX_FILE 
git-stash.sh:   export GIT_INDEX_FILE 


 And, unrelated to this patch,
 there seem to be a lot of  missing in git-remote-testgit.sh.
 But this should be a seperate patch, I think.



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] completion: add a note that merge options are shared

2014-05-22 Thread John Keeping
This should avoid future confusion after a subsequent patch has added
some options to __git_merge_options and some directly in _git_merge().

Signed-off-by: John Keeping j...@keeping.me.uk
---
 contrib/completion/git-completion.bash | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 2c59a76..ff97c20 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1472,6 +1472,7 @@ _git_log ()
__git_complete_revlist
 }
 
+# Common merge options shared by git-merge(1) and git-pull(1).
 __git_merge_options=
--no-commit --no-stat --log --no-log --squash --strategy
--commit --stat --no-squash --ff --no-ff --ff-only --edit --no-edit
-- 
2.0.0.rc2.4.g1dc51c6

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct

2014-05-22 Thread Johannes Sixt
Am 5/22/2014 15:19, schrieb David Kastrup:
 Torsten Bögershausen tbo...@web.de writes:
 
 On 2014-05-22 14.48, Elia Pinto wrote:
 Found by check-non-portable-shell.pl

 Thanks for picking this up
 -export TEST_DIRECTORY=$(pwd)/../../../t
 +TEST_DIRECTORY=$(pwd)/../../../t  export TEST_DIRECTORY
 Minor remark:
 Both commands should go on their own line, like this:

 TEST_DIRECTORY=$(pwd)/../../../t 
 export TEST_DIRECTORY


 And, unrelated to this patch, 
 there seem to be a lot of  missing in git-remote-testgit.sh.
 
 I have a hard time taking the above  seriously.  pwd is a shell
 builtin (when we are not talking about Version 3 UNIX or something) that
 can hardly fail.  And when your shell does not support assignment to a
 shell variable, you'll have a hard time getting the shell script to run.

The  after an assignment makes a big difference when the assignment is
part of an  chain. This is *very* common in our test suite, as you know.

People tend to copy-and-paste. And then it is better to provide a more
universally applicable precedent.

-- Hannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct

2014-05-22 Thread David Kastrup
Johannes Sixt j.s...@viscovery.net writes:

 Am 5/22/2014 15:19, schrieb David Kastrup:
 Torsten Bögershausen tbo...@web.de writes:
 
 On 2014-05-22 14.48, Elia Pinto wrote:
 Found by check-non-portable-shell.pl

 Thanks for picking this up
 -export TEST_DIRECTORY=$(pwd)/../../../t
 +TEST_DIRECTORY=$(pwd)/../../../t  export TEST_DIRECTORY
 Minor remark:
 Both commands should go on their own line, like this:

 TEST_DIRECTORY=$(pwd)/../../../t 
 export TEST_DIRECTORY


 And, unrelated to this patch, 
 there seem to be a lot of  missing in git-remote-testgit.sh.
 
 I have a hard time taking the above  seriously.  pwd is a shell
 builtin (when we are not talking about Version 3 UNIX or something) that
 can hardly fail.  And when your shell does not support assignment to a
 shell variable, you'll have a hard time getting the shell script to run.

 The  after an assignment makes a big difference when the assignment is
 part of an  chain. This is *very* common in our test suite, as you know.

 People tend to copy-and-paste. And then it is better to provide a more
 universally applicable precedent.

Copy-and-paste will not magically add the second  that would be
required for that usage, and the one in the line above might mislead you
into thinking that the problem has been dealt with already.

So I'm not convinced that using  outside of a preexisting  chain
makes any sense in this context.

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 35/44] refs.c: make delete_ref use a transaction

2014-05-22 Thread Ronnie Sahlberg
On Wed, May 21, 2014 at 4:22 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 --- a/refs.c
 +++ b/refs.c
 [...]
 @@ -2515,24 +2510,18 @@ static int delete_ref_loose(struct ref_lock *lock, 
 int flag, struct strbuf *err)

  int delete_ref(const char *refname, const unsigned char *sha1, int delopt)
  {
 - struct ref_lock *lock;
 - int ret = 0, flag = 0;
 + struct ref_transaction *transaction;

 [...]
 - lock = lock_ref_sha1_basic(refname, sha1, delopt, flag);
 + if (!transaction ||
 + ref_transaction_delete(transaction, refname, sha1, delopt,
 +sha1  !is_null_sha1(sha1), err) ||

 What should happen when is_null_sha1(sha1)?  In that case the
 caller has asked us to check that the ref doesn't exist before
 deleting it, which doesn't make a lot of sense, but the old
 code did exactly that if I read it correctly.  The new code
 seems to disable verification instead.

 Do we know that no callers call delete_ref with such an argument?
 Would it make sense to just die with a BUG: message to make
 the contract more clear?

There are no callers that pass in null_sha1 explicitely and all tests pass.
I have changed it to a die(BUG... to make it more explicit as you suggested.


 [...]
 - unlink_or_warn(git_path(logs/%s, lock-ref_name));

 When does the reflog get deleted in the new code?

It is deleted towards the end of ref_transaction_commit, after the ref
itself has been deleted.

Thanks!

ronnie sahlberg
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 36/44] refs.c: pass the ref log message to _create/delete/update instead of _commit

2014-05-22 Thread Ronnie Sahlberg
On Wed, May 21, 2014 at 4:47 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 Change the reference transactions so that we pass the reflog message
 through to the create/delete/update function instead of the commit message.

 Nice.

 [...]
 --- a/builtin/fetch.c
 +++ b/builtin/fetch.c
 @@ -673,7 +673,6 @@ static int store_updated_refs(const char *raw_url, const 
 char *remote_name,
   }
   }
   }
 -
   if (rc  STORE_REF_ERROR_DF_CONFLICT)
   error(_(some local refs could not be updated; try running\n
  'git remote prune %s' to remove any old, conflicting 

 Stray whitespace change?

Fixed.


 [...]
 --- a/refs.c
 +++ b/refs.c
 [...]
 @@ -3264,6 +3264,7 @@ struct ref_update {
   int have_old; /* 1 if old_sha1 is valid, 0 otherwise */
   struct ref_lock *lock;
   int type;
 + const char *msg;
   const char refname[FLEX_ARRAY];

 Should be 'char *msg' since we own the memory (or perhaps a strbuf).

I changed it to char *msg, but it still might/should be const. We own
the memory but we are not supposed to modify the content
(after we have copied what the caller gave us).

Yes, we should change it into a strbuf at some stage.


 [...]
 @@ -3297,9 +3298,10 @@ void ref_transaction_free(struct ref_transaction 
 *transaction)
   if (!transaction)
   return;

 - for (i = 0; i  transaction-nr; i++)
 + for (i = 0; i  transaction-nr; i++) {
 +   free((char *)transaction-updates[i]-msg);
   free(transaction-updates[i]);

 Whitespace?

Fixed.


 No need to cast.

Done.

Thanks!
ronnie sahlberg
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 38/44] refs.c: pack all refs before we start to rename a ref

2014-05-22 Thread Ronnie Sahlberg
On Wed, May 21, 2014 at 4:57 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 This means that most loose refs will no longer be present after the rename

 Is this to handle the git branch -m foo/bar foo case or for some other
 purpose?

Yes. That is the main reason.


 [...]
 --- a/t/t3200-branch.sh
 +++ b/t/t3200-branch.sh
 @@ -289,7 +289,7 @@ test_expect_success 'renaming a symref is not allowed' '
   git symbolic-ref refs/heads/master2 refs/heads/master 
   test_must_fail git branch -m master2 master3 
   git symbolic-ref refs/heads/master2 
 - test_path_is_file .git/refs/heads/master 
 + test_path_is_missing .git/refs/heads/master 
   test_path_is_missing .git/refs/heads/master3

 It's kind of silly that this test is mucking about in the .git directory
 at all.  Shouldn't the check be something like

 git rev-parse --verify refs/heads/master 
 test_must_fail git symbolic-ref refs/heads/master3 
 test_must_fail git rev-parse refs/heads/master3

 ?

Thanks. I updated the test with your change.

There is a whole bunch of tests that are like and access files directly..
Testing if a path exists or not, or checking that a reflog file
contains x number of lines. etc.

All these tests will be updated to not access the files directly once
I start mucking around with a TDB based refs backend.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git reset for index restoration?

2014-05-22 Thread David Turner
If I have a git repository with a clean working tree, and I delete the
index, then I can use git reset (with no arguments) to recreate it.
However, when I do recreate it, it doesn't come back the same.  I have
not analyzed this in detail, but the effect is that commands like git
status take much longer because they must read objects out of a pack
file.  In other words, the index seems to not realize that the index (or
at least most of it) represents the same state as HEAD.  If I do git
reset --hard, the index is restored to the original state (it's
byte-for-byte identical), and the pack file is no longer read.

Before I try to dig in to why this should be so, does anyone happen to
know off the top of their head?  Does this constitute a bug in git, or a
bug in my understanding of git?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 35/44] refs.c: make delete_ref use a transaction

2014-05-22 Thread Ronnie Sahlberg
On Thu, May 22, 2014 at 8:32 AM, Ronnie Sahlberg sahlb...@google.com wrote:
 On Wed, May 21, 2014 at 4:22 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 --- a/refs.c
 +++ b/refs.c
 [...]
 @@ -2515,24 +2510,18 @@ static int delete_ref_loose(struct ref_lock *lock, 
 int flag, struct strbuf *err)

  int delete_ref(const char *refname, const unsigned char *sha1, int delopt)
  {
 - struct ref_lock *lock;
 - int ret = 0, flag = 0;
 + struct ref_transaction *transaction;

 [...]
 - lock = lock_ref_sha1_basic(refname, sha1, delopt, flag);
 + if (!transaction ||
 + ref_transaction_delete(transaction, refname, sha1, delopt,
 +sha1  !is_null_sha1(sha1), err) ||

 What should happen when is_null_sha1(sha1)?  In that case the
 caller has asked us to check that the ref doesn't exist before
 deleting it, which doesn't make a lot of sense, but the old
 code did exactly that if I read it correctly.  The new code
 seems to disable verification instead.

 Do we know that no callers call delete_ref with such an argument?
 Would it make sense to just die with a BUG: message to make
 the contract more clear?

 There are no callers that pass in null_sha1 explicitely and all tests pass.
 I have changed it to a die(BUG... to make it more explicit as you suggested.

Strike that.

While there are no callers I can see that deliberately send null_sha1
it can happen because resolve_ref_unsafe(reading=0) calls
handle_missing_loose_ref which will clear sha1
if the ref is missing.

This can happen for either symrefs or refs that are soft links that
point to a nonexisting ref.





 [...]
 - unlink_or_warn(git_path(logs/%s, lock-ref_name));

 When does the reflog get deleted in the new code?

 It is deleted towards the end of ref_transaction_commit, after the ref
 itself has been deleted.

 Thanks!

 ronnie sahlberg
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] read-cache: inform the daemon that the index has been updated

2014-05-22 Thread David Turner
On Tue, 2014-05-13 at 18:15 +0700, Nguyễn Thái Ngọc Duy wrote:
 + if (run_command(cp))
 + warning(_(failed to start read-cache--daemon: %s),
 + strerror(errno));

errno is not always (ever?) set, so if read-cache--daemon is missing,
you get:
warning: failed to start read-cache--daemon: Success


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Jeff King
On Thu, May 22, 2014 at 12:22:43PM -0400, David Turner wrote:

 If I have a git repository with a clean working tree, and I delete the
 index, then I can use git reset (with no arguments) to recreate it.
 However, when I do recreate it, it doesn't come back the same.  I have
 not analyzed this in detail, but the effect is that commands like git
 status take much longer because they must read objects out of a pack
 file.  In other words, the index seems to not realize that the index (or
 at least most of it) represents the same state as HEAD.  If I do git
 reset --hard, the index is restored to the original state (it's
 byte-for-byte identical), and the pack file is no longer read.

Are you sure it's reading a packfile? I would expect that it is rather
reading the files in the working tree. One of the functions of the index
is to maintain a cache of the sha1 of the working tree file content and
the stat information. Once that is populated, a subsequent git status
can then just lstat() the files to see that its sha1 cache is up to
date.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Elijah Newren
On Thu, May 22, 2014 at 9:22 AM, David Turner dtur...@twopensource.com wrote:
 If I have a git repository with a clean working tree, and I delete the
 index, then I can use git reset (with no arguments) to recreate it.
 However, when I do recreate it, it doesn't come back the same.  I have
 not analyzed this in detail, but the effect is that commands like git
 status take much longer because they must read objects out of a pack
 file.  In other words, the index seems to not realize that the index (or
 at least most of it) represents the same state as HEAD.  If I do git
 reset --hard, the index is restored to the original state (it's
 byte-for-byte identical), and the pack file is no longer read.

 Before I try to dig in to why this should be so, does anyone happen to
 know off the top of their head?  Does this constitute a bug in git, or a
 bug in my understanding of git?

It's not a bug.  The index has additional stat-info it tracks about
files -- inode number, mtime, etc. -- so that it can quickly check
whether files are up to date without comparing full file contents in
the working copy to the relevant version from .git/objects.

'git reset' means make the index match the commit in HEAD.  It implies
nothing about the working copy files, which could be different.
Although you say that you have a clean working tree, git doesn't check
to verify that, so it has to treat these files as stat-dirty until the
next operation (e.g. git status) fills these in -- an operation that
involves full comparison of the files in the working copy with the
relevant version of the file from under .git/objects. (You may find
'git update-index --refresh' helpful here.)

git reset --hard means not only make the index match the commit in
HEAD, but change files in the working copy to match as well.  In such
a case, git will know that the index matches the working copy (since
it is enforcing it), so it can update all the stat-info in the index
and future git-status operations will be fast.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 32/44] refs.c: remove the update_ref_write function

2014-05-22 Thread Ronnie Sahlberg
On Wed, May 21, 2014 at 3:07 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 +++ b/refs.c
 [...]
 @@ -3518,14 +3499,16 @@ int ref_transaction_commit(struct ref_transaction 
 *transaction,
   struct ref_update *update = updates[i];

   if (!is_null_sha1(update-new_sha1)) {
 - ret = update_ref_write(msg,
 -update-refname,
 -update-new_sha1,
 -update-lock, err,
 -UPDATE_REFS_QUIET_ON_ERR);
 + ret = write_ref_sha1(update-lock, update-new_sha1,
 +  msg);

 This changes the return value on error from 1 to -1.  That seems like a
 good change.  It's probably worth mentioning in the commit message.

Done.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 34/44] refs.c: make prune_ref use a transaction to delete the ref

2014-05-22 Thread Ronnie Sahlberg
Added a comment that any flags =0x100 are reserved for internal use.


On Wed, May 21, 2014 at 4:01 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 Change prune_ref to delete the ref using a ref transaction. To do this we 
 also
 need to add a new flag REF_ISPRUNING that will tell the transaction that we
 do not want to delete this ref from the packed refs.

 Interesting.  Since the flag is per ref update, it even would allow
 deleting some refs and pruning others in the same transaction.  Makes
 sense.

 Looks like this doesn't batch up multiple ref-prunings into a single
 transaction.  Makes sense (it would just make the period while refs
 are locked longer without having any real benefit).

 [...]
 +#define REF_ISPRUNING0x0100

 Can this conflict with bit values declared elsewhere some day?  It
 would be more comfortable if refs.h also had a note about bits =
 0x100 being reserved for private use.

 The rest of the patch looks good.

 Thanks,
 Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 30/44] refs.c: add transaction.status and track OPEN/CLOSED/ERROR

2014-05-22 Thread Ronnie Sahlberg
On Wed, May 21, 2014 at 3:22 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 Please pull my ref-transactions branch.

 I'm at bd5736cb (2014-05-21 13:46) now.

 On Wed, May 21, 2014 at 3:00 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 --- a/refs.c
 +++ b/refs.c
 @@ -3308,6 +3308,12 @@ struct ref_update {
   const char refname[FLEX_ARRAY];
  };

 +enum ref_transaction_status {
 + REF_TRANSACTION_OPEN   = 0,
 + REF_TRANSACTION_CLOSED = 1,
 + REF_TRANSACTION_ERROR  = 2,

 What is the difference between _TRANSACTION_CLOSED and
 _TRANSACTION_ERROR?

 Closed is a transaction that has been committed successfully, and
 which we can not do any more updates onto.
 Error is a transaction that has failed, and which we can not do any
 more updates onto.

 That means that both mean the caller cannot do any more updates,
 right?  Why not call them both _CLOSED?

 The distinction could be useful if in the future we add support to
 reuse a transaction

 If the distinction becomes useful in the future, wouldn't that
 be the moment to add a new state?

 I don't mean that there has to be a big useful distinction.  E.g.,
 maybe the idea is that when using a debugger it can be useful to see
 the difference between _ERROR and _CLOSED or something, and I think
 that would be fine as long as the intended meaning is documented
 (e.g., using comments).  My only complaint is that it's hard to
 understand the code and keep track of which status should be used in a
 given place unless there is some distinction between them.

I have documented the transaction states in refs.c

Thanks.
ronnie sahlberg
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 39/44] refs.c: move the check for valid refname to lock_ref_sha1_basic

2014-05-22 Thread Ronnie Sahlberg
On Wed, May 21, 2014 at 6:42 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:
 --- a/refs.c
 +++ b/refs.c
 @@ -2044,6 +2044,9 @@ static struct ref_lock *lock_ref_sha1_basic(const char 
 *refname,
   int missing = 0;
   int attempts_remaining = 3;

 + if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
 + return NULL;

 Are we sure that no callers are locking a poorly named ref
 as preparation for repairing it?

 To see what works already in that vein, I tried:

 $ git rev-parse HEAD .git/refs/heads/foo..bar
 $ git branch -m foo..bar something-saner
 fatal: Invalid branch name: 'foo..bar'

 git branch -m has an explicit codepath (recovery = 1;) to handle
 this case, but it looks like it was not well tested and regressed in
 v1.7.8-rc0~19^2~7 (resolve_ref(): verify that the input refname has
 the right format, 2011-09-15).

 Is what the recovery codepath of branch -m does misguided?  One
 school of thought would be that people with malformed refs are
 responsible for recovering using low-level tools like mv and vi
 instead of normal git commands since normal git commands should never
 have created such a bad situation.  Another school of thought would
 assert that

  * git commands can have bugs
  * the format checked by check_refname_format() keeps getting stricter
over time

 which means it's nice when people can recover with 'update-ref -d'
 or 'branch -m'.  It's not obvious to me what the right thing to do
 here is (maybe a special flag to be attached to a ref update during
 recovery?).


I don't think we should allow external caller any other way to
access/modify refs than through
the transaction api. This may make it awkward to handle cases such as
foo/../bar was created  and how would you now delete it ?

The exception to this I think would be 'git fsck'. We could allow fsck
to have low level access to the backing store and allow it to access
the files directly,
or allow fsck to set magic flags that disable various checks.




 Hope that helps,
 Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 39/44] refs.c: move the check for valid refname to lock_ref_sha1_basic

2014-05-22 Thread Jonathan Nieder
Ronnie Sahlberg wrote:
 On Wed, May 21, 2014 at 6:42 PM, Jonathan Nieder jrnie...@gmail.com wrote:

 $ git rev-parse HEAD .git/refs/heads/foo..bar
 $ git branch -m foo..bar something-saner
 fatal: Invalid branch name: 'foo..bar'

 git branch -m has an explicit codepath (recovery = 1;) to handle
 this case, but it looks like it was not well tested and regressed in
 v1.7.8-rc0~19^2~7 (resolve_ref(): verify that the input refname has
 the right format, 2011-09-15).

 Is what the recovery codepath of branch -m does misguided?
[...]
 I don't think we should allow external caller any other way to
 access/modify refs than through
 the transaction api. This may make it awkward to handle cases such as
 foo/../bar was created  and how would you now delete it ?

 The exception to this I think would be 'git fsck'. We could allow fsck
 to have low level access to the backing store and allow it to access
 the files directly,
 or allow fsck to set magic flags that disable various checks.

Interesting.  Do you mean that 'git fsck' should notice invalid refnames
and have an option to repair them on its own?

That would be a big change from what git fsck currently does --- it's
currently read-only (except for fsck --lost-found, which is read-only
except in the .git/lost-found/ directory).  If there's an option to
check refnames only and not have to wait for the full check of all
objects then I think it makes sense.

In the meantime, 'git branch -m' and 'git update-ref -d' (which were
the historical ways to do this kind of repair) are already broken, so
at least this patch doesn't seem to make anything worse. :/  (I
haven't checked all callers, though.)

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 40/44] refs.c: call lock_ref_sha1_basic directly from commit

2014-05-22 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 Signed-off-by: Ronnie Sahlberg sahlb...@google.com
 ---
  refs.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

Reviewed-by: Jonathan Nieder jrnie...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 39/44] refs.c: move the check for valid refname to lock_ref_sha1_basic

2014-05-22 Thread Ronnie Sahlberg
On Thu, May 22, 2014 at 10:44 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:
 On Wed, May 21, 2014 at 6:42 PM, Jonathan Nieder jrnie...@gmail.com wrote:

 $ git rev-parse HEAD .git/refs/heads/foo..bar
 $ git branch -m foo..bar something-saner
 fatal: Invalid branch name: 'foo..bar'

 git branch -m has an explicit codepath (recovery = 1;) to handle
 this case, but it looks like it was not well tested and regressed in
 v1.7.8-rc0~19^2~7 (resolve_ref(): verify that the input refname has
 the right format, 2011-09-15).

 Is what the recovery codepath of branch -m does misguided?
 [...]
 I don't think we should allow external caller any other way to
 access/modify refs than through
 the transaction api. This may make it awkward to handle cases such as
 foo/../bar was created  and how would you now delete it ?

 The exception to this I think would be 'git fsck'. We could allow fsck
 to have low level access to the backing store and allow it to access
 the files directly,
 or allow fsck to set magic flags that disable various checks.

 Interesting.  Do you mean that 'git fsck' should notice invalid refnames
 and have an option to repair them on its own?

I think that would be useful.
But it would prompt the user before it does any changes. Even for a
fsck it would be rude to do any changes without asking the user.

'.git/refs/heads/foo^G^G^G.bar' it not a valid ref.
[s kip] [d elete] [r ename] ?



 That would be a big change from what git fsck currently does --- it's
 currently read-only (except for fsck --lost-found, which is read-only
 except in the .git/lost-found/ directory).  If there's an option to
 check refnames only and not have to wait for the full check of all
 objects then I think it makes sense.

 In the meantime, 'git branch -m' and 'git update-ref -d' (which were
 the historical ways to do this kind of repair) are already broken, so
 at least this patch doesn't seem to make anything worse. :/  (I
 haven't checked all callers, though.)

 Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 38/44] refs.c: pack all refs before we start to rename a ref

2014-05-22 Thread Ronnie Sahlberg
On Thu, May 22, 2014 at 10:51 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:
 On Wed, May 21, 2014 at 4:57 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 This means that most loose refs will no longer be present after the rename

 Is this to handle the git branch -m foo/bar foo case or for some other
 purpose?

 Yes. That is the main reason.

 Thanks.  Please say so in the log message.  Remember that people of
 the future debugging and trying to figure out how to fix the latest
 regression without breaking something else and whether to just revert
 your patch cannot read your mind. ;-)


Done.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread David Turner
On Thu, 2014-05-22 at 12:46 -0400, Jeff King wrote:
 On Thu, May 22, 2014 at 12:22:43PM -0400, David Turner wrote:
 
  If I have a git repository with a clean working tree, and I delete the
  index, then I can use git reset (with no arguments) to recreate it.
  However, when I do recreate it, it doesn't come back the same.  I have
  not analyzed this in detail, but the effect is that commands like git
  status take much longer because they must read objects out of a pack
  file.  In other words, the index seems to not realize that the index (or
  at least most of it) represents the same state as HEAD.  If I do git
  reset --hard, the index is restored to the original state (it's
  byte-for-byte identical), and the pack file is no longer read.
 
 Are you sure it's reading a packfile?

Well, it's calling inflate(), and strace says it is reading
e.g. .git/objects/pack/pack-{idx,pack}.

So, I would say so.




--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread David Turner
On Thu, 2014-05-22 at 09:46 -0700, Elijah Newren wrote:
 On Thu, May 22, 2014 at 9:22 AM, David Turner dtur...@twopensource.com 
 wrote:
  If I have a git repository with a clean working tree, and I delete the
  index, then I can use git reset (with no arguments) to recreate it.
  However, when I do recreate it, it doesn't come back the same.  I have
  not analyzed this in detail, but the effect is that commands like git
  status take much longer because they must read objects out of a pack
  file.  In other words, the index seems to not realize that the index (or
  at least most of it) represents the same state as HEAD.  If I do git
  reset --hard, the index is restored to the original state (it's
  byte-for-byte identical), and the pack file is no longer read.
 
  Before I try to dig in to why this should be so, does anyone happen to
  know off the top of their head?  Does this constitute a bug in git, or a
  bug in my understanding of git?
 
 It's not a bug.  The index has additional stat-info it tracks about
 files -- inode number, mtime, etc. -- so that it can quickly check
 whether files are up to date without comparing full file contents in
 the working copy to the relevant version from .git/objects.
 
 'git reset' means make the index match the commit in HEAD. 

Sure, so why does git status the need to read the pack file, given that
we already know that the index matches HEAD?

  It implies
 nothing about the working copy files, which could be different.
 Although you say that you have a clean working tree, git doesn't check
 to verify that, so it has to treat these files as stat-dirty until the
 next operation (e.g. git status) fills these in -- an operation that
 involves full comparison of the files in the working copy with the
 relevant version of the file from under .git/objects. (You may find
 'git update-index --refresh' helpful here.)

In fact, git status does not write the index (at least in this context).
And what is slow is not (only) checking over the working tree, but
reading the packs.  There should be no need to read files from the ODB
at all, since the index knows those objects' SHA1s, and it can check
those against the working tree's SHA1s.  Interestingly, if strace is to
be believed, git status doesn't even do check the working trees SHA1s
after a git reset; maybe reset already sets the stat bits?

So that's why I'm confused.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only

2014-05-22 Thread Jonathan Nieder
Hi,

Ronnie Sahlberg wrote:

 Add a new flag REF_ISPACKONLY that we can use in ref_transaction_delete.
 This flag indicates that the ref does not exist as a loose ref andf only as
 a packed ref. If this is the case we then change the commit code so that
 we skip taking out a lock file and we skip calling delete_ref_loose.

This seems wrong.  Can't someone else create a loose ref which will
shadow the packed ref and break the serializability of updates to this
ref?

The above doesn't explain why we want to avoid locking the loose ref,
but I assume it's for the sake of the git branch -m foo/bar foo
case.  For that case, wouldn't the following sequence of filesystem
operations work?

- create $GIT_DIR/refs/heads/foo/bar.lock
- create $GIT_DIR/refs/heads/foo.lock
- if $GIT_DIR/refs/heads/foo/bar exists, add the ref to
  packed-refs (using the usual lockfile-update mechanism)
  and unlink $GIT_DIR/refs/heads/foo/bar
- verify that current foo and foo/bar state are okay.  If
  not, roll back.
- unlink $GIT_DIR/refs/heads/foo/bar.lock
- rmdir $GIT_DIR/refs/heads/foo
- rename $GIT_DIR/refs/heads/foo.lock into place

Or:

- create $GIT_DIR/refs/heads/foo/bar.lock
- create $GIT_DIR/refs/heads/foo.lock
- verify state of all relevant refs

- update packed-refs to remove refs/heads/foo/bar and
  add refs/heads/foo

- unlink $GIT_DIR/refs/heads/foo/bar
- unlink $GIT_DIR/refs/heads/foo
- unlink $GIT_DIR/refs/heads/foo/bar.lock
- unlink $GIT_DIR/refs/heads/foo.lock

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Jeff King
On Thu, May 22, 2014 at 02:08:16PM -0400, David Turner wrote:

 On Thu, 2014-05-22 at 12:46 -0400, Jeff King wrote:
  On Thu, May 22, 2014 at 12:22:43PM -0400, David Turner wrote:
 
   If I have a git repository with a clean working tree, and I delete the
   index, then I can use git reset (with no arguments) to recreate it.
   However, when I do recreate it, it doesn't come back the same.  I have
   not analyzed this in detail, but the effect is that commands like git
   status take much longer because they must read objects out of a pack
   file.  In other words, the index seems to not realize that the index (or
   at least most of it) represents the same state as HEAD.  If I do git
   reset --hard, the index is restored to the original state (it's
   byte-for-byte identical), and the pack file is no longer read.
 
  Are you sure it's reading a packfile?

 Well, it's calling inflate(), and strace says it is reading
 e.g. .git/objects/pack/pack-{idx,pack}.

 So, I would say so.

That seems odd that we would be spending extra time there. We do
inflate() the trees in order to diff the index against HEAD, but we
shouldn't need to inflate any blobs.

Here it is for me (on linux.git):

  [before, warm cache]
  $ time perf record -q git status /dev/null
  real0m0.192s
  user0m0.080s
  sys 0m0.108s

  $ perf report | grep -v '#' | head -5
 7.46%  git  [kernel.kallsyms]   [k] __d_lookup_rcu
 4.55%  git  libz.so.1.2.8   [.] inflate
 3.53%  git  libc-2.18.so[.] __memcmp_sse4_1
 3.46%  git  [kernel.kallsyms]   [k] security_inode_getattr
 3.29%  git  git [.] memihash

  $ time git reset
  real0m0.080s
  user0m0.036s
  sys 0m0.040s

So status is pretty quick, and the time is going to lstat in the kernel,
and some tree inflation. Reset is fast, because it has nothing much to
do. Now let's kill off the index's stat cache:

  $ rm .git/index
  $ time perf record -q git reset
  real0m0.967s
  user0m0.780s
  sys 0m0.180s

That took a while. What was it doing?

  $ perf report | grep -v '#' | head -5
 3.23%  git  [kernel.kallsyms]   [k] copy_user_enhanced_fast_string
 1.74%  git  libcrypto.so.1.0.0  [.] 0x0007e010
 1.60%  git  [kernel.kallsyms]   [k] __d_lookup_rcu
 1.51%  git  [kernel.kallsyms]   [k] page_fault
 1.44%  git  libc-2.18.so[.] __memcmp_sse4_1   

Reading files and sha1. We hash the working-tree files here (reset
doesn't technically need to refresh the index from the working tree to
copy entries from HEAD into the index, but it does it so it can do fancy
things like tell you about which files are now out-of-date).

Now how does stat fare after this?

  $ time perf record -q git status /dev/null
  real0m0.189s
  user0m0.088s
  sys 0m0.096s

Looks about the same as before to me.

Note that if you use read-tree instead of reset, it _just_ loads the
index, and doesn't touch the working tree. If you then run git status,
then _that_ command has to refresh the index, and it will pay the
hashing cost. Like:

  $ rm .git/index
  $ time git read-tree HEAD
  real0m0.084s
  user0m0.064s
  sys 0m0.016s
  $ time git status /dev/null
  real0m0.833s
  user0m0.712s
  sys 0m0.112s

All of this is behaving as I would expect. Can you show us a set of
commands that deviate from this?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] strbuf: add strbuf_tolower function

2014-05-22 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Wed, May 21, 2014 at 05:07:36PM -0700, Kyle J. McKay wrote:

 +void strbuf_tolower(struct strbuf *sb)
 +{
 +   size_t i;
 +   for (i = 0; i  sb-len; i++)
 +   sb-buf[i] = tolower(sb-buf[i]);
 +}
 +
 
 Wouldn't a direct transfer of the lowercase function be something more like:
 
 
 void strbuf_tolower(struct strbuf *sb)
 {
  char *p = sb-buf;
  for (; *p; p++)
  *p = tolower(*p);
 }
 
 That seems to me to be a bit more efficient.  According to the comments in
 strbuf.c, people can always assume buf is non NULL and -buf is NUL
 terminated even for a freshly initialized strbuf.

 Yes, and that would be fine with me (I actually wrote strbuf_tolower for
 my own use, and _then_ realized that we already had such a thing that
 could be replaced).

Do we forbid that sb-buf[x] for some x  sb-len to be NUL, and if
there is such a byte we stop running tolower() on the remainder?


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Jeff King
On Thu, May 22, 2014 at 02:17:22PM -0400, David Turner wrote:

 In fact, git status does not write the index (at least in this context).
 And what is slow is not (only) checking over the working tree, but
 reading the packs.  There should be no need to read files from the ODB
 at all, since the index knows those objects' SHA1s, and it can check
 those against the working tree's SHA1s.  Interestingly, if strace is to
 be believed, git status doesn't even do check the working trees SHA1s
 after a git reset; maybe reset already sets the stat bits?

Keep in mind that status is running _two_ diffs. One between HEAD and
the index, and one between the index and the working tree.

Your timings might be more interesting just run git diff-index --cached
HEAD, which is the index-HEAD half of that, ignoring the working tree
state entirely.

Naively, running that diff will involve walking the trees and the index
in parallel, which means opening a bunch of tree objects. The cache-tree
index extension, however, records tree sha1s, which means we can avoid
recursing into parts of the comparison.

Comparing the two:

  $ rm .git/index
  $ git reset
  $ time git diff-index --cached HEAD
  real0m0.044s
  user0m0.040s
  sys 0m0.000s

  $ git reset --hard
  $ time git diff-index --cached HEAD
  real0m0.012s
  user0m0.008s
  sys 0m0.000s

does show some improvement. Perhaps git reset is not writing out the
cache-tree extension?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] strbuf: add strbuf_tolower function

2014-05-22 Thread Jeff King
On Thu, May 22, 2014 at 11:36:37AM -0700, Junio C Hamano wrote:

  Yes, and that would be fine with me (I actually wrote strbuf_tolower for
  my own use, and _then_ realized that we already had such a thing that
  could be replaced).
 
 Do we forbid that sb-buf[x] for some x  sb-len to be NUL, and if
 there is such a byte we stop running tolower() on the remainder?

Christian brought this up elsewhere, and I agree it's probably better to
work over the whole buffer, NULs included. I'm happy to re-roll (or you
can just pick up the version of the patch in this thread), but I think
the bigger question is: is this refactor worth doing, since there is
only one caller?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 0/2] format-patch --signature-file file

2014-05-22 Thread Junio C Hamano
Jeremiah Mahler jmmah...@gmail.com writes:

 I just notice that my patch is in 'pu'.
 But it is version 7 instead of the improved version 8.

Yeah, I know.  In a distributed environment, multiple people work
independently and a sequence of event can go like this:

 - I read v7, comment, and queue it only so that I won't forget;
 - I attend to other topics;
 - I start the day's integration cycle, merging topics to the
   integration branches, maint, master, next and pu;
 - You reroll v8 and post it;
 - I may not notice v8, or I may notice v8 but think it is not
   worth to discard the integration work so far only to replace
   v7 with v8.
 - The integration result is pushed out.

A reminder is very much appreciated, but on the other hand it is not
something unusual.  It happens all the time, and I usually am aware
of it ;-)

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread David Turner
On Thu, 2014-05-22 at 14:39 -0400, Jeff King wrote:
 does show some improvement. Perhaps git reset is not writing out the
 cache-tree extension?
 
Yes, that seems to be exactly what is going on; the two indexes are
identical up to the point where the TREE extension appears.

Thanks for clearing that up!  

Do you think that this is a bug in git reset?

(I'm also going to reply to your other mail because it seems like it
maybe points up some related bug)

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Jeff King
On Thu, May 22, 2014 at 03:07:49PM -0400, David Turner wrote:

 On Thu, 2014-05-22 at 14:39 -0400, Jeff King wrote:
  does show some improvement. Perhaps git reset is not writing out the
  cache-tree extension?
  
 Yes, that seems to be exactly what is going on; the two indexes are
 identical up to the point where the TREE extension appears.
 
 Thanks for clearing that up!  
 
 Do you think that this is a bug in git reset?

Possibly. There is a call to prime_cache_tree in builtin/reset.c, which
looks like it should trigger during a mixed or hard reset (and
without arguments, you should have a mixed reset). But it doesn't seem
to get called. I haven't traced it further.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only

2014-05-22 Thread Ronnie Sahlberg
On Thu, May 22, 2014 at 11:17 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Hi,

 Ronnie Sahlberg wrote:

 Add a new flag REF_ISPACKONLY that we can use in ref_transaction_delete.
 This flag indicates that the ref does not exist as a loose ref andf only as
 a packed ref. If this is the case we then change the commit code so that
 we skip taking out a lock file and we skip calling delete_ref_loose.

 This seems wrong.  Can't someone else create a loose ref which will
 shadow the packed ref and break the serializability of updates to this
 ref?

 The above doesn't explain why we want to avoid locking the loose ref,
 but I assume it's for the sake of the git branch -m foo/bar foo
 case.  For that case, wouldn't the following sequence of filesystem
 operations work?

 - create $GIT_DIR/refs/heads/foo/bar.lock
 - create $GIT_DIR/refs/heads/foo.lock
 - if $GIT_DIR/refs/heads/foo/bar exists, add the ref to
   packed-refs (using the usual lockfile-update mechanism)
   and unlink $GIT_DIR/refs/heads/foo/bar
 - verify that current foo and foo/bar state are okay.  If
   not, roll back.
 - unlink $GIT_DIR/refs/heads/foo/bar.lock
 - rmdir $GIT_DIR/refs/heads/foo
 - rename $GIT_DIR/refs/heads/foo.lock into place

 Or:

 - create $GIT_DIR/refs/heads/foo/bar.lock
 - create $GIT_DIR/refs/heads/foo.lock
 - verify state of all relevant refs

 - update packed-refs to remove refs/heads/foo/bar and
   add refs/heads/foo

 - unlink $GIT_DIR/refs/heads/foo/bar
 - unlink $GIT_DIR/refs/heads/foo
 - unlink $GIT_DIR/refs/heads/foo/bar.lock
 - unlink $GIT_DIR/refs/heads/foo.lock



I removed all the rename_ref related patches for now. rename_ref is
probably best implemented specifically for each backend anyway.

I will still produce a separate patch that will do something like this
you suggested
(since rename_ref is still racy and fragile)

 - create $GIT_DIR/refs/heads/foo/bar.lock
 - create $GIT_DIR/refs/heads/foo.lock
 - verify state of all relevant refs

 - update packed-refs to remove refs/heads/foo/bar and
   add refs/heads/foo

 - unlink $GIT_DIR/refs/heads/foo/bar
 - unlink $GIT_DIR/refs/heads/foo
 - unlink $GIT_DIR/refs/heads/foo/bar.lock
 - unlink $GIT_DIR/refs/heads/foo.lock


Thanks
ronnie sahlberg
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread David Turner
On Thu, 2014-05-22 at 14:23 -0400, Jeff King wrote:
 On Thu, May 22, 2014 at 02:08:16PM -0400, David Turner wrote:
 
  On Thu, 2014-05-22 at 12:46 -0400, Jeff King wrote:
   On Thu, May 22, 2014 at 12:22:43PM -0400, David Turner wrote:
  
If I have a git repository with a clean working tree, and I delete the
index, then I can use git reset (with no arguments) to recreate it.
However, when I do recreate it, it doesn't come back the same.  I have
not analyzed this in detail, but the effect is that commands like git
status take much longer because they must read objects out of a pack
file.  In other words, the index seems to not realize that the index (or
at least most of it) represents the same state as HEAD.  If I do git
reset --hard, the index is restored to the original state (it's
byte-for-byte identical), and the pack file is no longer read.
  
   Are you sure it's reading a packfile?
 
  Well, it's calling inflate(), and strace says it is reading
  e.g. .git/objects/pack/pack-{idx,pack}.
 
  So, I would say so.
 
 That seems odd that we would be spending extra time there. We do
 inflate() the trees in order to diff the index against HEAD, but we
 shouldn't need to inflate any blobs.

I just did a fresh clone of linux.git, and it doesn't have TREE (and
spends time in inflate).  Then I did a reset --hard, and it gained TREE
(and stopped spending time in inflate).  Then I did a checkout -b, and
TREE was lost again (time in inflate).  hard reset again (to HEAD, so no
change), and TREE returns and status is fast again.

Committing doesn't seem to create a TREE extension where one doesn't
exist.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 42/44] refs.c: pass a skip list to name_conflict_fn

2014-05-22 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 --- a/refs.c
 +++ b/refs.c
 @@ -798,11 +798,19 @@ struct name_conflict_cb {
   const char *refname;
   const char *oldrefname;
   const char *conflicting_refname;
 + const char **skip;
 + int skipnum;

Would a struct string_list make sense here?  (See
Documentation/technical/api-string-list.txt.)

[...]
  };
  
  static int name_conflict_fn(struct ref_entry *entry, void *cb_data)
  {
   struct name_conflict_cb *data = (struct name_conflict_cb *)cb_data;
 + int i;
 + for(i = 0; i  data-skipnum; i++) {

(style nit) missing space after 'for'.

 + if (!strcmp(entry-name, data-skip[i])) {
 + return 0;
 + }

Style: git tends to avoid braces around a single-line if/for/etc body.

[...]
 @@ -817,15 +825,21 @@ static int name_conflict_fn(struct ref_entry *entry, 
 void *cb_data)
   * conflicting with the name of an existing reference in dir.  If
   * oldrefname is non-NULL, ignore potential conflicts with oldrefname
   * (e.g., because oldrefname is scheduled for deletion in the same
 - * operation).
 + * operation). skip contains a list of refs we want to skip checking for
 + * conflicts with. Refs may be skipped due to us knowing that it will
 + * be deleted later during a transaction that deletes one reference and then
 + * creates a new conflicting reference. For example a rename from m to m/m.

This example of Refs may be skipped due to seems overly complicated.
Isn't the idea just that skip contains a list of refs scheduled for
deletion in this transaction, since they shouldn't be treated as
conflicts at all (for example when renamining m to m/m)?

I wonder if there's some way to make use of the result of the naive
refname_available check to decide what to do when creating a ref.

E.g.: if a refname would be available except there's a ref being
deleted in the way, we could do one of the following:

 a. delete all relevant loose refs and perform the transaction in
packed-refs, or

 b. order operations to avoid the D/F conflict, even with loose refs
(the hardest case is if the ref being deleted uses a directory
and we want to create a file with the same name.  But that's
still doable if we're willing to rmdir when needed as part of
the loop to commit changes)

The packed-refs trick (a) seems much simpler, but either should work.

This could be done e.g. by checking is_refname_available with an empty
list first before doing the real thing with a list of exclusions.

[...]
 @@ -2592,6 +2609,9 @@ int rename_ref(const char *oldrefname, const char 
 *newrefname, const char *logms
   int log = !lstat(git_path(logs/%s, oldrefname), loginfo);
   const char *symref = NULL;
  
 + if (!strcmp(oldrefname, newrefname))
 + return 0;

What is the intended result if I try to rename a nonexistent ref or an
existent symref to its own name?

Sorry to be so fussy about this part.  It's not that I think that this
change is trying to do something bad --- in fact, it's more the
opposite, that I'm excited to see git learning to have a better
understanding and handling of refname D/F conflicts.

That would allow:

 * git fetch --prune working as a single transaction even if
   the repository being fetched from removed a refs/heads/topic
   branch and created refs/heads/topic/1 and refs/heads/topic/2

 * git fast-import and git fetch --mirror learning the same trick

 * fewer code paths having to be touched to be able to (optionally)
   let git actually tolerate D/F conflicts, for people who want to
   have 'topic', 'topic/1', and 'topic/2' branches at the same time.
   This could be turned on by default for remote-tracking refs.  It
   would be especially nice for people on Windows and Mac OS where
   there can be D/F conflicts that people on Linux didn't notice due
   to case-sensitivity.

   Longer term, through a configuration that starts turned off by
   default and has the default flipped as more people have upgraded
   git, this could make D/F conflicts in refnames stop being an error
   altogether.

So it's kind of exciting to see, even though it's fussy to get it
right.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Jeff King
[+cc Junio for cache-tree expertise]

On Thu, May 22, 2014 at 03:09:59PM -0400, Jeff King wrote:

   does show some improvement. Perhaps git reset is not writing out the
   cache-tree extension?
 [...]
 
 Possibly. There is a call to prime_cache_tree in builtin/reset.c, which
 looks like it should trigger during a mixed or hard reset (and
 without arguments, you should have a mixed reset). But it doesn't seem
 to get called. I haven't traced it further.

So here's what's happening. The prime_cache_tree call is in reset_index,
and was added by:

  commit 6c52ec8a9ab48b50fc8bf9559467d5a4cf7eee3b
  Author: Thomas Rast tr...@student.ethz.ch
  Date:   Tue Dec 6 18:43:39 2011 +0100
  
  reset: update cache-tree data when appropriate
  
  In the case of --mixed and --hard, we throw away the old index and
  rebuild everything from the tree argument (or HEAD).  So we have an
  opportunity here to fill in the cache-tree data, just as read-tree
  did.
  
  Signed-off-by: Thomas Rast tr...@student.ethz.ch
  Signed-off-by: Junio C Hamano gits...@pobox.com

But that was counteracted by:

  commit 3fde386a40f38dbaa684c17603e71909b862d021
  Author: Martin von Zweigbergk martinv...@gmail.com
  Date:   Mon Jan 14 21:47:51 2013 -0800
  
  reset [--mixed]: use diff-based reset whether or not pathspec was given
  
  Thanks to b65982b (Optimize diff-index --cached using cache-tree,
  2009-05-20), resetting with paths is much faster than resetting
  without paths. Some timings for the linux-2.6 repo to illustrate this
  (best of five, warm cache):
  
  reset   reset .
  real0m0.219s0m0.080s
  user0m0.140s0m0.040s
  sys 0m0.070s0m0.030s
  
  These two commands should do the same thing, so instead of having the
  user type the trailing  . to get the faster do_diff_cache()-based
  implementation, always use it when doing a mixed reset, with or
  without paths (so git reset $rev would also be faster).
  
  Timing git reset shows that it indeed becomes as fast as
  git reset . after this patch.
  
  Signed-off-by: Martin von Zweigbergk martinv...@gmail.com
  Signed-off-by: Junio C Hamano gits...@pobox.com

We never call reset_index now, because we handle it via diff.  We could
call prime_cache_tree in this case, but I'm not sure if that is a good
idea, because it primes it from scratch (and so it opens up all those
trees that we are trying to avoid touching). I'm not sure if there's an
easy way to update it incrementally; I don't know the cache-tree code
very well.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 0/2] format-patch --signature-file file

2014-05-22 Thread Jeremiah Mahler
Junio,

On Thu, May 22, 2014 at 12:00:39PM -0700, Junio C Hamano wrote:
 Jeremiah Mahler jmmah...@gmail.com writes:
 
  I just notice that my patch is in 'pu'.
  But it is version 7 instead of the improved version 8.
 
 Yeah, I know.  In a distributed environment, multiple people work
 independently and a sequence of event can go like this:
 
  - I read v7, comment, and queue it only so that I won't forget;
  - I attend to other topics;
  - I start the day's integration cycle, merging topics to the
integration branches, maint, master, next and pu;
  - You reroll v8 and post it;
  - I may not notice v8, or I may notice v8 but think it is not
worth to discard the integration work so far only to replace
v7 with v8.
  - The integration result is pushed out.
 
 A reminder is very much appreciated, but on the other hand it is not
 something unusual.  It happens all the time, and I usually am aware
 of it ;-)
 
 Thanks.

Thanks for explaining how this process works.

-- 
Jeremiah Mahler
jmmah...@gmail.com
http://github.com/jmahler
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 00/44] Use ref transactions for all ref updates

2014-05-22 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 This version completes the work to convert all ref updates to use 
 transactions.

Finally got through this.  It had thorny bits but generally goes in a
very good direction.  Thanks for a pleasant read.

Feel free to send another iteration if you'd like review for the newer
code.  I expect except for the part about renames that most of what's
left is just nits so it should go faster.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 00/44] Use ref transactions for all ref updates

2014-05-22 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 This patch series can also be found at
 https://github.com/rsahlberg/git/tree/ref-transactions

Thoughts on 65a1cb7b (2014-05-22 12:08):

 01/40 remove ref_transaction_rollback
 Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 02/40 constify the sha arguments for ref_transaction_create|delete|update
 still Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 03/40 allow passing NULL to ref_transaction_free
 The third paragraph (This allows us to write code like) is still
 confusing.  Why not drop that paragraph?

 04/40 --- oh, lunch time.  Will pick up when I get back.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Add an explicit GIT_DIR to the list of excludes

2014-05-22 Thread Pasha Bolokhov
When an explicit '--git-dir' option points to a directory inside
the work tree, git treats it as if it were any other directory.
In particular, 'git status' lists it as untracked, while 'git add -A'
stages the metadata directory entirely

Add GIT_DIR to the list of excludes in setup_standard_excludes(),
while checking that GIT_DIR is not just '.git', in which case it
would be ignored by default, and that GIT_DIR is inside GIT_WORK_TREE

Although an analogous comparison of any given path against '.git'
is done in treat_path(), this does not seem to be the right place
to compare against GIT_DIR. Instead, the excludes provide an
effective mechanism of ignoring a file/directory, and adding GIT_DIR
as an exclude is equivalent of putting it into '.gitignore'. Function
setup_standard_excludes() was chosen because that is the place where
the excludes are initialized by the commands that are concerned about
excludes

Signed-off-by: Pasha Bolokhov pasha.bolok...@gmail.com
---
Add a test and change the documentation which now reflects the
fact that 'setup_standard_excludes()' has to be called for
the setup purposes

 Documentation/technical/api-directory-listing.txt |  4 +-
 dir.c | 20 
 t/t2205-add-gitdir.sh | 61 +++
 3 files changed, 83 insertions(+), 2 deletions(-)
 create mode 100755 t/t2205-add-gitdir.sh

diff --git a/Documentation/technical/api-directory-listing.txt 
b/Documentation/technical/api-directory-listing.txt
index 7f8e78d..fd4a178 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -90,8 +90,8 @@ marked. If you to exclude files, make sure you have loaded 
index first.
   `add_exclude()`.
 
 * To add patterns from a file (e.g. `.git/info/exclude`), call
-  `add_excludes_from_file()` , and/or set `dir.exclude_per_dir`.  A
-  short-hand function `setup_standard_excludes()` can be used to set
+  `add_excludes_from_file()` , and/or set `dir.exclude_per_dir`.  The
+  short-hand function `setup_standard_excludes()` must be used to set
   up the standard set of exclude settings.
 
 * Set options described in the Data Structure section above.
diff --git a/dir.c b/dir.c
index 98bb50f..07e36f3 100644
--- a/dir.c
+++ b/dir.c
@@ -1588,6 +1588,26 @@ void setup_standard_excludes(struct dir_struct *dir)
 {
const char *path;
char *xdg_path;
+   const char *gitdir = get_git_dir();
+
+   /* Add git directory to the ignores first */
+   if (strcmp(gitdir, .git) != 0) { /* --git-dir has been given */
+   char ngit[PATH_MAX + 1];
+
+   /*
+* See if GIT_DIR is inside the work tree; need to normalize
+* 'gitdir' but 'get_git_work_tree()' always appears absolute
+* and normalized
+*/
+   normalize_path_copy(ngit, real_path(absolute_path(gitdir)));
+
+   if (dir_inside_of(ngit, get_git_work_tree()) = 0) {
+   struct exclude_list *el = add_exclude_list(dir, 
EXC_CMDL,
+   --git-dir option);
+
+   add_exclude(gitdir, , 0, el, 0);
+   }
+   }
 
dir-exclude_per_dir = .gitignore;
path = git_path(info/exclude);
diff --git a/t/t2205-add-gitdir.sh b/t/t2205-add-gitdir.sh
new file mode 100755
index 000..3c6b853
--- /dev/null
+++ b/t/t2205-add-gitdir.sh
@@ -0,0 +1,61 @@
+#!/bin/sh
+#
+# Copyright (c) 2014 Pasha Bolokhov
+#
+
+test_description='alternative repository path specified by --git-dir is 
ignored by add and status'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+   git --git-dir=meta init 
+   for f in a b c d
+   do
+   echo DATA  $f || exit 1
+   done 
+   mkdir subdir1 
+   for f in e f g h
+   do
+   echo MORE DATA  subdir1/$f || exit 1
+   done 
+   mkdir subdir1/meta 
+   echo EVEN more Data  subdir1/meta/aa 
+   mkdir subdir1/ssubdir subdir1/ssubdir/meta 
+   echo So much more data  subdir1/ssubdir/meta/aaa
+'
+
+test_expect_success 'git status' ignores the repository directory '
+   git --git-dir=meta --work-tree=. status  status.out 
+   test_might_fail grep meta status.out  out 
+   ! test -s out
+'
+
+test_expect_success 'git add -A' ignores the repository directory '
+   git --git-dir=meta --work-tree=. add -A 
+   git --git-dir=meta --work-tree=. status --porcelain  status1.out 
+   test_might_fail grep meta status1.out  out1 
+   ! test -s out1
+'
+
+test_expect_success 'git status' acknowledges files 'meta' if repository is 
not within work tree '
+   test_might_fail rm -rf meta/ 
+   (
+   cd subdir1 
+   git --git-dir=../meta init
+   git --git-dir=../meta --work-tree=. status --porcelain  
status2.out 
+   test_might_fail 

Re: [PATCH v8 2/2] format-patch --signature-file file

2014-05-22 Thread Junio C Hamano
Jeremiah Mahler jmmah...@gmail.com writes:

 Added option that allows a signature file to be used with format-patch
 so that signatures with newlines and other special characters can be
 easily included.

s/Added option/Add an option/.  I do not think with newlines and
other special characters is the primary issue---isn't it more about
I have chosen to use this mail-signature; do not force me to retype
the same all the time?

   $ git format-patch --signature-file ~/.signature -1

The recommended command-line convention (see gitcli(7)) is to use
--option=value, so an example would be better to follow it, i.e.

$ git format-patch -1 --signature-file=$HOME/.signature

 The config variable format.signaturefile is also provided so that it
 can be added by default.

   $ git config format.signaturefile ~/.signature
   $ git format-patch -1

Something like:

To countermand the configuration variable for a specific run:

$ git format-patch -1 --signature=This time only
$ git format-patch -1 --signature;# to use the default
$ git format-patch -1 --signature= ;# to add nothing

is also needed here, I think.  Similarly, these two needs to be
tested in the test scripts you are modifying.  Specifically:

 +test_expect_success 'format-patch --no-signature and --signature-file OK' '
 + git format-patch --stdout --no-signature 
 --signature-file=mail-signature -1
 +'

should not just make sure format-patch does _something_, but needs
to make sure it does not contain the contents of the configured mail
signagture file.

I didn't see offhand if the tests make sure that a configured mail
signature can be overriden from the command line.  I think you would
want to test, with format-patch.signature-file pointing at the
mail-signature file, at least these three cases:

 - Run format-patch --no-signature and make sure that stops the
   contents from mail-signature file from being shown, and instead
   no mail-signature is given.

 - Run format-patch --signature='this time only' and make sure
   that stops the contents from mail-signature file from being shown
   and this time only is used instead.

 - Run format-patch --signature-file=another-mail-signature and
   make sure that stops the contents from mail-signature file from
   being shown and the contents from the other file is used instead.

Test for these negative cases is often what we forget, when we are
thrilled to show off that the shiny new feature works as expected.
We need to ensure that the ways to stop the shiny new feature from
kicking in will not be broken as well.

Thanks.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] strbuf: add strbuf_tolower function

2014-05-22 Thread Junio C Hamano
Jeff King p...@peff.net writes:

  Yes, and that would be fine with me (I actually wrote strbuf_tolower for
  my own use, and _then_ realized that we already had such a thing that
  could be replaced).
 ...
 ... I think
 the bigger question is: is this refactor worth doing, since there is
 only one caller?

If you wrote it for your own use and then realized that it is
applicable to this codepath, wouldn't that say that there are
multiple potential callers that would benefit from having it?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 [+cc Junio for cache-tree expertise]
 ...
 We never call reset_index now, because we handle it via diff.  We could
 call prime_cache_tree in this case, but I'm not sure if that is a good
 idea, because it primes it from scratch (and so it opens up all those
 trees that we are trying to avoid touching). I'm not sure if there's an
 easy way to update it incrementally; I don't know the cache-tree code
 very well.

The cache-tree is designed to start in a well-populated state,
allowing you to efficiently smudge the part you touched by
invalidating while keeping the parts you haven't touched intact.

What is missing in its API is a more fine-grained support to let us
say it has degraded too much and we need to bring it into a
well-populated state again for it to be truly useful as an
optimization.  There are only two modes of support to revive a
degraded cache-tree, one being write_cache_as_tree(), in which case
we have to compute necessary tree object names anyway (so there is
no point discarding the result of the computation), and the other
being calls to prime-cache-tree, in which we happen to know that the
whole index contents must match the whole tree structure represented
by one tree object.

Both aim to restore the cache-tree into a fully-populated state, and
there is no support to populate it well enough by doing anything
incremental.  You can call write-tree side incremental, because it
does reuse what is still valid without recomputing tree objects for
them---but the result is a fully-populated state.

Adding a more fine-grain support is not against the overall design,
but it was unclear what such additional API functions should look
like, and where we can call them safely, at least back when we were
actively improving it.  Two that comes to my mind are:

 - We know that the subtrees down in this directory are degraded too
   much; write-tree only the subtrees that correspond to this
   directory without restoring other parts of the tree.

 - We just populated the index with the subtrees in this directory
   and know that they should match the tree hierarchy exactly.
   prime-cache-tree only the parts without affecting other parts of
   the tree.

As with calls to existing (whole-tree) prime-cache-tree, the latter
is an error-prone optimization---I think we had cases where we said
after this operation, we know that the index must exactly match the
tree we used to muck with the index and added a call, and later
discovered that must exactly match was not true.

The former forces recomputation, so there is much less safety
concern.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread David Turner
On Thu, 2014-05-22 at 14:34 -0700, Junio C Hamano wrote:
 Jeff King p...@peff.net writes:
 
  [+cc Junio for cache-tree expertise]
  ...
  We never call reset_index now, because we handle it via diff.  We could
  call prime_cache_tree in this case, but I'm not sure if that is a good
  idea, because it primes it from scratch (and so it opens up all those
  trees that we are trying to avoid touching). I'm not sure if there's an
  easy way to update it incrementally; I don't know the cache-tree code
  very well.
 
 The cache-tree is designed to start in a well-populated state,
 allowing you to efficiently smudge the part you touched by
 invalidating while keeping the parts you haven't touched intact.

As far as I can tell, the cache-tree does not in fact ever get into a
well-populated state (that is, it does not exist at all) under ordinary
git operation except by git reset --hard.  Perhaps this was already
clear from the previous traffic on the thread, but I wanted to make sure
Junio was also aware of this.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Junio C Hamano
David Turner dtur...@twopensource.com writes:

 On Thu, 2014-05-22 at 14:34 -0700, Junio C Hamano wrote:
 Jeff King p...@peff.net writes:
 
  [+cc Junio for cache-tree expertise]
  ...
  We never call reset_index now, because we handle it via diff.  We could
  call prime_cache_tree in this case, but I'm not sure if that is a good
  idea, because it primes it from scratch (and so it opens up all those
  trees that we are trying to avoid touching). I'm not sure if there's an
  easy way to update it incrementally; I don't know the cache-tree code
  very well.
 
 The cache-tree is designed to start in a well-populated state,
 allowing you to efficiently smudge the part you touched by
 invalidating while keeping the parts you haven't touched intact.

 As far as I can tell, the cache-tree does not in fact ever get into a
 well-populated state (that is, it does not exist at all) under ordinary
 git operation except by git reset --hard.  Perhaps this was already
 clear from the previous traffic on the thread, but I wanted to make sure
 Junio was also aware of this.

Yes.  As I said, that should not usually be a problem for those who
do the real work (read: commit), at which time write-tree will fully
populate the cache-tree.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 00/44] Use ref transactions for all ref updates

2014-05-22 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 This patch series can also be found at
 https://github.com/rsahlberg/git/tree/ref-transactions

Thoughts on 65a1cb7b (2014-05-22 12:08):

 04/40 add a strbuf argument to ref_transaction_commit for error logging

 Ideally this would come after the functions it calls so the comment
 If error is non-NULL we will add an error string to it to explain
 why the transaction failed would be true already.  But rearranging
 the series for that would be more fuss than it's worth IMHO.

 The sanity check

int ref_transaction_commit(...)
{
int ret = ref_transaction_commit_real(...);
if (ret  err  !err-len)
die(BUG: ref_transaction_commit forgot to write an 
error message);
return ret;
}

 shows no instances of forgotten error messages in cases exercised by
 tests, at least.  And it's a step in the right direction.

 Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 05/40 add an err argument to repack_without_refs
 unable_to_lock_strbuf could be called unable_to_lock_message (which
 would make its behavior more obvious imho).

 This makes errno meaningful when commit_packed_refs returns an error.
 Probably its API documentation should say so to make it obvious to
 people modifying it in the future to preserve that property or audit
 callers.

 Via the new call to unable_to_lock_..., repack_without_refs cares
 about errno after a failed call to lock_packed_refs.  lock_packed_refs
 can only fail in hold_lock_file_for_update.  hold_lock_file_for_update
 is a thin wrapper around lockfile.c::lock_file.  lock_file can error
 out because

strlen(path) = max_path_len
adjust_shared_perm failed [calls error(), clobbers errno]
open failed

 So lock_file needs a similar kind of fix, and it's probably worth
 updating API documentation for these calls to make it clear that
 their errno is used (though that's not a new problem since
 repack_without_refs already called unable_to_lock_error).  Could be
 a separate, earlier patch since it's fixing an existing bug.

 06/40 make ref_update_reject_duplicates take a strbuf argument for errors
 still Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 07/40 add an err argument to delete_ref_loose
 The new unlink_or_err has an odd contract when the err argument is passed.
 On error:

  * if errno != ENOENT, it will append a message to err and return -1 (good)
  * if errno == ENOENT, it will not append a message to err but will
still return -1.

 Perhaps it should return 0 when errno == ENOENT.  After all, in that
 case the file does not exist any more, which is all we wanted.  And it
 would save the caller from having to inspect errno.

 On failure we seem to add our own message to err, too, so the resulting
 message would be something like

fatal: unable to unlink .git/refs/heads/master: \
Permission deniedfailed to delete loose ref 
'.git/refs/heads/master.lock'

 The second strbuf_addf is probably not needed.

 08/40 make update_ref_write update a strbuf on failure
 still Reviewed-by: Jonathan Nieder jrnie...@gmail.com
 
 09/40 log transaction error from the update_ref
 No actual functional change intended, right?  I'd say something like
 update-ref: use err argument to get error from ref_transaction_commit
 or something similar to make it clearer that this is just about
 changing APIs.  Or if there's an intended functional change, then the
 commit message could say something about that.

 10/40 remove the onerr argument to ref_transaction_commit
 still Reviewed-by: Jonathan Nieder jrnie...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Junio C Hamano
David Turner dtur...@twopensource.com writes:

 Yes.  As I said, that should not usually be a problem for those who
 do the real work (read: commit), at which time write-tree will fully
 populate the cache-tree.

 Git commit does not in fact populate the cache-tree.

If that is the case, we must have broken the write-tree codepath
over time.

Of course, git commit foo will need to prepare a temporary index
where only the entry foo is different from the HEAD version, write
the tree to create a commit, but we do not write out the main index
as a tree after updating the same entry foo in it (there may be
other changes relative to HEAD), so its cache-tree is only
invalidated at foo when we updating the entry and we do not spend
extra cycles to repopulate it.

But at least my understanding has been that git commit (no partial
commit, write the whole index as a commit) which uses the git
write-tree machinery knows which subtree has what tree object name
and populates the cache-tree fully.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 David Turner dtur...@twopensource.com writes:

 Yes.  As I said, that should not usually be a problem for those who
 do the real work (read: commit), at which time write-tree will fully
 populate the cache-tree.

 Git commit does not in fact populate the cache-tree.

 If that is the case, we must have broken the write-tree codepath
 over time.

 Of course, git commit foo will need to prepare a temporary index
 where only the entry foo is different from the HEAD version, write
 the tree to create a commit, but we do not write out the main index
 as a tree after updating the same entry foo in it (there may be
 other changes relative to HEAD), so its cache-tree is only
 invalidated at foo when we updating the entry and we do not spend
 extra cycles to repopulate it.

 But at least my understanding has been that git commit (no partial
 commit, write the whole index as a commit) which uses the git
 write-tree machinery knows which subtree has what tree object name
 and populates the cache-tree fully.

... and the incrementally repair Peff talks about would be to
cover more cases where we may know (either because we have already
computed it to write out a subtree, or we have just read from a
known tree to populate a part of the index and we know the paths in
the index that correspond to that subtree are exactly the same ones
as found in the tree we read from) parts of the cache-tree can be
updated with tree object names for subtrees, but we don't do
anything right now.  git commit foo (where foo is a directory)
may be able to say The cache-tree entry for 'foo' can be updated
with the tree object of the new HEAD:foo because we know they must
match in the main index, for example.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 But at least my understanding has been that git commit (no partial
 commit, write the whole index as a commit) which uses the git
 write-tree machinery knows which subtree has what tree object name
 and populates the cache-tree fully.

Here is what I tried just now.

$ rm .git/index
$ git read-tree HEAD HEAD

Note that a single-tree read-tree will populate the cache-tree and
that is why I am forcing switch branches 2-way read-tree here,
which I know will discard the cache-tree fully.

$ ls -l .git/index
-rw-r- 1 jch eng 249440 May 22 15:20 .git/index
$ git checkout HEAD^0
$ ls -l .git/index
-rw-r- 1 jch eng 249440 May 22 15:21 .git/index

Still the same size, without cache-tree.

$ git write-tree
57361c4add61b638dad1c1c2542edf877f515c48
$ ls -l .git/index
-rw-r- 1 jch eng 254383 May 22 15:21 .git/index

The size differences come from the recomputation of the cache tree.
The result is the same if we replace git write-tree with a
whole-index commit, e.g.

$ git commit --allow-empty -m foo

and test-dump-cache-tree seem to see a fully populated cache-tree
after these steps.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] strbuf: add strbuf_tolower function

2014-05-22 Thread Kyle J. McKay

On May 22, 2014, at 11:41, Jeff King wrote:


On Thu, May 22, 2014 at 11:36:37AM -0700, Junio C Hamano wrote:

Yes, and that would be fine with me (I actually wrote  
strbuf_tolower for
my own use, and _then_ realized that we already had such a thing  
that

could be replaced).


Do we forbid that sb-buf[x] for some x  sb-len to be NUL, and if
there is such a byte we stop running tolower() on the remainder?


Christian brought this up elsewhere, and I agree it's probably  
better to
work over the whole buffer, NULs included. I'm happy to re-roll (or  
you

can just pick up the version of the patch in this thread),


The only reason I brought up the code difference in the first place  
was that the comment was This makes config's lowercase() function  
public which made me expect to see basically the equivalent of  
replacing a static with an extern, but actually the function being  
made public was a different implementation than config's lowercase()  
function.  So it looks like the original PATCH 2/9 version of the code  
is best, but perhaps the comment can be tweaked a bit to not convey  
the same impression I got.  Maybe something like Replace config's  
lowercase() function with a public one.


--Kyle
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/8] http: extract type/subtype portion of content-type

2014-05-22 Thread Kyle J. McKay


On May 22, 2014, at 02:29, Jeff King wrote:


When we get a content-type from curl, we get the whole
header line, including any parameters, and without any
normalization (like downcasing or whitespace) applied.
If we later try to match it with strcmp() or even
strcasecmp(), we may get false negatives.

This could cause two visible behaviors:

 1. We might fail to recognize a smart-http server by its
content-type.

 2. We might fail to relay text/plain error messages to
users (especially if they contain a charset parameter).

This patch teaches the http code to extract and normalize
just the type/subtype portion of the string. This is
technically passing out less information to the callers, who
can no longer see the parameters. But none of the current
callers cares, and a future patch will add back an
easier-to-use method for accessing those parameters.

Signed-off-by: Jeff King p...@peff.net
---
http.c | 32 +---
remote-curl.c  |  2 +-
t/lib-httpd/error.sh   |  8 +++-
t/t5550-http-fetch-dumb.sh |  5 +
4 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/http.c b/http.c
index 94e1afd..4edf5b9 100644
--- a/http.c
+++ b/http.c
@@ -906,6 +906,29 @@ static CURLcode curlinfo_strbuf(CURL *curl,  
CURLINFO info, struct strbuf *buf)

return ret;
}

+/*
+ * Extract a normalized version of the content type, with any
+ * spaces suppressed, all letters lowercased, and no trailing ;
+ * or parameters.
+ *
+ * Example:
+ *   TEXT/PLAIN; charset=utf-8 - text/plain
+ */
+static void extract_content_type(struct strbuf *raw, struct strbuf  
*type)

+{
+   const char *p;
+
+   strbuf_reset(type);
+   strbuf_grow(type, raw-len);
+   for (p = raw-buf; *p; p++) {
+   if (isspace(*p))
+   continue;
+   if (*p == ';')
+   break;
+   strbuf_addch(type, tolower(*p));
+   }
+}
+


This will parse invalid content types as valid.  Probably not  
important since the producer of an invalid content type shouldn't be  
depending on any particular behavior by the consumer of such a type,  
but I think it warrants a note in the comment block, perhaps something  
like:


  * Note that an invalid content-type may be converted to a valid one

or some such.

--Kyle
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only

2014-05-22 Thread Ronnie Sahlberg
I hate rename_ref :-)


I have reworked the transaction code to special case the deletion of
the old ref for n/n - n  and n - n/n renames
so that we can carefully avoid n/n.lock files to exist or prevent the
directory - file transition for n during these renames.

This should allow us to have rename_ref becoming reasonably
implementation agnostic, aside for that it wants to lstat the ref to
see if it is a soft link, and re-use the code for other refs backends.


Please see the ref-transaction branch.


On Thu, May 22, 2014 at 12:12 PM, Ronnie Sahlberg sahlb...@google.com wrote:
 On Thu, May 22, 2014 at 11:17 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Hi,

 Ronnie Sahlberg wrote:

 Add a new flag REF_ISPACKONLY that we can use in ref_transaction_delete.
 This flag indicates that the ref does not exist as a loose ref andf only as
 a packed ref. If this is the case we then change the commit code so that
 we skip taking out a lock file and we skip calling delete_ref_loose.

 This seems wrong.  Can't someone else create a loose ref which will
 shadow the packed ref and break the serializability of updates to this
 ref?

 The above doesn't explain why we want to avoid locking the loose ref,
 but I assume it's for the sake of the git branch -m foo/bar foo
 case.  For that case, wouldn't the following sequence of filesystem
 operations work?

 - create $GIT_DIR/refs/heads/foo/bar.lock
 - create $GIT_DIR/refs/heads/foo.lock
 - if $GIT_DIR/refs/heads/foo/bar exists, add the ref to
   packed-refs (using the usual lockfile-update mechanism)
   and unlink $GIT_DIR/refs/heads/foo/bar
 - verify that current foo and foo/bar state are okay.  If
   not, roll back.
 - unlink $GIT_DIR/refs/heads/foo/bar.lock
 - rmdir $GIT_DIR/refs/heads/foo
 - rename $GIT_DIR/refs/heads/foo.lock into place

 Or:

 - create $GIT_DIR/refs/heads/foo/bar.lock
 - create $GIT_DIR/refs/heads/foo.lock
 - verify state of all relevant refs

 - update packed-refs to remove refs/heads/foo/bar and
   add refs/heads/foo

 - unlink $GIT_DIR/refs/heads/foo/bar
 - unlink $GIT_DIR/refs/heads/foo
 - unlink $GIT_DIR/refs/heads/foo/bar.lock
 - unlink $GIT_DIR/refs/heads/foo.lock



 I removed all the rename_ref related patches for now. rename_ref is
 probably best implemented specifically for each backend anyway.

 I will still produce a separate patch that will do something like this
 you suggested
 (since rename_ref is still racy and fragile)

 - create $GIT_DIR/refs/heads/foo/bar.lock
 - create $GIT_DIR/refs/heads/foo.lock
 - verify state of all relevant refs

 - update packed-refs to remove refs/heads/foo/bar and
   add refs/heads/foo

 - unlink $GIT_DIR/refs/heads/foo/bar
 - unlink $GIT_DIR/refs/heads/foo
 - unlink $GIT_DIR/refs/heads/foo/bar.lock
 - unlink $GIT_DIR/refs/heads/foo.lock


 Thanks
 ronnie sahlberg
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread David Turner
On Thu, 2014-05-22 at 15:29 -0700, Junio C Hamano wrote:
 Junio C Hamano gits...@pobox.com writes:
 
  But at least my understanding has been that git commit (no partial
  commit, write the whole index as a commit) which uses the git
  write-tree machinery knows which subtree has what tree object name
  and populates the cache-tree fully.
 
 Here is what I tried just now.
 
 $ rm .git/index
 $ git read-tree HEAD HEAD
 
 Note that a single-tree read-tree will populate the cache-tree and
 that is why I am forcing switch branches 2-way read-tree here,
 which I know will discard the cache-tree fully.
 
 $ ls -l .git/index
 -rw-r- 1 jch eng 249440 May 22 15:20 .git/index
 $ git checkout HEAD^0
 $ ls -l .git/index
 -rw-r- 1 jch eng 249440 May 22 15:21 .git/index
 
 Still the same size, without cache-tree.
 
 $ git write-tree
 57361c4add61b638dad1c1c2542edf877f515c48
 $ ls -l .git/index
 -rw-r- 1 jch eng 254383 May 22 15:21 .git/index
 
 The size differences come from the recomputation of the cache tree.
 The result is the same if we replace git write-tree with a
 whole-index commit, e.g.
 
 $ git commit --allow-empty -m foo
 
 and test-dump-cache-tree seem to see a fully populated cache-tree
 after these steps.

I get the same results as you with git write-tree.  But I do not get the
same results from a whole-index git commit (I tried your exact
command-line).  That is, when I do git commit with no cache-tree in
place, it does not create one. 

To expand: even if git commit did work for me the way it seems to work
for you, I still believe that the cache-tree behavior would be
suboptimal, because every time a user switches branches, they lose their
cache-tree, and thus all of their git status commands are slow until
their first commit. But I am willing to believe that my workflow is
atypical, and that most people commit enough soon after switching
branches.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 00/44] Use ref transactions for all ref updates

2014-05-22 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 This patch series can also be found at
 https://github.com/rsahlberg/git/tree/ref-transactions

Continuing with the review of 65a1cb7b (2014-05-22 12:08):

 11/40 change ref_transaction_update() to do error checking and return status
 The there will be in the future sounds ominous.  Do you have an
 example in mind?  E.g., I suppose it would be nice if _update could
 notice D/F conflicts or connection to a database server closing early,
 but it's not clear to me whether the kind of errors you're talking
 about are that or something else.

 With or without such a clarification,
 Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 12/40 change ref_transaction_create to do error checking and return status
 What does On failure the err buffer will be updated mean?  Will
 it clear err and replace it with a message, append to err, or
 something else?  Does the message explain the context or is the
 caller responsible for adding to it?  Does the message end with a
 newline or is the caller responsible for adding one when printing it
 out?

 For cases like this where lots of functions have a similar API,
 API comments start to become potentially repetitive.  It might be
 better to explain conventions at the top of the file or in
 Documentation/technical/api-refs.txt and say See the top of the
 file for error handling conventions or Returns non-zero and
 appends a message to err on error.  See
 Documentation/technical/api-refs.txt for more details on error
 handling.

 13/40 ref_transaction_delete to check for error and return status
 Each successive commit dropped something from its subject. :)
 (First the (), then the verb.)

 Same comments as before about an example being useful for the
 log message and the API documentation on error handling being a
 bit vague.

 14/40 make ref_transaction_begin take an err argument
 I found the failed to connect to mysql example instructive while
 doing reviews.  Perhaps it would be worth mentioning in the commit
 message.

 Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 15/40 add transaction.status and track OPEN/CLOSED/ERROR
 It says an ERRORed transaction cannot be committed and can be rolled
 back by calling _free.  Can a CLOSED transaction be committed or
 _freed?

 What does faild mean in the documentation comments?  (Maybe
 non-OPEN?)

 In the previous version of this patch passing a non-OPEN transaction
 would die(BUG: ...) to diagnose the caller's mistake.  Now I'm
 confused about the API: it seems you're allowed to pass a non-OPEN
 transaction but it doesn't append a message to 'err' in that case.
 Is this meant as a way to save the caller some typing, like
 fwrite/fclose do?  (I've found people often make mistakes with the
 fwrite API fwiw but can understand the appeal of it.)

 Maybe with more context I'd like this.  As is, it feels like a step
 in the wrong direction.

 16/40 tag: use ref transactions when doing updates
 Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 17/40 replace: use ref transactions when doing updates
 Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 18/40 commit: use ref transactions for updates
 Reviewed-by: Jonathan Nieder jrnie...@gmail.com
 
 19/40 sequencer: use ref transactions for all ref updates
 This would be a lot simpler if the ref_transaction_commit should not
 free the transaction patch came before it (yes, sorry, killing the
 fun).  I can push the result of a rebase doing that somewhere if you
 like.

 20/40 fast-import: change update_branch to use ref transactions
 Likewise.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Junio C Hamano
David Turner dtur...@twopensource.com writes:

 ... I still believe that the cache-tree behavior would be
 suboptimal, ...

I do not think anybody doubts that suboptimal-ness in this thread.
As you saw the incremental thing from Peff and my responses to it,
there may be more things we could be doing.  It just has not been at
a ultra high priority, and if we can choose only one change from
possibilities, losing the entire cache-tree upon switching branches,
like in my two-way read-tree example, would be the thing that would
give us the most benefit if we somehow change it.

That however is unfortunately not a low-hanging fruit.  The two-way
merge machinery we use for switching branches wants to populate the
index one entry at a time, without having any point where you can
say OK the result in this subdirectory will exactly match this
subtree which would allow us to say prime the cache tree for that
subdirectory with this tree object.



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] strbuf: add strbuf_tolower function

2014-05-22 Thread Kyle J. McKay

On May 22, 2014, at 06:42, Jeff King wrote:


[re-adding list cc]

On Thu, May 22, 2014 at 03:16:45PM +0200, Christian Couder wrote:


+void strbuf_tolower(struct strbuf *sb)
+{
+   char *p;
+   for (p = sb-buf; *p; p++)
+   *p = tolower(*p);
+}


Last time I tried a change like the above, I was told that strbufs  
are

buffers that can contain NUL bytes. So maybe it should be:

  for (p = sb-buf; p  sb-buf + sb-len; p++)
 *p = tolower(*p);


Hah. I wrote it like that originally, and in review was asked to  
switch
it. I agree that it might be worth keeping strbuf functions 8bit- 
clean.

The original intent of the strbuf code was to make C strings easier to
use, but I think we sometimes use them as general buffers, too.



I didn't see this patch before I sent the other email, but this is the  
relevant part:


On May 22, 2014, at 15:52, Kyle J. McKay wrote:
The only reason I brought up the code difference in the first place  
was that the comment was This makes config's lowercase() function  
public which made me expect to see basically the equivalent of  
replacing a static with an extern, but actually the function  
being made public was a different implementation than config's  
lowercase() function.  So it looks like the original PATCH 2/9  
version of the code is best, but perhaps the comment can be tweaked  
a bit to not convey the same impression I got.  Maybe something like  
Replace config's lowercase() function with a public one.



Or even just delete the This makes config's lowercase() function  
public line after switching the implementation back.


--Kyle
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread Duy Nguyen
On Fri, May 23, 2014 at 5:18 AM, Junio C Hamano gits...@pobox.com wrote:
 ... and the incrementally repair Peff talks about would be to
 cover more cases where we may know (either because we have already
 computed it to write out a subtree, or we have just read from a
 known tree to populate a part of the index and we know the paths in
 the index that correspond to that subtree are exactly the same ones
 as found in the tree we read from) parts of the cache-tree can be
 updated with tree object names for subtrees, but we don't do
 anything right now.

I wanted to do this but it's hard. For diff --cached (which should
be where we find out and repair cache-tree), we flatten the trees in
traverse_trees() and let unpack-trees figure out the differences
against the index. The's no direct connection between a change and a
tree. Maybe I missed something. David if you are interested in git
status performance, this repairing thing could be important when the
worktree is large because the diff cost increases accordingly if
cache-tree is not fully populated. Empty cache-tree could cost us
nearly the same time we save with avoiding opendir/readdir..
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git reset for index restoration?

2014-05-22 Thread David Turner
On Fri, 2014-05-23 at 06:33 +0700, Duy Nguyen wrote:
 On Fri, May 23, 2014 at 5:18 AM, Junio C Hamano gits...@pobox.com wrote:
  ... and the incrementally repair Peff talks about would be to
  cover more cases where we may know (either because we have already
  computed it to write out a subtree, or we have just read from a
  known tree to populate a part of the index and we know the paths in
  the index that correspond to that subtree are exactly the same ones
  as found in the tree we read from) parts of the cache-tree can be
  updated with tree object names for subtrees, but we don't do
  anything right now.
 
 I wanted to do this but it's hard. For diff --cached (which should
 be where we find out and repair cache-tree), we flatten the trees in
 traverse_trees() and let unpack-trees figure out the differences
 against the index. The's no direct connection between a change and a
 tree. Maybe I missed something. David if you are interested in git
 status performance, this repairing thing could be important when the
 worktree is large because the diff cost increases accordingly if
 cache-tree is not fully populated. Empty cache-tree could cost us
 nearly the same time we save with avoiding opendir/readdir..

I am interested, and I believe I might be able to start looking into it
in a week or two.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Get rid of the non portable shell export VAR=VALUE costruct

2014-05-22 Thread Junio C Hamano
Elia Pinto gitter.spi...@gmail.com writes:

 I have no problems rerolling  this simple patch, but i need to know
 what is the (git) right style in this case.

If I were doing this...

diff --git a/contrib/subtree/t/t7900-subtree.sh 
b/contrib/subtree/t/t7900-subtree.sh
index 66ce4b0..c1d0b23 100755
--- a/contrib/subtree/t/t7900-subtree.sh
+++ b/contrib/subtree/t/t7900-subtree.sh
@@ -8,7 +8,7 @@ This test verifies the basic operation of the merge, pull, add
 and split subcommands of git subtree.
 '
 
-export TEST_DIRECTORY=$(pwd)/../../../t
+TEST_DIRECTORY=$(pwd)/../../../t  export TEST_DIRECTORY

... I'd say I would make this two lines without , i.e.

TEST_DIRECTORY=$(pwd)/../../../t
export TEST_DIRECTORY

because we are not doing anything about a failure of $(pwd), other
than skipping the export.  If we were failing the entire thing, it
would be a different story, but at this point of the test, it is
unreasonable to do something like:

TEST_DIRECTORY=$(pwd)/../../../t 
export TEST_DIRECTORY || exit $?

anyway.
 
diff --git a/git-remote-testgit.sh b/git-remote-testgit.sh
index 1c006a0..9d1ce76 100755
--- a/git-remote-testgit.sh
+++ b/git-remote-testgit.sh
@@ -13,7 +13,7 @@ refspec=${GIT_REMOTE_TESTGIT_REFSPEC-$default_refspec}
 
 test -z $refspec  prefix=refs
 
-export GIT_DIR=$url/.git
+GIT_DIR=$url/.git  export GIT_DIR
 
 force=

Similarly, as we do not do anything if the assignment to GIT_DIR
fails here, just like we ignore any failure from assignment to
force,

GIT_DIR=$url/.git
export GIT_DIR

would be sufficient.

The logic before the mkdir -p of this script may want to be
cleaned up (can you tell how if $refspec is empty, set prefix=refs
makes any sense, and what its implications are, without thinking for
more than 10 seconds?), but that is clearly a separate topic [*1*].


diff --git a/git-stash.sh b/git-stash.sh
index 4798bcf..0bb1af8 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -94,7 +94,7 @@ create_stash () {
# ease of unpacking later.
u_commit=$(
untracked_files | (
-   export GIT_INDEX_FILE=$TMPindex
+   GIT_INDEX_FILE=$TMPindex  export 
GIT_INDEX_FILE 
rm -f $TMPindex 
git update-index -z --add --remove --stdin 
u_tree=$(git write-tree) 

This one does care about failures.  I'd rather have these on
separate lines, i.e.

GIT_INDEX_FILE=$TMPindex 
export GIT_INDEX_FILE 
rm -f $TMPindex 
...


[Footnote]

*1* Perhaps something along these lines with the exact placements of
blank lines to group similar things together:

alias=$1 url=$2

if test -z ${GIT_REMOTE_TESTGIT_REFSPEC-notEmpty}
then
# only if it is explicitly set to an empty string...
prefix=refs
else
prefix=refs/testgit/$alias
fi
dir=$GIT_DIR/testgit/$alias
default_refspec=refs/heads/*:${prefix}/heads/*
refspec=${GIT_REMOTE_TESTGIT_REFSPEC-$default_refspec}
force=

GIT_DIR=$url/.git
export GIT_DIR
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only

2014-05-22 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 I hate rename_ref :-)

 I have reworked the transaction code to special case the deletion of
 the old ref for n/n - n  and n - n/n renames
 so that we can carefully avoid n/n.lock files to exist or prevent the
 directory - file transition for n during these renames.

Thanks.

I suspect the REF_ISRENAME flag shouldn't be needed.  Wouldn't
something like the following work (in _commit)?

Allocate work space
Copy sort, and reject duplicate refs
Acquire all locks while verifying old values
This calls is_refname_available.
If a refname is unavailable, goto slowpath.
Perform updates first so live commits remain referenced.
Perform deletes now that updates are safely completed.
Done.

 slowpath:
Acquire locks, telling is_refname_available not to worry
about deleted refs.
Lock packed-refs.
Add all relevant refs to packed-refs (pack_if_possible_fn).
Commit packed-refs.
Unlink the corresponding loose refs so packed-refs
becomes authoritative for them.
Lock packed-refs.
Perform updates and removals in the packed-refs cache.
Commit packed-refs.
Release locks.
Done.

This wouldn't be any slower for the case without D/F conflicts, and in
the D/F conflict case, it should work for arbitrary transactions
that want to remove one ref to make room for another.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only

2014-05-22 Thread Jonathan Nieder
Jonathan Nieder wrote:
 Ronnie Sahlberg wrote:

 I hate rename_ref :-)

 I have reworked the transaction code to special case the deletion of
 the old ref for n/n - n  and n - n/n renames
 so that we can carefully avoid n/n.lock files to exist or prevent the
 directory - file transition for n during these renames.
[...]
   Unlink the corresponding loose refs so packed-refs
   becomes authoritative for them.
   Lock packed-refs.
   Perform updates and removals in the packed-refs cache.
   Commit packed-refs.

... or is the problem that the reflogs conflict?

How does rename_ref handle propagating the reflog from the old
name to the new name, by the way?

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] completion: add missing options for git-merge

2014-05-22 Thread Junio C Hamano
John Keeping j...@keeping.me.uk writes:

 The options added to __git_merge_options are those that git-pull also
 understands, since that variable is used by both commands.  Those added
 directly in _git_merge() are specific to git-merge and are not
 supported by git-pull.

Interesting.

Technically, are not passed through would be more accurate than
are not supported, as there may be a very good reason not to call
git merge --frotz from git pull --frotz=nitfol.  In such a case,
git pull would _support_ the --frotz option, but in a different
way and possibly with semantics different from git merge --frotz
(i.e. option with value vs boolean), and completion of the
--frotz=value option may need to be supported for git pull
separately without using $__git_merge_options.

The patch makes us think if --[no-]rerere-autoupdate should be a
pass-thru option.  You are unlikely to have seen the same conflict
that will arise from a pull from another person for autoupdate to
matter, but on the other hand, if you do have seen one and resolved
it already, you may want the autoupdate to kick in, so accepting and
passing it through may not be an unreasonable thing to do.

As long as we declare that git pull --abort does not make any
sense (which I think is a sane declaration), the distinction you
made in [PATCH 1/2] will always be with us, whether we make
--rerere-autoupdate pass-thru or not, so in that sense, both
patches make sense to me.

Thanks.


 Reported-by: Haralan Dobrev hkdob...@gmail.com
 Signed-off-by: John Keeping j...@keeping.me.uk
 ---
  contrib/completion/git-completion.bash | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

 diff --git a/contrib/completion/git-completion.bash 
 b/contrib/completion/git-completion.bash
 index ff97c20..019026e 100644
 --- a/contrib/completion/git-completion.bash
 +++ b/contrib/completion/git-completion.bash
 @@ -1476,6 +1476,8 @@ _git_log ()
  __git_merge_options=
   --no-commit --no-stat --log --no-log --squash --strategy
   --commit --stat --no-squash --ff --no-ff --ff-only --edit --no-edit
 + --verify-signatures --no-verify-signatures --gpg-sign
 + --quiet --verbose --progress --no-progress
  
  
  _git_merge ()
 @@ -1484,7 +1486,8 @@ _git_merge ()
  
   case $cur in
   --*)
 - __gitcomp $__git_merge_options
 + __gitcomp $__git_merge_options
 + --rerere-autoupdate --no-rerere-autoupdate --abort
   return
   esac
   __gitcomp_nl $(__git_refs)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[no subject]

2014-05-22 Thread Mrs. Jiang Ming



-- 
Compliment of the day,

I am Mrs. Jiang Ming, a staff of Lloyds TSB Group Plc. here in
Hong Kong attached with Private Banking Services;I have a secured business
proposal for you. Should you be interested please reach me on my private
emailaddress (mrsjiangming1...@outlook.com) And after that I shall provide
you with
more details of my proposal.
Your earliest response to this letter will be appreciated.

Yours Sincerely,
Jiang Ming

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 8/8] http: default text charset to iso-8859-1

2014-05-22 Thread brian m. carlson
On Thu, May 22, 2014 at 05:36:12AM -0400, Jeff King wrote:
 If we do want to do magic like latin1 is really iso-8859-1, that seems
 like the domain of iconv to me. If iconv doesn't handle it itself, I'd
 rather have a wrapper there. Putting it at that layer keeps the code
 cleaner, and it means the wrapper would benefit the regular commit-log
 reencoding code.

I think being a little stricter in our character encoding actually
benefits users.  If someone claims that all their commit messages are in
US-ASCII or ISO-8859-1, and then stuffs Windows-1252 in there, that's
going to break a lot of stuff, especially if someone assumes US-ASCII
means it's okay to use it where UTF-8 is required.

It's much better to let people not insert broken stuff in the first
place rather than deal with it afterwards.

-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187


signature.asc
Description: Digital signature


Re: [PATCH v2] Add an explicit GIT_DIR to the list of excludes

2014-05-22 Thread Eric Sunshine
On Thu, May 22, 2014 at 4:11 PM, Pasha Bolokhov
pasha.bolok...@gmail.com wrote:
 diff --git a/t/t2205-add-gitdir.sh b/t/t2205-add-gitdir.sh
 new file mode 100755
 index 000..3c6b853
 --- /dev/null
 +++ b/t/t2205-add-gitdir.sh
 @@ -0,0 +1,61 @@
 +#!/bin/sh
 +#
 +# Copyright (c) 2014 Pasha Bolokhov
 +#
 +
 +test_description='alternative repository path specified by --git-dir is 
 ignored by add and status'
 +
 +. ./test-lib.sh
 +
 +test_expect_success setup '
 +   git --git-dir=meta init 
 +   for f in a b c d
 +   do
 +   echo DATA  $f || exit 1

On this project, it's customary to say 'foo bar' (no whitespace after
''). Ditto for the rest of the file.

 +   done 
 +   mkdir subdir1 
 +   for f in e f g h
 +   do
 +   echo MORE DATA  subdir1/$f || exit 1
 +   done 
 +   mkdir subdir1/meta 
 +   echo EVEN more Data  subdir1/meta/aa 
 +   mkdir subdir1/ssubdir subdir1/ssubdir/meta 
 +   echo So much more data  subdir1/ssubdir/meta/aaa
 +'
 +
 +test_expect_success 'git status' acknowledges files 'meta' if repository is 
 not within work tree '
 +   test_might_fail rm -rf meta/ 
 +   (
 +   cd subdir1 
 +   git --git-dir=../meta init

Broken -chain.

 +   git --git-dir=../meta --work-tree=. status --porcelain  
 status2.out 
 +   test_might_fail grep meta status2.out  out2 
 +   test -s out2
 +   )
 +'
 +
 +test_done
 --
 1.9.1
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v1 1/3] replace: add --graft option

2014-05-22 Thread Christian Couder
The usage string for this option is:

git replace [-f] --graft commit [parent...]

First we create a new commit that is the same as commit
except that its parents are [parents...]

Then we create a replace ref that replace commit with
the commit we just created.

With this new option, it should be straightforward to
convert grafts to replace refs, with something like:

cat .git/info/grafts | while read line
do git replace --graft $line; done

Signed-off-by: Christian Couder chrisc...@tuxfamily.org
---
 builtin/replace.c | 84 ++-
 1 file changed, 83 insertions(+), 1 deletion(-)

diff --git a/builtin/replace.c b/builtin/replace.c
index 1bb491d..9d1e82f 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -17,6 +17,7 @@
 static const char * const git_replace_usage[] = {
N_(git replace [-f] object replacement),
N_(git replace [-f] --edit object),
+   N_(git replace [-f] --graft commit [parent...]),
N_(git replace -d object...),
N_(git replace [--format=format] [-l [pattern]]),
NULL
@@ -294,6 +295,76 @@ static int edit_and_replace(const char *object_ref, int 
force)
return replace_object_sha1(object_ref, old, replacement, new, force);
 }
 
+static int read_sha1_commit(const unsigned char *sha1, struct strbuf *dst)
+{
+   void *buf;
+   enum object_type type;
+   unsigned long size;
+
+   buf = read_sha1_file(sha1, type, size);
+   if (!buf)
+   return error(_(cannot read object %s), sha1_to_hex(sha1));
+   if (type != OBJ_COMMIT) {
+   free(buf);
+   return error(_(object %s is not a commit), sha1_to_hex(sha1));
+   }
+   strbuf_attach(dst, buf, size, size + 1);
+   return 0;
+}
+
+static int create_graft(int argc, const char **argv, int force)
+{
+   unsigned char old[20], new[20];
+   const char *old_ref = argv[0];
+   struct commit *commit;
+   struct strbuf buf = STRBUF_INIT;
+   struct strbuf new_parents = STRBUF_INIT;
+   const char *parent_start, *parent_end;
+   int i;
+
+   if (get_sha1(old_ref, old)  0)
+   die(_(Not a valid object name: '%s'), old_ref);
+   commit = lookup_commit_or_die(old, old_ref);
+   if (read_sha1_commit(old, buf))
+   die(_(Invalid commit: '%s'), old_ref);
+
+   /* make sure the commit is not corrupt */
+   if (parse_commit_buffer(commit, buf.buf, buf.len))
+   die(_(Could not parse commit: '%s'), old_ref);
+
+   /* find existing parents */
+   parent_start = buf.buf;
+   parent_start += 46; /* tree  + hex sha1 + \n */
+   parent_end = parent_start;
+
+   while (starts_with(parent_end, parent ))
+   parent_end += 48; /* parent  + hex sha1 + \n */
+
+   /* prepare new parents */
+   for (i = 1; i  argc; i++) {
+   unsigned char sha1[20];
+   if (get_sha1(argv[i], sha1)  0)
+   die(_(Not a valid object name: '%s'), argv[i]);
+   lookup_commit_or_die(sha1, argv[i]);
+   strbuf_addf(new_parents, parent %s\n, sha1_to_hex(sha1));
+   }
+
+   /* replace existing parents with new ones */
+   strbuf_splice(buf, parent_start - buf.buf, parent_end - parent_start,
+ new_parents.buf, new_parents.len);
+
+   if (write_sha1_file(buf.buf, buf.len, commit_type, new))
+   die(_(could not write replacement commit for: '%s'), old_ref);
+
+   strbuf_release(new_parents);
+   strbuf_release(buf);
+
+   if (!hashcmp(old, new))
+   return error(new commit is the same as the old one: '%s', 
sha1_to_hex(old));
+
+   return replace_object_sha1(old_ref, old, replacement, new, force);
+}
+
 int cmd_replace(int argc, const char **argv, const char *prefix)
 {
int force = 0;
@@ -303,12 +374,14 @@ int cmd_replace(int argc, const char **argv, const char 
*prefix)
MODE_LIST,
MODE_DELETE,
MODE_EDIT,
+   MODE_GRAFT,
MODE_REPLACE
} cmdmode = MODE_UNSPECIFIED;
struct option options[] = {
OPT_CMDMODE('l', list, cmdmode, N_(list replace refs), 
MODE_LIST),
OPT_CMDMODE('d', delete, cmdmode, N_(delete replace refs), 
MODE_DELETE),
OPT_CMDMODE('e', edit, cmdmode, N_(edit existing object), 
MODE_EDIT),
+   OPT_CMDMODE('g', graft, cmdmode, N_(change a commit's 
parents), MODE_GRAFT),
OPT_BOOL('f', force, force, N_(replace the ref if it 
exists)),
OPT_STRING(0, format, format, N_(format), N_(use this 
format)),
OPT_END()
@@ -325,7 +398,10 @@ int cmd_replace(int argc, const char **argv, const char 
*prefix)
usage_msg_opt(--format cannot be used when not listing,
  git_replace_usage, options);
 
-   if (force  cmdmode != MODE_REPLACE 

[PATCH v1 0/3] Add --graft option to git replace

2014-05-22 Thread Christian Couder
Here is a small patch series to implement:

git replace [-f] --graft commit [parent...]

The changes since the RFC/PATCH are the following:

- in patch 1/3, parse_commit_buffer() is now used to
  make sure commit is not corrupt
- patch 2/3 add some tests
- patch 3/3 add some documentation

About the documentation, maybe we should add that --graft
can now be used instead of grafts in .git/info/grafts,
and maybe we could add an example that shows how it can
be done.

Christian Couder (3):
  replace: add --graft option
  replace: add test for --graft
  Documentation: replace: add --graft option

 Documentation/git-replace.txt |  8 +
 builtin/replace.c | 84 ++-
 t/t6050-replace.sh| 12 +++
 3 files changed, 103 insertions(+), 1 deletion(-)

-- 
1.9.rc0.17.g651113e

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v1 2/3] replace: add test for --graft

2014-05-22 Thread Christian Couder
Signed-off-by: Christian Couder chrisc...@tuxfamily.org
---
 t/t6050-replace.sh | 12 
 1 file changed, 12 insertions(+)

diff --git a/t/t6050-replace.sh b/t/t6050-replace.sh
index 68b3cb2..ca45a84 100755
--- a/t/t6050-replace.sh
+++ b/t/t6050-replace.sh
@@ -351,4 +351,16 @@ test_expect_success 'replace ref cleanup' '
test -z $(git replace)
 '
 
+test_expect_success '--graft with and without already replaced object' '
+   test $(git log --oneline | wc -l) = 7 
+   git replace --graft $HASH5 
+   test $(git log --oneline | wc -l) = 3 
+   git cat-file -p $HASH5 | test_must_fail grep parent 
+   test_must_fail git replace --graft $HASH5 $HASH4 $HASH3 
+   git replace --force -g $HASH5 $HASH4 $HASH3 
+   git cat-file -p $HASH5 | grep parent $HASH4 
+   git cat-file -p $HASH5 | grep parent $HASH3 
+   git replace -d $HASH5
+'
+
 test_done
-- 
1.9.rc0.17.g651113e


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html