[PATCH 0/5] replace signal() with sigaction()

2014-05-28 Thread Jeremiah Mahler
From signal(2)

  The behavior of signal() varies across UNIX versions, and has also var‐
  ied historically across different versions of Linux.   Avoid  its  use:
  use sigaction(2) instead.  See Portability below.

This patch set replaces calls to signal() with sigaction() in all files
except sigchain.c.  sigchain.c is a bit more complicated than the others
and will be done in a separate patch.

Jeremiah Mahler (5):
  progress.c: replace signal() with sigaction()
  daemon.c run_service(): replace signal() with sigaction()
  daemon.c child_handler(): replace signal() with sigaction()
  daemon.c service_loop(): replace signal() with sigaction()
  connect.c: replace signal() with sigaction()

 connect.c  |  5 -
 daemon.c   | 15 ---
 progress.c |  6 +-
 3 files changed, 21 insertions(+), 5 deletions(-)

-- 
2.0.0.rc4.6.g127602c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] daemon.c service_loop(): replace signal() with sigaction()

2014-05-28 Thread Jeremiah Mahler
From signal(2)

  The behavior of signal() varies across UNIX versions, and has also var‐
  ied historically across different versions of Linux.   Avoid  its  use:
  use sigaction(2) instead.  See Portability below.

Replaced signal() with sigaction() in daemon.c service_loop()

Signed-off-by: Jeremiah Mahler jmmah...@gmail.com
---
 daemon.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/daemon.c b/daemon.c
index 3c48dc0..8c28b3d 100644
--- a/daemon.c
+++ b/daemon.c
@@ -1002,6 +1002,7 @@ static int service_loop(struct socketlist *socklist)
 {
struct pollfd *pfd;
int i;
+   struct sigaction sa;
 
pfd = xcalloc(socklist-nr, sizeof(struct pollfd));
 
@@ -1010,7 +1011,9 @@ static int service_loop(struct socketlist *socklist)
pfd[i].events = POLLIN;
}
 
-   signal(SIGCHLD, child_handler);
+   memset(sa, 0, sizeof(sa));
+   sa.sa_handler = child_handler;
+   sigaction(SIGCHLD, sa, 0);
 
for (;;) {
int i;
-- 
2.0.0.rc4.6.g127602c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] daemon.c child_handler(): replace signal() with sigaction()

2014-05-28 Thread Jeremiah Mahler
From signal(2)

  The behavior of signal() varies across UNIX versions, and has also var‐
  ied historically across different versions of Linux.   Avoid  its  use:
  use sigaction(2) instead.  See Portability below.

Replaced signal() with sigaction() in daemon.c child_handler()

Signed-off-by: Jeremiah Mahler jmmah...@gmail.com
---
 daemon.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/daemon.c b/daemon.c
index 78c4a1c..3c48dc0 100644
--- a/daemon.c
+++ b/daemon.c
@@ -791,12 +791,15 @@ static void handle(int incoming, struct sockaddr *addr, 
socklen_t addrlen)
 
 static void child_handler(int signo)
 {
+   struct sigaction sa;
/*
 * Otherwise empty handler because systemcalls will get interrupted
 * upon signal receipt
 * SysV needs the handler to be rearmed
 */
-   signal(SIGCHLD, child_handler);
+   memset(sa, 0, sizeof(sa));
+   sa.sa_handler = child_handler;
+   sigaction(SIGCHLD, sa, 0);
 }
 
 static int set_reuse_addr(int sockfd)
-- 
2.0.0.rc4.6.g127602c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] connect.c: replace signal() with sigaction()

2014-05-28 Thread Jeremiah Mahler
From signal(2)

  The behavior of signal() varies across UNIX versions, and has also var‐
  ied historically across different versions of Linux.   Avoid  its  use:
  use sigaction(2) instead.  See Portability below.

Replaced signal() with sigaction() in connect.c

Signed-off-by: Jeremiah Mahler jmmah...@gmail.com
---
 connect.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/connect.c b/connect.c
index a983d06..b2a33c9 100644
--- a/connect.c
+++ b/connect.c
@@ -665,11 +665,14 @@ struct child_process *git_connect(int fd[2], const char 
*url,
enum protocol protocol;
const char **arg;
struct strbuf cmd = STRBUF_INIT;
+   struct sigaction sa;
 
/* Without this we cannot rely on waitpid() to tell
 * what happened to our children.
 */
-   signal(SIGCHLD, SIG_DFL);
+   memset(sa, 0, sizeof(sa));
+   sa.sa_handler = SIG_DFL;
+   sigaction(SIGCHLD, sa, 0);
 
protocol = parse_connect_url(url, hostandport, path);
if (flags  CONNECT_DIAG_URL) {
-- 
2.0.0.rc4.6.g127602c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] progress.c: replace signal() with sigaction()

2014-05-28 Thread Jeremiah Mahler
From signal(2)

  The behavior of signal() varies across UNIX versions, and has also var‐
  ied historically across different versions of Linux.   Avoid  its  use:
  use sigaction(2) instead.  See Portability below.

Replaced signal() with sigaction() in progress.c

Signed-off-by: Jeremiah Mahler jmmah...@gmail.com
---
 progress.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/progress.c b/progress.c
index 261314e..24df263 100644
--- a/progress.c
+++ b/progress.c
@@ -66,8 +66,12 @@ static void set_progress_signal(void)
 static void clear_progress_signal(void)
 {
struct itimerval v = {{0,},};
+   struct sigaction sa;
+
+   memset(sa, 0, sizeof(sa));
+   sa.sa_handler = SIG_IGN;
setitimer(ITIMER_REAL, v, NULL);
-   signal(SIGALRM, SIG_IGN);
+   sigaction(SIGALRM, sa, 0);
progress_update = 0;
 }
 
-- 
2.0.0.rc4.6.g127602c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5] daemon.c run_service(): replace signal() with sigaction()

2014-05-28 Thread Jeremiah Mahler
From signal(2)

  The behavior of signal() varies across UNIX versions, and has also var‐
  ied historically across different versions of Linux.   Avoid  its  use:
  use sigaction(2) instead.  See Portability below.

Replaced signal() with sigaction() in daemon.c run_service()

Signed-off-by: Jeremiah Mahler jmmah...@gmail.com
---
 daemon.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/daemon.c b/daemon.c
index eba1255..78c4a1c 100644
--- a/daemon.c
+++ b/daemon.c
@@ -322,6 +322,7 @@ static int run_service(char *dir, struct daemon_service 
*service)
 {
const char *path;
int enabled = service-enabled;
+   struct sigaction sa;
 
loginfo(Request %s for '%s', service-name, dir);
 
@@ -376,7 +377,9 @@ static int run_service(char *dir, struct daemon_service 
*service)
 * We'll ignore SIGTERM from now on, we have a
 * good client.
 */
-   signal(SIGTERM, SIG_IGN);
+   memset(sa, 0, sizeof(sa));
+   sa.sa_handler = SIG_IGN;
+   sigaction(SIGTERM, sa, 0);
 
return service-fn();
 }
-- 
2.0.0.rc4.6.g127602c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Summary of the problems with git pull

2014-05-28 Thread Philippe Vaucher
 Felipe Contreras wrote:
 == git update ==

 Another proposed solution is to have a new command: `git update`. This
 command would be similar to `git pull --ff-only` by default, but it
 could be configured to do merges instead, and when doing so in reverse.

 And here it is:

 https://github.com/felipec/git/commits/fc/update

 Here's the documentation:

 https://github.com/felipec/git/wiki/git-update

 Works exactly as expected: non-fast-forwards are rejected by default,
 but can be configured, and parents are reversed when merging.

I know patches from Felipe are sensitive subject and that a lot of
people probably simply ignore him entirely, however I think this patch
deserves more attention.

From what I read, it seems to really improve the pull issues and is
more or less in line with what Junio suggested what should be done in
[1]. So the goal of this mail is just to make sure people consider it
for inclusion.

Sorry if I missed a thread where it was already decided not to include it.

Felipe, please don't use this to start any non-constructive behavior
(rant on who is right/wrong, my patches are not accepted, etc).

Philippe

[1] http://article.gmane.org/gmane.comp.version-control.git/248258
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] replace signal() with sigaction()

2014-05-28 Thread Johannes Sixt
Am 5/28/2014 8:14, schrieb Jeremiah Mahler:
 From signal(2)
 
   The behavior of signal() varies across UNIX versions, and has also var‐
   ied historically across different versions of Linux.   Avoid  its  use:
   use sigaction(2) instead.  See Portability below.
 
 This patch set replaces calls to signal() with sigaction() in all files
 except sigchain.c.  sigchain.c is a bit more complicated than the others
 and will be done in a separate patch.

In compat/mingw.c we have:

int sigaction(int sig, struct sigaction *in, struct sigaction *out)
{
if (sig != SIGALRM)
return errno = EINVAL,
error(sigaction only implemented for SIGALRM);
if (out != NULL)
return errno = EINVAL,
error(sigaction: param 3 != NULL not implemented);

timer_fn = in-sa_handler;
return 0;
}

Notice only implemented for SIGALRM. Are adding the missing signals
somewhere (here or in a later patch)?

 Jeremiah Mahler (5):
   progress.c: replace signal() with sigaction()
   daemon.c run_service(): replace signal() with sigaction()
   daemon.c child_handler(): replace signal() with sigaction()
   daemon.c service_loop(): replace signal() with sigaction()
   connect.c: replace signal() with sigaction()
 
  connect.c  |  5 -
  daemon.c   | 15 ---
  progress.c |  6 +-
  3 files changed, 21 insertions(+), 5 deletions(-)

Isn't this a bit too much of code churn, given that there were no bug
reports where signal handling is identified as the culprit despite
the warning you cited above?

-- Hannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] progress.c: replace signal() with sigaction()

2014-05-28 Thread Chris Packham
On 28/05/14 18:14, Jeremiah Mahler wrote:
 From signal(2)
 
   The behavior of signal() varies across UNIX versions, and has also var‐
   ied historically across different versions of Linux.   Avoid  its  use:
   use sigaction(2) instead.  See Portability below.

Minor nit. The last sentence applies to the man page you're quoting and
doesn't really make sense when viewed in the context of this commit
message. Same applies to other patches in this series.

 
 Replaced signal() with sigaction() in progress.c
 
 Signed-off-by: Jeremiah Mahler jmmah...@gmail.com
 ---
  progress.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)
 
 diff --git a/progress.c b/progress.c
 index 261314e..24df263 100644
 --- a/progress.c
 +++ b/progress.c
 @@ -66,8 +66,12 @@ static void set_progress_signal(void)
  static void clear_progress_signal(void)
  {
   struct itimerval v = {{0,},};
 + struct sigaction sa;
 +
 + memset(sa, 0, sizeof(sa));
 + sa.sa_handler = SIG_IGN;

A C99 initialiser here would save the call to memset. Unfortunately
Documentation/CodingGuidelines is fairly clear on not using C99
initialisers, given the fact we're now at git 2.0 maybe it's time to
revisit this policy?

   setitimer(ITIMER_REAL, v, NULL);
 - signal(SIGALRM, SIG_IGN);
 + sigaction(SIGALRM, sa, 0);
   progress_update = 0;
  }
  
 

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] replace signal() with sigaction()

2014-05-28 Thread Chris Packham
On 28/05/14 19:40, Johannes Sixt wrote:
 Am 5/28/2014 8:14, schrieb Jeremiah Mahler:
 From signal(2)

   The behavior of signal() varies across UNIX versions, and has also var‐
   ied historically across different versions of Linux.   Avoid  its  use:
   use sigaction(2) instead.  See Portability below.

 This patch set replaces calls to signal() with sigaction() in all files
 except sigchain.c.  sigchain.c is a bit more complicated than the others
 and will be done in a separate patch.
 
 In compat/mingw.c we have:
 
 int sigaction(int sig, struct sigaction *in, struct sigaction *out)
 {
   if (sig != SIGALRM)
   return errno = EINVAL,
   error(sigaction only implemented for SIGALRM);
   if (out != NULL)
   return errno = EINVAL,
   error(sigaction: param 3 != NULL not implemented);
 
   timer_fn = in-sa_handler;
   return 0;
 }
 
 Notice only implemented for SIGALRM. Are adding the missing signals
 somewhere (here or in a later patch)?
 

* note: not a windows/mingw programmer *

Will the ones setting SIG_IGN be OK? Presumably we won't get these
signals on windows anyway so we're already getting what we want.

 Jeremiah Mahler (5):
   progress.c: replace signal() with sigaction()
   daemon.c run_service(): replace signal() with sigaction()
   daemon.c child_handler(): replace signal() with sigaction()
   daemon.c service_loop(): replace signal() with sigaction()
   connect.c: replace signal() with sigaction()

  connect.c  |  5 -
  daemon.c   | 15 ---
  progress.c |  6 +-
  3 files changed, 21 insertions(+), 5 deletions(-)
 
 Isn't this a bit too much of code churn, given that there were no bug
 reports where signal handling is identified as the culprit despite
 the warning you cited above?
 
 -- Hannes
 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] progress.c: replace signal() with sigaction()

2014-05-28 Thread David Kastrup
Chris Packham judge.pack...@gmail.com writes:

 On 28/05/14 18:14, Jeremiah Mahler wrote:
 From signal(2)
 
   The behavior of signal() varies across UNIX versions, and has also var‐
   ied historically across different versions of Linux.   Avoid  its  use:
   use sigaction(2) instead.  See Portability below.

 Minor nit. The last sentence applies to the man page you're quoting and
 doesn't really make sense when viewed in the context of this commit
 message. Same applies to other patches in this series.

 
 Replaced signal() with sigaction() in progress.c
 
 Signed-off-by: Jeremiah Mahler jmmah...@gmail.com
 ---
  progress.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)
 
 diff --git a/progress.c b/progress.c
 index 261314e..24df263 100644
 --- a/progress.c
 +++ b/progress.c
 @@ -66,8 +66,12 @@ static void set_progress_signal(void)
  static void clear_progress_signal(void)
  {
  struct itimerval v = {{0,},};
 +struct sigaction sa;
 +
 +memset(sa, 0, sizeof(sa));
 +sa.sa_handler = SIG_IGN;

 A C99 initialiser here would save the call to memset. Unfortunately
 Documentation/CodingGuidelines is fairly clear on not using C99
 initialisers, given the fact we're now at git 2.0 maybe it's time to
 revisit this policy?

If I look at the initialization of v in the context immediately above
the new code, it would appear that somebody already revisited this
policy.

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] replace signal() with sigaction()

2014-05-28 Thread Erik Faye-Lund
On Wed, May 28, 2014 at 10:11 AM, Chris Packham judge.pack...@gmail.com wrote:
 On 28/05/14 19:40, Johannes Sixt wrote:
 Am 5/28/2014 8:14, schrieb Jeremiah Mahler:
 From signal(2)

   The behavior of signal() varies across UNIX versions, and has also var‐
   ied historically across different versions of Linux.   Avoid  its  use:
   use sigaction(2) instead.  See Portability below.

 This patch set replaces calls to signal() with sigaction() in all files
 except sigchain.c.  sigchain.c is a bit more complicated than the others
 and will be done in a separate patch.

 In compat/mingw.c we have:

 int sigaction(int sig, struct sigaction *in, struct sigaction *out)
 {
   if (sig != SIGALRM)
   return errno = EINVAL,
   error(sigaction only implemented for SIGALRM);
   if (out != NULL)
   return errno = EINVAL,
   error(sigaction: param 3 != NULL not implemented);

   timer_fn = in-sa_handler;
   return 0;
 }

 Notice only implemented for SIGALRM. Are adding the missing signals
 somewhere (here or in a later patch)?


 * note: not a windows/mingw programmer *

 Will the ones setting SIG_IGN be OK? Presumably we won't get these
 signals on windows anyway so we're already getting what we want.

We'll still emit an useless error unless someone cooks up a fix.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] progress.c: replace signal() with sigaction()

2014-05-28 Thread Erik Faye-Lund
On Wed, May 28, 2014 at 10:19 AM, David Kastrup d...@gnu.org wrote:
 Chris Packham judge.pack...@gmail.com writes:

 On 28/05/14 18:14, Jeremiah Mahler wrote:
 From signal(2)

   The behavior of signal() varies across UNIX versions, and has also var‐
   ied historically across different versions of Linux.   Avoid  its  use:
   use sigaction(2) instead.  See Portability below.

 Minor nit. The last sentence applies to the man page you're quoting and
 doesn't really make sense when viewed in the context of this commit
 message. Same applies to other patches in this series.


 Replaced signal() with sigaction() in progress.c

 Signed-off-by: Jeremiah Mahler jmmah...@gmail.com
 ---
  progress.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

 diff --git a/progress.c b/progress.c
 index 261314e..24df263 100644
 --- a/progress.c
 +++ b/progress.c
 @@ -66,8 +66,12 @@ static void set_progress_signal(void)
  static void clear_progress_signal(void)
  {
  struct itimerval v = {{0,},};
 +struct sigaction sa;
 +
 +memset(sa, 0, sizeof(sa));
 +sa.sa_handler = SIG_IGN;

 A C99 initialiser here would save the call to memset. Unfortunately
 Documentation/CodingGuidelines is fairly clear on not using C99
 initialisers, given the fact we're now at git 2.0 maybe it's time to
 revisit this policy?

 If I look at the initialization of v in the context immediately above
 the new code, it would appear that somebody already revisited this
 policy.

Huh, the initialization of v doesn't use C99-features...?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Summary of the problems with git pull

2014-05-28 Thread Felipe Contreras
Philippe Vaucher wrote:
 Sorry if I missed a thread where it was already decided not to include
 it.
 
 Felipe, please don't use this to start any non-constructive behavior
 (rant on who is right/wrong, my patches are not accepted, etc).

I never sent those patches. I gave up on the Git project.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] progress.c: replace signal() with sigaction()

2014-05-28 Thread David Kastrup
Erik Faye-Lund kusmab...@gmail.com writes:

 On Wed, May 28, 2014 at 10:19 AM, David Kastrup d...@gnu.org wrote:
 Chris Packham judge.pack...@gmail.com writes:

 On 28/05/14 18:14, Jeremiah Mahler wrote:
  static void clear_progress_signal(void)
  {
  struct itimerval v = {{0,},};
 +struct sigaction sa;
 +
 +memset(sa, 0, sizeof(sa));
 +sa.sa_handler = SIG_IGN;

 A C99 initialiser here would save the call to memset. Unfortunately
 Documentation/CodingGuidelines is fairly clear on not using C99
 initialisers, given the fact we're now at git 2.0 maybe it's time to
 revisit this policy?

 If I look at the initialization of v in the context immediately above
 the new code, it would appear that somebody already revisited this
 policy.

 Huh, the initialization of v doesn't use C99-features...?

Well, for me anything post-KR apparently is C99.

Cf
URL:http://computer-programming-forum.com/47-c-language/859a1b6693a0ddc5.htm

I have to admit that gcc -c -ansi -std=c89 -pedantic does not complain,
so that makes it quite probable that I was erring somewhat on the side
of the ancient ones and zeros.

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/2] config: Add cache for config value querying

2014-05-28 Thread Eric Sunshine
On Mon, May 26, 2014 at 1:33 PM, Tanay Abhra tanay...@gmail.com wrote:
 Add an internal cache with the all variable value pairs read from the usual
 config files(repo specific .git/config, user wide ~/.gitconfig and the global

s/(/ (/

 /etc/gitconfig). Also, add two external functions `git_config_get_string` and
 `git_config_get_string_multi` for querying in an non callback manner from the

s/an/a/
s/non/non-/

More below.

 cache.

 Signed-off-by: Tanay Abhra tanay...@gmail.com
 ---
  cache.h  |   2 ++
  config.c | 105 
 +++
  2 files changed, 107 insertions(+)

 diff --git a/cache.h b/cache.h
 index 107ac61..2038150 100644
 --- a/cache.h
 +++ b/cache.h
 @@ -1271,6 +1271,8 @@ extern int check_repository_format_version(const char 
 *var, const char *value, v
  extern int git_env_bool(const char *, int);
  extern int git_config_system(void);
  extern int config_error_nonbool(const char *);
 +extern const char *git_config_get_string(const char *);
 +extern const struct string_list *git_config_get_string_multi(const char *);
  #if defined(__GNUC__)  ! defined(__clang__)
  #define config_error_nonbool(s) (config_error_nonbool(s), -1)
  #endif
 diff --git a/config.c b/config.c
 index a30cb5c..f6515c2 100644
 --- a/config.c
 +++ b/config.c
 @@ -9,6 +9,8 @@
  #include exec_cmd.h
  #include strbuf.h
  #include quote.h
 +#include hashmap.h
 +#include string-list.h

  struct config_source {
 struct config_source *prev;
 @@ -37,6 +39,102 @@ static struct config_source *cf;

  static int zlib_compression_seen;

 +struct hashmap config_cache;

This should be 'static'.

 +struct config_cache_node {
 +   struct hashmap_entry ent;
 +   struct strbuf key;

Upon reading this, I had the same question as Torsten: Why isn't 'key'
a plain 'char *' allocated when the entry is created? Your answer:

To maintain consistency with config.c. config.c uses strbuf for
both key and value throughout. I found the reason by git-blaming
config.c. Key length is flexible so it would be better to use a
api construct such as strbuf for it.

doesn't make sense. The existing callback-based code re-uses the
strbuf for each key as it parses the file, however, you are just
storing each key once in the hashmap and never altering it, thus a
strbuf is overkill.

 +   struct string_list *value_list ;

Why isn't this just a plain 'struct string_list'? Why the extra
indirection of a pointer?

Drop space before semicolon.

 +};
 +
 +static int hashmap_init_v = 0;

It's not obvious what _v is meant to convey.

 +static int config_cache_node_cmp(const struct config_cache_node *e1,
 +  const struct config_cache_node *e2, const char 
 *key)
 +{
 +   return strcmp(e1-key.buf, key ? key : e2-key.buf);
 +}
 +
 +static int config_cache_node_cmp_icase(const struct config_cache_node *e1,
 +const struct config_cache_node *e2, const 
 char* key)
 +{
 +   return strcasecmp(e1-key.buf, key ? key : e2-key.buf);
 +}
 +
 +static void config_cache_init(const int icase)

'const int' as a function argument is very unusual in this code base.
(I found only a single instance of it in
log-tree.[ch]:parse_decorate_color_config.)

 +{
 +   hashmap_init(config_cache, (hashmap_cmp_fn) (icase ? 
 config_cache_node_cmp_icase
 +   : config_cache_node_cmp), 0);
 +}
 +
 +static void config_cache_free(void)
 +{
 +   hashmap_free(config_cache, 1);
 +}

Doesn't this leak the strbuf 'key' and string_list 'value_list'?

 +static struct config_cache_node *config_cache_find_entry(const char *key)
 +{
 +   struct hashmap_entry k;
 +   hashmap_entry_init(k, strihash(key));

How does this unconditional use of case-insensitive strihash()
interact with the 'icase' argument of config_cache_init()?

 +   return hashmap_get(config_cache, k, key);
 +}
 +
 +static struct string_list *config_cache_get_value(const char *key)
 +{
 +   struct config_cache_node *e = config_cache_find_entry(key);
 +   if (e == NULL)

Rephrase: if (!e)

 +   return NULL;
 +   else
 +   return (e-value_list);

Unnecessary parentheses.

 +}

A purely subject rewrite of the above:

return e ? e-value_list : NULL;

 +static void config_cache_set_value(const char *key, const char *value)
 +{
 +   struct config_cache_node *e;
 +
 +   if (!value)
 +   return;

Again, I had the same questions as Torsten. (Is this supposed to
delete the entry?, Is it a programmer error?, etc.) From a pure
reading, it's not obvious why a NULL 'value' is handled this way. The
intent might be clearer if the caller instead performs the check, and
invokes config_cache_set_value() only if 'value' is non-NULL.

 +   e = config_cache_find_entry(key);
 +   if (!e) {
 +   e = malloc(sizeof(*e));
 +   hashmap_entry_init(e, strihash(key));

Same question as above regarding 

Re: [RFC/PATCH 2/2] config: Add new query functions to the api

2014-05-28 Thread Eric Sunshine
On Mon, May 26, 2014 at 1:33 PM, Tanay Abhra tanay...@gmail.com wrote:
 Subject: config: Add new query functions to the api

This sounds as if you are adding new functions to the header file.
Saying ... to the api docs would been clearer. Better:

config: document new query functions

 Add explanations for `git_config_get_string_multi` and `git_config_get_string`
 which utilize the config cache for querying in an non callback manner for a

s/an/a/
s/non/non-/

 specific variable.

 Signed-off-by: Tanay Abhra tanay...@gmail.com

One might expect these functions to be documented in the same patch
which introduces them. Since this patch is so small, it might make
sense just to squash it into patch 1.

More below.

 ---
  Documentation/technical/api-config.txt | 19 +++
  1 file changed, 19 insertions(+)

 diff --git a/Documentation/technical/api-config.txt 
 b/Documentation/technical/api-config.txt
 index 230b3a0..33b5b90 100644
 --- a/Documentation/technical/api-config.txt
 +++ b/Documentation/technical/api-config.txt
 @@ -77,6 +77,25 @@ To read a specific file in git-config format, use
  `git_config_from_file`. This takes the same callback and data parameters
  as `git_config`.

 +Querying for specific variables
 +---

All other section headers in this file capitalize each word.

 +For programs wanting to query for specific variables in a non callback

s/non-/

 +manner, the config API provides two functions `git_config_get_string`
 +and `git_config_get_string_multi`. They both take a single parameter,
 +
 +- a variable as the key string for which the corresponding values will
 +  be retrieved and returned.
 +
 +They both read value from an internal cache generated previously from

Minor observation: As this file is documenting the API, mentioning
private implementation details such, as the internal cache, may be
frowned upon (though probably doesn't matter in practice since this
_is_ a technical document).

 +reading the config files. `git_config_get_string` returns the value with
 +the highest priority(i.e. value in the repo config will be preferred over
 +value in user wide config for the same variable).
 +
 +`git_config_get_string_multi` returns a `string_list` containing all the
 +values for that particular variable, sorted in order of increasing
 +priority.
 +
  Value Parsing Helpers
  -

 --
 1.9.0.GIT
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4] Add an explicit GIT_DIR to the list of excludes

2014-05-28 Thread Duy Nguyen
On Tue, May 27, 2014 at 10:56 AM, Pasha Bolokhov
pasha.bolok...@gmail.com wrote:
 @@ -1588,6 +1588,38 @@ void setup_standard_excludes(struct dir_struct *dir)
  {
 const char *path;
 char *xdg_path;
 +   const char *r_git, *gitdir = get_git_dir();
 +   char *n_git, *basename;
 +   int len, i;
 +
 +   /*
 +* Add git directory to the ignores; do this only if
 +* GIT_DIR does not end with /.git
 +*/
 +   r_git = real_path(absolute_path(gitdir));
 +   n_git = xmalloc(strlen(r_git) + 1 + 1);
 +   normalize_path_copy(n_git, r_git);
 +
 +   len = strlen(n_git); /* real_path() has stripped trailing slash */
 +   for (i = len - 1; i  0  !is_dir_sep(n_git[i]); i--) ;
 +   basename = n_git + i;
 +   if (is_dir_sep(*basename))
 +   basename++;
 +   if (strcmp(basename, .git)) {

I think normalize_path_copy makes sure that dir sep is '/', so this
code may be simplified to if (strcmp(n_git, .git)  (len == 4 ||
strcmp(n_git + len - 5, /.git)))?

 +   const char *worktree = get_git_work_tree();
 +
 +   if (worktree == NULL ||
 +   dir_inside_of(n_git, get_git_work_tree()) = 0) {
 +   struct exclude_list *el = add_exclude_list(dir, 
 EXC_CMDL,
 +   GIT_DIR setup);
 +
 +   /* append a trailing slash to exclude directories */
 +   n_git[len] = '/';
 +   n_git[len + 1] = '\0';
 +   add_exclude(basename, , 0, el, 0);
 +   }
 +   }
 +   free(n_git);

All this add-only code makes me think it may be nice to make it a
separate function. A good function name could replace the comment near
the beginning of the block.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Duy Nguyen
On Tue, May 27, 2014 at 11:47 PM, Dale R. Worley wor...@alum.mit.edu wrote:
 I've discovered a problem using Git.  It's not clear to me what the
 correct behavior should be, but it seems to me that Git is failing
 in an undesirable way.

 The problem arises when trying to handle a very large file.  For
 example:

 $ git --version
 git version 1.8.3.1
 $ mkdir $$
 $ cd $$
 $ git init
 Initialized empty Git repository in 
 /common/not-replicated/worley/temp/5627/.git/
 $ truncate --size=20G big_file
 $ ls -l
 total 0
 -rw-rw-r--. 1 worley worley 21474836480 May 27 11:59 big_file
 $ time git add big_file

 real4m48.752s
 user4m31.295s
 sys 0m16.747s
 $

 At this point, either 'git fsck' or 'git commit' fails:

 $ git fsck --full --strict
 notice: HEAD points to an unborn branch (master)
 Checking object directories: 100% (256/256), done.
 fatal: Out of memory, malloc failed (tried to allocate 21474836481 bytes)

Back trace for this one

#3  0x0055cf39 in xmalloc (size=21474836481) at wrapper.c:49
#4  0x0055cffd in xmallocz (size=21474836480) at wrapper.c:73
#5  0x00537858 in unpack_compressed_entry (p=0x858ac0,
w_curs=0x7fffc0f8, curpos=18, size=21474836480) at
sha1_file.c:1924
#6  0x00538364 in unpack_entry (p=0x858ac0, obj_offset=12,
final_type=0x7fffc1e4, final_size=0x7fffc1d8) at
sha1_file.c:2206
#7  0x004fb0a2 in verify_packfile (p=0x858ac0,
w_curs=0x7fffc320, fn=0x43f5f2 fsck_obj_buffer,
progress=0x858a90, base_count=0) at pack-check.c:119
#8  0x004fb3f4 in verify_pack (p=0x858ac0, fn=0x43f5f2
fsck_obj_buffer, progress=0x858a90, base_count=0) at
pack-check.c:177
#9  0x004401d7 in cmd_fsck (argc=0, argv=0x7fffd650,
prefix=0x0) at builtin/fsck.c:677

Not easy to fix. I started working on converting fsck to use
index-pack code for pack verification. index-pack supports large files
well, so in the end it might fix this (as well as speeding up fsck).
But that work has stalled for a long time.


 $ git commit -m Test.
 [master (root-commit) 3df3655] Test.
 fatal: Out of memory, malloc failed (tried to allocate 21474836481 bytes)

And back trace

#11 0x004b9da0 in read_sha1_file (sha1=0x8558a0
\256/s\324\370\304\344\212\304I\v\342\334MS\002\352\214\061\222,
type=0x7fffc6c4, size=0x8558d0) at cache.h:820
#12 0x004c1b98 in diff_populate_filespec (s=0x8558a0,
size_only=0) at diff.c:2749
#13 0x004c0110 in diff_filespec_is_binary (one=0x8558a0) at diff.c:2188
#14 0x004c0f0b in builtin_diffstat (name_a=0x858530
big_file, name_b=0x0, one=0x8584e0, two=0x8558a0,
diffstat=0x7fffc8a0, o=0x7fffce88, p=0x855910) at diff.c:2435
#15 0x004c2fd4 in run_diffstat (p=0x855910, o=0x7fffce88,
diffstat=0x7fffc8a0) at diff.c:3168
#16 0x004c603a in diff_flush_stat (p=0x855910,
o=0x7fffce88, diffstat=0x7fffc8a0) at diff.c:4081
#17 0x004c70e4 in diff_flush (options=0x7fffce88) at diff.c:4520
#18 0x004e5d59 in log_tree_diff_flush (opt=0x7fffcaf0) at
log-tree.c:715
#19 0x004e5e5a in log_tree_diff (opt=0x7fffcaf0,
commit=0x8585b0, log=0x7fffc9a0) at log-tree.c:747
#20 0x004e60b1 in log_tree_commit (opt=0x7fffcaf0,
commit=0x8585b0) at log-tree.c:810
#21 0x0042c45c in print_summary (prefix=0x0,
sha1=0x7fffd300 .Gȑ\360\243\202\351!\035\312q\374\345\314LL),
initial_commit=1) at builtin/commit.c:1426
#22 0x0042d213 in cmd_commit (argc=0, argv=0x7fffd650,
prefix=0x0) at builtin/commit.c:1750

If we could have an option in read_sha1_file to read max to n bytes
(enough for binary detection purpose), it would fix this. Another
option is declare all files larger than core.bigfilethreshold binary.
Easier in both senses of implementation cost and looseness.

 Even doing a 'git reset' does not put the repository in a state where
 'git fsck' will complete:

 $ git reset
 $ git fsck --full --strict
 notice: HEAD points to an unborn branch (master)
 Checking object directories: 100% (256/256), done.
 fatal: Out of memory, malloc failed (tried to allocate 21474836481 bytes)

I don't know how many commands are hit by this. If you have time and
gdb, please put a break point in die_builtin() function and send
backtraces for those that fail. You could speed up the process by
creating a smaller file and set the environment variable
GIT_ALLOC_LIMIT (in kilobytes) to a number lower than that size. If
git attempts to allocate a block larger than that limit it'll die.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug in shallow clone?

2014-05-28 Thread Thomas Kieffer

Hi there Git developers,

I'm not sure if I found a bug in the command

git clone repo.git cloned_repo --depth 1

I follow these steps:

git init
echo First commit  test.txt
git add -A
git commit -am First commit

echo Second commit  test.txt
git commit -am Second commit

echo Third commit  test.txt
git commit -am Third commit

git clone --bare . ./bare.git

I then clone the bare repository with --depth 1.

git clone file:///path/to/bare.git ./clone --depth 1

It always returns the last two commits. If I specify --depth 2 it 
returns the last 3 commits.


If I use --depth 1 on a Github repository it works as expected.

Am I doing something wrong or is it really a bug?

Kind Regards,

Thomas Kieffer


BTW.: Git is amazing and I love it :)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in shallow clone?

2014-05-28 Thread Duy Nguyen
On Wed, May 28, 2014 at 9:02 PM, Thomas Kieffer thomaskief...@web.de wrote:
 I then clone the bare repository with --depth 1.

 git clone file:///path/to/bare.git ./clone --depth 1

 It always returns the last two commits. If I specify --depth 2 it returns
 the last 3 commits.

 If I use --depth 1 on a Github repository it works as expected.

 Am I doing something wrong or is it really a bug?

Depth calculation has been corrected lately. It depends on your
version, maybe it's older than 1.8.2? If it's the latest, we screwed
something up again..
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only

2014-05-28 Thread Michael Haggerty
On 05/27/2014 08:27 PM, Junio C Hamano wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:
 
 This suggests to me that our current structure is best modeled as two
 independent reference back ends, with a third implementation of the same
 reference API whose job it is to compose the first two.  In pseudocode,
 ...
 
 That is an interesting view.
 
 How does reflog fit into the picture?  Is it a completely
 independent thing that is called from any implementation of
 ReferenceBackend interface?

That's a good question.  It certainly wouldn't work for each of the
loose and packed backends to try to do logging.  I think the best
solution would be to have a logging wrapper as explained below.

 From this point of view it is clear that packing refs is not an
 operation that belongs in the ReferenceBackend API, but rather in the
 StackedReferenceBackend interface.
 
 When an implementation of ReferenceBackend has skewed performance
 characteristics (e.g. PackedReferenceBackend really prefers to be
 modified in bulk), how would that interact with the abstraction?
 
 For example, when the application does:
 
 begin_transaction()
 for ref in many_refs():
   delete_reference(ref)
 commit_transaction()
 
 StackedReferenceBackend() that consists of Loose on top of Packed
 may want to implement the commit phase like so:
 
 - tell Packed backend to repack without the deleted refs
 - tell Loose backend to delete the deleted refs

I think for any two backends that are stacked, you would need to break
down a transaction as follows (here generalized to include not only
deletions):

packed-begin_transaction()
loose-begin_transaction()

# And this part is done within stacked-commit_transaction():
for entry in many_ref_updates():
if have_old:
stacked-verify_reference(ref, old_sha1)

if entry is a delete:
packed-delete_reference(entry)
loose-update_reference(entry)

if (!packed-commit_transaction())
loose-commit_transaction()

Verifying old values is impossible to do batchwise with the current API,
because the old value of the packed ref has to be verified if and only
if there is no corresponding loose ref.

 But the above would not quite work, as somebody needs to remove logs
 for refs that were only in the Packed backend, and repack without
 these refs API supported by the Packed backend cannot be that
 somebody---repack packed-refs without A B C cannot unconditionally
 remove logs for A B C without checking if A B C exists as Loose.

Correct.  That's another reason that logging has to be the
responsibility of something at the stacked level of abstraction or higher.

I think the logging should be done by yet another outer layer of wrapper
that only does the logging, while also passing all updates down 1:1 to
the backend that it wraps (which in our case would be a stacked
backend).  Then the loose and packed backends could remain completely
ignorant of the fact that reference updates can be logged.  I think the
logging layer could implement the same reference backend API as the
other backends.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in shallow clone?

2014-05-28 Thread Dennis Kaarsemaker
On wo, 2014-05-28 at 21:16 +0700, Duy Nguyen wrote:
 On Wed, May 28, 2014 at 9:02 PM, Thomas Kieffer thomaskief...@web.de wrote:
  I then clone the bare repository with --depth 1.
 
  git clone file:///path/to/bare.git ./clone --depth 1
 
  It always returns the last two commits. If I specify --depth 2 it returns
  the last 3 commits.
 
  If I use --depth 1 on a Github repository it works as expected.
 
  Am I doing something wrong or is it really a bug?
 
 Depth calculation has been corrected lately. It depends on your
 version, maybe it's older than 1.8.2? If it's the latest, we screwed
 something up again..

2.0.0-rc4 does this correctly.
-- 
Dennis Kaarsemaker
http://www.kaarsemaker.net

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 08/41] refs.c: add an err argument to delete_ref_loose

2014-05-28 Thread Ronnie Sahlberg
On Tue, May 27, 2014 at 5:25 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Hi,

 Comments from http://marc.info/?l=gitm=140079653930751w=2:

 Ronnie Sahlberg wrote:

 [...]
 --- a/refs.c
 +++ b/refs.c
 @@ -2491,17 +2491,43 @@ static int repack_without_ref(const char *refname)
   return repack_without_refs(refname, 1, NULL);
  }

 -static int delete_ref_loose(struct ref_lock *lock, int flag)
 +static int add_err_if_unremovable(const char *op, const char *file,
 +   struct strbuf *e, int rc)
 +{
 + int err = errno;
 + if (rc  0  err != ENOENT) {
 + strbuf_addf(e, unable to %s %s: %s,
 + op, file, strerror(errno));
 + errno = err;
 + }
 + return rc;
 +}
 +
 +static int unlink_or_err(const char *file, struct strbuf *err)
 +{
 + if (err)
 + return add_err_if_unremovable(unlink, file, err,
 +   unlink(file));
 + else
 + return unlink_or_warn(file);
 +}

 The new unlink_or_err has an odd contract when the err argument is passed.
 On error:

  * if errno != ENOENT, it will append a message to err and return -1 (good)
  * if errno == ENOENT, it will not append a message to err but will
still return -1.

 This means the caller has to check errno to figure out whether err is
 meaningful when it returns an error.

 Perhaps it should return 0 when errno == ENOENT.  After all, in that
 case the file does not exist any more, which is all we wanted.

Good idea.
Thanks. Done.



 +
 +static int delete_ref_loose(struct ref_lock *lock, int flag, struct strbuf 
 *err)
  {
   if (!(flag  REF_ISPACKED) || flag  REF_ISSYMREF) {
   /* loose */
 - int err, i = strlen(lock-lk-filename) - 5; /* .lock */
 + int res, i = strlen(lock-lk-filename) - 5; /* .lock */

   lock-lk-filename[i] = 0;
 - err = unlink_or_warn(lock-lk-filename);
 + res = unlink_or_err(lock-lk-filename, err);
   lock-lk-filename[i] = '.';
 - if (err  errno != ENOENT)
 + if (res  errno != ENOENT) {
 + if (err)
 + strbuf_addf(err,
 + failed to delete loose ref '%s',
 + lock-lk-filename);

 On failure we seem to add our own message to err, too, so the
 resulting message would be something like

 fatal: unable to unlink .git/refs/heads/master: \
 Permission deniedfailed to delete loose ref 
 '.git/refs/heads/master.lock'

 Is the second strbuf_addf a typo?

Yeah, we don't need it.
Removed.


 Thanks,
 Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 10/41] update-ref.c: log transaction error from the update_ref

2014-05-28 Thread Ronnie Sahlberg
On Tue, May 27, 2014 at 5:27 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Hi,

 Comments from http://marc.info/?l=gitm=140079653930751w=2:

 Ronnie Sahlberg wrote:

 [Subject: update-ref.c: log transaction error from the update_ref]

 The above description suggests that this is going to add new logging,
 or in other words that update_ref was being silent about transaction
 errors before.

 The actual intent is for there to be no functional change, right?  I'd
 say something like update-ref: use err argument to get error from
 ref_transaction_commit or something similar to make it clearer that
 this is just about changing APIs.  Or if there's an intended
 functional change, then the commit message could say something about
 that.

Thanks.
I reworded the commit message.


 Thanks,
 Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 13/41] refs.c: change ref_transaction_create to do error checking and return status

2014-05-28 Thread Ronnie Sahlberg
On Tue, May 27, 2014 at 5:42 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Hi,

 Ronnie Sahlberg wrote:

 --- a/refs.h
 +++ b/refs.h
 @@ -215,6 +215,15 @@ enum action_on_err {
  };

  /*
 + * Transaction functions that take an err argument will append an error
 + * string to this buffer if there was a failure.
 + * This string is not cleared on each call and may contain an aggregate of
 + * errors from several previous calls.
 + * If the caller needs a guarantee that the buffer will only contain the
 + * current or most recent error it must call strbuf_reset before calling
 + * the transaction function.
 + */
 +/*

 If I look at the documentation for ref_transaction_create(), it is not
 obvious I should look up here for how it handles errors.  Not sure how
 to fix that --- maybe this should go in a new
 Documentation/technical/api-ref-transactions.txt file?  Or maybe the
 top of refs.h where struct ref_transaction is declared.

 The content seems odd to me.  It says the string will contain an
 aggregate of errors from previous calls, but what it doesn't say is
 that that aggregate is a bunch of error messages juxtaposed without a
 newline or space between.  Is the idea that some callers will want
 this aggregate?

 Wouldn't it be clearer to say how the err argument is meant to be used
 from the caller's perspective?  E.g.,

 On error, transaction functions append a message about what
 went wrong to the 'err' argument.  The message mentions what
 ref was being updated (if any) when the error occurred so it
 can be passed to 'die' or 'error' as-is:

 if (ref_transaction_update(..., err)) {
 ret = error(%s, err.buf);
 goto cleanup;
 }

 If it's expected that the err argument will be used to aggregate
 multiple messages in some callers then it would be useful to give an
 example of that, too, but I don't think that's needed.

Thanks. Updated the comment in refs.h


 Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Thomas Braun
Am 27.05.2014 18:47, schrieb Dale R. Worley:
 Even doing a 'git reset' does not put the repository in a state where
 'git fsck' will complete:

You have to remove the offending commit also from the reflog.

The following snipped creates an offending commit, big_file is 2GB which
is too large for git on windows, and later removes it completely so that
git fsck passes again.

---
git init
echo 1  some_file
git add some_file
git commit -m add some_file
git add big_file
git commit -m add big_file # reports malloc error without the -q flag
git log --stat # malloc error
git reset HEAD^1
git fsck --full --strict --verbose # fails
git reflog expire --expire=now --all # remove all reflog entries
git gc --prune=now
git fsck --full --strict --verbose # passes
---
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 06/41] refs.c: add an err argument to repack_without_refs

2014-05-28 Thread Ronnie Sahlberg
On Tue, May 27, 2014 at 5:11 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Hi,

 Comments from http://marc.info/?l=gitm=140079653930751w=2:

 Ronnie Sahlberg wrote:

 --- a/cache.h
 +++ b/cache.h
 @@ -559,6 +559,8 @@ struct lock_file {
  #define LOCK_DIE_ON_ERROR 1
  #define LOCK_NODEREF 2
  extern int unable_to_lock_error(const char *path, int err);
 +extern void unable_to_lock_strbuf(const char *path, int err,
 +   struct strbuf *buf);

 unable_to_lock_strbuf could be called unable_to_lock_message (which
 would make its behavior more obvious IMHO).

Renamed.  Thanks.


 [...]
 --- a/refs.c
 +++ b/refs.c
 @@ -2208,6 +2208,7 @@ int commit_packed_refs(void)
   struct packed_ref_cache *packed_ref_cache =
   get_packed_ref_cache(ref_cache);
   int error = 0;
 + int save_errno = 0;

 This is about making errno meaningful when commit_packed_refs returns
 an error.  Probably its API documentation should say so to make it
 obvious when people modify it in the future that they should preserve
 that property or audit callers.

 [...]
 @@ -2444,6 +2448,11 @@ static int repack_without_refs(const char **refnames, 
 int n)
   return 0; /* no refname exists in packed refs */

   if (lock_packed_refs(0)) {
 + if (err) {
 + unable_to_lock_strbuf(git_path(packed-refs), errno,
 +   err);
 + return -1;
 + }
   unable_to_lock_error(git_path(packed-refs), errno);

 Via the new call to unable_to_lock_..., repack_without_refs cares
 about errno after a failed call to lock_packed_refs.  lock_packed_refs
 can only fail in hold_lock_file_for_update.  hold_lock_file_for_update
 is a thin wrapper around lockfile.c::lock_file.  lock_file can error
 out because

 strlen(path) = max_path_len
 adjust_shared_perm failed [calls error(), clobbers errno]
 open failed

 So lock_file needs a similar kind of fix, and it's probably worth
 updating API documentation for these calls to make it clear that their
 errno is used (though that's not a new problem since
 repack_without_refs already called unable_to_lock_error).  Could be a
 separate, earlier patch (or a TODO comment in this patch to be
 addressed with a later patch) since it's fixing an existing bug.

I will add it to my todo list.
I think passing of errno around is too fragile and that we should
avoid ad-hoc save_errno hacks and implement dedicated return codes to
replace errno.
We should only inspect errno immediately after a syscall has failed.


 Hope that helps,
 Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] progress.c: replace signal() with sigaction()

2014-05-28 Thread Jeremiah Mahler
On Wed, May 28, 2014 at 08:07:38PM +1200, Chris Packham wrote:
 On 28/05/14 18:14, Jeremiah Mahler wrote:
  From signal(2)
  
The behavior of signal() varies across UNIX versions, and has also var‐
ied historically across different versions of Linux.   Avoid  its  use:
use sigaction(2) instead.  See Portability below.
 
 Minor nit. The last sentence applies to the man page you're quoting and
 doesn't really make sense when viewed in the context of this commit
 message. Same applies to other patches in this series.
 
Yes, that is confusing.

...
  @@ -66,8 +66,12 @@ static void set_progress_signal(void)
   static void clear_progress_signal(void)
   {
  struct itimerval v = {{0,},};
  +   struct sigaction sa;
  +
  +   memset(sa, 0, sizeof(sa));
  +   sa.sa_handler = SIG_IGN;
 
 A C99 initialiser here would save the call to memset. Unfortunately
 Documentation/CodingGuidelines is fairly clear on not using C99
 initialisers, given the fact we're now at git 2.0 maybe it's time to
 revisit this policy?
 

struct sigaction sa = { .sa_handler = SIG_IGN };

I do like that.

This brings up another issue.  memset(sa, 0, sizeof(sa)); The sigaction
examples I have seen always use memset.  The manpage for sigaction(2)
doesn't mention it.  Other code it Git uses memset with sigaction (see
fast-import.c line 530).

Is this struct guaranteed to be initialized to zero so I don't have to
use memset?

-- 
Jeremiah Mahler
jmmah...@gmail.com
http://github.com/jmahler
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] Git clone of a bundle fails, but works (somewhat) when run with strace

2014-05-28 Thread Bram Geron
Jeff King peff at peff.net writes:
 [..]
 Secondly, I do get the same warning about HEAD:
 
   $ git clone repo.bundle repofrombundle
   Cloning into 'repofrombundle'...
   Receiving objects: 100% (3/3), done.
   warning: remote HEAD refers to nonexistent ref, unable to checkout.
 
 but that warning makes sense. You did not create a bundle that contains
 HEAD, therefore when we clone it, we do not know what to point HEAD to.
 You probably wanted git bundle create ../repo.bundle --all which
 includes both master and HEAD.

I'd like to revive this discussion and submit a patch, as I just spent
significant time wondering why git clone failed. It's been a while, so I'll
summarize: when you make a git bundle without including HEAD explicitly,
then clone from that bundle, Git throws a warning and leaves you with a
broken HEAD.

I do not agree that the warning makes sense. It implies that HEAD exists but
is invalid. In reality, no ref is referred to by HEAD in the first place.
Furthermore, .git/HEAD in the clone is autocorrected to be
refs/heads/master, so the error message is even more misleading.

It's like saying Our CEO's guitar is actually an air guitar, then
explaining where he stores his guitar, when I don't work in a company in the
first place.

 It would be slightly more accurate to say the remote HEAD does not
 exist, rather than refers to nonexistent ref.  It would perhaps be
 nicer still for git clone to make a guess about the correct HEAD when
 one is not present (especially in the single-branch case, it is easy to
 make the right guess).

Seems sensible at first sight, though it seems orthogonal to the warning
message.

 Patches welcome. In the meantime, you can clone with -b master to tell
 it explicitly, or you can git checkout master inside the newly-cloned
 repository.

Alright :) See below.

Cheers, Bram


From bc799b12b659d7ab20df7fe420d5f1f1c90450aa Mon Sep 17 00:00:00 2001
From: Bram Geron bge...@bgeron.nl
Date: Wed, 28 May 2014 15:54:37 +0100
Subject: [PATCH] Clearer error message when cloning a bundle without a HEAD
 ref.

---
 builtin/clone.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 9b3c04d..e3c1447 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -623,7 +623,7 @@ static int checkout(void)
 
head = resolve_refdup(HEAD, sha1, 1, NULL);
if (!head) {
-   warning(_(remote HEAD refers to nonexistent ref, 
+   warning(_(no HEAD in remote or HEAD refers to nonexistent ref, 

  unable to checkout.\n));
return 0;
}
-- 
1.9.1



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] replace signal() with sigaction()

2014-05-28 Thread Jeremiah Mahler
On Wed, May 28, 2014 at 09:40:55AM +0200, Johannes Sixt wrote:
 Am 5/28/2014 8:14, schrieb Jeremiah Mahler:
  From signal(2)
  
The behavior of signal() varies across UNIX versions, and has also var‐
ied historically across different versions of Linux.   Avoid  its  use:
use sigaction(2) instead.  See Portability below.
  
  This patch set replaces calls to signal() with sigaction() in all files
  except sigchain.c.  sigchain.c is a bit more complicated than the others
  and will be done in a separate patch.
 
 In compat/mingw.c we have:
 
 int sigaction(int sig, struct sigaction *in, struct sigaction *out)
 {
   if (sig != SIGALRM)
   return errno = EINVAL,
   error(sigaction only implemented for SIGALRM);
   if (out != NULL)
   return errno = EINVAL,
   error(sigaction: param 3 != NULL not implemented);
 
   timer_fn = in-sa_handler;
   return 0;
 }
 
 Notice only implemented for SIGALRM. Are adding the missing signals
 somewhere (here or in a later patch)?
 
Good catch Johannes, I did not account for this.

If I am reading this correctly, sigaction() can only be used for
SIGALRM, and no other signals.  But signal() it would have worked for
these other signals.

My first idea is why not call signal() from within this sigaction()
call?  The handler could be read from the struct sigaction passed in.

  Jeremiah Mahler (5):
progress.c: replace signal() with sigaction()
daemon.c run_service(): replace signal() with sigaction()
daemon.c child_handler(): replace signal() with sigaction()
daemon.c service_loop(): replace signal() with sigaction()
connect.c: replace signal() with sigaction()
  
   connect.c  |  5 -
   daemon.c   | 15 ---
   progress.c |  6 +-
   3 files changed, 21 insertions(+), 5 deletions(-)
 
 Isn't this a bit too much of code churn, given that there were no bug
 reports where signal handling is identified as the culprit despite
 the warning you cited above?
 
We might get bugs from continuing to use signal().  We might get bugs from
changing to sigaction().  I feel more comfortable using sigaction() since
its man page is so adamant about not using signal().

-- 
Jeremiah Mahler
jmmah...@gmail.com
http://github.com/jmahler
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 10/41] update-ref.c: log transaction error from the update_ref

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 I reworded the commit message.

Thanks much.  Looks good.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only

2014-05-28 Thread Junio C Hamano
Michael Haggerty mhag...@alum.mit.edu writes:

 I think for any two backends that are stacked, you would need to break
 down a transaction as follows (here generalized to include not only
 deletions):

 packed-begin_transaction()
 loose-begin_transaction()

 # And this part is done within stacked-commit_transaction():
 for entry in many_ref_updates():
 if have_old:
 stacked-verify_reference(ref, old_sha1)

 if entry is a delete:
 packed-delete_reference(entry)
 loose-update_reference(entry)

 if (!packed-commit_transaction())
 loose-commit_transaction()

Ugggly.  In general you would need to worry about the case where the
first commit succeeds and then the second commit fails, wouldn't
you?

The above happens not to break horribly wrong for the Loose on top
of Packed layering, in that the refs you wanted to delete are only
gone from Packed but still are in Loose, and the end result would be
some of them are really gone (because they weren't in Loose) and
some others remain (because they were in Loose), and at least you
did not lose any ref you did not want to discard.  But it still is
not what should happen in a proper transaction.

 But the above would not quite work, as somebody needs to remove logs
 for refs that were only in the Packed backend, and repack without
 these refs API supported by the Packed backend cannot be that
 somebody---repack packed-refs without A B C cannot unconditionally
 remove logs for A B C without checking if A B C exists as Loose.

 Correct.  That's another reason that logging has to be the
 responsibility of something at the stacked level of abstraction or higher.

 I think the logging should be done by yet another outer layer of
 wrapper that only does the logging, while also passing all updates
 down 1:1 to the backend that it wraps (which in our case would be
 a stacked backend). ... Then the loose and packed backends could
 remain completely ignorant of the fact that reference updates can
 be logged.

That would mean that Loose (or Packed) class cannot be used as-is
and always needs to be wrapped with the layer that does the logging,
no?  In a system with only packed-refs, no loose, you would want
Packed.deleteRefs() to remove the named refs ref and their reflogs,
but that would mean that the Layered wrapper that uses Loose
overlayed on Packed cannot call that method, because it does not
want reflogs of the refs in Packed covered by the ones in Loose.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 13/41] refs.c: change ref_transaction_create to do error checking and return status

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 Updated the comment in refs.h

Thanks.

 +++ b/refs.h
 @@ -215,6 +215,31 @@ enum action_on_err {
  };
  
  /*
 + * On error, transaction functions append a message about what
 + * went wrong to the 'err' argument.  The message mentions what
 + * ref was being updated (if any) when the error occurred so it
 + * can be passed to 'die' or 'error' as-is:
 + *
 + * if (ref_transaction_update(..., err)) {
 + * ret = error(%s, err.buf);
 + * goto cleanup;
 + * }
 + *
 + * The message is appended to err without first clearing err.
 + * This allow shte caller to prepare preamble text to the generated

s/allow shte/allows the/

 + * error message:
 + *
 + * strbuf_addf(err, Error while doing foo-bar: );
 + * if (ref_transaction_update(..., err)) {
 + * ret = error(%s, err.buf);
 + * goto cleanup;
 + * }

Nice.

 + *
 + * If the caller wants err to only contain the message for the current error
 + * and nothing else caller will need to guarantee that err is empty or
 + * call strbuf_reset before calling the transaction function.

I don't think this paragraph is needed --- especially with the
clarification about how to add a preamble, the contract is clear.

 + */
 +/*
   * Begin a reference transaction.  The reference transaction must
   * be freed by calling ref_transaction_free().
   */

Now that the comment is longer, it's harder to miss, but it still is
in an odd place for someone looking to understand what the 'err'
argument to ref_transaction_update() means.

How about this patch for squashing in?

diff --git i/refs.h w/refs.h
index c741a6c..913ca93 100644
--- i/refs.h
+++ w/refs.h
@@ -10,6 +10,45 @@ struct ref_lock {
int force_write;
 };
 
+/*
+ * A ref_transaction represents a collection of ref updates
+ * that should succeed or fail together.
+ *
+ * Calling sequence
+ * 
+ * - Allocate and initialize a `struct ref_transaction` by calling
+ *   `ref_transaction_begin()`.
+ *
+ * - List intended ref updates by calling functions like
+ *   `ref_transaction_update()` and `ref_transaction_create()`.
+ *
+ * - Call `ref_transaction_commit()` to execute the transaction.
+ *   If this succeeds, the ref updates will have taken place and
+ *   the transaction cannot be rolled back.
+ *
+ * - At any time call `ref_transaction_free()` to discard the
+ *   transaction and free associated resources.  In particular,
+ *   this rolls back the transaction if it has not been
+ *   successfully committed.
+ *
+ * Error handling
+ * --
+ *
+ * On error, transaction functions append a message about what
+ * went wrong to the 'err' argument.  The message mentions what
+ * ref was being updated (if any) when the error occurred so it
+ * can be passed to 'die' or 'error' as-is.
+ *
+ * The message is appended to err without first clearing err.
+ * This allows the caller to prepare preamble text to the generated
+ * error message:
+ *
+ * strbuf_addf(err, Error while doing foo-bar: );
+ * if (ref_transaction_update(..., err)) {
+ * ret = error(%s, err.buf);
+ * goto cleanup;
+ * }
+ */
 struct ref_transaction;
 
 /*
@@ -215,31 +254,6 @@ enum action_on_err {
 };
 
 /*
- * On error, transaction functions append a message about what
- * went wrong to the 'err' argument.  The message mentions what
- * ref was being updated (if any) when the error occurred so it
- * can be passed to 'die' or 'error' as-is:
- *
- * if (ref_transaction_update(..., err)) {
- * ret = error(%s, err.buf);
- * goto cleanup;
- * }
- *
- * The message is appended to err without first clearing err.
- * This allow shte caller to prepare preamble text to the generated
- * error message:
- *
- * strbuf_addf(err, Error while doing foo-bar: );
- * if (ref_transaction_update(..., err)) {
- * ret = error(%s, err.buf);
- * goto cleanup;
- * }
- *
- * If the caller wants err to only contain the message for the current error
- * and nothing else caller will need to guarantee that err is empty or
- * call strbuf_reset before calling the transaction function.
- */
-/*
  * Begin a reference transaction.  The reference transaction must
  * be freed by calling ref_transaction_free().
  */
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

 $ git fsck --full --strict
 notice: HEAD points to an unborn branch (master)
 Checking object directories: 100% (256/256), done.
 fatal: Out of memory, malloc failed (tried to allocate 21474836481 bytes)

 Back trace for this one
 ...
 Not easy to fix. I started working on converting fsck to use
 index-pack code for pack verification. index-pack supports large files
 well, so in the end it might fix this (as well as speeding up fsck).
 But that work has stalled for a long time.

You need to have enough memory (virtual is fine if you have enough
time) to do fsck.  Some part of index-pack could be refactored into
a common helper function that could be called from fsck, but I think
it would be a lot of work.

 $ git commit -m Test.
 [master (root-commit) 3df3655] Test.
 fatal: Out of memory, malloc failed (tried to allocate 21474836481 bytes)

I suspect that this one is only because you are letting the status
which involves diff to kick in (and for that you would need to have
enough memory).  As you suggested, it might be a good idea to take
bigfilethreshold account when deciding if we would want to run diff.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: optionally reject case-clone branch names

2014-05-28 Thread Junio C Hamano
David Turner dtur...@twopensource.com writes:

 RFC follows:

 1. On a case-insensitive server, git receive-pack ought to always reject
 branches which are same-but-for-case of existing branches.
 2. On a case-sensitive server, the same rule by default, with an option
 to allow the old behavior.

 Let me know if, should I write these patches, they would be likely to be
 accepted.

There is another a lot simpler possiblity, no?

3. On any server whose administrator chooses to enforce one case
   variant only rule, install a pre-receive hook that enforces the
   rule.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 13/41] refs.c: change ref_transaction_create to do error checking and return status

2014-05-28 Thread Ronnie Sahlberg
On Wed, May 28, 2014 at 10:07 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 Updated the comment in refs.h

 Thanks.

 +++ b/refs.h
 @@ -215,6 +215,31 @@ enum action_on_err {
  };

  /*
 + * On error, transaction functions append a message about what
 + * went wrong to the 'err' argument.  The message mentions what
 + * ref was being updated (if any) when the error occurred so it
 + * can be passed to 'die' or 'error' as-is:
 + *
 + * if (ref_transaction_update(..., err)) {
 + * ret = error(%s, err.buf);
 + * goto cleanup;
 + * }
 + *
 + * The message is appended to err without first clearing err.
 + * This allow shte caller to prepare preamble text to the generated

 s/allow shte/allows the/

 + * error message:
 + *
 + * strbuf_addf(err, Error while doing foo-bar: );
 + * if (ref_transaction_update(..., err)) {
 + * ret = error(%s, err.buf);
 + * goto cleanup;
 + * }

 Nice.

 + *
 + * If the caller wants err to only contain the message for the current error
 + * and nothing else caller will need to guarantee that err is empty or
 + * call strbuf_reset before calling the transaction function.

 I don't think this paragraph is needed --- especially with the
 clarification about how to add a preamble, the contract is clear.

 + */
 +/*
   * Begin a reference transaction.  The reference transaction must
   * be freed by calling ref_transaction_free().
   */

 Now that the comment is longer, it's harder to miss, but it still is
 in an odd place for someone looking to understand what the 'err'
 argument to ref_transaction_update() means.

 How about this patch for squashing in?

 diff --git i/refs.h w/refs.h
 index c741a6c..913ca93 100644
 --- i/refs.h
 +++ w/refs.h
 @@ -10,6 +10,45 @@ struct ref_lock {
 int force_write;
  };

 +/*
 + * A ref_transaction represents a collection of ref updates
 + * that should succeed or fail together.
 + *
 + * Calling sequence
 + * 
 + * - Allocate and initialize a `struct ref_transaction` by calling
 + *   `ref_transaction_begin()`.
 + *
 + * - List intended ref updates by calling functions like
 + *   `ref_transaction_update()` and `ref_transaction_create()`.
 + *
 + * - Call `ref_transaction_commit()` to execute the transaction.
 + *   If this succeeds, the ref updates will have taken place and
 + *   the transaction cannot be rolled back.
 + *
 + * - At any time call `ref_transaction_free()` to discard the
 + *   transaction and free associated resources.  In particular,
 + *   this rolls back the transaction if it has not been
 + *   successfully committed.
 + *
 + * Error handling
 + * --
 + *
 + * On error, transaction functions append a message about what
 + * went wrong to the 'err' argument.  The message mentions what
 + * ref was being updated (if any) when the error occurred so it
 + * can be passed to 'die' or 'error' as-is.
 + *
 + * The message is appended to err without first clearing err.
 + * This allows the caller to prepare preamble text to the generated
 + * error message:
 + *
 + * strbuf_addf(err, Error while doing foo-bar: );
 + * if (ref_transaction_update(..., err)) {
 + * ret = error(%s, err.buf);
 + * goto cleanup;
 + * }
 + */
  struct ref_transaction;

  /*
 @@ -215,31 +254,6 @@ enum action_on_err {
  };

  /*
 - * On error, transaction functions append a message about what
 - * went wrong to the 'err' argument.  The message mentions what
 - * ref was being updated (if any) when the error occurred so it
 - * can be passed to 'die' or 'error' as-is:
 - *
 - * if (ref_transaction_update(..., err)) {
 - * ret = error(%s, err.buf);
 - * goto cleanup;
 - * }
 - *
 - * The message is appended to err without first clearing err.
 - * This allow shte caller to prepare preamble text to the generated
 - * error message:
 - *
 - * strbuf_addf(err, Error while doing foo-bar: );
 - * if (ref_transaction_update(..., err)) {
 - * ret = error(%s, err.buf);
 - * goto cleanup;
 - * }
 - *
 - * If the caller wants err to only contain the message for the current error
 - * and nothing else caller will need to guarantee that err is empty or
 - * call strbuf_reset before calling the transaction function.
 - */
 -/*
   * Begin a reference transaction.  The reference transaction must
   * be freed by calling ref_transaction_free().
   */

Done.
Please re-review.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 13/41] refs.c: change ref_transaction_create to do error checking and return status

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 Please re-review.

06df8942 is
Reviewed-by: Jonathan Nieder jrnie...@gmail.com

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] git diff --(src|dst)-prefix=// causes spurious (new|deleted) file mode

2014-05-28 Thread Junio C Hamano
Noam Postavsky npost...@users.sourceforge.net writes:

 % git init
 Initialized empty Git repository in /home/npostavs/tmp/scratch/.git/
 % echo foo  x
 % git add x
 % git commit -m x
 [master (root-commit) 41be1f2] x
  1 file changed, 1 insertion(+)
  create mode 100644 x
 % echo bar  x
 % git diff  | head -3
 diff --git i/x w/x
 index 257cc56..5716ca5 100644
 --- i/x
 % git diff --dst-prefix=// | head -3
 diff --git i/x //x

The feature these options implement was never designed to accept
anything other than foo/bar/ (i.e. a relative path-looking thing
that ends with / and no funnies such as duplicated slashes, in
order to replace the standard a/ and b/).  I think the command
line parsing code of src/dst-prefix trusts the user too much not to
feed nonsense like the above.  They may want to be tightened.

 Background: trying to find a prefix that can't show up in file names
 in order to make parsing easier.
 https://github.com/magit/magit/pull/1379
 https://github.com/magit/magit/pull/1383

It may be worth studying how git apply finds what the paths are
and use the same rule for consistency.  IIRC, the rules are roughly:

 - In a renaming/copying patch, you will have rename/copy from/to
   header separately.  There is no need to parse the diff --git
   line at all;

 - Otherwise, you will have a/SOMETHING b/SOMETHING (SOMETHING are
   repeated because there is no rename involved).

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] progress.c: replace signal() with sigaction()

2014-05-28 Thread Junio C Hamano
David Kastrup d...@gnu.org writes:

 diff --git a/progress.c b/progress.c
 index 261314e..24df263 100644
 --- a/progress.c
 +++ b/progress.c
 @@ -66,8 +66,12 @@ static void set_progress_signal(void)
  static void clear_progress_signal(void)
  {
 struct itimerval v = {{0,},};
 +   struct sigaction sa;
 +
 +   memset(sa, 0, sizeof(sa));
 +   sa.sa_handler = SIG_IGN;

 A C99 initialiser here would save the call to memset. Unfortunately
 Documentation/CodingGuidelines is fairly clear on not using C99
 initialisers, given the fact we're now at git 2.0 maybe it's time to
 revisit this policy?

 If I look at the initialization of v in the context immediately above
 the new code, it would appear that somebody already revisited this
 policy.

The existing structure initialization that says the first field of
the structure is set to 0 implying everything else will also be
set to 0 is not what we avoid.  That is straight C89.

What we avoid is an initializer with a designator, e.g.

struct sigaction sa = {
.sa_handler = NULL,
.sa_flags = 0,
};

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: optionally reject case-clone branch names

2014-05-28 Thread David Turner
On Wed, 2014-05-28 at 10:14 -0700, Junio C Hamano wrote:
 David Turner dtur...@twopensource.com writes:
 
  RFC follows:
 
  1. On a case-insensitive server, git receive-pack ought to always reject
  branches which are same-but-for-case of existing branches.
  2. On a case-sensitive server, the same rule by default, with an option
  to allow the old behavior.
 
  Let me know if, should I write these patches, they would be likely to be
  accepted.
 
 There is another a lot simpler possiblity, no?
 
 3. On any server whose administrator chooses to enforce one case
variant only rule, install a pre-receive hook that enforces the
rule.

The reason I discovered this issue is that a user came to me to complain
that suddenly their pulls were failing.  Then I had to track down what
the actual problem was (a colleague actually pointed it out to me).

We could add some hooks, but we have a lot of repos, some of which
already have unique hooks that we would have to edit.  And this approach
wouldn't help the next person who gets into this situation, who would
have to again figure out what went wrong, and add the appropriate hook.

Basically, I'm trying to take a poka-yoke approach. Does this seem
reasonable?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 13/41] refs.c: change ref_transaction_create to do error checking and return status

2014-05-28 Thread Ronnie Sahlberg
Thanks

Can you re-review these patches for me :

please review

67b8fce refs.c: add an err argument to repack_without_refs
738ac43 refs.c: add an err argument to delete_ref_loose
b78b0e0 refs.c: update ref_transaction_delete to check for error and
return status
e558f96 refs.c: add transaction.status and track OPEN/CLOSED/ERROR
5a7d9bf sequencer.c: use ref transactions for all ref updates
0ea69df fast-import.c: change update_branch to use ref transactions
57c92fb refs.c: change update_ref to use a transaction





On Wed, May 28, 2014 at 10:25 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 Please re-review.

 06df8942 is
 Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 13/41] refs.c: change ref_transaction_create to do error checking and return status

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 Can you re-review these patches for me :

 please review

 67b8fce refs.c: add an err argument to repack_without_refs
 738ac43 refs.c: add an err argument to delete_ref_loose
 b78b0e0 refs.c: update ref_transaction_delete to check for error and
 return status
 e558f96 refs.c: add transaction.status and track OPEN/CLOSED/ERROR
 5a7d9bf sequencer.c: use ref transactions for all ref updates
 0ea69df fast-import.c: change update_branch to use ref transactions
 57c92fb refs.c: change update_ref to use a transaction

The easiest way for me is if you send them to the list so I can
comment inline (and others can chime in with other comments, etc).
This works like a reroll of an entire series (using format-patch -v if
you use format-patch, etc) except that instead of sending as a new
thread the updated patch is just sent in response to some review
comments or the earlier version of that patch.

git send-email has an --in-reply-to option for this.  That involves
getting the Message-Id, which can be a little fussy.

Others who use send-email might know better tricks for this.  I never
ended up learning how to use git send-email (and I'm a little scared
of it) so I just hit reply in mutt and put the patch there.

Thanks and sorry for the fuss,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Dale R. Worley
 From: Duy Nguyen pclo...@gmail.com

 I don't know how many commands are hit by this. If you have time and
 gdb, please put a break point in die_builtin() function and send
 backtraces for those that fail. You could speed up the process by
 creating a smaller file and set the environment variable
 GIT_ALLOC_LIMIT (in kilobytes) to a number lower than that size. If
 git attempts to allocate a block larger than that limit it'll die.

I don't use Git enough to exercise it well.  And there are dozens of
commands with hundreds of options.

As someone else has noted, if I run 'git commit -q --no-status', it
doesn't crash.

It seems that much of Git was coded under the assumption that any file
could always be held entirely in RAM.  Who made that mistake?  Are
people so out of touch with reality?

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com

 You need to have enough memory (virtual is fine if you have enough
 time) to do fsck.  Some part of index-pack could be refactored into
 a common helper function that could be called from fsck, but I think
 it would be a lot of work.

How much memory is enough?  And how is it that fsck is coded so that
the available RAM is a limit on which repositories can be checked?
Did someone forget that RAM may be considerably smaller than the
amount of data a program may have to deal with?

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread David Lang

On Wed, 28 May 2014, Dale R. Worley wrote:


From: Duy Nguyen pclo...@gmail.com



I don't know how many commands are hit by this. If you have time and
gdb, please put a break point in die_builtin() function and send
backtraces for those that fail. You could speed up the process by
creating a smaller file and set the environment variable
GIT_ALLOC_LIMIT (in kilobytes) to a number lower than that size. If
git attempts to allocate a block larger than that limit it'll die.


I don't use Git enough to exercise it well.  And there are dozens of
commands with hundreds of options.

As someone else has noted, if I run 'git commit -q --no-status', it
doesn't crash.

It seems that much of Git was coded under the assumption that any file
could always be held entirely in RAM.  Who made that mistake?  Are
people so out of touch with reality?


Git was designed to track source code, there are warts that show up in the 
implementation when you use individual files 4GB


such files tend to also not diff well. git-annex and other offshoots hae methods 
boled on that handle such large files better than core git does.


David Lang
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 06/41] refs.c: add an err argument to repack_without_refs

2014-05-28 Thread Jonathan Nieder
Hi,

Looking at 67b8fcee:

Ronnie Sahlberg wrote:
 On Tue, May 27, 2014 at 5:11 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 --- a/refs.c
 +++ b/refs.c
 @@ -2208,6 +2208,7 @@ int commit_packed_refs(void)
   struct packed_ref_cache *packed_ref_cache =
   get_packed_ref_cache(ref_cache);
   int error = 0;
 + int save_errno = 0;

 This is about making errno meaningful when commit_packed_refs returns
 an error.  Probably its API documentation should say so to make it
 obvious when people modify it in the future that they should preserve
 that property or audit callers.

67b8fcee doesn't address this.  While it's often even better to return
an error message or meaningful error number so the caller doesn't have
to worry about errno at all, I think being explicit in the comment
above a declaration about which functions leave behind a meaningful
errno can make error handling saner.

So I still think that the above would be a good idea.

[...]
 [...]
 @@ -2444,6 +2448,11 @@ static int repack_without_refs(const char 
 **refnames, int n)
   return 0; /* no refname exists in packed refs */

   if (lock_packed_refs(0)) {
 + if (err) {
 + unable_to_lock_strbuf(git_path(packed-refs), errno,
 +   err);
 + return -1;
 + }
   unable_to_lock_error(git_path(packed-refs), errno);

 Via the new call to unable_to_lock_..., repack_without_refs cares
 about errno after a failed call to lock_packed_refs.  lock_packed_refs
 can only fail in hold_lock_file_for_update.  hold_lock_file_for_update
 is a thin wrapper around lockfile.c::lock_file.  lock_file can error
 out because

 strlen(path) = max_path_len
[...]
Could be a
 separate, earlier patch (or a TODO comment in this patch to be
 addressed with a later patch) since it's fixing an existing bug.

 I will add it to my todo list.

My worry with that is that it is too easy to forget that there is a
problem at all.  That's not *that* bad --- if no one remembers, how
serious of a problem was it, really?

Except that it makes reading code difficult.  It's easier to read
some code that prints strerror(errno) in a place that is known to
sometimes have errno==0 if there is a comment

/*
 * NEEDSWORK: Tweak lock_packed_refs to either reliably
 * set errno to a sane value on error or to propagate
 * back error information another way.
 */
perror(...);

Otherwise, the reader has to keep scratching their head, wondering
how did this ever work?.

That's why I suggested adding a TODO comment.

 I think passing of errno around is too fragile and that we should
 avoid ad-hoc save_errno hacks and implement dedicated return codes to
 replace errno.
 We should only inspect errno immediately after a syscall has failed.

Sure, agreed in principle. ;-)

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 14/41] refs.c: update ref_transaction_delete to check for error and return status

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 Signed-off-by: Ronnie Sahlberg sahlb...@google.com
 ---
  builtin/update-ref.c |  5 +++--
  refs.c   | 16 +++-
  refs.h   | 12 
  3 files changed, 22 insertions(+), 11 deletions(-)

Reviewed-by: Jonathan Nieder jrnie...@gmail.com

[...]
 +++ b/refs.c
 @@ -3417,19 +3417,25 @@ int ref_transaction_create(struct ref_transaction 
 *transaction,
   return 0;
  }
  
 -void ref_transaction_delete(struct ref_transaction *transaction,
 - const char *refname,
 - const unsigned char *old_sha1,
 - int flags, int have_old)
 +int ref_transaction_delete(struct ref_transaction *transaction,
 +const char *refname,
 +const unsigned char *old_sha1,
 +int flags, int have_old,
 +struct strbuf *err)
  {
 - struct ref_update *update = add_update(transaction, refname);
 + struct ref_update *update;
  
 + if (have_old  !old_sha1)
 + die(BUG: have_old is true but old_sha1 is NULL);
 +
 + update = add_update(transaction, refname);
   update-flags = flags;
   update-have_old = have_old;
   if (have_old) {
   assert(!is_null_sha1(old_sha1));

Could combine this into the 'if (have_old ' check.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 16/41] refs.c: add transaction.status and track OPEN/CLOSED/ERROR

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 --- a/refs.c
 +++ b/refs.c
[...]
 @@ -3385,6 +3408,9 @@ int ref_transaction_update(struct ref_transaction 
 *transaction,
  {
   struct ref_update *update;
  
 + if (transaction-state != REF_TRANSACTION_OPEN)
 + return -1;

I still think this is a step in the wrong direction.  It means that
instead of being able to do

if (ref_transaction_update(..., err))
die(%s, err.buf);

and be certain that err.buf has a meaningful message, now I have to
worry that if the state were not REF_TRANSACTION_OPEN (e.g., due to
someone else's code forgetting to handle an error) then the result
would be a plain

fatal:

The behavior would be much easier to debug if the code were

if (transaction-state != REF_TRANSACTION_OPEN)
die(BUG: update on transaction that is not open);

since then the symptom would be

fatal: BUG: update on transaction that is not open

which is easier to work with.

If a non-OPEN state were a normal, recoverable error, then it could
append a message to the *err argument.  But there's no message that
could be put there that would be meaningful to the end-user.  So I
suspect it's not a normal, recoverable error.

If we want to collect errors to make _commit() later fail with a
message like '38 refs failed to update' or something (or a
string_list, if the API is to change that way in the future), then
_update() should not fail.  It can record information about what is
wrong with this update in the transaction and succeed so the caller
knows to keep going and collect other updates before the _commit()
that will fail.

Of course it's easily possible I'm missing something.  That's just how
I see it now.

Does that make sense?

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Dale R. Worley
 From: David Lang da...@lang.hm

 Git was designed to track source code, there are warts that show up
 in the implementation when you use individual files 4GB

I'd expect that if you want to deal with files over 100k, you should
assume that it doesn't all fit in memory.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 20/41] sequencer.c: use ref transactions for all ref updates

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 Signed-off-by: Ronnie Sahlberg sahlb...@google.com
 ---
  sequencer.c | 24 
  1 file changed, 16 insertions(+), 8 deletions(-)

Reviewed-by: Jonathan Nieder jrnie...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Add an explicit GIT_DIR to the list of excludes

2014-05-28 Thread Jakub Narębski

W dniu 2014-05-27 19:16, Pasha Bolokhov pisze:

Add GIT_DIR to the list of excludes in setup_standard_excludes(),
while checking that GIT_DIR is not just '.git', in which case it
would be ignored by default, and that GIT_DIR is inside GIT_WORK_TREE


gives git-grep.txt and git-ls-files.txt. I don't know if we need to
add something about this extra exclude rule to those .txt. If it's so
obvious that this should be the expected behavior, then probably not.


 I suggest this. There appears to be a notion of standard
excludes both in the code (dir.c) and in the man pages (git-grep,
git-ls-files). However, it doesn't actually appear to be defined
strictly speaking. So my suggestion is to define those standard
excludes in one place (say gitignore(5)), and make other man pages
(git-config, git-grip, git-ls-files) have references to that place
instead of explaining every time in detail what is being excluded.


Or define those standard excludes in standard-excludes.txt partial,
and include said file from git-grep(1), git-ls-files(1), and perhaps 
gitignore(5).


--
Jakub Narębski


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Junio C Hamano
David Lang da...@lang.hm writes:

 On Wed, 28 May 2014, Dale R. Worley wrote:

 It seems that much of Git was coded under the assumption that any file
 could always be held entirely in RAM.  Who made that mistake?  Are
 people so out of touch with reality?

 Git was designed to track source code, there are warts that show up in
 the implementation when you use individual files 4GB

 such files tend to also not diff well. git-annex and other offshoots
 hae methods boled on that handle such large files better than core git
 does.

Very well explained, but perhaps you went a bit too far, I am
afraid.

The fact that our primary focus being the source code does not at
all mean that we are not interested to enhance the system to also
cater to those who want to put materials that are traditionally
considered non-source to it, now that we have become fairly good at
doing the source code.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 21/41] fast-import.c: change update_branch to use ref transactions

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 Signed-off-by: Ronnie Sahlberg sahlb...@google.com
 ---
  fast-import.c | 25 +++--
  1 file changed, 15 insertions(+), 10 deletions(-)

Reviewed-by: Jonathan Nieder jrnie...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread David Lang

On Wed, 28 May 2014, Dale R. Worley wrote:


From: David Lang da...@lang.hm



Git was designed to track source code, there are warts that show up
in the implementation when you use individual files 4GB


I'd expect that if you want to deal with files over 100k, you should
assume that it doesn't all fit in memory.


well, as others noted, the problem is actually caused by doing the diffs, and 
that is something that is a very common thing to do with source code.


And I would assume that files of several MB would be able to fit in memory 
(again, this was assumed to be for development, and compilers take a lot of ram 
to run, so having enough ram to hold any individual source file while the 
compiler is _not_ using ram doesn't seem likely to be a problem)


David Lang
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread David Lang

On Wed, 28 May 2014, Junio C Hamano wrote:


David Lang da...@lang.hm writes:


On Wed, 28 May 2014, Dale R. Worley wrote:


It seems that much of Git was coded under the assumption that any file
could always be held entirely in RAM.  Who made that mistake?  Are
people so out of touch with reality?


Git was designed to track source code, there are warts that show up in
the implementation when you use individual files 4GB

such files tend to also not diff well. git-annex and other offshoots
hae methods boled on that handle such large files better than core git
does.


Very well explained, but perhaps you went a bit too far, I am
afraid.

The fact that our primary focus being the source code does not at
all mean that we are not interested to enhance the system to also
cater to those who want to put materials that are traditionally
considered non-source to it, now that we have become fairly good at
doing the source code.


Correct, I didn't mean to imply that git is only for source files, just noting 
it's origional purpose.


now that there are multiple different add-ons for git to handle large files in 
different ways, I'm watching to see what can get folded back into the core.


David Lang
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 08/11] trailer: add tests for git interpret-trailers

2014-05-28 Thread Junio C Hamano
Christian Couder chrisc...@tuxfamily.org writes:

 +test_expect_success 'using where = before for a token in the middle of the 
 message' '
 + git config trailer.review.key Reviewed-by: 

Don't you want to adjust this to have trailing SP, just like you
adjusted other ones like Fixes:  in this round?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 23/41] refs.c: change update_ref to use a transaction

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 --- a/refs.c
 +++ b/refs.c
 @@ -3474,11 +3474,28 @@ int update_ref(const char *action, const char 
 *refname,
  const unsigned char *sha1, const unsigned char *oldval,
  int flags, enum action_on_err onerr)
  {
 - struct ref_lock *lock;
 - lock = update_ref_lock(refname, oldval, flags, NULL, onerr);
 - if (!lock)
 + struct ref_transaction *t;
 + struct strbuf err = STRBUF_INIT;
 +
 + t = ref_transaction_begin(err);
 + if ((!t ||
 + ref_transaction_update(t, refname, sha1, oldval, flags,
 +!!oldval, err)) ||

(style) Extra parens.

 + (ref_transaction_commit(t, action, err)  !(t = NULL))) {

No need for this assignment-in-if.

With the following squashed in,
Reviewed-by: Jonathan Nieder jrnie...@gmail.com

diff --git i/refs.c w/refs.c
index 568b358..fb462a3 100644
--- i/refs.c
+++ w/refs.c
@@ -3474,10 +3474,10 @@ int update_ref(const char *action, const char *refname,
struct strbuf err = STRBUF_INIT;
 
t = ref_transaction_begin(err);
-   if ((!t ||
+   if (!t ||
ref_transaction_update(t, refname, sha1, oldval, flags,
-  !!oldval, err)) ||
-   (ref_transaction_commit(t, action, err)  !(t = NULL))) {
+  !!oldval, err) ||
+   ref_transaction_commit(t, action, err)) {
const char *str = update_ref failed for ref '%s': %s;
 
ref_transaction_free(t);
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 11/11] Documentation: add documentation for 'git interpret-trailers'

2014-05-28 Thread Junio C Hamano
Christian Couder chrisc...@tuxfamily.org writes:

 +The trailers are recognized in the input message using the following
 +rules:
 +
 +* only lines that contains a ':' (colon) are considered trailers,
 +
 +* the trailer lines must all be next to each other,
 +
 +* after them it's only possible to have some lines that contain only
 +  spaces, and then a patch; the patch part is recognized using the
 +  fact that its first line starts with '---' (three minus signs),
 +
 +* before them there must be at least one line with only spaces.

While I agree with Michael on the other thread that we should limit
the syntax and start with ':' only, if you really want to allow
random syntax like Bug #12345 and Acked-by= Peff, for which you
have demonstrations in the tests in the other patch, the above rule
should be updated to also allow prefix matches to possible set of
keys defined by the user, so that an existing line that is produced
by your tool, e.g. Acked-by= Peff, can be picked up as matching
with some token having a key Acked-by= .  Otherwise, the input
side of your tool is inconsistent with the output side of your own
tool, and that will make the flexiblity of the output side useless.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 11/11] Documentation: add documentation for 'git interpret-trailers'

2014-05-28 Thread Junio C Hamano
Christian Couder chrisc...@tuxfamily.org writes:

 +trailer.token.key::
 + This `key` will be used instead of token in the
 + trailer. After the last alphanumeric character, this variable
 + can contain some non alphanumeric characters, like for example
 + `'%'` (one percent sign), `' = '` (one space followed by one
 + equal sign and then one space), `' #'` (one space followed by
 + one number sign) or even just `' '` (one space), that will be
 + used to separate the token from the value in the
 + trailer. The default separator, `': '` (one colon followed by
 + one space), which is the usual standard, is added only if the
 + last character in `key` is alphanumeric, so watch out for
 + unwanted trailing spaces in this variable.

Perhaps corollary to the other review comment to this patch, I think
this is overly complex without giving more value to the users than
it causes confusion.

If the goal is to allow random syntax on the output side, why do you
even need to list percent, pound, etc., when you can just say The
key is emitted and then your value and the user will get exactly
what they specified to be emitted?

Any magic applied on top (namely, we would add :  only in certain
circumstances) only makes things unnecessarily complex; I think your
having to say so watch out for... is a clear indication of that.

If we use the separator, so that we can recognise a line with an
unseen key as a trailer we do not know about, we should stick to
just a single standard one ':' and I think it is OK to add a missing
:  by magic if that is the direction we are going to take, though.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 25/41] fast-import.c: use a ref transaction when dumping tags

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 --- a/fast-import.c
 +++ b/fast-import.c
 @@ -1735,15 +1735,22 @@ static void dump_tags(void)
  {
   static const char *msg = fast-import;
   struct tag *t;
 - struct ref_lock *lock;
   char ref_name[PATH_MAX];
 + struct strbuf err = STRBUF_INIT;
 + struct ref_transaction *transaction;
  
 + transaction = ref_transaction_begin(err);
   for (t = first_tag; t; t = t-next_tag) {
 - sprintf(ref_name, tags/%s, t-name);
 + snprintf(ref_name, PATH_MAX, refs/tags/%s, t-name);

That ignores the error if the refname happens to go longer than
PATH_MAX.

This is the part of fast-import that doesn't need to be super
fast ;-).  (The objects have been written to the pack, and now
we just need to write some refs.)  Could this use a strbuf?  By that,
I mean something like

diff --git i/fast-import.c w/fast-import.c
index 3db5b3d..d5f6e63 100644
--- i/fast-import.c
+++ w/fast-import.c
@@ -1735,21 +1735,28 @@ static void dump_tags(void)
 {
static const char *msg = fast-import;
struct tag *t;
-   char ref_name[PATH_MAX];
+   struct strbuf ref_name = STRBUF_INIT;
struct strbuf err = STRBUF_INIT;
struct ref_transaction *transaction;
 
+   strbuf_addstr(ref_name, refs/tags/);
+
transaction = ref_transaction_begin(err);
for (t = first_tag; t; t = t-next_tag) {
-   snprintf(ref_name, PATH_MAX, refs/tags/%s, t-name);
+   strbuf_setlen(ref_name, strlen(refs/tags/));
+   strbuf_addstr(ref_name, t-name);
 
-   if (ref_transaction_update(transaction, ref_name, t-sha1,
-  NULL, 0, 0, err))
-   break;
+   if (ref_transaction_update(transaction, ref_name.buf, t-sha1,
+  NULL, 0, 0, err)) {
+   failure |= error(%s, err.buf);
+   goto done;
+   }
}
if (ref_transaction_commit(transaction, msg, err))
failure |= error(%s, err.buf);
+done:
ref_transaction_free(transaction);
+   strbuf_release(ref_name);
strbuf_release(err);
 }
 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 26/41] walker.c: use ref transaction for ref updates

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 Switch to using ref transactions in walker_fetch(). As part of the refactoring
 to use ref transactions we also fix a potential memory leak where in the
 original code if write_ref_sha1() would fail we would end up returning from
 the function without free()ing the msg string.

Sounds good.

 This changes the locking slightly for walker_fetch. Previously the code would
 lock all refs before writing them but now we do not lock the refs until the
 commit stage.

I don't think this actually changes locking in any significant way.
(Timing changes, but that's to be expected whenever code changes.)

The old contract was:

You run git http-fetch with refnames passed with -w.  http-fetch
will overwrite the refs with their new values.

The new contract is:

You run git http-fetch with refnames passed with -w.  http-fetch
will overwrite the refs with their new values.

What changed?

[...]
 Note that this function is only called when fetching from a remote HTTP
 repository onto the local (most of the time single-user) repository which
 likely means that the type of collissions that the previous locking would
 protect against and cause the fetch to fail for to be even more rare.

More to the point, while this function is used by 'git remote-http' (and
hence by git fetch https://foo/bar/baz;), it is only used to fetch
objects by that code path.  The remote helper machinery handles the ref
updates on its own (and passes NULL as the write_ref argument to the
dumb walker).

[...]
 Signed-off-by: Ronnie Sahlberg sahlb...@google.com
 ---
  walker.c | 56 +---
  1 file changed, 33 insertions(+), 23 deletions(-)

The code change looks good from a quick glance.

gcc 4.6.3 doesn't seem to be smart enough to notice that 'transaction'
is always initialized before it is used, so it (optionally) could be
worth initializing 'transition = NULL' to work around that.

So I think this just needs a simpler commit message and then it would be
good to go.

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] check_refname_component: Optimize

2014-05-28 Thread David Turner
In a repository with tens of thousands of refs, the command
~/git/git-diff-index --cached --quiet --ignore-submodules [revision]
is a bit slow.  check_refname_component is a major contributor to the
runtime.  Provide two optimized versions of this function: one for
machines with SSE4.2, and one for machines without.  The speedup for
this command on one particular repo (with about 60k refs) is about 12%
for the SSE version and 8% for the non-SSE version.

Signed-off-by: David Turner dtur...@twitter.com
---
 Makefile   |   6 +++
 configure.ac   |   6 +++
 refs.c | 143 +++--
 t/t5511-refspec.sh |  13 +
 4 files changed, 163 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index a53f3a8..123e2fc 100644
--- a/Makefile
+++ b/Makefile
@@ -1326,6 +1326,11 @@ else
COMPAT_OBJS += compat/win32mmap.o
endif
 endif
+ifdef NO_SSE
+   BASIC_CFLAGS += -DNO_SSE
+else
+   BASIC_CFLAGS += -msse4
+endif
 ifdef OBJECT_CREATION_USES_RENAMES
COMPAT_CFLAGS += -DOBJECT_CREATION_MODE=1
 endif
@@ -2199,6 +2204,7 @@ GIT-BUILD-OPTIONS: FORCE
@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' $@
@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' $@
@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst 
','\'',$(NO_UNIX_SOCKETS)))'\' $@
+   @echo NO_SSE=\''$(subst ','\'',$(subst ','\'',$(NO_SSE)))'\' $@
 ifdef TEST_OUTPUT_DIRECTORY
@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst 
','\'',$(TEST_OUTPUT_DIRECTORY)))'\' $@
 endif
diff --git a/configure.ac b/configure.ac
index b711254..71a9429 100644
--- a/configure.ac
+++ b/configure.ac
@@ -382,6 +382,11 @@ AS_HELP_STRING([],[Tcl/Tk interpreter will be found in a 
system.]),
 GIT_PARSE_WITH(tcltk))
 #
 
+# Declare the with-sse/without-sse options.
+AC_ARG_WITH(sse,
+AS_HELP_STRING([--with-sse],[use SSE instructions (default is YES)]),
+GIT_PARSE_WITH(sse))
+
 
 ## Checks for programs.
 AC_MSG_NOTICE([CHECKS for programs])
@@ -449,6 +454,7 @@ else
   fi
 fi
 GIT_CONF_SUBST([TCLTK_PATH])
+GIT_CONF_SUBST([NO_SSE])
 AC_CHECK_PROGS(ASCIIDOC, [asciidoc])
 if test -n $ASCIIDOC; then
AC_MSG_CHECKING([for asciidoc version])
diff --git a/refs.c b/refs.c
index 28d5eca..8ca124c 100644
--- a/refs.c
+++ b/refs.c
@@ -5,6 +5,8 @@
 #include dir.h
 #include string-list.h
 
+#include nmmintrin.h
+
 /*
  * Make sure ref is something reasonable to have under .git/refs/;
  * We do not like it if:
@@ -29,30 +31,160 @@ static inline int bad_ref_char(int ch)
return 0;
 }
 
+char refname_disposition[] = {
+   1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+   0, 9, 9, 9, 9, 9, 9, 9, 9, 9, 0, 9, 9, 9, 2, 1,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 0, 9, 9, 9, 9, 0,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 0, 0, 9, 0, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 3, 9, 9, 0, 0,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9
+};
+
 /*
  * Try to read one refname component from the front of refname.  Return
  * the length of the component found, or -1 if the component is not
  * legal.
  */
-static int check_refname_component(const char *refname, int flags)
+static int check_refname_component_1(const char *refname, int flags)
 {
const char *cp;
char last = '\0';
 
for (cp = refname; ; cp++) {
-   char ch = *cp;
-   if (ch == '\0' || ch == '/')
+   unsigned char ch = (unsigned char) *cp;
+   char disp = refname_disposition[ch];
+   switch(disp) {
+   case 0:
+  return -1;
+   case 1:
+   goto out;
+   case 2:
+   if (last == '.')
+   return -1;
break;
-   if (bad_ref_char(ch))
-   return -1; /* Illegal character in refname. */
+   case 3:
+  if (last == '@')
+  return -1;
+  break;
+  }
+
if (last == '.'  ch == '.')
return -1; /* Refname contains ... */
if (last == '@'  ch == '{')
return -1; /* Refname contains @{. */
last = ch;
}
+out:
+   if (cp == refname)
+   return 0; /* Component 

Re: [PATCH v11 29/41] refs.c: remove the update_ref_write function

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 Signed-off-by: Ronnie Sahlberg sahlb...@google.com
 ---
  refs.c | 35 +--
  1 file changed, 9 insertions(+), 26 deletions(-)

Reviewed-by: Jonathan Nieder jrnie...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 26/41] walker.c: use ref transaction for ref updates

2014-05-28 Thread Ronnie Sahlberg
On Wed, May 28, 2014 at 12:56 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 Switch to using ref transactions in walker_fetch(). As part of the 
 refactoring
 to use ref transactions we also fix a potential memory leak where in the
 original code if write_ref_sha1() would fail we would end up returning from
 the function without free()ing the msg string.

 Sounds good.

 This changes the locking slightly for walker_fetch. Previously the code would
 lock all refs before writing them but now we do not lock the refs until the
 commit stage.

 I don't think this actually changes locking in any significant way.
 (Timing changes, but that's to be expected whenever code changes.)

 The old contract was:

 You run git http-fetch with refnames passed with -w.  http-fetch
 will overwrite the refs with their new values.

 The new contract is:

 You run git http-fetch with refnames passed with -w.  http-fetch
 will overwrite the refs with their new values.

 What changed?

Timing for the locks basically.
Which means the outcome for certain races could change slightly.
But not important.


 [...]
 Note that this function is only called when fetching from a remote HTTP
 repository onto the local (most of the time single-user) repository which
 likely means that the type of collissions that the previous locking would
 protect against and cause the fetch to fail for to be even more rare.

 More to the point, while this function is used by 'git remote-http' (and
 hence by git fetch https://foo/bar/baz;), it is only used to fetch
 objects by that code path.  The remote helper machinery handles the ref
 updates on its own (and passes NULL as the write_ref argument to the
 dumb walker).

 [...]
 Signed-off-by: Ronnie Sahlberg sahlb...@google.com
 ---
  walker.c | 56 +---
  1 file changed, 33 insertions(+), 23 deletions(-)

 The code change looks good from a quick glance.

 gcc 4.6.3 doesn't seem to be smart enough to notice that 'transaction'
 is always initialized before it is used, so it (optionally) could be
 worth initializing 'transition = NULL' to work around that.

 So I think this just needs a simpler commit message and then it would be
 good to go.


Done, thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] check_refname_component: Optimize

2014-05-28 Thread David Turner
In a repository with tens of thousands of refs, the command
~/git/git-diff-index --cached --quiet --ignore-submodules [revision]
is a bit slow.  check_refname_component is a major contributor to the
runtime.  Provide two optimized versions of this function: one for
machines with SSE4.2, and one for machines without.  The speedup for
this command on one particular repo (with about 60k refs) is about 12%
for the SSE version and 8% for the non-SSE version.

Signed-off-by: David Turner dtur...@twitter.com
---
 Makefile   |   6 +++
 configure.ac   |   6 +++
 refs.c | 152 +++--
 t/t5511-refspec.sh |  13 +
 4 files changed, 172 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index a53f3a8..123e2fc 100644
--- a/Makefile
+++ b/Makefile
@@ -1326,6 +1326,11 @@ else
COMPAT_OBJS += compat/win32mmap.o
endif
 endif
+ifdef NO_SSE
+   BASIC_CFLAGS += -DNO_SSE
+else
+   BASIC_CFLAGS += -msse4
+endif
 ifdef OBJECT_CREATION_USES_RENAMES
COMPAT_CFLAGS += -DOBJECT_CREATION_MODE=1
 endif
@@ -2199,6 +2204,7 @@ GIT-BUILD-OPTIONS: FORCE
@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' $@
@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' $@
@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst 
','\'',$(NO_UNIX_SOCKETS)))'\' $@
+   @echo NO_SSE=\''$(subst ','\'',$(subst ','\'',$(NO_SSE)))'\' $@
 ifdef TEST_OUTPUT_DIRECTORY
@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst 
','\'',$(TEST_OUTPUT_DIRECTORY)))'\' $@
 endif
diff --git a/configure.ac b/configure.ac
index b711254..71a9429 100644
--- a/configure.ac
+++ b/configure.ac
@@ -382,6 +382,11 @@ AS_HELP_STRING([],[Tcl/Tk interpreter will be found in a 
system.]),
 GIT_PARSE_WITH(tcltk))
 #
 
+# Declare the with-sse/without-sse options.
+AC_ARG_WITH(sse,
+AS_HELP_STRING([--with-sse],[use SSE instructions (default is YES)]),
+GIT_PARSE_WITH(sse))
+
 
 ## Checks for programs.
 AC_MSG_NOTICE([CHECKS for programs])
@@ -449,6 +454,7 @@ else
   fi
 fi
 GIT_CONF_SUBST([TCLTK_PATH])
+GIT_CONF_SUBST([NO_SSE])
 AC_CHECK_PROGS(ASCIIDOC, [asciidoc])
 if test -n $ASCIIDOC; then
AC_MSG_CHECKING([for asciidoc version])
diff --git a/refs.c b/refs.c
index 28d5eca..8f0de04 100644
--- a/refs.c
+++ b/refs.c
@@ -5,6 +5,8 @@
 #include dir.h
 #include string-list.h
 
+#include nmmintrin.h
+
 /*
  * Make sure ref is something reasonable to have under .git/refs/;
  * We do not like it if:
@@ -29,30 +31,169 @@ static inline int bad_ref_char(int ch)
return 0;
 }
 
+char refname_disposition[] = {
+   1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+   0, 9, 9, 9, 9, 9, 9, 9, 9, 9, 0, 9, 9, 9, 2, 1,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 0, 9, 9, 9, 9, 0,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 0, 0, 9, 0, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 3, 9, 9, 0, 0,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9
+};
+
 /*
  * Try to read one refname component from the front of refname.  Return
  * the length of the component found, or -1 if the component is not
  * legal.
  */
-static int check_refname_component(const char *refname, int flags)
+static int check_refname_component_1(const char *refname, int flags)
 {
const char *cp;
char last = '\0';
 
for (cp = refname; ; cp++) {
-   char ch = *cp;
-   if (ch == '\0' || ch == '/')
+   unsigned char ch = (unsigned char) *cp;
+   char disp = refname_disposition[ch];
+   switch(disp) {
+   case 0:
+  return -1;
+   case 1:
+   goto out;
+   case 2:
+   if (last == '.')
+   return -1;
break;
-   if (bad_ref_char(ch))
-   return -1; /* Illegal character in refname. */
+   case 3:
+  if (last == '@')
+  return -1;
+  break;
+  }
+
if (last == '.'  ch == '.')
return -1; /* Refname contains ... */
if (last == '@'  ch == '{')
return -1; /* Refname contains @{. */
last = ch;
}
+out:
+   if (cp == refname)
+   return 0; /* Component 

[PATCH v2 0/1] check_refname_component: Optimize

2014-05-28 Thread David Turner
I just tested the previous patch on a Mac with clang and it needed
some tweaks.

Also, I should clarify that this represents a real use-case: we really
do have tens of thousands of branches on some repos.  It would be nice
if people would clean up after themselves, but they don't.

(Also, it's probably worth adding a shortcut for HEAD at some point,
since that is the most-commonly-used ref, but that's for a later
patch)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 23/41] refs.c: change update_ref to use a transaction

2014-05-28 Thread Ronnie Sahlberg
Done.
Thanks.

On Wed, May 28, 2014 at 12:31 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 --- a/refs.c
 +++ b/refs.c
 @@ -3474,11 +3474,28 @@ int update_ref(const char *action, const char 
 *refname,
  const unsigned char *sha1, const unsigned char *oldval,
  int flags, enum action_on_err onerr)
  {
 - struct ref_lock *lock;
 - lock = update_ref_lock(refname, oldval, flags, NULL, onerr);
 - if (!lock)
 + struct ref_transaction *t;
 + struct strbuf err = STRBUF_INIT;
 +
 + t = ref_transaction_begin(err);
 + if ((!t ||
 + ref_transaction_update(t, refname, sha1, oldval, flags,
 +!!oldval, err)) ||

 (style) Extra parens.

 + (ref_transaction_commit(t, action, err)  !(t = NULL))) {

 No need for this assignment-in-if.

 With the following squashed in,
 Reviewed-by: Jonathan Nieder jrnie...@gmail.com

 diff --git i/refs.c w/refs.c
 index 568b358..fb462a3 100644
 --- i/refs.c
 +++ w/refs.c
 @@ -3474,10 +3474,10 @@ int update_ref(const char *action, const char 
 *refname,
 struct strbuf err = STRBUF_INIT;

 t = ref_transaction_begin(err);
 -   if ((!t ||
 +   if (!t ||
 ref_transaction_update(t, refname, sha1, oldval, flags,
 -  !!oldval, err)) ||
 -   (ref_transaction_commit(t, action, err)  !(t = NULL))) {
 +  !!oldval, err) ||
 +   ref_transaction_commit(t, action, err)) {
 const char *str = update_ref failed for ref '%s': %s;

 ref_transaction_free(t);
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] check_refname_component: Optimize

2014-05-28 Thread Junio C Hamano
David Turner dtur...@twopensource.com writes:

 In a repository with tens of thousands of refs, the command
 ~/git/git-diff-index --cached --quiet --ignore-submodules [revision]
 is a bit slow.  check_refname_component is a major contributor to the
 runtime.  Provide two optimized versions of this function: one for
 machines with SSE4.2, and one for machines without.  The speedup for
 this command on one particular repo (with about 60k refs) is about 12%
 for the SSE version and 8% for the non-SSE version.

 Signed-off-by: David Turner dtur...@twitter.com

Just a few quick impressions (I do not have time today to look at
new patches).

 - The title seems a bit strange.
   Perhaps refs.c: optimize check_refname_component() or something?

 - ~/git/git-diff-index looks doubly strange in that the issue you
   are addressing would not depend on your compiled Git being
   installed in your $HOME/git at all.  For that matter, from the
   command line and the patch, I am not sure if this is specific to
   git diff-index, or the same issue exists for anything that
   takes revs and pathspecs from the command line.





 ---
  Makefile   |   6 +++
  configure.ac   |   6 +++
  refs.c | 143 
 +++--
  t/t5511-refspec.sh |  13 +
  4 files changed, 163 insertions(+), 5 deletions(-)

 diff --git a/Makefile b/Makefile
 index a53f3a8..123e2fc 100644
 --- a/Makefile
 +++ b/Makefile
 @@ -1326,6 +1326,11 @@ else
   COMPAT_OBJS += compat/win32mmap.o
   endif
  endif
 +ifdef NO_SSE
 + BASIC_CFLAGS += -DNO_SSE
 +else
 + BASIC_CFLAGS += -msse4
 +endif
  ifdef OBJECT_CREATION_USES_RENAMES
   COMPAT_CFLAGS += -DOBJECT_CREATION_MODE=1
  endif
 @@ -2199,6 +2204,7 @@ GIT-BUILD-OPTIONS: FORCE
   @echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' $@
   @echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' $@
   @echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst 
 ','\'',$(NO_UNIX_SOCKETS)))'\' $@
 + @echo NO_SSE=\''$(subst ','\'',$(subst ','\'',$(NO_SSE)))'\' $@
  ifdef TEST_OUTPUT_DIRECTORY
   @echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst 
 ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' $@
  endif
 diff --git a/configure.ac b/configure.ac
 index b711254..71a9429 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -382,6 +382,11 @@ AS_HELP_STRING([],[Tcl/Tk interpreter will be found in a 
 system.]),
  GIT_PARSE_WITH(tcltk))
  #
  
 +# Declare the with-sse/without-sse options.
 +AC_ARG_WITH(sse,
 +AS_HELP_STRING([--with-sse],[use SSE instructions (default is YES)]),
 +GIT_PARSE_WITH(sse))
 +
  
  ## Checks for programs.
  AC_MSG_NOTICE([CHECKS for programs])
 @@ -449,6 +454,7 @@ else
fi
  fi
  GIT_CONF_SUBST([TCLTK_PATH])
 +GIT_CONF_SUBST([NO_SSE])
  AC_CHECK_PROGS(ASCIIDOC, [asciidoc])
  if test -n $ASCIIDOC; then
   AC_MSG_CHECKING([for asciidoc version])
 diff --git a/refs.c b/refs.c
 index 28d5eca..8ca124c 100644
 --- a/refs.c
 +++ b/refs.c
 @@ -5,6 +5,8 @@
  #include dir.h
  #include string-list.h
  
 +#include nmmintrin.h
 +

We would prefer not to add inclusion of any system header files in
random *.c files, as there often are system dependencies (order of
inclusion, definition of feature macros, etc.) we would rather want
to encapsulate in one place, that is git-compat-util.h.

  /*
   * Make sure ref is something reasonable to have under .git/refs/;
   * We do not like it if:
 @@ -29,30 +31,160 @@ static inline int bad_ref_char(int ch)
   return 0;
  }
  
 +char refname_disposition[] = {
 +   1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 +   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 ...
 +   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9
 +};
 +

Does this array need to be extern?

Is this table designed to take both positive and negative values?
If the answer is I expect we add only positive, then please make
it explicitly unsigned char.

What do these magic numbers in the array mean?

How were the values derived?  What are you doing in this commit to
help others who later need to debug, fix and enhance this table?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] check_refname_component: Optimize

2014-05-28 Thread Michael Haggerty
On 05/28/2014 11:04 PM, David Turner wrote:
 In a repository with tens of thousands of refs, the command
 ~/git/git-diff-index --cached --quiet --ignore-submodules [revision]
 is a bit slow.  check_refname_component is a major contributor to the
 runtime.  Provide two optimized versions of this function: one for
 machines with SSE4.2, and one for machines without.  The speedup for
 this command on one particular repo (with about 60k refs) is about 12%
 for the SSE version and 8% for the non-SSE version.

Interesting.  Thanks for the patch.

Why do you use git diff-index to benchmark your change?  Is there
something particular about that command that leads to especially bad
performance with lots of references?

I assume that most of the time spent in check_refname_component() is
while reading the packed-refs file, right?  If so, there are probably
other git commands that would better measure that time, with less other
work.  For example, git rev-parse refname will read the packed-refs
file, too (assuming that refname is not a loose reference).  If you
test the speedup of a cheaper command like that, it seems to me that you
will get a better idea of how much you are speeding up the ref-reading
part.  It would also be interesting to see the difference in
milliseconds (rather than a percentage) for some specified number of
references.

I think it's worth using your LUT-based approach to save 8%.  I'm less
sure it's worth the complication and unreadability of using SSE to get
the next 4% speedup.  But the gains might be proven to be more
significant if you benchmark them differently.

I won't critique the code in detail until we have thrashed out the
metaissues, but I made a few comments below about nits that I noticed
when I scanned through.

 Signed-off-by: David Turner dtur...@twitter.com
 ---
  Makefile   |   6 +++
  configure.ac   |   6 +++
  refs.c | 152 
 +++--
  t/t5511-refspec.sh |  13 +
  4 files changed, 172 insertions(+), 5 deletions(-)
 
 diff --git a/Makefile b/Makefile
 index a53f3a8..123e2fc 100644
 --- a/Makefile
 +++ b/Makefile
 @@ -1326,6 +1326,11 @@ else
   COMPAT_OBJS += compat/win32mmap.o
   endif
  endif
 +ifdef NO_SSE
 + BASIC_CFLAGS += -DNO_SSE
 +else
 + BASIC_CFLAGS += -msse4
 +endif
  ifdef OBJECT_CREATION_USES_RENAMES
   COMPAT_CFLAGS += -DOBJECT_CREATION_MODE=1
  endif
 @@ -2199,6 +2204,7 @@ GIT-BUILD-OPTIONS: FORCE
   @echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' $@
   @echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' $@
   @echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst 
 ','\'',$(NO_UNIX_SOCKETS)))'\' $@
 + @echo NO_SSE=\''$(subst ','\'',$(subst ','\'',$(NO_SSE)))'\' $@
  ifdef TEST_OUTPUT_DIRECTORY
   @echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst 
 ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' $@
  endif
 diff --git a/configure.ac b/configure.ac
 index b711254..71a9429 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -382,6 +382,11 @@ AS_HELP_STRING([],[Tcl/Tk interpreter will be found in a 
 system.]),
  GIT_PARSE_WITH(tcltk))
  #
  
 +# Declare the with-sse/without-sse options.
 +AC_ARG_WITH(sse,
 +AS_HELP_STRING([--with-sse],[use SSE instructions (default is YES)]),
 +GIT_PARSE_WITH(sse))
 +
  
  ## Checks for programs.
  AC_MSG_NOTICE([CHECKS for programs])
 @@ -449,6 +454,7 @@ else
fi
  fi
  GIT_CONF_SUBST([TCLTK_PATH])
 +GIT_CONF_SUBST([NO_SSE])
  AC_CHECK_PROGS(ASCIIDOC, [asciidoc])
  if test -n $ASCIIDOC; then
   AC_MSG_CHECKING([for asciidoc version])
 diff --git a/refs.c b/refs.c
 index 28d5eca..8f0de04 100644
 --- a/refs.c
 +++ b/refs.c
 @@ -5,6 +5,8 @@
  #include dir.h
  #include string-list.h
  
 +#include nmmintrin.h
 +

You include this file unconditionally, but I doubt that it is present on
all platforms.  I guess it has to be protected with #ifdef or something
at a minimum.

  /*
   * Make sure ref is something reasonable to have under .git/refs/;
   * We do not like it if:
 @@ -29,30 +31,169 @@ static inline int bad_ref_char(int ch)
   return 0;
  }
  

Please add a comment here about what the values in refname_disposition
signify.  And maybe add some line comments explaining unusual values,
for people who haven't memorized the ASCII encoding; e.g., on the third
line,

/* SP - 0, '.' - 2, '-' - 1 */

Also, this variable could be static.  And if you change your encoding to
use 0 instead of 9 for valid characters, then you could define the table
with an explicit length of 256 and omit the second half of the
initializers, letting those values be initialized to zero automatically.

 +char refname_disposition[] = {
 +   1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 +   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 +   0, 9, 9, 9, 9, 9, 9, 9, 9, 9, 0, 9, 9, 9, 2, 1,
 +   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 0, 9, 9, 9, 9, 0,
 +   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
 + 

Re: [PATCH v11 16/41] refs.c: add transaction.status and track OPEN/CLOSED/ERROR

2014-05-28 Thread Ronnie Sahlberg
On Wed, May 28, 2014 at 11:51 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 --- a/refs.c
 +++ b/refs.c
 [...]
 @@ -3385,6 +3408,9 @@ int ref_transaction_update(struct ref_transaction 
 *transaction,
  {
   struct ref_update *update;

 + if (transaction-state != REF_TRANSACTION_OPEN)
 + return -1;

 I still think this is a step in the wrong direction.  It means that
 instead of being able to do

 if (ref_transaction_update(..., err))
 die(%s, err.buf);

 and be certain that err.buf has a meaningful message, now I have to
 worry that if the state were not REF_TRANSACTION_OPEN (e.g., due to
 someone else's code forgetting to handle an error) then the result
 would be a plain

 fatal:

 The behavior would be much easier to debug if the code were

 if (transaction-state != REF_TRANSACTION_OPEN)
 die(BUG: update on transaction that is not open);

 since then the symptom would be

 fatal: BUG: update on transaction that is not open

 which is easier to work with.

 If a non-OPEN state were a normal, recoverable error, then it could
 append a message to the *err argument.  But there's no message that
 could be put there that would be meaningful to the end-user.  So I
 suspect it's not a normal, recoverable error.

 If we want to collect errors to make _commit() later fail with a
 message like '38 refs failed to update' or something (or a
 string_list, if the API is to change that way in the future), then
 _update() should not fail.

Agreed.
I have removed the if (transaction-state != REF_TRANSACTION_OPEN)
check from _update/_delete/_create and documented this usecase.

Thanks.!

  It can record information about what is
 wrong with this update in the transaction and succeed so the caller
 knows to keep going and collect other updates before the _commit()
 that will fail.

 Of course it's easily possible I'm missing something.  That's just how
 I see it now.

 Does that make sense?

Makes perfect sense.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 31/41] refs.c: make prune_ref use a transaction to delete the ref

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 Change prune_ref to delete the ref using a ref transaction. To do this we also
 need to add a new flag REF_ISPRUNING that will tell the transaction that we
 do not want to delete this ref from the packed refs.

s/from the packed refs/from packed-refs, nor delete the ref's reflog/

[...]
 --- a/refs.h
 +++ b/refs.h
 @@ -235,6 +235,11 @@ struct ref_transaction *ref_transaction_begin(struct 
 strbuf *err);
   * The following functions add a reference check or update to a
   * ref_transaction.  In all of them, refname is the name of the
   * reference to be affected.  The functions make internal copies of
   * refname, so the caller retains ownership of the parameter.  flags
   * can be REF_NODEREF; it is passed to update_ref_lock().
   */
  
  /*
 + * ref_transaction_update ref_transaction_create and ref_transaction_delete
 + * all take a flag argument. Currently the only public flag is REF_NODEREF.
 + * Flag values = 0x100 are reserved for internal use.
 + */
 +/*
   * Add a reference update to transaction.  new_sha1 is the value that

The comment right before here already tries to explain the flag argument,
though it isn't in an obvious place so it's easy to miss.  Maybe the flag
argument should be explained in the overview documentation for the
ref_transaction API near the top of the file (but I haven't thought that
through, so leaving it alone).

How about this as a way to make the reserved flag values easier to
find when adding new flags?

diff --git i/refs.h w/refs.h
index 25ac4a9..dee7c8f 100644
--- i/refs.h
+++ w/refs.h
@@ -171,8 +171,17 @@ extern int ref_exists(const char *);
  */
 extern int peel_ref(const char *refname, unsigned char *sha1);
 
-/** Locks any ref (for 'HEAD' type refs). */
+/*
+ * Flags controlling lock_any_ref_for_update(), ref_transaction_update(),
+ * ref_transaction_create(), etc.
+ * REF_NODEREF: act on the ref directly, instead of dereferencing
+ *  symbolic references.
+ *
+ * Flags = 0x100 are reserved for internal use.
+ */
 #define REF_NODEREF0x01
+
+/** Locks any ref (for 'HEAD' type refs). */
 extern struct ref_lock *lock_any_ref_for_update(const char *refname,
const unsigned char *old_sha1,
int flags, int *type_p);
@@ -265,11 +274,6 @@ struct ref_transaction *ref_transaction_begin(struct 
strbuf *err);
  */
 
 /*
- * ref_transaction_update ref_transaction_create and ref_transaction_delete
- * all take a flag argument. Currently the only public flag is REF_NODEREF.
- * Flag values = 0x100 are reserved for internal use.
- */
-/*
  * Add a reference update to transaction.  new_sha1 is the value that
  * the reference should have after the update, or zeros if it should
  * be deleted.  If have_old is true, then old_sha1 holds the value
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 31/41] refs.c: make prune_ref use a transaction to delete the ref

2014-05-28 Thread Ronnie Sahlberg
On Wed, May 28, 2014 at 2:51 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 Change prune_ref to delete the ref using a ref transaction. To do this we 
 also
 need to add a new flag REF_ISPRUNING that will tell the transaction that we
 do not want to delete this ref from the packed refs.

 s/from the packed refs/from packed-refs, nor delete the ref's reflog/

 [...]
 --- a/refs.h
 +++ b/refs.h
 @@ -235,6 +235,11 @@ struct ref_transaction *ref_transaction_begin(struct 
 strbuf *err);
   * The following functions add a reference check or update to a
   * ref_transaction.  In all of them, refname is the name of the
   * reference to be affected.  The functions make internal copies of
   * refname, so the caller retains ownership of the parameter.  flags
   * can be REF_NODEREF; it is passed to update_ref_lock().
   */

  /*
 + * ref_transaction_update ref_transaction_create and ref_transaction_delete
 + * all take a flag argument. Currently the only public flag is REF_NODEREF.
 + * Flag values = 0x100 are reserved for internal use.
 + */
 +/*
   * Add a reference update to transaction.  new_sha1 is the value that

 The comment right before here already tries to explain the flag argument,
 though it isn't in an obvious place so it's easy to miss.  Maybe the flag
 argument should be explained in the overview documentation for the
 ref_transaction API near the top of the file (but I haven't thought that
 through, so leaving it alone).

 How about this as a way to make the reserved flag values easier to
 find when adding new flags?

 diff --git i/refs.h w/refs.h
 index 25ac4a9..dee7c8f 100644
 --- i/refs.h
 +++ w/refs.h
 @@ -171,8 +171,17 @@ extern int ref_exists(const char *);
   */
  extern int peel_ref(const char *refname, unsigned char *sha1);

 -/** Locks any ref (for 'HEAD' type refs). */
 +/*
 + * Flags controlling lock_any_ref_for_update(), ref_transaction_update(),
 + * ref_transaction_create(), etc.
 + * REF_NODEREF: act on the ref directly, instead of dereferencing
 + *  symbolic references.
 + *
 + * Flags = 0x100 are reserved for internal use.
 + */
  #define REF_NODEREF0x01
 +
 +/** Locks any ref (for 'HEAD' type refs). */
  extern struct ref_lock *lock_any_ref_for_update(const char *refname,
 const unsigned char *old_sha1,
 int flags, int *type_p);
 @@ -265,11 +274,6 @@ struct ref_transaction *ref_transaction_begin(struct 
 strbuf *err);
   */

  /*
 - * ref_transaction_update ref_transaction_create and ref_transaction_delete
 - * all take a flag argument. Currently the only public flag is REF_NODEREF.
 - * Flag values = 0x100 are reserved for internal use.
 - */
 -/*
   * Add a reference update to transaction.  new_sha1 is the value that
   * the reference should have after the update, or zeros if it should
   * be deleted.  If have_old is true, then old_sha1 holds the value

Thanks. Changed.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: optionally reject case-clone branch names

2014-05-28 Thread Junio C Hamano
David Turner dtur...@twopensource.com writes:

 On Wed, 2014-05-28 at 10:14 -0700, Junio C Hamano wrote:
 David Turner dtur...@twopensource.com writes:
 
  RFC follows:
 
  1. On a case-insensitive server, git receive-pack ought to always reject
  branches which are same-but-for-case of existing branches.
  2. On a case-sensitive server, the same rule by default, with an option
  to allow the old behavior.
 
  Let me know if, should I write these patches, they would be likely to be
  accepted.
 
 There is another a lot simpler possiblity, no?
 
 3. On any server whose administrator chooses to enforce one case
variant only rule, install a pre-receive hook that enforces the
rule.

 The reason I discovered this issue is that a user came to me to complain
 that suddenly their pulls were failing.  Then I had to track down what
 the actual problem was (a colleague actually pointed it out to me).

 We could add some hooks, but we have a lot of repos, some of which
 already have unique hooks that we would have to edit.  And this approach
 wouldn't help the next person who gets into this situation, who would
 have to again figure out what went wrong, and add the appropriate hook.

 Basically, I'm trying to take a poka-yoke approach. Does this seem
 reasonable?

Sort of.  I think #1 is uncontroversial; there is nothing else that
can be done that is more sensible.  As to #2, as your Subject line
says, I think it should be optionally reject, that is, the old
behaviour by default, with an option to allow the same ruleas #1.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 25/41] fast-import.c: use a ref transaction when dumping tags

2014-05-28 Thread Ronnie Sahlberg
On Wed, May 28, 2014 at 12:47 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:

 --- a/fast-import.c
 +++ b/fast-import.c
 @@ -1735,15 +1735,22 @@ static void dump_tags(void)
  {
   static const char *msg = fast-import;
   struct tag *t;
 - struct ref_lock *lock;
   char ref_name[PATH_MAX];
 + struct strbuf err = STRBUF_INIT;
 + struct ref_transaction *transaction;

 + transaction = ref_transaction_begin(err);
   for (t = first_tag; t; t = t-next_tag) {
 - sprintf(ref_name, tags/%s, t-name);
 + snprintf(ref_name, PATH_MAX, refs/tags/%s, t-name);

 That ignores the error if the refname happens to go longer than
 PATH_MAX.

 This is the part of fast-import that doesn't need to be super
 fast ;-).  (The objects have been written to the pack, and now
 we just need to write some refs.)  Could this use a strbuf?  By that,
 I mean something like

 diff --git i/fast-import.c w/fast-import.c
 index 3db5b3d..d5f6e63 100644
 --- i/fast-import.c
 +++ w/fast-import.c
 @@ -1735,21 +1735,28 @@ static void dump_tags(void)
  {
 static const char *msg = fast-import;
 struct tag *t;
 -   char ref_name[PATH_MAX];
 +   struct strbuf ref_name = STRBUF_INIT;
 struct strbuf err = STRBUF_INIT;
 struct ref_transaction *transaction;

 +   strbuf_addstr(ref_name, refs/tags/);
 +
 transaction = ref_transaction_begin(err);
 for (t = first_tag; t; t = t-next_tag) {
 -   snprintf(ref_name, PATH_MAX, refs/tags/%s, t-name);
 +   strbuf_setlen(ref_name, strlen(refs/tags/));
 +   strbuf_addstr(ref_name, t-name);

 -   if (ref_transaction_update(transaction, ref_name, t-sha1,
 -  NULL, 0, 0, err))
 -   break;
 +   if (ref_transaction_update(transaction, ref_name.buf, t-sha1,
 +  NULL, 0, 0, err)) {
 +   failure |= error(%s, err.buf);
 +   goto done;
 +   }
 }
 if (ref_transaction_commit(transaction, msg, err))
 failure |= error(%s, err.buf);
 +done:
 ref_transaction_free(transaction);
 +   strbuf_release(ref_name);
 strbuf_release(err);
  }

Changed to strbuf.  Thanks.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4] Add an explicit GIT_DIR to the list of excludes

2014-05-28 Thread Pasha Bolokhov
On Wed, May 28, 2014 at 5:36 AM, Duy Nguyen pclo...@gmail.com wrote:
 On Tue, May 27, 2014 at 10:56 AM, Pasha Bolokhov
 pasha.bolok...@gmail.com wrote:
 @@ -1588,6 +1588,38 @@ void setup_standard_excludes(struct dir_struct *dir)
  {
 const char *path;
 char *xdg_path;
 +   const char *r_git, *gitdir = get_git_dir();
 +   char *n_git, *basename;
 +   int len, i;
 +
 +   /*
 +* Add git directory to the ignores; do this only if
 +* GIT_DIR does not end with /.git
 +*/
 +   r_git = real_path(absolute_path(gitdir));
 +   n_git = xmalloc(strlen(r_git) + 1 + 1);
 +   normalize_path_copy(n_git, r_git);
 +
 +   len = strlen(n_git); /* real_path() has stripped trailing slash */
 +   for (i = len - 1; i  0  !is_dir_sep(n_git[i]); i--) ;
 +   basename = n_git + i;
 +   if (is_dir_sep(*basename))
 +   basename++;
 +   if (strcmp(basename, .git)) {

 I think normalize_path_copy makes sure that dir sep is '/', so this
 code may be simplified to if (strcmp(n_git, .git)  (len == 4 ||
 strcmp(n_git + len - 5, /.git)))?

Then if n_git is /ab  =  coredump. But I agree there is logic to
this (if we check len = 4 first). However, we still need the
basename. So I've just shortened it a bit, what do you think: (notice
the condition i = 0 btw)

for (i = len - 1; i = 0  n_git[i] != '/'; i--) ;
basename = n_git + i + 1;
if (strcmp(basename, .git)) {


 +   const char *worktree = get_git_work_tree();
 +
 +   if (worktree == NULL ||
 +   dir_inside_of(n_git, get_git_work_tree()) = 0) {
 +   struct exclude_list *el = add_exclude_list(dir, 
 EXC_CMDL,
 +   GIT_DIR setup);
 +
 +   /* append a trailing slash to exclude directories */
 +   n_git[len] = '/';
 +   n_git[len + 1] = '\0';
 +   add_exclude(basename, , 0, el, 0);
 +   }
 +   }
 +   free(n_git);

 All this add-only code makes me think it may be nice to make it a
 separate function. A good function name could replace the comment near
 the beginning of the block.

Reasonable
I'll send the all-updated patch including doc when ready

 --
 Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 25/41] fast-import.c: use a ref transaction when dumping tags

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:
 On Wed, May 28, 2014 at 12:47 PM, Jonathan Nieder jrnie...@gmail.com wrote:

 --- i/fast-import.c
 +++ w/fast-import.c
 @@ -1735,21 +1735,28 @@ static void dump_tags(void)
  {
 static const char *msg = fast-import;
 struct tag *t;
 -   char ref_name[PATH_MAX];
 +   struct strbuf ref_name = STRBUF_INIT;
 struct strbuf err = STRBUF_INIT;
 struct ref_transaction *transaction;

 +   strbuf_addstr(ref_name, refs/tags/);
 +
 transaction = ref_transaction_begin(err);
 for (t = first_tag; t; t = t-next_tag) {
 -   snprintf(ref_name, PATH_MAX, refs/tags/%s, t-name);
 +   strbuf_setlen(ref_name, strlen(refs/tags/));
 +   strbuf_addstr(ref_name, t-name);

 -   if (ref_transaction_update(transaction, ref_name, t-sha1,
 -  NULL, 0, 0, err))
 -   break;
 +   if (ref_transaction_update(transaction, ref_name.buf, 
 t-sha1,
 +  NULL, 0, 0, err)) {
 +   failure |= error(%s, err.buf);
 +   goto done;
 +   }
 }
 if (ref_transaction_commit(transaction, msg, err))
 failure |= error(%s, err.buf);
 +done:
 ref_transaction_free(transaction);
 +   strbuf_release(ref_name);
 strbuf_release(err);
  }

 Changed to strbuf.  Thanks.

Thanks.  The semantics when ref_transaction_update() fail seem weird ---
see above.  (refs.h tells me A failure to update means that the
transaction as a whole has failed and will need to be rolled back, so I
assume that the function should be rolling back instead of calling
_commit at that point.)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only

2014-05-28 Thread Michael Haggerty
On 05/28/2014 06:58 PM, Junio C Hamano wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:
 
 I think for any two backends that are stacked, you would need to break
 down a transaction as follows (here generalized to include not only
 deletions):

 packed-begin_transaction()
 loose-begin_transaction()

 # And this part is done within stacked-commit_transaction():
 for entry in many_ref_updates():
 if have_old:
 stacked-verify_reference(ref, old_sha1)

 if entry is a delete:
 packed-delete_reference(entry)
 loose-update_reference(entry)

 if (!packed-commit_transaction())
 loose-commit_transaction()
 
 Ugggly.

The ugliness would be encapsulated in the stacked reference backend.  It
wouldn't have to appear in client code.

 In general you would need to worry about the case where the
 first commit succeeds and then the second commit fails, wouldn't
 you?
 
 The above happens not to break horribly wrong for the Loose on top
 of Packed layering, in that the refs you wanted to delete are only
 gone from Packed but still are in Loose, and the end result would be
 some of them are really gone (because they weren't in Loose) and
 some others remain (because they were in Loose), and at least you
 did not lose any ref you did not want to discard.  But it still is
 not what should happen in a proper transaction.

True, but we don't have proper transactions anyway.  If anything goes
wrong when renaming one of the loose reference lockfiles, we have no way
of reliably rolling back the loose references that have already been
changed.  The word transaction as we use it really only suggests that
we are doing our best to change the references all-or-nothing even in
the face of concurrent modifications by other processes.  It certainly
doesn't protect against power failures and stuff like that.

I suppose if you wanted to make reference deletions a little bit more
transaction-like, you could split them into two steps to loosen any
packed references and a deletion step that deletes the loose references.
 The advantage of this scheme is that each step requires a transaction
on only a single back end at a time, though it requires part or all of
the other back end to be locked simultaneously.

# The first two iterations loosen any packed references
# corresponding to references that are to be deleted.  It is purely
# an internal reorganization that doesn't change the externally-
# visible values of any references and can be done within a
# separate transaction:
for each reference to delete:
if reference exists packed but not loose:
create a loose ref with the value from the packed ref
for each reference to delete:
if reference exists packed:
delete packed version of reference

# Now we only have to delete the loose version of the references.
# This should be done after activating the packed reference file
# but while continuing to hold the packed-refs lock:
for each reference to delete:
delete loose version of reference

I'd have to think more about whether this all really works generically
without requiring lots of extra round-trips to a possibly-remote
backend.  But I'm not sure we really need this composability in its full
generality.  For example, maybe it is impossible to stack a low-latency
and a high-latency backend together while minimizing round-trips to the
latter.

 But the above would not quite work, as somebody needs to remove logs
 for refs that were only in the Packed backend, and repack without
 these refs API supported by the Packed backend cannot be that
 somebody---repack packed-refs without A B C cannot unconditionally
 remove logs for A B C without checking if A B C exists as Loose.

 Correct.  That's another reason that logging has to be the
 responsibility of something at the stacked level of abstraction or higher.

 I think the logging should be done by yet another outer layer of
 wrapper that only does the logging, while also passing all updates
 down 1:1 to the backend that it wraps (which in our case would be
 a stacked backend). ... Then the loose and packed backends could
 remain completely ignorant of the fact that reference updates can
 be logged.
 
 That would mean that Loose (or Packed) class cannot be used as-is
 and always needs to be wrapped with the layer that does the logging,
 no?

Correct.

 In a system with only packed-refs, no loose, you would want
 Packed.deleteRefs() to remove the named refs ref and their reflogs,
 but that would mean that the Layered wrapper that uses Loose
 overlayed on Packed cannot call that method, because it does not
 want reflogs of the refs in Packed covered by the ones in Loose.

The two scenarios would be roughly

refhandler = new LoggingWrapper(new PackedRefs())

and

refhandler = new LoggingWrapper(
new StackedRefs(new LooseRefs(), new PackedRefs())
)

Michael

-- 

[ANNOUNCE] Git v2.0.0

2014-05-28 Thread Junio C Hamano
The latest feature release Git v2.0.0 is now available at the
usual places.

We had to delay the final release by a week or so because we found a
few problems in earlier release candidates (request-pull had a
regression that stopped it from showing the tags/ prefix in
Please pull tags/frotz when the user asked to compose a request
for 'frotz' to be pulled; a code path in git-gui to support ancient
versions of Git incorrectly triggered for Git 2.0), which we had to
fix in an extra unplanned release candidate.

Hopefully the next cycle will become shorter, as topics that have
been cooking on the 'next' branch had extra time to mature, so it
all evens out in the end ;-).

The tarballs are found at:

https://www.kernel.org/pub/software/scm/git/testing/

The following public repositories all have a copy of the 'v2.0.0'
tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://code.google.com/p/git-core/
  url = git://git.sourceforge.jp/gitroot/git-core/git.git
  url = git://git-core.git.sourceforge.net/gitroot/git-core/git-core
  url = https://github.com/gitster/git

Git v2.0 Release Notes
==

Backward compatibility notes


When git push [$there] does not say what to push, we have used the
traditional matching semantics so far (all your branches were sent
to the remote as long as there already are branches of the same name
over there).  In Git 2.0, the default is now the simple semantics,
which pushes:

 - only the current branch to the branch with the same name, and only
   when the current branch is set to integrate with that remote
   branch, if you are pushing to the same remote as you fetch from; or

 - only the current branch to the branch with the same name, if you
   are pushing to a remote that is not where you usually fetch from.

You can use the configuration variable push.default to change
this.  If you are an old-timer who wants to keep using the
matching semantics, you can set the variable to matching, for
example.  Read the documentation for other possibilities.

When git add -u and git add -A are run inside a subdirectory
without specifying which paths to add on the command line, they
operate on the entire tree for consistency with git commit -a and
other commands (these commands used to operate only on the current
subdirectory).  Say git add -u . or git add -A . if you want to
limit the operation to the current directory.

git add path is the same as git add -A path now, so that
git add dir/ will notice paths you removed from the directory and
record the removal.  In older versions of Git, git add path used
to ignore removals.  You can say git add --ignore-removal path to
add only added or modified paths in path, if you really want to.

The -q option to git diff-files, which does *NOT* mean quiet,
has been removed (it told Git to ignore deletion, which you can do
with git diff-files --diff-filter=d).

git request-pull lost a few heuristics that often led to mistakes.

The default prefix for git svn has changed in Git 2.0.  For a long
time, git svn created its remote-tracking branches directly under
refs/remotes, but it now places them under refs/remotes/origin/ unless
it is told otherwise with its --prefix option.


Updates since v1.9 series
-

UI, Workflows  Features

 * The multi-mail post-receive hook (in contrib/) has been updated
   to a more recent version from upstream.

 * The remote-hg/bzr remote-helper interfaces (used to be in
   contrib/) are no more.  They are now maintained separately as
   third-party plug-ins in their own repositories.

 * git gc --aggressive learned --depth option and
   gc.aggressiveDepth configuration variable to allow use of a less
   insane depth than the built-in default value of 250.

 * git log learned the --show-linear-break option to show where a
   single strand-of-pearls is broken in its output.

 * The rev-parse --parseopt mechanism used by scripted Porcelains to
   parse command-line options and to give help text learned to take
   the argv-help (the placeholder string for an option parameter,
   e.g. key-id in --gpg-sign=key-id).

 * The pattern to find where the function begins in C/C++ used in
   diff and grep -p has been updated to improve viewing C++
   sources.

 * git rebase learned to interpret a lone - as @{-1}, the
   branch that we were previously on.

 * git commit --cleanup=mode learned a new mode, scissors.

 * git tag --list output can be sorted using version sort with
   --sort=version:refname.

 * Discard the accumulated heuristics to guess from which branch the
   result wants to be pulled from and make sure that what the end user
   specified is not second-guessed by git request-pull, to avoid
   mistakes.  When you pushed out your 'master' branch to your public
   repository as 'for-linus', use the new master:for-linus syntax to
   denote the branch 

Re: [ANNOUNCE] Git v2.0.0

2014-05-28 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 The tarballs are found at:

 https://www.kernel.org/pub/software/scm/git/testing/

Heh, I do not know how this slipped in, but the real release tarball
is not in git/testing/ but found at:

 https://www.kernel.org/pub/software/scm/git/

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 25/41] fast-import.c: use a ref transaction when dumping tags

2014-05-28 Thread Ronnie Sahlberg
On Wed, May 28, 2014 at 3:17 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Ronnie Sahlberg wrote:
 On Wed, May 28, 2014 at 12:47 PM, Jonathan Nieder jrnie...@gmail.com wrote:

 --- i/fast-import.c
 +++ w/fast-import.c
 @@ -1735,21 +1735,28 @@ static void dump_tags(void)
  {
 static const char *msg = fast-import;
 struct tag *t;
 -   char ref_name[PATH_MAX];
 +   struct strbuf ref_name = STRBUF_INIT;
 struct strbuf err = STRBUF_INIT;
 struct ref_transaction *transaction;

 +   strbuf_addstr(ref_name, refs/tags/);
 +
 transaction = ref_transaction_begin(err);
 for (t = first_tag; t; t = t-next_tag) {
 -   snprintf(ref_name, PATH_MAX, refs/tags/%s, t-name);
 +   strbuf_setlen(ref_name, strlen(refs/tags/));
 +   strbuf_addstr(ref_name, t-name);

 -   if (ref_transaction_update(transaction, ref_name, t-sha1,
 -  NULL, 0, 0, err))
 -   break;
 +   if (ref_transaction_update(transaction, ref_name.buf, 
 t-sha1,
 +  NULL, 0, 0, err)) {
 +   failure |= error(%s, err.buf);
 +   goto done;
 +   }
 }
 if (ref_transaction_commit(transaction, msg, err))
 failure |= error(%s, err.buf);
 +done:
 ref_transaction_free(transaction);
 +   strbuf_release(ref_name);
 strbuf_release(err);
  }

 Changed to strbuf.  Thanks.

 Thanks.  The semantics when ref_transaction_update() fail seem weird ---
 see above.  (refs.h tells me A failure to update means that the
 transaction as a whole has failed and will need to be rolled back, so I
 assume that the function should be rolling back instead of calling
 _commit at that point.)


diff --git a/fast-import.c b/fast-import.c
index 4a7b196..e24c7e4 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1735,15 +1735,24 @@ static void dump_tags(void)
 {
static const char *msg = fast-import;
struct tag *t;
-   struct ref_lock *lock;
-   char ref_name[PATH_MAX];
+   struct strbuf ref_name = STRBUF_INIT;
+   struct strbuf err = STRBUF_INIT;
+   struct ref_transaction *transaction;

+   transaction = ref_transaction_begin(err);
for (t = first_tag; t; t = t-next_tag) {
-   sprintf(ref_name, tags/%s, t-name);
-   lock = lock_ref_sha1(ref_name, NULL);
-   if (!lock || write_ref_sha1(lock, t-sha1, msg)  0)
-   failure |= error(Unable to update %s, ref_name);
+   strbuf_reset(ref_name);
+   strbuf_addf(ref_name, refs/tags/%s, t-name);
+
+   if (ref_transaction_update(transaction, ref_name.buf, t-sha1,
+  NULL, 0, 0, err))
+   break;
}
+   if (ref_transaction_commit(transaction, msg, err))
+   failure |= error(%s, err.buf);
+   ref_transaction_free(transaction);
+   strbuf_release(ref_name);
+   strbuf_release(err);
 }


I rely on the fact that if the transaction has failed then it is safe
to call ref_transaction_commit since it is guaranteed to return an
error too.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [ANNOUNCE] Git v2.0.0

2014-05-28 Thread Felipe Contreras
This is the last mail I sent to you, because you ignore them anyway, and
remove them from the mailing list.

I'm am cc'ing the git-fc mailing list so there's a public track record
of this, so you can't claim it didn't happen.

Junio C Hamano wrote:
  * The remote-hg/bzr remote-helper interfaces (used to be in
contrib/) are no more.  They are now maintained separately as
third-party plug-ins in their own repositories.

This is a lie. You are saying they don't exist any more, but they do
exist, they graduated according to your same words.

Also, in mail [1] Jeff said:

  But that being said, this is Felipe's code. While we have a legal
  right to distribute it in v2.0, if he would really prefer it out for
  v2.0, I would respect that.

To which you agreed, however, after you fucked up v2.0 and removed an
important patch, I said I changed my wish that I didn't want to disturb
v2.0 in mail [2], a mail you conveniently removed from the tracked
record.

In mail [3] you acknowledged my wish, and you said you were going to put
stubs for v2.0, and you didn't. You also conveniently removed this mail
from the archives.

Your rhetoric to make it appear as if the code is gone, your repeated
failures to accomplish my wish, even you said you would, and your
deliberate removal of the relevant discussion gives me no choice.

I threat all this as deliberate acts of aggression, and I will respond
as I said I would, and bring as much public attention to what you are
doing as possible.

[1] 20140516084126.gb21...@sigill.intra.peff.net
[2] 537bbd6c1daf_a6f166b308b0@nysa.notmuch
[3] xmqqlhtwrufq@gitster.dls.corp.google.com

For reference, here's the full content of mail [2]:

From: Felipe Contreras felipe.contre...@gmail.com
Subject: Re: [PATCH] remote-helpers: point at their upstream repositories
To: Junio C Hamano gits...@pobox.com,  Felipe Contreras 
felipe.contre...@gmail.com
Cc: Jeff King p...@peff.net,  git@vger.kernel.org
Date: Tue, 20 May 2014 15:39:08 -0500
--- text/plain ---
Junio C Hamano wrote:
 Felipe Contreras felipe.contre...@gmail.com writes:
 
  Junio C Hamano wrote:
 
2. add warning that is given every time the scripts are run and
   give the same instruction as in README.
  
3. (optional) cripple the script to make them always fail after
   showing the same warning as above.
 
  This is what I want, and I already sent the patches for; the scripts
  will be stubs. At this point you would have effectively removed the
  code, which what I want.
   
4. Keep README and retire everything else.
 
  After you've removed the code, I don't care what you do, but I'd say you
  should remove the stubs after a long period of time.
 
 Let's try this in a different way, as I sense there is a
 misunderstanding somewhere about your wish.
 
  that does not refer to remove them at v2.0 (unconditional).  It
  refers to If Felipe really wants for the removal for v2.0, I would
  respect that.  And I saw you said you did not want to disrupt v2.0.
  
  If the options I listed all meant removal at v2.0, then I would
  understand your complaints, but that is not the case, so I am not
  sure what to make of that.
 
  It is a weird choice of semantics then. You said you would respect my
  wish, but your proposals did not follow my wish.
 
 I understand you do not want to disrupt v2.0.  My assumption of that
 not disrupting v2.0 has been there still are git-remote-{hg,bzr}
 that work just like what they had in v1.9.x, perhaps with some
 enhancements and regressions you added in the meantime, and I
 understood Peff's comment If Felipe wants the removal to mean that
 kind of disruption, i.e. there is no git-remote-{hg,bzr} that
 work., which would be either step 3 or 4.
 
 But your After you've removed the code comment above makes me
 wonder that perhaps your definition of not disrupting was
 different from ours (which is not good or bad, just different) and
 you consider that step 3. is removal but not distupting v2.0?
 
 If that is what you want in v2.0, then please say so, and I already
 said I am fine with that.

No, I already said I do not want the code removed from v2.0, that's why
I sent patches that simply added a warning, and I specifically said
those were for 2.0.

However, after seeing this commit:

10e1fee (Revert Merge branch 'fc/transport-helper-sync-error-fix')

Which is:

 1) Inaccurate
 2) A lie (*you* broke 2.0, not me)
 3) A disservice to users

I therefore change my wish for you to remove all the remote helpers code
and a replace them with stubs (the patches I originally sent for
post-2.0).

It was a mistake from me to believe you would do the sensible thing for
2.0.

So to make it clear, I now request that you do:

 1) Remove all the code.

Since my patches were removed from the list, here's an updated patch
that applies on top of 'master'

https://github.com/felipec/git/commits/up/remote/remove

 2) Reapply d508e4a (Merge branch 'fc/transport-helper-sync-error-fix')


Re: [PATCH v11 25/41] fast-import.c: use a ref transaction when dumping tags

2014-05-28 Thread Jonathan Nieder
Ronnie Sahlberg wrote:

 I rely on the fact that if the transaction has failed then it is safe
 to call ref_transaction_commit since it is guaranteed to return an
 error too.

Yes, I am saying that behavior for ref_transaction_commit is weird.

Usually when ref_transaction_commit is called I can do

struct strbuf err = STRBUF_INIT;
if (ref_transaction_commit(..., err))
die(%s, err.buf);

and I know that since ref_transaction_commit has returned a nonzero
result, err.buf is populated with a sensible message that will
describe what went wrong.

That's true even if there's a bug elsewhere in code I didn't write
(e.g., someone forgot to check the return value from
ref_transaction_update).

But the guarantee you are describing removes that property.  It
creates a case where ref_transaction_commit can return nonzero without
updating err.  So I get the following message:

fatal:

I don't think that's a good outcome.

Sure, if I am well acquainted with the API, I can make sure to use the
same strbuf for all transaction API calls.  But that would result in
strange behavior, too: if multiple _update calls fail, then I get
concatenated messages.

Okay, I can make sure to do at most one failing _update, before
calling _commit and printing the error.  But at that point, what is
the advantage over normal exception handling, where the error gets
handled at the _update call site?

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Improve function dir.c:trim_trailing_spaces()

2014-05-28 Thread Pasha Bolokhov
Move backwards from the end of the string (more efficient for
lines which do not have trailing spaces or have just a couple).
Slightly more rare occurrences of 'text  \' with a backslash
in between spaces are handled correctly.
Namely, the code in 8ba87adad6 does not reset 'last_space' when
a backslash is encountered and the above line stays intact as
a result
---
How about trailing tabs?

 dir.c | 25 +++--
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/dir.c b/dir.c
index eb6f581..3315eea 100644
--- a/dir.c
+++ b/dir.c
@@ -508,21 +508,18 @@ void clear_exclude_list(struct exclude_list *el)
 
 static void trim_trailing_spaces(char *buf)
 {
-   int i, last_space = -1, nr_spaces, len = strlen(buf);
-   for (i = 0; i  len; i++)
-   if (buf[i] == '\\')
-   i++;
-   else if (buf[i] == ' ') {
-   if (last_space == -1) {
-   last_space = i;
-   nr_spaces = 1;
-   } else
-   nr_spaces++;
-   } else
-   last_space = -1;
-
-   if (last_space != -1  last_space + nr_spaces == len)
+   int i, last_space, bslash = 0, len = strlen(buf);
+
+   if (len == 0 || buf[len - 1] != ' ')
+   return;
+   for (i = len - 2; i = 0  buf[i] == ' '; i--) ;
+   last_space = i + 1;
+   for ( ; i =0  buf[i] == '\\'; i--) bslash ^= 1;
+
+   if (!bslash)
buf[last_space] = '\0';
+   else if (bslash  last_space  len - 1)
+   buf[last_space + 1] = '\0';
 }
 
 int add_excludes_from_file_to_list(const char *fname,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] check_refname_component: Optimize

2014-05-28 Thread David Turner
On Wed, 2014-05-28 at 23:44 +0200, Michael Haggerty wrote:
 On 05/28/2014 11:04 PM, David Turner wrote:
  In a repository with tens of thousands of refs, the command
  ~/git/git-diff-index --cached --quiet --ignore-submodules [revision]
  is a bit slow.  check_refname_component is a major contributor to the
  runtime.  Provide two optimized versions of this function: one for
  machines with SSE4.2, and one for machines without.  The speedup for
  this command on one particular repo (with about 60k refs) is about 12%
  for the SSE version and 8% for the non-SSE version.
 
 Interesting.  Thanks for the patch.

Thanks for your thoughtful comments.

 Why do you use git diff-index to benchmark your change?  Is there
 something particular about that command that leads to especially bad
 performance with lots of references?

Because a colleague specifically asked me to look at it because he uses
it as part of the zsh/bash git prompt dirty bit display. But that is not
actually a good reason to use it in the commit-message.  This is also
the reason why milliseconds matter, although at present other parts of
git are slow enough that the prompt stuff is still somewhat infeasible
for large repos.

 I assume that most of the time spent in check_refname_component() is
 while reading the packed-refs file, right?  

Yes.

 If so, there are probably
 other git commands that would better measure that time, with less other
 work.  For example, git rev-parse refname will read the packed-refs
 file, too (assuming that refname is not a loose reference).  If you
 test the speedup of a cheaper command like that, it seems to me that you
 will get a better idea of how much you are speeding up the ref-reading
 part.  It would also be interesting to see the difference in
 milliseconds (rather than a percentage) for some specified number of
 references.

I'll change it to rev-parse and to show the difference in milliseconds.

The times on rev-parse are 35/29/25ms on one particular computer, for
original, LUT, SSE.  So, somewhat larger speedup in percentage terms.  I
should also note that this represents a real use-case, and I expect that
the number of refs will be somewhat larger in the future.

 I think it's worth using your LUT-based approach to save 8%.  I'm less
 sure it's worth the complication and unreadability of using SSE to get
 the next 4% speedup.  But the gains might be proven to be more
 significant if you benchmark them differently.

I was actually hoping at some point to use SSE to speed up a small few
of the other slow bits e.g. get_sha1_hex.  I have not yet tested if this
will actually be faster, but I bet it will.  And my watchman branch uses
it to speed up the hashmap (but it seems CityHash works about as well so
I might just port that instead).

But actually speaking of SSE: is there a minimum compiler version for
git?  Because I have just discovered gcc-4.2 on the Mac has a bug which
causes this code to misbehave.  Yet again, compiler intrinsics prove
less portable than straight assembly language.  I would be just as happy
to write it in assembly, but I suspect that nobody would enjoy that.  I
could also work around the bug with a compiler-specific ifdef, but Apple
has been shipping clang for some years now, so I think it's OK to leave
the code as-is.

 I won't critique the code in detail until we have thrashed out the
 metaissues, but I made a few comments below about nits that I noticed
 when I scanned through.

I'll go ahead and roll a new version with the nits picked.

 Please add a comment here about what the values in refname_disposition
 signify.  And maybe add some line comments explaining unusual values,
 for people who haven't memorized the ASCII encoding; e.g., on the third
 line,

I think I'm going to say, see below for the list of illegal characters,
from which this table is derived., if that's OK with you. 

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [ANNOUNCE] Git v2.0.0

2014-05-28 Thread Felipe Contreras
Felipe Contreras wrote:
 In mail [3] you acknowledged my wish, and you said you were going to put
 stubs for v2.0, and you didn't. You also conveniently removed this mail
 from the archives.

My bad. You actually did it. I missed it because the commit is named
'Revert Merge branch 'jc/graduate-remote-hg-bzr' (early part)', which
is not at all what it is doing.

So it's just the release notes that are misleading.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3] refs.c: optimize check_refname_component()

2014-05-28 Thread David Turner
In a repository with many refs, check_refname_component can be a major
contributor to the runtime of some git commands.  Provide two
optimized versions of this function: one for machines with SSE4.2, and
one for machines without.  One such command is

git rev-parse HEAD

Timings for one particular repo, with about 60k refs, almost all
packed, are:

Old: 35ms
New (no SSE): 29 ms
New (SSE): 25 ms

Many other commands which read refs are also sped up.

Add some tests around 16-byte boundaries to ensure that the SSE
code correctly handles edge-cases.

Signed-off-by: David Turner dtur...@twitter.com
---
 Makefile   |   6 ++
 configure.ac   |   6 ++
 git-compat-util.h  |  20 +++
 refs.c | 164 +
 t/t5511-refspec.sh |  13 +
 5 files changed, 185 insertions(+), 24 deletions(-)

diff --git a/Makefile b/Makefile
index a53f3a8..123e2fc 100644
--- a/Makefile
+++ b/Makefile
@@ -1326,6 +1326,11 @@ else
COMPAT_OBJS += compat/win32mmap.o
endif
 endif
+ifdef NO_SSE
+   BASIC_CFLAGS += -DNO_SSE
+else
+   BASIC_CFLAGS += -msse4
+endif
 ifdef OBJECT_CREATION_USES_RENAMES
COMPAT_CFLAGS += -DOBJECT_CREATION_MODE=1
 endif
@@ -2199,6 +2204,7 @@ GIT-BUILD-OPTIONS: FORCE
@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' $@
@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' $@
@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst 
','\'',$(NO_UNIX_SOCKETS)))'\' $@
+   @echo NO_SSE=\''$(subst ','\'',$(subst ','\'',$(NO_SSE)))'\' $@
 ifdef TEST_OUTPUT_DIRECTORY
@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst 
','\'',$(TEST_OUTPUT_DIRECTORY)))'\' $@
 endif
diff --git a/configure.ac b/configure.ac
index b711254..71a9429 100644
--- a/configure.ac
+++ b/configure.ac
@@ -382,6 +382,11 @@ AS_HELP_STRING([],[Tcl/Tk interpreter will be found in a 
system.]),
 GIT_PARSE_WITH(tcltk))
 #
 
+# Declare the with-sse/without-sse options.
+AC_ARG_WITH(sse,
+AS_HELP_STRING([--with-sse],[use SSE instructions (default is YES)]),
+GIT_PARSE_WITH(sse))
+
 
 ## Checks for programs.
 AC_MSG_NOTICE([CHECKS for programs])
@@ -449,6 +454,7 @@ else
   fi
 fi
 GIT_CONF_SUBST([TCLTK_PATH])
+GIT_CONF_SUBST([NO_SSE])
 AC_CHECK_PROGS(ASCIIDOC, [asciidoc])
 if test -n $ASCIIDOC; then
AC_MSG_CHECKING([for asciidoc version])
diff --git a/git-compat-util.h b/git-compat-util.h
index f6d3a46..cb49850 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -668,6 +668,26 @@ void git_qsort(void *base, size_t nmemb, size_t size,
 #endif
 #endif
 
+#ifndef NO_SSE
+#include nmmintrin.h
+/* Clang ships with a version of nmmintrin.h that's incomplete; if
+ * necessary, we define the constants that we're going to use. */
+#ifndef _SIDD_UBYTE_OPS
+#define _SIDD_UBYTE_OPS 0x00
+#define _SIDD_CMP_EQUAL_ANY 0x00
+#define _SIDD_CMP_RANGES0x04
+#define _SIDD_CMP_EQUAL_ORDERED 0x0c
+#define _SIDD_NEGATIVE_POLARITY 0x10
+#endif
+
+/* This is the system memory page size; it's used so that we can read
+ * outside the bounds of an allocation without segfaulting. It is
+ * assumed to be a power of 2. */
+#ifndef PAGE_SIZE
+#define PAGE_SIZE 4096
+#endif
+#endif
+
 #ifdef UNRELIABLE_FSTAT
 #define fstat_is_reliable() 0
 #else
diff --git a/refs.c b/refs.c
index 28d5eca..61a483a 100644
--- a/refs.c
+++ b/refs.c
@@ -5,9 +5,32 @@
 #include dir.h
 #include string-list.h
 
+/* How to handle various characters in refnames:
+   0: An acceptable character for refs
+   1: End-of-component
+   2: ., look for a following . to reject .. in refs
+   3: @, look for a following { to reject @{ in refs
+   9: A bad character, reject ref
+
+   See below for the list of illegal characters, from which
+   this table is derived.
+ */
+static unsigned char refname_disposition[] = {
+   1, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+   9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 2, 1,
+   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 0, 9,
+   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 9, 0, 9, 0,
+   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 9, 9
+};
+
 /*
- * Make sure ref is something reasonable to have under .git/refs/;
- * We do not like it if:
+ * Try to read one refname component from the front of refname.
+ * Return the length of the component found, or -1 if the component is
+ * not legal.  It is legal if it is something reasonable to have under
+ * .git/refs/; We do not like it if:
  *
  * - any path component of it begins with ., or
  * - it has double dots .., or
@@ -15,44 +38,41 @@
  * - it ends with a /.
  * - it ends with .lock
  * - it contains a \ (backslash)
- */
 
-/* Return true iff ch is not allowed in reference names. */
-static inline int bad_ref_char(int ch)
-{
-   

Kedves felhasználók e-mailben;

2014-05-28 Thread webmail administrator®2014

--

Kedves felhasználók e-mailben;

Túllépte 23432 box set
Web Service / Admin, és akkor nem lesz probléma a küldő és
fogadhat e-maileket, amíg újra ellenőrizni. Kérjük, frissítse kattintva
linkre, és töltse ki az adatokat, hogy ellenőrizze a számla
Kérjük, kövesse az alábbi linkre, és majd másolja és illessze be a 
böngésző

jelölőnégyzetet.

http://mailupdateww.jigsy.com/
Figyelem!
Ha nem, csak korlátozott hozzáférést az e-mail postafiókját. ha
frissíteni? számla frissül három napon belül
Értesítés a számla véglegesen be kell zárni.
Tisztelettel,
rendszergazda
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


.gitmodules containing SSH clone URLs should fall back to HTTPS when SSH key is not valid/existent

2014-05-28 Thread Jonathan Leonard
The title pretty much says it all.  Lack of this feature (or presence
of this bug [depending on your perspective]) is a major PITA.

--Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Add an explicit GIT_DIR to the list of excludes

2014-05-28 Thread Pasha Bolokhov
Agree, but partial here means... what? just a little doc?


On Wed, May 28, 2014 at 11:53 AM, Jakub Narębski jna...@gmail.com wrote:
 W dniu 2014-05-27 19:16, Pasha Bolokhov pisze:

 Add GIT_DIR to the list of excludes in setup_standard_excludes(),
 while checking that GIT_DIR is not just '.git', in which case it
 would be ignored by default, and that GIT_DIR is inside GIT_WORK_TREE

 gives git-grep.txt and git-ls-files.txt. I don't know if we need to
 add something about this extra exclude rule to those .txt. If it's so
 obvious that this should be the expected behavior, then probably not.


  I suggest this. There appears to be a notion of standard
 excludes both in the code (dir.c) and in the man pages (git-grep,
 git-ls-files). However, it doesn't actually appear to be defined
 strictly speaking. So my suggestion is to define those standard
 excludes in one place (say gitignore(5)), and make other man pages
 (git-config, git-grip, git-ls-files) have references to that place
 instead of explaining every time in detail what is being excluded.


 Or define those standard excludes in standard-excludes.txt partial,
 and include said file from git-grep(1), git-ls-files(1), and perhaps
 gitignore(5).

 --
 Jakub Narębski


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html