Re: Unexpected behavior with :/ references

2018-07-08 Thread William Chargin
After further investigation, it appears that ":/foo" indeed resolves to
the commit with message "foobar" (in the above example) if the commits
are not all created at the same time: e.g., by adding `sleep 1` between
the commit commands, or exporting `GIT_AUTHOR_DATE`.

This leaves only the question of why :/ references don't resolve
to commits reachable only from HEAD, and whether this is the best
behavior.


Re: [PATCH 01/17] cache: update object ID functions for the_hash_algo

2018-07-08 Thread Jacob Keller
On Sun, Jul 8, 2018 at 9:05 PM Eric Sunshine  wrote:
>
> On Sun, Jul 8, 2018 at 10:38 PM Jacob Keller  wrote:
> > On Sun, Jul 8, 2018 at 4:39 PM brian m. carlson
> >  wrote:
> > >  static inline int oidcmp(const struct object_id *oid1, const struct 
> > > object_id *oid2)
> > >  {
> > > -   return hashcmp(oid1->hash, oid2->hash);
> > > +   return memcmp(oid1->hash, oid2->hash, the_hash_algo->rawsz);
> > >  }
> >
> > Just curious, what's the reasoning for not using the hashcmp anymore?
>
> hashcmp() is specific to SHA-1 (for instance, it hardocdes
> GIT_SHA1_RAWSZ). oidcmp() is meant as the hash-agnostic replacement
> for hashcmp(), so it doesn't make sense to continue implementing
> oidcmp() in terms of hashcmp() (the latter of which will eventually be
> retired, presumably).

Fair. I just saw that hashcmp was also updated to use the_hash_algo,
but if we're going to drop it eventually, then there's zero reason to
keep implementing oidcmp in terms of it, so... makes sense to me!

Thanks,
Jake


Re: [PATCH 01/17] cache: update object ID functions for the_hash_algo

2018-07-08 Thread Eric Sunshine
On Sun, Jul 8, 2018 at 10:38 PM Jacob Keller  wrote:
> On Sun, Jul 8, 2018 at 4:39 PM brian m. carlson
>  wrote:
> >  static inline int oidcmp(const struct object_id *oid1, const struct 
> > object_id *oid2)
> >  {
> > -   return hashcmp(oid1->hash, oid2->hash);
> > +   return memcmp(oid1->hash, oid2->hash, the_hash_algo->rawsz);
> >  }
>
> Just curious, what's the reasoning for not using the hashcmp anymore?

hashcmp() is specific to SHA-1 (for instance, it hardocdes
GIT_SHA1_RAWSZ). oidcmp() is meant as the hash-agnostic replacement
for hashcmp(), so it doesn't make sense to continue implementing
oidcmp() in terms of hashcmp() (the latter of which will eventually be
retired, presumably).


Unexpected behavior with :/ references

2018-07-08 Thread William Chargin
Hello,

I'm experiencing strange behavior with :/ references, which seems
to be inconsistent with the explanation in the docs on two counts.
First, sometimes the matched commit is not the youngest. Second, some
commits cannot be found at all, even if they are reachable from HEAD.

Here is a script to reproduce the behavior (Git built from current pu):

#!/bin/sh
export GIT_CONFIG_NOSYSTEM=1
export GIT_ATTR_NOSYSTEM=1
cd "$(mktemp -d --suffix .gitrepro)"

git --version
git init

git commit -q --allow-empty -m initial
git commit -q --allow-empty -m foo
git checkout -q -b early-branch
git commit -q --allow-empty -m foobar
git checkout -q --detach
git commit -q --allow-empty -m foobarbaz

echo
echo "The following should all print 'foobarbaz':"
git show --format=%s ':/foo' --
git show --format=%s ':/foobar' --
git show --format=%s ':/foobarbaz' --

echo
echo "With an explicit branch:"
git branch late-branch
git show --format=%s ':/foo' --
git show --format=%s ':/foobar' --
git show --format=%s ':/foobarbaz' --

Here is the output on my machine:

git version 2.18.0.516.g6fb7f6652
Initialized empty Git repository in /tmp/tmp.WeCD0QZPIf.gitrepro/.git/

The following should all print 'foobarbaz':
foo
foobar
fatal: bad revision ':/foobarbaz'

With an explicit branch:
foo
foobarbaz
foobarbaz

First, the commit with message "foobar" clearly matches the regular
expression /foo/ as well as /foobar/, but ":/foo" resolves to an older
commit. However, Documentation/revisions.txt indicates that a :/
reference should resolve to the _youngest_ matching commit.

Second, the commit with message "foobarbaz" is reachable from HEAD, and
yet the regular expression /foobarbaz/ fails to match it. The same
documentation indicates that :/ references find commits reachable
from any ref, and the glossary entry for "ref" states that HEAD is a
"special-purpose ref" even though it does not begin with "refs/".

It looks to me like references reachable from `master` are always picked
over other references, and that references reachable only from HEAD are
not matched at all.

Is the observed behavior intentional? If so, what am I misunderstanding?

Thanks!
WC

(for searchability: colon slash text references)


Re: [PATCH 00/17] object_id part 14

2018-07-08 Thread Jacob Keller
On Sun, Jul 8, 2018 at 4:39 PM brian m. carlson
 wrote:
>
> This is the fourteenth series of patches to switch to using struct
> object_id and the_hash_algo.  This series converts several core pieces
> to use struct object_id, including the oid* and hex functions.
>
> All of these patches have been tested with both SHA-1 and a 256-bit
> hash.
>

I read through the series, and didn't spot anything odd, except for
the question about reasoning for why we use memcmp directly over using
hashcmp. I don't think that's any sort of blocker, it just seemed an
odd decision to me.

Thanks,
Jake


Re: [PATCH 01/17] cache: update object ID functions for the_hash_algo

2018-07-08 Thread Jacob Keller
On Sun, Jul 8, 2018 at 4:39 PM brian m. carlson
 wrote:
>  static inline int oidcmp(const struct object_id *oid1, const struct 
> object_id *oid2)
>  {
> -   return hashcmp(oid1->hash, oid2->hash);
> +   return memcmp(oid1->hash, oid2->hash, the_hash_algo->rawsz);
>  }
>

Just curious, what's the reasoning for not using the hashcmp anymore?

Thanks,
Jake


[PATCH 02/17] tree-walk: replace hard-coded constants with the_hash_algo

2018-07-08 Thread brian m. carlson
Remove the hard-coded 20-based values and replace them with uses of
the_hash_algo.

Signed-off-by: brian m. carlson 
---
 tree-walk.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tree-walk.c b/tree-walk.c
index 8f5090862b..c1f27086a9 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -26,8 +26,9 @@ static int decode_tree_entry(struct tree_desc *desc, const 
char *buf, unsigned l
 {
const char *path;
unsigned int mode, len;
+   const unsigned hashsz = the_hash_algo->rawsz;
 
-   if (size < 23 || buf[size - 21]) {
+   if (size < hashsz + 3 || buf[size - (hashsz + 1)]) {
strbuf_addstr(err, _("too-short tree object"));
return -1;
}


[PATCH 03/17] hex: switch to using the_hash_algo

2018-07-08 Thread brian m. carlson
Instead of using the GIT_SHA1_* constants, switch to using the_hash_algo
to convert object IDs to and from hex format.

Signed-off-by: brian m. carlson 
---
 hex.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hex.c b/hex.c
index 8df2d63728..10af1a29e8 100644
--- a/hex.c
+++ b/hex.c
@@ -50,7 +50,7 @@ int hex_to_bytes(unsigned char *binary, const char *hex, 
size_t len)
 int get_sha1_hex(const char *hex, unsigned char *sha1)
 {
int i;
-   for (i = 0; i < GIT_SHA1_RAWSZ; i++) {
+   for (i = 0; i < the_hash_algo->rawsz; i++) {
int val = hex2chr(hex);
if (val < 0)
return -1;
@@ -69,7 +69,7 @@ int parse_oid_hex(const char *hex, struct object_id *oid, 
const char **end)
 {
int ret = get_oid_hex(hex, oid);
if (!ret)
-   *end = hex + GIT_SHA1_HEXSZ;
+   *end = hex + the_hash_algo->hexsz;
return ret;
 }
 
@@ -79,7 +79,7 @@ char *sha1_to_hex_r(char *buffer, const unsigned char *sha1)
char *buf = buffer;
int i;
 
-   for (i = 0; i < GIT_SHA1_RAWSZ; i++) {
+   for (i = 0; i < the_hash_algo->rawsz; i++) {
unsigned int val = *sha1++;
*buf++ = hex[val >> 4];
*buf++ = hex[val & 0xf];


[PATCH 15/17] log-tree: switch GIT_SHA1_HEXSZ to the_hash_algo->hexsz

2018-07-08 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 log-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/log-tree.c b/log-tree.c
index d3a43e29cd..9655de8ad7 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -545,7 +545,7 @@ void show_log(struct rev_info *opt)
struct strbuf msgbuf = STRBUF_INIT;
struct log_info *log = opt->loginfo;
struct commit *commit = log->commit, *parent = log->parent;
-   int abbrev_commit = opt->abbrev_commit ? opt->abbrev : GIT_SHA1_HEXSZ;
+   int abbrev_commit = opt->abbrev_commit ? opt->abbrev : 
the_hash_algo->hexsz;
const char *extra_headers = opt->extra_headers;
struct pretty_print_context ctx = {0};
 


[PATCH 09/17] builtin/update-index: convert to using the_hash_algo

2018-07-08 Thread brian m. carlson
Switch from using GIT_SHA1_HEXSZ to the_hash_algo to make the parsing of
the index information hash independent.

Signed-off-by: brian m. carlson 
---
 builtin/update-index.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index a8709a26ec..031cef5229 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -492,6 +492,7 @@ static void update_one(const char *path)
 
 static void read_index_info(int nul_term_line)
 {
+   const int hexsz = the_hash_algo->hexsz;
struct strbuf buf = STRBUF_INIT;
struct strbuf uq = STRBUF_INIT;
strbuf_getline_fn getline_fn;
@@ -529,7 +530,7 @@ static void read_index_info(int nul_term_line)
mode = ul;
 
tab = strchr(ptr, '\t');
-   if (!tab || tab - ptr < GIT_SHA1_HEXSZ + 1)
+   if (!tab || tab - ptr < hexsz + 1)
goto bad_line;
 
if (tab[-2] == ' ' && '0' <= tab[-1] && tab[-1] <= '3') {
@@ -542,8 +543,8 @@ static void read_index_info(int nul_term_line)
ptr = tab + 1; /* point at the head of path */
}
 
-   if (get_oid_hex(tab - GIT_SHA1_HEXSZ, ) ||
-   tab[-(GIT_SHA1_HEXSZ + 1)] != ' ')
+   if (get_oid_hex(tab - hexsz, ) ||
+   tab[-(hexsz + 1)] != ' ')
goto bad_line;
 
path_name = ptr;
@@ -571,7 +572,7 @@ static void read_index_info(int nul_term_line)
 * ptr[-1] points at tab,
 * ptr[-41] is at the beginning of sha1
 */
-   ptr[-(GIT_SHA1_HEXSZ + 2)] = ptr[-1] = 0;
+   ptr[-(hexsz + 2)] = ptr[-1] = 0;
if (add_cacheinfo(mode, , path_name, stage))
die("git update-index: unable to update %s",
path_name);


[PATCH 08/17] refs/files-backend: use the_hash_algo for writing refs

2018-07-08 Thread brian m. carlson
In order to ensure we write the correct amount, use the_hash_algo to
find the correct number of bytes for the current hash.

Signed-off-by: brian m. carlson 
---
 refs/files-backend.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 252f835bae..4a724f20a9 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1676,7 +1676,7 @@ static int write_ref_to_lockfile(struct ref_lock *lock,
return -1;
}
fd = get_lock_file_fd(>lk);
-   if (write_in_full(fd, oid_to_hex(oid), GIT_SHA1_HEXSZ) < 0 ||
+   if (write_in_full(fd, oid_to_hex(oid), the_hash_algo->hexsz) < 0 ||
write_in_full(fd, , 1) < 0 ||
close_ref_gently(lock) < 0) {
strbuf_addf(err,
@@ -3070,7 +3070,7 @@ static int files_reflog_expire(struct ref_store 
*ref_store,
rollback_lock_file(_lock);
} else if (update &&
   (write_in_full(get_lock_file_fd(>lk),
-   oid_to_hex(_kept_oid), GIT_SHA1_HEXSZ) 
< 0 ||
+   oid_to_hex(_kept_oid), 
the_hash_algo->hexsz) < 0 ||
write_str_in_full(get_lock_file_fd(>lk), 
"\n") < 0 ||
close_ref_gently(lock) < 0)) {
status |= error("couldn't write %s",


[PATCH 05/17] strbuf: allocate space with GIT_MAX_HEXSZ

2018-07-08 Thread brian m. carlson
In order to be sure we have enough space to use with any hash algorithm,
use GIT_MAX_HEXSZ to allocate space.

Signed-off-by: brian m. carlson 
---
 strbuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/strbuf.c b/strbuf.c
index b0716ac585..030556111d 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -921,7 +921,7 @@ void strbuf_add_unique_abbrev(struct strbuf *sb, const 
struct object_id *oid,
  int abbrev_len)
 {
int r;
-   strbuf_grow(sb, GIT_SHA1_HEXSZ + 1);
+   strbuf_grow(sb, GIT_MAX_HEXSZ + 1);
r = find_unique_abbrev_r(sb->buf + sb->len, oid, abbrev_len);
strbuf_setlen(sb, sb->len + r);
 }


[PATCH 13/17] builtin/merge-recursive: make hash independent

2018-07-08 Thread brian m. carlson
Use GIT_MAX_HEXSZ instead of GIT_SHA1_HEXSZ for an allocation so that it
is sufficiently large.  Switch a comparison to use the_hash_algo to
determine the length of a hex object ID.

Signed-off-by: brian m. carlson 
---
 builtin/merge-recursive.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index 0dd9021958..9b2f707c29 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -9,10 +9,10 @@ static const char builtin_merge_recursive_usage[] =
 
 static const char *better_branch_name(const char *branch)
 {
-   static char githead_env[8 + GIT_SHA1_HEXSZ + 1];
+   static char githead_env[8 + GIT_MAX_HEXSZ + 1];
char *name;
 
-   if (strlen(branch) != GIT_SHA1_HEXSZ)
+   if (strlen(branch) != the_hash_algo->hexsz)
return branch;
xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
name = getenv(githead_env);


[PATCH 16/17] sha1-file: convert constants to uses of the_hash_algo

2018-07-08 Thread brian m. carlson
Convert one use of 20 and several uses of GIT_SHA1_HEXSZ into references
to the_hash_algo.

Signed-off-by: brian m. carlson 
---
 sha1-file.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/sha1-file.c b/sha1-file.c
index de4839e634..1f66b9594f 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -336,7 +336,7 @@ int raceproof_create_file(const char *path, create_file_fn 
fn, void *cb)
 static void fill_sha1_path(struct strbuf *buf, const unsigned char *sha1)
 {
int i;
-   for (i = 0; i < 20; i++) {
+   for (i = 0; i < the_hash_algo->rawsz; i++) {
static char hex[] = "0123456789abcdef";
unsigned int val = sha1[i];
strbuf_addch(buf, hex[val >> 4]);
@@ -1473,7 +1473,7 @@ void *read_object_with_reference(const struct object_id 
*oid,
}
ref_length = strlen(ref_type);
 
-   if (ref_length + GIT_SHA1_HEXSZ > isize ||
+   if (ref_length + the_hash_algo->hexsz > isize ||
memcmp(buffer, ref_type, ref_length) ||
get_oid_hex((char *) buffer + ref_length, _oid)) {
free(buffer);
@@ -2062,9 +2062,9 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
namelen = strlen(de->d_name);
strbuf_setlen(path, baselen);
strbuf_add(path, de->d_name, namelen);
-   if (namelen == GIT_SHA1_HEXSZ - 2 &&
+   if (namelen == the_hash_algo->hexsz - 2 &&
!hex_to_bytes(oid.hash + 1, de->d_name,
- GIT_SHA1_RAWSZ - 1)) {
+ the_hash_algo->rawsz - 1)) {
if (obj_cb) {
r = obj_cb(, path->buf, data);
if (r)


[PATCH 07/17] commit: increase commit message buffer size

2018-07-08 Thread brian m. carlson
100 bytes is not sufficient to ensure we can write a commit message
buffer when using a 32-byte hash algorithm.  Increase the buffer size to
ensure we have sufficient space.

Signed-off-by: brian m. carlson 
---
 refs/files-backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index a9a066dcfb..252f835bae 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1587,7 +1587,7 @@ static int log_ref_write_fd(int fd, const struct 
object_id *old_oid,
char *logrec;
 
msglen = msg ? strlen(msg) : 0;
-   maxlen = strlen(committer) + msglen + 100;
+   maxlen = strlen(committer) + msglen + 200;
logrec = xmalloc(maxlen);
len = xsnprintf(logrec, maxlen, "%s %s %s\n",
oid_to_hex(old_oid),


[PATCH 17/17] pretty: switch hard-coded constants to the_hash_algo

2018-07-08 Thread brian m. carlson
Switch several hard-coded constants into expressions based either on
GIT_MAX_HEXSZ or the_hash_algo.

Signed-off-by: brian m. carlson 
---
 pretty.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/pretty.c b/pretty.c
index 703fa6ff7b..b0e653ff25 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1575,7 +1575,7 @@ static void pp_header(struct pretty_print_context *pp,
}
 
if (starts_with(line, "parent ")) {
-   if (linelen != 48)
+   if (linelen != the_hash_algo->hexsz + 8)
die("bad parent line in commit");
continue;
}
@@ -1583,7 +1583,7 @@ static void pp_header(struct pretty_print_context *pp,
if (!parents_shown) {
unsigned num = commit_list_count(commit->parents);
/* with enough slop */
-   strbuf_grow(sb, num * 50 + 20);
+   strbuf_grow(sb, num * (GIT_MAX_HEXSZ + 10) + 20);
add_merge_info(pp, sb, commit);
parents_shown = 1;
}


[PATCH 12/17] builtin/merge: switch to use the_hash_algo

2018-07-08 Thread brian m. carlson
Switch uses of GIT_SHA1_HEXSZ to use the_hash_algo instead.

Signed-off-by: brian m. carlson 
---
 builtin/merge.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 4a4c09496c..916c9f0569 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1034,6 +1034,7 @@ static void handle_fetch_head(struct commit_list 
**remotes, struct strbuf *merge
const char *filename;
int fd, pos, npos;
struct strbuf fetch_head_file = STRBUF_INIT;
+   const unsigned hexsz = the_hash_algo->hexsz;
 
if (!merge_names)
merge_names = _head_file;
@@ -1059,16 +1060,16 @@ static void handle_fetch_head(struct commit_list 
**remotes, struct strbuf *merge
else
npos = merge_names->len;
 
-   if (npos - pos < GIT_SHA1_HEXSZ + 2 ||
+   if (npos - pos < hexsz + 2 ||
get_oid_hex(merge_names->buf + pos, ))
commit = NULL; /* bad */
-   else if (memcmp(merge_names->buf + pos + GIT_SHA1_HEXSZ, 
"\t\t", 2))
+   else if (memcmp(merge_names->buf + pos + hexsz, "\t\t", 2))
continue; /* not-for-merge */
else {
-   char saved = merge_names->buf[pos + GIT_SHA1_HEXSZ];
-   merge_names->buf[pos + GIT_SHA1_HEXSZ] = '\0';
+   char saved = merge_names->buf[pos + hexsz];
+   merge_names->buf[pos + hexsz] = '\0';
commit = get_merge_parent(merge_names->buf + pos);
-   merge_names->buf[pos + GIT_SHA1_HEXSZ] = saved;
+   merge_names->buf[pos + hexsz] = saved;
}
if (!commit) {
if (ptr)


[PATCH 06/17] sha1-name: use the_hash_algo when parsing object names

2018-07-08 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 sha1-name.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index 60d9ef3c7e..ba6a5a689f 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -310,7 +310,7 @@ static int init_object_disambiguation(const char *name, int 
len,
 {
int i;
 
-   if (len < MINIMUM_ABBREV || len > GIT_SHA1_HEXSZ)
+   if (len < MINIMUM_ABBREV || len > the_hash_algo->hexsz)
return -1;
 
memset(ds, 0, sizeof(*ds));
@@ -576,6 +576,8 @@ int find_unique_abbrev_r(char *hex, const struct object_id 
*oid, int len)
struct disambiguate_state ds;
struct min_abbrev_data mad;
struct object_id oid_ret;
+   const unsigned hexsz = the_hash_algo->hexsz;
+
if (len < 0) {
unsigned long count = approximate_object_count();
/*
@@ -599,8 +601,8 @@ int find_unique_abbrev_r(char *hex, const struct object_id 
*oid, int len)
}
 
oid_to_hex_r(hex, oid);
-   if (len == GIT_SHA1_HEXSZ || !len)
-   return GIT_SHA1_HEXSZ;
+   if (len == hexsz || !len)
+   return hexsz;
 
mad.init_len = len;
mad.cur_len = len;
@@ -706,7 +708,7 @@ static int get_oid_basic(const char *str, int len, struct 
object_id *oid,
int refs_found = 0;
int at, reflog_len, nth_prior = 0;
 
-   if (len == GIT_SHA1_HEXSZ && !get_oid_hex(str, oid)) {
+   if (len == the_hash_algo->hexsz && !get_oid_hex(str, oid)) {
if (warn_ambiguous_refs && warn_on_object_refname_ambiguity) {
refs_found = dwim_ref(str, len, _oid, _ref);
if (refs_found > 0) {
@@ -750,7 +752,7 @@ static int get_oid_basic(const char *str, int len, struct 
object_id *oid,
int detached;
 
if (interpret_nth_prior_checkout(str, len, ) > 0) {
-   detached = (buf.len == GIT_SHA1_HEXSZ && 
!get_oid_hex(buf.buf, oid));
+   detached = (buf.len == the_hash_algo->hexsz && 
!get_oid_hex(buf.buf, oid));
strbuf_release();
if (detached)
return 0;


[PATCH 01/17] cache: update object ID functions for the_hash_algo

2018-07-08 Thread brian m. carlson
Update the hashcpy and hashclr functions to use the_hash_algo, since
they are used in a variety of places to copy and manipulate buffers that
need to move data into or out of struct object_id.  Update oidcmp so
that it is implemented on its own and similarly uses the_hash_algo.

Signed-off-by: brian m. carlson 
---
 cache.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index d49092d94d..c4a64278a1 100644
--- a/cache.h
+++ b/cache.h
@@ -977,7 +977,7 @@ static inline int hashcmp(const unsigned char *sha1, const 
unsigned char *sha2)
 
 static inline int oidcmp(const struct object_id *oid1, const struct object_id 
*oid2)
 {
-   return hashcmp(oid1->hash, oid2->hash);
+   return memcmp(oid1->hash, oid2->hash, the_hash_algo->rawsz);
 }
 
 static inline int is_null_sha1(const unsigned char *sha1)
@@ -992,7 +992,7 @@ static inline int is_null_oid(const struct object_id *oid)
 
 static inline void hashcpy(unsigned char *sha_dst, const unsigned char 
*sha_src)
 {
-   memcpy(sha_dst, sha_src, GIT_SHA1_RAWSZ);
+   memcpy(sha_dst, sha_src, the_hash_algo->rawsz);
 }
 
 static inline void oidcpy(struct object_id *dst, const struct object_id *src)
@@ -1009,7 +1009,7 @@ static inline struct object_id *oiddup(const struct 
object_id *src)
 
 static inline void hashclr(unsigned char *hash)
 {
-   memset(hash, 0, GIT_SHA1_RAWSZ);
+   memset(hash, 0, the_hash_algo->rawsz);
 }
 
 static inline void oidclr(struct object_id *oid)


[PATCH 00/17] object_id part 14

2018-07-08 Thread brian m. carlson
This is the fourteenth series of patches to switch to using struct
object_id and the_hash_algo.  This series converts several core pieces
to use struct object_id, including the oid* and hex functions.

All of these patches have been tested with both SHA-1 and a 256-bit
hash.

brian m. carlson (17):
  cache: update object ID functions for the_hash_algo
  tree-walk: replace hard-coded constants with the_hash_algo
  hex: switch to using the_hash_algo
  commit: express tree entry constants in terms of the_hash_algo
  strbuf: allocate space with GIT_MAX_HEXSZ
  sha1-name: use the_hash_algo when parsing object names
  commit: increase commit message buffer size
  refs/files-backend: use the_hash_algo for writing refs
  builtin/update-index: convert to using the_hash_algo
  builtin/update-index: simplify parsing of cacheinfo
  builtin/fmt-merge-msg: make hash independent
  builtin/merge: switch to use the_hash_algo
  builtin/merge-recursive: make hash independent
  diff: switch GIT_SHA1_HEXSZ to use the_hash_algo
  log-tree: switch GIT_SHA1_HEXSZ to the_hash_algo->hexsz
  sha1-file: convert constants to uses of the_hash_algo
  pretty: switch hard-coded constants to the_hash_algo

 builtin/fmt-merge-msg.c   | 19 ++-
 builtin/merge-recursive.c |  4 ++--
 builtin/merge.c   | 11 ++-
 builtin/update-index.c| 14 --
 cache.h   |  6 +++---
 commit.c  |  4 ++--
 diff.c|  6 +++---
 hex.c |  6 +++---
 log-tree.c|  2 +-
 pretty.c  |  4 ++--
 refs/files-backend.c  |  6 +++---
 sha1-file.c   |  8 
 sha1-name.c   | 12 +++-
 strbuf.c  |  2 +-
 tree-walk.c   |  3 ++-
 15 files changed, 57 insertions(+), 50 deletions(-)



[PATCH 11/17] builtin/fmt-merge-msg: make hash independent

2018-07-08 Thread brian m. carlson
Convert several uses of GIT_SHA1_HEXSZ into references to the_hash_algo.
Switch other uses into a use of parse_oid_hex and uses of its computed
pointer.

Signed-off-by: brian m. carlson 
---
 builtin/fmt-merge-msg.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index bd680be687..e8c13a2c03 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -108,14 +108,15 @@ static int handle_line(char *line, struct merge_parents 
*merge_parents)
struct string_list_item *item;
int pulling_head = 0;
struct object_id oid;
+   const unsigned hexsz = the_hash_algo->hexsz;
 
-   if (len < GIT_SHA1_HEXSZ + 3 || line[GIT_SHA1_HEXSZ] != '\t')
+   if (len < hexsz + 3 || line[hexsz] != '\t')
return 1;
 
-   if (starts_with(line + GIT_SHA1_HEXSZ + 1, "not-for-merge"))
+   if (starts_with(line + hexsz + 1, "not-for-merge"))
return 0;
 
-   if (line[GIT_SHA1_HEXSZ + 1] != '\t')
+   if (line[hexsz + 1] != '\t')
return 2;
 
i = get_oid_hex(line, );
@@ -130,7 +131,7 @@ static int handle_line(char *line, struct merge_parents 
*merge_parents)
 
if (line[len - 1] == '\n')
line[len - 1] = 0;
-   line += GIT_SHA1_HEXSZ + 2;
+   line += hexsz + 2;
 
/*
 * At this point, line points at the beginning of comment e.g.
@@ -342,7 +343,7 @@ static void shortlog(const char *name,
const struct object_id *oid = _data->oid;
int limit = opts->shortlog_len;
 
-   branch = deref_tag(parse_object(oid), oid_to_hex(oid), GIT_SHA1_HEXSZ);
+   branch = deref_tag(parse_object(oid), oid_to_hex(oid), 
the_hash_algo->hexsz);
if (!branch || branch->type != OBJ_COMMIT)
return;
 
@@ -545,6 +546,7 @@ static void find_merge_parents(struct merge_parents *result,
int len;
char *p = in->buf + pos;
char *newline = strchr(p, '\n');
+   const char *q;
struct object_id oid;
struct commit *parent;
struct object *obj;
@@ -552,10 +554,9 @@ static void find_merge_parents(struct merge_parents 
*result,
len = newline ? newline - p : strlen(p);
pos += len + !!newline;
 
-   if (len < GIT_SHA1_HEXSZ + 3 ||
-   get_oid_hex(p, ) ||
-   p[GIT_SHA1_HEXSZ] != '\t' ||
-   p[GIT_SHA1_HEXSZ + 1] != '\t')
+   if (parse_oid_hex(p, , ) ||
+   q[0] != '\t' ||
+   q[1] != '\t')
continue; /* skip not-for-merge */
/*
 * Do not use get_merge_parent() here; we do not have


[PATCH 04/17] commit: express tree entry constants in terms of the_hash_algo

2018-07-08 Thread brian m. carlson
Specify these constants in terms of the size of the hash algorithm
currently in use.

Signed-off-by: brian m. carlson 
---
 commit.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/commit.c b/commit.c
index 0c3b75aeff..ff05d04570 100644
--- a/commit.c
+++ b/commit.c
@@ -364,8 +364,8 @@ int parse_commit_buffer(struct commit *item, const void 
*buffer, unsigned long s
struct object_id parent;
struct commit_list **pptr;
struct commit_graft *graft;
-   const int tree_entry_len = GIT_SHA1_HEXSZ + 5;
-   const int parent_entry_len = GIT_SHA1_HEXSZ + 7;
+   const int tree_entry_len = the_hash_algo->hexsz + 5;
+   const int parent_entry_len = the_hash_algo->hexsz + 7;
 
if (item->object.parsed)
return 0;


[PATCH 14/17] diff: switch GIT_SHA1_HEXSZ to use the_hash_algo

2018-07-08 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 diff.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index 639eb646b9..485ff6c264 100644
--- a/diff.c
+++ b/diff.c
@@ -3832,7 +3832,7 @@ static const char *diff_abbrev_oid(const struct object_id 
*oid, int abbrev)
char *hex = oid_to_hex(oid);
if (abbrev < 0)
abbrev = FALLBACK_DEFAULT_ABBREV;
-   if (abbrev > GIT_SHA1_HEXSZ)
+   if (abbrev > the_hash_algo->hexsz)
BUG("oid abbreviation out of range: %d", abbrev);
if (abbrev)
hex[abbrev] = '\0';
@@ -4947,7 +4947,7 @@ const char *diff_aligned_abbrev(const struct object_id 
*oid, int len)
const char *abbrev;
 
/* Do we want all 40 hex characters? */
-   if (len == GIT_SHA1_HEXSZ)
+   if (len == the_hash_algo->hexsz)
return oid_to_hex(oid);
 
/* An abbreviated value is fine, possibly followed by an ellipsis. */
@@ -4977,7 +4977,7 @@ const char *diff_aligned_abbrev(const struct object_id 
*oid, int len)
 * the automatic sizing is supposed to give abblen that ensures
 * uniqueness across all objects (statistically speaking).
 */
-   if (abblen < GIT_SHA1_HEXSZ - 3) {
+   if (abblen < the_hash_algo->hexsz - 3) {
static char hex[GIT_MAX_HEXSZ + 1];
if (len < abblen && abblen <= len + 2)
xsnprintf(hex, sizeof(hex), "%s%.*s", abbrev, 
len+3-abblen, "..");


[PATCH 10/17] builtin/update-index: simplify parsing of cacheinfo

2018-07-08 Thread brian m. carlson
Switch from using get_oid_hex to parse_oid_hex to simplify pointer
operations and avoid the need for a hash-related constant.

Signed-off-by: brian m. carlson 
---
 builtin/update-index.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 031cef5229..3206c5ad45 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -827,6 +827,7 @@ static int parse_new_style_cacheinfo(const char *arg,
 {
unsigned long ul;
char *endp;
+   const char *p;
 
if (!arg)
return -1;
@@ -837,9 +838,9 @@ static int parse_new_style_cacheinfo(const char *arg,
return -1; /* not a new-style cacheinfo */
*mode = ul;
endp++;
-   if (get_oid_hex(endp, oid) || endp[GIT_SHA1_HEXSZ] != ',')
+   if (parse_oid_hex(endp, oid, ) || *p != ',')
return -1;
-   *path = endp + GIT_SHA1_HEXSZ + 1;
+   *path = p + 1;
return 0;
 }
 


Re: [PATCH v2 0/4] Automatic transfer encoding for patches

2018-07-08 Thread Drew DeVault
LGTM, thanks for the v2.


[PATCH v2 2/4] send-email: accept long lines with suitable transfer encoding

2018-07-08 Thread brian m. carlson
With --validate (which is the default), we warn about lines exceeding
998 characters due to the limits specified in RFC 5322.  However, if
we're using a suitable transfer encoding (quoted-printable or base64),
we're guaranteed not to have lines exceeding 76 characters, so there's
no need to fail in this case.  The auto transfer encoding handles this
specific case, so accept it as well.

Signed-off-by: brian m. carlson 
---
 Documentation/git-send-email.txt |  7 +--
 git-send-email.perl  | 18 +++---
 t/t9001-send-email.sh| 13 +
 3 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt
index 385c7de9e2..0e648075bb 100644
--- a/Documentation/git-send-email.txt
+++ b/Documentation/git-send-email.txt
@@ -401,8 +401,11 @@ have been specified, in which case default to 'compose'.
 +
 --
*   Invoke the sendemail-validate hook if present (see 
linkgit:githooks[5]).
-   *   Warn of patches that contain lines longer than 998 
characters; this
-   is due to SMTP limits as described by 
http://www.ietf.org/rfc/rfc2821.txt.
+   *   Warn of patches that contain lines longer than
+   998 characters unless a suitable transfer encoding
+   ('auto', 'base64', or 'quoted-printable') is used;
+   this is due to SMTP limits as described by
+   http://www.ietf.org/rfc/rfc2821.txt.
 --
 +
 Default is the value of `sendemail.validate`; if this is not set,
diff --git a/git-send-email.perl b/git-send-email.perl
index 1736a09d21..e6bcc55827 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -645,7 +645,7 @@ sub is_format_patch_arg {
 if ($validate) {
foreach my $f (@files) {
unless (-p $f) {
-   my $error = validate_patch($f);
+   my $error = validate_patch($f, $target_xfer_encoding);
$error and die sprintf(__("fatal: %s: %s\nwarning: no 
patches were sent\n"),
  $f, $error);
}
@@ -1879,7 +1879,7 @@ sub unique_email_list {
 }
 
 sub validate_patch {
-   my $fn = shift;
+   my ($fn, $xfer_encoding) = @_;
 
if ($repo) {
my $validate_hook = catfile(catdir($repo->repo_path(), 'hooks'),
@@ -1899,11 +1899,15 @@ sub validate_patch {
return $hook_error if $hook_error;
}
 
-   open(my $fh, '<', $fn)
-   or die sprintf(__("unable to open %s: %s\n"), $fn, $!);
-   while (my $line = <$fh>) {
-   if (length($line) > 998) {
-   return sprintf(__("%s: patch contains a line longer 
than 998 characters"), $.);
+   # Any long lines will be automatically fixed if we use a suitable 
transfer
+   # encoding.
+   unless ($xfer_encoding =~ /^(?:auto|quoted-printable|base64)$/) {
+   open(my $fh, '<', $fn)
+   or die sprintf(__("unable to open %s: %s\n"), $fn, $!);
+   while (my $line = <$fh>) {
+   if (length($line) > 998) {
+   return sprintf(__("%s: patch contains a line 
longer than 998 characters"), $.);
+   }
}
}
return;
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index a35cba6a4c..1b474cca28 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -479,6 +479,19 @@ test_expect_success $PREREQ 'long lines with auto encoding 
are quoted-printable'
grep "Content-Transfer-Encoding: quoted-printable" msgtxt1
 '
 
+for enc in auto quoted-printable base64
+do
+   test_expect_success $PREREQ "--validate passes with encoding $enc" '
+   git send-email \
+   --from="Example " \
+   --to=nob...@example.com \
+   --smtp-server="$(pwd)/fake.sendmail" \
+   --transfer-encoding=$enc \
+   --validate \
+   $patches longline.patch
+   '
+done
+
 test_expect_success $PREREQ 'Invalid In-Reply-To' '
clean_fake_sendmail &&
git send-email \


[PATCH v2 4/4] docs: correct RFC specifying email line length

2018-07-08 Thread brian m. carlson
The git send-email documentation specifies RFC 2821 (the SMTP RFC) as
providing line length limits, but the specification that restricts line
length to 998 octets is RFC 2822 (the email message format RFC).  Since
RFC 2822 has been obsoleted by RFC 5322, update the text to refer to RFC
5322 instead of RFC 2821.

Signed-off-by: brian m. carlson 
---
 Documentation/git-send-email.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt
index 2f32dbf16d..465a4ecbed 100644
--- a/Documentation/git-send-email.txt
+++ b/Documentation/git-send-email.txt
@@ -404,7 +404,7 @@ have been specified, in which case default to 'compose'.
998 characters unless a suitable transfer encoding
('auto', 'base64', or 'quoted-printable') is used;
this is due to SMTP limits as described by
-   http://www.ietf.org/rfc/rfc2821.txt.
+   http://www.ietf.org/rfc/rfc5322.txt.
 --
 +
 Default is the value of `sendemail.validate`; if this is not set,


[PATCH v2 3/4] send-email: automatically determine transfer-encoding

2018-07-08 Thread brian m. carlson
git send-email, when invoked without a --transfer-encoding option, sends
8bit data without a MIME version or a transfer encoding.  This has
several downsides.

First, unless the transfer encoding is specified, it defaults to 7bit,
meaning that non-ASCII data isn't allowed.  Second, if lines longer than
998 bytes are used, we will send an message that is invalid according to
RFC 5322.  The --validate option, which is the default, catches this
issue, but it isn't clear to many people how to resolve this.

To solve these issues, default the transfer encoding to "auto", so that
we explicitly specify 8bit encoding when lines don't exceed 998 bytes
and quoted-printable otherwise.  This means that we now always emit
Content-Transfer-Encoding and MIME-Version headers, so remove the
conditionals from this portion of the code.

It is unlikely that the unconditional inclusion of these two headers
will affect the deliverability of messages in anything but a positive
way, since MIME is already widespread and well understood by most email
programs.

Signed-off-by: brian m. carlson 
---
 Documentation/git-send-email.txt |  3 +--
 git-send-email.perl  | 18 ++
 t/t9001-send-email.sh| 21 +
 3 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt
index 0e648075bb..2f32dbf16d 100644
--- a/Documentation/git-send-email.txt
+++ b/Documentation/git-send-email.txt
@@ -147,8 +147,7 @@ Note that no attempts whatsoever are made to validate the 
encoding.
otherwise.
 +
 Default is the value of the `sendemail.transferEncoding` configuration
-value; if that is unspecified, git will use 8bit and not add a
-Content-Transfer-Encoding header.
+value; if that is unspecified, default to `auto`.
 
 --xmailer::
 --no-xmailer::
diff --git a/git-send-email.perl b/git-send-email.perl
index e6bcc55827..f4c07908d2 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -231,7 +231,7 @@ sub do_edit {
 my (@suppress_cc);
 my ($auto_8bit_encoding);
 my ($compose_encoding);
-my ($target_xfer_encoding);
+my $target_xfer_encoding = 'auto';
 
 my ($debug_net_smtp) = 0;  # Net::SMTP, see send_message()
 
@@ -1737,17 +1737,11 @@ sub process_file {
}
}
}
-   if (defined $target_xfer_encoding) {
-   $xfer_encoding = '8bit' if not defined $xfer_encoding;
-   ($message, $xfer_encoding) = apply_transfer_encoding(
-   $message, $xfer_encoding, $target_xfer_encoding);
-   }
-   if (defined $xfer_encoding) {
-   push @xh, "Content-Transfer-Encoding: $xfer_encoding";
-   }
-   if (defined $xfer_encoding or $has_content_type) {
-   unshift @xh, 'MIME-Version: 1.0' unless $has_mime_version;
-   }
+   $xfer_encoding = '8bit' if not defined $xfer_encoding;
+   ($message, $xfer_encoding) = apply_transfer_encoding(
+   $message, $xfer_encoding, $target_xfer_encoding);
+   push @xh, "Content-Transfer-Encoding: $xfer_encoding";
+   unshift @xh, 'MIME-Version: 1.0' unless $has_mime_version;
 
$needs_confirm = (
$confirm eq "always" or
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 1b474cca28..1da282c415 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -225,6 +225,8 @@ X-Mailer: X-MAILER-STRING
 In-Reply-To: 
 References: 
 Reply-To: Reply 
+MIME-Version: 1.0
+Content-Transfer-Encoding: 8bit
 
 Result: OK
 EOF
@@ -415,6 +417,7 @@ test_expect_success $PREREQ 'reject long lines' '
--from="Example " \
--to=nob...@example.com \
--smtp-server="$(pwd)/fake.sendmail" \
+   --transfer-encoding=8bit \
$patches longline.patch \
2>errors &&
grep longline.patch errors
@@ -609,6 +612,8 @@ Subject: [PATCH 1/1] Second.
 Date: DATE-STRING
 Message-Id: MESSAGE-ID-STRING
 X-Mailer: X-MAILER-STRING
+MIME-Version: 1.0
+Content-Transfer-Encoding: 8bit
 
 Result: OK
 EOF
@@ -653,6 +658,8 @@ Subject: [PATCH 1/1] Second.
 Date: DATE-STRING
 Message-Id: MESSAGE-ID-STRING
 X-Mailer: X-MAILER-STRING
+MIME-Version: 1.0
+Content-Transfer-Encoding: 8bit
 
 Result: OK
 EOF
@@ -688,6 +695,8 @@ Subject: [PATCH 1/1] Second.
 Date: DATE-STRING
 Message-Id: MESSAGE-ID-STRING
 X-Mailer: X-MAILER-STRING
+MIME-Version: 1.0
+Content-Transfer-Encoding: 8bit
 
 Result: OK
 EOF
@@ -714,6 +723,8 @@ Subject: [PATCH 1/1] Second.
 Date: DATE-STRING
 Message-Id: MESSAGE-ID-STRING
 X-Mailer: X-MAILER-STRING
+MIME-Version: 1.0
+Content-Transfer-Encoding: 8bit
 
 Result: OK
 EOF
@@ -748,6 +759,8 @@ Subject: [PATCH 1/1] Second.
 Date: DATE-STRING
 Message-Id: MESSAGE-ID-STRING
 X-Mailer: X-MAILER-STRING
+MIME-Version: 1.0
+Content-Transfer-Encoding: 8bit
 
 Result: OK
 EOF
@@ -779,6 +792,8 @@ Subject: [PATCH 1/1] Second.
 Date: DATE-STRING
 

[PATCH v2 0/4] Automatic transfer encoding for patches

2018-07-08 Thread brian m. carlson
This series introduces an "auto" value for git send-email
--transfer-encoding that uses 8bit when possible (i.e. when lines are
998 octets or shorter) and quoted-printable otherwise; it then makes
this the default behavior.  It also makes --validate aware of transfer
encoding so it doesn't complain when using quoted-printable or base64.

Changes from v1:
* Update commit messages to refer to RFC 5322.
* Add a missing space.
* Remove the needless capture of stderr.
* Define "suitable transfer encoding".
* Invert test to better capture failures.
* Wrap --validate code in an if block instead of returning early.
* Update documentation to reflect correct, modern RFC.

brian m. carlson (4):
  send-email: add an auto option for transfer encoding
  send-email: accept long lines with suitable transfer encoding
  send-email: automatically determine transfer-encoding
  docs: correct RFC specifying email line length

 Documentation/git-send-email.txt | 17 ++
 git-send-email.perl  | 46 +-
 t/t9001-send-email.sh| 57 
 3 files changed, 91 insertions(+), 29 deletions(-)



[PATCH v2 1/4] send-email: add an auto option for transfer encoding

2018-07-08 Thread brian m. carlson
For most patches, using a transfer encoding of 8bit provides good
compatibility with most servers and makes it as easy as possible to view
patches.  However, there are some patches for which 8bit is not a valid
encoding: RFC 5322 specifies that a message must not have lines
exceeding 998 octets.

Add a transfer encoding value, auto, which indicates that a patch should
use 8bit where allowed and quoted-printable otherwise.  Choose
quoted-printable instead of base64, since base64-encoded plain text is
treated as suspicious by some spam filters.

Signed-off-by: brian m. carlson 
---
 Documentation/git-send-email.txt | 11 +++
 git-send-email.perl  | 12 +++-
 t/t9001-send-email.sh| 23 +++
 3 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt
index 4f3efde80c..385c7de9e2 100644
--- a/Documentation/git-send-email.txt
+++ b/Documentation/git-send-email.txt
@@ -137,15 +137,18 @@ Note that no attempts whatsoever are made to validate the 
encoding.
Specify encoding of compose message. Default is the value of the
'sendemail.composeencoding'; if that is unspecified, UTF-8 is assumed.
 
---transfer-encoding=(7bit|8bit|quoted-printable|base64)::
+--transfer-encoding=(7bit|8bit|quoted-printable|base64|auto)::
Specify the transfer encoding to be used to send the message over SMTP.
7bit will fail upon encountering a non-ASCII message.  quoted-printable
can be useful when the repository contains files that contain carriage
returns, but makes the raw patch email file (as saved from a MUA) much
harder to inspect manually.  base64 is even more fool proof, but also
-   even more opaque.  Default is the value of the 
`sendemail.transferEncoding`
-   configuration value; if that is unspecified, git will use 8bit and not
-   add a Content-Transfer-Encoding header.
+   even more opaque.  auto will use 8bit when possible, and 
quoted-printable
+   otherwise.
++
+Default is the value of the `sendemail.transferEncoding` configuration
+value; if that is unspecified, git will use 8bit and not add a
+Content-Transfer-Encoding header.
 
 --xmailer::
 --no-xmailer::
diff --git a/git-send-email.perl b/git-send-email.perl
index 8ec70e58ed..1736a09d21 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1739,9 +1739,8 @@ sub process_file {
}
if (defined $target_xfer_encoding) {
$xfer_encoding = '8bit' if not defined $xfer_encoding;
-   $message = apply_transfer_encoding(
+   ($message, $xfer_encoding) = apply_transfer_encoding(
$message, $xfer_encoding, $target_xfer_encoding);
-   $xfer_encoding = $target_xfer_encoding;
}
if (defined $xfer_encoding) {
push @xh, "Content-Transfer-Encoding: $xfer_encoding";
@@ -1852,13 +1851,16 @@ sub apply_transfer_encoding {
$message = MIME::Base64::decode($message)
if ($from eq 'base64');
 
+   $to = ($message =~ /.{999,}/) ? 'quoted-printable' : '8bit'
+   if $to eq 'auto';
+
die __("cannot send message as 7bit")
if ($to eq '7bit' and $message =~ /[^[:ascii:]]/);
-   return $message
+   return ($message, $to)
if ($to eq '7bit' or $to eq '8bit');
-   return MIME::QuotedPrint::encode($message, "\n", 0)
+   return (MIME::QuotedPrint::encode($message, "\n", 0), $to)
if ($to eq 'quoted-printable');
-   return MIME::Base64::encode($message, "\n")
+   return (MIME::Base64::encode($message, "\n"), $to)
if ($to eq 'base64');
die __("invalid transfer encoding");
 }
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index e80eacbb1b..a35cba6a4c 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -456,6 +456,29 @@ test_expect_success $PREREQ 'allow long lines with 
--no-validate' '
2>errors
 '
 
+test_expect_success $PREREQ 'short lines with auto encoding are 8bit' '
+   clean_fake_sendmail &&
+   git send-email \
+   --from="A " \
+   --to=nob...@example.com \
+   --smtp-server="$(pwd)/fake.sendmail" \
+   --transfer-encoding=auto \
+   $patches &&
+   grep "Content-Transfer-Encoding: 8bit" msgtxt1
+'
+
+test_expect_success $PREREQ 'long lines with auto encoding are 
quoted-printable' '
+   clean_fake_sendmail &&
+   git send-email \
+   --from="Example " \
+   --to=nob...@example.com \
+   --smtp-server="$(pwd)/fake.sendmail" \
+   --transfer-encoding=auto \
+   --no-validate \
+   longline.patch &&
+   grep "Content-Transfer-Encoding: quoted-printable" msgtxt1
+'
+
 test_expect_success $PREREQ 'Invalid In-Reply-To' '

Re: Git 2.18: RUNTIME_PREFIX... is it working?

2018-07-08 Thread Johannes Schindelin
Hi Paul,

On Sun, 8 Jul 2018, Paul Smith wrote:

> On Fri, 2018-07-06 at 09:18 -0400, Daniel Jacques wrote:
> > I forewent autoconf because I was concerned that the option was too
> > obscure and the configuration too nuanced to be worth adding via
> > flag, as RUNTIME_PREFIX requires some degree of path alignment and is
> > fairly special-case. If you prefer autoconf, though, it sounds like a
> > good thing to add, and I'm happy that you are finding the feature
> > useful!
> 
> Well, far from obscure, I actually think that RUNTIME_PREFIX should be
> the default behavior on all platforms.  In fact speaking for myself, I
> see no value at all in the hardcoded path behavior and it could be
> removed and RUNTIME_PREFIX be the only option and that would be fine
> with me.
> 
> The only possible advantage I can see to the current default that you
> can copy the Git binary alone somewhere else, but that's of very little
> value IMO: you could instead create a symbolic link or a two-line shell
> script wrapper if you wanted to have "git" available outside of its
> normal relation to the rest of the installation for some reason.

In theory, I agree with you, I would love for RUNTIME_PREFIX not even to
be needed.

In practice, however, a *lng* time ago it was decided that it was okay
to implement parts of Git as shell scripts, and when those shell scripts
finally became too many, in order not to clutter the `PATH`, they were
moved to the libexec/git-core/ directory.

Obviously, for this to work, Git needs to prefix the `PATH` variable
internally, and for that it has to know where that libexec/git-core/
directory lives.

Now, if you care to have a look at Dan's (and my) patches to implement
RUNTIME_PREFIX so that it looks for a directory *relative to the Git
binary*, you will see that it is far from portable. In fact, it is very
definitely not portable, and needs specific support for *every single
supported Operating System*. And while we covered a lot, we did not cover
all of them.

So unfortunately, it is impossible to make it the default, I am afraid.
Until the time when we can ship a single `git` binary (which is sadly
unlikely to happen, as there has been a *lot* of pushback against that
e.g. on the grounds that having to (lazy-)load the cURL library adds a
tiny bit to the startup time of the `git` binary).

It is all a bit complex, due to non-technical reasons.

Ciao,
Dscho


Re: [PATCH v4 4/4] builtin/rebase: support running "git rebase "

2018-07-08 Thread Johannes Schindelin
Hi Pratik,

On Sun, 8 Jul 2018, Pratik Karki wrote:

> + strbuf_addf(_snippet,
> + ". git-rebase--common && . %s && %s",
> + backend, backend_func);

In my quick test (on top of `wip-rebase`), this line needed this change:

-- snip --
@@ -269,7 +279,8 @@ static int run_specific_rebase(struct rebase_options *opts)
}

strbuf_addf(_snippet,
-   "set -x && . git-rebase--common && . %s && %s",
+   "set -x && "
+   ". git-sh-setup && . git-rebase--common && . %s && %s",
backend, backend_func);
argv[0] = script_snippet.buf;
-- snap --

Obviously, the `set -x && ` part was not part of the patches you sent to
the Git mailing list, so please do not let that distract you from the fact
that I had to source also `git-sh-setup` (it implies `git-sh-i18n`, and
the `eval_gettext` function is defined there and used in the
`move_to_original_branch` fnuction).

With this (and the REF_NO_DEREF change), t3404-rebase-skip.sh passes,
which is pretty cool.

Ciao,
Dscho


Re: [PATCH v4 3/4] sequencer: refactor the code to detach HEAD to checkout.c

2018-07-08 Thread Johannes Schindelin
Hi Pratik,

On Sun, 8 Jul 2018, Pratik Karki wrote:

> diff --git a/checkout.c b/checkout.c
> index bdefc888ba..da68915fd7 100644
> --- a/checkout.c
> +++ b/checkout.c
> @@ -2,6 +2,11 @@
>  #include "remote.h"
>  #include "refspec.h"
>  #include "checkout.h"
> +#include "unpack-trees.h"
> +#include "lockfile.h"
> +#include "refs.h"
> +#include "tree.h"
> +#include "cache-tree.h"
>  
>  struct tracking_name_data {
>   /* const */ char *src_ref;
> @@ -42,3 +47,62 @@ const char *unique_tracking_name(const char *name, struct 
> object_id *oid)
>   free(cb_data.dst_ref);
>   return NULL;
>  }
> +
> +int detach_head_to(struct object_id *oid, const char *action,
> +const char *reflog_message)
> +{
> + struct strbuf ref_name = STRBUF_INIT;
> + struct tree_desc desc;
> + struct lock_file lock = LOCK_INIT;
> + struct unpack_trees_options unpack_tree_opts;
> + struct tree *tree;
> + int ret = 0;
> +
> + if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0)
> + return -1;
> +
> + memset(_tree_opts, 0, sizeof(unpack_tree_opts));
> + setup_unpack_trees_porcelain(_tree_opts, action);
> + unpack_tree_opts.head_idx = 1;
> + unpack_tree_opts.src_index = _index;
> + unpack_tree_opts.dst_index = _index;
> + unpack_tree_opts.fn = oneway_merge;
> + unpack_tree_opts.merge = 1;
> + unpack_tree_opts.update = 1;
> +
> + if (read_cache_unmerged()) {
> + rollback_lock_file();
> + strbuf_release(_name);
> + return error_resolve_conflict(_(action));
> + }
> +
> + if (!fill_tree_descriptor(, oid)) {
> + error(_("failed to find tree of %s"), oid_to_hex(oid));
> + rollback_lock_file();
> + free((void *)desc.buffer);
> + strbuf_release(_name);
> + return -1;
> + }
> +
> + if (unpack_trees(1, , _tree_opts)) {
> + rollback_lock_file();
> + free((void *)desc.buffer);
> + strbuf_release(_name);
> + return -1;
> + }
> +
> + tree = parse_tree_indirect(oid);
> + prime_cache_tree(_index, tree);
> +
> + if (write_locked_index(_index, , COMMIT_LOCK) < 0)
> + ret = error(_("could not write index"));
> + free((void *)desc.buffer);
> +
> + if (!ret)
> + ret = update_ref(reflog_message, "HEAD", oid,
> +  NULL, 0, UPDATE_REFS_MSG_ON_ERR);

I noticed that this does not actually detach the HEAD. That is my fault,
of course, as I should have not only suggested refactoring the
`do_reset()` function from `sequencer.c`, but I should also have
remembered that that function has the benefit of *always* acting on a
detached HEAD (because it runs during an interactive rebase), and
therefore does not need to detach it explicitly.

In light of the `reset_hard()` function that you added in a `wip` (see
https://github.com/git/git/pull/505/files#diff-c7361e406139e8cd3a300b80b8f8cc8dR296),
I could imagine that it might be better, after all, to leave `do_reset()`
alone and implement a `reset_hard()` function that also optionally
detaches the `HEAD` (I *think* that the flag `REF_NO_DEREF` would do that
for you).

Alternatively, just update the code in `do_reset()` to use that flag
first, and only *then* extract the code to `checkout.c`.

(I could not resist, and made this quick change on top of your
`wip-rebase`, and together with a couple more, obvious fixups, this lets
t3403 pass. It still needs some things that you have not yet sent to the
mailing list, such as support for `--skip`.)

Ciao,
Dscho



Re: [RFC PATCH 4/6] sequencer.c: avoid empty statements at top level

2018-07-08 Thread Philip Oakley

From: "Eric Sunshine" 
To: "Beat Bolli" 

On Sun, Jul 8, 2018 at 10:44 AM Beat Bolli  wrote:

The marco GIT_PATH_FUNC expands to a complete statement including the


s/marco/macro/


semicolon. Remove two extra trailing semicolons.

Signed-off-by: Beat Bolli 
---
 sequencer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


While you're at it, perhaps it would be a good idea to fix the example
in path.h which teaches the "wrong" way:

/*
* You can define a static memoized git path like:
*
*static GIT_PATH_FUNC(git_path_foo, "FOO");
*
* or use one of the global ones below.
*/



Re: [GSoC] [PATCH v4 0/4] rebase: rewrite rebase in C

2018-07-08 Thread Johannes Schindelin
Hi Pratik,

On Sun, 8 Jul 2018, Pratik Karki wrote:

> As a GSoC project, I have been working on the builtin rebase.
> 
> The motivation behind the rewrite of rebase i.e. from shell script to C
> are for following reasons:
> 
> 1.  Writing shell scripts and getting it to production is much faster
> than doing the equivalent in C but lacks in performance and extra
> workarounds are needed for non-POSIX platforms.
> 
> 2.  Git for Windows is at loss as the installer size increases due to
> addition of extra dependencies for the shell scripts which are usually
> available in POSIX compliant platforms.
> 
> This series of patches serves to demonstrate a minimal builtin rebase
> which supports running `git rebase ` and also serves to ask for
> reviews.
> 
> Changes since v3:
> 
>   -  Fix commit message of `rebase: start implementing it as a builtin`.
> 
>   -  Acknowledge Junio's style reviews.
> 
>   -  Acknowledge Johannes Schindelin's review.

The range-diff looks like this (and makes sense to me; you might want to
fix the typo s/retun/return/, but that's all for now):

-- snipsnap --
 1:  7baec70f219 !  1:  42778b20edf rebase: start implementing it as a builtin
@@ -13,6 +13,12 @@
 be able to conveniently test new features by configuring
 `rebase.useBuiltin`.

+In the original difftool conversion, if sane_execvp() that attempts to
+run the legacy scripted version returned with non-negative status, the
+command silently exited without doing anything with success, but
+sane_execvp() should not retun with non-negative status in the first
+place, so we use die() to notice such an abnormal case.
+
 We intentionally avoid reading the config directly to avoid
 messing up the GIT_* environment variables when we need to fall back to
 exec()ing the shell script. The test of builtin rebase can be done by
 2:  f385f42dc56 !  2:  a28be7308e6 rebase: refactor common shell functions 
into their own file
@@ -45,6 +45,20 @@
 diff --git a/git-legacy-rebase.sh b/git-legacy-rebase.sh
 --- a/git-legacy-rebase.sh
 +++ b/git-legacy-rebase.sh
+@@
+ LF='
+ '
+ ok_to_skip_pre_rebase=
+-resolvemsg="
+-$(gettext 'Resolve all conflicts manually, mark them as resolved with
+-"git add/rm ", then run "git rebase --continue".
+-You can instead skip this commit: run "git rebase --skip".
+-To abort and get back to the state before "git rebase", run "git rebase 
--abort".')
+-"
++
+ squash_onto=
+ unset onto
+ unset restrict_revision
 @@
  true) gpg_sign_opt=-S ;;
  *)gpg_sign_opt= ;;
@@ -128,6 +142,13 @@
 +++ b/git-rebase--common.sh
 @@
 +
++resolvemsg="
++$(gettext 'Resolve all conflicts manually, mark them as resolved with
++"git add/rm ", then run "git rebase --continue".
++You can instead skip this commit: run "git rebase --skip".
++To abort and get back to the state before "git rebase", run "git rebase 
--abort".')
++"
++
 +write_basic_state () {
 +  echo "$head_name" > "$state_dir"/head-name &&
 +  echo "$onto" > "$state_dir"/onto &&
 3:  147699bd195 =  3:  7591098c4d1 sequencer: refactor the code to detach HEAD 
to checkout.c
 4:  bbaa4264caa !  4:  f8429e950a4 builtin/rebase: support running "git rebase 
"
@@ -232,13 +232,14 @@
 +  }
 +
 +  /*
-+  * If the branch to rebase is given, that is the branch we will rebase
-+  * branch_name -- branch/commit being rebased, or HEAD (already detached)
-+  * orig_head -- commit object name of tip of the branch before rebasing
-+  * head_name -- refs/heads/ or "detached HEAD"
-+  */
++   * If the branch to rebase is given, that is the branch we will rebase
++   * branch_name -- branch/commit being rebased, or
++   *HEAD (already detached)
++   * orig_head -- commit object name of tip of the branch before rebasing
++   * head_name -- refs/heads/ or "detached HEAD"
++   */
 +  if (argc > 1)
-+   die ("TODO: handle switch_to");
++   die("TODO: handle switch_to");
 +  else {
 +  /* Do not need to switch branches, we are already on it.  */
 +  options.head_name =



Re: [PATCH 2/2] t3430: update to test with custom commentChar

2018-07-08 Thread brian m. carlson
On Sun, Jul 08, 2018 at 09:41:11PM +0300, Daniel Harding wrote:
> Signed-off-by: Daniel Harding 

I think maybe, as you suggested, a separate test for this would be
beneficial.  It might be as simple as modifying 'script-from-scratch' by
doing "sed 's/#/>/'".

> diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
> index 78f7c9958..ff474d033 100755
> --- a/t/t3430-rebase-merges.sh
> +++ b/t/t3430-rebase-merges.sh
> @@ -56,12 +56,12 @@ test_expect_success 'create completely different 
> structure' '
>   cat >script-from-scratch <<-\EOF &&
>   label onto
>  
> - # onebranch
> + > onebranch
>   pick G
>   pick D
>   label onebranch
>  
> - # second
> + > second
>   reset onto
>   pick B
>   label second

Should this affect the "# Merge the topic branch" line (and the "# C",
"# E", and "# H" lines in the next test) that appears below this?  It
would seem those would qualify as comments as well.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [RFC PATCH 4/6] sequencer.c: avoid empty statements at top level

2018-07-08 Thread Eric Sunshine
On Sun, Jul 8, 2018 at 10:44 AM Beat Bolli  wrote:
> The marco GIT_PATH_FUNC expands to a complete statement including the
> semicolon. Remove two extra trailing semicolons.
>
> Signed-off-by: Beat Bolli 
> ---
>  sequencer.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

While you're at it, perhaps it would be a good idea to fix the example
in path.h which teaches the "wrong" way:

/*
 * You can define a static memoized git path like:
 *
 *static GIT_PATH_FUNC(git_path_foo, "FOO");
 *
 * or use one of the global ones below.
 */


Re: Git 2.18: RUNTIME_PREFIX... is it working?

2018-07-08 Thread Paul Smith
On Fri, 2018-07-06 at 09:18 -0400, Daniel Jacques wrote:
> I forewent autoconf because I was concerned that the option was too
> obscure and the configuration too nuanced to be worth adding via
> flag, as RUNTIME_PREFIX requires some degree of path alignment and is
> fairly special-case. If you prefer autoconf, though, it sounds like a
> good thing to add, and I'm happy that you are finding the feature
> useful!

Well, far from obscure, I actually think that RUNTIME_PREFIX should be
the default behavior on all platforms.  In fact speaking for myself, I
see no value at all in the hardcoded path behavior and it could be
removed and RUNTIME_PREFIX be the only option and that would be fine
with me.

The only possible advantage I can see to the current default that you
can copy the Git binary alone somewhere else, but that's of very little
value IMO: you could instead create a symbolic link or a two-line shell
script wrapper if you wanted to have "git" available outside of its
normal relation to the rest of the installation for some reason.

Thanks for making this work in any event, Daniel!


[PATCH 1/2] sequencer: fix --rebase-merges with custom commentChar

2018-07-08 Thread Daniel Harding
Prefix the "Branch " comments in the todo list with the configured
comment character instead of hard-coding '#'.

Signed-off-by: Daniel Harding 
---
 sequencer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sequencer.c b/sequencer.c
index 4034c0461..caf91af29 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3991,7 +3991,7 @@ static int make_script_with_merges(struct 
pretty_print_context *pp,
entry = oidmap_get(, >object.oid);
 
if (entry)
-   fprintf(out, "\n# Branch %s\n", entry->string);
+   fprintf(out, "\n%c Branch %s\n", comment_line_char, 
entry->string);
else
fprintf(out, "\n");
 
-- 
2.18.0



[PATCH 0/2] Fix --rebase-merges with custom commentChar

2018-07-08 Thread Daniel Harding
I have core.commentChar set in my .gitconfig, and when I tried to run
git rebase -i -r, I received an error message like the following:

error: invalid line 3: # Branch 

To fix this, I updated sequencer.c to use the configured commentChar
for the Branch  comments.  I also tweaked the tests in t3430 to
verify todo list generation with a custom commentChar.  I'm not sure
if I took the right approach with that, or if it would be better to
add additional tests for that case, so feel free to
tweak/replace/ignore the second commit as appropriate.

Thanks,

Daniel Harding




[PATCH 2/2] t3430: update to test with custom commentChar

2018-07-08 Thread Daniel Harding
Signed-off-by: Daniel Harding 
---
 t/t3430-rebase-merges.sh | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index 78f7c9958..ff474d033 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -56,12 +56,12 @@ test_expect_success 'create completely different structure' 
'
cat >script-from-scratch <<-\EOF &&
label onto
 
-   # onebranch
+   > onebranch
pick G
pick D
label onebranch
 
-   # second
+   > second
reset onto
pick B
label second
@@ -70,6 +70,7 @@ test_expect_success 'create completely different structure' '
merge -C H second
merge onebranch # Merge the topic branch '\''onebranch'\''
EOF
+   test_config core.commentChar ">" &&
test_config sequence.editor \""$PWD"/replace-editor.sh\" &&
test_tick &&
git rebase -i -r A &&
@@ -107,10 +108,10 @@ test_expect_success 'generate correct todo list' '
pick 12bd07b D
merge -C 2051b56 E # E
merge -C 233d48a H # H
-
EOF
 
-   grep -v "^#" <.git/ORIGINAL-TODO >output &&
+   test_config core.commentChar ">" &&
+   git stripspace -s <.git/ORIGINAL-TODO >output &&
test_cmp expect output
 '
 
-- 
2.18.0



Re: [PATCH v1 2/2] convert: add alias support for 'working-tree-encoding' attributes

2018-07-08 Thread Lars Schneider



> On Jul 8, 2018, at 8:30 PM, larsxschnei...@gmail.com wrote:
> 
> From: Lars Schneider 
> 
> In 107642fe26 ("convert: add 'working-tree-encoding' attribute",
> 2018-04-15) we added an attribute which defines the working tree
> encoding of a file.
> 
> Some platforms might spell the name of a certain encoding differently or
> some users might want to use different encodings on different platforms.
> Add the Git config "encoding..insteadOf = " to
> support these use-cases with a user specific mapping. If the alias
> matches an existing encoding name, then the alias will take precedence.
> The alias is case insensitive.
> 
> Example:
> 
>   (in .gitattributes)
>   *.c working-tree-encoding=foo
> 
>   (in config)
>   [encoding "UTF-16"]
>   insteadOf = foo
> 
> Signed-off-by: Lars Schneider 
> ---
> Documentation/gitattributes.txt  | 19 
> convert.c| 50 
> t/t0028-working-tree-encoding.sh | 28 ++
> 3 files changed, 97 insertions(+)
> 
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 92010b062e..3628f0e5cf 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -366,6 +366,25 @@ command to guess the encoding:
> file foo.ps1
> 
> 
> ...
> 
>   return 0;
> @@ -1225,6 +1264,17 @@ static const char *git_path_check_encoding(struct 
> attr_check_item *check)
>   die(_("true/false are no valid working-tree-encodings"));
>   }
> 
> + /* Check if an alias was defined for the encoding in the Git config */
> + if (encoding_aliases_initialized) {
> + struct alias2enc hashkey;
> + struct alias2enc *entry;
> + hashmap_entry_init(, strihash(value));
> + hashkey.alias = value;
> + entry = hashmap_get(_map, , NULL);
> + if (entry)
> + value = entry->encoding;

Here I reuse the char* pointer from the hashmap.
The hashmap is static and no entry is ever removed.
Is this OK or should I rather create a copy of the string?

Thanks,
Lars



Re: [PATCH v1 1/2] convert: refactor conversion driver config parsing

2018-07-08 Thread Lars Schneider



> On Jul 8, 2018, at 8:30 PM, larsxschnei...@gmail.com wrote:
> 
> From: Lars Schneider 
> 
> Refactor conversion driver config parsing to ease the parsing of new
> configs in a subsequent patch.
> 
> No functional change intended.
> 
> Signed-off-by: Lars Schneider 
> ---
> convert.c | 64 +++
> 1 file changed, 32 insertions(+), 32 deletions(-)
> 
> diff --git a/convert.c b/convert.c
> index 64d0d30e08..949bc783e4 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -1003,43 +1003,43 @@ static int read_convert_config(const char *var, const 
> char *value, void *cb)
>   int namelen;
>   struct convert_driver *drv;
> 
> ...
> 
> - /*
> -  * filter..smudge and filter..clean specifies
> -  * the command line:
> -  *
> -  *  command-line
> -  *
> -  * The command-line will not be interpolated in any way.
> -  */
> + /*
> +  * filter..smudge and filter..clean specifies
> +  * the command line:
> +  *
> +  *  command-line
> +  *
> +  * The command-line will not be interpolated in any way.
> +  */

I stumbled over this comment introduced in aa4ed402c9 
("Add 'filter' attribute and external filter driver definition.", 2007-04-21).

Is the middle "command-line" intentional?

- Lars

[PATCH v1 2/2] convert: add alias support for 'working-tree-encoding' attributes

2018-07-08 Thread larsxschneider
From: Lars Schneider 

In 107642fe26 ("convert: add 'working-tree-encoding' attribute",
2018-04-15) we added an attribute which defines the working tree
encoding of a file.

Some platforms might spell the name of a certain encoding differently or
some users might want to use different encodings on different platforms.
Add the Git config "encoding..insteadOf = " to
support these use-cases with a user specific mapping. If the alias
matches an existing encoding name, then the alias will take precedence.
The alias is case insensitive.

Example:

(in .gitattributes)
*.c working-tree-encoding=foo

(in config)
[encoding "UTF-16"]
insteadOf = foo

Signed-off-by: Lars Schneider 
---
 Documentation/gitattributes.txt  | 19 
 convert.c| 50 
 t/t0028-working-tree-encoding.sh | 28 ++
 3 files changed, 97 insertions(+)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 92010b062e..3628f0e5cf 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -366,6 +366,25 @@ command to guess the encoding:
 file foo.ps1
 
 
+The encoding in all examples above was directly defined in the Git
+attributes. In addition, it is possible to define encodings indirectly
+using aliases set via Git config:
+
+
+[encoding "UTF-16"]
+insteadOf = my-custom-alias
+
+
+The alias name can be used in the Git attributes instead of the actual
+encoding name:
+
+
+*.ps1   text working-tree-encoding=my-custom-alias
+
+
+This mapping can be useful if equivalent encodings are spelled
+differently across platforms. It can also be useful if a user wants to
+use different encodings on different platforms for the same file.
 
 `ident`
 ^^^
diff --git a/convert.c b/convert.c
index 949bc783e4..4f19ce1a04 100644
--- a/convert.c
+++ b/convert.c
@@ -997,6 +997,15 @@ static int apply_filter(const char *path, const char *src, 
size_t len,
return 0;
 }
 
+struct alias2enc {
+   struct hashmap_entry ent; /* must be the first member! */
+   const char *alias;
+   const char *encoding;
+};
+
+static int encoding_aliases_initialized;
+static struct hashmap encoding_map;
+
 static int read_convert_config(const char *var, const char *value, void *cb)
 {
const char *key, *name;
@@ -1040,6 +1049,36 @@ static int read_convert_config(const char *var, const 
char *value, void *cb)
drv->required = git_config_bool(var, value);
return 0;
}
+   } else if (
+   parse_config_key(var, "encoding", , , ) >= 0 &&
+   name && !strcmp(key, "insteadof")) {
+   /*
+* Encoding aliases are configured using
+* "encoding..insteadOf = ".
+*/
+   struct alias2enc *entry;
+   if (!value)
+   return config_error_nonbool(key);
+
+   if (!encoding_aliases_initialized) {
+   encoding_aliases_initialized = 1;
+   hashmap_init(_map, NULL, NULL, 0);
+   entry = NULL;
+   } else {
+   struct alias2enc hashkey;
+   hashmap_entry_init(, strihash(value));
+   hashkey.alias = value;
+   entry = hashmap_get(_map, , NULL);
+   }
+
+   if (!entry) {
+   entry = xmalloc(sizeof(*entry));
+   entry->encoding = xstrndup(name, namelen);
+   entry->alias = xstrdup(value);
+
+   hashmap_entry_init(entry, strihash(value));
+   hashmap_add(_map, entry);
+   }
}
 
return 0;
@@ -1225,6 +1264,17 @@ static const char *git_path_check_encoding(struct 
attr_check_item *check)
die(_("true/false are no valid working-tree-encodings"));
}
 
+   /* Check if an alias was defined for the encoding in the Git config */
+   if (encoding_aliases_initialized) {
+   struct alias2enc hashkey;
+   struct alias2enc *entry;
+   hashmap_entry_init(, strihash(value));
+   hashkey.alias = value;
+   entry = hashmap_get(_map, , NULL);
+   if (entry)
+   value = entry->encoding;
+   }
+
/* Don't encode to the default encoding */
if (same_encoding(value, default_encoding))
return NULL;
diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh
index 12b8eb963a..d803e00cbe 100755
--- a/t/t0028-working-tree-encoding.sh
+++ b/t/t0028-working-tree-encoding.sh
@@ -242,4 +242,32 @@ test_expect_success 'check roundtrip 

[PATCH v1 1/2] convert: refactor conversion driver config parsing

2018-07-08 Thread larsxschneider
From: Lars Schneider 

Refactor conversion driver config parsing to ease the parsing of new
configs in a subsequent patch.

No functional change intended.

Signed-off-by: Lars Schneider 
---
 convert.c | 64 +++
 1 file changed, 32 insertions(+), 32 deletions(-)

diff --git a/convert.c b/convert.c
index 64d0d30e08..949bc783e4 100644
--- a/convert.c
+++ b/convert.c
@@ -1003,43 +1003,43 @@ static int read_convert_config(const char *var, const 
char *value, void *cb)
int namelen;
struct convert_driver *drv;
 
-   /*
-* External conversion drivers are configured using
-* "filter..variable".
-*/
-   if (parse_config_key(var, "filter", , , ) < 0 || !name)
-   return 0;
-   for (drv = user_convert; drv; drv = drv->next)
-   if (!strncmp(drv->name, name, namelen) && !drv->name[namelen])
-   break;
-   if (!drv) {
-   drv = xcalloc(1, sizeof(struct convert_driver));
-   drv->name = xmemdupz(name, namelen);
-   *user_convert_tail = drv;
-   user_convert_tail = &(drv->next);
-   }
+   if (parse_config_key(var, "filter", , , ) >= 0 && 
name) {
+   /*
+* External conversion drivers are configured using
+* "filter..variable".
+*/
+   for (drv = user_convert; drv; drv = drv->next)
+   if (!strncmp(drv->name, name, namelen) && 
!drv->name[namelen])
+   break;
+   if (!drv) {
+   drv = xcalloc(1, sizeof(struct convert_driver));
+   drv->name = xmemdupz(name, namelen);
+   *user_convert_tail = drv;
+   user_convert_tail = &(drv->next);
+   }
 
-   /*
-* filter..smudge and filter..clean specifies
-* the command line:
-*
-*  command-line
-*
-* The command-line will not be interpolated in any way.
-*/
+   /*
+* filter..smudge and filter..clean specifies
+* the command line:
+*
+*  command-line
+*
+* The command-line will not be interpolated in any way.
+*/
 
-   if (!strcmp("smudge", key))
-   return git_config_string(>smudge, var, value);
+   if (!strcmp("smudge", key))
+   return git_config_string(>smudge, var, value);
 
-   if (!strcmp("clean", key))
-   return git_config_string(>clean, var, value);
+   if (!strcmp("clean", key))
+   return git_config_string(>clean, var, value);
 
-   if (!strcmp("process", key))
-   return git_config_string(>process, var, value);
+   if (!strcmp("process", key))
+   return git_config_string(>process, var, value);
 
-   if (!strcmp("required", key)) {
-   drv->required = git_config_bool(var, value);
-   return 0;
+   if (!strcmp("required", key)) {
+   drv->required = git_config_bool(var, value);
+   return 0;
+   }
}
 
return 0;
-- 
2.18.0



[PATCH v1 0/2] convert: add alias support for 'working-tree-encoding' attributes

2018-07-08 Thread larsxschneider
From: Lars Schneider 

Hi,

this series adds Git config based alias support for
'working-tree-encoding' attributes that were introduced in 107642fe26
("convert: add 'working-tree-encoding' attribute", 2018-04-15).
The feature was suggested by Steve Groeger in [1].

The first patch is a refactoring with no functional change intended.
The second patch is the actual change.

Thanks,
Lars

[1] 
https://public-inbox.org/git/of5d40fe06.c18cd7cd-on002582b9.002b7a02-002582b9.002b7...@notes.na.collabserv.com/

Base Ref: master
Web-Diff: https://github.com/larsxschneider/git/commit/0cbee9bd8d
Checkout: git fetch https://github.com/larsxschneider/git encoding-alias-v1 && 
git checkout 0cbee9bd8d

Lars Schneider (2):
  convert: refactor conversion driver config parsing
  convert: add alias support for 'working-tree-encoding' attributes

 Documentation/gitattributes.txt  |  19 ++
 convert.c| 114 ++-
 t/t0028-working-tree-encoding.sh |  28 
 3 files changed, 129 insertions(+), 32 deletions(-)


base-commit: e3331758f12da22f4103eec7efe1b5304a9be5e9
--
2.18.0



[PATCH v4 4/4] builtin/rebase: support running "git rebase "

2018-07-08 Thread Pratik Karki
This patch gives life to the skeleton added in the previous patches:
With this change, we can perform a elementary rebase (without any
options).

It can be tested thusly by:

git -c rebase.usebuiltin=true rebase HEAD~2

The rebase backends (i.e. the shell script functions defined in
`git-rebase--`) are still at work here and the "builtin
rebase"'s purpose is simply to parse the options and set
everything up so that those rebase backends can do their work.

Note: We take an alternative approach here which is *not* to set the
environment variables via `run_command_v_opt_cd_env()` because those
variables would then be visible by processes spawned from the rebase
backends. Instead, we work hard to set them in the shell script snippet.

The next commits will handle and support all the wonderful rebase
options.

Signed-off-by: Pratik Karki 
---
 builtin/rebase.c | 238 ++-
 1 file changed, 237 insertions(+), 1 deletion(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index fab7fca37e..e38ea80874 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -9,6 +9,20 @@
 #include "exec-cmd.h"
 #include "argv-array.h"
 #include "dir.h"
+#include "packfile.h"
+#include "checkout.h"
+#include "refs.h"
+#include "quote.h"
+
+static GIT_PATH_FUNC(apply_dir, "rebase-apply");
+static GIT_PATH_FUNC(merge_dir, "rebase-merge");
+
+enum rebase_type {
+   REBASE_AM,
+   REBASE_MERGE,
+   REBASE_INTERACTIVE,
+   REBASE_PRESERVE_MERGES
+};
 
 static int use_builtin_rebase(void)
 {
@@ -28,8 +42,136 @@ static int use_builtin_rebase(void)
return ret;
 }
 
+static int apply_autostash(void)
+{
+   warning("TODO");
+   return 0;
+}
+
+struct rebase_options {
+   enum rebase_type type;
+   const char *state_dir;
+   struct commit *upstream;
+   const char *upstream_name;
+   char *head_name;
+   struct object_id orig_head;
+   struct commit *onto;
+   const char *onto_name;
+   const char *revisions;
+   const char *root;
+};
+
+static int finish_rebase(struct rebase_options *opts)
+{
+   struct strbuf dir = STRBUF_INIT;
+   const char *argv_gc_auto[] = { "gc", "--auto", NULL };
+
+   delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
+   apply_autostash();
+   close_all_packs(the_repository->objects);
+   /*
+* We ignore errors in 'gc --auto', since the
+* user should see them.
+*/
+   run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
+   strbuf_addstr(, opts->state_dir);
+   remove_dir_recursively(, 0);
+   strbuf_release();
+
+   return 0;
+}
+
+static struct commit *peel_committish(const char *name)
+{
+   struct object *obj;
+   struct object_id oid;
+
+   if (get_oid(name, ))
+   return NULL;
+   obj = parse_object();
+   return (struct commit *)peel_to_type(name, 0, obj, OBJ_COMMIT);
+}
+
+static void add_var(struct strbuf *buf, const char *name, const char *value)
+{
+   strbuf_addstr(buf, name);
+   strbuf_addstr(buf, "=");
+   sq_quote_buf(buf, value);
+   strbuf_addstr(buf, "; ");
+}
+
+static int run_specific_rebase(struct rebase_options *opts)
+{
+   const char *argv[] = { NULL, NULL };
+   struct strbuf script_snippet = STRBUF_INIT;
+   int status;
+   const char *backend, *backend_func;
+
+   add_var(_snippet, "GIT_DIR", absolute_path(get_git_dir()));
+
+   add_var(_snippet, "upstream_name", opts->upstream_name);
+   add_var(_snippet, "upstream",
+oid_to_hex(>upstream->object.oid));
+   add_var(_snippet, "head_name", opts->head_name);
+   add_var(_snippet, "orig_head", oid_to_hex(>orig_head));
+   add_var(_snippet, "onto", oid_to_hex(>onto->object.oid));
+   add_var(_snippet, "onto_name", opts->onto_name);
+   add_var(_snippet, "revisions", opts->revisions);
+
+   switch (opts->type) {
+   case REBASE_AM:
+   backend = "git-rebase--am";
+   backend_func = "git_rebase__am";
+   break;
+   case REBASE_INTERACTIVE:
+   backend = "git-rebase--interactive";
+   backend_func = "git_rebase__interactive";
+   break;
+   case REBASE_MERGE:
+   backend = "git-rebase--merge";
+   backend_func = "git_rebase__merge";
+   break;
+   case REBASE_PRESERVE_MERGES:
+   backend = "git-rebase--preserve-merges";
+   backend_func = "git_rebase__preserve_merges";
+   break;
+   default:
+   BUG("Unhandled rebase type %d", opts->type);
+   break;
+   }
+
+   strbuf_addf(_snippet,
+   ". git-rebase--common && . %s && %s",
+   backend, backend_func);
+   argv[0] = script_snippet.buf;
+
+   status = run_command_v_opt(argv, RUN_USING_SHELL);
+   if (status == 0)
+   finish_rebase(opts);
+ 

[PATCH v4 3/4] sequencer: refactor the code to detach HEAD to checkout.c

2018-07-08 Thread Pratik Karki
In the upcoming builtin rebase, we will have to start by detaching
the HEAD, just like shell script version does. Essentially, we have
to do the same thing as `git checkout -q ^0 --`, in pure C.

The aforementioned functionality was already present in `sequencer.c`
in `do_reset()` function. But `do_reset()` performs more than detaching
the HEAD, and performs action specific to `sequencer.c`.

So this commit refactors out that part from `do_reset()`, and moves it
to a new function called `detach_head_to()`. As this function has
nothing to do with the sequencer, and everything to do with what `git
checkout -q ^0 --` does, we move that function to checkout.c.

This refactoring actually introduces a slight change in behavior:
previously, the index was locked before parsing the argument to the
todo command `reset`, while it now gets locked *after* that, in the
`detach_head_to()` function.

It does not make a huge difference, and the upside is that this closes
a few (unlikely) code paths where the index would not be unlocked upon
error.

Signed-off-by: Pratik Karki 
---
 checkout.c  | 64 +
 checkout.h  |  3 +++
 sequencer.c | 58 +---
 3 files changed, 72 insertions(+), 53 deletions(-)

diff --git a/checkout.c b/checkout.c
index bdefc888ba..da68915fd7 100644
--- a/checkout.c
+++ b/checkout.c
@@ -2,6 +2,11 @@
 #include "remote.h"
 #include "refspec.h"
 #include "checkout.h"
+#include "unpack-trees.h"
+#include "lockfile.h"
+#include "refs.h"
+#include "tree.h"
+#include "cache-tree.h"
 
 struct tracking_name_data {
/* const */ char *src_ref;
@@ -42,3 +47,62 @@ const char *unique_tracking_name(const char *name, struct 
object_id *oid)
free(cb_data.dst_ref);
return NULL;
 }
+
+int detach_head_to(struct object_id *oid, const char *action,
+  const char *reflog_message)
+{
+   struct strbuf ref_name = STRBUF_INIT;
+   struct tree_desc desc;
+   struct lock_file lock = LOCK_INIT;
+   struct unpack_trees_options unpack_tree_opts;
+   struct tree *tree;
+   int ret = 0;
+
+   if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0)
+   return -1;
+
+   memset(_tree_opts, 0, sizeof(unpack_tree_opts));
+   setup_unpack_trees_porcelain(_tree_opts, action);
+   unpack_tree_opts.head_idx = 1;
+   unpack_tree_opts.src_index = _index;
+   unpack_tree_opts.dst_index = _index;
+   unpack_tree_opts.fn = oneway_merge;
+   unpack_tree_opts.merge = 1;
+   unpack_tree_opts.update = 1;
+
+   if (read_cache_unmerged()) {
+   rollback_lock_file();
+   strbuf_release(_name);
+   return error_resolve_conflict(_(action));
+   }
+
+   if (!fill_tree_descriptor(, oid)) {
+   error(_("failed to find tree of %s"), oid_to_hex(oid));
+   rollback_lock_file();
+   free((void *)desc.buffer);
+   strbuf_release(_name);
+   return -1;
+   }
+
+   if (unpack_trees(1, , _tree_opts)) {
+   rollback_lock_file();
+   free((void *)desc.buffer);
+   strbuf_release(_name);
+   return -1;
+   }
+
+   tree = parse_tree_indirect(oid);
+   prime_cache_tree(_index, tree);
+
+   if (write_locked_index(_index, , COMMIT_LOCK) < 0)
+   ret = error(_("could not write index"));
+   free((void *)desc.buffer);
+
+   if (!ret)
+   ret = update_ref(reflog_message, "HEAD", oid,
+NULL, 0, UPDATE_REFS_MSG_ON_ERR);
+
+   strbuf_release(_name);
+   return ret;
+
+}
diff --git a/checkout.h b/checkout.h
index 9980711179..6ce46cf4e8 100644
--- a/checkout.h
+++ b/checkout.h
@@ -10,4 +10,7 @@
  */
 extern const char *unique_tracking_name(const char *name, struct object_id 
*oid);
 
+int detach_head_to(struct object_id *oid, const char *action,
+  const char *reflog_message);
+
 #endif /* CHECKOUT_H */
diff --git a/sequencer.c b/sequencer.c
index 5354d4d51e..106d518f4d 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -29,6 +29,7 @@
 #include "oidset.h"
 #include "commit-slab.h"
 #include "alias.h"
+#include "checkout.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2756,14 +2757,7 @@ static int do_reset(const char *name, int len, struct 
replay_opts *opts)
 {
struct strbuf ref_name = STRBUF_INIT;
struct object_id oid;
-   struct lock_file lock = LOCK_INIT;
-   struct tree_desc desc;
-   struct tree *tree;
-   struct unpack_trees_options unpack_tree_opts;
-   int ret = 0, i;
-
-   if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0)
-   return -1;
+   int i;
 
if (len == 10 && !strncmp("[new root]", name, len)) {
if (!opts->have_squash_onto) {
@@ -2789,56 +2783,14 @@ static int do_reset(const char *name, int len, struct 
replay_opts *opts)
 

[PATCH v4 2/4] rebase: refactor common shell functions into their own file

2018-07-08 Thread Pratik Karki
The functions present in `git-legacy-rebase.sh` are used by the rebase
backends as they are implemented as shell script functions in the
`git-rebase--` files.

To make the `builtin/rebase.c` work, we have to provide support via
a Unix shell script snippet that uses these functions and so, we
want to use the rebase backends *directly* from the builtin rebase
without going through `git-legacy-rebase.sh`.

This commit extracts the functions to a separate file,
`git-rebase--common`, that will be read by `git-legacy-rebase.sh` and
by the shell script snippets which will be used extensively in the
following commits.

Signed-off-by: Pratik Karki 
---
 .gitignore|  1 +
 Makefile  |  1 +
 git-legacy-rebase.sh  | 69 ++-
 git-rebase--common.sh | 68 ++
 4 files changed, 72 insertions(+), 67 deletions(-)
 create mode 100644 git-rebase--common.sh

diff --git a/.gitignore b/.gitignore
index ec23959014..824141cba1 100644
--- a/.gitignore
+++ b/.gitignore
@@ -117,6 +117,7 @@
 /git-read-tree
 /git-rebase
 /git-rebase--am
+/git-rebase--common
 /git-rebase--helper
 /git-rebase--interactive
 /git-rebase--merge
diff --git a/Makefile b/Makefile
index e88fe2e5fb..163e52ad1a 100644
--- a/Makefile
+++ b/Makefile
@@ -619,6 +619,7 @@ SCRIPT_SH += git-web--browse.sh
 SCRIPT_LIB += git-mergetool--lib
 SCRIPT_LIB += git-parse-remote
 SCRIPT_LIB += git-rebase--am
+SCRIPT_LIB += git-rebase--common
 SCRIPT_LIB += git-rebase--interactive
 SCRIPT_LIB += git-rebase--preserve-merges
 SCRIPT_LIB += git-rebase--merge
diff --git a/git-legacy-rebase.sh b/git-legacy-rebase.sh
index 19bdebb480..fbaf16420d 100755
--- a/git-legacy-rebase.sh
+++ b/git-legacy-rebase.sh
@@ -57,12 +57,7 @@ cd_to_toplevel
 LF='
 '
 ok_to_skip_pre_rebase=
-resolvemsg="
-$(gettext 'Resolve all conflicts manually, mark them as resolved with
-"git add/rm ", then run "git rebase --continue".
-You can instead skip this commit: run "git rebase --skip".
-To abort and get back to the state before "git rebase", run "git rebase 
--abort".')
-"
+
 squash_onto=
 unset onto
 unset restrict_revision
@@ -102,6 +97,7 @@ case "$(git config --bool commit.gpgsign)" in
 true)  gpg_sign_opt=-S ;;
 *) gpg_sign_opt= ;;
 esac
+. git-rebase--common
 
 read_basic_state () {
test -f "$state_dir/head-name" &&
@@ -132,67 +128,6 @@ read_basic_state () {
}
 }
 
-write_basic_state () {
-   echo "$head_name" > "$state_dir"/head-name &&
-   echo "$onto" > "$state_dir"/onto &&
-   echo "$orig_head" > "$state_dir"/orig-head &&
-   echo "$GIT_QUIET" > "$state_dir"/quiet &&
-   test t = "$verbose" && : > "$state_dir"/verbose
-   test -n "$strategy" && echo "$strategy" > "$state_dir"/strategy
-   test -n "$strategy_opts" && echo "$strategy_opts" > \
-   "$state_dir"/strategy_opts
-   test -n "$allow_rerere_autoupdate" && echo "$allow_rerere_autoupdate" > 
\
-   "$state_dir"/allow_rerere_autoupdate
-   test -n "$gpg_sign_opt" && echo "$gpg_sign_opt" > 
"$state_dir"/gpg_sign_opt
-   test -n "$signoff" && echo "$signoff" >"$state_dir"/signoff
-}
-
-output () {
-   case "$verbose" in
-   '')
-   output=$("$@" 2>&1 )
-   status=$?
-   test $status != 0 && printf "%s\n" "$output"
-   return $status
-   ;;
-   *)
-   "$@"
-   ;;
-   esac
-}
-
-move_to_original_branch () {
-   case "$head_name" in
-   refs/*)
-   message="rebase finished: $head_name onto $onto"
-   git update-ref -m "$message" \
-   $head_name $(git rev-parse HEAD) $orig_head &&
-   git symbolic-ref \
-   -m "rebase finished: returning to $head_name" \
-   HEAD $head_name ||
-   die "$(eval_gettext "Could not move back to \$head_name")"
-   ;;
-   esac
-}
-
-apply_autostash () {
-   if test -f "$state_dir/autostash"
-   then
-   stash_sha1=$(cat "$state_dir/autostash")
-   if git stash apply $stash_sha1 >/dev/null 2>&1
-   then
-   echo "$(gettext 'Applied autostash.')" >&2
-   else
-   git stash store -m "autostash" -q $stash_sha1 ||
-   die "$(eval_gettext "Cannot store \$stash_sha1")"
-   gettext 'Applying autostash resulted in conflicts.
-Your changes are safe in the stash.
-You can run "git stash pop" or "git stash drop" at any time.
-' >&2
-   fi
-   fi
-}
-
 finish_rebase () {
rm -f "$(git rev-parse --git-path REBASE_HEAD)"
apply_autostash &&
diff --git a/git-rebase--common.sh b/git-rebase--common.sh
new file mode 100644
index 00..7e39d22871
--- /dev/null
+++ b/git-rebase--common.sh
@@ -0,0 +1,68 @@
+
+resolvemsg="
+$(gettext 'Resolve all conflicts manually, 

[PATCH v4 1/4] rebase: start implementing it as a builtin

2018-07-08 Thread Pratik Karki
This commit imitates the strategy that was used to convert the
difftool to a builtin. We start by renaming the shell script
`git-rebase.sh` to `git-legacy-rebase.sh` and introduce a
`builtin/rebase.c` that simply executes the shell script version,
unless the config setting `rebase.useBuiltin` is set to `true`.

The motivation behind this is to rewrite all the functionality of the
shell script version in the aforementioned `rebase.c`, one by one and
be able to conveniently test new features by configuring
`rebase.useBuiltin`.

In the original difftool conversion, if sane_execvp() that attempts to
run the legacy scripted version returned with non-negative status, the
command silently exited without doing anything with success, but
sane_execvp() should not retun with non-negative status in the first
place, so we use die() to notice such an abnormal case.

We intentionally avoid reading the config directly to avoid
messing up the GIT_* environment variables when we need to fall back to
exec()ing the shell script. The test of builtin rebase can be done by
`git -c rebase.useBuiltin=true rebase ...`

Signed-off-by: Pratik Karki 
---
 .gitignore|  1 +
 Makefile  |  3 +-
 builtin.h |  1 +
 builtin/rebase.c  | 56 +++
 git-rebase.sh => git-legacy-rebase.sh |  0
 git.c |  6 +++
 6 files changed, 66 insertions(+), 1 deletion(-)
 create mode 100644 builtin/rebase.c
 rename git-rebase.sh => git-legacy-rebase.sh (100%)

diff --git a/.gitignore b/.gitignore
index 3284a1e9b1..ec23959014 100644
--- a/.gitignore
+++ b/.gitignore
@@ -78,6 +78,7 @@
 /git-init-db
 /git-interpret-trailers
 /git-instaweb
+/git-legacy-rebase
 /git-log
 /git-ls-files
 /git-ls-remote
diff --git a/Makefile b/Makefile
index 0cb6590f24..e88fe2e5fb 100644
--- a/Makefile
+++ b/Makefile
@@ -609,7 +609,7 @@ SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
-SCRIPT_SH += git-rebase.sh
+SCRIPT_SH += git-legacy-rebase.sh
 SCRIPT_SH += git-remote-testgit.sh
 SCRIPT_SH += git-request-pull.sh
 SCRIPT_SH += git-stash.sh
@@ -1059,6 +1059,7 @@ BUILTIN_OBJS += builtin/prune.o
 BUILTIN_OBJS += builtin/pull.o
 BUILTIN_OBJS += builtin/push.o
 BUILTIN_OBJS += builtin/read-tree.o
+BUILTIN_OBJS += builtin/rebase.o
 BUILTIN_OBJS += builtin/rebase--helper.o
 BUILTIN_OBJS += builtin/receive-pack.o
 BUILTIN_OBJS += builtin/reflog.o
diff --git a/builtin.h b/builtin.h
index 0362f1ce25..44651a447f 100644
--- a/builtin.h
+++ b/builtin.h
@@ -202,6 +202,7 @@ extern int cmd_prune_packed(int argc, const char **argv, 
const char *prefix);
 extern int cmd_pull(int argc, const char **argv, const char *prefix);
 extern int cmd_push(int argc, const char **argv, const char *prefix);
 extern int cmd_read_tree(int argc, const char **argv, const char *prefix);
+extern int cmd_rebase(int argc, const char **argv, const char *prefix);
 extern int cmd_rebase__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_receive_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_reflog(int argc, const char **argv, const char *prefix);
diff --git a/builtin/rebase.c b/builtin/rebase.c
new file mode 100644
index 00..fab7fca37e
--- /dev/null
+++ b/builtin/rebase.c
@@ -0,0 +1,56 @@
+/*
+ * "git rebase" builtin command
+ *
+ * Copyright (c) 2018 Pratik Karki
+ */
+
+#include "builtin.h"
+#include "run-command.h"
+#include "exec-cmd.h"
+#include "argv-array.h"
+#include "dir.h"
+
+static int use_builtin_rebase(void)
+{
+   struct child_process cp = CHILD_PROCESS_INIT;
+   struct strbuf out = STRBUF_INIT;
+   int ret;
+
+   argv_array_pushl(,
+"config", "--bool", "rebase.usebuiltin", NULL);
+   cp.git_cmd = 1;
+   if (capture_command(, , 6))
+   return 0;
+
+   strbuf_trim();
+   ret = !strcmp("true", out.buf);
+   strbuf_release();
+   return ret;
+}
+
+int cmd_rebase(int argc, const char **argv, const char *prefix)
+{
+   /*
+* NEEDSWORK: Once the builtin rebase has been tested enough
+* and git-legacy-rebase.sh is retired to contrib/, this preamble
+* can be removed.
+*/
+
+   if (!use_builtin_rebase()) {
+   const char *path = mkpath("%s/git-legacy-rebase",
+ git_exec_path());
+
+   if (sane_execvp(path, (char **)argv) < 0)
+   die_errno("could not exec %s", path);
+   else
+   die("sane_execvp() returned???");
+   }
+
+   if (argc != 2)
+   die("Usage: %s ", argv[0]);
+   prefix = setup_git_directory();
+   trace_repo_setup(prefix);
+   setup_work_tree();
+
+   die("TODO");
+}
diff --git a/git-rebase.sh b/git-legacy-rebase.sh
similarity index 100%
rename from 

[GSoC] [PATCH v4 0/4] rebase: rewrite rebase in C

2018-07-08 Thread Pratik Karki
As a GSoC project, I have been working on the builtin rebase.

The motivation behind the rewrite of rebase i.e. from shell script to C
are for following reasons:

1.  Writing shell scripts and getting it to production is much faster
than doing the equivalent in C but lacks in performance and extra
workarounds are needed for non-POSIX platforms.

2.  Git for Windows is at loss as the installer size increases due to
addition of extra dependencies for the shell scripts which are usually
available in POSIX compliant platforms.

This series of patches serves to demonstrate a minimal builtin rebase
which supports running `git rebase ` and also serves to ask for
reviews.

Changes since v3:

  -  Fix commit message of `rebase: start implementing it as a builtin`.

  -  Acknowledge Junio's style reviews.

  -  Acknowledge Johannes Schindelin's review.

Pratik Karki (4):
  rebase: start implementing it as a builtin
  rebase: refactor common shell functions into their own file
  sequencer: refactor the code to detach HEAD to checkout.c
  builtin/rebase: support running "git rebase "

 .gitignore|   2 +
 Makefile  |   4 +-
 builtin.h |   1 +
 builtin/rebase.c  | 292 ++
 checkout.c|  64 ++
 checkout.h|   3 +
 git-rebase.sh => git-legacy-rebase.sh |  69 +-
 git-rebase--common.sh |  68 ++
 git.c |   6 +
 sequencer.c   |  58 +
 10 files changed, 446 insertions(+), 121 deletions(-)
 create mode 100644 builtin/rebase.c
 rename git-rebase.sh => git-legacy-rebase.sh (89%)
 create mode 100644 git-rebase--common.sh

-- 
2.18.0



[RFC PATCH 3/6] convert.c: replace "\e" escapes with "\033".

2018-07-08 Thread Beat Bolli
The "\e" escape is not defined in ISO C.

While on this line, add a missing space after the comma.

Signed-off-by: Beat Bolli 
---
 convert.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/convert.c b/convert.c
index 64d0d30e08..edebb946f5 100644
--- a/convert.c
+++ b/convert.c
@@ -334,7 +334,7 @@ static void trace_encoding(const char *context, const char 
*path,
strbuf_addf(, "%s (%s, considered %s):\n", context, path, 
encoding);
for (i = 0; i < len && buf; ++i) {
strbuf_addf(
-   ,"| \e[2m%2i:\e[0m %2x \e[2m%c\e[0m%c",
+   , "| \033[2m%2i:\033[0m %2x \033[2m%c\033[0m%c",
i,
(unsigned char) buf[i],
(buf[i] > 32 && buf[i] < 127 ? buf[i] : ' '),
-- 
2.15.0.rc1.299.gda03b47c3



[RFC PATCH 5/6] string-list.c: avoid conversion from void * to function pointer

2018-07-08 Thread Beat Bolli
ISO C forbids the conversion of void pointers to function pointers.
Introduce a context struct that encapsulates the function pointer.

Signed-off-by: Beat Bolli 
---
 string-list.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/string-list.c b/string-list.c
index a0cf0cfe88..771c455098 100644
--- a/string-list.c
+++ b/string-list.c
@@ -224,18 +224,28 @@ struct string_list_item *string_list_append(struct 
string_list *list,
list->strdup_strings ? xstrdup(string) : (char 
*)string);
 }
 
+/*
+ * Encapsulate the compare function pointer because ISO C99 forbids
+ * casting from void * to a function pointer and vice versa.
+ */
+struct string_list_sort_ctx
+{
+   compare_strings_fn cmp;
+};
+
 static int cmp_items(const void *a, const void *b, void *ctx)
 {
-   compare_strings_fn cmp = ctx;
+   struct string_list_sort_ctx *sort_ctx = ctx;
const struct string_list_item *one = a;
const struct string_list_item *two = b;
-   return cmp(one->string, two->string);
+   return sort_ctx->cmp(one->string, two->string);
 }
 
 void string_list_sort(struct string_list *list)
 {
-   QSORT_S(list->items, list->nr, cmp_items,
-   list->cmp ? list->cmp : strcmp);
+   struct string_list_sort_ctx sort_ctx = {list->cmp ? list->cmp : strcmp};
+
+   QSORT_S(list->items, list->nr, cmp_items, _ctx);
 }
 
 struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
-- 
2.15.0.rc1.299.gda03b47c3



[RFC PATCH 2/6] refs/refs-internal.h: avoid forward declaration of an enum

2018-07-08 Thread Beat Bolli
Include iterator.h to define enum iterator_selection.

Signed-off-by: Beat Bolli 
---
 refs/refs-internal.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index dd834314bd..a78b5cb803 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -1,6 +1,8 @@
 #ifndef REFS_REFS_INTERNAL_H
 #define REFS_REFS_INTERNAL_H
 
+#include "iterator.h"   /* for enum iterator_selection */
+
 /*
  * Data structures and functions for the internal use of the refs
  * module. Code outside of the refs module should use only the public
-- 
2.15.0.rc1.299.gda03b47c3



[RFC PATCH 6/6] utf8.c: avoid char overflow

2018-07-08 Thread Beat Bolli
In ISO C, char constants must be in the range -128..127. Change the BOM
constants to unsigned char to avoid overflow.

Signed-off-by: Beat Bolli 
---
 utf8.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/utf8.c b/utf8.c
index d55e20c641..833ce00617 100644
--- a/utf8.c
+++ b/utf8.c
@@ -561,15 +561,15 @@ char *reencode_string_len(const char *in, int insz,
 #endif
 
 static int has_bom_prefix(const char *data, size_t len,
- const char *bom, size_t bom_len)
+ const unsigned char *bom, size_t bom_len)
 {
return data && bom && (len >= bom_len) && !memcmp(data, bom, bom_len);
 }
 
-static const char utf16_be_bom[] = {0xFE, 0xFF};
-static const char utf16_le_bom[] = {0xFF, 0xFE};
-static const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
-static const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
+static const unsigned char utf16_be_bom[] = {0xFE, 0xFF};
+static const unsigned char utf16_le_bom[] = {0xFF, 0xFE};
+static const unsigned char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
+static const unsigned char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
 
 int has_prohibited_utf_bom(const char *enc, const char *data, size_t len)
 {
-- 
2.15.0.rc1.299.gda03b47c3



[RFC PATCH 4/6] sequencer.c: avoid empty statements at top level

2018-07-08 Thread Beat Bolli
The marco GIT_PATH_FUNC expands to a complete statement including the
semicolon. Remove two extra trailing semicolons.

Signed-off-by: Beat Bolli 
---
 sequencer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 5354d4d51e..66e7073995 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -62,12 +62,12 @@ static GIT_PATH_FUNC(rebase_path_done, "rebase-merge/done")
  * The file to keep track of how many commands were already processed (e.g.
  * for the prompt).
  */
-static GIT_PATH_FUNC(rebase_path_msgnum, "rebase-merge/msgnum");
+static GIT_PATH_FUNC(rebase_path_msgnum, "rebase-merge/msgnum")
 /*
  * The file to keep track of how many commands are to be processed in total
  * (e.g. for the prompt).
  */
-static GIT_PATH_FUNC(rebase_path_msgtotal, "rebase-merge/end");
+static GIT_PATH_FUNC(rebase_path_msgtotal, "rebase-merge/end")
 /*
  * The commit message that is planned to be used for any changes that
  * need to be committed following a user interaction.
-- 
2.15.0.rc1.299.gda03b47c3



[RFC PATCH 1/6] connect.h: avoid forward declaration of an enum

2018-07-08 Thread Beat Bolli
Include protocol.h to define enum protocol_version.

Signed-off-by: Beat Bolli 
---
 connect.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/connect.h b/connect.h
index 0e69c6709c..c86f862f2f 100644
--- a/connect.h
+++ b/connect.h
@@ -1,6 +1,8 @@
 #ifndef CONNECT_H
 #define CONNECT_H
 
+#include "protocol.h"   /* for enum protocol_version */
+
 #define CONNECT_VERBOSE   (1u << 0)
 #define CONNECT_DIAG_URL  (1u << 1)
 #define CONNECT_IPV4  (1u << 2)
-- 
2.15.0.rc1.299.gda03b47c3



[RFC PATCH 0/6] Compile cleanly in pedantic mode

2018-07-08 Thread Beat Bolli
While developing 6aaded550 ("builtin/config: work around an unsized
array forward declaration", 2018-07-05), I have compiled Git with
CFLAGS="-std=c99 -pedantic".

This is an RFC patch series that fixes a few compiler warnings when
compiling with these options, always assuming that this is a worthwile
goal.

Note that all warnings were produced by -pedantic; the C99 standard
option by itself didn't cause any of them.

The warnings were:

1) Char arrays initialized from a parenthesized string.

Suppressed by defining USE_PARENS_AROUND_GETTEXT_N to 0
globally. This was done just to keep the amount of warnings
manageable; this series leaves that knob alone. The advantage of
not mistakenly concatenating two translated strings is greater.

2) connect.h, refs/refs-internal.h: Forward reference to an enum.

Added two #includes that define the enums. This was already
(inconclusively) talked about in [0].

3) convert.c: Invalid escape sequence "\e".

Replaced with "\033".

4) seqencer.c: Empty statements at top level.

Removed the extra semicolons.

5) string-list.c: Forbidden to cast from void * to a function pointer and
   vice versa.

Encapsulated the function pointer in a context struct. This is
controversial because it has a performance impact, namely one
additional pointer dereference per string comparison. An
alternative might be to use multiple casts via intptr_t. But
I'm not sure if this is worth the trouble.

6) utf8.c: overflow of char values.

Use unsigned char for the BOM constants.

This series has patches for 2) to 6).

Regards,
Beat

[0] 
https://public-inbox.org/git/53ab8626-f862-a732-b369-abeab69a4...@ramsayjones.plus.com/T/


Re: Croatian version of Pro Git book

2018-07-08 Thread josip zamboni

Dear Eric,

Thank you for the prompt response and the links.

Cheers, Josip


On 8/07/2018 19:08, Eric Sunshine wrote:

On Sun, Jul 8, 2018 at 4:23 AM josip zamboni
 wrote:

I've noticed there is no Croatian version of Pro Git Book and I'd like
to produce it. If you agree, do let me know the contact details of
person(s) I should be dealing with.

The Pro Git book is a separate project from Git itself. They have a
general guide for contributing here [1], and a guide for translators
here [2].

[1]: https://github.com/progit/progit2/blob/master/CONTRIBUTING.md
[2]: https://github.com/progit/progit2/blob/master/TRANSLATING.md


Re: Croatian version of Pro Git book

2018-07-08 Thread Eric Sunshine
On Sun, Jul 8, 2018 at 4:23 AM josip zamboni
 wrote:
> I've noticed there is no Croatian version of Pro Git Book and I'd like
> to produce it. If you agree, do let me know the contact details of
> person(s) I should be dealing with.

The Pro Git book is a separate project from Git itself. They have a
general guide for contributing here [1], and a guide for translators
here [2].

[1]: https://github.com/progit/progit2/blob/master/CONTRIBUTING.md
[2]: https://github.com/progit/progit2/blob/master/TRANSLATING.md


Croatian version of Pro Git book

2018-07-08 Thread josip zamboni

Dear all,

I've noticed there is no Croatian version of Pro Git Book and I'd like 
to produce it. If you agree, do let me know the contact details of 
person(s) I should be dealing with.


With kind regards,

Josip Zamboni

Castle Hill, NSW 2154, Australia

Phone: +614 03 234 333