Re: [GSoC] Proposal: turn git-add--interactive.perl into a builtin

2017-03-30 Thread Daniel Ferreira (theiostream)
Hi Stefan & Johannes,

Thank you for the precious feedback on the proposal. I don't see much
sense in sending a full "v2" of it and have you read it all over
again, so I'll just answer to your comments directly.

Also, although the GSoC website allows me to send a "proposal draft"
to you through the website, since I've already sent it here that
shouldn't be necessary, correct? I intend to use it just to send the
final thing.

On Wed, Mar 29, 2017 at 9:01 PM, Johannes Schindelin
 wrote:
> On Tue, 28 Mar 2017, Stefan Beller wrote:
>
>> On Sat, Mar 25, 2017 at 8:15 PM, Daniel Ferreira (theiostream)
>>  wrote:
>>
>> > SYNOPSIS
>> > There are many advantages to converting parts of git that are still
>> > scripts to C builtins, among which execution speed, improved
>> > compatibility and code deduplication.
>>
>> agreed.
>
> I would even add portability. But yeah, speed is a big thing. I am an
> extensive user of `git add -p` (which is backed by
> git-add--interactive.perl) and it is slow as molasses on Windows, just
> because it is a Perl script (and the Perl interpreter needs to emulate
> POSIX functionality that is frequently not even needed, such as: copying
> all memory and reopening all file descriptors in a fork() call only to
> exec() git.exe right away, tossing all of the diligently work into the
> dustbin).

Thanks for this example – it hadn't come to my mind since I don't use
Git on Windows. I'll be sure to complement the synopsis with it. :)

>
>> > FEASIBILITY
>> >
>> > There was only one discussion regarding the feasibility of its porting
>> > (https://public-inbox.org/git/CAP8UFD2PcBsU6=FK4OHVrB7E98ycohS_0pYcbCBar=of1hl...@mail.gmail.com/).
>> > It resulted in a consensus that doing it would be a task too large –
>> > although interesting – for GSoC 2015 based on the amount of its lines
>> > of code. It is, however, only a few lines larger than
>> > git-rebase--interactive, which has been considered an appropriate
>> > idea. As such, it looks like a possible project for three months of
>> > full-time work.
>>
>> ok, it sounds a challenging project. (currently counting 1750 lines of
>> code). Scrolling over the source code, there are quite a couple of
>> functions, where the direct equivalent in C springs to mind.
>>
>> run_cmd_pipe -> see run-command.h
>> unquote_path -> unquote_c_style ?
>> refresh -> update_index_if_able()
>> list_modified -> iterate over "const struct cache_entry *ce = 
>> active_cache[i];"

Thank you for these functions. I don't think I will be able to specify
them in detail as part of the projected timeline (e.g. "June 1:
convert calls to refresh() to use update_index_if_able()") already
because there is not enough time prior to the proposal deadline to
study their behavior in detail, and I like to avoid talking about
things I don't fully understand. Although I think I can cite them as
examples for a thesis I had put elsewhere in the proposal that "Git
APIs in Perl already have functional equivalents in C".

Also, they will be great for my early investigation stage into
git-add--interactive. :) Once more, thanks for having listed them.

> Yes, I think it would be more important to acquaint oneself with the
> idiosynchracies of Git's internal "API" than to get familiar with Perl:
> interpreting what obscure Perl code does is something I would gladly do as
> a mentor.

That's really nice! I usually don't get stuck when trying to
understand code in languages I'm not too well acquainted with, but I
figured getting more familiar with Perl would speed development up.
But it does make sense that this "prior to May 4" might be better
invested learning about git's internals than Perl.

Question: do you suggest any pending bugfix to git-add--interactive or
to something related that might give some useful knowledge in advance?
(for the pre-code period). My microproject involves playing with the
dir_iterator interface, which is a great exercise in code refactoring
but really does not teach me too much about Git's architecture.

Even if you do not have an answer to this, I'm pretty sure I'll keep
this commitment to submitting some patch series somehow related to
git-add before GSoC begins, especially after this comment from
Johannes.

>
>> > PROJECTED TIMELINE
>> > - Prior to May 4
>> > -- Refine my basic knowledge of Perl
>> > -- Craft one or two small patches to some of Git's Perl components
>> > (preferentially to git-add--interactive itself) to improve my
>> > understanding of the language and of how Git's Perl scripts actually
>> > work

So yeah, I think this could be rewritten as:

- Prior to May 4
-- Craft two or three small patch series to git-add--interactive or
related components to improve my understanding of Git's internal
architecture, especially that related to git-add.

>
>> > - May 4 - May 30
>> > -- Clarify implementation details with my mentor, and work on a more
>> > detailed roadmap for the project
>> > -- Investigate 

Re: [PATCH] http.postbuffer: make a size_t

2017-03-30 Thread Torsten Bögershausen



On 30/03/17 22:29, David Turner wrote:

Unfortunately, in order to push some large repos, the http postbuffer
must sometimes exceed two gigabytes.  On a 64-bit system, this is OK:
we just malloc a larger buffer.

Signed-off-by: David Turner 
---
 cache.h  |  1 +
 config.c | 17 +
 http.c   |  2 +-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index fbdf7a815a..a8c1b65db0 100644
--- a/cache.h
+++ b/cache.h
@@ -1900,6 +1900,7 @@ extern int git_parse_maybe_bool(const char *);
 extern int git_config_int(const char *, const char *);
 extern int64_t git_config_int64(const char *, const char *);
 extern unsigned long git_config_ulong(const char *, const char *);
+extern size_t git_config_size_t(const char *, const char *);
 extern int git_config_bool_or_int(const char *, const char *, int *);
 extern int git_config_bool(const char *, const char *);
 extern int git_config_maybe_bool(const char *, const char *);
diff --git a/config.c b/config.c
index 1a4d85537b..7b706cf27a 100644
--- a/config.c
+++ b/config.c
@@ -834,6 +834,15 @@ int git_parse_ulong(const char *value, unsigned long *ret)
return 1;
 }

+static size_t git_parse_size_t(const char *value, unsigned long *ret)
+{
+   size_t tmp;
+   if (!git_parse_signed(value, , 
maximum_unsigned_value_of_type(size_t)))
+   return 0;
+   *ret = tmp;
+   return 1;
+}

What is the return value here ?
Isn't it a size_t we want ?
(There was a recent discussion about "unsigned long" vs "size_t", which
 are the same on many systems, but not under Win64)
Would the following work ?

static int git_parse_size_t(const char *value, size_t *ret)
{
if (!git_parse_signed(value, ret, 
maximum_unsigned_value_of_type(size_t)))
return 0;
return 1;
}

[]

+size_t git_config_size_t(const char *name, const char *value)
+{
+   unsigned long ret;
+   if (!git_parse_size_t(value, ))
+   die_bad_number(name, value);
+   return ret;
+}
+

Same here:
size_t git_config_size_t(const char *name, const char *value)
{
size_t ret;
if (!git_parse_size_t(value, ))
die_bad_number(name, value);
return ret;
}


Dear, I need your help to invest in your country,

2017-03-30 Thread aishagaddaf...@ono.com
I came across your contact during my private search
Mrs Aisha  Al-Qaddafi is my name, the only daughter of late Libyan 
president, I have funds the sum
of $27.5 million USD for investment, I am interested in you for 
investment project assistance in your country, 
i shall compensate you  30% of the total sum after the funds are 
transfer into your account,
kindly reply for more information via my mailbox: mrs.aishaalqaddafi0
(at)gmail.com
Greetings from Mrs Aisha Al-Qaddafi
Mrs Aisha Al-Qaddafi


[PATCH v3 02/20] Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ

2017-03-30 Thread brian m. carlson
Since we will likely be introducing a new hash function at some point,
and that hash function might be longer than 40 hex characters, use the
constant GIT_MAX_HEXSZ, which is designed to be suitable for
allocations, instead of GIT_SHA1_HEXSZ.  This will ease the transition
down the line by distinguishing between places where we need to allocate
memory suitable for the largest hash from those where we need to handle
the current hash.

Signed-off-by: brian m. carlson 
---
 bisect.c  | 2 +-
 builtin/blame.c   | 4 ++--
 builtin/merge-index.c | 2 +-
 builtin/merge.c   | 2 +-
 builtin/rev-list.c| 2 +-
 diff.c| 4 ++--
 hex.c | 2 +-
 sha1_file.c   | 2 +-
 sha1_name.c   | 6 +++---
 transport.c   | 2 +-
 10 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/bisect.c b/bisect.c
index 30808cadf7..21c3e34636 100644
--- a/bisect.c
+++ b/bisect.c
@@ -682,7 +682,7 @@ static int is_expected_rev(const struct object_id *oid)
 
 static int bisect_checkout(const unsigned char *bisect_rev, int no_checkout)
 {
-   char bisect_rev_hex[GIT_SHA1_HEXSZ + 1];
+   char bisect_rev_hex[GIT_MAX_HEXSZ + 1];
 
memcpy(bisect_rev_hex, sha1_to_hex(bisect_rev), GIT_SHA1_HEXSZ + 1);
update_ref(NULL, "BISECT_EXPECTED_REV", bisect_rev, NULL, 0, 
UPDATE_REFS_DIE_ON_ERR);
diff --git a/builtin/blame.c b/builtin/blame.c
index f7aa95f4ba..07506a3e45 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1890,7 +1890,7 @@ static void emit_porcelain(struct scoreboard *sb, struct 
blame_entry *ent,
int cnt;
const char *cp;
struct origin *suspect = ent->suspect;
-   char hex[GIT_SHA1_HEXSZ + 1];
+   char hex[GIT_MAX_HEXSZ + 1];
 
oid_to_hex_r(hex, >commit->object.oid);
printf("%s %d %d %d\n",
@@ -1928,7 +1928,7 @@ static void emit_other(struct scoreboard *sb, struct 
blame_entry *ent, int opt)
const char *cp;
struct origin *suspect = ent->suspect;
struct commit_info ci;
-   char hex[GIT_SHA1_HEXSZ + 1];
+   char hex[GIT_MAX_HEXSZ + 1];
int show_raw_time = !!(opt & OUTPUT_RAW_TIMESTAMP);
 
get_commit_info(suspect->commit, , 1);
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 2d1b6db6bd..c99443b095 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -9,7 +9,7 @@ static int merge_entry(int pos, const char *path)
 {
int found;
const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-   char hexbuf[4][GIT_SHA1_HEXSZ + 1];
+   char hexbuf[4][GIT_MAX_HEXSZ + 1];
char ownbuf[4][60];
 
if (pos >= active_nr)
diff --git a/builtin/merge.c b/builtin/merge.c
index 7554b8d412..a2cceea3fb 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1296,7 +1296,7 @@ int cmd_merge(int argc, const char **argv, const char 
*prefix)
if (verify_signatures) {
for (p = remoteheads; p; p = p->next) {
struct commit *commit = p->item;
-   char hex[GIT_SHA1_HEXSZ + 1];
+   char hex[GIT_MAX_HEXSZ + 1];
struct signature_check signature_check;
memset(_check, 0, sizeof(signature_check));
 
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 0aa93d5891..bcf77f0b8a 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -212,7 +212,7 @@ static void print_var_int(const char *var, int val)
 static int show_bisect_vars(struct rev_list_info *info, int reaches, int all)
 {
int cnt, flags = info->flags;
-   char hex[GIT_SHA1_HEXSZ + 1] = "";
+   char hex[GIT_MAX_HEXSZ + 1] = "";
struct commit_list *tried;
struct rev_info *revs = info->revs;
 
diff --git a/diff.c b/diff.c
index 58cb72d7e7..cad9cc9661 100644
--- a/diff.c
+++ b/diff.c
@@ -398,7 +398,7 @@ static struct diff_tempfile {
 */
const char *name;
 
-   char hex[GIT_SHA1_HEXSZ + 1];
+   char hex[GIT_MAX_HEXSZ + 1];
char mode[10];
 
/*
@@ -4219,7 +4219,7 @@ const char *diff_aligned_abbrev(const struct object_id 
*oid, int len)
 * uniqueness across all objects (statistically speaking).
 */
if (abblen < GIT_SHA1_HEXSZ - 3) {
-   static char hex[GIT_SHA1_HEXSZ + 1];
+   static char hex[GIT_MAX_HEXSZ + 1];
if (len < abblen && abblen <= len + 2)
xsnprintf(hex, sizeof(hex), "%s%.*s", abbrev, 
len+3-abblen, "..");
else
diff --git a/hex.c b/hex.c
index eab7b626ee..28b44118cb 100644
--- a/hex.c
+++ b/hex.c
@@ -85,7 +85,7 @@ char *oid_to_hex_r(char *buffer, const struct object_id *oid)
 char *sha1_to_hex(const unsigned char *sha1)
 {
static int bufno;
-   static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
+   static char hexbuffer[4][GIT_MAX_HEXSZ + 1];
bufno = (bufno + 1) % ARRAY_SIZE(hexbuffer);

[PATCH v3 10/20] sha1_name: convert struct disambiguate_state to object_id

2017-03-30 Thread brian m. carlson
Convert struct disambiguate_state to use struct object_id by changing
the structure definition and applying the following semantic patch:

@@
struct disambiguate_state E1;
@@
- E1.bin_pfx
+ E1.bin_pfx.hash

@@
struct disambiguate_state *E1;
@@
- E1->bin_pfx
+ E1->bin_pfx.hash

@@
struct disambiguate_state E1;
@@
- E1.candidate
+ E1.candidate.hash

@@
struct disambiguate_state *E1;
@@
- E1->candidate
+ E1->candidate.hash

This conversion is needed so we can convert disambiguate_hint_fn later.

Signed-off-by: brian m. carlson 
---
 sha1_name.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index 3db166b40b..cf6f4be0c6 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -16,11 +16,11 @@ typedef int (*disambiguate_hint_fn)(const unsigned char *, 
void *);
 struct disambiguate_state {
int len; /* length of prefix in hex chars */
char hex_pfx[GIT_MAX_HEXSZ + 1];
-   unsigned char bin_pfx[GIT_MAX_RAWSZ];
+   struct object_id bin_pfx;
 
disambiguate_hint_fn fn;
void *cb_data;
-   unsigned char candidate[GIT_MAX_RAWSZ];
+   struct object_id candidate;
unsigned candidate_exists:1;
unsigned candidate_checked:1;
unsigned candidate_ok:1;
@@ -37,10 +37,10 @@ static void update_candidates(struct disambiguate_state 
*ds, const unsigned char
}
if (!ds->candidate_exists) {
/* this is the first candidate */
-   hashcpy(ds->candidate, current);
+   hashcpy(ds->candidate.hash, current);
ds->candidate_exists = 1;
return;
-   } else if (!hashcmp(ds->candidate, current)) {
+   } else if (!hashcmp(ds->candidate.hash, current)) {
/* the same as what we already have seen */
return;
}
@@ -52,14 +52,14 @@ static void update_candidates(struct disambiguate_state 
*ds, const unsigned char
}
 
if (!ds->candidate_checked) {
-   ds->candidate_ok = ds->fn(ds->candidate, ds->cb_data);
+   ds->candidate_ok = ds->fn(ds->candidate.hash, ds->cb_data);
ds->disambiguate_fn_used = 1;
ds->candidate_checked = 1;
}
 
if (!ds->candidate_ok) {
/* discard the candidate; we know it does not satisfy fn */
-   hashcpy(ds->candidate, current);
+   hashcpy(ds->candidate.hash, current);
ds->candidate_checked = 0;
return;
}
@@ -151,7 +151,7 @@ static void unique_in_pack(struct packed_git *p,
int cmp;
 
current = nth_packed_object_sha1(p, mid);
-   cmp = hashcmp(ds->bin_pfx, current);
+   cmp = hashcmp(ds->bin_pfx.hash, current);
if (!cmp) {
first = mid;
break;
@@ -170,7 +170,7 @@ static void unique_in_pack(struct packed_git *p,
 */
for (i = first; i < num && !ds->ambiguous; i++) {
current = nth_packed_object_sha1(p, i);
-   if (!match_sha(ds->len, ds->bin_pfx, current))
+   if (!match_sha(ds->len, ds->bin_pfx.hash, current))
break;
update_candidates(ds, current);
}
@@ -213,12 +213,12 @@ static int finish_object_disambiguation(struct 
disambiguate_state *ds,
 * same repository!
 */
ds->candidate_ok = (!ds->disambiguate_fn_used ||
-   ds->fn(ds->candidate, ds->cb_data));
+   ds->fn(ds->candidate.hash, ds->cb_data));
 
if (!ds->candidate_ok)
return SHORT_NAME_AMBIGUOUS;
 
-   hashcpy(sha1, ds->candidate);
+   hashcpy(sha1, ds->candidate.hash);
return 0;
 }
 
@@ -332,7 +332,7 @@ static int init_object_disambiguation(const char *name, int 
len,
ds->hex_pfx[i] = c;
if (!(i & 1))
val <<= 4;
-   ds->bin_pfx[i >> 1] |= val;
+   ds->bin_pfx.hash[i >> 1] |= val;
}
 
ds->len = len;


[PATCH v3 03/20] Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ

2017-03-30 Thread brian m. carlson
Since we will likely be introducing a new hash function at some point,
and that hash function might be longer than 20 bytes, use the constant
GIT_MAX_RAWSZ, which is designed to be suitable for allocations, instead
of GIT_SHA1_RAWSZ.  This will ease the transition down the line by
distinguishing between places where we need to allocate memory suitable
for the largest hash from those where we need to handle the current
hash.

Signed-off-by: brian m. carlson 
---
 builtin/patch-id.c | 2 +-
 builtin/receive-pack.c | 2 +-
 cache.h| 2 +-
 patch-ids.c| 2 +-
 patch-ids.h| 2 +-
 sha1_file.c| 4 ++--
 sha1_name.c| 4 ++--
 wt-status.h| 2 +-
 8 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/builtin/patch-id.c b/builtin/patch-id.c
index a84d0003a3..81552e02e4 100644
--- a/builtin/patch-id.c
+++ b/builtin/patch-id.c
@@ -55,7 +55,7 @@ static int scan_hunk_header(const char *p, int *p_before, int 
*p_after)
 
 static void flush_one_hunk(struct object_id *result, git_SHA_CTX *ctx)
 {
-   unsigned char hash[GIT_SHA1_RAWSZ];
+   unsigned char hash[GIT_MAX_RAWSZ];
unsigned short carry = 0;
int i;
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index fb2a090a0c..feafb076a4 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1162,7 +1162,7 @@ static void check_aliased_update(struct command *cmd, 
struct string_list *list)
const char *dst_name;
struct string_list_item *item;
struct command *dst_cmd;
-   unsigned char sha1[GIT_SHA1_RAWSZ];
+   unsigned char sha1[GIT_MAX_RAWSZ];
int flag;
 
strbuf_addf(, "%s%s", get_git_namespace(), cmd->ref_name);
diff --git a/cache.h b/cache.h
index 02b6c753a3..db14464bce 100644
--- a/cache.h
+++ b/cache.h
@@ -980,7 +980,7 @@ extern char *sha1_pack_index_name(const unsigned char 
*sha1);
 extern const char *find_unique_abbrev(const unsigned char *sha1, int len);
 extern int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len);
 
-extern const unsigned char null_sha1[GIT_SHA1_RAWSZ];
+extern const unsigned char null_sha1[GIT_MAX_RAWSZ];
 extern const struct object_id null_oid;
 
 static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
diff --git a/patch-ids.c b/patch-ids.c
index ce285c2e0c..fa8f11de82 100644
--- a/patch-ids.c
+++ b/patch-ids.c
@@ -71,7 +71,7 @@ static int init_patch_id_entry(struct patch_id *patch,
   struct commit *commit,
   struct patch_ids *ids)
 {
-   unsigned char header_only_patch_id[GIT_SHA1_RAWSZ];
+   unsigned char header_only_patch_id[GIT_MAX_RAWSZ];
 
patch->commit = commit;
if (commit_patch_id(commit, >diffopts, header_only_patch_id, 1))
diff --git a/patch-ids.h b/patch-ids.h
index 0f34ea11ea..b9e5751f8e 100644
--- a/patch-ids.h
+++ b/patch-ids.h
@@ -3,7 +3,7 @@
 
 struct patch_id {
struct hashmap_entry ent;
-   unsigned char patch_id[GIT_SHA1_RAWSZ];
+   unsigned char patch_id[GIT_MAX_RAWSZ];
struct commit *commit;
 };
 
diff --git a/sha1_file.c b/sha1_file.c
index e763000d34..b4666ee5c2 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1606,7 +1606,7 @@ static void mark_bad_packed_object(struct packed_git *p,
if (!hashcmp(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i))
return;
p->bad_object_sha1 = xrealloc(p->bad_object_sha1,
- st_mult(GIT_SHA1_RAWSZ,
+ st_mult(GIT_MAX_RAWSZ,
  st_add(p->num_bad_objects, 1)));
hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1);
p->num_bad_objects++;
@@ -3913,7 +3913,7 @@ static int check_stream_sha1(git_zstream *stream,
 const unsigned char *expected_sha1)
 {
git_SHA_CTX c;
-   unsigned char real_sha1[GIT_SHA1_RAWSZ];
+   unsigned char real_sha1[GIT_MAX_RAWSZ];
unsigned char buf[4096];
unsigned long total_read;
int status = Z_OK;
diff --git a/sha1_name.c b/sha1_name.c
index 964201bc26..3db166b40b 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -16,11 +16,11 @@ typedef int (*disambiguate_hint_fn)(const unsigned char *, 
void *);
 struct disambiguate_state {
int len; /* length of prefix in hex chars */
char hex_pfx[GIT_MAX_HEXSZ + 1];
-   unsigned char bin_pfx[GIT_SHA1_RAWSZ];
+   unsigned char bin_pfx[GIT_MAX_RAWSZ];
 
disambiguate_hint_fn fn;
void *cb_data;
-   unsigned char candidate[GIT_SHA1_RAWSZ];
+   unsigned char candidate[GIT_MAX_RAWSZ];
unsigned candidate_exists:1;
unsigned candidate_checked:1;
unsigned candidate_ok:1;
diff --git a/wt-status.h b/wt-status.h
index 54fec77032..6018c627b1 100644
--- a/wt-status.h
+++ b/wt-status.h

[PATCH v3 01/20] Define new hash-size constants for allocating memory

2017-03-30 Thread brian m. carlson
Since we will want to transition to a new hash at some point in the
future, and that hash may be larger in size than 160 bits, introduce two
constants that can be used for allocating a sufficient amount of memory.
They can be increased to reflect the largest supported hash size.

Signed-off-by: brian m. carlson 
---
 cache.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index db4120c233..02b6c753a3 100644
--- a/cache.h
+++ b/cache.h
@@ -66,8 +66,12 @@ unsigned long git_deflate_bound(git_zstream *, unsigned 
long);
 #define GIT_SHA1_RAWSZ 20
 #define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ)
 
+/* The length in byte and in hex digits of the largest possible hash value. */
+#define GIT_MAX_RAWSZ GIT_SHA1_RAWSZ
+#define GIT_MAX_HEXSZ GIT_SHA1_HEXSZ
+
 struct object_id {
-   unsigned char hash[GIT_SHA1_RAWSZ];
+   unsigned char hash[GIT_MAX_RAWSZ];
 };
 
 #if defined(DT_UNKNOWN) && !defined(NO_D_TYPE_IN_DIRENT)


[PATCH v3 06/20] builtin/receive-pack: convert portions to struct object_id

2017-03-30 Thread brian m. carlson
Convert some hardcoded constants into uses of parse_oid_hex.
Additionally, convert all uses of struct command, and miscellaneous
other functions necessary for that.  This work is necessary to be able
to convert sha1_array_append later on.

To avoid needing to specify a constant, reject shallow lines with the
wrong length instead of simply ignoring them.

Note that in queue_command we are guaranteed to have a NUL-terminated
buffer or at least one byte of overflow that we can safely read, so the
linelen check can be elided.  We would die in such a case, but not read
invalid memory.

Signed-off-by: brian m. carlson 
---
 builtin/receive-pack.c | 98 +-
 1 file changed, 49 insertions(+), 49 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index feafb076a4..62b7c5e25c 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -309,8 +309,8 @@ struct command {
unsigned int skip_update:1,
 did_not_exist:1;
int index;
-   unsigned char old_sha1[20];
-   unsigned char new_sha1[20];
+   struct object_id old_oid;
+   struct object_id new_oid;
char ref_name[FLEX_ARRAY]; /* more */
 };
 
@@ -723,7 +723,7 @@ static int feed_receive_hook(void *state_, const char 
**bufp, size_t *sizep)
return -1; /* EOF */
strbuf_reset(>buf);
strbuf_addf(>buf, "%s %s %s\n",
-   sha1_to_hex(cmd->old_sha1), sha1_to_hex(cmd->new_sha1),
+   oid_to_hex(>old_oid), oid_to_hex(>new_oid),
cmd->ref_name);
state->cmd = cmd->next;
if (bufp) {
@@ -764,8 +764,8 @@ static int run_update_hook(struct command *cmd)
return 0;
 
argv[1] = cmd->ref_name;
-   argv[2] = sha1_to_hex(cmd->old_sha1);
-   argv[3] = sha1_to_hex(cmd->new_sha1);
+   argv[2] = oid_to_hex(>old_oid);
+   argv[3] = oid_to_hex(>new_oid);
argv[4] = NULL;
 
proc.no_stdin = 1;
@@ -988,8 +988,8 @@ static const char *update(struct command *cmd, struct 
shallow_info *si)
const char *name = cmd->ref_name;
struct strbuf namespaced_name_buf = STRBUF_INIT;
const char *namespaced_name, *ret;
-   unsigned char *old_sha1 = cmd->old_sha1;
-   unsigned char *new_sha1 = cmd->new_sha1;
+   struct object_id *old_oid = >old_oid;
+   struct object_id *new_oid = >new_oid;
 
/* only refs/... are allowed */
if (!starts_with(name, "refs/") || check_refname_format(name + 5, 0)) {
@@ -1014,20 +1014,20 @@ static const char *update(struct command *cmd, struct 
shallow_info *si)
refuse_unconfigured_deny();
return "branch is currently checked out";
case DENY_UPDATE_INSTEAD:
-   ret = update_worktree(new_sha1);
+   ret = update_worktree(new_oid->hash);
if (ret)
return ret;
break;
}
}
 
-   if (!is_null_sha1(new_sha1) && !has_sha1_file(new_sha1)) {
+   if (!is_null_oid(new_oid) && !has_object_file(new_oid)) {
error("unpack should have generated %s, "
- "but I can't find it!", sha1_to_hex(new_sha1));
+ "but I can't find it!", oid_to_hex(new_oid));
return "bad pack";
}
 
-   if (!is_null_sha1(old_sha1) && is_null_sha1(new_sha1)) {
+   if (!is_null_oid(old_oid) && is_null_oid(new_oid)) {
if (deny_deletes && starts_with(name, "refs/heads/")) {
rp_error("denying ref deletion for %s", name);
return "deletion prohibited";
@@ -1053,14 +1053,14 @@ static const char *update(struct command *cmd, struct 
shallow_info *si)
}
}
 
-   if (deny_non_fast_forwards && !is_null_sha1(new_sha1) &&
-   !is_null_sha1(old_sha1) &&
+   if (deny_non_fast_forwards && !is_null_oid(new_oid) &&
+   !is_null_oid(old_oid) &&
starts_with(name, "refs/heads/")) {
struct object *old_object, *new_object;
struct commit *old_commit, *new_commit;
 
-   old_object = parse_object(old_sha1);
-   new_object = parse_object(new_sha1);
+   old_object = parse_object(old_oid->hash);
+   new_object = parse_object(new_oid->hash);
 
if (!old_object || !new_object ||
old_object->type != OBJ_COMMIT ||
@@ -1081,10 +1081,10 @@ static const char *update(struct command *cmd, struct 
shallow_info *si)
return "hook declined";
}
 
-   if (is_null_sha1(new_sha1)) {
+   if (is_null_oid(new_oid)) {
struct strbuf err = STRBUF_INIT;
-   if (!parse_object(old_sha1)) {
-   old_sha1 = NULL;

[PATCH v3 08/20] parse-options-cb: convert sha1_array_append caller to struct object_id

2017-03-30 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 parse-options-cb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/parse-options-cb.c b/parse-options-cb.c
index b7d8f7dcb2..40ece4d8c2 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -96,7 +96,7 @@ int parse_opt_commits(const struct option *opt, const char 
*arg, int unset)
 
 int parse_opt_object_name(const struct option *opt, const char *arg, int unset)
 {
-   unsigned char sha1[20];
+   struct object_id oid;
 
if (unset) {
sha1_array_clear(opt->value);
@@ -104,9 +104,9 @@ int parse_opt_object_name(const struct option *opt, const 
char *arg, int unset)
}
if (!arg)
return -1;
-   if (get_sha1(arg, sha1))
+   if (get_oid(arg, ))
return error(_("malformed object name '%s'"), arg);
-   sha1_array_append(opt->value, sha1);
+   sha1_array_append(opt->value, oid.hash);
return 0;
 }
 


[PATCH v3 05/20] builtin/pull: convert portions to struct object_id

2017-03-30 Thread brian m. carlson
Convert the caller of sha1_array_append to struct object_id.

Signed-off-by: brian m. carlson 
---
 builtin/pull.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/pull.c b/builtin/pull.c
index 3ecb881b0b..a9f7553f30 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -335,16 +335,16 @@ static void get_merge_heads(struct sha1_array 
*merge_heads)
const char *filename = git_path("FETCH_HEAD");
FILE *fp;
struct strbuf sb = STRBUF_INIT;
-   unsigned char sha1[GIT_SHA1_RAWSZ];
+   struct object_id oid;
 
if (!(fp = fopen(filename, "r")))
die_errno(_("could not open '%s' for reading"), filename);
while (strbuf_getline_lf(, fp) != EOF) {
-   if (get_sha1_hex(sb.buf, sha1))
+   if (get_oid_hex(sb.buf, ))
continue;  /* invalid line: does not start with SHA1 */
if (starts_with(sb.buf + GIT_SHA1_HEXSZ, "\tnot-for-merge\t"))
continue;  /* ref is not-for-merge */
-   sha1_array_append(merge_heads, sha1);
+   sha1_array_append(merge_heads, oid.hash);
}
fclose(fp);
strbuf_release();


[PATCH v3 17/20] Convert sha1_array_lookup to take struct object_id

2017-03-30 Thread brian m. carlson
Convert this function by changing the declaration and definition and
applying the following semantic patch to update the callers:

@@
expression E1, E2;
@@
- sha1_array_lookup(E1, E2.hash)
+ sha1_array_lookup(E1, )

@@
expression E1, E2;
@@
- sha1_array_lookup(E1, E2->hash)
+ sha1_array_lookup(E1, E2)

Signed-off-by: brian m. carlson 
---
 bisect.c   | 7 +++
 builtin/pack-objects.c | 2 +-
 fsck.c | 2 +-
 ref-filter.c   | 4 ++--
 sha1-array.c   | 4 ++--
 sha1-array.h   | 2 +-
 t/helper/test-sha1-array.c | 2 +-
 7 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/bisect.c b/bisect.c
index a25d008693..f193257509 100644
--- a/bisect.c
+++ b/bisect.c
@@ -499,8 +499,7 @@ struct commit_list *filter_skipped(struct commit_list *list,
while (list) {
struct commit_list *next = list->next;
list->next = NULL;
-   if (0 <= sha1_array_lookup(_revs,
-  list->item->object.oid.hash)) {
+   if (0 <= sha1_array_lookup(_revs, 
>item->object.oid)) {
if (skipped_first && !*skipped_first)
*skipped_first = 1;
/* Move current to tried list */
@@ -790,9 +789,9 @@ static void check_merge_bases(int no_checkout)
const struct object_id *mb = >item->object.oid;
if (!oidcmp(mb, current_bad_oid)) {
handle_bad_merge_base();
-   } else if (0 <= sha1_array_lookup(_revs, mb->hash)) {
+   } else if (0 <= sha1_array_lookup(_revs, mb)) {
continue;
-   } else if (0 <= sha1_array_lookup(_revs, mb->hash)) {
+   } else if (0 <= sha1_array_lookup(_revs, mb)) {
handle_skipped_merge_base(mb);
} else {
printf(_("Bisecting: a merge base must be tested\n"));
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index dca1b68e69..028c7be9a2 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2677,7 +2677,7 @@ static int loosened_object_can_be_discarded(const struct 
object_id *oid,
return 0;
if (mtime > unpack_unreachable_expiration)
return 0;
-   if (sha1_array_lookup(_objects, oid->hash) >= 0)
+   if (sha1_array_lookup(_objects, oid) >= 0)
return 0;
return 1;
 }
diff --git a/fsck.c b/fsck.c
index 6682de1de5..24daedd6cc 100644
--- a/fsck.c
+++ b/fsck.c
@@ -280,7 +280,7 @@ static int report(struct fsck_options *options, struct 
object *object,
return 0;
 
if (options->skiplist && object &&
-   sha1_array_lookup(options->skiplist, object->oid.hash) 
>= 0)
+   sha1_array_lookup(options->skiplist, >oid) >= 0)
return 0;
 
if (msg_type == FSCK_FATAL)
diff --git a/ref-filter.c b/ref-filter.c
index d3dcb53dd5..4ee7ebcda3 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1684,14 +1684,14 @@ static const struct object_id *match_points_at(struct 
sha1_array *points_at,
const struct object_id *tagged_oid = NULL;
struct object *obj;
 
-   if (sha1_array_lookup(points_at, oid->hash) >= 0)
+   if (sha1_array_lookup(points_at, oid) >= 0)
return oid;
obj = parse_object(oid->hash);
if (!obj)
die(_("malformed object at '%s'"), refname);
if (obj->type == OBJ_TAG)
tagged_oid = &((struct tag *)obj)->tagged->oid;
-   if (tagged_oid && sha1_array_lookup(points_at, tagged_oid->hash) >= 0)
+   if (tagged_oid && sha1_array_lookup(points_at, tagged_oid) >= 0)
return tagged_oid;
return NULL;
 }
diff --git a/sha1-array.c b/sha1-array.c
index 26e596b264..1082b3dc11 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -26,11 +26,11 @@ static const unsigned char *sha1_access(size_t index, void 
*table)
return array[index].hash;
 }
 
-int sha1_array_lookup(struct sha1_array *array, const unsigned char *sha1)
+int sha1_array_lookup(struct sha1_array *array, const struct object_id *oid)
 {
if (!array->sorted)
sha1_array_sort(array);
-   return sha1_pos(sha1, array->oid, array->nr, sha1_access);
+   return sha1_pos(oid->hash, array->oid, array->nr, sha1_access);
 }
 
 void sha1_array_clear(struct sha1_array *array)
diff --git a/sha1-array.h b/sha1-array.h
index 4e60576a82..24347e7655 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -11,7 +11,7 @@ struct sha1_array {
 #define SHA1_ARRAY_INIT { NULL, 0, 0, 0 }
 
 void sha1_array_append(struct sha1_array *array, const struct object_id *oid);
-int sha1_array_lookup(struct sha1_array *array, const unsigned char *sha1);
+int sha1_array_lookup(struct sha1_array *array, const struct object_id *oid);
 void 

[PATCH v3 07/20] fsck: convert init_skiplist to struct object_id

2017-03-30 Thread brian m. carlson
Convert a hardcoded constant buffer size to a use of GIT_MAX_HEXSZ, and
use parse_oid_hex to reduce the dependency on the size of the hash.
This function is a caller of sha1_array_append, which will be converted
later.

Signed-off-by: brian m. carlson 
---
 fsck.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/fsck.c b/fsck.c
index 939792752b..aff4ae6fd4 100644
--- a/fsck.c
+++ b/fsck.c
@@ -134,8 +134,8 @@ static void init_skiplist(struct fsck_options *options, 
const char *path)
 {
static struct sha1_array skiplist = SHA1_ARRAY_INIT;
int sorted, fd;
-   char buffer[41];
-   unsigned char sha1[20];
+   char buffer[GIT_MAX_HEXSZ + 1];
+   struct object_id oid;
 
if (options->skiplist)
sorted = options->skiplist->sorted;
@@ -148,17 +148,18 @@ static void init_skiplist(struct fsck_options *options, 
const char *path)
if (fd < 0)
die("Could not open skip list: %s", path);
for (;;) {
+   const char *p;
int result = read_in_full(fd, buffer, sizeof(buffer));
if (result < 0)
die_errno("Could not read '%s'", path);
if (!result)
break;
-   if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
+   if (parse_oid_hex(buffer, , ) || *p != '\n')
die("Invalid SHA-1: %s", buffer);
-   sha1_array_append(, sha1);
+   sha1_array_append(, oid.hash);
if (sorted && skiplist.nr > 1 &&
hashcmp(skiplist.sha1[skiplist.nr - 2],
-   sha1) > 0)
+   oid.hash) > 0)
sorted = 0;
}
close(fd);


[PATCH v3 11/20] sha1_name: convert disambiguate_hint_fn to take object_id

2017-03-30 Thread brian m. carlson
Convert this function pointer type and the functions that implement it
to take a struct object_id.  Introduce a temporary in
show_ambiguous_object to avoid having to convert for_each_abbrev at this
point.

Signed-off-by: brian m. carlson 
---
 sha1_name.c | 64 -
 1 file changed, 34 insertions(+), 30 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index cf6f4be0c6..2e38aedfa5 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -11,7 +11,7 @@
 
 static int get_sha1_oneline(const char *, unsigned char *, struct commit_list 
*);
 
-typedef int (*disambiguate_hint_fn)(const unsigned char *, void *);
+typedef int (*disambiguate_hint_fn)(const struct object_id *, void *);
 
 struct disambiguate_state {
int len; /* length of prefix in hex chars */
@@ -29,7 +29,7 @@ struct disambiguate_state {
unsigned always_call_fn:1;
 };
 
-static void update_candidates(struct disambiguate_state *ds, const unsigned 
char *current)
+static void update_candidates(struct disambiguate_state *ds, const struct 
object_id *current)
 {
if (ds->always_call_fn) {
ds->ambiguous = ds->fn(current, ds->cb_data) ? 1 : 0;
@@ -37,10 +37,10 @@ static void update_candidates(struct disambiguate_state 
*ds, const unsigned char
}
if (!ds->candidate_exists) {
/* this is the first candidate */
-   hashcpy(ds->candidate.hash, current);
+   oidcpy(>candidate, current);
ds->candidate_exists = 1;
return;
-   } else if (!hashcmp(ds->candidate.hash, current)) {
+   } else if (!oidcmp(>candidate, current)) {
/* the same as what we already have seen */
return;
}
@@ -52,14 +52,14 @@ static void update_candidates(struct disambiguate_state 
*ds, const unsigned char
}
 
if (!ds->candidate_checked) {
-   ds->candidate_ok = ds->fn(ds->candidate.hash, ds->cb_data);
+   ds->candidate_ok = ds->fn(>candidate, ds->cb_data);
ds->disambiguate_fn_used = 1;
ds->candidate_checked = 1;
}
 
if (!ds->candidate_ok) {
/* discard the candidate; we know it does not satisfy fn */
-   hashcpy(ds->candidate.hash, current);
+   oidcpy(>candidate, current);
ds->candidate_checked = 0;
return;
}
@@ -107,15 +107,15 @@ static void find_short_object_filename(struct 
disambiguate_state *ds)
continue;
 
while (!ds->ambiguous && (de = readdir(dir)) != NULL) {
-   unsigned char sha1[20];
+   struct object_id oid;
 
-   if (strlen(de->d_name) != 38)
+   if (strlen(de->d_name) != GIT_SHA1_HEXSZ - 2)
continue;
if (memcmp(de->d_name, ds->hex_pfx + 2, ds->len - 2))
continue;
-   memcpy(hex + 2, de->d_name, 38);
-   if (!get_sha1_hex(hex, sha1))
-   update_candidates(ds, sha1);
+   memcpy(hex + 2, de->d_name, GIT_SHA1_HEXSZ - 2);
+   if (!get_oid_hex(hex, ))
+   update_candidates(ds, );
}
closedir(dir);
}
@@ -140,7 +140,7 @@ static void unique_in_pack(struct packed_git *p,
   struct disambiguate_state *ds)
 {
uint32_t num, last, i, first = 0;
-   const unsigned char *current = NULL;
+   const struct object_id *current = NULL;
 
open_pack_index(p);
num = p->num_objects;
@@ -169,8 +169,9 @@ static void unique_in_pack(struct packed_git *p,
 * 0, 1 or more objects that actually match(es).
 */
for (i = first; i < num && !ds->ambiguous; i++) {
-   current = nth_packed_object_sha1(p, i);
-   if (!match_sha(ds->len, ds->bin_pfx.hash, current))
+   struct object_id oid;
+   current = nth_packed_object_oid(, p, i);
+   if (!match_sha(ds->len, ds->bin_pfx.hash, current->hash))
break;
update_candidates(ds, current);
}
@@ -213,7 +214,7 @@ static int finish_object_disambiguation(struct 
disambiguate_state *ds,
 * same repository!
 */
ds->candidate_ok = (!ds->disambiguate_fn_used ||
-   ds->fn(ds->candidate.hash, ds->cb_data));
+   ds->fn(>candidate, ds->cb_data));
 
if (!ds->candidate_ok)
return SHORT_NAME_AMBIGUOUS;
@@ -222,57 +223,57 @@ static int finish_object_disambiguation(struct 
disambiguate_state *ds,
return 0;
 }
 
-static int disambiguate_commit_only(const unsigned char *sha1, 

[PATCH v3 14/20] sha1-array: convert internal storage for struct sha1_array to object_id

2017-03-30 Thread brian m. carlson
Make the internal storage for struct sha1_array use an array of struct
object_id internally.  Update the users of this struct which inspect its
internals.

Signed-off-by: brian m. carlson 
---
 bisect.c   | 14 +++---
 builtin/pull.c | 22 +++---
 builtin/receive-pack.c |  4 ++--
 combine-diff.c |  6 +++---
 fetch-pack.c   | 12 ++--
 fsck.c |  4 ++--
 remote-curl.c  |  2 +-
 send-pack.c|  2 +-
 sha1-array.c   | 22 +++---
 sha1-array.h   |  2 +-
 shallow.c  | 26 +-
 11 files changed, 58 insertions(+), 58 deletions(-)

diff --git a/bisect.c b/bisect.c
index 21c3e34636..ebaf7b05ba 100644
--- a/bisect.c
+++ b/bisect.c
@@ -457,7 +457,7 @@ static char *join_sha1_array_hex(struct sha1_array *array, 
char delim)
int i;
 
for (i = 0; i < array->nr; i++) {
-   strbuf_addstr(_hexs, sha1_to_hex(array->sha1[i]));
+   strbuf_addstr(_hexs, oid_to_hex(array->oid + i));
if (i + 1 < array->nr)
strbuf_addch(_hexs, delim);
}
@@ -621,7 +621,7 @@ static void bisect_rev_setup(struct rev_info *revs, const 
char *prefix,
argv_array_pushf(_argv, bad_format, oid_to_hex(current_bad_oid));
for (i = 0; i < good_revs.nr; i++)
argv_array_pushf(_argv, good_format,
-sha1_to_hex(good_revs.sha1[i]));
+oid_to_hex(good_revs.oid + i));
argv_array_push(_argv, "--");
if (read_paths)
read_bisect_paths(_argv);
@@ -701,11 +701,11 @@ static int bisect_checkout(const unsigned char 
*bisect_rev, int no_checkout)
return run_command_v_opt(argv_show_branch, RUN_GIT_CMD);
 }
 
-static struct commit *get_commit_reference(const unsigned char *sha1)
+static struct commit *get_commit_reference(const struct object_id *oid)
 {
-   struct commit *r = lookup_commit_reference(sha1);
+   struct commit *r = lookup_commit_reference(oid->hash);
if (!r)
-   die(_("Not a valid commit name %s"), sha1_to_hex(sha1));
+   die(_("Not a valid commit name %s"), oid_to_hex(oid));
return r;
 }
 
@@ -715,9 +715,9 @@ static struct commit **get_bad_and_good_commits(int *rev_nr)
int i, n = 0;
 
ALLOC_ARRAY(rev, 1 + good_revs.nr);
-   rev[n++] = get_commit_reference(current_bad_oid->hash);
+   rev[n++] = get_commit_reference(current_bad_oid);
for (i = 0; i < good_revs.nr; i++)
-   rev[n++] = get_commit_reference(good_revs.sha1[i]);
+   rev[n++] = get_commit_reference(good_revs.oid + i);
*rev_nr = n;
 
return rev;
diff --git a/builtin/pull.c b/builtin/pull.c
index 704ce1f042..c007900ab5 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -514,7 +514,7 @@ static int run_fetch(const char *repo, const char 
**refspecs)
 /**
  * "Pulls into void" by branching off merge_head.
  */
-static int pull_into_void(const unsigned char *merge_head,
+static int pull_into_void(const struct object_id *merge_head,
const struct object_id *curr_head)
 {
/*
@@ -523,10 +523,10 @@ static int pull_into_void(const unsigned char *merge_head,
 * index/worktree changes that the user already made on the unborn
 * branch.
 */
-   if (checkout_fast_forward(EMPTY_TREE_SHA1_BIN, merge_head, 0))
+   if (checkout_fast_forward(EMPTY_TREE_SHA1_BIN, merge_head->hash, 0))
return 1;
 
-   if (update_ref("initial pull", "HEAD", merge_head, curr_head->hash, 0, 
UPDATE_REFS_DIE_ON_ERR))
+   if (update_ref("initial pull", "HEAD", merge_head->hash, 
curr_head->hash, 0, UPDATE_REFS_DIE_ON_ERR))
return 1;
 
return 0;
@@ -693,13 +693,13 @@ static int get_rebase_fork_point(struct object_id 
*fork_point, const char *repo,
  */
 static int get_octopus_merge_base(struct object_id *merge_base,
const struct object_id *curr_head,
-   const unsigned char *merge_head,
+   const struct object_id *merge_head,
const struct object_id *fork_point)
 {
struct commit_list *revs = NULL, *result;
 
commit_list_insert(lookup_commit_reference(curr_head->hash), );
-   commit_list_insert(lookup_commit_reference(merge_head), );
+   commit_list_insert(lookup_commit_reference(merge_head->hash), );
if (!is_null_oid(fork_point))
commit_list_insert(lookup_commit_reference(fork_point->hash), 
);
 
@@ -718,7 +718,7 @@ static int get_octopus_merge_base(struct object_id 
*merge_base,
  * appropriate arguments and returns its exit status.
  */
 static int run_rebase(const struct object_id *curr_head,
-   const unsigned char *merge_head,
+   const struct object_id *merge_head,
const struct 

[PATCH v3 18/20] Convert sha1_array_for_each_unique and for_each_abbrev to object_id

2017-03-30 Thread brian m. carlson
Make sha1_array_for_each_unique take a callback using struct object_id.
Since one of these callbacks is an argument to for_each_abbrev, convert
those as well.  Rename various functions, replacing "sha1" with "oid".

Signed-off-by: brian m. carlson 
---
 builtin/cat-file.c |  4 ++--
 builtin/receive-pack.c | 12 ++--
 builtin/rev-parse.c|  4 ++--
 cache.h|  2 +-
 sha1-array.c   |  4 ++--
 sha1-array.h   |  6 +++---
 sha1_name.c| 14 ++
 submodule.c| 20 ++--
 t/helper/test-sha1-array.c |  6 +++---
 9 files changed, 35 insertions(+), 37 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 8fbb667170..eb0043231d 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -401,10 +401,10 @@ struct object_cb_data {
struct expand_data *expand;
 };
 
-static int batch_object_cb(const unsigned char sha1[20], void *vdata)
+static int batch_object_cb(const struct object_id *oid, void *vdata)
 {
struct object_cb_data *data = vdata;
-   hashcpy(data->expand->oid.hash, sha1);
+   oidcpy(>expand->oid, oid);
batch_object_write(NULL, data->opt, data->expand);
return 0;
 }
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 2853ea0f99..51b52a6041 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -225,10 +225,10 @@ static int receive_pack_config(const char *var, const 
char *value, void *cb)
return git_default_config(var, value, cb);
 }
 
-static void show_ref(const char *path, const unsigned char *sha1)
+static void show_ref(const char *path, const struct object_id *oid)
 {
if (sent_capabilities) {
-   packet_write_fmt(1, "%s %s\n", sha1_to_hex(sha1), path);
+   packet_write_fmt(1, "%s %s\n", oid_to_hex(oid), path);
} else {
struct strbuf cap = STRBUF_INIT;
 
@@ -244,7 +244,7 @@ static void show_ref(const char *path, const unsigned char 
*sha1)
strbuf_addstr(, " push-options");
strbuf_addf(, " agent=%s", git_user_agent_sanitized());
packet_write_fmt(1, "%s %s%c%s\n",
-sha1_to_hex(sha1), path, 0, cap.buf);
+oid_to_hex(oid), path, 0, cap.buf);
strbuf_release();
sent_capabilities = 1;
}
@@ -271,7 +271,7 @@ static int show_ref_cb(const char *path_full, const struct 
object_id *oid,
} else {
oidset_insert(seen, oid);
}
-   show_ref(path, oid->hash);
+   show_ref(path, oid);
return 0;
 }
 
@@ -284,7 +284,7 @@ static void show_one_alternate_ref(const char *refname,
if (oidset_insert(seen, oid))
return;
 
-   show_ref(".have", oid->hash);
+   show_ref(".have", oid);
 }
 
 static void write_head_info(void)
@@ -295,7 +295,7 @@ static void write_head_info(void)
for_each_alternate_ref(show_one_alternate_ref, );
oidset_clear();
if (!sent_capabilities)
-   show_ref("capabilities^{}", null_sha1);
+   show_ref("capabilities^{}", _oid);
 
advertise_shallow_grafts(1);
 
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 9e53a1a7ca..36268b511c 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -205,9 +205,9 @@ static int anti_reference(const char *refname, const struct 
object_id *oid, int
return 0;
 }
 
-static int show_abbrev(const unsigned char *sha1, void *cb_data)
+static int show_abbrev(const struct object_id *oid, void *cb_data)
 {
-   show_rev(NORMAL, sha1, NULL);
+   show_rev(NORMAL, oid->hash, NULL);
return 0;
 }
 
diff --git a/cache.h b/cache.h
index db14464bce..c1882112f6 100644
--- a/cache.h
+++ b/cache.h
@@ -1362,7 +1362,7 @@ extern int get_sha1_with_context(const char *str, 
unsigned flags, unsigned char
 
 extern int get_oid(const char *str, struct object_id *oid);
 
-typedef int each_abbrev_fn(const unsigned char *sha1, void *);
+typedef int each_abbrev_fn(const struct object_id *oid, void *);
 extern int for_each_abbrev(const char *prefix, each_abbrev_fn, void *);
 
 extern int set_disambiguate_hint_config(const char *var, const char *value);
diff --git a/sha1-array.c b/sha1-array.c
index 1082b3dc11..82a7f4435c 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -43,7 +43,7 @@ void sha1_array_clear(struct sha1_array *array)
 }
 
 int sha1_array_for_each_unique(struct sha1_array *array,
-   for_each_sha1_fn fn,
+   for_each_oid_fn fn,
void *data)
 {
int i;
@@ -55,7 +55,7 @@ int sha1_array_for_each_unique(struct sha1_array *array,
int ret;
if (i > 0 && !oidcmp(array->oid + i, array->oid + i - 1))
continue;
-   ret = 

[PATCH v3 19/20] Rename sha1_array to oid_array

2017-03-30 Thread brian m. carlson
Since this structure handles an array of object IDs, rename it to struct
oid_array.  Also rename the accessor functions and the initialization
constant.

This commit was produced mechanically by providing non-Documentation
files to the following Perl one-liners:

perl -pi -E 's/struct sha1_array/struct oid_array/g'
perl -pi -E 's/\bsha1_array_/oid_array_/g'
perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g'

Signed-off-by: brian m. carlson 
---
 bisect.c   | 16 
 builtin/cat-file.c | 10 +-
 builtin/diff.c |  6 +++---
 builtin/fetch-pack.c   |  2 +-
 builtin/pack-objects.c | 10 +-
 builtin/pull.c |  6 +++---
 builtin/receive-pack.c | 24 +++
 builtin/send-pack.c|  4 ++--
 combine-diff.c | 12 ++--
 commit.h   | 14 +++---
 connect.c  |  8 
 diff.h |  4 ++--
 fetch-pack.c   | 26 -
 fetch-pack.h   |  4 ++--
 fsck.c |  6 +++---
 fsck.h |  2 +-
 parse-options-cb.c |  4 ++--
 ref-filter.c   |  6 +++---
 ref-filter.h   |  2 +-
 remote-curl.c  |  2 +-
 remote.h   |  6 +++---
 send-pack.c|  4 ++--
 send-pack.h|  2 +-
 sha1-array.c   | 14 +++---
 sha1-array.h   | 12 ++--
 sha1_name.c|  8 
 shallow.c  | 12 ++--
 submodule.c| 48 +++---
 submodule.h|  6 +++---
 t/helper/test-sha1-array.c | 10 +-
 transport.c| 20 +--
 31 files changed, 155 insertions(+), 155 deletions(-)

diff --git a/bisect.c b/bisect.c
index f193257509..54d69e77b9 100644
--- a/bisect.c
+++ b/bisect.c
@@ -12,8 +12,8 @@
 #include "sha1-array.h"
 #include "argv-array.h"
 
-static struct sha1_array good_revs;
-static struct sha1_array skipped_revs;
+static struct oid_array good_revs;
+static struct oid_array skipped_revs;
 
 static struct object_id *current_bad_oid;
 
@@ -413,9 +413,9 @@ static int register_ref(const char *refname, const struct 
object_id *oid,
current_bad_oid = xmalloc(sizeof(*current_bad_oid));
oidcpy(current_bad_oid, oid);
} else if (starts_with(refname, good_prefix.buf)) {
-   sha1_array_append(_revs, oid);
+   oid_array_append(_revs, oid);
} else if (starts_with(refname, "skip-")) {
-   sha1_array_append(_revs, oid);
+   oid_array_append(_revs, oid);
}
 
strbuf_release(_prefix);
@@ -451,7 +451,7 @@ static void read_bisect_paths(struct argv_array *array)
fclose(fp);
 }
 
-static char *join_sha1_array_hex(struct sha1_array *array, char delim)
+static char *join_sha1_array_hex(struct oid_array *array, char delim)
 {
struct strbuf joined_hexs = STRBUF_INIT;
int i;
@@ -499,7 +499,7 @@ struct commit_list *filter_skipped(struct commit_list *list,
while (list) {
struct commit_list *next = list->next;
list->next = NULL;
-   if (0 <= sha1_array_lookup(_revs, 
>item->object.oid)) {
+   if (0 <= oid_array_lookup(_revs, 
>item->object.oid)) {
if (skipped_first && !*skipped_first)
*skipped_first = 1;
/* Move current to tried list */
@@ -789,9 +789,9 @@ static void check_merge_bases(int no_checkout)
const struct object_id *mb = >item->object.oid;
if (!oidcmp(mb, current_bad_oid)) {
handle_bad_merge_base();
-   } else if (0 <= sha1_array_lookup(_revs, mb)) {
+   } else if (0 <= oid_array_lookup(_revs, mb)) {
continue;
-   } else if (0 <= sha1_array_lookup(_revs, mb)) {
+   } else if (0 <= oid_array_lookup(_revs, mb)) {
handle_skipped_merge_base(mb);
} else {
printf(_("Bisecting: a merge base must be tested\n"));
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index eb0043231d..1890d7a639 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -413,7 +413,7 @@ static int batch_loose_object(const struct object_id *oid,
  const char *path,
  void *data)
 {
-   sha1_array_append(data, oid);
+   oid_array_append(data, oid);
return 0;
 }
 
@@ -422,7 +422,7 @@ static int batch_packed_object(const struct object_id *oid,
   uint32_t pos,
   void *data)
 {
-   sha1_array_append(data, oid);
+   oid_array_append(data, oid);
return 0;
 

[PATCH v3 20/20] Documentation: update and rename api-sha1-array.txt

2017-03-30 Thread brian m. carlson
Since the structure and functions have changed names, update the code
examples and the documentation.  Rename the file to match the new name
of the API.

Signed-off-by: brian m. carlson 
---
 .../{api-sha1-array.txt => api-oid-array.txt}  | 44 +++---
 1 file changed, 22 insertions(+), 22 deletions(-)
 rename Documentation/technical/{api-sha1-array.txt => api-oid-array.txt} (61%)

diff --git a/Documentation/technical/api-sha1-array.txt 
b/Documentation/technical/api-oid-array.txt
similarity index 61%
rename from Documentation/technical/api-sha1-array.txt
rename to Documentation/technical/api-oid-array.txt
index dcc52943a5..b0c11f868d 100644
--- a/Documentation/technical/api-sha1-array.txt
+++ b/Documentation/technical/api-oid-array.txt
@@ -1,7 +1,7 @@
-sha1-array API
+oid-array API
 ==
 
-The sha1-array API provides storage and manipulation of sets of SHA-1
+The oid-array API provides storage and manipulation of sets of object
 identifiers. The emphasis is on storage and processing efficiency,
 making them suitable for large lists. Note that the ordering of items is
 not preserved over some operations.
@@ -9,10 +9,10 @@ not preserved over some operations.
 Data Structures
 ---
 
-`struct sha1_array`::
+`struct oid_array`::
 
-   A single array of SHA-1 hashes. This should be initialized by
-   assignment from `SHA1_ARRAY_INIT`.  The `sha1` member contains
+   A single array of object IDs. This should be initialized by
+   assignment from `OID_ARRAY_INIT`.  The `oid` member contains
the actual data. The `nr` member contains the number of items in
the set.  The `alloc` and `sorted` members are used internally,
and should not be needed by API callers.
@@ -20,22 +20,22 @@ Data Structures
 Functions
 -
 
-`sha1_array_append`::
-   Add an item to the set. The sha1 will be placed at the end of
+`oid_array_append`::
+   Add an item to the set. The object ID will be placed at the end of
the array (but note that some operations below may lose this
ordering).
 
-`sha1_array_lookup`::
-   Perform a binary search of the array for a specific sha1.
+`oid_array_lookup`::
+   Perform a binary search of the array for a specific object ID.
If found, returns the offset (in number of elements) of the
-   sha1. If not found, returns a negative integer. If the array is
-   not sorted, this function has the side effect of sorting it.
+   object ID. If not found, returns a negative integer. If the array
+   is not sorted, this function has the side effect of sorting it.
 
-`sha1_array_clear`::
+`oid_array_clear`::
Free all memory associated with the array and return it to the
initial, empty state.
 
-`sha1_array_for_each_unique`::
+`oid_array_for_each_unique`::
Efficiently iterate over each unique element of the list,
executing the callback function for each one. If the array is
not sorted, this function has the side effect of sorting it. If
@@ -47,25 +47,25 @@ Examples
 
 
 -
-int print_callback(const unsigned char sha1[20],
+int print_callback(const struct object_id *oid,
void *data)
 {
-   printf("%s\n", sha1_to_hex(sha1));
+   printf("%s\n", oid_to_hex(oid));
return 0; /* always continue */
 }
 
 void some_func(void)
 {
-   struct sha1_array hashes = SHA1_ARRAY_INIT;
-   unsigned char sha1[20];
+   struct sha1_array hashes = OID_ARRAY_INIT;
+   struct object_id oid;
 
/* Read objects into our set */
-   while (read_object_from_stdin(sha1))
-   sha1_array_append(, sha1);
+   while (read_object_from_stdin(oid.hash))
+   oid_array_append(, );
 
/* Check if some objects are in our set */
-   while (read_object_from_stdin(sha1)) {
-   if (sha1_array_lookup(, sha1) >= 0)
+   while (read_object_from_stdin(oid.hash)) {
+   if (oid_array_lookup(, ) >= 0)
printf("it's in there!\n");
 
/*
@@ -75,6 +75,6 @@ void some_func(void)
 * Instead, this will sort once and then skip duplicates
 * in linear time.
 */
-   sha1_array_for_each_unique(, print_callback, NULL);
+   oid_array_for_each_unique(, print_callback, NULL);
 }
 -


[PATCH v3 12/20] submodule: convert check_for_new_submodule_commits to object_id

2017-03-30 Thread brian m. carlson
All of the callers of this function have been converted, so convert this
function and update the callers.  This function also calls
sha1_array_append, which we'll convert shortly.

Signed-off-by: brian m. carlson 
---
 builtin/fetch.c | 6 +++---
 submodule.c | 4 ++--
 submodule.h | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index b5ad09d046..a41b892dcc 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -659,7 +659,7 @@ static int update_local_ref(struct ref *ref,
 
if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
(recurse_submodules != RECURSE_SUBMODULES_ON))
-   check_for_new_submodule_commits(ref->new_oid.hash);
+   check_for_new_submodule_commits(>new_oid);
r = s_update_ref(msg, ref, 0);
format_display(display, r ? '!' : '*', what,
   r ? _("unable to update local ref") : NULL,
@@ -675,7 +675,7 @@ static int update_local_ref(struct ref *ref,
strbuf_add_unique_abbrev(, ref->new_oid.hash, 
DEFAULT_ABBREV);
if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
(recurse_submodules != RECURSE_SUBMODULES_ON))
-   check_for_new_submodule_commits(ref->new_oid.hash);
+   check_for_new_submodule_commits(>new_oid);
r = s_update_ref("fast-forward", ref, 1);
format_display(display, r ? '!' : ' ', quickref.buf,
   r ? _("unable to update local ref") : NULL,
@@ -690,7 +690,7 @@ static int update_local_ref(struct ref *ref,
strbuf_add_unique_abbrev(, ref->new_oid.hash, 
DEFAULT_ABBREV);
if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
(recurse_submodules != RECURSE_SUBMODULES_ON))
-   check_for_new_submodule_commits(ref->new_oid.hash);
+   check_for_new_submodule_commits(>new_oid);
r = s_update_ref("forced-update", ref, 1);
format_display(display, r ? '!' : '+', quickref.buf,
   r ? _("unable to update local ref") : _("forced 
update"),
diff --git a/submodule.c b/submodule.c
index 3200b7bb2b..5c5c18ec3d 100644
--- a/submodule.c
+++ b/submodule.c
@@ -821,14 +821,14 @@ static int add_sha1_to_array(const char *ref, const 
struct object_id *oid,
return 0;
 }
 
-void check_for_new_submodule_commits(unsigned char new_sha1[20])
+void check_for_new_submodule_commits(struct object_id *oid)
 {
if (!initialized_fetch_ref_tips) {
for_each_ref(add_sha1_to_array, _tips_before_fetch);
initialized_fetch_ref_tips = 1;
}
 
-   sha1_array_append(_tips_after_fetch, new_sha1);
+   sha1_array_append(_tips_after_fetch, oid->hash);
 }
 
 static int add_sha1_to_argv(const unsigned char sha1[20], void *data)
diff --git a/submodule.h b/submodule.h
index c8a0c9cb29..9c32b28b12 100644
--- a/submodule.h
+++ b/submodule.h
@@ -58,7 +58,7 @@ extern void show_submodule_inline_diff(FILE *f, const char 
*path,
const char *del, const char *add, const char *reset,
const struct diff_options *opt);
 extern void set_config_fetch_recurse_submodules(int value);
-extern void check_for_new_submodule_commits(unsigned char new_sha1[20]);
+extern void check_for_new_submodule_commits(struct object_id *oid);
 extern int fetch_populated_submodules(const struct argv_array *options,
   const char *prefix, int command_line_option,
   int quiet, int max_parallel_jobs);


[PATCH v3 16/20] Convert remaining callers of sha1_array_lookup to object_id

2017-03-30 Thread brian m. carlson
There are a very small number of callers which don't already use struct
object_id.  Convert them.

Signed-off-by: brian m. carlson 
---
 bisect.c   | 14 +++---
 builtin/pack-objects.c | 16 
 ref-filter.c   | 22 +++---
 3 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/bisect.c b/bisect.c
index 886e630884..a25d008693 100644
--- a/bisect.c
+++ b/bisect.c
@@ -754,9 +754,9 @@ static void handle_bad_merge_base(void)
exit(1);
 }
 
-static void handle_skipped_merge_base(const unsigned char *mb)
+static void handle_skipped_merge_base(const struct object_id *mb)
 {
-   char *mb_hex = sha1_to_hex(mb);
+   char *mb_hex = oid_to_hex(mb);
char *bad_hex = oid_to_hex(current_bad_oid);
char *good_hex = join_sha1_array_hex(_revs, ' ');
 
@@ -787,16 +787,16 @@ static void check_merge_bases(int no_checkout)
result = get_merge_bases_many(rev[0], rev_nr - 1, rev + 1);
 
for (; result; result = result->next) {
-   const unsigned char *mb = result->item->object.oid.hash;
-   if (!hashcmp(mb, current_bad_oid->hash)) {
+   const struct object_id *mb = >item->object.oid;
+   if (!oidcmp(mb, current_bad_oid)) {
handle_bad_merge_base();
-   } else if (0 <= sha1_array_lookup(_revs, mb)) {
+   } else if (0 <= sha1_array_lookup(_revs, mb->hash)) {
continue;
-   } else if (0 <= sha1_array_lookup(_revs, mb)) {
+   } else if (0 <= sha1_array_lookup(_revs, mb->hash)) {
handle_skipped_merge_base(mb);
} else {
printf(_("Bisecting: a merge base must be tested\n"));
-   exit(bisect_checkout(mb, no_checkout));
+   exit(bisect_checkout(mb->hash, no_checkout));
}
}
 
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index dfeacd5c37..dca1b68e69 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2670,14 +2670,14 @@ static int has_sha1_pack_kept_or_nonlocal(const 
unsigned char *sha1)
  */
 static struct sha1_array recent_objects;
 
-static int loosened_object_can_be_discarded(const unsigned char *sha1,
+static int loosened_object_can_be_discarded(const struct object_id *oid,
unsigned long mtime)
 {
if (!unpack_unreachable_expiration)
return 0;
if (mtime > unpack_unreachable_expiration)
return 0;
-   if (sha1_array_lookup(_objects, sha1) >= 0)
+   if (sha1_array_lookup(_objects, oid->hash) >= 0)
return 0;
return 1;
 }
@@ -2686,7 +2686,7 @@ static void loosen_unused_packed_objects(struct rev_info 
*revs)
 {
struct packed_git *p;
uint32_t i;
-   const unsigned char *sha1;
+   struct object_id oid;
 
for (p = packed_git; p; p = p->next) {
if (!p->pack_local || p->pack_keep)
@@ -2696,11 +2696,11 @@ static void loosen_unused_packed_objects(struct 
rev_info *revs)
die("cannot open pack index");
 
for (i = 0; i < p->num_objects; i++) {
-   sha1 = nth_packed_object_sha1(p, i);
-   if (!packlist_find(_pack, sha1, NULL) &&
-   !has_sha1_pack_kept_or_nonlocal(sha1) &&
-   !loosened_object_can_be_discarded(sha1, p->mtime))
-   if (force_object_loose(sha1, p->mtime))
+   nth_packed_object_oid(, p, i);
+   if (!packlist_find(_pack, oid.hash, NULL) &&
+   !has_sha1_pack_kept_or_nonlocal(oid.hash) &&
+   !loosened_object_can_be_discarded(, p->mtime))
+   if (force_object_loose(oid.hash, p->mtime))
die("unable to force loose object");
}
}
diff --git a/ref-filter.c b/ref-filter.c
index 9c82b5b9d6..d3dcb53dd5 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1677,22 +1677,22 @@ static int filter_pattern_match(struct ref_filter 
*filter, const char *refname)
  * the need to parse the object via parse_object(). peel_ref() might be a
  * more efficient alternative to obtain the pointee.
  */
-static const unsigned char *match_points_at(struct sha1_array *points_at,
-   const unsigned char *sha1,
-   const char *refname)
+static const struct object_id *match_points_at(struct sha1_array *points_at,
+  const struct object_id *oid,
+  const char *refname)
 {
-   const unsigned char *tagged_sha1 = NULL;
+   const struct object_id *tagged_oid = NULL;

[PATCH v3 00/20] object_id part 7

2017-03-30 Thread brian m. carlson
This is part 7 in the continuing transition to use struct object_id.

This series focuses on two main areas: adding two constants for the
maximum hash size we'll be using (which will be suitable for allocating
memory) and converting struct sha1_array to struct oid_array.

The rationale for adding separate constants for allocating memory is
that with a new 256-bit hash function, we're going to need two different
items: a constant for allocating memory that's as large as the largest
hash, and a global variable telling us size the current hash is.  I've
opted to provide GIT_MAX_RAWSZ and GIT_MAX_HEXSZ for allocating memory,
and leave GIT_SHA1_RAWSZ and GIT_SHA1_HEXSZ as values that can be later
replaced by the aforementioned global.

Replacing struct sha1_array with struct oid_array necessarily involves
converting the shallow code, so I did that.  The structure now handles
objects of struct object_id.  While I renamed the documentation (since
people will search for that), I chose not to rename the sha1-array.[ch]
files or the test helper because I didn't think it was worth the hassle,
especially for people who don't have rename support turned on by
default.

I chose to use Coccinelle quite a bit in this series, as it automates a
lot of the manual work and aides in review.  There is also some use of
Perl one-liners.

This series is available at https://github.com/bk2204/git under
object-id-part7; it may be rebased.

Changes from v2:
* Drop the patch that Junio has picked up separately.
* Change one struct object_id * parameter from "sha1" to "oid".
* Convert E2[E3].hash to E2[E3] instead of E2 + E3.

Changes from v1:
* Rebase on current master (no changes).
* Remove check for empty line in queue_command.
* Add patch 6 to fix invalid pointer arithmetic.

brian m. carlson (20):
  Define new hash-size constants for allocating memory
  Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ
  Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ
  builtin/diff: convert to struct object_id
  builtin/pull: convert portions to struct object_id
  builtin/receive-pack: convert portions to struct object_id
  fsck: convert init_skiplist to struct object_id
  parse-options-cb: convert sha1_array_append caller to struct object_id
  test-sha1-array: convert most code to struct object_id
  sha1_name: convert struct disambiguate_state to object_id
  sha1_name: convert disambiguate_hint_fn to take object_id
  submodule: convert check_for_new_submodule_commits to object_id
  builtin/pull: convert to struct object_id
  sha1-array: convert internal storage for struct sha1_array to
object_id
  Make sha1_array_append take a struct object_id *
  Convert remaining callers of sha1_array_lookup to object_id
  Convert sha1_array_lookup to take struct object_id
  Convert sha1_array_for_each_unique and for_each_abbrev to object_id
  Rename sha1_array to oid_array
  Documentation: update and rename api-sha1-array.txt

 .../{api-sha1-array.txt => api-oid-array.txt}  |  44 +++
 bisect.c   |  43 ---
 builtin/blame.c|   4 +-
 builtin/cat-file.c |  14 +--
 builtin/diff.c |  40 +++---
 builtin/fetch-pack.c   |   2 +-
 builtin/fetch.c|   6 +-
 builtin/merge-index.c  |   2 +-
 builtin/merge.c|   2 +-
 builtin/pack-objects.c |  24 ++--
 builtin/patch-id.c |   2 +-
 builtin/pull.c |  98 +++
 builtin/receive-pack.c | 134 ++---
 builtin/rev-list.c |   2 +-
 builtin/rev-parse.c|   4 +-
 builtin/send-pack.c|   4 +-
 cache.h|  10 +-
 combine-diff.c |  18 +--
 commit.h   |  14 +--
 connect.c  |   8 +-
 diff.c |   4 +-
 diff.h |   4 +-
 fetch-pack.c   |  32 ++---
 fetch-pack.h   |   4 +-
 fsck.c |  17 +--
 fsck.h |   2 +-
 hex.c  |   2 +-
 parse-options-cb.c |   8 +-
 patch-ids.c|   2 +-
 patch-ids.h|   2 +-
 ref-filter.c   |  22 ++--
 ref-filter.h   |   2 +-
 remote-curl.c

[PATCH v3 09/20] test-sha1-array: convert most code to struct object_id

2017-03-30 Thread brian m. carlson
This helper is very small, so convert the entire thing.

Signed-off-by: brian m. carlson 
---
 t/helper/test-sha1-array.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/t/helper/test-sha1-array.c b/t/helper/test-sha1-array.c
index f7a53c4ad6..b4bb97fccc 100644
--- a/t/helper/test-sha1-array.c
+++ b/t/helper/test-sha1-array.c
@@ -14,16 +14,16 @@ int cmd_main(int argc, const char **argv)
 
while (strbuf_getline(, stdin) != EOF) {
const char *arg;
-   unsigned char sha1[20];
+   struct object_id oid;
 
if (skip_prefix(line.buf, "append ", )) {
-   if (get_sha1_hex(arg, sha1))
+   if (get_oid_hex(arg, ))
die("not a hexadecimal SHA1: %s", arg);
-   sha1_array_append(, sha1);
+   sha1_array_append(, oid.hash);
} else if (skip_prefix(line.buf, "lookup ", )) {
-   if (get_sha1_hex(arg, sha1))
+   if (get_oid_hex(arg, ))
die("not a hexadecimal SHA1: %s", arg);
-   printf("%d\n", sha1_array_lookup(, sha1));
+   printf("%d\n", sha1_array_lookup(, oid.hash));
} else if (!strcmp(line.buf, "clear"))
sha1_array_clear();
else if (!strcmp(line.buf, "for_each_unique"))


[PATCH v3 04/20] builtin/diff: convert to struct object_id

2017-03-30 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 builtin/diff.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/builtin/diff.c b/builtin/diff.c
index 3d64b85337..398eee00d5 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -21,7 +21,7 @@
 #define DIFF_NO_INDEX_IMPLICIT 2
 
 struct blobinfo {
-   unsigned char sha1[20];
+   struct object_id oid;
const char *name;
unsigned mode;
 };
@@ -31,22 +31,22 @@ static const char builtin_diff_usage[] =
 
 static void stuff_change(struct diff_options *opt,
 unsigned old_mode, unsigned new_mode,
-const unsigned char *old_sha1,
-const unsigned char *new_sha1,
-int old_sha1_valid,
-int new_sha1_valid,
+const struct object_id *old_oid,
+const struct object_id *new_oid,
+int old_oid_valid,
+int new_oid_valid,
 const char *old_name,
 const char *new_name)
 {
struct diff_filespec *one, *two;
 
-   if (!is_null_sha1(old_sha1) && !is_null_sha1(new_sha1) &&
-   !hashcmp(old_sha1, new_sha1) && (old_mode == new_mode))
+   if (!is_null_oid(old_oid) && !is_null_oid(new_oid) &&
+   !oidcmp(old_oid, new_oid) && (old_mode == new_mode))
return;
 
if (DIFF_OPT_TST(opt, REVERSE_DIFF)) {
SWAP(old_mode, new_mode);
-   SWAP(old_sha1, new_sha1);
+   SWAP(old_oid, new_oid);
SWAP(old_name, new_name);
}
 
@@ -57,8 +57,8 @@ static void stuff_change(struct diff_options *opt,
 
one = alloc_filespec(old_name);
two = alloc_filespec(new_name);
-   fill_filespec(one, old_sha1, old_sha1_valid, old_mode);
-   fill_filespec(two, new_sha1, new_sha1_valid, new_mode);
+   fill_filespec(one, old_oid->hash, old_oid_valid, old_mode);
+   fill_filespec(two, new_oid->hash, new_oid_valid, new_mode);
 
diff_queue(_queued_diff, one, two);
 }
@@ -89,7 +89,7 @@ static int builtin_diff_b_f(struct rev_info *revs,
 
stuff_change(>diffopt,
 blob[0].mode, canon_mode(st.st_mode),
-blob[0].sha1, null_sha1,
+[0].oid, _oid,
 1, 0,
 path, path);
diffcore_std(>diffopt);
@@ -114,7 +114,7 @@ static int builtin_diff_blobs(struct rev_info *revs,
 
stuff_change(>diffopt,
 blob[0].mode, blob[1].mode,
-blob[0].sha1, blob[1].sha1,
+[0].oid, [1].oid,
 1, 1,
 blob[0].name, blob[1].name);
diffcore_std(>diffopt);
@@ -160,7 +160,7 @@ static int builtin_diff_tree(struct rev_info *revs,
 struct object_array_entry *ent0,
 struct object_array_entry *ent1)
 {
-   const unsigned char *(sha1[2]);
+   const struct object_id *(oid[2]);
int swap = 0;
 
if (argc > 1)
@@ -172,9 +172,9 @@ static int builtin_diff_tree(struct rev_info *revs,
 */
if (ent1->item->flags & UNINTERESTING)
swap = 1;
-   sha1[swap] = ent0->item->oid.hash;
-   sha1[1 - swap] = ent1->item->oid.hash;
-   diff_tree_sha1(sha1[0], sha1[1], "", >diffopt);
+   oid[swap] = >item->oid;
+   oid[1 - swap] = >item->oid;
+   diff_tree_sha1(oid[0]->hash, oid[1]->hash, "", >diffopt);
log_tree_diff_flush(revs);
return 0;
 }
@@ -408,7 +408,7 @@ int cmd_diff(int argc, const char **argv, const char 
*prefix)
} else if (obj->type == OBJ_BLOB) {
if (2 <= blobs)
die(_("more than two blobs given: '%s'"), name);
-   hashcpy(blob[blobs].sha1, obj->oid.hash);
+   hashcpy(blob[blobs].oid.hash, obj->oid.hash);
blob[blobs].name = name;
blob[blobs].mode = entry->mode;
blobs++;


[PATCH v3 13/20] builtin/pull: convert to struct object_id

2017-03-30 Thread brian m. carlson
Convert virtually all uses of unsigned char [20] to struct object_id.
Leave all the arguments that come from struct sha1_array, as these will
be converted in a later patch.

Signed-off-by: brian m. carlson 
---
 builtin/pull.c | 72 +-
 1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/builtin/pull.c b/builtin/pull.c
index a9f7553f30..704ce1f042 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -515,7 +515,7 @@ static int run_fetch(const char *repo, const char 
**refspecs)
  * "Pulls into void" by branching off merge_head.
  */
 static int pull_into_void(const unsigned char *merge_head,
-   const unsigned char *curr_head)
+   const struct object_id *curr_head)
 {
/*
 * Two-way merge: we treat the index as based on an empty tree,
@@ -526,7 +526,7 @@ static int pull_into_void(const unsigned char *merge_head,
if (checkout_fast_forward(EMPTY_TREE_SHA1_BIN, merge_head, 0))
return 1;
 
-   if (update_ref("initial pull", "HEAD", merge_head, curr_head, 0, 
UPDATE_REFS_DIE_ON_ERR))
+   if (update_ref("initial pull", "HEAD", merge_head, curr_head->hash, 0, 
UPDATE_REFS_DIE_ON_ERR))
return 1;
 
return 0;
@@ -647,7 +647,7 @@ static const char *get_tracking_branch(const char *remote, 
const char *refspec)
  * current branch forked from its remote tracking branch. Returns 0 on success,
  * -1 on failure.
  */
-static int get_rebase_fork_point(unsigned char *fork_point, const char *repo,
+static int get_rebase_fork_point(struct object_id *fork_point, const char 
*repo,
const char *refspec)
 {
int ret;
@@ -678,7 +678,7 @@ static int get_rebase_fork_point(unsigned char *fork_point, 
const char *repo,
if (ret)
goto cleanup;
 
-   ret = get_sha1_hex(sb.buf, fork_point);
+   ret = get_oid_hex(sb.buf, fork_point);
if (ret)
goto cleanup;
 
@@ -691,24 +691,24 @@ static int get_rebase_fork_point(unsigned char 
*fork_point, const char *repo,
  * Sets merge_base to the octopus merge base of curr_head, merge_head and
  * fork_point. Returns 0 if a merge base is found, 1 otherwise.
  */
-static int get_octopus_merge_base(unsigned char *merge_base,
-   const unsigned char *curr_head,
+static int get_octopus_merge_base(struct object_id *merge_base,
+   const struct object_id *curr_head,
const unsigned char *merge_head,
-   const unsigned char *fork_point)
+   const struct object_id *fork_point)
 {
struct commit_list *revs = NULL, *result;
 
-   commit_list_insert(lookup_commit_reference(curr_head), );
+   commit_list_insert(lookup_commit_reference(curr_head->hash), );
commit_list_insert(lookup_commit_reference(merge_head), );
-   if (!is_null_sha1(fork_point))
-   commit_list_insert(lookup_commit_reference(fork_point), );
+   if (!is_null_oid(fork_point))
+   commit_list_insert(lookup_commit_reference(fork_point->hash), 
);
 
result = reduce_heads(get_octopus_merge_bases(revs));
free_commit_list(revs);
if (!result)
return 1;
 
-   hashcpy(merge_base, result->item->object.oid.hash);
+   oidcpy(merge_base, >item->object.oid);
return 0;
 }
 
@@ -717,16 +717,16 @@ static int get_octopus_merge_base(unsigned char 
*merge_base,
  * fork point calculated by get_rebase_fork_point(), runs git-rebase with the
  * appropriate arguments and returns its exit status.
  */
-static int run_rebase(const unsigned char *curr_head,
+static int run_rebase(const struct object_id *curr_head,
const unsigned char *merge_head,
-   const unsigned char *fork_point)
+   const struct object_id *fork_point)
 {
int ret;
-   unsigned char oct_merge_base[GIT_SHA1_RAWSZ];
+   struct object_id oct_merge_base;
struct argv_array args = ARGV_ARRAY_INIT;
 
-   if (!get_octopus_merge_base(oct_merge_base, curr_head, merge_head, 
fork_point))
-   if (!is_null_sha1(fork_point) && !hashcmp(oct_merge_base, 
fork_point))
+   if (!get_octopus_merge_base(_merge_base, curr_head, merge_head, 
fork_point))
+   if (!is_null_oid(fork_point) && !oidcmp(_merge_base, 
fork_point))
fork_point = NULL;
 
argv_array_push(, "rebase");
@@ -756,8 +756,8 @@ static int run_rebase(const unsigned char *curr_head,
argv_array_push(, "--onto");
argv_array_push(, sha1_to_hex(merge_head));
 
-   if (fork_point && !is_null_sha1(fork_point))
-   argv_array_push(, sha1_to_hex(fork_point));
+   if (fork_point && !is_null_oid(fork_point))
+   argv_array_push(, oid_to_hex(fork_point));
else
argv_array_push(, sha1_to_hex(merge_head));
 
@@ -770,8 +770,8 @@ int cmd_pull(int 

[PATCH v3 15/20] Make sha1_array_append take a struct object_id *

2017-03-30 Thread brian m. carlson
Convert the callers to pass struct object_id by changing the function
declaration and definition and applying the following semantic patch:

@@
expression E1, E2;
@@
- sha1_array_append(E1, E2.hash)
+ sha1_array_append(E1, )

@@
expression E1, E2;
@@
- sha1_array_append(E1, E2->hash)
+ sha1_array_append(E1, E2)

Signed-off-by: brian m. carlson 
---
 bisect.c   | 4 ++--
 builtin/cat-file.c | 4 ++--
 builtin/diff.c | 2 +-
 builtin/pack-objects.c | 4 ++--
 builtin/pull.c | 2 +-
 builtin/receive-pack.c | 6 +++---
 combine-diff.c | 2 +-
 connect.c  | 4 ++--
 fetch-pack.c   | 8 
 fsck.c | 2 +-
 parse-options-cb.c | 2 +-
 sha1-array.c   | 4 ++--
 sha1-array.h   | 2 +-
 sha1_name.c| 2 +-
 submodule.c| 6 +++---
 t/helper/test-sha1-array.c | 2 +-
 transport.c| 6 --
 17 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/bisect.c b/bisect.c
index ebaf7b05ba..886e630884 100644
--- a/bisect.c
+++ b/bisect.c
@@ -413,9 +413,9 @@ static int register_ref(const char *refname, const struct 
object_id *oid,
current_bad_oid = xmalloc(sizeof(*current_bad_oid));
oidcpy(current_bad_oid, oid);
} else if (starts_with(refname, good_prefix.buf)) {
-   sha1_array_append(_revs, oid->hash);
+   sha1_array_append(_revs, oid);
} else if (starts_with(refname, "skip-")) {
-   sha1_array_append(_revs, oid->hash);
+   sha1_array_append(_revs, oid);
}
 
strbuf_release(_prefix);
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 8b85cb8cf0..8fbb667170 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -413,7 +413,7 @@ static int batch_loose_object(const struct object_id *oid,
  const char *path,
  void *data)
 {
-   sha1_array_append(data, oid->hash);
+   sha1_array_append(data, oid);
return 0;
 }
 
@@ -422,7 +422,7 @@ static int batch_packed_object(const struct object_id *oid,
   uint32_t pos,
   void *data)
 {
-   sha1_array_append(data, oid->hash);
+   sha1_array_append(data, oid);
return 0;
 }
 
diff --git a/builtin/diff.c b/builtin/diff.c
index 398eee00d5..a5b34eb156 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -193,7 +193,7 @@ static int builtin_diff_combined(struct rev_info *revs,
if (!revs->dense_combined_merges && !revs->combine_merges)
revs->dense_combined_merges = revs->combine_merges = 1;
for (i = 1; i < ents; i++)
-   sha1_array_append(, ent[i].item->oid.hash);
+   sha1_array_append(, [i].item->oid);
diff_tree_combined(ent[0].item->oid.hash, ,
   revs->dense_combined_merges, revs);
sha1_array_clear();
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 16517f2637..dfeacd5c37 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2739,12 +2739,12 @@ static void record_recent_object(struct object *obj,
 const char *name,
 void *data)
 {
-   sha1_array_append(_objects, obj->oid.hash);
+   sha1_array_append(_objects, >oid);
 }
 
 static void record_recent_commit(struct commit *commit, void *data)
 {
-   sha1_array_append(_objects, commit->object.oid.hash);
+   sha1_array_append(_objects, >object.oid);
 }
 
 static void get_object_list(int ac, const char **av)
diff --git a/builtin/pull.c b/builtin/pull.c
index c007900ab5..183e377147 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -344,7 +344,7 @@ static void get_merge_heads(struct sha1_array *merge_heads)
continue;  /* invalid line: does not start with SHA1 */
if (starts_with(sb.buf + GIT_SHA1_HEXSZ, "\tnot-for-merge\t"))
continue;  /* ref is not-for-merge */
-   sha1_array_append(merge_heads, oid.hash);
+   sha1_array_append(merge_heads, );
}
fclose(fp);
strbuf_release();
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index d6097ee08c..2853ea0f99 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -842,7 +842,7 @@ static int update_shallow_ref(struct command *cmd, struct 
shallow_info *si)
if (si->used_shallow[i] &&
(si->used_shallow[i][cmd->index / 32] & mask) &&
!delayed_reachability_test(si, i))
-   sha1_array_append(, si->shallow->oid[i].hash);
+   sha1_array_append(, >shallow->oid[i]);
 
opt.env = tmp_objdir_env(tmp_objdir);
setup_alternate_shallow(_lock, _file, );
@@ -1546,7 +1546,7 @@ static struct 

Re: Issue with 2.11.0 and GIT_EXEC_PATH with multiple entries

2017-03-30 Thread Nate Mueller
Got it.  Thanks!

On Thu, Mar 30, 2017 at 4:32 PM, Jeff King  wrote:
> On Thu, Mar 30, 2017 at 04:00:41PM -0700, Nate Mueller wrote:
>
>> Really?  My config has been set this way for years and it's never
>> caused problems before.  I have subcommands in both of those
>> directories and all work.
>
> Really. It did happen to work most of the time before (because most uses
> involved just appending it to $PATH). But it was never intended to work
> with multiple paths. The:
>
>   . "$(git --exec-path)/git-sh-whatever"
>
> advice has been advertised in the documentation for years. E.g., see
> bd870878f (Documentation: don't assume git-sh-setup and git-parse-remote
> are in PATH, 2008-06-29). So even if we wanted to relax the rules in our
> scripts, it seems like a potential hazard for 3rd party scripts.
>
> The recommended way is to just put your ~/.git-exec into your $PATH.
>
> -Peff



-- 
Nate Mueller - Head of Engineering - RetailNext - 406-356-6283


Re: Issue with 2.11.0 and GIT_EXEC_PATH with multiple entries

2017-03-30 Thread Jeff King
On Thu, Mar 30, 2017 at 04:00:41PM -0700, Nate Mueller wrote:

> Really?  My config has been set this way for years and it's never
> caused problems before.  I have subcommands in both of those
> directories and all work.

Really. It did happen to work most of the time before (because most uses
involved just appending it to $PATH). But it was never intended to work
with multiple paths. The:

  . "$(git --exec-path)/git-sh-whatever"

advice has been advertised in the documentation for years. E.g., see
bd870878f (Documentation: don't assume git-sh-setup and git-parse-remote
are in PATH, 2008-06-29). So even if we wanted to relax the rules in our
scripts, it seems like a potential hazard for 3rd party scripts.

The recommended way is to just put your ~/.git-exec into your $PATH.

-Peff


RE:

2017-03-30 Thread Mikhail Fridman



I have a Charitable Donation proposal for you. Reply to this email for
more information


Check the link below for confirmation:

http://www.ibtimes.co.uk/russias-second-wealthiest-man-mikhail-fridman-plans-leaving-14-2bn-fortune-charity-1561604

Best Regards
Mikhail Fridman.



Re: Issue with 2.11.0 and GIT_EXEC_PATH with multiple entries

2017-03-30 Thread Nate Mueller
Really?  My config has been set this way for years and it's never
caused problems before.  I have subcommands in both of those
directories and all work.

On Thu, Mar 30, 2017 at 3:57 PM, Junio C Hamano  wrote:
> Nate Mueller  writes:
>
>> This fails for me because my GIT_EXEC_PATH is set to
>> "/Library/Developer/CommandLineTools/usr/libexec/git-core:/Users/nate/.git-exec".
>
> That environment variable is designed to hold a single path, not
> like $PATH that lists multiple places in a colon separated list.



-- 
Nate Mueller - Head of Engineering - RetailNext - 406-356-6283


Re: Issue with 2.11.0 and GIT_EXEC_PATH with multiple entries

2017-03-30 Thread Junio C Hamano
Nate Mueller  writes:

> This fails for me because my GIT_EXEC_PATH is set to
> "/Library/Developer/CommandLineTools/usr/libexec/git-core:/Users/nate/.git-exec".

That environment variable is designed to hold a single path, not
like $PATH that lists multiple places in a colon separated list.


Re: /bin/bash: /usr/ucb/install: No such file or directory

2017-03-30 Thread Junio C Hamano
Jeffrey Walton  writes:

> I think this is the last of the issues for Git 2.12.2 on Solaris 11.3.
>
> It looks like 'install' is located in a few places, but not in
> '/usr/ucb'. I believe /usr/ucb is Solaris 9 or Solaris 10. I think the
> equivalent place to look on Solaris 11 is /usr/gnu (but I only have
> limited experience on Solaris).
>
> solaris:~$ find /usr -name install 2>/dev/null
> /usr/share/install
> /usr/dtrace/DTT/install
> /usr/sadm/install
> /usr/gnu/bin/install
> /usr/sbin/install
>
> solaris:~$ ls /usr/ucb
> /usr/ucb: No such file or directory
>
> Here's the default one based on default paths using Bash. I change the
> default shell, but not the default paths:
>
> solaris:~$ sudo su -
> Oracle Corporation  SunOS 5.11  11.3September 2015
> solaris:~# which install
> /sbin/install

$ make INSTALL=/usr/gnu/install

perhaps?  Look into config.mak.uname; it probably needs an update to
SunOS section, as we seem to only have entries up to 5.9.


RE:

2017-03-30 Thread Mikhail Fridman



I have a Charitable Donation proposal for you. Reply to this email for
more information


Check the link below for confirmation:

http://www.ibtimes.co.uk/russias-second-wealthiest-man-mikhail-fridman-plans-leaving-14-2bn-fortune-charity-1561604

Best Regards
Mikhail Fridman.



Issue with 2.11.0 and GIT_EXEC_PATH with multiple entries

2017-03-30 Thread Nate Mueller
I ran into this after upgrading to 2.11.0 through Xcode.  I assumed it
was a packaging issue but it looks like it's been in the mainline
since 1073094f30 (on October 29).

In 2.11.0, git-sh-setup switched it's call of git-sh-i18n from:

. git-sh-i18n

to:

. "$(git --exec-path)/git-sh-i18n"

This fails for me because my GIT_EXEC_PATH is set to
"/Library/Developer/CommandLineTools/usr/libexec/git-core:/Users/nate/.git-exec".
If I remove the second entry git-sh-setup works just fine.

Am I doing something wrong here?  I can't see what but I'm surprised
that I'm the first person to hit this.



-- 
Nate Mueller - Head of Engineering - RetailNext - 406-356-6283


/bin/bash: /usr/ucb/install: No such file or directory

2017-03-30 Thread Jeffrey Walton
I think this is the last of the issues for Git 2.12.2 on Solaris 11.3.

It looks like 'install' is located in a few places, but not in
'/usr/ucb'. I believe /usr/ucb is Solaris 9 or Solaris 10. I think the
equivalent place to look on Solaris 11 is /usr/gnu (but I only have
limited experience on Solaris).

solaris:~$ find /usr -name install 2>/dev/null
/usr/share/install
/usr/dtrace/DTT/install
/usr/sadm/install
/usr/gnu/bin/install
/usr/sbin/install

solaris:~$ ls /usr/ucb
/usr/ucb: No such file or directory

Here's the default one based on default paths using Bash. I change the
default shell, but not the default paths:

solaris:~$ sudo su -
Oracle Corporation  SunOS 5.11  11.3September 2015
solaris:~# which install
/sbin/install

Jeff



...
Writing MYMETA.yml and MYMETA.json
GEN git-add--interactive
GEN git-archimport
GEN git-cvsexportcommit
GEN git-cvsimport
GEN git-cvsserver
GEN git-send-email
GEN git-svn
SUBDIR git-gui
SUBDIR gitk-git
SUBDIR perl
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git/SVN.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git/SVN/GlobSpec.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Error.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git/I18N.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git/SVN/Editor.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>"
blib/lib/Git/SVN/Memoize/YAML.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git/SVN/Fetcher.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git/SVN/Ra.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git/SVN/Utils.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git/SVN/Prompt.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git/SVN/Migration.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git/IndexInfo.pm
"/usr/local/bin/perl5.24.0" -pe
"s<\Q++LOCALEDIR++\E>" blib/lib/Git/SVN/Log.pm
Manifying 9 pod documents
SUBDIR templates
/usr/ucb/install -d -m 755 '/usr/local/bin'
/bin/bash: /usr/ucb/install: No such file or directory
gmake: *** [install] Error 127


[PATCH v2] http.postbuffer: allow full range of ssize_t values

2017-03-30 Thread David Turner
Unfortunately, in order to push some large repos, the http postbuffer
must sometimes exceed two gigabytes.  On a 64-bit system, this is OK:
we just malloc a larger buffer.

Signed-off-by: David Turner 
---
 cache.h  |  1 +
 config.c | 17 +
 http.c   |  4 ++--
 http.h   |  2 +-
 4 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index fbdf7a815a..5e6747dbb4 100644
--- a/cache.h
+++ b/cache.h
@@ -1900,6 +1900,7 @@ extern int git_parse_maybe_bool(const char *);
 extern int git_config_int(const char *, const char *);
 extern int64_t git_config_int64(const char *, const char *);
 extern unsigned long git_config_ulong(const char *, const char *);
+extern ssize_t git_config_ssize_t(const char *, const char *);
 extern int git_config_bool_or_int(const char *, const char *, int *);
 extern int git_config_bool(const char *, const char *);
 extern int git_config_maybe_bool(const char *, const char *);
diff --git a/config.c b/config.c
index 1a4d85537b..5ead880d5b 100644
--- a/config.c
+++ b/config.c
@@ -834,6 +834,15 @@ int git_parse_ulong(const char *value, unsigned long *ret)
return 1;
 }
 
+static ssize_t git_parse_ssize_t(const char *value, ssize_t *ret)
+{
+   ssize_t tmp;
+   if (!git_parse_signed(value, , 
maximum_signed_value_of_type(ssize_t)))
+   return 0;
+   *ret = tmp;
+   return 1;
+}
+
 NORETURN
 static void die_bad_number(const char *name, const char *value)
 {
@@ -892,6 +901,14 @@ unsigned long git_config_ulong(const char *name, const 
char *value)
return ret;
 }
 
+ssize_t git_config_ssize_t(const char *name, const char *value)
+{
+   ssize_t ret;
+   if (!git_parse_ssize_t(value, ))
+   die_bad_number(name, value);
+   return ret;
+}
+
 int git_parse_maybe_bool(const char *value)
 {
if (!value)
diff --git a/http.c b/http.c
index 96d84bbed3..22f8167ba2 100644
--- a/http.c
+++ b/http.c
@@ -19,7 +19,7 @@ long int git_curl_ipresolve;
 #endif
 int active_requests;
 int http_is_verbose;
-size_t http_post_buffer = 16 * LARGE_PACKET_MAX;
+ssize_t http_post_buffer = 16 * LARGE_PACKET_MAX;
 
 #if LIBCURL_VERSION_NUM >= 0x070a06
 #define LIBCURL_CAN_HANDLE_AUTH_ANY
@@ -331,7 +331,7 @@ static int http_options(const char *var, const char *value, 
void *cb)
}
 
if (!strcmp("http.postbuffer", var)) {
-   http_post_buffer = git_config_int(var, value);
+   http_post_buffer = git_config_ssize_t(var, value);
if (http_post_buffer < LARGE_PACKET_MAX)
http_post_buffer = LARGE_PACKET_MAX;
return 0;
diff --git a/http.h b/http.h
index 02bccb7b0c..f7bd3b26b0 100644
--- a/http.h
+++ b/http.h
@@ -111,7 +111,7 @@ extern struct curl_slist *http_copy_default_headers(void);
 extern long int git_curl_ipresolve;
 extern int active_requests;
 extern int http_is_verbose;
-extern size_t http_post_buffer;
+extern ssize_t http_post_buffer;
 extern struct credential http_auth;
 
 extern char curl_errorstr[CURL_ERROR_SIZE];
-- 
2.11.GIT



Re: [PATCH] http.postbuffer: make a size_t

2017-03-30 Thread Junio C Hamano
David Turner  writes:

> Unfortunately, in order to push some large repos, the http postbuffer
> must sometimes exceed two gigabytes.  On a 64-bit system, this is OK:
> we just malloc a larger buffer.
>
> Signed-off-by: David Turner 
> ---
>  cache.h  |  1 +
>  config.c | 17 +
>  http.c   |  2 +-
>  3 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/cache.h b/cache.h
> index fbdf7a815a..a8c1b65db0 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1900,6 +1900,7 @@ extern int git_parse_maybe_bool(const char *);
>  extern int git_config_int(const char *, const char *);
>  extern int64_t git_config_int64(const char *, const char *);
>  extern unsigned long git_config_ulong(const char *, const char *);
> +extern size_t git_config_size_t(const char *, const char *);
>  extern int git_config_bool_or_int(const char *, const char *, int *);
>  extern int git_config_bool(const char *, const char *);
>  extern int git_config_maybe_bool(const char *, const char *);
> diff --git a/config.c b/config.c
> index 1a4d85537b..7b706cf27a 100644
> --- a/config.c
> +++ b/config.c
> @@ -834,6 +834,15 @@ int git_parse_ulong(const char *value, unsigned long 
> *ret)
>   return 1;
>  }
>  
> +static size_t git_parse_size_t(const char *value, unsigned long *ret)
> +{
> + size_t tmp;
> + if (!git_parse_signed(value, , 
> maximum_unsigned_value_of_type(size_t)))

I am getting these:

config.c:840:2: error: pointer targets in passing argument 2 of 
'git_parse_signed' differ in signedness [-Werror=pointer-sign]
  if (!git_parse_signed(value, , maximum_unsigned_value_of_type(size_t)))
  ^

config.c:753:12: note: expected 'intmax_t *' but argument is of type 'size_t *'
 static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
^

Changing "size_t tmp" to "intmax_t tmp" squelches it but the maximum
unsigned value of size_t type would probably overflow "intmax_t max"
which is signed, so...


Re: Git and PCRE2 vs PCRE?

2017-03-30 Thread Jeffrey Walton
On Thu, Mar 30, 2017 at 5:23 PM, Junio C Hamano  wrote:
> Jeffrey Walton  writes:
>
>> Is it possible to use PCRE2 with Git? If so, how do I tell Git to use PCRE2?
>
> Given that pcre2's symbols are all prefixed with pcre2_ (I only
> checked http://www.pcre.org/current/doc/html/pcre2api.html) and we
> do not see any hits from "git grep pcre2", I do not think you can
> just "configure" Git to use it.  Unless pcre2 library has a drop-in
> replacement backward compatibility mode with pcre library, that is.
>
> It probably is possible to use PCRE2 with Git by adding similar
> amount of code to grep.[ch] as we have support for pcre and that
> would be the way you tell Git to use PCRE2, but I think that is
> probably not the questino you are asking.

Ack, thanks Jeff and Junio. Its no big deal to me.

I'm not a PCRE user, so I'm not familiar with the extra gyrations
needed for the migration.

I'll get the original PCRE installed.

Thanks again.


Re: [PATCH v2] log: if --decorate is not given, default to --decorate=auto

2017-03-30 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Mar 30, 2017 at 11:03:51AM -0700, Junio C Hamano wrote:
>
>> With the "--decorate=auto" option becoming the default for "git
>> log", "git tbdiff" will be broken.
>> ...
> I'm confused. I thought "auto" would kick in only when we are outputting
> to a terminal. Or is the problem that the "is it a terminal" check is
> fooled by $GIT_PAGER_IN_USE, because you are running "git -p tbdiff"?

Interesting.  Yes, I do use 

[pager]
tbdiff

in my ~/.gitconfig file.

$ git tbdiff ..@{-1} @{-1}..

is one of the most frequently used commands in my ~/.bash_history
these days [*1*].  I by accident has been running the 'master'
version (not my private edition 'jch' that is a bit ahead of 'next')
for the past few weeks, and I just switched back to using the 'jch'
version so that I can say

$ git tbdiff ..- -..

instead, and that is when I noticed we broke "tbdiff".

> If so, this is the symptom of a more general problem, which is that
> a script outputting to a pager will have confused sub-processes, who do
> not know if their pipe is the pager one or not. Perhaps it is time to
> resurrect my patch from:
>
>   http://public-inbox.org/git/20150810052353.gb15...@sigill.intra.peff.net/
>
> I think it would need a Windows-specific variant, but the general idea
> is sound.

Yes, that might be necessary.


[Footnote]

*1* The general flow to accept a reroll of a topic "au/topic" goes
like this:

$ git checkout au/topic
$ git log master.. ;# to remind me what it was about
$ git checkout master... ;# to go back to the original base
$ Meta/CP ./+au-topic.mbox ;# run checkpatch
$ git am -s3c
$ git tbdiff ..@{-1} @{-1}..

Then if the initial N patches are identical, e.g. when the
output of tbdiff begins like this:

 1: f6d8dfd8b6 =  1: d681cf5ada do not check odb_mkstemp return value for 
errors
 2: 52dcad2c2e =  2: abf30edce4 odb_mkstemp: write filename into strbuf
 3: 033d6ae6cb =  3: 38fceca547 odb_mkstemp: use git_path_buf
 4: 55e3179076 !  4: 344267b632 diff: avoid fixed-size buffer for patch-ids
@@ ... @@

$ git rebase --onto 033d6ae6cb 38fceca547
$ git tbdiff ..@{-1} @{-1}..

That way, I can preserve the author and committer timestamps of
the earlier part that did not change.


Re: [PATCH v2] log: if --decorate is not given, default to --decorate=auto

2017-03-30 Thread Jeff King
On Thu, Mar 30, 2017 at 11:03:51AM -0700, Junio C Hamano wrote:

> With the "--decorate=auto" option becoming the default for "git
> log", "git tbdiff" will be broken.
> 
> The configuration variable has been already there, so in that sense
> this is not a new breakage (tbdiff wouldn't have worked well for
> those with configured default).  A fix is trivial (attached).
> 
> I suspect that Alex's change may uncover similar breakages in
> people's scripts.  Perhaps the topic should be cooked a bit longer
> than other topics on 'next'?

I'm confused. I thought "auto" would kick in only when we are outputting
to a terminal. Or is the problem that the "is it a terminal" check is
fooled by $GIT_PAGER_IN_USE, because you are running "git -p tbdiff"?

If so, this is the symptom of a more general problem, which is that
a script outputting to a pager will have confused sub-processes, who do
not know if their pipe is the pager one or not. Perhaps it is time to
resurrect my patch from:

  http://public-inbox.org/git/20150810052353.gb15...@sigill.intra.peff.net/

I think it would need a Windows-specific variant, but the general idea
is sound.

-Peff

PS I've been running git-tbdiff occasionally for years with log.decorate
   set to "true". I wonder why I haven't noticed any problems.


Re: ttk error when starting git gui

2017-03-30 Thread David Shrader



On 03/30/2017 01:54 PM, Peter van der Does wrote:

On 3/30/17 1:01 PM, David Shrader wrote:

Hello,

I get the following error when trying to start git gui:

Error in startup script: wrong # args: should be "ttk::style theme use
theme"
 while executing
"ttk::style theme use"
 (procedure "ttext" line 4)
 invoked from within
"ttext $ui_workdir -background white -foreground black \
 -borderwidth 0 \
 -width 20 -height 10 \
 -wrap none \
 -takefocus 1 -highlightthickness 1\
 ..."
 (file
"/home/dshrader/opt/toss2/common/git/2.12.2/libexec/git-core/git-gui"
line 3190)

I get this error with the latest released version 2.12.2. Two older git
versions are also available on this system, and neither has this issue.
Those older versions are 1.7.1 and 2.3.3. I don't see a call to ttext in
those corresponding git-gui executables, so that is probably why they work.

Here are the steps to reproduce:

1) cd to existing git repository
2) run 'git gui' (no gui comes up, and the error is printed in the
terminal)

I'm running on a RHEL6 based system. Do I have an insufficient version
of whatever git gui uses for graphics in the later versions of git? When
I try 2.12.2 on my personal workstation running Fedora 25, I don't see
the same issue.

Thank you very much for your time,
David



It looks like the git gui needs TCL/TK 8.6.0 or higher. Since that
version the command 'ttk::style theme use' has been changed, which
allows the command to be run without an argument and then returning the
current theme used.
I believe RHEL6 use Tk-8.5.7 but I can't be 100% sure.


Yep, 8.5.7 is what I have on the RHEL6-based system. Thanks for the info!
David

--
David Shrader
HPC-ENV High Performance Computer Systems
Los Alamos National Lab
Email: dshrader  lanl.gov



[RFC/PATCH] Make “git tag --contains ” less chatty if is invalid (gsoc)

2017-03-30 Thread Varun Garg
Hi Git devs,

I am a student interested for Gsoc. With my patch I am able to produce 
following output.

    $ git tag --contains qq
    error: malformed object name qq

    $ git tag --contains HEAD qq
    fatal: --contains option is only allowed with -l.

    $ git tag --contains 5c5c16af33f3cba2f7ce905514c04c4e173b11dc
    error: no such commit 5c5c16af33f3cba2f7ce905514c04c4e173b11dc

    $ git rev-parse --verify qq
    fatal: Needed a single revision

This makes git less chatty, and solves the problem with  
https://public-inbox.org/git/20160323224113.gb12...@sigill.intra.peff.net/.

Here is the patch

$cat less_chatty.patch 
diff --git a/parse-options.c b/parse-options.c
index a23a1e67f..603c77c61 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -160,7 +160,7 @@ static int get_value(struct parse_opt_ctx_t *p,
             return (*opt->callback)(opt, NULL, 0) ? (-1) : 0;
         if (get_arg(p, opt, flags, ))
             return -1;
-        return (*opt->callback)(opt, arg, 0) ? (-1) : 0;
+        return (*opt->callback)(opt, arg, 0) ? (-3) : 0;
 
     case OPTION_INTEGER:
         if (unset) {
@@ -506,6 +506,8 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
             goto show_usage_error;
         case -2:
             goto unknown;
+    case -3:
+    return PARSE_OPT_HELP;
         }
         continue;
 unknown:

What I have done is, created a new case(-3) which is error but not to be 
displayed. In case 1, function usage_with_options_internal() is called which 
returns PARSE_OPT_HELP (and doesen't modify given parameters). So directly 
returning it from 3rd case should directly skip help text without affecting 
program behavior. This gave me the desired results.

However, when I run the tests, I get following behavior with cherry picking,


$ ./t3502-cherry-pick-merge.sh 
ok 1 - setup
not ok 2 - cherry-pick -m complains of bogus numbers
#    
#        # expect 129 here to distinguish between cases where
#        # there was nothing to cherry-pick
#        test_expect_code 129 git cherry-pick -m &&
#        test_expect_code 129 git cherry-pick -m foo b &&
#        test_expect_code 129 git cherry-pick -m -1 b &&
#        test_expect_code 129 git cherry-pick -m 0 b
#    
ok 3 - cherry-pick a non-merge with -m should fail

I have no idea where and why it is coming. (maybe it wants a more chatty git). 
Any help would be really appreciated since gsoc deadline is very close and I 
have to write a proposal.

Regards
Varun garg


Re: Git and PCRE2 vs PCRE?

2017-03-30 Thread Junio C Hamano
Jeffrey Walton  writes:

> Is it possible to use PCRE2 with Git? If so, how do I tell Git to use PCRE2?

Given that pcre2's symbols are all prefixed with pcre2_ (I only
checked http://www.pcre.org/current/doc/html/pcre2api.html) and we
do not see any hits from "git grep pcre2", I do not think you can
just "configure" Git to use it.  Unless pcre2 library has a drop-in
replacement backward compatibility mode with pcre library, that is.

It probably is possible to use PCRE2 with Git by adding similar
amount of code to grep.[ch] as we have support for pcre and that
would be the way you tell Git to use PCRE2, but I think that is
probably not the questino you are asking.


Re: Git and PCRE2 vs PCRE?

2017-03-30 Thread Jeff King
On Thu, Mar 30, 2017 at 04:59:27PM -0400, Jeffrey Walton wrote:

> Configure has an option for libpcre, but its not clear to me how to
> fine tune it for libpcre2:
> 
> $ ./configure --help | /usr/gnu/bin/grep -A 2 -i pcre
>   --with-libpcre  support Perl-compatible regexes (default is NO)
>   ARG can be also prefix for libpcre library and
>   headers
> 
> Is it possible to use PCRE2 with Git? If so, how do I tell Git to use PCRE2?

No, I don't think so. According to the release announcement for pcre2,
the API is not compatible with the original:

  https://lists.exim.org/lurker/message/20150105.162835.0666407a.en.html

So even if you could tweak the build to use it, it would not do the
right thing. Somebody would have to add code for the new library, and
probably have a USE_LIBPCRE2 flag.

I don't know offhand if there are compelling reasons to do that work.

-Peff


[PATCH v2] read-cache: avoid using git_path() in freshen_shared_index()

2017-03-30 Thread Christian Couder
When performing an interactive rebase in split-index mode,
the commit message that one should rework when squashing commits
can contain some garbage instead of the usual concatenation of
both of the commit messages.

Bisecting shows that c3a0082502 (read-cache: use
freshen_shared_index() in read_index_from(), 2017-03-06) is involved,
which points to the unsafe use of git_path() in
freshen_shared_index().

Signed-off-by: Christian Couder 
---
The only changes in this v2 compared to the previous version are
in the title and commit message, which have been changed according
to:

http://public-inbox.org/git/CAP8UFD29xWU5bHwewUSq26pVTwK-gy3uSVG2aGTCrDak3620=a...@mail.gmail.com/

 read-cache.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index e447751823..2f10242c24 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1682,9 +1682,10 @@ int do_read_index(struct index_state *istate, const char 
*path, int must_exist)
  */
 static void freshen_shared_index(char *base_sha1_hex, int warn)
 {
-   const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
+   char *shared_index = git_pathdup("sharedindex.%s", base_sha1_hex);
if (!check_and_freshen_file(shared_index, 1) && warn)
warning("could not freshen shared index '%s'", shared_index);
+   free(shared_index);
 }
 
 int read_index_from(struct index_state *istate, const char *path)
-- 
2.12.0.rc0.23.g4ed838fb8d



Git and PCRE2 vs PCRE?

2017-03-30 Thread Jeffrey Walton
Some more 2.12.2 testing on Solaris 11.3 x86_64:

...
CC ident.o
CC kwset.o
CC line-log.o
CC levenshtein.o
CC line-range.o
CC list-objects.o
In file included from revision.h:5:0,
 from line-log.c:10:
grep.h:5:18: fatal error: pcre.h: No such file or directory
 #include 
  ^
compilation terminated.
gmake: *** [line-log.o] Error 1


Configure has an option for libpcre, but its not clear to me how to
fine tune it for libpcre2:

$ ./configure --help | /usr/gnu/bin/grep -A 2 -i pcre
  --with-libpcre  support Perl-compatible regexes (default is NO)
  ARG can be also prefix for libpcre library and
  headers

Is it possible to use PCRE2 with Git? If so, how do I tell Git to use PCRE2?

Thanks in advance.

**

$ ls /usr/local/include/pcre*
/usr/local/include/pcre2.h   /usr/local/include/pcre2posix.h

$ ls /usr/local/lib64/libpcre*
/usr/local/lib64/libpcre2-16.a
/usr/local/lib64/libpcre2-16.la
/usr/local/lib64/libpcre2-16.so
/usr/local/lib64/libpcre2-16.so.0
/usr/local/lib64/libpcre2-16.so.0.5.0
/usr/local/lib64/libpcre2-32.a
/usr/local/lib64/libpcre2-32.la
/usr/local/lib64/libpcre2-32.so
/usr/local/lib64/libpcre2-32.so.0
/usr/local/lib64/libpcre2-32.so.0.5.0
/usr/local/lib64/libpcre2-8.a
/usr/local/lib64/libpcre2-8.la
/usr/local/lib64/libpcre2-8.so
/usr/local/lib64/libpcre2-8.so.0
/usr/local/lib64/libpcre2-8.so.0.5.0
/usr/local/lib64/libpcre2-posix.a
/usr/local/lib64/libpcre2-posix.la
/usr/local/lib64/libpcre2-posix.so
/usr/local/lib64/libpcre2-posix.so.1
/usr/local/lib64/libpcre2-posix.so.1.0.1


RE: [PATCH] http.postbuffer: make a size_t

2017-03-30 Thread David Turner
GitLab.  I can't speak to our particular configuration of it -- but if you have 
a specific question about what the config is, I can ask our gitlab admins.

> -Original Message-
> From: Shawn Pearce [mailto:spea...@spearce.org]
> Sent: Thursday, March 30, 2017 4:42 PM
> To: David Turner 
> Cc: git 
> Subject: Re: [PATCH] http.postbuffer: make a size_t
> 
> On Thu, Mar 30, 2017 at 1:29 PM, David Turner 
> wrote:
> > Unfortunately, in order to push some large repos, the http postbuffer
> > must sometimes exceed two gigabytes.  On a 64-bit system, this is OK:
> > we just malloc a larger buffer.
> 
> I'm slightly curious what server you are pushing to that needs the entire 
> thing
> buffered to compute a Content-Length, rather than using
> Transfer-Encoding: chunked. Most Git-over-HTTP should be accepting
> Transfer-Encoding: chunked when the stream exceeds postBuffer.


Re: git-compat-util.h:735:13: error: conflicting types for 'inet_ntop'

2017-03-30 Thread Jeffrey Walton
On Thu, Mar 30, 2017 at 4:30 PM, Junio C Hamano  wrote:
> Jeffrey Walton  writes:
>
>> On Wed, Mar 29, 2017 at 1:11 PM, Junio C Hamano  wrote:
>>> Jeffrey Walton  writes:
>>>
 Some more 2.12.2 testing on Solaris 11.3 x86_64:

 $ make V=1
 gcc -o credential-store.o -c -MF ./.depend/credential-store.o.d -MQ
 credential-store.o -MMD -MP -I/usr/local/include -m64 -m64 -I.
 -D__EXTENSIONS__ -D__sun__ -DUSE_LIBPCRE -I/usr/local/include
 -DHAVE_ALLOCA_H -I/usr/local/include -DUSE_CURL_FOR_IMAP_SEND
 -I/usr/local/include -I/usr/local/include -DNO_D_TYPE_IN_DIRENT
 -DNO_INET_NTOP -DNO_INET_PTON  -DHAVE_PATHS_H -DHAVE_LIBCHARSET_H
>>>
>>> Looking at config.mak.uname, nothing in SunOS section seems to set
>>> NO_INET_NTOP or NO_INET_PTON.  Why is your build setting them?
>>
>> Thanks. It looks like the following is the culprit (from config.log).
>> Am I supposed to specify the socket library, or is Autotools supposed
>> to specify it?
>>
>> To date, I've been specify the libraries I request, like IDN2, PCRE,
>> cURL and OpenSSL.
>>
>> I don't recall specifying a socket library in the past, so I'm not
>> sure what is supposed to happen here.
>
> I'd be hated for saying this by many people, but my suspicion is
> that those who build Git are often better off ignoring the autoconf
> part of the build procedure, as it is my impression that ./configure
> we ship is not as well maintained as the Makefile.
>
> Looking through the first part of Makefile, I notice that INET_NTOP
> and INET_PTON bits are not described; we probably should add these
> two close to where we say "Define NO_IPV6 if you lack IPv6 support".

Lol... I avoid Autotools when I can, too.

In this case, since Git supplies Autotools and the fix looks easy,
maybe -lnsl -lsockets is the path to chose on Solaris. It ensures
things "just work" under the common case. It will avoid future
problems, and keep the mailing list dark and silent.

One other thing I try to engineer around... folks usually don't read
README's. I fell if README's were going to work, then it would have
happened by now in the last 40 or 50 years. Since the indicators tell
me they don't work, I don't depend on them.

I guess another option is to remove Autotools. I'm fine with using GNU
Make as the primary build system. But like you said, other folks may
want to tar and feather you for it.

Jeff


Re: ttk error when starting git gui

2017-03-30 Thread Dennis Kaarsemaker
On Thu, 2017-03-30 at 15:54 -0400, Peter van der Does wrote:

> It looks like the git gui needs TCL/TK 8.6.0 or higher. Since that
> version the command 'ttk::style theme use' has been changed, which
> allows the command to be run without an argument and then returning the
> current theme used.
> I believe RHEL6 use Tk-8.5.7 but I can't be 100% sure.

It does.

$ rpm -q tk
tk-8.5.7-5.el6.x86_64

D.


Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread

2017-03-30 Thread Junio C Hamano
Jeff King  writes:

> Still, I'm not sure the extra layer of cache is all that valuable. It
> should be a single hash lookup in the config cache (in an operation that
> otherwise reads the entire index).

OK, let's drop that part, then.


Re: [PATCH] http.postbuffer: make a size_t

2017-03-30 Thread Shawn Pearce
On Thu, Mar 30, 2017 at 1:29 PM, David Turner  wrote:
> Unfortunately, in order to push some large repos, the http postbuffer
> must sometimes exceed two gigabytes.  On a 64-bit system, this is OK:
> we just malloc a larger buffer.

I'm slightly curious what server you are pushing to that needs the
entire thing buffered to compute a Content-Length, rather than using
Transfer-Encoding: chunked. Most Git-over-HTTP should be accepting
Transfer-Encoding: chunked when the stream exceeds postBuffer.


Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread

2017-03-30 Thread Jeff King
On Thu, Mar 30, 2017 at 09:06:48PM +0100, Thomas Gummerer wrote:

> > Yeah, I think that would be fine. You _could_ write a t/perf test and
> > then use your 400MB monstrosity as GIT_PERF_LARGE_REPO. But given that
> > most people don't have such a thing, there's not much value over you
> > just showing off the perf improvement in the commit message.
> 
> Sorry if this was already discussed, but we already do have a perf
> test for the index (p0002), and a corresponding helper program which
> just does read_cache() and discard_cache().  Maybe we could re-use
> that and add a second test running the same using the new config?

Oh, indeed. Yes, I would think the results of p0002 would probably show
off Jeff's improvements.

-Peff


Re: git-compat-util.h:735:13: error: conflicting types for 'inet_ntop'

2017-03-30 Thread Junio C Hamano
Jeffrey Walton  writes:

> On Wed, Mar 29, 2017 at 1:11 PM, Junio C Hamano  wrote:
>> Jeffrey Walton  writes:
>>
>>> Some more 2.12.2 testing on Solaris 11.3 x86_64:
>>>
>>> $ make V=1
>>> gcc -o credential-store.o -c -MF ./.depend/credential-store.o.d -MQ
>>> credential-store.o -MMD -MP -I/usr/local/include -m64 -m64 -I.
>>> -D__EXTENSIONS__ -D__sun__ -DUSE_LIBPCRE -I/usr/local/include
>>> -DHAVE_ALLOCA_H -I/usr/local/include -DUSE_CURL_FOR_IMAP_SEND
>>> -I/usr/local/include -I/usr/local/include -DNO_D_TYPE_IN_DIRENT
>>> -DNO_INET_NTOP -DNO_INET_PTON  -DHAVE_PATHS_H -DHAVE_LIBCHARSET_H
>>
>> Looking at config.mak.uname, nothing in SunOS section seems to set
>> NO_INET_NTOP or NO_INET_PTON.  Why is your build setting them?
>
> Thanks. It looks like the following is the culprit (from config.log).
> Am I supposed to specify the socket library, or is Autotools supposed
> to specify it?
>
> To date, I've been specify the libraries I request, like IDN2, PCRE,
> cURL and OpenSSL.
>
> I don't recall specifying a socket library in the past, so I'm not
> sure what is supposed to happen here.

I'd be hated for saying this by many people, but my suspicion is
that those who build Git are often better off ignoring the autoconf
part of the build procedure, as it is my impression that ./configure
we ship is not as well maintained as the Makefile.  

Looking through the first part of Makefile, I notice that INET_NTOP
and INET_PTON bits are not described; we probably should add these
two close to where we say "Define NO_IPV6 if you lack IPv6 support".



[PATCH] http.postbuffer: make a size_t

2017-03-30 Thread David Turner
Unfortunately, in order to push some large repos, the http postbuffer
must sometimes exceed two gigabytes.  On a 64-bit system, this is OK:
we just malloc a larger buffer.

Signed-off-by: David Turner 
---
 cache.h  |  1 +
 config.c | 17 +
 http.c   |  2 +-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index fbdf7a815a..a8c1b65db0 100644
--- a/cache.h
+++ b/cache.h
@@ -1900,6 +1900,7 @@ extern int git_parse_maybe_bool(const char *);
 extern int git_config_int(const char *, const char *);
 extern int64_t git_config_int64(const char *, const char *);
 extern unsigned long git_config_ulong(const char *, const char *);
+extern size_t git_config_size_t(const char *, const char *);
 extern int git_config_bool_or_int(const char *, const char *, int *);
 extern int git_config_bool(const char *, const char *);
 extern int git_config_maybe_bool(const char *, const char *);
diff --git a/config.c b/config.c
index 1a4d85537b..7b706cf27a 100644
--- a/config.c
+++ b/config.c
@@ -834,6 +834,15 @@ int git_parse_ulong(const char *value, unsigned long *ret)
return 1;
 }
 
+static size_t git_parse_size_t(const char *value, unsigned long *ret)
+{
+   size_t tmp;
+   if (!git_parse_signed(value, , 
maximum_unsigned_value_of_type(size_t)))
+   return 0;
+   *ret = tmp;
+   return 1;
+}
+
 NORETURN
 static void die_bad_number(const char *name, const char *value)
 {
@@ -892,6 +901,14 @@ unsigned long git_config_ulong(const char *name, const 
char *value)
return ret;
 }
 
+size_t git_config_size_t(const char *name, const char *value)
+{
+   unsigned long ret;
+   if (!git_parse_size_t(value, ))
+   die_bad_number(name, value);
+   return ret;
+}
+
 int git_parse_maybe_bool(const char *value)
 {
if (!value)
diff --git a/http.c b/http.c
index 96d84bbed3..ab6080835f 100644
--- a/http.c
+++ b/http.c
@@ -331,7 +331,7 @@ static int http_options(const char *var, const char *value, 
void *cb)
}
 
if (!strcmp("http.postbuffer", var)) {
-   http_post_buffer = git_config_int(var, value);
+   http_post_buffer = git_config_size_t(var, value);
if (http_post_buffer < LARGE_PACKET_MAX)
http_post_buffer = LARGE_PACKET_MAX;
return 0;
-- 
2.11.GIT



Re: git-compat-util.h:735:13: error: conflicting types for 'inet_ntop'

2017-03-30 Thread Jeffrey Walton
On Thu, Mar 30, 2017 at 4:06 PM, Jeffrey Walton  wrote:
> On Wed, Mar 29, 2017 at 1:11 PM, Junio C Hamano  wrote:
>> Jeffrey Walton  writes:
>>
>>> Some more 2.12.2 testing on Solaris 11.3 x86_64:
>>>
>>> $ make V=1
>>> gcc -o credential-store.o -c -MF ./.depend/credential-store.o.d -MQ
>>> credential-store.o -MMD -MP -I/usr/local/include -m64 -m64 -I.
>>> -D__EXTENSIONS__ -D__sun__ -DUSE_LIBPCRE -I/usr/local/include
>>> -DHAVE_ALLOCA_H -I/usr/local/include -DUSE_CURL_FOR_IMAP_SEND
>>> -I/usr/local/include -I/usr/local/include -DNO_D_TYPE_IN_DIRENT
>>> -DNO_INET_NTOP -DNO_INET_PTON  -DHAVE_PATHS_H -DHAVE_LIBCHARSET_H
>>
>> Looking at config.mak.uname, nothing in SunOS section seems to set
>> NO_INET_NTOP or NO_INET_PTON.  Why is your build setting them?
>
> Thanks. It looks like the following is the culprit (from config.log).
> Am I supposed to specify the socket library, or is Autotools supposed
> to specify it?
>
> To date, I've been specify the libraries I request, like IDN2, PCRE,
> cURL and OpenSSL.
>
> I don't recall specifying a socket library in the past, so I'm not
> sure what is supposed to happen here.

It looks like adding -lnsl and -lsocket clears the issue.

Maybe Git on Solaris should test with both of the libraries, and not
just -lsockets.

Jeff


Re: [PATCH] Docs: Add some missing options to git-diff.txt

2017-03-30 Thread Junio C Hamano
Andreas Heiduk  writes:

> git-diff understands "--ours", "--theirs" and "--base" for files with
> conflicts. But so far they were not documented for the central diff
> command but only for diff-files.

This is probably a shared issue with the original text for
"diff-files", but I think we must stress that these options make
sense only when you are in the middle of conflict resolution.  

In addition, unlike "diff-files", if these were to appear in the
general "git diff" documentation, it also must stress that these
options are only about comparing the index and the working tree
files, e.g. "git diff --ours HEAD^ HEAD" does not make sense.



Re: ttk error when starting git gui

2017-03-30 Thread Peter van der Does
On 3/30/17 1:01 PM, David Shrader wrote:
> Hello,
> 
> I get the following error when trying to start git gui:
> 
> Error in startup script: wrong # args: should be "ttk::style theme use
> theme"
> while executing
> "ttk::style theme use"
> (procedure "ttext" line 4)
> invoked from within
> "ttext $ui_workdir -background white -foreground black \
> -borderwidth 0 \
> -width 20 -height 10 \
> -wrap none \
> -takefocus 1 -highlightthickness 1\
> ..."
> (file
> "/home/dshrader/opt/toss2/common/git/2.12.2/libexec/git-core/git-gui"
> line 3190)
> 
> I get this error with the latest released version 2.12.2. Two older git
> versions are also available on this system, and neither has this issue.
> Those older versions are 1.7.1 and 2.3.3. I don't see a call to ttext in
> those corresponding git-gui executables, so that is probably why they work.
> 
> Here are the steps to reproduce:
> 
> 1) cd to existing git repository
> 2) run 'git gui' (no gui comes up, and the error is printed in the
> terminal)
> 
> I'm running on a RHEL6 based system. Do I have an insufficient version
> of whatever git gui uses for graphics in the later versions of git? When
> I try 2.12.2 on my personal workstation running Fedora 25, I don't see
> the same issue.
> 
> Thank you very much for your time,
> David
> 


It looks like the git gui needs TCL/TK 8.6.0 or higher. Since that
version the command 'ttk::style theme use' has been changed, which
allows the command to be run without an argument and then returning the
current theme used.
I believe RHEL6 use Tk-8.5.7 but I can't be 100% sure.

-- 
Peter van der Does

Facebook : https://www.facebook.com/petervanderdoes
Twitter  : https://twitter.com/petervanderdoes
GitHub   : https://github.com/petervanderdoes
About Me : https://about.me/petervanderdoes


Re: [PATCH v4 0/8] refactor the filter process code into a reusable module

2017-03-30 Thread Junio C Hamano
Now 1 & 2 are as equally pleasant to read as others ;-).

Let's wait for a few days and then merge to 'next'.  I didn't see
anything questionable.

Thanks.


Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread

2017-03-30 Thread Thomas Gummerer
On 03/28, Jeff King wrote:
> On Tue, Mar 28, 2017 at 03:50:34PM -0400, Jeff Hostetler wrote:
> 
> > It was a convenient way to isolate, average, and compare
> > read_index() times, but I suppose we could do something
> > like that.
> > 
> > I did confirm that a ls-files does show a slight 0.008
> > second difference on the 58K file Linux tree when toggled
> > on or off.
> 
> Yeah, I agree it helps isolate the change. I'm just not sure we want to
> carry a bunch of function-specific perf-testing code. And one of the
> nice things about testing a real command is that it's...a real command.
> So it's an actual improvement a user might see.
> 
> > But I'm tempted to suggest that we just omit my helper exe
> > and not worry about a test -- since we don't have any test
> > repos large enough to really demonstrate the differences.
> > My concern is that that 0.008 would be lost in the noise
> > of the rest of the test and make for an unreliable result.
> 
> Yeah, I think that would be fine. You _could_ write a t/perf test and
> then use your 400MB monstrosity as GIT_PERF_LARGE_REPO. But given that
> most people don't have such a thing, there's not much value over you
> just showing off the perf improvement in the commit message.

Sorry if this was already discussed, but we already do have a perf
test for the index (p0002), and a corresponding helper program which
just does read_cache() and discard_cache().  Maybe we could re-use
that and add a second test running the same using the new config?

> We could also have a t/perf test that generates a monstrous index and
> shows that it's faster. But frankly, I don't think this is all that
> interesting as a performance regression test. It's not like there's
> something subtle about the performance improvement; we stopped computing
> the SHA-1, and (gasp!) it takes exactly one SHA-1 computation's less
> time.
> 
> So just mentioning the test case and the improvement in the commit
> message is sufficient, IMHO.
> 
> -Peff


Re: git-compat-util.h:735:13: error: conflicting types for 'inet_ntop'

2017-03-30 Thread Jeffrey Walton
On Wed, Mar 29, 2017 at 1:11 PM, Junio C Hamano  wrote:
> Jeffrey Walton  writes:
>
>> Some more 2.12.2 testing on Solaris 11.3 x86_64:
>>
>> $ make V=1
>> gcc -o credential-store.o -c -MF ./.depend/credential-store.o.d -MQ
>> credential-store.o -MMD -MP -I/usr/local/include -m64 -m64 -I.
>> -D__EXTENSIONS__ -D__sun__ -DUSE_LIBPCRE -I/usr/local/include
>> -DHAVE_ALLOCA_H -I/usr/local/include -DUSE_CURL_FOR_IMAP_SEND
>> -I/usr/local/include -I/usr/local/include -DNO_D_TYPE_IN_DIRENT
>> -DNO_INET_NTOP -DNO_INET_PTON  -DHAVE_PATHS_H -DHAVE_LIBCHARSET_H
>
> Looking at config.mak.uname, nothing in SunOS section seems to set
> NO_INET_NTOP or NO_INET_PTON.  Why is your build setting them?

Thanks. It looks like the following is the culprit (from config.log).
Am I supposed to specify the socket library, or is Autotools supposed
to specify it?

To date, I've been specify the libraries I request, like IDN2, PCRE,
cURL and OpenSSL.

I don't recall specifying a socket library in the past, so I'm not
sure what is supposed to happen here.

Thanks in advance.

**

It was created by git configure 2.12.2, which was
generated by GNU Autoconf 2.68.  Invocation command line was

  $ ./configure --enable-pthreads --with-lib=/usr/local/lib64
--with-openssl=/usr/local --with-curl=/usr/local
--with-libpcre=/usr/local --with-zlib=/usr/local
--with-perl=/usr/local/bin/perl --prefix=/usr/local
...

configure:5552: checking for inet_ntop
configure:5552: gcc -o conftest -m64 -I/usr/local/include -m64
-Wl,-rpath,/usr/local/lib64 -L/usr/local/lib64 conftest.c -lidn2
-lcurl -lssl -lcrypto -lz -ldl -lpthread -lsocket >&5
Undefinedfirst referenced
 symbol  in file
inet_ntop   /var/tmp//cc2WaWwg.o  (symbol
belongs to implicit dependency /lib/amd64/libnsl.so.1)
ld: fatal: symbol referencing errors
collect2: error: ld returned 1 exit status
configure:5552: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "git"
| #define PACKAGE_TARNAME "git"
| #define PACKAGE_VERSION "2.12.2"
| #define PACKAGE_STRING "git 2.12.2"
| #define PACKAGE_BUGREPORT "git@vger.kernel.org"
| #define PACKAGE_URL ""
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_ALLOCA_H 1
| #define HAVE_ALLOCA 1
| /* end confdefs.h.  */
| /* Define inet_ntop to an innocuous variant, in case 
declares inet_ntop.
|For example, HP-UX 11i  declares gettimeofday.  */
| #define inet_ntop innocuous_inet_ntop
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char inet_ntop (); below.
| Prefer  to  if __STDC__ is defined, since
|  exists even on freestanding compilers.  */
|
| #ifdef __STDC__
| # include 
| #else
| # include 
| #endif
|
| #undef inet_ntop
|
| /* Override any GCC internal prototype to avoid an error.
|Use char because int might match the return type of a GCC
|builtin and then its argument prototype would still apply.  */
| #ifdef __cplusplus
| extern "C"
| #endif
| char inet_ntop ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS.  Some functions are actually named
| something starting with __ and the normal name is an alias.  */
| #if defined __stub_inet_ntop || defined __stub___inet_ntop
| choke me
| #endif
|
| int
| main ()
| {
| return inet_ntop ();
|   ;
|   return 0;
| }
configure:5552: result: no
configure:5556: checking for inet_ntop in -lresolv
configure:5581: gcc -o conftest -m64 -I/usr/local/include -m64
-Wl,-rpath,/usr/local/lib64 -L/usr/local/lib64 conftest.c -lresolv
-lidn2 -lcurl -lssl -lcrypto -lz -ldl -lpthread -lsocket >&5
Undefinedfirst referenced
 symbol  in file
inet_ntop   /var/tmp//ccZYayyg.o  (symbol
belongs to implicit dependency /lib/amd64/libnsl.so.1)
ld: fatal: symbol referencing errors
collect2: error: ld returned 1 exit status
configure:5581: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "git"
| #define PACKAGE_TARNAME "git"
| #define PACKAGE_VERSION "2.12.2"
| #define PACKAGE_STRING "git 2.12.2"
| #define PACKAGE_BUGREPORT "git@vger.kernel.org"
| #define PACKAGE_URL ""
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_ALLOCA_H 1
| #define HAVE_ALLOCA 1
| /* end confdefs.h.  */
|
| /* Override any GCC internal prototype to avoid an error.
|Use char because int might match the return type of a GCC
|builtin and then its argument prototype 

Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows

2017-03-30 Thread Jeff King
On Thu, Mar 30, 2017 at 09:35:27PM +0200, Jakub Narębski wrote:

> And everything would be all right... if not the fact that Git appends
> spurious ^M to added lines in the `git diff` output.  Files use CRLF
> end-of-line convention (the native MS Windows one).
> 
>   $ git diff test.tex
>   diff --git a/test.tex b/test.tex
>   index 029646e..250ab16 100644
>   --- a/test.tex
>   +++ b/test.tex
>   @@ -1,4 +1,4 @@
>   -\documentclass{article}
>   +\documentclass{mwart}^M
>   
>\usepackage[cp1250]{inputenc}
>\usepackage{polski}
> 
> What gives?  Why there is this ^M tacked on the end of added lines,
> while it is not present in deleted lines, nor in content lines?

Perhaps it's trailing whitespace highlighting for added lines? You can
add "cr-at-eol" to core.whitespace to suppress it.

I suspect in the normal case that git is doing line-ending conversion,
but it's suppressed when textconv is in use.

-Peff


Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread

2017-03-30 Thread Jeff King
On Thu, Mar 30, 2017 at 12:49:15PM -0700, Junio C Hamano wrote:

> Notable suggested changes I have in this one are:
> 
>  * I stole the numbers from the cover letter of v2 and added them at
>the end of the log message.

Yeah, good.

>  * As the checksum is not a useless relic, but is an integrity
>check, I dropped the "ancient relic" from the proposed log
>message.  It is just that the modern disks are reliable enough to
>make it worthwhile to think about a trade-off this patch makes
>between performance and integrity.

Yeah, I'd agree this is more of a tradeoff than a cleanup.

>  * As it is customary, the configuration variable starts as an opt
>in feature.  In a few releases, perhaps we can flip the default,
>but we do not do so from day one.

I think this is so user-invisible that it doesn't really matter much,
but I'm OK with taking it slow.

>  * Updated the code around the call to config-get-bool to avoid
>asking the same question twice.

A few comments on that below.

>  * Added minimum documentation.

Yep, looks good.

> By the way, are we sure we have something that validates the
> checksum regardless of the configuration setting?  If not, we may
> want to tweak this further so that we can force the validation from
> "git fsck" or something.  I am not going to do that myself, but it
> may be necessary before this graduates to 'master'.

Yes, I'd agree this shouldn't graduate without the matching change to
teach fsck to flip the switch back.

> + static int do_checksum = -1;
> [...]
> + if (do_checksum < 0) {
> + /*
> +  * Since we run very early in command startup, git_config()
> +  * may not have been called yet and the various "core_*"
> +  * global variables haven't been set.  So look it up
> +  * explicitly.
> +  */
> + git_config_get_bool("core.checksumindex", _checksum);
> + if (do_checksum < 0)
> + do_checksum = 0; /* default to false */
> + }
> + if (!do_checksum)
> + return 0;

This is basically introducing a new caching layer, but there's no way to
invalidate it (if, say, we looked at the config once and then we knew
that the config changed).  I think it's probably OK, because the main
reason for the config to change is reading it before/after repository
setup. But since this is index code, it shouldn't be called at all
before repository setup.

Still, I'm not sure the extra layer of cache is all that valuable. It
should be a single hash lookup in the config cache (in an operation that
otherwise reads the entire index).

-Peff


[PATCH] Docs: Add some missing options to git-diff.txt

2017-03-30 Thread Andreas Heiduk
git-diff understands "--ours", "--theirs" and "--base" for files with
conflicts. But so far they were not documented for the central diff
command but only for diff-files.

Signed-off-by: Andreas Heiduk 
---
 Documentation/git-diff.txt | 8 
 1 file changed, 8 insertions(+)

diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index bbab35f..91ced4f 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -97,6 +97,14 @@ OPTIONS
 :git-diff: 1
 include::diff-options.txt[]
 
+-1 --base::
+-2 --ours::
+-3 --theirs::
+-0::
+   Diff against the "base" version, "our branch" or "their
+   branch" respectively. The option -0 can be given to omit diff
+   output for unmerged entries and just show "Unmerged".
+
 ...::
The  parameters, when given, are used to limit
the diff to the named paths (you can give directory
-- 
2.7.4



Re: [ANNOUNCE] Git for Windows 2.12.2

2017-03-30 Thread Mike Rappazzo
Forwarding to the lists, as my original message was rejected for html.

On Thu, Mar 30, 2017 at 3:44 PM, Andrew Witte  wrote:
> Just updated back to git 2.12.2 and git-lfs 2.0.2 and everything worked
> fine. Wish I could have gotten more info when it happened as its happened on
> a different computer as well. Will keep an eye out.
>
> Also another note that I really don't like with Windows for Git since 2.12
> is that It packages git-lfs with it. When I use the cmd it overrides the
> other git-lfs install I have. I have to manually go and remove the old
> git-lfs file in "program files" for things to work correctly.
>
> On top of this git-lfs needs to be registered in the environment vars
> because this is what the main git-lfs install does and apps Iv'e made like
> Git-It-GUI (https://github.com/reignstudios/Git-It-GUI) invoke git-lfs
> directly for some stuff. Because of this issue, the app will think a newer
> version is installed thats different from what the normal git cmd reports.
> Also doing git clone outside of the windows cmd with only git for windows
> installed doesn't invoke git-lfs correctly as its not registered in the
> system environment vars.  In short I don't think it should be shipped with
> the installer as it just creates confusion.
>
> On Thu, Mar 30, 2017 at 8:41 AM, Michael Rappazzo 
> wrote:
>>
>> I suspect that this is a problem in the windows credential manager.  I
>> tried this on:
>>   - git 2.12.2.windows.1 => failure
>>   - git 2.12.1.windows.1 => success
>>
>> More Details:
>> I have a perl script which uses (a copy of Git.pm) to invoke the
>> credential manager.  While debugging that script, I dumped the hash that I
>> read from the credential manager:
>>
>> $git->credential($cred, 'fill');
>> print Data::Dumper->Dump( [ $cred ] , [ "cred" ] );
>>
>> In 2.12.2, this produces output like this:
>>
>> $cred = {
>>   'path' => '',
>>   'protocol' => 'https',
>>   'username' => '',
>>   'host' => 'some.host.com',
>>   'password' => ''
>> };
>>
>> In 2.12.1, this produces output like this:
>>
>> $cred = {
>>   'path' => '',
>>   'host' => 'some.host.com',
>>   'protocol' => 'https',
>>   'password' => 'my.password',
>>   'username' => 'mrappazzo'
>> };
>>
>> While debugging this, I did something to get it to work on 2.12.2.  After
>> downgrading to 2.12.1, I manually removed the credentials from Credential
>> Manager (in Control Panel).  After successful authentication, they were back
>> in the credential manager.  I then upgraded to 2.12.2, and I was able to
>> successfully authenticate.
>>
>> To try to recreate the problem scenario again (in 2.12.2), I cleared the
>> credentials in Credential Manager.  Reattempting to authenticate gave the
>> credentials prompt.  The output of the perl hash was missing the password
>> again (thus, reproducing the error condition).
>>
>> I hope this helps.
>> _Mike
>>
>>
>> On Wednesday, March 29, 2017 at 10:06:03 PM UTC-4, Andrew Witte wrote:
>>>
>>> I'll try to get more info tomorrow.
>>>
>>>
>>> On Wednesday, March 29, 2017 at 2:59:10 PM UTC-7, Johannes Schindelin
>>> wrote:

 Hi Andrew,

 On Wed, 29 Mar 2017, Andrew Witte wrote:

 > The git 2.12 GCM for Windows is broken. I tried doing a git clone and
 > got "*remote: HTTP Basic: Access denied*".
 > I downgraded to git 2.11.0 and everything worked fine.

 Could you test v2.12.1, too, and open a bug report at:
 https://github.com/git-for-windows/git/issues/new ?

 I am particularly interested in any details you can share that would
 help
 other developers like me to reproduce the issue.

 Thank you,
 Johannes
>
>


Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread

2017-03-30 Thread Junio C Hamano
Jeff King  writes:

> So just mentioning the test case and the improvement in the commit
> message is sufficient, IMHO.

So here is how I butchered [v3 1/2] to tentatively queue it on 'pu'.

Notable suggested changes I have in this one are:

 * I stole the numbers from the cover letter of v2 and added them at
   the end of the log message.

 * As the checksum is not a useless relic, but is an integrity
   check, I dropped the "ancient relic" from the proposed log
   message.  It is just that the modern disks are reliable enough to
   make it worthwhile to think about a trade-off this patch makes
   between performance and integrity.

 * As it is customary, the configuration variable starts as an opt
   in feature.  In a few releases, perhaps we can flip the default,
   but we do not do so from day one.

 * Updated the code around the call to config-get-bool to avoid
   asking the same question twice.

 * Added minimum documentation.

By the way, are we sure we have something that validates the
checksum regardless of the configuration setting?  If not, we may
want to tweak this further so that we can force the validation from
"git fsck" or something.  I am not going to do that myself, but it
may be necessary before this graduates to 'master'.

Thanks.

-- >8 --
From: Jeff Hostetler 
Date: Tue, 28 Mar 2017 19:07:31 +
Subject: [PATCH] read-cache: core.checksumindex

Teach git to skip verification of the SHA-1 checksum at the end of
the index file in verify_hdr() called from read_index() when the
core.checksumIndex configuration variable is set to false.

The checksum verification is for detecting disk corruption, and for
small projects, the time it takes to compute SHA-1 is not that
significant, but for gigantic repositories this calculation adds
significant time to every command.

On the Linux kernel repository, the effect is rather trivial.
The time to reading its index with 58k entries drops from 0.0284 sec
down to 0.0155 sec.

On my Windows source tree (450MB index), I'm seeing a savings of 0.6
seconds -- read_index() went from 1.2 to 0.6 seconds.

Signed-off-by: Jeff Hostetler 
Signed-off-by: Junio C Hamano 
---
 Documentation/config.txt |  8 
 read-cache.c | 16 
 2 files changed, 24 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1df1965457..bc7b216d43 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -329,6 +329,14 @@ advice.*::
show directions on how to proceed from the current state.
 --
 
+core.checksumIndex::
+   Tell Git to validate the checksum at the end of the index
+   file to detect corruption.  Defaults to `true`.  Those who
+   work on a project with too many files may want to set this
+   variable to `false` to make it faster to load the index (in
+   exchange for reliability, but in general modern disks are
+   reliable enough for most people).
+
 core.fileMode::
Tells Git if the executable bit of files in the working tree
is to be honored.
diff --git a/read-cache.c b/read-cache.c
index e447751823..3195702cf7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1376,12 +1376,28 @@ static int verify_hdr(struct cache_header *hdr, 
unsigned long size)
git_SHA_CTX c;
unsigned char sha1[20];
int hdr_version;
+   static int do_checksum = -1;
 
if (hdr->hdr_signature != htonl(CACHE_SIGNATURE))
return error("bad signature");
hdr_version = ntohl(hdr->hdr_version);
if (hdr_version < INDEX_FORMAT_LB || INDEX_FORMAT_UB < hdr_version)
return error("bad index version %d", hdr_version);
+
+   if (do_checksum < 0) {
+   /*
+* Since we run very early in command startup, git_config()
+* may not have been called yet and the various "core_*"
+* global variables haven't been set.  So look it up
+* explicitly.
+*/
+   git_config_get_bool("core.checksumindex", _checksum);
+   if (do_checksum < 0)
+   do_checksum = 0; /* default to false */
+   }
+   if (!do_checksum)
+   return 0;
+
git_SHA1_Init();
git_SHA1_Update(, hdr, size - 20);
git_SHA1_Final(sha1, );
-- 
2.12.2-727-gf32eb5229d



[BUG?] iconv used as textconv, and spurious ^M on added lines on Windows

2017-03-30 Thread Jakub Narębski
Hello,

Recently I had to work on a project which uses legacy 8-bit encoding
(namely cp1250 encoding) instead of utf-8 for text files (LaTeX
documents).  My terminal, that is Git Bash from Git for Windows is set
up for utf-8.

I wanted for "git diff" and friends to return something sane on said
utf-8 terminal, instead of mojibake.  There is 'encoding'
gitattribute... but it works only for GUI ('git gui', that is).

Therefore I have (ab)used textconv facility to convert from cp1250 of
file encoding to utf-8 encoding of console.

I have set the following in .gitattributes file:

  ## LaTeX documents in cp1250 encoding
  *.tex text diff=mylatex

The 'mylatex' driver is defined as:

  [diff "mylatex"]
xfuncname = "^(((sub)*section|chapter|part)\\*{0,1}\\{.*)$"
wordRegex = "[a-zA-Z]+|[{}]|.|[^\\{}[:space:]]+"
textconv  = \"C:/Program Files/Git/usr/bin/iconv.exe\" -f cp1250 -t 
utf-8
cachetextconv = true

And everything would be all right... if not the fact that Git appends
spurious ^M to added lines in the `git diff` output.  Files use CRLF
end-of-line convention (the native MS Windows one).

  $ git diff test.tex
  diff --git a/test.tex b/test.tex
  index 029646e..250ab16 100644
  --- a/test.tex
  +++ b/test.tex
  @@ -1,4 +1,4 @@
  -\documentclass{article}
  +\documentclass{mwart}^M
  
   \usepackage[cp1250]{inputenc}
   \usepackage{polski}

What gives?  Why there is this ^M tacked on the end of added lines,
while it is not present in deleted lines, nor in content lines?

Puzzled.

P.S. Git has `i18n.commitEncoding` and `i18n.logOutputEncoding`; pity
that it doesn't supports in core `encoding` attribute together with
having `i18n.outputEncoding`.
--
Jakub Narębski




Re: [RFC] should these two topics graduate to 'master' soon?

2017-03-30 Thread Junio C Hamano
Jeff King  writes:

> On Tue, Mar 28, 2017 at 11:51:49AM -0700, Jonathan Nieder wrote:
>
>> > * jc/merge-drop-old-syntax (2015-04-29) 1 commit
>> >
>> >   This topic stops "git merge  HEAD " syntax that
>> >   has been deprecated since October 2007 (and we have issued a
>> >   warning message since around v2.5.0 when the ancient syntax was
>> >   used).
>> >
>> > * jk/no-looking-at-dotgit-outside-repo-final (2016-10-26) 1 commit
>> >
>> >   This is the endgame of the topic to avoid blindly falling back to
>> >   ".git" when the setup sequence said we are _not_ in Git repository.
>> >   A corner case that happens to work right now may be broken by a call
>> >   to die("BUG").
>> >
>> > I am leaning toward including the former in the upcoming release,
>> > whose -rc0 is tentatively scheduled to happen on Apr 20th.  I think
>> > the rest of the system is also ready for the latter (back when we
>> > merged it to 'next' and started cooking, there were still a few
>> > codepaths that triggered its die(), which have been fixed).
>> >
>> > Opinions?
>> 
>> Google has been running with both of these for a while.  Any problems
>> we ran into were already reported and fixed.  I would be all for
>> including them in the next release.
>
> Thanks, I was wondering how much exposure the latter got. It might be a
> good idea to merge it to "master" early in the post-2.13 cycle to get a
> little more exposure (since the point of it is really to flush out
> unusual cases, the more people run it before we make a release the
> better). But I'm also OK if it's merged to master this cycle, as long as
> it's soon-ish. It's much better to flush out problems in pre-release
> master than in a released version.

OK, let's do both during this cycle, before I take a short break at
the beginning of next month.

Thanks.


I need your urgent help...

2017-03-30 Thread Sgt. Britta Lopez
Hello,
I have a personal Project in which i need your assistance I would like to
be sure of your willingness and commitment to execute this
transaction with me. I seek your partnership in receiving this fund
worth (Twenty five million United States Dollars). If interested, reply
immediately for detailed information.
Regards,
Sgt. Britta Lopez.

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



Re: [PATCH 3/3] WIP - Allow custom printf function for column printing

2017-03-30 Thread Stefan Beller
On Wed, Mar 29, 2017 at 8:22 PM, Jeff King  wrote:
> On Wed, Mar 29, 2017 at 06:42:38PM -0700, Stefan Beller wrote:
>
>> Ever wondered why column.ui applies the untracked files in git-status,
>> but not for the help text comment in git-commit? Nobody wrote the code!
>>
>> This is marked as WIP, as it barely demonstrates how the code may look
>> like. No tests, no documentation.
>
> I'm confused about what this patch is trying to do. Is it just to turn
> on column.status support for the template shown by git-commit? Or is it
> columnizing more than the untracked files list?

It is just columnizing the untracked files list.

>
> If the former, why isn't just setting s.colopts enough? I guess because
> we write into a strbuf?

Yes, because it is a strbuf, and not stdout. the column code could only write
to stdout (which was solved in prior patches), but we still have to provide a
"printf like" function, which is the majority of this patch.

The remaining part of the patch, which could go into a separate patch
is the removal of:
-   s.colopts = 0;

> If the latter, then isn't that a separate logical patch from "should
> commit use columns"?

yes, I was too quick there.


> I don't have an opinion on the whole thing myself, as I do not use
> columns nor really know how they are supposed to work. You found my
> 4d2292e9a9, but I was explicitly trying _not_ to get involved in any
> behavior changes there due to my cluelessness. :)

eh, ok. I thought you had deeper reasons than that.
I did not use columns either, but now it looks like I am
a new user of that feature. If only Git would users easily
discover cool features. ;)

>
> I think Duy, who wrote all of the column code, is a better person to cc.
>
> -Peff


Re: [PATCH 2/2] config: resolve symlinks in conditional include's patterns

2017-03-30 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> $GIT_DIR returned by get_git_dir() is normalized, with all symlinks
> resolved (see setup_work_tree function). In order to match paths (or
> patterns) against $GIT_DIR char-by-char, they have to be normalized
> too. There is a note in config.txt about this, that the user need to
> resolve symlinks by themselves if needed.
>
> The problem is, we allow certain path expansion, '~/' and './', for
> convenience and can't ask the user to resolve symlinks in these
> expansions. Make sure the expanded paths have all symlinks resolved.

That sounds sensible but I fail to see why 1/2 is the right approach
to do this, and I must be missing something.  Wouldn't you get the
same result if you run realpath() yourself on expanded, after
receiving the return value of expand_user_path() in it?

Can you add a test to demonstrate the issue (which would need to be
protected with SYMLINKS prereq)?

Thanks.

> PS. The strbuf_realpath(, get_git_dir(), 1) is still needed because
> get_git_dir() may return relative path.
>
> Noticed-by: Torsten Bögershausen 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  config.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/config.c b/config.c
> index f036c721e6..d5ba848b65 100644
> --- a/config.c
> +++ b/config.c
> @@ -177,7 +177,7 @@ static int prepare_include_condition_pattern(struct 
> strbuf *pat)
>   char *expanded;
>   int prefix = 0;
>  
> - expanded = expand_user_path(pat->buf, 0);
> + expanded = expand_user_path(pat->buf, 1);
>   if (expanded) {
>   strbuf_reset(pat);
>   strbuf_addstr(pat, expanded);
> @@ -191,7 +191,7 @@ static int prepare_include_condition_pattern(struct 
> strbuf *pat)
>   return error(_("relative config include "
>  "conditionals must come from files"));
>  
> - strbuf_add_absolute_path(, cf->path);
> + strbuf_realpath(, cf->path, 1);
>   slash = find_last_dir_sep(path.buf);
>   if (!slash)
>   die("BUG: how is this possible?");
> @@ -213,7 +213,7 @@ static int include_by_gitdir(const char *cond, size_t 
> cond_len, int icase)
>   struct strbuf pattern = STRBUF_INIT;
>   int ret = 0, prefix;
>  
> - strbuf_add_absolute_path(, get_git_dir());
> + strbuf_realpath(, get_git_dir(), 1);
>   strbuf_add(, cond, cond_len);
>   prefix = prepare_include_condition_pattern();


Re: [PATCH 2/2] submodule.c: correctly handle nested submodules in is_submodule_modified

2017-03-30 Thread Junio C Hamano
Jonathan Nieder  writes:

> Stefan Beller wrote:
>
>> This bug fix also affects the default output (non-short, non-porcelain)
>> of git-status, which is not tested here.
>
> Do you have an example?  (In just the commit message would be fine, in
> tests would be even better.)
>
>> Signed-off-by: Stefan Beller 
>> ---
>>  Documentation/git-status.txt |  2 ++
>>  submodule.c  | 21 +++--
>>  t/t3600-rm.sh|  2 +-
>>  t/t7506-status-submodule.sh  |  4 ++--
>>  4 files changed, 24 insertions(+), 5 deletions(-)
>
> Reviewed-by: Jonathan Nieder 
>
> but I would be a lot more comfortable after looking at the change to
> "git status" output.  (E.g. a test demonstrating it can happen in a
> followup change if that's simpler.)
>
> Thanks for your patient work on this.

Thank you both.  I re-read the whole thing and it feels to me that
the topic is now ready for 'next'.


Re: [PATCH v5 4/6] dir_iterator: add tests for dir_iterator API

2017-03-30 Thread Daniel Ferreira (theiostream)
On Thu, Mar 30, 2017 at 5:05 AM, Michael Haggerty  wrote:
> Oh I forgot to mention, in the Git project we don't allow declarations
> to be mixed with code. Apparently there's some ancient compiler
> somewhere that doesn't allow it. Declarations always have to be
> together, at the top of a block. (Compile with
> `-Werror=declaration-after-statement` to detect this.)

Sorry about that. I'm compiling git on clang and it has a bug that
ignores this warning (it shows up on gcc). I'll watch out for it.


Re: [PATCH v5 4/6] dir_iterator: add tests for dir_iterator API

2017-03-30 Thread Daniel Ferreira (theiostream)
On Thu, Mar 30, 2017 at 4:46 AM, Michael Haggerty  wrote:
> Is there a special reason to write the date to the file as opposed to, say
>
> touch dir/b
>
> ? (Some people use `: >dir/b` for this purpose, though I've never found
> out why.) If you write the date to the file, the reader will be
> distracted unnecessarily wondering whether the contents are important to
> the test.
>

There's no reason. They will be `touch`ed instead of written in a next version.

`:` is a bash builtin that that returns an exit code of zero and
produces no output. On my Mac at least:

bash-3.2$ type :
: is a shell builtin
bash-3.2$ type touch
touch is /usr/bin/touch

I suppose there are reasons to try to keep the most of a shell
script's logic within the shell itself, without involving external
binaries.


Re: [PATCH 0/18] snprintf cleanups

2017-03-30 Thread Jeff King
On Thu, Mar 30, 2017 at 10:24:36AM -0700, Junio C Hamano wrote:

> > I'm OK either with the series I posted, or wrapping up the alternative
> > in a commit message.
> 
> I do find the updated one easier to follow (if anything it is more
> compact); I do not think it is worth a reroll, but it is easy enough
> to replace the patch part of the original with the updated patch and
> tweak "it's easy to fix by moving to a strbuf" in its log message to
> something like "But it's easy to eliminate the allocation with a few
> helper functions, and it makes the result easier to follow", so I am
> tempted to go that route myself...

Yeah, I think that's all that would be needed. Here it is wrapped up as
a patch:

-- >8 --
Subject: [PATCH] diff: avoid fixed-size buffer for patch-ids

To generate a patch id, we format the diff header into a
fixed-size buffer, and then feed the result to our sha1
computation. The fixed buffer has size '4*PATH_MAX + 20',
which in theory accommodates the four filenames plus some
extra data. Except:

  1. The filenames may not be constrained to PATH_MAX. The
 static value may not be a real limit on the current
 filesystem. Moreover, we may compute patch-ids for
 names stored only in git, without touching the current
 filesystem at all.

  2. The 20 bytes is not nearly enough to cover the
 extra content we put in the buffer.

As a result, the data we feed to the sha1 computation may be
truncated, and it's possible that a commit with a very long
filename could erroneously collide in the patch-id space
with another commit. For instance, if one commit modified
"really-long-filename/foo" and another modified "bar" in the
same directory.

In practice this is unlikely. Because the filenames are
repeated, and because there's a single cutoff at the end of
the buffer, the offending filename would have to be on the
order of four times larger than PATH_MAX.

We could fix this by moving to a strbuf. However, we can
observe that the purpose of formatting this in the first
place is to feed it to git_SHA1_Update(). So instead, let's
just feed each part of the formatted string directly. This
actually ends up more readable, and we can even factor out
some duplicated bits from the various conditional branches.

Technically this may change the output of patch-id for very
long filenames, but it's not worth making an exception for
this in the --stable output. It was a bug, and one that only
affected an unlikely set of paths.  And anyway, the exact
value would have varied from platform to platform depending
on the value of PATH_MAX, so there is no "stable" value.

Signed-off-by: Jeff King 
---
 diff.c | 68 --
 1 file changed, 37 insertions(+), 31 deletions(-)

diff --git a/diff.c b/diff.c
index 58cb72d7e..92d78e459 100644
--- a/diff.c
+++ b/diff.c
@@ -4570,6 +4570,19 @@ static void patch_id_consume(void *priv, char *line, 
unsigned long len)
data->patchlen += new_len;
 }
 
+static void patch_id_add_string(git_SHA_CTX *ctx, const char *str)
+{
+   git_SHA1_Update(ctx, str, strlen(str));
+}
+
+static void patch_id_add_mode(git_SHA_CTX *ctx, unsigned mode)
+{
+   /* large enough for 2^32 in octal */
+   char buf[12];
+   int len = xsnprintf(buf, sizeof(buf), "%06o", mode);
+   git_SHA1_Update(ctx, buf, len);
+}
+
 /* returns 0 upon success, and writes result into sha1 */
 static int diff_get_patch_id(struct diff_options *options, unsigned char 
*sha1, int diff_header_only)
 {
@@ -4577,7 +4590,6 @@ static int diff_get_patch_id(struct diff_options 
*options, unsigned char *sha1,
int i;
git_SHA_CTX ctx;
struct patch_id_t data;
-   char buffer[PATH_MAX * 4 + 20];
 
git_SHA1_Init();
memset(, 0, sizeof(struct patch_id_t));
@@ -4609,36 +4621,30 @@ static int diff_get_patch_id(struct diff_options 
*options, unsigned char *sha1,
 
len1 = remove_space(p->one->path, strlen(p->one->path));
len2 = remove_space(p->two->path, strlen(p->two->path));
-   if (p->one->mode == 0)
-   len1 = snprintf(buffer, sizeof(buffer),
-   "diff--gita/%.*sb/%.*s"
-   "newfilemode%06o"
-   "---/dev/null"
-   "+++b/%.*s",
-   len1, p->one->path,
-   len2, p->two->path,
-   p->two->mode,
-   len2, p->two->path);
-   else if (p->two->mode == 0)
-   len1 = snprintf(buffer, sizeof(buffer),
-   "diff--gita/%.*sb/%.*s"
-   "deletedfilemode%06o"
-   "---a/%.*s"
-   

Re: [PATCH 7/7] submodule.c: correctly handle nested submodules in is_submodule_modified

2017-03-30 Thread Junio C Hamano
Jonathan Nieder  writes:

> Junio C Hamano wrote:
>
>> Shouldn't this done as part of 4/7 where is_submodule_modified()
>> starts reading from the porcelain v2 output?  4/7 does adjust for
>> the change from double question mark (porcelain v1) to a single one
>> for untracked, but at the same time it needs to prepare for these
>> 'u' (unmerged), '1' (normal modification) and '2' (mods with rename)
>> to appear in the output, no?
>>
>> IOW, with 4/7 and 7/7 done as separate steps, isn't the system
>> broken between these steps?
>
> No.  Both before and after patch 4, this code has to determine two
> details from a submodule:
> ...
> Question (3) didn't come up before because when there are no nested
> submodules, the diff machinery answers it (saving us from getting the
> answer from the status --porcelain we recurse to).

Thanks.


[PATCH v2] userdiff: add build-in pattern for shell

2017-03-30 Thread Ivan Tham
Shell are widely used but comes with lots of different patterns. The
build-in pattern aim for POSIX-like shells with some additions:

- Notably ${g//re/s} and ${g#cut}
- Bashisms such as "function"

Signed-off-by: Ivan Tham 
---
 Documentation/gitattributes.txt |  2 ++
 t/t4034-diff-words.sh   |  1 +
 t/t4034/sh/expect   | 14 ++
 t/t4034/sh/post |  7 +++
 t/t4034/sh/pre  |  7 +++
 userdiff.c  |  5 +
 6 files changed, 36 insertions(+)
 create mode 100644 t/t4034/sh/expect
 create mode 100644 t/t4034/sh/post
 create mode 100644 t/t4034/sh/pre

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index a53d093ca..f8440119d 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -706,6 +706,8 @@ patterns are available:
 
 - `ruby` suitable for source code in the Ruby language.
 
+- `sh` suitable for source code in POSIX-like shells.
+
 - `tex` suitable for source code for LaTeX documents.
 
 
diff --git a/t/t4034-diff-words.sh b/t/t4034-diff-words.sh
index 912df9122..2eb662f89 100755
--- a/t/t4034-diff-words.sh
+++ b/t/t4034-diff-words.sh
@@ -313,6 +313,7 @@ test_language_driver perl
 test_language_driver php
 test_language_driver python
 test_language_driver ruby
+test_language_driver sh
 test_language_driver tex
 
 test_expect_success 'word-diff with diff.sbe' '
diff --git a/t/t4034/sh/expect b/t/t4034/sh/expect
new file mode 100644
index 0..e7b0a9ae3
--- /dev/null
+++ b/t/t4034/sh/expect
@@ -0,0 +1,14 @@
+diff --git a/pre b/post
+index 7bb0d15..df3845b 100644
+--- a/pre
 b/post
+@@ -1,7 +1,7 @@
+echo "Hello world!
+bomb?"
+fork(){ 
bombfork|bombfork& }
+; bomb
+ax=1 a2 
x=$((ax+12)) 
ax=$((a-1x-2)) 
ax=$((ax*12))
 
ax=$((ax/12))
+ax=$(ax) 
ax=`ax` 
ax=${ax#ax*}
 
ax=${ax%ax*}
 
ax=${ax//ax/ax}
+command -h -v--help=all -q | xargs -- echo 
2>&1 &/dev/null
+[ $a -eq $b$x -ne $y ]& 
aaxx||echo bbyy
+[ "$a$x"!=12 ] && echo 
ax || echo by
diff --git a/t/t4034/sh/post b/t/t4034/sh/post
new file mode 100644
index 0..df3845b4f
--- /dev/null
+++ b/t/t4034/sh/post
@@ -0,0 +1,7 @@
+echo "Hello world?"
+fork(){ fork|fork& }
+x=2 x=$((x+2)) x=$((x-2)) x=$((x*2)) x=$((x/2))
+x=$(x) x=`x` x=${x#x*} x=${x%x*} x=${x//x/x}
+command --help=all -q | xargs -- echo 2>/dev/null
+[ $x -ne $y ]& xx||echo yy
+[ "$x"!=2 ] && echo x || echo y
diff --git a/t/t4034/sh/pre b/t/t4034/sh/pre
new file mode 100644
index 0..7bb0d1562
--- /dev/null
+++ b/t/t4034/sh/pre
@@ -0,0 +1,7 @@
+echo Hello world!
+bomb(){ bomb|bomb& }; bomb
+a=1 a=$((a+1)) a=$((a-1)) a=$((a*1)) a=$((a/1))
+a=$(a) a=`a` a=${a#a*} a=${a%a*} a=${a//a/a}
+command -h -v | xargs -- echo >&1 &
+[ $a -eq $b ]& aa||echo bb
+[ "$a"!=1 ] && echo a || echo b
diff --git a/userdiff.c b/userdiff.c
index 8b732e40b..72ffd85f1 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -148,6 +148,11 @@ PATTERNS("csharp",
 "[a-zA-Z_][a-zA-Z0-9_]*"
 "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
 "|[-+*/<>%&^|=!]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\||::|->"),
+PATTERNS("sh",
+"^[ \t]*(function )?[A-Za-z_][A-Za-z_0-9]*[ \t]*\\(\\)[\t]*\\{?$",
+/* -- */
+"(\\$|--?)?([a-zA-Z_][a-zA-Z0-9._]*|[0-9]+|#|\\?)|--" /* command/param 
*/
+"|\\$[({]|[)}]|[-+*/=!]=?|[\\]&%#/|]{1,2}|[<>]{1,3}|[ \t]#.*"),
 IPATTERN("css",
 "![:;][[:space:]]*$\n"
 "^[_a-z0-9].*$",
-- 
2.12.2.609.gf7d0c115f



Re: [PATCH v2] log: if --decorate is not given, default to --decorate=auto

2017-03-30 Thread Junio C Hamano
With the "--decorate=auto" option becoming the default for "git
log", "git tbdiff" will be broken.

The configuration variable has been already there, so in that sense
this is not a new breakage (tbdiff wouldn't have worked well for
those with configured default).  A fix is trivial (attached).

I suspect that Alex's change may uncover similar breakages in
people's scripts.  Perhaps the topic should be cooked a bit longer
than other topics on 'next'?


 git-tbdiff.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-tbdiff.py b/git-tbdiff.py
index ccf7a0b..ae7cbb5 100755
--- a/git-tbdiff.py
+++ b/git-tbdiff.py
@@ -75,7 +75,7 @@ def read_patches(rev_list_args):
 series = []
 diffs = {}
 p = subprocess.Popen(['git', 'log', '--no-color', '-p', '--no-merges',
-  '--reverse', '--date-order']
+  '--no-decorate', '--reverse', '--date-order']
  + rev_list_args,
  stdout=subprocess.PIPE)
 sha1 = None


Re: [RFC PATCH] change default for status.submoduleSummary to true

2017-03-30 Thread Stefan Beller
On Wed, Mar 29, 2017 at 11:04 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> A user complained about the workflow with submodules:
>>> Re submodules pain, I've seen a lot of people get confused about
>>> how and when to commit submodule changes. The main thing missing
>>> in the related UIs is some way to summarize the subproject commit
>>> diff in a human readable way. Maybe last log message would be better
>>> than just sha?
>>
>> We could advise all the confused users to turn on
>> status.submoduleSummary.  However there is no downside from turning
>> it on by default apart from a slight change in behavior and bit
>> longer output of git-status and the help in git-commit.
>
> Is "there is no downside" substantiated or just hand-waving?

It's the best kind of hand waving.

> Any pros-and-cons analysis, e.g. performance implications, etc.?

Performance will be terrible. The submodule summary is gathered by
running "git submodule summary", which is shell (that is slower than
the C code in status).

In gerrit (which has 6 submodules, that I modified for the test by adding
one commit on top of each), we find

$ time git -c status.submoduleSummary=false status
...
real 0m0.043s

$ time git -c status.submoduleSummary=true status
...
real 0m0.359s

So it is slower by an order of magnitude.
Maybe it is not the right time to propose this patch then, but we'
rather want to wait until "git submodule summary" is converted to C.
(Maybe we can even parallelize the data gathering before output.)

Even for git.git (which has no submodules), we have

$ time git -c status.submoduleSummary=false status
...
real 0m0.014s
$ time git -c status.submoduleSummary=true status
...
real 0m0.125s

So additionally to porting that part to C, we may want to condition it
to a new function "int do_we_have_submodules_at_all(void)",
which has a similar heuristic as the once proposed patch
to recurse into submodules all the time
https://public-inbox.org/git/20161006193725.31553-3-sbel...@google.com/

Now that I explained why this patch is unacceptable as-is,
let's think of potential upsides:

git-status is the one command to deliver a summary on the state
of a repo with the granularity on the commit
(e.g. it reports the relation to its upstream tracking branch)
and file level (which files are change, untracked).

And I think a submodule is a weird between-state of these
two as it is seen like a file in the superproject, but it also
has commits, which are the smallest unit of logical change.

So maybe we'd want to report just a number in git-status,
which indicates how much the current detached head
is behind/before the remote tracking branch, and if it can be
fast forwarded.

Thanks,
Stefan


Re: [PATCH] strbuf: support long paths w/o read rights in strbuf_getcwd() on FreeBSD

2017-03-30 Thread René Scharfe

Am 29.03.2017 um 06:54 schrieb Christian Couder:

On Tue, Mar 28, 2017 at 11:49 PM, Jeff King  wrote:

On Tue, Mar 28, 2017 at 11:15:12PM +0200, Christian Couder wrote:


On Sun, Mar 26, 2017 at 3:43 PM, René Scharfe  wrote:

FreeBSD implements getcwd(3) as a syscall, but falls back to a version
based on readdir(3) if it fails for some reason.  The latter requires
permissions to read and execute path components, while the former does
not.  That means that if our buffer is too small and we're missing
rights we could get EACCES, but we may succeed with a bigger buffer.

Keep retrying if getcwd(3) indicates lack of permissions until our
buffer can fit PATH_MAX bytes, as that's the maximum supported by the
syscall on FreeBSD anyway.  This way we do what we can to be able to
benefit from the syscall, but we also won't loop forever if there is a
real permission issue.


Sorry to be late and maybe I missed something obvious, but the above
and the patch seem complex to me compared with something like:

diff --git a/strbuf.c b/strbuf.c
index ace58e7367..25eadcbedc 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -441,7 +441,7 @@ int strbuf_readlink(struct strbuf *sb, const char
*path, size_t hint)
 int strbuf_getcwd(struct strbuf *sb)
 {
size_t oldalloc = sb->alloc;
-   size_t guessed_len = 128;
+   size_t guessed_len = PATH_MAX > 128 ? PATH_MAX : 128;

for (;; guessed_len *= 2) {
strbuf_grow(sb, guessed_len);


I think the main reason is just that we do not have to pay the price to
allocate PATH_MAX-sized buffers when they are rarely used.

I doubt it matters all that much in practice, though.


Yeah, I can understand that.


Right, using PATH_MAX as initial size (no ?: operator needed) is the 
easiest fix.  And yes, I wanted to avoid wasting memory just to fix an 
odd special case.


Given that the function isn't called very often the impact is probably 
not that high either way, but increasing memory usage by 3200% on Linux 
just didn't sit right with me -- even though 4KB isn't much in absolute 
terms, of course.



But I also wonder if René's patch relies on PATH_MAX being a multiple
of 128 on FreeBSD and what would happen for a path between 129 and
PATH_MAX if PATH_MAX was something like 254.


We can use any value bigger than zero to initialize guessed_len. 
getcwd() won't have any problems fitting e.g. a path with a length of 
130 bytes into a buffer of 256 bytes in the second round of the loop.


René


Re: [PATCH v8 04/11] update-index: add untracked cache notifications

2017-03-30 Thread Stefan Beller
On Thu, Mar 30, 2017 at 10:33 AM, Junio C Hamano  wrote:
>>
>> Coverity points out that this is a leak (xgetcwd returns an allocated
>> buffer).
>
> I saw that quite recently and was wondering why it reported it this
> late.  It has been in 'master' for more than a month.

Because the repo/script for coverity was broken. It was missing
a hard reset, such that the merge conflict that occurred once upon
a time stayed in the repo and the checkout and merge of the newer
version did not succeed for quite some time.

I fixed it yesterday and will monitor the script more carefully now.

Stefan


Re: [PATCH v8 04/11] update-index: add untracked cache notifications

2017-03-30 Thread Junio C Hamano
Jeff King  writes:

> On Wed, Jan 27, 2016 at 07:58:00AM +0100, Christian Couder wrote:
>
>> diff --git a/builtin/update-index.c b/builtin/update-index.c
>> index 6dd..369c207 100644
>> --- a/builtin/update-index.c
>> +++ b/builtin/update-index.c
>> @@ -130,7 +130,7 @@ static int test_if_untracked_cache_is_supported(void)
>>  if (!mkdtemp(mtime_dir.buf))
>>  die_errno("Could not make temporary directory");
>>  
>> -fprintf(stderr, _("Testing "));
>> +fprintf(stderr, _("Testing mtime in '%s' "), xgetcwd());
>
> Coverity points out that this is a leak (xgetcwd returns an allocated
> buffer).

I saw that quite recently and was wondering why it reported it this
late.  It has been in 'master' for more than a month.

In any case, thanks for noticing and I agree the fix in the other
message is obviously correct.



Re: [PATCH v2 0/3] name-rev sanity

2017-03-30 Thread Junio C Hamano
Michael J Gruber  writes:

> Junio C Hamano venit, vidit, dixit 29.03.2017 19:43:
> ...
>> Ah, of course, you are depending on your other topic ;-)
>> I'll wiggle these in.
>> 
>> Thanks.
>
> Yes, sorry, isn't that in next already? I should have meantioned it anyways.

No worries.  jc/name-rev was meant to be replaced.

> Also, I should split patch 3 into name-rev and describe related parts
> and update the name-rev documentation. We don't have tests for describe
> --debug, that should be ok.

OK.  I do not think these three patches are in 'next' yet, so feel
free to further replace the last one (or for that matter the first
two as necessary).  I do not think we want to cast the debugging
information in stone, but it would not hurt if you made sure that
"describe --debug [--contains]" does not dump core ;-)

Thanks.


Re: [PATCH v4 2/5] dir_iterator: iterate over dir after its contents

2017-03-30 Thread Junio C Hamano
Michael Haggerty  writes:

> I don't think any of this needs to be implemented now, but maybe keep it
> in mind if/when `dir_iterator` gets more users.

OK.  One thing that was missing in your list was the opposite of "do
not show directories", i.e. "show only directories".  That should
also be easy to do (but we need to figure out a way to encode it in
the flag somehow) if it turns out to be useful later.




Re: [PATCH 0/18] snprintf cleanups

2017-03-30 Thread Junio C Hamano
Jeff King  writes:

> On Wed, Mar 29, 2017 at 09:05:33AM -0700, Junio C Hamano wrote:
>
>> > I think there are two things going on in your example.
>> >
>> > One is that obviously patch_id_addf() removes the spaces from the
>> > result. But we could do that now by keeping the big strbuf_addf(), and
>> > then just walking the result and feeding non-spaces.
>> >
>> > The second is that your addf means we are back to formatting everything
>> > into a buffer again
>> 
>> You are right to point out that I was blinded by the ugliness of
>> words stuck together without spaces in between, which was inherited
>> from the original code, and failed to see the sole point of this
>> series, which is to remove truncation without adding unnecessary
>> allocation and freeing.
>> 
>> Thanks for straighten my thinking out.  I think the seeming
>> ugliness, if it ever becomes a real problem, should be handled
>> outside this series after the dust settles.
>
> Yeah, the no-spaces thing should almost certainly wait.
>
> There is still the minor question of whether skipping the strbuf
> entirely is nicer, even if you still have to feed it strings without
> spaces (i.e., what I posted in my initial reply).
>
> I'm OK either with the series I posted, or wrapping up the alternative
> in a commit message.

I do find the updated one easier to follow (if anything it is more
compact); I do not think it is worth a reroll, but it is easy enough
to replace the patch part of the original with the updated patch and
tweak "it's easy to fix by moving to a strbuf" in its log message to
something like "But it's easy to eliminate the allocation with a few
helper functions, and it makes the result easier to follow", so I am
tempted to go that route myself...


ttk error when starting git gui

2017-03-30 Thread David Shrader

Hello,

I get the following error when trying to start git gui:

Error in startup script: wrong # args: should be "ttk::style theme use 
theme"

while executing
"ttk::style theme use"
(procedure "ttext" line 4)
invoked from within
"ttext $ui_workdir -background white -foreground black \
-borderwidth 0 \
-width 20 -height 10 \
-wrap none \
-takefocus 1 -highlightthickness 1\
..."
(file 
"/home/dshrader/opt/toss2/common/git/2.12.2/libexec/git-core/git-gui" 
line 3190)


I get this error with the latest released version 2.12.2. Two older git 
versions are also available on this system, and neither has this issue. 
Those older versions are 1.7.1 and 2.3.3. I don't see a call to ttext in 
those corresponding git-gui executables, so that is probably why they work.


Here are the steps to reproduce:

1) cd to existing git repository
2) run 'git gui' (no gui comes up, and the error is printed in the terminal)

I'm running on a RHEL6 based system. Do I have an insufficient version 
of whatever git gui uses for graphics in the later versions of git? When 
I try 2.12.2 on my personal workstation running Fedora 25, I don't see 
the same issue.


Thank you very much for your time,
David

--
David Shrader
HPC-ENV High Performance Computer Systems
Los Alamos National Lab
Email: dshrader  lanl.gov



Re: [PATCH v2 1/2] pkt-line: add packet_writel() and packet_read_line_gently()

2017-03-30 Thread Torsten Bögershausen
On 30.03.17 18:01, Ben Peart wrote:
>> From: Torsten Bögershausen [mailto:tbo...@web.de]
>>
>>
>> Does this work ?
>> I would have expected
>> packet_writel(fd, "line one", "line two", "line n"), NULL; 
Typo.
Should have been:
packet_writel(fd, "line one", "line two", "line n", NULL);
>>
> 
> No, that's actually not valid C syntax.
> 
>>>
>>> which requires the use of variable number of arguments.  With your
>> proposal that convenience is lost as you have to create an array of strings
>> and pass that instead.  The usage just isn't as simple as the current model.
>>>
>> What is wrong with
>>
>> int packet_write_fmt_gently(int fd, const char *fmt, ...) and we use it like
>> this:
>> if packet_write_fmt_gently(fd, "%s%s%s", "line one", "line two", "line n")
>>
> 
> Packets are not just strings; see pkt-line.h for more details-
>  but basically they are a length packet, followed by the data (in this 
> particular case a string). 
> The packet_writel function is a convenience function to write out a variable 
> number of
> packetized strings followed by a flush packet.
> You're sample above would simply concatenate the three strings and then write 
> a single packet. 
> A very different outcome. :)

Got it.


Re: [PATCH 2/7] Makefile: optionally compile with both SHA1DC and SHA1_OPENSSL

2017-03-30 Thread Junio C Hamano
Junio C Hamano  writes:

> ...
> The use of union is a good ingredient for a solution.  I would have
> chosen to do this slightly differently if I were doing it.
>
> typedef struct {
> int safe;
> union {
> SHA1_CTX_SAFE safe;
> SHA1_CTX_FAST fast;
> } u;
> } git_SHA_CTX;
>
> void git_SHA1_Init(git_SHA_CTX *ctx, int safe);
>   void git_SHA1_Update(git_SHA_CTX *ctx, const void *, unsigned long);
>   git_SHA1_Final(uchar [20], git_SHA_CTX *ctx);
>
> where SHA1_CTX_FAST may be chosen from the Makefile just like we
> currently choose platform_SHA_CTX.  SHA1_CTX_SAFE could also be made
> configurable but it may be OK to hardcode it to refer to SHA1_CTX of
> DC's.
>
> As you already know, I am assuming that each codepath pretty much
> knows if it needs safe or fast one (e.g. the one used in csum-file.c
> knows it does not have to), so each git_SHA_CTX is told which one to
> use when it gets initialized.

And if we wanted to declare "git add" is always safe, we could still
do

int sha1_safety_global_override = -1; /* unspecified */

void git_SHA1_Init(git_SHA_CTX *ctx, int safe)
{
if (sha1_safety_global_override >= 0)
ctx->safe = sha1_safety_global_override;
else
ctx->safe = safe;

if (ctx->safe)
SHA1DCInit(&(ctx->u.safe));
else
platform_SHA1_Init(&(ctx->u.fast));
   }

and then have cmd_add() in builtin/add.c to flip that global
override bit to say "this does not have to be safe".  I personally
do not think it is a good idea, but I am showing that it is still
doable.  

And as long as assignment to sha1_safety_global_override is done in
a thread-friendly way, such a scheme would be more thread-friendly
as a whole compared to the "toggle_sha1dc()" approach where each CTX
instance does not know which side of the union it is being used us
(which, if mixed-up, of course would lead to a funny behaviour).



Re: [PATCH 2/7] Makefile: optionally compile with both SHA1DC and SHA1_OPENSSL

2017-03-30 Thread Junio C Hamano
Johannes Schindelin  writes:

> +#ifdef SHA1_DC_AND_OPENSSL
> +void (*SHA1_Init_func)(SHA_CTX_union *ctx) = (void *)SHA1DCInit;
> +void (*SHA1_Update_func)(SHA_CTX_union *ctx, const void *pointer, size_t 
> size) =
> + (void *)git_SHA1DCUpdate;
> +int (*SHA1_Final_func)(unsigned char sha1[20], SHA_CTX_union *ctx) =
> + (void *)git_SHA1DCFinal;
> +
> +void toggle_sha1dc(int enable)
> +{
> + if (enable) {
> + SHA1_Init_func = (void *)SHA1DCInit;
> + SHA1_Update_func = (void *)git_SHA1DCUpdate;
> + SHA1_Final_func = (void *)git_SHA1DCFinal;
> + } else {
> + SHA1_Init_func = (void *)SHA1_Init;
> + SHA1_Update_func = (void *)SHA1_Update;
> + SHA1_Final_func = (void *)SHA1_Final;
> + }
> +}
> +#endif

As I understand that this is a demonstration series, the approach
above is OK as an expedite way to illustrate one way how run-time
switching could be done.  The approach however is not very thread
friendly, though.

> diff --git a/sha1dc/sha1.h b/sha1dc/sha1.h
> index bd8bd928fb3..243c2fe0b6b 100644
> --- a/sha1dc/sha1.h
> +++ b/sha1dc/sha1.h
> @@ -110,10 +110,26 @@ void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *);
>   */
>  void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len);
>  
> +#ifdef SHA1_DC_AND_OPENSSL
> +extern void toggle_sha1dc(int enable);
> +
> +typedef union {
> + SHA1_CTX dc;
> + SHA_CTX openssl;
> +} SHA_CTX_union;

The use of union is a good ingredient for a solution.  I would have
chosen to do this slightly differently if I were doing it.

typedef struct {
int safe;
union {
SHA1_CTX_SAFE safe;
SHA1_CTX_FAST fast;
} u;
} git_SHA_CTX;

void git_SHA1_Init(git_SHA_CTX *ctx, int safe);
void git_SHA1_Update(git_SHA_CTX *ctx, const void *, unsigned long);
git_SHA1_Final(uchar [20], git_SHA_CTX *ctx);

where SHA1_CTX_FAST may be chosen from the Makefile just like we
currently choose platform_SHA_CTX.  SHA1_CTX_SAFE could also be made
configurable but it may be OK to hardcode it to refer to SHA1_CTX of
DC's.

As you already know, I am assuming that each codepath pretty much
knows if it needs safe or fast one (e.g. the one used in csum-file.c
knows it does not have to), so each git_SHA_CTX is told which one to
use when it gets initialized.


RE: [PATCH v2 1/2] pkt-line: add packet_writel() and packet_read_line_gently()

2017-03-30 Thread Ben Peart
> From: Torsten Bögershausen [mailto:tbo...@web.de]
> 
> 
> Does this work ?
> I would have expected
> packet_writel(fd, "line one", "line two", "line n"), NULL;
> 

No, that's actually not valid C syntax.

> >
> > which requires the use of variable number of arguments.  With your
> proposal that convenience is lost as you have to create an array of strings
> and pass that instead.  The usage just isn't as simple as the current model.
> >
> What is wrong with
> 
> int packet_write_fmt_gently(int fd, const char *fmt, ...) and we use it like
> this:
> if packet_write_fmt_gently(fd, "%s%s%s", "line one", "line two", "line n")
> 

Packets are not just strings; see pkt-line.h for more details but basically 
they are a length packet, followed by the data (in this particular case a 
string).  The packet_writel function is a convenience function to write out a 
variable number of packetized strings followed by a flush packet.  You're 
sample above would simply concatenate the three strings and then write a single 
packet.  A very different outcome. :)

> 
> 
> (Or do we need another one like)
> int packet_write_fmt_gently_flush(int fd, const char *fmt, ...)
> 
> Sorry if I am lost here


Re: [PATCH] read-cache: avoid git_path() race in freshen_shared_index()

2017-03-30 Thread Junio C Hamano
Christian Couder  writes:

> On Wed, Mar 29, 2017 at 7:56 PM, Jeff King  wrote:
> ...
>> Yeah, it looks like that is what happened. I see that Christian bisected
>> the rebase to find the commit in the series that introduces the problem.
>> I'm mildly curious which commit upstream created the problem[1].
>
> I bisected it to 18633e1a22 (rebase -i: use the rebase--helper
> builtin, 2017-02-09).
> This commit is indeed changing how the interactive rebase works, but
> it is not easy to see how it impact git_path() usage.

That small series activates a large body of code that hasn't been exercised
so it is not surprising.  

> Yeah, I am very tempted to just rewrite the commit message like this:
>
> 
>
> When performing an interactive rebase in split-index mode,
> the commit message that one should rework when squashing commits
> can contain some garbage instead of the usual concatenation of
> both of the commit messages.
>
> Bisecting shows that c3a0082502 (read-cache: use
> freshen_shared_index() in read_index_from(), 2017-03-06) is involved,
> which points to the unsafe use of git_path() in
> freshen_shared_index().
>
> 
>
> and change the subject to "read-cache: avoid using git_path() in
> freshen_shared_index()".

Sure.  It may not be even worth mentioning a not-so-useful bisect
result, and the potential riskiness of the original code (iow why
this fix is the right thing) can be seen from the patch alone.

Thanks for following it through.


[PATCH v4 7/8] sub-process: move sub-process functions into separate files

2017-03-30 Thread Ben Peart
Move the sub-proces functions into sub-process.h/c.  Add documentation
for the new module in Documentation/technical/api-sub-process.txt

Signed-off-by: Ben Peart 
---
 Documentation/technical/api-sub-process.txt |  54 +
 Makefile|   1 +
 convert.c   | 119 +---
 sub-process.c   | 116 +++
 sub-process.h   |  46 +++
 5 files changed, 218 insertions(+), 118 deletions(-)
 create mode 100644 Documentation/technical/api-sub-process.txt
 create mode 100644 sub-process.c
 create mode 100644 sub-process.h

diff --git a/Documentation/technical/api-sub-process.txt 
b/Documentation/technical/api-sub-process.txt
new file mode 100644
index 00..eb5005aa72
--- /dev/null
+++ b/Documentation/technical/api-sub-process.txt
@@ -0,0 +1,54 @@
+sub-process API
+===
+
+The sub-process API makes it possible to run background sub-processes
+that should run until the git command exits and communicate with it
+through stdin and stdout.  This reduces the overhead of having to fork
+a new process each time it needs to be communicated with.
+
+The sub-processes are kept in a hashmap by command name and looked up
+via the subprocess_find_entry function.  If an existing instance can not
+be found then a new process should be created and started.  When the
+parent git command terminates, all sub-processes are also terminated.
+
+This API is based on the run-command API.
+
+Data structures
+---
+
+* `struct subprocess_entry`
+
+The sub-process structure.  Members should not be accessed directly.
+
+Types
+-
+
+'int(*subprocess_start_fn)(struct subprocess_entry *entry)'::
+
+   User-supplied function to initialize the sub-process.  This is
+   typically used to negoiate the interface version and capabilities.
+
+
+Functions
+-
+
+`subprocess_start`::
+
+   Start a subprocess and add it to the subprocess hashmap.
+
+`subprocess_stop`::
+
+   Kill a subprocess and remove it from the subprocess hashmap.
+
+`subprocess_find_entry`::
+
+   Find a subprocess in the subprocess hashmap.
+
+`subprocess_get_child_process`::
+
+   Get the underlying `struct child_process` from a subprocess.
+
+`subprocess_read_status`::
+
+   Helper function to read packets looking for the last "status="
+   key/value pair.
diff --git a/Makefile b/Makefile
index 9f8b35ad41..add945b560 100644
--- a/Makefile
+++ b/Makefile
@@ -838,6 +838,7 @@ LIB_OBJS += streaming.o
 LIB_OBJS += string-list.o
 LIB_OBJS += submodule.o
 LIB_OBJS += submodule-config.o
+LIB_OBJS += sub-process.o
 LIB_OBJS += symlinks.o
 LIB_OBJS += tag.o
 LIB_OBJS += tempfile.o
diff --git a/convert.c b/convert.c
index f68a7be622..baa41da760 100644
--- a/convert.c
+++ b/convert.c
@@ -4,6 +4,7 @@
 #include "quote.h"
 #include "sigchain.h"
 #include "pkt-line.h"
+#include "sub-process.h"
 
 /*
  * convert.c - convert a file when checking it out and checking it in.
@@ -496,86 +497,11 @@ static int apply_single_file_filter(const char *path, 
const char *src, size_t le
 #define CAP_CLEAN(1u<<0)
 #define CAP_SMUDGE   (1u<<1)
 
-struct subprocess_entry {
-   struct hashmap_entry ent; /* must be the first member! */
-   const char *cmd;
-   struct child_process process;
-};
-
 struct cmd2process {
struct subprocess_entry subprocess; /* must be the first member! */
unsigned int supported_capabilities;
 };
 
-static int subprocess_map_initialized;
-static struct hashmap subprocess_map;
-
-static int cmd2process_cmp(const struct subprocess_entry *e1,
-  const struct subprocess_entry *e2,
-  const void *unused)
-{
-   return strcmp(e1->cmd, e2->cmd);
-}
-
-static struct subprocess_entry *subprocess_find_entry(const char *cmd)
-{
-   struct subprocess_entry key;
-
-   if (!subprocess_map_initialized) {
-   subprocess_map_initialized = 1;
-   hashmap_init(_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
-   return NULL;
-   }
-
-   hashmap_entry_init(, strhash(cmd));
-   key.cmd = cmd;
-   return hashmap_get(_map, , NULL);
-}
-
-static void subprocess_read_status(int fd, struct strbuf *status)
-{
-   struct strbuf **pair;
-   char *line;
-   for (;;) {
-   line = packet_read_line(fd, NULL);
-   if (!line)
-   break;
-   pair = strbuf_split_str(line, '=', 2);
-   if (pair[0] && pair[0]->len && pair[1]) {
-   /* the last "status=" line wins */
-   if (!strcmp(pair[0]->buf, "status=")) {
-   strbuf_reset(status);
-   strbuf_addbuf(status, pair[1]);
-   }
-   }
-   strbuf_list_free(pair);

[PATCH v4 8/8] convert: Update subprocess_read_status to not die on EOF

2017-03-30 Thread Ben Peart
Enable sub-processes to gracefully handle when the process dies by
updating subprocess_read_status to return an error on EOF instead of
dying.

Update apply_multi_file_filter to take advantage of the revised
subprocess_read_status.

Signed-off-by: Ben Peart 
---
 convert.c | 10 --
 sub-process.c | 10 +++---
 sub-process.h |  2 +-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/convert.c b/convert.c
index baa41da760..9e181e27ad 100644
--- a/convert.c
+++ b/convert.c
@@ -629,7 +629,10 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
if (err)
goto done;
 
-   subprocess_read_status(process->out, _status);
+   err = subprocess_read_status(process->out, _status);
+   if (err)
+   goto done;
+
err = strcmp(filter_status.buf, "success");
if (err)
goto done;
@@ -638,7 +641,10 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
if (err)
goto done;
 
-   subprocess_read_status(process->out, _status);
+   err = subprocess_read_status(process->out, _status);
+   if (err)
+   goto done;
+
err = strcmp(filter_status.buf, "success");
 
 done:
diff --git a/sub-process.c b/sub-process.c
index a9e998cd75..ce919383fa 100644
--- a/sub-process.c
+++ b/sub-process.c
@@ -30,13 +30,15 @@ struct subprocess_entry *subprocess_find_entry(const char 
*cmd)
return hashmap_get(_map, , NULL);
 }
 
-void subprocess_read_status(int fd, struct strbuf *status)
+int subprocess_read_status(int fd, struct strbuf *status)
 {
struct strbuf **pair;
char *line;
+   int len;
+
for (;;) {
-   line = packet_read_line(fd, NULL);
-   if (!line)
+   len = packet_read_line_gently(fd, NULL, );
+   if ((len == -1) || !line)
break;
pair = strbuf_split_str(line, '=', 2);
if (pair[0] && pair[0]->len && pair[1]) {
@@ -48,6 +50,8 @@ void subprocess_read_status(int fd, struct strbuf *status)
}
strbuf_list_free(pair);
}
+
+   return len == -1 ? len : 0;
 }
 
 void subprocess_stop(struct subprocess_entry *entry)
diff --git a/sub-process.h b/sub-process.h
index 0cf1760a0a..5a1ce0 100644
--- a/sub-process.h
+++ b/sub-process.h
@@ -41,6 +41,6 @@ static inline struct child_process 
*subprocess_get_child_process(
  * key/value pairs and return the value from the last "status" packet
  */
 
-void subprocess_read_status(int fd, struct strbuf *status);
+int subprocess_read_status(int fd, struct strbuf *status);
 
 #endif
-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v4 1/8] pkt-line: add packet_read_line_gently()

2017-03-30 Thread Ben Peart
Add packet_read_line_gently() to enable reading a line without dying on
EOF.

Signed-off-by: Ben Peart 
---
 pkt-line.c | 12 
 pkt-line.h | 10 ++
 2 files changed, 22 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index d4b6bfe076..58842544b4 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -323,6 +323,18 @@ char *packet_read_line(int fd, int *len_p)
return packet_read_line_generic(fd, NULL, NULL, len_p);
 }
 
+int packet_read_line_gently(int fd, int *dst_len, char** dst_line)
+{
+   int len = packet_read(fd, NULL, NULL,
+   packet_buffer, sizeof(packet_buffer),
+   PACKET_READ_CHOMP_NEWLINE|PACKET_READ_GENTLE_ON_EOF);
+   if (dst_len)
+   *dst_len = len;
+   if (dst_line)
+   *dst_line = (len > 0) ? packet_buffer : NULL;
+   return len;
+}
+
 char *packet_read_line_buf(char **src, size_t *src_len, int *dst_len)
 {
return packet_read_line_generic(-1, src, src_len, dst_len);
diff --git a/pkt-line.h b/pkt-line.h
index 18eac64830..12b18991f6 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -74,6 +74,16 @@ int packet_read(int fd, char **src_buffer, size_t *src_len, 
char
 char *packet_read_line(int fd, int *size);
 
 /*
+ * Convenience wrapper for packet_read that sets the PACKET_READ_GENTLE_ON_EOF
+ * and CHOMP_NEWLINE options. The return value specifies the number of bytes
+ * read into the buffer or -1 on truncated input. if the *dst_line parameter
+ * is not NULL it will return NULL for a flush packet and otherwise points to
+ * a static buffer (that may be overwritten by subsequent calls). If the size
+ * parameter is not NULL, the length of the packet is written to it.
+ */
+int packet_read_line_gently(int fd, int *size, char** dst_line);
+
+/*
  * Same as packet_read_line, but read from a buf rather than a descriptor;
  * see packet_read for details on how src_* is used.
  */
-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v4 3/8] convert: Split start_multi_file_filter into two separate functions

2017-03-30 Thread Ben Peart
To enable future reuse of the filter..process infrastructure,
split start_multi_file_filter into two separate parts.

start_multi_file_filter will now only contain the generic logic to
manage the creation and tracking of the child process in a hashmap.

start_multi_file_filter_fn is a protocol specific initialization
function that will negotiate the multi-file-filter interface version
and capabilities.

Signed-off-by: Ben Peart 
---
 convert.c | 63 ++-
 1 file changed, 38 insertions(+), 25 deletions(-)

diff --git a/convert.c b/convert.c
index 793c29ebfd..404757eac9 100644
--- a/convert.c
+++ b/convert.c
@@ -565,35 +565,14 @@ static void stop_multi_file_filter(struct child_process 
*process)
finish_command(process);
 }
 
-static struct cmd2process *start_multi_file_filter(struct hashmap *hashmap, 
const char *cmd)
+static int start_multi_file_filter_fn(struct cmd2process *entry)
 {
int err;
-   struct cmd2process *entry;
-   struct child_process *process;
-   const char *argv[] = { cmd, NULL };
struct string_list cap_list = STRING_LIST_INIT_NODUP;
char *cap_buf;
const char *cap_name;
-
-   entry = xmalloc(sizeof(*entry));
-   entry->cmd = cmd;
-   entry->supported_capabilities = 0;
-   process = >process;
-
-   child_process_init(process);
-   process->argv = argv;
-   process->use_shell = 1;
-   process->in = -1;
-   process->out = -1;
-   process->clean_on_exit = 1;
-   process->clean_on_exit_handler = stop_multi_file_filter;
-
-   if (start_command(process)) {
-   error("cannot fork to run external filter '%s'", cmd);
-   return NULL;
-   }
-
-   hashmap_entry_init(entry, strhash(cmd));
+   struct child_process *process = >process;
+   const char *cmd = entry->cmd;
 
sigchain_push(SIGPIPE, SIG_IGN);
 
@@ -642,7 +621,41 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
 done:
sigchain_pop(SIGPIPE);
 
-   if (err || errno == EPIPE) {
+   if (err || errno == EPIPE)
+   err = err ? err : errno;
+
+   return err;
+}
+
+static struct cmd2process *start_multi_file_filter(struct hashmap *hashmap, 
const char *cmd)
+{
+   int err;
+   struct cmd2process *entry;
+   struct child_process *process;
+   const char *argv[] = { cmd, NULL };
+
+   entry = xmalloc(sizeof(*entry));
+   entry->cmd = cmd;
+   entry->supported_capabilities = 0;
+   process = >process;
+
+   child_process_init(process);
+   process->argv = argv;
+   process->use_shell = 1;
+   process->in = -1;
+   process->out = -1;
+   process->clean_on_exit = 1;
+   process->clean_on_exit_handler = stop_multi_file_filter;
+
+   if (start_command(process)) {
+   error("cannot fork to run external filter '%s'", cmd);
+   return NULL;
+   }
+
+   hashmap_entry_init(entry, strhash(cmd));
+
+   err = start_multi_file_filter_fn(entry);
+   if (err) {
error("initialization for external filter '%s' failed", cmd);
kill_multi_file_filter(hashmap, entry);
return NULL;
-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v4 5/8] convert: Update generic functions to only use generic data structures

2017-03-30 Thread Ben Peart
Update all functions that are going to be moved into a reusable module
so that they only work with the reusable data structures.  Move code
that is specific to the filter out into the filter specific functions.

Signed-off-by: Ben Peart 
---
 convert.c | 46 --
 1 file changed, 28 insertions(+), 18 deletions(-)

diff --git a/convert.c b/convert.c
index f569026511..747c0c363b 100644
--- a/convert.c
+++ b/convert.c
@@ -576,14 +576,15 @@ static void stop_multi_file_filter(struct child_process 
*process)
finish_command(process);
 }
 
-static int start_multi_file_filter_fn(struct cmd2process *entry)
+static int start_multi_file_filter_fn(struct subprocess_entry *subprocess)
 {
int err;
+   struct cmd2process *entry = (struct cmd2process *)subprocess;
struct string_list cap_list = STRING_LIST_INIT_NODUP;
char *cap_buf;
const char *cap_name;
-   struct child_process *process = >subprocess.process;
-   const char *cmd = entry->subprocess.cmd;
+   struct child_process *process = >process;
+   const char *cmd = subprocess->cmd;
 
sigchain_push(SIGPIPE, SIG_IGN);
 
@@ -638,17 +639,21 @@ static int start_multi_file_filter_fn(struct cmd2process 
*entry)
return err;
 }
 
-static struct cmd2process *start_multi_file_filter(const char *cmd)
+typedef int(*subprocess_start_fn)(struct subprocess_entry *entry);
+int start_multi_file_filter(struct subprocess_entry *entry, const char *cmd,
+   subprocess_start_fn startfn)
 {
int err;
-   struct cmd2process *entry;
struct child_process *process;
const char *argv[] = { cmd, NULL };
 
-   entry = xmalloc(sizeof(*entry));
-   entry->subprocess.cmd = cmd;
-   entry->supported_capabilities = 0;
-   process = >subprocess.process;
+   if (!cmd_process_map_initialized) {
+   cmd_process_map_initialized = 1;
+   hashmap_init(_process_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
+   }
+
+   entry->cmd = cmd;
+   process = >process;
 
child_process_init(process);
process->argv = argv;
@@ -658,22 +663,23 @@ static struct cmd2process *start_multi_file_filter(const 
char *cmd)
process->clean_on_exit = 1;
process->clean_on_exit_handler = stop_multi_file_filter;
 
-   if (start_command(process)) {
+   err = start_command(process);
+   if (err) {
error("cannot fork to run external filter '%s'", cmd);
-   return NULL;
+   return err;
}
 
hashmap_entry_init(entry, strhash(cmd));
 
-   err = start_multi_file_filter_fn(entry);
+   err = startfn(entry);
if (err) {
error("initialization for external filter '%s' failed", cmd);
-   kill_multi_file_filter(>subprocess);
-   return NULL;
+   kill_multi_file_filter(entry);
+   return err;
}
 
hashmap_add(_process_map, entry);
-   return entry;
+   return 0;
 }
 
 static int apply_multi_file_filter(const char *path, const char *src, size_t 
len,
@@ -692,9 +698,13 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
fflush(NULL);
 
if (!entry) {
-   entry = start_multi_file_filter(cmd);
-   if (!entry)
+   entry = xmalloc(sizeof(*entry));
+   entry->supported_capabilities = 0;
+
+   if (start_multi_file_filter(>subprocess, cmd, 
start_multi_file_filter_fn)) {
+   free(entry);
return 0;
+   }
}
process = >subprocess.process;
 
@@ -767,7 +777,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
 * Force shutdown and restart if another blob requires 
filtering.
 */
error("external filter '%s' failed", cmd);
-   kill_multi_file_filter(>subprocess);
+   kill_multi_file_filter((struct subprocess_entry 
*)entry);
}
} else {
strbuf_swap(dst, );
-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v4 2/8] convert: move packet_write_list() into pkt-line as packet_writel()

2017-03-30 Thread Ben Peart
Add packet_writel() which writes multiple lines in a single call and
then calls packet_flush_gently(). Update convert.c to use the new
packet_writel() function from pkt-line.

Signed-off-by: Ben Peart 
---
 convert.c  | 23 ++-
 pkt-line.c | 19 +++
 pkt-line.h |  1 +
 3 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/convert.c b/convert.c
index 8d652bf27c..793c29ebfd 100644
--- a/convert.c
+++ b/convert.c
@@ -521,25 +521,6 @@ static struct cmd2process 
*find_multi_file_filter_entry(struct hashmap *hashmap,
return hashmap_get(hashmap, , NULL);
 }
 
-static int packet_write_list(int fd, const char *line, ...)
-{
-   va_list args;
-   int err;
-   va_start(args, line);
-   for (;;) {
-   if (!line)
-   break;
-   if (strlen(line) > LARGE_PACKET_DATA_MAX)
-   return -1;
-   err = packet_write_fmt_gently(fd, "%s\n", line);
-   if (err)
-   return err;
-   line = va_arg(args, const char*);
-   }
-   va_end(args);
-   return packet_flush_gently(fd);
-}
-
 static void read_multi_file_filter_status(int fd, struct strbuf *status)
 {
struct strbuf **pair;
@@ -616,7 +597,7 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
 
sigchain_push(SIGPIPE, SIG_IGN);
 
-   err = packet_write_list(process->in, "git-filter-client", "version=2", 
NULL);
+   err = packet_writel(process->in, "git-filter-client", "version=2", 
NULL);
if (err)
goto done;
 
@@ -632,7 +613,7 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
if (err)
goto done;
 
-   err = packet_write_list(process->in, "capability=clean", 
"capability=smudge", NULL);
+   err = packet_writel(process->in, "capability=clean", 
"capability=smudge", NULL);
 
for (;;) {
cap_buf = packet_read_line(process->out, NULL);
diff --git a/pkt-line.c b/pkt-line.c
index 58842544b4..2788aa1af6 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -171,6 +171,25 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...)
return status;
 }
 
+int packet_writel(int fd, const char *line, ...)
+{
+   va_list args;
+   int err;
+   va_start(args, line);
+   for (;;) {
+   if (!line)
+   break;
+   if (strlen(line) > LARGE_PACKET_DATA_MAX)
+   return -1;
+   err = packet_write_fmt_gently(fd, "%s\n", line);
+   if (err)
+   return err;
+   line = va_arg(args, const char*);
+   }
+   va_end(args);
+   return packet_flush_gently(fd);
+}
+
 static int packet_write_gently(const int fd_out, const char *buf, size_t size)
 {
static char packet_write_buffer[LARGE_PACKET_MAX];
diff --git a/pkt-line.h b/pkt-line.h
index 12b18991f6..cb3eda9695 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -25,6 +25,7 @@ void packet_buf_flush(struct strbuf *buf);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
+int packet_writel(int fd, const char *line, ...);
 int write_packetized_from_fd(int fd_in, int fd_out);
 int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 
-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v4 4/8] convert: Separate generic structures and variables from the filter specific ones

2017-03-30 Thread Ben Peart
To enable future reuse of the filter..process infrastructure,
split the cmd2process structure into two separate parts.

subprocess_entry will now contain the generic data required to manage
the creation and tracking of the child process in a hashmap. Also move
all knowledge of the hashmap into the generic functions.

cmd2process is a filter protocol specific structure that is used to
track the negotiated capabilities of the filter.

Signed-off-by: Ben Peart 
---
 convert.c | 57 +++--
 1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/convert.c b/convert.c
index 404757eac9..f569026511 100644
--- a/convert.c
+++ b/convert.c
@@ -496,29 +496,40 @@ static int apply_single_file_filter(const char *path, 
const char *src, size_t le
 #define CAP_CLEAN(1u<<0)
 #define CAP_SMUDGE   (1u<<1)
 
-struct cmd2process {
+struct subprocess_entry {
struct hashmap_entry ent; /* must be the first member! */
-   unsigned int supported_capabilities;
const char *cmd;
struct child_process process;
 };
 
+struct cmd2process {
+   struct subprocess_entry subprocess; /* must be the first member! */
+   unsigned int supported_capabilities;
+};
+
 static int cmd_process_map_initialized;
 static struct hashmap cmd_process_map;
 
-static int cmd2process_cmp(const struct cmd2process *e1,
-  const struct cmd2process *e2,
+static int cmd2process_cmp(const struct subprocess_entry *e1,
+  const struct subprocess_entry *e2,
   const void *unused)
 {
return strcmp(e1->cmd, e2->cmd);
 }
 
-static struct cmd2process *find_multi_file_filter_entry(struct hashmap 
*hashmap, const char *cmd)
+static struct subprocess_entry *find_multi_file_filter_entry(const char *cmd)
 {
-   struct cmd2process key;
+   struct subprocess_entry key;
+
+   if (!cmd_process_map_initialized) {
+   cmd_process_map_initialized = 1;
+   hashmap_init(_process_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
+   return NULL;
+   }
+
hashmap_entry_init(, strhash(cmd));
key.cmd = cmd;
-   return hashmap_get(hashmap, , NULL);
+   return hashmap_get(_process_map, , NULL);
 }
 
 static void read_multi_file_filter_status(int fd, struct strbuf *status)
@@ -541,7 +552,7 @@ static void read_multi_file_filter_status(int fd, struct 
strbuf *status)
}
 }
 
-static void kill_multi_file_filter(struct hashmap *hashmap, struct cmd2process 
*entry)
+static void kill_multi_file_filter(struct subprocess_entry *entry)
 {
if (!entry)
return;
@@ -550,7 +561,7 @@ static void kill_multi_file_filter(struct hashmap *hashmap, 
struct cmd2process *
kill(entry->process.pid, SIGTERM);
finish_command(>process);
 
-   hashmap_remove(hashmap, entry, NULL);
+   hashmap_remove(_process_map, entry, NULL);
free(entry);
 }
 
@@ -571,8 +582,8 @@ static int start_multi_file_filter_fn(struct cmd2process 
*entry)
struct string_list cap_list = STRING_LIST_INIT_NODUP;
char *cap_buf;
const char *cap_name;
-   struct child_process *process = >process;
-   const char *cmd = entry->cmd;
+   struct child_process *process = >subprocess.process;
+   const char *cmd = entry->subprocess.cmd;
 
sigchain_push(SIGPIPE, SIG_IGN);
 
@@ -627,7 +638,7 @@ static int start_multi_file_filter_fn(struct cmd2process 
*entry)
return err;
 }
 
-static struct cmd2process *start_multi_file_filter(struct hashmap *hashmap, 
const char *cmd)
+static struct cmd2process *start_multi_file_filter(const char *cmd)
 {
int err;
struct cmd2process *entry;
@@ -635,9 +646,9 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
const char *argv[] = { cmd, NULL };
 
entry = xmalloc(sizeof(*entry));
-   entry->cmd = cmd;
+   entry->subprocess.cmd = cmd;
entry->supported_capabilities = 0;
-   process = >process;
+   process = >subprocess.process;
 
child_process_init(process);
process->argv = argv;
@@ -657,11 +668,11 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
err = start_multi_file_filter_fn(entry);
if (err) {
error("initialization for external filter '%s' failed", cmd);
-   kill_multi_file_filter(hashmap, entry);
+   kill_multi_file_filter(>subprocess);
return NULL;
}
 
-   hashmap_add(hashmap, entry);
+   hashmap_add(_process_map, entry);
return entry;
 }
 
@@ -676,22 +687,16 @@ static int apply_multi_file_filter(const char *path, 
const char *src, size_t len
struct strbuf filter_status = STRBUF_INIT;
const char *filter_type;
 
-   if (!cmd_process_map_initialized) {
-   cmd_process_map_initialized 

[PATCH v4 6/8] convert: rename reusable sub-process functions

2017-03-30 Thread Ben Peart
Do a mechanical rename of the functions that will become the reusable
sub-process module.

Signed-off-by: Ben Peart 
---
 convert.c | 47 ---
 1 file changed, 24 insertions(+), 23 deletions(-)

diff --git a/convert.c b/convert.c
index 747c0c363b..f68a7be622 100644
--- a/convert.c
+++ b/convert.c
@@ -507,8 +507,8 @@ struct cmd2process {
unsigned int supported_capabilities;
 };
 
-static int cmd_process_map_initialized;
-static struct hashmap cmd_process_map;
+static int subprocess_map_initialized;
+static struct hashmap subprocess_map;
 
 static int cmd2process_cmp(const struct subprocess_entry *e1,
   const struct subprocess_entry *e2,
@@ -517,22 +517,22 @@ static int cmd2process_cmp(const struct subprocess_entry 
*e1,
return strcmp(e1->cmd, e2->cmd);
 }
 
-static struct subprocess_entry *find_multi_file_filter_entry(const char *cmd)
+static struct subprocess_entry *subprocess_find_entry(const char *cmd)
 {
struct subprocess_entry key;
 
-   if (!cmd_process_map_initialized) {
-   cmd_process_map_initialized = 1;
-   hashmap_init(_process_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
+   if (!subprocess_map_initialized) {
+   subprocess_map_initialized = 1;
+   hashmap_init(_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
return NULL;
}
 
hashmap_entry_init(, strhash(cmd));
key.cmd = cmd;
-   return hashmap_get(_process_map, , NULL);
+   return hashmap_get(_map, , NULL);
 }
 
-static void read_multi_file_filter_status(int fd, struct strbuf *status)
+static void subprocess_read_status(int fd, struct strbuf *status)
 {
struct strbuf **pair;
char *line;
@@ -552,7 +552,7 @@ static void read_multi_file_filter_status(int fd, struct 
strbuf *status)
}
 }
 
-static void kill_multi_file_filter(struct subprocess_entry *entry)
+static void subprocess_stop(struct subprocess_entry *entry)
 {
if (!entry)
return;
@@ -561,11 +561,11 @@ static void kill_multi_file_filter(struct 
subprocess_entry *entry)
kill(entry->process.pid, SIGTERM);
finish_command(>process);
 
-   hashmap_remove(_process_map, entry, NULL);
+   hashmap_remove(_map, entry, NULL);
free(entry);
 }
 
-static void stop_multi_file_filter(struct child_process *process)
+static void subprocess_exit_handler(struct child_process *process)
 {
sigchain_push(SIGPIPE, SIG_IGN);
/* Closing the pipe signals the filter to initiate a shutdown. */
@@ -640,16 +640,16 @@ static int start_multi_file_filter_fn(struct 
subprocess_entry *subprocess)
 }
 
 typedef int(*subprocess_start_fn)(struct subprocess_entry *entry);
-int start_multi_file_filter(struct subprocess_entry *entry, const char *cmd,
+int subprocess_start(struct subprocess_entry *entry, const char *cmd,
subprocess_start_fn startfn)
 {
int err;
struct child_process *process;
const char *argv[] = { cmd, NULL };
 
-   if (!cmd_process_map_initialized) {
-   cmd_process_map_initialized = 1;
-   hashmap_init(_process_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
+   if (!subprocess_map_initialized) {
+   subprocess_map_initialized = 1;
+   hashmap_init(_map, (hashmap_cmp_fn)cmd2process_cmp, 
0);
}
 
entry->cmd = cmd;
@@ -661,7 +661,7 @@ int start_multi_file_filter(struct subprocess_entry *entry, 
const char *cmd,
process->in = -1;
process->out = -1;
process->clean_on_exit = 1;
-   process->clean_on_exit_handler = stop_multi_file_filter;
+   process->clean_on_exit_handler = subprocess_exit_handler;
 
err = start_command(process);
if (err) {
@@ -674,11 +674,11 @@ int start_multi_file_filter(struct subprocess_entry 
*entry, const char *cmd,
err = startfn(entry);
if (err) {
error("initialization for external filter '%s' failed", cmd);
-   kill_multi_file_filter(entry);
+   subprocess_stop(entry);
return err;
}
 
-   hashmap_add(_process_map, entry);
+   hashmap_add(_map, entry);
return 0;
 }
 
@@ -693,7 +693,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
struct strbuf filter_status = STRBUF_INIT;
const char *filter_type;
 
-   entry = (struct cmd2process *)find_multi_file_filter_entry(cmd);
+   entry = (struct cmd2process *)subprocess_find_entry(cmd);
 
fflush(NULL);
 
@@ -701,7 +701,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len
entry = xmalloc(sizeof(*entry));
entry->supported_capabilities = 0;
 
-   if (start_multi_file_filter(>subprocess, cmd, 
start_multi_file_filter_fn)) {
+   if 

[PATCH v4 0/8] refactor the filter process code into a reusable module

2017-03-30 Thread Ben Peart
Refactor the filter..process code into a separate sub-process
module that can be used to reduce the cost of starting up a sub-process
for multiple commands.  It does this by keeping the external process
running and processing all commands by communicating over standard input
and standard output using the packet format (pkt-line) based protocol.
Full documentation is in Documentation/technical/api-sub-process.txt.

This code is refactored from:

Commit edcc85814c ("convert: add filter..process option", 
2016-10-16)
keeps the external process running and processes all commands

Ben Peart (8):
  pkt-line: add packet_read_line_gently()
  convert: move packet_write_list() into pkt-line as packet_writel()
  convert: Split start_multi_file_filter into two separate functions
  convert: Separate generic structures and variables from the filter
specific ones
  convert: Update generic functions to only use generic data structures
  convert: rename reusable sub-process functions
  sub-process: move sub-process functions into separate files
  convert: Update subprocess_read_status to not die on EOF

 Documentation/technical/api-sub-process.txt |  54 ++
 Makefile|   1 +
 convert.c   | 159 +---
 pkt-line.c  |  31 ++
 pkt-line.h  |  11 ++
 sub-process.c   | 120 +
 sub-process.h   |  46 
 7 files changed, 292 insertions(+), 130 deletions(-)
 create mode 100644 Documentation/technical/api-sub-process.txt
 create mode 100644 sub-process.c
 create mode 100644 sub-process.h

-- 
2.12.1.gvfs.1.18.ge47db72



[PATCH v4 2/8] convert: move packet_write_list() into pkt-line as packet_writel()

2017-03-30 Thread Ben Peart
Add packet_writel() which writes multiple lines in a single call and
then calls packet_flush_gently(). Update convert.c to use the new
packet_writel() function from pkt-line.

Signed-off-by: Ben Peart 
---
 convert.c  | 23 ++-
 pkt-line.c | 19 +++
 pkt-line.h |  1 +
 3 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/convert.c b/convert.c
index 8d652bf27c..793c29ebfd 100644
--- a/convert.c
+++ b/convert.c
@@ -521,25 +521,6 @@ static struct cmd2process 
*find_multi_file_filter_entry(struct hashmap *hashmap,
return hashmap_get(hashmap, , NULL);
 }
 
-static int packet_write_list(int fd, const char *line, ...)
-{
-   va_list args;
-   int err;
-   va_start(args, line);
-   for (;;) {
-   if (!line)
-   break;
-   if (strlen(line) > LARGE_PACKET_DATA_MAX)
-   return -1;
-   err = packet_write_fmt_gently(fd, "%s\n", line);
-   if (err)
-   return err;
-   line = va_arg(args, const char*);
-   }
-   va_end(args);
-   return packet_flush_gently(fd);
-}
-
 static void read_multi_file_filter_status(int fd, struct strbuf *status)
 {
struct strbuf **pair;
@@ -616,7 +597,7 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
 
sigchain_push(SIGPIPE, SIG_IGN);
 
-   err = packet_write_list(process->in, "git-filter-client", "version=2", 
NULL);
+   err = packet_writel(process->in, "git-filter-client", "version=2", 
NULL);
if (err)
goto done;
 
@@ -632,7 +613,7 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
if (err)
goto done;
 
-   err = packet_write_list(process->in, "capability=clean", 
"capability=smudge", NULL);
+   err = packet_writel(process->in, "capability=clean", 
"capability=smudge", NULL);
 
for (;;) {
cap_buf = packet_read_line(process->out, NULL);
diff --git a/pkt-line.c b/pkt-line.c
index 58842544b4..2788aa1af6 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -171,6 +171,25 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...)
return status;
 }
 
+int packet_writel(int fd, const char *line, ...)
+{
+   va_list args;
+   int err;
+   va_start(args, line);
+   for (;;) {
+   if (!line)
+   break;
+   if (strlen(line) > LARGE_PACKET_DATA_MAX)
+   return -1;
+   err = packet_write_fmt_gently(fd, "%s\n", line);
+   if (err)
+   return err;
+   line = va_arg(args, const char*);
+   }
+   va_end(args);
+   return packet_flush_gently(fd);
+}
+
 static int packet_write_gently(const int fd_out, const char *buf, size_t size)
 {
static char packet_write_buffer[LARGE_PACKET_MAX];
diff --git a/pkt-line.h b/pkt-line.h
index 12b18991f6..cb3eda9695 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -25,6 +25,7 @@ void packet_buf_flush(struct strbuf *buf);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
+int packet_writel(int fd, const char *line, ...);
 int write_packetized_from_fd(int fd_in, int fd_out);
 int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 
-- 
2.12.1.gvfs.1.18.ge47db72



RE: [PATCH v3 7/8] sub-process: move sub-process functions into separate files

2017-03-30 Thread Ben Peart
> From: Junio C Hamano [mailto:gits...@pobox.com]
> 
> > diff --git a/sub-process.c b/sub-process.c new file mode 100644 index
> > 00..2c4d27c193
> > --- /dev/null
> > +++ b/sub-process.c
> > @@ -0,0 +1,116 @@
> > +/*
> > + * Generic implementation of background process infrastructure.
> > + */
> > +#include "sub-process.h"
> > +#include "sigchain.h"
> > +#include "pkt-line.h"
> > + ...
> > +void subprocess_exit_handler(struct child_process *process) {
> 
> This is not only undocumented in the above, but it does not seem to be
> necessary to be a public function.  The only thing that uses this is
> subprocess_start(), which is in this file.  Perhaps make it static?

OK.  Missed that somehow. I'll fix it and send another patch series.


RE: [PATCH v3 0/8] refactor the filter process code into a reusable module

2017-03-30 Thread Ben Peart
> From: Junio C Hamano [mailto:gits...@pobox.com]
> 
> Ben Peart  writes:
> 
> > Ben Peart (8):
> >   pkt-line: add packet_writel() and packet_read_line_gently()
> >   convert: Update convert to use new packet_writel() function
> >   convert: Split start_multi_file_filter into two separate functions
> >   convert: Separate generic structures and variables from the filter
> > specific ones
> >   convert: Update generic functions to only use generic data structures
> >   convert: rename reusable sub-process functions
> >   sub-process: move sub-process functions into separate files
> >   convert: Update subprocess_read_status to not die on EOF
> 
> This presentation is much easier to digest, compared to the large ball of wax
> we saw previously.  It highlights the key modification that cmd2process is
> now "subclassed" from subprocess_entry which is a more generic structure
> by embedding the latter at the beginning, and have its user
> start_multi_file_filter_fn() explicitly downcast the latter to the former 
> around
> patches 4/8 and 5/8.
> 
> If I were doing this series, I would organize the first two slightly 
> differently,
> namely:
> 
>  * 1/8 just adds packet_read_line_gently().
> 
>  * 2/8 moves packet_write_line() from convert.c to pkt-line.c while
>renaming it, with the justification that this function must be
>made more widely available.  It would naturally involves
>adjusting existing callers.
> 
> because write and read done in your 1/8 are independent and orthogonal
> changes, and doing it that way also avoids needless temporary duplication of
> the same function.
> 

I'll go ahead and do it that way.
 
> I may later have further comments on 3-8/8 after giving them another read,
> but I haven't seen anything questionable in them so far.
> 
> Thanks.



  1   2   >