Re: [PATCH 0/6] Convert hash-object to struct object_id

2017-08-20 Thread Junio C Hamano
"brian m. carlson"  writes:

> On Sun, Aug 20, 2017 at 10:09:25PM +0200, Patryk Obara wrote:
>> This enabled conversion of few functions in sha1_file, which
>> had almost all callers converted already.
>> 
>> I hope I'm not stepping on anyone's toes with this patch series.
>> If I do - is there some email thread or document in which I can
>> coordinate with other developers, regarding which code regions
>> are being converted to struct object_id next?
>
> We don't have a coordinated thread at the moment.  You can see what I'm
> working on at https://github.com/bk2204/git.git in the object-id-part10
> and object-id-part11 branches (based on an older next).
>
> However, having said that, I don't mind if you or others pick up various
> parts of the codebase.  At worst, I drop a few patches for things others
> have already converted.

Thanks for working well together ;-)

I've scanned these patches and they looked OK.  There still are
places that dereferences oid->hash when making a call instead of
passing a pointer to the whole oid, but that is not making things
worse.

As to the naming (your comments on 5/6), I agree that we would need
to switch s/sha1/oid/ in the names in the endgame.  It may be OK to
leave it to later rounds when we do use something like your hash
algorithm abstraction throughout the codebase.




Re: [PATCH 2/2] rerere: allow approxidate in gc.rerereResolved/gc.rerereUnresolved

2017-08-20 Thread Junio C Hamano
Ramsay Jones  writes:

> On 19/08/17 21:30, Junio C Hamano wrote:
>
>> In the future, we may find other variables that only allow an
>> integer that specifies "this many days" (or other unit of time) and
>> allow them to also do the same, and at that point we probably would
>> want to move the helper to a place that is not specific to the
>> rerere machinery.  Perhaps config.c would be such a good neutral
>> place, as it will allow git_parse_signed() to go back to static to
>> the file.
>
> You make git_parse_unsigned() external in this patch, in addition
> to git_parse_signed(), but don't actually call it. Was that intended?

Yes, it was done on purpose for symmetry and was done before I had
that "perhaps this can be moved to config.c" vision.

Perhaps I should follow up the topic with [PATCH 3/2] to really move
it to config.c to clean it up.  But not today.

Thanks.


Re: [RFC PATCH 0/6] Hash Abstraction

2017-08-20 Thread Junio C Hamano
"brian m. carlson"  writes:

> brian m. carlson (6):
>   vcs-svn: remove unused prototypes
>   vcs-svn: rename repo functions to "svn_repo"
>   setup: expose enumerated repo info
>   Add structure representing hash algorithm
>   Integrate hash algorithm support with repo setup
>   Switch empty tree and blob lookups to use hash abstraction

The first two are about very quiescent code that can go in and sail
thru 'next' very quickly without problems, I would think.

After the third one alone, the discovered format information may not
be fully plumbed through (i.e. after setup finishes there is no way
for outside callers to peek at the repo_fmt instance that was on
stack), but it seems that this is sufficient for the purpose of this
step and it looks a very sensible first step.,

As to patch 4, I tend to agree with the analysis you had in this
cover letter--it did make me wonder if we want to have union of hash
contenxt structures, for example.  But the enumeration of virtual
functions looks about right.

I wondered if we wanted "binery size" and "hex size" separately,
because the latter will always be twice as big as the former as long
as the latter's definition is "hex", but that is a minor point.

Thanks for starting this.

>  builtin/am.c|  2 +-
>  builtin/checkout.c  |  2 +-
>  builtin/diff.c  |  2 +-
>  builtin/pull.c  |  2 +-
>  cache.h | 48 
>  diff-lib.c  |  2 +-
>  merge-recursive.c   |  2 +-
>  notes-merge.c   |  2 +-
>  repository.c|  7 +++
>  repository.h|  5 +
>  sequencer.c |  6 +++---
>  setup.c | 48 +++-
>  sha1_file.c | 29 +
>  submodule.c |  2 +-
>  vcs-svn/repo_tree.c |  6 +++---
>  vcs-svn/repo_tree.h | 13 +++--
>  vcs-svn/svndump.c   |  8 
>  17 files changed, 133 insertions(+), 53 deletions(-)


[RFC PATCH 3/6] setup: expose enumerated repo info

2017-08-20 Thread brian m. carlson
We enumerate several different items as part of struct
repository_format, but then actually set up those values using the
global variables we've initialized from them.  Instead, let's pass a
pointer to the structure down to the code where we enumerate these
values, so we can later on use those values directly to perform setup.

This technique makes it easier for us to determine additional items
about the repository format (such as the hash algorithm) and then use
them for setup later on, without needing to add additional global
variables.  We can't avoid using the existing global variables since
they're intricately intertwined with how things work at the moment, but
this improves things for the future.

Signed-off-by: brian m. carlson 
---
 setup.c | 46 +-
 1 file changed, 25 insertions(+), 21 deletions(-)

diff --git a/setup.c b/setup.c
index 860507e1fd..115e70a4e8 100644
--- a/setup.c
+++ b/setup.c
@@ -437,16 +437,15 @@ static int check_repo_format(const char *var, const char 
*value, void *vdata)
return 0;
 }
 
-static int check_repository_format_gently(const char *gitdir, int *nongit_ok)
+static int check_repository_format_gently(const char *gitdir, struct 
repository_format *candidate, int *nongit_ok)
 {
struct strbuf sb = STRBUF_INIT;
struct strbuf err = STRBUF_INIT;
-   struct repository_format candidate;
int has_common;
 
has_common = get_common_dir(, gitdir);
strbuf_addstr(, "/config");
-   read_repository_format(, sb.buf);
+   read_repository_format(candidate, sb.buf);
strbuf_release();
 
/*
@@ -454,10 +453,10 @@ static int check_repository_format_gently(const char 
*gitdir, int *nongit_ok)
 * we treat a missing config as a silent "ok", even when nongit_ok
 * is unset.
 */
-   if (candidate.version < 0)
+   if (candidate->version < 0)
return 0;
 
-   if (verify_repository_format(, ) < 0) {
+   if (verify_repository_format(candidate, ) < 0) {
if (nongit_ok) {
warning("%s", err.buf);
strbuf_release();
@@ -467,21 +466,21 @@ static int check_repository_format_gently(const char 
*gitdir, int *nongit_ok)
die("%s", err.buf);
}
 
-   repository_format_precious_objects = candidate.precious_objects;
-   string_list_clear(_extensions, 0);
+   repository_format_precious_objects = candidate->precious_objects;
+   string_list_clear(>unknown_extensions, 0);
if (!has_common) {
-   if (candidate.is_bare != -1) {
-   is_bare_repository_cfg = candidate.is_bare;
+   if (candidate->is_bare != -1) {
+   is_bare_repository_cfg = candidate->is_bare;
if (is_bare_repository_cfg == 1)
inside_work_tree = -1;
}
-   if (candidate.work_tree) {
+   if (candidate->work_tree) {
free(git_work_tree_cfg);
-   git_work_tree_cfg = candidate.work_tree;
+   git_work_tree_cfg = candidate->work_tree;
inside_work_tree = -1;
}
} else {
-   free(candidate.work_tree);
+   free(candidate->work_tree);
}
 
return 0;
@@ -627,6 +626,7 @@ const char *read_gitfile_gently(const char *path, int 
*return_error_code)
 
 static const char *setup_explicit_git_dir(const char *gitdirenv,
  struct strbuf *cwd,
+ struct repository_format *repo_fmt,
  int *nongit_ok)
 {
const char *work_tree_env = getenv(GIT_WORK_TREE_ENVIRONMENT);
@@ -652,7 +652,7 @@ static const char *setup_explicit_git_dir(const char 
*gitdirenv,
die("Not a git repository: '%s'", gitdirenv);
}
 
-   if (check_repository_format_gently(gitdirenv, nongit_ok)) {
+   if (check_repository_format_gently(gitdirenv, repo_fmt, nongit_ok)) {
free(gitfile);
return NULL;
}
@@ -725,9 +725,10 @@ static const char *setup_explicit_git_dir(const char 
*gitdirenv,
 
 static const char *setup_discovered_git_dir(const char *gitdir,
struct strbuf *cwd, int offset,
+   struct repository_format *repo_fmt,
int *nongit_ok)
 {
-   if (check_repository_format_gently(gitdir, nongit_ok))
+   if (check_repository_format_gently(gitdir, repo_fmt, nongit_ok))
return NULL;
 
/* --work-tree is set without --git-dir; use discovered one */
@@ -739,7 +740,7 @@ static const char *setup_discovered_git_dir(const char 
*gitdir,
gitdir = 

[RFC PATCH 5/6] Integrate hash algorithm support with repo setup

2017-08-20 Thread brian m. carlson
In future versions of Git, we plan to support an additional hash
algorithm.  Integrate the enumeration of hash algorithms with repository
setup, and store a pointer to the enumerated data in struct repository.
Of course, we currently only support SHA-1, so hard-code this value in
read_repository_format.  In the future, we'll enumerate this value from
the configuration.

Add a constant, current_hash, which points to the hash_algo structure
pointer in the repository global.  Include repository.h in cache.h since
we now need to have access to these struct and variable definitions.

Signed-off-by: brian m. carlson 
---
 cache.h  | 4 
 repository.c | 7 +++
 repository.h | 5 +
 setup.c  | 2 ++
 4 files changed, 18 insertions(+)

diff --git a/cache.h b/cache.h
index 375a7fb15e..d759824803 100644
--- a/cache.h
+++ b/cache.h
@@ -13,6 +13,7 @@
 #include "hash.h"
 #include "path.h"
 #include "sha1-array.h"
+#include "repository.h"
 
 #ifndef platform_SHA_CTX
 /*
@@ -112,6 +113,8 @@ struct git_hash_algo {
 };
 extern const struct git_hash_algo hash_algos[1];
 
+#define current_hash the_repository->hash_algo
+
 #if defined(DT_UNKNOWN) && !defined(NO_D_TYPE_IN_DIRENT)
 #define DTYPE(de)  ((de)->d_type)
 #else
@@ -894,6 +897,7 @@ struct repository_format {
int version;
int precious_objects;
int is_bare;
+   int hash_algo;
char *work_tree;
struct string_list unknown_extensions;
 };
diff --git a/repository.c b/repository.c
index 1617467568..37764f627a 100644
--- a/repository.c
+++ b/repository.c
@@ -62,6 +62,11 @@ void repo_set_gitdir(struct repository *repo, const char 
*path)
repo_setup_env(repo);
 }
 
+void repo_set_hash_algo(struct repository *repo, int hash_algo)
+{
+   repo->hash_algo = _algos[hash_algo];
+}
+
 /*
  * Attempt to resolve and set the provided 'gitdir' for repository 'repo'.
  * Return 0 upon success and a non-zero value upon failure.
@@ -134,6 +139,8 @@ int repo_init(struct repository *repo, const char *gitdir, 
const char *worktree)
if (read_and_verify_repository_format(, repo->commondir))
goto error;
 
+   repo->hash_algo = _algos[format.hash_algo];
+
if (worktree)
repo_set_worktree(repo, worktree);
 
diff --git a/repository.h b/repository.h
index 417787f3ef..f171172150 100644
--- a/repository.h
+++ b/repository.h
@@ -4,6 +4,7 @@
 struct config_set;
 struct index_state;
 struct submodule_cache;
+struct git_hash_algo;
 
 struct repository {
/* Environment */
@@ -67,6 +68,9 @@ struct repository {
 */
struct index_state *index;
 
+   /* Repository's current hash algorithm. */
+   const struct git_hash_algo *hash_algo;
+
/* Configurations */
/*
 * Bit used during initialization to indicate if repository state (like
@@ -86,6 +90,7 @@ extern struct repository *the_repository;
 
 extern void repo_set_gitdir(struct repository *repo, const char *path);
 extern void repo_set_worktree(struct repository *repo, const char *path);
+extern void repo_set_hash_algo(struct repository *repo, int algo);
 extern int repo_init(struct repository *repo, const char *gitdir, const char 
*worktree);
 extern int repo_submodule_init(struct repository *submodule,
   struct repository *superproject,
diff --git a/setup.c b/setup.c
index 115e70a4e8..289e24811c 100644
--- a/setup.c
+++ b/setup.c
@@ -491,6 +491,7 @@ int read_repository_format(struct repository_format 
*format, const char *path)
memset(format, 0, sizeof(*format));
format->version = -1;
format->is_bare = -1;
+   format->hash_algo = GIT_HASH_SHA1;
string_list_init(>unknown_extensions, 1);
git_config_from_file(check_repo_format, path, format);
return format->version;
@@ -1125,6 +1126,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
repo_set_gitdir(the_repository, gitdir);
setup_git_env();
}
+   repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
}
 
strbuf_release();


[RFC PATCH 4/6] Add structure representing hash algorithm

2017-08-20 Thread brian m. carlson
Since in the future we want to support an additional hash algorithm, add
a structure that represents a hash algorithm and all the data that must
go along with it.  Add a constant to allow easy enumeration of hash
algorithms.  Implement function typedefs to create an abstract API that
can be used by any hash algorithm, and wrappers for the existing SHA1
functions that conform to this API.

Don't include an entry in the hash algorithm structure for the null
object ID.  As this value is all zeros, any suitably sized all-zero
object ID can be used, and there's no need to store a given one on a
per-hash basis.

Signed-off-by: brian m. carlson 
---
 cache.h | 36 
 sha1_file.c | 29 +
 2 files changed, 65 insertions(+)

diff --git a/cache.h b/cache.h
index 1c69d2a05a..375a7fb15e 100644
--- a/cache.h
+++ b/cache.h
@@ -76,6 +76,42 @@ struct object_id {
unsigned char hash[GIT_MAX_RAWSZ];
 };
 
+#define GIT_HASH_SHA1 0
+
+typedef void (*git_hash_init_fn)(void *ctx);
+typedef void (*git_hash_update_fn)(void *ctx, const void *in, size_t len);
+typedef void (*git_hash_final_fn)(unsigned char *hash, void *ctx);
+
+struct git_hash_algo {
+   /* The name of the algorithm, as appears in the config file. */
+   const char *name;
+
+   /* The size of a hash context (e.g. git_SHA_CTX). */
+   size_t ctxsz;
+
+   /* The length of the hash in bytes. */
+   size_t rawsz;
+
+   /* The length of the hash in hex characters. */
+   size_t hexsz;
+
+   /* The hash initialization function. */
+   git_hash_init_fn init_fn;
+
+   /* The hash update function. */
+   git_hash_update_fn update_fn;
+
+   /* The hash finalization function. */
+   git_hash_final_fn final_fn;
+
+   /* The OID of the empty tree. */
+   const struct object_id *empty_tree;
+
+   /* The OID of the empty blob. */
+   const struct object_id *empty_blob;
+};
+extern const struct git_hash_algo hash_algos[1];
+
 #if defined(DT_UNKNOWN) && !defined(NO_D_TYPE_IN_DIRENT)
 #define DTYPE(de)  ((de)->d_type)
 #else
diff --git a/sha1_file.c b/sha1_file.c
index b60ae15f70..bd9ff02802 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -41,6 +41,35 @@ const struct object_id empty_blob_oid = {
EMPTY_BLOB_SHA1_BIN_LITERAL
 };
 
+static inline void git_hash_sha1_init(void *ctx)
+{
+   git_SHA1_Init((git_SHA_CTX *)ctx);
+}
+
+static inline void git_hash_sha1_update(void *ctx, const void *data, size_t 
len)
+{
+   git_SHA1_Update((git_SHA_CTX *)ctx, data, len);
+}
+
+static inline void git_hash_sha1_final(unsigned char *hash, void *ctx)
+{
+   git_SHA1_Final(hash, (git_SHA_CTX *)ctx);
+}
+
+const struct git_hash_algo hash_algos[1] = {
+   [GIT_HASH_SHA1] = {
+   .name = "sha-1",
+   .ctxsz = sizeof(git_SHA_CTX),
+   .rawsz = GIT_SHA1_RAWSZ,
+   .hexsz = GIT_SHA1_HEXSZ,
+   .init_fn = git_hash_sha1_init,
+   .update_fn = git_hash_sha1_update,
+   .final_fn = git_hash_sha1_final,
+   .empty_tree = _tree_oid,
+   .empty_blob = _blob_oid,
+   },
+};
+
 /*
  * This is meant to hold a *small* number of objects that you would
  * want read_sha1_file() to be able to return, but yet you do not want


[RFC PATCH 2/6] vcs-svn: rename repo functions to "svn_repo"

2017-08-20 Thread brian m. carlson
There were several functions in the Subversion code that started with
"repo_".  This namespace is also used by the Git struct repository code.
Rename these functions to start with "svn_repo" to avoid any future
conflicts.

Signed-off-by: brian m. carlson 
---
 vcs-svn/repo_tree.c | 6 +++---
 vcs-svn/repo_tree.h | 6 +++---
 vcs-svn/svndump.c   | 8 
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/vcs-svn/repo_tree.c b/vcs-svn/repo_tree.c
index 67d27f0b6c..d77cb0ada7 100644
--- a/vcs-svn/repo_tree.c
+++ b/vcs-svn/repo_tree.c
@@ -8,7 +8,7 @@
 #include "repo_tree.h"
 #include "fast_export.h"
 
-const char *repo_read_path(const char *path, uint32_t *mode_out)
+const char *svn_repo_read_path(const char *path, uint32_t *mode_out)
 {
int err;
static struct strbuf buf = STRBUF_INIT;
@@ -25,7 +25,7 @@ const char *repo_read_path(const char *path, uint32_t 
*mode_out)
return buf.buf;
 }
 
-void repo_copy(uint32_t revision, const char *src, const char *dst)
+void svn_repo_copy(uint32_t revision, const char *src, const char *dst)
 {
int err;
uint32_t mode;
@@ -42,7 +42,7 @@ void repo_copy(uint32_t revision, const char *src, const char 
*dst)
fast_export_modify(dst, mode, data.buf);
 }
 
-void repo_delete(const char *path)
+void svn_repo_delete(const char *path)
 {
fast_export_delete(path);
 }
diff --git a/vcs-svn/repo_tree.h b/vcs-svn/repo_tree.h
index d0f5690dca..555b64bbb6 100644
--- a/vcs-svn/repo_tree.h
+++ b/vcs-svn/repo_tree.h
@@ -9,8 +9,8 @@ struct strbuf;
 #define REPO_MODE_LNK 012
 
 uint32_t next_blob_mark(void);
-void repo_copy(uint32_t revision, const char *src, const char *dst);
-const char *repo_read_path(const char *path, uint32_t *mode_out);
-void repo_delete(const char *path);
+void svn_repo_copy(uint32_t revision, const char *src, const char *dst);
+const char *svn_repo_read_path(const char *path, uint32_t *mode_out);
+void svn_repo_delete(const char *path);
 
 #endif
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index 1846685a21..37c4a36b92 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -225,15 +225,15 @@ static void handle_node(void)
if (have_text || have_props || node_ctx.srcRev)
die("invalid dump: deletion node has "
"copyfrom info, text, or properties");
-   repo_delete(node_ctx.dst.buf);
+   svn_repo_delete(node_ctx.dst.buf);
return;
}
if (node_ctx.action == NODEACT_REPLACE) {
-   repo_delete(node_ctx.dst.buf);
+   svn_repo_delete(node_ctx.dst.buf);
node_ctx.action = NODEACT_ADD;
}
if (node_ctx.srcRev) {
-   repo_copy(node_ctx.srcRev, node_ctx.src.buf, node_ctx.dst.buf);
+   svn_repo_copy(node_ctx.srcRev, node_ctx.src.buf, 
node_ctx.dst.buf);
if (node_ctx.action == NODEACT_ADD)
node_ctx.action = NODEACT_CHANGE;
}
@@ -249,7 +249,7 @@ static void handle_node(void)
old_data = NULL;
} else if (node_ctx.action == NODEACT_CHANGE) {
uint32_t mode;
-   old_data = repo_read_path(node_ctx.dst.buf, );
+   old_data = svn_repo_read_path(node_ctx.dst.buf, );
if (mode == REPO_MODE_DIR && type != REPO_MODE_DIR)
die("invalid dump: cannot modify a directory into a 
file");
if (mode != REPO_MODE_DIR && type == REPO_MODE_DIR)


[RFC PATCH 0/6] Hash Abstraction

2017-08-20 Thread brian m. carlson
= Overview

This is an RFC series proposing a basic abstraction for hash functions.

As we get closer to converting the remainder of the codebase to use
struct object_id, we should think about the design we want our hash
function abstraction to take.  This series is a proposal for one idea to
start discussion.  Input on any aspect of this proposal is welcome.

This series exposes a struct git_hash_algo that contains basic
information about a given hash algorithm that distinguishes it from
other algorithms: name, lengths, implementing functions, and empty tree
and blob constants.  It also exposes an array of hash algorithms, and a
constant for indexing them.

The series also demonstrates a simple conversion using the abstraction
over empty blob and tree values.

In order to avoid conflicting with the struct repository work and with
the goal of avoiding global variables as much as possible, I've pushed
the hash algorithm into struct repository and exposed it via a #define.
This necessitiates pulling repository.h into cache.h, which I don't
think is fatal.  Doing that, in turn, necessitated some work on the
Subversion code to avoid conflicts.

It should be fine for Junio to pick up the first two patches from this
series, as they're relatively independent and valuable without the rest
of the series.  The rest should not be applied immediately, although
they do pass the testsuite.

I proposed this series now as it will inform the way we go about
converting other parts of the codebase, especially some of the pack
algorithms.  Because we share some hash computation code between pack
checksums and object hashing, we need to decide whether to expose pack
checksums as struct object_id, even though they are technically not
object IDs.  Furthermore, if we end up needing to stuff an algorithm
value into struct object_id, we'll no longer be able to directly
reference object IDs in a pack without a copy.

This series is available from the usual places as branch hash-struct,
based against master.

= Naming

The length names are similar to the current constant names
intentionally.  I've used the "hash_algo" name for both the integer
constant and the pointer to struct, although we could change the latter
to "hash_impl" or such as people like.

I chose to name the define "current_hash" and expose no other defines.
The name is relatively short since we're going to be typing it a lot.
However, if people like, we can capitalize it or expose other defines
(say, a GIT_HASH_RAWSZ or GIT_HASH_HEXSZ) instead of or in addition to
current_hash, which would make this name less interesting.

Feel free to propose alternatives to the naming of anything in this
series.

= Open Issues

I originally decided to convert hex.c as an example, but quickly found
out that this caused segfaults.  As part of setup, we call
is_git_directory, which calls validate_headref, which ends up calling
get_sha1_hex.  Obviously, we don't have a repository, so the hash
algorithm isn't set up yet.  This is an area we'll need to consider
making hash function agnostic, and we may also need to consider
inserting a hash constant integer into struct object_id if we're going
to do that.  Alternatively, we could just paper over this issue as a
special case.

Clearly we're going to want to expose some sort of lookup functionality
for hash algorithms.  We'll need to expose lookup by name (for the
.git/config file and any command-line options), but we may want other
functions as well.  What functions should those be?  Should we expose
the structure or the constant for those lookup functions?  If the
structure, we'll probably need to expose the constant in the structure
as well for easy use.

Should we avoid exposing the array of structure altogether and wrap this
in a function?

We could expose a union of hash context structures and take that as the
pointer type for the API calls.  That would probably obviate the need
for ctxsz.

We could expose hex versions of the blob constants if desired.  This
might make converting the remaining pieces of code that use them easier.

There are probably dozens of other things I haven't thought of yet as
well.

brian m. carlson (6):
  vcs-svn: remove unused prototypes
  vcs-svn: rename repo functions to "svn_repo"
  setup: expose enumerated repo info
  Add structure representing hash algorithm
  Integrate hash algorithm support with repo setup
  Switch empty tree and blob lookups to use hash abstraction

 builtin/am.c|  2 +-
 builtin/checkout.c  |  2 +-
 builtin/diff.c  |  2 +-
 builtin/pull.c  |  2 +-
 cache.h | 48 
 diff-lib.c  |  2 +-
 merge-recursive.c   |  2 +-
 notes-merge.c   |  2 +-
 repository.c|  7 +++
 repository.h|  5 +
 sequencer.c |  6 +++---
 setup.c | 48 +++-
 sha1_file.c | 29 +
 submodule.c |  2 +-
 

[RFC PATCH 6/6] Switch empty tree and blob lookups to use hash abstraction

2017-08-20 Thread brian m. carlson
Switch the uses of empty_tree_oid and empty_blob_oid to use the
current_hash abstraction that represents the current hash algorithm in
use.

Signed-off-by: brian m. carlson 
---
 builtin/am.c   | 2 +-
 builtin/checkout.c | 2 +-
 builtin/diff.c | 2 +-
 builtin/pull.c | 2 +-
 cache.h| 8 
 diff-lib.c | 2 +-
 merge-recursive.c  | 2 +-
 notes-merge.c  | 2 +-
 sequencer.c| 6 +++---
 submodule.c| 2 +-
 10 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 40cc6d6fe8..347b30bd1f 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1435,7 +1435,7 @@ static void write_index_patch(const struct am_state 
*state)
if (!get_oid_tree("HEAD", ))
tree = lookup_tree();
else
-   tree = lookup_tree(_tree_oid);
+   tree = lookup_tree(current_hash->empty_tree);
 
fp = xfopen(am_path(state, "patch"), "w");
init_revisions(_info, NULL);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 2d75ac66c7..efdec232b7 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -515,7 +515,7 @@ static int merge_working_tree(const struct checkout_opts 
*opts,
}
tree = parse_tree_indirect(old->commit ?
   >commit->object.oid :
-  _tree_oid);
+  current_hash->empty_tree);
init_tree_desc([0], tree->buffer, tree->size);
tree = parse_tree_indirect(>commit->object.oid);
init_tree_desc([1], tree->buffer, tree->size);
diff --git a/builtin/diff.c b/builtin/diff.c
index 7cde6abbcf..4cb5d23899 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -382,7 +382,7 @@ int cmd_diff(int argc, const char **argv, const char 
*prefix)
add_head_to_pending();
if (!rev.pending.nr) {
struct tree *tree;
-   tree = lookup_tree(_tree_oid);
+   tree = 
lookup_tree(current_hash->empty_tree);
add_pending_object(, >object, 
"HEAD");
}
break;
diff --git a/builtin/pull.c b/builtin/pull.c
index 9b86e519b1..4fa5db2fc9 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -537,7 +537,7 @@ static int pull_into_void(const struct object_id 
*merge_head,
 * index/worktree changes that the user already made on the unborn
 * branch.
 */
-   if (checkout_fast_forward(_tree_oid, merge_head, 0))
+   if (checkout_fast_forward(current_hash->empty_tree, merge_head, 0))
return 1;
 
if (update_ref("initial pull", "HEAD", merge_head->hash, 
curr_head->hash, 0, UPDATE_REFS_DIE_ON_ERR))
diff --git a/cache.h b/cache.h
index d759824803..23d2e29fcd 100644
--- a/cache.h
+++ b/cache.h
@@ -1051,22 +1051,22 @@ extern const struct object_id empty_blob_oid;
 
 static inline int is_empty_blob_sha1(const unsigned char *sha1)
 {
-   return !hashcmp(sha1, EMPTY_BLOB_SHA1_BIN);
+   return !hashcmp(sha1, current_hash->empty_blob->hash);
 }
 
 static inline int is_empty_blob_oid(const struct object_id *oid)
 {
-   return !hashcmp(oid->hash, EMPTY_BLOB_SHA1_BIN);
+   return !oidcmp(oid, current_hash->empty_blob);
 }
 
 static inline int is_empty_tree_sha1(const unsigned char *sha1)
 {
-   return !hashcmp(sha1, EMPTY_TREE_SHA1_BIN);
+   return !hashcmp(sha1, current_hash->empty_tree->hash);
 }
 
 static inline int is_empty_tree_oid(const struct object_id *oid)
 {
-   return !hashcmp(oid->hash, EMPTY_TREE_SHA1_BIN);
+   return !oidcmp(oid, current_hash->empty_tree);
 }
 
 /* set default permissions by passing mode arguments to open(2) */
diff --git a/diff-lib.c b/diff-lib.c
index 2a52b07954..0575f4cb1b 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -216,7 +216,7 @@ int run_diff_files(struct rev_info *revs, unsigned int 
option)
} else if (revs->diffopt.ita_invisible_in_index &&
   ce_intent_to_add(ce)) {
diff_addremove(>diffopt, '+', ce->ce_mode,
-  _tree_oid, 0,
+  current_hash->empty_tree, 0,
   ce->name, 0);
continue;
}
diff --git a/merge-recursive.c b/merge-recursive.c
index 1494ffdb82..981fb668f9 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2043,7 +2043,7 @@ int merge_recursive(struct merge_options *o,
/* if there is no common ancestor, use an empty tree */
struct tree *tree;
 
-   tree = lookup_tree(_tree_oid);
+  

[RFC PATCH 1/6] vcs-svn: remove unused prototypes

2017-08-20 Thread brian m. carlson
The Subversion code had prototypes for several functions which were not
ever defined or used.  These functions all had names starting with
"repo_", some of which conflict with those in repository.h.  To avoid
the conflict, remove those unused prototypes.

Signed-off-by: brian m. carlson 
---
 vcs-svn/repo_tree.h | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/vcs-svn/repo_tree.h b/vcs-svn/repo_tree.h
index 889c6a3c95..d0f5690dca 100644
--- a/vcs-svn/repo_tree.h
+++ b/vcs-svn/repo_tree.h
@@ -10,14 +10,7 @@ struct strbuf;
 
 uint32_t next_blob_mark(void);
 void repo_copy(uint32_t revision, const char *src, const char *dst);
-void repo_add(const char *path, uint32_t mode, uint32_t blob_mark);
 const char *repo_read_path(const char *path, uint32_t *mode_out);
 void repo_delete(const char *path);
-void repo_commit(uint32_t revision, const char *author,
-   const struct strbuf *log, const char *uuid, const char *url,
-   long unsigned timestamp);
-void repo_diff(uint32_t r1, uint32_t r2);
-void repo_init(void);
-void repo_reset(void);
 
 #endif


Re: Please fix the useless email prompts

2017-08-20 Thread Junio C Hamano
Andrew Ardill  writes:

> Is there any reason `git pull` can't delay that check until the point
> where it actually tries to create a new commit? It's fair enough to
> error if a new commit needs to be made, and there is no user
> configured, but for the use cases discussed here it seems a little
> eager to error on the chance that the user will be needed.

I personally do not think it is a good trade-off.

In theory [*1*], it _is_ possible to delay the "the given identity
looks bogus" check so that the underlying "git fetch" is done
without it, and bypass the check when "pull" fast-forwards, as there
is no need for an extra merge commit in such a case as you noticed.
We still do record the bogus identity in the reflog of the HEAD ref,
but IIRC, we do not trigger a severe error when a bogus identity is
only needed for reflogs.

But before running "fetch", you cannot tell if the "pull" will
fast-forward, so such an "optimization" may actually be a net loss
for end users, who have to wait for network delay only to be told
that you'd end up with a history with bogus identity and need to
redo the operation after fixing your identity.  Then after that,
they are likely to do another "git pull", which will avoid the cost
of retransmission of objects if (and only if) the initial "git pull"
uses remote-tracking branches.


[Footnote]

*1* We actually might already do such an "optimization"; I didn't
check.


Re: Please fix the useless email prompts

2017-08-20 Thread Andrew Ardill
Hi Anatoli,

On 21 August 2017 at 07:57, Anatolii Borodin  wrote:
> On Sun, Aug 20, 2017 at 2:40 PM, Andrew Ardill  
> wrote:
>> Maybe I am missing something obvious, but if that's the case then
>> can't we just do the identity check when trying to make new commits,
>> in which case you should be able to pull without setting your
>> identity?
>
> `git pull` is `git fetch + git merge / git rebase` in disguise, so we
> should be ready if git will want to create a merge commit or do a
> rebase automatically (and potentially create new commits with
> `Committer` set to the current user). `git fetch` and `git clone`
> alone, `git branch`, `git checkout` etc don't care about the email (as
> of 2.14.1), even if `user.useConfigOnly` is set to `true`.

Is there any reason `git pull` can't delay that check until the point
where it actually tries to create a new commit? It's fair enough to
error if a new commit needs to be made, and there is no user
configured, but for the use cases discussed here it seems a little
eager to error on the chance that the user will be needed.

It seems nicer for the user if the `git fetch` happens first, and if
the merge is not a fast forward, and there is no user configured, that
the error pops then. I don't know if this idea of "do as much as
possible before erroring" is consistent with any other errors we
handle.

Regards,

Andrew Ardill


Re: Please fix the useless email prompts

2017-08-20 Thread Anatolii Borodin
Hi Andrew,

On Sun, Aug 20, 2017 at 2:40 PM, Andrew Ardill  wrote:
> Maybe I am missing something obvious, but if that's the case then
> can't we just do the identity check when trying to make new commits,
> in which case you should be able to pull without setting your
> identity?

`git pull` is `git fetch + git merge / git rebase` in disguise, so we
should be ready if git will want to create a merge commit or do a
rebase automatically (and potentially create new commits with
`Committer` set to the current user). `git fetch` and `git clone`
alone, `git branch`, `git checkout` etc don't care about the email (as
of 2.14.1), even if `user.useConfigOnly` is set to `true`.


-- 
Mit freundlichen Grüßen,
Anatolii Borodin


Re: Please fix the useless email prompts

2017-08-20 Thread Anatolii Borodin
Hi Jeffrey,


On Sat, Aug 19, 2017 at 5:10 PM, Jeffrey Walton  wrote:
> *** Please tell me who you are.

Which version of git do you use? Do you have user.useConfigOnly set to
true anywhere in the config files?

-- 
Mit freundlichen Grüßen,
Anatolii Borodin


Re: [PATCH 2/2] rerere: allow approxidate in gc.rerereResolved/gc.rerereUnresolved

2017-08-20 Thread Ramsay Jones


On 19/08/17 21:30, Junio C Hamano wrote:
> These two configuration variables are described in the documentation
> to take an expiry period expressed in the number of days:
> 
> gc.rerereResolved::
>   Records of conflicted merge you resolved earlier are
>   kept for this many days when 'git rerere gc' is run.
>   The default is 60 days.
> 
> gc.rerereUnresolved::
>   Records of conflicted merge you have not resolved are
>   kept for this many days when 'git rerere gc' is run.
>   The default is 15 days.
> 
> There is no strong reason not to allow a more general "approxidate"
> expiry specification, e.g. "5.days.ago", or "never".
> 
> Tweak the config_get_expiry() helper introduced in the previous step
> to use date.c::parse_expiry_date() to do so.
> 
> In the future, we may find other variables that only allow an
> integer that specifies "this many days" (or other unit of time) and
> allow them to also do the same, and at that point we probably would
> want to move the helper to a place that is not specific to the
> rerere machinery.  Perhaps config.c would be such a good neutral
> place, as it will allow git_parse_signed() to go back to static to
> the file.

You make git_parse_unsigned() external in this patch, in addition
to git_parse_signed(), but don't actually call it. Was that intended?

ATB,
Ramsay Jones

> 
> But this will do for now.
> 
> Signed-off-by: Junio C Hamano 
> ---
>  Documentation/config.txt |  2 ++
>  config.c |  4 ++--
>  config.h |  3 +++
>  rerere.c | 14 --
>  4 files changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index d5c9c4cab6..ac95f5f954 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1553,11 +1553,13 @@ gc..reflogExpireUnreachable::
>  gc.rerereResolved::
>   Records of conflicted merge you resolved earlier are
>   kept for this many days when 'git rerere gc' is run.
> + You can also use more human-readable "1.month.ago", etc.
>   The default is 60 days.  See linkgit:git-rerere[1].
>  
>  gc.rerereUnresolved::
>   Records of conflicted merge you have not resolved are
>   kept for this many days when 'git rerere gc' is run.
> + You can also use more human-readable "1.month.ago", etc.
>   The default is 15 days.  See linkgit:git-rerere[1].
>  
>  gitcvs.commitMsgAnnotation::
> diff --git a/config.c b/config.c
> index 231f9a750b..ac9071c5cf 100644
> --- a/config.c
> +++ b/config.c
> @@ -769,7 +769,7 @@ static int parse_unit_factor(const char *end, uintmax_t 
> *val)
>   return 0;
>  }
>  
> -static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
> +int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
>  {
>   if (value && *value) {
>   char *end;
> @@ -799,7 +799,7 @@ static int git_parse_signed(const char *value, intmax_t 
> *ret, intmax_t max)
>   return 0;
>  }
>  
> -static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t 
> max)
> +int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
>  {
>   if (value && *value) {
>   char *end;
> diff --git a/config.h b/config.h
> index 0352da117b..039a9295de 100644
> --- a/config.h
> +++ b/config.h
> @@ -215,4 +215,7 @@ struct key_value_info {
>  extern NORETURN void git_die_config(const char *key, const char *err, ...) 
> __attribute__((format(printf, 2, 3)));
>  extern NORETURN void git_die_config_linenr(const char *key, const char 
> *filename, int linenr);
>  
> +int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max);
> +int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);
> +
>  #endif /* CONFIG_H */
> diff --git a/rerere.c b/rerere.c
> index f0b4bce881..8bbdfe8569 100644
> --- a/rerere.c
> +++ b/rerere.c
> @@ -1178,11 +1178,21 @@ static void prune_one(struct rerere_id *id,
>  
>  static void config_get_expiry(const char *key, timestamp_t *cutoff, 
> timestamp_t now)
>  {
> - int days;
> + char *expiry_string;
> + intmax_t days;
> + timestamp_t when;
>  
> - if (!git_config_get_int(key, )) {
> + if (git_config_get_string(key, _string))
> + return;
> +
> + if (git_parse_signed(expiry_string, , 
> maximum_signed_value_of_type(int))) {
>   const int scale = 86400;
>   *cutoff = now - days * scale;
> + return;
> + }
> +
> + if (!parse_expiry_date(expiry_string, )) {
> + *cutoff = when;
>   }
>  }
>  
> 


Re: [PATCH 5/6] sha1_file: convert hash_sha1_file_literally to struct object_id

2017-08-20 Thread brian m. carlson
On Sun, Aug 20, 2017 at 10:09:30PM +0200, Patryk Obara wrote:
> Convert all remaining callers as well.
> 
> Signed-off-by: Patryk Obara 
> ---
>  builtin/hash-object.c | 2 +-
>  cache.h   | 2 +-
>  sha1_file.c   | 8 
>  3 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/builtin/hash-object.c b/builtin/hash-object.c
> index 8a58ce0..c532ff9 100644
> --- a/builtin/hash-object.c
> +++ b/builtin/hash-object.c
> @@ -24,7 +24,7 @@ static int hash_literally(struct object_id *oid, int fd, 
> const char *type, unsig
>   if (strbuf_read(, fd, 4096) < 0)
>   ret = -1;
>   else
> - ret = hash_sha1_file_literally(buf.buf, buf.len, type, 
> oid->hash, flags);
> + ret = hash_sha1_file_literally(buf.buf, buf.len, type, oid, 
> flags);
>   strbuf_release();
>   return ret;
>  }
> diff --git a/cache.h b/cache.h
> index eaf3603..237adb5 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1199,7 +1199,7 @@ static inline const unsigned char 
> *lookup_replace_object(const unsigned char *sh
>  extern int sha1_object_info(const unsigned char *, unsigned long *);
>  extern int hash_sha1_file(const void *buf, unsigned long len, const char 
> *type, unsigned char *sha1);
>  extern int write_sha1_file(const void *buf, unsigned long len, const char 
> *type, unsigned char *return_sha1);
> -extern int hash_sha1_file_literally(const void *buf, unsigned long len, 
> const char *type, unsigned char *sha1, unsigned flags);
> +extern int hash_sha1_file_literally(const void *buf, unsigned long len, 
> const char *type, struct object_id *oid, unsigned flags);

We probably want to rename this function, since it no longer handles
exclusively SHA-1.  When I've made changes to the "_sha1_file"
functions, I've converted them to "_object_file" instead.  However, if
people like "_oid_file", we could do that instead.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 0/6] Convert hash-object to struct object_id

2017-08-20 Thread brian m. carlson
On Sun, Aug 20, 2017 at 10:09:25PM +0200, Patryk Obara wrote:
> This enabled conversion of few functions in sha1_file, which
> had almost all callers converted already.
> 
> I hope I'm not stepping on anyone's toes with this patch series.
> If I do - is there some email thread or document in which I can
> coordinate with other developers, regarding which code regions
> are being converted to struct object_id next?

We don't have a coordinated thread at the moment.  You can see what I'm
working on at https://github.com/bk2204/git.git in the object-id-part10
and object-id-part11 branches (based on an older next).

However, having said that, I don't mind if you or others pick up various
parts of the codebase.  At worst, I drop a few patches for things others
have already converted.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


[PATCH 5/6] sha1_file: convert hash_sha1_file_literally to struct object_id

2017-08-20 Thread Patryk Obara
Convert all remaining callers as well.

Signed-off-by: Patryk Obara 
---
 builtin/hash-object.c | 2 +-
 cache.h   | 2 +-
 sha1_file.c   | 8 
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/hash-object.c b/builtin/hash-object.c
index 8a58ce0..c532ff9 100644
--- a/builtin/hash-object.c
+++ b/builtin/hash-object.c
@@ -24,7 +24,7 @@ static int hash_literally(struct object_id *oid, int fd, 
const char *type, unsig
if (strbuf_read(, fd, 4096) < 0)
ret = -1;
else
-   ret = hash_sha1_file_literally(buf.buf, buf.len, type, 
oid->hash, flags);
+   ret = hash_sha1_file_literally(buf.buf, buf.len, type, oid, 
flags);
strbuf_release();
return ret;
 }
diff --git a/cache.h b/cache.h
index eaf3603..237adb5 100644
--- a/cache.h
+++ b/cache.h
@@ -1199,7 +1199,7 @@ static inline const unsigned char 
*lookup_replace_object(const unsigned char *sh
 extern int sha1_object_info(const unsigned char *, unsigned long *);
 extern int hash_sha1_file(const void *buf, unsigned long len, const char 
*type, unsigned char *sha1);
 extern int write_sha1_file(const void *buf, unsigned long len, const char 
*type, unsigned char *return_sha1);
-extern int hash_sha1_file_literally(const void *buf, unsigned long len, const 
char *type, unsigned char *sha1, unsigned flags);
+extern int hash_sha1_file_literally(const void *buf, unsigned long len, const 
char *type, struct object_id *oid, unsigned flags);
 extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned 
char *);
 extern int force_object_loose(const unsigned char *sha1, time_t mtime);
 extern int git_open_cloexec(const char *name, int flags);
diff --git a/sha1_file.c b/sha1_file.c
index 11995e5..3e2ef4e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3437,7 +3437,7 @@ int write_sha1_file(const void *buf, unsigned long len, 
const char *type, unsign
 }
 
 int hash_sha1_file_literally(const void *buf, unsigned long len, const char 
*type,
-unsigned char *sha1, unsigned flags)
+ struct object_id *oid, unsigned flags)
 {
char *header;
int hdrlen, status = 0;
@@ -3445,13 +3445,13 @@ int hash_sha1_file_literally(const void *buf, unsigned 
long len, const char *typ
/* type string, SP, %lu of the length plus NUL must fit this */
hdrlen = strlen(type) + 32;
header = xmalloc(hdrlen);
-   write_sha1_file_prepare(buf, len, type, sha1, header, );
+   write_sha1_file_prepare(buf, len, type, oid->hash, header, );
 
if (!(flags & HASH_WRITE_OBJECT))
goto cleanup;
-   if (freshen_packed_object(sha1) || freshen_loose_object(sha1))
+   if (freshen_packed_object(oid->hash) || freshen_loose_object(oid->hash))
goto cleanup;
-   status = write_loose_object(sha1, header, hdrlen, buf, len, 0);
+   status = write_loose_object(oid->hash, header, hdrlen, buf, len, 0);
 
 cleanup:
free(header);
-- 
2.9.5



[PATCH 4/6] sha1_file: convert index_fd to struct object_id

2017-08-20 Thread Patryk Obara
Convert all remaining callers as well.

Signed-off-by: Patryk Obara 
---
 builtin/difftool.c|  2 +-
 builtin/hash-object.c |  2 +-
 builtin/replace.c |  2 +-
 cache.h   |  2 +-
 read-cache.c  |  2 +-
 sha1_file.c   | 14 +++---
 6 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/builtin/difftool.c b/builtin/difftool.c
index 8864d84..b2d3ba7 100644
--- a/builtin/difftool.c
+++ b/builtin/difftool.c
@@ -111,7 +111,7 @@ static int use_wt_file(const char *workdir, const char 
*name,
int fd = open(buf.buf, O_RDONLY);
 
if (fd >= 0 &&
-   !index_fd(wt_oid.hash, fd, , OBJ_BLOB, name, 0)) {
+   !index_fd(_oid, fd, , OBJ_BLOB, name, 0)) {
if (is_null_oid(oid)) {
oidcpy(oid, _oid);
use = 1;
diff --git a/builtin/hash-object.c b/builtin/hash-object.c
index 1c0f0f3..8a58ce0 100644
--- a/builtin/hash-object.c
+++ b/builtin/hash-object.c
@@ -38,7 +38,7 @@ static void hash_fd(int fd, const char *type, const char 
*path, unsigned flags,
if (fstat(fd, ) < 0 ||
(literally
 ? hash_literally(, fd, type, flags)
-: index_fd(oid.hash, fd, , type_from_string(type), path, 
flags)))
+: index_fd(, fd, , type_from_string(type), path, flags)))
die((flags & HASH_WRITE_OBJECT)
? "Unable to add %s to database"
: "Unable to hash %s", path);
diff --git a/builtin/replace.c b/builtin/replace.c
index f4a85a1..3e71a77 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -269,7 +269,7 @@ static void import_object(struct object_id *oid, enum 
object_type type,
 
if (fstat(fd, ) < 0)
die_errno("unable to fstat %s", filename);
-   if (index_fd(oid->hash, fd, , type, NULL, flags) < 0)
+   if (index_fd(oid, fd, , type, NULL, flags) < 0)
die("unable to write object to database");
/* index_fd close()s fd for us */
}
diff --git a/cache.h b/cache.h
index 380868d..eaf3603 100644
--- a/cache.h
+++ b/cache.h
@@ -684,7 +684,7 @@ extern int ie_modified(const struct index_state *, const 
struct cache_entry *, s
 
 #define HASH_WRITE_OBJECT 1
 #define HASH_FORMAT_CHECK 2
-extern int index_fd(unsigned char *sha1, int fd, struct stat *st, enum 
object_type type, const char *path, unsigned flags);
+extern int index_fd(struct object_id *oid, int fd, struct stat *st, enum 
object_type type, const char *path, unsigned flags);
 extern int index_path(struct object_id *oid, const char *path, struct stat 
*st, unsigned flags);
 
 /*
diff --git a/read-cache.c b/read-cache.c
index 17f19a1..9b41058 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -161,7 +161,7 @@ static int ce_compare_data(const struct cache_entry *ce, 
struct stat *st)
 
if (fd >= 0) {
struct object_id oid;
-   if (!index_fd(oid.hash, fd, st, OBJ_BLOB, ce->name, 0))
+   if (!index_fd(, fd, st, OBJ_BLOB, ce->name, 0))
match = oidcmp(, >oid);
/* index_fd() closed the file descriptor already */
}
diff --git a/sha1_file.c b/sha1_file.c
index 6a2a48b..11995e5 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3662,8 +3662,8 @@ static int index_stream(unsigned char *sha1, int fd, 
size_t size,
return index_bulk_checkin(sha1, fd, size, type, path, flags);
 }
 
-int index_fd(unsigned char *sha1, int fd, struct stat *st,
-enum object_type type, const char *path, unsigned flags)
+int index_fd(struct object_id *oid, int fd, struct stat *st,
+ enum object_type type, const char *path, unsigned flags)
 {
int ret;
 
@@ -3672,15 +3672,15 @@ int index_fd(unsigned char *sha1, int fd, struct stat 
*st,
 * die() for large files.
 */
if (type == OBJ_BLOB && path && would_convert_to_git_filter_fd(path))
-   ret = index_stream_convert_blob(sha1, fd, path, flags);
+   ret = index_stream_convert_blob(oid->hash, fd, path, flags);
else if (!S_ISREG(st->st_mode))
-   ret = index_pipe(sha1, fd, type, path, flags);
+   ret = index_pipe(oid->hash, fd, type, path, flags);
else if (st->st_size <= big_file_threshold || type != OBJ_BLOB ||
 (path && would_convert_to_git(_index, path)))
-   ret = index_core(sha1, fd, xsize_t(st->st_size), type, path,
+   ret = index_core(oid->hash, fd, xsize_t(st->st_size), type, 
path,
 flags);
else
-   ret = index_stream(sha1, fd, xsize_t(st->st_size), type, path,
+   ret = index_stream(oid->hash, fd, xsize_t(st->st_size), type, 
path,
   flags);
close(fd);
return ret;
@@ -3696,7 +3696,7 

[PATCH 2/6] read-cache: convert to struct object_id

2017-08-20 Thread Patryk Obara
Replace hashcmp with oidcmp.

Signed-off-by: Patryk Obara 
---
 read-cache.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index acfb028..7285608 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -160,9 +160,9 @@ static int ce_compare_data(const struct cache_entry *ce, 
struct stat *st)
int fd = git_open_cloexec(ce->name, O_RDONLY);
 
if (fd >= 0) {
-   unsigned char sha1[20];
-   if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, 0))
-   match = hashcmp(sha1, ce->oid.hash);
+   struct object_id oid;
+   if (!index_fd(oid.hash, fd, st, OBJ_BLOB, ce->name, 0))
+   match = oidcmp(, >oid);
/* index_fd() closed the file descriptor already */
}
return match;
-- 
2.9.5



[PATCH 6/6] sha1_file: convert index_stream to struct object_id

2017-08-20 Thread Patryk Obara
Signed-off-by: Patryk Obara 
---
 sha1_file.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 3e2ef4e..8d6960a 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3655,11 +3655,11 @@ static int index_core(unsigned char *sha1, int fd, 
size_t size,
  * binary blobs, they generally do not want to get any conversion, and
  * callers should avoid this code path when filters are requested.
  */
-static int index_stream(unsigned char *sha1, int fd, size_t size,
+static int index_stream(struct object_id *oid, int fd, size_t size,
enum object_type type, const char *path,
unsigned flags)
 {
-   return index_bulk_checkin(sha1, fd, size, type, path, flags);
+   return index_bulk_checkin(oid->hash, fd, size, type, path, flags);
 }
 
 int index_fd(struct object_id *oid, int fd, struct stat *st,
@@ -3680,7 +3680,7 @@ int index_fd(struct object_id *oid, int fd, struct stat 
*st,
ret = index_core(oid->hash, fd, xsize_t(st->st_size), type, 
path,
 flags);
else
-   ret = index_stream(oid->hash, fd, xsize_t(st->st_size), type, 
path,
+   ret = index_stream(oid, fd, xsize_t(st->st_size), type, path,
   flags);
close(fd);
return ret;
-- 
2.9.5



[PATCH 3/6] sha1_file: convert index_path to struct object_id

2017-08-20 Thread Patryk Obara
Convert all remaining callers as well.

Signed-off-by: Patryk Obara 
---
 builtin/update-index.c |  2 +-
 cache.h|  2 +-
 diff.c |  2 +-
 notes-merge.c  |  2 +-
 read-cache.c   |  2 +-
 sha1_file.c| 10 +-
 6 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 56721cf..d562f2e 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -280,7 +280,7 @@ static int add_one_path(const struct cache_entry *old, 
const char *path, int len
fill_stat_cache_info(ce, st);
ce->ce_mode = ce_mode_from_stat(old, st->st_mode);
 
-   if (index_path(ce->oid.hash, path, st,
+   if (index_path(>oid, path, st,
   info_only ? 0 : HASH_WRITE_OBJECT)) {
free(ce);
return -1;
diff --git a/cache.h b/cache.h
index 1c69d2a..380868d 100644
--- a/cache.h
+++ b/cache.h
@@ -685,7 +685,7 @@ extern int ie_modified(const struct index_state *, const 
struct cache_entry *, s
 #define HASH_WRITE_OBJECT 1
 #define HASH_FORMAT_CHECK 2
 extern int index_fd(unsigned char *sha1, int fd, struct stat *st, enum 
object_type type, const char *path, unsigned flags);
-extern int index_path(unsigned char *sha1, const char *path, struct stat *st, 
unsigned flags);
+extern int index_path(struct object_id *oid, const char *path, struct stat 
*st, unsigned flags);
 
 /*
  * Record to sd the data from st that we use to check whether a file
diff --git a/diff.c b/diff.c
index 9c38258..65f8d13 100644
--- a/diff.c
+++ b/diff.c
@@ -3246,7 +3246,7 @@ static void diff_fill_oid_info(struct diff_filespec *one)
}
if (lstat(one->path, ) < 0)
die_errno("stat '%s'", one->path);
-   if (index_path(one->oid.hash, one->path, , 0))
+   if (index_path(>oid, one->path, , 0))
die("cannot hash %s", one->path);
}
}
diff --git a/notes-merge.c b/notes-merge.c
index c12b354..744c685 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -709,7 +709,7 @@ int notes_merge_commit(struct notes_merge_options *o,
/* write file as blob, and add to partial_tree */
if (stat(path.buf, ))
die_errno("Failed to stat '%s'", path.buf);
-   if (index_path(blob_oid.hash, path.buf, , HASH_WRITE_OBJECT))
+   if (index_path(_oid, path.buf, , HASH_WRITE_OBJECT))
die("Failed to write blob object from '%s'", path.buf);
if (add_note(partial_tree, _oid, _oid, NULL))
die("Failed to add resolved note '%s' to notes tree",
diff --git a/read-cache.c b/read-cache.c
index 7285608..17f19a1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -689,7 +689,7 @@ int add_to_index(struct index_state *istate, const char 
*path, struct stat *st,
return 0;
}
if (!intent_only) {
-   if (index_path(ce->oid.hash, path, st, HASH_WRITE_OBJECT)) {
+   if (index_path(>oid, path, st, HASH_WRITE_OBJECT)) {
free(ce);
return error("unable to index file %s", path);
}
diff --git a/sha1_file.c b/sha1_file.c
index b60ae15..6a2a48b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3686,7 +3686,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st,
return ret;
 }
 
-int index_path(unsigned char *sha1, const char *path, struct stat *st, 
unsigned flags)
+int index_path(struct object_id *oid, const char *path, struct stat *st, 
unsigned flags)
 {
int fd;
struct strbuf sb = STRBUF_INIT;
@@ -3696,7 +3696,7 @@ int index_path(unsigned char *sha1, const char *path, 
struct stat *st, unsigned
fd = open(path, O_RDONLY);
if (fd < 0)
return error_errno("open(\"%s\")", path);
-   if (index_fd(sha1, fd, st, OBJ_BLOB, path, flags) < 0)
+   if (index_fd(oid->hash, fd, st, OBJ_BLOB, path, flags) < 0)
return error("%s: failed to insert into database",
 path);
break;
@@ -3704,14 +3704,14 @@ int index_path(unsigned char *sha1, const char *path, 
struct stat *st, unsigned
if (strbuf_readlink(, path, st->st_size))
return error_errno("readlink(\"%s\")", path);
if (!(flags & HASH_WRITE_OBJECT))
-   hash_sha1_file(sb.buf, sb.len, blob_type, sha1);
-   else if (write_sha1_file(sb.buf, sb.len, blob_type, sha1))
+   hash_sha1_file(sb.buf, sb.len, blob_type, oid->hash);
+   else if (write_sha1_file(sb.buf, sb.len, blob_type, oid->hash))
return error("%s: failed to insert into database",
   

[PATCH 1/6] builtin/hash-object: convert to struct object_id

2017-08-20 Thread Patryk Obara
Signed-off-by: Patryk Obara 
---
 builtin/hash-object.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/hash-object.c b/builtin/hash-object.c
index d04baf9..1c0f0f3 100644
--- a/builtin/hash-object.c
+++ b/builtin/hash-object.c
@@ -16,7 +16,7 @@
  * needs to bypass the data conversion performed by, and the type
  * limitation imposed by, index_fd() and its callees.
  */
-static int hash_literally(unsigned char *sha1, int fd, const char *type, 
unsigned flags)
+static int hash_literally(struct object_id *oid, int fd, const char *type, 
unsigned flags)
 {
struct strbuf buf = STRBUF_INIT;
int ret;
@@ -24,7 +24,7 @@ static int hash_literally(unsigned char *sha1, int fd, const 
char *type, unsigne
if (strbuf_read(, fd, 4096) < 0)
ret = -1;
else
-   ret = hash_sha1_file_literally(buf.buf, buf.len, type, sha1, 
flags);
+   ret = hash_sha1_file_literally(buf.buf, buf.len, type, 
oid->hash, flags);
strbuf_release();
return ret;
 }
@@ -33,16 +33,16 @@ static void hash_fd(int fd, const char *type, const char 
*path, unsigned flags,
int literally)
 {
struct stat st;
-   unsigned char sha1[20];
+   struct object_id oid;
 
if (fstat(fd, ) < 0 ||
(literally
-? hash_literally(sha1, fd, type, flags)
-: index_fd(sha1, fd, , type_from_string(type), path, flags)))
+? hash_literally(, fd, type, flags)
+: index_fd(oid.hash, fd, , type_from_string(type), path, 
flags)))
die((flags & HASH_WRITE_OBJECT)
? "Unable to add %s to database"
: "Unable to hash %s", path);
-   printf("%s\n", sha1_to_hex(sha1));
+   printf("%s\n", oid_to_hex());
maybe_flush_or_die(stdout, "hash to stdout");
 }
 
-- 
2.9.5



[PATCH 0/6] Convert hash-object to struct object_id

2017-08-20 Thread Patryk Obara
This enabled conversion of few functions in sha1_file, which
had almost all callers converted already.

I hope I'm not stepping on anyone's toes with this patch series.
If I do - is there some email thread or document in which I can
coordinate with other developers, regarding which code regions
are being converted to struct object_id next?

I focused on this builtin in particular because it's probably
the smallest functionality, that can be converted to different
hashing algorithm, at least partly.

Patryk Obara (6):
  builtin/hash-object: convert to struct object_id
  read-cache: convert to struct object_id
  sha1_file: convert index_path to struct object_id
  sha1_file: convert index_fd to struct object_id
  sha1_file: convert hash_sha1_file_literally to struct object_id
  sha1_file: convert index_stream to struct object_id

 builtin/difftool.c |  2 +-
 builtin/hash-object.c  | 12 ++--
 builtin/replace.c  |  2 +-
 builtin/update-index.c |  2 +-
 cache.h|  6 +++---
 diff.c |  2 +-
 notes-merge.c  |  2 +-
 read-cache.c   |  8 
 sha1_file.c| 34 +-
 9 files changed, 35 insertions(+), 35 deletions(-)

-- 
2.9.5



What does 'git blame --before -- ' supposed to mean?

2017-08-20 Thread Kaartic Sivaraam
Hello all,

I tried to do a 'git blame' on a file and wanted to see the blame
before a particular revision of that file. I initially didn't know that
you could achieve that with,

$ git blame  

I thought I need to pass the revision as a parameter so I tried (before
looking into the documentation) the command found in the subject. It
worked but to my surprise it had the same result as,

$ git blame -- 

I was confused and came to know from the documentation that blame
doesn't have any '--before' option. That was even more surprising. Why
does blame accept an option which it doesn't identify? Shouldn't it
have warned that it doesn't accept the '--before' option? I guess it
should not accept it because it confuses the user a lot as the could
make it hard time for him to identify the issue.

'git blame' doesn't seem to be the only command that accepts options
not specified in the documentation there's 'git show' for it's company,

$ git show --grep 'regex'

But the good thing with the above command is it behaves as expected. I
suspect this should be documented, anyway.

Thoughts ?

-- 
Kaartic


FREE DONATION

2017-08-20 Thread Elisabeth Mohn
Hello,
I, Elisabeth Mohn intend to give out a Portion of my Wealth as charity hoping 
it would be of help to you and others too.
Respond for confirmation.
elisabethmohnfoundat...@gmail.com


Si avvisano i destinatari che questo messaggio e' aziendale e non ha natura 
personale. Le risposte a questo indirizzo mittente potranno essere conosciute, 
per motivi aziendali, da altre persone della Fondazione IRCCS Ca Granda 
Ospedale Maggiore Policlinico autorizzate a tal fine dal Titolare.
Nota di riservatezza: il presente messaggio e' strettamente riservato al 
destinatario sopraindicato. Chiunque ricevesse questo messaggio per errore e' 
pregato di notificarlo al mittente e quindi cancellarlo. E' severamente 
proibito copiarlo, divulgarne i contenuti o usarlo per qualsivoglia scopo. 
Grazie.



Re: [PATCH] Add 'raw' blob_plain link in history overview

2017-08-20 Thread Job Snijders
bump? 

On Wed, Aug 02, 2017 at 08:59:01PM +0200, Job Snijders wrote:
> We often work with very large plain text files in our repositories and
> found it friendlier to the users if we can click directly to the raw
> version of such files.
> 
> This patch adds a 'raw' blob_plain link in history overview.
> 
> Signed-off-by: Job Snijders 
> ---
>  gitweb/gitweb.perl | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 3d4a8ee27..ad79c518e 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -5961,8 +5961,11 @@ sub git_history_body {
> href(action=>"commit", 
> hash=>$commit), $ref);
>   print "\n" .
> "" .
> -   $cgi->a({-href => href(action=>$ftype, 
> hash_base=>$commit, file_name=>$file_name)}, $ftype) . " | " .
> -   $cgi->a({-href => href(action=>"commitdiff", 
> hash=>$commit)}, "commitdiff");
> +   $cgi->a({-href => href(action=>$ftype, 
> hash_base=>$commit, file_name=>$file_name)}, $ftype) . " | ";
> + if ($ftype eq 'blob') {
> + print $cgi->a({-href => href(action=>"blob_plain", 
> hash_base=>$commit, file_name=>$file_name)}, "raw") . " | ";
> + }
> + print $cgi->a({-href => href(action=>"commitdiff", 
> hash=>$commit)}, "commitdiff");
>  
>   if ($ftype eq 'blob') {
>   my $blob_current = $file_hash;


Re: git-svn: Handling of branches created from subfolders

2017-08-20 Thread Jan Teske

> On 20. Aug 2017, at 12:27 , Andreas Heiduk  wrote:
> 
> Am 19.08.2017 um 14:45 schrieb Jan Teske:
>> Is there any way to fix such branches from subfolders in a way that they 
>> integrate correctly with the converted git repository, without losing any 
>> (or at least too much) history? If this is not possible with git-svn 
>> directly, maybe I could prepare the SVN repo or post-process the converted 
>> git repository somehow?
> 
> You can use `git replace --graft` to connect the first commit of the
> loose branches with their source. After all connections are in place you
> can use `git filter-branch` to make the replacements permanent.
> 
> This will not change the content or directory structure of branch1 or
> branch2 but the diff with their parent commits will show up as a huge
> delete/rename operation. So merging/Cherry-picking between trunk and
> branch1/branch2 will be ... challenging.

That’s really helpful, thanks!

I even solved the problem of the challenging merging between the converted 
branches and trunk by using another filter-branch to rewrite all the commits in 
branch1/branch2 to make all their modifications in the respective subfolders 
(effectively fixing the directory structure to fit trunk). This answer was 
helpful for how to do this: https://unix.stackexchange.com/a/280229

So, for future reference, the following post-processing seems to work:

1. Use filter-branch to move all the commits the the correct subfolder.
2. Use `git replace --graft` to connect the first commit of the branch to its 
correct parent commit.
3. Use `git replace --graft` to add the missing parents of any merge commits 
the branch was part of.
4. Use filter-branch again to make the replacements permanent.




Re: What's cooking in git.git (Aug 2017, #04; Fri, 18)

2017-08-20 Thread Junio C Hamano
Jeff King  writes:

> On Fri, Aug 18, 2017 at 02:26:14PM -0700, Junio C Hamano wrote:
>
>> * jk/trailers-parse (2017-08-15) 8 commits
>>  - pretty: support normalization options for %(trailers)
>>  - t4205: refactor %(trailers) tests
>>  - pretty: move trailer formatting to trailer.c
>>  - interpret-trailers: add --parse convenience option
>>  - interpret-trailers: add an option to unfold values
>>  - interpret-trailers: add an option to show only existing trailers
>>  - interpret-trailers: add an option to show only the trailers
>>  - trailer: put process_trailers() options into a struct
>> 
>>  "git interpret-trailers" has been taught a "--parse" and a few
>>  other options to make it easier for scripts to grab existing
>>  trailer lines from a commit log message.
>> 
>>  Will merge to 'next'.
>
> I saw that this was merged and ended up with a few conflicts related to
> the other interpret-trailers series (sorry). Your resolution looks good
> to me.

Thanks for double checking.  This was an easy conflict--the only
remotely tricky part was "list.nr -> !list_empty()".

> There are a few leftover bits I think we could do on top:
>
>   - we disallow "--trailer" with "--only-input", because the former is
> quietly ignored. After the two series are merged, I think "--where",
> etc should probably get the same treatment (the behavior without it
> isn't horrible, but it's nice to warn the user that what they asked
> for is bogus).

Sounds good.  I didn't think about that case.

The implementation would have to add another variable whose sole
purpose is to keep track of "have we ever seen any of these
options?" and flip it in the callbacks that implement these options,
as there is no way to tell between WHERE_DEFAULT that is set from
the command line, from the configuration, or left as-is without
either.

>   - Martin pointed out a typo in the new documentation
>
>   - I just noticed that ref-filter understands "%(trailer)", too
> (courtesy of Jacob's original series from last year; I didn't think
> to even check for it). It probably should learn to support the
> additional options, too. I didn't look, but it probably could just
> reuse the new trailer.c formatting function I added.
>
> There's a patch for the second one below.
>
> None for the other two right now, as I'm just trying to clear out
> backlog before going offline for a few days (but I'd be happy if anybody
> wanted to take a crack at them in the meantime).

Thanks.  Have a good time (I am assuming you'd be having fun, as
opposed to having to become offline to tend to unpleasant duties).

We probably want some convention to mark the messages with these
"leftover bits that are relatively low hanging fruits", so that
archive search can find them easily.  #leftoverbits perhaps?

That way, hopefully people can look for messages with such mark that
came from those list participants with known-good taste.  That would
be far easier than a general bug tracker anybody can throw garbage
at, and requires constant attention to winnow only good ones out of
chaff.

> -- >8 --
> From: =?UTF-8?q?Martin=20=C3=85gren?= 
> Subject: [PATCH] doc/interpret-trailers: fix "the this" typo
>
> Signed-off-by: Jeff King 
> ---
> I put Martin as the author since he noticed the bug, but I think we are
> OK without a signoff for this trivial change (normally I'd have just
> squashed, but the topic is in 'next' now).
>
>  Documentation/git-interpret-trailers.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/git-interpret-trailers.txt 
> b/Documentation/git-interpret-trailers.txt
> index 1df8aabf51..2e22210734 100644
> --- a/Documentation/git-interpret-trailers.txt
> +++ b/Documentation/git-interpret-trailers.txt
> @@ -21,7 +21,7 @@ This command reads some patches or commit messages from 
> either the
>   arguments or the standard input if no  is specified. If
>  `--parse` is specified, the output consists of the parsed trailers.
>  
> -Otherwise, the this command applies the arguments passed using the
> +Otherwise, this command applies the arguments passed using the
>  `--trailer` option, if any, to the commit message part of each input
>  file. The result is emitted on the standard output.


Re: [PATCH 0/2] http: handle curl with vendor backports

2017-08-20 Thread Junio C Hamano
Jeff King  writes:

> On Fri, Aug 11, 2017 at 03:15:06PM -0700, Junio C Hamano wrote:
>
>> "Tom G. Christensen"  writes:
>> 
>> > The curl packages provided by Red Hat for RHEL contain several
>> > backports of features from later curl releases.
>> > This causes problems with current version based checks in http.c.
>> >
>> > Here is an overview of the features that have been backported:
>> > 7.10.6 (el3) Backports CURLPROTO_*
>> > 7.12.1 (el4) Backports CURLPROTO_*
>> > 7.15.5 (el5) Backports GSSAPI_DELEGATION_*
>> >  Backports CURLPROTO_*
>> > 7.19.7 (el6) Backports GSSAPI_DELEGATION_*
>> >  Backports CURL_SSL_VERSION_TLSv1_{0,1,2}
>> > 7.29.0 (el7) Backports CURL_SSL_VERSION_TLSv1_{0,1,2}
>> >
>> > This patch series will update the current version based checks for
>> > protocol restriction and GSSAPI delegation control support to ones
>> > based on features to properly deal with the above listed backports.
>> > The fine grained TLS version support does not seem to be
>> > distinguishable via a preprocessor macro so I've left that alone.
>> 
>> Thanks; these feature macros ought to be more dependable, and I
>> think this moves things in the right direction (regardless of which
>> features we might later pick as mandatory and cut off supports for
>> older versions).
>
> Yes, I agree that these are an improvement regardless. If we follow
> through on the cut-off to 7.19.4, then the CURLPROTO ones all go away.
> But I don't mind rebasing any cut-off proposal on top of this work.

Yeah I came to a similar conclusion and was about asking if you feel
the same way that your series should be made on top of Tom's fixes.

The aspect of that series I do like the most is to base our
decisions on features, not versions, and I also wonder if we can do
similar in your "abandon too old ones" series, too.

Thanks.


Re: git add -p breaks after split on change at the top of the file

2017-08-20 Thread Junio C Hamano
Jeff King  writes:

> So that's one question I'm puzzled by: why does it work without an edit,
> but fail with one.

As this is the part of the system I long time ago declared a "works
only some of the time" hack, I do not recall the exact details, but
a key phrase "DIRTY" in the old discussion thread made me look.

Doesn't coalesce_overlapping_hunks in add-interactive.perl play a
role in the difference?  I _think_ that one tries to do the right
thing by enabling the logic before the "(e)dit" hack was introduced;
I haven't actually checked out and looked the code before the
addition of the option, but "add-p" should back then have actually
understood the offsets and hunk length and merged the surviving
hunks as needed to avoid feeding overlaps to "git apply"---it may
even have fixed the offset before doing so.  Edited hunks whose
mechanical integrity has become dubious are marked with DIRTY, as
the hunk touched by "(e)dit" codepath, by the time it reaches that
helper function, no longer has trustable information in the correct
offset information there.

> My second question is how "add -p" could generate the corrected hunk
> headers.
>
> In the non-edit case, we know that our second hunk's leading context
> must always match the first hunk's trailing context (since that's how we
> do a split). So we'd know that the preimage "-pos" field of the second
> hunk header should be offset by the net sum of the added/changed lines
> for any previous split hunks.
>
> In the original case we had net two added lines in the first hunk. So if
> it's selected, that's how our "-1" becomes "-3". And after the edit, we
> should be able to apply the same, I think? We have a single added line,
> so it becomes "-2". That doesn't require knowing what was in the hunk
> before the user edited at all. We just need to know which other hunks
> it's split from, so we know which overlapping hunks we need to take into
> account when recomputing.
>
> "add -p" certainly has that information. I _think_ git-apply also has
> that information if it bothers to find out.

I am not sure about the latter.  If we declare that we assume the
user never touches the hunk header line (i.e. "@@ ... @@") when
editing the patch text (i.e. " /-/+" lines), the receiving end can
probably guess by counting how many preimage and postimage lines are
described in the hunk and then seeing the change from the numbers on
the hunk header.  

I do not think the "--recount/--overlap" code trusts its input that
much in the current code, and there may be a room to tweak that band
aid with such an additional assumption.

But I dunno.  Once we go that route, and then if we want to assume
less about random things the end-user may do to make it more robust,
"add -p" would need to keep the hunk header before giving the hunk
to the user, and then replace the (possibly modified) hunk header in
the edited patch with the original to tell the receiving end what
the original numbers were to help --recount logic.  At that point, I
do not think it is too big a stretch for the "(e)dit" to actually
correct the hunk headers to not just express the correct offset but
also show the right number of lines, so that it does not have to be
marked with the "DIRTY" bit.  That way, coalesce_overlapping_hunks
logic can finish it off just like other "picked but unedited" hunks.

And when that happens, we may not even need to run "apply" with the
"--recount" and "--allow-overlap" options from this codepath to go
back to the state before 8cbd4310 ("git-add--interactive: replace
hunk recounting with apply --recount", 2008-07-02), where "add -p"
did not fly blind and always knew (or at least "supposed to know")
everything that is happening, which IMO is a preferrable endgame.
If the language spoken between "add -p" and "apply" is "patch", we
should try to stick to that language.  Butchering the listener to
allow this speaker to speak in loose and ambiguous terms, especially
when we can teach the speaker to do so, was a big mistake.



Re: Add --ignore-missing to git-pack-objects?

2017-08-20 Thread ch

Hi Jeff.

Thanks a lot for your response.

Jeff King wrote:


So if I understand correctly, you are only using these for the negative
side of the traversal?


Yes.

Jeff King wrote:


rev-list should ignore missing objects in such a
case even without --ignore-missing, and I think it may simply be a bug
if pack-objects is not.

Do you have a simple reproduction recipe?


Here's a small bash script to illustrate the core issue:


add_file()
{
echo "$1" > "$1"
git add "$1"
git commit -m "$1"
}

git init .
git config core.logAllRefUpdates false

add_file "test-1"
add_file "test-2"

git checkout -b feature

add_file "test-3"
add_file "test-4"

git checkout master

add_file "test-5"
add_file "test-6"

feature_tip=$(git rev-list -1 feature)

echo -e "\nDeleting branch 'feature' ($feature_tip)..."
git branch -D feature
git gc --prune=now

echo -e "\nCalling git-pack-objects with (now deleted) ^$feature_tip..."
git pack-objects --all --revs --stdout --thin --delta-base-offset --all-progress-implied 
<<< ^$feature_tip > pack

echo -e "\nCalling git-rev-list with (now deleted) ^$feature_tip..."
git rev-list --all ^$feature_tip

echo -e "\nCalling git-rev-list --ignore-missing with (now deleted) 
^$feature_tip..."
git rev-list --all --ignore-missing ^$feature_tip


Both, git-pack-objects and git-rev-list (without --ignore-missing) fail with

fatal: bad object 

on git version 2.14.1.windows.1.


Re: git fetch with refspec does not include tags?

2017-08-20 Thread Junio C Hamano
Jeff King  writes:

> On Sun, Aug 20, 2017 at 03:47:28AM -0400, Jeff King wrote:
>
>> Yeah, I don't think we want to go back to the original behavior. I agree
>> that it is partially to blame for the inconsistency that started this
>> thread, but I think on balance it has saved much more confusion than it
>> has started. And we can address that inconsistency with better tag rules
>> (like the "autofollow if we wrote any real refs" thing).
>> 
>> I don't have a patch for that yet, so if anybody feels like taking a
>> look, it would be much appreciated.
>
> Also: I don't think we've seen a patch yet for documenting the current
> auto-follow behavior.  Even if we don't make a behavior change, let's
> not forget to improve that, which should be much less work. :)

Thanks for two thoughtful follow-ups.


[PATCH v1] convert: display progress for filtered objects that have been delayed

2017-08-20 Thread Lars Schneider
In 2841e8f ("convert: add "status=delayed" to filter process protocol",
2017-06-30) we taught the filter process protocol to delayed responses.
These responses are processed after the "Checking out files" phase.
If the processing takes noticeable time, then the user might think Git
is stuck.

Display the progress of the delayed responses to let the user know that
Git is still processing objects. This works very well for objects that
can be filtered quickly. If filtering of an individual object takes
noticeable time, then the user might still think that Git is stuck.
However, in that case the user would at least know what Git is doing.

It would be technical more correct to display "Checking out files whose
content filtering has been delayed". For brevity we only print
"Filtering content".

The finish_delayed_checkout() call was moved below the stop_progress()
call in unpack-trees.c to ensure that the "Checking out files" progress
is properly stopped before the "Filtering content" progress starts in
finish_delayed_checkout().

Signed-off-by: Lars Schneider 
Suggested-by: Taylor Blau 
---

Notes:
Base Ref: master
Web-Diff: https://github.com/larsxschneider/git/commit/8c3f433083
Checkout: git fetch https://github.com/larsxschneider/git 
filter-process/progress-v1 && git checkout 8c3f433083

 entry.c| 16 +++-
 unpack-trees.c |  2 +-
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/entry.c b/entry.c
index 65458f07a4..1d1a09f47e 100644
--- a/entry.c
+++ b/entry.c
@@ -3,6 +3,7 @@
 #include "dir.h"
 #include "streaming.h"
 #include "submodule.h"
+#include "progress.h"
 
 static void create_directories(const char *path, int path_len,
   const struct checkout *state)
@@ -161,16 +162,23 @@ static int remove_available_paths(struct string_list_item 
*item, void *cb_data)
 int finish_delayed_checkout(struct checkout *state)
 {
int errs = 0;
+   unsigned delayed_object_count;
+   off_t filtered_bytes = 0;
struct string_list_item *filter, *path;
+   struct progress *progress;
struct delayed_checkout *dco = state->delayed_checkout;
 
if (!state->delayed_checkout)
return errs;
 
dco->state = CE_RETRY;
+   delayed_object_count = dco->paths.nr;
+   progress = start_progress_delay(
+   _("Filtering content"), delayed_object_count, 50, 1);
while (dco->filters.nr > 0) {
for_each_string_list_item(filter, >filters) {
struct string_list available_paths = 
STRING_LIST_INIT_NODUP;
+   display_progress(progress, delayed_object_count - 
dco->paths.nr);
 
if (!async_query_available_blobs(filter->string, 
_paths)) {
/* Filter reported an error */
@@ -216,11 +224,17 @@ int finish_delayed_checkout(struct checkout *state)
}
ce = index_file_exists(state->istate, 
path->string,
   strlen(path->string), 0);
-   errs |= (ce ? checkout_entry(ce, state, NULL) : 
1);
+   if (ce) {
+   errs |= checkout_entry(ce, state, NULL);
+   filtered_bytes += 
ce->ce_stat_data.sd_size;
+   display_throughput(progress, 
filtered_bytes);
+   } else
+   errs = 1;
}
}
string_list_remove_empty_items(>filters, 0);
}
+   stop_progress();
string_list_clear(>filters, 0);
 
/* At this point we should not have any delayed paths anymore. */
diff --git a/unpack-trees.c b/unpack-trees.c
index 862cfce661..90fb270d77 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -395,8 +395,8 @@ static int check_updates(struct unpack_trees_options *o)
}
}
}
-   errs |= finish_delayed_checkout();
stop_progress();
+   errs |= finish_delayed_checkout();
if (o->update)
git_attr_set_direction(GIT_ATTR_CHECKIN, NULL);
return errs != 0;

base-commit: b3622a4ee94e4916cd05e6d96e41eeb36b941182
-- 
2.14.1



Re: Please fix the useless email prompts

2017-08-20 Thread Andrew Ardill
On 20 August 2017 at 19:18, Jeff King  wrote:
> On Sat, Aug 19, 2017 at 02:02:09PM -0400, Jeffrey Walton wrote:
>
>> > Hasn't this been asked and answered already?
>> >
>> > 
>> > https://public-inbox.org/git/cacbzzx4veod-4a-ek-ubxmfrb1glsvjkxhw51whcsbczdh7...@mail.gmail.com/
>>
>> Its 2017. I'd like the tools to work for me instead of me working for the 
>> tools.
>
> Ironically, Git used to behave as you requested in 2005. After being
> bombarded with complaints about how Git was too lax in creating commits
> with bogus ident information, we changed it in 2012.

Maybe I am missing something obvious, but if that's the case then
can't we just do the identity check when trying to make new commits,
in which case you should be able to pull without setting your
identity?

Regards,

Andrew Ardill


Re: What's cooking in git.git (Aug 2017, #04; Fri, 18)

2017-08-20 Thread Martin Ågren
On 20 August 2017 at 11:40, Jeff King  wrote:
> -- >8 --
> From: =?UTF-8?q?Martin=20=C3=85gren?= 
> Subject: [PATCH] doc/interpret-trailers: fix "the this" typo
>
> Signed-off-by: Jeff King 
> ---
> I put Martin as the author since he noticed the bug, but I think we are
> OK without a signoff for this trivial change (normally I'd have just
> squashed, but the topic is in 'next' now).

FWIW, adding my sign-off would be ok with me. Also, obviously, I'd be
just as ok if this patch was not "From" me at all.

>  Documentation/git-interpret-trailers.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/git-interpret-trailers.txt 
> b/Documentation/git-interpret-trailers.txt
> index 1df8aabf51..2e22210734 100644
> --- a/Documentation/git-interpret-trailers.txt
> +++ b/Documentation/git-interpret-trailers.txt
> @@ -21,7 +21,7 @@ This command reads some patches or commit messages from 
> either the
>   arguments or the standard input if no  is specified. If
>  `--parse` is specified, the output consists of the parsed trailers.
>
> -Otherwise, the this command applies the arguments passed using the
> +Otherwise, this command applies the arguments passed using the
>  `--trailer` option, if any, to the commit message part of each input
>  file. The result is emitted on the standard output.


Re: [PATCH/RFC 0/5] Some ThreadSanitizer-results

2017-08-20 Thread Martin Ågren
On 20 August 2017 at 12:06, Jeff King  wrote:
> On Tue, Aug 15, 2017 at 02:53:00PM +0200, Martin Ågren wrote:
>
>> I tried running the test suite on Git compiled with ThreadSanitizer
>> (SANITIZE=thread). Maybe this series could be useful for someone else
>> trying to do the same. I needed the first patch to avoid warnings when
>> compiling, although it actually has nothing to do with threads.
>>
>> The last four patches are about avoiding some issues where
>> ThreadSanitizer complains for reasonable reasons, but which to the best
>> of my understanding are not real problems. These patches could be useful
>> to make "actual" problems stand out more. Of course, if no-one ever runs
>> ThreadSanitizer, they are of little to no (or even negative) value...
>
> I think it's a chicken-and-egg. I'd love to run the test suite with tsan
> from time to time, but there's no point if it turns up a bunch of false
> positives.
>
> The general direction here looks good to me (and I agree with the
> comments made so far, especially that we should stop writing to
> strbuf_slopbuf entirely).
>
>>   ThreadSanitizer: add suppressions
>
> This one is the most interesting because it really is just papering over
> the issues. I "solved" the transfer_debug one with actual code in:
>
>   
> https://public-inbox.org/git/20170710133040.yom65mjol3nmf...@sigill.intra.peff.net/

Hmm, I did search for related posts, but obviously not good enough.
Sorry that I missed this.

> but it just feels really dirty. I'd be inclined to go with suppressions
> for now until somebody can demonstrate or argue for an actual breakage
> (just because it makes the tool more useful for finding _real_
> problems).

I actually had a more intrusive version of my patch, but which I didn't
send, where I extracted transport_debug_enabled() exactly like that,
except I did it in order to minimize the scope of the suppression. In
the end, I figured the scope was already small enough.

The obvious risk of introducing any kind of "temporary" suppression is
that it remains forever... :-) I think I'll add a couple of NEEDSWORK in
the code to make it a bit more likely that someone stumbles over the
problem and addresses it.

Thanks.


Re: git-svn: Handling of branches created from subfolders

2017-08-20 Thread Andreas Heiduk
Am 19.08.2017 um 14:45 schrieb Jan Teske:
> Is there any way to fix such branches from subfolders in a way that they 
> integrate correctly with the converted git repository, without losing any (or 
> at least too much) history? If this is not possible with git-svn directly, 
> maybe I could prepare the SVN repo or post-process the converted git 
> repository somehow?

You can use `git replace --graft` to connect the first commit of the
loose branches with their source. After all connections are in place you
can use `git filter-branch` to make the replacements permanent.

This will not change the content or directory structure of branch1 or
branch2 but the diff with their parent commits will show up as a huge
delete/rename operation. So merging/Cherry-picking between trunk and
branch1/branch2 will be ... challenging.


Re: Please fix the useless email prompts

2017-08-20 Thread Kaartic Sivaraam
On Sun, 2017-08-20 at 05:18 -0400, Jeff King wrote:
> Ironically, Git used to behave as you requested in 2005. After being
> bombarded with complaints about how Git was too lax in creating commits
> with bogus ident information, we changed it in 2012. So I don't think
> "it's 2017" carries any weight as an argument.
> 

I would like to go with the "don't create commits with bogus
ident". "bogus idents" seem to go against the notion of a "content
tracker" which should *help* identifying the user who was the reason
behind the change. It shoudln't encourage users to create "anonymous
changes". Thus an "anonymous user" seems to completely obscure the
meaning of a "content tracker", at least in my opinion

Moreover, there's no tool that can satisfy *all* of it's users because
Humans are so diverse in their thoughts and behaviour.


-- 
Kaartic


Re: Add --ignore-missing to git-pack-objects?

2017-08-20 Thread Jeff King
On Tue, Aug 15, 2017 at 12:51:01AM +0200, ch wrote:

> Is it possible to add an option akin to git-rev-list's '--ignore-missing' to
> git-pack-objects?
> 
> I use git bundles to (incrementally) backup my repositories. My script 
> inspects
> all bundles in the backup and passes their contained refs as excludes to
> git-pack-objects to build the pack for the new bundle. This works fine as long
> as none of these refs have been garbage-collected in the source repository.
> Something like '--ignore-missing' would be really handy here to ask
> git-pack-objects to simply ignore any of the passed revs that are not present
> (anymore).

So if I understand correctly, you are only using these for the negative
side of the traversal? rev-list should ignore missing objects in such a
case even without --ignore-missing, and I think it may simply be a bug
if pack-objects is not.

Do you have a simple reproduction recipe?

-Peff


Re: [PATCH/RFC 0/5] Some ThreadSanitizer-results

2017-08-20 Thread Jeff King
On Tue, Aug 15, 2017 at 02:53:00PM +0200, Martin Ågren wrote:

> I tried running the test suite on Git compiled with ThreadSanitizer
> (SANITIZE=thread). Maybe this series could be useful for someone else
> trying to do the same. I needed the first patch to avoid warnings when
> compiling, although it actually has nothing to do with threads.
> 
> The last four patches are about avoiding some issues where
> ThreadSanitizer complains for reasonable reasons, but which to the best
> of my understanding are not real problems. These patches could be useful
> to make "actual" problems stand out more. Of course, if no-one ever runs
> ThreadSanitizer, they are of little to no (or even negative) value...

I think it's a chicken-and-egg. I'd love to run the test suite with tsan
from time to time, but there's no point if it turns up a bunch of false
positives.

The general direction here looks good to me (and I agree with the
comments made so far, especially that we should stop writing to
strbuf_slopbuf entirely).

>   ThreadSanitizer: add suppressions

This one is the most interesting because it really is just papering over
the issues. I "solved" the transfer_debug one with actual code in:

  
https://public-inbox.org/git/20170710133040.yom65mjol3nmf...@sigill.intra.peff.net/

but it just feels really dirty. I'd be inclined to go with suppressions
for now until somebody can demonstrate or argue for an actual breakage
(just because it makes the tool more useful for finding _real_
problems).

-Peff


Re: Git *accepts* a branch name, it can't identity in the future?

2017-08-20 Thread Kaartic Sivaraam
Thanks, but Johannes has already found the issue and given a solution.
Regardless, replying to the questions just for the note.

On Sun, 2017-08-20 at 04:33 -0400, Jeff King wrote:
> What does "git for-each-ref" say about which branches you _do_ have?
> 
> Also, what platform are you on?
> 

I use a "Debian GNU/Linux buster/sid 64-bit"

> I'm wondering specifically if you have a filesystem (like HFS+ on MacOS)
> that silently rewrites invalid unicode in filenames we create. That
> would mean your branches are still there, but probably with some funny
> filename like "done/%xxdoc-fix". Git wouldn't know that name because the
> filesystem rewriting happened behinds its back (though I'd think that a
> further open() call would find the same file, so maybe this is barking
> up the wrong tree).
> 

That sounds dangerous!


> Another line of thinking: are you sure the � you are writing on the
> command line is identical to the one generated by the corruption (and if
> you cut and paste, is perhaps a generic glyph placed in the buffer by
> your terminal to replace an invalid codepoint, rather than the actual
> bytes)?
> 

This was the issue. I wasn't providing git with the actual bytes that
resulted as a consequence of the sloppy script.


>   [you didn't say how your script works, so let's use git to rename]

I know of no other way to rename a branch, so I didn't mention it :)


>   $ broken=$(printf '\223')
> 
>   [and we can rename it using that knowledge]
>   $ git branch ${broken}doc-fix doc-fix
> 

Johannes has already given a solution, this one works too.


-- 
Kaartic


[PATCH] doc: fix typo in sendemail.identity

2017-08-20 Thread Jeff King
On Sun, Aug 20, 2017 at 05:40:09AM -0400, Jeff King wrote:

> Subject: [PATCH] doc/interpret-trailers: fix "the this" typo

Looks like this is a not-uncommon error. Here's another one (not part of
the same patch since the interpret-trailers one isn't even in 'master'
yet).

-- >8 --
Subject: [PATCH] doc: fix typo in sendemail.identity

Saying "the this" is an obvious typo. But while we're here,
let's polish the English on the second half of the sentence,
too.

Signed-off-by: Jeff King 
---
 Documentation/config.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index a6a589a735..012919938f 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2895,8 +2895,8 @@ sendemail.smtpsslcertpath::
 
 sendemail..*::
Identity-specific versions of the 'sendemail.*' parameters
-   found below, taking precedence over those when the this
-   identity is selected, through command-line or
+   found below, taking precedence over those when this
+   identity is selected, through either the command-line or
`sendemail.identity`.
 
 sendemail.aliasesFile::
-- 
2.14.1.495.ged40d10435



Re: What's cooking in git.git (Aug 2017, #04; Fri, 18)

2017-08-20 Thread Jeff King
On Fri, Aug 18, 2017 at 02:26:14PM -0700, Junio C Hamano wrote:

> * jk/trailers-parse (2017-08-15) 8 commits
>  - pretty: support normalization options for %(trailers)
>  - t4205: refactor %(trailers) tests
>  - pretty: move trailer formatting to trailer.c
>  - interpret-trailers: add --parse convenience option
>  - interpret-trailers: add an option to unfold values
>  - interpret-trailers: add an option to show only existing trailers
>  - interpret-trailers: add an option to show only the trailers
>  - trailer: put process_trailers() options into a struct
> 
>  "git interpret-trailers" has been taught a "--parse" and a few
>  other options to make it easier for scripts to grab existing
>  trailer lines from a commit log message.
> 
>  Will merge to 'next'.

I saw that this was merged and ended up with a few conflicts related to
the other interpret-trailers series (sorry). Your resolution looks good
to me.

There are a few leftover bits I think we could do on top:

  - we disallow "--trailer" with "--only-input", because the former is
quietly ignored. After the two series are merged, I think "--where",
etc should probably get the same treatment (the behavior without it
isn't horrible, but it's nice to warn the user that what they asked
for is bogus).

  - Martin pointed out a typo in the new documentation

  - I just noticed that ref-filter understands "%(trailer)", too
(courtesy of Jacob's original series from last year; I didn't think
to even check for it). It probably should learn to support the
additional options, too. I didn't look, but it probably could just
reuse the new trailer.c formatting function I added.

There's a patch for the second one below.

None for the other two right now, as I'm just trying to clear out
backlog before going offline for a few days (but I'd be happy if anybody
wanted to take a crack at them in the meantime).

-Peff

-- >8 --
From: =?UTF-8?q?Martin=20=C3=85gren?= 
Subject: [PATCH] doc/interpret-trailers: fix "the this" typo

Signed-off-by: Jeff King 
---
I put Martin as the author since he noticed the bug, but I think we are
OK without a signoff for this trivial change (normally I'd have just
squashed, but the topic is in 'next' now).

 Documentation/git-interpret-trailers.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-interpret-trailers.txt 
b/Documentation/git-interpret-trailers.txt
index 1df8aabf51..2e22210734 100644
--- a/Documentation/git-interpret-trailers.txt
+++ b/Documentation/git-interpret-trailers.txt
@@ -21,7 +21,7 @@ This command reads some patches or commit messages from 
either the
  arguments or the standard input if no  is specified. If
 `--parse` is specified, the output consists of the parsed trailers.
 
-Otherwise, the this command applies the arguments passed using the
+Otherwise, this command applies the arguments passed using the
 `--trailer` option, if any, to the commit message part of each input
 file. The result is emitted on the standard output.
 
-- 
2.14.1.495.ged40d10435



Re: What's cooking in git.git (Aug 2017, #04; Fri, 18)

2017-08-20 Thread Jeff King
On Fri, Aug 18, 2017 at 02:26:14PM -0700, Junio C Hamano wrote:

> * jk/drop-ancient-curl (2017-08-09) 5 commits
>  - http: #error on too-old curl
>  - curl: remove ifdef'd code never used with curl >=7.19.4
>  - http: drop support for curl < 7.19.4
>  - http: drop support for curl < 7.16.0
>  - http: drop support for curl < 7.11.1
> 
>  Some code in http.c that has bitrot is being removed.
> 
>  What is the status of the discussion around this area

This definitely should not be merged to 'next'; there were a few minor
issues people pointed out. I haven't re-rolled because I was trying to
figure out if we were interested in moving forward with this or not.

I'm still inclined in that direction, though I agree the information
from Tom (and his series) does weaken my argument somewhat.

> * tc/curl-with-backports (2017-08-11) 2 commits
>  - http: use a feature check to enable GSSAPI delegation control
>  - http: fix handling of missing CURLPROTO_*
> 
>  Updates to the HTTP layer we made recently unconditionally used
>  features of libCurl without checking the existence of them, causing
>  compilation errors, which has been fixed.  Also migrate the code to
>  check feature macros, not version numbers, to cope better with
>  libCurl that vendor ships with backported features.
> 
>  What is the doneness of this topic

These look OK to me, and I just dropped a few comments in the thread.

-Peff


Re: Please fix the useless email prompts

2017-08-20 Thread Jeff King
On Sat, Aug 19, 2017 at 02:02:09PM -0400, Jeffrey Walton wrote:

> > Hasn't this been asked and answered already?
> >
> > 
> > https://public-inbox.org/git/cacbzzx4veod-4a-ek-ubxmfrb1glsvjkxhw51whcsbczdh7...@mail.gmail.com/
> 
> Its 2017. I'd like the tools to work for me instead of me working for the 
> tools.

Ironically, Git used to behave as you requested in 2005. After being
bombarded with complaints about how Git was too lax in creating commits
with bogus ident information, we changed it in 2012. So I don't think
"it's 2017" carries any weight as an argument.

There are numerous discussions in the archive on this. But my
recollection is that the conclusion is:

  - stricter ident checking on balance saves more headaches than it
causes, just in terms of numbers

  - the headaches solved by stricter checking are hard to fix (because
you unknowingly bake cruft into commit objects, and fixing that
requires a history rewrite)

  - the headaches caused by stricter checking are easy to fix (they're
obvious when they happen, and "git -c" lets you provide a dummy
value to override).

We can revisit any of those conclusions, but I'd want to see a
thoughtful discussion of the tradeoffs raised in past rounds, not just
"the current behavior is inconvenient for me".

-Peff


Re: Git *accepts* a branch name, it can't identity in the future?

2017-08-20 Thread Kaartic Sivaraam
On Sun, 2017-08-20 at 10:20 +0200, Johannes Sixt wrote:
> It is not Git's fault that your terminal converts an invalid UTF-8 
> sequence (that your script produces) to �. Nor is it when you paste that 
> character onto the command line, that it is passed as a (correct) UTF-8 
> character.
> 

You're right. I just now realise how I missed the line between "what's
seen by us" and "what's seen by the program".


> Perhaps this helps (untested):
> 
> $ git branch -m done/$(printf '\x93')doc-fix done/dic-fix
> 

This one helped, thanks.

-- 
Kaartic


Re: [PATCH 4/4] archive: queue directories for all types of pathspecs

2017-08-20 Thread Jeff King
On Sat, Aug 19, 2017 at 06:53:26PM +0200, René Scharfe wrote:

> Am 19.08.2017 um 07:33 schrieb René Scharfe:
> > When read_tree_recursive() encounters a directory excluded by a pathspec
> > then it enters it anyway because it might contain included entries.  It
> > calls the callback function before it is able to decide if the directory
> > is actually needed.
> > 
> > For that reason git archive adds directories to a queue and writes
> > entries for them only when it encounters the first child item -- but
> > only if pathspecs with wildcards are used.  Do the same for literal
> > pathspecs as well, as the reasoning above applies to them, too.  This
> > prevents git archive from writing entries for excluded directories.
> 
> This breaks the test "archive empty subtree with no pathspec" in t5004 by
> omitting the empty directory from the archive.  Sorry for missing that!
> 
> This is kind of a bonus patch, so please discard it for now; the first
> three are OK IMHO.
> 
> A better version of this patch would at least update t5004 as well, but
> there might be a better way.

I actually think it would be reasonable to omit the empty directory in
t5004. The main thing we care about there is that we produce an archive
with no files rather than barfing. Checking that the empty directory is
actually there was mostly "this is what it happens to produce" rather
than any conscious decision, I think.

If the new rule is "we omit empty directories", then it would make sense
for us to follow that, regardless of whether it happened by pathspec
limiting or if the tree was empty in the first place (and such an
empty tree is insane anyway; Git tries hard not to create them).

-Peff


Re: [PATCH 1/2] http: Fix handling of missing CURLPROTO_*

2017-08-20 Thread Jeff King
On Fri, Aug 11, 2017 at 05:30:34PM -0700, Junio C Hamano wrote:

> > -#if LIBCURL_VERSION_NUM >= 0x071304
> > +#ifdef CURLPROTO_HTTP
> > curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS,
> >  get_curl_allowed_protocols(0));
> > curl_easy_setopt(result, CURLOPT_PROTOCOLS,
> 
> This may make the code to _compile_, but is it sensible to let the
> code build and be used by the end users without the "these protocols
> are safe" filter, I wonder?  
> 
> Granted, ancient code was unsafe and people were happily using it,
> but now we know better, and more importantly, we have since added
> users of transport (e.g. blindly fetch submodules recursively) that
> may _rely_ on this layer of the code safely filtering unsafe
> protocols, so...

I don't think Tom's patch changes this in any meaningful way. The
"fallback to skipping the safety and showing a warning" dates back to
the original introduction of the feature.

But FWIW, that is exactly the kind of thing that led me to wanting to
implement a hard cutoff in the first place. The warning is small
consolation if Git allows an attack through anyway. Or worse, you don't
even see the warning because it's an automated process that is being
exploited.

There's a good chance if you have such an antique curl that it is also
riddled with other curl-specific bugs that have since been fixed. And
that would argue that we don't need to care that much anyway; people
running old curl have decided that it's not worth caring about the
security implications.

But in the case of RHEL, in theory they are patching security bugs in
curl but just not implementing new features. So if we have a
vulnerability introduced by using an old version of curl, we really are
making things worse. And that argues for having a hard cutoff.

But as Tom's series demonstrates, they are backporting _some_ features
(presumably ones needed by other programs like Git to fix security
bugs). Which argues for having #ifdefs that handle those backports,
which in theory gives us a secure Git on systems that do careful
backporting, and gives us an insecure-but-not-worse-than-it-already-was
Git on systems that don't do backporting.

I dunno. There were a lot of assumptions and mental gymnastics there.
I'm still tempted to target curl >= 7.19.4 just based on timing and
RHEL5's support life-cycle.

-Peff


Re: [PATCH 0/2] http: handle curl with vendor backports

2017-08-20 Thread Jeff King
On Fri, Aug 11, 2017 at 03:15:06PM -0700, Junio C Hamano wrote:

> "Tom G. Christensen"  writes:
> 
> > The curl packages provided by Red Hat for RHEL contain several
> > backports of features from later curl releases.
> > This causes problems with current version based checks in http.c.
> >
> > Here is an overview of the features that have been backported:
> > 7.10.6 (el3) Backports CURLPROTO_*
> > 7.12.1 (el4) Backports CURLPROTO_*
> > 7.15.5 (el5) Backports GSSAPI_DELEGATION_*
> >  Backports CURLPROTO_*
> > 7.19.7 (el6) Backports GSSAPI_DELEGATION_*
> >  Backports CURL_SSL_VERSION_TLSv1_{0,1,2}
> > 7.29.0 (el7) Backports CURL_SSL_VERSION_TLSv1_{0,1,2}
> >
> > This patch series will update the current version based checks for
> > protocol restriction and GSSAPI delegation control support to ones
> > based on features to properly deal with the above listed backports.
> > The fine grained TLS version support does not seem to be
> > distinguishable via a preprocessor macro so I've left that alone.
> 
> Thanks; these feature macros ought to be more dependable, and I
> think this moves things in the right direction (regardless of which
> features we might later pick as mandatory and cut off supports for
> older versions).

Yes, I agree that these are an improvement regardless. If we follow
through on the cut-off to 7.19.4, then the CURLPROTO ones all go away.
But I don't mind rebasing any cut-off proposal on top of this work.

-Peff


Re: Git *accepts* a branch name, it can't identity in the future?

2017-08-20 Thread Jeff King
On Sun, Aug 20, 2017 at 01:21:29PM +0530, Kaartic Sivaraam wrote:

> I made a small assumption in the script which turned out to be false. I
> thought the unicode prefixes I used corresponded to only two bytes.
> This lead to the issue. The unicode character '✓' corresponds to three
> characters and as a result instead of removing it, my script replaced
> it with the unknown character '�'. So, the branch named '✓doc-fix'
> became 'done/�doc-fix'. Here's the issue. I couldn't use 
> 
> $ git branch -m done/�doc-fix done/dic-fix 
> 
> to rename the branch. Nor could I refer to it in anyway. Git simply
> says,
> 
> error: pathspec 'done/�doc-fix' did not match any file(s) known to git.

What does "git for-each-ref" say about which branches you _do_ have?

Also, what platform are you on?

I'm wondering specifically if you have a filesystem (like HFS+ on MacOS)
that silently rewrites invalid unicode in filenames we create. That
would mean your branches are still there, but probably with some funny
filename like "done/%xxdoc-fix". Git wouldn't know that name because the
filesystem rewriting happened behinds its back (though I'd think that a
further open() call would find the same file, so maybe this is barking
up the wrong tree).

Another line of thinking: are you sure the � you are writing on the
command line is identical to the one generated by the corruption (and if
you cut and paste, is perhaps a generic glyph placed in the buffer by
your terminal to replace an invalid codepoint, rather than the actual
bytes)?

> I just wanted to know why git accepted a branch name which it can't
> identify later?
> 
> If it had rejected that name in the first place it would have been
> better. In case you would like to know how I got that weird name,
> here's a way to get that
> 
> $ echo '✓doc-fix' | cut -c3-100

  [a few defines to make it easy to prod git]
  $ check=$(printf '\342\234\223')
  $ broken=$(printf '\223')

  [this is your starting state, a branch with the unicode name]
  $ git branch ${check}doc-fix

  [you didn't say how your script works, so let's use git to rename]
  $ git branch -m ${check}doc-fix ${broken}doc-fix

  [my terminal doesn't show the unknown-character glyph, but we
   can see the funny character with "cat -A"]:
  $ git for-each-ref --format='%(refname)' | cat -A
  refs/heads/master$
  refs/heads/M-^Sdoc-fix$

  [and we can rename it using that knowledge]
  $ git branch ${broken}doc-fix doc-fix

-Peff


Re: Git *accepts* a branch name, it can't identity in the future?

2017-08-20 Thread Johannes Sixt

Am 20.08.2017 um 09:51 schrieb Kaartic Sivaraam:

I made a small assumption in the script which turned out to be false. I
thought the unicode prefixes I used corresponded to only two bytes.
This lead to the issue. The unicode character '✓' corresponds to three
characters and as a result instead of removing it, my script replaced
it with the unknown character '�'. So, the branch named '✓doc-fix'
became 'done/�doc-fix'. Here's the issue. I couldn't use

 $ git branch -m done/�doc-fix done/dic-fix

to rename the branch. Nor could I refer to it in anyway. Git simply
says,

 error: pathspec 'done/�doc-fix' did not match any file(s) known to git.

It's not a big issue as I haven't lost anything out of it. The branches
have been merged into 'master'.

I just wanted to know why git accepted a branch name which it can't
identify later?

If it had rejected that name in the first place it would have been
better. In case you would like to know how I got that weird name,
here's a way to get that

 $ echo '✓doc-fix' | cut -c3-100



See, these two are different:

$ echo '✓doc-fix' | cut -c3-100 | od -t x1
000 93 64 6f 63 2d 66 69 78 0a
011
$ echo '�doc-fix' | od -t x1
000 64 6f bd 64 6f 63 2d 66 69 78 0a
013

It is not Git's fault that your terminal converts an invalid UTF-8 
sequence (that your script produces) to �. Nor is it when you paste that 
character onto the command line, that it is passed as a (correct) UTF-8 
character.


Perhaps this helps (untested):

$ git branch -m done/$(printf '\x93')doc-fix done/dic-fix

In Git's database, branch names are just sequences of bytes. It is 
outside the scope to verify that all input is encoded correctly.


-- Hannes


Re: git add -p breaks after split on change at the top of the file

2017-08-20 Thread Jeff King
On Thu, Aug 17, 2017 at 12:22:19PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Of course this is a pretty ridiculous input in the first place. In
> > theory it _could_ be figured out, but overlapping hunks like this are
> > always going to cause problems (in this case, context from the second
> > hunk became a removal, and the second hunk no longer applies).
> 
> To be honest, I do not think it could be figured out by "git apply",
> but "git add -p" _should_ be able to.  I've said already this
> earlier in the discussion:
> 
>   https://public-inbox.org/git/7vab5cn7wr@alter.siamese.dyndns.org/
> 
> "add -p" knows how the hunk _before_ it gave the user to edit looked
> like, and it knows how the hunk after the user edited looks like, so
> it may have enough information to figure it out.  But the "(e)dit"
> feature was done in a way to discard that information and forced an
> impossible "guess what the correct shape of the patch would be, out
> of this broken patch input" with --recount and --allow-overlap.
> 
> If we want to make "add -p/(e)dit" work in the corner cases and make
> it trustworthy, "apply" is a wrong tree to back at.  A major part of
> the effort to fix would go to "add -p", not to "apply".

In theory "add -p" could give all the information it has to git-apply in
some form, and it could figure it out. That may sound crazy, but I just
wonder if it makes things easier because git-apply already knows quite a
lot about how to parse and interpret diffs. Whereas "add -p" is largely
stupid and just parroting back lines.

But I certainly wouldn't make such a decision until figuring out the
actual algorithm, which should then make it obvious where is the best
place to implement it. :)

So just thinking about this particular situation. We have two hunks
after the split:

  @@ -1, +1,3 @@
  +a
  +b
   x

and

  @@ -1, +3,2 @@
   x
  +c

The root of the issue, I think, is that the hunk header for the second
one is bogus. It claims to start at the beginning, but of course if the
other hunk is applied first, it doesn't. The right header in that case
would be (assuming we do not copy the extra context):

  @@ -3, +3,2 @@
   x
  +c

One point of interest is that without an (e)dit, we get this case right,
despite the bogus hunk header. I'm not quite sure why that is. After the
failing edit we would end up with:

  @@ -1, +1,3 @@
  +b
   x

  @@ -1, +3,2 @@
   x
  +c

Clearly the count in the first one is now wrong, but we fix that with
--recount. But it seems to me that the second hunk is equally wrong as
it would be in the non-edit case. I guess in addition to the "-1" being
a "-2", the "+3" should now also be a "+2". But I didn't think that
would impact us finding the preimage side.

So that's one question I'm puzzled by: why does it work without an edit,
but fail with one.

My second question is how "add -p" could generate the corrected hunk
headers.

In the non-edit case, we know that our second hunk's leading context
must always match the first hunk's trailing context (since that's how we
do a split). So we'd know that the preimage "-pos" field of the second
hunk header should be offset by the net sum of the added/changed lines
for any previous split hunks.

In the original case we had net two added lines in the first hunk. So if
it's selected, that's how our "-1" becomes "-3". And after the edit, we
should be able to apply the same, I think? We have a single added line,
so it becomes "-2". That doesn't require knowing what was in the hunk
before the user edited at all. We just need to know which other hunks
it's split from, so we know which overlapping hunks we need to take into
account when recomputing.

"add -p" certainly has that information. I _think_ git-apply also has
that information if it bothers to find out. Right now --allow-overlap is
just "do not complain of overlap". But it could actively recount the
offsets when it detects overlap.

I'm assuming here that the edit process isn't changing the hunk headers,
introducing new hunks, etc. I feel like all bets are off, then.

-Peff


Re: git fetch with refspec does not include tags?

2017-08-20 Thread Jeff King
On Sun, Aug 20, 2017 at 03:47:28AM -0400, Jeff King wrote:

> Yeah, I don't think we want to go back to the original behavior. I agree
> that it is partially to blame for the inconsistency that started this
> thread, but I think on balance it has saved much more confusion than it
> has started. And we can address that inconsistency with better tag rules
> (like the "autofollow if we wrote any real refs" thing).
> 
> I don't have a patch for that yet, so if anybody feels like taking a
> look, it would be much appreciated.

Also: I don't think we've seen a patch yet for documenting the current
auto-follow behavior.  Even if we don't make a behavior change, let's
not forget to improve that, which should be much less work. :)

-Peff


Git *accepts* a branch name, it can't identity in the future?

2017-08-20 Thread Kaartic Sivaraam
Hello all,

First of all, I would like to tell that this happened completely by
accident and it's partly my mistake. Here's what happened.

I recently started creating 'feature branches' a lot for the few
patches that I sent to this mailing list. To identify the status of the
patch corresponding to that branch I prefixed them with special unicode
characters like ✓, ˅ etc. instead of using conventional hierarchical
names like, 'done/', 'archived/'.

Then I started finding it difficult to distinguish these unicode-
prefixed names probably because they had only one unicode character in
common. So, I thought of switching to the conventional way of using
scoped branch names (old is gold, you see). I wrote a tiny script to
rename the branches by replacing a specific unicode prefix with a
corresponding hierachy. For example, the script would convert a branch
named '✓doc-fix' to 'done/doc-fix'.

I made a small assumption in the script which turned out to be false. I
thought the unicode prefixes I used corresponded to only two bytes.
This lead to the issue. The unicode character '✓' corresponds to three
characters and as a result instead of removing it, my script replaced
it with the unknown character '�'. So, the branch named '✓doc-fix'
became 'done/�doc-fix'. Here's the issue. I couldn't use 

$ git branch -m done/�doc-fix done/dic-fix 

to rename the branch. Nor could I refer to it in anyway. Git simply
says,

error: pathspec 'done/�doc-fix' did not match any file(s) known to git.

It's not a big issue as I haven't lost anything out of it. The branches
have been merged into 'master'.

I just wanted to know why git accepted a branch name which it can't
identify later?

If it had rejected that name in the first place it would have been
better. In case you would like to know how I got that weird name,
here's a way to get that

$ echo '✓doc-fix' | cut -c3-100

-- 
Kaartic


Re: git fetch with refspec does not include tags?

2017-08-20 Thread Jeff King
On Thu, Aug 17, 2017 at 10:43:12PM +0200, Kevin Daudt wrote:

> > No, you are not misunderstanding anything.  The "pretend that we
> > immediately turned around and fetched" done by "git push" was
> > already breaking the predictability, but the change in 1.8.4 made it
> > even worse.  I am saying that going back to the old behaviour may be
> > one option to address the issue being discussed in this thread.
> 
> Ok. The reason I'm bring this up is because we often had to tell users
> in the irc channel that git fetch   did not update the
> remote tracking branches, which confused them, so reverting back might
> reintroduce this confusion again.

Yeah, I don't think we want to go back to the original behavior. I agree
that it is partially to blame for the inconsistency that started this
thread, but I think on balance it has saved much more confusion than it
has started. And we can address that inconsistency with better tag rules
(like the "autofollow if we wrote any real refs" thing).

I don't have a patch for that yet, so if anybody feels like taking a
look, it would be much appreciated.

-Peff


Re: [PATCH] progress: simplify "delayed" progress API

2017-08-20 Thread Jeff King
On Sat, Aug 19, 2017 at 10:39:41AM -0700, Junio C Hamano wrote:

> We used to expose the full power of the delayed progress API to the
> callers, so that they can specify, not just the message to show and
> expected total amount of work that is used to compute the percentage
> of work performed so far, the percent-threshold parameter P and the
> delay-seconds parameter N.  The progress meter starts to show at N
> seconds into the operation only if the amount of work completed
> exceeds P.
> 
> Most callers used either (0%, 2s) or (50%, 1s) as (P, N), but there
> are oddballs that chose more random-looking values like 95%.
> 
> For a smoother workload, (50%, 1s) would allow us to start showing
> the progress meter earlier than (0%, 2s), while keeping the chance
> of not showing progress meter for long running operation the same as
> the latter.  For a task that would take 2s or more to complete, it
> is likely that less than half of it would complete within the first
> second, if the workload is smooth.  But for a spiky workload whose
> earlier part is easier, such a setting is likely to fail to show the
> progress meter entirely and (0%, 2s) is more appropriate.
> 
> But that is merely a theory.  Realistically, it is of dubious value
> to ask each codepath to carefully consider smoothness of their
> workload and specify their own setting by passing two extra
> parameters.  Let's simplify the API by dropping both parameters and
> have everybody use (0%, 2s).

Nicely explained (modulo the reversal you noticed in the first
paragraph). The patch looks good to me.

> Oh, by the way, the percent-threshold parameter and the structure
> member were consistently misspelled, which also is now fixed ;-)

Heh, that was bugging me, too. :)

>  * So before I forget all about this topic, here is a patch to
>actually do this.
> 
>> So I'd vote to drop that parameter entirely. And if 1 second seems
>> noticeably snappier, then we should probably just move everything to a 1
>> second delay (I don't have a strong feeling either way).
> 
>I was also tempted to do this, but decided to keep it closer to
>the original for majority of callers by leaving the delay at 2s.
>With this patch, such a change as a follow-up is made a lot
>easier (somebody may want to even make a configuration out of it,
>but that would not be me).

Yeah, I agree that it can come later on top (if at all).

Thanks for finishing this off.

-Peff


Re: [PATCH v3 00/23] Move exported packfile funcs to its own file

2017-08-20 Thread Junio C Hamano
Junio C Hamano  writes:

> I have to say that this was a painful topic to integrate.
>
> As you may know, the mk/use-size-t-in-zlib topic is being retracted
> and getting rerolled as a larger size_t series, most of which still
> needs help in reviewing.
>
> The jt/sha1-file-cleanup topic is the only one among the other four
> that are still not in 'next', and I think that topic, as well as the
> other three, are all good and serve as a good base to build on top.
> So I first rebuilt your patches on top of these four topics.  This
> took some time but it wasn't all that painful.

... but it turns out that I screwed it up in at least one place,
making Linux32 build fail (Thanks Lars and folks who pushed hard to
arrange Travis to build all my pushes to 'pu').  I'm pushing out my
second attempt.  Let's see how it goes.

A change like this that only moves code around without changing
anything is painful to everybody to keep around, as nobody can
safely touch the affected code while it is in flight.  On the other
hand, as long as it is only moving code around, such a change is
reasonably safe, and it is relatively easy to ensure that there is
no change other than code movement is involved.  So let's 

 (1) make sure that the topics this depends on are sound by
 re-reading them once again, and merge them quickly down to
 'master';

 (2) merge this topic to 'next', optionally after rebasing it on
 'master', after (1) happens; and

 (3) quickly merge it to 'master', to get it over with.

In the meantime we'd need to refrain from taking code that touch
things that are moved by this series.

I plan to be offline for a week or so near the end of this month, so
I am hoping that we can do all of the above before that. That may
make us break our usual "tip of the 'master' is more stable and
robust than any released version" promise by potentially leaving it
broken for a while, but nobody can build on top of a fluid codebase
that is in the process of moving things around in a big way, so it
might not be such a bad idea to make it coincide with the period
when the tree must become quiescent due to my being offline.  We'll
see.

Thanks.