date:20160825

Re: [PATCH v1 0/3] Update eol documentation

2016-08-25 Thread Jacob Keller

On Thu, Aug 25, 2016 at 1:31 PM, Junio C Hamano  wrote:
> tbo...@web.de writes:
>
>> From: Torsten Bögershausen 
>>
>> Sorry for posting this so late:
>> While reviewing another patch I realized that the eol related
>> documentation was not updated as it should be.
>>
>> Torsten Bögershausen (2):
>>   git ls-files: text=auto eol=lf is supported in Git 2.10
>>   gitattributes: Document the unified "auto" handling
>>
>>  Documentation/git-ls-files.txt  |  3 +--
>>  Documentation/gitattributes.txt | 24 
>>  2 files changed, 17 insertions(+), 10 deletions(-)
>
> This [0/3] is meant to be a cover for [1/2] and [2/2]?
>
> I am trying to see if we broke format-patch recently, or it is a
> manual editing error.  The latter I do not care about; the former I
> do.
> --

Yes. I recently changed some of format patch and would like to make
sure I didn't break it...

Thanks,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 06/13] pkt-line: add functions to read/write flush terminated packet streams

2016-08-25 Thread Jacob Keller

On Thu, Aug 25, 2016 at 3:31 PM, Junio C Hamano  wrote:
> What is wrong about that?  4*80k = 320kB overhead for length fields
> to transfer 5GB worth of data?  I do not think it is worth worrying
> about it.
>
> But I am more surprised by seeing that "why not a single huge
> packet" suggestion immediately after you talked about "without the
> possibility to intervene".  They do not seem to be remotely related;
> in fact, they are going into opposite directions.
>
> Puzzled.

Stefan's argument to me is thus "If we're already going to ignore
sideband packets here, why not go all the way and make variable length
packets and send a single packet of a maximum length? Doing thus will
solve some set of future problems nicely and makes this code easier."

I'm not sure I agree myself, but that's the logic as I understand it.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128

2016-08-25 Thread Michael Haggerty

On 08/25/2016 06:01 PM, Alex Nauda wrote:
> On Thu, Aug 25, 2016 at 2:28 AM, Michael Haggerty  
> wrote:
>> On 08/24/2016 11:39 PM, Jeff King wrote:
>>> On Wed, Aug 24, 2016 at 04:52:33PM -0400, Alex Nauda wrote:
>>>
 Elastic File System (EFS) is Amazon's scalable filesystem product that
 is exposed to the OS as an NFS mount. We're using EFS to host the
 filesystem used by a Jenkins CI server. Sometimes when Jenkins tries
 to git fetch, we get this error:
 $ git -c core.askpass=true fetch --tags --progress
 g...@github.com:mediasilo/dodo.git
 +refs/pull/*:refs/remotes/origin/pr/*
 fatal: Reference directory conflict: refs/heads/
 $ echo $? 128

 Has anyone seen anything like this before? Any tips on how to troubleshoot 
 it?
>>>
>>> No, I haven't seen it before. That's an internal assertion in the refs
>>> code that shouldn't ever happen. It looks like it happens when the loose
>>> refs end up with duplicate directory entries. While a bug in git is an
>>> obvious culprit, I wonder if it's possible that your filesystem might
>>> expose the same name twice in one set of readdir() results.
>>>
>>> +cc Michael, who added this assertion long ago (and since this is the
>>> first report in all these years, it does make me suspect that the
>>> filesystem is a critical part of reproducing).
>>
>> Thanks for the CC.
>>
>> I've never heard of this problem before.
>>
>> What Git version are you using?
> Git client 2.7.4 against GitHub (Git 2.6.5)
> 
>>
>> I tried to provoke the problem by hand-corrupting the packed-refs file,
>> but wasn't successful.
>>
>> So Peff's suggestion that the problem originates in your filesystem
>> seems to be to be the most likely cause. A quick Google search found,
>> for example,
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=739222
>>
>> http://superuser.com/questions/640419/how-can-i-have-two-files-with-the-same-name-in-a-directory-when-mounted-with-nfs
>>
>> though these reports seem connected with having lots of files in the
>> directory, which seems unlikely for `$GIT_DIR/refs/`. But I didn't do a
>> more careful search, and it is easily possible that there are other bugs
>> in NFS (or EFS) that could be affecting you.
>>
>> If this were repeatable, you could run Git under strace to test Peff's
>> hypothesis. But I suppose it only happens rarely, right?
> Actually it seems to be reproducible. Here's the last portion of an strace:
> 
> [...]
> stat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644,
> st_size=41, ...}) = 0
> lstat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644,
> st_size=41, ...}) = 0
> open(".git/refs/remotes/origin/pr/7/head", O_RDONLY) = 4
> read(4, "5d82811a248900efd8e201c6d9232de5"..., 256) = 41
> read(4, "", 215)= 0
> close(4)= 0
> getdents(3, /* 0 entries */, 32768) = 0
> close(3)= 0
> open(".git/refs/remotes/origin/pr/16/",
> O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
> fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> getdents(3, /* 3 entries */, 32768) = 72
> stat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644,
> st_size=41, ...}) = 0
> lstat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644,
> st_size=41, ...}) = 0
> open(".git/refs/remotes/origin/pr/16/head", O_RDONLY) = 4
> read(4, "2886c4f3ba8c3b5c2306029f6e39498d"..., 256) = 41
> read(4, "", 215)= 0
> close(4)= 0
> getdents(3, /* 0 entries */, 32768) = 0
> close(3)= 0
> open(".git/refs/tags/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
> fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> getdents(3, /* 2 entries */, 32768) = 48
> getdents(3, /* 0 entries */, 32768) = 0
> close(3)= 0
> open(".git/refs/bisect/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) =
> -1 ENOENT (No such file or directory)
> open(".git/packed-refs", O_RDONLY)  = -1 ENOENT (No such file or 
> directory)
> fstat(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
> write(2, "fatal: Reference directory confl"..., 58fatal: Reference
> directory conflict: refs/remotes/origin/
> ) = 58
> exit_group(128) = ?
> +++ exited with 128 +++

Thanks for the additional information.

>From the strace output it is clear that there is no packed-refs file at
the time of the problem, so the problem must be among the loose refs.

The error is a "Reference directory conflict", which suggests that
"refs/remotes/origin/" appears in two entries; once as a reference
directory and once as a reference. But in fact it could also mean that
"refs/remotes/origin/" appears twice, both as directories. Neither one
should happen in normal operation.

Unfortunately there is not enough strace output to see whether (in this
case) path `refs/remotes/origin` was

[PATCH v11 0/8] submodule inline diff format

2016-08-25 Thread Jacob Keller

From: Jacob Keller 

Modify the changes to do_submodule_path so that we properly call
gitmodules_config() before the lookup of submodule_from_path. This may
need to be modified so that we only call it the first time as I'm not
sure what sort of performance hit we'll see. Note that we only need this
call if we no longer have the checkout so it may not be so bad.

Also make the function propagate an error code up to the callers so that
they don't try to use an invalid path. This is better than die()'ing
because a failure can occur if the submodule configuration can't be
found for some reason (such as committing the submodule but not the
.gitmodules file which we previously did in the test).

interdiff between v10 and v11:
diff --git w/cache.h c/cache.h
index 70428e92d7ed..4f6693afa387 100644
--- w/cache.h
+++ c/cache.h
@@ -819,8 +819,8 @@ extern void strbuf_git_common_path(struct strbuf *sb, const 
char *fmt, ...)
__attribute__((format (printf, 2, 3)));
 extern char *git_path_buf(struct strbuf *buf, const char *fmt, ...)
__attribute__((format (printf, 2, 3)));
-extern void strbuf_git_path_submodule(struct strbuf *sb, const char *path,
- const char *fmt, ...)
+extern int strbuf_git_path_submodule(struct strbuf *sb, const char *path,
+const char *fmt, ...)
__attribute__((format (printf, 3, 4)));
 extern char *git_pathdup(const char *fmt, ...)
__attribute__((format (printf, 1, 2)));
diff --git w/path.c c/path.c
index 07dd0f62eb82..e9369f75319d 100644
--- w/path.c
+++ c/path.c
@@ -469,13 +469,16 @@ const char *worktree_git_path(const struct worktree *wt, 
const char *fmt, ...)
return pathname->buf;
 }
 
-static void do_submodule_path(struct strbuf *buf, const char *path,
- const char *fmt, va_list args)
+/* Returns 0 on success, non-zero on failure. */
+#define SUBMODULE_PATH_ERR_NOT_CONFIGURED -1
+static int do_submodule_path(struct strbuf *buf, const char *path,
+const char *fmt, va_list args)
 {
const char *git_dir;
struct strbuf git_submodule_common_dir = STRBUF_INIT;
struct strbuf git_submodule_dir = STRBUF_INIT;
const struct submodule *sub;
+   int err = 0;
 
strbuf_addstr(buf, path);
strbuf_complete(buf, '/');
@@ -487,12 +490,14 @@ static void do_submodule_path(struct strbuf *buf, const 
char *path,
strbuf_addstr(buf, git_dir);
}
if (!is_git_directory(buf->buf)) {
+   gitmodules_config();
sub = submodule_from_path(null_sha1, path);
-   if (sub) {
-   strbuf_reset(buf);
-   strbuf_git_path(buf, "%s/%s", "modules",
-   sub->name);
+   if (!sub) {
+   err = SUBMODULE_PATH_ERR_NOT_CONFIGURED;
+   goto cleanup;
}
+   strbuf_reset(buf);
+   strbuf_git_path(buf, "%s/%s", "modules", sub->name);
}
 
strbuf_addch(buf, '/');
@@ -505,27 +510,36 @@ static void do_submodule_path(struct strbuf *buf, const 
char *path,
 
strbuf_cleanup_path(buf);
 
+cleanup:
strbuf_release(_submodule_dir);
strbuf_release(_submodule_common_dir);
+
+   return err;
 }
 
 char *git_pathdup_submodule(const char *path, const char *fmt, ...)
 {
+   int err;
va_list args;
struct strbuf buf = STRBUF_INIT;
va_start(args, fmt);
-   do_submodule_path(, path, fmt, args);
+   err = do_submodule_path(, path, fmt, args);
va_end(args);
+   if (err)
+   return NULL;
return strbuf_detach(, NULL);
 }
 
-void strbuf_git_path_submodule(struct strbuf *buf, const char *path,
-  const char *fmt, ...)
+int strbuf_git_path_submodule(struct strbuf *buf, const char *path,
+ const char *fmt, ...)
 {
+   int err;
va_list args;
va_start(args, fmt);
-   do_submodule_path(buf, path, fmt, args);
+   err = do_submodule_path(buf, path, fmt, args);
va_end(args);
+
+   return err;
 }
 
 static void do_git_common_path(struct strbuf *buf,
diff --git w/refs/files-backend.c c/refs/files-backend.c
index 12290d249643..1f34b444af8d 100644
--- w/refs/files-backend.c
+++ c/refs/files-backend.c
@@ -1225,13 +1225,19 @@ static void read_loose_refs(const char *dirname, struct 
ref_dir *dir)
struct strbuf refname;
struct strbuf path = STRBUF_INIT;
size_t path_baselen;
+   int err = 0;
 
if (*refs->name)
-   strbuf_git_path_submodule(, refs->name, "%s", dirname);
+   err = strbuf_git_path_submodule(, refs->name, "%s", 
dirname);
else
strbuf_git_path(, "%s", dirname);
path_baselen = path.len;
 
+   if (err) {
+

[PATCH v11 1/8] cache: add empty_tree_oid object and helper function

2016-08-25 Thread Jacob Keller

From: Jacob Keller 

Similar to is_null_oid(), and is_empty_blob_sha1() add an
empty_tree_oid along with helper function is_empty_tree_oid(). For
completeness, also add an "is_empty_tree_sha1()",
"is_empty_blob_sha1()", "is_empty_tree_oid()" and "is_empty_blob_oid()"
helpers.

To ensure we only get one singleton, implement EMPTY_BLOB_SHA1_BIN as
simply getting the hash of empty_blob_oid structure.

Signed-off-by: Jacob Keller 
---
 cache.h | 25 +
 sha1_file.c |  6 ++
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index f30a4417efdf..70428e92d7ed 100644
--- a/cache.h
+++ b/cache.h
@@ -953,22 +953,39 @@ static inline void oidclr(struct object_id *oid)
 #define EMPTY_TREE_SHA1_BIN_LITERAL \
 "\x4b\x82\x5d\xc6\x42\xcb\x6e\xb9\xa0\x60" \
 "\xe5\x4b\xf8\xd6\x92\x88\xfb\xee\x49\x04"
-#define EMPTY_TREE_SHA1_BIN \
-((const unsigned char *) EMPTY_TREE_SHA1_BIN_LITERAL)
+extern const struct object_id empty_tree_oid;
+#define EMPTY_TREE_SHA1_BIN (empty_tree_oid.hash)
 
 #define EMPTY_BLOB_SHA1_HEX \
"e69de29bb2d1d6434b8b29ae775ad8c2e48c5391"
 #define EMPTY_BLOB_SHA1_BIN_LITERAL \
"\xe6\x9d\xe2\x9b\xb2\xd1\xd6\x43\x4b\x8b" \
"\x29\xae\x77\x5a\xd8\xc2\xe4\x8c\x53\x91"
-#define EMPTY_BLOB_SHA1_BIN \
-   ((const unsigned char *) EMPTY_BLOB_SHA1_BIN_LITERAL)
+extern const struct object_id empty_blob_oid;
+#define EMPTY_BLOB_SHA1_BIN (empty_blob_oid.hash)
+
 
 static inline int is_empty_blob_sha1(const unsigned char *sha1)
 {
return !hashcmp(sha1, EMPTY_BLOB_SHA1_BIN);
 }
 
+static inline int is_empty_blob_oid(const struct object_id *oid)
+{
+   return !hashcmp(oid->hash, EMPTY_BLOB_SHA1_BIN);
+}
+
+static inline int is_empty_tree_sha1(const unsigned char *sha1)
+{
+   return !hashcmp(sha1, EMPTY_TREE_SHA1_BIN);
+}
+
+static inline int is_empty_tree_oid(const struct object_id *oid)
+{
+   return !hashcmp(oid->hash, EMPTY_TREE_SHA1_BIN);
+}
+
+
 int git_mkstemp(char *path, size_t n, const char *template);
 
 /* set default permissions by passing mode arguments to open(2) */
diff --git a/sha1_file.c b/sha1_file.c
index 1e23fc186a02..21cf923bcf1f 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -38,6 +38,12 @@ static inline uintmax_t sz_fmt(size_t s) { return s; }
 
 const unsigned char null_sha1[20];
 const struct object_id null_oid;
+const struct object_id empty_tree_oid = {
+   EMPTY_TREE_SHA1_BIN_LITERAL
+};
+const struct object_id empty_blob_oid = {
+   EMPTY_BLOB_SHA1_BIN_LITERAL
+};
 
 /*
  * This is meant to hold a *small* number of objects that you would
-- 
2.10.0.rc0.259.g83512d9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v11 8/8] diff: teach diff to display submodule difference with an inline diff

2016-08-25 Thread Jacob Keller

From: Jacob Keller 

Teach git-diff and friends a new format for displaying the difference
of a submodule. The new format is an inline diff of the contents of the
submodule between the commit range of the update. This allows the user
to see the actual code change caused by a submodule update.

Add tests for the new format and option.

Signed-off-by: Jacob Keller 
---
 Documentation/diff-config.txt|   9 +-
 Documentation/diff-options.txt   |  17 +-
 diff.c   |  31 +-
 diff.h   |   3 +-
 submodule.c  |  69 +++
 submodule.h  |   6 +
 t/t4060-diff-submodule-option-diff-format.sh | 749 +++
 7 files changed, 863 insertions(+), 21 deletions(-)
 create mode 100755 t/t4060-diff-submodule-option-diff-format.sh

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index d5a5b17d5088..0eded24034b5 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -122,10 +122,11 @@ diff.suppressBlankEmpty::
 
 diff.submodule::
Specify the format in which differences in submodules are
-   shown.  The "log" format lists the commits in the range like
-   linkgit:git-submodule[1] `summary` does.  The "short" format
-   format just shows the names of the commits at the beginning
-   and end of the range.  Defaults to short.
+   shown.  The "short" format just shows the names of the commits
+   at the beginning and end of the range. The "log" format lists
+   the commits in the range like linkgit:git-submodule[1] `summary`
+   does. The "diff" format shows an inline diff of the changed
+   contents of the submodule. Defaults to "short".
 
 diff.wordRegex::
A POSIX Extended Regular Expression used to determine what is a "word"
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index cc4342e2034d..7805a0ccadf2 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -210,13 +210,16 @@ any of those replacements occurred.
of the `--diff-filter` option on what the status letters mean.
 
 --submodule[=]::
-   Specify how differences in submodules are shown.  When `--submodule`
-   or `--submodule=log` is given, the 'log' format is used.  This format 
lists
-   the commits in the range like linkgit:git-submodule[1] `summary` does.
-   Omitting the `--submodule` option or specifying `--submodule=short`,
-   uses the 'short' format. This format just shows the names of the commits
-   at the beginning and end of the range.  Can be tweaked via the
-   `diff.submodule` configuration variable.
+   Specify how differences in submodules are shown.  When specifying
+   `--submodule=short` the 'short' format is used.  This format just
+   shows the names of the commits at the beginning and end of the range.
+   When `--submodule` or `--submodule=log` is specified, the 'log'
+   format is used.  This format lists the commits in the range like
+   linkgit:git-submodule[1] `summary` does.  When `--submodule=diff`
+   is specified, the 'diff' format is used.  This format shows an
+   inline diff of the changes in the submodule contents between the
+   commit range.  Defaults to `diff.submodule` or the 'short' format
+   if the config option is unset.
 
 --color[=]::
Show colored diff.
diff --git a/diff.c b/diff.c
index 16253b191f53..b38d95eb249c 100644
--- a/diff.c
+++ b/diff.c
@@ -135,6 +135,8 @@ static int parse_submodule_params(struct diff_options 
*options, const char *valu
options->submodule_format = DIFF_SUBMODULE_LOG;
else if (!strcmp(value, "short"))
options->submodule_format = DIFF_SUBMODULE_SHORT;
+   else if (!strcmp(value, "diff"))
+   options->submodule_format = DIFF_SUBMODULE_INLINE_DIFF;
else
return -1;
return 0;
@@ -2300,6 +2302,15 @@ static void builtin_diff(const char *name_a,
struct strbuf header = STRBUF_INIT;
const char *line_prefix = diff_line_prefix(o);
 
+   diff_set_mnemonic_prefix(o, "a/", "b/");
+   if (DIFF_OPT_TST(o, REVERSE_DIFF)) {
+   a_prefix = o->b_prefix;
+   b_prefix = o->a_prefix;
+   } else {
+   a_prefix = o->a_prefix;
+   b_prefix = o->b_prefix;
+   }
+
if (o->submodule_format == DIFF_SUBMODULE_LOG &&
(!one->mode || S_ISGITLINK(one->mode)) &&
(!two->mode || S_ISGITLINK(two->mode))) {
@@ -2311,6 +2322,17 @@ static void builtin_diff(const char *name_a,
two->dirty_submodule,
meta, del, add, reset);
return;
+   } else if (o->submodule_format == DIFF_SUBMODULE_INLINE_DIFF &&
+

[PATCH v11 7/8] submodule: refactor show_submodule_summary with helper function

2016-08-25 Thread Jacob Keller

From: Jacob Keller 

A future patch is going to add a new submodule diff format which
displays an inline diff of the submodule changes. To make this easier,
and to ensure that both submodule diff formats use the same initial
header, factor out show_submodule_header() function which will print the
current submodule header line, and then leave the show_submodule_summary
function to lookup and print the submodule log format.

This does create one format change in that "(revision walker failed)"
will now be displayed on its own line rather than as part of the message
because we no longer perform this step directly in the header display
flow. However, this is a rare case as most causes of the failure will be
due to a missing commit which we already check for and avoid previously.
flow. However, this is a rare case and shouldn't impact much.

Signed-off-by: Jacob Keller 
---
 submodule.c | 115 +++-
 1 file changed, 82 insertions(+), 33 deletions(-)

diff --git a/submodule.c b/submodule.c
index 7cb236b0a108..2d88c555895d 100644
--- a/submodule.c
+++ b/submodule.c
@@ -280,9 +280,9 @@ void handle_ignore_submodules_arg(struct diff_options 
*diffopt,
 
 static int prepare_submodule_summary(struct rev_info *rev, const char *path,
struct commit *left, struct commit *right,
-   int *fast_forward, int *fast_backward)
+   struct commit_list *merge_bases)
 {
-   struct commit_list *merge_bases, *list;
+   struct commit_list *list;
 
init_revisions(rev, NULL);
setup_revisions(0, NULL, rev, NULL);
@@ -291,13 +291,6 @@ static int prepare_submodule_summary(struct rev_info *rev, 
const char *path,
left->object.flags |= SYMMETRIC_LEFT;
add_pending_object(rev, >object, path);
add_pending_object(rev, >object, path);
-   merge_bases = get_merge_bases(left, right);
-   if (merge_bases) {
-   if (merge_bases->item == left)
-   *fast_forward = 1;
-   else if (merge_bases->item == right)
-   *fast_backward = 1;
-   }
for (list = merge_bases; list; list = list->next) {
list->item->object.flags |= UNINTERESTING;
add_pending_object(rev, >item->object,
@@ -335,31 +328,23 @@ static void print_submodule_summary(struct rev_info *rev, 
FILE *f,
strbuf_release();
 }
 
-void show_submodule_summary(FILE *f, const char *path,
+/* Helper function to display the submodule header line prior to the full
+ * summary output. If it can locate the submodule objects directory it will
+ * attempt to lookup both the left and right commits and put them into the
+ * left and right pointers.
+ */
+static void show_submodule_header(FILE *f, const char *path,
const char *line_prefix,
struct object_id *one, struct object_id *two,
unsigned dirty_submodule, const char *meta,
-   const char *del, const char *add, const char *reset)
+   const char *reset,
+   struct commit **left, struct commit **right,
+   struct commit_list **merge_bases)
 {
-   struct rev_info rev;
-   struct commit *left = NULL, *right = NULL;
const char *message = NULL;
struct strbuf sb = STRBUF_INIT;
int fast_forward = 0, fast_backward = 0;
 
-   if (is_null_oid(two))
-   message = "(submodule deleted)";
-   else if (add_submodule_odb(path))
-   message = "(not initialized)";
-   else if (is_null_oid(one))
-   message = "(new submodule)";
-   else if (!(left = lookup_commit_reference(one->hash)) ||
-!(right = lookup_commit_reference(two->hash)))
-   message = "(commits not present)";
-   else if (prepare_submodule_summary(, path, left, right,
-  _forward, _backward))
-   message = "(revision walker failed)";
-
if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED)
fprintf(f, "%sSubmodule %s contains untracked content\n",
line_prefix, path);
@@ -367,11 +352,46 @@ void show_submodule_summary(FILE *f, const char *path,
fprintf(f, "%sSubmodule %s contains modified content\n",
line_prefix, path);
 
+   if (is_null_oid(one))
+   message = "(new submodule)";
+   else if (is_null_oid(two))
+   message = "(submodule deleted)";
+
+   if (add_submodule_odb(path)) {
+   if (!message)
+   message = "(not initialized)";
+   goto output_header;
+   }
+
+   /*
+* Attempt to lookup the commit references, and determine if this is
+* a fast forward or fast backwards update.
+*/
+   *left = lookup_commit_reference(one->hash);
+   *right =

[PATCH v11 6/8] submodule: convert show_submodule_summary to use struct object_id *

2016-08-25 Thread Jacob Keller

From: Jacob Keller 

Since we're going to be changing this function in a future patch, lets
go ahead and convert this to use object_id now.

Signed-off-by: Jacob Keller 
---
 diff.c  |  2 +-
 submodule.c | 16 
 submodule.h |  2 +-
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/diff.c b/diff.c
index d6b321da3d1d..16253b191f53 100644
--- a/diff.c
+++ b/diff.c
@@ -2307,7 +2307,7 @@ static void builtin_diff(const char *name_a,
const char *add = diff_get_color_opt(o, DIFF_FILE_NEW);
show_submodule_summary(o->file, one->path ? one->path : 
two->path,
line_prefix,
-   one->oid.hash, two->oid.hash,
+   >oid, >oid,
two->dirty_submodule,
meta, del, add, reset);
return;
diff --git a/submodule.c b/submodule.c
index 6096cf428be7..7cb236b0a108 100644
--- a/submodule.c
+++ b/submodule.c
@@ -337,7 +337,7 @@ static void print_submodule_summary(struct rev_info *rev, 
FILE *f,
 
 void show_submodule_summary(FILE *f, const char *path,
const char *line_prefix,
-   unsigned char one[20], unsigned char two[20],
+   struct object_id *one, struct object_id *two,
unsigned dirty_submodule, const char *meta,
const char *del, const char *add, const char *reset)
 {
@@ -347,14 +347,14 @@ void show_submodule_summary(FILE *f, const char *path,
struct strbuf sb = STRBUF_INIT;
int fast_forward = 0, fast_backward = 0;
 
-   if (is_null_sha1(two))
+   if (is_null_oid(two))
message = "(submodule deleted)";
else if (add_submodule_odb(path))
message = "(not initialized)";
-   else if (is_null_sha1(one))
+   else if (is_null_oid(one))
message = "(new submodule)";
-   else if (!(left = lookup_commit_reference(one)) ||
-!(right = lookup_commit_reference(two)))
+   else if (!(left = lookup_commit_reference(one->hash)) ||
+!(right = lookup_commit_reference(two->hash)))
message = "(commits not present)";
else if (prepare_submodule_summary(, path, left, right,
   _forward, _backward))
@@ -367,16 +367,16 @@ void show_submodule_summary(FILE *f, const char *path,
fprintf(f, "%sSubmodule %s contains modified content\n",
line_prefix, path);
 
-   if (!hashcmp(one, two)) {
+   if (!oidcmp(one, two)) {
strbuf_release();
return;
}
 
strbuf_addf(, "%s%sSubmodule %s %s..", line_prefix, meta, path,
-   find_unique_abbrev(one, DEFAULT_ABBREV));
+   find_unique_abbrev(one->hash, DEFAULT_ABBREV));
if (!fast_backward && !fast_forward)
strbuf_addch(, '.');
-   strbuf_addf(, "%s", find_unique_abbrev(two, DEFAULT_ABBREV));
+   strbuf_addf(, "%s", find_unique_abbrev(two->hash, DEFAULT_ABBREV));
if (message)
strbuf_addf(, " %s%s\n", message, reset);
else
diff --git a/submodule.h b/submodule.h
index 2af939099819..d83df57e24ff 100644
--- a/submodule.h
+++ b/submodule.h
@@ -43,7 +43,7 @@ const char *submodule_strategy_to_string(const struct 
submodule_update_strategy
 void handle_ignore_submodules_arg(struct diff_options *diffopt, const char *);
 void show_submodule_summary(FILE *f, const char *path,
const char *line_prefix,
-   unsigned char one[20], unsigned char two[20],
+   struct object_id *one, struct object_id *two,
unsigned dirty_submodule, const char *meta,
const char *del, const char *add, const char *reset);
 void set_config_fetch_recurse_submodules(int value);
-- 
2.10.0.rc0.259.g83512d9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v11 3/8] graph: add support for --line-prefix on all graph-aware output

2016-08-25 Thread Jacob Keller

From: Jacob Keller 

Add an extension to git-diff and git-log (and any other graph-aware
displayable output) such that "--line-prefix=" will print the
additional line-prefix on every line of output.

To make this work, we have to fix a few bugs in the graph API that force
graph_show_commit_msg to be used only when you have a valid graph.
Additionally, we extend the default_diff_output_prefix handler to work
even when no graph is enabled.

This is somewhat of a hack on top of the graph API, but I think it
should be acceptable here.

This will be used by a future extension of submodule display which
displays the submodule diff as the actual diff between the pre and post
commit in the submodule project.

Add some tests for both git-log and git-diff to ensure that the prefix
is honored correctly.

Signed-off-by: Jacob Keller 
---
 Documentation/diff-options.txt |   3 +
 builtin/rev-list.c |  70 ++---
 diff.c |   7 +
 diff.h |   2 +
 graph.c|  98 ---
 graph.h|  22 +-
 log-tree.c |   5 +-
 t/t4013-diff-various.sh|   6 +
 ...diff.diff_--line-prefix=abc_master_master^_side |  29 ++
 t/t4013/diff.diff_--line-prefix_--cached_--_file0  |  15 +
 t/t4202-log.sh | 323 +
 11 files changed, 502 insertions(+), 78 deletions(-)
 create mode 100644 t/t4013/diff.diff_--line-prefix=abc_master_master^_side
 create mode 100644 t/t4013/diff.diff_--line-prefix_--cached_--_file0

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 705a87394200..cc4342e2034d 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -569,5 +569,8 @@ endif::git-format-patch[]
 --no-prefix::
Do not show any source or destination prefix.
 
+--line-prefix=::
+   Prepend an additional prefix to every line of output.
+
 For more detailed explanation on these common options, see also
 linkgit:gitdiffcore[7].
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 0ba82b1635b6..8479f6ed28aa 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -122,48 +122,40 @@ static void show_commit(struct commit *commit, void *data)
ctx.fmt = revs->commit_format;
ctx.output_encoding = get_log_output_encoding();
pretty_print_commit(, commit, );
-   if (revs->graph) {
-   if (buf.len) {
-   if (revs->commit_format != CMIT_FMT_ONELINE)
-   graph_show_oneline(revs->graph);
+   if (buf.len) {
+   if (revs->commit_format != CMIT_FMT_ONELINE)
+   graph_show_oneline(revs->graph);
 
-   graph_show_commit_msg(revs->graph, );
+   graph_show_commit_msg(revs->graph, stdout, );
 
-   /*
-* Add a newline after the commit message.
-*
-* Usually, this newline produces a blank
-* padding line between entries, in which case
-* we need to add graph padding on this line.
-*
-* However, the commit message may not end in a
-* newline.  In this case the newline simply
-* ends the last line of the commit message,
-* and we don't need any graph output.  (This
-* always happens with CMIT_FMT_ONELINE, and it
-* happens with CMIT_FMT_USERFORMAT when the
-* format doesn't explicitly end in a newline.)
-*/
-   if (buf.len && buf.buf[buf.len - 1] == '\n')
-   graph_show_padding(revs->graph);
-   putchar('\n');
-   } else {
-   /*
-* If the message buffer is empty, just show
-* the rest of the graph output for this
-* commit.
-*/
-   if (graph_show_remainder(revs->graph))
-   putchar('\n');
-   if (revs->commit_format == CMIT_FMT_ONELINE)
-   putchar('\n');
-   }
+   /*
+

[PATCH v11 4/8] diff: prepare for additional submodule formats

2016-08-25 Thread Jacob Keller

From: Jacob Keller 

A future patch will add a new format for displaying the difference of
a submodule. Make it easier by changing how we store the current
selected format. Replace the DIFF_OPT flag with an enumeration, as each
format will be mutually exclusive.

Signed-off-by: Jacob Keller 
---
 diff.c | 12 ++--
 diff.h |  7 ++-
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/diff.c b/diff.c
index e57cf39ad109..d6b321da3d1d 100644
--- a/diff.c
+++ b/diff.c
@@ -132,9 +132,9 @@ static int parse_dirstat_params(struct diff_options 
*options, const char *params
 static int parse_submodule_params(struct diff_options *options, const char 
*value)
 {
if (!strcmp(value, "log"))
-   DIFF_OPT_SET(options, SUBMODULE_LOG);
+   options->submodule_format = DIFF_SUBMODULE_LOG;
else if (!strcmp(value, "short"))
-   DIFF_OPT_CLR(options, SUBMODULE_LOG);
+   options->submodule_format = DIFF_SUBMODULE_SHORT;
else
return -1;
return 0;
@@ -2300,9 +2300,9 @@ static void builtin_diff(const char *name_a,
struct strbuf header = STRBUF_INIT;
const char *line_prefix = diff_line_prefix(o);
 
-   if (DIFF_OPT_TST(o, SUBMODULE_LOG) &&
-   (!one->mode || S_ISGITLINK(one->mode)) &&
-   (!two->mode || S_ISGITLINK(two->mode))) {
+   if (o->submodule_format == DIFF_SUBMODULE_LOG &&
+   (!one->mode || S_ISGITLINK(one->mode)) &&
+   (!two->mode || S_ISGITLINK(two->mode))) {
const char *del = diff_get_color_opt(o, DIFF_FILE_OLD);
const char *add = diff_get_color_opt(o, DIFF_FILE_NEW);
show_submodule_summary(o->file, one->path ? one->path : 
two->path,
@@ -3916,7 +3916,7 @@ int diff_opt_parse(struct diff_options *options,
DIFF_OPT_SET(options, OVERRIDE_SUBMODULE_CONFIG);
handle_ignore_submodules_arg(options, arg);
} else if (!strcmp(arg, "--submodule"))
-   DIFF_OPT_SET(options, SUBMODULE_LOG);
+   options->submodule_format = DIFF_SUBMODULE_LOG;
else if (skip_prefix(arg, "--submodule=", ))
return parse_submodule_opt(options, arg);
else if (skip_prefix(arg, "--ws-error-highlight=", ))
diff --git a/diff.h b/diff.h
index 1f57aad25c71..43b353aea091 100644
--- a/diff.h
+++ b/diff.h
@@ -83,7 +83,6 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct 
diff_options *opt, void *data)
 #define DIFF_OPT_DIRSTAT_BY_FILE (1 << 20)
 #define DIFF_OPT_ALLOW_TEXTCONV  (1 << 21)
 #define DIFF_OPT_DIFF_FROM_CONTENTS  (1 << 22)
-#define DIFF_OPT_SUBMODULE_LOG   (1 << 23)
 #define DIFF_OPT_DIRTY_SUBMODULES(1 << 24)
 #define DIFF_OPT_IGNORE_UNTRACKED_IN_SUBMODULES (1 << 25)
 #define DIFF_OPT_IGNORE_DIRTY_SUBMODULES (1 << 26)
@@ -110,6 +109,11 @@ enum diff_words_type {
DIFF_WORDS_COLOR
 };
 
+enum diff_submodule_format {
+   DIFF_SUBMODULE_SHORT = 0,
+   DIFF_SUBMODULE_LOG
+};
+
 struct diff_options {
const char *orderfile;
const char *pickaxe;
@@ -157,6 +161,7 @@ struct diff_options {
int stat_count;
const char *word_regex;
enum diff_words_type word_diff;
+   enum diff_submodule_format submodule_format;
 
/* this is set by diffcore for DIFF_FORMAT_PATCH */
int found_changes;
-- 
2.10.0.rc0.259.g83512d9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v11 5/8] allow do_submodule_path to work even if submodule isn't checked out

2016-08-25 Thread Jacob Keller

From: Jacob Keller 

Currently, do_submodule_path will attempt locating the .git directory by
using read_gitfile on /.git. If this fails it just assumes the
/.git is actually a git directory.

This is good because it allows for handling submodules which were cloned
in a regular manner first before being added to the parent project.

Unfortunately this fails if the  is not actually checked out any
longer, such as by removing the directory.

Fix this by checking if the directory we found is actually a gitdir. In
the case it is not, attempt to lookup the submodule configuration and
find the name of where it is stored in the .git/modules/ folder of the
parent project.

If we can't locate the submodule configuration this might occur because
for example a submodule gitlink was added but the corresponding
.gitmodules file was not properly updated. A die() here would not be
pleasant to the users of submodule diff formats, so instead, modify
do_submodule_path to return an error code. For git_pathdup_submodule,
just return NULL when we fail to find a path. For strbuf_git_path_submodule
propagate the error code to the caller.

Modify the callers of these functions to check the error code and fail
properly. This ensures we don't attempt to use a bad path that doesn't
match the corresponding submodule.

Because this change fixes add_submodule_odb to work even if the
submodule is not checked out, update the wording of the submodule log
diff format to correctly display that the submodule is "not initialized"
instead of "not checked out"

Add tests to ensure this change works as expected.

Signed-off-by: Jacob Keller 
---
 cache.h   |   4 +-
 path.c|  37 +++--
 refs/files-backend.c  |   8 +-
 submodule.c   |   6 +-
 t/t4059-diff-submodule-not-initialized.sh | 127 ++
 5 files changed, 171 insertions(+), 11 deletions(-)
 create mode 100755 t/t4059-diff-submodule-not-initialized.sh

diff --git a/cache.h b/cache.h
index 70428e92d7ed..4f6693afa387 100644
--- a/cache.h
+++ b/cache.h
@@ -819,8 +819,8 @@ extern void strbuf_git_common_path(struct strbuf *sb, const 
char *fmt, ...)
__attribute__((format (printf, 2, 3)));
 extern char *git_path_buf(struct strbuf *buf, const char *fmt, ...)
__attribute__((format (printf, 2, 3)));
-extern void strbuf_git_path_submodule(struct strbuf *sb, const char *path,
- const char *fmt, ...)
+extern int strbuf_git_path_submodule(struct strbuf *sb, const char *path,
+const char *fmt, ...)
__attribute__((format (printf, 3, 4)));
 extern char *git_pathdup(const char *fmt, ...)
__attribute__((format (printf, 1, 2)));
diff --git a/path.c b/path.c
index fe3c4d96c6d8..e9369f75319d 100644
--- a/path.c
+++ b/path.c
@@ -6,6 +6,7 @@
 #include "string-list.h"
 #include "dir.h"
 #include "worktree.h"
+#include "submodule-config.h"
 
 static int get_st_mode_bits(const char *path, int *mode)
 {
@@ -468,12 +469,16 @@ const char *worktree_git_path(const struct worktree *wt, 
const char *fmt, ...)
return pathname->buf;
 }
 
-static void do_submodule_path(struct strbuf *buf, const char *path,
- const char *fmt, va_list args)
+/* Returns 0 on success, non-zero on failure. */
+#define SUBMODULE_PATH_ERR_NOT_CONFIGURED -1
+static int do_submodule_path(struct strbuf *buf, const char *path,
+const char *fmt, va_list args)
 {
const char *git_dir;
struct strbuf git_submodule_common_dir = STRBUF_INIT;
struct strbuf git_submodule_dir = STRBUF_INIT;
+   const struct submodule *sub;
+   int err = 0;
 
strbuf_addstr(buf, path);
strbuf_complete(buf, '/');
@@ -484,6 +489,17 @@ static void do_submodule_path(struct strbuf *buf, const 
char *path,
strbuf_reset(buf);
strbuf_addstr(buf, git_dir);
}
+   if (!is_git_directory(buf->buf)) {
+   gitmodules_config();
+   sub = submodule_from_path(null_sha1, path);
+   if (!sub) {
+   err = SUBMODULE_PATH_ERR_NOT_CONFIGURED;
+   goto cleanup;
+   }
+   strbuf_reset(buf);
+   strbuf_git_path(buf, "%s/%s", "modules", sub->name);
+   }
+
strbuf_addch(buf, '/');
strbuf_addbuf(_submodule_dir, buf);
 
@@ -494,27 +510,36 @@ static void do_submodule_path(struct strbuf *buf, const 
char *path,
 
strbuf_cleanup_path(buf);
 
+cleanup:
strbuf_release(_submodule_dir);
strbuf_release(_submodule_common_dir);
+
+   return err;
 }
 
 char *git_pathdup_submodule(const char *path, const char *fmt, ...)
 {
+   int err;
va_list args;
struct strbuf buf = STRBUF_INIT;

[PATCH v11 2/8] diff.c: remove output_prefix_length field

2016-08-25 Thread Jacob Keller

From: Junio C Hamano 

"diff/log --stat" has a logic that determines the display columns
available for the diffstat part of the output and apportions it for
pathnames and diffstat graph automatically.

5e71a84a (Add output_prefix_length to diff_options, 2012-04-16)
added the output_prefix_length field to diff_options structure to
allow this logic to subtract the display columns used for the
history graph part from the total "terminal width"; this matters
when the "git log --graph -p" option is in use.

The field must be set to the number of display columns needed to
show the output from the output_prefix() callback, which is error
prone.  As there is only one user of the field, and the user has the
actual value of the prefix string, let's get rid of the field and
have the user count the display width itself.

Signed-off-by: Junio C Hamano 
---
 diff.c  | 2 +-
 diff.h  | 1 -
 graph.c | 2 --
 3 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/diff.c b/diff.c
index 534c12e28ea8..50bef1f07658 100644
--- a/diff.c
+++ b/diff.c
@@ -1625,7 +1625,7 @@ static void show_stats(struct diffstat_t *data, struct 
diff_options *options)
 */
 
if (options->stat_width == -1)
-   width = term_columns() - options->output_prefix_length;
+   width = term_columns() - strlen(line_prefix);
else
width = options->stat_width ? options->stat_width : 80;
number_width = decimal_width(max_change) > number_width ?
diff --git a/diff.h b/diff.h
index 7883729edf10..747a204d75a4 100644
--- a/diff.h
+++ b/diff.h
@@ -174,7 +174,6 @@ struct diff_options {
diff_format_fn_t format_callback;
void *format_callback_data;
diff_prefix_fn_t output_prefix;
-   int output_prefix_length;
void *output_prefix_data;
 
int diff_path_counter;
diff --git a/graph.c b/graph.c
index dd1720148dc5..a46803840511 100644
--- a/graph.c
+++ b/graph.c
@@ -197,7 +197,6 @@ static struct strbuf *diff_output_prefix_callback(struct 
diff_options *opt, void
assert(opt);
assert(graph);
 
-   opt->output_prefix_length = graph->width;
strbuf_reset();
graph_padding_line(graph, );
return 
@@ -245,7 +244,6 @@ struct git_graph *graph_init(struct rev_info *opt)
 */
opt->diffopt.output_prefix = diff_output_prefix_callback;
opt->diffopt.output_prefix_data = graph;
-   opt->diffopt.output_prefix_length = 0;
 
return graph;
 }
-- 
2.10.0.rc0.259.g83512d9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 06/13] pkt-line: add functions to read/write flush terminated packet streams

2016-08-25 Thread Lars Schneider


> On 25 Aug 2016, at 20:46, Stefan Beller  wrote:
> 
>> On Thu, Aug 25, 2016 at 4:07 AM,   wrote:
>> From: Lars Schneider 
>> 
>> packet_write_stream_with_flush_from_fd() and
>> packet_write_stream_with_flush_from_buf() write a stream of packets. All
>> content packets use the maximal packet size except for the last one.
>> After the last content packet a `flush` control packet is written.
>> 
>> packet_read_till_flush() reads arbitrary sized packets until it detects
>> a `flush` packet.
> 
> So the API provided by these read/write functions is intended
> to move a huge chunks of data. And as it puts the data on the wire one
> packet after the other without the possibility to intervene and e.g. send
> a side channel progress bar update, I would question the design of this.
> If I understand correctly this will be specifically  used for large
> files locally,
> so e.g. a file of 5 GB (such as a virtual machine tracked in Git), would
> require about 80k packets.

Peff suggested this approach arguing that the overhead is neglectable:
http://public-inbox.org/git/20160720134916.gb19...@sigill.intra.peff.net/


> Instead of having many packets of max length and then a remainder,
> I would suggest to invent larger packets for this use case. Then we can
> just send one packet instead.
> 
> Currently a packet consists of 4 bytes indicating the length in hex
> and then the payload of length-4 bytes. As the length is in hex
> the characters in the first 4 bytes are [0-9a-f], we can easily add another
> meaning for the length, e.g.:
> 
>  A packet starts with the overall length and then the payload.
>  If the first character of the length is 'v' the length is encoded as a
>  variable length quantity[1]. The high bit of the char indicates if
>  the next char is still part of the length field. The length must not exceed
>  LLONG_MAX (which results in a payload of 9223 Petabyte, so
>  enough for the foreseeable future).

Eventually I would like to resurrect Joey's cleanFromFile/smudgeToFile idea:

http://public-inbox.org/git/1468277112-9909-3-git-send-email-jo...@joeyh.name/

Then we would not need to transfer that much data over the pipes. However, I 
wonder if the large amount of packets would actually be a problem. Honestly, I 
would prefer to not change Git's packet format in this already large series ;-)


>  [1] A variable-length quantity (VLQ) is a universal code that uses
>  an arbitrary number of bytes to represent an arbitrarily large integer.
>  https://en.wikipedia.org/wiki/Variable-length_quantity
> 
> The neat thing about the packet system is we can dedicate packets
> to different channels (such as the side channels), but with the provided
> API here this makes it impossible to later add in these side channel
> as it is a pure streaming API now. So let's remove the complication
> of having to send multiple packets and just go with one large packet
> instead.

I tried to design the protocol as flexible as possible for the future with a 
version negotiation and a capabilities list. Therefore, I would think it should 
be possible to implement these ideas in the future if they are required.


> --
>I understand that my proposal would require writing code again,
>but it has also some long term advantages in the networking stack
>of Git: There are some worries that a capabilities line in fetch/push
>might overflow in the far future, when there are lots of capabilities.
> 
>Also a few days ago there was a proposal to add all symbolic refs
>to a capabilities line, which Peff shot down as "the packet may be
>too small".
> 
>There is an incredible hack that allows transporting refs > 64kB IIRC.
> 
>All these things could go away with the variable length encoded
>packets. But to make them go away in the future we would need
>to start with these variable length packets today. ;)
> 
> Just food for thought.

Thanks for thinking it through that thoroughly! I understand your point of view 
and I am curious what others thing.

Cheers,
Lars

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/15] sequencer: lib'ify read_populate_todo()

2016-08-25 Thread Junio C Hamano

Johannes Schindelin  writes:

> To be truly useful, the sequencer should never die() but always return
> an error.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  sequencer.c | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)

Instead of dying there, you let the caller high up in the callchain
to notice the error and handle it (by dying).

The only caller of read_populate_todo(), sequencer_continue() can
already return errors, so its caller must be already prepared to
handle error returns, and with this step, you make it notice an
error return from this function.  So this is a safe conversion to
make read_populate_todo() callable from new callers that want it not
to die, without changing the external behaviour of anything
existing.

Good.

By the way, I am writing these as review comments because I do not
want to keep repeating this kind of analysis as a reviewer.  I am
demonstrating what should have been in the commit log message
instead, so that the reviewer does not have to spend extra time, if
the reviewer trusts the author's diligence well enough, to see if
the conversion makes sense.

Please follow the example when/if you have to reroll.  I want the
patches to show the evidence of careful analysis to reviewers so
that they can gauge the trustworthiness of the patches.  With this
round of patches, honestly, I cannot tell if it is a mechanical
substitution alone, or such a substitution followed by a careful
verification of the callers.

> diff --git a/sequencer.c b/sequencer.c
> index a8c3a48..5f6b020 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -746,7 +746,7 @@ static int parse_insn_buffer(char *buf, struct 
> commit_list **todo_list,
>   return 0;
>  }
>  
> -static void read_populate_todo(struct commit_list **todo_list,
> +static int read_populate_todo(struct commit_list **todo_list,
>   struct replay_opts *opts)
>  {
>   struct strbuf buf = STRBUF_INIT;
> @@ -754,18 +754,21 @@ static void read_populate_todo(struct commit_list 
> **todo_list,
>  
>   fd = open(git_path_todo_file(), O_RDONLY);
>   if (fd < 0)
> - die_errno(_("Could not open %s"), git_path_todo_file());
> + return error(_("Could not open %s (%s)"),
> + git_path_todo_file(), strerror(errno));
>   if (strbuf_read(, fd, 0) < 0) {
>   close(fd);
>   strbuf_release();
> - die(_("Could not read %s."), git_path_todo_file());
> + return error(_("Could not read %s."), git_path_todo_file());
>   }
>   close(fd);
>  
>   res = parse_insn_buffer(buf.buf, todo_list, opts);
>   strbuf_release();
>   if (res)
> - die(_("Unusable instruction sheet: %s"), git_path_todo_file());
> + return error(_("Unusable instruction sheet: %s"),
> + git_path_todo_file());
> + return 0;
>  }
>  
>  static int populate_opts_cb(const char *key, const char *value, void *data)
> @@ -1015,7 +1018,8 @@ static int sequencer_continue(struct replay_opts *opts)
>   if (!file_exists(git_path_todo_file()))
>   return continue_single_pick();
>   read_populate_opts();
> - read_populate_todo(_list, opts);
> + if (read_populate_todo(_list, opts))
> + return -1;
>  
>   /* Verify that the conflict has been resolved */
>   if (file_exists(git_path_cherry_pick_head()) ||
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 04/15] sequencer: lib'ify prepare_revs()

2016-08-25 Thread Junio C Hamano

Johannes Schindelin  writes:

> To be truly useful, the sequencer should never die() but always return
> an error.
>
> Signed-off-by: Johannes Schindelin 
> ---

I am still looking at sequencer.c in 'master', but I do not think
that the sole caller of this function, walk_revs_populate_todo(),
is prepared to act on an error return from this function and instead
it expects this to die() when in trouble.  And I do not think I saw
the function touched in the steps so far.

So this step smells like a fishy conversion to me.

>  sequencer.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index 6ac2187..b90294f 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -621,7 +621,7 @@ static int do_pick_commit(struct commit *commit, struct 
> replay_opts *opts)
>   return res;
>  }
>  
> -static void prepare_revs(struct replay_opts *opts)
> +static int prepare_revs(struct replay_opts *opts)
>  {
>   /*
>* picking (but not reverting) ranges (but not individual revisions)
> @@ -631,10 +631,11 @@ static void prepare_revs(struct replay_opts *opts)
>   opts->revs->reverse ^= 1;
>  
>   if (prepare_revision_walk(opts->revs))
> - die(_("revision walk setup failed"));
> + return error(_("revision walk setup failed"));
>  
>   if (!opts->revs->commits)
> - die(_("empty commit set passed"));
> + return error(_("empty commit set passed"));
> + return 0;
>  }
>  
>  static void read_and_refresh_cache(struct replay_opts *opts)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 05/15] sequencer: lib'ify read_and_refresh_cache()

2016-08-25 Thread Junio C Hamano

Johannes Schindelin  writes:

> To be truly useful, the sequencer should never die() but always return
> an error.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  sequencer.c | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)

Instead of dying there, you let the caller high up in the callchain
to notice the error and handle it (by dying).  There are two call
sites of read_and_refresh_cache(), one of which is pick_commits(),
whose callers you already verified that they are prepared to do the
right thing given an "error" return from it when you did 3/15, so
the conversion is safe.  The other one, sequencer_pick_revisions()
is also prepared to relay an error return back to its caller and you
made sure its callers are all safe when you did 3/15.

Good.

> diff --git a/sequencer.c b/sequencer.c
> index b90294f..a8c3a48 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -638,18 +638,21 @@ static int prepare_revs(struct replay_opts *opts)
>   return 0;
>  }
>  
> -static void read_and_refresh_cache(struct replay_opts *opts)
> +static int read_and_refresh_cache(struct replay_opts *opts)
>  {
>   static struct lock_file index_lock;
>   int index_fd = hold_locked_index(_lock, 0);
>   if (read_index_preload(_index, NULL) < 0)
> - die(_("git %s: failed to read the index"), action_name(opts));
> + return error(_("git %s: failed to read the index"),
> + action_name(opts));
>   refresh_index(_index, REFRESH_QUIET|REFRESH_UNMERGED, NULL, NULL, 
> NULL);
>   if (the_index.cache_changed && index_fd >= 0) {
>   if (write_locked_index(_index, _lock, COMMIT_LOCK))
> - die(_("git %s: failed to refresh the index"), 
> action_name(opts));
> + return error(_("git %s: failed to refresh the index"),
> + action_name(opts));
>   }
>   rollback_lock_file(_lock);
> + return 0;
>  }
>  
>  static int format_todo(struct strbuf *buf, struct commit_list *todo_list,
> @@ -977,7 +980,8 @@ static int pick_commits(struct commit_list *todo_list, 
> struct replay_opts *opts)
>   if (opts->allow_ff)
>   assert(!(opts->signoff || opts->no_commit ||
>   opts->record_origin || opts->edit));
> - read_and_refresh_cache(opts);
> + if (read_and_refresh_cache(opts))
> + return -1;
>  
>   for (cur = todo_list; cur; cur = cur->next) {
>   save_todo(cur, opts);
> @@ -1041,7 +1045,8 @@ int sequencer_pick_revisions(struct replay_opts *opts)
>   if (opts->subcommand == REPLAY_NONE)
>   assert(opts->revs);
>  
> - read_and_refresh_cache(opts);
> + if (read_and_refresh_cache(opts))
> + return -1;
>  
>   /*
>* Decide what to do depending on the arguments; a fresh
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v10 0/9] submodule inline diff format

2016-08-25 Thread Jacob Keller

On Thu, Aug 25, 2016 at 3:38 PM, Junio C Hamano  wrote:
> Jacob Keller  writes:
>
>> So we should support the gitlink to a repository stored at 
>> without stuff inside the .git/modules, and we should support submodule
>> gitlinks with a proper .gitmodules setup. I don't think we should
>> die() but we should error properly so I will introduce a _gently()
>> variant of these functions, and die properly in the regular flow.
>
> Because "git diff [--cached] []" in the top-level is
> driven by a gitlink in the index, immediately after adding a new
> submodule to the index but before describing it in .gitmodules you
> might not have a name (and you know in that case the path will
> become the name when adding it to .gitmodules).  Also a gitlink in
> the index may correspond to a submodule the user of the top-level is
> not interested in, so there may not be anything in .git/modules/
> that corresponds to it.  In these cases, I suspect that you do not
> want to die, but you can just tell the user "I do not have enough
> information to tell you a useful story yet".
>

Right. submodule_from_path() fails to find a config. I don't think
die() is right here, because there is no easy way to make this into a
gently() variant I can still do it if we think a die() is
worthwhile otherwise for the other callers of do_submodule_path...

However, I think the safest thing is to just:

a) read_gitfile on /.git
b) if read_gitfile succeeds, use it's contents, otherwise use
/.git for next steps
c) check if the resulting file is a git directory, we're fine.. we
found a gitdir, so stop.
d) otherwise,  empty the buffer, then lookup submodules
e) when submodules lookup succeeds.. see if we found a name. If so, use that.
f) if we didn't just exit with an empty buffer.

That empty buffer *should* trigger  revision error codes since it
won't point to any valid path and it also triggers the regular error
code in add_submodule_odb so it handles that with showing not
initizlied.

This method is less work then re-implementing a _gently() variant for
all of these functions.

Stefan, does this make sense and seem reasonable?

If we just die when submodule_from_path fails I think it's bad for the
submodule=* formats beside short. I don't know if it causes problems
for revision code or not... I think falling back to resetting the
buffer as our way of indicating error is reasonable enough...

Thanks,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 06/13] pkt-line: add functions to read/write flush terminated packet streams

2016-08-25 Thread Junio C Hamano

larsxschnei...@gmail.com writes:

> From: Lars Schneider 
>
> packet_write_stream_with_flush_from_fd() and
> packet_write_stream_with_flush_from_buf() write a stream of packets. All
> content packets use the maximal packet size except for the last one.
> After the last content packet a `flush` control packet is written.
> packet_read_till_flush() reads arbitrary sized packets until it detects
> a `flush` packet.

These are awkwardly named and I couldn't guess what the input is (I
can tell one is to read from fd and the other is  buffer,
but it is unclear if that is in packetized form or just raw data
stream to be copied to the end from their names) without reading the
implementation.  I _think_ you read a raw stream of data through the
end (either EOF or length limit) and write it out packetized, and
use the flush packet to mark the end of the stream.  In my mind,
that is "writing a packetized stream".  The words "packetizing" and
"stream" imply that the stream could consist of more data than what
would fit in a single packet, which in turn implies that there needs
a way to mark the end of one data item, so with_flush does not
necessarily have to be their names.

The counter-part would be "reading a packetized stream".

> +int packet_write_stream_with_flush_from_fd(int fd_in, int fd_out)
> +{

Especially this one I am tempted to suggest "copy-to-packetized-stream",
as it reads a stream from one fd and then copies out while packetizing.

> + int err = 0;
> + ssize_t bytes_to_write;
> +
> + while (!err) {
> + bytes_to_write = xread(fd_in, packet_write_buffer, 
> sizeof(packet_write_buffer) - 4);
> + if (bytes_to_write < 0)
> + return COPY_READ_ERROR;
> + if (bytes_to_write == 0)
> + break;
> + if (bytes_to_write > sizeof(packet_write_buffer) - 4)
> + return COPY_WRITE_ERROR;

... and you seem to agree with me by using COPY here.

> + err = packet_write_gently(fd_out, packet_write_buffer, 
> bytes_to_write);
> + }
> + if (!err)
> + err = packet_flush_gently(fd_out);
> + return err;
> +}
> +
> +int packet_write_stream_with_flush_from_buf(const char *src_in, size_t len, 
> int fd_out)
> +{
> + int err = 0;
> + size_t bytes_written = 0;
> + size_t bytes_to_write;
> +
> + while (!err) {
> + if ((len - bytes_written) > sizeof(packet_write_buffer) - 4)
> + bytes_to_write = sizeof(packet_write_buffer) - 4;
> + else
> + bytes_to_write = len - bytes_written;
> + if (bytes_to_write == 0)
> + break;

The lack of COPY_WRITE_ERROR puzzled me briefly here.  If you are
assuming that your math at the beginning of this loop is correct and
bytes_to_write will never exceed the write-buffer size, I think you
should be able to (and it would be better to) assume that the math
you do to tell xread() up to how many bytes it is allowed to read at
once is also correct, losing the COPY_WRITE_ERROR check in the other
function.  You can choose to play safer and do a check in this
function, too.  Either way, we would want to be consistent.

> + err = packet_write_gently(fd_out, src_in + bytes_written, 
> bytes_to_write);
> + bytes_written += bytes_to_write;
> + }
> + if (!err)
> + err = packet_flush_gently(fd_out);
> + return err;
> +}

> +ssize_t packet_read_till_flush(int fd_in, struct strbuf *sb_out)
> +{
> + int len, ret;
> + int options = PACKET_READ_GENTLE_ON_EOF;
> + char linelen[4];
> +
> + size_t oldlen = sb_out->len;
> + size_t oldalloc = sb_out->alloc;
> +
> + for (;;) {
> + /* Read packet header */
> + ret = get_packet_data(fd_in, NULL, NULL, linelen, 4, options);
> + if (ret < 0)
> + goto done;
> + len = packet_length(linelen);
> + if (len < 0)
> + die("protocol error: bad line length character: %.4s", 
> linelen);
> + if (!len) {
> + /* Found a flush packet - Done! */
> + packet_trace("", 4, 0);
> + break;
> + }
> + len -= 4;
> +
> + /* Read packet content */
> + strbuf_grow(sb_out, len);
> + ret = get_packet_data(fd_in, NULL, NULL, sb_out->buf + 
> sb_out->len, len, options);
> + if (ret < 0)
> + goto done;
> + if (ret != len) {
> + error("protocol error: incomplete read (expected %d, 
> got %d)", len, ret);
> + goto done;
> + }
> +
> + packet_trace(sb_out->buf + sb_out->len, len, 0);

All of the above seems to pretty much duplicate the logic in
packet_read(), except that this user does not need options handling
it

Re: [PATCH v10 0/9] submodule inline diff format

2016-08-25 Thread Junio C Hamano

Jacob Keller  writes:

> So we should support the gitlink to a repository stored at 
> without stuff inside the .git/modules, and we should support submodule
> gitlinks with a proper .gitmodules setup. I don't think we should
> die() but we should error properly so I will introduce a _gently()
> variant of these functions, and die properly in the regular flow.

Because "git diff [--cached] []" in the top-level is
driven by a gitlink in the index, immediately after adding a new
submodule to the index but before describing it in .gitmodules you
might not have a name (and you know in that case the path will
become the name when adding it to .gitmodules).  Also a gitlink in
the index may correspond to a submodule the user of the top-level is
not interested in, so there may not be anything in .git/modules/
that corresponds to it.  In these cases, I suspect that you do not
want to die, but you can just tell the user "I do not have enough
information to tell you a useful story yet".

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 06/13] pkt-line: add functions to read/write flush terminated packet streams

2016-08-25 Thread Junio C Hamano

Stefan Beller  writes:

> So the API provided by these read/write functions is intended
> to move a huge chunks of data. And as it puts the data on the wire one
> packet after the other without the possibility to intervene and e.g. send
> a side channel progress bar update, I would question the design of this.

Hmph, I didn't think about it.

But shouldn't one be able to set up sideband and channel one such
large transfer on one band, while multiplexing other payload on
other bands?

> If I understand correctly this will be specifically  used for large
> files locally,
> so e.g. a file of 5 GB (such as a virtual machine tracked in Git), would
> require about 80k packets.

What is wrong about that?  4*80k = 320kB overhead for length fields
to transfer 5GB worth of data?  I do not think it is worth worrying
about it.

But I am more surprised by seeing that "why not a single huge
packet" suggestion immediately after you talked about "without the
possibility to intervene".  They do not seem to be remotely related;
in fact, they are going into opposite directions.

Puzzled.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v10 0/9] submodule inline diff format

2016-08-25 Thread Jacob Keller

On Thu, Aug 25, 2016 at 1:46 PM, Stefan Beller  wrote:
> On Thu, Aug 25, 2016 at 1:39 PM, Jacob Keller  wrote:
>> On Tue, Aug 23, 2016 at 10:47 AM, Stefan Beller  wrote:
>>> On Tue, Aug 23, 2016 at 10:25 AM, Junio C Hamano  wrote:
 I am not so sure about that.  If there is an existing place that is
 buggy, shouldn't we fix that, instead of spreading the same bug
 (assuming that it is a bug in the first place, which I do not have a
 strong opinion on, at least not yet)?

 Can there be .git/modules// repository that is pointed at an
 in-tree .git file when there is no "name" defined?
>>>
>>> If you're holding it wrong we can come into that state.
>>> * Checkout the submodule,
>>> * then remove .gitmodules as well as relevant config in .git/config.
>>> Result: Then we have a only a gitlink recorded as well as connected
>>> working tree to a gitdir inside a superprojects .git/modules/.
>>>
>>
>> Yea, but I think you're right that we shouldn't support that state.
>> What I'm worried about is the case where we can get this state doing
>> something sane/acceptable but I don't think we can.
>>
>> So we should support the gitlink to a repository stored at 
>> without stuff inside the .git/modules, and we should support submodule
>> gitlinks with a proper .gitmodules setup. I don't think we should
>> die() but we should error properly so I will introduce a _gently()
>> variant of these functions, and die properly in the regular flow.
>
> ok, thanks!


Ok so a die() doesn't really work. If you use submodule_from_path here
on a newly checked out repository that has sub modules but you haven't
yet initialized the repository, then the result is that
submodule_from_path will fail.. Is that expected?

That causes us to die every time on a newly checked out repository.

If we go with this behavior, then I have to refactor the whole set of
path() functions to have a _gently() variant which handles this
gracefully, and the regular revision code would still end up die()ing
as well..

Thanks,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 05/13] pkt-line: add packet_write_gently()

2016-08-25 Thread Junio C Hamano

larsxschnei...@gmail.com writes:

> From: Lars Schneider 
>
> packet_write_fmt() has two shortcomings. First, it uses format_packet()
> which lets the caller only send string data via "%s". That means it
> cannot be used for arbitrary data that may contain NULs. Second, it will
> always die on error.

As you introduced _gently in 3/13, the latter is no longer a valid
excuse to add this function.  Just remove the sentence "Second, ...".

> Add packet_write_gently() which writes arbitrary data and returns `0`
> for success and `-1` for an error. This function is used by other
> pkt-line functions in a subsequent patch.
>
> Signed-off-by: Lars Schneider 
> ---
>  pkt-line.c | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/pkt-line.c b/pkt-line.c
> index cad26df..7e8a803 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -3,6 +3,7 @@
>  #include "run-command.h"
>  
>  char packet_buffer[LARGE_PACKET_MAX];
> +static char packet_write_buffer[LARGE_PACKET_MAX];
>  static const char *packet_trace_prefix = "git";
>  static struct trace_key trace_packet = TRACE_KEY_INIT(PACKET);
>  static struct trace_key trace_pack = TRACE_KEY_INIT(PACKFILE);
> @@ -155,6 +156,17 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...)
>   return (write_in_full(fd, buf.buf, buf.len) == buf.len ? 0 : -1);
>  }
>  
> +int packet_write_gently(const int fd_out, const char *buf, size_t size)
> +{
> + if (size > sizeof(packet_write_buffer) - 4)
> + return -1;
> + packet_trace(buf, size, 1);
> + memmove(packet_write_buffer + 4, buf, size);
> + size += 4;
> + set_packet_header(packet_write_buffer, size);

It may not matter all that much, but from code-reader's point of
view, when you know packet_write_buffer[] will contain things A and
B in this order, and when you have enough information to compute A
before stasrting to fill packet_write_buffer[], I would prefer to
see you actually fill the buffer in that natural order.

Do you anticipate future need of non-gently variant of this
function?  If so, perhaps a helper that takes a boolean "am I
working for the gently variant?" may help share more code.

> + return (write_in_full(fd_out, packet_write_buffer, size) == size ? 0 : 
> -1);
> +}
> +
>  void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
>  {
>   va_list args;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 03/13] pkt-line: add packet_write_fmt_gently()

2016-08-25 Thread Junio C Hamano

larsxschnei...@gmail.com writes:

> From: Lars Schneider 
>
> packet_write_fmt() would die in case of a write error even though for
> some callers an error would be acceptable. Add packet_write_fmt_gently()
> which writes a formatted pkt-line and returns `0` for success and `-1`
> for an error.
>
> Signed-off-by: Lars Schneider 
> ---
>  pkt-line.c | 12 
>  pkt-line.h |  1 +
>  2 files changed, 13 insertions(+)
>
> diff --git a/pkt-line.c b/pkt-line.c
> index e8adc0f..3e8b2fb 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -137,6 +137,18 @@ void packet_write_fmt(int fd, const char *fmt, ...)
>   write_or_die(fd, buf.buf, buf.len);
>  }
>  
> +int packet_write_fmt_gently(int fd, const char *fmt, ...)
> +{
> + static struct strbuf buf = STRBUF_INIT;
> + va_list args;
> +
> + strbuf_reset();
> + va_start(args, fmt);
> + format_packet(, fmt, args);
> + va_end(args);
> + return (write_in_full(fd, buf.buf, buf.len) == buf.len ? 0 : -1);
> +}

Even though its only a handful lines, it is a bit ugly to have a
completely copied implementation only to have _gently().  I suspect
that you should be able to

static int packet_write_fmt_1(int fd, int gently,
const char *fmt, va_list args)
{
struct strbuf buf = STRBUF_INIT;
size_t count;

format_packet(, fmt, args);

count = write_in_full(fd, buf.buf, buf.len);
if (count == buf.len)
return 0;
if (!gently) {
check_pipe(errno);
die_errno("write error");
}
return -1;
}

and then share that between the existing one:

void packet_write_fmt(int fd, const char *fmt, ...)
{
va_list args;
va_start(args, fmt);
packet_write_fmt_1(fd, 0, fmt, args);
va_end(args);
}

and the new one:

void packet_write_fmt_gently(int fd, const char *fmt, ...)
{
int status;
va_list args;
va_start(args, fmt);
status = packet_write_fmt_1(fd, 1, fmt, args);
va_end(args);
return status;
}

>  void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
>  {
>   va_list args;
> diff --git a/pkt-line.h b/pkt-line.h
> index 1902fb3..3caea77 100644
> --- a/pkt-line.h
> +++ b/pkt-line.h
> @@ -23,6 +23,7 @@ void packet_flush(int fd);
>  void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format 
> (printf, 2, 3)));
>  void packet_buf_flush(struct strbuf *buf);
>  void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
> __attribute__((format (printf, 2, 3)));
> +int packet_write_fmt_gently(int fd, const char *fmt, ...) 
> __attribute__((format (printf, 2, 3)));
>  
>  /*
>   * Read a packetized line into the buffer, which must be at least size bytes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Feature Request: Branch-Aware Submodules

2016-08-25 Thread Junio C Hamano

Stefan Beller  writes:

>>> So you roughly do
>>>
>>> git checkout -b new-topic
>>> # change the submodule to point at the latest upstream version:
>>> git submodule update --remote 
>>> git commit -a -m "update submodule"
>>> git checkout master
>>> git merge new-topic
>>> # here seems to be your point of critic?
>>> # now the submodule pointer would still point to the latest
>>> upstream version?
>>
>> Isn't  subject to the usual 3-way merge when the
>> last step (i.e. a merge of new-topic branch into master in the
>> superproject) is made?  If 'master' hasn't changed 
>> since 'new-topic' forked from it, because 'new-topic' updated the
>> commit bound at , doesn't "git merge new-topic" just
>> take that change as the normal "One side updated, the other did not
>> touch; take the update" merge?
>
> Yes. I was unclear here.
> By "latest upstream version" I meant the version you pulled in in the 
> new-topic
> branch via the "submodule update --remote" and that is preserved as is.

I do not think you were unclear at all.

What else is desired?  "git merge new-topic" leaves a result that is
not a merge of the changes made on that new-topic branch, by leaving
a stale  that was in 'master' as-is?

After all, the new-topic branch committed that "update submodule",
showing its desire that the latest-from-upstream commit it just
obtained must be at  from then on in the top-level
project.  If that change is not propagated (or at least "taken into
account") when merging it to 'master', the result is not a proper
"merge".  If new-topic didn't want the updated commit from the
submodule, it shouldn't have recorded that in its commit in the
first place.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 03/15] sequencer: lib'ify do_pick_commit()

2016-08-25 Thread Junio C Hamano

Johannes Schindelin  writes:

> To be truly useful, the sequencer should never die() but always return
> an error.
>
> Signed-off-by: Johannes Schindelin 
> ---

Instead of dying there, you let the caller high up in the callchain
to notice the error and handle it (by dying).  The eventual caller
of do_pick_commit() is sequencer_pick_revisions(), which already
relays an reported error from its helper functions (including this
one), and both of its two callers know how to react to a negative
return correctly.  So this makes do_pick_commit() callable from new
callers that want it not to die, without changing the external
behaviour of anything existing.

Good.

>  sequencer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index 0c8c955..6ac2187 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -464,7 +464,7 @@ static int do_pick_commit(struct commit *commit, struct 
> replay_opts *opts)
>* to work on.
>*/
>   if (write_cache_as_tree(head, 0, NULL))
> - die (_("Your index file is unmerged."));
> + return error (_("Your index file is unmerged."));

While you are touching the line, it is a good idea to correct an
obvious style error like this one.  "Do one thing and one thing well
in a commit" is a good discipline, but it is absurd to take it to
the extreme.

>   } else {
>   unborn = get_sha1("HEAD", head);
>   if (unborn)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Feature Request: Branch-Aware Submodules

2016-08-25 Thread Stefan Beller

On Thu, Aug 25, 2016 at 1:50 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> +cc Jacob and Lars who work with submodules as well.
>>
>> On Thu, Aug 25, 2016 at 2:00 AM, Hedges  Alexander
>>  wrote:
>>>
>>> Right now updating a submodule in a topic branch and merging it into master
>>> will not change the submodule index in master leading to at least two commit
>>> for the same change (one in any active branch). This happened to me quite a 
>>> few
>>> times. To a newcomer this behavior is confusing and it leads to unnecessary
>>> commits.
>>
>> So you roughly do
>>
>> git checkout -b new-topic
>> # change the submodule to point at the latest upstream version:
>> git submodule update --remote 
>> git commit -a -m "update submodule"
>> git checkout master
>> git merge new-topic
>> # here seems to be your point of critic?
>> # now the submodule pointer would still point to the latest
>> upstream version?
>
> Isn't  subject to the usual 3-way merge when the
> last step (i.e. a merge of new-topic branch into master in the
> superproject) is made?  If 'master' hasn't changed 
> since 'new-topic' forked from it, because 'new-topic' updated the
> commit bound at , doesn't "git merge new-topic" just
> take that change as the normal "One side updated, the other did not
> touch; take the update" merge?

Yes. I was unclear here.
By "latest upstream version" I meant the version you pulled in in the new-topic
branch via the "submodule update --remote" and that is preserved as is.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Feature Request: Branch-Aware Submodules

2016-08-25 Thread Junio C Hamano

Stefan Beller  writes:

> +cc Jacob and Lars who work with submodules as well.
>
> On Thu, Aug 25, 2016 at 2:00 AM, Hedges  Alexander
>  wrote:
>>
>> Right now updating a submodule in a topic branch and merging it into master
>> will not change the submodule index in master leading to at least two commit
>> for the same change (one in any active branch). This happened to me quite a 
>> few
>> times. To a newcomer this behavior is confusing and it leads to unnecessary
>> commits.
>
> So you roughly do
>
> git checkout -b new-topic
> # change the submodule to point at the latest upstream version:
> git submodule update --remote 
> git commit -a -m "update submodule"
> git checkout master
> git merge new-topic
> # here seems to be your point of critic?
> # now the submodule pointer would still point to the latest
> upstream version?

Isn't  subject to the usual 3-way merge when the
last step (i.e. a merge of new-topic branch into master in the
superproject) is made?  If 'master' hasn't changed 
since 'new-topic' forked from it, because 'new-topic' updated the
commit bound at , doesn't "git merge new-topic" just
take that change as the normal "One side updated, the other did not
touch; take the update" merge?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 2/2] gitattributes: Document the unified "auto" handling

2016-08-25 Thread Junio C Hamano

tbo...@web.de writes:

> +If you want to ensure that text files that any contributor introduces to
> +the repository have their line endings normalized, you could set the
> +`text` attribute to "auto" for _all_ files.
> +
> +
> +*text=auto
> +
> +

That is very understandable, especially that the text before this
added paragraph is about "core.autocrlf" configuration that is about
"your" changes.  It contrasts gitconfig vs gitattributes very well.

However, it is no longer clear what "you should instead" in the
existing paragraph attempts to contrast with.  "If you want all text
files, then use '* text=auto'" is what is said previously.  And your
updated example below says "If you do not want that, and instead you
want X, do '*.txt text'".  But the value of X is reads the same as
the above one: "you want all text files to be normalized".

>  If you want to interoperate with a source code management system that
>  enforces end-of-line normalization, or you simply want all text files
>  in your repository to be normalized, you should instead set the `text`
> -attribute to "auto" for _all_ files.
> +attribute to "text" for text files.
>  
>  
> -*text=auto
> +*.txttext
>  

In short, the above is incoherent and not understandable, without
updating the three lines of introductory text you left untouched at
the beginning of the paragraph, when read in the (updated) context.

> -This ensures that all files that Git considers to be text will have
> +This ensures that all files marked as text will have

This is a good update of the description to match the updated example.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v10 0/9] submodule inline diff format

2016-08-25 Thread Stefan Beller

On Thu, Aug 25, 2016 at 1:39 PM, Jacob Keller  wrote:
> On Tue, Aug 23, 2016 at 10:47 AM, Stefan Beller  wrote:
>> On Tue, Aug 23, 2016 at 10:25 AM, Junio C Hamano  wrote:
>>> I am not so sure about that.  If there is an existing place that is
>>> buggy, shouldn't we fix that, instead of spreading the same bug
>>> (assuming that it is a bug in the first place, which I do not have a
>>> strong opinion on, at least not yet)?
>>>
>>> Can there be .git/modules// repository that is pointed at an
>>> in-tree .git file when there is no "name" defined?
>>
>> If you're holding it wrong we can come into that state.
>> * Checkout the submodule,
>> * then remove .gitmodules as well as relevant config in .git/config.
>> Result: Then we have a only a gitlink recorded as well as connected
>> working tree to a gitdir inside a superprojects .git/modules/.
>>
>
> Yea, but I think you're right that we shouldn't support that state.
> What I'm worried about is the case where we can get this state doing
> something sane/acceptable but I don't think we can.
>
> So we should support the gitlink to a repository stored at 
> without stuff inside the .git/modules, and we should support submodule
> gitlinks with a proper .gitmodules setup. I don't think we should
> die() but we should error properly so I will introduce a _gently()
> variant of these functions, and die properly in the regular flow.

ok, thanks!

>> Stepping back a bit, I think we'd want to document this expectation more
>> in the man pages
>> The name unlike the path of a submodule must not be changed (as the
>> name is used internally to point at the submodules git dir)
>
> Agreed, this should be documented. Do you mind doing the documentation
> patch for this?

Well that documentation is sort of unrelated to this series, so no pressure.
I have a revamp of the whole submodule documentation on my wish/TODO list,
including this.

Thanks,
Stefan

>
> Thanks,
> Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 1/2] git ls-files: text=auto eol=lf is supported in Git 2.10

2016-08-25 Thread Junio C Hamano

tbo...@web.de writes:

> From: Torsten Bögershausen 
>
> The man page for `git ls-files --eol` mentions the combination
> of text attributes "text=auto eol=lf" or "text=auto eol=crlf" as not
> supported yet, but may be in the future.
> Now they are supported

Thanks. I'll finish the sentence with a full-stop here ;-).

>
> Signed-off-by: Torsten Bögershausen 
> ---
>  Documentation/git-ls-files.txt | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index 078b556..0d933ac 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -159,8 +159,7 @@ not accessible in the working tree.
>  +
>   is the attribute that is used when checking out or committing,
>  it is either "", "-text", "text", "text=auto", "text eol=lf", "text 
> eol=crlf".
> -Note: Currently Git does not support "text=auto eol=lf" or "text=auto 
> eol=crlf",
> -that may change in the future.
> +Since Git 2.10 "text=auto eol=lf" and "text=auto eol=crlf" are supported.

It may be a good idea to have this for a while.  Having this
sentence would only help those who have been dissuaded by the
existing Note by telling them that the limitation is no longer
there, but that will quickly become unnecessary.

We'd eventually want to remove this sentence.  I wonder if it is a
better alternative to just remove the Note without adding new text,
though.

>  Both the  in the index ("i/")
>  and in the working tree ("w/") are shown for regular files,
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v10 0/9] submodule inline diff format

2016-08-25 Thread Jacob Keller

On Tue, Aug 23, 2016 at 10:47 AM, Stefan Beller  wrote:
> On Tue, Aug 23, 2016 at 10:25 AM, Junio C Hamano  wrote:
>> I am not so sure about that.  If there is an existing place that is
>> buggy, shouldn't we fix that, instead of spreading the same bug
>> (assuming that it is a bug in the first place, which I do not have a
>> strong opinion on, at least not yet)?
>>
>> Can there be .git/modules// repository that is pointed at an
>> in-tree .git file when there is no "name" defined?
>
> If you're holding it wrong we can come into that state.
> * Checkout the submodule,
> * then remove .gitmodules as well as relevant config in .git/config.
> Result: Then we have a only a gitlink recorded as well as connected
> working tree to a gitdir inside a superprojects .git/modules/.
>

Yea, but I think you're right that we shouldn't support that state.
What I'm worried about is the case where we can get this state doing
something sane/acceptable but I don't think we can.

So we should support the gitlink to a repository stored at 
without stuff inside the .git/modules, and we should support submodule
gitlinks with a proper .gitmodules setup. I don't think we should
die() but we should error properly so I will introduce a _gently()
variant of these functions, and die properly in the regular flow.

>> I thought we
>> errored out in module_name helper function in git-submodule.sh when
>> we need a name and only have path (I just checked in the maint-2.6
>> track); did we break it recently? submodule--helper.c::module_name()
>> seems to error out when submodule_from_path() fails to find one and
>> will segfault if it does not have name, so it is not likely.
>
> The name is literally the only thing that is not optional in a struct 
> submodule
> (see submodule-config.c:182 In lookup_or_create_by_name, these structs are
> added to the internal cache.
>
> Stepping back a bit, I think we'd want to document this expectation more
> in the man pages
> The name unlike the path of a submodule must not be changed (as the
> name is used internally to point at the submodules git dir)

Agreed, this should be documented. Do you mind doing the documentation
patch for this?

Thanks,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v10 0/9] submodule inline diff format

2016-08-25 Thread Jacob Keller

On Tue, Aug 23, 2016 at 10:25 AM, Junio C Hamano  wrote:
> I am not so sure about that.  If there is an existing place that is
> buggy, shouldn't we fix that, instead of spreading the same bug
> (assuming that it is a bug in the first place, which I do not have a
> strong opinion on, at least not yet)?
>

I was saying that I'm not sure it is a bug so I sided with preserving
behavior instead.

If it is indeed a bug we should fix it, and I'm trying to determine
whether it actually is a bug here, and what the best solution is.

If there is a bug and we should die, should we also introduce a
"_gently()" variant of these functions and thus fail properly if they
don't work so that we can report an error in producing a diff instead
of die()ing?

Thanks,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 0/3] Update eol documentation

2016-08-25 Thread Junio C Hamano

tbo...@web.de writes:

> From: Torsten Bögershausen 
>
> Sorry for posting this so late:
> While reviewing another patch I realized that the eol related
> documentation was not updated as it should be.
>
> Torsten Bögershausen (2):
>   git ls-files: text=auto eol=lf is supported in Git 2.10
>   gitattributes: Document the unified "auto" handling
>
>  Documentation/git-ls-files.txt  |  3 +--
>  Documentation/gitattributes.txt | 24 
>  2 files changed, 17 insertions(+), 10 deletions(-)

This [0/3] is meant to be a cover for [1/2] and [2/2]?

I am trying to see if we broke format-patch recently, or it is a
manual editing error.  The latter I do not care about; the former I
do.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C

2016-08-25 Thread Junio C Hamano

Pranit Bauva  writes:

> A lot of parts of bisect.c uses exit() and these signals are then
> trapped in the `bisect_start` function. Since the shell script ceases
> its existence it would be necessary to convert those exit() calls to
> return statements so that errors can be reported efficiently in C code.

Is efficiency really an issue?  I think the real reason is that it
would make it impossible for the callers to handle errors, if you do
not convert and let the error codepaths exit().

> @@ -729,7 +735,7 @@ static struct commit **get_bad_and_good_commits(int 
> *rev_nr)
>   return rev;
>  }
>  
> -static void handle_bad_merge_base(void)
> +static int handle_bad_merge_base(void)
>  {
>   if (is_expected_rev(current_bad_oid)) {
>   char *bad_hex = oid_to_hex(current_bad_oid);
> @@ -750,17 +756,18 @@ static void handle_bad_merge_base(void)
>   "between %s and [%s].\n"),
>   bad_hex, term_bad, term_good, bad_hex, 
> good_hex);
>   }
> - exit(3);
> + return 3;
>   }
>  
>   fprintf(stderr, _("Some %s revs are not ancestor of the %s rev.\n"
>   "git bisect cannot work properly in this case.\n"
>   "Maybe you mistook %s and %s revs?\n"),
>   term_good, term_bad, term_good, term_bad);
> - exit(1);
> + bisect_clean_state();
> + return 1;

What is the logic behind this function sometimes clean the state,
and some other times do not, when it makes an error-return?  We see
above that "return 3" codepath leaves the state behind.

Either you forgot a necessary clean_state in "return 3" codepath,
or you forgot to document why the distinction exists in the in-code
comment for the function.  I cannot tell which, but I am leaning
towards guessing that it is the former.

> -static void check_good_are_ancestors_of_bad(const char *prefix, int 
> no_checkout)
> +static int check_good_are_ancestors_of_bad(const char *prefix, int 
> no_checkout)
>  {
>   char *filename = git_pathdup("BISECT_ANCESTORS_OK");
>   struct stat st;
> - int fd;
> + int fd, res = 0;
>  
>   if (!current_bad_oid)
>   die(_("a %s revision is needed"), term_bad);

Can you let it die yere?

> @@ -873,8 +890,11 @@ static void check_good_are_ancestors_of_bad(const char 
> *prefix, int no_checkout)
> filename);
>   else
>   close(fd);
> +
> + return 0;
>   done:
>   free(filename);
> + return 0;
>  }

Who owns "filename"?  The first "return 0" leaves it unfreed, and
when "goto done" is done, it is freed.

The above two may indicate that "perhaps 'retval + goto finish'
pattern?" is a really relevant suggestion for the earlier steps in
this series.

>   if (!all) {
>   fprintf(stderr, _("No testable commit found.\n"
>   "Maybe you started with bad path parameters?\n"));
> - exit(4);
> + return 4;
>   }
>  
>   bisect_rev = revs.commits->item->object.oid.hash;
>  
>   if (!hashcmp(bisect_rev, current_bad_oid->hash)) {
> - exit_if_skipped_commits(tried, current_bad_oid);
> + res = exit_if_skipped_commits(tried, current_bad_oid);
> + if (res)
> + return res;
> +
>   printf("%s is the first %s commit\n", sha1_to_hex(bisect_rev),
>   term_bad);
>   show_diff_tree(prefix, revs.commits->item);
>   /* This means the bisection process succeeded. */
> - exit(10);
> + return 10;
>   }
>  
>   nr = all - reaches - 1;
> @@ -1005,7 +1033,11 @@ int bisect_next_all(const char *prefix, int 
> no_checkout)
> "Bisecting: %d revisions left to test after this %s\n",
> nr), nr, steps_msg);
>  
> - return bisect_checkout(bisect_rev, no_checkout);
> + res = bisect_checkout(bisect_rev, no_checkout);
> + if (res)
> + bisect_clean_state();
> +
> + return res;
>  }

There were tons of "exit_if" that was converted to "if (res) return
res" above, instead of jumping here to cause clean_state to be
called.  I cannot tell if this new call to clean_state() is wrong,
or all the earlier "return res" should come here.  I am guessing the
latter.

> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
> index c64996a..ef7b3a1 100644
> --- a/builtin/bisect--helper.c
> +++ b/builtin/bisect--helper.c
> @@ -8,6 +8,7 @@
>  #include "run-command.h"
>  #include "prompt.h"
>  #include "quote.h"
> +#include "revision.h"
>  
>  static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS")
>  static GIT_PATH_FUNC(git_path_bisect_expected_rev, "BISECT_EXPECTED_REV")
> @@ -29,6 +30,8 @@ static const char * const git_bisect_helper_usage[] = {
>   N_("git bisect--helper --bisect-terms [--term-good | --term-old | 
> --term-bad | --term-new]"),
>

Re: [PATCH v6 10/13] convert: generate large test files only once

2016-08-25 Thread Lars Schneider

> On 25 Aug 2016, at 21:17, Stefan Beller  wrote:
> 
>> On Thu, Aug 25, 2016 at 4:07 AM,   wrote:
>> From: Lars Schneider 
>> 
>> Generate more interesting large test files
> 
> How are the large test files more interesting?
> (interesting in the notion of covering more potential bugs?
> easier to debug? better to maintain, or just a pleasant read?)

The old large test file was 1MB of zeros and 1 byte with a one, repeated 2048 
times.

Since the filter uses 64k packets we would test a large number of equally 
looking packets.

That's why I thought the pseudo random content is more interesting.

>> with pseudo random characters
>> in between and reuse these test files in multiple tests. Run tests
>> formerly marked as EXPENSIVE every time but with a smaller data set.
> 
> Sounds good to me.

Thank you.
Lars--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v14 13/27] bisect--helper: `bisect_start` shell function partially in C

2016-08-25 Thread Junio C Hamano

Junio C Hamano  writes:

>> +for (i = 0; i < revs.nr; i++) {
>> +if (bad_seen)
>> +string_list_append(, terms->term_good.buf);
>> +else {
>> +bad_seen = 1;
>> +string_list_append(, terms->term_bad.buf);
>> +}
>> +}
>
> This is certainly different from the original.  We used to do one
> "bisect_write" per element in revs in the loop; we no longer do that
> and instead collect them in states list.  Each element in these two
> lists, i.e. revs.item[i] and states.item[i], corresponds to each
> other.
>
> That explains why the variable is renamed to states.  We haven't
> seen enough to say if this behaviour change is a good idea or not.

Ahh, I misread the original.  It accumulates the writes to be
executed in $eval and does so at the end.  So there is no change in
behaviour.

So please ignore that point in the previous message.  That leaves
only the following points:

 * Perhaps 'retval' with 'goto finish' pattern?

 * ref_exists()?  Perhaps use skip_prefix(head, "refs/heads/, )?

 * if (clean-state) { return -1 }?

 * Is comment on trap relevant here?

Sorry, and thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 07/13] pack-protocol: fix maximum pkt-line size

2016-08-25 Thread Lars Schneider


> On 25 Aug 2016, at 20:59, Stefan Beller  wrote:
> 
>> On Thu, Aug 25, 2016 at 4:07 AM,   wrote:
>> From: Lars Schneider 
>> 
>> According to LARGE_PACKET_MAX in pkt-line.h the maximal length of a
>> pkt-line packet is 65520 bytes. The pkt-line header takes 4 bytes and
>> therefore the pkt-line data component must not exceed 65516 bytes.
>> 
>> Signed-off-by: Lars Schneider 
>> ---
> 
> I was recently encouraged off list to really try hard to go with
> shorter series's.
> (Duh! Everyone knows that shorter series go in faster ;)
> 
> And as this is a strict bug fix of Documentation that makes sense
> outside this series,
> I'd like to encourage you to send this as a separate patch in case you
> need further
> rerolls.

Agreed!

Thanks,
Lars
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 10/13] convert: generate large test files only once

2016-08-25 Thread Stefan Beller

On Thu, Aug 25, 2016 at 4:07 AM,   wrote:
> From: Lars Schneider 
>
> Generate more interesting large test files

How are the large test files more interesting?
(interesting in the notion of covering more potential bugs?
easier to debug? better to maintain, or just a pleasant read?)

> with pseudo random characters
> in between and reuse these test files in multiple tests. Run tests
> formerly marked as EXPENSIVE every time but with a smaller data set.

Sounds good to me.

>
> Signed-off-by: Lars Schneider 
> ---
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v14 13/27] bisect--helper: `bisect_start` shell function partially in C

2016-08-25 Thread Junio C Hamano

Pranit Bauva  writes:

> +static int bisect_start(struct bisect_terms *terms, int no_checkout,
> + const char **argv, int argc)
> +{
> + int i, has_double_dash = 0, must_write_terms = 0, bad_seen = 0;
> + int flags, pathspec_pos;
> + struct string_list revs = STRING_LIST_INIT_DUP;
> + struct string_list states = STRING_LIST_INIT_DUP;

The original has a single state, not states.  Let's see if there is
a reason behind the name change

> + unsigned char sha1[20];
> + struct object_id oid;

More on these below...

> + ...
> + for (i = 0; i < argc; i++) {
> + const char *commit_id = xstrfmt("%s^{commit}", argv[i]);
> + if (!strcmp(argv[i], "--")) {
> + has_double_dash = 1;
> + break;
> + } else if (!strcmp(argv[i], "--no-checkout"))
> + no_checkout = 1;
> + else if (!strcmp(argv[i], "--term-good") ||
> + ...
> + } else if (starts_with(argv[i], "--") &&
> +  !one_of(argv[i], "--term-good", "--term-bad", NULL)) {
> + string_list_clear(, 0);
> + string_list_clear(, 0);
> + die(_("unrecognised option: '%s'"), argv[i]);
> + } else if (get_oid(commit_id, ) && has_double_dash) {
> + string_list_clear(, 0);
> + string_list_clear(, 0);
> + die(_("'%s' does not appear to be a valid revision"), 
> argv[i]);
> + } else {
> + string_list_append(, oid_to_hex());
> + }
> + }

What I do not understand is clearing the string list "states" inside
this loop.  It may have been INIT_DUPed, but nothing in this loop
adds anything to it.  Because "revs" does get extended in the loop,
it is understandable if you wanted to clear it before dying, but "if
you are dying anyway why bother clearing?" is also a valid stance to
take.

The same "perhaps want to use the 'retval' with goto 'finish:' pattern?"
comment applies here, too.

> + pathspec_pos = i;
> +
> + /*
> +  * The user ran "git bisect start  ", hence did not
> +  * explicitly specify the terms, but we are already starting to
> +  * set references named with the default terms, and won't be able
> +  * to change afterwards.
> +  */
> + must_write_terms |= !!revs.nr;
> + for (i = 0; i < revs.nr; i++) {
> + if (bad_seen)
> + string_list_append(, terms->term_good.buf);
> + else {
> + bad_seen = 1;
> + string_list_append(, terms->term_bad.buf);
> + }
> + }

This is certainly different from the original.  We used to do one
"bisect_write" per element in revs in the loop; we no longer do that
and instead collect them in states list.  Each element in these two
lists, i.e. revs.item[i] and states.item[i], corresponds to each
other.

That explains why the variable is renamed to states.  We haven't
seen enough to say if this behaviour change is a good idea or not.

> + /*
> +  * Verify HEAD
> +  */
> + head = resolve_ref_unsafe("HEAD", 0, sha1, );
> + if (!head) {
> + if (get_sha1("HEAD", sha1)) {
> + string_list_clear(, 0);
> + string_list_clear(, 0);
> + die(_("Bad HEAD - I need a HEAD"));
> + }
> + }
> + if (!is_empty_or_missing_file(git_path_bisect_start())) {
> + /* Reset to the rev from where we started */
> + strbuf_read_file(_head, git_path_bisect_start(), 0);
> + strbuf_trim(_head);
> + if (!no_checkout) {
> + struct argv_array argv = ARGV_ARRAY_INIT;
> + argv_array_pushl(, "checkout", start_head.buf,
> +  "--", NULL);
> + if (run_command_v_opt(argv.argv, RUN_GIT_CMD)) {
> + error(_("checking out '%s' failed. Try 'git "
> + "bisect start '."),
> +   start_head.buf);
> + strbuf_release(_head);
> + string_list_clear(, 0);
> + string_list_clear(, 0);
> + return -1;

The original died here, but you expect the caller to respond to a
negative return.  I haven't read enough to judge if that is a good
idea, but doesn't it make sense to do the same throughout the
function consistently?  I saw a few die()'s already in the command
line parsing loop--shouldn't they also return -1?

The original has called bisect_write already when we attempt this
checkout and the user will see the states in the filesystem.  This
rewrite does not.  What effect does this behaviour change have to
the end-user experience?  The error

Re: [PATCH v6 07/13] pack-protocol: fix maximum pkt-line size

2016-08-25 Thread Stefan Beller

On Thu, Aug 25, 2016 at 4:07 AM,   wrote:
> From: Lars Schneider 
>
> According to LARGE_PACKET_MAX in pkt-line.h the maximal length of a
> pkt-line packet is 65520 bytes. The pkt-line header takes 4 bytes and
> therefore the pkt-line data component must not exceed 65516 bytes.
>
> Signed-off-by: Lars Schneider 
> ---

I was recently encouraged off list to really try hard to go with
shorter series's.
(Duh! Everyone knows that shorter series go in faster ;)

And as this is a strict bug fix of Documentation that makes sense
outside this series,
I'd like to encourage you to send this as a separate patch in case you
need further
rerolls.

Thanks,
Stefan

>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git push origin BRANCHNAME question

2016-08-25 Thread Jeff King

On Thu, Aug 25, 2016 at 10:50:23AM -0700, Junio C Hamano wrote:

> Ed Greenberg  writes:
> 
> > I think I understand this from the git-push man page, but I want to
> > make sure:
> >
> > I have two branches, master and develop.
> >
> > If I am (accidentally) sitting on master, and issue 'git push origin
> > develop', does this properly push develop to remote develop, or does
> > it push master to remote develop (which seems to be bad, in the most
> > common use case.)  ?
> 
> You can find it out yourself quite easily, I would think.
> 
> $ git init src
> $ git init dst
> $ cd src
> $ git commit --allow-empty -m initial
> $ git checkout -b develop
> $ git commit --allow-empty -m second
> 
> $ git checkout master
> $ git push ../dst develop
> Counting objects: 3, done.
> Delta compression using up to 6 threads.
> Compressing objects: 100% (2/2), done.
> Writing objects: 100% (3/3), 226 bytes | 0 bytes/s, done.
> Total 3 (delta 1), reused 0 (delta 0)
> To ../dst
>  * [new branch]  develop -> develop

Yes, though I think the "why" is interesting here, too.

And the answer is that "develop" is a shortened refspec, whose full form
is more like "develop:develop". A name with no colon is always the local
source, and the implied destination is usually the same name on the
remote (though it can be changed via config). The third paragraph of
"..." under OPTIONS in "git help push" covers this, though I do
think it is a little hard to follow.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 03/13] pkt-line: add packet_write_fmt_gently()

2016-08-25 Thread Lars Schneider

On 25 Aug 2016, at 20:12, Stefan Beller  wrote:

>> +int packet_write_fmt_gently(int fd, const char *fmt, ...)
>> +{
>> +   static struct strbuf buf = STRBUF_INIT;
>> +   va_list args;
>> +
>> +   strbuf_reset();
>> +   va_start(args, fmt);
>> +   format_packet(, fmt, args);
> 
> format_packet also takes care of tracing the contents,
> which was a bit unexpected to me.

To me, too :)
http://public-inbox.org/git/20160810143321.q7mjirgr5ynml...@sigill.intra.peff.net/

The series is already pretty large and therefore I decided to leave this as-is.

> Do we also want to trace failure?

You mean in all new *_gently() functions? Good idea!

Thanks,
Lars--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 06/13] pkt-line: add functions to read/write flush terminated packet streams

2016-08-25 Thread Stefan Beller

On Thu, Aug 25, 2016 at 4:07 AM,   wrote:
> From: Lars Schneider 
>
> packet_write_stream_with_flush_from_fd() and
> packet_write_stream_with_flush_from_buf() write a stream of packets. All
> content packets use the maximal packet size except for the last one.
> After the last content packet a `flush` control packet is written.
>
> packet_read_till_flush() reads arbitrary sized packets until it detects
> a `flush` packet.

So the API provided by these read/write functions is intended
to move a huge chunks of data. And as it puts the data on the wire one
packet after the other without the possibility to intervene and e.g. send
a side channel progress bar update, I would question the design of this.
If I understand correctly this will be specifically  used for large
files locally,
so e.g. a file of 5 GB (such as a virtual machine tracked in Git), would
require about 80k packets.

Instead of having many packets of max length and then a remainder,
I would suggest to invent larger packets for this use case. Then we can
just send one packet instead.

Currently a packet consists of 4 bytes indicating the length in hex
and then the payload of length-4 bytes. As the length is in hex
the characters in the first 4 bytes are [0-9a-f], we can easily add another
meaning for the length, e.g.:

  A packet starts with the overall length and then the payload.
  If the first character of the length is 'v' the length is encoded as a
  variable length quantity[1]. The high bit of the char indicates if
  the next char is still part of the length field. The length must not exceed
  LLONG_MAX (which results in a payload of 9223 Petabyte, so
  enough for the foreseeable future).

  [1] A variable-length quantity (VLQ) is a universal code that uses
  an arbitrary number of bytes to represent an arbitrarily large integer.
  https://en.wikipedia.org/wiki/Variable-length_quantity

The neat thing about the packet system is we can dedicate packets
to different channels (such as the side channels), but with the provided
API here this makes it impossible to later add in these side channel
as it is a pure streaming API now. So let's remove the complication
of having to send multiple packets and just go with one large packet
instead.

--
I understand that my proposal would require writing code again,
but it has also some long term advantages in the networking stack
of Git: There are some worries that a capabilities line in fetch/push
might overflow in the far future, when there are lots of capabilities.

Also a few days ago there was a proposal to add all symbolic refs
to a capabilities line, which Peff shot down as "the packet may be
too small".

There is an incredible hack that allows transporting refs > 64kB IIRC.

All these things could go away with the variable length encoded
packets. But to make them go away in the future we would need
to start with these variable length packets today. ;)

Just food for thought.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: Smart HTTP push permissions failure

2016-08-25 Thread David McGough

Thank you for your reply Jeff.  I have moved on to installing GitLab.  It has 
been a success so far.

Thanks,
Dave

-Original Message-
From: Jeff King [mailto:p...@peff.net] 
Sent: Wednesday, August 24, 2016 1:00 PM
To: David McGough 
Cc: git@vger.kernel.org
Subject: Re: Smart HTTP push permissions failure

On Tue, Aug 23, 2016 at 03:45:33PM +, David McGough wrote:

> When I try to push to the server I get this message:
> remote: error: insufficient permission for adding an object to 
> repository database ./objects
> remote: fatal: failed to write object
> [...]
> So I am pretty confused about what the issue.  Which OS user is git 
> using to write the files?  I hope somebody can help me understand why 
> the project cannot be pushed to the git server.

For a smart-http push, it will be whatever user the web server execs the CGI 
as. So I'd think "apache" would be the default, but it's possible that it runs 
CGIs as a different user, depending on your config.

One possibility may be to add a simple shell script CGI that does something 
like:

  #!/bin/sh
  echo "Content-type: text/plain"
  echo
  id

just to see what's happening.

Based on the data you showed, here are some wild possibilities I can think of:

  - the CGI runs as "apache", but your files are owned by "git".
"apache" is in the "staff" group, and the directories all have write
permission for that group. But are we sure that apache does not shed
any group permissions when running a CGI? The "id" script above
should hopefully show that.

  - You mentioned CentOS. It has been a while since I dealt with RHEL
and its derivatives, but I think selinux is turned on by default
there. Is it possible that the webserver runs in an selinux profile
that does not allow writing to the repository directory?

I don't recall the specifics of debugging selinux problems, but
there may be logs there.

Sorry those are just stabs in the dark, but I don't see anything else obviously 
wrong with what you've posted.

-Peff

Re: [PATCH v14 12/27] bisect--helper: `get_terms` & `bisect_terms` shell function in C

2016-08-25 Thread Junio C Hamano

Pranit Bauva  writes:

> +static int bisect_terms(struct bisect_terms *terms, const char **argv, int 
> argc)
> +{
> + int i;
> +
> + if (get_terms(terms)) {
> + fprintf(stderr, _("no terms defined\n"));
> + return -1;
> + }
> + if (argc == 0) {
> + printf(_("Your current terms are %s for the old state\nand "
> +"%s for the new state.\n"), terms->term_good.buf,
> +terms->term_bad.buf);
> + return 0;
> + }
> +
> + for (i = 0; i < argc; i++) {
> + if (!strcmp(argv[i], "--term-good"))
> + printf("%s\n", terms->term_good.buf);
> + else if (!strcmp(argv[i], "--term-bad"))
> + printf("%s\n", terms->term_bad.buf);
> + else
> + printf(_("invalid argument %s for 'git bisect "
> +   "terms'.\nSupported options are: "
> +   "--term-good|--term-old and "
> +   "--term-bad|--term-new."), argv[i]);
> + }

The original took only one and gave one answer (and errored out when
the user asked for more), but this one loops.  I can see either way
is OK and do not think of a good reason to favor one over the other;
unless there is a strong reason why you need this extended behaviour
that allows users to ask multiple questions, I'd say we should keep
the original behaviour.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 03/13] pkt-line: add packet_write_fmt_gently()

2016-08-25 Thread Stefan Beller

> +int packet_write_fmt_gently(int fd, const char *fmt, ...)
> +{
> +   static struct strbuf buf = STRBUF_INIT;
> +   va_list args;
> +
> +   strbuf_reset();
> +   va_start(args, fmt);
> +   format_packet(, fmt, args);

format_packet also takes care of tracing the contents,
which was a bit unexpected to me.
Do we also want to trace failure?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git push origin BRANCHNAME question

2016-08-25 Thread Junio C Hamano

Ed Greenberg  writes:

> I think I understand this from the git-push man page, but I want to
> make sure:
>
> I have two branches, master and develop.
>
> If I am (accidentally) sitting on master, and issue 'git push origin
> develop', does this properly push develop to remote develop, or does
> it push master to remote develop (which seems to be bad, in the most
> common use case.)  ?

You can find it out yourself quite easily, I would think.

$ git init src
$ git init dst
$ cd src
$ git commit --allow-empty -m initial
$ git checkout -b develop
$ git commit --allow-empty -m second

$ git checkout master
$ git push ../dst develop
Counting objects: 3, done.
Delta compression using up to 6 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 226 bytes | 0 bytes/s, done.
Total 3 (delta 1), reused 0 (delta 0)
To ../dst
 * [new branch]  develop -> develop

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Feature Request: Branch-Aware Submodules

2016-08-25 Thread Junio C Hamano

"Hedges  Alexander"  writes:

> Right now updating a submodule in a topic branch and merging it into master
> will not change the submodule index in master leading to at least two commit
> for the same change (one in any active branch).

I stopped reading here because I am not getting this.

I guess I am confused because I do not understand what you mean by
"the submodule index in master".  The concept of "index" does not
belong to each branch (or even a commit), so by "index" you are
trying to point at something else, but I cannot guess what it is.

You have a top-level superproject that has another project as its
submodule.  The superproject has topic and master branches (or it
may only have master).  The project that is used as its submodule
also has topic and master branches (it may have more).  You do your
development in the submodule, e.g.

cd submoduledir
git checkout topic
hack hack hack
git commit
git checkout master
git merge topic

and merge the topic branch into its master when the topic is
polished enough.

And then?  The 'master' in the submodule is good enough, so you'd
go back to the top-level superproject and bind that merged result 
in its place?  e.g.

cd ..
git add submoduledir
git commit -m "Updated submoduledir with the topic"

That is only one commit each in the superproject and the submodule
project.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Feature Request: Branch-Aware Submodules

2016-08-25 Thread Stefan Beller

+cc Jacob and Lars who work with submodules as well.

On Thu, Aug 25, 2016 at 2:00 AM, Hedges  Alexander
 wrote:
>
> Right now updating a submodule in a topic branch and merging it into master
> will not change the submodule index in master leading to at least two commit
> for the same change (one in any active branch). This happened to me quite a 
> few
> times. To a newcomer this behavior is confusing and it leads to unnecessary
> commits.

So you roughly do

git checkout -b new-topic
# change the submodule to point at the latest upstream version:
git submodule update --remote 
git commit -a -m "update submodule"
git checkout master
git merge new-topic
# here seems to be your point of critic?
# now the submodule pointer would still point to the latest
upstream version?

>
>
> The proposed change would be to have a submodule either ignored or tracked by
> the .gitmodules file.
> If it is ignored, as for instance after a clone of the superproject, git 
> simply
> ignores all files in the submodule directory. The content of the gitmodules
> file is then also not updated by git.
> If it is not ignored, the .gitmodules is updated every time a commit happens 
> in
> the submodule.

So

git -C  commit

should trigger a commit in the superproject as well, that changes the gitmodules
file? What do you record in the git modules file that needs updating?
As the version is tracked via the gitlink entry, I do not see the
information that
needs tracking here?

> On branch switches the revision shown in the gitmodules from
> that branch is checked out.

So you are proposing to put the revision into the gitmodules file?
That would be redundant with the actual gitlink entry in your tree.
(as shown via `git submodule status`)
What would happen if the recorded revision in the gitmodules file and the
gitlink are out of sync?

Oh, are you just proposing to actually make `git checkout` aware of the
submodules? See[1]. I would welcome such a change and be happy th

[1] https://github.com/jlehmann/git-submod-enhancements
which has some attempts for checkout including the submodules.
I also tried writing some patches which integrate checking out submodules
via checkout as well. A quicker `solution` would be a config option that
just runs `git submodule update` after each checkout/pull etc.

> This change would have submodules conceptually behave more like files to the
> superproject.
>
>
> Like current behavior, git status would display whether the submodule has
> uncommitted changes or is at a new commit.

See config options diff.submodule and status.submoduleSummary.

>
> I couldn't find any discussions on the initial implementation of git-submodule
> or any previous proposals related to this in nature due to gmane being down
> right now and the mailing list archives on the other sites are not great for
> searching. So please excuse me if I'm bringing up already discussed stuff.

https://public-inbox.org/git for reading on the web, or

git clone https://public-inbox.org/git

for reading offline.

>
> Until now I only worked on projects with few submodules. I expect the
> proposed changes to have a larger effect on projects containing lots of
> submodules. So it would be nice if maybe somebody with experience working on
> projects with lots of submodules could weigh into the discussion.
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] checkout: swap the order of ambiguity check for :/ syntax

2016-08-25 Thread Junio C Hamano

Duy Nguyen  writes:

> On Wed, Aug 24, 2016 at 11:35 PM, Junio C Hamano  wrote:
>> Nguyễn Thái Ngọc Duy   writes:
>>
>>> It's not wonderful, but it's in line with how git-checkout stops caring
>>> about ambiguity after the first argument can be resolved as a ref
>>> (there's even a test for it, t2010.6).
>>
>> But that is justifiable because checkout can only ever take one
>> revision.  What follows, if there are any, must be paths, and more
>> importantly, it would be perfectly reasonable if some of them were
>> missing in the working tree ("ow, I accidentally removed that file,
>> I need to resurrect it from the index").  Does the same justification
>> apply to this change?
>
> I think there is a misunderstanding. My "after" is in "after the first
> argument can be resolved, check if it exists in worktree too, if so
> it's ambiguous and bail". This is usually how we detect ambiguation.
> But git-checkout does not do the "check if it exists..." clause.

Hmph.  The "case 4" in the function you touched says

 * case 4: git checkout  
 *
 *   The first argument must not be ambiguous.
 *   - If it's *only* a reference, treat it like case (1).
 *   - If it's only a path, treat it like case (2).
 *   - else: fail.

Did we break it recently?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128

2016-08-25 Thread Alex Nauda

On Thu, Aug 25, 2016 at 2:28 AM, Michael Haggerty  wrote:
> On 08/24/2016 11:39 PM, Jeff King wrote:
>> On Wed, Aug 24, 2016 at 04:52:33PM -0400, Alex Nauda wrote:
>>
>>> Elastic File System (EFS) is Amazon's scalable filesystem product that
>>> is exposed to the OS as an NFS mount. We're using EFS to host the
>>> filesystem used by a Jenkins CI server. Sometimes when Jenkins tries
>>> to git fetch, we get this error:
>>> $ git -c core.askpass=true fetch --tags --progress
>>> g...@github.com:mediasilo/dodo.git
>>> +refs/pull/*:refs/remotes/origin/pr/*
>>> fatal: Reference directory conflict: refs/heads/
>>> $ echo $? 128
>>>
>>> Has anyone seen anything like this before? Any tips on how to troubleshoot 
>>> it?
>>
>> No, I haven't seen it before. That's an internal assertion in the refs
>> code that shouldn't ever happen. It looks like it happens when the loose
>> refs end up with duplicate directory entries. While a bug in git is an
>> obvious culprit, I wonder if it's possible that your filesystem might
>> expose the same name twice in one set of readdir() results.
>>
>> +cc Michael, who added this assertion long ago (and since this is the
>> first report in all these years, it does make me suspect that the
>> filesystem is a critical part of reproducing).
>
> Thanks for the CC.
>
> I've never heard of this problem before.
>
> What Git version are you using?
Git client 2.7.4 against GitHub (Git 2.6.5)

>
> I tried to provoke the problem by hand-corrupting the packed-refs file,
> but wasn't successful.
>
> So Peff's suggestion that the problem originates in your filesystem
> seems to be to be the most likely cause. A quick Google search found,
> for example,
>
> https://bugzilla.redhat.com/show_bug.cgi?id=739222
>
> http://superuser.com/questions/640419/how-can-i-have-two-files-with-the-same-name-in-a-directory-when-mounted-with-nfs
>
> though these reports seem connected with having lots of files in the
> directory, which seems unlikely for `$GIT_DIR/refs/`. But I didn't do a
> more careful search, and it is easily possible that there are other bugs
> in NFS (or EFS) that could be affecting you.
>
> If this were repeatable, you could run Git under strace to test Peff's
> hypothesis. But I suppose it only happens rarely, right?
Actually it seems to be reproducible. Here's the last portion of an strace:

[...]
stat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644,
st_size=41, ...}) = 0
lstat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644,
st_size=41, ...}) = 0
open(".git/refs/remotes/origin/pr/7/head", O_RDONLY) = 4
read(4, "5d82811a248900efd8e201c6d9232de5"..., 256) = 41
read(4, "", 215)= 0
close(4)= 0
getdents(3, /* 0 entries */, 32768) = 0
close(3)= 0
open(".git/refs/remotes/origin/pr/16/",
O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getdents(3, /* 3 entries */, 32768) = 72
stat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644,
st_size=41, ...}) = 0
lstat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644,
st_size=41, ...}) = 0
open(".git/refs/remotes/origin/pr/16/head", O_RDONLY) = 4
read(4, "2886c4f3ba8c3b5c2306029f6e39498d"..., 256) = 41
read(4, "", 215)= 0
close(4)= 0
getdents(3, /* 0 entries */, 32768) = 0
close(3)= 0
open(".git/refs/tags/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getdents(3, /* 2 entries */, 32768) = 48
getdents(3, /* 0 entries */, 32768) = 0
close(3)= 0
open(".git/refs/bisect/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) =
-1 ENOENT (No such file or directory)
open(".git/packed-refs", O_RDONLY)  = -1 ENOENT (No such file or directory)
fstat(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
write(2, "fatal: Reference directory confl"..., 58fatal: Reference
directory conflict: refs/remotes/origin/
) = 58
exit_group(128) = ?
+++ exited with 128 +++

>
> Is it possible that multiple clients have the same NFS filesystem
> mounted while Git is running? That would seem like an especially bad
> idea and I could imagine it leading to problems like this.
>
> It's surprising that you are seeing this problem in directory `refs`,
> because (1) that directory is unlikely to have very many entries, and
> (2) as far as I remember, Git will never delete the directories
> `refs/heads` and `refs/tags`.
Seems like sometimes it happens on other directories:
refs/remotes/origin/ or refs/remotes/origin/pr/1
Then as I was stracing it again, suddenly it succeeded. Some kind of
race condition?

>
> Michael
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at

[PATCH v1 0/3] Update eol documentation

2016-08-25 Thread tboegi

From: Torsten Bögershausen 

Sorry for posting this so late:
While reviewing another patch I realized that the eol related
documentation was not updated as it should be.

Torsten Bögershausen (2):
  git ls-files: text=auto eol=lf is supported in Git 2.10
  gitattributes: Document the unified "auto" handling

 Documentation/git-ls-files.txt  |  3 +--
 Documentation/gitattributes.txt | 24 
 2 files changed, 17 insertions(+), 10 deletions(-)

-- 
2.9.3.599.g2376d31.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v1 1/2] git ls-files: text=auto eol=lf is supported in Git 2.10

2016-08-25 Thread tboegi

From: Torsten Bögershausen 

The man page for `git ls-files --eol` mentions the combination
of text attributes "text=auto eol=lf" or "text=auto eol=crlf" as not
supported yet, but may be in the future.
Now they are supported

Signed-off-by: Torsten Bögershausen 
---
 Documentation/git-ls-files.txt | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 078b556..0d933ac 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -159,8 +159,7 @@ not accessible in the working tree.
 +
  is the attribute that is used when checking out or committing,
 it is either "", "-text", "text", "text=auto", "text eol=lf", "text eol=crlf".
-Note: Currently Git does not support "text=auto eol=lf" or "text=auto 
eol=crlf",
-that may change in the future.
+Since Git 2.10 "text=auto eol=lf" and "text=auto eol=crlf" are supported.
 +
 Both the  in the index ("i/")
 and in the working tree ("w/") are shown for regular files,
-- 
2.9.3.599.g2376d31.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v1 2/2] gitattributes: Document the unified "auto" handling

2016-08-25 Thread tboegi

From: Torsten Bögershausen 

Update the documentation about text=auto:
text=auto now follows the core.autocrlf handling when files are not
normalized in the repository.

For a cross platform project recommend the usage of attributes for
line-ending conversions.

Signed-off-by: Torsten Bögershausen 
---
 Documentation/gitattributes.txt | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 807577a..4012661 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -213,27 +213,35 @@ that text files that you introduce to the repository have 
their line
 endings normalized to LF when they are added, and that files that are
 already normalized in the repository stay normalized.
 
+If you want to ensure that text files that any contributor introduces to
+the repository have their line endings normalized, you could set the
+`text` attribute to "auto" for _all_ files.
+
+
+*  text=auto
+
+
 If you want to interoperate with a source code management system that
 enforces end-of-line normalization, or you simply want all text files
 in your repository to be normalized, you should instead set the `text`
-attribute to "auto" for _all_ files.
+attribute to "text" for text files.
 
 
-*  text=auto
+*.txt  text
 
 
-This ensures that all files that Git considers to be text will have
+This ensures that all files marked as text will have
 normalized (LF) line endings in the repository.  The `core.eol`
 configuration variable controls which line endings Git will use for
 normalized files in your working directory; the default is to use the
 native line ending for your platform, or CRLF if `core.autocrlf` is
 set.
 
-NOTE: When `text=auto` normalization is enabled in an existing
-repository, any text files containing CRLFs should be normalized.  If
-they are not they will be normalized the next time someone tries to
-change them, causing unfortunate misattribution.  From a clean working
-directory:
+NOTE: When you have a cross-platform project using push and pull
+to a central repository the text files containing CRLFs should be
+normalized. All text files should have a text attribute, either
+`text` or `text=auto`.
+From a clean working directory:
 
 -
 $ echo "* text=auto" >>.gitattributes
-- 
2.9.3.599.g2376d31.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 6/6] wt-status: teach has_{unstaged,uncommitted}_changes() about submodules

2016-08-25 Thread Johannes Schindelin

Sometimes we are *actually* interested in those changes... For
example when an interactive rebase wants to continue with a staged
submodule update.

Signed-off-by: Johannes Schindelin 
---
 builtin/pull.c |  2 +-
 wt-status.c| 16 +---
 wt-status.h|  7 ---
 3 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/builtin/pull.c b/builtin/pull.c
index c35c6e8..843ff19 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -810,7 +810,7 @@ int cmd_pull(int argc, const char **argv, const char 
*prefix)
 
if (!autostash)
require_clean_work_tree("pull with rebase",
-   "Please commit or stash them.", 0);
+   "Please commit or stash them.", 1, 0);
 
if (get_rebase_fork_point(rebase_fork_point, repo, *refspecs))
hashclr(rebase_fork_point);
diff --git a/wt-status.c b/wt-status.c
index 1c3fabf..8ba0b4d 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1762,13 +1762,14 @@ void wt_porcelain_print(struct wt_status *s)
 /**
  * Returns 1 if there are unstaged changes, 0 otherwise.
  */
-int has_unstaged_changes(void)
+int has_unstaged_changes(int ignore_submodules)
 {
struct rev_info rev_info;
int result;
 
init_revisions(_info, NULL);
-   DIFF_OPT_SET(_info.diffopt, IGNORE_SUBMODULES);
+   if (ignore_submodules)
+   DIFF_OPT_SET(_info.diffopt, IGNORE_SUBMODULES);
DIFF_OPT_SET(_info.diffopt, QUICK);
diff_setup_done(_info.diffopt);
result = run_diff_files(_info, 0);
@@ -1778,7 +1779,7 @@ int has_unstaged_changes(void)
 /**
  * Returns 1 if there are uncommitted changes, 0 otherwise.
  */
-int has_uncommitted_changes(void)
+int has_uncommitted_changes(int ignore_submodules)
 {
struct rev_info rev_info;
int result;
@@ -1787,7 +1788,8 @@ int has_uncommitted_changes(void)
return 0;
 
init_revisions(_info, NULL);
-   DIFF_OPT_SET(_info.diffopt, IGNORE_SUBMODULES);
+   if (ignore_submodules)
+   DIFF_OPT_SET(_info.diffopt, IGNORE_SUBMODULES);
DIFF_OPT_SET(_info.diffopt, QUICK);
add_head_to_pending(_info);
diff_setup_done(_info.diffopt);
@@ -1799,7 +1801,7 @@ int has_uncommitted_changes(void)
  * If the work tree has unstaged or uncommitted changes, dies with the
  * appropriate message.
  */
-int require_clean_work_tree(const char *action, const char *hint, int gently)
+int require_clean_work_tree(const char *action, const char *hint, int 
ignore_submodules, int gently)
 {
struct lock_file *lock_file = xcalloc(1, sizeof(*lock_file));
int err = 0;
@@ -1816,12 +1818,12 @@ int require_clean_work_tree(const char *action, const 
char *hint, int gently)
update_index_if_able(_index, lock_file);
rollback_lock_file(lock_file);
 
-   if (has_unstaged_changes()) {
+   if (has_unstaged_changes(ignore_submodules)) {
error(_("Cannot %s: You have unstaged changes."), action);
err = 1;
}
 
-   if (has_uncommitted_changes()) {
+   if (has_uncommitted_changes(ignore_submodules)) {
if (err)
error(_("Additionally, your index contains uncommitted 
changes."));
else
diff --git a/wt-status.h b/wt-status.h
index 75833c1..ba56c01 100644
--- a/wt-status.h
+++ b/wt-status.h
@@ -115,8 +115,9 @@ void status_printf_ln(struct wt_status *s, const char 
*color, const char *fmt, .
 __attribute__((format (printf, 3, 4)))
 void status_printf(struct wt_status *s, const char *color, const char *fmt, 
...);
 
-int has_unstaged_changes(void);
-int has_uncommitted_changes(void);
-int require_clean_work_tree(const char *action, const char *hint, int gently);
+int has_unstaged_changes(int ignore_submodules);
+int has_uncommitted_changes(int ignore_submodules);
+int require_clean_work_tree(const char *action, const char *hint,
+   int ignore_submodules, int gently);
 
 #endif /* STATUS_H */
-- 
2.10.0.rc1.99.gcd66998
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/6] Export also the has_un{staged,committed}_changed() functions

2016-08-25 Thread Johannes Schindelin

They will be used in the upcoming rebase helper.

Signed-off-by: Johannes Schindelin 
---
 wt-status.c | 4 ++--
 wt-status.h | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index 33dd76f..1c3fabf 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1762,7 +1762,7 @@ void wt_porcelain_print(struct wt_status *s)
 /**
  * Returns 1 if there are unstaged changes, 0 otherwise.
  */
-static int has_unstaged_changes(void)
+int has_unstaged_changes(void)
 {
struct rev_info rev_info;
int result;
@@ -1778,7 +1778,7 @@ static int has_unstaged_changes(void)
 /**
  * Returns 1 if there are uncommitted changes, 0 otherwise.
  */
-static int has_uncommitted_changes(void)
+int has_uncommitted_changes(void)
 {
struct rev_info rev_info;
int result;
diff --git a/wt-status.h b/wt-status.h
index cc4e5a3..75833c1 100644
--- a/wt-status.h
+++ b/wt-status.h
@@ -115,6 +115,8 @@ void status_printf_ln(struct wt_status *s, const char 
*color, const char *fmt, .
 __attribute__((format (printf, 3, 4)))
 void status_printf(struct wt_status *s, const char *color, const char *fmt, 
...);
 
+int has_unstaged_changes(void);
+int has_uncommitted_changes(void);
 int require_clean_work_tree(const char *action, const char *hint, int gently);
 
 #endif /* STATUS_H */
-- 
2.10.0.rc1.99.gcd66998


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/6] require_clean_work_tree: ensure that the index was read

2016-08-25 Thread Johannes Schindelin

The function would otherwise pretend to work fine, but totally ignore
the working directory.

Signed-off-by: Johannes Schindelin 
---
 wt-status.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/wt-status.c b/wt-status.c
index 792dda9..33dd76f 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1804,6 +1804,13 @@ int require_clean_work_tree(const char *action, const 
char *hint, int gently)
struct lock_file *lock_file = xcalloc(1, sizeof(*lock_file));
int err = 0;
 
+   if (read_cache() < 0) {
+   error(_("Could not read index"));
+   if (gently)
+   return -1;
+   exit(1);
+   }
+
hold_locked_index(lock_file, 0);
refresh_cache(REFRESH_QUIET);
update_index_if_able(_index, lock_file);
-- 
2.10.0.rc1.99.gcd66998


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Proposed questions for "Git User's Survey 2016"

2016-08-25 Thread Jakub Narębski

W dniu 22.08.2016 o 01:59, Andrew Ardill pisze:
> On 21 August 2016 at 04:56, Jakub Narębski  wrote:
>> 25. What [channel(s)] do you use to request/get help about Git [(if any)]
> 
> It may also be useful to ask how people hear news about git, such as
> when a new release comes out. Not sure if worth a separate question,
> as there is a lot of crossover in the resources available for this and
> for requesting help, but knowing this information would help us
> understand what kinds of users are responding and which communication
> channels are effective for git news.

How would you propose such question would look like, and what proposed
answers would be (if it were not a free-text / essay question)?

Note that there might be a problem of severe bias: people who heard
about Git User's Survey 2016 are probably ones that watch news about Git.
Still, it would be useful to know if people read RelNotes...
 
> Related, it might be worth asking how often people upgrade their git
> clients and servers, particularly in corporate/managed environments.
> This question would ask two things, how long after a new release comes
> out do you install it, and do you install every update that comes out
> or do you skip versions. I suspect many would just use whatever is
> released in their distro and update at the same time as they update
> other packages, but it would be interesting to know if people, for
> example, only upgrade their managed environments every year/6 months
> or something to avoid introducing changes to their users.

That is, if people have a pattern to their upgrade of Git, and can
tell how often they upgrade.

XX. How often you upgrade Git?
(multiple choice or single choice?)

 * as soon as new version is released
 * when there is new binary package / distribution package
 * when updating distribution / system
 * around every month, or more often
 * around every 6 months or more often
 * I use what is installed on system

Something like that?

Regards,
-- 
Jakub Narębski

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/6] Pull out require_clean_work_tree() functionality from builtin/pull.c

2016-08-25 Thread Johannes Schindelin

This is the 5th last patch series of my work to accelerate interactive
rebases in particular on Windows.

Basically, all it does is to make reusable some functions that were
ported over from git-pull.sh but made private to builtin/pull.c.

An earlier attempt included only the first patch, and somehow it failed
to convince our good Git maintainer without mentioning that it is part
of something much bigger:

http://public-inbox.org/git/974d0bfed38e8aa410e97e05022bc5dbbd78d915.1457615785.git.johannes.schinde...@gmx.de/

However, now that I have this big carrot (3x speedup on Linux, 4x
speedup on MacOSX and 5x speedup on Windows), it cannot possibly fail.
*thumbs-crossed*


Johannes Schindelin (6):
  pull: drop confusing prefix parameter of die_on_unclean_work_tree()
  pull: make code more similar to the shell script again
  Make the require_clean_work_tree() function truly reusable
  require_clean_work_tree: ensure that the index was read
  Export also the has_un{staged,committed}_changed() functions
  wt-status: teach has_{unstaged,uncommitted}_changes() about submodules

 builtin/pull.c | 71 +++--
 wt-status.c| 83 ++
 wt-status.h|  5 
 3 files changed, 91 insertions(+), 68 deletions(-)

Published-As: 
https://github.com/dscho/git/releases/tag/require-clean-work-tree-v1
Fetch-It-Via: git fetch https://github.com/dscho/git require-clean-work-tree-v1

-- 
2.10.0.rc1.99.gcd66998

base-commit: 2632c897f74b1cc9b5533f467da459b9ec725538
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/6] Make the require_clean_work_tree() function truly reusable

2016-08-25 Thread Johannes Schindelin

It is remarkable that libgit.a did not sport this function yet... Let's
move it into a more prominent (and into an actually reusable) spot:
wt-status.[ch].

Signed-off-by: Johannes Schindelin 
---
 builtin/pull.c | 75 +-
 wt-status.c| 74 +
 wt-status.h|  2 ++
 3 files changed, 77 insertions(+), 74 deletions(-)

diff --git a/builtin/pull.c b/builtin/pull.c
index 4d1f9c8..c35c6e8 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -17,6 +17,7 @@
 #include "revision.h"
 #include "tempfile.h"
 #include "lockfile.h"
+#include "wt-status.h"
 
 enum rebase_type {
REBASE_INVALID = -1,
@@ -326,80 +327,6 @@ static int git_pull_config(const char *var, const char 
*value, void *cb)
 }
 
 /**
- * Returns 1 if there are unstaged changes, 0 otherwise.
- */
-static int has_unstaged_changes(void)
-{
-   struct rev_info rev_info;
-   int result;
-
-   init_revisions(_info, NULL);
-   DIFF_OPT_SET(_info.diffopt, IGNORE_SUBMODULES);
-   DIFF_OPT_SET(_info.diffopt, QUICK);
-   diff_setup_done(_info.diffopt);
-   result = run_diff_files(_info, 0);
-   return diff_result_code(_info.diffopt, result);
-}
-
-/**
- * Returns 1 if there are uncommitted changes, 0 otherwise.
- */
-static int has_uncommitted_changes(void)
-{
-   struct rev_info rev_info;
-   int result;
-
-   if (is_cache_unborn())
-   return 0;
-
-   init_revisions(_info, NULL);
-   DIFF_OPT_SET(_info.diffopt, IGNORE_SUBMODULES);
-   DIFF_OPT_SET(_info.diffopt, QUICK);
-   add_head_to_pending(_info);
-   diff_setup_done(_info.diffopt);
-   result = run_diff_index(_info, 1);
-   return diff_result_code(_info.diffopt, result);
-}
-
-/**
- * If the work tree has unstaged or uncommitted changes, dies with the
- * appropriate message.
- */
-static int require_clean_work_tree(const char *action, const char *hint,
-   int gently)
-{
-   struct lock_file *lock_file = xcalloc(1, sizeof(*lock_file));
-   int err = 0;
-
-   hold_locked_index(lock_file, 0);
-   refresh_cache(REFRESH_QUIET);
-   update_index_if_able(_index, lock_file);
-   rollback_lock_file(lock_file);
-
-   if (has_unstaged_changes()) {
-   error(_("Cannot %s: You have unstaged changes."), action);
-   err = 1;
-   }
-
-   if (has_uncommitted_changes()) {
-   if (err)
-   error(_("Additionally, your index contains uncommitted 
changes."));
-   else
-   error(_("Cannot %s: Your index contains uncommitted 
changes."), action);
-   err = 1;
-   }
-
-   if (err) {
-   if (hint)
-   error("%s", hint);
-   if (!gently)
-   exit(err);
-   }
-
-   return err;
-}
-
-/**
  * Appends merge candidates from FETCH_HEAD that are not marked not-for-merge
  * into merge_heads.
  */
diff --git a/wt-status.c b/wt-status.c
index 6225a2d..792dda9 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -16,6 +16,7 @@
 #include "strbuf.h"
 #include "utf8.h"
 #include "worktree.h"
+#include "lockfile.h"
 
 static const char cut_line[] =
 " >8 \n";
@@ -1757,3 +1758,76 @@ void wt_porcelain_print(struct wt_status *s)
s->no_gettext = 1;
wt_shortstatus_print(s);
 }
+
+/**
+ * Returns 1 if there are unstaged changes, 0 otherwise.
+ */
+static int has_unstaged_changes(void)
+{
+   struct rev_info rev_info;
+   int result;
+
+   init_revisions(_info, NULL);
+   DIFF_OPT_SET(_info.diffopt, IGNORE_SUBMODULES);
+   DIFF_OPT_SET(_info.diffopt, QUICK);
+   diff_setup_done(_info.diffopt);
+   result = run_diff_files(_info, 0);
+   return diff_result_code(_info.diffopt, result);
+}
+
+/**
+ * Returns 1 if there are uncommitted changes, 0 otherwise.
+ */
+static int has_uncommitted_changes(void)
+{
+   struct rev_info rev_info;
+   int result;
+
+   if (is_cache_unborn())
+   return 0;
+
+   init_revisions(_info, NULL);
+   DIFF_OPT_SET(_info.diffopt, IGNORE_SUBMODULES);
+   DIFF_OPT_SET(_info.diffopt, QUICK);
+   add_head_to_pending(_info);
+   diff_setup_done(_info.diffopt);
+   result = run_diff_index(_info, 1);
+   return diff_result_code(_info.diffopt, result);
+}
+
+/**
+ * If the work tree has unstaged or uncommitted changes, dies with the
+ * appropriate message.
+ */
+int require_clean_work_tree(const char *action, const char *hint, int gently)
+{
+   struct lock_file *lock_file = xcalloc(1, sizeof(*lock_file));
+   int err = 0;
+
+   hold_locked_index(lock_file, 0);
+   refresh_cache(REFRESH_QUIET);
+   update_index_if_able(_index, lock_file);
+   rollback_lock_file(lock_file);
+
+   if (has_unstaged_changes()) {
+

[PATCH 2/6] pull: make code more similar to the shell script again

2016-08-25 Thread Johannes Schindelin

When converting the pull command to a builtin, the
require_clean_work_tree() function was renamed and the pull-specific
parts hard-coded.

This makes it impossible to reuse the code, so let's modify the code to
make it more similar to the original shell script again.

Signed-off-by: Johannes Schindelin 
---
 builtin/pull.c | 28 ++--
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/builtin/pull.c b/builtin/pull.c
index d4bd635..4d1f9c8 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -365,10 +365,11 @@ static int has_uncommitted_changes(void)
  * If the work tree has unstaged or uncommitted changes, dies with the
  * appropriate message.
  */
-static void die_on_unclean_work_tree(void)
+static int require_clean_work_tree(const char *action, const char *hint,
+   int gently)
 {
struct lock_file *lock_file = xcalloc(1, sizeof(*lock_file));
-   int do_die = 0;
+   int err = 0;
 
hold_locked_index(lock_file, 0);
refresh_cache(REFRESH_QUIET);
@@ -376,20 +377,26 @@ static void die_on_unclean_work_tree(void)
rollback_lock_file(lock_file);
 
if (has_unstaged_changes()) {
-   error(_("Cannot pull with rebase: You have unstaged changes."));
-   do_die = 1;
+   error(_("Cannot %s: You have unstaged changes."), action);
+   err = 1;
}
 
if (has_uncommitted_changes()) {
-   if (do_die)
+   if (err)
error(_("Additionally, your index contains uncommitted 
changes."));
else
-   error(_("Cannot pull with rebase: Your index contains 
uncommitted changes."));
-   do_die = 1;
+   error(_("Cannot %s: Your index contains uncommitted 
changes."), action);
+   err = 1;
}
 
-   if (do_die)
-   exit(1);
+   if (err) {
+   if (hint)
+   error("%s", hint);
+   if (!gently)
+   exit(err);
+   }
+
+   return err;
 }
 
 /**
@@ -875,7 +882,8 @@ int cmd_pull(int argc, const char **argv, const char 
*prefix)
die(_("Updating an unborn branch with changes added to 
the index."));
 
if (!autostash)
-   die_on_unclean_work_tree();
+   require_clean_work_tree("pull with rebase",
+   "Please commit or stash them.", 0);
 
if (get_rebase_fork_point(rebase_fork_point, repo, *refspecs))
hashclr(rebase_fork_point);
-- 
2.10.0.rc1.99.gcd66998


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/6] pull: drop confusing prefix parameter of die_on_unclean_work_tree()

2016-08-25 Thread Johannes Schindelin

In cmd_pull(), when verifying that there are no changes preventing a
rebasing pull, we diligently pass the prefix parameter to the
die_on_unclean_work_tree() function which in turn diligently passes it
to the has_unstaged_changes() and has_uncommitted_changes() functions.

The casual reader might now be curious (as this developer was) whether
that means that calling `git pull --rebase` in a subdirectory will
ignore unstaged changes in other parts of the working directory. And be
puzzled that `git pull --rebase` (correctly) complains about those
changes outside of the current directory.

The puzzle is easily resolved: while we take pains to pass around the
prefix and even pass it to init_revisions(), the fact that no paths are
passed to init_revisions() ensures that the prefix is simply ignored.

That, combined with the fact that we will *always* want a *full* working
directory check before running a rebasing pull, is reason enough to
simply do away with the actual prefix parameter and to pass NULL
instead, as if we were running this from the top-level working directory
anyway.

Signed-off-by: Johannes Schindelin 
---
 builtin/pull.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/builtin/pull.c b/builtin/pull.c
index 398aae1..d4bd635 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -328,12 +328,12 @@ static int git_pull_config(const char *var, const char 
*value, void *cb)
 /**
  * Returns 1 if there are unstaged changes, 0 otherwise.
  */
-static int has_unstaged_changes(const char *prefix)
+static int has_unstaged_changes(void)
 {
struct rev_info rev_info;
int result;
 
-   init_revisions(_info, prefix);
+   init_revisions(_info, NULL);
DIFF_OPT_SET(_info.diffopt, IGNORE_SUBMODULES);
DIFF_OPT_SET(_info.diffopt, QUICK);
diff_setup_done(_info.diffopt);
@@ -344,7 +344,7 @@ static int has_unstaged_changes(const char *prefix)
 /**
  * Returns 1 if there are uncommitted changes, 0 otherwise.
  */
-static int has_uncommitted_changes(const char *prefix)
+static int has_uncommitted_changes(void)
 {
struct rev_info rev_info;
int result;
@@ -352,7 +352,7 @@ static int has_uncommitted_changes(const char *prefix)
if (is_cache_unborn())
return 0;
 
-   init_revisions(_info, prefix);
+   init_revisions(_info, NULL);
DIFF_OPT_SET(_info.diffopt, IGNORE_SUBMODULES);
DIFF_OPT_SET(_info.diffopt, QUICK);
add_head_to_pending(_info);
@@ -365,7 +365,7 @@ static int has_uncommitted_changes(const char *prefix)
  * If the work tree has unstaged or uncommitted changes, dies with the
  * appropriate message.
  */
-static void die_on_unclean_work_tree(const char *prefix)
+static void die_on_unclean_work_tree(void)
 {
struct lock_file *lock_file = xcalloc(1, sizeof(*lock_file));
int do_die = 0;
@@ -375,12 +375,12 @@ static void die_on_unclean_work_tree(const char *prefix)
update_index_if_able(_index, lock_file);
rollback_lock_file(lock_file);
 
-   if (has_unstaged_changes(prefix)) {
+   if (has_unstaged_changes()) {
error(_("Cannot pull with rebase: You have unstaged changes."));
do_die = 1;
}
 
-   if (has_uncommitted_changes(prefix)) {
+   if (has_uncommitted_changes()) {
if (do_die)
error(_("Additionally, your index contains uncommitted 
changes."));
else
@@ -875,7 +875,7 @@ int cmd_pull(int argc, const char **argv, const char 
*prefix)
die(_("Updating an unborn branch with changes added to 
the index."));
 
if (!autostash)
-   die_on_unclean_work_tree(prefix);
+   die_on_unclean_work_tree();
 
if (get_rebase_fork_point(rebase_fork_point, repo, *refspecs))
hashclr(rebase_fork_point);
-- 
2.10.0.rc1.99.gcd66998


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 07/13] pack-protocol: fix maximum pkt-line size

2016-08-25 Thread larsxschneider

From: Lars Schneider 

According to LARGE_PACKET_MAX in pkt-line.h the maximal length of a
pkt-line packet is 65520 bytes. The pkt-line header takes 4 bytes and
therefore the pkt-line data component must not exceed 65516 bytes.

Signed-off-by: Lars Schneider 
---
 Documentation/technical/protocol-common.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/protocol-common.txt 
b/Documentation/technical/protocol-common.txt
index bf30167..ecedb34 100644
--- a/Documentation/technical/protocol-common.txt
+++ b/Documentation/technical/protocol-common.txt
@@ -67,9 +67,9 @@ with non-binary data the same whether or not they contain the 
trailing
 LF (stripping the LF if present, and not complaining when it is
 missing).
 
-The maximum length of a pkt-line's data component is 65520 bytes.
-Implementations MUST NOT send pkt-line whose length exceeds 65524
-(65520 bytes of payload + 4 bytes of length data).
+The maximum length of a pkt-line's data component is 65516 bytes.
+Implementations MUST NOT send pkt-line whose length exceeds 65520
+(65516 bytes of payload + 4 bytes of length data).
 
 Implementations SHOULD NOT send an empty pkt-line ("0004").
 
-- 
2.9.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[ANNOUNCE] Git for Windows 2.9.3(2)

2016-08-25 Thread Johannes Schindelin

Dear Git users,

It is my pleasure to announce that Git for Windows 2.9.3(2) is available from:

https://git-for-windows.github.io/

Changes since Git for Windows v2.9.3 (August 13th 2016)

New Features

  • Comes with Git Credential Manager v1.6.1.
  • The feature introduced with Git for Windows v2.9.3 where cat-file
can apply smudge filters was renamed to --filters and made
compatible with the --batch mode (the former option name --smudge
has been deprecated and will go away in v2.10.0).
  • Comes with OpenSSH 7.3p1.
  • Git's .exe files are now code-signed, helping with performance when
being run with Windows File Protection.

Filename | SHA-256
 | ---
Git-2.9.3.2-64-bit.exe | 
d23629ec9f89a6bbddc0459108296b4fdfa08abd98e07485134a9a30ad40486a
Git-2.9.3.2-32-bit.exe | 
7d9645093925fee8de5308c7d6d53d9f6b070e95e4369fef02b4493da782475f
PortableGit-2.9.3.2-64-bit.7z.exe | 
2115cbd45b20efc62dcfa57ce86d5b6f3f15e8a887f253baeb43e3bff836810e
PortableGit-2.9.3.2-32-bit.7z.exe | 
a173516ba51d7b1d926b78243911c55962c007cc6ea822d1cca37963a4794c68
MinGit-2.9.3.2-64-bit.zip | 
4616ef2cef84d19ccc0325d6326107fe82a7c487dafbe7ee2248878cea32ea1a
MinGit-2.9.3.2-32-bit.zip | 
7178963b663d91abc5d101eab6a24552fdbc2913a312edaa8291266451bcf1d1
Git-2.9.3.2-64-bit.tar.bz2 | 
9cbb3019bc614fa2769b6d64298b339a2ef66d475602c1d572b849283b62ad15
Git-2.9.3.2-32-bit.tar.bz2 | 
be55791fbd02bd7d51b7b93eed0ac8193bbd996e8590f0f0ac01b65c9a85e5b0

Ciao,
Johannes

Re: core.autocrlf, was Re: [git-for-windows] Re: [ANNOUNCE] Git for Windows 2.9.3

2016-08-25 Thread Johannes Schindelin

Hi Torsten,

On Thu, 25 Aug 2016, Torsten Bögershausen wrote:

> > I was not talking about the cost of correcting mistakes. Running --filters
> > is potentially very costly. Just so you understand what I am talking
> > about: I have a report that says that checking out a sizeable worktree
> > with core.autocrlf=true is 58% slower than with core.autocrlf=false. That
> > is horrible. []
> 
> Is this a public repo ?

No.

> Or is there a benchmark repo somewhere ?

Unfortunately not. The only information I have is that it contains
gazillions of files and that most of that time was spent in figuring out
whether the files contain CR/LF, LF, or both.

I hope to get back to some performance benchmarking soon. I have some
experimental code to generate Git repositories of a specific size, and I
hope to be able to replicate the issues with that infrastructure.

Should be fun.

Ciao,
Dscho

Re: Working with public-inbox.org [Was: [PATCH] rev-parse: respect core.hooksPath in --git-path]

2016-08-25 Thread Johannes Schindelin

Hi Kuba,

On Mon, 22 Aug 2016, Jakub Narębski wrote:

> W dniu 22.08.2016 o 15:18, Johannes Schindelin pisze:
> 
> > So unfortunately this thread has devolved. Which is sad. Because all I
> > wanted is to have a change in Git's submission process that would not
> > exclude *so many* developers. That is really all I care about. Not about
> > tools. Not about open vs proprietary, or standards.
> > 
> > I just want developers who are already familiar with Git, and come up with
> > an improvement to Git itself, to be able to contribute it without having
> > to pull out their hair in despair.
> 
> What is lacking in using submitGit tool for those that have problems
> with sending patches via email?

Where do I start? And where do I stop? Here is a *very* brief list of
issues from the top of my head (and the history of my web browser):

- You cannot open a PR on GitHub and include the PR's cover letter as
  cover letter: https://github.com/rtyley/submitgit/issues/9

- You cannot Cc: people explicitly:
  https://github.com/rtyley/submitgit/issues/31

- submitGit does not include any interdiff

- it is really hard to get back from mails to the corresponding commits

- you have to register with yet another service to send mails on your
  behalf. Would be nicer if the mails could be sent from a submitGit
  address (moderated, of course) and did not need a separate registration
  step with some scary permission granting.

- submitGit requires you to go to a separate website to interact with the
  submitGit web app. Would be so much nicer if it were a bot operating on
  PRs.

- comments sent as replies have no connection to the PR *nor* the commits
  they refer to (making submitGit basically a pimped up git-send-email,
  nothing more).

- submitGit would require a substantial effort from me to learn how to
  extend/fix it, to run the web app locally and run its tests. That is a
  rather steep hurdle.

This is an incomplete list, of course.

> Submitting changes in Git comes in three phases:
>  - submit email with patches
>  - review and discuss patch
>  - apply patches from email

You forgot a really crucial step. Maybe you did not go through dozens of
iterations in your patch series as I regularly do, or something, so it is
probably easy for you to forget:

  - find the commit in question, run rebase -i and patch it as suggested

This is something that costs me quite some time to do. It is easily the
most annoying aspect of the mail list-based approach for me.

> Pull request via GitHub / Bitbucket / GitLab is easier than sending
> patches via email (pity that GitHub et al. do not have such submitGit-like
> automation built-in).  But I think email, with threaded view, careful
> trimming of quoted contents, multi-level quotes is superior to exiting
> web-based solutions.

They are not exiting, but I know what you meant.

The thing is: GitHub does not need such an automation. Because most
projects are pretty happy with the process centered around the web app.

It is only projects such as Linux, Cygwin and Git itself who refuse to
allow for tools that would let the majority of potential contributors
stick with their favorite way to read and write mails (I am talking about
users of GMail and Outlook, of course).

Ciao,
Dscho

git push origin BRANCHNAME question

2016-08-25 Thread Ed Greenberg

I think I understand this from the git-push man page, but I want to make 
sure:


I have two branches, master and develop.

If I am (accidentally) sitting on master, and issue 'git push origin 
develop', does this properly push develop to remote develop, or does it 
push master to remote develop (which seems to be bad, in the most common 
use case.)  ?


Thanks,

Ed


--
Ed Greenberg
Glens Falls, NY USA

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

git push origin BRANCHNAME question

2016-08-25 Thread Ed Greenberg

I think I understand this from the git-push man page, but I want to make 
sure:


I have two branches, master and develop.

If I am (accidentally) sitting on master, and issue 'git push origin 
develop', does this properly push develop to remote develop, or does it 
push master to remote develop (which seems to be bad, in the most common 
use case.)  ?


Thanks,

Ed


--
Ed Greenberg
Glens Falls, NY USA

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Working with public-inbox.org [Was: [PATCH] rev-parse: respect core.hooksPath in --git-path]

2016-08-25 Thread Johannes Schindelin

Hi Eric,

On Wed, 24 Aug 2016, Eric Wong wrote:

> Johannes Schindelin  wrote:
> 
> > Now, with somebody like me who would lose a lot when destroying trust,
> > it is highly unlikely. But it is possible that in between the hundreds
> > of sincere contributors a bad apple tries to sneak in bad stuff.
> 
> Yes, I would never mix reviews + patch applications of emails vs
> git-fetched data.

Well, such a categorical statement seems to exclude all convenience I had
in mind.

My idea was that, say, a web service running on a trusted server with a
trusted code base could send mails that would be trusted to contain
correct SHA-1 information, allowing for a review in the mail, but still
use the much, much more convenient Git tools to actually work on the code.

I really hope that not everybody is so categorically against introducing
much needed convenience.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Working with public-inbox.org [Was: [PATCH] rev-parse: respect core.hooksPath in --git-path]

2016-08-25 Thread Johannes Schindelin

Hi Arif,

On Thu, 25 Aug 2016, Arif Khokar wrote:

> On 08/24/2016 09:04 AM, Johannes Schindelin wrote:
> >
> > On Mon, 22 Aug 2016, Philip Oakley wrote:
> 
> >> I do note that dscho's patches now have the extra footer (below the
> >> three dashes) e.g.
> >>
> >> Published-As: https://github.com/dscho/git/releases/tag/cat-file-filters-v1
> >> Fetch-It-Via: git fetch https://github.com/dscho/git cat-file-filters-v1
> 
> 
> 
> > I considered recommending this as some way to improve the review process.
> > The problem, of course, is that it is very easy to craft an email with an
> > innocuous patch and then push some malicious patch to the linked
> > repository.
> 
> It should be possible to verify the SHA1 of the blob before and after 
> the patch is applied given the values listed near the beginning of the 
> git diff output.

There is no guarantee that the SHA-1 has not been tampered with.

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Working with public-inbox.org [Was: [PATCH] rev-parse: respect core.hooksPath in --git-path]

2016-08-25 Thread Johannes Schindelin

Hi Eric,

On Mon, 22 Aug 2016, Eric Wong wrote:

> Johannes Schindelin  wrote:
>
> > I just want developers who are already familiar with Git, and come up with
> > an improvement to Git itself, to be able to contribute it without having
> > to pull out their hair in despair.
> 
> We want the same thing.  I just want to go farther and get
> people familiar with (federated|decentralized) tools instead of
> proprietary and centralized ones.

Why require users to get familiar with (federated|decentralized) tools
*unless* they make things provably more convenient? So far, I only see
that this would add to the hurdle, not improve things.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 04/13] pkt-line: add packet_flush_gently()

2016-08-25 Thread larsxschneider

From: Lars Schneider 

packet_flush() would die in case of a write error even though for some
callers an error would be acceptable. Add packet_flush_gently() which
writes a pkt-line flush packet and returns `0` for success and `-1` for
failure.

Signed-off-by: Lars Schneider 
---
 pkt-line.c | 6 ++
 pkt-line.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 3e8b2fb..cad26df 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -91,6 +91,12 @@ void packet_flush(int fd)
write_or_die(fd, "", 4);
 }
 
+int packet_flush_gently(int fd)
+{
+   packet_trace("", 4, 1);
+   return (write_in_full(fd, "", 4) == 4 ? 0 : -1);
+}
+
 void packet_buf_flush(struct strbuf *buf)
 {
packet_trace("", 4, 1);
diff --git a/pkt-line.h b/pkt-line.h
index 3caea77..3fa0899 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -23,6 +23,7 @@ void packet_flush(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format 
(printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
+int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
 
 /*
-- 
2.9.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: core.autocrlf, was Re: [git-for-windows] Re: [ANNOUNCE] Git for Windows 2.9.3

2016-08-25 Thread Torsten Bögershausen




I was not talking about the cost of correcting mistakes. Running --filters
is potentially very costly. Just so you understand what I am talking
about: I have a report that says that checking out a sizeable worktree
with core.autocrlf=true is 58% slower than with core.autocrlf=false. That
is horrible. []


Is this a public repo ?

Or is there a benchmark repo somewhere ?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 08/13] convert: quote filter names in error messages

2016-08-25 Thread larsxschneider

From: Lars Schneider 

Git filter driver commands with spaces (e.g. `filter.sh foo`) are hard to
read in error messages. Quote them to improve the readability.

Signed-off-by: Lars Schneider 
---
 convert.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/convert.c b/convert.c
index b1614bf..522e2c5 100644
--- a/convert.c
+++ b/convert.c
@@ -397,7 +397,7 @@ static int filter_buffer_or_fd(int in, int out, void *data)
child_process.out = out;
 
if (start_command(_process))
-   return error("cannot fork to run external filter %s", 
params->cmd);
+   return error("cannot fork to run external filter '%s'", 
params->cmd);
 
sigchain_push(SIGPIPE, SIG_IGN);
 
@@ -415,13 +415,13 @@ static int filter_buffer_or_fd(int in, int out, void 
*data)
if (close(child_process.in))
write_err = 1;
if (write_err)
-   error("cannot feed the input to external filter %s", 
params->cmd);
+   error("cannot feed the input to external filter '%s'", 
params->cmd);
 
sigchain_pop(SIGPIPE);
 
status = finish_command(_process);
if (status)
-   error("external filter %s failed %d", params->cmd, status);
+   error("external filter '%s' failed %d", params->cmd, status);
 
strbuf_release();
return (write_err || status);
@@ -462,15 +462,15 @@ static int apply_filter(const char *path, const char 
*src, size_t len, int fd,
return 0;   /* error was already reported */
 
if (strbuf_read(, async.out, len) < 0) {
-   error("read from external filter %s failed", cmd);
+   error("read from external filter '%s' failed", cmd);
ret = 0;
}
if (close(async.out)) {
-   error("read from external filter %s failed", cmd);
+   error("read from external filter '%s' failed", cmd);
ret = 0;
}
if (finish_async()) {
-   error("external filter %s failed", cmd);
+   error("external filter '%s' failed", cmd);
ret = 0;
}
 
-- 
2.9.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 09/13] convert: modernize tests

2016-08-25 Thread larsxschneider

From: Lars Schneider 

Use `test_config` to set the config, check that files are empty with
`test_must_be_empty`, compare files with `test_cmp`, and remove spaces
after ">" and "<".

Signed-off-by: Lars Schneider 
---
 t/t0021-conversion.sh | 62 +--
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 7bac2bc..7b45136 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -13,8 +13,8 @@ EOF
 chmod +x rot13.sh
 
 test_expect_success setup '
-   git config filter.rot13.smudge ./rot13.sh &&
-   git config filter.rot13.clean ./rot13.sh &&
+   test_config filter.rot13.smudge ./rot13.sh &&
+   test_config filter.rot13.clean ./rot13.sh &&
 
{
echo "*.t filter=rot13"
@@ -38,8 +38,8 @@ script='s/^\$Id: \([0-9a-f]*\) \$/\1/p'
 
 test_expect_success check '
 
-   cmp test.o test &&
-   cmp test.o test.t &&
+   test_cmp test.o test &&
+   test_cmp test.o test.t &&
 
# ident should be stripped in the repository
git diff --raw --exit-code :test :test.i &&
@@ -47,10 +47,10 @@ test_expect_success check '
embedded=$(sed -ne "$script" test.i) &&
test "z$id" = "z$embedded" &&
 
-   git cat-file blob :test.t > test.r &&
+   git cat-file blob :test.t >test.r &&
 
-   ./rot13.sh < test.o > test.t &&
-   cmp test.r test.t
+   ./rot13.sh test.t &&
+   test_cmp test.r test.t
 '
 
 # If an expanded ident ever gets into the repository, we want to make sure that
@@ -130,7 +130,7 @@ test_expect_success 'filter shell-escaped filenames' '
 
# delete the files and check them out again, using a smudge filter
# that will count the args and echo the command-line back to us
-   git config filter.argc.smudge "sh ./argc.sh %f" &&
+   test_config filter.argc.smudge "sh ./argc.sh %f" &&
rm "$normal" "$special" &&
git checkout -- "$normal" "$special" &&
 
@@ -141,7 +141,7 @@ test_expect_success 'filter shell-escaped filenames' '
test_cmp expect "$special" &&
 
# do the same thing, but with more args in the filter expression
-   git config filter.argc.smudge "sh ./argc.sh %f --my-extra-arg" &&
+   test_config filter.argc.smudge "sh ./argc.sh %f --my-extra-arg" &&
rm "$normal" "$special" &&
git checkout -- "$normal" "$special" &&
 
@@ -154,9 +154,9 @@ test_expect_success 'filter shell-escaped filenames' '
 '
 
 test_expect_success 'required filter should filter data' '
-   git config filter.required.smudge ./rot13.sh &&
-   git config filter.required.clean ./rot13.sh &&
-   git config filter.required.required true &&
+   test_config filter.required.smudge ./rot13.sh &&
+   test_config filter.required.clean ./rot13.sh &&
+   test_config filter.required.required true &&
 
echo "*.r filter=required" >.gitattributes &&
 
@@ -165,17 +165,17 @@ test_expect_success 'required filter should filter data' '
 
rm -f test.r &&
git checkout -- test.r &&
-   cmp test.o test.r &&
+   test_cmp test.o test.r &&
 
./rot13.sh expected &&
git cat-file blob :test.r >actual &&
-   cmp expected actual
+   test_cmp expected actual
 '
 
 test_expect_success 'required filter smudge failure' '
-   git config filter.failsmudge.smudge false &&
-   git config filter.failsmudge.clean cat &&
-   git config filter.failsmudge.required true &&
+   test_config filter.failsmudge.smudge false &&
+   test_config filter.failsmudge.clean cat &&
+   test_config filter.failsmudge.required true &&
 
echo "*.fs filter=failsmudge" >.gitattributes &&
 
@@ -186,9 +186,9 @@ test_expect_success 'required filter smudge failure' '
 '
 
 test_expect_success 'required filter clean failure' '
-   git config filter.failclean.smudge cat &&
-   git config filter.failclean.clean false &&
-   git config filter.failclean.required true &&
+   test_config filter.failclean.smudge cat &&
+   test_config filter.failclean.clean false &&
+   test_config filter.failclean.required true &&
 
echo "*.fc filter=failclean" >.gitattributes &&
 
@@ -197,8 +197,8 @@ test_expect_success 'required filter clean failure' '
 '
 
 test_expect_success 'filtering large input to small output should use little 
memory' '
-   git config filter.devnull.clean "cat >/dev/null" &&
-   git config filter.devnull.required true &&
+   test_config filter.devnull.clean "cat >/dev/null" &&
+   test_config filter.devnull.required true &&
for i in $(test_seq 1 30); do printf "%1048576d" 1; done >30MB &&
echo "30MB filter=devnull" >.gitattributes &&
GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB
@@ -207,7 +207,7 @@ test_expect_success 'filtering large input to small output 
should use little mem

[PATCH v6 00/13] Git filter protocol

2016-08-25 Thread larsxschneider

From: Lars Schneider 

The goal of this series is to avoid launching a new clean/smudge filter
process for each file that is filtered.

A short summary about v1 to v5 can be found here:
https://git.github.io/rev_news/2016/08/17/edition-18/

This series is also published on web:
https://github.com/larsxschneider/git/pull/10

Thanks a lot for your reviews,
Lars


## Changes since v5

### Peff
* 
http://public-inbox.org/git/20160810131541.ovpvgwdxjibae5gy%40sigill.intra.peff.net/
* drop patch "pkt-line: add `gentle` parameter to format_packet()"
* 
http://public-inbox.org/git/20160810143321.q7mjirgr5ynml...@sigill.intra.peff.net/
* drop patch "pkt-line: call packet_trace() only if a packet is actually 
send"
* 
http://public-inbox.org/git/20160810132814.gqnipsdwyzjmuqjy%40sigill.intra.peff.net/
* make pkt-line write buffer static
* replace PKTLINE_DATA_MAXLEN with sizeof(packet_write_buffer) - 4)
  in pkt-line.c to --> makes it easier to see that things are correct
* 
http://public-inbox.org/git/20160810133745.wagccvvf35o3pbwb%40sigill.intra.peff.net/
* check max content length before using packet_write_fmt_gently()


### Junio
* http://public-inbox.org/git/xmqqpopg5yqf.fsf%40gitster.mtv.corp.google.com/
*  change packet_write_gently_fmt() to packet_write_fmt_gently()
* http://public-inbox.org/git/xmqq8tw45vtg@gitster.mtv.corp.google.com/
* rename packet_write() to packet_write_fmt() before adding new functions


## Stefan
* 
http://public-inbox.org/git/CAGZ79kboxgBRHSa2s7CKZ1Uo%3D13WT%3DrT8VHCNJNj_Q9jQzZAYw%40mail.gmail.com/
* state more clearly that everything after version=2 is specific to 
version=2
* do not duplicate protocol in commit message,
  summarize design decisions instead
* use single line comment
* remove unnessary braces


## Dscho
* http://public-inbox.org/git/alpine.DEB.2.20.1608181617240.4924@virtualbox/
* wrap commit messages at 72
* s/CLOEXEX/CLOEXEC/


## Interdiff (v5..v6)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 6e563a6..6346700 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -393,13 +393,20 @@ standard input and standard output.

 Git starts the filter when it encounters the first file
 that needs to be cleaned or smudged. After the filter started
-Git sends a welcome message, a list of supported protocol
-version numbers, and a flush packet. Git expects to read the
-welcome message and one protocol version number from the
-previously sent list. Afterwards Git sends a list of supported
-capabilities and a flush packet. Git expects to read a list of
-desired capabilities, which must be a subset of the supported
-capabilities list, and a flush packet as response:
+Git sends a welcome message ("git-filter-client"), a list of
+supported protocol version numbers, and a flush packet. Git expects
+to read a welcome response message ("git-filter-server") and exactly
+one protocol version number from the previously sent list. All further
+communication will be based on the selected version. The remaining
+protocol description below documents "version=2". Please note that
+"version=42" in the example below does not exist and is only there
+to illustrate how the protocol would look like with more than one
+version.
+
+After the version negotiation Git sends a list of supported capabilities
+and a flush packet. Git expects to read a list of desired capabilities,
+which must be a subset of the supported capabilities list, and a flush
+packet as response:
 
 packet:  git> git-filter-client
 packet:  git> version=2
diff --git a/convert.c b/convert.c
index e421f4a..362a0af 100644
--- a/convert.c
+++ b/convert.c
@@ -530,7 +530,9 @@ static int packet_write_list(int fd, const char *line, ...)
  {
if (!line)
  break;
-   err = packet_write_gently_fmt(fd, "%s", line);
+   if (strlen(line) > PKTLINE_DATA_MAXLEN)
+ return -1;
+   err = packet_write_fmt_gently(fd, "%s", line);
if (err)
  return err;
line = va_arg(args, const char*);
@@ -686,11 +688,18 @@ static int apply_multi_file_filter(const char *path, 
const char *src, size_t len

  sigchain_push(SIGPIPE, SIG_IGN);

- err = packet_write_gently_fmt(process->in, "command=%s\n", filter_type);
+
+ err = (strlen(filter_type) > PKTLINE_DATA_MAXLEN);
+ if (err)
+   goto done;
+ err = packet_write_fmt_gently(process->in, "command=%s\n", filter_type);
  if (err)
goto done;

- err = packet_write_gently_fmt(process->in, "pathname=%s\n", path);
+ err = (strlen(path) > PKTLINE_DATA_MAXLEN);
+ if (err)
+   goto done;
+ err = packet_write_fmt_gently(process->in, "pathname=%s\n", path);
  if (err)
goto done;

@@ -722,9 +731,7 @@ static int apply_multi_file_filter(const char *path, const 
char *src, size_t len

  if (err || errno == EPIPE) {
if (!strcmp(filter_status.buf, "error")) {
- /*
-* The filter signaled a

Re: core.autocrlf, was Re: [git-for-windows] Re: [ANNOUNCE] Git for Windows 2.9.3

2016-08-25 Thread Johannes Schindelin

Hi Junio,

On Wed, 24 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> >> In any case, in the ideal future, I would imagine that we would want
> >> to have "cat-file blob" to enable "--filters" by default; that would
> >> make cat-file and hash-objects a pair of symmetric operations.
> >
> > I would advocate against that. It is not like the terms "hash-object" and
> > "cat-file" even *look* like they are opposites.
> 
> I do not quite understand your objection.
> 
> hash-object is "I have data somewhere on the filesystem, and I want
> to store it in the object store even though I am not ready to add it
> to the index yet (or I may not even add it to the index ever), just
> to make it available to Git tools".

That is not how I read it. I read "hash-object" as: "hash this object".
There was not a thought in my mind that it would apply filters. Since that
was so clear in my mind, I failed to understand that you do not consider
it a design mistake to turn on --filters by default in hash-object.

I read "cat-file" just the same: concatenate files and print on the
standard output. Now, it is confusing enough that it does not concatenate
files in unless in batch mode, and it would be even more confusing if it
started to behave as if the user had called "git checkout --dry-run
 -- " (which does not exist, but for which I would
understand the --filters default).

> Yes, correcting ancient mistakes is costly.  Such is life.

I was not talking about the cost of correcting mistakes. Running --filters
is potentially very costly. Just so you understand what I am talking
about: I have a report that says that checking out a sizeable worktree
with core.autocrlf=true is 58% slower than with core.autocrlf=false. That
is horrible. And it is a cost that is entirely born by Windows users.

In short: I think letting hash-object default to --filters was a mistake,
and doing the same for cat-file would be a mistake, too.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git for Windows documentation, was Re: [git-for-windows] Re: [ANNOUNCE] Git for Windows 2.9.3

2016-08-25 Thread Johannes Schindelin

Hi Dakota,

On Wed, 24 Aug 2016, Dakota Hawkins wrote:

> On Wed, Aug 24, 2016 at 11:41 AM, Johannes Schindelin
>  wrote:
> >
> > On Tue, 23 Aug 2016, Dakota Hawkins wrote:
> >
> >> I use GFW almost exclusively, but I pretty much always consult the
> >> upstream documentation anyway (because I find it easier).
> >
> > Oh... I thought that typing "git help git-commit" opening a nice HTML
> > page in your favorite browser was good enough.
> >
> > Do you have any suggestion how to improve the user experience?
> 
> Just a small one, and that's that I'd prefer the option to have help
> display in my terminal (that option might exist and I don't know how to
> turn it on). That would be very convenient for me.

Ah, okay... The reason why this is not that easy is: we ship with HTML
documentation (and skip `man` altogether, also to conserve space in the
already large installer: it is ~30MB, which might seem acceptable to you
until you are stuck in a country where the download is at 30-70 kB/s).

So I am afraid that the only solution in that case would be to install the
Git for Windows SDK (https://git-for-windows.github.io/#download-sdk, as
pointed out by Philip).

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 05/13] pkt-line: add packet_write_gently()

2016-08-25 Thread larsxschneider

From: Lars Schneider 

packet_write_fmt() has two shortcomings. First, it uses format_packet()
which lets the caller only send string data via "%s". That means it
cannot be used for arbitrary data that may contain NULs. Second, it will
always die on error.

Add packet_write_gently() which writes arbitrary data and returns `0`
for success and `-1` for an error. This function is used by other
pkt-line functions in a subsequent patch.

Signed-off-by: Lars Schneider 
---
 pkt-line.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index cad26df..7e8a803 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -3,6 +3,7 @@
 #include "run-command.h"
 
 char packet_buffer[LARGE_PACKET_MAX];
+static char packet_write_buffer[LARGE_PACKET_MAX];
 static const char *packet_trace_prefix = "git";
 static struct trace_key trace_packet = TRACE_KEY_INIT(PACKET);
 static struct trace_key trace_pack = TRACE_KEY_INIT(PACKFILE);
@@ -155,6 +156,17 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...)
return (write_in_full(fd, buf.buf, buf.len) == buf.len ? 0 : -1);
 }
 
+int packet_write_gently(const int fd_out, const char *buf, size_t size)
+{
+   if (size > sizeof(packet_write_buffer) - 4)
+   return -1;
+   packet_trace(buf, size, 1);
+   memmove(packet_write_buffer + 4, buf, size);
+   size += 4;
+   set_packet_header(packet_write_buffer, size);
+   return (write_in_full(fd_out, packet_write_buffer, size) == size ? 0 : 
-1);
+}
+
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
 {
va_list args;
-- 
2.9.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 13/13] read-cache: make sure file handles are not inherited by child processes

2016-08-25 Thread larsxschneider

From: Lars Schneider 

Consider the case of a file that requires filtering and is present in
branch A but not in branch B. If A is the current HEAD and we checkout B
then the following happens:

1. ce_compare_data() opens the file
2.   index_fd() detects that the file requires to run a clean filter and
 calls index_stream_convert_blob()
4. index_stream_convert_blob() calls convert_to_git_filter_fd()
5.   convert_to_git_filter_fd() calls apply_filter() which creates a
 new long running filter process (in case it is the first file
 of this kind to be filtered)
6.   The new filter process inherits all file handles. This is the
 default on Linux/OSX and is explicitly defined in the
 `CreateProcessW` call in `mingw.c` on Windows.
7. ce_compare_data() closes the file
8. Git unlinks the file as it is not present in B

The unlink operation does not work on Windows because the filter process
has still an open handle to the file. Apparently that is no problem on
Linux/OSX. Probably because "[...] the two file descriptors share open
file status flags" (see fork(2)).

Fix this problem by opening files in read-cache with the `O_CLOEXEC`
flag to ensure that the file descriptor does not remain open in a newly
spawned process. `O_CLOEXEC` is defined as `O_NOINHERIT` on Windows. A
similar fix for temporary file handles was applied on Git for Windows
already:
https://github.com/git-for-windows/git/commit/667b8b51ec850c3e1c7d75dee69dc13c29d1f162

Signed-off-by: Lars Schneider 
---
 read-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index db27766..f481dee 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -159,7 +159,7 @@ void fill_stat_cache_info(struct cache_entry *ce, struct 
stat *st)
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
 {
int match = -1;
-   int fd = open(ce->name, O_RDONLY);
+   int fd = open(ce->name, O_RDONLY | O_CLOEXEC);
 
if (fd >= 0) {
unsigned char sha1[20];
-- 
2.9.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 02/13] pkt-line: extract set_packet_header()

2016-08-25 Thread larsxschneider

From: Lars Schneider 

set_packet_header() converts an integer to a 4 byte hex string. Make
this function locally available so that other pkt-line functions can
use it.

Signed-off-by: Lars Schneider 
---
 pkt-line.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index 0a9b61c..e8adc0f 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -97,10 +97,20 @@ void packet_buf_flush(struct strbuf *buf)
strbuf_add(buf, "", 4);
 }
 
-#define hex(a) (hexchar[(a) & 15])
-static void format_packet(struct strbuf *out, const char *fmt, va_list args)
+static void set_packet_header(char *buf, const int size)
 {
static char hexchar[] = "0123456789abcdef";
+
+   #define hex(a) (hexchar[(a) & 15])
+   buf[0] = hex(size >> 12);
+   buf[1] = hex(size >> 8);
+   buf[2] = hex(size >> 4);
+   buf[3] = hex(size);
+   #undef hex
+}
+
+static void format_packet(struct strbuf *out, const char *fmt, va_list args)
+{
size_t orig_len, n;
 
orig_len = out->len;
@@ -111,10 +121,7 @@ static void format_packet(struct strbuf *out, const char 
*fmt, va_list args)
if (n > LARGE_PACKET_MAX)
die("protocol error: impossibly long line");
 
-   out->buf[orig_len + 0] = hex(n >> 12);
-   out->buf[orig_len + 1] = hex(n >> 8);
-   out->buf[orig_len + 2] = hex(n >> 4);
-   out->buf[orig_len + 3] = hex(n);
+   set_packet_header(>buf[orig_len], n);
packet_trace(out->buf + orig_len + 4, n - 4, 1);
 }
 
-- 
2.9.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 12/13] convert: add filter..process option

2016-08-25 Thread larsxschneider

From: Lars Schneider 

Git's clean/smudge mechanism invokes an external filter process for
every single blob that is affected by a filter. If Git filters a lot of
blobs then the startup time of the external filter processes can become
a significant part of the overall Git execution time.

In a preliminary performance test this developer used a clean/smudge
filter written in golang to filter 12,000 files. This process took 364s
with the existing filter mechanism and 5s with the new mechanism. See
details here: https://github.com/github/git-lfs/pull/1382

This patch adds the `filter..process` string option which, if
used, keeps the external filter process running and processes all blobs
with the packet format (pkt-line) based protocol over standard input and
standard output. The full protocol is explained in detail in
`Documentation/gitattributes.txt`.

A few key decisions:

* The long running filter process is referred to as filter protocol
  version 2 because the existing single shot filter invocation is
  considered version 1.
* Git sends a welcome message and expects a response right after the
  external filter process has started. This ensures that Git will not
  hang if a version 1 filter is incorrectly used with the
  filter..process option for version 2 filters. In addition,
  Git can detect this kind of error and warn the user.
* The status of a filter operation (e.g. "success" or "error) is set
  before the actual response and (if necessary!) re-set after the
  response. The advantage of this two step status response is that if
  the filter detects an error early, then the filter can communicate
  this and Git does not even need to create structures to read the
  response.
* All status responses are pkt-line lists terminated with a flush
  packet. This allows us to send other status fields with the same
  protocol in the future.

Helped-by: Martin-Louis Bright 
Reviewed-by: Jakub Narebski 
Signed-off-by: Lars Schneider 
---
 Documentation/gitattributes.txt | 146 +++-
 convert.c   | 349 +
 pkt-line.h  |   1 +
 t/t0021-conversion.sh   | 372 
 t/t0021/rot13-filter.pl | 176 +++
 unpack-trees.c  |   1 +
 6 files changed, 1015 insertions(+), 30 deletions(-)
 create mode 100755 t/t0021/rot13-filter.pl

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 8882a3e..6346700 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -300,7 +300,13 @@ checkout, when the `smudge` command is specified, the 
command is
 fed the blob object from its standard input, and its standard
 output is used to update the worktree file.  Similarly, the
 `clean` command is used to convert the contents of worktree file
-upon checkin.
+upon checkin. By default these commands process only a single
+blob and terminate.  If a long running `process` filter is used
+in place of `clean` and/or `smudge` filters, then Git can process
+all blobs with a single filter command invocation for the entire
+life of a single Git command, for example `git add --all`.  See
+section below for the description of the protocol used to
+communicate with a `process` filter.
 
 One use of the content filtering is to massage the content into a shape
 that is more convenient for the platform, filesystem, and the user to use.
@@ -375,6 +381,144 @@ substitution.  For example:
 
 
 
+Long Running Filter Process
+^^^
+
+If the filter command (a string value) is defined via
+`filter..process` then Git can process all blobs with a
+single filter invocation for the entire life of a single Git
+command. This is achieved by using the following packet format
+(pkt-line, see technical/protocol-common.txt) based protocol over
+standard input and standard output.
+
+Git starts the filter when it encounters the first file
+that needs to be cleaned or smudged. After the filter started
+Git sends a welcome message ("git-filter-client"), a list of
+supported protocol version numbers, and a flush packet. Git expects
+to read a welcome response message ("git-filter-server") and exactly
+one protocol version number from the previously sent list. All further
+communication will be based on the selected version. The remaining
+protocol description below documents "version=2". Please note that
+"version=42" in the example below does not exist and is only there
+to illustrate how the protocol would look like with more than one
+version.
+
+After the version negotiation Git sends a list of supported capabilities
+and a flush packet. Git expects to read a list of desired capabilities,
+which must be a subset of the supported capabilities list, and a flush
+packet as response:
+
+packet:

[PATCH v6 01/13] pkt-line: rename packet_write() to packet_write_fmt()

2016-08-25 Thread larsxschneider

From: Lars Schneider 

packet_write() should be called packet_write_fmt() as the string
parameter can be formatted.

Suggested-by: Junio C Hamano 
Signed-off-by: Lars Schneider 
---
 builtin/archive.c|  4 ++--
 builtin/receive-pack.c   |  4 ++--
 builtin/remote-ext.c |  4 ++--
 builtin/upload-archive.c |  4 ++--
 connect.c|  2 +-
 daemon.c |  2 +-
 http-backend.c   |  2 +-
 pkt-line.c   |  2 +-
 pkt-line.h   |  2 +-
 shallow.c|  2 +-
 upload-pack.c| 30 +++---
 11 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/builtin/archive.c b/builtin/archive.c
index a1e3b94..49f4914 100644
--- a/builtin/archive.c
+++ b/builtin/archive.c
@@ -47,10 +47,10 @@ static int run_remote_archiver(int argc, const char **argv,
if (name_hint) {
const char *format = archive_format_from_filename(name_hint);
if (format)
-   packet_write(fd[1], "argument --format=%s\n", format);
+   packet_write_fmt(fd[1], "argument --format=%s\n", 
format);
}
for (i = 1; i < argc; i++)
-   packet_write(fd[1], "argument %s\n", argv[i]);
+   packet_write_fmt(fd[1], "argument %s\n", argv[i]);
packet_flush(fd[1]);
 
buf = packet_read_line(fd[0], NULL);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 15c323a..d60a412 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -199,7 +199,7 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
 static void show_ref(const char *path, const unsigned char *sha1)
 {
if (sent_capabilities) {
-   packet_write(1, "%s %s\n", sha1_to_hex(sha1), path);
+   packet_write_fmt(1, "%s %s\n", sha1_to_hex(sha1), path);
} else {
struct strbuf cap = STRBUF_INIT;
 
@@ -212,7 +212,7 @@ static void show_ref(const char *path, const unsigned char 
*sha1)
if (push_cert_nonce)
strbuf_addf(, " push-cert=%s", push_cert_nonce);
strbuf_addf(, " agent=%s", git_user_agent_sanitized());
-   packet_write(1, "%s %s%c%s\n",
+   packet_write_fmt(1, "%s %s%c%s\n",
 sha1_to_hex(sha1), path, 0, cap.buf);
strbuf_release();
sent_capabilities = 1;
diff --git a/builtin/remote-ext.c b/builtin/remote-ext.c
index 88eb8f9..11b48bf 100644
--- a/builtin/remote-ext.c
+++ b/builtin/remote-ext.c
@@ -128,9 +128,9 @@ static void send_git_request(int stdin_fd, const char 
*serv, const char *repo,
const char *vhost)
 {
if (!vhost)
-   packet_write(stdin_fd, "%s %s%c", serv, repo, 0);
+   packet_write_fmt(stdin_fd, "%s %s%c", serv, repo, 0);
else
-   packet_write(stdin_fd, "%s %s%chost=%s%c", serv, repo, 0,
+   packet_write_fmt(stdin_fd, "%s %s%chost=%s%c", serv, repo, 0,
 vhost, 0);
 }
 
diff --git a/builtin/upload-archive.c b/builtin/upload-archive.c
index 2caedf1..dc872f6 100644
--- a/builtin/upload-archive.c
+++ b/builtin/upload-archive.c
@@ -88,11 +88,11 @@ int cmd_upload_archive(int argc, const char **argv, const 
char *prefix)
writer.git_cmd = 1;
if (start_command()) {
int err = errno;
-   packet_write(1, "NACK unable to spawn subprocess\n");
+   packet_write_fmt(1, "NACK unable to spawn subprocess\n");
die("upload-archive: %s", strerror(err));
}
 
-   packet_write(1, "ACK\n");
+   packet_write_fmt(1, "ACK\n");
packet_flush(1);
 
while (1) {
diff --git a/connect.c b/connect.c
index 722dc3f..5330d9c 100644
--- a/connect.c
+++ b/connect.c
@@ -730,7 +730,7 @@ struct child_process *git_connect(int fd[2], const char 
*url,
 * Note: Do not add any other headers here!  Doing so
 * will cause older git-daemon servers to crash.
 */
-   packet_write(fd[1],
+   packet_write_fmt(fd[1],
 "%s %s%chost=%s%c",
 prog, path, 0,
 target_host, 0);
diff --git a/daemon.c b/daemon.c
index e647254..2646d0f 100644
--- a/daemon.c
+++ b/daemon.c
@@ -281,7 +281,7 @@ static int daemon_error(const char *dir, const char *msg)
 {
if (!informative_errors)
msg = "access denied or repository not exported";
-   packet_write(1, "ERR %s: %s", msg, dir);
+   packet_write_fmt(1, "ERR %s: %s", msg, dir);
return -1;
 }
 
diff --git a/http-backend.c b/http-backend.c
index 0d59499..aa61c18 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -460,7 +460,7 @@ static void get_info_refs(char *arg)

[PATCH v6 03/13] pkt-line: add packet_write_fmt_gently()

2016-08-25 Thread larsxschneider

From: Lars Schneider 

packet_write_fmt() would die in case of a write error even though for
some callers an error would be acceptable. Add packet_write_fmt_gently()
which writes a formatted pkt-line and returns `0` for success and `-1`
for an error.

Signed-off-by: Lars Schneider 
---
 pkt-line.c | 12 
 pkt-line.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index e8adc0f..3e8b2fb 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -137,6 +137,18 @@ void packet_write_fmt(int fd, const char *fmt, ...)
write_or_die(fd, buf.buf, buf.len);
 }
 
+int packet_write_fmt_gently(int fd, const char *fmt, ...)
+{
+   static struct strbuf buf = STRBUF_INIT;
+   va_list args;
+
+   strbuf_reset();
+   va_start(args, fmt);
+   format_packet(, fmt, args);
+   va_end(args);
+   return (write_in_full(fd, buf.buf, buf.len) == buf.len ? 0 : -1);
+}
+
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
 {
va_list args;
diff --git a/pkt-line.h b/pkt-line.h
index 1902fb3..3caea77 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -23,6 +23,7 @@ void packet_flush(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format 
(printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
+int packet_write_fmt_gently(int fd, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
 
 /*
  * Read a packetized line into the buffer, which must be at least size bytes
-- 
2.9.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 06/13] pkt-line: add functions to read/write flush terminated packet streams

2016-08-25 Thread larsxschneider

From: Lars Schneider 

packet_write_stream_with_flush_from_fd() and
packet_write_stream_with_flush_from_buf() write a stream of packets. All
content packets use the maximal packet size except for the last one.
After the last content packet a `flush` control packet is written.

packet_read_till_flush() reads arbitrary sized packets until it detects
a `flush` packet.

Signed-off-by: Lars Schneider 
---
 pkt-line.c | 91 ++
 pkt-line.h |  7 +
 2 files changed, 98 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 7e8a803..3033aa3 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -176,6 +176,47 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, 
...)
va_end(args);
 }
 
+int packet_write_stream_with_flush_from_fd(int fd_in, int fd_out)
+{
+   int err = 0;
+   ssize_t bytes_to_write;
+
+   while (!err) {
+   bytes_to_write = xread(fd_in, packet_write_buffer, 
sizeof(packet_write_buffer) - 4);
+   if (bytes_to_write < 0)
+   return COPY_READ_ERROR;
+   if (bytes_to_write == 0)
+   break;
+   if (bytes_to_write > sizeof(packet_write_buffer) - 4)
+   return COPY_WRITE_ERROR;
+   err = packet_write_gently(fd_out, packet_write_buffer, 
bytes_to_write);
+   }
+   if (!err)
+   err = packet_flush_gently(fd_out);
+   return err;
+}
+
+int packet_write_stream_with_flush_from_buf(const char *src_in, size_t len, 
int fd_out)
+{
+   int err = 0;
+   size_t bytes_written = 0;
+   size_t bytes_to_write;
+
+   while (!err) {
+   if ((len - bytes_written) > sizeof(packet_write_buffer) - 4)
+   bytes_to_write = sizeof(packet_write_buffer) - 4;
+   else
+   bytes_to_write = len - bytes_written;
+   if (bytes_to_write == 0)
+   break;
+   err = packet_write_gently(fd_out, src_in + bytes_written, 
bytes_to_write);
+   bytes_written += bytes_to_write;
+   }
+   if (!err)
+   err = packet_flush_gently(fd_out);
+   return err;
+}
+
 static int get_packet_data(int fd, char **src_buf, size_t *src_size,
   void *dst, unsigned size, int options)
 {
@@ -285,3 +326,53 @@ char *packet_read_line_buf(char **src, size_t *src_len, 
int *dst_len)
 {
return packet_read_line_generic(-1, src, src_len, dst_len);
 }
+
+ssize_t packet_read_till_flush(int fd_in, struct strbuf *sb_out)
+{
+   int len, ret;
+   int options = PACKET_READ_GENTLE_ON_EOF;
+   char linelen[4];
+
+   size_t oldlen = sb_out->len;
+   size_t oldalloc = sb_out->alloc;
+
+   for (;;) {
+   /* Read packet header */
+   ret = get_packet_data(fd_in, NULL, NULL, linelen, 4, options);
+   if (ret < 0)
+   goto done;
+   len = packet_length(linelen);
+   if (len < 0)
+   die("protocol error: bad line length character: %.4s", 
linelen);
+   if (!len) {
+   /* Found a flush packet - Done! */
+   packet_trace("", 4, 0);
+   break;
+   }
+   len -= 4;
+
+   /* Read packet content */
+   strbuf_grow(sb_out, len);
+   ret = get_packet_data(fd_in, NULL, NULL, sb_out->buf + 
sb_out->len, len, options);
+   if (ret < 0)
+   goto done;
+
+   if (ret != len) {
+   error("protocol error: incomplete read (expected %d, 
got %d)", len, ret);
+   goto done;
+   }
+
+   packet_trace(sb_out->buf + sb_out->len, len, 0);
+   sb_out->len += len;
+   }
+
+done:
+   if (ret < 0) {
+   if (oldalloc == 0)
+   strbuf_release(sb_out);
+   else
+   strbuf_setlen(sb_out, oldlen);
+   return ret;  /* unexpected EOF */
+   }
+   return sb_out->len - oldlen;
+}
diff --git a/pkt-line.h b/pkt-line.h
index 3fa0899..9616117 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -25,6 +25,8 @@ void packet_buf_flush(struct strbuf *buf);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
+int packet_write_stream_with_flush_from_fd(int fd_in, int fd_out);
+int packet_write_stream_with_flush_from_buf(const char *src_in, size_t len, 
int fd_out);
 
 /*
  * Read a packetized line into the buffer, which must be at least size bytes
@@ -77,6 +79,11 @@ char *packet_read_line(int fd, int *size);
  */
 char

[PATCH v6 11/13] convert: make apply_filter() adhere to standard Git error handling

2016-08-25 Thread larsxschneider

From: Lars Schneider 

apply_filter() returns a boolean that tells the caller if it
"did convert or did not convert". The variable `ret` was used throughout
the function to track errors whereas `1` denoted success and `0`
failure. This is unusual for the Git source where `0` denotes success.

Rename the variable and flip its value to make the function easier
readable for Git developers.

Signed-off-by: Lars Schneider 
---
 convert.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/convert.c b/convert.c
index 522e2c5..bd17340 100644
--- a/convert.c
+++ b/convert.c
@@ -436,7 +436,7 @@ static int apply_filter(const char *path, const char *src, 
size_t len, int fd,
 *
 * (child --> cmd) --> us
 */
-   int ret = 1;
+   int err = 0;
struct strbuf nbuf = STRBUF_INIT;
struct async async;
struct filter_params params;
@@ -463,22 +463,22 @@ static int apply_filter(const char *path, const char 
*src, size_t len, int fd,
 
if (strbuf_read(, async.out, len) < 0) {
error("read from external filter '%s' failed", cmd);
-   ret = 0;
+   err = -1;
}
if (close(async.out)) {
error("read from external filter '%s' failed", cmd);
-   ret = 0;
+   err = -1;
}
if (finish_async()) {
error("external filter '%s' failed", cmd);
-   ret = 0;
+   err = -1;
}
 
-   if (ret) {
+   if (!err) {
strbuf_swap(dst, );
}
strbuf_release();
-   return ret;
+   return !err;
 }
 
 static struct convert_driver {
-- 
2.9.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 10/13] convert: generate large test files only once

2016-08-25 Thread larsxschneider

From: Lars Schneider 

Generate more interesting large test files with pseudo random characters
in between and reuse these test files in multiple tests. Run tests
formerly marked as EXPENSIVE every time but with a smaller data set.

Signed-off-by: Lars Schneider 
---
 t/t0021-conversion.sh | 48 ++--
 1 file changed, 38 insertions(+), 10 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 7b45136..34c8eb9 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -4,6 +4,15 @@ test_description='blob conversion via gitattributes'
 
 . ./test-lib.sh
 
+if test_have_prereq EXPENSIVE
+then
+   T0021_LARGE_FILE_SIZE=2048
+   T0021_LARGISH_FILE_SIZE=100
+else
+   T0021_LARGE_FILE_SIZE=30
+   T0021_LARGISH_FILE_SIZE=2
+fi
+
 cat test.i &&
git add test test.t test.i &&
rm -f test test.t test.i &&
-   git checkout -- test test.t test.i
+   git checkout -- test test.t test.i &&
+
+   mkdir generated-test-data &&
+   for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE)
+   do
+   RANDOM_STRING="$(test-genrandom end $i | tr -dc "A-Za-z0-9" )"
+   ROT_RANDOM_STRING="$(echo $RANDOM_STRING | ./rot13.sh )"
+   # Generate 1MB of empty data and 100 bytes of random characters
+   # printf "$(test-genrandom start $i)"
+   printf "%1048576d" 1 >>generated-test-data/large.file &&
+   printf "$RANDOM_STRING" >>generated-test-data/large.file &&
+   printf "%1048576d" 1 >>generated-test-data/large.file.rot13 &&
+   printf "$ROT_RANDOM_STRING" 
>>generated-test-data/large.file.rot13 &&
+
+   if test $i = $T0021_LARGISH_FILE_SIZE
+   then
+   cat generated-test-data/large.file 
>generated-test-data/largish.file &&
+   cat generated-test-data/large.file.rot13 
>generated-test-data/largish.file.rot13
+   fi
+   done
 '
 
 script='s/^\$Id: \([0-9a-f]*\) \$/\1/p'
@@ -199,9 +227,9 @@ test_expect_success 'required filter clean failure' '
 test_expect_success 'filtering large input to small output should use little 
memory' '
test_config filter.devnull.clean "cat >/dev/null" &&
test_config filter.devnull.required true &&
-   for i in $(test_seq 1 30); do printf "%1048576d" 1; done >30MB &&
-   echo "30MB filter=devnull" >.gitattributes &&
-   GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB
+   cp generated-test-data/large.file large.file &&
+   echo "large.file filter=devnull" >.gitattributes &&
+   GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add large.file
 '
 
 test_expect_success 'filter that does not read is fine' '
@@ -214,15 +242,15 @@ test_expect_success 'filter that does not read is fine' '
test_cmp expect actual
 '
 
-test_expect_success EXPENSIVE 'filter large file' '
+test_expect_success 'filter large file' '
test_config filter.largefile.smudge cat &&
test_config filter.largefile.clean cat &&
-   for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB &&
-   echo "2GB filter=largefile" >.gitattributes &&
-   git add 2GB 2>err &&
+   echo "large.file filter=largefile" >.gitattributes &&
+   cp generated-test-data/large.file large.file &&
+   git add large.file 2>err &&
test_must_be_empty err &&
-   rm -f 2GB &&
-   git checkout -- 2GB 2>err &&
+   rm -f large.file &&
+   git checkout -- large.file 2>err &&
test_must_be_empty err
 '
 
-- 
2.9.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Proposed questions for "Git User's Survey 2016"

2016-08-25 Thread Jakub Narębski

W dniu 20.08.2016 o 23:29, Eric Wong pisze:
> Jakub Narębski  wrote:
>>  Other version control systems
>>
>> 20. What other version control systems (SCM) do you use beside Git?
>>(multiple choice, with other)
>>
>> Explanation: "using" version control system here means using
>> it to actively contribute (propose changes or accept proposals),
>> and not only e.g. using it to download software.
>>
>> JN> Perhaps we should split it into two questions, one about
>> JN> centralized version control systems, one about distributed
>> JN> ones.
> 
> Perhaps there can be a question about use and interest of other
> decentralized/federated systems which could be potential
> collaboration tools or transports for git.
> e.g. ipfs, gpg, tor, diaspora, *coin, tent, xmpp, matrix, ...
> 
> And another about how they use email: webmail, GUI client,
> console client, phone app, none at all.

I am of two minds about those (and similar) questions.  One
one hand side, these are quite interesting (especially correlated
with other answers).  On the other hand, they are not about Git,
and we have large number of questions already - I'd prefer if
number of questions was below 50-60.

That said, those questions could be added as a separate section:

 Other tools

XX. How do you read and answer email (check all that apply)?
(multiple choice, possibly with other)

 + GUI client (e.g. Outlook, Thunderbird, Evolution, KMail)
 + console client (e.g. pine, alpine, mutt)
 + webmail or web client (e.g. GMail, Hotmail; HyperKitty)
 + phone app (e.g. K-9 Mail, Airmail, CloudMagic)
 + I don't use email

XX. Which of the decentralized/federated systems do you use
or are interested in?

JN> Have I missed some interesting and Git-relevant federated system?

 + IPFS
 + PGP / GPG
 + Tor
 + diaspora*
 + Bitcoin, Litecoin, Etherium, etc.
 + tent.io
 + XMPP / Jabber
 + OMEMO
 + Matrix.org
 + pump.io
 + other, please specify

There are a few other questions that we might want to ask if
such section is to be added to the Git User's Survey 2016:

XX. Which of IDEs and programmer's editors do you use [with Git]?
(multiple choice, with other)

JN> Have I missed some popular IDE or programmers editor?

 + Visual Studio
 + Eclipse
 + NetBeans
 + Xcode
 + IntelliJ IDEA / PhpStorm / WebStorm
 + KDevelop
 + Anjuta

 + Sublime Text
 + TextMate
 + Emacs
 + Vim
 + Atom
 + Brackets
 + Geany

 + other IDE or editor, please specify

XX. Which of the programming languages are you proficient with?
(multiple choices, with other)

JN> Based on TIOBE index from August 2016, Language Trends on GitHub
JN> 2015, GitHut (languages in GitHub), Stack Overflow Developer
JN> Survey 2016, and my own preferences; in no particular order

 + C
 + C++
 + C#
 + Java
 + VisualBasic.NET 
 + Objective-C

 + Python
 + Perl
 + PHP
 + JavaScript
 + Ruby
 + shell scripe

 + CSS, LESS, SASS etc.
 + HTML, HTML5
 + TeX, LaTeX, ConTeXt
 + SQL

 + Go
 + Rust
 + Swift
 + Scala
 + Haskell

-- 
Jakub Narębski

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] checkout: swap the order of ambiguity check for :/ syntax

2016-08-25 Thread Duy Nguyen

On Wed, Aug 24, 2016 at 11:35 PM, Junio C Hamano  wrote:
> Nguyễn Thái Ngọc Duy   writes:
>
>> It's not wonderful, but it's in line with how git-checkout stops caring
>> about ambiguity after the first argument can be resolved as a ref
>> (there's even a test for it, t2010.6).
>
> But that is justifiable because checkout can only ever take one
> revision.  What follows, if there are any, must be paths, and more
> importantly, it would be perfectly reasonable if some of them were
> missing in the working tree ("ow, I accidentally removed that file,
> I need to resurrect it from the index").  Does the same justification
> apply to this change?

I think there is a misunderstanding. My "after" is in "after the first
argument can be resolved, check if it exists in worktree too, if so
it's ambiguous and bail". This is usually how we detect ambiguation.
But git-checkout does not do the "check if it exists..." clause.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Feature Request: Branch-Aware Submodules

2016-08-25 Thread Hedges Alexander

Dear Git Developers,

First of all, thanks for this great project, it has made my life a lot easier
as a developer!

I'm writing this email to suggest some improvements to git submodules. In my
eyes how git handles submodules could be improved to be more intuitive to a
novice and require less manual management.


Right now updating a submodule in a topic branch and merging it into master
will not change the submodule index in master leading to at least two commit
for the same change (one in any active branch). This happened to me quite a few
times. To a newcomer this behavior is confusing and it leads to unnecessary
commits.


The proposed change would be to have a submodule either ignored or tracked by
the .gitmodules file.
If it is ignored, as for instance after a clone of the superproject, git simply
ignores all files in the submodule directory. The content of the gitmodules
file is then also not updated by git.
If it is not ignored, the .gitmodules is updated every time a commit happens in
the submodule. On branch switches the revision shown in the gitmodules from
that branch is checked out.
This change would have submodules conceptually behave more like files to the
superproject.


Like current behavior, git status would display whether the submodule has
uncommitted changes or is at a new commit. A repository is in a dirty state if
there are changes to the gitmodules file or any tracked submodule is in a dirty
state. Every time a commit happens in a submodule, the parents gitmodules is
updated. Uncommitted changes are not reflected in the parent's gitmodules file.

When the user manually edits the .gitmodules, git switches to that revision
after commit. But the user would have to stash or commit all uncommitted
changes in the submodule first.

When checking out a commit in a submodule, if there is currently a branch
pointing to that commit, HEAD could point to that branch instead (Is there a
case where that doesn't make sense? What about multiple branches pointing to
the commit?). It could also support branch names as references where the branch
(or tag) would be checked out instead.

With git submodule init you could have the submodule tracked. Using deinit
would put the submodule into the ignored state.

And while we're at it, it is quite some work to completely delete a submodule.
You have to manually remove all the associated files in the git repository
(StackOverflow lists 7 steps). Obviously it's not encouraged, as everything
that removes data without recovery method, but it should be possible.
git submodule rm --force could remove the repository and the associated nested
.git tree. git submodule rm could keep the .git directory but move it to another
location.

The behavior of git submodule sync and git submodule update would stay the same.


Migrating existing repositories to the new behavior should be quite straight
forward. Submodules that are not init'ed yet would be ignored. All others
behave accordingly to the new rules. Maybe a message with a note about the
changes could be displayed by the appropriate git-submodule commands or even by
git status.


An alternative considered was to have submodules decoupled stronger from the
superproject. That would mean having the .gitmodules only tracked by master and
leaving the other behaviors unchanged. For consistency one could do the same
thing for the .gitignore.

The drawback of this option are obviously no per branch submodules, if you want
to experiment with external libraries, topic branches would not be the place to
go. Also there would be a lot of intricacies that would have to be worked out.


I couldn't find any discussions on the initial implementation of git-submodule
or any previous proposals related to this in nature due to gmane being down
right now and the mailing list archives on the other sites are not great for
searching. So please excuse me if I'm bringing up already discussed stuff.

Until now I only worked on projects with few submodules. I expect the
proposed changes to have a larger effect on projects containing lots of
submodules. So it would be nice if maybe somebody with experience working on
projects with lots of submodules could weigh into the discussion.


Best Regards,
Alexander Hedges



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] for-each-ref: add %(upstream:gone) to mark missing refs

2016-08-25 Thread Øystein Walle

On 25 August 2016 at 07:56, Karthik Nayak  wrote:
>
> I'm thinking more on the lines of `%(upstream)` being an atom and the
> `:track` being an option under that atom. I like sub-atom though ;)
>

On second thought maybe "quark" is better :P

> On Thu, Aug 25, 2016 at 12:03 AM, Jeff King  wrote:
>>
>> Ah, right. I was feeling like this was all vaguely familiar. I think
>> it would be better to push forward kn/ref-filter-branch-list.
>> According to the last "what's cooking", I think that topic is waiting
>> on more review. If you're willing and able to do so, that would be a
>> big help.
>>
>
> It's been waiting for review for a _long_ time now.
>

To be perfectly honest my C skills and familiarity with the git source
code is not much to speak of. I very much want to take a close look but
I cannot promise anything worth your time...

But if I do find something I'd like to point out should I just reply
directly to the e-mails containing the patches as one usually does even
though they're months old at this point?

Øsse
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128

2016-08-25 Thread Michael Haggerty

On 08/24/2016 11:39 PM, Jeff King wrote:
> On Wed, Aug 24, 2016 at 04:52:33PM -0400, Alex Nauda wrote:
> 
>> Elastic File System (EFS) is Amazon's scalable filesystem product that
>> is exposed to the OS as an NFS mount. We're using EFS to host the
>> filesystem used by a Jenkins CI server. Sometimes when Jenkins tries
>> to git fetch, we get this error:
>> $ git -c core.askpass=true fetch --tags --progress
>> g...@github.com:mediasilo/dodo.git
>> +refs/pull/*:refs/remotes/origin/pr/*
>> fatal: Reference directory conflict: refs/heads/
>> $ echo $? 128
>>
>> Has anyone seen anything like this before? Any tips on how to troubleshoot 
>> it?
> 
> No, I haven't seen it before. That's an internal assertion in the refs
> code that shouldn't ever happen. It looks like it happens when the loose
> refs end up with duplicate directory entries. While a bug in git is an
> obvious culprit, I wonder if it's possible that your filesystem might
> expose the same name twice in one set of readdir() results.
> 
> +cc Michael, who added this assertion long ago (and since this is the
> first report in all these years, it does make me suspect that the
> filesystem is a critical part of reproducing).

Thanks for the CC.

I've never heard of this problem before.

What Git version are you using?

I tried to provoke the problem by hand-corrupting the packed-refs file,
but wasn't successful.

So Peff's suggestion that the problem originates in your filesystem
seems to be to be the most likely cause. A quick Google search found,
for example,

https://bugzilla.redhat.com/show_bug.cgi?id=739222

http://superuser.com/questions/640419/how-can-i-have-two-files-with-the-same-name-in-a-directory-when-mounted-with-nfs

though these reports seem connected with having lots of files in the
directory, which seems unlikely for `$GIT_DIR/refs/`. But I didn't do a
more careful search, and it is easily possible that there are other bugs
in NFS (or EFS) that could be affecting you.

If this were repeatable, you could run Git under strace to test Peff's
hypothesis. But I suppose it only happens rarely, right?

Is it possible that multiple clients have the same NFS filesystem
mounted while Git is running? That would seem like an especially bad
idea and I could imagine it leading to problems like this.

It's surprising that you are seeing this problem in directory `refs`,
because (1) that directory is unlikely to have very many entries, and
(2) as far as I remember, Git will never delete the directories
`refs/heads` and `refs/tags`.

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

92 matches

Mail list logo