Re: [PATCH] specify encoding for sed command

2018-04-12 Thread Matthew Coleman
>> I think the best way to move forward with this is a new patch that uses
>> `awk` instead of `sed`: I tested several `awk` variants and the command
>> was portable without requiring any changes to LANG or LC_ALL.
>> 
>> Does that sound like a good plan?
> 
> No ;)
> Could you please give the patch below a try?
Your patch works great and is clearly a way better approach than parsing the 
output of `set`.

> I'm just not sure whether we should burden this otherwise nice and short
> commit message with the details of this Powerline-macOS-Bash-sed issue...
It might be worth mentioning it briefly, in order to reduce the chances of a 
regression in the future. Maybe something like...
"In Bash we can do better: run the 'compgen -v __gitcomp_builtin_' builtin 
command, which lists the same variables, but without the risk of errors caused 
by poor Unicode handling in some shells' 'set' builtin command or the overhead 
of executing 'sed'."


Re: Optimizing writes to unchanged files during merges?

2018-04-12 Thread Elijah Newren
On Thu, Apr 12, 2018 at 2:14 PM, Linus Torvalds
 wrote:
> So I just had an interesting experience that has happened before too,
> but this time I decided to try to figure out *why* it happened.
>

> include/linux/mm.h
>
> and yeah, that rather core header file causes pretty much everything
> to be re-built.
>
> Now, the reason it was marked as changed is that the xfs branch _had_
> in fact changed it, but the changes were already upstream and got
> merged away. But the file still got written out (with the same
> contents it had before the merge), and 'make' obviously only looks at
> modification time, so make rebuilt everything.


Wow, timing.  I've been looking at this independently -- it turns out
to be in the exact same code area involved in the breakage my recent
series caused.  The bug you are observing here will happen with
current git (going back about seven years, at which time it had
different bugs) whenever no rename is involved and the modifications
on the other branch are a subset of the changes made on HEAD's side.

My series (reverted yesterday morning) made this particular code break
in a new, totally different way.  This is a code path that would be
trivial to get right with Junio's suggested re-design of
merge-recursive[1], but the current design just makes this a bit of a
mess, thus resulting in various code changes over the years with
different breakages, each with its own curious flavor of
incorrectness.

Anyway, I'm writing a bunch of test cases, and will try to get a patch
(or patches) out soon.  Pretty busy this week, but I should definitely
have something out next week.

Elijah

[1] https://public-inbox.org/git/xmqqd147kpdm@gitster.mtv.corp.google.com/


Re: [PATCH] specify encoding for sed command

2018-04-12 Thread SZEDER Gábor

> I did a little more digging into this issue today.

> The issue isn't actually caused by `sed`: it's caused by the way that
> the `set` builtin outputs characters in the Unicode Private Use Area
> (PUA) in the build of Bash 3.2.57 that macOS Sierra ships with.
> 
> Powerline uses several PUA code points to make some of its pretty text
> UI elements:
> 
> Code point (hex value): description
> U+E0A0 (0xEE82A0): Version control branch
> U+E0A1 (0xEE82A1): LN (line) symbol
> U+E0A2 (0xEE82A2): Closed padlock
> U+E0B0 (0xEE82B0): Rightwards black arrowhead
> U+E0B1 (0xEE82B1): Rightwards arrowhead
> U+E0B2 (0xEE82B2): Leftwards black arrowhead
> U+E0B3 (0xEE82B3): Leftwards arrowhead
> 
> macOS Bash 3.2.57's `set` builtin has garbled output where Powerline's
> special symbols should be in the PS1 variable, but Bash 4.4.19
> (installed on macOS via homebrew) and Bash 4.3.38 (Ubuntu 16.04) both
> display it correctly in the output of `set`. `echo $PS1` does display
> the symbols correctly on these versions of Bash.

Thanks for digging.

> I think the best way to move forward with this is a new patch that uses
> `awk` instead of `sed`: I tested several `awk` variants and the command
> was portable without requiring any changes to LANG or LC_ALL.
> 
> Does that sound like a good plan?

No ;)
Could you please give the patch below a try?

I intended it as a fix for a minor performance regression introduced in
the same commit that triggered this issue, but if it can work around
this issue, too, all the better.

I'm just not sure whether we should burden this otherwise nice and short
commit message with the details of this Powerline-macOS-Bash-sed issue...


  -- >8 --

Subject: [PATCH] completion: reduce overhead of clearing cached --options

To get the names of all '$__git_builtin_*' variables caching --options
of builtin commands in order to unset them, 8b0eaa41f2 (completion:
clear cached --options when sourcing the completion script,
2018-03-22) runs a 'set |sed s///' pipeline.  This works both in Bash
and in ZSH, but has a higher than necessasry overhead with the extra
processes.

In Bash we can do better: run the 'compgen -v __gitcomp_builtin_'
builtin command, which lists the same variables, but without a
pipeline and 'sed' it can do so with lower overhead.

Signed-off-by: SZEDER Gábor 

TODO: as an unexpected bonus, this might work around the issue with
'sed' and invalid UTF-8 characters in the environment as well, at least
in Bash.

  
https://public-inbox.org/git/0102016293c8dca7-6626fcde-548d-476e-b61f-c83ecdeedfe1-000...@eu-west-1.amazonses.com/
---
 contrib/completion/git-completion.bash | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index b09c8a2362..4ef59a51be 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -282,7 +282,11 @@ __gitcomp ()
 
 # Clear the variables caching builtins' options when (re-)sourcing
 # the completion script.
-unset $(set |sed -ne 
's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null
+if [[ -n ${ZSH_VERSION-} ]]; then
+   unset $(set |sed -ne 
's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null
+else
+   unset $(compgen -v __gitcomp_builtin_)
+fi
 
 # This function is equivalent to
 #
-- 
2.17.0.366.gbe216a3084



Re: Optimizing writes to unchanged files during merges?

2018-04-12 Thread Linus Torvalds
[ Still talking to myself. Very soothing. ]

On Thu, Apr 12, 2018 at 4:55 PM, Linus Torvalds
 wrote:
> [ Talking to myself ]
>
> Did it perhaps mean to say
>
> path_renamed_outside_HEAD = path2 && !strcmp(path, path2);
>
> instead?

Probably not correct, but making that change makes my test-case work.

It probably breaks something important, though. I didn't look at why
that code happened in the first place.

But it's nice to know I'm at least looking at the right code while I'm
talking to myself.

   Linus


Re: [RFC 02/10] submodule: fix getting custom gitmodule file in fetch command

2018-04-12 Thread Stefan Beller
Hi Antonio,

the subject line could also be
  fetch: fix custom submodule location
as I was not expecting this patch from the subject line.

On Thu, Apr 12, 2018 at 3:20 PM, Antonio Ospite  wrote:
> Import the default git configuration before accessing the gitmodules
> file.
>
> This is important when a value different from the default one is set via
> the 'core.submodulesfile' config.
>
> Signed-off-by: Antonio Ospite 
> ---
>  builtin/fetch.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index dcdfc66f0..d56636f74 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1428,8 +1428,8 @@ int cmd_fetch(int argc, const char **argv, const char 
> *prefix)
> for (i = 1; i < argc; i++)
> strbuf_addf(_rla, " %s", argv[i]);
>
> -   config_from_gitmodules(gitmodules_fetch_config, NULL);
> git_config(git_fetch_config, NULL);
> +   config_from_gitmodules(gitmodules_fetch_config, NULL);

Wouldn't this break the overwriting behavior?
e.g. you configure a submodule URL in .gitmodules and then override it
in .git/config,
then we'd currently first load config from the modules file and then override it
in git_config(git_fetch_config,..), whereas swapping them might have
unintended consequences? Do the tests pass with this patch?


Re: Optimizing writes to unchanged files during merges?

2018-04-12 Thread Linus Torvalds
[ Talking to myself ]

On Thu, Apr 12, 2018 at 4:41 PM, Linus Torvalds
 wrote:
>
> Oddly, that *already* has the check:
>
> if (mfi.clean && !df_conflict_remains &&
> oid_eq(, a_oid) && mfi.mode == a_mode) {
> int path_renamed_outside_HEAD;
> output(o, 3, _("Skipped %s (merged same as existing)"), path);
>
> but that doesn't seem to actually trigger for some reason.

Actually, that check triggers just fine.

> But the code really seems to have the _intention_ of skipping the case
> where the result ended up the same as the source.
>
> Maybe I'm missing something.

The later check that does

/*
 * The content merge resulted in the same file contents we
 * already had.  We can return early if those file contents
 * are recorded at the correct path (which may not be true
 * if the merge involves a rename).
 */
path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
if (!path_renamed_outside_HEAD) {

will see that 'path2' is NULL, and not trigger this early out case,
and then this all falls back to the normal cases after all.

So I think that 'path_renamed_outside_HEAD' logic is somehow broken.

Did it perhaps mean to say

path_renamed_outside_HEAD = path2 && !strcmp(path, path2);

instead?

See commit 5b448b853 ("merge-recursive: When we detect we can skip an
update, actually skip it") which really implies we want to actually
skip it (but then we don't anyway).

Also see commit b2c8c0a76 ("merge-recursive: When we detect we can
skip an update, actually skip it") which was an earlier version, and
which *actually* skipped it, but it was reverted because of that
rename issue.

Adding Elijah Newren to the cc, because he was working on this back
then, an dis still around, and still working on merging ;)

   Linus


Re: [RFC 01/10] submodule: add 'core.submodulesFile' to override the '.gitmodules' path

2018-04-12 Thread Stefan Beller
Hi Antonio,

On Thu, Apr 12, 2018 at 3:20 PM, Antonio Ospite  wrote:
> When multiple repositories with detached work-trees take turns using the
> same directory as their work-tree, and more than one of them want to use
> submodules, there will be conflicts about the '.gitmodules' file.

unlike other files which would not conflict?
There might be file names such as LICENSE, Readme.md etc,
which are common enough that they would produce conflicts as well?
I find this argument on its own rather weak. ("Just delete everything in
the working dir before using it with another repository"). I might be
missing a crucial bit here?

> git hardcodes this path so it's not possible to override its location on
> a per-repository basis to allow such repositories to coexists
> peacefully.
>
> Make the path of the "gitmodules file" customizable exposing
> a 'core.submodulesFile' configuration setting.
>
> The default value will still be '.gitmodules' when 'core.submodulesFile'
> is not set.

ok.


> --- a/cache.h
> +++ b/cache.h
> @@ -1774,6 +1774,7 @@ extern void prepare_pager_args(struct child_process *, 
> const char *pager);
>  extern const char *editor_program;
>  extern const char *askpass_program;
>  extern const char *excludes_file;
> +extern const char *submodules_file;

Could you place this variable in repository.h in struct repository?
(Some developers currently try to move any global state to that place,
as that makes working with e.g. nested submodules easier in-process
and you would not need to spawn processes for submodules)

Once migrated to the repository struct mentioned above, you'd access
it via the_repository->submodules_file for the main repository.


> diff --git a/git-submodule.sh b/git-submodule.sh
> index 24914963c..610fd0dc5 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -71,7 +71,9 @@ get_submodule_config () {
> value=$(git config submodule."$name"."$option")
> if test -z "$value"
> then
> -   value=$(git config -f .gitmodules submodule."$name"."$option")
> +   gitmodules_file=$(git config core.submodulesfile)
> +   : ${gitmodules_file:=.gitmodules}
> +   value=$(git config -f "$gitmodules_file" 
> submodule."$name"."$option")

I wonder if it would be cheaper to write a special config lookup now, e.g.
in builtin/submodule--helper.c we could have a "config-from-gitmodules"
subcommand that is looking up the modules file and then running the config
on that file.

I am surprised how little access of the .gitmodules is left in git-submodule.sh
(which is partially ported to the builtin/submodule--helper.c)

> diff --git a/submodule-config.c b/submodule-config.c
> index 3f2075764..8a3396ade 100644
> --- a/submodule-config.c
> +++ b/submodule-config.c
> @@ -468,7 +468,7 @@ static int gitmodule_oid_from_commit(const struct 
> object_id *treeish_name,
> return 1;
> }
>
> -   strbuf_addf(rev, "%s:.gitmodules", oid_to_hex(treeish_name));
> +   strbuf_addf(rev, "%s:%s", oid_to_hex(treeish_name), submodules_file);
> if (get_oid(rev->buf, gitmodules_oid) >= 0)
> ret = 1;
>
> @@ -583,7 +583,7 @@ void repo_read_gitmodules(struct repository *repo)
> if (repo_read_index(repo) < 0)
> return;
>
> -   gitmodules = repo_worktree_path(repo, GITMODULES_FILE);
> +   gitmodules = repo_worktree_path(repo, submodules_file);
>
> if (!is_gitmodules_unmerged(repo->index))
> git_config_from_file(gitmodules_cb, gitmodules, repo);
> diff --git a/submodule.c b/submodule.c
> index 9a50168b2..2afbdb644 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -36,13 +36,13 @@ static struct oid_array ref_tips_after_fetch;
>   */
>  int is_gitmodules_unmerged(const struct index_state *istate)
>  {
> -   int pos = index_name_pos(istate, GITMODULES_FILE, 
> strlen(GITMODULES_FILE));
> +   int pos = index_name_pos(istate, submodules_file, 
> strlen(submodules_file));

Ah, regarding the coverletter: This clearly assumes the modules
file is in the tree. So at least here we would make an exception
for files outside the tree to either not check for un-merged-ness or
disallow that case entirely.



There are quite a few functions in submodule.c which access the new global. :/
So moving them to the_repository should be fine, but eventually (not
in this series)
all these functions would would want to take a repository argument as well
such that they work on more than the_repository.

Thanks,
Stefan


Re: Optimizing writes to unchanged files during merges?

2018-04-12 Thread Linus Torvalds
On Thu, Apr 12, 2018 at 4:35 PM, Linus Torvalds
 wrote:
>
> in process_entry(), and I think  we could just there add a test for if
> o_old,o_mod == a_oid,a_mode or something?

Actually, not process_entry, but merge_content().

Oddly, that *already* has the check:

if (mfi.clean && !df_conflict_remains &&
oid_eq(, a_oid) && mfi.mode == a_mode) {
int path_renamed_outside_HEAD;
output(o, 3, _("Skipped %s (merged same as existing)"), path);

but that doesn't seem to actually trigger for some reason.

But the code really seems to have the _intention_ of skipping the case
where the result ended up the same as the source.

Maybe I'm missing something.

 Linus


Re: [RFC 00/10] Make .the gitmodules file path configurable

2018-04-12 Thread Stefan Beller
Hi Antonio,

On Thu, Apr 12, 2018 at 3:20 PM, Antonio Ospite  wrote:
> Hi,
>
> vcsh[1] uses bare git repositories and detached work-trees to manage
> *distinct* sets of configuration files directly into $HOME.
>
> In general, submodules have worked perfectly fine with detached
> work-trees for some time[2,3,4].
>
> However when multiple repositories take turns using the same directory
> as their work-tree, and more than one of them want to use submodules,
> there could still be conflicts about the '.gitmodules' file because git
> hardcodes this path.
>
> For comparison, in case of '.gitignore' a similar conflict might arise,
> but git has alternative ways to specify exclude files, so vcsh solves
> this by setting core.excludesFile for each repository and track ignored
> files somewhere else (in ~/.gitignore.d/$VCSH_REPO_NAME).
>
> This is currently not possible with submodules configuration.

So far I agree w.r.t. Gits capabilities.

> So this series proposes a mechanism to set an alternative path for the
> submodules configuration file (from now on "gitmodules file").

That sounds interesting, so let's read on.

> Patches are based on fe0a9eaf31dd0c349ae4308498c33a5c3794b293.

... which is the current master branch by Junio.

> In commit 4c0eeafe4755 (cache.h: add GITMODULES_FILE macro)[5] the
> gitmodules file path definition was centralized, AFAIU this was done
> mainly to prevent typos, as checking a symbolic constant is something
> the compiler will do for us.

+cc Brandon author of said patch.

Digging up the discussion, this was indeed only done to prevent typos.
https://public-inbox.org/git/20170802172633.ga36...@google.com/

> Expanding on that change the first patch in the series makes the path
> customizable exposing a 'core.submodulesFile' configuration setting.

I guess the similarity to core.ignoreFile is desired here. Although these
mechanisms are very different.

> The new configuration setting can be used to set an *alternative*
> location for the gitmodules file; IMHO there is no need to provide
> *additional* locations like in the case of exclude files.

I think there *may* be a need for additional files.
Currently there is only the .gitmodules file and the configuration
in .git/config overriding settings from .gitmodules.

There was some discussion on the mailing list in the past, which
presented a intermediate layer in between these two places, in
a special ref, such that:
base is in .gitmodules
overwritten via refs/meta/submodules:.gitmodules
overwritten via the .git/config

The intermediate would be a config file that is tracked on another
ref. This (a) decouples main project history from submodule history
and (b) makes it easier to distribute as it is part of the repository.

For example (a) is desired if you dig up an old project and the
submodules have all moved from one git hosting provider to another.
Another example would be when you fork a project with submodules
and don't want to mess with the main history but you just want to
adjust the submodule URLs. That is possible with such an intermediate
additional place.

For (b) you can imagine the fork that you want to distribute in your
community and you don't want to tell everyone to change the
submodule URLs, but instead you can provide them with a prepared
.gitmodules file, that they have to place into that special ref (which
can be done via fetching).

I digress as these ideas seem to be orthogonal to your patch series,
just FYI. prior discussion starting at:
https://public-inbox.org/git/1462317985-640-1-git-send-email-sbel...@google.com/
I recall there was a better discussion even prior to that, but have no
link handy.


> For instance vcsh could set the location to
> '~/.gitmodules.d/$VCSH_REPO_NAME' to avoid conflicts.
>
> Since the gitmodules file is meant to be checked in into the repository,
> the overridden file path should be relative to the work-tree; is there
> a way to enforce this constraint at run time (i.e. validate the config
> value), or is it enough to have it documented?

I think we'd want to check at run time, if we need this constraint.

Thanks,
Stefan


Re: Optimizing writes to unchanged files during merges?

2018-04-12 Thread Linus Torvalds
On Thu, Apr 12, 2018 at 4:17 PM, Junio C Hamano  wrote:
>
> A bit of detour.  "Change in side branch happened to be a subset of
> the change in trunk and gets subsumed, but we end up writing the
> same result" happens also with the simpler resolve strategy.
>
> Here is a fix.

Heh. Except git-merge-one-file.sh is *only* used for that case, so
this doesn't fix the normal case.

I verified that the normal case situation is that
"update_file_flags()" thing that checks out the end result.

It's called by this case:

} else if (a_oid && b_oid) {
/* Case C: Added in both (check for same permissions) and */
/* case D: Modified in both, but differently. */
clean_merge = merge_content(o, path,
o_oid, o_mode, a_oid,
a_mode, b_oid, b_mode,
NULL);

in process_entry(), and I think  we could just there add a test for if
o_old,o_mod == a_oid,a_mode or something?

  Linus


Re: Optimizing writes to unchanged files during merges?

2018-04-12 Thread Linus Torvalds
On Thu, Apr 12, 2018 at 2:46 PM, Junio C Hamano  wrote:
>
> Thanks for a clear description of the issue.  It does sound
> interesting.

I decided to show it with a simpler case that could be scripted and
doesn't need the kernel.

NOTE! This obviously doesn't happen for files with the trivial merge
cases (ie the ones that don't require any real merging at all).

There *is* a three-way merge going on, it's just that the end result
is the same as the original file.

Example script just appended here at the end.

NOTE: The script uses "ls -l --full-time", which afaik is a GNU ls
extension. So the script below is not portable.

 Linus

---

  # Create throw-away test repository
  mkdir merge-test
  cd merge-test
  git init

  # Create silly baseline file with 10 lines of numbers in it
  for i in $(seq 1 10); do echo $i; done > a
  git add a
  git commit -m"Original"

  # Make a branch that changes '9' to 'nine'
  git checkout -b branch
  sed -i 's/9/nine/' a
  git commit -m "Nine" a

  # On the master, change '2' to 'two' _and_ '9' to 'nine'
  git checkout master
  sed -i 's/9/nine/' a
  sed -i 's/2/two/' a
  git commit -m "Two and nine" a

  # sleep to show the time difference
  sleep 1

  # show the date on 'a' and do the merge
  ls -l --full-time a
  git merge -m "Merge contents" branch

  # The merge didn't change the contents, but did rewrite the file
  ls -l --full-time a


Re: Optimizing writes to unchanged files during merges?

2018-04-12 Thread Junio C Hamano
Junio C Hamano  writes:

> Linus Torvalds  writes:
>
>> Now, the reason it was marked as changed is that the xfs branch _had_
>> in fact changed it, but the changes were already upstream and got
>> merged away. But the file still got written out (with the same
>> contents it had before the merge), and 'make' obviously only looks at
>> modification time, so make rebuilt everything.
>
> Thanks for a clear description of the issue.  It does sound
> interesting.

A bit of detour.  "Change in side branch happened to be a subset of
the change in trunk and gets subsumed, but we end up writing the
same result" happens also with the simpler resolve strategy.

Here is a fix.

 git-merge-one-file.sh | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
index 9879c59395..aa7392f7ff 100755
--- a/git-merge-one-file.sh
+++ b/git-merge-one-file.sh
@@ -137,11 +137,21 @@ case "${1:-.}${2:-.}${3:-.}" in
ret=1
fi
 
+   # Does the merge result happen to be identical to what we
+   # already have?  Then do not cause unnecessary recompilation.
+   # But don't bother the optimization if we need to chmod
+   if cmp -s "$4" "$src1" && test "$6" = "$7"
+   then
+   :; # happy
+   else
+
# Create the working tree file, using "our tree" version from the
# index, and then store the result of the merge.
git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
rm -f -- "$orig" "$src1" "$src2"
 
+   fi
+
if test "$6" != "$7"
then
if test -n "$msg"

   


[RFC 02/10] submodule: fix getting custom gitmodule file in fetch command

2018-04-12 Thread Antonio Ospite
Import the default git configuration before accessing the gitmodules
file.

This is important when a value different from the default one is set via
the 'core.submodulesfile' config.

Signed-off-by: Antonio Ospite 
---
 builtin/fetch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index dcdfc66f0..d56636f74 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1428,8 +1428,8 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
for (i = 1; i < argc; i++)
strbuf_addf(_rla, " %s", argv[i]);
 
-   config_from_gitmodules(gitmodules_fetch_config, NULL);
git_config(git_fetch_config, NULL);
+   config_from_gitmodules(gitmodules_fetch_config, NULL);
 
argc = parse_options(argc, argv, prefix,
 builtin_fetch_options, builtin_fetch_usage, 0);
-- 
2.17.0



[RFC 03/10] submodule: use the 'submodules_file' variable in output messages

2018-04-12 Thread Antonio Ospite
The gitmodules file path can be customized by setting the
'core.submodulesFile' config option.

Use the 'submodules_file' variable instead of the hardcoded
'.gitmodules' in output messages, to reflect the actual path of the
gitmodules file.

NOTE: the default git configuration has to be initialized in
'builtin/submodule--helper.c' to make the actual value of
'submodules_file', overridden by 'core.submodulesFile', available to the
helper command.

Signed-off-by: Antonio Ospite 
---
 builtin/mv.c|  3 ++-
 builtin/rm.c|  3 ++-
 builtin/submodule--helper.c | 18 ++
 submodule-config.c  |  4 ++--
 submodule.c | 14 +++---
 5 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 6d141f7a5..ec8f139c4 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -82,7 +82,8 @@ static void prepare_move_submodule(const char *src, int first,
if (!S_ISGITLINK(active_cache[first]->ce_mode))
die(_("Directory %s is in index and no submodule?"), src);
if (!is_staging_gitmodules_ok(_index))
-   die(_("Please stage your changes to .gitmodules or stash them 
to proceed"));
+   die(_("Please stage your changes to %s or stash them to 
proceed"),
+   submodules_file);
strbuf_addf(_dotgit, "%s/.git", src);
*submodule_gitfile = read_gitfile(submodule_dotgit.buf);
if (*submodule_gitfile)
diff --git a/builtin/rm.c b/builtin/rm.c
index 5b6fc7ee8..6fd015d86 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -286,7 +286,8 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
list.entry[list.nr].is_submodule = S_ISGITLINK(ce->ce_mode);
if (list.entry[list.nr++].is_submodule &&
!is_staging_gitmodules_ok(_index))
-   die (_("Please stage your changes to .gitmodules or 
stash them to proceed"));
+   die (_("Please stage your changes to %s or stash them 
to proceed"),
+submodules_file);
}
 
if (pathspec.nr) {
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index a404df3ea..72b95d27b 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -458,8 +458,8 @@ static void init_submodule(const char *path, const char 
*prefix,
sub = submodule_from_path(_oid, path);
 
if (!sub)
-   die(_("No url found for submodule path '%s' in .gitmodules"),
-   displaypath);
+   die(_("No url found for submodule path '%s' in %s"),
+   displaypath, submodules_file);
 
/*
 * NEEDSWORK: In a multi-working-tree world, this needs to be
@@ -481,8 +481,8 @@ static void init_submodule(const char *path, const char 
*prefix,
strbuf_addf(, "submodule.%s.url", sub->name);
if (git_config_get_string(sb.buf, )) {
if (!sub->url)
-   die(_("No url found for submodule path '%s' in 
.gitmodules"),
-   displaypath);
+   die(_("No url found for submodule path '%s' in %s"),
+   displaypath, submodules_file);
 
url = xstrdup(sub->url);
 
@@ -623,8 +623,8 @@ static void status_submodule(const char *path, const struct 
object_id *ce_oid,
int diff_files_result;
 
if (!submodule_from_path(_oid, path))
-   die(_("no submodule mapping found in .gitmodules for path 
'%s'"),
- path);
+   die(_("no submodule mapping found in %s for path '%s'"),
+ submodules_file, path);
 
displaypath = get_submodule_displaypath(path, prefix);
 
@@ -749,8 +749,8 @@ static int module_name(int argc, const char **argv, const 
char *prefix)
sub = submodule_from_path(_oid, argv[1]);
 
if (!sub)
-   die(_("no submodule mapping found in .gitmodules for path 
'%s'"),
-   argv[1]);
+   die(_("no submodule mapping found in %s for path '%s'"),
+   submodules_file, argv[1]);
 
printf("%s\n", sub->name);
 
@@ -1855,6 +1855,8 @@ int cmd_submodule__helper(int argc, const char **argv, 
const char *prefix)
if (argc < 2 || !strcmp(argv[1], "-h"))
usage("git submodule--helper ");
 
+   git_config(git_default_config, NULL);
+
for (i = 0; i < ARRAY_SIZE(commands); i++) {
if (!strcmp(argv[1], commands[i].cmd)) {
if (get_super_prefix() &&
diff --git a/submodule-config.c b/submodule-config.c
index 8a3396ade..620d522ee 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -347,9 +347,9 @@ static void warn_multiple_config(const unsigned char 
*treeish_name,
const char *commit_string = "WORKTREE";
if (treeish_name)
commit_string = 

[RFC 00/10] Make .the gitmodules file path configurable

2018-04-12 Thread Antonio Ospite
Hi,

vcsh[1] uses bare git repositories and detached work-trees to manage
*distinct* sets of configuration files directly into $HOME.

In general, submodules have worked perfectly fine with detached
work-trees for some time[2,3,4].

However when multiple repositories take turns using the same directory
as their work-tree, and more than one of them want to use submodules,
there could still be conflicts about the '.gitmodules' file because git
hardcodes this path.

For comparison, in case of '.gitignore' a similar conflict might arise,
but git has alternative ways to specify exclude files, so vcsh solves
this by setting core.excludesFile for each repository and track ignored
files somewhere else (in ~/.gitignore.d/$VCSH_REPO_NAME).

This is currently not possible with submodules configuration.

So this series proposes a mechanism to set an alternative path for the
submodules configuration file (from now on "gitmodules file").

Patches are based on fe0a9eaf31dd0c349ae4308498c33a5c3794b293.

In commit 4c0eeafe4755 (cache.h: add GITMODULES_FILE macro)[5] the
gitmodules file path definition was centralized, AFAIU this was done
mainly to prevent typos, as checking a symbolic constant is something
the compiler will do for us.

Expanding on that change the first patch in the series makes the path
customizable exposing a 'core.submodulesFile' configuration setting.

The new configuration setting can be used to set an *alternative*
location for the gitmodules file; IMHO there is no need to provide
*additional* locations like in the case of exclude files.

For instance vcsh could set the location to
'~/.gitmodules.d/$VCSH_REPO_NAME' to avoid conflicts.

Since the gitmodules file is meant to be checked in into the repository,
the overridden file path should be relative to the work-tree; is there
a way to enforce this constraint at run time (i.e. validate the config
value), or is it enough to have it documented?

The last patch adds a hacky 'test-custom-gitmodules-file.sh' script
which patches the test suite to fix all the hardcoded occurrences of
'.gitmodules' and runs it twice: once with the default value and once
with a custom path, passed via an environmental variable.

I guess in the final version just testing for a custom path (e.g.
'.gitmodules_custom') could be enough, as the default value can be seen
as a particular case.

The last thing I noticed is that git does not create config files
automatically if they are under a subdirectory which does not exist
already; for instance the following command would fail:

  git config -f newsubdir/test-config user.name Antonio

This means that if the user wanted to use something like:

  git -c core.submodulesFile=.gitmodules.d/repo_submodules ...

the subdirectory would have to be created beforehand. This is not a big
deal of course, but I was wondering if this is mentioned anywhere.
Fixing the current behavior to create the leading directories does not
seem hard, but I am not sure it is worth it.

Thanks,
   Antonio

[1] https://github.com/RichiH/vcsh
[2] 
http://git.661346.n2.nabble.com/git-submodule-vs-GIT-WORK-TREE-td7562165.html
[3] 
http://git.661346.n2.nabble.com/PATCH-Solve-git-submodule-issues-with-detached-work-trees-td7563377.html
[4] https://github.com/git/git/commit/be8779f7ac9a3be9aa783df008d59082f4054f67
[5] https://github.com/git/git/commit/4c0eeafe4755345b0f4636bf09904cf689703e11

Antonio Ospite (10):
  submodule: add 'core.submodulesFile' to override the '.gitmodules'
path
  submodule: fix getting custom gitmodule file in fetch command
  submodule: use the 'submodules_file' variable in output messages
  submodule: document 'core.submodulesFile' and fix references to
'.gitmodules'
  submodule: adjust references to '.gitmodules' in comments
  completion: add 'core.submodulesfile' to the git-completion.bash file
  XXX: wrap-for-bin.sh: set 'core.submodulesFile' for each git
invocation
  XXX: submodule: fix t1300-repo-config.sh to take into account the new
config
  XXX: submodule: pass custom gitmodules file to 'test-tool
submodule-config'
  XXX: add a hacky script to test the changes with a patched test suite

 Documentation/config.txt  | 18 +++--
 Documentation/git-add.txt |  4 +-
 Documentation/git-submodule.txt   | 45 +--
 Documentation/gitmodules.txt  | 15 ++--
 Documentation/gitsubmodules.txt   | 18 ++---
 .../technical/api-submodule-config.txt|  6 +-
 Makefile  |  3 +-
 builtin/fetch.c   |  2 +-
 builtin/mv.c  |  3 +-
 builtin/rm.c  |  3 +-
 builtin/submodule--helper.c   | 20 ++---
 cache.h   |  1 +
 config.c  | 13 ++--
 config.h  |  7 +-
 contrib/completion/git-completion.bash|  1 

[RFC 04/10] submodule: document 'core.submodulesFile' and fix references to '.gitmodules'

2018-04-12 Thread Antonio Ospite
The gitmodules file location can be overridden by setting the
'core.submodulesFile' config option.

Document this new option and reflect the new behavior in the
documentation by using the more generic formula "gitmodules file"
instead of the hardcoded '.gitmodules' file path.

Signed-off-by: Antonio Ospite 
---
 Documentation/config.txt  | 18 +---
 Documentation/git-add.txt |  4 +-
 Documentation/git-submodule.txt   | 45 ++-
 Documentation/gitmodules.txt  | 15 ---
 Documentation/gitsubmodules.txt   | 18 
 .../technical/api-submodule-config.txt|  6 +--
 contrib/subtree/git-subtree.txt   |  2 +-
 7 files changed, 59 insertions(+), 49 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb..647e33646 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -742,6 +742,12 @@ core.excludesFile::
If `$XDG_CONFIG_HOME` is either not set or empty, 
`$HOME/.config/git/ignore`
is used instead. See linkgit:gitignore[5].
 
+core.submodulesFile::
+   Specifies the pathname to the file that contains the submodules
+   configuration. The path must be relative to the repository work tree.
+   The specified path replaces the default per-repository '.gitmodules'
+   file. See linkgit:gitmodules[5].
+
 core.askPass::
Some commands (e.g. svn and http interfaces) that interactively
ask for a password can be told to use an external program given
@@ -3170,7 +3176,7 @@ stash.showStat::
See description of 'show' command in linkgit:git-stash[1].
 
 submodule..url::
-   The URL for a submodule. This variable is copied from the .gitmodules
+   The URL for a submodule. This variable is copied from the gitmodules
file to the git config via 'git submodule init'. The user can change
the configured URL before obtaining the submodule via 'git submodule
update'. If neither submodule..active or submodule.active are
@@ -3191,7 +3197,7 @@ submodule..update::
 submodule..branch::
The remote branch name for a submodule, used by `git submodule
update --remote`.  Set this option to override the value found in
-   the `.gitmodules` file.  See linkgit:git-submodule[1] and
+   the gitmodules file.  See linkgit:git-submodule[1] and
linkgit:gitmodules[5] for details.
 
 submodule..fetchRecurseSubmodules::
@@ -3212,10 +3218,10 @@ submodule..ignore::
let submodules with modified tracked files in their work tree show up.
Using "none" (the default when this option is not set) also shows
submodules that have untracked files in their work tree as changed.
-   This setting overrides any setting made in .gitmodules for this 
submodule,
-   both settings can be overridden on the command line by using the
-   "--ignore-submodules" option. The 'git submodule' commands are not
-   affected by this setting.
+   This setting overrides any setting made in the gitmodules file for
+   this submodule, both settings can be overridden on the command line by
+   using the "--ignore-submodules" option. The 'git submodule' commands
+   are not affected by this setting.
 
 submodule..active::
Boolean value indicating if the submodule is of interest to git
diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index d50fa339d..8add7e7c0 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -171,8 +171,8 @@ for "git add --no-all ...", i.e. ignored removed 
files.
 --no-warn-embedded-repo::
By default, `git add` will warn when adding an embedded
repository to the index without using `git submodule add` to
-   create an entry in `.gitmodules`. This option will suppress the
-   warning (e.g., if you are manually performing operations on
+   create an entry in the gitmodules file. This option will suppress
+   the warning (e.g., if you are manually performing operations on
submodules).
 
 --renormalize::
diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 71c5618e8..6bd8e3d44 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -58,13 +58,13 @@ for commit without cloning. The  is also used as the 
submodule's
 logical name in its configuration entries unless `--name` is used
 to specify a logical name.
 +
-The given URL is recorded into `.gitmodules` for use by subsequent users
-cloning the superproject. If the URL is given relative to the
+The given URL is recorded in the gitmodules file for use by subsequent
+users cloning the superproject. If the URL is given relative to the
 superproject's repository, the presumption is the superproject and
 submodule repositories will be kept together in the same relative
 location, and only the superproject's URL needs to be 

[RFC 06/10] completion: add 'core.submodulesfile' to the git-completion.bash file

2018-04-12 Thread Antonio Ospite
Signed-off-by: Antonio Ospite 
---
 contrib/completion/git-completion.bash | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index a75707394..19beb6735 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -2382,6 +2382,7 @@ _git_config ()
core.sparseCheckout
core.splitIndex
core.sshCommand
+   core.submodulesfile
core.symlinks
core.trustctime
core.untrackedCache
-- 
2.17.0



[RFC 01/10] submodule: add 'core.submodulesFile' to override the '.gitmodules' path

2018-04-12 Thread Antonio Ospite
When multiple repositories with detached work-trees take turns using the
same directory as their work-tree, and more than one of them want to use
submodules, there will be conflicts about the '.gitmodules' file.

git hardcodes this path so it's not possible to override its location on
a per-repository basis to allow such repositories to coexists
peacefully.

Make the path of the "gitmodules file" customizable exposing
a 'core.submodulesFile' configuration setting.

The default value will still be '.gitmodules' when 'core.submodulesFile'
is not set.

Signed-off-by: Antonio Ospite 
---
 cache.h|  1 +
 config.c   |  6 +-
 environment.c  |  1 +
 git-submodule.sh   | 14 +-
 submodule-config.c |  4 ++--
 submodule.c| 20 ++--
 unpack-trees.c |  2 +-
 7 files changed, 29 insertions(+), 19 deletions(-)

diff --git a/cache.h b/cache.h
index bbaf5c349..f4dd8bc00 100644
--- a/cache.h
+++ b/cache.h
@@ -1774,6 +1774,7 @@ extern void prepare_pager_args(struct child_process *, 
const char *pager);
 extern const char *editor_program;
 extern const char *askpass_program;
 extern const char *excludes_file;
+extern const char *submodules_file;
 
 /* base85 */
 int decode_85(char *dst, const char *line, int linelen);
diff --git a/config.c b/config.c
index c698988f5..6ffb1d501 100644
--- a/config.c
+++ b/config.c
@@ -1199,6 +1199,10 @@ static int git_default_core_config(const char *var, 
const char *value)
if (!strcmp(var, "core.excludesfile"))
return git_config_pathname(_file, var, value);
 
+   if (!strcmp(var, "core.submodulesfile")) {
+   return git_config_pathname(_file, var, value);
+   }
+
if (!strcmp(var, "core.whitespace")) {
if (!value)
return config_error_nonbool(var);
@@ -2084,7 +2088,7 @@ int git_config_get_pathname(const char *key, const char 
**dest)
 void config_from_gitmodules(config_fn_t fn, void *data)
 {
if (the_repository->worktree) {
-   char *file = repo_worktree_path(the_repository, 
GITMODULES_FILE);
+   char *file = repo_worktree_path(the_repository, 
submodules_file);
git_config_from_file(fn, file, data);
free(file);
}
diff --git a/environment.c b/environment.c
index 39b3d906c..69cae6d68 100644
--- a/environment.c
+++ b/environment.c
@@ -49,6 +49,7 @@ int pager_use_color = 1;
 const char *editor_program;
 const char *askpass_program;
 const char *excludes_file;
+const char *submodules_file = GITMODULES_FILE;
 enum auto_crlf auto_crlf = AUTO_CRLF_FALSE;
 int check_replace_refs = 1;
 char *git_replace_ref_base;
diff --git a/git-submodule.sh b/git-submodule.sh
index 24914963c..610fd0dc5 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -71,7 +71,9 @@ get_submodule_config () {
value=$(git config submodule."$name"."$option")
if test -z "$value"
then
-   value=$(git config -f .gitmodules submodule."$name"."$option")
+   gitmodules_file=$(git config core.submodulesfile)
+   : ${gitmodules_file:=.gitmodules}
+   value=$(git config -f "$gitmodules_file" 
submodule."$name"."$option")
fi
printf '%s' "${value:-$default}"
 }
@@ -271,13 +273,15 @@ or you are unsure what this means choose another name 
with the '--name' option."
git add --no-warn-embedded-repo $force "$sm_path" ||
die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
 
-   git config -f .gitmodules submodule."$sm_name".path "$sm_path" &&
-   git config -f .gitmodules submodule."$sm_name".url "$repo" &&
+   gitmodules_file=$(git config core.submodulesfile)
+   : ${gitmodules_file:=.gitmodules}
+   git config -f "$gitmodules_file" submodule."$sm_name".path "$sm_path" &&
+   git config -f "$gitmodules_file" submodule."$sm_name".url "$repo" &&
if test -n "$branch"
then
-   git config -f .gitmodules submodule."$sm_name".branch "$branch"
+   git config -f "$gitmodules_file" submodule."$sm_name".branch 
"$branch"
fi &&
-   git add --force .gitmodules ||
+   git add --force "$gitmodules_file" ||
die "$(eval_gettext "Failed to register submodule '\$sm_path'")"
 
# NEEDSWORK: In a multi-working-tree world, this needs to be
diff --git a/submodule-config.c b/submodule-config.c
index 3f2075764..8a3396ade 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -468,7 +468,7 @@ static int gitmodule_oid_from_commit(const struct object_id 
*treeish_name,
return 1;
}
 
-   strbuf_addf(rev, "%s:.gitmodules", oid_to_hex(treeish_name));
+   strbuf_addf(rev, "%s:%s", oid_to_hex(treeish_name), submodules_file);
if (get_oid(rev->buf, gitmodules_oid) >= 0)
ret = 1;
 
@@ -583,7 +583,7 @@ void repo_read_gitmodules(struct repository *repo)
if 

[RFC 05/10] submodule: adjust references to '.gitmodules' in comments

2018-04-12 Thread Antonio Ospite
In the comments refer to a more generic "gitmodules file" instead of
using the '.gitmodules' file name directly.

Signed-off-by: Antonio Ospite 
---
 builtin/submodule--helper.c |  2 +-
 config.c|  7 +++
 config.h|  7 +++
 git-submodule.sh| 10 +-
 repository.h|  2 +-
 submodule-config.c  |  8 
 submodule-config.h  |  2 +-
 submodule.c | 24 
 8 files changed, 30 insertions(+), 32 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 72b95d27b..7e8fa26c9 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -476,7 +476,7 @@ static void init_submodule(const char *path, const char 
*prefix,
/*
 * Copy url setting when it is not set yet.
 * To look up the url in .git/config, we must not fall back to
-* .gitmodules, so look it up directly.
+* the gitmodules file, so look it up directly.
 */
strbuf_addf(, "submodule.%s.url", sub->name);
if (git_config_get_string(sb.buf, )) {
diff --git a/config.c b/config.c
index 6ffb1d501..9921b6c2c 100644
--- a/config.c
+++ b/config.c
@@ -2079,11 +2079,10 @@ int git_config_get_pathname(const char *key, const char 
**dest)
 
 /*
  * Note: This function exists solely to maintain backward compatibility with
- * 'fetch' and 'update_clone' storing configuration in '.gitmodules' and should
- * NOT be used anywhere else.
+ * 'fetch' and 'update_clone' storing configuration in the gitmodules file and
+ * should NOT be used anywhere else.
  *
- * Runs the provided config function on the '.gitmodules' file found in the
- * working directory.
+ * Runs the provided config function on the gitmodules file.
  */
 void config_from_gitmodules(config_fn_t fn, void *data)
 {
diff --git a/config.h b/config.h
index ef70a9cac..37aef35d5 100644
--- a/config.h
+++ b/config.h
@@ -191,11 +191,10 @@ extern int repo_config_get_pathname(struct repository 
*repo,
 
 /*
  * Note: This function exists solely to maintain backward compatibility with
- * 'fetch' and 'update_clone' storing configuration in '.gitmodules' and should
- * NOT be used anywhere else.
+ * 'fetch' and 'update_clone' storing configuration in the gitmodules file and
+ * should NOT be used anywhere else.
  *
- * Runs the provided config function on the '.gitmodules' file found in the
- * working directory.
+ * Runs the provided config function on the gitmodules file.
  */
 extern void config_from_gitmodules(config_fn_t fn, void *data);
 
diff --git a/git-submodule.sh b/git-submodule.sh
index 610fd0dc5..615a65be9 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -59,10 +59,10 @@ die_if_unmatched ()
 # $3 = default value
 #
 # Checks in the usual git-config places first (for overrides),
-# otherwise it falls back on .gitmodules.  This allows you to
-# distribute project-wide defaults in .gitmodules, while still
+# otherwise it falls back to the gitmodules file.  This allows you to
+# distribute project-wide defaults in the gitmodules file, while still
 # customizing individual repositories if necessary.  If the option is
-# not in .gitmodules either, print a default value.
+# not in the gitmodules file either, print a default value.
 #
 get_submodule_config () {
name="$1"
@@ -95,7 +95,7 @@ sanitize_submodule_env()
 }
 
 #
-# Add a new submodule to the working tree, .gitmodules and the index
+# Add a new submodule to the working tree, the gitmodules file and the index
 #
 # $@ = repo path
 #
@@ -960,7 +960,7 @@ cmd_status()
 #
 # Sync remote urls for submodules
 # This makes the value for remote.$remote.url match the value
-# specified in .gitmodules.
+# specified in the gitmodules file.
 #
 cmd_sync()
 {
diff --git a/repository.h b/repository.h
index 09df94a47..2db62af0e 100644
--- a/repository.h
+++ b/repository.h
@@ -59,7 +59,7 @@ struct repository {
 */
struct config_set *config;
 
-   /* Repository's submodule config as defined by '.gitmodules' */
+   /* Repository's submodule config as defined by the gitmodules file */
struct submodule_cache *submodule_cache;
 
/*
diff --git a/submodule-config.c b/submodule-config.c
index 620d522ee..18984473b 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -9,7 +9,7 @@
 /*
  * submodule cache lookup structure
  * There is one shared set of 'struct submodule' entries which can be
- * looked up by their sha1 blob id of the .gitmodules file and either
+ * looked up by their sha1 blob id of the gitmodules file and either
  * using path or name as key.
  * for_path stores submodule entries with path as key
  * for_name stores submodule entries with name as key
@@ -91,7 +91,7 @@ static void submodule_cache_clear(struct submodule_cache 
*cache)
/*
 * We iterate over the name hash here to be symmetric with the
 * allocation of struct 

Re: [PATCH] specify encoding for sed command

2018-04-12 Thread Matthew Coleman
I did a little more digging into this issue today.

> On Apr 11, 2018, at 4:42 PM, Matt Coleman  wrote:
> 
> I found another (possibly better) way to fix this:
> 
>> On Apr 10, 2018, at 3:18 AM, Matt Coleman  wrote:
>> 
>>> 1) What platform OS / version / sed version is this on?
>> I'm experiencing this on macOS Sierra (10.12.6). The issue occurs with the 
>> OS's native sed, which is FreeBSD sed so the version number is kind of 
>> ambiguous.
>> 
>> The error goes away if I set LANG=C or LC_ALL=C or change it to use GNU sed 
>> (installed via homebrew as gsed).
> 
> If I change it to use awk instead of sed, it works with mawk, gawk, and macOS 
> awk:
> unset $(set | awk -F '=' '/^__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*=/ 
> {print $1}') 2>/dev/null
> 
> I compared sed vs. awk on Linux and Mac and they all take about the same 
> amount of time to run (within 0.002ms).

The issue isn't actually caused by `sed`: it's caused by the way that the `set` 
builtin outputs characters in the Unicode Private Use Area (PUA) in the build 
of Bash 3.2.57 that macOS Sierra ships with.

Powerline uses several PUA code points to make some of its pretty text UI 
elements:

Code point (hex value): description
U+E0A0 (0xEE82A0): Version control branch
U+E0A1 (0xEE82A1): LN (line) symbol
U+E0A2 (0xEE82A2): Closed padlock
U+E0B0 (0xEE82B0): Rightwards black arrowhead
U+E0B1 (0xEE82B1): Rightwards arrowhead
U+E0B2 (0xEE82B2): Leftwards black arrowhead
U+E0B3 (0xEE82B3): Leftwards arrowhead

macOS Bash 3.2.57's `set` builtin has garbled output where Powerline's special 
symbols should be in the PS1 variable, but Bash 4.4.19 (installed on macOS via 
homebrew) and Bash 4.3.38 (Ubuntu 16.04) both display it correctly in the 
output of `set`. `echo $PS1` does display the symbols correctly on these 
versions of Bash.

So...

> 3) There's other invocations of "sed" in the file, aren't those affected as 
> well?
The short answer: no. Slightly longer: not by the same thing that's affecting 
the line in the patch, at least.

Long: As described above, the problem isn't actually `sed`: it's the `set` 
builtin in macOS's build of Bash. The other invocations of `sed` should be 
safe, because `sed` properly handles the PUA code points on its own:

$ echo $'\xee\x82\xb0' | sed 's/./@/'
@

The way that `set` is displaying the PS1 variable seems to be sending them to 
`sed` individually or somehow split up:

$ for character in $'\xee' $'\x82' $'\xb0' $'\xee\x82' $'\x82\xb0'; do echo 
$character | sed 's/./@/'; done
sed: RE error: illegal byte sequence
sed: RE error: illegal byte sequence
sed: RE error: illegal byte sequence
sed: RE error: illegal byte sequence
sed: RE error: illegal byte sequence

Interestingly, Bash 3.2.25's `set` builtin on CentOS 5 correctly displays the 
octal values for the symbols (prompt edited to keep this email ASCII):

$ PS1=$'\xee\x82\xb0 '
>  set | grep PS1
PS1=$'\356\202\260 '

I haven't started digging through Bash's codebase, but it could either be a bug 
that was introduced between 3.2.25 and 3.2.57 or Apple used some silly flags 
when compiling Bash. Ideally, this should be fixed in Bash, but since Apple's 
using such an old version of Bash for license reasons, I think it's unlikely 
that they'll fix the issue.

I think the best way to move forward with this is a new patch that uses `awk` 
instead of `sed`: I tested several `awk` variants and the command was portable 
without requiring any changes to LANG or LC_ALL.

Does that sound like a good plan?


git blame completion add completion?

2018-04-12 Thread Jacob Keller
Hi,

I use git blame sometimes to blame against specific revisions, i.e.
"git blame  -- " and it would be nice if blame could
figure out how to complete the  or revision reference..

I tried to take a stab at adding support for this, but I couldn't
figure out which completion function to use...

I want "git blame " to complete files like normal, (as this is
the most general use case), but to have "git blame xyz" to
complete revisions if there's any that match, (while still also
completing files). Additionally, it should only complete a revision
for the first argument, and go back to strict file names after.

In an *ideal* world, it'd be nice if it could figure out say:

git blame  xyz

and complete xyz as a file found in branch, not actually checking
local files but the ls-files of that branch instead. (i.e. it should
be able to complete a file that was deleted in the current checkout
but exists on that ref).

any ideas on where to start with that?

Thanks,
Jake


Re: [PATCH v6 00/15] rebase -i: offer to recreate commit topology

2018-04-12 Thread Jacob Keller
On Thu, Apr 12, 2018 at 3:02 PM, Johannes Schindelin
 wrote:
> Hi Jake,
>
> On Thu, 12 Apr 2018, Jacob Keller wrote:
>
>> On Wed, Apr 11, 2018 at 10:42 PM, Sergey Organov  wrote:
>> >
>> > Jacob Keller  writes:
>> >> On Wed, Apr 11, 2018 at 6:13 AM, Sergey Organov  
>> >> wrote:
>> >>> It was rather --recreate-merges just a few weeks ago, and I've seen
>> >>> nobody actually commented either in favor or against the
>> >>> --rebase-merges.
>> >>>
>> >>> git rebase --rebase-merges
>> >>
>> >> I'm going to jump in here and say that *I* prefer --rebase-merges, as
>> >> it clearly mentions merge commits (which is the thing that changes).
>> >
>> > OK, thanks, it's fair and the first argument in favor of
>> > --rebase-merges I see.
>>
>> I'd be ok with "--keep-merges" also.
>
> My main argument against --keep-merges is that there is no good short
> option for it: -k and -m are already taken. And I really like my `git
> rebase -kir` now...
>

Right, that's unfortunate.

> A minor argument in favor of `--rebase-merges` vs `--keep-merges` is that
> we do not really keep the merge commits, we rewrite them. In the version
> as per this here patch series, we really create recursive merges from
> scratch.

I also don't have a strong opinion in regards to --keep vs --rebase..

>
> In the later patch series on which I am working, we use a variation of
> Phillip's strategy which can be construed as a generalization of the
> cherry-pick to include merges: for a cherry-pick, we perform a 3-way merge
> between the commit and HEAD, with the commit's parent commit as merge
> base. With Phillip's strategy, we perform a 3-way merge between the merge
> commit and HEAD (i.e. the rebased first parent), with the merge commit's
> first parent as merge base, followed by a 3-way merge with the rebased
> 2nd parent (with the original 2nd parent as merge base), etc
>
> However. This strategy, while it performed well in my initial tests (and
> in Buga's initial tests, too), *does* involve more than one 3-way merge,
> and therefore it risks something very, very nasty: *nested* merge
> conflicts.

Yea, it could. Finding an elegant solution around this would be ideal!
(By elegant, I mean a solution which produces merge conflicts that
users can resolve relatively easily).

>
> Now, I did see nested merge conflicts in the past, very rarely, but that
> can happen, when two developers criss-cross merge each others' `master`
> branch and are really happy to perform invasive changes that our merge
> does not deal well with, such as indentation changes.
>
> When rebasing a merge conflict, however, such nested conflicts can happen
> relatively easily. Not rare at all.

Right. This would be true regardless of what strategy we choose, I think.

>
> I found out about this by doing what I keep preaching in this thred:
> theory is often very nice *right* until the point where it hits reality,
> and then frequently turns really useless, real quickly. Theoretical
> musings can therefore be an utter waste of time, unless accompanied by
> concrete examples.

Agreed.

>
> To start, I built on the example for an "evil merge" that I gave already
> in the very beginning of this insanely chatty thread: if one branch
> changes the signature of a function, and a second branch adds a caller to
> that function, then by necessity a merge between those two branches has to
> change the caller to accommodate the signature change. Otherwise it would
> end up in a broken state.
>
> In my `sequencer-shears` branch at https://github.com/dscho/git, I added
> this as a test case, where I start out with a main.c containing a single
> function called core(). I then create one branch where this function is
> renamed to hi(), and another branch where the function caller() is added
> that calls core(). Then I merge both, amending the merge commit so that
> caller() now calls hi(). So this is the main.c after merging:
>
> int hi(void) {
> printf("Hello, world!\n");
> }
> /* caller */
> void caller(void) {
> hi();
> }
>
> To create the kind of problems that are all too common in my daily work
> (seemingly every time some stable patch in Git for Windows gets
> upstreamed, it changes into an incompatible version, causing merge
> conflicts, and sometimes not only that... but I digress...), I then added
> an "upstream" where some maintainer decided that core() is better called
> greeting(), and also that a placeholder function for an event loop should
> be added. So in upstream, main.c looks like this:
>
> int greeting(void) {
> printf("Hello, world!\n");
> }
> /* main event loop */
> void event_loop(void) {
> /* TODO: place holder for now */
> }
>
> Keep in mind: while this is a minimal example of disagreeing changes that
> may look unrealistic, in 

Re: [PATCH v6 00/15] rebase -i: offer to recreate commit topology

2018-04-12 Thread Johannes Schindelin
Hi Jake,

On Thu, 12 Apr 2018, Jacob Keller wrote:

> On Wed, Apr 11, 2018 at 10:42 PM, Sergey Organov  wrote:
> >
> > Jacob Keller  writes:
> >> On Wed, Apr 11, 2018 at 6:13 AM, Sergey Organov  wrote:
> >>> It was rather --recreate-merges just a few weeks ago, and I've seen
> >>> nobody actually commented either in favor or against the
> >>> --rebase-merges.
> >>>
> >>> git rebase --rebase-merges
> >>
> >> I'm going to jump in here and say that *I* prefer --rebase-merges, as
> >> it clearly mentions merge commits (which is the thing that changes).
> >
> > OK, thanks, it's fair and the first argument in favor of
> > --rebase-merges I see.
> 
> I'd be ok with "--keep-merges" also.

My main argument against --keep-merges is that there is no good short
option for it: -k and -m are already taken. And I really like my `git
rebase -kir` now...

A minor argument in favor of `--rebase-merges` vs `--keep-merges` is that
we do not really keep the merge commits, we rewrite them. In the version
as per this here patch series, we really create recursive merges from
scratch.

In the later patch series on which I am working, we use a variation of
Phillip's strategy which can be construed as a generalization of the
cherry-pick to include merges: for a cherry-pick, we perform a 3-way merge
between the commit and HEAD, with the commit's parent commit as merge
base. With Phillip's strategy, we perform a 3-way merge between the merge
commit and HEAD (i.e. the rebased first parent), with the merge commit's
first parent as merge base, followed by a 3-way merge with the rebased
2nd parent (with the original 2nd parent as merge base), etc

However. This strategy, while it performed well in my initial tests (and
in Buga's initial tests, too), *does* involve more than one 3-way merge,
and therefore it risks something very, very nasty: *nested* merge
conflicts.

Now, I did see nested merge conflicts in the past, very rarely, but that
can happen, when two developers criss-cross merge each others' `master`
branch and are really happy to perform invasive changes that our merge
does not deal well with, such as indentation changes.

When rebasing a merge conflict, however, such nested conflicts can happen
relatively easily. Not rare at all.

I found out about this by doing what I keep preaching in this thred:
theory is often very nice *right* until the point where it hits reality,
and then frequently turns really useless, real quickly. Theoretical
musings can therefore be an utter waste of time, unless accompanied by
concrete examples.

To start, I built on the example for an "evil merge" that I gave already
in the very beginning of this insanely chatty thread: if one branch
changes the signature of a function, and a second branch adds a caller to
that function, then by necessity a merge between those two branches has to
change the caller to accommodate the signature change. Otherwise it would
end up in a broken state.

In my `sequencer-shears` branch at https://github.com/dscho/git, I added
this as a test case, where I start out with a main.c containing a single
function called core(). I then create one branch where this function is
renamed to hi(), and another branch where the function caller() is added
that calls core(). Then I merge both, amending the merge commit so that
caller() now calls hi(). So this is the main.c after merging:

int hi(void) {
printf("Hello, world!\n");
}
/* caller */
void caller(void) {
hi();
}

To create the kind of problems that are all too common in my daily work
(seemingly every time some stable patch in Git for Windows gets
upstreamed, it changes into an incompatible version, causing merge
conflicts, and sometimes not only that... but I digress...), I then added
an "upstream" where some maintainer decided that core() is better called
greeting(), and also that a placeholder function for an event loop should
be added. So in upstream, main.c looks like this:

int greeting(void) {
printf("Hello, world!\n");
}
/* main event loop */
void event_loop(void) {
/* TODO: place holder for now */
}

Keep in mind: while this is a minimal example of disagreeing changes that
may look unrealistic, in practice this is the exact type of problem I am
dealing with on a daily basis, in Git for Windows as well as in GVFS Git
(which adds a thicket of branches on top of Git for Windows) and with the
MSYS2 runtime (where Git for Windows stacks patches on top of MSYs2, which
in turn maintains their set of patches on top of the Cygwin runtime), and
with BusyBox, and probably other projects I forgot spontaneously. This
makes me convinced that this is the exact type of problem that will
challenge whatever --rebase-merges has to deal with, or better put: what
the user of --rebase-merges will have to deal with.

(If I got a penny for 

Re: Optimizing writes to unchanged files during merges?

2018-04-12 Thread Junio C Hamano
Linus Torvalds  writes:

> Now, the reason it was marked as changed is that the xfs branch _had_
> in fact changed it, but the changes were already upstream and got
> merged away. But the file still got written out (with the same
> contents it had before the merge), and 'make' obviously only looks at
> modification time, so make rebuilt everything.

Thanks for a clear description of the issue.  It does sound
interesting.


[PATCH 1/2] daemon: use timeout for uninterruptible poll

2018-04-12 Thread Kim Gybels
The poll provided in compat/poll.c is not interrupted by receiving
SIGCHLD. Use a timeout for cleaning up dead children in a timely manner.

Signed-off-by: Kim Gybels 
---
 daemon.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/daemon.c b/daemon.c
index fe833ea7de..6dc95c1b2f 100644
--- a/daemon.c
+++ b/daemon.c
@@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist)
 {
struct pollfd *pfd;
int i;
+   int poll_timeout = -1;
 
pfd = xcalloc(socklist->nr, sizeof(struct pollfd));
 
@@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist)
int i;
 
check_dead_children();
-
-   if (poll(pfd, socklist->nr, -1) < 0) {
+#ifdef NO_POLL
+   poll_timeout = live_children ? 100 : -1;
+#endif
+   int ret = poll(pfd, socklist->nr, poll_timeout);
+   if  (ret == 0) {
+   continue;
+   } else if (ret < 0) {
if (errno != EINTR) {
logerror("Poll failed, resuming: %s",
  strerror(errno));
-- 
2.17.0.windows.1



[PATCH 0/2] Fix early EOF with GfW daemon

2018-04-12 Thread Kim Gybels
It has been reported [1] that cloning from a Git-for-Windows daemon would
sometimes fail with an early EOF error:

  $ git clone git://server/test
  Cloning into 'test'...
  remote: Counting objects: 36, done.
  remote: Compressing objects: 100% (24/24), done.
  fatal: read error: Invalid argument
  fatal: early EOF
  fatal: index-pack failed

These patches solve the issue by only changing git-daemon, its child processes
can remain unaware that stdin/stdout are actually network connections.

[1] https://github.com/git-for-windows/git/issues/304

Kim Gybels (2):
  daemon: use timeout for uninterruptible poll
  daemon: graceful shutdown of client connection

 daemon.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

-- 
2.17.0.windows.1



[PATCH 2/2] daemon: graceful shutdown of client connection

2018-04-12 Thread Kim Gybels
On Windows, a connection is shutdown when the last open handle to it is
closed. When that last open handle is stdout of our child process, an
abortive shutdown is triggered when said process exits. Ensure a
graceful shutdown of the client connection by keeping an open handle
until we detect our child process has finished. This allows all the data
to be sent to the client, instead of being discarded.

Fixes https://github.com/git-for-windows/git/issues/304

Signed-off-by: Kim Gybels 
---
 daemon.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/daemon.c b/daemon.c
index 6dc95c1b2f..97fadd62d1 100644
--- a/daemon.c
+++ b/daemon.c
@@ -834,9 +834,10 @@ static struct child {
struct child *next;
struct child_process cld;
struct sockaddr_storage address;
+   int connection;
 } *firstborn;
 
-static void add_child(struct child_process *cld, struct sockaddr *addr, 
socklen_t addrlen)
+static void add_child(struct child_process *cld, struct sockaddr *addr, 
socklen_t addrlen, int connection)
 {
struct child *newborn, **cradle;
 
@@ -844,6 +845,7 @@ static void add_child(struct child_process *cld, struct 
sockaddr *addr, socklen_
live_children++;
memcpy(>cld, cld, sizeof(*cld));
memcpy(>address, addr, addrlen);
+   newborn->connection = connection;
for (cradle =  *cradle; cradle = &(*cradle)->next)
if (!addrcmp(&(*cradle)->address, >address))
break;
@@ -888,6 +890,7 @@ static void check_dead_children(void)
*cradle = blanket->next;
live_children--;
child_process_clear(>cld);
+   close(blanket->connection);
free(blanket);
} else
cradle = >next;
@@ -928,13 +931,13 @@ static void handle(int incoming, struct sockaddr *addr, 
socklen_t addrlen)
}
 
cld.argv = cld_argv.argv;
-   cld.in = incoming;
+   cld.in = dup(incoming);
cld.out = dup(incoming);
 
if (start_command())
logerror("unable to fork");
else
-   add_child(, addr, addrlen);
+   add_child(, addr, addrlen, incoming);
 }
 
 static void child_handler(int signo)
-- 
2.17.0.windows.1



Re: File versioning based on shallow Git repositories?

2018-04-12 Thread Hallvard Breien Furuseth

On 12. april 2018 23:07, Rafael Ascensao wrote:

Would initiating a repo with a empty root commit, tag it with 'base' then
use $ git rebase --onto base master@{30 days ago} master;
be viable?


No... my question was confused from the beginning.  With such large files
I _shouldn't_ have history (or grafts), otherwise Git spends a lot of CPU
time creating diffs when I look at a commit, or worse, when I try git log.
Which I discovered quickly when trying real data instead of test-data:-)

Ævar's suggestion was exactly right in that respect.  Thanks again!

--
Hallvard


Optimizing writes to unchanged files during merges?

2018-04-12 Thread Linus Torvalds
So I just had an interesting experience that has happened before too,
but this time I decided to try to figure out *why* it happened.

I'm obviously in the latter part of the kernel merge window, and
things are slowly calming down. I do the second XFS merge during this
window, and it brings in updates just to the fs/xfs/ subdirectory, so
I expect that my test build for the full kernel configuration should
be about a minute.

Instead of recompiles pretty much *everything*, and the test build
takes 22 minutes.

This happens occasionally, and I blame gremlins. But this time I
decided to look at what the gremlins actually *are*.

The diff that git shows for the pull was very clear: only fs/xfs/
changes. But doing

  ls -tr $(git ls-files '*.[chS]') | tail -10

gave the real reason: in between all the fs/xfs/xyz files was this:

include/linux/mm.h

and yeah, that rather core header file causes pretty much everything
to be re-built.

Now, the reason it was marked as changed is that the xfs branch _had_
in fact changed it, but the changes were already upstream and got
merged away. But the file still got written out (with the same
contents it had before the merge), and 'make' obviously only looks at
modification time, so make rebuilt everything.

Now, because it's still the merge window, I don't have much time to
look into this, but I was hoping somebody in git land would like to
give it a quick look. I'm sure I'm not the only one to have ever been
hit by this, and I don't think the kernel is the only project to hit
it either.

Because it would be lovely if the merging logic would just notice "oh,
that file doesn't change", and not even write out the end result.

For testing, the merge that triggered this git introspection is kernel
commit 80aa76bcd364 ("Merge tag 'xfs-4.17-merge-4' of
git://git.kernel.org/pub/scm/fs/xfs/xfs-linux"), which can act as a
test-case. It's a clean merge, so no kernel knowledge necessary: just
re-merge the parents and see if the modification time of
include/linux/mm.h changes.

I'm guessing some hack in update_file_flags() to say "if the new
object matches the old one, don't do anything" might work. Although I
didn't look if we perhaps end up writing to the working tree copy
earlier already.

Looking at the blame output of that function, most of it is really
old, so this "problem" goes back a long long time.

Just to clarify: this is absolutely not a correctness issue. It's not
even a *git* performance issue. It's literally just a "not updating
the working tree when things don't change results in better behavior
for other tools".

So if somebody gets interested in this problem, that would be great.
And if not, I'll hopefully remember to come back to this next week
when the merge window is over, and take a second look.

 Linus


Re: File versioning based on shallow Git repositories?

2018-04-12 Thread Rafael Ascensao
Would initiating a repo with a empty root commit, tag it with 'base' then

use $ git rebase --onto base master@{30 days ago} master;

be viable?

The --orphan & tag is perhaps more robust, since it's "harder" to move
tags around.

--
Rafael Ascensão


Re: File versioning based on shallow Git repositories?

2018-04-12 Thread Ævar Arnfjörð Bjarmason

On Thu, Apr 12 2018, Hallvard Breien Furuseth wrote:

> On 12. april 2018 20:47, Ævar Arnfjörð Bjarmason wrote:
>> 1. Create a backup.git repo
>> 2. Each time you make a backup, checkout a new orphan branch, see "git
>> checkout --orphan"
>> 3. You copy the files over, commit them, "git log" at this point shows
>> one commit no matter if you've done this before.
>> 4. You create a tag for this backup, e.g. one named after the current
>> time, delete the branch.
>> 5. You then have a retention period for the tags, e.g. only keep the
>> last 30 tags if you do daily backups for 30 days of backups.
>>
>> Then as soon as you delete the tags the old commit will be unreferenced,
>> and you can make git-gc delete the data.
>
> Nice!
> Why the tags though, instead of branches named after the current time?

Because tags are idiomatic in git for a reference that doesn't change,
but sure, if you'd like branches that'll work too.

> One --orphan branch/tag per day with several commits would work for me.
>
> Also maybe it'll be worthwhile to generate .git/info/grafts in a local
> clone of the repo to get back easily visible history.  No grafts in
> the original repo, grafts mess things up.

Maybe, I have not tried this with grafts.


Re: File versioning based on shallow Git repositories?

2018-04-12 Thread Hallvard Breien Furuseth

On 12. april 2018 20:47, Ævar Arnfjörð Bjarmason wrote:

1. Create a backup.git repo
2. Each time you make a backup, checkout a new orphan branch, see "git
checkout --orphan"
3. You copy the files over, commit them, "git log" at this point shows
one commit no matter if you've done this before.
4. You create a tag for this backup, e.g. one named after the current
time, delete the branch.
5. You then have a retention period for the tags, e.g. only keep the
last 30 tags if you do daily backups for 30 days of backups.

Then as soon as you delete the tags the old commit will be unreferenced,
and you can make git-gc delete the data.


Nice!
Why the tags though, instead of branches named after the current time?

One --orphan branch/tag per day with several commits would work for me.

Also maybe it'll be worthwhile to generate .git/info/grafts in a local
clone of the repo to get back easily visible history.  No grafts in
the original repo, grafts mess things up.

--
Hallvard


Re: fixup! [PATCH 1/6] doc: fix formatting inconsistency in githooks.txt

2018-04-12 Thread Martin Ågren
On 11 April 2018 at 23:08, Andreas Heiduk  wrote:
> - reflow some paragraphs
> ---
>  Documentation/githooks.txt | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)

I have reviewed the resulting githooks.txt. See the diff below for two
more instances that I found. For the second hunk, I have difficulties
parsing that paragraph, but I still claim those should be backticks and
*git* read-tree...

Martin

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index ab5ce80e13..e3c283a174 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -80,7 +80,7 @@ This hook is invoked by linkgit:git-am[1].  It takes
no parameter,
 and is invoked after the patch is applied and a commit is made.

 This hook is meant primarily for notification, and cannot affect
-the outcome of 'git am'.
+the outcome of `git am`.

 pre-commit
 ~~
@@ -400,8 +400,8 @@ when the tip of the current branch is updated to
the new commit, and
 exit with a zero status.

 For example, the hook can simply run `git read-tree -u -m HEAD "$1"`
-in order to emulate 'git fetch' that is run in the reverse direction
-with `git push`, as the two-tree form of `read-tree -u -m` is
+in order to emulate `git fetch` that is run in the reverse direction
+with `git push`, as the two-tree form of `git read-tree -u -m` is
 essentially the same as `git checkout` that switches branches while
 keeping the local changes in the working tree that do not interfere
 with the difference between the branches.


Re: File versioning based on shallow Git repositories?

2018-04-12 Thread Ævar Arnfjörð Bjarmason

On Thu, Apr 12 2018, Hallvard Breien Furuseth wrote:

> Can I use a shallow Git repo for file versioning, and regularly purge
> history older than e.g. 2 weeks?  Purged data MUST NOT be recoverable.
>
> Or is there a backup tool based on shallow Git cloning which does this?
> Push/pull to another shallow repo would be nice but is not required.
> The files are text files up to 1/4 Gb, usually with few changes.
>
>
> If using Git - I see "git fetch --depth" can shorten history now.
> How do I do that without 'fetch', in the origin repo?
> Also Documentation/technical/shallow.txt describes some caveats, I'm
> not sure how relevant they are.
>
> To purge old data -
>   git config core.logallrefupdates false
>   git gc --prune=now --aggressive
> Anything else?
>
> I'm guessing that without --aggressive, some expired info might be
> deduced from studying the packing of the remaining objects.  Don't
> know if we'll be required to be that paranoid.

The shallow feature is not for this use-case, but there's a much easier
solution that I've used for exactly this use-case, e.g. taking backups
of SQL dumps that delta-compress well, and then throwing out old
backups.

You:

1. Create a backup.git repo
2. Each time you make a backup, checkout a new orphan branch, see "git
   checkout --orphan"
3. You copy the files over, commit them, "git log" at this point shows
   one commit no matter if you've done this before.
4. You create a tag for this backup, e.g. one named after the current
   time, delete the branch.
5. You then have a retention period for the tags, e.g. only keep the
   last 30 tags if you do daily backups for 30 days of backups.

Then as soon as you delete the tags the old commit will be unreferenced,
and you can make git-gc delete the data.

You'll still be able to `git diff` between tags, even though they have
unrelated histories, and the files will still delta-compress.


Re: [PATCH v6 15/15] rebase -i --rebase-merges: add a section to the man page

2018-04-12 Thread Jacob Keller
On Thu, Apr 12, 2018 at 2:30 AM, Johannes Schindelin
 wrote:
>> I think it would be worthwhile to point out that unlike the other commands
>> this will not preserve untracked files. Maybe something like
>> "Note that unlike the `pick` or `merge` commands or initial checkout when the
>> rebase starts the `reset` command will overwrite any untracked files."
>
> You know what? You just pointed out a bug in my thinking. Previously, I
> thought that this is impossible, that you cannot overwrite untracked files
> because we labeled this revision previously, so the only new files to
> write by `reset` were tracked files previous. But that forgets `exec` and
> `reset` with unlabeled revisions (e.g. for cousins).
>
> So I changed the `reset` command to refuse overwriting untracked files...
>
> Thank you for improving this patch series!
> Dscho

Good catch! This could possibly have bitten someone badly.


File versioning based on shallow Git repositories?

2018-04-12 Thread Hallvard Breien Furuseth
Can I use a shallow Git repo for file versioning, and regularly purge
history older than e.g. 2 weeks?  Purged data MUST NOT be recoverable.

Or is there a backup tool based on shallow Git cloning which does this?
Push/pull to another shallow repo would be nice but is not required.
The files are text files up to 1/4 Gb, usually with few changes. 


If using Git - I see "git fetch --depth" can shorten history now.
How do I do that without 'fetch', in the origin repo?
Also Documentation/technical/shallow.txt describes some caveats, I'm
not sure how relevant they are.

To purge old data -
  git config core.logallrefupdates false
  git gc --prune=now --aggressive
Anything else?

I'm guessing that without --aggressive, some expired info might be
deduced from studying the packing of the remaining objects.  Don't
know if we'll be required to be that paranoid.

-- 
Hallvard


Re: [PATCH v6 00/15] rebase -i: offer to recreate commit topology

2018-04-12 Thread Jacob Keller
On Wed, Apr 11, 2018 at 10:42 PM, Sergey Organov  wrote:
> Hi Jacob,
>
> Jacob Keller  writes:
>> On Wed, Apr 11, 2018 at 6:13 AM, Sergey Organov  wrote:
>>> It was rather --recreate-merges just a few weeks ago, and I've seen
>>> nobody actually commented either in favor or against the
>>> --rebase-merges.
>>>
>>> git rebase --rebase-merges
>>>
>>
>> I'm going to jump in here and say that *I* prefer --rebase-merges, as
>> it clearly mentions merge commits (which is the thing that changes).
>
> OK, thanks, it's fair and the first argument in favor of --rebase-merges
> I see.
>

I'd be ok with "--keep-merges" also. I don't like the idea of
"flatten" as it, to me, means that anyone who wants to understand the
option without prior knowledge must immediately read the man page or
they will be confused. Something like "--rebase-merges" at least my
coworkers got it instantly. The same could be said for "--keep-merges"
too, but so far no one I asked said the immediately understood
"--no-flatten".

Thanks,
Jake


Re: Great Investment Offer

2018-04-12 Thread Gagum Melvin Sikze Kakha
Hello

In my search for a business partner i got your contact in google 
search. My client is willing to invest $10 Million to $500 
million but my client said he need a trusted partner who he can 
have a meeting at the point of releasing his funds. 

I told my client that you have a good profile with your company 
which i got details about you on my search on google lookup. Can 
we trust you. 

Can we make a plan for a long term business relationship.

Please reply. 

Regards,
Gagum Melvin Sikze Kakha
Tel: 703197576


git gui issue

2018-04-12 Thread Isaac Kaplan
I'm having a bug with git gui. Please help me find the right place to report it.


version stats:
$ git gui version
git-gui version 0.21.GITGUI

$ git version
git version 2.16.2

how the issue is produced:
using 
  Git Gui
preferences
  Additional Diff Parameters   set to --ignore-all-space
save, exit and relanch gui. 

select an unstaged change where the only differences are ignored based on 
--ignore-all-space
one example of such a change is adding 4 space to an existing blank line
this gives the misleading message "No differences detected."   and triggers a 
rescan. This results in losing the place in the unstaged changes list and a 
rescan that does not change the list.

Re: [PATCH v6 00/15] rebase -i: offer to recreate commit topology

2018-04-12 Thread Johannes Schindelin
Hi Sergey,

On Thu, 12 Apr 2018, Sergey Organov wrote:

> Johannes Schindelin  writes:
>
> > On Wed, 11 Apr 2018, Sergey Organov wrote:
> >
> >> The RFC v2 and Phillip's strategy are essentially the same, as has been
> >> already shown multiple times, both theoretically and by testing.
> >
> > No, they are not.
> 
> It's off-topic here.

Well, you directed the discussion there. So there.

> If you _really_ want to discuss it further [...]

I am always interested in a constructive discussion toward the goal of
making Git better, to improve its user experience, to give users more
powerful options, and to make things easier to use.

I'll let you know when I detect a change in this discussion in that vague
direction.

> Abrupt change of the topic of discussion indicates your intention to
> take attention off the apparent ugliness of 
> 
> git rebase --rebase-merges

If you want to discuss ugly things in Git, that is really an abrupt
diversion, but I would not fault you: there is plenty of that in Git.

As to `git rebase --rebase-merges`? I do not actually find that really
ugly. I find that it says what I want it to say. And after how many people
agreed, I find it rather pointless and time-wasting to discuss this
further. Naming is hard, and you seem to have a knack for coming up with
names that are really terrible. That is why I stopped discussing this with
you.

> I also get it as an indication that there are no more arguments in favor
> of --rebase-merges on your side, at least for now.

You seem to misinterpret your own arguments against --rebase-merges to be
anywhere in the realm of convincing. They are not.

Did I say "flatten history" to you in this discussion? Sure I did. We also
talked about Darcs. About the theory of patches. About the inner workings
of recursive merges. About commit graphs. And topologies. And we threw
around many terms that we know people understand who are deep into the
inner workings of merges and cherry-picks.

Does this mean that we should expose all the terms we used in this
technical discussion to the user interface?

No, it does not. We should not absolutely not do that.

So it is not at all a convincing argument to say "but you said XYZ". *In
this mail thread*. Which is necessarily full of technical lingo.

Also, I am still waiting for something tangible from your side. Something
non-theoretic. Something practical. Something like taking that FAKE_INIT
example at heart, studying it, deducing from it what weaknesses we cannot
tolerate in strategies to "cherry-pick merge commits" or "forward-port
merges" or "re-apply amendments in merge commits" or whatever you want to
call it.

Your suggestions so far are heavily biased by your own preferences, based
in theoretical musings, not in practical examples. I do not see any focus
on the Git user base at large. "What? They don't know what a topology is?"
is a question I could see you asking.

There has been a lot of talk in this mail thread, and the only actual
outcome I see is my own work, and Buga's tireless efforts to test
approaches for their practicality. There is zilch concrete testing from
your side. No implementation of anything. No demonstration what kinds of
merge conflicts are produced, how often they would have to be resolved by
the user. None.

The important thing to keep in mind is that all my efforts here are spent
in order to come up with a feature in Git that empowers users. And I want
this feature to be as usable as possible. And I want it to use as simple
language and option names as possible. That is what I will keep focusing
on, like it or not.

Ciao,
Johannes


Re: [PATCH] Create '--merges-only' option for 'git bisect'

2018-04-12 Thread Tiago Botelho
On Thu, Apr 12, 2018 at 12:53 PM, Johannes Schindelin
 wrote:
> Hi Tiago,
>
> On Thu, 12 Apr 2018, Tiago Botelho wrote:
>
>> On Thu, Apr 12, 2018 at 9:58 AM, Christian Couder
>>  wrote:
>> > On Thu, Apr 12, 2018 at 9:49 AM, Harald Nordgren
>> >  wrote:
>> >> I think it looks similar. But if I'm reading that thread correctly
>> >> then there was never a patch created, right?
>> >
>> > (It is customary on this mailing list to reply after the sentences we
>> > reply to. We don't "top post".)
>> >
>> > On the GSoC idea pages (like https://git.github.io/SoC-2018-Ideas/) we
>> > have been suggesting "Implement git bisect --first-parent" and there
>> > are the following related links:
>> >
>> > https://public-inbox.org/git/2015030405.ga9...@peff.net/
>> > https://public-inbox.org/git/4d3cddf9.6080...@intel.com/
>> >
>> > Tiago in Cc also tried at a recent London hackathon to implement it
>> > and came up with the following:
>> >
>> > https://github.com/tiagonbotelho/git/pull/1/files
>> >
>> > I tried to help him by reworking his commit in the following branch:
>> >
>> > https://github.com/chriscool/git/commits/myfirstparent
>>
>> Thank you for the cc Christian, I’ve been quite busy and was not able
>> to work on the PR for quite some time.
>>
>> I intended to pick it back up again next week. If it is ok with Harald
>> I would love to finish the PR that I started,
>> since it is quite close to being finished (I think it was just specs
>> missing if I am not mistaken).
>
> That looks promising. Just like I suggested to Harald in another reply
> [*1*] on this thread, you probably want to use `int flags` instead, and
> turn `find_all` into BISECT_FIND_ALL in a preparatory commit.
>
> Also, you will definitely want to add a test. Again, my reply to Harald
> [*1*] should give you a head start there... You will want to imitate the
> test case I outlined there, maybe something like:
>
> # A - B - C - F
> #   \   \   /   \
> # D - E - G - H
>
> [... 'setup' as in my mail to Harald ...]
>
> test_expect_success '--first-parent' '
> write_script find-e.sh <<-\EOF &&
> case "$(git show --format=%s -s)" in
> B|C|F) ;; # first parent lineage: okay
> *) git show -s --oneline HEAD >unexpected;;
> esac
> # detect whether we are "before" or "after" E
> test ! -f E.t
> EOF
>
> git bisect start --first-parent HEAD A &&
> git bisect run ./find-e.sh >actual &&
> test_path_is_missing unexpected &&
> grep "$(git rev-parse F) is the first bad commit" actual
> '
>
> Also, Tiago, reading through your patch (as on chriscool/git; do you have
> your own fork? That would make it much easier to collaborate with you by
> offering PRs), it looks more straight-forward than editing the commit_list
> after the fact and adding magic weights ;-)
>
> Except for one thing. I wonder why `bisect_next_all()` does not set
> revs.first_parent_only after calling `bisect_rev_setup()`? You would still
> need the changes in `count_distance()`, as it performs its own commit
> graph traversal, but there is no need to enumerate too many commits in the
> first place, right?
>
> Harald, maybe --merges-only can be implemented on top of --first-parent,
> with the `int flags` change I suggested?
>
> Ciao,
> Johannes
>
> Footnote *1*:
> https://public-inbox.org/git/nycvar.qro.7.76.6.1804121143120...@zvavag-6oxh6da.rhebcr.pbec.zvpebfbsg.pbz/

Hi Johannes,

Thank you for the pointers!

I do have my own fork, you can see it here:
https://github.com/tiagonbotelho/git/pull/1/commits
which applies the changes Chris made to my first draft of the solution.

I still have to add him as co-author of that commit.

Cheers,
Tiago Botelho


Re: [PATCH] Create '--merges-only' option for 'git bisect'

2018-04-12 Thread Johannes Schindelin
Hi Tiago,

On Thu, 12 Apr 2018, Tiago Botelho wrote:

> On Thu, Apr 12, 2018 at 9:58 AM, Christian Couder
>  wrote:
> > On Thu, Apr 12, 2018 at 9:49 AM, Harald Nordgren
> >  wrote:
> >> I think it looks similar. But if I'm reading that thread correctly
> >> then there was never a patch created, right?
> >
> > (It is customary on this mailing list to reply after the sentences we
> > reply to. We don't "top post".)
> >
> > On the GSoC idea pages (like https://git.github.io/SoC-2018-Ideas/) we
> > have been suggesting "Implement git bisect --first-parent" and there
> > are the following related links:
> >
> > https://public-inbox.org/git/2015030405.ga9...@peff.net/
> > https://public-inbox.org/git/4d3cddf9.6080...@intel.com/
> >
> > Tiago in Cc also tried at a recent London hackathon to implement it
> > and came up with the following:
> >
> > https://github.com/tiagonbotelho/git/pull/1/files
> >
> > I tried to help him by reworking his commit in the following branch:
> >
> > https://github.com/chriscool/git/commits/myfirstparent
> 
> Thank you for the cc Christian, I’ve been quite busy and was not able
> to work on the PR for quite some time.
> 
> I intended to pick it back up again next week. If it is ok with Harald
> I would love to finish the PR that I started,
> since it is quite close to being finished (I think it was just specs
> missing if I am not mistaken).

That looks promising. Just like I suggested to Harald in another reply
[*1*] on this thread, you probably want to use `int flags` instead, and
turn `find_all` into BISECT_FIND_ALL in a preparatory commit.

Also, you will definitely want to add a test. Again, my reply to Harald
[*1*] should give you a head start there... You will want to imitate the
test case I outlined there, maybe something like:

# A - B - C - F
#   \   \   /   \
# D - E - G - H

[... 'setup' as in my mail to Harald ...]

test_expect_success '--first-parent' '
write_script find-e.sh <<-\EOF &&
case "$(git show --format=%s -s)" in
B|C|F) ;; # first parent lineage: okay
*) git show -s --oneline HEAD >unexpected;;
esac
# detect whether we are "before" or "after" E
test ! -f E.t
EOF

git bisect start --first-parent HEAD A &&
git bisect run ./find-e.sh >actual &&
test_path_is_missing unexpected &&
grep "$(git rev-parse F) is the first bad commit" actual
'

Also, Tiago, reading through your patch (as on chriscool/git; do you have
your own fork? That would make it much easier to collaborate with you by
offering PRs), it looks more straight-forward than editing the commit_list
after the fact and adding magic weights ;-)

Except for one thing. I wonder why `bisect_next_all()` does not set
revs.first_parent_only after calling `bisect_rev_setup()`? You would still
need the changes in `count_distance()`, as it performs its own commit
graph traversal, but there is no need to enumerate too many commits in the
first place, right?

Harald, maybe --merges-only can be implemented on top of --first-parent,
with the `int flags` change I suggested?

Ciao,
Johannes

Footnote *1*:
https://public-inbox.org/git/nycvar.qro.7.76.6.1804121143120...@zvavag-6oxh6da.rhebcr.pbec.zvpebfbsg.pbz/

Re: [PATCH v6 15/15] rebase -i --rebase-merges: add a section to the man page

2018-04-12 Thread Sergey Organov
Johannes Schindelin  writes:
> +
> +*   Merge branch 'report-a-bug'
> +|\
> +| * Add the feedback button
> +* | Merge branch 'refactor-button'
> +|\ \
> +| |/
> +| * Use the Button class for all buttons
> +| * Extract a generic Button class from the DownloadButton one
> +

Consider to put SHA1s into the diagram, as they are then used in
explanaitions. Hopefully I got them right here:


*   6f5e4d Merge branch 'report-a-bug'
|\
| * abcdef Add the feedback button
* | a1b2c3 Merge branch 'refactor-button'
|\ \
| |/
| * 654321 Use the Button class for all buttons
| * 123456 Extract a generic Button class from the DownloadButton one


Original explanation, just for reference, unchanged:

> +
> +label onto
> +
> +# Branch: refactor-button
> +reset onto
> +pick 123456 Extract a generic Button class from the DownloadButton one
> +pick 654321 Use the Button class for all buttons
> +label refactor-button
> +
> +# Branch: report-a-bug
> +reset refactor-button # Use the Button class for all buttons
> +pick abcdef Add the feedback button
> +label report-a-bug
> +
> +reset onto
> +merge -C a1b2c3 refactor-button # Merge 'refactor-button'
> +merge -C 6f5e4d report-a-bug # Merge 'report-a-bug'
> +

-- Sergey


Re: [PATCHv3 00/15] replace_object.c: rename to use dash in file name

2018-04-12 Thread Derrick Stolee

On 4/11/2018 8:21 PM, Stefan Beller wrote:

v3:
* interdiff below,
   the only changes are renaming the variable
   -   struct ref_store *main_ref_store;
   +   struct ref_store *refs;
   in struct repository.
   as well as dropping the file rename patch.
* improved commit messages from discussion on the single patches.

Looks good to me. Thanks!
-Stolee


Re: [PATCH] Create '--merges-only' option for 'git bisect'

2018-04-12 Thread Johannes Schindelin
Hi Stefan,

On Wed, 11 Apr 2018, Stefan Beller wrote:

> On Wed, Apr 11, 2018 at 3:55 PM, Harald Nordgren
>  wrote:
> > When ran with '--merges-only', git bisect will only look at merge commits 
> > -- commits with 2 or more parents or the initial commit.
> 
> There has been quite some talk on the mailing list, e.g.
> https://public-inbox.org/git/20160427204551.GB4613@virgo.localdomain/
> which suggests a --first-parent mode instead.

I like that mode, but I would love to have *both*. And from what I see, it
should be relatively easy to add the --first-parent mode on top of
Harald's patches.

> For certain histories these are the same, but merges-only is more
> restrictive for back-and-forth-cross merges.

You mean merges-only tests *more* in back-and-forth-cross-merges
scenarios?

Ciao,
Dscho


Re: [PATCH v2 07/10] commit-graph.txt: update future work

2018-04-12 Thread Derrick Stolee

On 4/12/2018 5:12 AM, Junio C Hamano wrote:

Derrick Stolee  writes:


+Here is a diagram to visualize the shape of the full commit graph, and
+how different generation numbers relate:
+
++-+
+| GENERATION_NUMBER_INFINITY = 0x |
++-+
+   ||  ^
+   ||  |
+   |+--+
+   | [gen(A) = gen(B)]
+   V
++-+
+| 0 < commit->generation < 0x4000 |
++-+
+   ||  ^
+   ||  |
+   |+--+
+   |[gen(A) > gen(B)]
+   V
++-+
+| GENERATION_NUMBER_ZERO = 0  |
++-+
+|  ^
+|  |
++--+
+[gen(A) = gen(B)]

It may be just me but all I can read out of the above is that
commit->generation may store 0x, a value between 0 and
0x4000, or 0.  I cannot quite tell what the notation [gen(A)
 gen(B)] is trying to say.  I am guessing "Two generation
numbers within the 'valid' range can be compared" is what the second
one is trying to say, but it is much less interesting to know that
two infinities compare equal than how generation numbers from
different classes compare, which cannot be depicted in the above
notation, I am afraid.  For example, don't we want to say that a
commit with INF can never be reached by a commit with a valid
generation number, or something like that?


My intention with the arrows was to demonstrate where parent 
relationships can go, and the generation-number relation between a 
commit A with parent B. Clearly, this diagram is less than helpful.





  Design Details
  --
  
@@ -98,17 +141,12 @@ Future Work

  - The 'commit-graph' subcommand does not have a "verify" mode that is
necessary for integration with fsck.
  
-- The file format includes room for precomputed generation numbers. These

-  are not currently computed, so all generation numbers will be marked as
-  0 (or "uncomputed"). A later patch will include this calculation.
-
  - After computing and storing generation numbers, we must make graph
walks aware of generation numbers to gain the performance benefits they
enable. This will mostly be accomplished by swapping a commit-date-ordered
priority queue with one ordered by generation number. The following
-  operations are important candidates:
+  operation is an important candidate:

Good.




Re: [PATCH] Create '--merges-only' option for 'git bisect'

2018-04-12 Thread Johannes Schindelin
Hi Harald,

Thank you for working on this.

On Thu, 12 Apr 2018, Harald Nordgren wrote:

> When ran with '--merges-only', git bisect will only look at merge
> commits -- commits with 2 or more parents or the initial commit.

This is an excellent idea!

> Proof of concept of a feature that I have wanted in Git for a while.
> In my daily work my company uses GitHub, which creates lots of merge
> commits. In general, tests are only ran on the tip of a branch
> before merging, so the different commits within a merge commit are
> allowed not to be buildable. Therefore 'git bisect' often doesn't
> work since it will give lots of false positives for anything that is
> not a merge commit. If we could have a feature to only bisect merge
> commits then it would be easier to pinpoint which merge causes any
> particular issue. After that, a bisect could be done within the
> merge to pinpoint futher. As a follow-up to this patch -- we could
> potentially create a feature that automatically continues into
> regular bisect within the bad merge commit after completed
> '--merges-only' bisection.

It also helps when bisecting breakages in `pu` (mainly because the
branches in `pu` use branch points that are often insanely far in the
past).

> diff --git a/bisect.c b/bisect.c
> index a579b5088..e8a2f77c7 100644
> --- a/bisect.c
> +++ b/bisect.c
> @@ -254,7 +254,7 @@ static struct commit_list *best_bisection_sorted(struct 
> commit_list *list, int n
>   */
>  static struct commit_list *do_find_bisection(struct commit_list *list,
>int nr, int *weights,
> -  int find_all)
> +  int find_all, int 
> only_merge_commits)

How about `int flags`, and defining FIND_ALL and ONLY_MERGE_COMMITS?

That way, it will be easy to add ONLY_FIRST_PARENTS without changing the
signature of do_find_bisection().

Ideally, a preparatory commit would change find_all -> flags (adding
FIND_ALL), and the second commit would then add support for
ONLY_MERGE_COMMITS.

> @@ -266,6 +266,13 @@ static struct commit_list *do_find_bisection(struct 
> commit_list *list,
>   unsigned flags = commit->object.flags;
>  
>   p->item->util = [n++];
> +
> + if (only_merge_commits) {
> + weight_set(p, -2);
> + counted++;
> + continue;
> + }
> +

This hunk is a little hard to understand when you come from elsewhere, as
I do. Could you maybe explain a little bit what the `weight_set(p, -2)`
does? This is probably good material for enhancing the commit message
(seeing as we understand the commit message to be kind of the "convince me
that this is a good change, and explain things that are not immediately
obvious from the diff" document).

Maybe it would be enough to describe the role of the weight, and what the
typical values look like.

> @@ -305,11 +312,17 @@ static struct commit_list *do_find_bisection(struct 
> commit_list *list,
>* way, and then fill the blanks using cheaper algorithm.
>*/
>   for (p = list; p; p = p->next) {
> + int distance;
>   if (p->item->object.flags & UNINTERESTING)
>   continue;
>   if (weight(p) != -2)
>   continue;
> - weight_set(p, count_distance(p));
> +
> + if (only_merge_commits)
> + distance = count_distance(p) - 1;
> + else
> + distance = count_distance(p);
> + weight_set(p, distance);

A shorter way:

weight_set(p, count_distance(p) - !!only_merge_commits);

Could you add a code comment above this line to explain why this is
needed? (I have to admit that I have no clue what the weights or the
distances are in this context, but I think that a 3-line explanation could
probably give me enough of an idea that I do not have to study an hour or
three to learn enough to understand this change.)

>   clear_distance(list);
>  
>   /* Does it happen to be at exactly half-way? */
> @@ -330,11 +343,17 @@ static struct commit_list *do_find_bisection(struct 
> commit_list *list,
>   for (q = p->item->parents; q; q = q->next) {
>   if (q->item->object.flags & UNINTERESTING)
>   continue;
> + if (!q->item->util)
> + break;

So we use the `util` field now to do things, probably to flag some
property of the commit.

Maybe the commit message could prepare the reader for this, with a
paragraph starting with "We use the commits' `util` field to store the
information that [...]"?

Seeing as this loop iterates over the commit's parents, I guess we are
looking at merge commits, and drop out early if we find a 

Re: [PATCH v6 14/15] rebase -i: introduce --rebase-merges=[no-]rebase-cousins

2018-04-12 Thread Sergey Organov
Johannes Schindelin  writes:

[...]

> ++
> +By default, or when `no-rebase-cousins` was specified, commits which do not
> +have `` as direct ancestor will keep their original branch point.



sans quotes, <...> are used without them throughout the manual page.

> +If the `rebase-cousins` mode is turned on, such commits are rebased onto
> +`` (or ``, if specified).

 (or , if --onto is specified).

sans quotes, and there is no  defined.

-- Sergey


Re: [PATCH v8 03/14] commit-graph: add format document

2018-04-12 Thread Derrick Stolee

On 4/11/2018 4:58 PM, Jakub Narebski wrote:

Derrick Stolee  writes:


+CHUNK DATA:
+
+  OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
+  The ith entry, F[i], stores the number of OIDs with first
+  byte at most i. Thus F[255] stores the total
+  number of commits (N).
+
+  OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
+  The OIDs for all commits in the graph, sorted in ascending order.
+
+  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)

I think it is a typo, and it should be CDAT, not CGET
(CDAT seem to me to stand for Commit DATa):

   +  Commit Data (ID: {'C', 'D', 'A', 'T' }) (N * (H + 16) bytes)

This is what you use in actual implementation, in PATCH v8 06/14

DS> +#define GRAPH_SIGNATURE 0x43475048 /* "CGPH" */
DS> +#define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */
DS> +#define GRAPH_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */
DS> +#define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */
DS> +#define GRAPH_CHUNKID_LARGEEDGES 0x45444745 /* "EDGE" */



Documentation bugs are hard to diagnose. Thanks for finding this. I 
double checked that the hex int "0x43444154" matches "CDAT".


Here is a diff to make it match.

diff --git a/Documentation/technical/commit-graph-format.txt 
b/Documentation/technical/commit-graph-format.txt

index ad6af8105c..af03501834 100644
--- a/Documentation/technical/commit-graph-format.txt
+++ b/Documentation/technical/commit-graph-format.txt
@@ -70,7 +70,7 @@ CHUNK DATA:
   OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
   The OIDs for all commits in the graph, sorted in ascending order.

-  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
+  Commit Data (ID: {'C', 'D', 'A', 'T' }) (N * (H + 16) bytes)
 * The first H bytes are for the OID of the root tree.
 * The next 8 bytes are for the positions of the first two parents
   of the ith commit. Stores value 0x if no parent in that



Re: [PATCH v6 15/15] rebase -i --rebase-merges: add a section to the man page

2018-04-12 Thread Johannes Schindelin
Hi Phillip,

On Wed, 11 Apr 2018, Phillip Wood wrote:

> On 10/04/18 13:30, Johannes Schindelin wrote:
> 
> Firstly let me say that I think expanding the documentation and having an
> example is an excellent idea.

Thanks! At first, I meant to leave this for others to contribute, but I
think it makes sense for me to describe it, as I do have a little bit of
experience with rebasing merges.

> > +
> > +
> > +label onto
> > +
> > +# Branch: refactor-button
> > +reset onto
> > +pick 123456 Extract a generic Button class from the DownloadButton one
> > +pick 654321 Use the Button class for all buttons
> > +label refactor-button
> > +
> > +# Branch: report-a-bug
> > +reset refactor-button # Use the Button class for all buttons
> > +pick abcdef Add the feedback button
> > +label report-a-bug
> > +
> > +reset onto
> > +merge -C a1b2c3 refactor-button # Merge 'refactor-button'
> > +merge -C 6f5e4d report-a-bug # Merge 'report-a-bug'
> > +
> > +
> > +In contrast to a regular interactive rebase, there are `label`, `reset` and
> > +`merge` commands in addition to `pick` ones.
> > +
> > +The `label` command puts a label to whatever will be the current
> 
> s/puts a label to/associates a label with/ would be clearer I think. Maybe
> s/whatever will be the current revision/the current HEAD/ an well?

Thanks, I incorporated both changes here.

> > +revision when that command is executed. Internally, these labels are
> > +worktree-local refs that will be deleted when the rebase finishes or
> > +when it is aborted.
> 
> I agree they should be deleted when the rebase is aborted but I cannot see any
> changes to git-rebase.sh to make that happen. I think they should also be
> deleted by 'rebase --quit'.

Oh right! For some reason I thought I already hooked up rebase--helper
--abort when rebase was called with --abort or quit, but I had not managed
yet. I think I will leave this for later, or for GSoC, or something.

In the meantime, I'll just drop the "or when it is aborted.".

> > That way, rebase operations in multiple worktrees
> > +linked to the same repository do not interfere with one another.
> > +
> > +The `reset` command is essentially a `git reset --hard` to the specified
> > +revision (typically a previously-labeled one).
> 
> s/labeled/labelled/

As Eric pointed out, I am using 'murricane spelling here (or is it
speling? Ya never know these days).

> I think it would be worthwhile to point out that unlike the other commands
> this will not preserve untracked files. Maybe something like
> "Note that unlike the `pick` or `merge` commands or initial checkout when the
> rebase starts the `reset` command will overwrite any untracked files."

You know what? You just pointed out a bug in my thinking. Previously, I
thought that this is impossible, that you cannot overwrite untracked files
because we labeled this revision previously, so the only new files to
write by `reset` were tracked files previous. But that forgets `exec` and
`reset` with unlabeled revisions (e.g. for cousins).

So I changed the `reset` command to refuse overwriting untracked files...

Thank you for improving this patch series!
Dscho


Re: [PATCH] Create '--merges-only' option for 'git bisect'

2018-04-12 Thread Tiago Botelho
On Thu, Apr 12, 2018 at 9:58 AM, Christian Couder
 wrote:
> On Thu, Apr 12, 2018 at 9:49 AM, Harald Nordgren
>  wrote:
>> I think it looks similar. But if I'm reading that thread correctly
>> then there was never a patch created, right?
>
> (It is customary on this mailing list to reply after the sentences we
> reply to. We don't "top post".)
>
> On the GSoC idea pages (like https://git.github.io/SoC-2018-Ideas/) we
> have been suggesting "Implement git bisect --first-parent" and there
> are the following related links:
>
> https://public-inbox.org/git/2015030405.ga9...@peff.net/
> https://public-inbox.org/git/4d3cddf9.6080...@intel.com/
>
> Tiago in Cc also tried at a recent London hackathon to implement it
> and came up with the following:
>
> https://github.com/tiagonbotelho/git/pull/1/files
>
> I tried to help him by reworking his commit in the following branch:
>
> https://github.com/chriscool/git/commits/myfirstparent

Thank you for the cc Christian, I’ve been quite busy and was not able
to work on the PR for quite some time.

I intended to pick it back up again next week. If it is ok with Harald
I would love to finish the PR that I started,
since it is quite close to being finished (I think it was just specs
missing if I am not mistaken).

Kind regards,

Tiago Botelho


Re: [PATCH v4] git{,-blame}.el: remove old bitrotting Emacs code

2018-04-12 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

>> On the other hand, the 6-lines of e-lisp you wrote for git.el
>> replacement is something the packagers could have written for their
>> users, so (1) if we really want to go extra mile without trusting
>> that distro packagers are less competent than us in helping their
>> users, we'd be better off to leave Makefile in, or (2) if we trust
>> packagers and leave possible end-user confusion as their problem
>> (not ours), then we can just remove as your previous round did.
>>
>> And from that point of view, I find this round slightly odd.
>
> I think the way it is makes sense. In Debian debian/git-el.install just
> does:
> ...
> RedHat does use contrib/emacs/Makefile:
> ...
> But they can either just do their own byte compilation as they surely do
> for other elisp packages,...

In short, Debian happens to be OK, but RedHat folks need to do work
and cannot use what we ship out of the box, *IF* they care about end
user experience.

That was exactly why I felt it was "odd" (iow, "uneven").  We bother
to give a stub git.el; we do not bother to make sure it would keep
being installed if the packagers do not bother to update their
procedure.

If we do not change anything other than making *.el into stubs, then
it is a lot more likely that end user experience on *any* distro
that have been shipping contrib/emacs/ stuff will by default
(i.e. if the packagers do not do anything to adjust) be what we
design here---upon loading they'd see (error) triggering that nudge
them towards modern and maintained alternatives.  If we do more than
that, e.g. remove Makefile, then some distros need to adjust, or
their build would be broken.

I suspect that the set of people Cc'ed on the thread are a lot more
familiar than I am with how distro packagers prefer us to deliber,
so I'll stop speculating at this point.

Thanks.


Re: [PATCH v2 07/10] commit-graph.txt: update future work

2018-04-12 Thread Junio C Hamano
Derrick Stolee  writes:

> +Here is a diagram to visualize the shape of the full commit graph, and
> +how different generation numbers relate:
> +
> ++-+
> +| GENERATION_NUMBER_INFINITY = 0x |
> ++-+
> + ||  ^
> + ||  |
> + |+--+
> + | [gen(A) = gen(B)]
> + V
> ++-+
> +| 0 < commit->generation < 0x4000 |
> ++-+
> + ||  ^
> + ||  |
> + |+--+
> + |[gen(A) > gen(B)]
> + V
> ++-+
> +| GENERATION_NUMBER_ZERO = 0  |
> ++-+
> +  |  ^
> +  |  |
> +  +--+
> +  [gen(A) = gen(B)]

It may be just me but all I can read out of the above is that
commit->generation may store 0x, a value between 0 and
0x4000, or 0.  I cannot quite tell what the notation [gen(A)
 gen(B)] is trying to say.  I am guessing "Two generation
numbers within the 'valid' range can be compared" is what the second
one is trying to say, but it is much less interesting to know that
two infinities compare equal than how generation numbers from
different classes compare, which cannot be depicted in the above
notation, I am afraid.  For example, don't we want to say that a
commit with INF can never be reached by a commit with a valid
generation number, or something like that?

>  Design Details
>  --
>  
> @@ -98,17 +141,12 @@ Future Work
>  - The 'commit-graph' subcommand does not have a "verify" mode that is
>necessary for integration with fsck.
>  
> -- The file format includes room for precomputed generation numbers. These
> -  are not currently computed, so all generation numbers will be marked as
> -  0 (or "uncomputed"). A later patch will include this calculation.
> -
>  - After computing and storing generation numbers, we must make graph
>walks aware of generation numbers to gain the performance benefits they
>enable. This will mostly be accomplished by swapping a commit-date-ordered
>priority queue with one ordered by generation number. The following
> -  operations are important candidates:
> +  operation is an important candidate:

Good.


Re: [PATCH v6 15/15] rebase -i --rebase-merges: add a section to the man page

2018-04-12 Thread Johannes Schindelin
Hi Eric & Phillip,

On Wed, 11 Apr 2018, Eric Sunshine wrote:

> On Wed, Apr 11, 2018 at 11:35 AM, Phillip Wood
>  wrote:
> > On 10/04/18 13:30, Johannes Schindelin wrote:
> >> +The `reset` command is essentially a `git reset --hard` to the specified
> >> +revision (typically a previously-labeled one).
> >
> > s/labeled/labelled/
> 
> American vs. British English spelling.
> 
> CodingGuidelines and SubmittingPatches talk about this. Junio
> summarizes the issue well in [1]. The TL;DR is to lean toward the
> American English spelling.
> 
> [1]: 
> https://public-inbox.org/git/xmqq4m9gpebm@gitster.mtv.corp.google.com/

Thanks, I meant to look that up because I was not sure, and now I do not
have to ;-)

No worries, Phillip, I will keep spelling your name with two `l`s. :0)

Ciao,
Dscho


Re: [PATCH] Create '--merges-only' option for 'git bisect'

2018-04-12 Thread Christian Couder
On Thu, Apr 12, 2018 at 9:49 AM, Harald Nordgren
 wrote:
> I think it looks similar. But if I'm reading that thread correctly
> then there was never a patch created, right?

(It is customary on this mailing list to reply after the sentences we
reply to. We don't "top post".)

On the GSoC idea pages (like https://git.github.io/SoC-2018-Ideas/) we
have been suggesting "Implement git bisect --first-parent" and there
are the following related links:

https://public-inbox.org/git/2015030405.ga9...@peff.net/
https://public-inbox.org/git/4d3cddf9.6080...@intel.com/

Tiago in Cc also tried at a recent London hackathon to implement it
and came up with the following:

https://github.com/tiagonbotelho/git/pull/1/files

I tried to help him by reworking his commit in the following branch:

https://github.com/chriscool/git/commits/myfirstparent


Re: [PATCH v6 00/15] rebase -i: offer to recreate commit topology

2018-04-12 Thread Sergey Organov
Hi Johannes,

Johannes Schindelin  writes:
> Hi Sergey,
>
> On Wed, 11 Apr 2018, Sergey Organov wrote:
>
>> The RFC v2 and Phillip's strategy are essentially the same, as has been
>> already shown multiple times, both theoretically and by testing.
>
> No, they are not.

It's off-topic here. If you _really_ want to discuss it further, you are
still welcome to come back to where you ran away from and continue:

https://public-inbox.org/git/87po3oddl1@javad.com/

Abrupt change of the topic of discussion indicates your intention to
take attention off the apparent ugliness of 

git rebase --rebase-merges

I also get it as an indication that there are no more arguments in favor
of --rebase-merges on your side, at least for now.

-- Sergey


Re: [PATCH] Create '--merges-only' option for 'git bisect'

2018-04-12 Thread Harald Nordgren
I think it looks similar. But if I'm reading that thread correctly
then there was never a patch created, right?

On Thu, Apr 12, 2018 at 1:33 AM, Stefan Beller  wrote:
> On Wed, Apr 11, 2018 at 3:55 PM, Harald Nordgren
>  wrote:
>> When ran with '--merges-only', git bisect will only look at merge commits -- 
>> commits with 2 or more parents or the initial commit.
>
> There has been quite some talk on the mailing list, e.g.
> https://public-inbox.org/git/20160427204551.GB4613@virgo.localdomain/
> which suggests a --first-parent mode instead. For certain histories
> these are the same,
> but merges-only is more restrictive for back-and-forth-cross merges.
>
>
>
>>
>> Signed-off-by: Harald Nordgren 
>> ---
>>
>> Notes:
>> Proof of concept of a feature that I have wanted in Git for a while. In 
>> my daily work my company uses GitHub, which creates lots of merge commits. 
>> In general, tests are only ran on the tip of a branch before merging, so the 
>> different commits within a merge commit are allowed not to be buildable. 
>> Therefore 'git bisect' often doesn't work since it will give lots of false 
>> positives for anything that is not a merge commit. If we could have a 
>> feature to only bisect merge commits then it would be easier to pinpoint 
>> which merge causes any particular issue. After that, a bisect could be done 
>> within the merge to pinpoint futher. As a follow-up to this patch -- we 
>> could potentially create a feature that automatically continues into regular 
>> bisect within the bad merge commit after completed '--merges-only' bisection.
>
> The github workflow you mention sounds as if --first-parent would do, too?


Can i trust you?

2018-04-12 Thread Sgt. Britta Lopez
Good Day,
How are u doing today ? Apologies! I am a military woman ,seeking
your kind assistance to move the sum of ($7M USD) to you, as far
as i can be assured that my money will be safe in your care until
i complete my service here in Afghanistan and come over next
month.
This is legitimate, and there is no danger involved.If
interested, reply immediately for detailed information.
 
Reply to this email sgt.brittalo...@gmail.com
Regards ,
Sgt. Britta Lopez


Re: [PATCH v4] git{,-blame}.el: remove old bitrotting Emacs code

2018-04-12 Thread Ævar Arnfjörð Bjarmason

On Thu, Apr 12 2018, Junio C. Hamano wrote:

> Ævar Arnfjörð Bjarmason   writes:
>
>> However, since downstream packagers such as Debian are packaging this
>> as git-el it's less disruptive to still carry these files as Elisp
>> code that'll error out with a message suggesting alternatives, rather
>> than drop the files entirely[2].
>>
>> Then rather than receive a cryptic load error when they upgrade
>> existing users will get an error directing them to the README file, or
>> to just stop requiring these modes. I think it makes sense to link to
>> GitHub's hosting of contrib/emacs/README (which'll be updated by the
>> time users see this) so they don't have to hunt down the packaged
>> README on their local system.
>> ...
>>
>>  contrib/emacs/.gitignore   |1 -
>>  contrib/emacs/Makefile |   21 -
>>  contrib/emacs/README   |   32 +-
>>  contrib/emacs/git-blame.el |  489 +--
>>  contrib/emacs/git.el   | 1710 +---
>>  5 files changed, 25 insertions(+), 2228 deletions(-)
>>  delete mode 100644 contrib/emacs/.gitignore
>>  delete mode 100644 contrib/emacs/Makefile
>
> I know I am to blame for prodding you to reopen this topic, but I am
> wondering if removal of Makefile is sensible.  Is the assumption
> that the distro packagers won't be using this Makefile at all and
> have their own (e.g. debian/rules for Debian) build procedure, hence
> *.el files are all they need to have?
>
> The reason why I am wondering is because I do not know what distro
> folks would do when they find that their build procedure does not
> work---I suspect the would punt, and end users of the distro would
> find that git-el package is no longer with them.  These end users
> are whom this discussion is trying to help, but then to these
> packagers, the reason why their build procedure no longer works does
> not really matter, whether git.el is not shipped, or Makefile that
> their debian/rules-equivalent depends on is not there, for them to
> decide dropping the git-el package.
>
> On the other hand, the 6-lines of e-lisp you wrote for git.el
> replacement is something the packagers could have written for their
> users, so (1) if we really want to go extra mile without trusting
> that distro packagers are less competent than us in helping their
> users, we'd be better off to leave Makefile in, or (2) if we trust
> packagers and leave possible end-user confusion as their problem
> (not ours), then we can just remove as your previous round did.
>
> And from that point of view, I find this round slightly odd.

I think the way it is makes sense. In Debian debian/git-el.install just
does:

contrib/emacs/git-blame.el usr/share/git-core/emacs
contrib/emacs/git.el usr/share/git-core/emacs

Which means on installation they don't even use our
contrib/emacs/Makefile, but instead on installation do:

Setting up git-el (1:2.16.3-1) ...
Install git for emacs
Install git for emacs24
install/git: Handling install of emacsen flavor emacs24
install/git: Byte-compiling for emacs24
+ emacs24 -batch -q -no-site-file -f batch-byte-compile git.el git-blame.el
Wrote /usr/share/emacs24/site-lisp/git/git.elc
Wrote /usr/share/emacs24/site-lisp/git/git-blame.elc
Install git for emacs25
install/git: Handling install of emacsen flavor emacs25
install/git: Byte-compiling for emacs25
+ emacs25 -batch -q -no-site-file -f batch-byte-compile git.el git-blame.el

RedHat does use contrib/emacs/Makefile:

https://src.fedoraproject.org/cgit/rpms/git.git/tree/git.spec#n493

But they can either just do their own byte compilation as they surely do
for other elisp packages, or just do this:

 git-init.el | 5 -
 git.spec| 9 ++---
 2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/git-init.el b/git-init.el
deleted file mode 100644
index d2a96a7..000
--- a/git-init.el
+++ /dev/null
@@ -1,5 +0,0 @@
-;; Git VC backend
-(add-to-list 'vc-handled-backends 'GIT t)
-(autoload 'git-status "git" "GIT mode." t)
-(autoload 'git-blame-mode "git-blame"
-   "Minor mode for incremental blame for Git." t)
diff --git a/git.spec b/git.spec
index ee60d3e..a82c1aa 100644
--- a/git.spec
+++ b/git.spec
@@ -87,7 +87,6 @@ Source1:
https://www.kernel.org/pub/software/scm/git/%{?rcrev:testing/}%{
 Source9:gpgkey-junio.asc

 # Local sources begin at 10 to allow for additional future upstream sources
-Source10:   git-init.el
 Source11:   git.xinetd.in
 Source12:   git-gui.desktop
 Source13:   gitweb-httpd.conf
@@ -491,14 +490,10 @@ for i in git git-shell git-upload-pack; do
 done

 %global elispdir %{_emacs_sitelispdir}/git
-make -C contrib/emacs install \
-emacsdir=%{buildroot}%{elispdir}
-for elc in %{buildroot}%{elispdir}/*.elc ; do
-install -pm 644 contrib/emacs/$(basename