date:20180302

Re: [PATCH v3] run-command: add hint when a hook is ignored

2018-03-02 Thread Jeff King

On Fri, Jan 05, 2018 at 11:36:39AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > The problem is that I was bisecting an unrelated change in old code, and
> > I built a commit which predates f98f8cbac0 (Ship sample hooks with .sample
> > suffix, 2008-06-24). That wrote a bare "update" file into templates/blt
> > (without the ".sample" suffix). After that, the cruft is forever in my
> > build directory until I "git clean -x".
> >
> > The t5523 script tries to run "git push --quiet" and make sure that it
> > produces no output, but with this setup it produces a round of "hint"
> > lines (actually, they come from receive-pack on the remote end but still
> > end up on stderr).
> >
> > So that raises a few interesting questions to me:
> >
> >   1. Should receive-pack generate these hints? They get propagated by
> >  the client, but there's a reasonable chance that the user can't
> >  actually fix the situation.
> 
> True.  The user could tell the server operator to rename them or
> remove them because they are not doing anything useful, but then
> as long as everybody knows they are not doing anything, it is OK
> to leave that sleeping dog lie, as they are not doing anything
> harmful anyway.
> 
> That brings us one step further back to question if the hints are
> useful in the first place, though ;-).

Yes, that last paragraph definitely crossed my mind. Do we have an
opinion on doing anything here? (E.g., declaring the hints not worth the
trouble and reverting them versus just living with it)?

-Peff

Re: [PATCH 00/11] Make the test suite pass with '-x' and /bin/sh

2018-03-02 Thread Jeff King

On Sat, Feb 24, 2018 at 12:39:40AM +0100, SZEDER Gábor wrote:

> The first patch is the most important: with a couple of well-placed file
> descriptor redirections it ensures that the stderr of the test helper
> functions running git commands only contain the stderr of the tested
> command, thereby resolving over 90% of the failures resulting from
> running the test suite with '-x' and /bin/sh.

I dunno. It seems like this requires a lot of caveats for people using
subshells and shell functions, and I suspect it's going to be an
on-going maintenance burden.

That said, I'm not opposed if you want to do the work to try to get the
whole test-suite clean, and we can see how it goes from there. It
shouldn't be hurting anything, I don't think, aside from some
mysterious-looking redirects (but your commit messages seem to explain
it, so anybody can dig).

Does it make descriptor 7 magical, and something that scripts should
avoid touching? That would mean we have 2 magical descriptors now.

-Peff

Re: [PATCH] t: send verbose test-helper output to fd 4

2018-03-02 Thread Jeff King

On Wed, Feb 28, 2018 at 03:30:29AM +0100, SZEDER Gábor wrote:

> > -   echo >&2 "test_expect_code: command exited with $exit_code, we 
> > wanted $want_code $*"
> > +   echo >&4 "test_expect_code: command exited with $exit_code, we 
> > wanted $want_code $*"
> > return 1
> >  }
> 
> The parts above changing the fds used for error messages in
> 'test_must_fail' and 'test_expect_code' will become unnecessary if we
> take my patch from
> 
>   https://public-inbox.org/git/20180225134015.26964-1-szeder@gmail.com/
> 
> because that patch also ensures that error messages from this function
> will go to fd 4 even if the stderr of the functions is redirected in the
> test.  Though, arguably, your patch makes it more readily visible that
> those error messages go to fd 4, so maybe it's still worth having.

Yeah, I could take it or leave it if we follow that other route.

> >  # not output anything when they fail.
> >  verbose () {
> > "$@" && return 0
> > -   echo >&2 "command failed: $(git rev-parse --sq-quote "$@")"
> > +   echo >&4 "command failed: $(git rev-parse --sq-quote "$@")"
> > return 1
> >  }
> 
> I'm not so sure about these changes to 'test_18ngrep' and 'verbose'.  It
> seems that they are the result of a simple s/>&2/>&4/ on
> 'test-lib-functions.sh'?  The problem described in the commit message
> doesn't really applies to them, because their _only_ output to stderr
> are the error messages intended to the user, so no sane test would
> redirect and check their stderr.  (OK, technically the command run by
> 'verbose' could output anything to stderr, but 'verbose' was meant for
> commands failing silently in the first place; that's why my patch linked
> above left 'verbose' unchanged.)

Yes, I agree it's not likely to help anybody. I did it mostly for
consistency. I.e., to say "and we are meaning to send this directly to
the user". It actually might make sense to wrap that concept in a helper
function. Arguably "say" should do this redirection automatically (we do
redirect it elsewhere, but mostly to try to accomplish the same thing).

> Also note that there are a lot of other test helper functions that
> output something primarily intended to the user on failure (e.g.
> 'test_path_is_*' and the like), though those messages go stdout instead
> of stderr.  Perhaps the messages of those functions should go to stderr
> as well (or even fd 4)?

Yeah, I just missed those. I agree they should redirect to fd 4, too.

I'm trying to clear my inbox before traveling, so I probably won't dig
into this further for a while. If you think this is the right direction,
do you want to add more ">&4" redirects to helpers as part of your
series?

-Peff

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Jeff King

On Fri, Mar 02, 2018 at 01:53:32PM -0800, Junio C Hamano wrote:

> Sam Kuper  writes:
> 
> > 1. It would yield, IIUC, less flexibility to create new kinds of view
> > based on a consistent, standardised underlying model.
> >
> > 2. It is harder to read, for some types of input (e.g. prose) than the
> > view generated by the existing word-diff algorithm.
> 
> The loss of line-end by the lossy "word-diff" output does not matter
> if you never split hunks, but to be able to split a hunk at an
> in-between context line (which you can already do) and new features
> like per-line selection that are emerging, keeping 1:1 line
> correspondence between what is shown and what is applied is a must.
> 
> Unless you are volunteering to design (notice that I am not saying
> "implement") both diff generation/coloration side _and_ patch
> application side, that is.  In which case, you may be able to come
> up with a magic ;-)

IIRC, we do the word-diff by first finding the individual hunks with a
normal line diff, and then doing a word diff inside those hunks. So one
easy option would be to add a --color-words option that gets passed
along to the underlying display diff and just disables hunk splitting. I
think we'd also need to start parsing the DISPLAY hunks more carefully
to make sure we re-sync at hunk boundaries.

-Peff

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Jeff King

On Fri, Mar 02, 2018 at 05:30:28PM +, Sam Kuper wrote:

> On 02/03/2018, Jeff King  wrote:
> > Unfortunately, I don't think there's an easy way to do what you want
> > (show word-diffs but apply the full diff).
> 
> Oh :(
> 
> That would be a *very* useful feature to have, especially where
> multiple small (e.g. single character or single word) changes are
> sprinkled throughout a large file.

You might want to try diff-highlight from Git's contrib/ directory. It
was the reason I added interactive.diffFilter in the first place. E.g.:

  cd contrib/diff-highlight
  make
  cp diff-highlight /to/your/$PATH
  git config --global interactive.diffFilter diff-highlight

It's not quite the same as a word-diff, because it doesn't find matches
across word boundaries. So if you re-wrap a paragraph, for example, it
won't be much help. But I do find it useful (and also as a general
filter for log and diff output).

> Should I start a separate thread for this as a feature request?

I think this thread can serve.

-Peff

[PATCH 2/2] add--interactive: detect bogus diffFilter output

2018-03-02 Thread Jeff King

It's important that the diff-filter only filter the
individual lines, and that there remain a one-to-one mapping
between the input and output lines. Otherwise, things like
hunk-splitting will behave quite unexpectedly (e.g., you
think you are splitting at one point, but it has a different
effect in the text patch we apply).

We can't detect all problematic cases, but we can at least
catch the obvious case where we don't even have the correct
number of lines.

Signed-off-by: Jeff King 
---
 git-add--interactive.perl  | 8 
 t/t3701-add-interactive.sh | 8 
 2 files changed, 16 insertions(+)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 964c3a7542..ff02008abe 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -705,6 +705,14 @@ sub parse_diff {
}
my (@hunk) = { TEXT => [], DISPLAY => [], TYPE => 'header' };
 
+   if (@colored && @colored != @diff) {
+   print STDERR
+ "fatal: mismatched output from interactive.diffFilter\n",
+ "hint: Your filter must maintain a one-to-one 
correspondence\n",
+ "hint: between its input and output lines.\n";
+   exit 1;
+   }
+
for (my $i = 0; $i < @diff; $i++) {
if ($diff[$i] =~ /^@@ /) {
push @hunk, { TEXT => [], DISPLAY => [],
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 64fe56c3d5..9bb17f91b2 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -404,6 +404,14 @@ test_expect_success TTY 'diffFilter filters diff' '
grep foo:.*content output
 '
 
+test_expect_success TTY 'detect bogus diffFilter output' '
+   git reset --hard &&
+
+   echo content >test &&
+   test_config interactive.diffFilter "echo too-short" &&
+   printf y | test_must_fail test_terminal git add -p
+'
+
 test_expect_success 'patch-mode via -i prompts for files' '
git reset --hard &&
 
-- 
2.16.2.708.g0b2ed7f536

[PATCH 1/2] t3701: add a test for interactive.diffFilter

2018-03-02 Thread Jeff King

This feature was added in 01143847db (add--interactive:
allow custom diff highlighting programs, 2016-02-27) but
never tested. Let's add a basic test.

Note that we only apply the filter when color is enabled,
so we have to use test_terminal. This is an open limitation
explicitly mentioned in the original commit. So take this
commit as testing the status quo, and not making a statement
on whether we'd want to enhance that in the future.

Signed-off-by: Jeff King 
---
 t/t3701-add-interactive.sh | 12 
 1 file changed, 12 insertions(+)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 058698df6a..64fe56c3d5 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -392,6 +392,18 @@ test_expect_success TTY 'diffs can be colorized' '
grep "$(printf "\\033")" output
 '
 
+test_expect_success TTY 'diffFilter filters diff' '
+   git reset --hard &&
+
+   echo content >test &&
+   test_config interactive.diffFilter "sed s/^/foo:/" &&
+   printf y | test_terminal git add -p >output 2>&1 &&
+
+   # avoid depending on the exact coloring or content of the prompts,
+   # and just make sure we saw our diff prefixed
+   grep foo:.*content output
+'
+
 test_expect_success 'patch-mode via -i prompts for files' '
git reset --hard &&
 
-- 
2.16.2.708.g0b2ed7f536

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Jeff King

On Fri, Mar 02, 2018 at 09:15:44AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > That's probably a reasonable sanity check, but I think we need to abort
> > and not just have a too-small DISPLAY array. Because later code like the
> > hunk-splitting is going to assume that there's a 1:1 line
> > correspondence. We definitely don't want to end up in a situation where
> > we show one thing but apply another.
> 
> Yes, agreed completely.

Let's add this sanity check while we're thinking about it. Here's a
series.

  [1/2]: t3701: add a test for interactive.diffFilter
  [2/2]: add--interactive: detect bogus diffFilter output

 git-add--interactive.perl  |  8 
 t/t3701-add-interactive.sh | 20 
 2 files changed, 28 insertions(+)

-Peff

[PATCH] smart-http: document flush after "# service" line

2018-03-02 Thread Jeff King

On Thu, Feb 22, 2018 at 12:12:54PM -0800, Dorian Taylor wrote:

> This patch exists because I was asked to write it. I don’t know squat
> about this protocol other than the fact that I followed the spec and
> it didn’t work. I traced a known-good protocol endpoint and found it
> contained content that didn’t agree with the spec. I then obliged the
> request to submit a patch with *what I knew to be true* about the
> sample that actually worked. I then followed the recommendations
> *advertised on GitHub* for submitting the patch.

I take it that a revised patch is not forthcoming, then. :-/

Let's wrap up this topic with this, then, which adds a commit message
and fixes the flush/version-1 ordering issue.

-- >8 --
Subject: smart-http: document flush after "# service" line

The http-protocol.txt spec fails to mention that a flush
packet comes in the smart server response after sending any
the "service" header.

Technically the client code is actually ready to receive an
arbitrary number of headers here, but since we haven't
introduced any other headers in the past decade (and the
client would just throw them away), let's not mention it in
the spec.

This fixes both BNF and the example. While we're fixing the
latter, let's also add the missing flush after the ref list.

Reported-by: Dorian Taylor 
Signed-off-by: Jeff King 
---
 Documentation/technical/http-protocol.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/technical/http-protocol.txt 
b/Documentation/technical/http-protocol.txt
index a0e45f2889..64f49d0bbb 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -214,10 +214,12 @@ smart server reply:
S: Cache-Control: no-cache
S:
S: 001e# service=git-upload-pack\n
+   S: 
S: 004895dcfa3633004da0049d3d0fa03f80589cbcaf31 
refs/heads/maint\0multi_ack\n
S: 0042d049f6c27a2244e12041955e262a404c7faba355 refs/heads/master\n
S: 003c2cb58b79488a98d2721cea644875a8dd0026b115 refs/tags/v1.0\n
S: 003fa3c2e2402b99163d1d59756e5f207ae21cccba4c refs/tags/v1.0^{}\n
+   S: 
 
 The client may send Extra Parameters (see
 Documentation/technical/pack-protocol.txt) as a colon-separated string
@@ -277,6 +279,7 @@ The returned response contains "version 1" if "version=1" 
was sent as an
 Extra Parameter.
 
   smart_reply =  PKT-LINE("# service=$servicename" LF)
+""
 *1("version 1")
 ref_list
 ""
-- 
2.16.2.708.g0b2ed7f536

Re: [PATCH 0/4] Speed up git tag --contains

2018-03-02 Thread Jeff King

On Fri, Jan 12, 2018 at 10:56:00AM -0800, csilvers wrote:

> > This is a resubmission of Jeff King's patch series to speed up git tag
> > --contains with some changes. It's been cooking for a while as:
> 
> Replying to this 6-year-old thread:
> 
> Is there any chance this could be resurrected?  We are using
> phabricator, which uses `git branch --contains` as part of its
> workflow.  Our repo has ~1000 branches on it, and the contains
> operation is eating up all our CPU (and time).  It would be very
> helpful to us to make this faster!
> 
> (The original thread is at
> https://public-inbox.org/git/e1ou82h-0001xy...@closure.thunk.org/

Sorry, this got thrown on my "to respond" pile and languished.

There are actually three things that make "git branch --contains" slow.

First, if you're filtering 1000 branches, we'll run 1000 merge-base
traversals, which may walk over the same commits multiple times.

These days "tag --contains" uses a different algorithm that can look at
all heads in a single traversal. But the downside is that it's
depth-first, so it tends to walk down to the roots. That's generally OK
for tags, since you often have ancient tags that mean getting close to
the roots anyway.

But for branches, they're more likely to be recent, and you can get away
without going very deep into the history.

So it's a tradeoff. There's no run-time switch to flip between them, but
a patch like this:

diff --git a/builtin/branch.c b/builtin/branch.c
index 8dcc2ed058..4d674e86d5 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -404,6 +404,7 @@ static void print_ref_list(struct ref_filter *filter, 
struct ref_sorting *sortin

memset(, 0, sizeof(array));

+   filter->with_commit_tag_algo = 1;
filter_refs(, filter, filter->kind | FILTER_REFS_INCLUDE_BROKEN);

if (filter->verbose)

drops my run of "git branch -a --contains HEAD~100" from 8.6s to
0.4s on a repo with ~1800 branches. That sounds good, but on a repo with
a smaller number of branches, we may actually end up slower (because we
dig further down in history, and don't benefit from the multiple-branch
speedup).

I tried to do a "best of both" algorithm in:

 https://public-inbox.org/git/20140625233429.ga20...@sigill.intra.peff.net/

which finds arbitrary numbers of merge bases in a single traversal.  It
did seem to work, but I felt uneasy about some of the corner cases.
I've been meaning to revisit it, but obviously have never gotten around
to it.

The second slow thing is that during the traversal we load each commit
object from disk. The solution there is to keep the parent information
in a faster cache. I had a few proposals over the years, but I won't
even bother to dig them up, because there's quite recent and promising
work in this area from Derrick Stolee:

https://public-inbox.org/git/1519698787-190494-1-git-send-email-dsto...@microsoft.com/

And finally, the thing that the patches you linked are referencing is
about using commit timestamps as a proxy for generation numbers. And
Stolee's patches actually leave room for real, trustable generation
numbers.

Once we have the serialized commit graph and generation numbers, think
the final step would just be to teach the "tag --contains" algorithm to
stop walking down unproductive lines of history. And in fact, I think we
can forget about the best-of-both multi-tip merge-base idea entirely.
Because if you can use the generation numbers to avoid going too deep,
then a depth-first approach is fine. And we'd just want to flip
git-branch over to using that algorithm by default.

-Peff

Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command

2018-03-02 Thread Jeff King

On Wed, Feb 28, 2018 at 03:22:30PM -0800, Brandon Williams wrote:

> +static void add_pattern(struct pattern_list *patterns, const char *pattern)
> +{
> + struct ref_pattern p;
> + const char *wildcard;
> +
> + p.pattern = strdup(pattern);

xstrdup?

> + wildcard = strchr(pattern, '*');
> + if (wildcard) {
> + p.wildcard_pos = wildcard - pattern;
> + } else {
> + p.wildcard_pos = -1;
> + }

Hmm, so this would accept stuff like "refs/heads/*/foo" but quietly
ignore the "/foo" part.

It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
it's worth going for the most-restrictive thing to start with, since
that enables a lot more server operations without worrying about
breaking compatibility.

-Peff

Re: [PATCH v3 12/35] serve: introduce git-serve

2018-03-02 Thread Jeff King

On Fri, Feb 23, 2018 at 01:45:57PM -0800, Brandon Williams wrote:

> I think this is the price of extending the protocol in a backward
> compatible way.  If we don't want to be backwards compatible (allowing
> for graceful fallback to v1) then we could design this differently.
> Even so we're not completely out of luck just yet.
> 
> Back when I introduced the GIT_PROTOCOL side-channel I was able to
> demonstrate that arbitrary data could be sent to the server and it would
> only respect the stuff it knows about.  This means that we can do a
> follow up to v2 at some point to introduce an optimization where we can
> stuff a request into GIT_PROTOCOL and short-circuit the first round-trip
> if the server supports it.

If that's our end-game, it does make me wonder if we'd be happier just
jumping to that at first. Before you started the v2 protocol work, I had
a rough patch series passing what I called "early capabilities". The
idea was to let the client speak a few optional capabilities before the
ref advertisement, and be ready for the server to ignore them
completely. That doesn't clean up all the warts with the v0 protocol,
but it handles the major one (allowing more efficient ref
advertisements).

I dunno. There's a lot more going on here in v2 and I'm not sure I've
fully digested it.

> The great thing about this is that from the POV of the git-client, it
> doesn't care if its speaking using the git://, ssh://, file://, or
> http:// transport; it's all the same protocol.  In my next re-roll I'll
> even drop the "# service" bit from the http server response and then the
> responses will truly be identical in all cases.

This part has me a little confused still. The big difference between
http and the other protocols is that the other ones are full-duplex, and
http is a series of stateless request/response pairs.

Are the other protocols becoming stateless request/response pairs, too?
Or will they be "the same protocol" only in the sense of using the same
transport?

(There are a lot of reasons not to like the stateless pair thing; it has
some horrid corner cases during want/have negotiation).

-Peff

Re: [PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper

2018-03-02 Thread Jeff King

On Fri, Feb 23, 2018 at 01:22:31PM -0800, Brandon Williams wrote:

> On 02/22, Stefan Beller wrote:
> > On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams  wrote:
> > 
> > > +static void pack_line(const char *line)
> > > +{
> > > +   if (!strcmp(line, "") || !strcmp(line, "\n"))
> > 
> > From our in-office discussion:
> > v1/v0 packs pktlines twice in http, which is not possible to
> > construct using this test helper when using the same string
> > for the packed and unpacked representation of flush and delim packets,
> > i.e. test-pkt-line --pack $(test-pkt-line --pack ) would produce
> > '' instead of '0009\n'.
> > To fix it we'd have to replace the unpacked versions of these pkts to
> > something else such as "FLUSH" "DELIM".
> > 
> > However as we do not anticipate the test helper to be used in further
> > tests for v0, this ought to be no big issue.
> > Maybe someone else cares though?
> 
> I'm going to punt and say, if someone cares enough they can update this
> test-helper when they want to use it for v1/v0 stuff.

I recently add packetize and depacketize helpers for testing v0 streams;
see 4414a15002 (t/lib-git-daemon: add network-protocol helpers,
2018-01-24). Is it worth folding these together?

-Peff

Re: [PATCH v3 04/35] upload-pack: convert to a builtin

2018-03-02 Thread Jeff King

On Fri, Feb 23, 2018 at 01:09:04PM -0800, Brandon Williams wrote:

> > By the way, any decision here would presumably need to be extended to
> > git-serve, etc. The current property is that it's safe to fetch from an
> > untrusted repository, even over ssh. If we're keeping that for protocol
> > v1, we'd want it to apply to protocol v2, as well.
> 
> This may be more complicated.  Right now (for backward compatibility)
> all fetches for v2 are issued to the upload-pack endpoint. So even
> though I've introduced git-serve it doesn't have requests issued to it
> and no requests can be issued to it currently (support isn't added to
> http-backend or git-daemon).  This just means that the command already
> exists to make it easy for testing specific v2 stuff and if we want to
> expose it as an endpoint (like when we have a brand new server command
> that is completely incompatible with v1) its already there and support
> just needs to be plumbed in.
> 
> This whole notion of treating upload-pack differently from receive-pack
> has bad consequences for v2 though.  The idea for v2 is to be able to
> run any number of commands via the same endpoint, so at the end of the
> day the endpoint you used is irrelevant.  So you could issue both fetch
> and push commands via the same endpoint in v2 whether its git-serve,
> receive-pack, or upload-pack.  So really, like Jonathan has said
> elsewhere, we need to figure out how to be ok with having receive-pack
> and upload-pack builtins, or having neither of them builtins, because it
> doesn't make much sense for v2 to straddle that line.

It seems like it would be OK if the whole code path of git-serve
invoking upload-pack happened without being a builtin, even if it would
be possible to run a builtin receive-pack from that same (non-builtin)
git-serve.

Remember that the client is driving the whole operation here, and we can
assume that git-serve is operating on the client's behalf. So a client
who chooses not to trigger receive-pack would be fine.

-Peff

[ID] - [39485] [RE] - [Notice!]

2018-03-02 Thread alex

PayPal

Dear,



YoursaccountshassBeenslimited!



-To getsbacksintosyourspaypalsaccount,syousneedstosconfirmsyoursidentity.

-it'sseasy !



1. Clicksonstheslinksbelow.

2. 
Confirmsthatsyou'resthesownersofsthesaccount,sandsthensfollowsthesinstructions.

Confirm Your account



Sincerely

,

[PATCH] git.el: handle default excludesfile properly

2018-03-02 Thread Dorab Patel

The previous version only looked at core.excludesfile for locating the
excludesfile.  So, when core.excludesfile was not defined, it did not
use the possible default locations.

The current version uses either $XDG_CONFIG_HOME/git/ignore or
$HOME/.config/git/ignore for the default excludesfile location
depending on whether XDG_CONFIG_HOME is defined or not.  As per the
documentation of gitignore.

Signed-off-by: Dorab Patel 
---
 contrib/emacs/git.el | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/contrib/emacs/git.el b/contrib/emacs/git.el
index 97919f2d7..16c6b4c05 100644
--- a/contrib/emacs/git.el
+++ b/contrib/emacs/git.el
@@ -757,8 +757,12 @@ Return the list of files that haven't been handled."
 
 (defun git-get-exclude-files ()
   "Get the list of exclude files to pass to git-ls-files."
-  (let (files
-(config (git-config "core.excludesfile")))
+  (let* (files
+(defaultconfig (if (getenv "XDG_CONFIG_HOME")
+   (concat (file-name-as-directory (getenv 
"XDG_CONFIG_HOME")) "git/ignore")
+ (concat (file-name-as-directory (getenv "HOME")) 
".config/git/ignore")))
+(coreexcludes (git-config "core.excludesfile"))
+(config (if coreexcludes coreexcludes defaultconfig)))
 (when (file-readable-p ".git/info/exclude")
   (push ".git/info/exclude" files))
 (when (and config (file-readable-p config))
-- 
2.16.2

[PATCH 3/3] worktree prune: improve prune logic when worktree is moved

2018-03-02 Thread Nguyễn Thái Ngọc Duy

Worktree manual move support is actually gone in 618244e160 (worktree:
stop supporting moving worktrees manually - 2016-01-22). Before that,
this gitdir could be updated often when the worktree is accessed. That
keeps the worktree from being pruned by this logic.

"git worktree move" is coming so we don't really need this, but since
it's easy to do, perhaps we could keep supporting manual worktree move a
bit longer. Notice that when a worktree is active, the "index" file
should be updated pretty often in common case. The logic is updated to
check for index mtime to see if the worktree is alive.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/worktree.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 60440c4106..4d4404e97f 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -101,6 +101,9 @@ static int prune_worktree(const char *id, struct strbuf 
*reason)
if (!file_exists(path)) {
free(path);
if (st.st_mtime <= expire) {
+   if (!stat(git_path("worktrees/%s/index", id), ) &&
+   st.st_mtime > expire)
+   return 0;
strbuf_addf(reason, _("Removing worktrees/%s: gitdir 
file points to non-existent location"), id);
return 1;
} else {
-- 
2.16.1.435.g8f24da2e1a

[PATCH 0/3] git worktree prune improvements

2018-03-02 Thread Nguyễn Thái Ngọc Duy

This is something we could do to improve the situation when a user
manually moves a worktree and not follow the update process (we have
had the first reported case [1]). Plus a bit cleanup in gc.

I think this is something we should do until we somehow make the user
aware that the worktree is broken as soon as they move a worktree
manually. But there's some more work to get there.

[1] 
http://public-inbox.org/git/%3caa98f187-4b1a-176d-2a1b-826c99577...@aegee.org%3E

Nguyễn Thái Ngọc Duy (3):
  gc.txt: more details about what gc does
  worktree: delete dead code
  worktree prune: improve prune logic when worktree is moved

 Documentation/git-gc.txt   | 12 ++--
 Documentation/gitrepository-layout.txt |  5 -
 builtin/worktree.c | 11 +++
 3 files changed, 13 insertions(+), 15 deletions(-)

-- 
2.16.1.435.g8f24da2e1a

[PATCH 2/3] worktree: delete dead code

2018-03-02 Thread Nguyễn Thái Ngọc Duy

This "link" was a feature in early iterations of multiple worktree
functionality but for some reason it was dropped. Since nobody creates
this "link", there's no need to check it.

This is mostly used to let the user moves a worktree manually [1]. If
you move a worktree within the same file system, this hard link count
lets us know the worktree is still there even if we don't know where it
is.

We're having proper worktree move support now and don't need this
anymore.

[1] 23af91d102 (prune: strategies for linked checkouts - 2014-11-30)

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/gitrepository-layout.txt | 5 -
 builtin/worktree.c | 8 
 2 files changed, 13 deletions(-)

diff --git a/Documentation/gitrepository-layout.txt 
b/Documentation/gitrepository-layout.txt
index c60bcad44a..e85148f05e 100644
--- a/Documentation/gitrepository-layout.txt
+++ b/Documentation/gitrepository-layout.txt
@@ -275,11 +275,6 @@ worktrees//locked::
or manually by `git worktree prune`. The file may contain a string
explaining why the repository is locked.
 
-worktrees//link::
-   If this file exists, it is a hard link to the linked .git
-   file. It is used to detect if the linked repository is
-   manually removed.
-
 SEE ALSO
 
 linkgit:git-init[1],
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 4e7c98758f..60440c4106 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -99,15 +99,7 @@ static int prune_worktree(const char *id, struct strbuf 
*reason)
}
path[len] = '\0';
if (!file_exists(path)) {
-   struct stat st_link;
free(path);
-   /*
-* the repo is moved manually and has not been
-* accessed since?
-*/
-   if (!stat(git_path("worktrees/%s/link", id), _link) &&
-   st_link.st_nlink > 1)
-   return 0;
if (st.st_mtime <= expire) {
strbuf_addf(reason, _("Removing worktrees/%s: gitdir 
file points to non-existent location"), id);
return 1;
-- 
2.16.1.435.g8f24da2e1a

[PATCH 1/3] gc.txt: more details about what gc does

2018-03-02 Thread Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-gc.txt | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index 571b5a7e3c..862c931104 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -15,8 +15,9 @@ DESCRIPTION
 ---
 Runs a number of housekeeping tasks within the current repository,
 such as compressing file revisions (to reduce disk space and increase
-performance) and removing unreachable objects which may have been
-created from prior invocations of 'git add'.
+performance), removing unreachable objects which may have been
+created from prior invocations of 'git add', packing refs, pruning
+reflog, rerere or stale working trees.
 
 Users are encouraged to run this task on a regular basis within
 each repository to maintain good disk space utilization and good
@@ -59,6 +60,10 @@ then existing packs (except those marked with a `.keep` file)
 are consolidated into a single pack by using the `-A` option of
 'git repack'. Setting `gc.autoPackLimit` to 0 disables
 automatic consolidation of packs.
++
+If `git gc --auto` goes ahead because of either too loose objects or
+packs, all other housekeeping tasks (e.g. rerere, working trees,
+reflog...) will also be be performed.
 
 --prune=::
Prune loose objects older than date (default is 2 weeks ago,
@@ -133,6 +138,9 @@ The optional configuration variable `gc.pruneExpire` 
controls how old
 the unreferenced loose objects have to be before they are pruned.  The
 default is "2 weeks ago".
 
+The optional gc.worktreePruneExpire controls how old a stale working
+tree before `git worktree prune` deletes it. The default is "3 months
+ago".
 
 Notes
 -
-- 
2.16.1.435.g8f24da2e1a

Re: [PATCH 00/11] Moving global state into the repository object (part 2)

2018-03-02 Thread Duy Nguyen

On Thu, Mar 1, 2018 at 2:09 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> On Wed, Feb 28, 2018 at 9:59 AM, Junio C Hamano  wrote:
>>> Duy Nguyen  writes:
>>>
 Looking at the full-series diff though, it makes me wonder if we
 should keep prepare_packed_git() private (i.e. how we initialize the
 object store, packfile included, is a private matter). How about
 something like this on top?
>>>
>>> Yup, that looks cleaner.
>>
>> I agree that it looks cleaner. So we plan on just putting
>> it on top of that series?
>
> We tend to avoid "oops, that was wrong and here is a band aid on
> top" for things that are still mushy, so it would be preferrable to
> get it fixed inline, especially if there are more changes to the
> other parts of the series coming.

I agree with this. Stefan, if you're getting bored with rerolling
refactor patches, I can update this series and send out v2.
-- 
Duy

[ID] - [39485] [RE] - [Notice!]

2018-03-02 Thread alex

PayPal

Dear,



YoursaccountshassBeenslimited!



-To getsbacksintosyourspaypalsaccount,syousneedstosconfirmsyoursidentity.

-it'sseasy !



1. Clicksonstheslinksbelow.

2. 
Confirmsthatsyou'resthesownersofsthesaccount,sandsthensfollowsthesinstructions.

Confirm Your account



Sincerely

,

[PATCH/RFC v2 8/9] pack-objects: refer to delta objects by index instead of pointer

2018-03-02 Thread Nguyễn Thái Ngọc Duy

Notice that packing_data::nr_objects is uint32_t, we could only handle
maximum 4G objects and can address all of them with an uint32_t. If we
use a pointer here, we waste 4 bytes on 64 bit architecture.

Convert these delta pointers to indexes. Since we need to handle NULL
pointers as well, the index is shifted by one [1].

There are holes in this struct but this patch is already big. Struct
packing can be done separately. Even with holes, we save 8 bytes per
object_entry.

[1] This means we can only index 2^32-2 objects even though nr_objects
could contain 2^32-1 objects. It should not be a problem in
practice because when we grow objects[], nr_alloc would probably
blow up long before nr_objects hits the wall.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 124 +++--
 pack-objects.h |  14 +++--
 2 files changed, 78 insertions(+), 60 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5818bf73ca..55f19a1f18 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -29,6 +29,20 @@
 #include "list.h"
 #include "packfile.h"
 
+#define DELTA(obj) \
+   ((obj)->delta_idx ? _pack.objects[(obj)->delta_idx - 1] : NULL)
+#define DELTA_CHILD(obj) \
+   ((obj)->delta_child_idx ? _pack.objects[(obj)->delta_child_idx - 1] 
: NULL)
+#define DELTA_SIBLING(obj) \
+   ((obj)->delta_sibling_idx ? _pack.objects[(obj)->delta_sibling_idx - 
1] : NULL)
+
+#define CLEAR_DELTA(obj) (obj)->delta_idx = 0
+#define CLEAR_DELTA_CHILD(obj) (obj)->delta_child_idx = 0
+#define CLEAR_DELTA_SIBLING(obj) (obj)->delta_sibling_idx = 0
+
+#define SET_DELTA(obj, val) (obj)->delta_idx = ((val) - to_pack.objects) + 1
+#define SET_DELTA_CHILD(obj, val) (obj)->delta_child_idx = ((val) - 
to_pack.objects) + 1
+
 static const char *pack_usage[] = {
N_("git pack-objects --stdout [...] [<  | < 
]"),
N_("git pack-objects [...]  [<  | < 
]"),
@@ -125,11 +139,11 @@ static void *get_delta(struct object_entry *entry)
buf = read_sha1_file(entry->idx.oid.hash, , );
if (!buf)
die("unable to read %s", oid_to_hex(>idx.oid));
-   base_buf = read_sha1_file(entry->delta->idx.oid.hash, ,
+   base_buf = read_sha1_file(DELTA(entry)->idx.oid.hash, ,
  _size);
if (!base_buf)
die("unable to read %s",
-   oid_to_hex(>delta->idx.oid));
+   oid_to_hex((entry)->idx.oid));
delta_buf = diff_delta(base_buf, base_size,
   buf, size, _size, 0);
if (!delta_buf || delta_size != entry->delta_size)
@@ -286,12 +300,12 @@ static unsigned long write_no_reuse_object(struct 
hashfile *f, struct object_ent
size = entry->delta_size;
buf = entry->delta_data;
entry->delta_data = NULL;
-   type = (allow_ofs_delta && entry->delta->idx.offset) ?
+   type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
} else {
buf = get_delta(entry);
size = entry->delta_size;
-   type = (allow_ofs_delta && entry->delta->idx.offset) ?
+   type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
}
 
@@ -315,7 +329,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
 * encoding of the relative offset for the delta
 * base from this object's position in the pack.
 */
-   off_t ofs = entry->idx.offset - entry->delta->idx.offset;
+   off_t ofs = entry->idx.offset - DELTA(entry)->idx.offset;
unsigned pos = sizeof(dheader) - 1;
dheader[pos] = ofs & 127;
while (ofs >>= 7)
@@ -341,7 +355,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
return 0;
}
hashwrite(f, header, hdrlen);
-   hashwrite(f, entry->delta->idx.oid.hash, 20);
+   hashwrite(f, DELTA(entry)->idx.oid.hash, 20);
hdrlen += 20;
} else {
if (limit && hdrlen + datalen + 20 >= limit) {
@@ -377,8 +391,8 @@ static off_t write_reuse_object(struct hashfile *f, struct 
object_entry *entry,
  dheader[MAX_PACK_OBJECT_HEADER];
unsigned hdrlen;
 
-   if (entry->delta)
-   type = (allow_ofs_delta && entry->delta->idx.offset) ?
+   if (DELTA(entry))
+   type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
hdrlen = encode_in_pack_object_header(header, sizeof(header),
  type, entry->size);
@@ -406,7 +420,7 @@ static off_t

[PATCH/RFC v2 9/9] pack-objects: reorder 'hash' to pack struct object_entry

2018-03-02 Thread Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 pack-objects.h | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/pack-objects.h b/pack-objects.h
index db50e56223..a57aca5f03 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -38,12 +38,10 @@ struct object_entry {
uint32_t delta_sibling_idx; /* other deltified objects who
 * uses the same base as me
 */
-   /* XXX 4 bytes hole, try to pack */
-
+   uint32_t hash;  /* name hint hash */
void *delta_data;   /* cached delta (uncompressed) */
unsigned long delta_size;   /* delta data size (uncompressed) */
unsigned long z_delta_size; /* delta data size (compressed) */
-   uint32_t hash;  /* name hint hash */
unsigned char in_pack_header_size; /* note: spare bits available! */
unsigned in_pack_idx:OE_IN_PACK_BITS;   /* already in pack */
unsigned type:TYPE_BITS;
@@ -62,7 +60,7 @@ struct object_entry {
 
unsigned depth:OE_DEPTH_BITS;
 
-   /* size: 104, padding: 4, bit_padding: 18 bits */
+   /* size: 96, bit_padding: 18 bits */
 };
 
 struct packing_data {
-- 
2.16.1.435.g8f24da2e1a

[PATCH/RFC v2 4/9] pack-objects: use bitfield for object_entry::depth

2018-03-02 Thread Nguyễn Thái Ngọc Duy

This does not give us any saving due to padding. But we will be able
to save once we cut 4 bytes out of this struct in a subsequent patch.

Because of struct packing from now on we can only handle max depth
4095 (or even lower when new booleans are added in this struct). This
should be ok since long delta chain will cause significant slow down
anyway.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/config.txt   | 1 +
 Documentation/git-pack-objects.txt | 4 +++-
 Documentation/git-repack.txt   | 4 +++-
 builtin/pack-objects.c | 4 
 pack-objects.h | 8 +++-
 5 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f57e9cf10c..9bd3f5a789 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2412,6 +2412,7 @@ pack.window::
 pack.depth::
The maximum delta depth used by linkgit:git-pack-objects[1] when no
maximum depth is given on the command line. Defaults to 50.
+   Maximum value is 4095.
 
 pack.windowMemory::
The maximum size of memory that is consumed by each thread
diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 81bc490ac5..3503c9e3e6 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -96,7 +96,9 @@ base-name::
it too deep affects the performance on the unpacker
side, because delta data needs to be applied that many
times to get to the necessary object.
-   The default value for --window is 10 and --depth is 50.
++
+The default value for --window is 10 and --depth is 50. The maximum
+depth is 4095.
 
 --window-memory=::
This option provides an additional limit on top of `--window`;
diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index ae750e9e11..25c83c4927 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -90,7 +90,9 @@ other objects in that pack they already have locally.
space. `--depth` limits the maximum delta depth; making it too deep
affects the performance on the unpacker side, because delta data needs
to be applied that many times to get to the necessary object.
-   The default value for --window is 10 and --depth is 50.
++
+The default value for --window is 10 and --depth is 50. The maximum
+depth is 4095.
 
 --threads=::
This option is passed through to `git pack-objects`.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index a4dbb40824..cfd97da7db 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3068,6 +3068,10 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
if (pack_to_stdout != !base_name || argc)
usage_with_options(pack_usage, pack_objects_options);
 
+   if (depth > (1 << OE_DEPTH_BITS))
+   die(_("delta chain depth %d is greater than maximum limit %d"),
+   depth, (1 << OE_DEPTH_BITS));
+
argv_array_push(, "pack-objects");
if (thin) {
use_internal_rev_list = 1;
diff --git a/pack-objects.h b/pack-objects.h
index 6a85cc60c9..2050a05a0b 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -2,6 +2,7 @@
 #define PACK_OBJECTS_H
 
 #define OE_DFS_STATE_BITS 2
+#define OE_DEPTH_BITS 12
 
 /*
  * State flags for depth-first search used for analyzing delta cycles.
@@ -43,12 +44,9 @@ struct object_entry {
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
unsigned dfs_state:OE_DFS_STATE_BITS;
+   unsigned depth:OE_DEPTH_BITS;
 
-   /* XXX 20 bits hole, try to pack */
-
-   int depth;
-
-   /* size: 120 */
+   /* size: 120, bit_padding: 8 bits */
 };
 
 struct packing_data {
-- 
2.16.1.435.g8f24da2e1a

[PATCH/RFC v2 6/9] pack-objects: move in_pack_pos out of struct object_entry

2018-03-02 Thread Nguyễn Thái Ngọc Duy

This field is only need for pack-bitmap, which is an optional
feature. Move it to a separate array that is only allocated when
pack-bitmap is used (it's not freed in the same way that objects[] is
not). This saves us 8 bytes in struct object_entry.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 3 ++-
 pack-bitmap-write.c| 8 +---
 pack-bitmap.c  | 2 +-
 pack-bitmap.h  | 4 +++-
 pack-objects.h | 8 ++--
 5 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index cfd97da7db..7bb5544883 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -878,7 +878,8 @@ static void write_pack_file(void)
 
if (write_bitmap_index) {
bitmap_writer_set_checksum(oid.hash);
-   bitmap_writer_build_type_index(written_list, 
nr_written);
+   bitmap_writer_build_type_index(
+   _pack, written_list, nr_written);
}
 
finish_tmp_packfile(, pack_tmp_name,
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index e01f992884..1360a93311 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -48,7 +48,8 @@ void bitmap_writer_show_progress(int show)
 /**
  * Build the initial type index for the packfile
  */
-void bitmap_writer_build_type_index(struct pack_idx_entry **index,
+void bitmap_writer_build_type_index(struct packing_data *to_pack,
+   struct pack_idx_entry **index,
uint32_t index_nr)
 {
uint32_t i;
@@ -57,12 +58,13 @@ void bitmap_writer_build_type_index(struct pack_idx_entry 
**index,
writer.trees = ewah_new();
writer.blobs = ewah_new();
writer.tags = ewah_new();
+   ALLOC_ARRAY(to_pack->in_pack_pos, to_pack->nr_objects);
 
for (i = 0; i < index_nr; ++i) {
struct object_entry *entry = (struct object_entry *)index[i];
enum object_type real_type;
 
-   entry->in_pack_pos = i;
+   IN_PACK_POS(to_pack, entry) = i;
 
switch (entry->type) {
case OBJ_COMMIT:
@@ -147,7 +149,7 @@ static uint32_t find_object_pos(const unsigned char *sha1)
"(object %s is missing)", sha1_to_hex(sha1));
}
 
-   return entry->in_pack_pos;
+   return IN_PACK_POS(writer.to_pack, entry);
 }
 
 static void show_object(struct object *object, const char *name, void *data)
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 9270983e5f..f21479fe16 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -1032,7 +1032,7 @@ int rebuild_existing_bitmaps(struct packing_data *mapping,
oe = packlist_find(mapping, sha1, NULL);
 
if (oe)
-   reposition[i] = oe->in_pack_pos + 1;
+   reposition[i] = IN_PACK_POS(mapping, oe) + 1;
}
 
rebuild = bitmap_new();
diff --git a/pack-bitmap.h b/pack-bitmap.h
index 3742a00e14..5ded2f139a 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -44,7 +44,9 @@ int rebuild_existing_bitmaps(struct packing_data *mapping, 
khash_sha1 *reused_bi
 
 void bitmap_writer_show_progress(int show);
 void bitmap_writer_set_checksum(unsigned char *sha1);
-void bitmap_writer_build_type_index(struct pack_idx_entry **index, uint32_t 
index_nr);
+void bitmap_writer_build_type_index(struct packing_data *to_pack,
+   struct pack_idx_entry **index,
+   uint32_t index_nr);
 void bitmap_writer_reuse_bitmaps(struct packing_data *to_pack);
 void bitmap_writer_select_commits(struct commit **indexed_commits,
unsigned int indexed_commits_nr, int max_bitmaps);
diff --git a/pack-objects.h b/pack-objects.h
index fb2a3c8f48..737e89b665 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -4,6 +4,9 @@
 #define OE_DFS_STATE_BITS 2
 #define OE_DEPTH_BITS 12
 
+#define IN_PACK_POS(to_pack, obj) \
+   (to_pack)->in_pack_pos[(struct object_entry *)(obj) - 
(to_pack)->objects]
+
 /*
  * State flags for depth-first search used for analyzing delta cycles.
  *
@@ -31,7 +34,6 @@ struct object_entry {
unsigned long delta_size;   /* delta data size (uncompressed) */
unsigned long z_delta_size; /* delta data size (compressed) */
uint32_t hash;  /* name hint hash */
-   unsigned int in_pack_pos;
unsigned char in_pack_header_size; /* note: spare bits available! */
unsigned type:TYPE_BITS;
unsigned in_pack_type:TYPE_BITS; /* could be delta */
@@ -46,7 +48,7 @@ struct object_entry {
unsigned dfs_state:OE_DFS_STATE_BITS;
unsigned depth:OE_DEPTH_BITS;
 
-   /* size: 120, bit_padding: 8 bits */
+   /* size: 112, bit_padding: 8 bits */
 };
 
 struct packing_data

[PATCH/RFC v2 0/9] Reduce pack-objects memory footprint

2018-03-02 Thread Nguyễn Thái Ngọc Duy

The array of object_entry in pack-objects can take a lot of memory
when pack-objects is run in "pack everything" mode. On linux-2.6.git,
this array alone takes roughly 800MB.

This series reorders some fields and reduces field size... to keep
this struct smaller. Its size goes from 136 bytes to 96 bytes (29%) on
64-bit linux and saves 260MB on linux-2.6.git.

v2 only differs in limits, documentation and a new escape hatch for
the pack file limit.

Now the bad side:

- the number of pack files pack-objects can handle is reduced to 16k
  (previously unlimited, v1 has it at 4k)
- max delta chain is also limited to 4096 (previously practically
  unlimited), same as v1
- some patches are quite invasive (e.g. replacing pointer with
  uint32_t) and reduces readability a bit.
- it may be tricker to add more data in object_entry in the future.

In v1, if the total pack count is above the 4k limit, pack-objects
dies. v2 is a bit smarter and only count packs that do not have the
companion .keep files. This allows users with 16k+ pack files to
continue to use pack-objects by first adding .keep to reduce pack
count, repack, remove (some) .keep files, repack again...

While this process could be automated at least by 'git repack', given
the unbelievably high limit 16k, I don't think it's worth doing it.

Interdiff

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f57e9cf10c..9bd3f5a789 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2412,6 +2412,7 @@ pack.window::
 pack.depth::
The maximum delta depth used by linkgit:git-pack-objects[1] when no
maximum depth is given on the command line. Defaults to 50.
+   Maximum value is 4095.
 
 pack.windowMemory::
The maximum size of memory that is consumed by each thread
diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 81bc490ac5..b8d936ccf5 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -96,7 +96,9 @@ base-name::
it too deep affects the performance on the unpacker
side, because delta data needs to be applied that many
times to get to the necessary object.
-   The default value for --window is 10 and --depth is 50.
++
+The default value for --window is 10 and --depth is 50. The maximum
+depth is 4095.
 
 --window-memory=::
This option provides an additional limit on top of `--window`;
@@ -267,6 +269,15 @@ Unexpected missing object will raise an error.
locally created objects [without .promisor] and objects from the
promisor remote [with .promisor].)  This is used with partial clone.
 
+LIMITATIONS
+---
+
+This command could only handle 16384 existing pack files at a time.
+If you have more than this, you need to exclude some pack files with
+".keep" file and --honor-pack-keep option, to combine 16k pack files
+in one, then remove these .keep files and run pack-objects one more
+time.
+
 SEE ALSO
 
 linkgit:git-rev-list[1]
diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index ae750e9e11..25c83c4927 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -90,7 +90,9 @@ other objects in that pack they already have locally.
space. `--depth` limits the maximum delta depth; making it too deep
affects the performance on the unpacker side, because delta data needs
to be applied that many times to get to the necessary object.
-   The default value for --window is 10 and --depth is 50.
++
+The default value for --window is 10 and --depth is 50. The maximum
+depth is 4095.
 
 --threads=::
This option is passed through to `git pack-objects`.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 45076f2523..55f19a1f18 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1038,7 +1038,7 @@ static int want_object_in_pack(const struct object_id 
*oid,
if (*found_pack) {
want = want_found_object(exclude, *found_pack);
if (want != -1)
-   return want;
+   goto done;
}
 
list_for_each(pos, _git_mru) {
@@ -1061,11 +1061,27 @@ static int want_object_in_pack(const struct object_id 
*oid,
if (!exclude && want > 0)
list_move(>mru, _git_mru);
if (want != -1)
-   return want;
+   goto done;
}
}
 
-   return 1;
+   want = 1;
+done:
+   if (want && *found_pack && !(*found_pack)->index) {
+   struct packed_git *p = *found_pack;
+
+   if (to_pack.in_pack_count >= (1 << OE_IN_PACK_BITS))
+   die(_("too many packs to handle in one go. "
+ "Please add .keep files to exclude\n"
+ "some pack files and keep the

[PATCH/RFC v2 3/9] pack-objects: use bitfield for object_entry::dfs_state

2018-03-02 Thread Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c |  3 +++
 pack-objects.h | 33 -
 2 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index fd217cb51f..a4dbb40824 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3049,6 +3049,9 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
OPT_END(),
};
 
+   if (DFS_NUM_STATES > (1 << OE_DFS_STATE_BITS))
+   die("BUG: too many dfs states, increase OE_DFS_STATE_BITS");
+
check_replace_refs = 0;
 
reset_pack_idx_option(_idx_opts);
diff --git a/pack-objects.h b/pack-objects.h
index f8b06e2521..6a85cc60c9 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -1,6 +1,21 @@
 #ifndef PACK_OBJECTS_H
 #define PACK_OBJECTS_H
 
+#define OE_DFS_STATE_BITS 2
+
+/*
+ * State flags for depth-first search used for analyzing delta cycles.
+ *
+ * The depth is measured in delta-links to the base (so if A is a delta
+ * against B, then A has a depth of 1, and B a depth of 0).
+ */
+enum dfs_state {
+   DFS_NONE = 0,
+   DFS_ACTIVE,
+   DFS_DONE,
+   DFS_NUM_STATES
+};
+
 struct object_entry {
struct pack_idx_entry idx;
unsigned long size; /* uncompressed size */
@@ -27,21 +42,13 @@ struct object_entry {
unsigned no_try_delta:1;
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
+   unsigned dfs_state:OE_DFS_STATE_BITS;
+
+   /* XXX 20 bits hole, try to pack */
 
-   /* XXX 22 bits hole, try to pack */
-   /*
-* State flags for depth-first search used for analyzing delta cycles.
-*
-* The depth is measured in delta-links to the base (so if A is a delta
-* against B, then A has a depth of 1, and B a depth of 0).
-*/
-   enum {
-   DFS_NONE = 0,
-   DFS_ACTIVE,
-   DFS_DONE
-   } dfs_state;
int depth;
-   /* size: 128, padding: 4 */
+
+   /* size: 120 */
 };
 
 struct packing_data {
-- 
2.16.1.435.g8f24da2e1a

[PATCH/RFC v2 5/9] pack-objects: note about in_pack_header_size

2018-03-02 Thread Nguyễn Thái Ngọc Duy

Object header in a pack is packed really tight (see
pack-format.txt). Even with 8 bytes length, we need 9-10 bytes most,
plus a hash (20 bytes). Which means this field only needs to store a
number as big as 32 (5 bits).

This is trickier to pack tight though since a new hash algorithm is
coming, the number of bits needed may quickly increase. So leave it
for now.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 pack-objects.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pack-objects.h b/pack-objects.h
index 2050a05a0b..fb2a3c8f48 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -32,7 +32,7 @@ struct object_entry {
unsigned long z_delta_size; /* delta data size (compressed) */
uint32_t hash;  /* name hint hash */
unsigned int in_pack_pos;
-   unsigned char in_pack_header_size;
+   unsigned char in_pack_header_size; /* note: spare bits available! */
unsigned type:TYPE_BITS;
unsigned in_pack_type:TYPE_BITS; /* could be delta */
unsigned preferred_base:1; /*
-- 
2.16.1.435.g8f24da2e1a

[PATCH/RFC v2 2/9] pack-objects: turn type and in_pack_type to bitfields

2018-03-02 Thread Nguyễn Thái Ngọc Duy

This saves 8 bytes in sizeof(struct object_entry). On a large
repository like linux-2.6.git (6.5M objects), this saves us 52MB
memory.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 14 --
 cache.h|  2 ++
 object.h   |  1 -
 pack-objects.h |  8 
 4 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5c674b2843..fd217cb51f 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1407,6 +1407,7 @@ static void check_object(struct object_entry *entry)
unsigned long avail;
off_t ofs;
unsigned char *buf, c;
+   enum object_type type;
 
buf = use_pack(p, _curs, entry->in_pack_offset, );
 
@@ -1415,11 +1416,15 @@ static void check_object(struct object_entry *entry)
 * since non-delta representations could still be reused.
 */
used = unpack_object_header_buffer(buf, avail,
-  >in_pack_type,
+  ,
   >size);
if (used == 0)
goto give_up;
 
+   if (type < 0)
+   die("BUG: invalid type %d", type);
+   entry->in_pack_type = type;
+
/*
 * Determine if this is a delta and if so whether we can
 * reuse it or not.  Otherwise let's find out as cheaply as
@@ -1559,6 +1564,7 @@ static void drop_reused_delta(struct object_entry *entry)
 {
struct object_entry **p = >delta->delta_child;
struct object_info oi = OBJECT_INFO_INIT;
+   enum object_type type;
 
while (*p) {
if (*p == entry)
@@ -1570,7 +1576,7 @@ static void drop_reused_delta(struct object_entry *entry)
entry->depth = 0;
 
oi.sizep = >size;
-   oi.typep = >type;
+   oi.typep = 
if (packed_object_info(entry->in_pack, entry->in_pack_offset, ) < 0) 
{
/*
 * We failed to get the info from this pack for some reason;
@@ -1580,6 +1586,10 @@ static void drop_reused_delta(struct object_entry *entry)
 */
entry->type = sha1_object_info(entry->idx.oid.hash,
   >size);
+   } else {
+   if (type < 0)
+   die("BUG: invalid type %d", type);
+   entry->type = type;
}
 }
 
diff --git a/cache.h b/cache.h
index 21fbcc2414..862bdff83a 100644
--- a/cache.h
+++ b/cache.h
@@ -373,6 +373,8 @@ extern void free_name_hash(struct index_state *istate);
 #define read_blob_data_from_cache(path, sz) 
read_blob_data_from_index(_index, (path), (sz))
 #endif
 
+#define TYPE_BITS 3
+
 enum object_type {
OBJ_BAD = -1,
OBJ_NONE = 0,
diff --git a/object.h b/object.h
index 87563d9056..8ce294d6ec 100644
--- a/object.h
+++ b/object.h
@@ -25,7 +25,6 @@ struct object_array {
 
 #define OBJECT_ARRAY_INIT { 0, 0, NULL }
 
-#define TYPE_BITS   3
 /*
  * object flag allocation:
  * revision.h:  0-1026
diff --git a/pack-objects.h b/pack-objects.h
index 720a8e8756..f8b06e2521 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -14,11 +14,11 @@ struct object_entry {
void *delta_data;   /* cached delta (uncompressed) */
unsigned long delta_size;   /* delta data size (uncompressed) */
unsigned long z_delta_size; /* delta data size (compressed) */
-   enum object_type type;
-   enum object_type in_pack_type;  /* could be delta */
uint32_t hash;  /* name hint hash */
unsigned int in_pack_pos;
unsigned char in_pack_header_size;
+   unsigned type:TYPE_BITS;
+   unsigned in_pack_type:TYPE_BITS; /* could be delta */
unsigned preferred_base:1; /*
* we do not pack this, but is available
* to be used as the base object to delta
@@ -28,7 +28,7 @@ struct object_entry {
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
 
-   /* XXX 28 bits hole, try to pack */
+   /* XXX 22 bits hole, try to pack */
/*
 * State flags for depth-first search used for analyzing delta cycles.
 *
@@ -41,7 +41,7 @@ struct object_entry {
DFS_DONE
} dfs_state;
int depth;
-   /* size: 136, padding: 4 */
+   /* size: 128, padding: 4 */
 };
 
 struct packing_data {
-- 
2.16.1.435.g8f24da2e1a

[PATCH/RFC v2 1/9] pack-objects: document holes in struct object_entry.h

2018-03-02 Thread Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 pack-objects.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/pack-objects.h b/pack-objects.h
index 03f1191659..720a8e8756 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -28,6 +28,7 @@ struct object_entry {
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
 
+   /* XXX 28 bits hole, try to pack */
/*
 * State flags for depth-first search used for analyzing delta cycles.
 *
@@ -40,6 +41,7 @@ struct object_entry {
DFS_DONE
} dfs_state;
int depth;
+   /* size: 136, padding: 4 */
 };
 
 struct packing_data {
-- 
2.16.1.435.g8f24da2e1a

[PATCH/RFC v2 7/9] pack-objects: move in_pack out of struct object_entry

2018-03-02 Thread Nguyễn Thái Ngọc Duy

Instead of using 8 bytes (on 64 bit arch) to store a pointer to a
pack. Use an index isntead since the number of packs should be
relatively small.

This limits the number of packs we can handle to 16k. For now if you hit
16k pack files limit, pack-objects will simply fail [1].

This technically saves 7 bytes. But we don't see any of that in
practice due to padding. The saving becomes real when we pack this
struct tighter later.

[1] The escape hatch is .keep file to limit the non-kept pack files
below 16k limit. Then you can go for another pack-objects run to
combine another 16k pack files. Repeat until you're satisfied.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-pack-objects.txt |  9 +
 builtin/pack-objects.c | 59 +++---
 cache.h|  1 +
 pack-objects.h | 18 -
 4 files changed, 71 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 3503c9e3e6..b8d936ccf5 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -269,6 +269,15 @@ Unexpected missing object will raise an error.
locally created objects [without .promisor] and objects from the
promisor remote [with .promisor].)  This is used with partial clone.
 
+LIMITATIONS
+---
+
+This command could only handle 16384 existing pack files at a time.
+If you have more than this, you need to exclude some pack files with
+".keep" file and --honor-pack-keep option, to combine 16k pack files
+in one, then remove these .keep files and run pack-objects one more
+time.
+
 SEE ALSO
 
 linkgit:git-rev-list[1]
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 7bb5544883..5818bf73ca 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -367,7 +367,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
 static off_t write_reuse_object(struct hashfile *f, struct object_entry *entry,
unsigned long limit, int usable_delta)
 {
-   struct packed_git *p = entry->in_pack;
+   struct packed_git *p = IN_PACK(_pack, entry);
struct pack_window *w_curs = NULL;
struct revindex_entry *revidx;
off_t offset;
@@ -478,7 +478,7 @@ static off_t write_object(struct hashfile *f,
 
if (!reuse_object)
to_reuse = 0;   /* explicit */
-   else if (!entry->in_pack)
+   else if (!IN_PACK(_pack, entry))
to_reuse = 0;   /* can't reuse what we don't have */
else if (entry->type == OBJ_REF_DELTA || entry->type == OBJ_OFS_DELTA)
/* check_object() decided it for us ... */
@@ -1024,7 +1024,7 @@ static int want_object_in_pack(const struct object_id 
*oid,
if (*found_pack) {
want = want_found_object(exclude, *found_pack);
if (want != -1)
-   return want;
+   goto done;
}
 
list_for_each(pos, _git_mru) {
@@ -1047,11 +1047,27 @@ static int want_object_in_pack(const struct object_id 
*oid,
if (!exclude && want > 0)
list_move(>mru, _git_mru);
if (want != -1)
-   return want;
+   goto done;
}
}
 
-   return 1;
+   want = 1;
+done:
+   if (want && *found_pack && !(*found_pack)->index) {
+   struct packed_git *p = *found_pack;
+
+   if (to_pack.in_pack_count >= (1 << OE_IN_PACK_BITS))
+   die(_("too many packs to handle in one go. "
+ "Please add .keep files to exclude\n"
+ "some pack files and keep the number "
+ "of non-kept files below %d."),
+   1 << OE_IN_PACK_BITS);
+
+   p->index = to_pack.in_pack_count++;
+   to_pack.in_pack[p->index] = p;
+   }
+
+   return want;
 }
 
 static void create_object_entry(const struct object_id *oid,
@@ -1074,7 +1090,9 @@ static void create_object_entry(const struct object_id 
*oid,
else
nr_result++;
if (found_pack) {
-   entry->in_pack = found_pack;
+   if (found_pack->index <= 0)
+   die("BUG: found_pack should be NULL instead of having 
non-positive index");
+   entry->in_pack_idx = found_pack->index;
entry->in_pack_offset = found_offset;
}
 
@@ -1399,8 +1417,8 @@ static void cleanup_preferred_base(void)
 
 static void check_object(struct object_entry *entry)
 {
-   if (entry->in_pack) {
-   struct packed_git *p = entry->in_pack;
+   if (IN_PACK(_pack, entry)) {
+   struct packed_git *p = IN_PACK(_pack,

Re: [PATCH 2/2] Revert "repository: pre-initialize hash algo pointer"

2018-03-02 Thread Duy Nguyen

On Tue, Feb 27, 2018 at 6:09 AM, Junio C Hamano  wrote:
> Duy Nguyen  writes:
>
>> Yes that's the intention. But after writing cover letter for v2 and
>> sending it out, it looks to me that this thing must stay until all our
>> code is converted to using the_hash_algo (I don't know if there are
>> more to convert or it's finished already). So an alternative is we do
>> the opposite: default to GIT_HASH_SHA1, but when an env variable is
>> set, reset it back to NULL. This env variable will _always_ be set by
>> the test suite to help us catch problems.
>
> If we were to do the "do not blindly initialize and always
> initialize explicitly for debugging" (which I doubt is a good
> direction to go in), then what you outlined above certainly is a
> less evil way to do so than what the patch under discussion did, I
> would think.

So v3 [1] should be in good condition to go to 'pu', I think?

[1] https://public-inbox.org/git/20180225111840.16421-1-pclo...@gmail.com/
-- 
Duy

[PATCH] travis-ci: enable more warnings on travis linux-gcc job

2018-03-02 Thread Nguyễn Thái Ngọc Duy

We have DEVELOPER config to enable more warnings, but since we can't set
a fixed gcc version to all devs, that has to be a bit more conservative.
On travis, we know almost exact version to be used (bumped to 6.x for
more warnings), we could be more aggressive.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 .travis.yml |  3 +++
 ci/run-build.sh | 15 +++
 2 files changed, 18 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index 4684b3f4f3..273b1d508a 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -16,10 +16,13 @@ compiler:
 
 addons:
   apt:
+sources:
+- ubuntu-toolchain-r-test
 packages:
 - language-pack-is
 - git-svn
 - apache2
+- gcc-6
 
 matrix:
   include:
diff --git a/ci/run-build.sh b/ci/run-build.sh
index 4f940d1032..04e163359c 100755
--- a/ci/run-build.sh
+++ b/ci/run-build.sh
@@ -5,4 +5,19 @@
 
 . ${0%/*}/lib-travisci.sh
 
+if [ "$jobname" = linux-gcc ]; then
+   gcc-6 --version
+   cat >config.mak <<-EOF
+   CC=gcc-6
+   CFLAGS = -g -O2 -Wall
+   CFLAGS += -Wextra
+   CFLAGS += -Wmissing-prototypes
+   CFLAGS += -Wno-empty-body
+   CFLAGS += -Wno-maybe-uninitialized
+   CFLAGS += -Wno-missing-field-initializers
+   CFLAGS += -Wno-sign-compare
+   CFLAGS += -Wno-unused-function
+   CFLAGS += -Wno-unused-parameter
+   EOF
+fi
 make --jobs=2
-- 
2.16.1.435.g8f24da2e1a

Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-02 Thread Igor Djordjevic

Hi Phillip,

On 02/03/2018 12:31, Phillip Wood wrote:
> 
> > Thinking about it overnight, I now suspect that original proposal had a
> > mistake in the final merge step. I think that what you did is a way to
> > fix it, and I want to try to figure what exactly was wrong in the
> > original proposal and to find simpler way of doing it right.
> >
> > The likely solution is to use original UM as a merge-base for final
> > 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty natural
> > though, as that's exactly UM from which both U1' and U2' have diverged
> > due to rebasing and other history editing.
> 
> Hi Sergey, I've been following this discussion from the sidelines,
> though I haven't had time to study all the posts in this thread in
> detail. I wonder if it would be helpful to think of rebasing a merge as
> merging the changes in the parents due to the rebase back into the
> original merge. So for a merge M with parents A B C that are rebased to
> A' B' C' the rebased merge M' would be constructed by (ignoring shell
> quoting issues)
> 
> git checkout --detach M
> git merge-recursive A -- M A'
> tree=$(git write-tree)
> git merge-recursive B -- $tree B'
> tree=$(git write-tree)
> git merge-recursive C -- $tree C'
> tree=$(git write-tree)
> M'=$(git log --pretty=%B -1 M | git commit-tree -pA' -pB' -pC')
> 
> This should pull in all the changes from the parents while preserving
> any evil conflict resolution in the original merge. It superficially
> reminds me of incremental merging [1] but it's so long since I looked at
> that I'm not sure if there are any significant similarities.
> 
> [1] https://github.com/mhagger/git-imerge

Interesting, from quick test[3], this seems to produce the same 
result as that other test I previously provided[2], where temporary 
commits U1' and U2' are finally merged with original M as a base :)

Just that this looks like even more straight-forward approach...?

The only thing I wonder of here is how would we check if the 
"rebased" merge M' was "clean", or should we stop for user amendment? 
With that other approach Sergey described, we have U1'==U2' to test with.

By the way, is there documentation for `git merge-recursive` 
anywhere, besides the code itself...? :$

Thanks, Buga

[2] https://public-inbox.org/git/f1a960dc-cc5c-e7b0-10b6-39e551665...@gmail.com/
[3] Quick test script:
-- >8 --
#!/bin/sh

# rm -rf ./.git
# rm -f ./test.txt

git init

touch ./test.txt
git add -- test.txt

# prepare repository
for i in {1..8}
do
echo X$i >>test.txt
git commit -am "X$i"
done

# prepare branch A
git checkout -b A
sed -i '2iA1' test.txt
git commit -am "A1"
sed -i '4iA2' test.txt
git commit -am "A2"
sed -i '6iA3' test.txt
git commit -am "A3"

# prepare branch B
git checkout -b B master
sed -i '5iB1' test.txt
git commit -am "B1"
sed -i '7iB2' test.txt
git commit -am "B2"
sed -i '9iB3' test.txt
git commit -am "B3"

git checkout -b topic A
git merge -s ours --no-commit B # merge A and B with `-s ours`
sed -i '8iM' test.txt   # amend merge commit ("evil merge")
git commit -am "M"
git tag M #original-merge

# master moves on...
git checkout master
git cherry-pick B^ # cherry-pick B2 into master
sed -i "1iX9" test.txt # add X9
git commit -am "X9"

# (0) ---X8--B2'--X9 (master)
#|\
#| A1---A2---A3 (A)
#| \
#|  M (topic)
#| /
#\-B1---B2---B3 (B)

# simple/naive demonstration of proposed merge rebasing logic
# using iterative merge-recursive, preserving merge commit manual
# amendments, testing `-s ours` merge with cherry-picking from
# obsoleted part, but still respecting interactively rebased
# added/modified/dropped/cherry-picked commits :)

git checkout -b A-prime A
git reset --hard HEAD^ # drop A3 from A
sed -i '/A1/c\A12' test.txt# amend A1 to A12
git commit -a --amend --no-edit
git rebase master  # rebase A onto master
git cherry-pick B  # cherry-pick B3 into A

git checkout -b B-prime B
git rebase master  # rebase B onto master
sed -i '12iB4' test.txt# add B4
git commit -am "B4"

git checkout --detach M
git merge-recursive A -- M A-prime
tree="$(git write-tree)"
git merge-recursive B -- $tree B-prime
tree="$(git write-tree)"
git tag M-prime "$(git log --pretty=%B -1 M | git commit-tree $tree -p A-prime 
-p B-prime)"

git update-ref refs/heads/topic "$(git rev-parse M-prime)"
git checkout topic

# (1) ---X8--B2'--X9 (master)
# |\
# | A12--A2'---B3' (A)
# | \
# |  M' (topic)
# | /
# \-B1'--B3'---B4  (B)

# show resulting graph
# echo
# git log --all --decorate --oneline --graph

# in comparison to original merge commit M, rebased merge commit 
# M' is expected to:
#
# - Add X9, from updated "master"
# - Have A1 changed to A12, due to A12 commit

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-02 Thread Igor Djordjevic

Hi Phillip,

> On Fri, Mar 2, 2018 at 4:36 AM, Phillip Wood wrote:
> > 
> > > It is interesting to think what it means to faithfully rebase a '-s
> > > ours' merge.
> > 
> > I should have explained that I mean is a faithful rebase one that
> > adheres to the semantics of '-s ours' (i.e. ignores any changes in the
> > side branch) or one that merges new changes from the side branch into
> > the content of the original merge? In your example you add B4 to B. If
> > M' preserves the semantics of '-s ours' then it will not contain the
> > changes in B4. I think your result does (correct me if I'm wrong) so it
> > is preserving the content of the original merge rather than the
> > semantics of it.

Yeah, I understood what you mean, and I see you noticed that B4 
commit, for which I did anticipate possibly bringing up a discussion 
like this ;)

I agree with Jake here, my thoughts exactly (what I wrote in that 
other subthread[1], too):

On 02/03/2018 17:02, Jacob Keller wrote:
> 
> We only have the content, and we don't know the semantics (nor, I
> think, should we attempt to understand or figure out the semantics).

Hmm, I wanted to elaborate a bit here, but that sentence seems to 
summarize the pure essence of it, and whatever I write looks like 
just repeating the same stuff again...

That`s just it. And stopping to give the user a chance to 
review/amend the result, where he might decide he actually did want 
something else - so all good.

Otherwise, I would be interested to learn how context/semantics 
guessing could provide a better default action (without introducing 
more complexity for might not be much benefit, if any).

But in the end, I guess we can just discuss the "most sane default" 
to present to the user (introduce or obsolete that new commit B4, in 
the discussed example[2]), as we should definitely stop for amending 
anyway, not proceeding automatically whenever U1' != U2'.

Oh, and what about situations where we introduce new or drop existing 
branches (which should be possible with new `--recreate-merges`)...? 
"Preserving old branch semantics" may have even less sense here - the 
(possibly heavily reorganized) content is the only thing we have, 
where context will (and should) be provided by the user.

And I guess being consistent is pretty important, too - if you add 
new content during merge rebase, it should always show up in the 
merge, period. It seems pretty confusing to find out one of the 
branches "declared special" (even more if it`s based on uncertain 
guess-work), so when you add something to it it`s just "swallowed", 
as the whole branch is always obsoleted, for now and ever.

I might even see a value in such behavior, but only as a conscious 
user action, not something done automatically... I guess? :)

Regards, Buga

[1] https://public-inbox.org/git/f26cdbe2-1bc3-02ff-7b99-12a6ebab5...@gmail.com/
[2] https://public-inbox.org/git/f1a960dc-cc5c-e7b0-10b6-39e551665...@gmail.com/

Re: [PATCH v4 19/35] push: pass ref patterns when pushing

2018-03-02 Thread Junio C Hamano

Brandon Williams  writes:

> Construct a list of ref patterns to be passed to 'get_refs_list()' from
> the refspec to be used during the push.  This list of ref patterns will
> be used to allow the server to filter the ref advertisement when
> communicating using protocol v2.
>
> Signed-off-by: Brandon Williams 
> ---
>  transport.c | 26 +-
>  1 file changed, 25 insertions(+), 1 deletion(-)

When you are pushing 'master', we no longer hear what the other end
has at 'next', with this change, no?

In a project whose 'master' is extended primarily by merging topics
that have been cooking in 'next', old way of pushing would only have
transferred the merge commits and resulting trees but not bulk of
the blob data because they are all available on 'next', would it
make the object transfer far less efficient, I wonder?

I guess it is OK only because the push side of the current protocol
does not do common ancestor discovery exchange ;-)

>
> diff --git a/transport.c b/transport.c
> index dfc603b36..bf7ba6879 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1026,11 +1026,35 @@ int transport_push(struct transport *transport,
>   int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
>   int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
>   int push_ret, ret, err;
> + struct refspec *tmp_rs;
> + struct argv_array ref_patterns = ARGV_ARRAY_INIT;
> + int i;
>  
>   if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
>   return -1;
>  
> - remote_refs = transport->vtable->get_refs_list(transport, 1, 
> NULL);
> + tmp_rs = parse_push_refspec(refspec_nr, refspec);
> + for (i = 0; i < refspec_nr; i++) {
> + const char *pattern = NULL;
> +
> + if (tmp_rs[i].dst)
> + pattern = tmp_rs[i].dst;
> + else if (tmp_rs[i].src && !tmp_rs[i].exact_sha1)
> + pattern = tmp_rs[i].src;
> +
> + if (pattern) {
> + if (tmp_rs[i].pattern)
> + argv_array_push(_patterns, pattern);
> + else
> + expand_ref_pattern(_patterns, 
> pattern);
> + }
> + }
> +
> + remote_refs = transport->vtable->get_refs_list(transport, 1,
> +_patterns);
> +
> + argv_array_clear(_patterns);
> + free_refspec(refspec_nr, tmp_rs);
>  
>   if (flags & TRANSPORT_PUSH_ALL)
>   match_flags |= MATCH_REFS_ALL;

Re: [PATCH v4 18/35] fetch: pass ref patterns when fetching

2018-03-02 Thread Junio C Hamano

Brandon Williams  writes:

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 850382f55..695fafe06 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -332,11 +332,25 @@ static struct ref *get_ref_map(struct transport 
> *transport,
>   struct ref *rm;
>   struct ref *ref_map = NULL;
>   struct ref **tail = _map;
> + struct argv_array ref_patterns = ARGV_ARRAY_INIT;
>  
>   /* opportunistically-updated references: */
>   struct ref *orefs = NULL, **oref_tail = 
>  
> - const struct ref *remote_refs = transport_get_remote_refs(transport, 
> NULL);
> + const struct ref *remote_refs;
> +
> + for (i = 0; i < refspec_count; i++) {
> + if (!refspecs[i].exact_sha1) {
> + if (refspecs[i].pattern)
> + argv_array_push(_patterns, refspecs[i].src);
> + else
> + expand_ref_pattern(_patterns, 
> refspecs[i].src);
> + }
> + }
> +
> + remote_refs = transport_get_remote_refs(transport, _patterns);
> +
> + argv_array_clear(_patterns);

Is the idea here, which is shared with 17/35 about ls-remote, that
we used to grab literally everything they have in remote_refs, but
we have code in place to filter that set using refspecs given in the
remote.*.fetch configuration, so it is OK as long as we grab everything
that would match the remote.*.fetch pattern?  That is, grabbing too
much is acceptable, but if we populated ref_patterns[] with too few
patterns and fail to ask refs that would match our refspec it would
be a bug?

The reason behind this question is that I am wondering if/how we can
take advantage of this remote-side pre-filtering while doing "fetch
--prune".

Thanks.

>  
>   if (refspec_count) {
>   struct refspec *fetch_refspec;

Re: [PATCH v4 17/35] ls-remote: pass ref patterns when requesting a remote's refs

2018-03-02 Thread Junio C Hamano

Brandon Williams  writes:

> Construct an argv_array of the ref patterns supplied via the command
> line and pass them to 'transport_get_remote_refs()' to be used when
> communicating protocol v2 so that the server can limit the ref
> advertisement based on the supplied patterns.
>
> Signed-off-by: Brandon Williams 
> ---
>  builtin/ls-remote.c| 12 ++--
>  refs.c | 14 ++
>  refs.h |  7 +++
>  t/t5702-protocol-v2.sh | 26 ++
>  4 files changed, 57 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
> index c6e9847c5..083ba8b29 100644
> --- a/builtin/ls-remote.c
> +++ b/builtin/ls-remote.c
> @@ -2,6 +2,7 @@
>  #include "cache.h"
>  #include "transport.h"
>  #include "remote.h"
> +#include "refs.h"
>  
>  static const char * const ls_remote_usage[] = {
>   N_("git ls-remote [--heads] [--tags] [--refs] [--upload-pack=]\n"
> @@ -43,6 +44,7 @@ int cmd_ls_remote(int argc, const char **argv, const char 
> *prefix)
>   int show_symref_target = 0;
>   const char *uploadpack = NULL;
>   const char **pattern = NULL;
> + struct argv_array ref_patterns = ARGV_ARRAY_INIT;
>  
>   struct remote *remote;
>   struct transport *transport;
> @@ -74,8 +76,14 @@ int cmd_ls_remote(int argc, const char **argv, const char 
> *prefix)
>   if (argc > 1) {
>   int i;
>   pattern = xcalloc(argc, sizeof(const char *));
> - for (i = 1; i < argc; i++)
> + for (i = 1; i < argc; i++) {
>   pattern[i - 1] = xstrfmt("*/%s", argv[i]);
> +
> + if (strchr(argv[i], '*'))
> + argv_array_push(_patterns, argv[i]);
> + else
> + expand_ref_pattern(_patterns, argv[i]);
> + }
>   }
>  
>   remote = remote_get(dest);
> @@ -96,7 +104,7 @@ int cmd_ls_remote(int argc, const char **argv, const char 
> *prefix)
>   if (uploadpack != NULL)
>   transport_set_option(transport, TRANS_OPT_UPLOADPACK, 
> uploadpack);
>  
> - ref = transport_get_remote_refs(transport, NULL);
> + ref = transport_get_remote_refs(transport, _patterns);

Yup, this is a logical and an obvious conclusion of the past handful
of steps ;-) I actually was wondering why the previous step didn't
do this already, but the resulting series is easier to understand if
this is kept as a separate step.

However, this also means that traditional pattern language ls-remote
used to support dictates what ls-refs command over the wire can
take, which may not be optimal.

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Junio C Hamano

Sam Kuper  writes:

> 1. It would yield, IIUC, less flexibility to create new kinds of view
> based on a consistent, standardised underlying model.
>
> 2. It is harder to read, for some types of input (e.g. prose) than the
> view generated by the existing word-diff algorithm.

The loss of line-end by the lossy "word-diff" output does not matter
if you never split hunks, but to be able to split a hunk at an
in-between context line (which you can already do) and new features
like per-line selection that are emerging, keeping 1:1 line
correspondence between what is shown and what is applied is a must.

Unless you are volunteering to design (notice that I am not saying
"implement") both diff generation/coloration side _and_ patch
application side, that is.  In which case, you may be able to come
up with a magic ;-)

Respond for details

2018-03-02 Thread Mr. Allen



Good day.

Do you need a loan to pay off bills ? To pay off your mortgage quickly ? To set 
up a new business or to Re- finance your existing business ? I can help you 
secure a private loan should you be interested please respond for more details 


Thanks 

Allen

Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command

2018-03-02 Thread Junio C Hamano

Brandon Williams  writes:

> + ls-refs
> +-
> +
> +`ls-refs` is the command used to request a reference advertisement in v2.
> +Unlike the current reference advertisement, ls-refs takes in arguments
> +which can be used to limit the refs sent from the server.

OK.

> +Additional features not supported in the base command will be advertised
> +as the value of the command in the capability advertisement in the form
> +of a space separated list of features, e.g.  "=
> +".

Doesn't this explain the general convention that applies to any
command, not just ls-refs command?  As a part of ls-refs section,
 in the above explanation is always a constant "ls-refs",
right?

It is a bit unclear how  in the above description are
related to "arguments" in the following paragraph.  Do the server
that can show symref and peeled tags and that can limit the output
with ref-pattern advertise these three as supported features, i.e.

ls-refs=symrefs peel ref-pattern

or something?  Would there a case where a "feature" does not
correspond 1:1 to an argument to the command, and if so how would
the server and the client negotiate use of such a feature?

> +ref-pattern 
> + When specified, only references matching one of the provided
> + patterns are displayed.  A pattern is either a valid refname
> + (e.g.  refs/heads/master), in which a ref must match the pattern
> + exactly, or a prefix of a ref followed by a single '*' wildcard
> + character (e.g. refs/heads/*), in which a ref must have a prefix
> + equal to the pattern up to the wildcard character.

I thought the recent concensus was left-anchored prefix match that
honors /-directory boundary, i.e. no explicit asterisk and just
saying "refs/heads" is enough to match "refs/heads" itself and
"refs/heads/master" but not "refs/headscarf"?

Re: [PATCH v4 12/35] serve: introduce git-serve

2018-03-02 Thread Junio C Hamano

Brandon Williams  writes:

> + /*
> +  * Function queried to see if a capability should be advertised.
> +  * Optionally a value can be specified by adding it to 'value'.
> +  * If a value is added to 'value', the server will advertise this
> +  * capability as "=" instead of "".
> +  */
> + int (*advertise)(struct repository *r, struct strbuf *value);

So this is "do we tell them about this capability?"

> +static void advertise_capabilities(void)
> +{
> + ...
> + for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
> + struct protocol_capability *c = [i];
> +
> + if (c->advertise(the_repository, )) {
> + strbuf_addstr(, c->name);
> + ...

And used as such in this function.  We tell the other side about the
capability only when .advertise returns true.

> +static int is_valid_capability(const char *key)
> +{
> + const struct protocol_capability *c = get_capability(key);
> +
> + return c && c->advertise(the_repository, NULL);
> +}

But this is different---the other side mentioned a capability's
name, and we looked it up from our table to see if we know about it
(i.e. NULL-ness of c), but in addition, we ask if we would tell them
about it if we were advertising.  I am not sure how I should feel
about it (yet).

> +static int is_command(const char *key, struct protocol_capability **command)
> +{
> + const char *out;
> +
> + if (skip_prefix(key, "command=", )) {
> + struct protocol_capability *cmd = get_capability(out);
> +
> + if (!cmd || !cmd->advertise(the_repository, NULL) || 
> !cmd->command)
> + die("invalid command '%s'", out);
> + if (*command)
> + die("command already requested");

Shouldn't these two checks that lead to die the other way around?
When they give us "command=frotz" and we already have *command, it
would be an error whether we understand 'frotz' or not.

Who are the target audience of these "die"?  Are they meant to be
communicated back to the other side of the connection, or are they
only to be sent to the "server log"?

The latter one may want to say what two conflicting commands are in
the log message, perhaps?

> + *command = cmd;

Re: [PATCH v4 12/35] serve: introduce git-serve

2018-03-02 Thread Junio C Hamano

Brandon Williams  writes:

> + Capabilities
> +~~
> +
> +There are two different types of capabilities: normal capabilities,
> +which can be used to to convey information or alter the behavior of a
> +request, and commands, which are the core actions that a client wants to
> +perform (fetch, push, etc).
> +
> +All commands must only last a single round and be stateless from the
> +perspective of the server side.  All state MUST be retained and managed
> +by the client process.  This permits simple round-robin load-balancing
> +on the server side, without needing to worry about state management.
> +
> +Clients MUST NOT require state management on the server side in order to
> +function correctly.

This somehow feels a bit too HTTP centric worldview that potentially
may penalize those who do not mind stateful services.

> + agent
> +---
> +
> +The server can advertise the `agent` capability with a value `X` (in the
> +form `agent=X`) to notify the client that the server is running version
> +`X`.  The client may optionally send its own agent string by including
> +the `agent` capability with a value `Y` (in the form `agent=Y`) in its
> +request to the server (but it MUST NOT do so if the server did not
> +advertise the agent capability).

Are there different degrees of permissiveness between "The server
CAN" and "The client MAY" above, or is the above paragraph merely
being fuzzy?

I notice that, with the above "MUST NOT", it is impossible for a
server to collect voluntary census information from client without
revealing its own "version".  Because in principle it is not
sensible to allow one side to send random capabilities without first
making sure that the other side understands them, unsolicited
"agent" from the client over a channel where the server did not say
it would accept one is quite fine, and the server can always say
something silly like "agent=undisclosed" to allow the clients to
volunteer their own version, but the definition of this capability
smells like conflating two unrelated things (i.e. advertising your
own version vs permission to announce yourself).

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Sam Kuper

On 02/03/2018, Junio C Hamano  wrote:
> Jeff King  writes:
>> Unfortunately, I don't think there's an easy way to do what you want
>> (show word-diffs but apply the full diff).
>
> The current "word-diff" discards the information on where the lines
> end, and it is pretty much hopeless/useless in the context of "add
> -p".

This is unfortunate.

I am familiar with the model-view-controller ("MVC") paradigm used in
some software packages. I had hoped that Git followed this paradigm
somewhat, and cleanly separated the underlying model of the diff (i.e.
a representation that would be generated in all cases where a diff is
needed), and the view of the diff (i.e. the visual representation,
e.g. word-diff, line diff, colored, non-colored, etc, as requested by
the user).

Such separation of concerns strikes me as being the best approach
here. It could be a lot of work, but I would be grateful if the Git
developers/maintainers could work towards it as an end goal for this
aspect of Git's architecture.

Unfortunately, I am not very familiar with the Git codebase, nor
well-versed in its primary languages, so I can't be of much help here
:(

> It would be a good addition to revamp it so that it keeps the
> original lines in pre/post images but only colors/highlights the
> byte ranges in such a way that (1) on a pre-image line, paint only
> the byte range that do not appear in the post-image, and (2) on a
> post-image line, paint only the byte range that do not appear in the
> pre-image.
>
> E.g. Given these two
>
> diff --git a/1 b/2
> index aa49234b68..1cd9f6b5fd 100644
> --- a/1
> +++ b/2
> @@ -1,2 +1 @@
> -a b c e
> -f g i j
> +a b d f g h
>
> the current word-diff would give (I cannot do colors here, so I'll
> upcase the colored letters):
>
> @@ -1,2 +1 @@
> a b C ED f g I JH
>
> as the output, but if we produced
>
> @@ -1,2 +1 @@
> -a b C E
> -f g I J
> +a b D f g H
>
> instead, then colored "add -p" would become easier to see.
>
> And that would be useful outside the context of "add -p", which is a
> huge plus.

This would be much better than the current situation, where the visual
representation gives misleading feedback about the underlying diff,
and where the error I reported crops up.

However, in my opinion your proposed solution would still be not as
good as separating the two concerns, as described earlier in this
email, on two fronts:

1. It would yield, IIUC, less flexibility to create new kinds of view
based on a consistent, standardised underlying model.

2. It is harder to read, for some types of input (e.g. prose) than the
view generated by the existing word-diff algorithm. Your approach, as
outlined in your example above, requires the reader to visually jump
(saccade) between two lines that are separated by an intervening line,
in order to see what has changed. The existing word-diff is much
easier to use: it puts the changes right next to each other, avoiding
line-skipping and allowing horizontal saccades (which are much more
familiar to people used to languages written in left-to-right or
right-to-left scripts).

I don't want to sound negative. As I say, I think your solution is a
big improvement on what currently exists. But I would see it as an
intermediate solution - a stopgap - rather than an endpoint of
development.

In other words, if your solution would be much quicker to implement
than the one I proposed, then please go ahead with yours first, and
please add mine to the longer-term to-do list :)

Thank you again for helping to make Git such a great tool, and for
working tirelessly to improve it further!

P.S. There is a related StackOverflow question here:
https://stackoverflow.com/questions/49058817/git-add-patch-with-word-diff

Re: [PATCH v5 01/12] sequencer: avoid using errno clobbered by rollback_lock_file()

2018-03-02 Thread Johannes Schindelin

Hi Martin,

On Tue, 27 Feb 2018, Martin Ågren wrote:

> On 26 February 2018 at 22:29, Johannes Schindelin
>  wrote:
> > As pointed out in a review of the `--recreate-merges` patch series,
> > `rollback_lock_file()` clobbers errno. Therefore, we have to report the
> > error message that uses errno before calling said function.
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  sequencer.c | 13 -
> >  1 file changed, 8 insertions(+), 5 deletions(-)
> >
> > diff --git a/sequencer.c b/sequencer.c
> > index e9baaf59bd9..5aa3dc3c95c 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -345,12 +345,14 @@ static int write_message(const void *buf, size_t len, 
> > const char *filename,
> > if (msg_fd < 0)
> > return error_errno(_("could not lock '%s'"), filename);
> > if (write_in_full(msg_fd, buf, len) < 0) {
> > +   error_errno(_("could not write to '%s'"), filename);
> > rollback_lock_file(_file);
> > -   return error_errno(_("could not write to '%s'"), filename);
> > +   return -1;
> > }
> > if (append_eol && write(msg_fd, "\n", 1) < 0) {
> > +   error_errno(_("could not write eol to '%s'"), filename);
> > rollback_lock_file(_file);
> > -   return error_errno(_("could not write eol to '%s'"), 
> > filename);
> > +   return -1;
> > }
> > if (commit_lock_file(_file) < 0) {
> > rollback_lock_file(_file);
> > @@ -2106,16 +2108,17 @@ static int save_head(const char *head)
> >
> > fd = hold_lock_file_for_update(_lock, git_path_head_file(), 0);
> > if (fd < 0) {
> > +   error_errno(_("could not lock HEAD"));
> > rollback_lock_file(_lock);
> > -   return error_errno(_("could not lock HEAD"));
> > +   return -1;
> > }
> 
> I just noticed this when test-merging my series of lockfile-fixes to pu.
> This `rollback_lock_file()` is not needed, since failure to take the
> lock leaves it unlocked. If one wants to roll back the lock "for
> clarity" or "just to be safe", then the same should arguably be done in
> `write_message()`, just barely visible at the top of this diff.
> 
> Perhaps not worth a reroll. The conflict resolution between this and my
> patch would be to take my hunk.
> 
> https://public-inbox.org/git/cover.1519763396.git.martin.ag...@gmail.com/T/#t

Thank you for working on this!
Dscho

Re: [PATCH] git-p4: Fix depot path stripping

2018-03-02 Thread Nuno Subtil

Yes, that's where we left off. I owe you better testing of the
clientspec-side path remapping case, which I hope to get to soon. Will
get back to you as soon as I can.

Thanks,
Nuno


On Fri, Mar 2, 2018 at 2:09 AM, Luke Diamand  wrote:
> On 27 February 2018 at 19:00, Nuno Subtil  wrote:
>> I originally thought this had been designed such that the p4 client spec
>> determines the paths, which would make sense. However, git-p4 still ignores
>> the local path mappings in the client spec when syncing p4 depot paths
>> w/--use-client-spec. Effectively, it looks as though --use-client-spec
>> always implies --keep-paths, which didn't seem to me like what was intended.
>>
>> For the use case I'm dealing with (syncing a p4 path but ignoring some
>> directories inside it), there seems to be no way today to generate a git
>> tree rooted at the p4 path location instead of the root of the depot, which
>> looks like a bug to me. I don't really understand the overall design well
>> enough to tell whether the bug lies in stripRepoPath or somewhere else, to
>> be honest, but that's where I saw the inconsistency manifest itself.
>
> I replied but managed to drop git-users off the thread. So trying again!
>
> The behavior is a touch surprising, but I _think_ it's correct.
>
> With --use-client-spec enabled, paths in the git repot get mapped as
> if you had used the file mapping in the client spec, using "p4 where".
>
> So, for example, if you have a client spec which looks like:
>//depot/... //my_client_spec/...
>
> then you're going to get the full repo structure, even if you only
> clone a subdirectory.
>
> e.g. if you clone //depot/a/b/c then with "--use-client-spec" enabled,
> you will get a/b/c in your git repo, and with it not enabled, you will
> just get c/.
>
> If instead your client spec looks like:
>   //depot/a/b/... //my_client_spec/...
>
> then you should only get c/d. (And a quick test confirms that).
>
> I think Nuno's original question comes from wanting to map some files
> into place which the clientspec mapping emulation in git-p4 was
> possibly not handling - if we can get a test case for that (or an
> example) then we can see if it's just that the client mapping code
> that Pete put in is insufficient, or if there's some other way around.
> Or if indeed I'm wrong, and there's a bug! There may be some weird
> corner-cases where it might do the wrong thing.
>
> Thanks,
> Luke
>
>
>>
>> Nuno
>>
>>
>> On Tue, Feb 27, 2018 at 3:12 AM, Luke Diamand  wrote:
>>>
>>> On 27 February 2018 at 08:43, Nuno Subtil  wrote:
>>> > The issue is that passing in --use-client-spec will always cause git-p4
>>> > to
>>> > preserve the full depot path in the working tree that it creates, even
>>> > when
>>> > --keep-path is not used.
>>> >
>>> > The repro case is fairly simple: cloning a given p4 path that is nested
>>> > 2 or
>>> > more levels below the depot root will have different results with and
>>> > without --use-client-spec (even when the client spec is just a
>>> > straightforward map of the entire depot).
>>> >
>>> > E.g., 'git p4 clone //p4depot/path/to/some/project/...' will create a
>>> > working tree rooted at project. However, 'git p4 clone --use-client-spec
>>> > //p4depot/path/to/some/project/...' will create a working tree rooted at
>>> > the
>>> > root of the depot.
>>>
>>> I think it _might_ be by design.
>>>
>>> At least, the test ./t9809-git-p4-client-view.sh seems to fail for me
>>> with your change, although I haven't investigated what's going on:
>>>
>>> $ ./t9809-git-p4-client-view.sh
>>> ...
>>> ...
>>> Doing initial import of //depot/dir1/ from revision #head into
>>> refs/remotes/p4/master
>>> ./t9809-git-p4-client-view.sh: 10: eval: cannot create dir1/file13:
>>> Directory nonexistent
>>> not ok 23 - subdir clone, submit add
>>>
>>> I think originally the logic came from this change:
>>>
>>>21ef5df43 git p4: make branch detection work with --use-client-spec
>>>
>>> which was fixing what seems like the same problem but with branch
>>> detection enabled.
>>>
>>>
>>> >
>>> > Thanks,
>>> > Nuno
>>> >
>>> >
>>> > On Tue, Feb 27, 2018 at 12:10 AM, Luke Diamand  wrote:
>>> >>
>>> >> On 27 February 2018 at 04:16, Nuno Subtil  wrote:
>>> >> > When useClientSpec=true, stripping of P4 depot paths doesn't happen
>>> >> > correctly on sync. Modifies stripRepoPath to handle this case better.
>>> >>
>>> >> Can you give an example of how this shows up? I could quickly write a
>>> >> test case for this if I knew what was going on.
>>> >>
>>> >> Thanks
>>> >> Luke
>>> >>
>>> >>
>>> >> >
>>> >> > Signed-off-by: Nuno Subtil 
>>> >> > ---
>>> >> >  git-p4.py | 12 +---
>>> >> >  1 file changed, 9 insertions(+), 3 deletions(-)
>>> >> >
>>> >> > diff --git a/git-p4.py b/git-p4.py
>>> >> > index 7bb9cadc69738..3df95d0fd1d83 100755
>>> >> > --- a/git-p4.py
>>> >> > +++ b/git-p4.py

[no subject]

2018-03-02 Thread Cong ty Tan Van Lang

Good day, I have business proposition for further reinvestment, if
interested please contact me for details on my emailccb9...@gmail.com

Re: [PATCH v4 34/35] remote-curl: implement stateless-connect command

2018-03-02 Thread Johannes Schindelin

Hi Brandon,

On Wed, 28 Feb 2018, Brandon Williams wrote:

> diff --git a/remote-curl.c b/remote-curl.c
> index 66a53f74b..3f882d766 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -188,7 +188,12 @@ static struct ref *parse_git_refs(struct discovery 
> *heads, int for_push)
> [...]
> +static size_t proxy_in(char *buffer, size_t eltsize,
> +size_t nmemb, void *userdata)
> +{
> + size_t max;
> + struct proxy_state *p = userdata;
> + size_t avail = p->request_buffer.len - p->pos;
> +
> +
> + if (eltsize != 1)
> + BUG("curl read callback called with size = %zu != 1", eltsize);

The format specified %z is not actually portable. Please use PRIuMAX and
cast to (uintmax_t) instead.

This breaks the Windows build of `pu` (before that, there was still a test
failure that I did not have the time to chase down).

Ciao,
Dscho

Re: What's cooking in git.git (Mar 2018, #01; Thu, 1)

2018-03-02 Thread SZEDER Gábor

On Fri, Mar 2, 2018 at 6:11 PM, Junio C Hamano  wrote:
> OK, I think I now understand what happened.  I misread the "fold"
> discussion and thought we were still exploring the possibility, to
> avoid showing uninteresting zero-status case to the users.
>
> If we do not care about that part, then it seems that the discussion
> thread is complete.  Let's move on.

I think we don't care much about folding, at least not in conjunction
with the patch in question.  And even if we change our minds in the
future, we can always add that on top.

Note that nowadays there is a _different_ issue that may make folding
worthwhile.

Recently[1] we started to run the test suite twice in one of the Linux
build jobs: once "as usual" and once with split index enabled.  If a
test fails in this build job, then it's impossible to tell which one of
the two test runs failed just by looking at the end of the build job's
log; one has to scroll back to the start of the failed test run to see
the executed command shown by the 'set -x' trace.  Now, if the failure
happened in the split index enabled test run, then this line is
somewhere in the middle of the looong log (even without the patch in
question!), which is a bit inconvenient to find.  If the outputs of
building Git _and_ the "as usual" test run were folded, then this line
would be closer to the top and easier to find.

Though, arguably, if all or most of the other build jobs have succeeded,
then it's quite likely that the failure happened in the split index
enabled test run.  And if the other build jobs have failed as well,
then, on one hand, the failure is almost certainly in the "as usual"
test run, and, on the other hand, one can check the output of any of
those other failed build jobs to see what went wrong.  As far as I
observed, the failure tends to happen in the split index enabled test
run most of the time ;) and I have a couple of patches almost ready for
submission to address some of the transient failures related to split
index.

[1] - ae59a4e44f (travis: run tests with GIT_TEST_SPLIT_INDEX,
  2018-01-07)

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-02 Thread Igor Djordjevic

Hi Phillip,

On 02/03/2018 17:00, Jacob Keller wrote:
> 
> > It is interesting to think what it means to faithfully rebase a '-s
> > ours' merge. In your example the rebase does not introduce any new
> > changes into branch B that it doesn't introduce to branch A. Had it
> > added a fixup to branch B1 for example or if the topology was more
> > complex so that B ended up with some other changes that the rebase did
> > not introduce into A, then M' would contain those extra changes whereas
> > '--recreate-merges' with '-s ours' (once it supports it) would not.
> 
> Unless the method of merging was stored, I don't think we *can*
> correctly automate resolving of "-s ours" because all we store is the
> resulting content, and we don't know how or why the user generated it
> as such. I believe the "correct" solution in any case would be to take
> the content we DO know and then ask the user to stop for amendments.

I agree with Jake, and for the exact same reasons.

That said, I`d like to see what mentioned '--recreate-merges' with 
'-s ours' does (or would do, once it supports it), would you have a 
pointer for me where to look at?

But if that`s something yet to come, might be it`s still open for 
discussion. I mean, even this topic started inside original 
`--recreate-merges` one[1], and hopefully it can still bring 
improvements there (sooner or later).

Thanks, Buga

[1] 
https://public-inbox.org/git/cover.1516225925.git.johannes.schinde...@gmx.de/

Re: [PATCH 1/2] ref-filter: mark a file-local symbol as static

2018-03-02 Thread Ramsay Jones



On 02/03/18 17:19, Junio C Hamano wrote:
> Ramsay Jones  writes:
> 
>> Junio, do you want me to re-roll, or would you mind tweaking the
>> commit message while queueing?
> 
> Perfect timing ;-)  I was about to get to these two patches.  Here
> is what is queued.

Thanks!

ATB,
Ramsay Jones

> commit 2d7cb07e3718d0af6547e2abb35f9cff9b10c1f5
> Author: Ramsay Jones 
> Date:   Fri Mar 2 02:54:02 2018 +
> 
> ref-filter: mark a file-local symbol as static
> 
> Commit fcfba37337 ('ref-filter: make "--contains " less chatty if
>  is invalid', 2018-02-23) added the add_str_to_commit_list()
> function, which causes sparse to issue a "... not declared. Should it
> be static?" warning for that symbol.
> 
> Indeed, the function is only used in this one compilation unit. Mark
> it static.
> 
> Signed-off-by: Ramsay Jones 
> Signed-off-by: Junio C Hamano 
> 
> diff --git a/ref-filter.c b/ref-filter.c
> index aa282a27f4..ed993d32d8 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -2000,7 +2000,7 @@ static void do_merge_filter(struct ref_filter_cbdata 
> *ref_cbdata)
>   free(to_clear);
>  }
>  
> -int add_str_to_commit_list(struct string_list_item *item, void *commit_list)
> +static int add_str_to_commit_list(struct string_list_item *item, void 
> *commit_list)
>  {
>   struct object_id oid;
>   struct commit *commit;
> 
>

SimpleMind Mind Maps Export

2018-03-02 Thread Matthew Hinesly

The attachment contains an archive of all local Mind Maps and related data.
This archive can also be used to transfer Mind Maps between SimpleMind editions.
Importing this archive will not overwrite maps, but instead will add the 
contained Mind Maps to the maps already present.
Send this message to yourself and after receiving it, you can import the 
archive into SimpleMind.
Note that import only works with paid SimpleMind editions or SimpleMind+ 
upgraded to full functionality via In-App Purchase.




simplemind_ios.smmstore
Description: Binary data


Sent from my iPhone

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Junio C Hamano

Jeff King  writes:

> Unfortunately, I don't think there's an easy way to do what you want
> (show word-diffs but apply the full diff).

The current "word-diff" discards the information on where the lines
end, and it is pretty much hopeless/useless in the context of "add
-p".  It would be a good addition to revamp it so that it keeps the
original lines in pre/post images but only colors/highlights the
byte ranges in such a way that (1) on a pre-image line, paint only
the byte range that do not appear in the post-image, and (2) on a
post-image line, paint only the byte range that do not appear in the
pre-image.  

E.g. Given these two

diff --git a/1 b/2
index aa49234b68..1cd9f6b5fd 100644
--- a/1
+++ b/2
@@ -1,2 +1 @@
-a b c e
-f g i j
+a b d f g h

the current word-diff would give (I cannot do colors here, so I'll
upcase the colored letters):

@@ -1,2 +1 @@
a b C ED f g I JH

as the output, but if we produced

@@ -1,2 +1 @@
-a b C E
-f g I J
+a b D f g H

instead, then colored "add -p" would become easier to see.

And that would be useful outside the context of "add -p", which is a
huge plus.

Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-02 Thread Igor Djordjevic

Hi Sergey,

On 02/03/2018 06:40, Sergey Organov wrote:
> 
> > So... In comparison to original merge commit M, rebased merge commit 
> > M' is expected to:
> > 
> >  - Add X9, from updated "master"
> >  - Have A1 changed to A12, due to A12 commit amendment
> >  - Keep A2, rebased as A2'
> >  - Remove A3, due to dropped A3 commit
> >  - Keep amendment from original (evil) merge commit M
> >  - Miss B1' like M does B, due to original `-s ours` merge strategy
> >  - Add B2, cherry-picked as B2' into "master"
> >  - Add B3, cherry-picked as B3' into "A"
> >  - Add B4, added to "B"
> >  - Most important, provide safety mechanism to "fail loud", being 
> >aware of non-trivial things going on, allowing to stop for user 
> >inspection/decision
> > 
> > 
> > There, I hope I didn`t miss any expectation. And, it _seems_ to work 
> > exactly as expected :D
> 
> That's very nice, to the level of being even suspect! :-)

Heh, indeed :) I`m already thinking of some even more complex 
situations. The more use/test cases, the merrier.

Like if number of merge parents (branches) gets changed during 
interactive rebase...

> To avoid falling into euphoria though, we need to keep in mind that
> "expectations" is rather vague concept, and we likely still need to stop
> for user amendment unless we absolutely sure nothing surprising happens.
> I.e., we better require U1'==U2' test to succeed to proceed non-stop
> automatically. Besides, it will be somewhat inline with what 'rerere'
> does.

I totally agree, and I think whatever we come up with, we`ll always 
be missing some deeper context of the original merge, so U1'==U2' 
test is a must - if it fails, even if we didn`t get any conflicts and 
could otherwise proceed automatically, better stop for user sanity check.

I`m still thinking if there could be a situation where test passes, 
but result might still be suspicious (and worth inspecting), though.

Regards, Buga

Re: What's cooking in git.git (Mar 2018, #01; Thu, 1)

2018-03-02 Thread Lars Schneider


> On 02 Mar 2018, at 18:11, Junio C Hamano  wrote:
> 
> Junio C Hamano  writes:
> 
>> SZEDER Gábor  writes:
>> 
>>> On Thu, Mar 1, 2018 at 11:20 PM, Junio C Hamano  wrote:
>>> 
 * sg/travis-build-during-script-phase (2018-01-08) 1 commit
 - travis-ci: build Git during the 'script' phase
 
 Stalled for too long without any response; will discard.
>>> 
>>> I still think this is a good change as-is and it does make checking the
>>> results of Travis CI builds easier.  The issue Lars raised in [1] is in
>>> my opinion a non-issue [2] in practice and Lars hasn't yet disagreed
>>> with that.
>> 
>> OK, so I simply misread the discussion and did not realize that it
>> reached a conclusion?  If that's the case, good ;-).  Thanks.
> 
> OK, I think I now understand what happened.  I misread the "fold"
> discussion and thought we were still exploring the possibility, to
> avoid showing uninteresting zero-status case to the users.
> 
> If we do not care about that part, then it seems that the discussion
> thread is complete.  Let's move on.

All good with me. I just wanted explain my reasoning for the initial
implementation. I do understand Szeder's reasoning too and I am OK
with the change.

Thanks for improving the TravisCI integration Szeder!

- Lars

, My name is Lindsey and I'm glad to get In Touch with you after view you profile for serious relationship,age does not matter but love and care,

2018-03-02 Thread Lindsey

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Sam Kuper

On 02/03/2018, Jeff King  wrote:
> Unfortunately, I don't think there's an easy way to do what you want
> (show word-diffs but apply the full diff).

Oh :(

That would be a *very* useful feature to have, especially where
multiple small (e.g. single character or single word) changes are
sprinkled throughout a large file.

Should I start a separate thread for this as a feature request?

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Sam Kuper

On 02/03/2018, Jonathan Nieder  wrote:
> Is this reproducible for you?

Yes. It seems to occur consistently, given the same input.

> Do you have more details about how I can reproduce it?

Unfortunately, the particular git repo I encountered it on is private,
otherwise I would point you to it.

I haven't attempted to create a MWE. Am I correct that doing so is now
not needed, in the light of Jeff King's email below?

> What arch are you on?  What perl version do you use?  Can you report
> this using "reportbug git"?

All perfectly decent questions. For a modicum of security/privacy, I
would prefer to avoid answering them unless necessary.

Am I right in thinking that these answers are no longer needed, in the
light of Jeff King's email below?

On 02/03/2018, Jeff King  wrote:
>   3. Your invocation in particular is a problem because it uses
>  --word-diff, which will not have a one-to-one line correspondence
>  with the bare diff.
>
>  add--interactive handles pretty-printing by running the diff
>  command twice: once with no special options, and once with
>  "--color" and piped through the diffFilter. It assumes that the two
>  match each other line for line, so it shows you the "DISPLAY"
>  variant, but then ultimately applies the "TEXT" variant.
>
> And that last one is the cause of the errors you see:
>
>> Use of uninitialized value $_ in print at
>> /usr/lib/git-core/git-add--interactive line 1371,  line 74.
>
> The "DISPLAY" run for your case generates fewer lines than the "TEXT"
> run, and we complain on trying to show those extra lines.

Re: [PATCH 1/2] ref-filter: mark a file-local symbol as static

2018-03-02 Thread Junio C Hamano

Ramsay Jones  writes:

> Junio, do you want me to re-roll, or would you mind tweaking the
> commit message while queueing?

Perfect timing ;-)  I was about to get to these two patches.  Here
is what is queued.

commit 2d7cb07e3718d0af6547e2abb35f9cff9b10c1f5
Author: Ramsay Jones 
Date:   Fri Mar 2 02:54:02 2018 +

ref-filter: mark a file-local symbol as static

Commit fcfba37337 ('ref-filter: make "--contains " less chatty if
 is invalid', 2018-02-23) added the add_str_to_commit_list()
function, which causes sparse to issue a "... not declared. Should it
be static?" warning for that symbol.

Indeed, the function is only used in this one compilation unit. Mark
it static.

Signed-off-by: Ramsay Jones 
Signed-off-by: Junio C Hamano 

diff --git a/ref-filter.c b/ref-filter.c
index aa282a27f4..ed993d32d8 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2000,7 +2000,7 @@ static void do_merge_filter(struct ref_filter_cbdata 
*ref_cbdata)
free(to_clear);
 }

-int add_str_to_commit_list(struct string_list_item *item, void *commit_list)
+static int add_str_to_commit_list(struct string_list_item *item, void 
*commit_list)
 {
struct object_id oid;
struct commit *commit;

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Junio C Hamano

Jeff King  writes:

> That's probably a reasonable sanity check, but I think we need to abort
> and not just have a too-small DISPLAY array. Because later code like the
> hunk-splitting is going to assume that there's a 1:1 line
> correspondence. We definitely don't want to end up in a situation where
> we show one thing but apply another.

Yes, agreed completely.

Re: What's cooking in git.git (Mar 2018, #01; Thu, 1)

2018-03-02 Thread Junio C Hamano

Junio C Hamano  writes:

> SZEDER Gábor  writes:
>
>> On Thu, Mar 1, 2018 at 11:20 PM, Junio C Hamano  wrote:
>>
>>> * sg/travis-build-during-script-phase (2018-01-08) 1 commit
>>>  - travis-ci: build Git during the 'script' phase
>>>
>>>  Stalled for too long without any response; will discard.
>>
>> I still think this is a good change as-is and it does make checking the
>> results of Travis CI builds easier.  The issue Lars raised in [1] is in
>> my opinion a non-issue [2] in practice and Lars hasn't yet disagreed
>> with that.
>
> OK, so I simply misread the discussion and did not realize that it
> reached a conclusion?  If that's the case, good ;-).  Thanks.

OK, I think I now understand what happened.  I misread the "fold"
discussion and thought we were still exploring the possibility, to
avoid showing uninteresting zero-status case to the users.

If we do not care about that part, then it seems that the discussion
thread is complete.  Let's move on.

Re: What's cooking in git.git (Mar 2018, #01; Thu, 1)

2018-03-02 Thread Junio C Hamano

SZEDER Gábor  writes:

> On Thu, Mar 1, 2018 at 11:20 PM, Junio C Hamano  wrote:
>
>> * sg/travis-build-during-script-phase (2018-01-08) 1 commit
>>  - travis-ci: build Git during the 'script' phase
>>
>>  Stalled for too long without any response; will discard.
>
> I still think this is a good change as-is and it does make checking the
> results of Travis CI builds easier.  The issue Lars raised in [1] is in
> my opinion a non-issue [2] in practice and Lars hasn't yet disagreed
> with that.

OK, so I simply misread the discussion and did not realize that it
reached a conclusion?  If that's the case, good ;-).  Thanks.

>
> Not sure what else I could say what's no already in the commit message
> or in my response in [2].
>
>
> [1] - 
> https://public-inbox.org/git/5de3fa05-2347-4be7-8a1a-a6e5feec7...@gmail.com/
>
> [2] - 
> https://public-inbox.org/git/cam0vkjnszoc+e408ifkcg+qptagrnl3e3jvdrn573kdcbsz...@mail.gmail.com/

Re: [PATCH 1/2] ref-filter: mark a file-local symbol as static

2018-03-02 Thread Ramsay Jones



On 02/03/18 03:59, Jonathan Nieder wrote:
> Hi,
> 
> Ramsay Jones wrote:
> 
>> Commit fcfba37337 ('ref-filter: make "--contains " less chatty if
>>  is invalid', 2018-02-23) added the add_str_to_commit_list()
>> function, which causes sparse to issue a "... not declared. Should it
>> be static?" warning for that symbol.
> 
> Thanks for catching it!
> 
>> In order to suppress the warning, mark that function as static.
> 
> Isn't this closer to
> 
>   Indeed, the function is only used in this one compilation
>   unit. Mark it static.
> 
> ?  In other words, sparse's warning is accurate, and this is not about
> trying to quiet a false positive but about addressing a true positive.

I thought that was implied by the commit subject line. :-D

However, it certainly doesn't hurt to be more explicit.

Junio, do you want me to re-roll, or would you mind tweaking the
commit message while queueing?

Thanks!

ATB,
Ramsay Jones

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Jeff King

On Fri, Mar 02, 2018 at 08:53:34AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Because the array is full of "undef". See parse_diff(), which does this:
> >
> >   my @diff = run_cmd_pipe("git", @diff_cmd, "--", $path);
> >   ...
> >   @colored = run_cmd_pipe(@display_cmd);
> >   ...
> >   for (my $i = 0; $i < @diff; $i++) {
> >   ...
> >   push @{$hunk[-1]{TEXT}}, $diff[$i];
> >   push @{$hunk[-1]{DISPLAY}},
> >(@colored ? $colored[$i] : $diff[$i]);
> >   }
> >
> > If the @colored output is shorter than the normal @diff output, we'll
> > push a bunch of "undef" onto the DISPLAY array (the ternary there is
> > because sometimes @colored is completely empty if the user did not ask
> > for color).
> 
> An obvious sanity check would be to ensure @colored == @diff
> 
> push @{$hunk[-1]{DISPLAY}},
> -(@colored ? $colored[$i] : $diff[$i]);
> +(@colored && @colored == @diff ? $colored[$i] : 
> $diff[$i]);
>  }
> 
> or something like that?

That's probably a reasonable sanity check, but I think we need to abort
and not just have a too-small DISPLAY array. Because later code like the
hunk-splitting is going to assume that there's a 1:1 line
correspondence. We definitely don't want to end up in a situation where
we show one thing but apply another.

-Peff

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Junio C Hamano

Jeff King  writes:

> Because the array is full of "undef". See parse_diff(), which does this:
>
>   my @diff = run_cmd_pipe("git", @diff_cmd, "--", $path);
>   ...
>   @colored = run_cmd_pipe(@display_cmd);
>   ...
>   for (my $i = 0; $i < @diff; $i++) {
>   ...
>   push @{$hunk[-1]{TEXT}}, $diff[$i];
>   push @{$hunk[-1]{DISPLAY}},
>(@colored ? $colored[$i] : $diff[$i]);
>   }
>
> If the @colored output is shorter than the normal @diff output, we'll
> push a bunch of "undef" onto the DISPLAY array (the ternary there is
> because sometimes @colored is completely empty if the user did not ask
> for color).

An obvious sanity check would be to ensure @colored == @diff

push @{$hunk[-1]{DISPLAY}},
-(@colored ? $colored[$i] : $diff[$i]);
+(@colored && @colored == @diff ? $colored[$i] : $diff[$i]);
 }

or something like that?

Re: [RFC] Contributing to Git (on Windows)

2018-03-02 Thread Junio C Hamano

Jonathan Nieder  writes:

>> +When adding a helper, be sure to add a line to `t/Makefile` and to the 
>> `.gitignore` for the
>> +binary file you add. The Git community prefers functional tests using the 
>> full `git`
>> +executable, so be sure the test helper is required.
>
> Maybe s/low-level/unit/?

Yup.  

Also "be sure the test helper is required" at the end did not click
until I read it twice and realized it wanted to say that addition of
new test helper executable for function level unit testing is
discouraged and should be limited to cases where it is absolutely
necessary.

>> +Test Your Changes on Linux
>> +--
>> +
>> +It can be important to work directly on the [core Git 
>> codebase](https://github.com/git/git),
>> +such as a recent commit into the `master` or `next` branch that has not 
>> been incorporated
>> +into Git for Windows. Also, it can help to run functional and performance 
>> tests on your
>> +code in Linux before submitting patches to the Linux-focused mailing list.
>
> I'm surprised at this advice.  Does it actually come up?  When I was
> on Mac I never worried about this, nor when I was on Windows.
>
> I'm personally happy to review patches that haven't been tested on
> Linux, though it's of course even nicer if the patch author mentions
> what platforms they've tested on.

s/Linux/other platforms/, perhaps?  In fact, portability issues in a
patch originally written for a platform is rather quickly discovered
if the original platform is more minor than the others, so while it
is good advice to test your ware before you make it public, singling
out the portability issues may not add much value.  The fewer number
of obvious bugs remaining, the fewer number of iterations it would
take for a series to get in a good shape.

>> +When preparing your patch, it is important to put yourself in the shoes of 
>> the maintainer.
>
> ... and in the shoes of other users and developers working with Git down
> the line who will interact with the patch later!
>
> Actually, the maintainer is one of the least important audiences for a
> commit message.  But may 'the maintainer' is a stand-in for 'the
> project' here?

I tend to agree.  The reviewers all comment on the proposed log
messages and code changes to help making the patch understandable by
future readers (i.e. the most important audicences).  The maintainer
and senior reviewers may happen to be good at putting themselves in
the shoes of future readers to spot potential problems than other
reviewers can, simply because they have been working on the project
longer than others.  But we should stress that the patch should be
written for future readers of the code and history.

> [...]
>> +* Make sure the commits are signed off using `git commit (-s|--signoff)`. 
>> This means that you
>> +  testify to be allowed to contribute your changes, in particular that if 
>> you did this on company
>> +  time, said company approved to releasing your work as Open Source under 
>> the GNU Public License v2.
>
> Can this either point to or quote the relevant legal text from
> Documentation/SubmittingPatches?  It feels dangerous (in the sense of,
> potentially leading to misunderstandings and major future pain) to ask
> people to sign off without them knowing exactly what that means.

When you can point at an existing test in legal context, it is safer
to do so rather than attempting to "rephrase" it yourself (unless
you are a lawyer, of course, in which case you can assess the best
course of action yourself).

Re: What's cooking in git.git (Mar 2018, #01; Thu, 1)

2018-03-02 Thread SZEDER Gábor

On Thu, Mar 1, 2018 at 11:20 PM, Junio C Hamano  wrote:

> * sg/travis-build-during-script-phase (2018-01-08) 1 commit
>  - travis-ci: build Git during the 'script' phase
>
>  Stalled for too long without any response; will discard.

I still think this is a good change as-is and it does make checking the
results of Travis CI builds easier.  The issue Lars raised in [1] is in
my opinion a non-issue [2] in practice and Lars hasn't yet disagreed
with that.

Not sure what else I could say what's no already in the commit message
or in my response in [2].

[1] - 
https://public-inbox.org/git/5de3fa05-2347-4be7-8a1a-a6e5feec7...@gmail.com/

[2] - 
https://public-inbox.org/git/cam0vkjnszoc+e408ifkcg+qptagrnl3e3jvdrn573kdcbsz...@mail.gmail.com/

Re: [PATCH v4 4/9] t3701: don't hard code sha1 hash values

2018-03-02 Thread Junio C Hamano

SZEDER Gábor  writes:

>> +diff_cmp () {
>> +for x
>> +do
>> +sed  -e '/^index/s/[0-9a-f]*[1-9a-f][0-9a-f]*\.\./1234567../' \
>> + -e '/^index/s/\.\.[0-9a-f]*[1-9a-f][0-9a-f]*/..9abcdef/' \
>> + -e '/^index/s/ 00*\.\./ 000../' \
>> + -e '/^index/s/\.\.00*$/..000/' \
>> + -e '/^index/s/\.\.00* /..000 /' \
>> + "$x" >"$x.filtered"
>> +done
>> +test_cmp "$1.filtered" "$2.filtered"
>> +}
>
> t3701 is not the only test script comparing diffs, many other
> tests do so:
>
>   $ git grep 'diff --git' t/ |wc -l
>   835
>
> It's totally inaccurate, but a good ballpark figure.
>
> I think this function should be a widely available test helper
> function in 'test-lib-functions.sh', perhaps renamed to
> 'test_cmp_diff', so those other tests could use it as well.

I am on the fence.  

The above does not redact enough (e.g. rename/copy score should be
redacted while comparing) to serve as a generic filter.

We could certainly extend it when we make code in other tests use
the helper, but then we can do the moving to test-lib-functions when
that happens, too.

Those who will be writing new tests that need to compare two diff
outputs are either (1) people who are careful and try to find
existing tests that do the same, to locate the helper we already
have to be reused in theirs, or (2) people who don't and roll their
own.  The latter camp cannot be helped either way, but the former
will run "git grep 'diff --git' t/" and find the helper whether it
is in this script or in test-lib-functions, I would think.

So, I certainly do not mind a reroll to move it to a more generic
place, but I do not think I would terribly mind if we leave it in
its current place, later to be moved by the first new caller that
wants to use it from outside this script.

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-02 Thread Jacob Keller

On Fri, Mar 2, 2018 at 4:36 AM, Phillip Wood  wrote:
> On 02/03/18 11:17, Phillip Wood wrote:
>>
>> On 02/03/18 01:16, Igor Djordjevic wrote:
>>>
>>> Hi Sergey,
>>>
>>> On 01/03/2018 06:39, Sergey Organov wrote:

>> (3) ---X1---o---o---o---o---o---X2
>>|\   |\
>>| A1---A2---A3---U1  | A1'--A2'--A3'--U1'
>>| \  |
>>|  M |
>>| /  |
>>\-B1---B2---B3---U2  \-B1'--B2'--B3'--U2'
>>
>
> Meh, I hope I`m rushing it now, but for example, if we had decided to
> drop commit A2 during an interactive rebase (so losing A2' from
> diagram above), wouldn`t U2' still introduce those changes back, once
> U1' and U2' are merged, being incorrect/unwanted behavior...? :/
>
> [...]

 Yeah, I now see it myself. I'm sorry for being lazy and not inspecting
 this more carefully in the first place.
>>>
>>> No problem, that`s why we`re discussing it, and I`m glad we`re
>>> aligned now, so we can move forward :)
>>>
> So while your original proposal currently seems like it could be
> working nicely for non-interactive rebase (and might be some simpler
> interactive ones), now hitting/acknowledging its first real use
> limit, my additional quick attempt[1] just tries to aid pretty
> interesting case of complicated interactive rebase, too, where we
> might be able to do better as well, still using you original proposal
> as a base idea :)

 Yes, thank you for pushing me back to reality! :-) The work and thoughts
 you are putting into solving the puzzle are greatly appreciated!
>>>
>>> You`re welcome, and I am enjoying it :)
>>>
 Thinking about it overnight, I now suspect that original proposal had a
 mistake in the final merge step. I think that what you did is a way to
 fix it, and I want to try to figure what exactly was wrong in the
 original proposal and to find simpler way of doing it right.

 The likely solution is to use original UM as a merge-base for final
 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty natural
 though, as that's exactly UM from which both U1' and U2' have diverged
 due to rebasing and other history editing.
>>>
>>> Yes, this might be it...! ;)
>>>
>>> To prove myself it works, I`ve assembled a pretty crazy `-s ours`
>>> merge interactive rebase scenario, and it seems this passes the test,
>>> ticking all the check boxes (I could think of) :P
>
> Hi Igor
>
>> It is interesting to think what it means to faithfully rebase a '-s
>> ours' merge.
> I should have explained that I mean is a faithful rebase one that
> adheres to the semantics of '-s ours' (i.e. ignores any changes in the
> side branch) or one that merges new changes from the side branch into
> the content of the original merge? In your example you add B4 to B. If
> M' preserves the semantics of '-s ours' then it will not contain the
> changes in B4. I think your result does (correct me if I'm wrong) so it
> is preserving the content of the original merge rather than the
> semantics of it.
>
> Best Wishes
>
> Phillip
>

I believe this was always the outline of this type of approach, as per
Sergey's original email.

We only have the content, and we don't know the semantics (nor, I
think, should we attempt to understand or figure out the semantics).

Regards,
Jake

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-02 Thread Jacob Keller

On Fri, Mar 2, 2018 at 3:17 AM, Phillip Wood  wrote:
>
> It is interesting to think what it means to faithfully rebase a '-s
> ours' merge. In your example the rebase does not introduce any new
> changes into branch B that it doesn't introduce to branch A. Had it
> added a fixup to branch B1 for example or if the topology was more
> complex so that B ended up with some other changes that the rebase did
> not introduce into A, then M' would contain those extra changes whereas
> '--recreate-merges' with '-s ours' (once it supports it) would not.
>

Unless the method of merging was stored, I don't think we *can*
correctly automate resolving of "-s ours" because all we store is the
resulting content, and we don't know how or why the user generated it
as such. I believe the "correct" solution in any case would be to take
the content we DO know and then ask the user to stop for amendments.

Thanks,
Jake

Re: [PATCH v4 4/9] t3701: don't hard code sha1 hash values

2018-03-02 Thread SZEDER Gábor


> Use a filter when comparing diffs to fix the value of non-zero hashes
> in diff index lines so we're not hard coding sha1 hash values in the
> expected output. This makes it easier to change the expected output if
> a test is edited as we don't need to worry about the exact hash value
> and means the tests will work when the hash algorithm is transitioned
> away from sha1.

> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
> index 836ce346ed..f95714230b 100755
> --- a/t/t3701-add-interactive.sh
> +++ b/t/t3701-add-interactive.sh
> @@ -10,6 +10,19 @@ then
>   test_done
>  fi
>  
> +diff_cmp () {
> + for x
> + do
> + sed  -e '/^index/s/[0-9a-f]*[1-9a-f][0-9a-f]*\.\./1234567../' \
> +  -e '/^index/s/\.\.[0-9a-f]*[1-9a-f][0-9a-f]*/..9abcdef/' \
> +  -e '/^index/s/ 00*\.\./ 000../' \
> +  -e '/^index/s/\.\.00*$/..000/' \
> +  -e '/^index/s/\.\.00* /..000 /' \
> +  "$x" >"$x.filtered"
> + done
> + test_cmp "$1.filtered" "$2.filtered"
> +}

t3701 is not the only test script comparing diffs, many other
tests do so:

  $ git grep 'diff --git' t/ |wc -l
  835

It's totally inaccurate, but a good ballpark figure.

I think this function should be a widely available test helper
function in 'test-lib-functions.sh', perhaps renamed to
'test_cmp_diff', so those other tests could use it as well.

Re: [PATCH 00/11] Make the test suite pass with '-x' and /bin/sh

2018-03-02 Thread SZEDER Gábor

On Sat, Feb 24, 2018 at 12:39 AM, SZEDER Gábor  wrote:
> This patch series makes '-x' tracing of tests work reliably even when
> running them with /bin/sh, and setting TEST_SHELL_PATH=bash will be
> unnecessary.
>
>   make GIT_TEST_OPTS='-x --verbose-log' test
>
> passes on my setup and on all Travis CI build jobs (though neither me
> nor Travis CI run all tests, e.g. CVS).

I installed 'cvs' and whatnot to run t94* and t96* tests, and sure
enough, 5 tests in 2 test scripts fail with '-x' tracing and /bin/sh.
I think I will be able to get around to send v2 during the weekend.

Re: What's cooking in git.git (Mar 2018, #01; Thu, 1)

2018-03-02 Thread Derrick Stolee


On 3/1/2018 5:20 PM, Junio C Hamano wrote:

--
[Graduated to "master"]

* jt/binsearch-with-fanout (2018-02-15) 2 commits
   (merged to 'next' on 2018-02-15 at 7648891022)
  + packfile: refactor hash search with fanout table
  + packfile: remove GIT_DEBUG_LOOKUP log statements
  (this branch is used by ds/commit-graph.)

  Refactor the code to binary search starting from a fan-out table
  (which is how the packfile is indexed with object names) into a
  reusable helper.

[...]
--
[New Topics]

* jk/cached-commit-buffer (2018-02-22) 2 commits
   (merged to 'next' on 2018-02-27 at af791d9a1e)
  + revision: drop --show-all option
  + commit: drop uses of get_cached_commit_buffer()

  Code clean-up.

  Will merge to 'master'.


Thanks! These resolve the dependencies for ds/commit-graph


* ds/commit-graph (2018-02-20) 13 commits
  - commit-graph: build graph from starting commits
  - commit-graph: read only from specific pack-indexes
  - commit: integrate commit graph with commit parsing
  - commit-graph: close under reachability
  - commit-graph: add core.commitGraph setting
  - commit-graph: implement --delete-expired
  - commit-graph: implement --set-latest
  - commit-graph: implement git commit-graph read
  - commit-graph: implement 'git-commit-graph write'
  - commit-graph: implement write_commit_graph()
  - commit-graph: create git-commit-graph builtin
  - graph: add commit graph design document
  - commit-graph: add format document

  Precompute and store information necessary for ancestry traversal
  in a separate file to optimize graph walking.

  Reroll exists, but it appears that there will be a further reroll.
  cf. <1519698787-190494-1-git-send-email-dsto...@microsoft.com>


v5 reroll (mentioned above) has limited review, so I won't be rerolling 
soon. I'm particularly interested in review on [PATCH 04/13] csum-file: 
add CSUM_KEEP_OPEN flag [1].


Thanks,

-Stolee

[1] 
https://public-inbox.org/git/1519698787-190494-5-git-send-email-dsto...@microsoft.com/

Re: Reduce pack-objects memory footprint?

2018-03-02 Thread Duy Nguyen

On Fri, Mar 2, 2018 at 5:54 PM, Jeff King  wrote:
> On Fri, Mar 02, 2018 at 05:18:45PM +0700, Duy Nguyen wrote:
>
>> On Wed, Feb 28, 2018 at 4:27 PM, Duy Nguyen  wrote:
>> > linux-2.6.git current has 6483999 objects. "git gc" on my poor laptop
>> > consumes 1.7G out of 4G RAM, pushing lots of data to swap and making
>> > all apps nearly unusuable (granted the problem is partly Linux I/O
>> > scheduler too). So I wonder if we can reduce pack-objects memory
>> > footprint a bit.
>>
>> Next low hanging fruit item:
>>
>> struct revindex_entry {
>> off_t offset;
>> unsigned int nr;
>> };
>>
>> We need on entry per object, so 6.5M objects * 16 bytes = 104 MB. If
>> we break this struct apart and store two arrays of offset and nr in
>> struct packed_git, we save 4 bytes per struct, 26 MB total.
>>
>> It's getting low but every megabyte counts for me, and it does not
>> look like breaking this struct will make horrible code (we recreate
>> the struct at find_pack_revindex()) so I'm going to do this too unless
>> someone objects. There will be slight performance regression due to
>> cache effects, but hopefully it's ok.
>
> Maybe you will prove me wrong, but I don't think splitting them is going
> to work. The point of the revindex_entry is that we sort the (offset,nr)
> tuple as a unit.
>
> Or are you planning to sort it, and then copy the result into two
> separate arrays? I think that would work, but it sounds kind of nasty
> (arcane code, and extra CPU work for systems that don't care about the
> 26MB).


How about two level lookups? We have

struct revindex_entry_l2 {
uint32_t offset; /* the lowest 32 bits */
uint32_t nr;
};

struct revindex {
struct revindex_entry *level1[256]; /* 8 high bits */
};

This limits us to 1024GB pack files, which should give us some time
before we have to worry about it again and most of the time we'll need
just one or two items in level1[] so cache effects are not that bad.
Preparing/Sorting this could be a problem though.
-- 
Duy

Re: [RFC] Contributing to Git (on Windows)

2018-03-02 Thread Derrick Stolee


On 3/1/2018 11:44 PM, Jonathan Nieder wrote:

Hi,

Derrick Stolee wrote:


Now, we'd like to make that document publicly available. These steps are
focused on a Windows user, so we propose putting them in the
git-for-windows/git repo under CONTRIBUTING.md. I have a pull request open
for feedback [1]. I'll read comments on that PR or in this thread.

Thanks!  I wonder if we can put this in the standard Documentation/
directory as well.  E.g. the Windows CONTRIBUTING.md could say say
"See Documentation/Contributing-On-Windows.md" so that the bulk of the
text would not have to be git-for-windows specific.


That's a good idea. After this review stabilizes, I'll send a patch to 
add the windows-specific instructions as you recommend.


What precedent do we have for referencing forks like git-for-windows/git?



[...]

+++ b/CONTRIBUTING.md
@@ -0,0 +1,401 @@
+How to Contribute to Git for Windows
+

Would it make sense for this to be checked in with LF instead of CRLF
line endings, so that clones use the user's chosen / platform native
line ending?  The .gitattributes file could include

/CONTRIBUTING.md text

so that line ending conversion takes place even if the user hasn't
enabled core.autocrlf.


Oops! I turned off the CRLF munging in my repo because I had an issue 
with a script somewhere, but forgot to turn it back on again. Thanks for 
checking this.




[...]

+The SDK uses a different credential manager, so you may still want to use 
Visual Studio
+or normal Git Bash for interacting with your remotes.  Alternatively, use 
SSH rather
+than HTTPS and avoid credential manager problems.

Where can I read more about the problems in question?


I'll try to see if we can elaborate here. The Git for Windows client 
ships with Git Credential Manager for Windows [1] but the SDK does not. 
At the very least, credential interactions are not the same and you do 
not have access to the credentials stored in Windows Credential Manager.


[1] https://github.com/Microsoft/Git-Credential-Manager-for-Windows



[...]

+Many new contributors ask: What should I start working on?
+
+One way to win big with the maintainer of an open-source project is to look at 
the
+[issues page](https://github.com/git-for-windows/git/issues) and see if there 
are any issues that
+you can fix quickly, or if anything catches your eye.

You can also look at https://crbug.com/git for non
Windows-specific issues.  And you can look at recent user questions
on the mailing list: https://public-inbox.org/git

[...]

+If you are adding new functionality, you may need to create low-level tests by 
creating
+helper commands that test a very limited action. These commands are stored in 
`t/helpers`.
+When adding a helper, be sure to add a line to `t/Makefile` and to the 
`.gitignore` for the
+binary file you add. The Git community prefers functional tests using the full 
`git`
+executable, so be sure the test helper is required.

Maybe s/low-level/unit/?

[...]

+Read [`t/README`](https://github.com/git/git/blob/master/t/README) for more 
details.

Forgive my ignorance: does github flavored markdown allow relative
links?  (I.e., could this say [`t/README`](t/README)?)

[...]

+You can also set certain environment variables to help test the performance on 
different
+repositories or with more repetitions. The full list is available in
+[the `t/perf/README` 
file](https://github.com/git/git/blob/master/t/perf/README),

Likewise.

[...]

+Test Your Changes on Linux
+--
+
+It can be important to work directly on the [core Git 
codebase](https://github.com/git/git),
+such as a recent commit into the `master` or `next` branch that has not been 
incorporated
+into Git for Windows. Also, it can help to run functional and performance 
tests on your
+code in Linux before submitting patches to the Linux-focused mailing list.

I'm surprised at this advice.  Does it actually come up?  When I was
on Mac I never worried about this, nor when I was on Windows.

I'm personally happy to review patches that haven't been tested on
Linux, though it's of course even nicer if the patch author mentions
what platforms they've tested on.

Maybe this can be reframed to refer specifically to cases where you've
modified some Linux platform-specific code (e.g. to add a new feature
to run-command.c)?


I recently had a bug in my local copy of the commit-graph patch that was 
only caught by our Mac OSX automated builds: I was passing the 
edge-value for a parent into a lookup to get an octopus edge from the 
graph, but I forgot to drop the most-significant bit. This cast the 
"uint32_t" silently into an "int" but since we multiplied by 4 somehow 
the Windows and Linux compilers turned this into a left-shift while Mac 
did not and failed during my test.


Now this is an example of something that probably would have been caught 
in review, but stuff gets through.


The Windows/Linux boundary is usually

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-02 Thread Phillip Wood

On 02/03/18 11:17, Phillip Wood wrote:
> 
> On 02/03/18 01:16, Igor Djordjevic wrote:
>>
>> Hi Sergey,
>>
>> On 01/03/2018 06:39, Sergey Organov wrote:
>>>
> (3) ---X1---o---o---o---o---o---X2
>|\   |\
>| A1---A2---A3---U1  | A1'--A2'--A3'--U1'
>| \  |
>|  M |
>| /  |
>\-B1---B2---B3---U2  \-B1'--B2'--B3'--U2'
>

 Meh, I hope I`m rushing it now, but for example, if we had decided to 
 drop commit A2 during an interactive rebase (so losing A2' from 
 diagram above), wouldn`t U2' still introduce those changes back, once 
 U1' and U2' are merged, being incorrect/unwanted behavior...? :/

 [...]
>>>
>>> Yeah, I now see it myself. I'm sorry for being lazy and not inspecting
>>> this more carefully in the first place.
>>
>> No problem, that`s why we`re discussing it, and I`m glad we`re 
>> aligned now, so we can move forward :)
>>
 So while your original proposal currently seems like it could be 
 working nicely for non-interactive rebase (and might be some simpler 
 interactive ones), now hitting/acknowledging its first real use 
 limit, my additional quick attempt[1] just tries to aid pretty 
 interesting case of complicated interactive rebase, too, where we 
 might be able to do better as well, still using you original proposal 
 as a base idea :)
>>>
>>> Yes, thank you for pushing me back to reality! :-) The work and thoughts
>>> you are putting into solving the puzzle are greatly appreciated!
>>
>> You`re welcome, and I am enjoying it :)
>>
>>> Thinking about it overnight, I now suspect that original proposal had a
>>> mistake in the final merge step. I think that what you did is a way to
>>> fix it, and I want to try to figure what exactly was wrong in the
>>> original proposal and to find simpler way of doing it right.
>>>
>>> The likely solution is to use original UM as a merge-base for final
>>> 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty natural
>>> though, as that's exactly UM from which both U1' and U2' have diverged
>>> due to rebasing and other history editing.
>>
>> Yes, this might be it...! ;)
>>
>> To prove myself it works, I`ve assembled a pretty crazy `-s ours`  
>> merge interactive rebase scenario, and it seems this passes the test, 
>> ticking all the check boxes (I could think of) :P

Hi Igor

> It is interesting to think what it means to faithfully rebase a '-s
> ours' merge.
I should have explained that I mean is a faithful rebase one that
adheres to the semantics of '-s ours' (i.e. ignores any changes in the
side branch) or one that merges new changes from the side branch into
the content of the original merge? In your example you add B4 to B. If
M' preserves the semantics of '-s ours' then it will not contain the
changes in B4. I think your result does (correct me if I'm wrong) so it
is preserving the content of the original merge rather than the
semantics of it.

Best Wishes

Phillip

(ignore the rest of what I wrote earlier I don't think it's correct)
 In your example the rebase does not introduce any new
> changes into branch B that it doesn't introduce to branch A. Had it
> added a fixup to branch B1 for example or if the topology was more
> complex so that B ended up with some other changes that the rebase did
> not introduce into A, then M' would contain those extra changes whereas
> '--recreate-merges' with '-s ours' (once it supports it) would not.
> 
>>
>> Let`s see our starting situation:
>>
>>  (0) ---X8--B2'--X9 (master)
>> |\
>> | A1---A2---A3 (A)
>> | \
>> |  M (topic)
>> | /
>> \-B1---B2---B3 (B)
>>
>>
>> Here, merge commit M is done with `-s ours` (obsoleting branch "B"), 
>> plus amended to make it an "evil merge", where a commit B2 from 
>> obsoleted branch "B" is cherry picked to "master".
>>
>> Now, we want to rebase "topic" (M) onto updated "master" (X9), but to 
>> make things more interesting, we`ll do it interactively, with some 
>> amendments, drops, additions and even more cherry-picks!
>>
>> This is what the final result looks like:
>>
>>  (1) ---X8--B2'--X9 (master)
>>  |\
>>  | A12--A2'---B3' (A)
>>  | \
>>  |  M' (topic)
>>  | /
>>  \-B1'--B3'---B4  (B)
>>
>>
>> During interactive rebase, on branch "A", we amended A1 into A12, 
>> dropped A3 and cherry-picked B3. On branch "B", B4 is added, B2' being 
>> omitted automatically as already present in "master".
>>
>> So... In comparison to original merge commit M, rebased merge commit 
>> M' is expected to:
>>
>>  - Add X9, from updated "master"
>>  - Have A1 changed to A12, due to A12 commit amendment
>>  - Keep A2, rebased as A2'
>>  - Remove A3, due to

Hello

2018-03-02 Thread Sare Ouedraogo

Am Mr.Sare Ouedraogo.i work in one of the prime bank here in Burkina 
Faso, i want the bank to transfer the money left by our late customer 
is a foreigner from Korea. can you invest this money  and also help the 
poor'  the amount value at  $13,300,000.00 (Thirteen Million Three 
Hundred Thousand United States American Dollars), left in his account 
still unclaimed. more details will be given to you if you are 
interested.

best regards
Mr. Sare Ouedraogo

Re: Worktree silently deleted on git fetch/gc/log

2018-03-02 Thread Duy Nguyen

On Fri, Mar 2, 2018 at 6:34 PM, Дилян Палаузов
 wrote:
> Hello Duy,
>
> thanks for your answer.
>
> Your assumption is correct: when renaming the directory apparently I have
> not adjusted /git/A/.git/worktrees/B/gitdir to the new location.
>
> I fixed the situation by renaming /git/B to /git/B.bak, creating a new
> worktree pointing to /git/B (with branch g), then deleting /git/B and moving
> /git/B.bak to /git/B .

Phew.. I was worried there was some serious bug that I missed even
after the last code inspection.

The good news is "git worktree move" command is coming so you don't
have to do this manual (and risky) operation.

I'm going to improve it a bit in this case either way, I think I have
some idea: (mostly to Eric) since worktree B is alive and kicking, it
should keep at least HEAD or index updated often. We can delay
deleting a worktree until both of these are past expire time.

Also Eric (and this is getting off topic), I wonder if the decision to
store absolute path in worktrees/../gitdir and .git files is a bad one
(and is my bad).

If we stored relative path in ".git" file at least, the worktree would
immediately fail after the user moves either the linked checkout, or
the main one. This should send a loud and clear signal to the user
"something has gone horribly" and hopefully make them connect it to
the last rename and undo that. "git gc" would have near zero chance to
kick in and destroy stale worktrees.

Another bad thing about absolute paths, if you backup both main and
linked worktrees in a tar file, restoring from the backup won't work
unless you restore at the exact same place.

Hmm?

> Consider adjusting the documentation as suggested below, possibly using a
> different wording.

Will do.

>
> Regards
>   Дилян
>
> On 03/02/18 00:56, Duy Nguyen wrote:
>>
>> On Fri, Mar 2, 2018 at 3:14 AM, Eric Sunshine 
>> wrote:
>>>
>>> As far as I know, there isn't any code in Git which would
>>> automatically remove the .git/worktrees/B directory, so it's not clear
>>> how that happened.
>>
>>
>> "git worktree prune" does, which is called by "git gc" and that was
>> actually executed from the command output in the original mail.
>>
>> Дилян Палаузов did you move /git/B away manually? For example, you
>> originally created "B" with
>>
>> git worktree add /somewhere/B g
>>
>> then you do this at some point
>>
>> mv /somewhere/B /git/B
>>
>
> Signed-off-by Дилян Палаузов 
> ---
> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index 571b5a7..33655c4 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -15,8 +15,8 @@ DESCRIPTION
>  ---
>  Runs a number of housekeeping tasks within the current repository,
>  such as compressing file revisions (to reduce disk space and increase
> -performance) and removing unreachable objects which may have been
> -created from prior invocations of 'git add'.
> +performance), removing unreachable objects, which may have been
> +created from prior invocations of 'git add', and calling 'git worktree
> prune'.
>   Users are encouraged to run this task on a regular basis within
>  each repository to maintain good disk space utilization and good
> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
> index 5ac3f68..4b0d32f 100644
> --- a/Documentation/git-worktree.txt
> +++ b/Documentation/git-worktree.txt
> @@ -82,7 +82,7 @@ with `--reason`.
>   prune::
>  -Prune working tree information in $GIT_DIR/worktrees.
> +Prune working tree information in $GIT_DIR/worktrees.  Called implicitly by
> 'git gc'.
>   unlock::
>  --
> 2.10.0.298.gc59cecb

-- 
Duy

Re: Worktree silently deleted on git fetch/gc/log

2018-03-02 Thread Eric Sunshine

On Fri, Mar 2, 2018 at 6:34 AM, Дилян Палаузов
 wrote:
> Your assumption is correct: when renaming the directory apparently I have
> not adjusted /git/A/.git/worktrees/B/gitdir to the new location.
>
> I fixed the situation by renaming /git/B to /git/B.bak, creating a new
> worktree pointing to /git/B (with branch g), then deleting /git/B and moving
> /git/B.bak to /git/B .

Duy's patch series to add a "git worktree move" command[1] should help
avoid such problems in the future.

[1]: https://public-inbox.org/git/20180124095357.19645-1-pclo...@gmail.com/

Re: Worktree silently deleted on git fetch/gc/log

2018-03-02 Thread Дилян Палаузов


Hello Duy,

thanks for your answer.

Your assumption is correct: when renaming the directory apparently I have not 
adjusted /git/A/.git/worktrees/B/gitdir to the new location.

I fixed the situation by renaming /git/B to /git/B.bak, creating a new worktree 
pointing to /git/B (with branch g), then deleting /git/B and moving /git/B.bak 
to /git/B .

Consider adjusting the documentation as suggested below, possibly using a 
different wording.

Regards
  Дилян

On 03/02/18 00:56, Duy Nguyen wrote:

On Fri, Mar 2, 2018 at 3:14 AM, Eric Sunshine  wrote:

As far as I know, there isn't any code in Git which would
automatically remove the .git/worktrees/B directory, so it's not clear
how that happened.


"git worktree prune" does, which is called by "git gc" and that was
actually executed from the command output in the original mail.

Дилян Палаузов did you move /git/B away manually? For example, you
originally created "B" with

git worktree add /somewhere/B g

then you do this at some point

mv /somewhere/B /git/B



Signed-off-by Дилян Палаузов 
---
diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index 571b5a7..33655c4 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -15,8 +15,8 @@ DESCRIPTION
 ---
 Runs a number of housekeeping tasks within the current repository,
 such as compressing file revisions (to reduce disk space and increase
-performance) and removing unreachable objects which may have been
-created from prior invocations of 'git add'.
+performance), removing unreachable objects, which may have been
+created from prior invocations of 'git add', and calling 'git worktree prune'.
 
 Users are encouraged to run this task on a regular basis within

 each repository to maintain good disk space utilization and good
diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
index 5ac3f68..4b0d32f 100644
--- a/Documentation/git-worktree.txt
+++ b/Documentation/git-worktree.txt
@@ -82,7 +82,7 @@ with `--reason`.
 
 prune::
 
-Prune working tree information in $GIT_DIR/worktrees.

+Prune working tree information in $GIT_DIR/worktrees.  Called implicitly by 
'git gc'.
 
 unlock::
 
--

2.10.0.298.gc59cecb

Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-02 Thread Phillip Wood

On 01/03/18 05:39, Sergey Organov wrote:
> 
> Hi Igor,
> 
> Igor Djordjevic  writes:
> 
>> Hi Sergey,
>>
>> On 28/02/2018 06:19, Sergey Organov wrote:
>>>
> (3) ---X1---o---o---o---o---o---X2
>|\   |\
>| A1---A2---A3---U1  | A1'--A2'--A3'--U1'
>| \  |
>|  M |
>| /  |
>\-B1---B2---B3---U2  \-B1'--B2'--B3'--U2'
>

 Meh, I hope I`m rushing it now, but for example, if we had decided to 
 drop commit A2 during an interactive rebase (so losing A2' from 
 diagram above), wouldn`t U2' still introduce those changes back, once 
 U1' and U2' are merged, being incorrect/unwanted behavior...? :/
>>>
>>> I think the method will handle this nicely.
>>
>> That`s what I thought as well. And then I made a test. And then the 
>> test broke... Now, might be the test was flawed in the first place, 
>> but thinking about it a bit more, it does seem to make sense not to 
>> handle this case nicely :/
> 
> Yeah, I now see it myself. I'm sorry for being lazy and not inspecting
> this more carefully in the first place.
> 
> [...]
> 
>> So while your original proposal currently seems like it could be 
>> working nicely for non-interactive rebase (and might be some simpler 
>> interactive ones), now hitting/acknowledging its first real use 
>> limit, my additional quick attempt[1] just tries to aid pretty 
>> interesting case of complicated interactive rebase, too, where we 
>> might be able to do better as well, still using you original proposal 
>> as a base idea :)
> 
> Yes, thank you for pushing me back to reality! :-) The work and thoughts
> you are putting into solving the puzzle are greatly appreciated!
> 
> Thinking about it overnight, I now suspect that original proposal had a
> mistake in the final merge step. I think that what you did is a way to
> fix it, and I want to try to figure what exactly was wrong in the
> original proposal and to find simpler way of doing it right.
> 
> The likely solution is to use original UM as a merge-base for final
> 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty natural
> though, as that's exactly UM from which both U1' and U2' have diverged
> due to rebasing and other history editing.

Hi Sergey, I've been following this discussion from the sidelines,
though I haven't had time to study all the posts in this thread in
detail. I wonder if it would be helpful to think of rebasing a merge as
merging the changes in the parents due to the rebase back into the
original merge. So for a merge M with parents A B C that are rebased to
A' B' C' the rebased merge M' would be constructed by (ignoring shell
quoting issues)

git checkout --detach M
git merge-recursive A -- M A'
tree=$(git write-tree)
git merge-recursive B -- $tree B'
tree=$(git write-tree)
git merge-recursive C -- $tree C'
tree=$(git write-tree)
M'=$(git log --pretty=%B -1 M | git commit-tree -pA' -pB' -pC')

This should pull in all the changes from the parents while preserving
any evil conflict resolution in the original merge. It superficially
reminds me of incremental merging [1] but it's so long since I looked at
that I'm not sure if there are any significant similarities.

Best Wishes

Phillip

[1] https://github.com/mhagger/git-imerge

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-02 Thread Phillip Wood


On 02/03/18 01:16, Igor Djordjevic wrote:
> 
> Hi Sergey,
> 
> On 01/03/2018 06:39, Sergey Organov wrote:
>>
 (3) ---X1---o---o---o---o---o---X2
|\   |\
| A1---A2---A3---U1  | A1'--A2'--A3'--U1'
| \  |
|  M |
| /  |
\-B1---B2---B3---U2  \-B1'--B2'--B3'--U2'

>>>
>>> Meh, I hope I`m rushing it now, but for example, if we had decided to 
>>> drop commit A2 during an interactive rebase (so losing A2' from 
>>> diagram above), wouldn`t U2' still introduce those changes back, once 
>>> U1' and U2' are merged, being incorrect/unwanted behavior...? :/
>>>
>>> [...]
>>
>> Yeah, I now see it myself. I'm sorry for being lazy and not inspecting
>> this more carefully in the first place.
> 
> No problem, that`s why we`re discussing it, and I`m glad we`re 
> aligned now, so we can move forward :)
> 
>>> So while your original proposal currently seems like it could be 
>>> working nicely for non-interactive rebase (and might be some simpler 
>>> interactive ones), now hitting/acknowledging its first real use 
>>> limit, my additional quick attempt[1] just tries to aid pretty 
>>> interesting case of complicated interactive rebase, too, where we 
>>> might be able to do better as well, still using you original proposal 
>>> as a base idea :)
>>
>> Yes, thank you for pushing me back to reality! :-) The work and thoughts
>> you are putting into solving the puzzle are greatly appreciated!
> 
> You`re welcome, and I am enjoying it :)
> 
>> Thinking about it overnight, I now suspect that original proposal had a
>> mistake in the final merge step. I think that what you did is a way to
>> fix it, and I want to try to figure what exactly was wrong in the
>> original proposal and to find simpler way of doing it right.
>>
>> The likely solution is to use original UM as a merge-base for final
>> 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty natural
>> though, as that's exactly UM from which both U1' and U2' have diverged
>> due to rebasing and other history editing.
> 
> Yes, this might be it...! ;)
> 
> To prove myself it works, I`ve assembled a pretty crazy `-s ours`  
> merge interactive rebase scenario, and it seems this passes the test, 
> ticking all the check boxes (I could think of) :P

It is interesting to think what it means to faithfully rebase a '-s
ours' merge. In your example the rebase does not introduce any new
changes into branch B that it doesn't introduce to branch A. Had it
added a fixup to branch B1 for example or if the topology was more
complex so that B ended up with some other changes that the rebase did
not introduce into A, then M' would contain those extra changes whereas
'--recreate-merges' with '-s ours' (once it supports it) would not.

> 
> Let`s see our starting situation:
> 
>  (0) ---X8--B2'--X9 (master)
> |\
> | A1---A2---A3 (A)
> | \
> |  M (topic)
> | /
> \-B1---B2---B3 (B)
> 
> 
> Here, merge commit M is done with `-s ours` (obsoleting branch "B"), 
> plus amended to make it an "evil merge", where a commit B2 from 
> obsoleted branch "B" is cherry picked to "master".
> 
> Now, we want to rebase "topic" (M) onto updated "master" (X9), but to 
> make things more interesting, we`ll do it interactively, with some 
> amendments, drops, additions and even more cherry-picks!
> 
> This is what the final result looks like:
> 
>  (1) ---X8--B2'--X9 (master)
>  |\
>  | A12--A2'---B3' (A)
>  | \
>  |  M' (topic)
>  | /
>  \-B1'--B3'---B4  (B)
> 
> 
> During interactive rebase, on branch "A", we amended A1 into A12, 
> dropped A3 and cherry-picked B3. On branch "B", B4 is added, B2' being 
> omitted automatically as already present in "master".
> 
> So... In comparison to original merge commit M, rebased merge commit 
> M' is expected to:
> 
>  - Add X9, from updated "master"
>  - Have A1 changed to A12, due to A12 commit amendment
>  - Keep A2, rebased as A2'
>  - Remove A3, due to dropped A3 commit
>  - Keep amendment from original (evil) merge commit M
>  - Miss B1' like M does B, due to original `-s ours` merge strategy
>  - Add B2, cherry-picked as B2' into "master"
>  - Add B3, cherry-picked as B3' into "A"
>  - Add B4, added to "B"
>  - Most important, provide safety mechanism to "fail loud", being 
>aware of non-trivial things going on, allowing to stop for user 
>inspection/decision
> 
> 
> There, I hope I didn`t miss any expectation. And, it _seems_ to work 
> exactly as expected :D
> 
> Not to leave this to imagination only, and hopefully helping others 
> to get to speed and possibly discuss this, pointing to still possible 
> flaws, I`m adding a demo script[1], showing how this exact

Re: [PATCH 00/11] Reduce pack-objects memory footprint

2018-03-02 Thread Jeff King

On Fri, Mar 02, 2018 at 07:14:01AM +0700, Duy Nguyen wrote:

> > We have a big repo, and this gets repacked on 6-8GB of memory on dev
> > KVMs, so we're under a fair bit of memory pressure. git-gc slows things
> > down a lot.
> >
> > It would be really nice to have something that made it use drastically
> > less memory at the cost of less efficient packs. Is the property that
> 
> Ahh.. less efficient. You may be more interested in [1] then. It
> avoids rewriting the base pack. Without the base pack, book keeping
> becomes much much cheaper.
> 
> We still read every single byte in all packs though (I think, unless
> you use pack-bitmap) and this amount of I/O affect the rest of the
> system too. Perhaps reducing core.packedgitwindowsize might make it
> friendlier to the OS, I don't know.

Yes, the ".keep" thing is actually quite expensive. We still do a
complete rev-list to find all the objects we want, and then for each
object say "is this in a pack with .keep?". And worse, the mru doesn't
help there because even if we find it in the first pack, we have to keep
looking to see if it's _another_ pack.

There are probably some low-hanging optimizations there (e.g., only
looking in the .keep packs if that's all we're looking for; we may even
do that already).

But I think fundamentally you'd do much better to generate the partial
list of objects outside of pack-objects entirely, and then just feed it
to pack-objects without using "--revs".

-Peff

Re: Reduce pack-objects memory footprint?

2018-03-02 Thread Duy Nguyen

On Fri, Mar 2, 2018 at 5:54 PM, Jeff King  wrote:
> On Fri, Mar 02, 2018 at 05:18:45PM +0700, Duy Nguyen wrote:
>
>> On Wed, Feb 28, 2018 at 4:27 PM, Duy Nguyen  wrote:
>> > linux-2.6.git current has 6483999 objects. "git gc" on my poor laptop
>> > consumes 1.7G out of 4G RAM, pushing lots of data to swap and making
>> > all apps nearly unusuable (granted the problem is partly Linux I/O
>> > scheduler too). So I wonder if we can reduce pack-objects memory
>> > footprint a bit.
>>
>> Next low hanging fruit item:
>>
>> struct revindex_entry {
>> off_t offset;
>> unsigned int nr;
>> };
>>
>> We need on entry per object, so 6.5M objects * 16 bytes = 104 MB. If
>> we break this struct apart and store two arrays of offset and nr in
>> struct packed_git, we save 4 bytes per struct, 26 MB total.
>>
>> It's getting low but every megabyte counts for me, and it does not
>> look like breaking this struct will make horrible code (we recreate
>> the struct at find_pack_revindex()) so I'm going to do this too unless
>> someone objects. There will be slight performance regression due to
>> cache effects, but hopefully it's ok.
>
> Maybe you will prove me wrong, but I don't think splitting them is going
> to work. The point of the revindex_entry is that we sort the (offset,nr)
> tuple as a unit.
>
> Or are you planning to sort it, and then copy the result into two
> separate arrays?

Yep.

> I think that would work, but it sounds kind of nasty

Yeah :(

> (arcane code, and extra CPU work for systems that don't care about the
> 26MB).
-- 
Duy

Re: Reduce pack-objects memory footprint?

2018-03-02 Thread Jeff King

On Fri, Mar 02, 2018 at 05:18:45PM +0700, Duy Nguyen wrote:

> On Wed, Feb 28, 2018 at 4:27 PM, Duy Nguyen  wrote:
> > linux-2.6.git current has 6483999 objects. "git gc" on my poor laptop
> > consumes 1.7G out of 4G RAM, pushing lots of data to swap and making
> > all apps nearly unusuable (granted the problem is partly Linux I/O
> > scheduler too). So I wonder if we can reduce pack-objects memory
> > footprint a bit.
> 
> Next low hanging fruit item:
> 
> struct revindex_entry {
> off_t offset;
> unsigned int nr;
> };
> 
> We need on entry per object, so 6.5M objects * 16 bytes = 104 MB. If
> we break this struct apart and store two arrays of offset and nr in
> struct packed_git, we save 4 bytes per struct, 26 MB total.
> 
> It's getting low but every megabyte counts for me, and it does not
> look like breaking this struct will make horrible code (we recreate
> the struct at find_pack_revindex()) so I'm going to do this too unless
> someone objects. There will be slight performance regression due to
> cache effects, but hopefully it's ok.

Maybe you will prove me wrong, but I don't think splitting them is going
to work. The point of the revindex_entry is that we sort the (offset,nr)
tuple as a unit.

Or are you planning to sort it, and then copy the result into two
separate arrays? I think that would work, but it sounds kind of nasty
(arcane code, and extra CPU work for systems that don't care about the
26MB).

-Peff

Re: [PATCH v4 8/9] add -p: fix counting when splitting and coalescing

2018-03-02 Thread Phillip Wood

On 01/03/18 20:29, Junio C Hamano wrote:
> Phillip Wood  writes:
> 
>> @@ -887,8 +892,8 @@ sub merge_hunk {
>>  $o_cnt = $n_cnt = 0;
>>  for ($i = 1; $i < @{$prev->{TEXT}}; $i++) {
>>  my $line = $prev->{TEXT}[$i];
>> -if ($line =~ /^\+/) {
>> -$n_cnt++;
>> +if ($line =~ /^[+\\]/) {
>> +$n_cnt++ if ($line =~ /^\+/);
>>  push @line, $line;
>>  next;
>>  }
> 
> H, the logic may be correct, but this looks like a result of
> attempting to minimize the number of changed lines and ending up
> with a less-than-readble code.  "If the line begins with a plus or
> backslash, do these things, the first of which is done only when
> the line begins with a plus."  The same comment for the other hunk
> that counts the $this side.
> 
Right, I'll re-roll with a separate clause for the "\ No new line .." lines.

Re: [PATCH v4 9/9] add -p: don't rely on apply's '--recount' option

2018-03-02 Thread Phillip Wood

On 01/03/18 20:30, Junio C Hamano wrote:
> Phillip Wood  writes:
> 
>> From: Phillip Wood 
>>
>> Now that add -p counts patches properly it should be possible to turn
>> off the '--recount' option when invoking 'git apply'
> 
> Sounds good ;-)
> 
Lets hope it works! Thanks for your comments, I'll send a re-roll of
this series at the beginning of next week

Best Wishes

Phillip

Re: [PATCH v2 0/5] roll back locks in various code paths

2018-03-02 Thread Jeff King

On Thu, Mar 01, 2018 at 09:40:20PM +0100, Martin Ågren wrote:

> After thinking about it, I tend to agree. That rewrite loses an
> indentation level and makes it a bit clearer that we have two steps,
> "maybe bail" and "write". But at the cost of duplicating logic -- after
> all, those two steps are very closely related, so there's no need to
> artificially separate them.
> 
> Here it is again, without that hunk, and without the commit message
> claim that it'd be a good thing to have just a few uses of
> "active_cache_changed" remaining.

Thanks, this version looks good to me. The name SKIP_IF_UNCHANGED is
generic and may result in clashes down the road. But then so is the name
COMMIT_LOCK. I'm OK to punt on that until we do see such a collision, at
which point we may want to provide a consistent namespace for these
flags.

-Peff

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Jeff King

On Thu, Mar 01, 2018 at 11:04:34PM -0800, Jonathan Nieder wrote:

> > Use of uninitialized value $_ in print at
> > /usr/lib/git-core/git-add--interactive line 1371,  line 75.
> [...]
> 
> Strange.  The relevant line, for reference:
> 
> $ dpkg-deb --fsys-tarfile git_2.11.0-3+deb9u2_amd64.deb |
>   tar Oxf - ./usr/lib/git-core/git-add--interactive |
>   sed -n '1370,1372 p'
> 
>   for (@{$hunk[$ix]{DISPLAY}}) {
>   print; < this one
>   }
> 
> This is a foreach loop, so it's supposed to have set $_ to each value
> in the list @{$hunk[$ix]{DISPLAY}).  So why is Perl considering it
> uninitialized?

Because the array is full of "undef". See parse_diff(), which does this:

  my @diff = run_cmd_pipe("git", @diff_cmd, "--", $path);
  ...
  @colored = run_cmd_pipe(@display_cmd);
  ...
  for (my $i = 0; $i < @diff; $i++) {
  ...
  push @{$hunk[-1]{TEXT}}, $diff[$i];
  push @{$hunk[-1]{DISPLAY}},
   (@colored ? $colored[$i] : $diff[$i]);
  }

If the @colored output is shorter than the normal @diff output, we'll
push a bunch of "undef" onto the DISPLAY array (the ternary there is
because sometimes @colored is completely empty if the user did not ask
for color).

-Peff

Re: Bug report: "Use of uninitialized value $_ in print"

2018-03-02 Thread Jeff King

On Fri, Mar 02, 2018 at 01:19:35AM +, Sam Kuper wrote:

> The bug is that in the midst of running
> 
> git -c interactive.diffFilter="git diff --word-diff --color" add --patch

That's not how interactive.diffFilter is supposed to work.

It's meant to have the output of an existing diff piped into it.
Generating the diff anew will appear to work for some simple cases, but
fall down for others:

  1. Any of the flavors besides "add -p" will be running the wrong diff
 (e.g., "reset -p" runs a diff between the index and HEAD).

  2. Any pathspec limiters would be ignored (e.g., "add -p file").

  3. Your invocation in particular is a problem because it uses
 --word-diff, which will not have a one-to-one line correspondence
 with the bare diff.

 add--interactive handles pretty-printing by running the diff
 command twice: once with no special options, and once with
 "--color" and piped through the diffFilter. It assumes that the two
 match each other line for line, so it shows you the "DISPLAY"
 variant, but then ultimately applies the "TEXT" variant.

And that last one is the cause of the errors you see:

> Use of uninitialized value $_ in print at
> /usr/lib/git-core/git-add--interactive line 1371,  line 74.

The "DISPLAY" run for your case generates fewer lines than the "TEXT"
run, and we complain on trying to show those extra lines.

Unfortunately, I don't think there's an easy way to do what you want
(show word-diffs but apply the full diff).

-Peff

Re: [PATCH v4 7/9] add -p: calculate offset delta for edited patches

2018-03-02 Thread Phillip Wood

On 01/03/18 20:14, Junio C Hamano wrote:
> Phillip Wood  writes:
> 
>> From: Phillip Wood 
>>
>> Recount the number of preimage and postimage lines in a hunk after it
>> has been edited so any change in the number of insertions or deletions
>> can be used to adjust the offsets of subsequent hunks. If an edited
>> hunk is subsequently split then the offset correction will be lost. It
>> would be possible to fix this if it is a problem, however the code
>> here is still an improvement on the status quo for the common case
>> where an edited hunk is applied without being split.
>>
>> This is also a necessary step to removing '--recount' and
>> '--allow-overlap' from the invocation of 'git apply'. Before
>> '--recount' can be removed the splitting and coalescing counting needs
>> to be fixed to handle a missing newline at the end of a file. In order
>> to remove '--allow-overlap' there needs to be i) some way of verifying
>> the offset data in the edited hunk (probably by correlating the
>> preimage (or postimage if the patch is going to be applied in reverse)
>> lines of the edited and unedited versions to see if they are offset or
>> if any leading/trailing context lines have been removed) and ii) a way of
>> dealing with edited hunks that change context lines that are shared
>> with neighbouring hunks.
>>
>> Signed-off-by: Phillip Wood 
>> ---
> 
> Thanks for clear description of what is going on in the series.
> 
>> diff --git a/git-add--interactive.perl b/git-add--interactive.perl
>> index 7a0a5896bb..0df0c2aa06 100755
>> --- a/git-add--interactive.perl
>> +++ b/git-add--interactive.perl
>> @@ -938,13 +938,19 @@ sub coalesce_overlapping_hunks {
>>  parse_hunk_header($text->[0]);
>>  unless ($_->{USE}) {
>>  $ofs_delta += $o_cnt - $n_cnt;
>> +# If this hunk has been edited then subtract
>> +# the delta that is due to the edit.
>> +$_->{OFS_DELTA} and $ofs_delta -= $_->{OFS_DELTA};
> 
> The pattern
> 
>   <> and <>;
> 
> is something you are newly introducing to this script.  I am not
> sure if we want to see them.  I somehow find them harder to read
> than the more straight-forward and naïve
> 
>   if (<>) {
>   <>;
>   }
> 

Fair enough, I think I was suffering from brace fatigue when I wrote it,
if you can hold off merging this into next I'll re-roll with "if's"
instead of "and's".

>> +# If this hunk was edited then adjust the offset delta
>> +# to reflect the edit.
>> +$_->{OFS_DELTA} and $ofs_delta += $_->{OFS_DELTA};
> 
> Likewise.
> 
>> +sub recount_edited_hunk {
>> +local $_;
>> +my ($oldtext, $newtext) = @_;
>> +my ($o_cnt, $n_cnt) = (0, 0);
>> +for (@{$newtext}[1..$#{$newtext}]) {
>> +my $mode = substr($_, 0, 1);
>> +if ($mode eq '-') {
>> +$o_cnt++;
>> +} elsif ($mode eq '+') {
>> +$n_cnt++;
>> +} elsif ($mode eq ' ') {
>> +$o_cnt++;
>> +$n_cnt++;
>> +}
>> +}
>> +my ($o_ofs, undef, $n_ofs, undef) =
>> +parse_hunk_header($newtext->[0]);
>> +$newtext->[0] = format_hunk_header($o_ofs, $o_cnt, $n_ofs, $n_cnt);
>> +my (undef, $orig_o_cnt, undef, $orig_n_cnt) =
>> +parse_hunk_header($oldtext->[0]);
>> +# Return the change in the number of lines inserted by this hunk
>> +return $orig_o_cnt - $orig_n_cnt - $o_cnt + $n_cnt;
>> +}
> 
> OK.
> 
>> @@ -1114,25 +1144,32 @@ sub prompt_yesno {
>>  }
>>  
>>  sub edit_hunk_loop {
>> -my ($head, $hunk, $ix) = @_;
>> -my $text = $hunk->[$ix]->{TEXT};
>> +my ($head, $hunks, $ix) = @_;
>> +my $hunk = $hunks->[$ix];
>> +my $text = $hunk->{TEXT};
>> ...
>> +$newhunk->{OFS_DELTA} = recount_edited_hunk($text, $newtext);
>> +# If this hunk has already been edited then add the
>> +# offset delta of the previous edit to get the real
>> +# delta from the original unedited hunk.
>> +$hunk->{OFS_DELTA} and
>> +$newhunk->{OFS_DELTA} += $hunk->{OFS_DELTA};
> 
> Ahh, good point.
>

Re: Reduce pack-objects memory footprint?

2018-03-02 Thread Eric Wong

Duy Nguyen  wrote:
> struct revindex_entry {
> off_t offset;
> unsigned int nr;
> };
> 
> We need on entry per object, so 6.5M objects * 16 bytes = 104 MB. If
> we break this struct apart and store two arrays of offset and nr in
> struct packed_git, we save 4 bytes per struct, 26 MB total.

Can the offset array can be a union which stores
int32_t/uint32_t instead of off_t for projects which never
exceed 2/4GB?

Likewise, places object_entry where "unsigned long" and off_t
are 64-bit could benefit from being 32-bit.  Testing/maintenance
overhead could be bad, for those, though.

Re: What's cooking in git.git (Mar 2018, #01; Thu, 1)

2018-03-02 Thread Luke Diamand

On 1 March 2018 at 22:20, Junio C Hamano  wrote:
> Here are the topics that have been cooking.  Commits prefixed with
> '-' are only in 'pu' (proposed updates) while commits prefixed with
> '+' are in 'next'.  The ones marked with '.' do not appear in any of
> the integration branches, but I am still holding onto them.
>
> You can find the changes described here in the integration branches
> of the repositories listed at
>
> http://git-blame.blogspot.com/p/git-public-repositories.html
>
> --
> --
> [New Topics]
>
>
> * ld/p4-unshelve (2018-02-22) 1 commit
>  - git-p4: add unshelve command
>
>  "git p4" learned to "unshelve" shelved commit from P4.
>
>  Will merge to 'next'.

The unshelve change should be left off next for now.

The problem with it is that it can't easily find a sensible consistent
point prior to the shelved changelist to generate the diff from (P4
has no concept of a tree revision). So you can end up "unshelving" and
pickup not only the shelved changelist, but also a bunch of
intervening changes (or the effect of some missing changelists). That
can be quite surprising.

This is actually pretty close to the behaviour of P4 unshelve itself,
which does somewhat the same thing. From the p4 manual page:

https://www.perforce.com/perforce/doc.current/manuals/cmdref/Content/CmdRef/p4_unshelve.html

   " Unshelving copies the shelved files into the user’s workspace as
they existed when they were shelved. (For example, a file open for
edit when shelved will also be open for edit in the unshelving user’s
workspace.)"

There's a better change which I posted which adds a "git p4
format-change" command which uses the diffs from Perforce. I think
that has a better chance of working properly. I had some review
comments which I need to take, after which it could be a candidate for
next.

It _might_ though be possible to resurrect the unshelve code by doing
something like extracting the previous versions  of the files (which
is kind of doable) and then constructing a temporary branch in git to
do the comparison against. Sounds pretty nasty though.

Thanks
Luke

Re: Reduce pack-objects memory footprint?

2018-03-02 Thread Duy Nguyen

On Wed, Feb 28, 2018 at 4:27 PM, Duy Nguyen  wrote:
> linux-2.6.git current has 6483999 objects. "git gc" on my poor laptop
> consumes 1.7G out of 4G RAM, pushing lots of data to swap and making
> all apps nearly unusuable (granted the problem is partly Linux I/O
> scheduler too). So I wonder if we can reduce pack-objects memory
> footprint a bit.

Next low hanging fruit item:

struct revindex_entry {
off_t offset;
unsigned int nr;
};

We need on entry per object, so 6.5M objects * 16 bytes = 104 MB. If
we break this struct apart and store two arrays of offset and nr in
struct packed_git, we save 4 bytes per struct, 26 MB total.

It's getting low but every megabyte counts for me, and it does not
look like breaking this struct will make horrible code (we recreate
the struct at find_pack_revindex()) so I'm going to do this too unless
someone objects. There will be slight performance regression due to
cache effects, but hopefully it's ok.
-- 
Duy

Re: [PATCH] git-p4: Fix depot path stripping

2018-03-02 Thread Luke Diamand

On 27 February 2018 at 19:00, Nuno Subtil  wrote:
> I originally thought this had been designed such that the p4 client spec
> determines the paths, which would make sense. However, git-p4 still ignores
> the local path mappings in the client spec when syncing p4 depot paths
> w/--use-client-spec. Effectively, it looks as though --use-client-spec
> always implies --keep-paths, which didn't seem to me like what was intended.
>
> For the use case I'm dealing with (syncing a p4 path but ignoring some
> directories inside it), there seems to be no way today to generate a git
> tree rooted at the p4 path location instead of the root of the depot, which
> looks like a bug to me. I don't really understand the overall design well
> enough to tell whether the bug lies in stripRepoPath or somewhere else, to
> be honest, but that's where I saw the inconsistency manifest itself.

I replied but managed to drop git-users off the thread. So trying again!

The behavior is a touch surprising, but I _think_ it's correct.

With --use-client-spec enabled, paths in the git repot get mapped as
if you had used the file mapping in the client spec, using "p4 where".

So, for example, if you have a client spec which looks like:
   //depot/... //my_client_spec/...

then you're going to get the full repo structure, even if you only
clone a subdirectory.

e.g. if you clone //depot/a/b/c then with "--use-client-spec" enabled,
you will get a/b/c in your git repo, and with it not enabled, you will
just get c/.

If instead your client spec looks like:
  //depot/a/b/... //my_client_spec/...

then you should only get c/d. (And a quick test confirms that).

I think Nuno's original question comes from wanting to map some files
into place which the clientspec mapping emulation in git-p4 was
possibly not handling - if we can get a test case for that (or an
example) then we can see if it's just that the client mapping code
that Pete put in is insufficient, or if there's some other way around.
Or if indeed I'm wrong, and there's a bug! There may be some weird
corner-cases where it might do the wrong thing.

Thanks,
Luke


>
> Nuno
>
>
> On Tue, Feb 27, 2018 at 3:12 AM, Luke Diamand  wrote:
>>
>> On 27 February 2018 at 08:43, Nuno Subtil  wrote:
>> > The issue is that passing in --use-client-spec will always cause git-p4
>> > to
>> > preserve the full depot path in the working tree that it creates, even
>> > when
>> > --keep-path is not used.
>> >
>> > The repro case is fairly simple: cloning a given p4 path that is nested
>> > 2 or
>> > more levels below the depot root will have different results with and
>> > without --use-client-spec (even when the client spec is just a
>> > straightforward map of the entire depot).
>> >
>> > E.g., 'git p4 clone //p4depot/path/to/some/project/...' will create a
>> > working tree rooted at project. However, 'git p4 clone --use-client-spec
>> > //p4depot/path/to/some/project/...' will create a working tree rooted at
>> > the
>> > root of the depot.
>>
>> I think it _might_ be by design.
>>
>> At least, the test ./t9809-git-p4-client-view.sh seems to fail for me
>> with your change, although I haven't investigated what's going on:
>>
>> $ ./t9809-git-p4-client-view.sh
>> ...
>> ...
>> Doing initial import of //depot/dir1/ from revision #head into
>> refs/remotes/p4/master
>> ./t9809-git-p4-client-view.sh: 10: eval: cannot create dir1/file13:
>> Directory nonexistent
>> not ok 23 - subdir clone, submit add
>>
>> I think originally the logic came from this change:
>>
>>21ef5df43 git p4: make branch detection work with --use-client-spec
>>
>> which was fixing what seems like the same problem but with branch
>> detection enabled.
>>
>>
>> >
>> > Thanks,
>> > Nuno
>> >
>> >
>> > On Tue, Feb 27, 2018 at 12:10 AM, Luke Diamand  wrote:
>> >>
>> >> On 27 February 2018 at 04:16, Nuno Subtil  wrote:
>> >> > When useClientSpec=true, stripping of P4 depot paths doesn't happen
>> >> > correctly on sync. Modifies stripRepoPath to handle this case better.
>> >>
>> >> Can you give an example of how this shows up? I could quickly write a
>> >> test case for this if I knew what was going on.
>> >>
>> >> Thanks
>> >> Luke
>> >>
>> >>
>> >> >
>> >> > Signed-off-by: Nuno Subtil 
>> >> > ---
>> >> >  git-p4.py | 12 +---
>> >> >  1 file changed, 9 insertions(+), 3 deletions(-)
>> >> >
>> >> > diff --git a/git-p4.py b/git-p4.py
>> >> > index 7bb9cadc69738..3df95d0fd1d83 100755
>> >> > --- a/git-p4.py
>> >> > +++ b/git-p4.py
>> >> > @@ -2480,7 +2480,7 @@ def stripRepoPath(self, path, prefixes):
>> >> >  if path.startswith(b + "/"):
>> >> >  path = path[len(b)+1:]
>> >> >
>> >> > -elif self.keepRepoPath:
>> >> > +if self.keepRepoPath:
>> >> >  # Preserve everything in relative path name except
>> >> > leading
>> >> >  # //depot/; just look

[PATCH v2] git-gui: bind CTRL/CMD+numpad ENTER to do_commit

2018-03-02 Thread Birger Skogeng Pedersen

CTRL/CMD+ENTER is bound to do_commit, but this did not apply for the
(numpad ENTER) key. To enable CTRL/CMD+ENTER and CTRL/CMD+(numpad ENTER)
to yield the same behaviour, CTRL/CMD+(numpad enter) has also been bound
to do_commit.

Signed-off-by: Birger Skogeng Pedersen 
---
 git-gui/git-gui.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh
index 91c00e648..6de74ce63 100755
--- a/git-gui/git-gui.sh
+++ b/git-gui/git-gui.sh
@@ -3867,6 +3867,7 @@ bind .   <$M1B-Key-equal> {show_more_context;break}
 bind .   <$M1B-Key-plus> {show_more_context;break}
 bind .   <$M1B-Key-KP_Add> {show_more_context;break}
 bind .   <$M1B-Key-Return> do_commit
+bind .   <$M1B-Key-KP_Enter> do_commit
 foreach i [list $ui_index $ui_workdir] {
bind $i{ toggle_or_diff click %W %x %y; break }
bind $i <$M1B-Button-1>  { add_one_to_selection %W %x %y; break }
-- 
2.16.2.270.gdc6133cf5

Re: [PATCH] git-gui: bind CTRL/CMD+numpad ENTER to do_commit

2018-03-02 Thread Birger Skogeng Pedersen

Sorry about that. Version 2 coming right up.

On Thu, Mar 1, 2018 at 7:31 PM, Eric Sunshine  wrote:
> On Thu, Mar 1, 2018 at 9:39 AM, Birger Skogeng Pedersen
>  wrote:
>> ---
>
> Please sign-off on your patch. See Documentation/SubmittingPatches.
>
> Also, it would be helpful to write at least a short commit message
> justifying the change. The reason you gave in your lead-in email[1]
> might be sufficient:
>
> In git-gui, we can hit CTRL/CMD+ENTER to create a commit. However,
> using the numpad ENTER does not invoke the same command.
>
> (assuming people don't argue that numpad ENTER should be saved for
> some other function).
>
> Thanks.
>
> [1]: 
> https://public-inbox.org/git/CAGr--=lxmtz5rrp4742u3vsradrsware2sitcsowatyson2...@mail.gmail.com/
>
>> diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh
>> index 91c00e648..6de74ce63 100755
>> --- a/git-gui/git-gui.sh
>> +++ b/git-gui/git-gui.sh
>> @@ -3867,6 +3867,7 @@ bind .   <$M1B-Key-equal> {show_more_context;break}
>>  bind .   <$M1B-Key-plus> {show_more_context;break}
>>  bind .   <$M1B-Key-KP_Add> {show_more_context;break}
>>  bind .   <$M1B-Key-Return> do_commit
>> +bind .   <$M1B-Key-KP_Enter> do_commit
>>  foreach i [list $ui_index $ui_workdir] {
>> bind $i{ toggle_or_diff click %W %x %y; break }
>> bind $i <$M1B-Button-1>  { add_one_to_selection %W %x %y; break }

Von: Joy Kone

2018-03-02 Thread joy kone

Von: Joy Kone

Ich habe Ihnen diese E-Mail geschickt, weil Sie mit Ihnen diskutieren
müssen. Ich möchte nicht, dass Sie dieses Angebot in irgendeiner
Hinsicht missverstehen ... wenn es Ihnen recht ist, bitte ich um Ihre
volle Mitarbeit. Ich habe mich mit Ihnen in Verbindung gesetzt, um
eine Investition in Ihrem Land / Unternehmen in meinem Namen als
potenzieller Partner zu führen.

Mein Name ist Joy Konei. ein Bürger aber wohnt hier. Es könnte Sie
interessieren zu wissen, dass ich US $ 10.500.000,00 bei einem
Finanzinstitut hinterlegt habe, das in Ihrem Land / Unternehmen
investiert werden soll. Es ist sachdienlich, mich wissen zu lassen, ob
Sie mit diesem Fonds / Ihrer Anlage in Ihrem Land umgehen können, um
Ihnen alle notwendigen Informationen über das Finanzinstitut für
weitere Informationen zu liefern. Inzwischen bin ich sehr ehrlich in
meinen Umgang mit Menschen und ich fordere auch dasselbe von dir als
Partner zu sein. Kann ich Ihnen diesen Fonds anvertrauen?

Ich möchte Sie darauf hinweisen, dass dies eine gemeinsame
Unternehmung ist, da dies eine Belohnung für Ihre Unterstützung ist.
Ich werde Ihnen Ihren Nutzen für Ihre Unterstützung mitteilen, während
wir fortfahren. Für eine umfassendere Details und Quelle des Fonds,
kontaktieren Sie mich bitte so schnell wie möglich. Wenn Sie diesen
Brief als anstößig empfinden, ignorieren Sie ihn bitte und akzeptieren
Sie meine Entschuldigung.

Grüße,
Freude Kone

99 matches

Mail list logo