date:20180824

git@vger.kernel.org Quality Petroleum Products

2018-08-24 Thread PROMISKI VOLGr

Dear Sir/Madam,

We "VASUGANNEFT LLC" is an official registered 
Representatives/Mandate Company, on behalf of our End 
Seller/Refinery Company; offer oil products from the leading 
companies in the Russian Federation producing the highest quality 
of Petroleum products e.g.


*MAZUT M100 GOST -10585-99 RUSSIA ORIGIN: 
*MAZUT M100 GOST -10585-75 RUSSIA ORIGIN: 
*RUSSIAN D2 GAS OIL GOST 305-82: 
*RUSSIAN PETROLEUM COKE (PETCOKE):
*JET FUEL A1 91/91:
*AVIATION KEROSENE COLONIAL GRADE 54 JET FUEL RUSSIAN ORIGIN: 
*LIQUIDIFIED NATURAL GAZ: 
*LIQUIDFIED PATROLEUM GAZ. 50% PROPANE & 50% BUTANE MIX:
*DIESEL D6 VIRGIN LOW POUR FUEL OIL:
*AUTOMOTIVE GAZ OIL (AGO): 
*RUSSIA EXPORT BLEND CRUDE GOST 9965-76: 
*GASOLINE 95 OCTANES:
*RUSSIAN BITUMEN OF ALL SPECIFICATIONS:
*DIESEL FUEL EN590:
*UREA GRANULAR AND PRILLED RUSSIAN ORIGIN:

 

Guaranteeing high quality of our oil with full Corporate & Legal 
Responsibility, under penalty of Perjury we confirms/certify that 
our Seller have the availability and capability to Sale & supply 
the below mentioned commodities in-according to the terms and 
conditions stipulated on our commercial Procedure.for Inquiries 
please Contact us on(vasugann...@rerms.eu)Also Below is our 
working procedures please kindly review and revert back to us for 
Further proceedings thanks.


FOB (DIP/PAY) TRANSACTION 
PROCEDURE 1 :

1. Buyer Company Issues LOI/ICPO to Purchase Products along with 
seller procedure acceptance letter.

 2. Seller issues CI for Available quantity in the Storage Tank 
to Buyer, Buyer signs and return to Seller with TSA signed by 
buyer and their appointed storage company.

 3. Seller issues dip test authorization, which shall be endorsed 
by seller, buyer and their storage tank company.  

 4. Upon receipt of the signed Dip Test Authorization , buyer 
Storage company issues note of readiness (NOR) to commence 
injection into buyer rented tanks. Injection concluded and buyer 
commence dip test for the injected product. 

 5. Buyer confirms dip test result and Makes 100% Payment by 
MT103 TT wire Transfer within 24hours. After confirmation of 
Buyers' payment, seller out pays facilitators involved in the 
transaction.

 6. Seller issues Final Agreement to Buyer Company to review 
Contract on R/E Monthly Deliveries.

 7. Buyer review and approves Contract and Issues SBLC/IRDLC 
Irrevocable, non-transferable, auto revolving for 12 months 
Shipment Value, Documentary Letter Of Credit for Length of 
Contract and for each lifting as schedule.

 8. Buyer pays after successful Dip Test by MT103 Wire Transfer 
on each Monthly delivery.

9. The Subsequent delivery shall commence according to the terms 
and conditions of the Contract and Seller pays commissions to 
Seller side and to Buyer side intermediaries via MT103 according 
to Monthly Delivery.



FOB TRANSACTION PROCEDURE 2 :

 1. Buyer Company Issues LOI/ICPO to Purchase Products along with 
seller procedure acceptance letter.

 2. Buyer pays for ICPO registration to enable the Refinery 
export department approve and allocate allocation to buyer.

 3. On receipt of buyer payment by seller attorney, seller issues 
CI for Available quantity in the Storage Tank to Buyer, Buyer 
signs and return to Seller with NCNDA/IMFPA signed by all Groups 
with Commission Structures.

 4. Seller issue unconditional DTA to buyer for immediate dip 
test of the product in seller storage tank where the product is 
stored.

 5. Buyer confirms dip test result and Makes 100% Payment by 
MT103 TT wire Transfer within 24hours also deduct the amount paid 
for the registration of the ICPO. After confirmation of Buyers' 
payment, seller out pays facilitators involved in the 
transaction. Buyer will extend seller storage tank or immediately 
inject the product into their own rented storage tanks.

 6. Seller issues Final Agreement to Buyer Company to review 
Contract on R/E Monthly Deliveries.

7. Buyer review and approves Contract and Issues SBLC/IRDLC 
Irrevocable, non-transferable, auto revolving for 12 months 
Shipment Value, Documentary Letter Of Credit for Length of 
Contract and for each lifting as schedule.

 8. Buyer pays after successful Dip Test by MT103 Wire Transfer 
on each Monthly delivery.

 9. The Subsequent delivery shall commence according to the terms 
and conditions of the Contract and Seller pays commissions to 
Seller side and to Buyer side intermediaries viaMT103 according 
to Monthly Delivery.
 


CIF TRANSACTION PROCEDURE 1:

1. Buyer issues irrevocable corporate purchase order (ICPO) with 
seller procedure inserted in the ICPO.

 2. Seller issues sales and purchase agreement (SPA) to buyer, 
buyer countersigns and sends back to seller.

 3. Seller signs and seal the draft contract and send back to 
buyer.

 4. Buyer legalize and register the transaction with the 
ministries which will enable the shipping company accept 
anchoring

Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff

2018-08-24 Thread Jeff King

On Fri, Aug 24, 2018 at 06:55:24PM -0400, Eric Sunshine wrote:

> On Fri, Aug 24, 2018 at 10:47 AM Duy Nguyen  wrote:
> > On Thu, Aug 23, 2018 at 8:21 PM Eric Sunshine  
> > wrote:
> > > Peff wrote:
> > > > Yes, but then what's the next step for my script? I can't "remove" since
> > > > the worktree isn't there. I can't blow away any directory that I know
> > > > about, since there isn't one.
> > >
> > > I was thinking that "worktree add" could start respecting the --force
> > > option as an escape hatch.
> > >
> > > > What about refusing by default, but forcing an overwrite with "-f"?
> > >
> > > My thought, also.
> >
> > Sounds good. Eric are you going to implement this? Just checking so
> > that I can (hopefully) cross this off my backlog ;-)
> 
> It wasn't something I was planning on working on (at least not
> immediately) since it's still a bit fuzzy for me whether this is
> enough to help Peff's use-case (and because I have several other
> things in my queue, already).

I'm pretty sure it would just be a one-liner to "worktree add -f" in the
doc-diff script. So I think it does solve the problem.

> However, before even considering implementing it, there's at least one
> question (and possibly others) needing answering. For instance, how
> should "add --force" interact with a locked (not-present) worktree?
> Should it blast it despite the lock? Or would that need --force
> specified twice ("git worktree add -f -f foo")?

Yes, I think that should probably be two forces.

> As for the actual implementation, I haven't yet looked at how much
> surgery will be needed to make 'add' respect --force.

Me either. I may take a look this weekend. I got sucked into an asm and
coccinelle rabbit hole the last few days.

-Peff

Re: [PATCH/RFC] commit: new option to abort -a something is already staged

2018-08-24 Thread Jacob Keller

On Fri, Aug 24, 2018 at 7:42 AM Duy Nguyen  wrote:
>
> On Fri, Aug 24, 2018 at 5:02 AM Jacob Keller  wrote:
> >
> > On Thu, Aug 23, 2018 at 9:28 AM Junio C Hamano  wrote:
> > > I think the above example forgets "-a" on the final "git commit"
> > > step.  With it added, I can understand the concern (and I am sure
> > > you would, too).
> > >
> > > The user is trying to add everything done in the working tree, and
> > > "commit -a" would catch all changes to paths that were already
> > > tracked, but a separate "add" is necessary for newly created paths.
> > > But adding a new path means the index no longer matches HEAD, and
> > > the "commit -a" at the final step sweeps the changes to already
> > > tracked paths---failing that because there "already is something
> > > staged" will break the workflow.
> >
> > Right. I think this would need to be able to understand the case of
> > "different only by new files".
>
> OK so the rules I'm going to try to implement is, if the  version in
> the index is not the same as one in HEAD or one in worktree, then "git
> commit -a" fails.
>
> The unwritten line is, if the path in question does not exist in HEAD,
> then the first condition "as one in HEAD" is dropped, naturally. This
> addresses the "git add new-file" problem, but if you have made more
> changes in new-file in worktree after "git add" and do "git commit -a"
> then you still get rejected because it fails the second condition.
>
> File removal should be considered as well. But I don't foresee any
> problem there. Resolving merges, replacing higher stage entries with
> stage 0 will be accepted at "git commit -a" as usual.
>
> Let me know if we should tweak these rules (and how).
> --
> Duy

This seems reasonable to me. It might trigger a few issues with people
doing "git add new_file ; $EDITOR new_file ; git commit -a" but... I
know I've accidentally done "git commit -a" and had problems.

What about "git commit "? That's another case where I've
accidentally lost some work for a similar reason.

Thanks,
Jake

Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff

2018-08-24 Thread Eric Sunshine

On Fri, Aug 24, 2018 at 10:47 AM Duy Nguyen  wrote:
> On Thu, Aug 23, 2018 at 8:21 PM Eric Sunshine  wrote:
> > Peff wrote:
> > > Yes, but then what's the next step for my script? I can't "remove" since
> > > the worktree isn't there. I can't blow away any directory that I know
> > > about, since there isn't one.
> >
> > I was thinking that "worktree add" could start respecting the --force
> > option as an escape hatch.
> >
> > > What about refusing by default, but forcing an overwrite with "-f"?
> >
> > My thought, also.
>
> Sounds good. Eric are you going to implement this? Just checking so
> that I can (hopefully) cross this off my backlog ;-)

It wasn't something I was planning on working on (at least not
immediately) since it's still a bit fuzzy for me whether this is
enough to help Peff's use-case (and because I have several other
things in my queue, already).

However, before even considering implementing it, there's at least one
question (and possibly others) needing answering. For instance, how
should "add --force" interact with a locked (not-present) worktree?
Should it blast it despite the lock? Or would that need --force
specified twice ("git worktree add -f -f foo")?

As for the actual implementation, I haven't yet looked at how much
surgery will be needed to make 'add' respect --force.

Re: Request for testing v2.19.0-rc0 with builtin stash/rebase

2018-08-24 Thread Bryan Turner

On Fri, Aug 24, 2018 at 5:14 AM Johannes Schindelin
 wrote:
>
> For that reason, I was delighted to see that our Google Summer of Code
> pushed pretty hard in that direction. And I could not help myself so I had
> to test how much faster things got. Here is the result of my first, really
> quick and dirty test:
>
> without builtin stash/rebasewith builtin stash/rebase
> t3400 (rebase)  1m27s   32s
> t3404 (rebase -i)   13m15s  3m59s
> t3903 (stash)   8m37s   1m18s
>
> What can I say? Even if the numbers are off by as much as 10%, these are
> impressive improvements: keep in mind that there is a lot of churn going
> on in the test suite because it is itself implemented in Unix shell
> scripts (and hence I won't even bother to try more correct performance
> benchmarking because that is simply not possible when Unix shell scripts
> are in the equation). So the speed improvements of the stash/rebase
> commands are *even higher* than this.

Thanks for taking the time to make these pre-releases available. I
appreciate the effort. And the same to Junio, for always posting
release candidates. We rely on them heavily to find changes that might
cause issues before admins start upgrading in the wild and find them
for us.

I downloaded both the rc0.1 and rc0.2 builds, as well as 2.18.0, and
ran them all through Bitbucket Server's test suite a few times (to
ensure warm disk for comparable numbers). I added support for some
"simple" rebase cases a few releases ago, so we have a set of tests
that verify the rebase behaviors we use. (We don't use git stash, so
we don't have any tests in our suite for that.)

Running our entire Git test suite (~1,600 tests) against Git for
Windows 2.18.0 takes ~5 minutes, and 2.19.0-rc0.1 produced an almost
identical duration. Running our tests against rc0.2 cut the duration
down to 4 minutes. There were no test failures on either pre-release
build.

To try and get a better sense of the rebase performance improvement
specifically, I filtered down to a set of 14 specific tests which use
it and ran those. On Git 2.18, those 14 tests take just over 19
seconds. On 2.19.0-rc0.2, they take just over 8 seconds.

When they do ship, whether it's in 2.19 (by default or optional) or
later, the changes definitely offer some significant performance wins.

Thanks again, to everyone involved, for all the effort that went into
designing, implementing, reviewing and releasing these improvements.
As someone who develops under Windows most of the time, they make a
big difference in my day to day work. And that's not to mention all
the Bitbucket Server and Bitbucket Data Center users who will enjoy a
snappier experience as these changes make their way out into the wild.

Best regards,
Bryan Turner

[PATCH 1/2] rerere: remove documentation for "nested conflicts"

2018-08-24 Thread Thomas Gummerer

4af32207bc ("rerere: teach rerere to handle nested conflicts",
2018-08-05) introduced slightly better behaviour if the user commits
conflict markers and then gets another conflict in 'git rerere'.
However this is just a heuristic to punt on such conflicts better, and
the documentation might be misleading to users, in case we change the
heuristic in the future.

Remove this documentation to avoid being potentially misleading in the
documentation.

Suggested-by: Junio C Hamano 
Signed-off-by: Thomas Gummerer 
---

The original series already made it into 'next', so these patches are
on top of that.  I also see it is marked as "will merge to master" in
the "What's cooking" email, so these two patches would be on top of
that.  If you are not planning to merge the series down to master
before 2.19, we could squash this into 10/11, otherwise I'm happy with
the patches on top.

 Documentation/technical/rerere.txt | 42 --
 1 file changed, 42 deletions(-)

diff --git a/Documentation/technical/rerere.txt 
b/Documentation/technical/rerere.txt
index e65ba9b0c6..3d10dbfa67 100644
--- a/Documentation/technical/rerere.txt
+++ b/Documentation/technical/rerere.txt
@@ -138,45 +138,3 @@ SHA1('BC').
 If there are multiple conflicts in one file, the sha1 is calculated
 the same way with all hunks appended to each other, in the order in
 which they appear in the file, separated by a  character.
-
-Nested conflicts
-
-
-Nested conflicts are handled very similarly to "simple" conflicts.
-Similar to simple conflicts, the conflict is first normalized by
-stripping the labels from conflict markers, stripping the common ancestor
-version, and the sorting the conflict hunks, both for the outer and the
-inner conflict.  This is done recursively, so any number of nested
-conflicts can be handled.
-
-The only difference is in how the conflict ID is calculated.  For the
-inner conflict, the conflict markers themselves are not stripped out
-before calculating the sha1.
-
-Say we have the following conflict for example:
-
-<<< HEAD
-1
-===
-<<< HEAD
-3
-===
-2
->>> branch-2
->>> branch-3~
-
-After stripping out the labels of the conflict markers, and sorting
-the hunks, the conflict would look as follows:
-
-<<<
-1
-===
-<<<
-2
-===
-3
->>>
->>>
-
-and finally the conflict ID would be calculated as:
-`sha1('1<<<\n3\n===\n2\n>>>')`
-- 
2.18.0.1088.ge017bf2cd1

[PATCH 2/2] rerere: add not about files with existing conflict markers

2018-08-24 Thread Thomas Gummerer

When a file contains lines that look like conflict markers, 'git
rerere' may fail not be able to record a conflict resolution.
Emphasize that in the man page.

Helped-by: Junio C Hamano 
Signed-off-by: Thomas Gummerer 
---

Not sure if there may be a better place in the man page for this, but
this is the best I could come up with.

 Documentation/git-rerere.txt | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/git-rerere.txt b/Documentation/git-rerere.txt
index 031f31fa47..036ea11528 100644
--- a/Documentation/git-rerere.txt
+++ b/Documentation/git-rerere.txt
@@ -211,6 +211,12 @@ would conflict the same way as the test merge you resolved 
earlier.
 'git rerere' will be run by 'git rebase' to help you resolve this
 conflict.
 
+[NOTE]
+'git rerere' relies on the conflict markers in the file to detect the
+conflict.  If the file already contains lines that look the same as
+lines with conflict markers, 'git rerere' may fail to record a
+conflict resolution.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
-- 
2.18.0.1088.ge017bf2cd1

Re: [PATCH v4 10/11] rerere: teach rerere to handle nested conflicts

2018-08-24 Thread Thomas Gummerer

On 08/22, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > Hmm, it does describe what happens in the code, which is what this
> > patch implements.  Maybe we should rephrase the title here?
> >
> > Or are you suggesting dropping this patch (and the next one)
> > completely, as we don't want to try and handle the case where this
> > kind of garbage is thrown at 'rerere'?
> 
> I consider these two patches as merely attempting to punt a bit
> better.  Once users start committing conflict-marker-looking lines
> in the contents, and getting them involved in actual conflicts, I do
> not think any approach (including what the original rerere uses
> before this patch) that assumes the markers will neatly form set of
> blocks of text enclosed in << == >> will reliably step around such
> broken contents.  E.g. it is entirely conceivable both branches have
> the <<< beginning of conflict marker plus contents from the HEAD
> before they recorded the marker that are identical, that diverge as
> you scan the text down and get closer to ===, something like:
> 
> side A  side B
> 
> 
> shared  shared
> <<< <<<
> version before  version before
> these guys merged   these guys merged
> their ancestor  their ancestor
> versionsversions.
> but somenow some
> lines are different lines are different
> === 
> and other   totally different
> contentscontents
> ... ...
> 
> And a merge of these may make <<< part shared (i.e. outside the
> conflicted region) while lines near and below  part of conflict,
> which would give us something like
> 
> merge of side A & B
> ---
> 
> shared  
> <<< (this is part of contents)
> version before  
> these guys merged   
> their ancestor  
> <<< HEAD(conflict marker)
> versions
> but some
> lines are different
> === (this is part of contents)
> and other
> contents
> ...
> === (conflict marker)
> versions.
> now some
> lines are different
> === (this is part of contents)
> totally different
> contents
> ...
> >>> theirs  (conflict marker)
> 
> Depending on the shape of the original conflict that was committed,
> we may have two versions of <<<, together with the real conflict
> marker, but shared closing >>> marker.  With contents like that,
> there is no way for us to split these lines into two groups at a
> line '=' (which one?) and swap to come up with the normalized
> shape.
> 
> The original rerere algorithm would punt when such an unmatched
> markers are found, and deals with "nested conflict" situation by
> avoiding to create such a thing altogether.  I am sure your two
> patches may make the code punt less, but I suspect that is not a
> foolproof "solution" but more of a workaround, as I do not think it
> is solvable, once you allow users to commit conflict-marker looking
> strings in contents.

Agreed.  I think it may be solvable if we'd actually get the
information about what belongs to which side from the merge algorithm
directly.  But that sounds way more involved than what I'm able to
commit to for something that I don't forsee running into myself :)

>   As the heuristics used in such a workaround
> are very likely to change, and something the end-users should not
> even rely on, I'd rather not document and promise the exact
> behaviour---perhaps we should stress "don't do that" even stronger
> instead.

Fair enough.  I thought of the technical documentation as something
that doesn't promise users anything, but rather describes how the
internals work right now, which is what this bit of documentation
attempted to write down.  But if we are worried about this giving end
users ideas then I definitely agree and we should get rid of this bit
of documentation.  I'll send a patch for that, and for adding a note
about "don't do that" in the man page.

Re: [PATCH v4 4/6] tests: use shorter here-docs in chainlint.sed for AIX sed

2018-08-24 Thread Eric Sunshine

On Fri, Aug 24, 2018 at 11:20 AM Ævar Arnfjörð Bjarmason
 wrote:
> Improve the portability of chainlint by using shorter here-docs. On
> AIX sed will complain about:
>
> sed: 0602-417 The label :hereslurp is greater than eight
> characters

Shortening the names makes them ugly and often unreadable. That's not
a complaint with this patch; just a general observation regarding
8-byte limitation with this platform's "sed" (and POSIX). See a few
suggested improvements below, but probably not worth a re-roll.

> This, in combination with the previous fix to this file makes
> GIT_TEST_CHAIN_LINT=1 (which is the default) working again on AIX
> without issues, and the "gmake check-chainlint" test also passes.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason 

FWIW,
Acked-by: Eric Sunshine 

> diff --git a/t/chainlint.sed b/t/chainlint.sed
> -:checkchain
> +:chkchn

":chkchain"

>  # found incomplete line "...\" -- slurp up next line
> -:incomplete
> +:icmplte

":fold" (for "fold out NL")

>  # found nested multi-line "(...\n...)" -- pass through untouched
> -:nestslurp
> +:nstslurp

":nesteat"

> -:nestcontinue
> +:nstcnt

":nestcont"

> -:nestclose
> +:nstclose

":nestend" or ":endnest"

>  # found closing ")" on own line -- drop "suspect" from final line of subshell
>  # since that line legitimately lacks "&&" and exit subshell loop
> -:closesolo
> +:clssolo

":endsolo"

Re: [PATCH v4 3/6] tests: fix comment syntax in chainlint.sed for AIX sed

2018-08-24 Thread Eric Sunshine

On Fri, Aug 24, 2018 at 11:20 AM Ævar Arnfjörð Bjarmason
 wrote:
> Change a comment in chainlint.sed to appease AIX sed, which would
> previously print this error:
>
> sed:# stash for later printing is not a recognized function
>
> 1. 
> https://public-inbox.org/git/CAPig+cTTbU5HFMKgNyrxTp3+kcK46-Fn=4zh6zdt1oqchac...@mail.gmail.com/

It would have been nice for the commit message to explain that the
problem was caused by the comment being indented, since it's otherwise
not otherwise obvious to the reader without chasing the link to the
email exchange, but probably not worth a re-roll.

> Signed-off-by: Ævar Arnfjörð Bjarmason 

FWIW,
Acked-by: Eric Sunshine

Re: [PATCH v4 5/6] tests: fix version-specific portability issue in Perl JSON

2018-08-24 Thread Eric Sunshine

On Fri, Aug 24, 2018 at 11:20 AM Ævar Arnfjörð Bjarmason
 wrote:
> The test guarded by PERLJSON added in 75459410ed ("json_writer: new
> routines to create JSON data", 2018-07-13) assumed that a JSON boolean
> value like "true" or "false" would be represented as "1" or "0" in
> Perl.
>
> This behavior can't be relied upon, e.g. with JSON.pm 2.50 and
> JSON::PP A JSON::PP::Boolean object will be represented as "true" or

s/PP A/PP. A/

(Not worth a re-roll.)

> "false". To work around this let's check if we have any refs left
> after we check for hashes and arrays, assume those are JSON objects,
> and coerce them to a known boolean value.
>
> The behavior of this test still looks odd to me. Why implement our own
> ad-hoc encoder just for some one-off test, as opposed to say Perl's
> own Data::Dumper with Sortkeys et al? But with this change it works,
> so let's leave it be.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason

Re: clone, hardlinks, and file modes (and CAP_FOWNER)

2018-08-24 Thread Andreas Krey

On Fri, 24 Aug 2018 16:48:37 +, Ævar Arnfjörð Bjarmason wrote:
...
> I don't understand how this hardlink approach would work (doesn't mean
> it won't, just that I don't get it).

I just detect whether there is insufficient sharing (df is quite handy
here; 'df this/.git that/.git' tells the unshared part of that/.git only).

When I detect 'unsharedness', I just hardlink the biggest .pack and the
corresponding .idx into the target repo, create a .keep file for that,
run 'git gc', and remove the .keep file. Effect: repack uses the .kept
file and only creates a small additional pack file for the remaining
objects, thus the biggest part of the objects are now shared between
the cache and the target repo.

This is going to be run once a week over all the repos on a machine
(that were created by our tooling and thus have known locations),
to avoid eventual repacks of repos to gradually and completely
lose the sharedness of the objects/packs.

> If you have such a tightly coupled approach isn't --reference closed to
> what you want in that case?

Close, but not. --reference et al. all need the promise that the
referenced repo isn't going away, and I don't want to rely on this
(if someone thinks he can drop the cache this should not lead to
breakage in the work repos).

- Andreas

-- 
"Totally trivial. Famous last words."
From: Linus Torvalds 
Date: Fri, 22 Jan 2010 07:29:21 -0800

Re: [PATCH v1] read-cache: speed up index load through parallelization

2018-08-24 Thread Ben Peart





On 8/24/2018 3:00 PM, Duy Nguyen wrote:

On Fri, Aug 24, 2018 at 8:40 PM Ben Peart  wrote:




On 8/24/2018 2:20 PM, Duy Nguyen wrote:

On Fri, Aug 24, 2018 at 5:37 PM Duy Nguyen  wrote:

On Thu, Aug 23, 2018 at 10:36 PM Ben Peart  wrote:

Nice to see this done without a new index extension that records
offsets, so that we can load existing index files in parallel.



Yes, I prefer this simpler model as well.  I wasn't sure it would
produce a significant improvement given the primary thread still has to
run through the variable length cache entries but was pleasantly surprised.


Out of curiosity, how much time saving could we gain by recording
offsets as an extension (I assume we need, like 4 offsets if the
system has 4 cores)? Much much more than this simpler model (which may
justify the complexity) or just "meh" compared to this?


To answer my own question, I ran a patched git to precalculate
individual thread parameters, removed the scheduler code and hard
coded these parameters (I ran just 4 threads, one per core). I got
0m2.949s (webkit.git, 275k files, 100 read-cache runs). Compared to
0m4.996s from Ben's patch (same test settings of course) I think it's
definitely worth adding some extra complexity.



I took a run at doing that last year [1] but that was before the
mem_pool work that allowed us to avoid the thread contention on the heap
so the numbers aren't an apples to apples comparison (they would be
better today).


Ah.. sorry I was not aware. A big chunk of 2017 is blank to me when it
comes to git.


The trade-off is the additional complexity to be able to load the index
extension without having to parse through all the variable length cache
entries.  My patch worked but there was feedback requested to make it
more generic and robust that I haven't gotten around to yet.


One more comment. Instead of forcing this special index at the bottom,
add a generic one that gives positions of all extensions and put that
one at the bottom. Then you can still quickly locate your offset table
extension, and you could load UNTR and TREE extensions in parallel too
(those scale up to worktree size)



That is pretty much what Junio's feedback was and what I was referring 
to as making it "more generic."  The "more robust" was the request to 
add a SHA to the extension ensure it wasn't corrupt and was a valid 
extension.



This patch series went for simplicity over absolutely the best possible
performance.


Well, you know my stance on this now :) Not that it really matters.


[1]
https://public-inbox.org/git/20171109141737.47976-1-benpe...@microsoft.com/


PS. I still think it's worth bring v4's performance back to v2. It's
low hanging fruit because I'm pretty sure Junio did not add v4 code
with cpu performance in mind. It was about file size at that time and
cpu consumption was still dwarfed by hashing.



I see that as a nice follow up patch.  If the extension exists, use it 
and jump directly to the blocks and spin up threads.  If it doesn't 
exist, fall back to the code in this patch that has to find/compute the 
blocks on the fly.

Re: [PATCH v1] read-cache: speed up index load through parallelization

2018-08-24 Thread Duy Nguyen

On Fri, Aug 24, 2018 at 8:40 PM Ben Peart  wrote:
>
>
>
> On 8/24/2018 2:20 PM, Duy Nguyen wrote:
> > On Fri, Aug 24, 2018 at 5:37 PM Duy Nguyen  wrote:
> >> On Thu, Aug 23, 2018 at 10:36 PM Ben Peart  wrote:
>  Nice to see this done without a new index extension that records
>  offsets, so that we can load existing index files in parallel.
> 
> >>>
> >>> Yes, I prefer this simpler model as well.  I wasn't sure it would
> >>> produce a significant improvement given the primary thread still has to
> >>> run through the variable length cache entries but was pleasantly 
> >>> surprised.
> >>
> >> Out of curiosity, how much time saving could we gain by recording
> >> offsets as an extension (I assume we need, like 4 offsets if the
> >> system has 4 cores)? Much much more than this simpler model (which may
> >> justify the complexity) or just "meh" compared to this?
> >
> > To answer my own question, I ran a patched git to precalculate
> > individual thread parameters, removed the scheduler code and hard
> > coded these parameters (I ran just 4 threads, one per core). I got
> > 0m2.949s (webkit.git, 275k files, 100 read-cache runs). Compared to
> > 0m4.996s from Ben's patch (same test settings of course) I think it's
> > definitely worth adding some extra complexity.
> >
>
> I took a run at doing that last year [1] but that was before the
> mem_pool work that allowed us to avoid the thread contention on the heap
> so the numbers aren't an apples to apples comparison (they would be
> better today).

Ah.. sorry I was not aware. A big chunk of 2017 is blank to me when it
comes to git.

> The trade-off is the additional complexity to be able to load the index
> extension without having to parse through all the variable length cache
> entries.  My patch worked but there was feedback requested to make it
> more generic and robust that I haven't gotten around to yet.

One more comment. Instead of forcing this special index at the bottom,
add a generic one that gives positions of all extensions and put that
one at the bottom. Then you can still quickly locate your offset table
extension, and you could load UNTR and TREE extensions in parallel too
(those scale up to worktree size)

> This patch series went for simplicity over absolutely the best possible
> performance.

Well, you know my stance on this now :) Not that it really matters.

> [1]
> https://public-inbox.org/git/20171109141737.47976-1-benpe...@microsoft.com/

PS. I still think it's worth bring v4's performance back to v2. It's
low hanging fruit because I'm pretty sure Junio did not add v4 code
with cpu performance in mind. It was about file size at that time and
cpu consumption was still dwarfed by hashing.
-- 
Duy

Re: [PATCH v1] read-cache: speed up index load through parallelization

2018-08-24 Thread Ben Peart





On 8/24/2018 2:20 PM, Duy Nguyen wrote:

On Fri, Aug 24, 2018 at 5:37 PM Duy Nguyen  wrote:

On Thu, Aug 23, 2018 at 10:36 PM Ben Peart  wrote:

Nice to see this done without a new index extension that records
offsets, so that we can load existing index files in parallel.



Yes, I prefer this simpler model as well.  I wasn't sure it would
produce a significant improvement given the primary thread still has to
run through the variable length cache entries but was pleasantly surprised.


Out of curiosity, how much time saving could we gain by recording
offsets as an extension (I assume we need, like 4 offsets if the
system has 4 cores)? Much much more than this simpler model (which may
justify the complexity) or just "meh" compared to this?


To answer my own question, I ran a patched git to precalculate
individual thread parameters, removed the scheduler code and hard
coded these parameters (I ran just 4 threads, one per core). I got
0m2.949s (webkit.git, 275k files, 100 read-cache runs). Compared to
0m4.996s from Ben's patch (same test settings of course) I think it's
definitely worth adding some extra complexity.



I took a run at doing that last year [1] but that was before the 
mem_pool work that allowed us to avoid the thread contention on the heap 
so the numbers aren't an apples to apples comparison (they would be 
better today).


The trade-off is the additional complexity to be able to load the index 
extension without having to parse through all the variable length cache 
entries.  My patch worked but there was feedback requested to make it 
more generic and robust that I haven't gotten around to yet.


This patch series went for simplicity over absolutely the best possible 
performance.


[1] 
https://public-inbox.org/git/20171109141737.47976-1-benpe...@microsoft.com/

Re: [PATCH v1] read-cache: speed up index load through parallelization

2018-08-24 Thread Duy Nguyen

On Thu, Aug 23, 2018 at 7:33 PM Stefan Beller  wrote:
> > +core.fastIndex::
> > +   Enable parallel index loading
> > ++
> > +This can speed up operations like 'git diff' and 'git status' especially
> > +when the index is very large.  When enabled, Git will do the index
> > +loading from the on disk format to the in-memory format in parallel.
> > +Defaults to true.
>
> "fast" is a non-descriptive word as we try to be fast in any operation?
> Maybe core.parallelIndexReading as that just describes what it
> turns on/off, without second guessing its effects?

Another option is index.threads (the "index" section currently only
has one item, index.version). The value could be the same as
grep.threads or pack.threads.

(and if you're thinking about parallelizing write as well but it
should be tuned differently, then perhaps index.readThreads, but I
don't think we need to go that far)
-- 
Duy

Re: [PATCH v1] read-cache: speed up index load through parallelization

2018-08-24 Thread Duy Nguyen

On Fri, Aug 24, 2018 at 5:37 PM Duy Nguyen  wrote:
> On Thu, Aug 23, 2018 at 10:36 PM Ben Peart  wrote:
> > > Nice to see this done without a new index extension that records
> > > offsets, so that we can load existing index files in parallel.
> > >
> >
> > Yes, I prefer this simpler model as well.  I wasn't sure it would
> > produce a significant improvement given the primary thread still has to
> > run through the variable length cache entries but was pleasantly surprised.
>
> Out of curiosity, how much time saving could we gain by recording
> offsets as an extension (I assume we need, like 4 offsets if the
> system has 4 cores)? Much much more than this simpler model (which may
> justify the complexity) or just "meh" compared to this?

To answer my own question, I ran a patched git to precalculate
individual thread parameters, removed the scheduler code and hard
coded these parameters (I ran just 4 threads, one per core). I got
0m2.949s (webkit.git, 275k files, 100 read-cache runs). Compared to
0m4.996s from Ben's patch (same test settings of course) I think it's
definitely worth adding some extra complexity.
-- 
Duy

Re: [PATCH v1] read-cache: speed up index load through parallelization

2018-08-24 Thread Ben Peart





On 8/24/2018 11:57 AM, Duy Nguyen wrote:

On Fri, Aug 24, 2018 at 05:37:20PM +0200, Duy Nguyen wrote:

Since we're cutting corners to speed things up, could you try
something like this?

I notice that reading v4 is significantly slower than v2 and
apparently strlen() (at least from glibc) is much cleverer and at
least gives me a few percentage time saving.

diff --git a/read-cache.c b/read-cache.c
index 7b1354d759..d10cccaed0 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1755,8 +1755,7 @@ static unsigned long expand_name_field(struct
strbuf *name, const char *cp_)
 if (name->len < len)
 die("malformed name field in the index");
 strbuf_remove(name, name->len - len, len);
-   for (ep = cp; *ep; ep++)
-   ; /* find the end */
+   ep = cp + strlen(cp);
 strbuf_add(name, cp, ep - cp);
 return (const char *)ep + 1 - cp_;
  }


No try this instead. It's half way back to v2 numbers for me (tested
with "test-tool read-cache 100" on webkit.git). For the record, v4 is
about 30% slower than v2 in my tests.



Thanks Duy, this helped on my system as well.

Interestingly, simply reading the cache tree extension in read_one() now 
takes about double the CPU on the primary thread as does 
load_cache_entries().


Hmm, that gives me an idea.  I could kick off another thread to load 
that extension in parallel and cut off another ~160 ms.  I'll add that 
to my list of future patches to investigate...



We could probably do better too. Instead of preparing the string in a
separate buffer (previous_name_buf), we could just assemble it directly
to the newly allocated "ce".

-- 8< --
diff --git a/read-cache.c b/read-cache.c
index 7b1354d759..237f60a76c 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1754,9 +1754,8 @@ static unsigned long expand_name_field(struct strbuf 
*name, const char *cp_)
  
  	if (name->len < len)

die("malformed name field in the index");
-   strbuf_remove(name, name->len - len, len);
-   for (ep = cp; *ep; ep++)
-   ; /* find the end */
+   strbuf_setlen(name, name->len - len);
+   ep = cp + strlen(cp);
strbuf_add(name, cp, ep - cp);
return (const char *)ep + 1 - cp_;
  }
-- 8< --

Re: [PATCH v4 1/9] submodule: add a print_config_from_gitmodules() helper

2018-08-24 Thread Antonio Ospite

On Fri, 24 Aug 2018 16:32:38 +0200
Ævar Arnfjörð Bjarmason  wrote:

> 
> On Fri, Aug 24 2018, Antonio Ospite wrote:
[...]
> > +static int config_print_callback(const char *key_, const char *value_, 
> > void *cb_data)
> > +{
> > +   char *key = cb_data;
> > +
> > +   if (!strcmp(key, key_))
> > +   printf("%s\n", value_);
> > +
> > +   return 0;
> > +}
> 
> No problem with the code itself, but I'd find this a lot easier to read
> in context if it was:
> 
> key_ -> var
> value_ -> value
> key -> wanted_key, perhaps?
> 
> I.e. the rest of the file uses the convention of the
> config_from_gitmodules callbacks getting "var" and "value",
> respectively.
> 
> We don't use this convention of suffixing variables with "_" if they're
> arguments to the function anywhere else.
>

I was new to git when I firstly wrote the code, I picked up the style
by copying from builtin/config.c (collect_config() and
show_all_config()) because I was focusing more on the functionality and
didn't bother to harmonize the style with the destination file.

> > +int print_config_from_gitmodules(const char *key)
> > +{
> > +   int ret;
> > +   char *store_key;
> > +
> > +   ret = git_config_parse_key(key, _key, NULL);
> > +   if (ret < 0)
> > +   return CONFIG_INVALID_KEY;
> 
> Isn't this a memory leak? I.e. we should free() and return here, no?
>

It is true that git_config_parse_key_1() allocates some storage even
for an invalid key, and uses the space to lowercase the key as the
parsing progresses, however it also frees the memory in the *error*
path.

So unless I am missing something I don't think there is a leak here.

[...]
> > diff --git a/submodule-config.h b/submodule-config.h
> > index dc7278eea4..ed40e9a478 100644
> > --- a/submodule-config.h
> > +++ b/submodule-config.h
> > @@ -56,6 +56,8 @@ void submodule_free(struct repository *r);
> >   */
> >  int check_submodule_name(const char *name);
> >
> > +int print_config_from_gitmodules(const char *key);
> > +
> >  /*
> >   * Note: these helper functions exist solely to maintain backward
> >   * compatibility with 'fetch' and 'update_clone' storing configuration in
> 
> Another style nit: Makes more sense to put this new function right
> underneath "submodule_free" above, instead of under 1/2 of the functions
> that have a big comment describing how they work, since that comment
> doesn't apply to this new function.

You are probably right, IIRC the function was firstly written before
check_submodule_name() was there. And I didn't think of moving it up
when I rebased after check_submodule_name() was added.

Your same remark may apply to the new function added in patch 02.

I'll wait for other comments to see if a v5 is really needed.

Thanks for the review Ævar.

Ciao,
   Antonio

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?

Re: [ANNOUNCE] Git v2.19.0-rc0

2018-08-24 Thread Derrick Stolee


On 8/23/2018 4:59 PM, Derrick Stolee wrote:


When I choose my own metrics for performance tests, I like to run at 
least 10 runs, remove the largest AND smallest runs from the samples, 
and then take the average. I did this manually for 'git rev-list --all 
--objects' on git/git and got the following results:


v2.18.0    v2.19.0-rc0   HEAD

3.126 s    3.308 s   3.170 s

For full disclosure, here is a full table including all samples:

|  | v2.18.0 | v2.19.0-rc0 | HEAD    |
|--|-|-|-|
|  | 4.58    | 3.302   | 3.239   |
|  | 3.13    | 3.337   | 3.133   |
|  | 3.213   | 3.291   | 3.159   |
|  | 3.219   | 3.318   | 3.131   |
|  | 3.077   | 3.302   | 3.163   |
|  | 3.074   | 3.328   | 3.119   |
|  | 3.022   | 3.277   | 3.125   |
|  | 3.083   | 3.259   | 3.203   |
|  | 3.057   | 3.311   | 3.223   |
|  | 3.155   | 3.413   | 3.225   |
| Max  | 4.58    | 3.413   | 3.239   |
| Min  | 3.022   | 3.259   | 3.119   |
| Avg* | 3.126   | 3.30825 | 3.17025 |

(Note that the largest one was the first run, on v2.18.0, which is due 
to a cold disk.)


I just kicked off a script that will run this test on the Linux repo 
while I drive home. I'll be able to report a similar table of data 
easily.


Here are the numbers for Linux:

|  | v2.18.0  | v2.19.0-rc0 | HEAD   |
|--|--|-||
|  | 86.5 | 70.739  | 57.266 |
|  | 60.582   | 101.928 | 56.641 |
|  | 58.964   | 60.139  | 60.258 |
|  | 59.47    | 61.141  | 58.213 |
|  | 62.554   | 60.73   | 84.54  |
|  | 59.139   | 85.424  | 57.745 |
|  | 58.487   | 59.31   | 59.979 |
|  | 58.653   | 69.845  | 60.181 |
|  | 58.085   | 102.777 | 61.455 |
|  | 58.304   | 60.459  | 62.551 |
| Max  | 86.5 | 102.777 | 84.54  |
| Min  | 58.085   | 59.31   | 56.641 |
| Avg* | 59.51913 | 71.30063    | 59.706 |
| Med  | 59.0515  | 65.493  | 60.08  |

Thanks,

-Stolee

Re: [git-for-windows] Request for testing v2.19.0-rc0 with builtin stash/rebase

2018-08-24 Thread Johannes Schindelin

Team,

On Fri, 24 Aug 2018, Johannes Schindelin wrote:

> while this mail talks about Git for Windows, please keep in mind that we
> try *very* hard to keep Git for Windows' master working correctly not only
> on Windows but also on macOS and Linux.
> 
> I, for one, run Git built from Git for Windows' `master` branch in my
> Linux VMs all the time.
> 
> As all of you know by now, the fact that Git was pretty much designed to
> work well on Linux is not exactly helping Git for Windows. (Or for that
> matter, macOS: Git is substantially slower on macOS than on Linux when
> running on the same hardware.) The quickest pay-off comes from converting
> Unix shell scripts (which are designed to spawn processes, whereas Windows
> is more optimized for multi-threaded applications).
> 
> For that reason, I was delighted to see that our Google Summer of Code
> pushed pretty hard in that direction. And I could not help myself so I had
> to test how much faster things got. Here is the result of my first, really
> quick and dirty test:
> 
>   without builtin stash/rebasewith builtin stash/rebase
> t3400 (rebase)1m27s   32s
> t3404 (rebase -i) 13m15s  3m59s
> t3903 (stash) 8m37s   1m18s
> 
> What can I say? Even if the numbers are off by as much as 10%, these are
> impressive improvements: keep in mind that there is a lot of churn going
> on in the test suite because it is itself implemented in Unix shell
> scripts (and hence I won't even bother to try more correct performance
> benchmarking because that is simply not possible when Unix shell scripts
> are in the equation). So the speed improvements of the stash/rebase
> commands are *even higher* than this.
> 
> So I really, really, really want those builtins in Git for Windows
> v2.19.0. At *least* as an option.
> 
> Therefore, after creating a pre-release of Git for Windows corresponding
> to Git v2.19.0-rc0, I created another one (note the .2 at the end), *with*
> those new builtins:
> https://github.com/git-for-windows/git/releases/tag/v2.19.0-rc0.windows.2
> 
> I would like to ask you for help in dog-fooding this. In particular if you
> are a heavy stash/rebase user (like I am), it would be helpful to really
> kick those tires.
> 
> Of course, those are only Windows binaries on that web page, but it should
> be easy to compile from that tag on Linux and macOS. I could also build a
> macOS installer and add it to that pre-release, is there interest in that?
> 
> Currently I am uncertain whether I should spend the time to reinstate the
> old scripted `git stash` guarded by `stash.useBuiltin` and the old
> scripted `git rebase -i` (guarded by the same `rebase.useBuiltin` that
> guards the scripted `git rebase`), as a fall-back (or make the new
> builtins opt-ins instead of opt-outs).
> 
> So far, I have not encountered any serious bug, but then, I did not really
> have a chance to use it yet (I installed it, of course, but I have not
> done any serious rebasing yet). So your help will be crucial in
> determining where I need to spend time.

Thomas Gummerer pointed out just how dangerously I want to live, so I did
implement the `stash.useBuiltin` option (and extended the
`rebase.useBuiltin` option to also cover `rebase -i`), and made the use of
those builtins an opt-in, i.e. still default to the scripted versions.

You can see my progress in this PR (which I mainly opened to have this
tested on Windows, macOS and Linux):
https://github.com/git-for-windows/git/pull/1800

Ciao,
Johannes

Would a config var for --force-with-lease be useful?

2018-08-24 Thread Scott Johnson

Hello Everyone:

I'm considering writing a patch that adds a configuration variable
that will allow the user to default the command:

git push --force

to:

git push --force-with-lease

As discussed here:

https://stackoverflow.com/questions/30542491/push-force-with-lease-by-default

Now, I understand that there are downsides to having this enabled,
namely that a user who has this enabled might forget that they have it
enabled, and, as such, on a machine that _doesn't_ have it enabled (of
which they are unfamiliar) might then run the more consequential
command "git push --force", but my thinking is that adding this as a
feature to the git codebase as an _optional_ (i.e. not enabled by
default) configuration variable would then save some of us who use a
"rebase-then-force-push for pull request" workflow some time and
headaches.

Of course, I don't want to submit a patch if this is a feature that
isn't likely to be accepted, so I wanted to get some thoughts from the
mailing list regarding this idea.

Thank you,

~Scott Johnson

Re: [PATCH 01/11] builtin rebase: support --onto

2018-08-24 Thread Johannes Schindelin

Hi Junio,

On Wed, 8 Aug 2018, Junio C Hamano wrote:

> Pratik Karki  writes:
> 
> > The `--onto` option is important, as it allows to rebase a range of
> > commits onto a different base commit (which gave the command its odd
> > name: "rebase").
> 
> Is there anything unimportant?  A rhetorical question, of course.

You might think it is a rhetorical question, but obviously it is not, as
your reaction testifies.

But certainly there are more important options and less important options!
The most important options are those that are frequently used.

> The quite casual and natural use of "to rebase" as a verb in the
> first sentence contradicts with what the parenthetical "its odd
> name" comment says.  Perhaps drop everything after "(which..."?
> 
> i.e.
> 
>   The `--onto` option allows to rebase a range of commits onto
>   a different base commit.  Port the support for the option to
>   the C re-implementation.

I'd rather keep it.

Remember, a story is easier to read than a dull academic treatise. I want
to have a little bit of a personal touch when I stumble over these commit
messages again. And I know I will.

> > This commit introduces options parsing so that different options can
> > be added in future commits.
> 
> We usually do not say "This commit does X", or (worse) "I do X in
> this commit".

Oh, don't we now? ;-)

(This *was* a rhetorical question, as *I* use this tense all the time, and
unless you have quietly rewritten my commit messages without my knowledge
nor consent, the Git commit history is full of these instances.)

> Instead, order the codebase to be like so, e.g.  "Support command line
> options by adding a call to parse_options(); later commits will add more
> options by building on top." or something like that.

To be quite frank with you, I hoped for a review that would focus a teeny
tiny bit on the correctness of the code.

If you want to continue to nit-pick the commit messages, that's fine, of
course, but do understand that I am not really prepared to change a whole
lot there, unless you point out outright errors or false statements. Those
naturally need fixing.

Also, please note that I will now *definitely* focus on bug fixes, as I am
really eager to get those speed improvements into Git for Windows v2.19.0.

And I don't know whether I have said this publicly yet: I will send the
next iterations of Pratik's patch series. He is busy with exams (GSoC
really caters for US schedules, students who are in countries with very
different university schedules are a bit out of luck here), and I really
want these patches.

Ciao,
Dscho

Dear I Need An Investment Partner

2018-08-24 Thread Aisha Gaddafi




--
Dear Assalamu Alaikum,
I came across your contact during my private search
Mrs Aisha Al-Qaddafi is my name, the only daughter of late Libyan
president, I have funds the sum
of $27.5 million USD for investment, I am interested in you for
investment project assistance in your country,
i shall compensate you 30% of the total sum after the funds are
transfer into your account,
Greetings from Mrs Aisha Al-Qaddafi
Mrs Aisha Al-Qaddafi
--

Re: [PATCH 09/11] builtin rebase: start a new rebase only if none is in progress

2018-08-24 Thread Johannes Schindelin

Hi Stefan,

On Wed, 8 Aug 2018, Stefan Beller wrote:

> On Wed, Aug 8, 2018 at 6:51 AM Pratik Karki  wrote:
> >
> > diff --git a/builtin/rebase.c b/builtin/rebase.c
> > index 8a7bf3d468..a261f552f1 100644
> > --- a/builtin/rebase.c
> > +++ b/builtin/rebase.c
> > @@ -455,6 +481,26 @@ int cmd_rebase(int argc, const char **argv, const char 
> > *prefix)
> > usage_with_options(builtin_rebase_usage,
> >builtin_rebase_options);
> >
> > +   /* Make sure no rebase is in progress */
> 
> The faithful conversion doesn't even stop at the comments. ;-)

Yes, I insisted on it.

TBH it is a bit of a shame that we cannot fix all those error messages
going to stdout, but... you know... One step after the other.

> I shortly wondered if this is the best place for this comment,
> but let's just keep it here to have the 1:1 rewrite.

It should probably be inside the conditional block, but as you say: the
original had it in a funny spot, and so does the converted code.

> > +   if (in_progress) {
> [...]
> > +   state_dir_base, cmd_live_rebase,buf.buf);
> 
> In case a resend is needed, add a whitespace after the
> comma and buf.buf, please.

I will fix this before sending the next iteration.

Thanks for the review!
Dscho

Re: [PATCH 08/11] builtin rebase: support --force-rebase

2018-08-24 Thread Johannes Schindelin

Hi Stefan,

On Wed, 8 Aug 2018, Stefan Beller wrote:

> On Wed, Aug 8, 2018 at 6:51 AM Pratik Karki  wrote:
> 
> > @@ -551,10 +560,21 @@ int cmd_rebase(int argc, const char **argv, const 
> > char *prefix)
> [...]
> > ; /* be quiet */
> > else if (!strcmp(branch_name, "HEAD") &&
> > -   resolve_ref_unsafe("HEAD", 0, NULL, ))
> > +resolve_ref_unsafe("HEAD", 0, NULL, ))
> 
> This line is changing only the indentation whitespace?
> Would it make sense to have it in the previous patch?

Correct. I will fix this before sending the next iteration,
Dscho

Re: [PATCH 11/18] builtin rebase: support `--autostash` option

2018-08-24 Thread Johannes Schindelin

Hi Duy,

On Sat, 18 Aug 2018, Duy Nguyen wrote:

> On Wed, Aug 8, 2018 at 5:26 PM Pratik Karki  wrote:
> > @@ -224,13 +219,56 @@ static int read_basic_state(struct rebase_options 
> > *opts)
> > return 0;
> >  }
> >
> > +static int apply_autostash(struct rebase_options *opts)
> > +{
> > +   const char *path = state_dir_path("autostash", opts);
> > +   struct strbuf autostash = STRBUF_INIT;
> > +   struct child_process stash_apply = CHILD_PROCESS_INIT;
> > +
> > +   if (!file_exists(path))
> > +   return 0;
> > +
> > +   if (read_one(state_dir_path("autostash", opts), ))
> > +   return error(_("Could not read '%s'"), path);
> > +   argv_array_pushl(_apply.args,
> > +"stash", "apply", autostash.buf, NULL);
> > +   stash_apply.git_cmd = 1;
> > +   stash_apply.no_stderr = stash_apply.no_stdout =
> > +   stash_apply.no_stdin = 1;
> > +   if (!run_command(_apply))
> > +   printf("Applied autostash.\n");
> 
> I think you need _() here.

Good catch.

Will fix before sending v2,
Dscho

> 
> > +   else {
> > +   struct argv_array args = ARGV_ARRAY_INIT;
> > +   int res = 0;
> > +
> > +   argv_array_pushl(,
> > +"stash", "store", "-m", "autostash", "-q",
> > +autostash.buf, NULL);
> > +   if (run_command_v_opt(args.argv, RUN_GIT_CMD))
> > +   res = error(_("Cannot store %s"), autostash.buf);
> > +   argv_array_clear();
> > +   strbuf_release();
> > +   if (res)
> > +   return res;
> > +
> > +   fprintf(stderr,
> > +   _("Applying autostash resulted in conflicts.\n"
> > + "Your changes are safe in the stash.\n"
> > + "You can run \"git stash pop\" or \"git stash 
> > drop\" "
> > + "at any time.\n"));
> > +   }
> > +
> > +   strbuf_release();
> > +   return 0;
> > +}
> > +
> >  static int finish_rebase(struct rebase_options *opts)
> >  {
> > struct strbuf dir = STRBUF_INIT;
> > const char *argv_gc_auto[] = { "gc", "--auto", NULL };
> >
> > delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
> > -   apply_autostash();
> > +   apply_autostash(opts);
> > close_all_packs(the_repository->objects);
> > /*
> >  * We ignore errors in 'gc --auto', since the
> -- 
> Duy
>

Re: [PATCH 07/18] builtin rebase: support `keep-empty` option

2018-08-24 Thread Johannes Schindelin

Hi,

On Wed, 8 Aug 2018, Pratik Karki wrote:

> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 42ee040da3..fd9ad8efae 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -588,6 +607,8 @@ int cmd_rebase(int argc, const char **argv, const char 
> *prefix)
>_rerere_autoupdate,
>N_("allow rerere to update index  with resolved "
>   "conflict")),
> + OPT_BOOL(0, "keep-empty", _empty,
> +  N_("preserve empty commits during rebase")),
>   OPT_END(),

This forgets the short option `-k`, I just noticed. I then looked at the
rest of the options, and they seem not to forget the short option
anywhere.

I will fix this before sending v2.

Ciao,
Dscho

Re: [PATCH v1] read-cache: speed up index load through parallelization

2018-08-24 Thread Duy Nguyen

On Fri, Aug 24, 2018 at 05:37:20PM +0200, Duy Nguyen wrote:
> Since we're cutting corners to speed things up, could you try
> something like this?
> 
> I notice that reading v4 is significantly slower than v2 and
> apparently strlen() (at least from glibc) is much cleverer and at
> least gives me a few percentage time saving.
> 
> diff --git a/read-cache.c b/read-cache.c
> index 7b1354d759..d10cccaed0 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1755,8 +1755,7 @@ static unsigned long expand_name_field(struct
> strbuf *name, const char *cp_)
> if (name->len < len)
> die("malformed name field in the index");
> strbuf_remove(name, name->len - len, len);
> -   for (ep = cp; *ep; ep++)
> -   ; /* find the end */
> +   ep = cp + strlen(cp);
> strbuf_add(name, cp, ep - cp);
> return (const char *)ep + 1 - cp_;
>  }

No try this instead. It's half way back to v2 numbers for me (tested
with "test-tool read-cache 100" on webkit.git). For the record, v4 is
about 30% slower than v2 in my tests.

We could probably do better too. Instead of preparing the string in a
separate buffer (previous_name_buf), we could just assemble it directly
to the newly allocated "ce".

-- 8< --
diff --git a/read-cache.c b/read-cache.c
index 7b1354d759..237f60a76c 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1754,9 +1754,8 @@ static unsigned long expand_name_field(struct strbuf 
*name, const char *cp_)
 
if (name->len < len)
die("malformed name field in the index");
-   strbuf_remove(name, name->len - len, len);
-   for (ep = cp; *ep; ep++)
-   ; /* find the end */
+   strbuf_setlen(name, name->len - len);
+   ep = cp + strlen(cp);
strbuf_add(name, cp, ep - cp);
return (const char *)ep + 1 - cp_;
 }
-- 8< --

Re: [PATCH v1] read-cache: speed up index load through parallelization

2018-08-24 Thread Duy Nguyen

Since we're cutting corners to speed things up, could you try
something like this?

I notice that reading v4 is significantly slower than v2 and
apparently strlen() (at least from glibc) is much cleverer and at
least gives me a few percentage time saving.

diff --git a/read-cache.c b/read-cache.c
index 7b1354d759..d10cccaed0 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1755,8 +1755,7 @@ static unsigned long expand_name_field(struct
strbuf *name, const char *cp_)
if (name->len < len)
die("malformed name field in the index");
strbuf_remove(name, name->len - len, len);
-   for (ep = cp; *ep; ep++)
-   ; /* find the end */
+   ep = cp + strlen(cp);
strbuf_add(name, cp, ep - cp);
return (const char *)ep + 1 - cp_;
 }

On Thu, Aug 23, 2018 at 10:36 PM Ben Peart  wrote:
> > Nice to see this done without a new index extension that records
> > offsets, so that we can load existing index files in parallel.
> >
>
> Yes, I prefer this simpler model as well.  I wasn't sure it would
> produce a significant improvement given the primary thread still has to
> run through the variable length cache entries but was pleasantly surprised.

Out of curiosity, how much time saving could we gain by recording
offsets as an extension (I assume we need, like 4 offsets if the
system has 4 cores)? Much much more than this simpler model (which may
justify the complexity) or just "meh" compared to this?
-- 
Duy

ag/rebase-i-in-c, was Re: What's cooking in git.git (Aug 2018, #05; Mon, 20)

2018-08-24 Thread Alban Gruin

Hi Junio,

Le 21/08/2018 à 00:15, Junio C Hamano a écrit :
> * ag/rebase-i-in-c (2018-08-10) 20 commits
>  - rebase -i: move rebase--helper modes to rebase--interactive
>  - rebase -i: remove git-rebase--interactive.sh
>  - rebase--interactive2: rewrite the submodes of interactive rebase in C
>  - rebase -i: implement the main part of interactive rebase as a builtin
>  - rebase -i: rewrite init_basic_state() in C
>  - rebase -i: rewrite write_basic_state() in C
>  - rebase -i: rewrite the rest of init_revisions_and_shortrevisions() in C
>  - rebase -i: implement the logic to initialize $revisions in C
>  - rebase -i: remove unused modes and functions
>  - rebase -i: rewrite complete_action() in C
>  - t3404: todo list with commented-out commands only aborts
>  - sequencer: change the way skip_unnecessary_picks() returns its result
>  - sequencer: refactor append_todo_help() to write its message to a buffer
>  - rebase -i: rewrite checkout_onto() in C
>  - rebase -i: rewrite setup_reflog_action() in C
>  - sequencer: add a new function to silence a command, except if it fails
>  - rebase -i: rewrite the edit-todo functionality in C
>  - editor: add a function to launch the sequence editor
>  - rebase -i: rewrite append_todo_help() in C
>  - sequencer: make three functions and an enum from sequencer.c public
> 
>  Rewrite of the remaining "rebase -i" machinery in C.
> 
>  With "rebase -i" machinery being rewritten to C, with a different
>  interface between "rebase" proper and its backends, this and the
>  other topics need a bit more work to play with each other better.

I am preparing to send a new version of this patch series, but it
conflicts a bit with js/range-diff in the Makefile and in git.c.  They
are not really hard to solve, though, a 3-way merge can handle this
fine.  Should I rebase my work onto origin/master, or should I let you
handle this?

Cheers,
Alban

[PATCH v4 2/6] tests: fix and add lint for non-portable seq

2018-08-24 Thread Ævar Arnfjörð Bjarmason

The seq command is not in POSIX, and doesn't exist on
e.g. OpenBSD. We've had the test_seq wrapper since d17cf5f3a3 ("tests:
Introduce test_seq", 2012-08-04), but use of it keeps coming back,
e.g. in the recently added "fetch negotiator" tests being added here.

So let's also add a check to "make test-lint". The regex is aiming to
capture the likes of $(seq ..) and "seq" as a stand-alone command,
without capturing some existing cases where we e.g. have files called
"seq", as \bseq\b would do.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/check-non-portable-shell.pl|  1 +
 t/t5552-skipping-fetch-negotiator.sh | 12 ++--
 t/t5703-upload-pack-ref-in-want.sh   |  4 ++--
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl
index c8f10d40a1..75f38298d7 100755
--- a/t/check-non-portable-shell.pl
+++ b/t/check-non-portable-shell.pl
@@ -42,6 +42,7 @@ sub err {
/\btest\s+[^=]*==/ and err '"test a == b" is not portable (use =)';
/\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use 
test_line_count)';
/\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes 
BYTES out)';
+   /(?:\$\(seq|^\s*seq\b)/ and err 'seq is not portable (use test_seq)';
/\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable 
(use FOO=bar && export FOO)';
/^\s*([A-Z0-9_]+=(\w+|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and
err '"FOO=bar shell_func" assignment extends beyond 
"shell_func"';
diff --git a/t/t5552-skipping-fetch-negotiator.sh 
b/t/t5552-skipping-fetch-negotiator.sh
index 5ad5bece55..30857b84a8 100755
--- a/t/t5552-skipping-fetch-negotiator.sh
+++ b/t/t5552-skipping-fetch-negotiator.sh
@@ -46,7 +46,7 @@ test_expect_success 'commits with no parents are sent 
regardless of skip distanc
test_commit -C server to_fetch &&
 
git init client &&
-   for i in $(seq 7)
+   for i in $(test_seq 7)
do
test_commit -C client c$i
done &&
@@ -89,7 +89,7 @@ test_expect_success 'when two skips collide, favor the larger 
one' '
test_commit -C server to_fetch &&
 
git init client &&
-   for i in $(seq 11)
+   for i in $(test_seq 11)
do
test_commit -C client c$i
done &&
@@ -168,14 +168,14 @@ test_expect_success 'do not send "have" with ancestors of 
commits that server AC
test_commit -C server to_fetch &&
 
git init client &&
-   for i in $(seq 8)
+   for i in $(test_seq 8)
do
git -C client checkout --orphan b$i &&
test_commit -C client b$i.c0
done &&
-   for j in $(seq 19)
+   for j in $(test_seq 19)
do
-   for i in $(seq 8)
+   for i in $(test_seq 8)
do
git -C client checkout b$i &&
test_commit -C client b$i.c$j
@@ -205,7 +205,7 @@ test_expect_success 'do not send "have" with ancestors of 
commits that server AC
 
# fetch-pack should thus not send any more commits in the b1 branch, but
# should still send the others (in this test, just check b2).
-   for i in $(seq 0 8)
+   for i in $(test_seq 0 8)
do
have_not_sent b1.c$i
done &&
diff --git a/t/t5703-upload-pack-ref-in-want.sh 
b/t/t5703-upload-pack-ref-in-want.sh
index a73c55a47e..d1ccc22331 100755
--- a/t/t5703-upload-pack-ref-in-want.sh
+++ b/t/t5703-upload-pack-ref-in-want.sh
@@ -176,7 +176,7 @@ test_expect_success 'setup repos for 
change-while-negotiating test' '
git clone "http://127.0.0.1:$LIB_HTTPD_PORT/smart/repo; 
"$LOCAL_PRISTINE" &&
cd "$LOCAL_PRISTINE" &&
git checkout -b side &&
-   for i in $(seq 1 33); do test_commit s$i; done &&
+   for i in $(test_seq 1 33); do test_commit s$i; done &&
 
# Add novel commits to upstream
git checkout master &&
@@ -289,7 +289,7 @@ test_expect_success 'setup repos for fetching with 
ref-in-want tests' '
git clone "file://$REPO" "$LOCAL_PRISTINE" &&
cd "$LOCAL_PRISTINE" &&
git checkout -b side &&
-   for i in $(seq 1 33); do test_commit s$i; done &&
+   for i in $(test_seq 1 33); do test_commit s$i; done &&
 
# Add novel commits to upstream
git checkout master &&
-- 
2.18.0.865.gffc8e1a3cd6

[PATCH v4 5/6] tests: fix version-specific portability issue in Perl JSON

2018-08-24 Thread Ævar Arnfjörð Bjarmason

The test guarded by PERLJSON added in 75459410ed ("json_writer: new
routines to create JSON data", 2018-07-13) assumed that a JSON boolean
value like "true" or "false" would be represented as "1" or "0" in
Perl.

This behavior can't be relied upon, e.g. with JSON.pm 2.50 and
JSON::PP A JSON::PP::Boolean object will be represented as "true" or
"false". To work around this let's check if we have any refs left
after we check for hashes and arrays, assume those are JSON objects,
and coerce them to a known boolean value.

The behavior of this test still looks odd to me. Why implement our own
ad-hoc encoder just for some one-off test, as opposed to say Perl's
own Data::Dumper with Sortkeys et al? But with this change it works,
so let's leave it be.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t0019/parse_json.perl | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/t/t0019/parse_json.perl b/t/t0019/parse_json.perl
index ca4e5bfa78..fea87fb81b 100644
--- a/t/t0019/parse_json.perl
+++ b/t/t0019/parse_json.perl
@@ -34,6 +34,9 @@ sub dump_item {
 } elsif (ref($value) eq 'HASH') {
print "$label_in hash\n";
dump_hash($label_in, $value);
+} elsif (ref $value) {
+   my $bool = $value ? 1 : 0;
+   print "$label_in $bool\n";
 } elsif (defined $value) {
print "$label_in $value\n";
 } else {
-- 
2.18.0.865.gffc8e1a3cd6

[PATCH v4 0/6] OpenBSD & AIX etc. portability fixes

2018-08-24 Thread Ævar Arnfjörð Bjarmason

On Thu, Aug 23 2018, Eric Sunshine wrote:

> On Thu, Aug 23, 2018 at 4:36 PM Ævar Arnfjörð Bjarmason
>  wrote:
>> As noted in [1] there's still a remaining recently introduced
>> portability issue also introduced in 878f988350 ("t/test-lib: teach
>> --chain-lint to detect broken &&-chains in subshells", 2018-07-11), so
>> under AIX the tests must be run with GIT_TEST_CHAIN_LINT=0.
>>
>> I don't know how to solve the other issue, and this gets us some of
>> the way to GIT_TEST_CHAIN_LINT=1 working again on AIX.
>
> Does unindenting the comment, as I suggested in [1], fix the remaining
> problem for you?
>
> [1]: 
> https://public-inbox.org/git/CAPig+cTTbU5HFMKgNyrxTp3+kcK46-Fn=4zh6zdt1oqchac...@mail.gmail.com/

I didn't notice Eric's E-mail before I sent my v3, so going back and
testing this revealed two things:

 1) Yes, his suggestion works
 2) AIX sed will complain about one thing at a time, so we had a lot
more of these "labels too long" problems once I got past fixing
that issue.

So here's a version, which as noted in 4/6 makes GIT_TEST_CHAIN_LINT=1
fully work on AIX again.

As an aside, the reason I have access to AIX is because I requested
access to the GCC compile farm as suggested by someone on-list here
the other day: https://cfarm.tetaneutral.net/

They accepted my account request on the basis that I was going to hack
on git & perl on those boxes, so if anyone else here is interested in
testing stuff for portability...

Ævar Arnfjörð Bjarmason (6):
  tests: fix and add lint for non-portable head -c N
  tests: fix and add lint for non-portable seq
  tests: fix comment syntax in chainlint.sed for AIX sed
  tests: use shorter here-docs in chainlint.sed for AIX sed
  tests: fix version-specific portability issue in Perl JSON
  tests: fix and add lint for non-portable grep --file

 t/chainlint.sed  | 59 ++--
 t/check-non-portable-shell.pl|  3 ++
 t/t0019/parse_json.perl  |  3 ++
 t/t5310-pack-bitmaps.sh  |  2 +-
 t/t5318-commit-graph.sh  |  2 +-
 t/t5552-skipping-fetch-negotiator.sh | 12 +++---
 t/t5703-upload-pack-ref-in-want.sh   |  4 +-
 t/test-lib.sh|  4 +-
 8 files changed, 47 insertions(+), 42 deletions(-)

-- 
2.18.0.865.gffc8e1a3cd6

[PATCH v4 1/6] tests: fix and add lint for non-portable head -c N

2018-08-24 Thread Ævar Arnfjörð Bjarmason

The "head -c BYTES" option is non-portable (not in POSIX[1]). Change
such invocations to use the test_copy_bytes wrapper added in
48860819e8 ("t9300: factor out portable "head -c" replacement",
2016-06-30).

This fixes a test added in 9d2e330b17 ("ewah_read_mmap: bounds-check
mmap reads", 2018-06-14), which has been breaking
t5310-pack-bitmaps.sh on OpenBSD since 2.18.0. The OpenBSD ports
already have a similar workaround after their upgrade to 2.18.0[2].

I have not tested this on IRIX, but according to 4de0bbd898 ("t9300:
use perl "head -c" clone in place of "dd bs=1 count=16000" kluge",
2010-12-13) this invocation would have broken things there too.

Also, change a valgrind-specific codepath in test-lib.sh to use this
wrapper. Given where valgrind runs I don't think this would ever
become a portability issue in practice, but it's easier to just use
the wrapper than introduce some exception for the "make test-lint"
check being added here.

1. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/head.html
2. 
https://github.com/openbsd/ports/commit/08d5d82eaefe5cf2f125ecc0c6a57df9cf91350c#diff-f7d3c4fabeed1691620d608f1534f5e5

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/check-non-portable-shell.pl | 1 +
 t/t5310-pack-bitmaps.sh   | 2 +-
 t/test-lib.sh | 4 ++--
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl
index d5823f71d8..c8f10d40a1 100755
--- a/t/check-non-portable-shell.pl
+++ b/t/check-non-portable-shell.pl
@@ -41,6 +41,7 @@ sub err {
/^\s*[^#]\s*which\s/ and err 'which is not portable (use type)';
/\btest\s+[^=]*==/ and err '"test a == b" is not portable (use =)';
/\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use 
test_line_count)';
+   /\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes 
BYTES out)';
/\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable 
(use FOO=bar && export FOO)';
/^\s*([A-Z0-9_]+=(\w+|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and
err '"FOO=bar shell_func" assignment extends beyond 
"shell_func"';
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index 557bd0d0c0..7bff7923f2 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -335,7 +335,7 @@ test_expect_success 'truncated bitmap fails gracefully' '
git rev-list --use-bitmap-index --count --all >expect &&
bitmap=$(ls .git/objects/pack/*.bitmap) &&
test_when_finished "rm -f $bitmap" &&
-   head -c 512 <$bitmap >$bitmap.tmp &&
+   test_copy_bytes 512 <$bitmap >$bitmap.tmp &&
mv -f $bitmap.tmp $bitmap &&
git rev-list --use-bitmap-index --count --all >actual 2>stderr &&
test_cmp expect actual &&
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 8bb0f4348e..44288cbb59 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -867,7 +867,7 @@ then
# handle only executables, unless they are shell libraries that
# need to be in the exec-path.
test -x "$1" ||
-   test "# " = "$(head -c 2 <"$1")" ||
+   test "# " = "$(test_copy_bytes 2 <"$1")" ||
return;
 
base=$(basename "$1")
@@ -882,7 +882,7 @@ then
# do not override scripts
if test -x "$symlink_target" &&
test ! -d "$symlink_target" &&
-   test "#!" != "$(head -c 2 < "$symlink_target")"
+   test "#!" != "$(test_copy_bytes 2 <"$symlink_target")"
then
symlink_target=../valgrind.sh
fi
-- 
2.18.0.865.gffc8e1a3cd6

[PATCH v4 4/6] tests: use shorter here-docs in chainlint.sed for AIX sed

2018-08-24 Thread Ævar Arnfjörð Bjarmason

Improve the portability of chainlint by using shorter here-docs. On
AIX sed will complain about:

sed: 0602-417 The label :hereslurp is greater than eight
characters

This, in combination with the previous fix to this file makes
GIT_TEST_CHAIN_LINT=1 (which is the default) working again on AIX
without issues, and the "gmake check-chainlint" test also passes.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/chainlint.sed | 56 -
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/t/chainlint.sed b/t/chainlint.sed
index dcb4b333ed..c80d2fad7a 100644
--- a/t/chainlint.sed
+++ b/t/chainlint.sed
@@ -97,11 +97,11 @@
 /<<[   ]*[-\\']*[A-Za-z0-9_]/ {
s/^\(.*\)<<[]*[-\\']*\([A-Za-z0-9_][A-Za-z0-9_]*\)'*/<\2>\1<]*\)>.*\n[  ]*\1[   ]*$/!{
s/\n.*$//
-   bhereslurp
+   bhered
}
s/^<[^>]*>//
s/\n.*$//
@@ -149,7 +149,7 @@ s/.*\n//
 
 :slurp
 # incomplete line "...\"
-/\\$/bincomplete
+/\\$/bicmplte
 # multi-line quoted string "...\n..."?
 /"/bdqstring
 # multi-line quoted string '...\n...'? (but not contraction in string "it's")
@@ -171,7 +171,7 @@ s/.*\n//
/"[^"]*#[^"]*"/!s/[ ]#.*$//
 }
 # one-liner "case ... esac"
-/^[]*case[ ]*..*esac/bcheckchain
+/^[]*case[ ]*..*esac/bchkchn
 # multi-line "case ... esac"
 /^[]*case[ ]..*[   ]in/bcase
 # multi-line "for ... done" or "while ... done"
@@ -200,32 +200,32 @@ s/.*\n//
 /^[]*fi[   ]*[<>|]/bdone
 /^[]*fi[   ]*)/bdone
 # nested one-liner "(...) &&"
-/^[]*(.*)[ ]*&&[   ]*$/bcheckchain
+/^[]*(.*)[ ]*&&[   ]*$/bchkchn
 # nested one-liner "(...)"
-/^[]*(.*)[ ]*$/bcheckchain
+/^[]*(.*)[ ]*$/bchkchn
 # nested one-liner "(...) >x" (or "2>x" or "|]/bcheckchain
+/^[]*(.*)[ ]*[0-9]*[<>|]/bchkchn
 # nested multi-line "(...\n...)"
 /^[]*(/bnest
 # multi-line "{...\n...}"
 /^[]*{/bblock
 # closing ")" on own line -- exit subshell
-/^[]*)/bclosesolo
+/^[]*)/bclssolo
 # "$((...))" -- arithmetic expansion; not closing ")"
-/\$(([^)][^)]*))[^)]*$/bcheckchain
+/\$(([^)][^)]*))[^)]*$/bchkchn
 # "$(...)" -- command substitution; not closing ")"
-/\$([^)][^)]*)[^)]*$/bcheckchain
+/\$([^)][^)]*)[^)]*$/bchkchn
 # multi-line "$(...\n...)" -- command substitution; treat as nested subshell
 /\$([^)]*$/bnest
 # "=(...)" -- Bash array assignment; not closing ")"
-/=(/bcheckchain
+/=(/bchkchn
 # closing "...) &&"
 /)[]*&&[   ]*$/bclose
 # closing "...)"
 /)[]*$/bclose
 # closing "...) >x" (or "2>x" or "|]/bclose
-:checkchain
+:chkchn
 # mark suspect if line uses ";" internally rather than "&&" (but not ";" in a
 # string and not ";;" in one-liner "case...esac")
 /;/{
@@ -244,7 +244,7 @@ n
 bslurp
 
 # found incomplete line "...\" -- slurp up next line
-:incomplete
+:icmplte
 N
 s/\\\n//
 bslurp
@@ -282,11 +282,11 @@ bfolded
 :heredoc
 s/^\(.*\)<<[   ]*[-\\']*\([A-Za-z0-9_][A-Za-z0-9_]*\)'*/<\2>\1<]*\)>.*\n[ ]*\1[   ]*$/!{
s/\n.*$//
-   bhereslurpsub
+   bheredsub
 }
 s/^<[^>]*>//
 s/\n.*$//
@@ -316,43 +316,43 @@ x
 # is 'done' or 'fi' cuddled with ")" to close subshell?
 /done.*)/bclose
 /fi.*)/bclose
-bcheckchain
+bchkchn
 
 # found nested multi-line "(...\n...)" -- pass through untouched
 :nest
 x
-:nestslurp
+:nstslurp
 n
 # closing ")" on own line -- stop nested slurp
-/^[]*)/bnestclose
+/^[]*)/bnstclose
 # comment -- not closing ")" if in comment
-/^[]*#/bnestcontinue
+/^[]*#/bnstcnt
 # "$((...))" -- arithmetic expansion; not closing ")"
-/\$(([^)][^)]*))[^)]*$/bnestcontinue
+/\$(([^)][^)]*))[^)]*$/bnstcnt
 # "$(...)" -- command substitution; not closing ")"
-/\$([^)][^)]*)[^)]*$/bnestcontinue
+/\$([^)][^)]*)[^)]*$/bnstcnt
 # closing "...)" -- stop nested slurp
-/)/bnestclose
-:nestcontinue
+/)/bnstclose
+:nstcnt
 x
-bnestslurp
-:nestclose
+bnstslurp
+:nstclose
 s/^/>>/
 # is it "))" which closes nested and parent subshells?
 /)[]*)/bslurp
-bcheckchain
+bchkchn
 
 # found multi-line "{...\n...}" block -- pass through untouched
 :block
 x
 n
 # closing "}" -- stop block slurp
-/}/bcheckchain
+/}/bchkchn
 bblock
 
 # found closing ")" on own line -- drop "suspect" from final line of subshell
 # since that line legitimately lacks "&&" and exit subshell loop
-:closesolo
+:clssolo
 x
 s/?!AMP?!//
 p
-- 
2.18.0.865.gffc8e1a3cd6

[PATCH v4 6/6] tests: fix and add lint for non-portable grep --file

2018-08-24 Thread Ævar Arnfjörð Bjarmason

The --file option to grep isn't in POSIX[1], but -f is[1]. Let's check
for that in the future, and fix the portability regression in
f237c8b6fe ("commit-graph: implement git-commit-graph write",
2018-04-02) that broke e.g. AIX.

1. http://pubs.opengroup.org/onlinepubs/009695399/utilities/grep.html

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/check-non-portable-shell.pl | 1 +
 t/t5318-commit-graph.sh   | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl
index 75f38298d7..b45bdac688 100755
--- a/t/check-non-portable-shell.pl
+++ b/t/check-non-portable-shell.pl
@@ -43,6 +43,7 @@ sub err {
/\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use 
test_line_count)';
/\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes 
BYTES out)';
/(?:\$\(seq|^\s*seq\b)/ and err 'seq is not portable (use test_seq)';
+   /\bgrep\b.*--file\b/ and err 'grep --file FILE is not portable (use 
grep -f FILE)';
/\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable 
(use FOO=bar && export FOO)';
/^\s*([A-Z0-9_]+=(\w+|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and
err '"FOO=bar shell_func" assignment extends beyond 
"shell_func"';
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 3c1ffad491..0c500f7ca2 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -134,7 +134,7 @@ test_expect_success 'Add one more commit' '
git branch commits/8 &&
ls $objdir/pack | grep idx >existing-idx &&
git repack &&
-   ls $objdir/pack| grep idx | grep -v --file=existing-idx >new-idx
+   ls $objdir/pack| grep idx | grep -v -f existing-idx >new-idx
 '
 
 # Current graph structure:
-- 
2.18.0.865.gffc8e1a3cd6

[PATCH v4 3/6] tests: fix comment syntax in chainlint.sed for AIX sed

2018-08-24 Thread Ævar Arnfjörð Bjarmason

Change a comment in chainlint.sed to appease AIX sed, which would
previously print this error:

sed:# stash for later printing is not a recognized function

1. 
https://public-inbox.org/git/CAPig+cTTbU5HFMKgNyrxTp3+kcK46-Fn=4zh6zdt1oqchac...@mail.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/chainlint.sed | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/t/chainlint.sed b/t/chainlint.sed
index 8544df38df..dcb4b333ed 100644
--- a/t/chainlint.sed
+++ b/t/chainlint.sed
@@ -131,9 +131,8 @@ b
 b
 
 :subshell
-# bare "(" line?
+# bare "(" line? -- stash for later printing
 /^[]*([]*$/ {
-   # stash for later printing
h
bnextline
 }
-- 
2.18.0.865.gffc8e1a3cd6

Re: clone, hardlinks, and file modes (and CAP_FOWNER)

2018-08-24 Thread Ævar Arnfjörð Bjarmason



On Fri, Aug 24 2018, Andreas Krey wrote:

> I'm currently looking into more aggressively sharing space between multiple 
> repositories,
> and into getting them to share again after one did a repack (which costs us 
> 15G space).
>
> One thing I stumbled on is the /proc/sys/fs/protected_hardlinks stuff which 
> disallows
> hardlinking pack files belonging to someone else. This consequently inhibits 
> sharing
> when first cloning from a common shared cache repo.
>
> Installing git with CAP_FOWNER is probably too dangerous;
> at least the capability should only be enabled during the directory copying.
>
> *
>
> And the next thing is that copied object/pack files are created with mode 
> rw-rw-r--,
> unlike those that come out of the regular transports.
>
> Apparent patch:
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index fd2c3ef090..6ffb4db4da 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -448,7 +448,7 @@ static void copy_or_link_directory(struct strbuf *src, 
> struct strbuf *dest,
> die_errno(_("failed to create link '%s'"), 
> dest->buf);
> option_no_hardlinks = 1;
> }
> -   if (copy_file_with_time(dest->buf, src->buf, 0666))
> +   if (copy_file_with_time(dest->buf, src->buf, 0444))
> die_errno(_("failed to copy file to '%s'"), 
> dest->buf);
> }
> closedir(dir);
>
> Alas, copy_file takes the mode just as a crude hint to executability, so also:
>
> diff --git a/copy.c b/copy.c
> index 4de6a110f0..883060009c 100644
> --- a/copy.c
> +++ b/copy.c
> @@ -32,7 +32,7 @@ int copy_file(const char *dst, const char *src, int mode)
>  {
> int fdi, fdo, status;
>
> -   mode = (mode & 0111) ? 0777 : 0666;
> +   mode = (mode & 0111) ? 0777 : (mode & 0222) ? 0666 : 0444;
> if ((fdi = open(src, O_RDONLY)) < 0)
> return fdi;
> if ((fdo = open(dst, O_WRONLY | O_CREAT | O_EXCL, mode)) < 0) {
>
> (copy_file is also used with 0644 instead of the usual 0666 in 
> refs/files-backend.c)
>
> Will submit as patch if acceptable; I'm not sure what the mode casing will
> do with other users.

This is mostly unrelated to your suggestion, but you might be interested
in this thread I started a while ago of doing this with an approach
unrelated to hardlinks, although you'll need a FS that does block
de-duplication (and it won't work at all currently, needs some
patching):
https://public-inbox.org/git/87bmhiykvw@evledraar.gmail.com/

I don't understand how this hardlink approach would work (doesn't mean
it won't, just that I don't get it).

Are you meaning to clone without --reference and instead via file:// and
rely on FS-local hardlinks, but then how will that work once one of the
repos does a full repack? Are you going to inhibit that in some way,
e.g. with gc.bigPackThreshold (but then why doesn't that work already?).

If you have such a tightly coupled approach isn't --reference closed to
what you want in that case?

Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff

2018-08-24 Thread Duy Nguyen

Jeff, you're doing crazy things beyond my (admittedly very limited)
imagination :P I did not see this at all when I implemented the
worktree stuff.

On Thu, Aug 23, 2018 at 8:21 PM Eric Sunshine  wrote:
> > > In this case, it might make sense for "git worktree add" to refuse to
> > > operate if an existing worktree entry still points at the directory
> > > that you're trying to add. That should prevent those duplicate
> > > worktree entries you saw.
> >
> > Yes, but then what's the next step for my script? I can't "remove" since
> > the worktree isn't there. I can't blow away any directory that I know
> > about, since there isn't one.
>
> I was thinking that "worktree add" could start respecting the --force
> option as an escape hatch.
>
> > I need to somehow know that an existing
> > "$GIT_DIR/worktrees/foo" is the problem. But "foo" is not even
> > deterministic. Looking at the duplicates, it seems to be the basename of
> > the working tree, but then mutated to avoid collisions with other
> > worktrees.
>
> If the worktree directory still existed, "git -C rev-parse --git-dir"
> inside the worktree would give you the proper path of
> $GIT_DIR/worktrees/foo, but the directory doesn't exist, so...
> nothing.
>
> > What about refusing by default, but forcing an overwrite with "-f"?
>
> My thought, also.

Sounds good. Eric are you going to implement this? Just checking so
that I can (hopefully) cross this off my backlog ;-)
-- 
Duy

Re: [PATCH/RFC] commit: new option to abort -a something is already staged

2018-08-24 Thread Duy Nguyen

On Fri, Aug 24, 2018 at 5:02 AM Jacob Keller  wrote:
>
> On Thu, Aug 23, 2018 at 9:28 AM Junio C Hamano  wrote:
> > I think the above example forgets "-a" on the final "git commit"
> > step.  With it added, I can understand the concern (and I am sure
> > you would, too).
> >
> > The user is trying to add everything done in the working tree, and
> > "commit -a" would catch all changes to paths that were already
> > tracked, but a separate "add" is necessary for newly created paths.
> > But adding a new path means the index no longer matches HEAD, and
> > the "commit -a" at the final step sweeps the changes to already
> > tracked paths---failing that because there "already is something
> > staged" will break the workflow.
>
> Right. I think this would need to be able to understand the case of
> "different only by new files".

OK so the rules I'm going to try to implement is, if the  version in
the index is not the same as one in HEAD or one in worktree, then "git
commit -a" fails.

The unwritten line is, if the path in question does not exist in HEAD,
then the first condition "as one in HEAD" is dropped, naturally. This
addresses the "git add new-file" problem, but if you have made more
changes in new-file in worktree after "git add" and do "git commit -a"
then you still get rejected because it fails the second condition.

File removal should be considered as well. But I don't foresee any
problem there. Resolving merges, replacing higher stage entries with
stage 0 will be accepted at "git commit -a" as usual.

Let me know if we should tweak these rules (and how).
-- 
Duy

Re: [PATCH v4 0/9] Make submodules work if .gitmodules is not checked out

2018-08-24 Thread Ævar Arnfjörð Bjarmason

On Fri, Aug 24 2018, Antonio Ospite wrote:

> this series teaches git to try and read the .gitmodules file from the
> index (:.gitmodules) and the current branch (HEAD:.gitmodules) when it
> is not readily available in the working tree.

FWIW I didn't read any of the earlier series's, and I'm not that
familiar with the submodule code, but having skimmed this all (not live
tested at all) nothing jumped out at me as an issue, I just had some
minor style / potential memory leak nits on 1/9.

Re: [PATCH v4 1/9] submodule: add a print_config_from_gitmodules() helper

2018-08-24 Thread Ævar Arnfjörð Bjarmason



On Fri, Aug 24 2018, Antonio Ospite wrote:

> Add a new print_config_from_gitmodules() helper function to print values
> from .gitmodules just like "git config -f .gitmodules" would.
>
> This will be used by a new submodule--helper subcommand to be able to
> access the .gitmodules file in a more controlled way.
>
> Signed-off-by: Antonio Ospite 
> ---
>  submodule-config.c | 25 +
>  submodule-config.h |  2 ++
>  2 files changed, 27 insertions(+)
>
> diff --git a/submodule-config.c b/submodule-config.c
> index fc2c41b947..eef96c4198 100644
> --- a/submodule-config.c
> +++ b/submodule-config.c
> @@ -682,6 +682,31 @@ void submodule_free(struct repository *r)
>   submodule_cache_clear(r->submodule_cache);
>  }
>
> +static int config_print_callback(const char *key_, const char *value_, void 
> *cb_data)
> +{
> + char *key = cb_data;
> +
> + if (!strcmp(key, key_))
> + printf("%s\n", value_);
> +
> + return 0;
> +}

No problem with the code itself, but I'd find this a lot easier to read
in context if it was:

key_ -> var
value_ -> value
key -> wanted_key, perhaps?

I.e. the rest of the file uses the convention of the
config_from_gitmodules callbacks getting "var" and "value",
respectively.

We don't use this convention of suffixing variables with "_" if they're
arguments to the function anywhere else.

> +int print_config_from_gitmodules(const char *key)
> +{
> + int ret;
> + char *store_key;
> +
> + ret = git_config_parse_key(key, _key, NULL);
> + if (ret < 0)
> + return CONFIG_INVALID_KEY;

Isn't this a memory leak? I.e. we should free() and return here, no?

> + config_from_gitmodules(config_print_callback, the_repository, 
> store_key);
> +
> + free(store_key);
> + return 0;
> +}
> +
>  struct fetch_config {
>   int *max_children;
>   int *recurse_submodules;
> diff --git a/submodule-config.h b/submodule-config.h
> index dc7278eea4..ed40e9a478 100644
> --- a/submodule-config.h
> +++ b/submodule-config.h
> @@ -56,6 +56,8 @@ void submodule_free(struct repository *r);
>   */
>  int check_submodule_name(const char *name);
>
> +int print_config_from_gitmodules(const char *key);
> +
>  /*
>   * Note: these helper functions exist solely to maintain backward
>   * compatibility with 'fetch' and 'update_clone' storing configuration in

Another style nit: Makes more sense to put this new function right
underneath "submodule_free" above, instead of under 1/2 of the functions
that have a big comment describing how they work, since that comment
doesn't apply to this new function.

Vďaka

2018-08-24 Thread Michelle Goodman

Vďaka

Ahoj! Tento list som vám poslal pred 2 dňami, ale nie som si istý
dostane sa to, a preto ju opakujem znova. prosím,
Vráťte sa na mňa, kde nájdete viac informácií.
S pozdravom,
michelle

Re: [PATCH v3 5/5] tests: fix and add lint for non-portable grep --file

2018-08-24 Thread Derrick Stolee


On 8/23/2018 4:44 PM, Junio C Hamano wrote:

Ævar Arnfjörð Bjarmason   writes:


The --file option to grep isn't in POSIX[1], but -f is[1]. Let's check
for that in the future, and fix the portability regression in
f237c8b6fe ("commit-graph: implement git-commit-graph write",
2018-04-02) that broke e.g. AIX.

1. http://pubs.opengroup.org/onlinepubs/009695399/utilities/grep.html

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

Thanks.


Yes, thanks! I'll keep this in mind for the future.


  t/check-non-portable-shell.pl | 1 +
  t/t5318-commit-graph.sh   | 2 +-
  2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl
index 75f38298d7..b45bdac688 100755
--- a/t/check-non-portable-shell.pl
+++ b/t/check-non-portable-shell.pl
@@ -43,6 +43,7 @@ sub err {
/\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use 
test_line_count)';
/\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes BYTES 
out)';
/(?:\$\(seq|^\s*seq\b)/ and err 'seq is not portable (use test_seq)';
+   /\bgrep\b.*--file\b/ and err 'grep --file FILE is not portable (use 
grep -f FILE)';
/\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable (use 
FOO=bar && export FOO)';
/^\s*([A-Z0-9_]+=(\w+|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and
err '"FOO=bar shell_func" assignment extends beyond 
"shell_func"';
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 3c1ffad491..0c500f7ca2 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -134,7 +134,7 @@ test_expect_success 'Add one more commit' '
git branch commits/8 &&
ls $objdir/pack | grep idx >existing-idx &&
git repack &&
-   ls $objdir/pack| grep idx | grep -v --file=existing-idx >new-idx
+   ls $objdir/pack| grep idx | grep -v -f existing-idx >new-idx
  '
  
  # Current graph structure:

[PATCH v4 1/9] submodule: add a print_config_from_gitmodules() helper

2018-08-24 Thread Antonio Ospite

Add a new print_config_from_gitmodules() helper function to print values
from .gitmodules just like "git config -f .gitmodules" would.

This will be used by a new submodule--helper subcommand to be able to
access the .gitmodules file in a more controlled way.

Signed-off-by: Antonio Ospite 
---
 submodule-config.c | 25 +
 submodule-config.h |  2 ++
 2 files changed, 27 insertions(+)

diff --git a/submodule-config.c b/submodule-config.c
index fc2c41b947..eef96c4198 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -682,6 +682,31 @@ void submodule_free(struct repository *r)
submodule_cache_clear(r->submodule_cache);
 }
 
+static int config_print_callback(const char *key_, const char *value_, void 
*cb_data)
+{
+   char *key = cb_data;
+
+   if (!strcmp(key, key_))
+   printf("%s\n", value_);
+
+   return 0;
+}
+
+int print_config_from_gitmodules(const char *key)
+{
+   int ret;
+   char *store_key;
+
+   ret = git_config_parse_key(key, _key, NULL);
+   if (ret < 0)
+   return CONFIG_INVALID_KEY;
+
+   config_from_gitmodules(config_print_callback, the_repository, 
store_key);
+
+   free(store_key);
+   return 0;
+}
+
 struct fetch_config {
int *max_children;
int *recurse_submodules;
diff --git a/submodule-config.h b/submodule-config.h
index dc7278eea4..ed40e9a478 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -56,6 +56,8 @@ void submodule_free(struct repository *r);
  */
 int check_submodule_name(const char *name);
 
+int print_config_from_gitmodules(const char *key);
+
 /*
  * Note: these helper functions exist solely to maintain backward
  * compatibility with 'fetch' and 'update_clone' storing configuration in
-- 
2.18.0

[PATCH v4 7/9] t7506: clean up .gitmodules properly before setting up new scenario

2018-08-24 Thread Antonio Ospite

In t/t7506-status-submodule.sh at some point a new scenario is set up to
test different things, in particular new submodules are added which are
meant to completely replace the previous ones.

However before calling the "git submodule add" commands for the new
layout, the .gitmodules file is removed only from the working tree still
leaving the previous content in current branch.

This can break if, in the future, "git submodule add" starts
differentiating between the following two cases:

  - .gitmodules is not in the working tree but it is in the current
branch (it may not be safe to add new submodules in this case);

  - .gitmodules is neither in the working tree nor anywhere in the
current branch (it is safe to add new submodules).

Since the test intends to get rid of .gitmodules anyways, let's
completely remove it from the current branch, to actually start afresh
in the new scenario.

This is more future-proof and does not break current tests.

Signed-off-by: Antonio Ospite 
---
 t/t7506-status-submodule.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/t7506-status-submodule.sh b/t/t7506-status-submodule.sh
index 943708fb04..08629a6e70 100755
--- a/t/t7506-status-submodule.sh
+++ b/t/t7506-status-submodule.sh
@@ -325,7 +325,8 @@ test_expect_success 'setup superproject with untracked file 
in nested submodule'
(
cd super &&
git clean -dfx &&
-   rm .gitmodules &&
+   git rm .gitmodules &&
+   git commit -m "remove .gitmodules" &&
git submodule add -f ./sub1 &&
git submodule add -f ./sub2 &&
git submodule add -f ./sub1 sub3 &&
-- 
2.18.0

[PATCH v4 3/9] t7411: merge tests 5 and 6

2018-08-24 Thread Antonio Ospite

Tests 5 and 6 check for the effects of the same commit, merge the two
tests to make it more straightforward to clean things up after the test
has finished.

The cleanup will be added in a future commit.

Signed-off-by: Antonio Ospite 
---
 t/t7411-submodule-config.sh | 18 +-
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index 0bde5850ac..f2cd1f4a2c 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -82,29 +82,21 @@ Submodule name: 'a' for path 'b'
 Submodule name: 'submodule' for path 'submodule'
 EOF
 
-test_expect_success 'error in one submodule config lets continue' '
+test_expect_success 'error in history of one submodule config lets continue, 
stderr message contains blob ref' '
(cd super &&
cp .gitmodules .gitmodules.bak &&
echo "  value = \"" >>.gitmodules &&
git add .gitmodules &&
mv .gitmodules.bak .gitmodules &&
git commit -m "add error" &&
-   test-tool submodule-config \
-   HEAD b \
-   HEAD submodule \
-   >actual &&
-   test_cmp expect_error actual
-   )
-'
-
-test_expect_success 'error message contains blob reference' '
-   (cd super &&
sha1=$(git rev-parse HEAD) &&
test-tool submodule-config \
HEAD b \
HEAD submodule \
-   2>actual_err &&
-   test_i18ngrep "submodule-blob $sha1:.gitmodules" actual_err 
>/dev/null
+   >actual \
+   2>actual_stderr &&
+   test_cmp expect_error actual &&
+   test_i18ngrep "submodule-blob $sha1:.gitmodules" actual_stderr 
>/dev/null
)
 '
 
-- 
2.18.0

[PATCH v4 5/9] submodule--helper: add a new 'config' subcommand

2018-08-24 Thread Antonio Ospite

Add a new 'config' subcommand to 'submodule--helper', this extra level
of indirection makes it possible to add some flexibility to how the
submodules configuration is handled.

Signed-off-by: Antonio Ospite 
---
 builtin/submodule--helper.c | 14 ++
 t/t7411-submodule-config.sh | 27 +++
 2 files changed, 41 insertions(+)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 2bcc70fdfe..a461a1107c 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2028,6 +2028,19 @@ static int connect_gitdir_workingtree(int argc, const 
char **argv, const char *p
return 0;
 }
 
+static int module_config(int argc, const char **argv, const char *prefix)
+{
+   /* Equivalent to ACTION_GET in builtin/config.c */
+   if (argc == 2)
+   return print_config_from_gitmodules(argv[1]);
+
+   /* Equivalent to ACTION_SET in builtin/config.c */
+   if (argc == 3)
+   return config_set_in_gitmodules_file_gently(argv[1], argv[2]);
+
+   die("submodule--helper config takes 1 or 2 arguments: name [value]");
+}
+
 #define SUPPORT_SUPER_PREFIX (1<<0)
 
 struct cmd_struct {
@@ -2056,6 +2069,7 @@ static struct cmd_struct commands[] = {
{"absorb-git-dirs", absorb_git_dirs, SUPPORT_SUPER_PREFIX},
{"is-active", is_active, 0},
{"check-name", check_name, 0},
+   {"config", module_config, 0},
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index b1f3c6489b..791245f18d 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -134,4 +134,31 @@ test_expect_success 'error in history in 
fetchrecursesubmodule lets continue' '
)
 '
 
+test_expect_success 'reading submodules config with "submodule--helper 
config"' '
+   (cd super &&
+   echo "../submodule" >expect &&
+   git submodule--helper config submodule.submodule.url >actual &&
+   test_cmp expect actual
+   )
+'
+
+test_expect_success 'writing submodules config with "submodule--helper 
config"' '
+   (cd super &&
+   echo "new_url" >expect &&
+   git submodule--helper config submodule.submodule.url "new_url" 
&&
+   git submodule--helper config submodule.submodule.url >actual &&
+   test_cmp expect actual
+   )
+'
+
+test_expect_success 'overwriting unstaged submodules config with 
"submodule--helper config"' '
+   test_when_finished "git -C super checkout .gitmodules" &&
+   (cd super &&
+   echo "newer_url" >expect &&
+   git submodule--helper config submodule.submodule.url 
"newer_url" &&
+   git submodule--helper config submodule.submodule.url >actual &&
+   test_cmp expect actual
+   )
+'
+
 test_done
-- 
2.18.0

[PATCH v4 0/9] Make submodules work if .gitmodules is not checked out

2018-08-24 Thread Antonio Ospite

Hi,

this series teaches git to try and read the .gitmodules file from the
index (:.gitmodules) and the current branch (HEAD:.gitmodules) when it
is not readily available in the working tree.

This can be used, together with sparse checkouts, to enable submodule
usage with programs like vcsh[1] which manage multiple repositories with
their working trees sharing the same path.

[1] https://github.com/RichiH/vcsh

In v4 patches 1, 2, 5, 6, and 7 are basically the same as in the
previous series; the review can concentrate on patches 3, 4, 8, and 9.

v3 of the series is here:
https://public-inbox.org/git/20180814110525.17801-1-...@ao2.it/

v2 of the series is here:
https://public-inbox.org/git/20180802134634.10300-1-...@ao2.it/

v1 of the series, with some background info, is here:
https://public-inbox.org/git/20180514105823.8378-1-...@ao2.it/

Changes since v3:

  * Improve robustness of current tests in t7411-submodule-config.sh:
  - merge two tests that check for effects of the same commit.
  - reset to a well defined point in history when exiting the tests.

  * Fix style issues in new tests added to t7411-submodule-config.sh:
  - use test_when_finished in new tests.
  - name the output file 'expect' instead of 'expected'.

  * Add a new "submodule--helper config --check-writeable" command.

  * Use "s--h config --check-wrteable" in git-submodule.sh to share the
code with the C implementation instead of duplicating the safety
check in shell script.

  * Add the ability to read .gitmodules from the index and then
fall-back to the current branch if the file is not in the index.
Add also more tests to validate all the possible scenarios.

  * Fix style issues in t7416-submodule-sparse-gitmodules.sh:
  - name the output file 'expect' instead of 'expected'.
  - remove white space after the redirection operator.
  - indent the HEREDOC block.
  - use "git -C super" instead of a subshell when there is only one
command in the test.

  * Remove a stale file named 'new' which erroneusly slipped in
a commit.

  * Update some comments and commit messages.


Thanks,
   Antonio

Antonio Ospite (9):
  submodule: add a print_config_from_gitmodules() helper
  submodule: factor out a config_set_in_gitmodules_file_gently function
  t7411: merge tests 5 and 6
  t7411: be nicer to future tests and really clean things up
  submodule--helper: add a new 'config' subcommand
  submodule: use the 'submodule--helper config' command
  t7506: clean up .gitmodules properly before setting up new scenario
  submodule: add a helper to check if it is safe to write to .gitmodules
  submodule: support reading .gitmodules when it's not in the working
tree

 builtin/submodule--helper.c|  40 +
 cache.h|   2 +
 git-submodule.sh   |  13 ++-
 submodule-config.c |  55 -
 submodule-config.h |   3 +
 submodule.c|  28 +--
 submodule.h|   1 +
 t/t7411-submodule-config.sh| 107 +
 t/t7416-submodule-sparse-gitmodules.sh |  78 ++
 t/t7506-status-submodule.sh|   3 +-
 10 files changed, 301 insertions(+), 29 deletions(-)
 create mode 100755 t/t7416-submodule-sparse-gitmodules.sh

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?

[PATCH v4 8/9] submodule: add a helper to check if it is safe to write to .gitmodules

2018-08-24 Thread Antonio Ospite

Introduce a helper function named is_writing_gitmodules_ok() to verify
that the .gitmodules file is safe to write.

The function name follows the scheme of is_staging_gitmodules_ok().

The two symbolic constants GITMODULES_INDEX and GITMODULES_HEAD are used
to get help from the C preprocessor in preventing typos, especially for
future users.

This is in preparation for a future change which teaches git how to read
.gitmodules from the index or from the current branch if the file is not
available in the working tree.

The rationale behind the check is that writing to .gitmodules requires
the file to be present in the working tree, unless a brand new
.gitmodules is being created (in which case the .gitmodules file would
not exist at all: neither in the working tree nor in the index or in the
current branch).

Expose the functionality also via a "submodule-helper config
--check-writeable" command, as git scripts may want to perform the check
before modifying submodules configuration.

Signed-off-by: Antonio Ospite 
---
 builtin/submodule--helper.c | 24 +++-
 cache.h |  2 ++
 submodule.c | 18 ++
 submodule.h |  1 +
 t/t7411-submodule-config.sh | 31 +++
 5 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index a461a1107c..1bb168e814 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2030,6 +2030,28 @@ static int connect_gitdir_workingtree(int argc, const 
char **argv, const char *p
 
 static int module_config(int argc, const char **argv, const char *prefix)
 {
+   enum {
+   CHECK_WRITEABLE = 1
+   } command = 0;
+
+   struct option module_config_options[] = {
+   OPT_CMDMODE(0, "check-writeable", ,
+   N_("check if it is safe to write to the .gitmodules 
file"),
+   CHECK_WRITEABLE),
+   OPT_END()
+   };
+   const char *const git_submodule_helper_usage[] = {
+   N_("git submodule--helper config name [value]"),
+   N_("git submodule--helper config --check-writeable"),
+   NULL
+   };
+
+   argc = parse_options(argc, argv, prefix, module_config_options,
+git_submodule_helper_usage, PARSE_OPT_KEEP_ARGV0);
+
+   if (argc == 1 && command == CHECK_WRITEABLE)
+   return is_writing_gitmodules_ok() ? 0 : -1;
+
/* Equivalent to ACTION_GET in builtin/config.c */
if (argc == 2)
return print_config_from_gitmodules(argv[1]);
@@ -2038,7 +2060,7 @@ static int module_config(int argc, const char **argv, 
const char *prefix)
if (argc == 3)
return config_set_in_gitmodules_file_gently(argv[1], argv[2]);
 
-   die("submodule--helper config takes 1 or 2 arguments: name [value]");
+   usage_with_options(git_submodule_helper_usage, module_config_options);
 }
 
 #define SUPPORT_SUPER_PREFIX (1<<0)
diff --git a/cache.h b/cache.h
index b1fd3d58ab..e3fe1e6191 100644
--- a/cache.h
+++ b/cache.h
@@ -486,6 +486,8 @@ static inline enum object_type object_type(unsigned int 
mode)
 #define INFOATTRIBUTES_FILE "info/attributes"
 #define ATTRIBUTE_MACRO_PREFIX "[attr]"
 #define GITMODULES_FILE ".gitmodules"
+#define GITMODULES_INDEX ":.gitmodules"
+#define GITMODULES_HEAD "HEAD:.gitmodules"
 #define GIT_NOTES_REF_ENVIRONMENT "GIT_NOTES_REF"
 #define GIT_NOTES_DEFAULT_REF "refs/notes/commits"
 #define GIT_NOTES_DISPLAY_REF_ENVIRONMENT "GIT_NOTES_DISPLAY_REF"
diff --git a/submodule.c b/submodule.c
index 36a9465acb..d875b512df 100644
--- a/submodule.c
+++ b/submodule.c
@@ -50,6 +50,24 @@ int is_gitmodules_unmerged(const struct index_state *istate)
return 0;
 }
 
+/*
+ * Check if the .gitmodules file is safe to write.
+ *
+ * Writing to the .gitmodules file requires that the file exists in the
+ * working tree or, if it doesn't, that a brand new .gitmodules file is going
+ * to be created (i.e. it's neither in the index nor in the current branch).
+ *
+ * It is not safe to write to .gitmodules if it's not in the working tree but
+ * it is in the index or in the current branch, because writing new values
+ * (and staging them) would blindly overwrite ALL the old content.
+ */
+int is_writing_gitmodules_ok(void)
+{
+   struct object_id oid;
+   return file_exists(GITMODULES_FILE) ||
+   (get_oid(GITMODULES_INDEX, ) < 0 && 
get_oid(GITMODULES_HEAD, ) < 0);
+}
+
 /*
  * Check if the .gitmodules file has unstaged modifications.  This must be
  * checked before allowing modifications to the .gitmodules file with the
diff --git a/submodule.h b/submodule.h
index 7d476cefa7..5106d4597a 100644
--- a/submodule.h
+++ b/submodule.h
@@ -40,6 +40,7 @@ struct submodule_update_strategy {
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
 int

[PATCH v4 6/9] submodule: use the 'submodule--helper config' command

2018-08-24 Thread Antonio Ospite

Use the 'submodule--helper config' command in git-submodules.sh to avoid
referring explicitly to .gitmodules by the hardcoded file path.

This makes it possible to access the submodules configuration in a more
controlled way.

Signed-off-by: Antonio Ospite 
---
 git-submodule.sh | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index f7fd80345c..9e47dc9574 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -72,7 +72,7 @@ get_submodule_config () {
value=$(git config submodule."$name"."$option")
if test -z "$value"
then
-   value=$(git config -f .gitmodules submodule."$name"."$option")
+   value=$(git submodule--helper config 
submodule."$name"."$option")
fi
printf '%s' "${value:-$default}"
 }
@@ -283,11 +283,11 @@ or you are unsure what this means choose another name 
with the '--name' option."
git add --no-warn-embedded-repo $force "$sm_path" ||
die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
 
-   git config -f .gitmodules submodule."$sm_name".path "$sm_path" &&
-   git config -f .gitmodules submodule."$sm_name".url "$repo" &&
+   git submodule--helper config submodule."$sm_name".path "$sm_path" &&
+   git submodule--helper config submodule."$sm_name".url "$repo" &&
if test -n "$branch"
then
-   git config -f .gitmodules submodule."$sm_name".branch "$branch"
+   git submodule--helper config submodule."$sm_name".branch 
"$branch"
fi &&
git add --force .gitmodules ||
die "$(eval_gettext "Failed to register submodule '\$sm_path'")"
-- 
2.18.0

[PATCH v4 2/9] submodule: factor out a config_set_in_gitmodules_file_gently function

2018-08-24 Thread Antonio Ospite

Introduce a new config_set_in_gitmodules_file_gently() function to write
config values to the .gitmodules file.

This is in preparation for a future change which will use the function
to write to the .gitmodules file in a more controlled way instead of
using "git config -f .gitmodules".

The purpose of the change is mainly to centralize the code that writes
to the .gitmodules file to avoid some duplication.

The naming follows git_config_set_in_file_gently() but the git_ prefix
is removed to communicate that this is not a generic git-config API.

Signed-off-by: Antonio Ospite 
---
 submodule-config.c | 12 
 submodule-config.h |  1 +
 submodule.c| 10 +++---
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/submodule-config.c b/submodule-config.c
index eef96c4198..b7ef055c63 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -707,6 +707,18 @@ int print_config_from_gitmodules(const char *key)
return 0;
 }
 
+int config_set_in_gitmodules_file_gently(const char *key, const char *value)
+{
+   int ret;
+
+   ret = git_config_set_in_file_gently(GITMODULES_FILE, key, value);
+   if (ret < 0)
+   /* Maybe the user already did that, don't error out here */
+   warning(_("Could not update .gitmodules entry %s"), key);
+
+   return ret;
+}
+
 struct fetch_config {
int *max_children;
int *recurse_submodules;
diff --git a/submodule-config.h b/submodule-config.h
index ed40e9a478..9957bcbbfa 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -57,6 +57,7 @@ void submodule_free(struct repository *r);
 int check_submodule_name(const char *name);
 
 int print_config_from_gitmodules(const char *key);
+int config_set_in_gitmodules_file_gently(const char *key, const char *value);
 
 /*
  * Note: these helper functions exist solely to maintain backward
diff --git a/submodule.c b/submodule.c
index 50cbf5f13e..36a9465acb 100644
--- a/submodule.c
+++ b/submodule.c
@@ -89,6 +89,7 @@ int update_path_in_gitmodules(const char *oldpath, const char 
*newpath)
 {
struct strbuf entry = STRBUF_INIT;
const struct submodule *submodule;
+   int ret;
 
if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
return -1;
@@ -104,14 +105,9 @@ int update_path_in_gitmodules(const char *oldpath, const 
char *newpath)
strbuf_addstr(, "submodule.");
strbuf_addstr(, submodule->name);
strbuf_addstr(, ".path");
-   if (git_config_set_in_file_gently(GITMODULES_FILE, entry.buf, newpath) 
< 0) {
-   /* Maybe the user already did that, don't error out here */
-   warning(_("Could not update .gitmodules entry %s"), entry.buf);
-   strbuf_release();
-   return -1;
-   }
+   ret = config_set_in_gitmodules_file_gently(entry.buf, newpath);
strbuf_release();
-   return 0;
+   return ret;
 }
 
 /*
-- 
2.18.0

[PATCH v4 9/9] submodule: support reading .gitmodules when it's not in the working tree

2018-08-24 Thread Antonio Ospite

When the .gitmodules file is not available in the working tree, try
using the content from the index and from the current branch. This
covers the case when the file is part of the repository but for some
reason it is not checked out, for example because of a sparse checkout.

This makes it possible to use at least the 'git submodule' commands
which *read* the gitmodules configuration file without fully populating
the working tree.

Writing to .gitmodules will still require that the file is checked out,
so check for that before calling config_set_in_gitmodules_file_gently.

Add a similar check also in git-submodule.sh::cmd_add() to anticipate
the eventual failure of the "git submodule add" command when .gitmodules
is not safely writeable; this prevents the command from leaving the
repository in a spurious state (e.g. the submodule repository was cloned
but .gitmodules was not updated because
config_set_in_gitmodules_file_gently failed).

Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading
from .gitmodules succeeds and that writing to it fails when the file is
not checked out.

Signed-off-by: Antonio Ospite 
---
 builtin/submodule--helper.c|  6 +-
 git-submodule.sh   |  5 ++
 submodule-config.c | 18 +-
 t/t7411-submodule-config.sh| 26 -
 t/t7416-submodule-sparse-gitmodules.sh | 78 ++
 5 files changed, 129 insertions(+), 4 deletions(-)
 create mode 100755 t/t7416-submodule-sparse-gitmodules.sh

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 1bb168e814..6bc44d87dc 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2057,8 +2057,12 @@ static int module_config(int argc, const char **argv, 
const char *prefix)
return print_config_from_gitmodules(argv[1]);
 
/* Equivalent to ACTION_SET in builtin/config.c */
-   if (argc == 3)
+   if (argc == 3) {
+   if (!is_writing_gitmodules_ok())
+   die(_("please make sure that the .gitmodules file is in 
the working tree"));
+
return config_set_in_gitmodules_file_gently(argv[1], argv[2]);
+   }
 
usage_with_options(git_submodule_helper_usage, module_config_options);
 }
diff --git a/git-submodule.sh b/git-submodule.sh
index 9e47dc9574..0e44487135 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -159,6 +159,11 @@ cmd_add()
shift
done
 
+   if ! git submodule--helper config --check-writeable >/dev/null 2>&1
+   then
+die "$(eval_gettext "please make sure that the .gitmodules 
file is in the working tree")"
+   fi
+
if test -n "$reference_path"
then
is_absolute_path "$reference_path" ||
diff --git a/submodule-config.c b/submodule-config.c
index b7ef055c63..edba68ac85 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "dir.h"
 #include "repository.h"
 #include "config.h"
 #include "submodule-config.h"
@@ -603,8 +604,21 @@ static void submodule_cache_check_init(struct repository 
*repo)
 static void config_from_gitmodules(config_fn_t fn, struct repository *repo, 
void *data)
 {
if (repo->worktree) {
-   char *file = repo_worktree_path(repo, GITMODULES_FILE);
-   git_config_from_file(fn, file, data);
+   struct git_config_source config_source = { 0 };
+   const struct config_options opts = { 0 };
+   struct object_id oid;
+   char *file;
+
+   file = repo_worktree_path(repo, GITMODULES_FILE);
+   if (file_exists(file))
+   config_source.file = file;
+   else if (get_oid(GITMODULES_INDEX, ) >= 0)
+   config_source.blob = GITMODULES_INDEX;
+   else if (get_oid(GITMODULES_HEAD, ) >= 0)
+   config_source.blob = GITMODULES_HEAD;
+
+   config_with_options(fn, data, _source, );
+
free(file);
}
 }
diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index 45953f9300..2cfabb18bc 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -134,7 +134,7 @@ test_expect_success 'error in history in 
fetchrecursesubmodule lets continue' '
)
 '
 
-test_expect_success 'reading submodules config with "submodule--helper 
config"' '
+test_expect_success 'reading submodules config from the working tree with 
"submodule--helper config"' '
(cd super &&
echo "../submodule" >expect &&
git submodule--helper config submodule.submodule.url >actual &&
@@ -192,4 +192,28 @@ test_expect_success 'non-writeable .gitmodules when it is 
in the current branch
)
 '
 
+test_expect_success 'reading submodules config from the index when .gitmodules 
is not in the working tree' '
+   ORIG=$(git -C

[PATCH v4 4/9] t7411: be nicer to future tests and really clean things up

2018-08-24 Thread Antonio Ospite

Tests 5 and 7 in t/t7411-submodule-config.sh add two commits with
invalid lines in .gitmodules but then only the second commit is removed.

This may affect future subsequent tests if they assume that the
.gitmodules file has no errors.

Remove both the commits as soon as they are not needed anymore.

Signed-off-by: Antonio Ospite 
---
 t/t7411-submodule-config.sh | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index f2cd1f4a2c..b1f3c6489b 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -83,6 +83,8 @@ Submodule name: 'submodule' for path 'submodule'
 EOF
 
 test_expect_success 'error in history of one submodule config lets continue, 
stderr message contains blob ref' '
+   ORIG=$(git -C super rev-parse HEAD) &&
+   test_when_finished "git -C super reset --hard $ORIG" &&
(cd super &&
cp .gitmodules .gitmodules.bak &&
echo "  value = \"" >>.gitmodules &&
@@ -115,6 +117,8 @@ test_expect_success 'using different treeishs works' '
 '
 
 test_expect_success 'error in history in fetchrecursesubmodule lets continue' '
+   ORIG=$(git -C super rev-parse HEAD) &&
+   test_when_finished "git -C super reset --hard $ORIG" &&
(cd super &&
git config -f .gitmodules \
submodule.submodule.fetchrecursesubmodules blabla &&
@@ -126,8 +130,7 @@ test_expect_success 'error in history in 
fetchrecursesubmodule lets continue' '
HEAD b \
HEAD submodule \
>actual &&
-   test_cmp expect_error actual  &&
-   git reset --hard HEAD^
+   test_cmp expect_error actual
)
 '
 
-- 
2.18.0

clone, hardlinks, and file modes (and CAP_FOWNER)

2018-08-24 Thread Andreas Krey

Hi everybody,

I'm currently looking into more aggressively sharing space between multiple 
repositories,
and into getting them to share again after one did a repack (which costs us 15G 
space).

One thing I stumbled on is the /proc/sys/fs/protected_hardlinks stuff which 
disallows
hardlinking pack files belonging to someone else. This consequently inhibits 
sharing
when first cloning from a common shared cache repo.

Installing git with CAP_FOWNER is probably too dangerous;
at least the capability should only be enabled during the directory copying.

*

And the next thing is that copied object/pack files are created with mode 
rw-rw-r--,
unlike those that come out of the regular transports.

Apparent patch:

diff --git a/builtin/clone.c b/builtin/clone.c
index fd2c3ef090..6ffb4db4da 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -448,7 +448,7 @@ static void copy_or_link_directory(struct strbuf *src, 
struct strbuf *dest,
die_errno(_("failed to create link '%s'"), 
dest->buf);
option_no_hardlinks = 1;
}
-   if (copy_file_with_time(dest->buf, src->buf, 0666))
+   if (copy_file_with_time(dest->buf, src->buf, 0444))
die_errno(_("failed to copy file to '%s'"), dest->buf);
}
closedir(dir);

Alas, copy_file takes the mode just as a crude hint to executability, so also:

diff --git a/copy.c b/copy.c
index 4de6a110f0..883060009c 100644
--- a/copy.c
+++ b/copy.c
@@ -32,7 +32,7 @@ int copy_file(const char *dst, const char *src, int mode)
 {
int fdi, fdo, status;
 
-   mode = (mode & 0111) ? 0777 : 0666;
+   mode = (mode & 0111) ? 0777 : (mode & 0222) ? 0666 : 0444;
if ((fdi = open(src, O_RDONLY)) < 0)
return fdi;
if ((fdo = open(dst, O_WRONLY | O_CREAT | O_EXCL, mode)) < 0) {

(copy_file is also used with 0644 instead of the usual 0666 in 
refs/files-backend.c)

Will submit as patch if acceptable; I'm not sure what the mode casing will
do with other users.

- Andreas

-- 
"Totally trivial. Famous last words."
From: Linus Torvalds 
Date: Fri, 22 Jan 2010 07:29:21 -0800

Request for testing v2.19.0-rc0 with builtin stash/rebase

2018-08-24 Thread Johannes Schindelin

Team,

while this mail talks about Git for Windows, please keep in mind that we
try *very* hard to keep Git for Windows' master working correctly not only
on Windows but also on macOS and Linux.

I, for one, run Git built from Git for Windows' `master` branch in my
Linux VMs all the time.

As all of you know by now, the fact that Git was pretty much designed to
work well on Linux is not exactly helping Git for Windows. (Or for that
matter, macOS: Git is substantially slower on macOS than on Linux when
running on the same hardware.) The quickest pay-off comes from converting
Unix shell scripts (which are designed to spawn processes, whereas Windows
is more optimized for multi-threaded applications).

For that reason, I was delighted to see that our Google Summer of Code
pushed pretty hard in that direction. And I could not help myself so I had
to test how much faster things got. Here is the result of my first, really
quick and dirty test:

without builtin stash/rebasewith builtin stash/rebase
t3400 (rebase)  1m27s   32s
t3404 (rebase -i)   13m15s  3m59s
t3903 (stash)   8m37s   1m18s

What can I say? Even if the numbers are off by as much as 10%, these are
impressive improvements: keep in mind that there is a lot of churn going
on in the test suite because it is itself implemented in Unix shell
scripts (and hence I won't even bother to try more correct performance
benchmarking because that is simply not possible when Unix shell scripts
are in the equation). So the speed improvements of the stash/rebase
commands are *even higher* than this.

So I really, really, really want those builtins in Git for Windows
v2.19.0. At *least* as an option.

Therefore, after creating a pre-release of Git for Windows corresponding
to Git v2.19.0-rc0, I created another one (note the .2 at the end), *with*
those new builtins:
https://github.com/git-for-windows/git/releases/tag/v2.19.0-rc0.windows.2

I would like to ask you for help in dog-fooding this. In particular if you
are a heavy stash/rebase user (like I am), it would be helpful to really
kick those tires.

Of course, those are only Windows binaries on that web page, but it should
be easy to compile from that tag on Linux and macOS. I could also build a
macOS installer and add it to that pre-release, is there interest in that?

Currently I am uncertain whether I should spend the time to reinstate the
old scripted `git stash` guarded by `stash.useBuiltin` and the old
scripted `git rebase -i` (guarded by the same `rebase.useBuiltin` that
guards the scripted `git rebase`), as a fall-back (or make the new
builtins opt-ins instead of opt-outs).

So far, I have not encountered any serious bug, but then, I did not really
have a chance to use it yet (I installed it, of course, but I have not
done any serious rebasing yet). So your help will be crucial in
determining where I need to spend time.

Thanks,
Johannes

Re: [ANNOUNCE] Git v2.19.0-rc0

2018-08-24 Thread Derrick Stolee


On 8/24/2018 2:45 AM, Jeff King wrote:

On Thu, Aug 23, 2018 at 10:59:55PM -0400, Jeff King wrote:


So I think we have a winner. I'll polish that up into patches and send
it out later tonight.

Oof. This rabbit hole keeps going deeper and deeper. I wrote up my
coccinelle findings separately in:

   https://public-inbox.org/git/20180824064228.ga3...@sigill.intra.peff.net/

which is possibly a coccinelle bug (there I talked about oidcmp, since
it can be demonstrated with the existing transformations, but the same
thing happens with my hasheq patches). I'll wait to see how that
discussion plays out, but I do otherwise have hasheq() patches ready to
go, so it's probably not worth anybody else digging in further.

I was trying this myself yesterday, and found a doc _somewhere_ that 
said what we should do is this:


@@
expression e1, e2;
@@
if (
- oidcmp(e1, e2)
+ !oideq(e1, e2)
 )

(Don't forget the space before the last end-paren!)

-Stolee

Re: [Cocci] excluding a function from coccinelle transformation

2018-08-24 Thread Julia Lawall

On Fri, 24 Aug 2018, Jeff King wrote:

> In Git's Coccinelle patches, we sometimes want to suppress a
> transformation inside a particular function. For example, in finding
> conversions of hashcmp() to oidcmp(), we should not convert the call in
> oidcmp() itself, since that would cause infinite recursion. We write the
> semantic patch like this:
>
>   @@
>   identifier f != oidcmp;
>   expression E1, E2;
>   @@
> f(...) {...
>   - hashcmp(E1->hash, E2->hash)
>   + oidcmp(E1, E2)
> ...}

The problem is with how how ... works.  For transformation, A ... B
requires that B occur on every execution path starting with A, unless that
execution path ends up in error handling code.
(eg, if (...) { ... return; }).  Here your A is the start if the function.
So you need a call to hashcmp on every path through the function, which
fails when you add ifs.

If you use * (searching) instead of - and + (transformation) it will only
require that a path exists.  * is mean for bug finding, where you often
want to find eg whether there exists a path that is missing a free.

If you want the exists behavior with a transformation rule, then you can
put exists at the top of the rule between the initial @@.  I don't suggest
this in general, as it can lead to inconsistencies.

What you want is what you ended up using, which is <... P ...> which
allows zero or more occurrences of P.

However, this can all be very expensive, because you are matching paths
through the function definition which you don't really care about.  All
you care about here is the name.  So another approach is

@@
position p : script:python() { p[0].current_element != "oldcmp" };
expression E1,E2;
@@

- hashcmp(E1->hash, E2->hash)
+ oidcmp(E1, E2)

(I assume that "not equals" is written != in python)

Another issue with A ... B is that by default A and B should not appear in
the matched region.  So your original rule matches only the case where
every execution path contains exactly one call to hashcmp, not more than
one.  So that was another problem with it.

julia

>
> This catches some cases, but not all. For instance, there's one case in
> sequencer.c which it does not convert. Now here's where it gets weird.
> If I instead use the angle-bracket form of ellipses, like this:
>
>   @@
>   identifier f != oidcmp;
>   expression E1, E2;
>   @@
> f(...) {<...
>   - hashcmp(E1->hash, E2->hash)
>   + oidcmp(E1, E2)
> ...>}
>
> then we do generate the expected diff! Here's a much more cut-down
> source file that demonstrates the same behavior:
>
>   int foo(void)
>   {
> if (1)
>   if (!hashcmp(x, y))
> return 1;
> return 0;
>   }
>
> If I remove the initial "if (1)" then a diff is generated with either
> semantic patch (and the particulars of the "if" are not important; the
> same thing happens if it's a while-loop. The key thing seems to be that
> the code is not in the top-level block of the function).
>
> And here's some double-weirdness. I get those results with spatch 1.0.4,
> which is what's in Debian unstable. If I then upgrade to 1.0.6 from
> Debian experimental, then _neither_ patch produces any results! Instead
> I get:
>
>   init_defs_builtins: /usr/lib/coccinelle/standard.h
>   (ONCE) Expected tokens oidcmp hashcmp hash
>   Skipping:foo.c
>
> (whereas before, even the failing case said "HANDLING: foo.c").
>
> And then one final check: I built coccinelle from the current tip of
> https://github.com/coccinelle/coccinelle (1.0.7-00504-g670b2243).
> With my cut-down case, that version generates a diff with either
> semantic patch. But for the full-blown case in sequencer.c, it still
> only works with the angle brackets.
>
> So my questions are:
>
>   - is this a bug in coccinelle? Or I not understand how "..." is
> supposed to work here?
>
> (It does seem like there was possibly a separate bug introduced in
> 1.0.6 that was later fixed; we can probably ignore that and just
> focus on the behavior in the current tip of master).
>
>   - is there a better way to represent this kind of "transform this
> everywhere _except_ in this function" semantic patch? (preferably
> one that does not tickle this bug, if it is indeed a bug ;) ).
>
> -Peff
> ___
> Cocci mailing list
> co...@systeme.lip6.fr
> https://systeme.lip6.fr/mailman/listinfo/cocci
>

Re: [ANNOUNCE] Git v2.19.0-rc0

2018-08-24 Thread Ævar Arnfjörð Bjarmason



On Fri, Aug 24 2018, Jeff King wrote:

> On Thu, Aug 23, 2018 at 04:59:27PM -0400, Derrick Stolee wrote:
>
>> Using git/git:
>>
>> Test v2.18.0 v2.19.0-rc0 HEAD
>> -
>> 0001.2: 3.10(3.02+0.08) 3.27(3.17+0.09) +5.5% 3.14(3.02+0.11) +1.3%
>>
>>
>> Using torvalds/linux:
>>
>> Test v2.18.0 v2.19.0-rc0 HEAD
>> --
>> 0001.2: 56.08(45.91+1.50) 56.60(46.62+1.50) +0.9% 54.61(45.47+1.46) -2.6%
>
> Interesting that these timings aren't as dramatic as the ones you got
> the other day (mine seemed to shift, too; for whatever reason it seems
> like under load the difference is larger).
>
>> Now here is where I get on my soapbox (and create a TODO for myself later).
>> I ran the above with GIT_PERF_REPEAT_COUNT=10, which intuitively suggests
>> that the results should be _more_ accurate than the default of 3. However, I
>> then remember that we only report the *minimum* time from all the runs,
>> which is likely to select an outlier from the distribution. To test this, I
>> ran a few tests manually and found the variation between runs to be larger
>> than 3%.
>
> Yes, I agree it's not a great system. The whole "best of 3" thing is
> OK for throwing out cold-cache warmups, but it's really bad for teasing
> out the significance of small changes, or even understanding how much
> run-to-run noise there is.
>
>> When I choose my own metrics for performance tests, I like to run at least
>> 10 runs, remove the largest AND smallest runs from the samples, and then
>> take the average. I did this manually for 'git rev-list --all --objects' on
>> git/git and got the following results:
>
> I agree that technique is better. I wonder if there's something even
> more statistically rigorous we could do. E.g., to compute the variance
> and throw away outliers based on standard deviations. And also to report
> the variance to give a sense of the significance of any changes.
>
> Obviously more runs gives greater confidence in the results, but 10
> sounds like a lot. Many of these tests take minutes to run. Letting it
> go overnight is OK if you're doing a once-per-release mega-run, but it's
> pretty painful if you just want to generate some numbers to show off
> your commit.

An ex-coworker who's a *lot* smarter about these things than I am wrote
this module: https://metacpan.org/pod/Dumbbench

I while ago I dabbled briefly with integrating it into t/perf/ but got
distracted by something else:

The module currently works similar to [more traditional iterative
benchmark modules], except (in layman terms) it will run the command
many times, estimate the uncertainty of the result and keep
iterating until a certain user-defined precision has been
reached. Then, it calculates the resulting uncertainty and goes
through some pain to discard bad runs and subtract overhead from the
timings. The reported timing includes an uncertainty, so that
multiple benchmarks can more easily be compared.

Details of how it works here:
https://metacpan.org/pod/Dumbbench#HOW-IT-WORKS-AND-WHY-IT-DOESN'T

Something like that seems to me to be an inherently better
approach. I.e. we have lots of test cases that take 500ms, and some that
take maybe 5 minutes (depending on the size of the repository).

Indiscriminately repeating all of those for GIT_PERF_REPEAT_COUNT must
be dumber than something like the above method.

We could also speed up the runtime of the perf tests a lot with such a
method, by e.g. saying that we're OK with less certainty on tests that
take a longer time than those that take a shorter time.


>> v2.18.0 v2.19.0-rc0 HEAD
>> 
>> 3.126 s 3.308 s 3.170 s
>
> So that's 5% worsening in 2.19, and we reclaim all but 1.4% of it. Those
> numbers match what I expect (and what I was seeing in some of my earlier
> timings).
>
>> I just kicked off a script that will run this test on the Linux repo while I
>> drive home. I'll be able to report a similar table of data easily.
>
> Thanks, I'd expect it to come up with similar percentages. So we'll see
> if that holds true. :)
>
>> My TODO is to consider aggregating the data this way (or with a median)
>> instead of reporting the minimum.
>
> Yes, I think that would be a great improvement for t/perf.
>
> -Peff

Re: [PATCH 0/9] trailer-parsing false positives

2018-08-24 Thread Jeff King

On Thu, Aug 23, 2018 at 11:30:54AM -0700, Junio C Hamano wrote:

> > Now here's the tricky part. I think patches 1-8 are mostly sensible.
> 
> Yeah, nothing that made me go "Huh?" in these 8 patches.  Thanks.
> 
> > So I think there may be further opportunities for cleanup here. I'm not
> > sure if we'd need to retain this behavior for git-interpret-trailers.
> > AFAICT it is not documented, and I suspect is mostly historical
> > accident, and not anything anybody ever wanted.
> 
> I tend to think the behaviour was not designed but it "just happens
> to work that way".

OK, so I've slept on it as promised, and looked at ignore_non_trailer()
a bit more closely.

It actually ignores three things:

 - comment blocks

 - "commit -v" cut lines

 - "Conflicts:" blocks for merges

I think I could see an argument for handling the third type everywhere,
if we wanted trailers to go above such a block (since they really are
part of the actual commit message). Of course, we try to make those
"Conflicts" blocks into comments these days. And looking at our history,
I had trouble finding many examples, since most of the old merges do not
have sign-offs in the first place.

There are a handful with the S-o-b below the conflict block (e.g.,
a24a32ddb). There are even some gems like 425b139313 and a5db4b127b)
where there is one sign-off above the block and one below. In our
history, there's nothing more recent than 2015, which is not incredibly
long after 261f315beb (merge & sequencer: turn "Conflicts:" hint into a
comment, 2014-10-28).

But in linux.git, I can find many examples from this year of Conflicts
blocks with the signoffs afterwards. And a few with the signoffs before
(e.g., ed09f6d05c).  I think the current code will do the right thing
either way when parsing those (if the trailers are after, we won't
exclude those, but if they're before, then we'll skip over the Conflicts
block).

So frankly, that all makes me afraid of touching any of it. I do think
it's probably doing the wrong thing in some cases, and we should
probably just make a rule like "trailers always go at the bottom, end of
story". But there are enough weird and historical cases, not to mention
potential interactions with git-commit, that I'd be quite likely to
regress some other case.

So my inclination is to punt on it for now, and go with my patches 1-8,
which fix an actual case that we saw in the real world, without creating
other problems.

I think patch 9 is not hurting anything and may later help us, but I
could take or leave it.

> > If we do keep it by default, then the "--no-divider" option I added in
> > patch 4 should probably get a more generic name and cover this.
> > Something like "--verbatim-input".
> 
> Perhaps.  Even if this is not covered, --verbatim-input would be a
> good name for the option ;-)

Possibly. :) What I was worried about is realizing that it's not really
"verbatim", though, but rather "some mystical set of rules including
nonsense like git-commit cut-lines". And so we should not over-promise
with the option name.

I'd also be OK to call it "verbatim" and consider it a to-be-fixed bug
that it still respects these weird rules. I'm just not sure I want to
spend more time digging on those weird rules now.

-Peff

Re: [ANNOUNCE] Git v2.19.0-rc0

2018-08-24 Thread Jeff King

On Thu, Aug 23, 2018 at 04:59:27PM -0400, Derrick Stolee wrote:

> Using git/git:
> 
> Test  v2.18.0   v2.19.0-rc0 HEAD
> -
> 0001.2:   3.10(3.02+0.08)   3.27(3.17+0.09) +5.5% 3.14(3.02+0.11) +1.3%
> 
> 
> Using torvalds/linux:
> 
> Test v2.18.0 v2.19.0-rc0   HEAD
> --
> 0001.2:  56.08(45.91+1.50)   56.60(46.62+1.50) +0.9% 54.61(45.47+1.46) -2.6%

Interesting that these timings aren't as dramatic as the ones you got
the other day (mine seemed to shift, too; for whatever reason it seems
like under load the difference is larger).

> Now here is where I get on my soapbox (and create a TODO for myself later).
> I ran the above with GIT_PERF_REPEAT_COUNT=10, which intuitively suggests
> that the results should be _more_ accurate than the default of 3. However, I
> then remember that we only report the *minimum* time from all the runs,
> which is likely to select an outlier from the distribution. To test this, I
> ran a few tests manually and found the variation between runs to be larger
> than 3%.

Yes, I agree it's not a great system. The whole "best of 3" thing is
OK for throwing out cold-cache warmups, but it's really bad for teasing
out the significance of small changes, or even understanding how much
run-to-run noise there is.

> When I choose my own metrics for performance tests, I like to run at least
> 10 runs, remove the largest AND smallest runs from the samples, and then
> take the average. I did this manually for 'git rev-list --all --objects' on
> git/git and got the following results:

I agree that technique is better. I wonder if there's something even
more statistically rigorous we could do. E.g., to compute the variance
and throw away outliers based on standard deviations. And also to report
the variance to give a sense of the significance of any changes.

Obviously more runs gives greater confidence in the results, but 10
sounds like a lot. Many of these tests take minutes to run. Letting it
go overnight is OK if you're doing a once-per-release mega-run, but it's
pretty painful if you just want to generate some numbers to show off
your commit.

> v2.18.0    v2.19.0-rc0   HEAD
> 
> 3.126 s    3.308 s   3.170 s

So that's 5% worsening in 2.19, and we reclaim all but 1.4% of it. Those
numbers match what I expect (and what I was seeing in some of my earlier
timings).

> I just kicked off a script that will run this test on the Linux repo while I
> drive home. I'll be able to report a similar table of data easily.

Thanks, I'd expect it to come up with similar percentages. So we'll see
if that holds true. :)

> My TODO is to consider aggregating the data this way (or with a median)
> instead of reporting the minimum.

Yes, I think that would be a great improvement for t/perf.

-Peff

Re: [ANNOUNCE] Git v2.19.0-rc0

2018-08-24 Thread Jeff King

On Thu, Aug 23, 2018 at 10:59:55PM -0400, Jeff King wrote:

> So I think we have a winner. I'll polish that up into patches and send
> it out later tonight.

Oof. This rabbit hole keeps going deeper and deeper. I wrote up my
coccinelle findings separately in:

  https://public-inbox.org/git/20180824064228.ga3...@sigill.intra.peff.net/

which is possibly a coccinelle bug (there I talked about oidcmp, since
it can be demonstrated with the existing transformations, but the same
thing happens with my hasheq patches). I'll wait to see how that
discussion plays out, but I do otherwise have hasheq() patches ready to
go, so it's probably not worth anybody else digging in further.

-Peff

63 matches

Mail list logo