Re: What exactly is a "initial checkout"

2018-11-07 Thread Philip Oakley

On 07/11/2018 08:50, Christian Halstrick wrote:

Ok, I know understand the problems which are solved by this
special behaviour of a "initial checkout". And also important I understand
when exactly I should do a "initial checkout" - when the index file does
not exist. I'll share my new knowledge with JGit :-)

Given that the initial query was about the lack of documentation for the 
term "initial checkout", do you have any suggestion of how it might best 
be incorporated into the documentation to assist future reader?

--
Philip


Re: if YOU use a Windows GUI for Git, i would appreciate knowing which one and why

2018-11-05 Thread Philip Oakley

Hi Gerry,
I'll give my view, as someone approaching retirement, but who worked as 
an Engineer in a mainly Windows environment.


On 04/11/2018 17:48, _g e r r y _ _l o w r y _ wrote:

PREAMBLE [START] - please feel free to skip this first section

Forgive me for asking this question on a mailing list.

stackoverflow would probably kill such a question before the bits were fully 
saved to a server drive.

Let me explain why i am asking and why i am not being a troll.

[a] i'm "old school", i.e., > 50% on my way to being age 72 [born 1947]


8 years behind..


[b] when i started programming in 1967, most of my work input was via punched 
cards

'69, at school, post/compile/run/wait for post; 1 week
 (Maths club)



[c] punching my own cards was cool

Pin punching individual chads ;-)



[d] IBM System/360 mainframe assembler was cool and patching previously punched 
card encoded machine code output was a fun risky but
at times necessary challenge.

Eventually the 370 at university.



[e] using command windows and coding batch files for Gary Kildall's CP/M and 
the evil empire's PC/MS-DOS was how i accomplished many
tasks for early non-GUI environments (i still continue this practice even in 
Windows 10 (a.k.a. please don't update my PC O/S behind
my back again versions of MS Windows)).
Engineer in electronics; software was an interlinked part of electronics 
back then


[f] my introduction to Git was via a command line based awesome video that has 
disappeared (i asked this community about that in a
previous thread).
Discovered in 2011 via 'Code News' article - Spotted immediately that it 
solved the engineers version control issue because it 'distributed' the 
control. I've tried a few of the Gui's.




BOTTOM LINE:  virtually 100% of my Git use has been via Git Bash command line 
[probably downloaded from https://git-scm.com/]

For me, and i suspect even for most people who live with GUI platforms, [a well 
kept secret fact] using the keyboard is faster than
using the mouse [especially when one's fingers are already over one's keyboard-example, 
closing one or more "windows" via Alt+F4.

Also for me, i am happy to change some code and/or write some new code, Alt+Tab 
to Git Bash frequently, ADD/COMMIT, then Alt+Tab
back to whatever IDE i'm using [mostly LINQPad and vs2017]; i know that's quite 
a bit schizophrenic of me-command line Git but GUI
IDE.

PREAMBLE [END]


QUESTION:  if YOU use a Windows GUI for Git, i would appreciate knowing which 
one and why

i have been asked to look at GUI versions of Git for Windows.
I presume that this is for a client who isn't sure what they want 
http://www.abilitybusinesscomputerservices.com/home.html




https://git-scm.com/download/gui/windows currently lists 22 options.

That's nearly as bad as choosing a Linux distro ;-)



if i had more time left in my life and the option, because of my own nature, 
i'd likely download and evaluate all 22 - Mr.T would
pity the fool that i often can be.

CAUTION:  i am not looking for anyone to disparage other Git Windows GUIs.

Let me break down the question into 4 parts:

[1a] Which do you prefer:  Git GUI, Git command line?
I use the three parts provided as part of regular Git and Git for 
Windows, that is git-gui, gitk and git cli in a terminal (mintty)



[1b] What is your reason for your [1a] preference?
I have been in a general Windows environment for decades. The Gui format 
with single buttons/drop downs that do one thing well, without finger 
trouble side effects, is good in such environments. One cannot be master 
of everything.


The cli is good for specialists and special actions, especially 
precision surgery. The key is to avoid the "the surgery was a success 
but the patient died" results.


[2a] if applicable, which Git GUI do you prefer?

git-gui and gitk are now the only two I use.


[2b] What is your reason for your [2a] preference?
Many of the other Gui's hide the power of Git and its new abstraction of 
no longer actually being about "Control" (by 'management'). Now it is 
about veracity. If you have the right object ID (sha1/sha256) you have 
an identical original [there are no 'copies', all Mona Lisas with the 
hash are the same]. Management can choose which hash to accept upstream.


Most other Gui's try to hide behind the old school Master-copy view 
point that was developed in the 19th century for drawing office control. 
If you damaged the master drawing the ability to make things and do 
business was lost. Protecting the master drawing was everything. They 
were traced before they went to the blue print machine. Changes were 
batched up before the master could be touched (that risk again).


Too may Gui's (and their Managements!) still try to work the old way, 
loosing all the potential benefits. They are still hammer wielders 
looking for nails, and only finding screws to smash.


I've heard reasonable things about SmartGit but that costs money so I 

Re: Git Slowness on Windows w/o Internet

2018-11-03 Thread Philip Oakley



On 03/11/2018 16:44, brian m. carlson wrote:

On Fri, Nov 02, 2018 at 11:10:51AM -0500, Peter Kostyukov wrote:

Wanted to bring to your attention an issue that we discovered on our
Windows Jenkins nodes with git scm installed (git.exe). Our Jenkins
servers don't have Internet access. It appears that git.exe is trying
to connect to various Cloudflare and Akamai CDN instances over the
Internet when it first runs and it keeps trying to connect to these
CDNs every git.exe execution until it makes a successful attempt. See
the screenshot attached with the details.

Enabling Internet access via proxy fixes the issue and git.exe
continues to work fast on the next attempts to run git.exe

Is there any configuration setting that can disable this git's
behavior or is there any other workaround without allowing Internet
access? Otherwise, every git command run on a server without the
Internet takes about 30 seconds to complete.


Git itself doesn't make any attempt to access those systems unless it's
configured to do so (e.g. a remote is set up to talk to those systems
and fetch or pull is used).

It's possible that you're using a distribution package that performs
this behavior, say, to check for updates.  I'd recommend that you
contact the distributor, which in this case might be Git for Windows,
and see if they can tell you more about what's going on.  The URL for
that project is at https://github.com/git-for-windows/git.



The normal Git for Windows install includes an option to check for 
updates at a suitable rate. Maybe you are hitting that. It can be 
switched off.


--
Philip


Re: git projects with submodules in different sites - in txt format (:+(

2018-10-02 Thread Philip Oakley

On 02/10/2018 06:47, Michele Hallak wrote:

Hi,

I am getting out of idea about how to change the methodology we are using in 
order to ease our integration process... Close to despair, I am throwing the 
question to you...

We have 6 infrastructure repositories [A, B, C, D, E, F ?].



Each project [W,X,Y,Z] is composed of 4 repositories [1-4], each one using one 
or two infrastructure repositories as sub-modules. (Not the same)


e.g. W1-W4; with say B & D as submodules


The infrastructure repositories are common to several projects and in the case 
we have to make change in the infrastructure for a specific project, we are 
doing it on a specific branch until properly merged.


Do you also have remotes setup that provide backup and central authority 
to the projects..?


Everything is fine (more or less) and somehow working.


Good..


Now, we have one project that will be developed in another site and with 
another git server physically separated from the main site.


Is it networked? Internal control, external internet, sneakernet?


I copied the infrastructure repositories in the new site and removed and add 
the sub-modules in order for them to point to the url in the separated git 
server.

Every 2 weeks, the remotely developed code has to be integrated back in the 
main site.
My idea was to format GIT patches, integrate in the main site, tag the whole 
thing and ship back the integrated tagged code to the remote site.
... and now the nightmare starts:
yep, you have lost the validation & verification capability of Git's 
sha1/oid and DAG.




Since the .gitmodules is different, I cannot have the same SHA and then same 
tag and I am never sure that the integrated code is proper.


Remotes, remotes...


May be there is a simple solution that I don't know about to my problem? Is 
there something else than GIT patches? Should I simply ship to the remote site 
the code as is and change the submodules each time?



I think the solution you need is `git bundle` 
https://git-scm.com/docs/git-bundle. This is designed for the case where 
you do not have the regular git transport infrastructure. Instead it 
records the expected data that would be 'on the wire', which is then 
read in at the far end. The bundle can contain excess data to ensure 
overlap between site transmissions.


You just run the projects in the same way but add the courier step for 
shipping the CD, or some password protected archive as per your security 
needs.


Everything is should be just fine (more or less) and somehow it will 
just work. ;-)


--
Philip
https://stackoverflow.com/questions/11792671/how-to-git-bundle-a-complete-repo


[PATCH 0/1] Re: git silently ignores include directive with single quotes

2018-09-24 Thread Philip Oakley
Rather than attaching the problem with code, I decided to simply update
the config file documentation.

As the userbase expands the documentation will need to be more comprehensive
about exclusions and omissions, along with better highlighting for core
areas.

I would be useful if Stas could comment on whether these changes would
have assisted in debugging the faulty config file. 

Philip Oakley (1):
  config doc: highlight the name=value syntax

 Documentation/config.txt | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

-- 
2.17.1.windows.2



[PATCH 1/1] config doc: highlight the name=value syntax

2018-09-24 Thread Philip Oakley
Stas Bekman reported [1] that Git config was not accepting single quotes
around a filename as may have been expected by shell users.

Highlight the 'name = value' syntax with its own heading. Clarify that
single quotes are not special here. Also point to this paragraph in the
'include' section regarding pathnames.

In addition clarify that missing include file paths are not an error, but
rather an implicit 'if found' for include files.

[1] 
https://public-inbox.org/git/ca2b192e-1722-092e-2c54-d79d21a66...@stason.org/

Reported-by: Stas Bekman 
Signed-off-by: Philip Oakley 
---
 Documentation/config.txt | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1264d91fa3..b65fd6138d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -19,8 +19,8 @@ characters and `-`, and must start with an alphabetic 
character.  Some
 variables may appear multiple times; we say then that the variable is
 multivalued.
 
-Syntax
-~~
+Config file Syntax
+~~
 
 The syntax is fairly flexible and permissive; whitespaces are mostly
 ignored.  The '#' and ';' characters begin comments to the end of line,
@@ -56,6 +56,9 @@ syntax, the subsection name is converted to lower-case and is 
also
 compared case sensitively. These subsection names follow the same
 restrictions as section names.
 
+Variable name/value syntax
+^^
+
 All the other lines (and the remainder of the line after the section
 header) are recognized as setting variables, in the form
 'name = value' (or just 'name', which is a short-hand to say that
@@ -69,7 +72,8 @@ stripped.  Leading whitespaces after 'name =', the remainder 
of the
 line after the first comment character '#' or ';', and trailing
 whitespaces of the line are discarded unless they are enclosed in
 double quotes.  Internal whitespaces within the value are retained
-verbatim.
+verbatim. Single quotes are not special and form part of the
+variable's value.
 
 Inside double quotes, double quote `"` and backslash `\` characters
 must be escaped: use `\"` for `"` and `\\` for `\`.
@@ -89,10 +93,14 @@ each other with the exception that `includeIf` sections may 
be ignored
 if their condition does not evaluate to true; see "Conditional includes"
 below.
 
+Both the `include` and `includeIf` sections implicitly apply an 'if found'
+condition to the given path names.
+
 You can include a config file from another by setting the special
 `include.path` (or `includeIf.*.path`) variable to the name of the file
 to be included. The variable takes a pathname as its value, and is
-subject to tilde expansion. These variables can be given multiple times.
+subject to tilde expansion and the value syntax detailed above.
+These variables can be given multiple times.
 
 The contents of the included file are inserted immediately, as if they
 had been found at the location of the include directive. If the value of the
-- 
2.17.1.windows.2



Re: Receiving console output from GIT 10mins after abort/termination?

2018-07-22 Thread Philip Oakley

From: "Frank Wolf" 
Sent: Wednesday, July 18, 2018 7:38 AM

Hi @ll,

I hope I'm posting to the right group (not sure if it's Windows related) 
but I've got

a weird problem using GIT:

By accident I've tried to push a repository (containing an already
commited but not yet pushed submodule reference).
This fails immediately with an error of course BUT

after 10 mins I get an output on the console though the command exited!?
(... $Received disconnect from : User session has timed out 
idling after 600 ms)


Does anyone have an explanation why I still get an output after the 
command was aborted?


/Frank

I think this is a Windows environment issue. I have added a repy to the 
GitHub git-forwindows tracker. 
https://github.com/git-for-windows/git/issues/1762#issuecomment-406851107
I think you may have found a special case so will need extra details from 
you about the setup and hopefully an MVCE.


Philip 



Re: git-gui ignores core.hooksPath

2018-07-10 Thread Philip Oakley

From: "Johannes Schindelin" 

Hi Phillip,

On Wed, 14 Jun 2017, Philipp Gortan wrote:


thanks for following up,

> Indeed. Why don't you give it a try?

Actually, I already did: https://github.com/patthoyts/git-gui/pull/12

You might want to post your analysis and patch there as well...


I wonder what good posting my analysis did, if nothing changed as a
consequence.

FWIW I opened this PR with Git for Windows to fix it properly:

https://github.com/git-for-windows/git/pull/1757

I plan on consolidating all of the PRs at
https://github.com/patthoyts/git-gui, too, and to try to get them into
git.git.




   I guess that means that I just volunteered as interim maintainer
of the git-gui repository. However, I will really act as maintainer, not
as "cleaner upper".


"Curator" is a useful intermediate level concept between active maintenance 
and passive benign neglect, if that term is a help...


--
Philip 



Re: [RFC PATCH 4/6] sequencer.c: avoid empty statements at top level

2018-07-08 Thread Philip Oakley

From: "Eric Sunshine" 
To: "Beat Bolli" 

On Sun, Jul 8, 2018 at 10:44 AM Beat Bolli  wrote:

The marco GIT_PATH_FUNC expands to a complete statement including the


s/marco/macro/


semicolon. Remove two extra trailing semicolons.

Signed-off-by: Beat Bolli 
---
 sequencer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


While you're at it, perhaps it would be a good idea to fix the example
in path.h which teaches the "wrong" way:

/*
* You can define a static memoized git path like:
*
*static GIT_PATH_FUNC(git_path_foo, "FOO");
*
* or use one of the global ones below.
*/



Re: [msysGit] Possible git status problem at case insensitive file system

2018-06-20 Thread Philip Oakley

Hi Frank,

Your system Clock looks to be providing the wrong date for your emails.

The last XP version was 
https://github.com/git-for-windows/git/releases/tag/v2.10.0.windows.1 so you 
may want to upgrade to that. (see FAQs 
https://github.com/git-for-windows/git/wiki/FAQ)


It won't solve the capitalisation problem - that is a Windows FS issue. Git 
assumes case matters, but the FS will fetch directories and branches case 
insensitively.


Philip


- Original Message - 
From: "Frank Li" 

To: "Git List" ; "msysGit" 
Sent: Monday, August 09, 2010 5:22 AM
Subject: [msysGit] Possible git status problem at case insensitive file 
system




All:
   I use msysgit 1.7.0.2 at windows xp.
   Problem: git status will list tracked directory as untracked dir.
   Duplicate:
   1. mkdir test, cd test
   2. git init-db
   3. mkdir d, cd d
   4. touch a.c
   5. git add a.c
   6. git commit -a -m "test"
   7. cd ..
   8.  mv d d1
   9.  mv d1 D
  10. git status


# On branch master
# Untracked files:
#   (use "git add ..." to include in what will be committed)
#
#   D/
nothing added to commit but untracked files present (use "git add" to 
track)


   D/ should be same as d/ at case insensitive file system.
   D/ should not listed by git status.

best regards
Frank Li
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
--
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github 
accounts are free.


You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msys...@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscr...@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an 
email to msysgit+unsubscr...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.





Re: GDPR compliance best practices?

2018-06-09 Thread Philip Oakley

From: "Theodore Y. Ts'o" 
Sent: Friday, June 08, 2018 3:53 AM

On Fri, Jun 08, 2018 at 01:21:29AM +0200, Peter Backes wrote:

On Thu, Jun 07, 2018 at 03:38:49PM -0700, David Lang wrote:
> > Again: The GDPR certainly allows you to keep a proof of copyright
> > privately if you have it. However, it does not allow you to keep
> > publishing it if someone exercises his right to be forgotten.
> someone is granting the world the right to use the code and you are
> claiming
> that the evidence that they have granted this right is illegal to have?

Hell no! Please read what I wrote:

- "allows you to keep a proof ... privately"
- "However, it does not allow you to keep publishing it"


The problem is you've left undefined who is "you"?  With an open
source project, anyone who has contributed to open source project has
a copyright interest.  That hobbyist in German who submitted a patch?
They have a copyright interest.  That US Company based in Redmond,
Washington?  They own a copyright interest.  Huawei in China?  They
have a copyright interest.

So there is no "privately".  And "you" numbers in the thousands and
thousands of copyright holders of portions of the open source code.

And of course, that's the other thing you seem to fundamentally not
understand about how git works.  Every developer in the world working
on that open source project has their own copy.  There is
fundamentally no way that you can expunge that information from every
single git repository in the world.  You can remote a git note from a
single repository.  But that doesn't affect my copy of the repository
on my laptop.  And if I push that repository to my server, it git note
will be out there for the whole world to see.

So someone could *try* sending a public request to the entire world,
saying, "I am a European and I demand that you disassociate commit
DEADBEF12345 from my name".  They could try serving legal papers on
everyone.  But at this point, it's going to trigger something called
the "Streisand Effect".  If you haven't heard of it, I suggest you
look it up:

http://mentalfloss.com/article/67299/how-barbra-streisand-inspired-streisand-effect

Regards,

- Ted


Hi Ted,

I just want to remind folks that Gmane disappeared as a regular list because
of a legal challenge, the SCO v IBM Unix court case keeps rumbling on, so
clarifying the legal case for:
a) holding the 'personal git meta data', and
b) disclosing (publishing) 'personal git meta data'
under various copyright and other legal issue scenarios relative to GDPR is
worth clarifying.

I'm of the opinion that the GPL should be able to allow both holding and
disclosing that data, though it may need a few more clarifications as to
verifying that the author is 'correct' (e.g. not a child) and if a DCO is
needed, etc.

We are already looking at a change to the hash, so the technical challenge
could be addressed, but may create too many logical conflicts if 'right to
be forgotten' is allowed (one hash change is enough;-)

Philip



Re: GDPR compliance best practices?

2018-06-07 Thread Philip Oakley

Hi Peter, David,

I thought that the legal notice (aka 'disclaimer') was pretty reaonable.

Some of Peter's fine distinctions may be technically valid, but that does 
not stop there being legal grounds. The proof of copyright is a legal 
grounds.


Unfortunately once one gets into legal nitpicking the wording becomes 
tortuous and helps no-one.


If one starts from an absolute "right to be forgotten" perspective one can 
demand all evidence of wrong doing , or authority to do something, be 
forgotten. The GDPR has the right to retain such evidence.


I'll try and comment where I see the distinctions to be.

From: "Peter Backes" 


Hi David,

thanks for your input on the issue.


LEGAL GDPR NOTICE:
According to the European data protection laws (GDPR), we would like to 
make you

aware that contributing to rsyslog via git will permanently store the
name and email address you provide as well as the actual commit and the
time and date you made it inside git's version history.


This is simply an information statement


  This is inevitable,

because it is a main feature git.


The "inevitable" word creates a point of argument within the GDPR. Removing 
the word (and 'because/main') brings the sentance back to be an informative 
statement without a GDPR claim.


As we can, see, rsyslog tries to solve the issue by the already
discussed legal "technology" of disclaimers (which is certainly not
accepted as state of the art technology by the GDPR). In essence, they
are giving excuses for why they are not honoring the right to be
forgotten.

Disclaimers do not work. They have no legal effect, they are placebos.

The GDPR does not accept such excuses. If it would, companies could
arbitrarily design their data storage such as to make it "the main
feature" to not honor the right to be forgotten and/or other GDPR
rights. It is obvious that this cannot work, as it would completely
undermine those rights.

The GDPR honors technology as a means to protect the individual's
rights, not as a means to subvert them.


If you are concerned about your
privacy, we strongly recommend to use

--author "anonymous "

together with your commit.


The [key] missing information here is whether rsyslog has a DCO (Developer 
Certificate of Origin) and what that contains.


The git.git DCO is here 
https://github.com/git/git/blob/master/Documentation/SubmittingPatches#L304-L349


This will also help discriminate between the "name" part and the  
identifier, as both could be separately anonymised (given the right DCO). 
Thus it may be that the name is recored as "anonymous", but with a 
 that bridges the legal evidence/right to be forgotten 
bridge.


This can only be a solution if the project rejects any commits which
are not anonymous.


However, we have valid reasons why we cannot remove that information
later on. The reasons are:

* this would break git history and make future merges unworkable


This is not a valid excuse (see above).


Within the GDPR, that is correct. It (breaking history validation), of 
itself, should not be the reason.



 The technology has to be
designed or applied in such a way that the individuals rights are
honored, not the other way around.

In absence of other means, the project has to rewrite history if it
gets a valid request by someone exercising his right to be forgotten,
even if that causes a lot of hazzle for everyone.

* the rsyslog projects has legitimate interest to keep a permanent record 
of the

  contributor identity, once given, for
  - copyright verification
  - being able to provide proof should a malicious commit be made


True, but that doesn't justify publishing that information and keeping
it published even when someone exercises his right to be forgotten.


Publishing (the meta data) is *distinct* from having it.

However publishing the content and it's legal copyright is also associated 
with identifying the copyright holder (who has released it). This can be the 
uid if they hide behind a legal entity. This creates the catch 22 scenario. 
You either start off public and stay public, or you start off private and 
stay there.


Whether the rsyslog folk want to accept copyrighted work without appropriate 
legal release (who guards the guards, what's their badge number?) is part of 
the same information requirement.


Malicious intent makes the submission (commit) part of a legal evidence one 
needs to retain, so is supported by GDPR.




In that case, "legitimate interest" is not enough. There need to be
"overriding legitimate grounds". I don't see them here.

Please also note that your commit is public and as such will potentially 
be
processed by many third-parties. Git's distributed nature makes it 
impossible
to track where exactly your commit, and thus your personal data, will be 
stored
and be processed. If you would not like to accept this risk, please do 
either

commit anonymously or refrain from contributing to the rsyslog project.


The onward publishing and release 

Re: GDPR compliance best practices?

2018-06-04 Thread Philip Oakley

Hi Peter,
(lost the cc's)

From: "Peter Backes" 

On Sun, Jun 03, 2018 at 11:28:43PM +0100, Philip Oakley wrote:

It is here that Article 6 kicks in as to whether the 'organisation' can
retain the data and continue to use it.


Article 6 is not about continuing to use data. Article 6 is about
having and even obtaining it in the first place.


Correct, and that is the part I was refering to. Recipients of the
particular meta data require it for the licencing purpose. Thus they can
continue to have (and 'need') that data. It is that 'other side of the 
fence'

view I mentioned.



Article 17 and article 21 are about continuing to use data.


For an open source project with an open source licence then an implict
DCO
applies for the meta data. It is the legal  basis for the the release.


Neither article 6 nor 17 or 21 have anything remotely like an "implicit
DCO" as a legitimization for publishing employee data.


I was refering to 'implict' in a reverse direction, that is, the DCO
supports the legal basis to have and hold the data. The express licence
terms in the various open source licences give the permission, and becomes
one of these legally conflicting aspects



The GDPR is very explicit about implicit stuff never being a basis for
consent, if you want to imply that is your basis. And consent can be
withdrawn at any time anyway.

An open source license has nothing whatsoever to do with the question
of version control metadata. A public version control system is not
necessary to publish open source software.


> - copyright is about distributing the program, not about distributing
> version control metadata.
It is specificaly about giving that right to copy by Jane Doe (but git
gives
no other information other than that supposedly globally unique 'author
email'.


I don't get what you are saying. As I said, a public version control
system is not necessary to publish open source software. The two things
may be intimately related in practice, but not in theory.


Such is the law. It's the practice that is legal/illegal, decided in court
(if it gets there)




> - Being named is a right, not an obligation of the author. Hence, if
> the author doesn't want his name published, the company doesn't have
> legitimate grounds based in copyright for doing it anyway, against his
> or her will.
Git for Open Source is about open licencing by name. I'd agree that a
closed
corporate licence stays closed, but not forgotten.


Again I don't get what you are saying. The author has a right to be
named as the author, not an obligation. This has nothing whatsoever to
do with the question of Open Source vs. closed corporate licenses.



The question is which clause is being used to justify an action. Those
corporate organisations want a legal basis for holding data, not a voluntary
permisson (because folk may try and rescind that permission... ). Those in
open source want to ensure that their licence is a legal basis for other
folk to have copies, and that folk can show they have that permission.

Those with a personal data view, will focus on the hope that they can remove
permission, especially for companies that are doing things they find
unacceptable, and maybe 'illegal' or unethical. The GDPR attempts to balance
the different set of expectaions, and the overlaps will need to be
negotiated. Different nations (and individuals) have different perceptions
as to what is normal and reasonable thus focus on different aspects, not
appreciating the Competeing Values that are present in the different
Frameworks of their weltanshauung.

If a closed source corporate does publish their closed data, they have real
internal problems anyway regarding that contradiction!


> Let's be honest: We do not know what legitimization exactly in each
> specific case the git metadata is being distributed under.

We should know, already. A specific licence [or limit] should be in
place.
We don't really want to have to let a court decide ;-)


It is insufficient to have a license for distributing the program. The
license is not a GDPR legitimization for git metadata. Distributing the
program can be done without distributing the author's identity as part
of the metadata of his commits.


The law is never decided by technical means, unfortunately.


It is. The GDPR refers to the state of the art of technology without
defining it. Thus, technical means are very important in the GDPR. This
may be something new for lawyers. If technology changes tomorrow, even
without anything else changing, you may be breaking the GDPR by this
simple fact tomorrow, while not breaking it today.



They will still argue about what is the state of the art, and that if the
art is hidden in some lab, then it's not available to meet the criteia.


Again: Technology is very important in the GDPR.


We know quantum computing can crack the codes, but when does it become
the state of the art. SHA1 has been 'cracked' once in one 

Re: GDPR compliance best practices?

2018-06-03 Thread Philip Oakley

From: "Peter Backes" 

On Sun, Jun 03, 2018 at 04:28:31PM +0100, Philip Oakley wrote:

In most Git cases that legal/legitimate purpose is the copyright licence,
and/or corporate employment. That is, Jane wrote it, hence X has a legal
rights of use, and we need to have a record of that (Jane wrote it) as
evidence of that (I'm X, I can use it) right. That would mean that Jane
cannot just ask to have that record removed and expect it to be removed.


Re corporate employment:

For sure nobody would dare to quesion that a company has a right to
keep an internal record that Jane wrote it.

The issue is publishing that information. This is an entirely different
story.


It is here that Article 6 kicks in as to whether the 'organisation' can 
retain the data and continue to use it.

https://gdpr-info.eu/art-6-gdpr/
https://ico.org.uk/for-organisations/guide-to-the-general-data-protection-regulation-gdpr/lawful-basis-for-processing/
https://www.lawscot.org.uk/news-and-events/news/gdpr-legal-basis-and-why-it-matters/

For an open source project with an open source licence then an implict DCO 
applies for the meta data. It is the legal  basis for the the release.


If a corporate project has a closed source project, then yes, open 
publishing of that personal data within a repo's meta data would be 
incorrect, even though the internal repo would be kept.





I already stressed that from the very beginning.

Re copyright license:

No, a copyright license does not provide a legitimization.

- copyright is about distributing the program, not about distributing
version control metadata.


It is specificaly about giving that right to copy by Jane Doe (but git gives 
no other information other than that supposedly globally unique 'author 
email'.




- Being named is a right, not an obligation of the author. Hence, if
the author doesn't want his name published, the company doesn't have
legitimate grounds based in copyright for doing it anyway, against his
or her will.


Git for Open Source is about open licencing by name. I'd agree that a closed 
corporate licence stays closed, but not forgotten.





From a personal view, many folk want it to be that corporates (and open
source organisations) should hold no personal information with having
explicit permission that can then be withdrawn, with deletion to follow.
However that 'legal' clause does [generally] win.


Let's be honest: We do not know what legitimization exactly in each
specific case the git metadata is being distributed under.


We should know, already. A specific licence [or limit] should be in place. 
We don't really want to have to let a court decide ;-)




It may be copyright, it may be employment, but it may also be revocable
consent. This is, we cannot safely assume that no git user will ever
have to deal with a legitimate request based on the right to be
forgotten.



The law is never decided by technical means, unfortunately. Regular git 
users should have no issues - they just need to point their finger at the 
responsible authority. (beware though, of the oneway trap door that the 
users mistakes can become the problem for the responsible authority!)



In the git.git case (and linux.git) there is the DCO (to back up the 
GLP2)
as an explicit requirement/certification that puts the information into 
the

legal evidence category. IIUC almost all copyright ends up with a similar
evidentail trail for the meta data.


This makes things more complicated, not less. You have yet more meta
data to cope with, yet more opportunities to be bitten by the right to
be forgotten. Since I proposed a list of metadata where each entry can
be anonymized independently of each other, it would be able to deal
with this perfectly.


The DCO/GPL2 are the legitimate data record that recipients should have for 
their copy. There is no right to be forgotten at that point.




The more likely problem is if the content of the repo, rather than the 
meta

data, is subject to GDPR, and that could easily ruin any storage method.
Being able to mark an object as  would help here(*).


My proposal supports any part of the commit, including the contents of
individual files, as eraseable, yet verifiable data.


Also remember that most EU legislation is 'intent' based, rather than
'letter of', for the style of legal arguments (which is where some of the 
UK
Brexit misunderstandings come from), so it is more than possible to get 
into
the situation where an action is both mandated and illegal at the same 
time,
so plent of snake oil salesman continue to sell magic fixes according to 
the

customers local biases.


This may be true. I am not trying to sell snake oil, however. To have
erasure and verifiability at the same time is a highly generic feature
that may be desirable to have for a multitude of reasons, including but
not limited to legal ones like GDPR and copyright violations.


I do not believe Git has anything to worry about that wasn't already an
is

Re: GDPR compliance best practices?

2018-06-03 Thread Philip Oakley

correcting a negative /with/without/ and inserting a comma.
- Original Message - 
From: "Philip Oakley" 

[snip]


From a personal view, many folk want it to be that corporates (and open
source organisations) should hold no personal information with having

s/with/without/


explicit permission that can then be withdrawn, with deletion to follow.
s/permission/permission,/  


However that 'legal' clause does [generally] win.





Re: git glob pattern in .gitignore and git command

2018-06-03 Thread Philip Oakley

Hi Yubun,

From: "Yubin Ruan" 

To ignore all .js file under a directory `lib', I can use "lib/**/js" to
match
them. But when using git command such as "git add", using "git add
lib/\*.js"
is sufficient. Why is this difference in glob mode?

I have heard that there are many different glob mode out there (e.g., bash
has
many different glob mode). So, which classes of glob mode does these two
belong to? Do they have a name?



Is this a question about `git add` being able to add a file that is marked
as being ignored in the .gitignore file? [Yes it can.]

Or, is this simply about the many different globbing capabilities of one's
shell, and of Git?

The double asterix (star) is specific/local to Git. It is described in the
various commands that use it, especially the gitignore man page `git help
ignore` or  https://git-scm.com/docs/gitignore.
"Two consecutive asterisks ("**") in patterns matched against full pathname
may have special meaning: ... "

The single asterix does have two modes depending on how you quote it. It is
described in the command line interface (cli) man page ` git help cli` or
https://git-scm.com/docs/gitcli.
"Many commands allow wildcards in paths, but you need to protect them from
getting globbed by the shell. These two mean different things: ... "

A common proper name for these asterix style characters is a "wildcards".
Try 'bash wildcards' or linux wildcards' in your favourite search engine.

--
Philip



Re: [PATCH v2] t/perf/run: Use proper "--get-regexp", not "--get-regex"

2018-06-03 Thread Philip Oakley

From: "Robert P. J. Day" 

On Sun, 3 Jun 2018, Thomas Gummerer wrote:


> Subject: [PATCH v2] t/perf/run: Use proper "--get-regexp", not

micronit: we prefer starting with a lowercase letter after the "area:"
prefix in commit messages.   Junio can probably fix that while
queuing, so no need to resend.


 argh, i actually know that, i just screwed up.


On 06/03, Robert P. J. Day wrote:
>
> Even though "--get-regex" appears to work with "git config", the
> clear standard is to spell out the action in full.

--get-regex works as the parse-option API allows abbreviations of the
full option to be specified as long as the abbreviation is
unambiguos.  I don't know if this is documented anywhere other than
'Documentation/technical/api-parse-options.txt' though.


it's in `git help cli`:

many commands allow a long option --option to be abbreviated only to their 
unique prefix (e.g. if there is no other option whose name begins with opt, 
you may be able to spell --opt to invoke the --option flag), but you should 
fully spell them out when writing your scripts;


It's a worthwile read, even if the man page isn't flagged up that often.



> Signed-off-by: Robert P. J. Day 
>
> ---

It took me a bit to figure out why there is a v2, and what changed
between the versions.  This space after the '---' would be a good
place to describe that to help reviewers.

For others that are curious, it seems like the word "clear" was added
in the commit message.

The change itself looks good to me.


 the actual rationale for v2 was in the subject, i originally put
just "get-regex" rather then "--get-regex"; i resubmitted for
consistency.


--
Philip 



Re: GDPR compliance best practices?

2018-06-03 Thread Philip Oakley

From: "Peter Backes" 

On Sun, Jun 03, 2018 at 02:59:26PM +0200, Ævar Arnfjörð Bjarmason wrote:

I'm not trying to be selfish, I'm just trying to counter your literal
reading of the law with a comment of "it'll depend".

Just like there's a law against public urination in many places, but
this is applied very differently to someone taking a piss in front of
parliament v.s. someone taking a piss in the forest on a hike, even
though the law itself usually makes no distinction about the two.


We have huge companies using git now. This is not the tool used by a
few kernel hackers anymore.


In this example once you'd delete the UUID ref you don't have the UUID
-> author mapping anymore (and b.t.w. that could be a many to one
mapping).


It is not relevant whether you have that mapping or not, it is enough
that with additional information you could obtain it. For example, say,
you have 5000 commits with the same UUID. Now your delete the mapping.
But your friend still has it on his local copy. Now your friendly
merely needs to tell you who is behind that UUID and instantly you can
associate all 5000 commits with that person again.

The GDPR is very explict about this, see recital 26. It says that
pseudonymization is not enough, you need anonymization if you want to
be free from regulation.

In addition, and in contrast to my proposal, your solution doesn't
allow verification of the author field.


I think again that this is taking too much of a literalist view. The
intent of that policy is to ensure that companies like Google can't just
close down their EU offices weasel out of compliance be saying "we're
just doing business from the US, it doesn't apply to us".

It will not be used against anyone who's taking every reasonable
precaution from doing business with EU customers.


I think you are underestimating the political intention behind the
GDPR. It has kind of an imperialist goal, to set international
standards, to enforce them against foreign companies and to pressure
other nations to establish the same standards.

If I would read the GPDR in a literal sense, I would in fact come to
the same conclusion as you: It's about companies doing substantial
business in the EU. But the GDPR is carefully constructed in such a way
that it is hard not to be affected by the GDPR in one way or another,
and the obvious way to cope with that risk is to more or less obey the
GDPR rules even if one does not have substantial business interests in
the EU.


What do you imagine that this is going to be like? That some EU citizen
is going to walk into a small business in South America one day, which
somehow is violating the GPDR, and when that business owner goes on
holiday to the EU they're going to get detained? Not even the US policy
against Cuba is anywhere remotely close to that.


Well not if he's locally interacting with that business, a situation
which I am sure is not regulated by the GDPR.

However, if a large US website accepts users from the EU and uses the
data gathered in conflict with the GDPR, perhaps selling it for use in
political campaigns, and it gets several fines for this by EU
authorities but ignores them and doesn't pay them, and the CEO one day
takes a flight to Frankfurt to continue by train to Switzerland to get
some cash from his bank account, then he will most likely not reach
Swiss territory.



--
Having been through corporate training and read up a number of the
conflicting views in the press, one of the issues is that there are two
viewpoints, one from each side of the fence.


From a corporate/organisation viewpoint, it is best if every case of holding

user information is for a legitimate purpose, which then means the company
has 'protection' from requests for removal because the data *is* held
legally/legitimately (which includes acting as evidence).

In most Git cases that legal/legitimate purpose is the copyright licence,
and/or corporate employment. That is, Jane wrote it, hence X has a legal
rights of use, and we need to have a record of that (Jane wrote it) as
evidence of that (I'm X, I can use it) right. That would mean that Jane
cannot just ask to have that record removed and expect it to be removed.


From a personal view, many folk want it to be that corporates (and open

source organisations) should hold no personal information with having
explicit permission that can then be withdrawn, with deletion to follow.
However that 'legal' clause does [generally] win.

In the git.git case (and linux.git) there is the DCO (to back up the GLP2)
as an explicit requirement/certification that puts the information into the
legal evidence category. IIUC almost all copyright ends up with a similar
evidentail trail for the meta data.


The more likely problem is if the content of the repo, rather than the meta
data, is subject to GDPR, and that could easily ruin any storage method.
Being able to mark an object as  would help here(*).

Also remember that most EU legislation is 'intent' based, 

Re: git rebase -i --exec and changing directory

2018-05-27 Thread Philip Oakley

Hi Ondrej, Phillip,

From: "Phillip Wood" <phillip.w...@talktalk.net>

Hi Ondrej

On 27/05/18 13:53, Ondrej Mosnáček wrote:


Hi Philip,

2018-05-27 14:28 GMT+02:00 Philip Oakley <philipoak...@iee.org>:
You may need to give a bit more background of things that seem obvious 
to

you.
So where is the src directory you are cd'ing to relative to the
directory/repository you are creating?


It is located in the top-level directory of the working tree (in the
same directory that .git is in).

 From git-rebase(1):

 The "exec" command launches the command in a shell (the one
 specified in $SHELL, or the default shell if $SHELL is not set), so
 you can use shell features (like "cd", ">", ";" ...). The command is
 run from the root of the working tree.

So I need to run 'cd src' if I want to run a command in there
(regardless of the working directory of the git rebase command
itself).


What is [the name of] the directory you are currently in, etc. ?


I don't think that is relevant here. FWIW, when verifying the problem
I ran the reproducer from my original message in a directory whose
path did not contain any spaces or special characters.

Did you try to run the reproducing commands I posted? Did you get a
different result? You should see the following in the output of 'cd
dir && git status':


At the time, I hadn't run the command. I was more interested in 
understanding the problem setup, as understanding often brings 
enlightenment.


I was jsut starting to do my own setup and swaw Phillip had responsed which 
prompted me to think it could be that there was no tty attached to the exec, 
so output wasn't being seen (or something like that).




I tried your recipe and got the same result as you. However I think it 
could be a problem with 'git status' rather than 'git rebase --exec'. If I 
run your recipe in /tmp/a and do


cd dir
GIT_DIR=/tmp/a/.git git status

I get the same result as when running 'git status' from 'git 
rebase --exec' So I think the problem might have something to do with 
GIT_DIR being set in the environment when 'git status' is run




I too got the same same results.
I also tried duplicating the exec line and placing it before the pick line, 
just to check it wasn't an issue about termination. Same result.



Best Wishes

Phillip



[...]
Changes not staged for commit:
  (use "git add/rm ..." to update what will be committed)
  (use "git checkout -- ..." to discard changes in working 
directory)


deleted:a
deleted:b
deleted:dir/x
deleted:reproduce.sh

Untracked files:
  (use "git add ..." to include in what will be committed)

x
[...]

When I drop the 'cd dir && ' from before 'git status', the output is
as expected:

You are currently editing a commit while rebasing branch 'master' on 
'19765db'.

  (use "git commit --amend" to amend the current commit)
  (use "git rebase --continue" once you are satisfied with your changes)

nothing to commit, working tree clean


So I extended the command to be exec'd to `cd dir && ls && git status`, 
again with duplication of the exec, which then gives a bit more..


finally I extended the status to pipe it's output to a file, again 
duplicated.

--
Philip@PhilipOakley MINGW32 /usr/src/mosnacek (master)

$ git rebase -i --exec 'cd dir && ls && git status >stat.txt' base

Executing: cd dir && ls && git status >stat0.txt

x

Executing: cd dir && ls && git status >stat.txt

stat0.txt x

Successfully rebased and updated refs/heads/master.

--
the stat0, stat files can then be investigated.

Summary: status is, I think, being clever and dropping the verbiage when not 
directly attached to the terminal. (or it is being intelligent and adding a 
lot more status details just because it _is_ within the rebase..)






Philip
--

From: "Ondrej Mosnáček" <omosna...@gmail.com>
Bump? Has anyone had time to look at this?

2018-05-19 18:38 GMT+02:00 Ondrej Mosnáček <omosna...@gmail.com>:


Hello,

I am trying to run a script to edit multiple commits using 'git rebase
-i --exec ...' and I ran into a strange behavior when I run 'cd'
inside the --exec command and subsequently run a git command. For
example, if the command is 'cd src && git status', then git status
reports as if all files in the repository are deleted.


What does that particular report look like? I see no special report of 
deletions, or additions.





Example command sequence to reproduce the problem:

 # Setup:
 touch a
 mkdir dir
 touch dir/x

 git init .
 git add --all
 git commit -m commit1
 git tag base
 touch b
 git add --all
 git commit -m commit2

 # Here we go:
 git rebase -i --exec 'cd dir && git status' base

 # Spawning a s

Re: git rebase -i --exec and changing directory

2018-05-27 Thread Philip Oakley

Hi Ondrej, Phillip,

From: "Phillip Wood" <phillip.w...@talktalk.net>

Hi Ondrej

On 27/05/18 13:53, Ondrej Mosnáček wrote:


Hi Philip,

2018-05-27 14:28 GMT+02:00 Philip Oakley <philipoak...@iee.org>:
You may need to give a bit more background of things that seem obvious 
to

you.
So where is the src directory you are cd'ing to relative to the
directory/repository you are creating?


It is located in the top-level directory of the working tree (in the
same directory that .git is in).

 From git-rebase(1):

 The "exec" command launches the command in a shell (the one
 specified in $SHELL, or the default shell if $SHELL is not set), so
 you can use shell features (like "cd", ">", ";" ...). The command is
 run from the root of the working tree.

So I need to run 'cd src' if I want to run a command in there
(regardless of the working directory of the git rebase command
itself).


What is [the name of] the directory you are currently in, etc. ?


I don't think that is relevant here. FWIW, when verifying the problem
I ran the reproducer from my original message in a directory whose
path did not contain any spaces or special characters.

Did you try to run the reproducing commands I posted? Did you get a
different result? You should see the following in the output of 'cd
dir && git status':


At the time, I hadn't run the command. I was more interested in 
understanding the problem setup, as understanding often brings 
enlightenment.


I was jsut starting to do my own setup and swaw Phillip had responsed which 
prompted me to think it could be that there was no tty attached to the exec, 
so output wasn't being seen (or something like that).




I tried your recipe and got the same result as you. However I think it 
could be a problem with 'git status' rather than 'git rebase --exec'. If I 
run your recipe in /tmp/a and do


cd dir
GIT_DIR=/tmp/a/.git git status

I get the same result as when running 'git status' from 'git 
rebase --exec' So I think the problem might have something to do with 
GIT_DIR being set in the environment when 'git status' is run




I too got the same same results.
I also tried duplicating the exec line and placing it before the pick line, 
just to check it wasn't an issue about termination. Same result.



Best Wishes

Phillip



[...]
Changes not staged for commit:
  (use "git add/rm ..." to update what will be committed)
  (use "git checkout -- ..." to discard changes in working 
directory)


deleted:a
deleted:b
deleted:dir/x
deleted:reproduce.sh

Untracked files:
  (use "git add ..." to include in what will be committed)

x
[...]

When I drop the 'cd dir && ' from before 'git status', the output is
as expected:

You are currently editing a commit while rebasing branch 'master' on 
'19765db'.

  (use "git commit --amend" to amend the current commit)
  (use "git rebase --continue" once you are satisfied with your changes)

nothing to commit, working tree clean


So I extended the command to be exec'd to `cd dir && ls && git status`, 
again with duplication of the exec, which then gives a bit more..


finally I extended the status to pipe it's output to a file, again 
duplicated.

--
Philip@PhilipOakley MINGW32 /usr/src/mosnacek (master)

$ git rebase -i --exec 'cd dir && ls && git status >stat.txt' base

Executing: cd dir && ls && git status >stat0.txt

x

Executing: cd dir && ls && git status >stat.txt

stat0.txt x

Successfully rebased and updated refs/heads/master.

--
the stat0, stat files can then be investigated.

Summary: status is, I think, being clever and dropping the verbiage when not 
directly attached to the terminal. (or it is being intelligent and adding a 
lot more status details just because it _is_ within the rebase..)






Philip
--

From: "Ondrej Mosnáček" <omosna...@gmail.com>
Bump? Has anyone had time to look at this?

2018-05-19 18:38 GMT+02:00 Ondrej Mosnáček <omosna...@gmail.com>:


Hello,

I am trying to run a script to edit multiple commits using 'git rebase
-i --exec ...' and I ran into a strange behavior when I run 'cd'
inside the --exec command and subsequently run a git command. For
example, if the command is 'cd src && git status', then git status
reports as if all files in the repository are deleted.


What does that particular report look like? I see no special report of 
deletions, or additions.





Example command sequence to reproduce the problem:

 # Setup:
 touch a
 mkdir dir
 touch dir/x

 git init .
 git add --all
 git commit -m commit1
 git tag base
 touch b
 git add --all
 git commit -m commit2

 # Here we go:
 git rebase -i --exec 'cd dir && git status' base

 # Spawning a s

Re: git rebase -i --exec and changing directory

2018-05-27 Thread Philip Oakley
You may need to give a bit more background of things that seem obvious to 
you.
So where is the src directory you are cd'ing to relative to the 
directory/repository you are creating?

What is [the name of] the directory you are currently in, etc. ?

Philip
--

From: "Ondrej Mosnáček" 
Bump? Has anyone had time to look at this?

2018-05-19 18:38 GMT+02:00 Ondrej Mosnáček :

Hello,

I am trying to run a script to edit multiple commits using 'git rebase
-i --exec ...' and I ran into a strange behavior when I run 'cd'
inside the --exec command and subsequently run a git command. For
example, if the command is 'cd src && git status', then git status
reports as if all files in the repository are deleted.

Example command sequence to reproduce the problem:

# Setup:
touch a
mkdir dir
touch dir/x

git init .
git add --all
git commit -m commit1
git tag base
touch b
git add --all
git commit -m commit2

# Here we go:
git rebase -i --exec 'cd dir && git status' base

# Spawning a sub-shell doesn't help:
git rebase -i --exec '(cd dir && git status)' base

Is this expected behavior or did I found a bug? Is there any
workaround, other than cd'ing to the toplevel directory every time I
want to run a git command when I am inside a subdirectory?

$ git --version
git version 2.17.0

Thanks,

Ondrej Mosnacek




Re: Troubles with picking an editor during Git update

2018-05-17 Thread Philip Oakley, CEng MIET

Hi Bartosz,

From: "Bartosz Konikiewicz" 

Hi there!

I had an issue with Git installer for Windows while trying to update


The Git for Windows package is managed, via https://gitforwindows.org/, as a 
separate application, based on Git.



my instance of the software. My previous version was "git version
2.15.1.windows.2", while my operating system prompted me to upgrade to
"2.17.0". The installer asked me to "choose the default editor for
Git". One of these options was Notepad++ - my editor of choice. Vim
was selected by default and I've picked Notepad++ from a drop-down
list. As soon as I did it, a "next" button greyed out. When I moved
back to the previous step and then forward to the editor choice, the
"Notepad++" option was still highlighted, and the "next" button wasn't
greyed out anymore - it was active and I was able to press it and
continue installation.

Steps to reproduce:

1. Have Notepad++ 6.6.9 installed on Windows 10 64-bit 10.0.17134 Build 
17134.

2. Use an installer for version 2.17.0 to upgrade from version 2.15.1.
3. On an editor selection screen, choose Notepad++ instead of Vim. You
should be unable to continue installation because of the "next" button
being disabled.
4. Press "prev".
5. Press "next". Notepad++ should be still highlighted, and the "next"
button should be active, allowing to continue installation.

I find it to be a crafty trick to make me use Vim. I have considered
it for a good moment.

The best place to report the issue, and perhaps contribure is via the 'GfW' 
Issue tracker https://github.com/git-for-windows/git/issues.


Building Git for Windows via the SDK has become even easier with recent 
updates, so it should be relativley easy to spot the offending line in the 
installer and perhaps even propose a PR (Pull Request) to fix the issue.


regards
Philip



Re: Re: [PATCH 1/3] checkout.c: add strict usage of -- before file_path

2018-05-13 Thread Philip Oakley

From: "Dannier Castro L" 

On 13/05/2018 00:03, Duy Nguyen wrote:


On Sun, May 13, 2018 at 4:23 AM, Dannier Castro L 
wrote:

For GIT new users, this complicated versatility of  could
be very confused, also considering that actually the flag '--' is
completely useless (added or not, there is not any difference for
this command), when the same program messages promote the use of
this flag.

I would like an option to revert back to current behavior. I'm not a
new user. I know what I'm doing. Please don't make me type more.

And '--" is not completely useless. If you have  and 
with the same name, you have to give "--" to to tell git what the
first argument means.


Sure Duy, you're right, probably "completely useless" is not the correct
definition, even according with the code I didn't find another useful
case that is not file and branch with the same name. The program is able
to know the type using only the name, turning "--" into an extra flag in
most of cases.

I think this solution could please you more: By default the configuration
is the current, but the user has the chance to set this, for example:

git config --global flag.strictdashdash true

Thank you so much for the spent time reviewing the patch, this is my
first one in this repository.


It maybe that after review you could suggest an appropriate rewording or
re-arrangement of the man page to better highlight the proper use of the
'--' disambiguation.

Perhaps frame the man page as if it is normal for the '--' to be included
within command lines (which should be the case for scripts anyway!).

Then indicate that it isn't mandatory if the file/branch/dwim distinction is
obvious. i.e. make sure that the man page is educational as well as being a
reference that may be misunderstood.

Those well versed in the Git cli will normally omit the '--', only using it
where necessary, however for a new users/readers of the man page, it may be
better to be more explicit and avoid future misunderstandings.

--
Philip



Re: [PATCH v6 11/13] command-list.txt: documentation and guide line

2018-05-12 Thread Philip Oakley

Hi Duy,

From: "Nguyễn Thái Ngọc Duy"  : Monday, May 07, 2018

This is intended to help anybody who needs to update command-list.txt.
It gives a brief introduction of all attributes a command can take.
---
command-list.txt | 44 
1 file changed, 44 insertions(+)

diff --git a/command-list.txt b/command-list.txt
index 99ddc231c1..9c70c69193 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -1,3 +1,47 @@
+# Command classification list
+# ---
+# All supported commands, builtin or external, must be described in
+# here. This info is used to list commands in various places. Each
+# command is on one line followed by one or more attributes.
+#
+# The first attribute group is mandatory and indicates the command
+# type. This group includes:
+#
+#   mainporcelain
+#   ancillarymanipulators
+#   ancillaryinterrogators
+#   foreignscminterface
+#   plumbingmanipulators
+#   plumbinginterrogators
+#   synchingrepositories
+#   synchelpers
+#   purehelpers
+#
+# The type names are self explanatory. But if you want to see what
+# command belongs to what group to get a better picture, have a look
+# at "git" man page, "GIT COMMANDS" section.
+#
+# Commands of type mainporcelain can also optionally have one of these
+# attributes:
+#
+#   init
+#   worktree
+#   info
+#   history
+#   remote
+#
+# These commands are considered "common" and will show up in "git
+# help" output in groups. Uncommon porcelain commands must not
+# specify any of these attributes.
+#
+# "complete" attribute is used to mark that the command should be
+# completable by git-completion.bash. Note that by default,
+# mainporcelain commands are completable so you don't need this
+# attribute.
+#
+# While not true commands, guides are also specified here, which can
+# only have "guide" attribute and nothing else.


While the file is called ~ "Command List", the list is here as a support to
the Help function, and ultimately to the user's reading of the man pages,
including the man(5/7) guides, so I'd view the man page guides as first
class citizens.

Perhaps:
# As part of the Git man page list, the man(5/7) guides are also specified
# here, which can only have "guide" attribute and nothing else.

--
Philip


+#
### command list (do not change this line, also do not change alignment)
# command name  category [category] [category]
git-add mainporcelain   worktree
--
2.17.0.705.g3525833791






Re: [PATCH 11/18] branch-diff: add tests

2018-05-03 Thread Philip Oakley

From: "Johannes Schindelin" 

From: Thomas Rast 

These are essentially lifted from https://github.com/trast/tbdiff, with
light touch-ups to account for the new command name.

Apart from renaming `tbdiff` to `branch-diff`, only one test case needed
to be adjusted: 11 - 'changed message'.

The underlying reason it had to be adjusted is that diff generation is
sometimes ambiguous. In this case, a comment line and an empty line are
added, but it is ambiguous whether they were added after the existing
empty line, or whether an empty line and the comment line are added
*before* the existing emtpy line. And apparently xdiff picks a different


s/emtpy/empty/


option here than Python's difflib.

Signed-off-by: Johannes Schindelin 

[...]
Philip


Re: [PATCH 0/4] subtree: move out of contrib

2018-04-30 Thread Philip Oakley

From: "Ævar Arnfjörð Bjarmason" 

I think at this point git-subtree is widely used enough to move out of
contrib/, maybe others disagree, but patches are always better for
discussion that patch-less ML posts.



Assuming this lands in Git, then there will also need to be a simple follow 
on into Duy's series that is updating the command-list.txt (Message-Id: 
<20180429181844.21325-10-pclo...@gmail.com>). Duy's series also does the 
completions thing IIUC;-).

--
Philip


Ævar Arnfjörð Bjarmason (4):
 git-subtree: move from contrib/subtree/
 subtree: remove support for git version <1.7
 subtree: fix a test failure under GETTEXT_POISON
 i18n: translate the git-subtree command

.gitignore|   1 +
Documentation/git-submodule.txt   |   2 +-
.../subtree => Documentation}/git-subtree.txt |   3 +
Makefile  |   1 +
contrib/subtree/.gitignore|   7 -
contrib/subtree/COPYING   | 339 --
contrib/subtree/INSTALL   |  28 --
contrib/subtree/Makefile  |  97 -
contrib/subtree/README|   8 -
contrib/subtree/t/Makefile|  86 -
contrib/subtree/todo  |  48 ---
.../subtree/git-subtree.sh => git-subtree.sh  | 109 +++---
{contrib/subtree/t => t}/t7900-subtree.sh |  21 +-
13 files changed, 78 insertions(+), 672 deletions(-)
rename {contrib/subtree => Documentation}/git-subtree.txt (99%)
delete mode 100644 contrib/subtree/.gitignore
delete mode 100644 contrib/subtree/COPYING
delete mode 100644 contrib/subtree/INSTALL
delete mode 100644 contrib/subtree/Makefile
delete mode 100644 contrib/subtree/README
delete mode 100644 contrib/subtree/t/Makefile
delete mode 100644 contrib/subtree/todo
rename contrib/subtree/git-subtree.sh => git-subtree.sh (84%)
rename {contrib/subtree/t => t}/t7900-subtree.sh (99%)

--
2.17.0.290.gded63e768a






Re: Branch deletion question / possible bug?

2018-04-28 Thread Philip Oakley

From: "Jacob Keller" 

On Fri, Apr 27, 2018 at 5:29 PM, Tang (US), Pik S 
wrote:

Hi,

I discovered that I was able to delete the feature branch I was in, due
to some fat fingering on my part and case insensitivity.  I never
realized this could be done before.  A quick google search did not give
me a whole lot to work with...

Steps to reproduce:
1. Create a feature branch, "editCss"
2. git checkout master
3. git checkout editCSS
4. git checkout editCss
5. git branch -d editCSS



Are you running on a case-insensitive file system? What version of
git? I thought I recalled seeing commits to help avoid creating
branches of the same name with separate case when we know we're on a
file system which is case-insensitive..


Normally, it should have been impossible for a user to delete the branch
they're on.  And the deletion left me in a weird state that took a while
to dig out of.

I know this was a user error, but I was also wondering if this was a bug.


If we have not yet done this, I think we should. Long term this would
be fixed by using a separate format to store refs than the filesystem,
which has a few projects being worked on but none have been put into a
release.


Yes, this is an on-going problem on Windows and other case insentive
systems. At the moment the branch name becomes embedded as a file name, so
when Git requests details of a branch from the filesystem, it can get a case
insensitive equivalent. Meanwhile, internally Git is checking for equality
in a case sensitive [Linux] way with obvious consequences such as this - The
most obvious being when there is no "*" current branch marker in the branch
status list.

It's a bit tricky to fix (internally the name and the path are passed down
different call chains), and depends on how one expects the case
insensitivity to work - the kicker is when someone does an edit of the name
via the file system and expects Git to cope (i.e. devs knowing, or think
they know, too much detail ;-).

The refs can also get packed, so the "bad spelling" gets baked in.
Ultimately it probably means that GfW and other systems will need  a case
sensitivity check when opening paths...

Philip


Thanks,
Jake




Thanks,

Pik Tang







Re: [PATCH v6 11/11] Remove obsolete script to convert grafts to replace refs

2018-04-28 Thread Philip Oakley

From: "Johannes Schindelin" 

The functionality is now implemented as `git replace
--convert-graft-file`.


A rather late in the day thought: Should this go through the same
deprecation dance?

I.e. replace the body of the script with the new `git
replace --convert-graft-file` and echo (or die!) a warning message that this
script is now deprecated and will be removed?

At least it will catch those who arrive via random web advice!

--
Philip


Signed-off-by: Johannes Schindelin 
---
contrib/convert-grafts-to-replace-refs.sh | 28 ---
1 file changed, 28 deletions(-)
delete mode 100755 contrib/convert-grafts-to-replace-refs.sh

diff --git a/contrib/convert-grafts-to-replace-refs.sh
b/contrib/convert-grafts-to-replace-refs.sh
deleted file mode 100755
index 0cbc917b8cf..000
--- a/contrib/convert-grafts-to-replace-refs.sh
+++ /dev/null
@@ -1,28 +0,0 @@
-#!/bin/sh
-
-# You should execute this script in the repository where you
-# want to convert grafts to replace refs.
-
-GRAFTS_FILE="${GIT_DIR:-.git}/info/grafts"
-
-. $(git --exec-path)/git-sh-setup
-
-test -f "$GRAFTS_FILE" || die "Could not find graft file: '$GRAFTS_FILE'"
-
-grep '^[^# ]' "$GRAFTS_FILE" |
-while read definition
-do
- if test -n "$definition"
- then
- echo "Converting: $definition"
- git replace --graft $definition ||
- die "Conversion failed for: $definition"
- fi
-done
-
-mv "$GRAFTS_FILE" "$GRAFTS_FILE.bak" ||
- die "Could not rename '$GRAFTS_FILE' to '$GRAFTS_FILE.bak'"
-
-echo "Success!"
-echo "All the grafts in '$GRAFTS_FILE' have been converted to replace
refs!"
-echo "The grafts file '$GRAFTS_FILE' has been renamed:
'$GRAFTS_FILE.bak'"
--
2.17.0.windows.1.33.gfcbb1fa0445





Re: [PATCH v3 09/11] technical/shallow: describe the relationship with replace refs

2018-04-24 Thread Philip Oakley

Hi dscho

From: "Johannes Schindelin" <johannes.schinde...@gmx.de> : Tuesday, April 
24, 2018 8:10 PM

On Sun, 22 Apr 2018, Philip Oakley wrote:


From: "Johannes Schindelin" <johannes.schinde...@gmx.de>
> Now that grafts are deprecated, we should start to assume that readers
> have no idea what grafts are. So it makes more sense to describe the
> "shallow" feature in terms of replace refs.


Here we say we should drop the term "grafts"

>
> Suggested-by: Eric Sunshine <sunsh...@sunshineco.com>
> Signed-off-by: Johannes Schindelin <johannes.schinde...@gmx.de>
> ---
> Documentation/technical/shallow.txt | 19 +++
> 1 file changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/technical/shallow.txt
> b/Documentation/technical/shallow.txt
> index 5183b154229..b3ff23c25f6 100644
> --- a/Documentation/technical/shallow.txt
> +++ b/Documentation/technical/shallow.txt
> @@ -9,14 +9,17 @@ these commits have no parents.
> *
>
> The basic idea is to write the SHA-1s of shallow commits into
> -$GIT_DIR/shallow, and handle its contents like the contents
> -of $GIT_DIR/info/grafts (with the difference that shallow
> -cannot contain parent information).
> -
> -This information is stored in a new file instead of grafts, or
> -even the config, since the user should not touch that file
> -at all (even throughout development of the shallow clone, it
> -was never manually edited!).
> +$GIT_DIR/shallow, and handle its contents similar to replace
> +refs (with the difference that shallow does not actually
> +create those replace refs) and

If grafts are deprecated, why not alse get rid of this mention and simply
leave the 'what it does' part.


Internally, shallow commits are implemented using the graft code path, and


however the change here is just to the documentation, independent of th code 
path's name.



they always will be: we will always need a list of the shallow commits,
and we will always need to be able to lift the "shallow" attribute
quickly, when deepening a shallow clone.

So it makes sense to mention that here, because we are deep in technical
details in Documentation/technical/.

>   very much like the 
> deprecated

> +graft file (with


I was looking to snip this 'graft' reference, as per the commit message..



>   the difference that shallow commits will
> +always have their parents grafted away, not replaced by
s/their parents grafted away/no parents/ (rather than being replaced..)


Then I botched this substitution



But the commits will typically have parents. So they really will have
their parents grafted away as long as they are marked "shallow"...


OK, maybe I mis-used the figurative 'no parents', when it means the literal 
'parents not present'.


Perhaps something like:
+$GIT_DIR/shallow, and handle its contents similar to replace
+refs (with the difference that shallow does not actually
+create those replace refs) with the difference that shallow commits will
+always have their parents not present.

--
Philip 



Re: [PATCH v8 06/16] sequencer: introduce the `merge` command

2018-04-24 Thread Philip Oakley

From: "Johannes Schindelin" <johannes.schinde...@gmx.de>

On Mon, 23 Apr 2018, Philip Oakley wrote:

From: "Johannes Schindelin" <johannes.schinde...@gmx.de> : Monday, April 
23,

2018 1:03 PM
Subject: Re: [PATCH v8 06/16] sequencer: introduce the `merge` command

[...]
>
> > > label onto
> > >
> > > # Branch abc
> > > reset onto
> >
> > Is this reset strictly necessary. We are already there @head.
>
> No, this is not strictly necessary, but

I've realised my misunderstanding. I was thinking this (and others) was
equivalent to

$  git reset <thatHead'onto'> # maybe even --hard,

i.e. affecting the worktree


Oh, but it *is* affecting the worktree. In this case, since we label HEAD
and then immediately reset to the label, there is just nothing to change.

Consider this example, though:

label onto

# Branch: from-philip
reset onto
pick abcdef something
label from-philip

# Branch: with-love
reset onto
pick 012345 else
label with-love

reset onto
merge -C 98765 from-philip
merge -C 43210 with-love

Only in the first instance is the `reset onto` a no-op, an incidental one.
After picking `something` and labeling the result as `from-philip`,
though, the next `reset onto` really resets the worktree.


rather that just being a movement of the Head rev (though I may be having
brain fade here regarding untracked files etc..)


The current way of doing things does not allow the `reset` to overwrite
untracked, nor ignored files (I think, I only verified the former, not the
latter).

But yeah, it is not just a movement of HEAD. It does reset the worktree,
although quite a bit more gently (and safely) than `git reset --hard`. In
that respect, this patch series is a drastic improvement over the Git
garden shears (which is the shell script I use in Git for Windows which
inspired this here patch series).

thanks for clarifying. Yes my reasoning  was a total brain fade ... Along 
with the fact that it's a soft/safe/gentle reset.

--
Philip 



Re: [PATCH v8 06/16] sequencer: introduce the `merge` command

2018-04-23 Thread Philip Oakley
From: "Johannes Schindelin"  : Monday, April 23, 
2018 1:03 PM

Subject: Re: [PATCH v8 06/16] sequencer: introduce the `merge` command



Hi Philip,


[...]



> label onto
>
> # Branch abc
> reset onto

Is this reset strictly necessary. We are already there @head.


No, this is not strictly necessary, but


I've realised my misunderstanding. I was thinking this (and others) was 
equivalent to


$  git reset  # maybe even --hard,

i.e. affecting the worktree

rather that just being a movement of the Head rev (though I may be having 
brain fade here regarding untracked files etc..)




- it makes it easier to auto-generate (otherwise you would have to keep
 track of the "current HEAD" while generating that todo list, and

- if I keep the `reset onto` there, then it is *a lot* easier to reorder
 topic branches.

Ciao,
Dscho


Thanks

Philip 



Re: [PATCH v8 06/16] sequencer: introduce the `merge` command

2018-04-22 Thread Philip Oakley

From: "Johannes Schindelin" 

This patch is part of the effort to reimplement `--preserve-merges` with
a substantially improved design, a design that has been developed in the
Git for Windows project to maintain the dozens of Windows-specific patch
series on top of upstream Git.

The previous patch implemented the `label` and `reset` commands to label


The previous patch was [Patch 05/16] git-rebase--interactive: clarify 
arguments, so this statement doesn't appear to be true. Has a patch been 
missed or re-ordered? Or should it be simply "This patch implements" ? 
Likewise the patch subject would be updated.



commits and to reset to labeled commits. This patch adds the `merge`


s/adds/also adds/ ?


command, with the following syntax:

merge [-C ]  # 

The  parameter in this instance is the *original* merge commit,
whose author and message will be used for the merge commit that is about
to be created.

The  parameter refers to the (possibly rewritten) revision to
merge. Let's see an example of a todo list:


The example ought to also note that `label onto` is to
`# label current HEAD with a name`, seeing as this is the first occurance.
It may be obvious in retrospect, but not at first reading.


label onto

# Branch abc
reset onto


Is this reset strictly necessary. We are already there @head.


pick deadbeef Hello, world!
label abc

reset onto
pick cafecafe And now for something completely different
merge -C baaabaaa abc # Merge the branch 'abc' into master

To edit the merge commit's message (a "reword" for merges, if you will),
use `-c` (lower-case) instead of `-C`; this convention was borrowed from
`git commit` that also supports `-c` and `-C` with similar meanings.

To create *new* merges, i.e. without copying the commit message from an
existing commit, simply omit the `-C ` parameter (which will
open an editor for the merge message):

merge abc

This comes in handy when splitting a branch into two or more branches.

Note: this patch only adds support for recursive merges, to keep things
simple. Support for octopus merges will be added later in a separate
patch series, support for merges using strategies other than the
recursive merge is left for the future.

Signed-off-by: Johannes Schindelin 
---
git-rebase--interactive.sh |   6 +
sequencer.c| 407 -
2 files changed, 406 insertions(+), 7 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index e1b865f43f2..ccd5254d1c9 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -162,6 +162,12 @@ s, squash  = use commit, but meld into 
previous commit

f, fixup  = like \"squash\", but discard this commit's log message
x, exec  = run command (the rest of the line) using shell
d, drop  = remove commit
+l, label  = label current HEAD with a name
+t, reset  = reset HEAD to a label
+m, merge [-C  | -c ]  [# ]
+.   create a merge commit using the original merge commit's
+.   message (or the oneline, if no original merge commit was
+.   specified). Use -c  to reword the commit message.

These lines can be re-ordered; they are executed from top to bottom.
" | git stripspace --comment-lines >>"$todo"
diff --git a/sequencer.c b/sequencer.c
index 01443e0f245..35fcacbdf0f 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -23,6 +23,8 @@
#include "hashmap.h"
#include "notes-utils.h"
#include "sigchain.h"
+#include "unpack-trees.h"
+#include "worktree.h"

#define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"

@@ -120,6 +122,13 @@ static GIT_PATH_FUNC(rebase_path_stopped_sha, 
"rebase-merge/stopped-sha")
static GIT_PATH_FUNC(rebase_path_rewritten_list, 
"rebase-merge/rewritten-list")

static GIT_PATH_FUNC(rebase_path_rewritten_pending,
 "rebase-merge/rewritten-pending")
+
+/*
+ * The path of the file listing refs that need to be deleted after the 
rebase
+ * finishes. This is used by the `label` command to record the need for 
cleanup.

+ */
+static GIT_PATH_FUNC(rebase_path_refs_to_delete, 
"rebase-merge/refs-to-delete")

+
/*
 * The following files are written by git-rebase just after parsing the
 * command-line (and are only consumed, not modified, by the sequencer).
@@ -244,18 +253,34 @@ static const char *gpg_sign_opt_quoted(struct 
replay_opts *opts)


int sequencer_remove_state(struct replay_opts *opts)
{
- struct strbuf dir = STRBUF_INIT;
+ struct strbuf buf = STRBUF_INIT;
 int i;

+ if (is_rebase_i(opts) &&
+ strbuf_read_file(, rebase_path_refs_to_delete(), 0) > 0) {
+ char *p = buf.buf;
+ while (*p) {
+ char *eol = strchr(p, '\n');
+ if (eol)
+ *eol = '\0';
+ if (delete_ref("(rebase -i) cleanup", p, NULL, 0) < 0)
+ warning(_("could not delete '%s'"), p);
+ if (!eol)
+ break;
+ p = eol + 1;
+ }
+ }
+
 free(opts->gpg_sign);
 free(opts->strategy);
 for (i = 0; i < opts->xopts_nr; i++)
 free(opts->xopts[i]);
 free(opts->xopts);

- strbuf_addstr(, get_dir(opts));
- remove_dir_recursively(, 0);
- 

Re: [PATCH 3/3] Avoid multiple PREFIX definitions

2018-04-22 Thread Philip Oakley

From: "Johannes Schindelin" <johannes.schinde...@gmx.de>

From: Philip Oakley <philipoak...@iee.org>

The short and sweet PREFIX can be confused when used in many places.

Rename both usages to better describe their purpose. EXEC_CMD_PREFIX is
used in full to disambiguate it from the nearby GIT_EXEC_PATH.


@dcsho; Thanks for keeping up with this and all your work. LGTM Philip.



The PREFIX in sideband.c, while nominally independant of the exec_cmd
PREFIX, does reside within libgit[1], so the definitions would clash
when taken together with a PREFIX given on the command line for use by
exec_cmd.c.

Noticed when compiling Git for Windows using MSVC/Visual Studio [1] which
reports the conflict beteeen the command line definition and the
definition in sideband.c within the libgit project.

[1] the libgit functions are brought into a single sub-project
within the Visual Studio construction script provided in contrib,
and hence uses a single command for both exec_cmd.c and sideband.c.

Signed-off-by: Philip Oakley <philipoak...@iee.org>
Signed-off-by: Johannes Schindelin <johannes.schinde...@gmx.de>
---
Makefile   |  2 +-
exec-cmd.c |  4 ++--
sideband.c | 10 +-
3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/Makefile b/Makefile
index 111e93d3bea..49cec672242 100644
--- a/Makefile
+++ b/Makefile
@@ -2271,7 +2271,7 @@ exec-cmd.sp exec-cmd.s exec-cmd.o: EXTRA_CPPFLAGS =
\
 '-DGIT_EXEC_PATH="$(gitexecdir_SQ)"' \
 '-DGIT_LOCALE_PATH="$(localedir_relative_SQ)"' \
 '-DBINDIR="$(bindir_relative_SQ)"' \
- '-DPREFIX="$(prefix_SQ)"'
+ '-DFALLBACK_RUNTIME_PREFIX="$(prefix_SQ)"'

builtin/init-db.sp builtin/init-db.s builtin/init-db.o: GIT-PREFIX
builtin/init-db.sp builtin/init-db.s builtin/init-db.o: EXTRA_CPPFLAGS = \
diff --git a/exec-cmd.c b/exec-cmd.c
index 3b0a039083a..02d31ee8971 100644
--- a/exec-cmd.c
+++ b/exec-cmd.c
@@ -48,7 +48,7 @@ static const char *system_prefix(void)
 !(prefix = strip_path_suffix(executable_dirname, GIT_EXEC_PATH)) &&
 !(prefix = strip_path_suffix(executable_dirname, BINDIR)) &&
 !(prefix = strip_path_suffix(executable_dirname, "git"))) {
- prefix = PREFIX;
+ prefix = FALLBACK_RUNTIME_PREFIX;
 trace_printf("RUNTIME_PREFIX requested, "
 "but prefix computation failed.  "
 "Using static fallback '%s'.\n", prefix);
@@ -243,7 +243,7 @@ void git_resolve_executable_dir(const char *argv0)
 */
static const char *system_prefix(void)
{
- return PREFIX;
+ return FALLBACK_RUNTIME_PREFIX;
}

/*
diff --git a/sideband.c b/sideband.c
index 6d7f943e438..325bf0e974a 100644
--- a/sideband.c
+++ b/sideband.c
@@ -13,7 +13,7 @@
 * the remote died unexpectedly.  A flush() concludes the stream.
 */

-#define PREFIX "remote: "
+#define DISPLAY_PREFIX "remote: "

#define ANSI_SUFFIX "\033[K"
#define DUMB_SUFFIX ""
@@ -49,7 +49,7 @@ int recv_sideband(const char *me, int in_stream, int
out)
 switch (band) {
 case 3:
 strbuf_addf(, "%s%s%s", outbuf.len ? "\n" : "",
- PREFIX, buf + 1);
+ DISPLAY_PREFIX, buf + 1);
 retval = SIDEBAND_REMOTE_ERROR;
 break;
 case 2:
@@ -67,7 +67,7 @@ int recv_sideband(const char *me, int in_stream, int
out)
 int linelen = brk - b;

 if (!outbuf.len)
- strbuf_addstr(, PREFIX);
+ strbuf_addstr(, DISPLAY_PREFIX);
 if (linelen > 0) {
 strbuf_addf(, "%.*s%s%c",
 linelen, b, suffix, *brk);
@@ -81,8 +81,8 @@ int recv_sideband(const char *me, int in_stream, int
out)
 }

 if (*b)
- strbuf_addf(, "%s%s",
- outbuf.len ? "" : PREFIX, b);
+ strbuf_addf(, "%s%s", outbuf.len ?
+ "" : DISPLAY_PREFIX, b);
 break;
 case 1:
 write_or_die(out, buf + 1, len);
--
2.17.0.windows.1.15.gaa56ade3205





Re: [PATCH v3 09/11] technical/shallow: describe the relationship with replace refs

2018-04-22 Thread Philip Oakley

From: "Johannes Schindelin" 

Now that grafts are deprecated, we should start to assume that readers
have no idea what grafts are. So it makes more sense to describe the
"shallow" feature in terms of replace refs.

Suggested-by: Eric Sunshine 
Signed-off-by: Johannes Schindelin 
---
Documentation/technical/shallow.txt | 19 +++
1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/Documentation/technical/shallow.txt 
b/Documentation/technical/shallow.txt

index 5183b154229..b3ff23c25f6 100644
--- a/Documentation/technical/shallow.txt
+++ b/Documentation/technical/shallow.txt
@@ -9,14 +9,17 @@ these commits have no parents.
*

The basic idea is to write the SHA-1s of shallow commits into
-$GIT_DIR/shallow, and handle its contents like the contents
-of $GIT_DIR/info/grafts (with the difference that shallow
-cannot contain parent information).
-
-This information is stored in a new file instead of grafts, or
-even the config, since the user should not touch that file
-at all (even throughout development of the shallow clone, it
-was never manually edited!).
+$GIT_DIR/shallow, and handle its contents similar to replace
+refs (with the difference that shallow does not actually
+create those replace refs) and


If grafts are deprecated, why not alse get rid of this mention and simply 
leave the 'what it does' part.


  very much like the 
deprecated

+graft file (with



  the difference that shallow commits will
+always have their parents grafted away, not replaced by

s/their parents grafted away/no parents/ (rather than being replaced..)


+different parents).
+
+This information is stored in a special-purpose file because the
+user should not touch that file at all (even throughout
+development of the shallow clone, it was never manually
+edited!).

Each line contains exactly one SHA-1. When read, a commit_graft
will be constructed, which has nr_parent < 0 to make it easier
--
2.17.0.windows.1.15.gaa56ade3205







Re: [PATCH v8 09/16] rebase: introduce the --rebase-merges option

2018-04-22 Thread Philip Oakley

From: "Johannes Schindelin" 

Once upon a time, this here developer thought: wouldn't it be nice if,
say, Git for Windows' patches on top of core Git could be represented as
a thicket of branches, and be rebased on top of core Git in order to
maintain a cherry-pick'able set of patch series?

The original attempt to answer this was: git rebase --preserve-merges.

However, that experiment was never intended as an interactive option,
and it only piggy-backed on git rebase --interactive because that
command's implementation looked already very, very familiar: it was
designed by the same person who designed --preserve-merges: yours truly.

Some time later, some other developer (I am looking at you, Andreas!
;-)) decided that it would be a good idea to allow --preserve-merges to
be combined with --interactive (with caveats!) and the Git maintainer
(well, the interim Git maintainer during Junio's absence, that is)
agreed, and that is when the glamor of the --preserve-merges design
started to fall apart rather quickly and unglamorously.

The reason? In --preserve-merges mode, the parents of a merge commit (or
for that matter, of *any* commit) were not stated explicitly, but were
*implied* by the commit name passed to the `pick` command.

This made it impossible, for example, to reorder commits. Not to mention
to flatten the branch topology or, deity forbid, to split topic branches


Aside: The idea of a "flattened" topology is, to my mind, not actually
defined though may be understood by devs working in the area. Hopefully it's
going away as a term, though the new 'cousins' will need clarification
(there's no dot notation for that area of topology).


into two.

Alas, these shortcomings also prevented that mode (whose original
purpose was to serve Git for Windows' needs, with the additional hope
that it may be useful to others, too) from serving Git for Windows'
needs.

Five years later, when it became really untenable to have one unwieldy,
big hodge-podge patch series of partly related, partly unrelated patches
in Git for Windows that was rebased onto core Git's tags from time to
time (earning the undeserved wrath of the developer of the ill-fated
git-remote-hg series that first obsoleted Git for Windows' competing
approach, only to be abandoned without maintainer later) was really
untenable, the "Git garden shears" were born [*1*/*2*]: a script,
piggy-backing on top of the interactive rebase, that would first
determine the branch topology of the patches to be rebased, create a
pseudo todo list for further editing, transform the result into a real
todo list (making heavy use of the `exec` command to "implement" the
missing todo list commands) and finally recreate the patch series on
top of the new base commit.

That was in 2013. And it took about three weeks to come up with the
design and implement it as an out-of-tree script. Needless to say, the
implementation needed quite a few years to stabilize, all the while the
design itself proved itself sound.

With this patch, the goodness of the Git garden shears comes to `git
rebase -i` itself. Passing the `--rebase-merges` option will generate
a todo list that can be understood readily, and where it is obvious
how to reorder commits. New branches can be introduced by inserting
`label` commands and calling `merge `. And once this mode will
have become stable and universally accepted, we can deprecate the design
mistake that was `--preserve-merges`.

Link *1*:
https://github.com/msysgit/msysgit/blob/master/share/msysGit/shears.sh
Link *2*:
https://github.com/git-for-windows/build-extra/blob/master/shears.sh

Signed-off-by: Johannes Schindelin 
---
Documentation/git-rebase.txt   |  20 ++-
contrib/completion/git-completion.bash |   2 +-
git-rebase--interactive.sh |   1 +
git-rebase.sh  |   6 +
t/t3430-rebase-merges.sh   | 179 +
5 files changed, 206 insertions(+), 2 deletions(-)
create mode 100755 t/t3430-rebase-merges.sh

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 3277ca14327..34e0f6a69c1 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -378,6 +378,23 @@ The commit list format can be changed by setting the
configuration option
rebase.instructionFormat.  A customized instruction format will
automatically
have the long commit hash prepended to the format.

+-r::
+--rebase-merges::
+ By default, a rebase will simply drop merge commits and only rebase
+ the non-merge commits. With this option, it will try to preserve
+ the branching structure within the commits that are to be rebased,
+ by recreating the merge commits. If a merge commit resolved any merge
+ or contained manual amendments, then they will have to be re-applied
+ manually.
++
+This mode is similar in spirit to `--preserve-merges`, but in contrast to
+that option works well in interactive rebases: commits can 

Re: [PATCH v8 09/16] rebase: introduce the --rebase-merges option

2018-04-22 Thread Philip Oakley

From: "Johannes Schindelin" 

Once upon a time, this here developer thought: wouldn't it be nice if,
say, Git for Windows' patches on top of core Git could be represented as
a thicket of branches, and be rebased on top of core Git in order to
maintain a cherry-pick'able set of patch series?

The original attempt to answer this was: git rebase --preserve-merges.

However, that experiment was never intended as an interactive option,
and it only piggy-backed on git rebase --interactive because that
command's implementation looked already very, very familiar: it was
designed by the same person who designed --preserve-merges: yours truly.

Some time later, some other developer (I am looking at you, Andreas!
;-)) decided that it would be a good idea to allow --preserve-merges to
be combined with --interactive (with caveats!) and the Git maintainer
(well, the interim Git maintainer during Junio's absence, that is)
agreed, and that is when the glamor of the --preserve-merges design
started to fall apart rather quickly and unglamorously.

The reason? In --preserve-merges mode, the parents of a merge commit (or
for that matter, of *any* commit) were not stated explicitly, but were
*implied* by the commit name passed to the `pick` command.

Aside: I think this para should be extracted to the --preserve-merges 
documentation to highlight what it does / why it is 'wrong' (not what would 
be expected in some case). It may also need to discuss the (figurative) 
Cousins vs. Siblings distinction [merge of branches external, or internal, 
to the rebase.


"In --preserve-merges, the commit being selected for merging is implied by 
the commit name  passed to the `pick` command (i.e. of the original merge 
commit), not that of the rebased version of that parent."


A similar issue occurs with (figuratively) '--ancestry-path --first parent' 
searches which lacks the alternate '--lead parent' post-walk selection. [1]. 
I don't think there is a dot notation to select the merge cousins, nor merge 
siblings either A.,B ? (that's dot-comma ;-)



This made it impossible, for example, to reorder commits. Not to mention
to flatten the branch topology or, deity forbid, to split topic branches
into two.

Alas, these shortcomings also prevented that mode (whose original
purpose was to serve Git for Windows' needs, with the additional hope
that it may be useful to others, too) from serving Git for Windows'
needs.

Five years later, when it became really untenable to have one unwieldy,
big hodge-podge patch series of partly related, partly unrelated patches
in Git for Windows that was rebased onto core Git's tags from time to
time (earning the undeserved wrath of the developer of the ill-fated
git-remote-hg series that first obsoleted Git for Windows' competing
approach, only to be abandoned without maintainer later) was really
untenable, the "Git garden shears" were born [*1*/*2*]: a script,
piggy-backing on top of the interactive rebase, that would first
determine the branch topology of the patches to be rebased, create a
pseudo todo list for further editing, transform the result into a real
todo list (making heavy use of the `exec` command to "implement" the
missing todo list commands) and finally recreate the patch series on
top of the new base commit.

That was in 2013. And it took about three weeks to come up with the
design and implement it as an out-of-tree script. Needless to say, the
implementation needed quite a few years to stabilize, all the while the
design itself proved itself sound.

With this patch, the goodness of the Git garden shears comes to `git
rebase -i` itself. Passing the `--rebase-merges` option will generate
a todo list that can be understood readily, and where it is obvious
how to reorder commits. New branches can be introduced by inserting
`label` commands and calling `merge `. And once this mode will
have become stable and universally accepted, we can deprecate the design
mistake that was `--preserve-merges`.

Link *1*:
https://github.com/msysgit/msysgit/blob/master/share/msysGit/shears.sh
Link *2*:
https://github.com/git-for-windows/build-extra/blob/master/shears.sh

Signed-off-by: Johannes Schindelin 
---
Documentation/git-rebase.txt   |  20 ++-
contrib/completion/git-completion.bash |   2 +-
git-rebase--interactive.sh |   1 +
git-rebase.sh  |   6 +
t/t3430-rebase-merges.sh   | 179 +
5 files changed, 206 insertions(+), 2 deletions(-)
create mode 100755 t/t3430-rebase-merges.sh

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 3277ca14327..34e0f6a69c1 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -378,6 +378,23 @@ The commit list format can be changed by setting the 
configuration option
rebase.instructionFormat.  A customized instruction format will 
automatically

have the long commit hash prepended to the 

Re: [PATCH v8 08/16] rebase-helper --make-script: introduce a flag to rebase merges

2018-04-22 Thread Philip Oakley

From: "Johannes Schindelin" 

Sorry for the very late in the series comments..


The sequencer just learned new commands intended to recreate branch
structure (similar in spirit to --preserve-merges, but with a
substantially less-broken design).

Let's allow the rebase--helper to generate todo lists making use of
these commands, triggered by the new --rebase-merges option. For a
commit topology like this (where the HEAD points to C):

- A - B - C
\   /
  D

the generated todo list would look like this:

# branch D
pick 0123 A
label branch-point
pick 1234 D
label D

reset branch-point
pick 2345 B
merge -C 3456 D # C

To keep things simple, we first only implement support for merge commits
with exactly two parents, leaving support for octopus merges to a later
patch series.


For the first time reader this (below) isn't as obvious as may be thought.
maybe we should be a little more explicit here.


As a special, hard-coded label, all merge-rebasing todo lists start with
the command `label onto`


.. which labels the start point head with the name 'onto' ...

Maybe even:
"All merge-rebasing todo lists start with, as a convenience, a hard-coded
`label onto` line which will label the start point's head" ...


   so that we can later always refer to the revision
onto which everything is rebased.

Signed-off-by: Johannes Schindelin 
---
builtin/rebase--helper.c |   4 +-
sequencer.c  | 351 ++-
sequencer.h  |   1 +
3 files changed, 353 insertions(+), 3 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index ad074705bb5..781782e7272 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -12,7 +12,7 @@ static const char * const builtin_rebase_helper_usage[]
= {
int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
{
 struct replay_opts opts = REPLAY_OPTS_INIT;
- unsigned flags = 0, keep_empty = 0;
+ unsigned flags = 0, keep_empty = 0, rebase_merges = 0;
 int abbreviate_commands = 0;
 enum {
 CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
@@ -24,6 +24,7 @@ int cmd_rebase__helper(int argc, const char **argv,
const char *prefix)
 OPT_BOOL(0, "keep-empty", _empty, N_("keep empty commits")),
 OPT_BOOL(0, "allow-empty-message", _empty_message,
 N_("allow commits with empty messages")),
+ OPT_BOOL(0, "rebase-merges", _merges, N_("rebase merge
commits")),
 OPT_CMDMODE(0, "continue", , N_("continue rebase"),
 CONTINUE),
 OPT_CMDMODE(0, "abort", , N_("abort rebase"),
@@ -57,6 +58,7 @@ int cmd_rebase__helper(int argc, const char **argv,
const char *prefix)

 flags |= keep_empty ? TODO_LIST_KEEP_EMPTY : 0;
 flags |= abbreviate_commands ? TODO_LIST_ABBREVIATE_CMDS : 0;
+ flags |= rebase_merges ? TODO_LIST_REBASE_MERGES : 0;
 flags |= command == SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;

 if (command == CONTINUE && argc == 1)
diff --git a/sequencer.c b/sequencer.c
index 5944d3a34eb..1e17a11ca32 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -25,6 +25,8 @@
#include "sigchain.h"
#include "unpack-trees.h"
#include "worktree.h"
+#include "oidmap.h"
+#include "oidset.h"

#define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"

@@ -3436,6 +3438,343 @@ void append_signoff(struct strbuf *msgbuf, int
ignore_footer, unsigned flag)
 strbuf_release();
}

+struct labels_entry {
+ struct hashmap_entry entry;
+ char label[FLEX_ARRAY];
+};
+
+static int labels_cmp(const void *fndata, const struct labels_entry *a,
+   const struct labels_entry *b, const void *key)
+{
+ return key ? strcmp(a->label, key) : strcmp(a->label, b->label);
+}
+
+struct string_entry {
+ struct oidmap_entry entry;
+ char string[FLEX_ARRAY];
+};
+
+struct label_state {
+ struct oidmap commit2label;
+ struct hashmap labels;
+ struct strbuf buf;
+};
+
+static const char *label_oid(struct object_id *oid, const char *label,
+  struct label_state *state)
+{
+ struct labels_entry *labels_entry;
+ struct string_entry *string_entry;
+ struct object_id dummy;
+ size_t len;
+ int i;
+
+ string_entry = oidmap_get(>commit2label, oid);
+ if (string_entry)
+ return string_entry->string;
+
+ /*
+ * For "uninteresting" commits, i.e. commits that are not to be
+ * rebased, and which can therefore not be labeled, we use a unique
+ * abbreviation of the commit name. This is slightly more complicated
+ * than calling find_unique_abbrev() because we also need to make
+ * sure that the abbreviation does not conflict with any other
+ * label.
+ *
+ * We disallow "interesting" commits to be labeled by a string that
+ * is a valid full-length hash, to ensure that we always can find an
+ * abbreviation for any uninteresting commit's names that does not
+ * clash with any other label.
+ */
+ if (!label) {
+ char *p;
+
+ strbuf_reset(>buf);
+ strbuf_grow(>buf, GIT_SHA1_HEXSZ);
+ label = p = state->buf.buf;
+
+ find_unique_abbrev_r(p, oid, default_abbrev);
+
+ /*
+ * We may need to extend the 

Re: [PATCH v8 06/16] sequencer: introduce the `merge` command

2018-04-22 Thread Philip Oakley

From: "Johannes Schindelin" 

This patch is part of the effort to reimplement `--preserve-merges` with
a substantially improved design, a design that has been developed in the
Git for Windows project to maintain the dozens of Windows-specific patch
series on top of upstream Git.

The previous patch implemented the `label` and `reset` commands to label


The previous patch was [Patch 05/16] git-rebase--interactive: clarify
arguments, so this statement doesn't appear to be true. Has a patch been
missed or re-ordered? Or should it be simply "This patch implements" ?
Likewise the patch subject would be updated.


commits and to reset to labeled commits. This patch adds the `merge`


s/adds/also adds/ ?


command, with the following syntax:

merge [-C ]  # 

The  parameter in this instance is the *original* merge commit,
whose author and message will be used for the merge commit that is about
to be created.

The  parameter refers to the (possibly rewritten) revision to
merge. Let's see an example of a todo list:


The example ought to also note that `label onto` is to
`# label current HEAD with a name`, seeing as this is the first occurance.
It may be obvious in retrospect, but not at first reading.


label onto

# Branch abc
reset onto


Is this reset strictly necessary. We are already there @head.


pick deadbeef Hello, world!
label abc

reset onto
pick cafecafe And now for something completely different
merge -C baaabaaa abc # Merge the branch 'abc' into master

To edit the merge commit's message (a "reword" for merges, if you will),
use `-c` (lower-case) instead of `-C`; this convention was borrowed from
`git commit` that also supports `-c` and `-C` with similar meanings.

To create *new* merges, i.e. without copying the commit message from an
existing commit, simply omit the `-C ` parameter (which will
open an editor for the merge message):

merge abc

This comes in handy when splitting a branch into two or more branches.

Note: this patch only adds support for recursive merges, to keep things
simple. Support for octopus merges will be added later in a separate
patch series, support for merges using strategies other than the
recursive merge is left for the future.

Signed-off-by: Johannes Schindelin 
---
git-rebase--interactive.sh |   6 +
sequencer.c| 407 -
2 files changed, 406 insertions(+), 7 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index e1b865f43f2..ccd5254d1c9 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -162,6 +162,12 @@ s, squash  = use commit, but meld into
previous commit
f, fixup  = like \"squash\", but discard this commit's log message
x, exec  = run command (the rest of the line) using shell
d, drop  = remove commit
+l, label  = label current HEAD with a name
+t, reset  = reset HEAD to a label
+m, merge [-C  | -c ]  [# ]
+.   create a merge commit using the original merge commit's
+.   message (or the oneline, if no original merge commit was
+.   specified). Use -c  to reword the commit message.

These lines can be re-ordered; they are executed from top to bottom.
" | git stripspace --comment-lines >>"$todo"
diff --git a/sequencer.c b/sequencer.c
index 01443e0f245..35fcacbdf0f 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -23,6 +23,8 @@
#include "hashmap.h"
#include "notes-utils.h"
#include "sigchain.h"
+#include "unpack-trees.h"
+#include "worktree.h"

#define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"

@@ -120,6 +122,13 @@ static GIT_PATH_FUNC(rebase_path_stopped_sha,
"rebase-merge/stopped-sha")
static GIT_PATH_FUNC(rebase_path_rewritten_list,
"rebase-merge/rewritten-list")
static GIT_PATH_FUNC(rebase_path_rewritten_pending,
 "rebase-merge/rewritten-pending")
+
+/*
+ * The path of the file listing refs that need to be deleted after the
rebase
+ * finishes. This is used by the `label` command to record the need for
cleanup.
+ */
+static GIT_PATH_FUNC(rebase_path_refs_to_delete,
"rebase-merge/refs-to-delete")
+
/*
 * The following files are written by git-rebase just after parsing the
 * command-line (and are only consumed, not modified, by the sequencer).
@@ -244,18 +253,34 @@ static const char *gpg_sign_opt_quoted(struct
replay_opts *opts)

int sequencer_remove_state(struct replay_opts *opts)
{
- struct strbuf dir = STRBUF_INIT;
+ struct strbuf buf = STRBUF_INIT;
 int i;

+ if (is_rebase_i(opts) &&
+ strbuf_read_file(, rebase_path_refs_to_delete(), 0) > 0) {
+ char *p = buf.buf;
+ while (*p) {
+ char *eol = strchr(p, '\n');
+ if (eol)
+ *eol = '\0';
+ if (delete_ref("(rebase -i) cleanup", p, NULL, 0) < 0)
+ warning(_("could not delete '%s'"), p);
+ if (!eol)
+ break;
+ p = eol + 1;
+ }
+ }
+
 free(opts->gpg_sign);
 free(opts->strategy);
 for (i = 0; i < opts->xopts_nr; i++)
 free(opts->xopts[i]);
 free(opts->xopts);

- strbuf_addstr(, get_dir(opts));
- remove_dir_recursively(, 0);
- strbuf_release();
+ 

Re: [PATCH/RFC 0/5] Keep all info in command-list.txt in git binary

2018-04-19 Thread Philip Oakley

From: "Duy Nguyen" <pclo...@gmail.com>

On Wed, Apr 18, 2018 at 12:47 AM, Philip Oakley <philipoak...@iee.org>
wrote:

> Is that something I should add to my todo to add a 'guide' category >
> etc.?

I added it too [1]. Not sure if you want anything more on top though.



What I've seen is looking good - I've not had as much time as I'd like..

I'm not sure of the status of the git/generate-cmdlist.sh though. Should
that also be updated, or did I miss that?


Yes it's updated by other patches in the same thread.
--

Thanks. Hopefully I'll have some time this weekend/coming week as my wife is
away
Philip



Re: [PATCH/RFC 0/5] Keep all info in command-list.txt in git binary

2018-04-18 Thread Philip Oakley

From: "Philip Oakley" <philipoak...@iee.org> : Tuesday, April 17, 2018 11:47
PM

From: "Duy Nguyen" <pclo...@gmail.com> : Tuesday, April 17, 2018 5:48 PM

On Tue, Apr 17, 2018 at 06:24:41PM +0200, Duy Nguyen wrote:

On Sun, Apr 15, 2018 at 11:21 PM, Philip Oakley <philipoak...@iee.org>
wrote:
> From: "Duy Nguyen" <pclo...@gmail.com> : Saturday, April 14, 2018 4:44
> PM
>
>> On Thu, Apr 12, 2018 at 12:06 AM, Philip Oakley
>> <philipoak...@iee.org>
>> wrote:
>>>
>>> I'm only just catching up, but does/can this series also capture the
>>> non-command guides that are available in git so that the 'git
>>> help -g'
>>> can
>>> begin to list them all?
>>
>>
>> It currently does not. But I don't see why it should not. This should
>> allow git.txt to list all the guides too, for people who skip "git
>> help" and go hard core mode with "man git". Thanks for bringing this
>> up.
>> --
>> Duy
>>
> Is that something I should add to my todo to add a 'guide' category
> etc.?

I added it too [1]. Not sure if you want anything more on top though.


What I've seen is looking good - I've not had as much time as I'd like..

I'm not sure of the status of the git/generate-cmdlist.sh though. Should
that also be updated, or did I miss that?
--
Philip


I may be miss-remembering the order that the `git help` determines the list
of commands and guides. There was at least one place where the list of
commands was generated programatically that I may be confused with (I've not
had time to delve into the code :-(
--






The "anything more" that at least I had in mind was something like
this. Though I'm not sure if it's a good thing to replace a hand
crafted section with an automatedly generated one. This patch on top
combines the "SEE ALSO" and "FURTHER DOCUMENT" into one with most of
documents/guides are extracted from command-list.txt

-- 8< --
diff --git a/Documentation/Makefile b/Documentation/Makefile
index 6232143cb9..3e0ecd2e11 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -292,6 +292,7 @@ doc.dep : $(docdep_prereqs) $(wildcard *.txt)
build-docdep.perl

cmds_txt = cmds-ancillaryinterrogators.txt \
 cmds-ancillarymanipulators.txt \
+ cmds-guide.txt \
 cmds-mainporcelain.txt \
 cmds-plumbinginterrogators.txt \
 cmds-plumbingmanipulators.txt \
diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
index 5aa73cfe45..e158bd9b96 100755
--- a/Documentation/cmd-list.perl
+++ b/Documentation/cmd-list.perl
@@ -54,6 +54,7 @@ for (sort <>) {

for my $cat (qw(ancillaryinterrogators
 ancillarymanipulators
+ guide
 mainporcelain
 plumbinginterrogators
 plumbingmanipulators
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 4767860e72..d60d2ae0c7 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -808,29 +808,6 @@ The index is also capable of storing multiple
entries (called "stages")
for a given pathname.  These stages are used to hold the various
unmerged version of a file when a merge is in progress.

-FURTHER DOCUMENTATION
--
-
-See the references in the "description" section to get started
-using Git.  The following is probably more detail than necessary
-for a first-time user.
-
-The link:user-manual.html#git-concepts[Git concepts chapter of the
-user-manual] and linkgit:gitcore-tutorial[7] both provide
-introductions to the underlying Git architecture.
-
-See linkgit:gitworkflows[7] for an overview of recommended workflows.
-
-See also the link:howto-index.html[howto] documents for some useful
-examples.
-
-The internals are documented in the
-link:technical/api-index.html[Git API documentation].
-
-Users migrating from CVS may also want to
-read linkgit:gitcvs-migration[7].
-
-
Authors
---
Git was started by Linus Torvalds, and is currently maintained by Junio
@@ -854,11 +831,16 @@ the Git Security mailing list
<git-secur...@googlegroups.com>.

SEE ALSO

-linkgit:gittutorial[7], linkgit:gittutorial-2[7],
-linkgit:giteveryday[7], linkgit:gitcvs-migration[7],
-linkgit:gitglossary[7], linkgit:gitcore-tutorial[7],
-linkgit:gitcli[7], link:user-manual.html[The Git User's Manual],
-linkgit:gitworkflows[7]
+
+See the references in the "description" section to get started
+using Git.  The following is probably more detail than necessary
+for a first-time user.
+
+include::cmds-guide.txt[]
+
+See also the link:howto-index.html[howto] documents for some useful
+examples. The internals are documented in the
+link:technical/api-index.html[Git API documentation].

GIT
---
diff --git a/command-list.txt b/command-list.txt
index 1835f1a928..f26b8acd52 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -150,10 +150,14 @@ git-whatchanged

Re: [PATCH/RFC 0/5] Keep all info in command-list.txt in git binary

2018-04-17 Thread Philip Oakley

From: "Duy Nguyen" <pclo...@gmail.com> : Tuesday, April 17, 2018 5:48 PM

On Tue, Apr 17, 2018 at 06:24:41PM +0200, Duy Nguyen wrote:
On Sun, Apr 15, 2018 at 11:21 PM, Philip Oakley <philipoak...@iee.org> 
wrote:
> From: "Duy Nguyen" <pclo...@gmail.com> : Saturday, April 14, 2018 4:44 
> PM

>
>> On Thu, Apr 12, 2018 at 12:06 AM, Philip Oakley <philipoak...@iee.org>
>> wrote:
>>>
>>> I'm only just catching up, but does/can this series also capture the
>>> non-command guides that are available in git so that the 'git 
>>> help -g'

>>> can
>>> begin to list them all?
>>
>>
>> It currently does not. But I don't see why it should not. This should
>> allow git.txt to list all the guides too, for people who skip "git
>> help" and go hard core mode with "man git". Thanks for bringing this
>> up.
>> --
>> Duy
>>
> Is that something I should add to my todo to add a 'guide' category 
> etc.?


I added it too [1]. Not sure if you want anything more on top though.


What I've seen is looking good - I've not had as much time as I'd like..

I'm not sure of the status of the git/generate-cmdlist.sh though. Should 
that also be updated, or did I miss that?

--
Philip



The "anything more" that at least I had in mind was something like
this. Though I'm not sure if it's a good thing to replace a hand
crafted section with an automatedly generated one. This patch on top
combines the "SEE ALSO" and "FURTHER DOCUMENT" into one with most of
documents/guides are extracted from command-list.txt

-- 8< --
diff --git a/Documentation/Makefile b/Documentation/Makefile
index 6232143cb9..3e0ecd2e11 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -292,6 +292,7 @@ doc.dep : $(docdep_prereqs) $(wildcard *.txt) 
build-docdep.perl


cmds_txt = cmds-ancillaryinterrogators.txt \
 cmds-ancillarymanipulators.txt \
+ cmds-guide.txt \
 cmds-mainporcelain.txt \
 cmds-plumbinginterrogators.txt \
 cmds-plumbingmanipulators.txt \
diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
index 5aa73cfe45..e158bd9b96 100755
--- a/Documentation/cmd-list.perl
+++ b/Documentation/cmd-list.perl
@@ -54,6 +54,7 @@ for (sort <>) {

for my $cat (qw(ancillaryinterrogators
 ancillarymanipulators
+ guide
 mainporcelain
 plumbinginterrogators
 plumbingmanipulators
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 4767860e72..d60d2ae0c7 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -808,29 +808,6 @@ The index is also capable of storing multiple entries 
(called "stages")

for a given pathname.  These stages are used to hold the various
unmerged version of a file when a merge is in progress.

-FURTHER DOCUMENTATION
--
-
-See the references in the "description" section to get started
-using Git.  The following is probably more detail than necessary
-for a first-time user.
-
-The link:user-manual.html#git-concepts[Git concepts chapter of the
-user-manual] and linkgit:gitcore-tutorial[7] both provide
-introductions to the underlying Git architecture.
-
-See linkgit:gitworkflows[7] for an overview of recommended workflows.
-
-See also the link:howto-index.html[howto] documents for some useful
-examples.
-
-The internals are documented in the
-link:technical/api-index.html[Git API documentation].
-
-Users migrating from CVS may also want to
-read linkgit:gitcvs-migration[7].
-
-
Authors
---
Git was started by Linus Torvalds, and is currently maintained by Junio
@@ -854,11 +831,16 @@ the Git Security mailing list 
<git-secur...@googlegroups.com>.


SEE ALSO

-linkgit:gittutorial[7], linkgit:gittutorial-2[7],
-linkgit:giteveryday[7], linkgit:gitcvs-migration[7],
-linkgit:gitglossary[7], linkgit:gitcore-tutorial[7],
-linkgit:gitcli[7], link:user-manual.html[The Git User's Manual],
-linkgit:gitworkflows[7]
+
+See the references in the "description" section to get started
+using Git.  The following is probably more detail than necessary
+for a first-time user.
+
+include::cmds-guide.txt[]
+
+See also the link:howto-index.html[howto] documents for some useful
+examples. The internals are documented in the
+link:technical/api-index.html[Git API documentation].

GIT
---
diff --git a/command-list.txt b/command-list.txt
index 1835f1a928..f26b8acd52 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -150,10 +150,14 @@ git-whatchanged 
ancillaryinterrogators

git-worktreemainporcelain
git-write-tree  plumbingmanipulators
gitattributes   guide
+gitcvs-migrationguide
+gitcli  guide
+gitcore-tutorialguide
giteveryday guide
gi

Re: [PATCH/RFC 0/5] Keep all info in command-list.txt in git binary

2018-04-15 Thread Philip Oakley

From: "Duy Nguyen" <pclo...@gmail.com> : Saturday, April 14, 2018 4:44 PM

On Thu, Apr 12, 2018 at 12:06 AM, Philip Oakley <philipoak...@iee.org>
wrote:

I'm only just catching up, but does/can this series also capture the
non-command guides that are available in git so that the 'git help -g'
can
begin to list them all?


It currently does not. But I don't see why it should not. This should
allow git.txt to list all the guides too, for people who skip "git
help" and go hard core mode with "man git". Thanks for bringing this
up.
--
Duy


Is that something I should add to my todo to add a 'guide' category etc.?

A quick search of public-inbox suggests
https://public-inbox.org/git/1361660761-1932-1-git-send-email-philipoak...@iee.org/
as being where I first made the suggestions, but it got trimmed back to not
update (be embedded in) the command-list.txt

Philip



Re: [PATCH v6 04/15] sequencer: introduce new commands to reset the revision

2018-04-15 Thread Philip Oakley

From: "Phillip Wood" 
: Friday, April 13, 2018 11:03 AM

If a label or reset command fails it is likely to be due to a
typo. Rescheduling the command would make it easier for the user to fix
the problem as they can just run 'git rebase --edit-todo'. 


Is this worth noting in the command documentation? 
"If the label or reset command fails then fix

the problem by runnning 'git rebase --edit-todo'." ?

Just a thought.


It also
ensures that the problem has actually been fixed when the rebase
continues. I think you could do it like this



--
Philip
(also @dunelm, 73-79..)


Re: [PATCH/RFC 0/5] Keep all info in command-list.txt in git binary

2018-04-11 Thread Philip Oakley
From: "Eric Sunshine"  Monday, April 09, 2018 6:17 
AM

On Mon, Mar 26, 2018 at 12:55 PM, Nguyễn Thái Ngọc Duy
 wrote:

This is pretty rough but I'd like to see how people feel about this
first.

I notice we have two places for command classification. One in
command-list.txt, one in __git_list_porcelain_commands() in
git-completion.bash. People who are following nd/parseopt-completion
probably know that I'm try to reduce duplication in this script as
much as possible, this is another step towards that.

By keeping all information of command-list.txt in git binary, we could
provide the porcelain list to git-completion.bash via "git
--list-cmds=porcelain", so we don't neeed a separate command
classification in git-completion.bash anymore.


I like the direction this series is taking.


Because we have all command synopsis as a side effect, we could
now support "git help -a --verbose" which prints something like "git
help", a command name and a description, but we could do it for _all_
recognized commands. This could help people look for a command even if
we don't provide "git appropos".


Nice idea, and you practically get this for free (aside from the the
obvious new code) since generate-cmdlist.sh already plucks the summary
for each command directly from Documentation/git-*.txt.

I'm only just catching up, but does/can this series also capture the 
non-command guides that are available in git so that the 'git help -g' can 
begin to list them all?


It was something I looked at some years ago (when I added the -g option) but 
at the time the idea of updating the command-list.txt was too invasive.


Just a thought.

Philip



Re: Bug: duplicate sections in .git/config after remote removal

2018-03-28 Thread Philip Oakley

From: "Ævar Arnfjörð Bjarmason" 

On Tue, Mar 27 2018, Jason Frey wrote:


While the impact of this bug is minimal, and git itself is not
affected, it can affect external tools that want to read the
.git/config file, expecting unique section names.

To reproduce:

Given the following example .git/config file (I am leaving out the
[core] section for brevity):

[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master

Running `git remote rm origin` will result in the following contents:

[branch "master"]

Running `git remote add origin g...@github.com:Fryguy/example.git` will
result in the following contents:

[branch "master"]
[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*

And finally, running `git fetch origin; git branch -u origin/master`
will result in the following contents:

[branch "master"]
[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master

at which point you can see the duplicate sections (even though one is
empty).  Also note that if you do the steps again, you will be left
with 3 sections, 2 of which are empty.  This process can be repeated
over and over.


This can be annoying and result in some very verbose config files when
we automatically edit them, e.g.:

   (rm -v /tmp/test.ini; for i in {1..3}; do git config -f /tmp/test.ini 
foo.bar 0 && git config -f /tmp/test.ini --unset foo.bar; done; cat 
/tmp/test.ini)

   removed '/tmp/test.ini'
   [foo]
   [foo]
   [foo]

But it's not so clear that it should be called a bug, yes we could be a
bit smarter and not add obvious crap like the example above (duplicate
sections at the end), but it gets less obvious in more complex cases,
see my c8b2cec09e ("branch: add test for -m renaming multiple config
sections", 2017-06-18) for one such example.

Git has a config format that's hybrid human/machine editable. Consider a
case like:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   ;; Our aliases
   [alias]
   st = status

Now, if I run `git config gc.auto 0` is it better if we end up with:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   auto = 0
   ;; Our aliases
   [alias]
   st = status

Or something that makes it more clear that a machine added something at
the end:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   ;; Our aliases
   [alias]
   st = status
   [gc]
   auto = 0

Most importantly though, regardless of what we decide to do when we
machine-edit the file, it's also human-editable, and being able to
repeat sections is part of our config format that you're simply going to
have to deal with.


One option may be to create  a simple 'lint' style checker that simply 
hiughlights and suggests options so the user can decide for themselves what 
they need to do. This would help span the gap between hard format and the 
soft format capabiulities of machine readable ini files, the Git config 
reader and being human readable.


Thus duplicate sections would be noted, likewise the presence of comments 
immediately preceding a section header, or terminating a section (with or 
without spacing?), etc.Such a config_lint could reside in the contrib as a 
supprt tool, and may in the long term be a guide to a common format. 
However, as noted, it would be more of a long term aspiration..





The external tool (presumably some generic *.ini parser) you're trying
to point at git's config is broken for that purpose if it doesn't handle
duplicate sections. You're probably better off trying to parse `git
config --list --null` than trying to make it work.

I don't think we'd ever want to get rid of this feature, it's *very*
useful. Both for config via the include macro, and for people to
manually paste some config they want to try out to the end of their
config, without having to manually edit it to incorporate it into their
already existing sections.



--
Philip 



Re: Bug: duplicate sections in .git/config after remote removal

2018-03-28 Thread Philip Oakley

From: "Ævar Arnfjörð Bjarmason" 

On Tue, Mar 27 2018, Jason Frey wrote:


While the impact of this bug is minimal, and git itself is not
affected, it can affect external tools that want to read the
.git/config file, expecting unique section names.

To reproduce:

Given the following example .git/config file (I am leaving out the
[core] section for brevity):

[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master

Running `git remote rm origin` will result in the following contents:

[branch "master"]

Running `git remote add origin g...@github.com:Fryguy/example.git` will
result in the following contents:

[branch "master"]
[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*

And finally, running `git fetch origin; git branch -u origin/master`
will result in the following contents:

[branch "master"]
[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master

at which point you can see the duplicate sections (even though one is
empty).  Also note that if you do the steps again, you will be left
with 3 sections, 2 of which are empty.  This process can be repeated
over and over.


This can be annoying and result in some very verbose config files when
we automatically edit them, e.g.:

   (rm -v /tmp/test.ini; for i in {1..3}; do git config -f /tmp/test.ini 
foo.bar 0 && git config -f /tmp/test.ini --unset foo.bar; done; cat 
/tmp/test.ini)

   removed '/tmp/test.ini'
   [foo]
   [foo]
   [foo]

But it's not so clear that it should be called a bug, yes we could be a
bit smarter and not add obvious crap like the example above (duplicate
sections at the end), but it gets less obvious in more complex cases,
see my c8b2cec09e ("branch: add test for -m renaming multiple config
sections", 2017-06-18) for one such example.

Git has a config format that's hybrid human/machine editable. Consider a
case like:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   ;; Our aliases
   [alias]
   st = status

Now, if I run `git config gc.auto 0` is it better if we end up with:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   auto = 0
   ;; Our aliases
   [alias]
   st = status

Or something that makes it more clear that a machine added something at
the end:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   ;; Our aliases
   [alias]
   st = status
   [gc]
   auto = 0

Most importantly though, regardless of what we decide to do when we
machine-edit the file, it's also human-editable, and being able to
repeat sections is part of our config format that you're simply going to
have to deal with.


One option may be to create  a simple 'lint' style checker that simply 
hiughlights and suggests options so the user can decide for themselves what 
they need to do. This would help span the gap between hard format and the 
soft format capabiulities of machine readable ini files, the Git config 
reader and being human readable.


Thus duplicate sections would be noted, likewise the presence of comments 
immediately preceding a section header, or terminating a section (with or 
without spacing?), etc.Such a config_lint could reside in the contrib as a 
supprt tool, and may in the long term be a guide to a common format. 
However, as noted, it would be more of a long term aspiration..





The external tool (presumably some generic *.ini parser) you're trying
to point at git's config is broken for that purpose if it doesn't handle
duplicate sections. You're probably better off trying to parse `git
config --list --null` than trying to make it work.

I don't think we'd ever want to get rid of this feature, it's *very*
useful. Both for config via the include macro, and for people to
manually paste some config they want to try out to the end of their
config, without having to manually edit it to incorporate it into their
already existing sections.



--
Philip 



Re: Bug: duplicate sections in .git/config after remote removal

2018-03-28 Thread Philip Oakley

From: "Ævar Arnfjörð Bjarmason" 

On Tue, Mar 27 2018, Jason Frey wrote:


While the impact of this bug is minimal, and git itself is not
affected, it can affect external tools that want to read the
.git/config file, expecting unique section names.

To reproduce:

Given the following example .git/config file (I am leaving out the
[core] section for brevity):

[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master

Running `git remote rm origin` will result in the following contents:

[branch "master"]

Running `git remote add origin g...@github.com:Fryguy/example.git` will
result in the following contents:

[branch "master"]
[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*

And finally, running `git fetch origin; git branch -u origin/master`
will result in the following contents:

[branch "master"]
[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master

at which point you can see the duplicate sections (even though one is
empty).  Also note that if you do the steps again, you will be left
with 3 sections, 2 of which are empty.  This process can be repeated
over and over.


This can be annoying and result in some very verbose config files when
we automatically edit them, e.g.:

   (rm -v /tmp/test.ini; for i in {1..3}; do git config -f /tmp/test.ini 
foo.bar 0 && git config -f /tmp/test.ini --unset foo.bar; done; cat 
/tmp/test.ini)

   removed '/tmp/test.ini'
   [foo]
   [foo]
   [foo]

But it's not so clear that it should be called a bug, yes we could be a
bit smarter and not add obvious crap like the example above (duplicate
sections at the end), but it gets less obvious in more complex cases,
see my c8b2cec09e ("branch: add test for -m renaming multiple config
sections", 2017-06-18) for one such example.

Git has a config format that's hybrid human/machine editable. Consider a
case like:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   ;; Our aliases
   [alias]
   st = status

Now, if I run `git config gc.auto 0` is it better if we end up with:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   auto = 0
   ;; Our aliases
   [alias]
   st = status

Or something that makes it more clear that a machine added something at
the end:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   ;; Our aliases
   [alias]
   st = status
   [gc]
   auto = 0

Most importantly though, regardless of what we decide to do when we
machine-edit the file, it's also human-editable, and being able to
repeat sections is part of our config format that you're simply going to
have to deal with.


One option may be to create  a simple 'lint' style checker that simply 
hiughlights and suggests options so the user can decide for themselves what 
they need to do. This would help span the gap between hard format and the 
soft format capabiulities of machine readable ini files, the Git config 
reader and being human readable.


Thus duplicate sections would be noted, likewise the presence of comments 
immediately preceding a section header, or terminating a section (with or 
without spacing?), etc.Such a config_lint could reside in the contrib as a 
supprt tool, and may in the long term be a guide to a common format. 
However, as noted, it would be more of a long term aspiration..





The external tool (presumably some generic *.ini parser) you're trying
to point at git's config is broken for that purpose if it doesn't handle
duplicate sections. You're probably better off trying to parse `git
config --list --null` than trying to make it work.

I don't think we'd ever want to get rid of this feature, it's *very*
useful. Both for config via the include macro, and for people to
manually paste some config they want to try out to the end of their
config, without having to manually edit it to incorporate it into their
already existing sections.



--
Philip 



Re: Bug: duplicate sections in .git/config after remote removal

2018-03-28 Thread Philip Oakley

From: "Ævar Arnfjörð Bjarmason" 

On Tue, Mar 27 2018, Jason Frey wrote:


While the impact of this bug is minimal, and git itself is not
affected, it can affect external tools that want to read the
.git/config file, expecting unique section names.

To reproduce:

Given the following example .git/config file (I am leaving out the
[core] section for brevity):

[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master

Running `git remote rm origin` will result in the following contents:

[branch "master"]

Running `git remote add origin g...@github.com:Fryguy/example.git` will
result in the following contents:

[branch "master"]
[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*

And finally, running `git fetch origin; git branch -u origin/master`
will result in the following contents:

[branch "master"]
[remote "origin"]
url = g...@github.com:Fryguy/example.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master

at which point you can see the duplicate sections (even though one is
empty).  Also note that if you do the steps again, you will be left
with 3 sections, 2 of which are empty.  This process can be repeated
over and over.


This can be annoying and result in some very verbose config files when
we automatically edit them, e.g.:

   (rm -v /tmp/test.ini; for i in {1..3}; do git config -f /tmp/test.ini
foo.bar 0 && git config -f /tmp/test.ini --unset foo.bar; done; cat
/tmp/test.ini)
   removed '/tmp/test.ini'
   [foo]
   [foo]
   [foo]

But it's not so clear that it should be called a bug, yes we could be a
bit smarter and not add obvious crap like the example above (duplicate
sections at the end), but it gets less obvious in more complex cases,
see my c8b2cec09e ("branch: add test for -m renaming multiple config
sections", 2017-06-18) for one such example.

Git has a config format that's hybrid human/machine editable. Consider a
case like:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   ;; Our aliases
   [alias]
   st = status

Now, if I run `git config gc.auto 0` is it better if we end up with:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   auto = 0
   ;; Our aliases
   [alias]
   st = status

Or something that makes it more clear that a machine added something at
the end:

   [gc]
   ;; Here's all the gc config we set up to avoid the great outage of 2015
   autoDetach = false
   ;; Our aliases
   [alias]
   st = status
   [gc]
   auto = 0

Most importantly though, regardless of what we decide to do when we
machine-edit the file, it's also human-editable, and being able to
repeat sections is part of our config format that you're simply going to
have to deal with.


One option may be to create  a simple 'lint' style checker that simply
hiughlights and suggests options so the user can decide for themselves what
they need to do. This would help span the gap between hard format and the
soft format capabiulities of machine readable ini files, the Git config
reader and being human readable.

Thus duplicate sections would be noted, likewise the presence of comments
immediately preceding a section header, or terminating a section (with or
without spacing?), etc.Such a config_lint could reside in the contrib as a
supprt tool, and may in the long term be a guide to a common format.
However, as noted, it would be more of a long term aspiration..




The external tool (presumably some generic *.ini parser) you're trying
to point at git's config is broken for that purpose if it doesn't handle
duplicate sections. You're probably better off trying to parse `git
config --list --null` than trying to make it work.

I don't think we'd ever want to get rid of this feature, it's *very*
useful. Both for config via the include macro, and for people to
manually paste some config they want to try out to the end of their
config, without having to manually edit it to incorporate it into their
already existing sections.



--
Philip



Re: [ANNOUNCE] Git Rev News edition 37

2018-03-24 Thread Philip Oakley

From: "Christian Couder" 

Hi everyone,

The 37th edition of Git Rev News is now published:

 https://git.github.io/rev_news/2018/03/21/edition-37/

Thanks a lot to all the contributors!

Enjoy,
Christian, Jakub, Markus and Gabriel.



Thank you for the Git Rev News. I've been off-line for 5 weeks, so seeing 
the newsletter is great.


Next is to peruse Junio's "What's Cooking" lists.

Thanks to all.

Philip 



Re: Crash when clone includes magic filenames on Windows

2018-02-10 Thread Philip Oakley

From: "Philip Oakley" <philipoak...@iee.org>

From: "Jeffrey Walton" <noloa...@gmail.com>

Hi Everyone,

I'm seeing this issue on Windows: https://pastebin.com/YfB25E4T . It
seems the filename AUX is the culprit. Also see
https://blogs.msdn.microsoft.com/oldnewthing/20031022-00/?p=42073 .
(Thanks to Milleneumbug on Stack Overflow).

I did not name the file, someone else did. I doubt the filename will be
changed.

Searching is not turning up much information:
https://www.google.com/search?q=git+"magic+filenames"+windows

Does anyone know how to sidestep the issue on Windows?

Jeff


This comes up on the Git-for-Windows (GfW) issues fairly often
https://github.com/git-for-windows/git/issues.

The fetch part of the clone is sucessful, but the final checkout step
fails when the AUX (or any other prohibited filename - that's proper
cabkward compatibility for you) is to be checked out then the file system
(FS) refuses and the checkout 'fails. You do however have the full repo
locally.

The trick is probably then to set up a sparse checkout so the AUX is never
included on the FS.

However it is an open 'up-for-grabs' project to add such a check in GfW.

Philip

One option maybe to extend the $GIT_DIR/info/sparse-checkout capability and
add a specific $GIT_DIR/info/never-sparse-checkout file that could carry the
complement (files & dirs) options that are platform applicable (no AUX, no
COM1, no colons, etc.;-), so that it does not conflict with the users'
regular sparse checkout selection in $GIT_DIR/info/sparse-checkout. It's
probably easier to understand that way.
--
Philip



Re: Crash when clone includes magic filenames on Windows

2018-02-10 Thread Philip Oakley

From: "Jeffrey Walton" 

Hi Everyone,

I'm seeing this issue on Windows: https://pastebin.com/YfB25E4T . It
seems the filename AUX is the culprit. Also see
https://blogs.msdn.microsoft.com/oldnewthing/20031022-00/?p=42073 .
(Thanks to Milleneumbug on Stack Overflow).

I did not name the file, someone else did. I doubt the filename will be 
changed.


Searching is not turning up much information:
https://www.google.com/search?q=git+"magic+filenames"+windows

Does anyone know how to sidestep the issue on Windows?

Jeff

This comes up on the Git-for-Windows (GfW) issues fairly often 
https://github.com/git-for-windows/git/issues.


The fetch part of the clone is sucessful, but the final checkout step fails 
when the AUX (or any other prohibited filename - that's proper cabkward 
compatibility for you) is to be checked out then the file system (FS) 
refuses and the checkout 'fails. You do however have the full repo locally.


The trick is probably then to set up a sparse checkout so the AUX is never 
included on the FS.


However it is an open 'up-for-grabs' project to add such a check in GfW.

Philip 



Re: "git bisect run make" adequate to locate first unbuildable commit?

2018-02-09 Thread Philip Oakley

From: "Robert P. J. Day" <rpj...@crashcourse.ca>

On Fri, 9 Feb 2018, Philip Oakley, CEng MIET wrote:

(apologies for using the fancy letters after the name ID...)



From: "Robert P. J. Day" <rpj...@crashcourse.ca>
>
> writing a short tutorial on "git bisect" and, all the details of
> special exit code 125 aside, if one wanted to locate the first
> unbuildable commit, would it be sufficient to just run?
>
>  $ git bisect run make
>
> as i read it, make returns either 0, 1 or 2 so there doesn't appear
> to be any possibility of weirdness with clashing with a 125 exit code.
> am i overlooking some subtle detail here i should be aware of? thanks.
>
> rday

In the spirit of pedanticism, one should also clarify the word
"first", in that it's not a linear search for _an_ unbuildable
commit, but that one is looking for the transition between an
unbroken sequence of unbuildable commits, which transitions to
buildable commits, and its the transition that is sought. (there
could be many random unbuildable commits within a sequence in some
folks' processes!)


 quite so, i should have been more precise.

rday


The other two things that may be happening (in the wider bisect discussion) 
that I've heard of are:
1. there may be feature branches that bypass the known good starting commit, 
which can cause understanding issues as those side branches that predate the 
start point are also considered potential bu commits.
2. if you just want the first parent check for the bad commit point, that 
mark the second parents of merges as being good.


Also, I'd expect that the skipped commits aren't 'counted' (too hard?) for 
the bisect algorithm's reporting.


https://stackoverflow.com/questions/5638211/how-do-you-get-git-bisect-to-ignore-merged-branches 
contains a number of the ideas..


Philip



Re: "git bisect run make" adequate to locate first unbuildable commit?

2018-02-09 Thread Philip Oakley, CEng MIET

From: "Robert P. J. Day" 


 writing a short tutorial on "git bisect" and, all the details of
special exit code 125 aside, if one wanted to locate the first
unbuildable commit, would it be sufficient to just run?

 $ git bisect run make

 as i read it, make returns either 0, 1 or 2 so there doesn't appear
to be any possibility of weirdness with clashing with a 125 exit code.
am i overlooking some subtle detail here i should be aware of? thanks.

rday



In the spirit of pedanticism, one should also clarify the word "first", in 
that it's not a linear search for _an_ unbuildable commit, but that one is 
looking for the transition between an unbroken sequence of unbuildable 
commits, which transitions to buildable commits, and its the transition that 
is sought. (there could be many random unbuildable commits within a sequence 
in some folks' processes!)

--
Philip 



RE: git send-email sets date

2018-01-28 Thread Philip Oakley
Behalf Of brian m. carlson
> On Fri, Jan 26, 2018 at 06:32:30PM +0100, Michal Suchánek wrote:
> > git send-email sets the message date to author date.
> >
> > This is wrong because the message will most likely not get delivered
> > when the author date differs from current time. It might give slightly
> > better results with commit date instead of author date but can't is
> > just skip that header and leave it to the mailer?
> >
> > It does not even seem to have an option to suppress adding the date
> > header.
> 
> I'm pretty sure it's intended to work this way.
> 
> Without the Date header, we have no way of providing the author date
> when sending a patch.  git am will read this date and use it as the
> author date when applying patches, so if it's omitted, the author date
> will be wrong.
> 
> If you want to send patches with a different date, you can always insert
> the patch inline in your mailer using the scissors notation, which will
> allow your mailer to insert its own date while keeping the patch date
> separate.
> --

Michal, you may want to hack up an option that can automatically create 
that format if it is of use. I sometimes find the sort order an issue in 
some of my mail clients.
--
Philip



Re: [PATCH 3/3] perf/aggregate: sort JSON fields in output

2018-01-28 Thread Philip Oakley

From: "Christian Couder" 

It is much easier to diff the output against a preivous


s/preivous/previous/


one when the fields are sorted.

Signed-off-by: Christian Couder 
---
t/perf/aggregate.perl | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
index d616d31ca8..fcc0313e65 100755
--- a/t/perf/aggregate.perl
+++ b/t/perf/aggregate.perl
@@ -253,7 +253,7 @@ sub print_codespeed_results {
 }
 }

- print to_json(\@data, {utf8 => 1, pretty => 1}), "\n";
+ print to_json(\@data, {utf8 => 1, pretty => 1, canonical => 1}), "\n";
}

binmode STDOUT, ":utf8" or die "PANIC on binmode: $!";
--
2.16.0.rc2.45.g09a1bbd803



Re: cygwin git and golang: how @{u} is handled

2018-01-21 Thread Philip Oakley

From: "John Cheng" 

I am experiencing a strange behavior and I'm not certain if it is a
problem with golang or the cygwin version of git.

Steps to reproduce:
Use golang's os/exec library to execute
exec.Command(os.Args[1],"log","@{u}") // where os.Args[1] is either
cygwin git or Windows git

Expected result:
commit 09357db3a29909c3498143b0d06989e00f5e2442
Author: John Cheng 
Date:   Sun Jan 14 10:57:01 2018 -0800
...

Actual result:
Suppose that cygwin git is specified, the result becomes:
exit status 128 fatal: ambiguous argument '@u': unknown revision or
path not in the working tree.

Version:
git version 2.15.1.windows.2
git version 2.15.1

I'm not certain if this is a git problem, as I could not reproduce
this problem using python to script cygwin git.

A list of scenarios I've tested are
1. golang + cygwin git = "exit code 128"
2. golang + windows git = "exit code 0"
3. python + cygwin git = "exit code 0"
4. python + windows git = "exit code 0"

I've tried to write a simple program to echo the command line
parameters passed by go into the process it executes - and it appears
that go itself does not change "@{u}" into "@u". I'm a bit stuck at
point to figure out which may be the cause: golang or git. I figured
I'd start here.

There is a similar problem a user is experiencing on Git-for-Windows, that 
we/the user haven't got to the bottom of, but it appears to have a similar 
form where the braces appear to be is some form parsed twice (though thats 
still a guess / hypothesis).


"Aliases in git are stripping curly-brackets (#1220)" 
https://github.com/git-for-windows/git/issues/1220#issuecomment-340341336




Philip



Re: [PATCH 10/8] [DO NOT APPLY, but improve?] rebase--interactive: introduce "stop" command

2018-01-18 Thread Philip Oakley

From: "Jacob Keller" 
On Thu, Jan 18, 2018 at 10:36 AM, Stefan Beller  
wrote:

Jake suggested using "x false" instead of "edit" for some corner cases.

I do prefer using "x false" for all kinds of things such as stopping
before a commit (edit only let's you stop after a commit), and the
knowledge that "x false" does the least amount of actions behind my back.

We should have that command as well, maybe?




I agree. I use "x false" very often, and I think stop is probably a
better solution since it avoids spawning an extra shell that will just
fail. Not sure if stop implies too much about "stop the whole thing"
as opposed to "stop here and let me do something manual", but I think
it's clear enough.

'hold' or 'pause' maybe options (leads to 
http://www.thesaurus.com/browse/put+on+hold offering procastinate etc.)

'adjourn'.




Signed-off-by: Stefan Beller 
---
 git-rebase--interactive.sh |  1 +
 sequencer.c| 10 ++
 2 files changed, 11 insertions(+)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 3cd7446d0b..9eac53f0c5 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -166,6 +166,7 @@ l, label = label current HEAD with a name
 t, reset  = reset HEAD to a label
 b, bud = reset HEAD to the revision labeled 'onto', no arguments
 m, merge []* = create a merge commit using a given 
commit's message

+y, stay = stop for  shortcut for

 These lines can be re-ordered; they are executed from top to bottom.
 " | git stripspace --comment-lines >>"$todo"
diff --git a/sequencer.c b/sequencer.c
index 2b4e6b1232..4b3b9fe59d 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -782,6 +782,7 @@ enum todo_command {
TODO_RESET,
TODO_BUD,
TODO_MERGE,
+   TODO_STOP,
/* commands that do nothing but are counted for reporting 
progress */

TODO_NOOP,
TODO_DROP,
@@ -803,6 +804,7 @@ static struct {
{ 'l', "label" },
{ 't', "reset" },
{ 'b', "bud" },
+   { 'y', "stay" },
{ 'm', "merge" },
{ 0,   "noop" },
{ 'd', "drop" },
@@ -1307,6 +1309,12 @@ static int parse_insn_line(struct todo_item *item, 
const char *bol, char *eol)

return 0;
}

+   if (item->command == TODO_STOP) {
+   item->commit = NULL;
+   item->arg = "";
+   item->arg_len = 0;
+   }
+
end_of_object_name = (char *) bol + strcspn(bol, " \t\n");
item->arg = end_of_object_name + strspn(end_of_object_name, " 
\t");

item->arg_len = (int)(eol - item->arg);
@@ -2407,6 +2415,8 @@ static int pick_commits(struct todo_list 
*todo_list, struct replay_opts *opts)

/* `current` will be incremented below */
todo_list->current = -1;
}
+   } else if (item->command == TODO_STOP) {
+   todo_list->current = -1;
} else if (item->command == TODO_LABEL)
res = do_label(item->arg, item->arg_len);
else if (item->command == TODO_RESET)
--
2.16.0.rc1.238.g530d649a79-goog





Re: [PATCH 8/8] rebase -i: introduce --recreate-merges=no-rebase-cousins

2018-01-18 Thread Philip Oakley

From: "Johannes Schindelin" 

This one is a bit tricky to explain, so let's try with a diagram:

   C
 /   \
A - B - E - F
 \   /
   D

To illustrate what this new mode is all about, let's consider what
happens upon `git rebase -i --recreate-merges B`, in particular to
the commit `D`. In the default mode, the new branch structure is:

  --- C' --
 / \
A - B -- E' - F'
 \/
   D'

This is not really preserving the branch topology from before! The
reason is that the commit `D` does not have `B` as ancestor, and
therefore it gets rebased onto `B`.

However, when recreating branch structure, there are legitimate use
cases where one might want to preserve the branch points of commits that
do not descend from the  commit that was passed to the rebase
command, e.g. when a branch from core Git's `next` was merged into Git
for Windows' master we will not want to rebase those commits on top of a
Windows-specific commit. In the example above, the desired outcome would
look like this:

  --- C' --
 / \
A - B -- E' - F'
 \/
  -- D' --


I'm not understanding this. I see that D properly starts from A, but don't 
see why it is now D'. Surely it's unchanged.
Maybe it's the arc/node confusion. Maybe even spell out that the rebased 
commits from the command are B..HEAD, but that includes D, which may not be 
what folk had expected. (not even sure if the reflog comes into determining 
merge-bases here..)


I do think an exact definition is needed (e.g. via --ancestry-path or its 
equivalent?).




Let's introduce the term "cousins" for such commits ("D" in the
example), and the "no-rebase-cousins" mode of the merge-recreating
rebase, to help those use cases.

Signed-off-by: Johannes Schindelin 
---
Documentation/git-rebase.txt  |  7 ++-
builtin/rebase--helper.c  |  9 -
git-rebase--interactive.sh|  1 +
git-rebase.sh | 12 +++-
sequencer.c   |  4 
sequencer.h   |  8 
t/t3430-rebase-recreate-merges.sh | 23 +++
7 files changed, 61 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 1d061373288..ac07a5c3fc9 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -368,10 +368,15 @@ The commit list format can be changed by setting the 
configuration option
rebase.instructionFormat.  A customized instruction format will 
automatically

have the long commit hash prepended to the format.

---recreate-merges::
+--recreate-merges[=(rebase-cousins|no-rebase-cousins)]::
 Recreate merge commits instead of flattening the history by replaying
 merges. Merge conflict resolutions or manual amendments to merge
 commits are not preserved.
++
+By default, or when `rebase-cousins` was specified, commits which do not 
have
+`` as direct ancestor are rebased onto `` (or 
``,
+if specified). If the `rebase-cousins` mode is turned off, such commits 
will

+retain their original branch point.

-p::
--preserve-merges::
diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index a34ab5c0655..ef08fef4d14 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,7 +13,7 @@ int cmd_rebase__helper(int argc, const char **argv, 
const char *prefix)

{
 struct replay_opts opts = REPLAY_OPTS_INIT;
 unsigned flags = 0, keep_empty = 0, recreate_merges = 0;
- int abbreviate_commands = 0;
+ int abbreviate_commands = 0, no_rebase_cousins = -1;
 enum {
 CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
 CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH,
@@ -23,6 +23,8 @@ int cmd_rebase__helper(int argc, const char **argv, 
const char *prefix)

 OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
 OPT_BOOL(0, "keep-empty", _empty, N_("keep empty commits")),
 OPT_BOOL(0, "recreate-merges", _merges, N_("recreate merge 
commits")),

+ OPT_BOOL(0, "no-rebase-cousins", _rebase_cousins,
+ N_("keep original branch points of cousins")),
 OPT_CMDMODE(0, "continue", , N_("continue rebase"),
 CONTINUE),
 OPT_CMDMODE(0, "abort", , N_("abort rebase"),
@@ -57,8 +59,13 @@ int cmd_rebase__helper(int argc, const char **argv, 
const char *prefix)

 flags |= keep_empty ? TODO_LIST_KEEP_EMPTY : 0;
 flags |= abbreviate_commands ? TODO_LIST_ABBREVIATE_CMDS : 0;
 flags |= recreate_merges ? TODO_LIST_RECREATE_MERGES : 0;
+ flags |= no_rebase_cousins > 0 ? TODO_LIST_NO_REBASE_COUSINS : 0;
 flags |= command == SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;

+ if (no_rebase_cousins >= 0&& !recreate_merges)
+ warning(_("--[no-]rebase-cousins has no effect without "
+   "--recreate-merges"));
+
 if (command == CONTINUE && argc == 1)
 return !!sequencer_continue();
 if (command == ABORT && argc == 1)
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 3459ec5a018..23184c77e88 100644
--- 

Re: [PATCH 4/8] rebase-helper --make-script: introduce a flag to recreate merges

2018-01-18 Thread Philip Oakley

From: "Johannes Schindelin" 

The sequencer just learned a new commands intended to recreate branch
structure (similar in spirit to --preserve-merges, but with a
substantially less-broken design).

Let's allow the rebase--helper to generate todo lists making use of
these commands, triggered by the new --recreate-merges option. For a
commit topology like this:

A - B - C
  \   /
D


Could the topology include the predecessor for context. Alo it is easy for 
readers to become confused between the arcs of the graphs and the nodes of 
the graphs, such that we confuse 'commits as patches' with 'commits as 
snapshots'. It might need an 'Aa' distinction between the two types, 
especially around merges and potential evilness.




the generated todo list would look like this:

# branch D
pick 0123 A
label branch-point
pick 1234 D
label D

reset branch-point
pick 2345 B
merge 3456 D C

To keep things simple, we first only implement support for merge commits
with exactly two parents, leaving support for octopus merges to a later
patch in this patch series.

Signed-off-by: Johannes Schindelin 
---
builtin/rebase--helper.c |   4 +-
sequencer.c  | 343 
++-

sequencer.h  |   1 +
3 files changed, 345 insertions(+), 3 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 7daee544b7b..a34ab5c0655 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -12,7 +12,7 @@ static const char * const builtin_rebase_helper_usage[] 
= {

int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
{
 struct replay_opts opts = REPLAY_OPTS_INIT;
- unsigned flags = 0, keep_empty = 0;
+ unsigned flags = 0, keep_empty = 0, recreate_merges = 0;
 int abbreviate_commands = 0;
 enum {
 CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
@@ -22,6 +22,7 @@ int cmd_rebase__helper(int argc, const char **argv, 
const char *prefix)

 struct option options[] = {
 OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
 OPT_BOOL(0, "keep-empty", _empty, N_("keep empty commits")),
+ OPT_BOOL(0, "recreate-merges", _merges, N_("recreate merge 
commits")),

 OPT_CMDMODE(0, "continue", , N_("continue rebase"),
 CONTINUE),
 OPT_CMDMODE(0, "abort", , N_("abort rebase"),
@@ -55,6 +56,7 @@ int cmd_rebase__helper(int argc, const char **argv, 
const char *prefix)


 flags |= keep_empty ? TODO_LIST_KEEP_EMPTY : 0;
 flags |= abbreviate_commands ? TODO_LIST_ABBREVIATE_CMDS : 0;
+ flags |= recreate_merges ? TODO_LIST_RECREATE_MERGES : 0;
 flags |= command == SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;

 if (command == CONTINUE && argc == 1)
diff --git a/sequencer.c b/sequencer.c
index a96255426e7..1bef16647b4 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -23,6 +23,8 @@
#include "hashmap.h"
#include "unpack-trees.h"
#include "worktree.h"
+#include "oidmap.h"
+#include "oidset.h"

#define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"

@@ -2785,6 +2787,335 @@ void append_signoff(struct strbuf *msgbuf, int 
ignore_footer, unsigned flag)

 strbuf_release();
}

+struct labels_entry {
+ struct hashmap_entry entry;
+ char label[FLEX_ARRAY];
+};
+
+static int labels_cmp(const void *fndata, const struct labels_entry *a,
+   const struct labels_entry *b, const void *key)
+{
+ return key ? strcmp(a->label, key) : strcmp(a->label, b->label);
+}
+
+struct string_entry {
+ struct oidmap_entry entry;
+ char string[FLEX_ARRAY];
+};
+
+struct label_state {
+ struct oidmap commit2label;
+ struct hashmap labels;
+ struct strbuf buf;
+};
+
+static const char *label_oid(struct object_id *oid, const char *label,
+  struct label_state *state)
+{
+ struct labels_entry *labels_entry;
+ struct string_entry *string_entry;
+ struct object_id dummy;
+ size_t len;
+ int i;
+
+ string_entry = oidmap_get(>commit2label, oid);
+ if (string_entry)
+ return string_entry->string;
+
+ /*
+ * For "uninteresting" commits, i.e. commits that are not to be
+ * rebased, and which can therefore not be labeled, we use a unique
+ * abbreviation of the commit name. This is slightly more complicated
+ * than calling find_unique_abbrev() because we also need to make
+ * sure that the abbreviation does not conflict with any other
+ * label.
+ *
+ * We disallow "interesting" commits to be labeled by a string that
+ * is a valid full-length hash, to ensure that we always can find an
+ * abbreviation for any uninteresting commit's names that does not
+ * clash with any other label.
+ */
+ if (!label) {
+ char *p;
+
+ strbuf_reset(>buf);
+ strbuf_grow(>buf, GIT_SHA1_HEXSZ);
+ label = p = state->buf.buf;
+
+ find_unique_abbrev_r(p, oid->hash, default_abbrev);
+
+ /*
+ * We may need to extend the abbreviated hash so that there is
+ * no conflicting label.
+ */
+ if (hashmap_get_from_hash(>labels, strihash(p), p)) {
+ size_t i = strlen(p) + 1;
+
+ oid_to_hex_r(p, oid);
+ for (; i < GIT_SHA1_HEXSZ; i++) {
+ char save = p[i];
+ p[i] = '\0';

Re: [PATCH 1/8] sequencer: introduce new commands to reset the revision

2018-01-18 Thread Philip Oakley

From: "Jacob Keller" 

On Thu, Jan 18, 2018 at 7:35 AM, Johannes Schindelin
 wrote:

This commit implements the commands to label, and to reset to, given
revisions. The syntax is:

label 
reset 

As a convenience shortcut, also to improve readability of the generated
todo list, a third command is introduced: bud. It simply resets to the
"onto" revision, i.e. the commit onto which we currently rebase.



The code looks good, but I'm a little wary of adding bud which
hard-codes a specific label. I suppose it does grant a bit of
readability to the resulting script... ? It doesn't seem that
important compared to use using "reset onto"? At least when
documenting this it should be made clear that the "onto" label is
special.

Thanks,
Jake.


I'd agree.

The special 'onto' label should be fully documented, and the commit message 
should indicate which patch actually defines it (and all its corner cases 
and fall backs if --onto isn't explicitly given..)


Likewise the choice of 'bud' should be explained with some nice phraseology 
indicating that we are growing the new flowering from the bud, otherwise the 
word is a bit too short and sudden for easy explanation.


Philip 



Re: [PATCH] Remoted unnecessary void* from hashmap.h that caused compile warnings

2018-01-14 Thread Philip Oakley

From: 
Subject: [PATCH] Remoted unnecessary void* from hashmap.h that caused 
compile warnings


s/Remoted/Removed/ ?

Maybe shorten to " hashmap.h: remove unnecessary void* " (ex the superflous 
spaces)

--
Philip



From: "Randall S. Becker" 

* The while loop in the inline method hashmap_enable_item_counting
 used an unneeded variable. The loop has been revised accordingly.

Signed-off-by: Randall S. Becker 
---
hashmap.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hashmap.h b/hashmap.h
index 7ce79f3..d375d9c 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -400,7 +400,6 @@ static inline void 
hashmap_disable_item_counting(struct hashmap *map)

 */
static inline void hashmap_enable_item_counting(struct hashmap *map)
{
- void *item;
 unsigned int n = 0;
 struct hashmap_iter iter;

@@ -408,7 +407,7 @@ static inline void hashmap_enable_item_counting(struct 
hashmap *map)

 return;

 hashmap_iter_init(map, );
- while ((item = hashmap_iter_next()))
+ while (hashmap_iter_next())
 n++;

 map->do_count_items = 1;
--
2.8.5.23.g6fa7ec3





Re: [PATCH/RFC] diff: add --compact-summary option to complement --stat

2018-01-13 Thread Philip Oakley

(one spelling spotted)..
From: "Nguyễn Thái Ngọc Duy" 

This is partly inspired by gerrit web interface which shows diffstat
like this, e.g. with commit 0433d533f1 (notice the "A" column on the
third line):

Documentation/merge-config.txt |  4 +
builtin/merge.c|  2 +
  A t/t5573-pull-verify-signatures.sh  | 81 ++
t/t7612-merge-verify-signatures.sh | 45 ++
  4 files changed, 132 insertions(+)

In other words, certain information currently shown with --summary is
embedded in the diffstat. This helps reading (all information of the
same file in the same line instead of two) and can reduce the number of
lines if you add/delete a lot of files.

The new option --compact-summary implements this with a tweak to support
mode change, which is shown in --summary too.

For mode changes, executable bit is denoted as "(+x)" or "(-x)" when
it's added or removed respectively. The same for when a regular file is
replaced with a symlink "(+l)" or the other way "(-l)". This also
applies to new files. New regulare files are "A", while new executable
files or symlinks are "A+x" or "A+l".

Note, there is still one piece of information missing from --summary,
the rename/copy percentage. That could probably be added later. It's not
as useful as the others anyway.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
I have had something similar for years but the data is shown after
the path name instead (it's incidentally shown in the diffstat right
below). I was going to clean it up and submit it again, but my recent
experience with Gerrit changed my mind a bit about the output.

Documentation/diff-options.txt | 11 
diff.c | 64 
+-

diff.h |  1 +
t/t4013-diff-various.sh|  5 ++
...y_--root_--stat_--compact-summary_initial (new) | 12 
...R_--root_--stat_--compact-summary_initial (new) | 12 
...ree_--stat_--compact-summary_initial_mode (new) |  4 ++
..._-R_--stat_--compact-summary_initial_mode (new) |  4 ++
8 files changed, 110 insertions(+), 3 deletions(-)
create mode 100644 
t/t4013/diff.diff-tree_--pretty_--root_--stat_--compact-summary_initial
create mode 100644 
t/t4013/diff.diff-tree_--pretty_-R_--root_--stat_--compact-summary_initial
create mode 100644 
t/t4013/diff.diff-tree_--stat_--compact-summary_initial_mode
create mode 100644 
t/t4013/diff.diff-tree_-R_--stat_--compact-summary_initial_mode


diff --git a/Documentation/diff-options.txt 
b/Documentation/diff-options.txt

index 9d1586b956..ff93ff74d0 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -188,6 +188,17 @@ and accumulating child directory counts in the parent 
directories:

 Output a condensed summary of extended header information
 such as creations, renames and mode changes.

+--compact-summary::
+ Output a condensed summary of extended header information in
+ front of the file name part of diffstat. This option is
+ ignored if --stat is not specified.
++
+Fle creations or deletions are denoted with "A" or "D" respectively,


s/Fle/File/ ?


+optionally "+l" if it's a symlink, or "+x" if it's executable.
+Mode changes are put in brackets, e.g. "+x" or "-x" for adding or
+removing executable bit respectively, "+l" or "-l" for becoming a
+symlink or a regular file.
+
ifndef::git-format-patch[]
--patch-with-stat::
 Synonym for `-p --stat`.
diff --git a/diff.c b/diff.c
index fb22b19f09..3f676d 100644
--- a/diff.c
+++ b/diff.c
@@ -2131,6 +2131,7 @@ struct diffstat_t {
 char *from_name;
 char *name;
 char *print_name;
+ const char *status_code;
 unsigned is_unmerged:1;
 unsigned is_binary:1;
 unsigned is_renamed:1;
@@ -2271,6 +2272,7 @@ static void show_stats(struct diffstat_t *data, 
struct diff_options *options)

{
 int i, len, add, del, adds = 0, dels = 0;
 uintmax_t max_change = 0, max_len = 0;
+ int max_status_len = 0;
 int total_files = data->nr, count;
 int width, name_width, graph_width, number_width = 0, bin_width = 0;
 const char *reset, *add_c, *del_c;
@@ -2287,6 +2289,18 @@ static void show_stats(struct diffstat_t *data, 
struct diff_options *options)

 add_c = diff_get_color_opt(options, DIFF_FILE_NEW);
 del_c = diff_get_color_opt(options, DIFF_FILE_OLD);

+ for (i = 0; (i < count) && (i < data->nr); i++) {
+ const struct diffstat_file *file = data->files[i];
+ int len;
+
+ if (!file->status_code)
+ continue;
+ len = strlen(file->status_code) + 1;
+
+ if (len > max_status_len)
+ max_status_len = len;
+ }
+
 /*
 * Find the longest filename and max number of changes
 */
@@ -2383,6 +2397,8 @@ static void show_stats(struct diffstat_t *data, 
struct diff_options *options)

   options->stat_name_width < max_len) ?
 options->stat_name_width : max_len;

+ name_width += max_status_len;
+
 /*
 * Adjust adjustable widths not to exceed maximum width
 */
@@ -2402,6 +2418,8 

Re: Errors and other unpleasant things found by Cppcheck

2018-01-08 Thread Philip Oakley

From: "Friedrich Spee von Langenfeld" 

Hi,

I analyzed the GitHub repository with Cppcheck. The resulting XML file
is attached. Please open it in Cppcheck to view it comfortably.

Especially the bunch of errors could be of interest to you.


Hi,

Thanks for the submission.

The list prefers that useful information is in plain text so as to avoid 
opening file types that may hide undesirable effects.


Was your analysis part of an organised scan, or a personal insight? It would 
help to know the background.


The project does have a number of known and accepted cases of 'unitialised 
variables' and known memory leaks which are acceptable in those cases.


If you picked out the few key issues that you feel should be addressed then 
a patch can be considered, e.g. the suggestion of the wildmatch macro (L263) 
that depends on the order of evaluation of side effects.


--
Philip 



Re: Re: Unify annotated and non-annotated tags

2017-12-24 Thread Philip Oakley

From: "anatoly techtonik" <techto...@gmail.com>

From: Philip Oakley
> So if I understand correctly, the hope is that `git show-ref --tags` 
> could
> get an alternate option `--all-tags` [proper option name required...] 
> such

> that the user would not have to develop the rather over the complicated
> expression that used a newish capability of a different command.

> Would that be right?



That's correct.



> Or at least update the man page docs to clarify the annotated vs
> non-annotated tags issue (many SO questions!).



Are there stats how many users read man pages and what is their
reading session length? I mean docs may not help much,


The "reading the manual" question is fairly well answered in the Human Error 
literature in terms of clarity and effectiveness, and the normal human error 
rates (for interest search for "Panko" "Spreadsheet errors" [1]). Typical 
human error rate is 1%. Most pilot error ends up being, in part, caused by 
confusing / incomplete manuals (i.e. we fail to support them).


If the manuals are the peak of perfection then they are well visited and the 
supporting material is usually good. If manuals are a sprawling upland with 
bogs, fissure, islands of inaccessability, then they are rarely used.


Git does suffer from having a lot of separate commands, which makes seeing 
the woods for the trees difficult sometimes, especially as its core concepts 
are not always well understood.


Improving the manuals (as reference material) will always help, even if the 
trickle down effect is slow (made worse by alternate sources of error - 
Stackoverflow and blogs... ;-)


> And indicate if the --dereference and/or --hash options would do the 
> trick!
> - maybe the "^{}" appended would be part of the problem (and need that 
> new

> option "--objectreference" ).



--dereference would work if it didn't require extra processing.
It is hard to think about other option name that would give
desired result.
---
anatoly t.

--
Philip

[1] https://arxiv.org/abs/1602.02601  https://arxiv.org/pdf/1602.02601
"This paper reviews human cognition processes and shows first that humans 
cannot be error free no matter how hard they try, and second that our 
intuition about errors and how we can reduce them is based on appallingly 
bad knowledge." 



Re: [WIP 12/15] ls-refs: introduce ls-refs server command

2017-12-13 Thread Philip Oakley

From: "Brandon Williams" 
Sent: Monday, December 04, 2017 11:58 PM

Introduce the ls-refs server command.  In protocol v2, the ls-refs
command is used to request the ref advertisement from the server.  Since
it is a command which can be requested (as opposed to manditory in v1),
a clinet can sent a number of parameters in its request to limit the ref


s/clinet/client/


advertisement based on provided ref-patterns.

Signed-off-by: Brandon Williams 
---

Philip


Re: [PATCH 0/8] Codespeed perf results

2017-12-13 Thread Philip Oakley

From: "Christian Couder" 

This patch series is built on top of cc/perf-run-config which recently
graduated to master.

It makes it possible to send perf results to a Codespeed server. See
https://github.com/tobami/codespeed/ and web sites like
http://speed.pypy.org/ which are using Codespeed.

The end goal would be to have such a server always available to track
how the different git commands perform over time on different kind of
repos (small, medium, large, ...) with different optimizations on and
off (split-index, libpcre2, BLK_SHA1, ...)


Dumb question: is this expected to also be able to do a retrospective on the 
performance of appropriate past releases? That would allow immediate 
performance comparisons, rather than needing to wait for a few releases to 
see the trends.


Philip



With this series and a config file like:

$ cat perf.conf
[perf]
   dirsOrRevs = v2.12.0 v2.13.0
   repeatCount = 10
sendToCodespeed = http://localhost:8000
repoName = Git repo
[perf "with libpcre"]
   makeOpts = "DEVELOPER=1 USE_LIBPCRE=YesPlease"
[perf "without libpcre"]
   makeOpts = "DEVELOPER=1"

One should be able to just launch:

$ ./run --config perf.conf p7810-grep.sh

and then get nice graphs in a Codespeed instance running on
http://localhost:8000.

Caveat
~~

For now one has to create the "Git repo" environment in the Codespeed
admin interface. (We send the perf.repoName config variable in the
"environment" Codespeed field.) This is because Codespeed requires the
environment fields to be created and does not provide a simple way to
create these fields programmatically.

I might try to work around this problem in the future.

Links
~

This patch series is available here:

https://github.com/chriscool/git/commits/codespeed

The cc/perf-run-config patch series was discussed here:

v1: 
https://public-inbox.org/git/20170713065050.19215-1-chrisc...@tuxfamily.org/
v2: 
https://public-inbox.org/git/cap8ufd2j-ufh+9awz91gtz-jusq7euoexmguro59vpf29jx...@mail.gmail.com/


Christian Couder (8):
 perf/aggregate: fix checking ENV{GIT_PERF_SUBSECTION}
 perf/aggregate: refactor printing results
 perf/aggregate: implement codespeed JSON output
 perf/run: use $default_value instead of $4
 perf/run: add conf_opts argument to get_var_from_env_or_config()
 perf/run: learn about perf.codespeedOutput
 perf/run: learn to send output to codespeed server
 perf/run: read GIT_TEST_REPO_NAME from perf.repoName

t/perf/aggregate.perl | 164 
+++---

t/perf/run|  29 +++--
2 files changed, 140 insertions(+), 53 deletions(-)

--
2.15.1.361.g8b07d831d0





Re: [PATCH] partial-clone: design doc

2017-12-13 Thread Philip Oakley

From: "Junio C Hamano" <gits...@pobox.com>

"Philip Oakley" <philipoak...@iee.org> writes:


+  These filtered packfiles are incomplete in the traditional sense
because
+  they may contain trees that reference blobs that the client does
not have.


Is a comment needed here noting that currently, IIUC, the complete
trees are fetched in the packfiles, it's just the un-necessary blobs
that are omitted ?


I probably am misreading what you meant to say, but the above
statement with "currently" taken literally to mean the system
without JeffH's changes, is false.


I was meaning the current JeffH's V6 series, rather than the last Git 
release.


In one of the previous discussions Jeff had noted that (at that time) his 
partial design would provide a full set of trees for the selected commits 
(excluding the trees already available locally), but only a few of the file 
blobs (based on the filter spec).


So yes, I should have been clearer to avoid talking at cross purposes.



When the receiver says it has commit A and the sender wants to send
a commit B (because the receiver said it does not have it, and it
wants it), trees in A are not sent in the pack the sender sends to
give objects sufficient to complete B, which the receiver wanted to
have, even if B also has those trees.  If you fetch from me twice
and between that time Documentation/ directory did not change, the
second fetch will not have the tree object that corresponds to that
hierarchy (and of course no blobs and sub trees inside it).


Though, after the fetch has completed (v2.15 Git), the receiver will have 
the 'full set of trees and blobs'. In Jeff's design (V6) the reciever would 
still have a full set of trees, but only a partial set of the blobs. So my 
viewpoint was not of the pack file but of the receiver's object store after 
the fetch.




So "the complete trees are fetched" is not true.  What is true (and
what matters more in JeffH's document) is that fetching is done in
such a way that objects resulting in the receiving repository are
complete in the current system that does not allow promised objects.
If some objects resulting in the receiving repository are incomplete,
the current system considers that we corrupted the repository.

The promise mechanism says that it is fine for the receiving end to
lack blobs, trees or commits, as long as the promisor repository
tells it that these "missing" objects can be obtained from it later.


True. (though I'm not sure exactly how Jeff decides about commits - I 
thought theye were not part of this optimisation)



The way the receiving end which notices that it does not have an
otherwise required blob, tree or commit is one promised by the
promisor repository is to see if it is referenced by a pack that
came from such a promisor repository.


.. and marked as such with the ".promisor" extension.



Thanks. 



Re: [PATCH] partial-clone: design doc

2017-12-12 Thread Philip Oakley

From: "Jeff Hostetler" 

From: Jeff Hostetler 

First draft of design document for partial clone feature.

Signed-off-by: Jeff Hostetler 
Signed-off-by: Jonathan Tan 
---
Documentation/technical/partial-clone.txt | 240 
++

1 file changed, 240 insertions(+)
create mode 100644 Documentation/technical/partial-clone.txt

diff --git a/Documentation/technical/partial-clone.txt 
b/Documentation/technical/partial-clone.txt

new file mode 100644
index 000..7ab39d8
--- /dev/null
+++ b/Documentation/technical/partial-clone.txt
@@ -0,0 +1,240 @@
+Partial Clone Design Notes
+==
+
+The "Partial Clone" feature is a performance optimization for git that
+allows git to function without having a complete copy of the repository.
+


I think it would be worthwhile at least listing the issues that make the 
'optimisation' necessary, and then the available factors that make the 
optimisation possible. This helps for future adjustments when those issues 
and factors change.


I think the issues are:
* the size of the repository that is being cloned, both in the width of a 
commit (you mentioned 100M trees) and the time (hours to days) / size to 
clone over the connection.


While the supporting factor is:
* the remote is always on-line and available for on-demand object fetching 
(seconds)


The solution choice then should fall out fairly obviously, and we can 
separate out the other optimisations that are based on other views about the 
issues. E.g. my desire for a solution in the off-line case.


In fact the current design, apart from some terminology, does look well 
matched, with only a couple of places that would be affected.


The airplane-mode expectations of a partial clone should also be stated.



+During clone and fetch operations, git normally downloads the complete
+contents and history of the repository.  That is, during clone the client
+receives all of the commits, trees, and blobs in the repository into a
+local ODB.  Subsequent fetches extend the local ODB with any new objects.
+For large repositories, this can take significant time to download and
+large amounts of diskspace to store.
+
+The goal of this work is to allow git better handle extremely large
+repositories.


Shouln't this goal be nearer the top?


   Often in these repositories there are many files that the
+user does not need such as ancient versions of source files, files in
+portions of the worktree outside of the user's work area, or large binary
+assets.  If we can avoid downloading such unneeded objects *in advance*
+during clone and fetch operations, we can decrease download times and
+reduce ODB disk usage.
+


Does this need to distinguish between the shallow clone mechanism for 
reducing the cloning of old history from the desire for a width wise partial 
clone of only the users narrow work area, and/or without large files/blobs?



+
+Non-Goals
+-
+
+Partial clone is independent of and not intended to conflict with
+shallow-clone, refspec, or limited-ref mechanisms since these all operate
+at the DAG level whereas partial clone and fetch works *within* the set
+of commits already chosen for download.
+
+
+Design Overview
+---
+
+Partial clone logically consists of the following parts:
+
+- A mechanism for the client to describe unneeded or unwanted objects to
+  the server.
+
+- A mechanism for the server to omit such unwanted objects from packfiles
+  sent to the client.
+
+- A mechanism for the client to gracefully handle missing objects (that
+  were previously omitted by the server).
+
+- A mechanism for the client to backfill missing objects as needed.
+
+
+Design Details
+--
+
+- A new pack-protocol capability "filter" is added to the fetch-pack and
+  upload-pack negotiation.
+
+  This uses the existing capability discovery mechanism.
+  See "filter" in Documentation/technical/pack-protocol.txt.
+
+- Clients pass a "filter-spec" to clone and fetch which is passed to the
+  server to request filtering during packfile construction.
+
+  There are various filters available to accomodate different situations.
+  See "--filter=" in Documentation/rev-list-options.txt.
+
+- On the server pack-objects applies the requested filter-spec as it
+  creates "filtered" packfiles for the client.
+
+  These filtered packfiles are incomplete in the traditional sense 
because
+  they may contain trees that reference blobs that the client does not 
have.


Is a comment needed here noting that currently, IIUC, the complete trees are 
fetched in the packfiles, it's just the un-necessary blobs that are omitted 
?



+
+
+ How the local repository gracefully handles missing objects
+
+With partial clone, the fact that objects can be missing makes such
+repositories incompatible with older versions of Git, necessitating a
+repository extension (see the 

Re: Re: Re: bug deleting "unmerged" branch (2.12.3)

2017-12-12 Thread Philip Oakley

From: "Ulrich Windl" <ulrich.wi...@rz.uni-regensburg.de>

Hi!

Sorry for the late response:
On a somewhat not-up-to date manual:

  -d, --delete
  Delete a branch. The branch must be fully merged in its upstream
  branch, or in HEAD if no upstream was set with --track or
  --set-upstream.


Maybe the topic of multiple branches pointing to the same commit could be 
mentioned (regarding the status of each such branch being considered to be 
merged or not). Also "fully merged" could be made a bit more precise, 
maybe.


Maybe gitglossary could have definitions for "merged" and "fully merged" 
with manual pages referring to it.


Thanks, I'll add your note to my list of clarifications.

Philip



Regards,
Ulrich


"Philip Oakley" <philipoak...@iee.org> schrieb am 08.12.2017 um 21:26 
in

Nachricht <582105F8768F4DA6AF4EC82888F0BFBE@PhilipOakley>:

From: "Ulrich Windl" <ulrich.wi...@rz.uni-regensburg.de>

Hi Philip!

I'm unsure what you are asking for...

Ulrich


Hi Ulrich,

I was doing a retrospective follow up (of the second kind [1]).

In your initial email
https://public-inbox.org/git/5a1d70fd02a100029...@gwsmtp1.uni-regensburg.d
e/
you said

"I wanted to delete the temporary branch (which is of no use now), I got 
a

message that the branch is unmerged.
I think if more than one branches are pointing to the same commit, one
should be allowed to delete all but the last one without warning."

My retrospectives question was to find what what part of the 
documentation
could be improved to assist fellow coders and Git users in gaining a 
better

understanding here. I think it's an easy mistake [2] to make and that we
should try to make the man pages more assistive.

I suspect that the description for the `git branch -d` needs a few more
words to clarify the 'merged/unmerged' issue for those who recieve the
warning message. Or maybe the git-glossary, etc. I tend to believe that 
most
users will read some of the man pages, and would continue to do so if 
they

are useful.

I'd welcome any feedback or suggestions you could provide.
--
Philip


>>> "Philip Oakley" <philipoak...@iee.org> 04.12.17 0.30 Uhr >>>
From: "Junio C Hamano" <gits...@pobox.com>
> "Philip Oakley" <philipoak...@iee.org> writes:
>
>> I think it was that currently you are on M, and neither A nor B are
>> ancestors (i.e. merged) of M.
>>
>> As Junio said:- "branch -d" protects branches that are yet to be
>> merged to the **current branch**.
>
> Actually, I think people loosened this over time and removal of
> branch X is not rejected even if the range HEAD..X is not empty, as
> long as X is marked to integrate with/build on something else with
> branch.X.{remote,merge} and the range X@{upstream}..X is empty.
>
> So the stress of "current branch" above you added is a bit of a
> white lie.

Ah, thanks. [I haven't had chance to check the code]

The man page does say:
.-d
.Delete a branch. The branch must be fully merged in its upstream
.branch, or in HEAD if no upstream was set with --track
.or --set-upstream.

It's whether or not Ulrich had joined the two aspects together, and if 
the

doc was sufficient to help recognise the 'unmerged' issue. Ulrich?
--
Philip




[1] Retrospective Second Directive, section 3.4.2 of (15th Ed) Agile
Processes in software engineering and extreme programming. ISBN 
1628251042

(for the perspective of the retrospective..)
[2] 'mistake' colloquial part of the error categories of slips lapses and
mistakes : Human Error, by Reason (James, prof) ISBN 0521314194 
(worthwhile)






Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)

2017-12-12 Thread Philip Oakley

From: "Christian Couder" 

On Thu, Dec 7, 2017 at 7:04 PM, Junio C Hamano  wrote:


* jh/object-filtering (2017-12-05) 9 commits
  (merged to 'next' on 2017-12-05 at 3a56b51085)
 + rev-list: support --no-filter argument
 + list-objects-filter-options: support --no-filter
 + list-objects-filter-options: fix 'keword' typo in comment
  (merged to 'next' on 2017-11-27 at e5008c3b28)
 + pack-objects: add list-objects filtering
 + rev-list: add list-objects filtering support
 + list-objects: filter objects in traverse_commit_list
 + oidset: add iterator methods to oidset
 + oidmap: add oidmap iterator methods
 + dir: allow exclusions from blob in addition to file
 (this branch is used by jh/fsck-promisors and jh/partial-clone.)

 In preparation for implementing narrow/partial clone, the object
 walking machinery has been taught a way to tell it to "filter" some
 objects from enumeration.


* jh/fsck-promisors (2017-12-05) 12 commits
 - gc: do not repack promisor packfiles
 - rev-list: support termination at promisor objects
 - fixup: sha1_file: add TODO
 - fixup: sha1_file: convert gotos to break/continue
 - sha1_file: support lazily fetching missing objects
 - introduce fetch-object: fetch one promisor object
 - index-pack: refactor writing of .keep files
 - fsck: support promisor objects as CLI argument
 - fsck: support referenced promisor objects
 - fsck: support refs pointing to promisor objects
 - fsck: introduce partialclone extension
 - extension.partialclone: introduce partial clone extension
 (this branch is used by jh/partial-clone; uses jh/object-filtering.)

 In preparation for implementing narrow/partial clone, the machinery
 for checking object connectivity used by gc and fsck has been
 taught that a missing object is OK when it is referenced by a
 packfile specially marked as coming from trusted repository that
 promises to make them available on-demand and lazily.


I am currently working on integrating this series with my external odb
series 
(https://public-inbox.org/git/20170916080731.13925-1-chrisc...@tuxfamily.org/).


I too had seen that, as currently configured, the 'partialClone' could be 
seen as a method for using the remote as if it were an object database (odb) 
that was part of an 'always on-line' capability. However I'm cautious about 
locking out the original DVCS capability of being off-line relative to some, 
or all, remotes and still needing to work in 'airplane mode'.


It should be OK for the local narrowClone (my term) to be totally off-line 
for a while and still be able to work when back on line with other suitable 
remotes, even after the original remote has gone.




Instead of using an "extension.partialclone" config variable, an odb
will be configured like using an "odb..promisorRemote" (the
name might still change) config variable. Other odbs could still be
configured using "odb..scriptCommand" and
"odb..subprocessCommand".


The future work Jeff had indicated, IIRC, should be able to cope with 
multiple promisor remotes, which it's to be hope this could handle. I'm not 
sure how the odb code would handle a partial failure where a partition of 
the odb stops being available.




The current work is still very much WIP and some tests fail, but you
can take a look there:

https://github.com/chriscool/git/tree/gl-promisor-external-odb440

--
Philip 



Re: Re: bug deleting "unmerged" branch (2.12.3)

2017-12-08 Thread Philip Oakley

From: "Ulrich Windl" <ulrich.wi...@rz.uni-regensburg.de>

Hi Philip!

I'm unsure what you are asking for...

Ulrich


Hi Ulrich,

I was doing a retrospective follow up (of the second kind [1]).

In your initial email
https://public-inbox.org/git/5a1d70fd02a100029...@gwsmtp1.uni-regensburg.de/
you said

"I wanted to delete the temporary branch (which is of no use now), I got a
message that the branch is unmerged.
I think if more than one branches are pointing to the same commit, one
should be allowed to delete all but the last one without warning."

My retrospectives question was to find what what part of the documentation
could be improved to assist fellow coders and Git users in gaining a better
understanding here. I think it's an easy mistake [2] to make and that we
should try to make the man pages more assistive.

I suspect that the description for the `git branch -d` needs a few more
words to clarify the 'merged/unmerged' issue for those who recieve the
warning message. Or maybe the git-glossary, etc. I tend to believe that most
users will read some of the man pages, and would continue to do so if they
are useful.

I'd welcome any feedback or suggestions you could provide.
--
Philip


>>> "Philip Oakley" <philipoak...@iee.org> 04.12.17 0.30 Uhr >>>
From: "Junio C Hamano" <gits...@pobox.com>
> "Philip Oakley" <philipoak...@iee.org> writes:
>
>> I think it was that currently you are on M, and neither A nor B are
>> ancestors (i.e. merged) of M.
>>
>> As Junio said:- "branch -d" protects branches that are yet to be
>> merged to the **current branch**.
>
> Actually, I think people loosened this over time and removal of
> branch X is not rejected even if the range HEAD..X is not empty, as
> long as X is marked to integrate with/build on something else with
> branch.X.{remote,merge} and the range X@{upstream}..X is empty.
>
> So the stress of "current branch" above you added is a bit of a
> white lie.

Ah, thanks. [I haven't had chance to check the code]

The man page does say:
.-d
.Delete a branch. The branch must be fully merged in its upstream
.branch, or in HEAD if no upstream was set with --track
.or --set-upstream.

It's whether or not Ulrich had joined the two aspects together, and if the
doc was sufficient to help recognise the 'unmerged' issue. Ulrich?
--
Philip




[1] Retrospective Second Directive, section 3.4.2 of (15th Ed) Agile
Processes in software engineering and extreme programming. ISBN 1628251042
(for the perspective of the retrospective..)
[2] 'mistake' colloquial part of the error categories of slips lapses and
mistakes : Human Error, by Reason (James, prof) ISBN 0521314194 (worthwhile)



Re: How hard would it be to implement sparse fetching/pulling?

2017-12-05 Thread Philip Oakley

From: "Jeff Hostetler" <g...@jeffhostetler.com>
Sent: Monday, December 04, 2017 3:36 PM


On 12/2/2017 11:30 AM, Philip Oakley wrote:

From: "Jeff Hostetler" <g...@jeffhostetler.com>
Sent: Friday, December 01, 2017 2:30 PM

On 11/30/2017 8:51 PM, Vitaly Arbuzov wrote:

I think it would be great if we high level agree on desired user
experience, so let me put a few possible use cases here.

1. Init and fetch into a new repo with a sparse list.
Preconditions: origin blah exists and has a lot of folders inside of
src including "bar".
Actions:
git init foo && cd foo
git config core.sparseAll true # New flag to activate all sparse
operations by default so you don't need to pass options to each
command.
echo "src/bar" > .git/info/sparse-checkout
git remote add origin blah
git pull origin master
Expected results: foo contains src/bar folder and nothing else,
objects that are unrelated to this tree are not fetched.
Notes: This should work same when fetch/merge/checkout operations are
used in the right order.


With the current patches (parts 1,2,3) we can pass a blob-ish
to the server during a clone that refers to a sparse-checkout
specification.


I hadn't appreciated this capability. I see it as important, and should 
be available both ways, so that a .gitNarrow spec can be imposed from the 
server side, as well as by the requester.


It could also be used to assist in the 'precious/secret' blob problem, so 
that AWS keys are never pushed, nor available for fetching!


To be honest, I've always considered partial clone/fetch as
a client-side request as a performance feature to minimize
download times and disk space requirements on the client.


Mine was a two way view where one side or other specified an extent for the 
narrow clone to achieve either the speed/space improvement or partitioning 
capability.



I've not thought of it from the "server has secrets" point
of view.


My potential for "secrets" was a little softer that some of the 'hard' 
security that is often discussed. I'm for the layered risk approach (swiss 
cheese model)


We can talk about it, but I'd like to keep it outside the
scope of the current effort.


Agreed.


 My concerns are that that is
not the appropriate mechanism to enforce MAC/DAC like security
mechanisms.  For example:
[a] The client will still receive the containing trees that
refer to the sensitive blobs, so the user can tell when
the secret blobs change -- they wouldn't have either blob,
but can tell when they are changed.  This event by itself
may or may not leak sensitive information depending on the
terms of the security policy in place.
[b] The existence of such missing blobs would tell the client
which blobs are significant and secret and allow them to
focus their attack.  It would be better if those assets
were completely hidden and not in the tree at all.
[c] The client could push a fake secret blob to replace the
valid one on the server.  You would have to audit the
server to ensure that it never accepts a push containing
a change to any secret blob.  And the server would need
an infrastructure to know about all secrets in the tree.
[d] When a secret blob does change, any local merges by the
user lack information to complete the merge -- they can't
merge the secrets and they can't be trusted to correctly
pick-ours or pick-theirs -- so their workflows are broken.
I'm not trying to blindly spread FUD here, but it is arguments
like these that make me suggest that the partial clone mechanism
is not the right vehicle for such "secret" blobs.


I'm on the 'a little security is better than no security' side, but all the 
points are valid.






There's a bit of a chicken-n-egg problem getting
things set up. So if we assume your team would create a series
of "known enlistments" under version control, then you could


s/enlistments/entitlements/ I presume?


Within my org we speak of "enlistments" as subset of the tree
that you plan to work on.  For example, you might enlist in the
"file system" portion of the tree or in the "device drivers"
portion.  If the Makefiles have good partitioning, you should
only need one of the above portions to do productive work within
a feature area.
Ah, so it's the things that have been requested by the client (I'd like to 
the enlist..)




I'm not sure what you mean by "entitlements".


It is like having the title deeds to a house - a list things you have, or 
can have. (e.g. a father saying: you can have the car on Saturday 6pm -11pm)


At the end of the day the particular lists would be the same, they guide 
what is sent.







just reference one by : during your clone. The
server can lookup that blob and just use it.

git clone --filter=sparse:oid=master:templates/bar URL

And then the server will filter-out the unwanted blobs dur

Re: [RFE] Inverted sparseness (amended)

2017-12-05 Thread Philip Oakley

From: "Randall S. Becker" <rsbec...@nexbridge.com>
On December 3, 2017 6:14 PM, Philip Oakley wrote a nugget of wisdom:

From: "Randall S. Becker" <rsbec...@nexbridge.com>

[...]


If using the empty tree part doesn't pass muster (i.e. showing nothing
isn't sufficient), then the narrow clone could come into play to limit
what parts of the trees are widely visible, but mainly its using the
grafts to cover the regulatory gap, and (for the moment) using
fast-export to transfer the singleton commit / tags


Oh Just remembered, there is the newish capability to fetch random blobs, 
so that may help.


I think you hit the nail on the head pretty well. We're currently at 2.3.7, 
with a push to 2.15.1 this week, so I'm looking forward to trying this. My 
two worries are whether the empty tree is acceptable (it should be to the 
client, and might be to the vendor), and doing this reliably 
(semi-automated) so the user base does not have to worry about the gory 
details of doing this. The unit tests for it are undoubtedly going to give 
me headaches.


Thanks for the advice. Islands of shallowness are a really descriptive image 
for what this is. So identifying that there are shoals (to extend the 
metaphor somewhat), will be crucial to this adventure.


These islands of shallowness, however, are also concerns as described in the 
[Re: How hard would it be to implement sparse fetching/pulling?] thread. The 
matter of the security audit is important here also:
I'm just thinking that even if we get a *perfectly working* partial 
clone/fetch/push/etc. that it would not pass a security audit.



Philip says:
I'd totally disagree in the sense that if we had a submodule anywhere_ in 
the repo that would be an independent island of code, and we are quite happy 
with that - we use the web of trust with the auditors for them to go check, 
separately, the oid of the independent portion, which may be at another site 
or another vendor/client. That's OK, so what's the problem here...


We do the same for pinning the tips and tails of the lines of development 
that make for the shallowness and narrowness that create these shoals, and 
oxbows of development. Managing them is normal human activity, with the 
technical support that the Git chain provides - so much better than previous 
'versioning systems' that we see regularly in engineering, with backdoor 
tweaks etc.


The key is to ensure that there is a proper hand holding across the air 
gaps, such that the oids exist both sides of the gaps, and a properly built 
on, such that the hash chain is unbroken. It's a similar negotiation to 
those used for establishing web security between IP clients, so it is 
doable. But you are right to have concerns and suspisions to ensure that it 
is all tested and verified

--
Philip (sorry about the poor quoting of the reply)




Not having the capability would similarly cause a failure of a security 
audit.


Cheers,
Randall

-- Brief whoami: NonStop developer since approximately 
UNIX(421664400)/NonStop(2112884442)

-- In my real life, I talk too much.





Re: [RFE] Inverted sparseness

2017-12-04 Thread Philip Oakley

From: "Randall S. Becker"  :December 03, 2017 11:44 PM
On December 3, 2017 6:14 PM, Philip Oakley wrote a nugget of wisdom:

From: "Randall S. Becker" <rsbec...@nexbridge.com>
Sent: Friday, December 01, 2017 6:31 PM

On December 1, 2017 1:19 PM, Jeff Hostetler wrote:

On 12/1/2017 12:21 PM, Randall S. Becker wrote:

I recently encountered a really strange use-case relating to sparse
clone/fetch that is really backwards from the discussion that has
been going on, and well, I'm a bit embarrassed to bring it up, but I
have no good solution including building a separate data store that
will end up inconsistent with repositories (a bad solution).  The
use-case is as
follows:

Given a backbone of multiple git repositories spread across an
organization with a server farm and upstream vendors.
The vendor delivers code by having the client perform git pull into
a specific branch.
The customer may take the code as is or merge in customizations.
The vendor wants to know exactly what commit of theirs is installed
on each server, in near real time.
The customer is willing to push the commit-ish to the vendor's
upstream repo but does not want, by default, to share the actual
commit contents for security reasons.
Realistically, the vendor needs to know that their own commit id was
put somewhere (process exists to track this, so not part of the
use-case) and whether there is a subsequent commit contributed >by
the customer, but the content is not relevant initially.

After some time, the vendor may request the commit contents from the
customer in order to satisfy support requirements - a.k.a. a defect
was found but has to be resolved.
The customer would then perform a deeper push that looks a lot like
a "slightly" symmetrical operation of a deep fetch following a prior
sparse fetch to supply the vendor with the specific commit(s).



Perhaps I'm not understanding the subtleties of what you're
describing, but could you do this with stock git functionality.



Let the vendor publish a "well known branch" for the client.
Let the client pull that and build.
Let the client create a branch set to the same commit that they fetched.
Let the client push that branch as a client-specific branch to the
vendor to indicate that that is the official release they are based on.



Then the vendor would know the official commit that the client was using.

This is the easy part, and it doesn't require anything sparse to exist.


If the client makes local changes, does the vendor really need the SHA
of those -- without the actual content?
I mean any SHA would do right?  Perhaps let the client create a second
client-specific branch (set to  the same commit as the first) to
indicate they had mods.
Later, when the vendor needs the actual client changes, the client
does a normal push to this 2nd client-specific branch at the vendor.
This would send everything that the client has done to the code since
the official release.


What I should have added to the use-case was that there is a strong
audit requirement (regulatory, actually) involved that the SHA is
exact, immutable, and cannot be substitute or forged (one of the
reasons git is in such high regard). So, no I can't arrange a fake SHA
to represent a SHA to be named later. It SHA of the installed commit
is part of the official record of what happened on the specific server,
so I'm stuck with it.


I'm not sure what you mean about "it is inside a tree".


m---a---b---c---H1
 `---d---H2

d would be at a head. b would be inside. Determining content of c is
problematic if b is sparse, so I'm really unsure that any of this is
possible.



I think I get the jist of your use case. Would I be right that you don't
have a true working
solution yet? i.e. that it's a problem that is almost sorted but falls down
at the last step.



If one pretended that this was a single development shop, and the various
vendors, clients
and customers as being independent devolopers, each of whom is over
protective of their
code, it may give a better view that maps onto classic feature development
diagrams.
(i.e draw the answer for local devs, then mark where the splits happen)



In particular, I think you could use a notional regulator's view that the
whole code base is
part of a large Git heirarchy of branches and merges, and that some of the
feature loops
are only available via the particular developer that worked on that
feature.



This would mean that from a regulatory overview there is a merge commit in
the 'main'
(master) heirachy that has the main and feature commits listed, and the
feature commit
is probably an --allow-empty commit (that has an empty tree if they are
that paranoid) that
says 'function X released' (and probably tagged), and that release commit
then has, as its
parent, the true release commit, with the true code tree. The latter commit
isn't actually being
shown to you!



At this point the potential for using the graft capa

Re: bug deleting "unmerged" branch (2.12.3)

2017-12-03 Thread Philip Oakley

From: "Junio C Hamano" <gits...@pobox.com>

"Philip Oakley" <philipoak...@iee.org> writes:


I think it was that currently you are on M, and neither A nor B are
ancestors (i.e. merged) of M.

As Junio said:- "branch -d" protects branches that are yet to be
merged to the **current branch**.


Actually, I think people loosened this over time and removal of
branch X is not rejected even if the range HEAD..X is not empty, as
long as X is marked to integrate with/build on something else with
branch.X.{remote,merge} and the range X@{upstream}..X is empty.

So the stress of "current branch" above you added is a bit of a
white lie.


Ah, thanks. [I haven't had chance to check the code]

The man page does say:
.-d
.Delete a branch. The branch must be fully merged in its upstream
.branch, or in HEAD if no upstream was set with --track 
.or --set-upstream.


It's whether or not Ulrich had joined the two aspects together, and if the
doc was sufficient to help recognise the 'unmerged' issue. Ulrich?
--
Philip



Re: [RFE] Inverted sparseness

2017-12-03 Thread Philip Oakley

From: "Randall S. Becker" 
Sent: Friday, December 01, 2017 6:31 PM

On December 1, 2017 1:19 PM, Jeff Hostetler wrote:

On 12/1/2017 12:21 PM, Randall S. Becker wrote:
I recently encountered a really strange use-case relating to sparse 
clone/fetch that is really backwards from the discussion that has been 
going on, and well, I'm a bit embarrassed to bring it up, but I have no 
good solution including building a separate data store that will end up 
inconsistent with repositories (a bad solution).  The use-case is as 
follows:


Given a backbone of multiple git repositories spread across an 
organization with a server farm and upstream vendors.
The vendor delivers code by having the client perform git pull into a 
specific branch.

The customer may take the code as is or merge in customizations.
The vendor wants to know exactly what commit of theirs is installed on 
each server, in near real time.
The customer is willing to push the commit-ish to the vendor's upstream 
repo but does not want, by default, to share the actual commit contents 
for security reasons.
Realistically, the vendor needs to know that their own commit id was put 
somewhere (process exists to track this, so not part of the use-case) 
and whether there is a subsequent commit contributed >by the customer, 
but the content is not relevant initially.


After some time, the vendor may request the commit contents from the 
customer in order to satisfy support requirements - a.k.a. a defect was 
found but has to be resolved.
The customer would then perform a deeper push that looks a lot like a 
"slightly" symmetrical operation of a deep fetch following a prior 
sparse fetch to supply the vendor with the specific commit(s).


Perhaps I'm not understanding the subtleties of what you're describing, 
but could you do this with stock git functionality.



Let the vendor publish a "well known branch" for the client.
Let the client pull that and build.
Let the client create a branch set to the same commit that they fetched.
Let the client push that branch as a client-specific branch to the vendor 
to indicate that that is the official release they are based on.



Then the vendor would know the official commit that the client was using.

This is the easy part, and it doesn't require anything sparse to exist.

If the client makes local changes, does the vendor really need the SHA of 
those -- without the actual content?
I mean any SHA would do right?  Perhaps let the client create a second 
client-specific branch (set to

the same commit as the first) to indicate they had mods.
Later, when the vendor needs the actual client changes, the client does a 
normal push to this 2nd client-specific branch at the vendor.
This would send everything that the client has done to the code since the 
official release.


What I should have added to the use-case was that there is a strong audit 
requirement (regulatory, actually) involved that the SHA is exact, 
immutable, and cannot be substitute or forged (one of the reasons git is 
in such high regard). So, no I can't arrange a fake SHA to represent a SHA 
to be named later. It SHA of the installed commit is part of the official 
record of what happened on the specific server, so I'm stuck with it.



I'm not sure what you mean about "it is inside a tree".


m---a---b---c---H1
 `---d---H2

d would be at a head. b would be inside. Determining content of c is 
problematic if b is sparse, so I'm really unsure that any of this is 
possible.


Cheers,
Randall

-- Brief whoami: NonStop developer since approximately 
UNIX(421664400)/NonStop(2112884442)

-- In my real life, I talk too much.


I think I get the jist of your use case. Would I be right that you don't 
have a true working solution yet? i.e. that it's a problem that is almost 
sorted but falls down at the last step.


If one pretended that this was a single development shop, and the various 
vendors, clients and customers as being independent devolopers, each of whom 
is over protective of their code, it may give a better view that maps onto 
classic feature development diagrams. (i.e draw the answer for local devs, 
then mark where the splits happen)


In particular, I think you could use a notional regulator's view that the 
whole code base is part of a large Git heirarchy of branches and merges, and 
that some of the feature loops are only available via the particular 
developer that worked on that feature.


This would mean that from a regulatory overview there is a merge commit in 
the 'main' (master) heirachy that has the main and feature commits listed, 
and the feature commit is probably an --allow-empty commit (that has an 
empty tree if they are that paranoid) that says 'function X released' (and 
probably tagged), and that release commit then has, as its parent, the true 
release commit, with the true code tree. The latter commit isn't actually 
being shown to you!


At this point the potential for using 

Re: Re: Unify annotated and non-annotated tags

2017-12-02 Thread Philip Oakley

From: "anatoly techtonik" 

comment at end - Philip

On Fri, Nov 24, 2017 at 1:24 PM, Ævar Arnfjörð Bjarmason
 wrote:
On Fri, Nov 24, 2017 at 10:52 AM, anatoly techtonik  
wrote:

On Thu, Nov 23, 2017 at 6:08 PM, Randall S. Becker
 wrote:

On 2017-11-23 02:31 (GMT-05:00) anatoly techtonik wrote

Subject: Re: Unify annotated and non-annotated tags
On Sat, Nov 11, 2017 at 5:06 AM, Junio C Hamano  
wrote:

Igor Djordjevic  writes:


If you would like to mimic output of "git show-ref", repeating
commits for each tag pointing to it and showing full tag name as
well, you could do something like this, for example:

  for tag in $(git for-each-ref --format="%(refname)" refs/tags)
  do
  printf '%s %s\n' "$(git rev-parse $tag^0)" "$tag"
  done


Hope that helps a bit.


If you use for-each-ref's --format option, you could do something
like (pardon a long line):

git 
for-each-ref --format='%(if)%(*objectname)%(then)%(*objectname)%(else)%(objectname)%(end) 
%(refname)' refs/tags


without any loop, I would think.

Thanks. That helps.
So my proposal is to get rid of non-annotated tags, so to get all
tags with commits that they point to, one would use:
git for-each-ref --format='%(*objectname) %(refname)' refs/tags>
For so-called non-annotated tags just leave the message empty.
I don't see why anyone would need non-annotated tags though.


I have seen non-annotated tags used in automations (not necessarily well 
written ones) that create tags as a record of automation activity. I am 
not sure we should be writing off the concept of unannotated tags 
entirely. This may cause breakage based on existing expectations of how 
tags work at present. My take is that tags should include whodunnit, 
even if it's just the version of the automation being used, but I don't 
always get to have my wishes fulfilled. In essence, whatever behaviour a 
non-annotated tag has now may need to be emulated in future even if 
reconciliation happens. An option to preserve empty tag compatibility 
with pre-2.16 behaviour, perhaps? Sadly, I cannot supply examples of 
this usage based on a human memory page-fault and NDAs.


Are there any windows for backward compatibility breaks, or git is
doomed to preserve it forever?
Automation without support won't survive for long, and people who rely
on that, like Chromium team, usually hard set the version used.


Git is not doomed to preserve anything forever. We've gradually broken
backwards compatibility for a few core things like these.

However, just as a bystander reading this thread I haven't seen any
compelling reason for why these should be removed. You initially had
questions about how to extract info about them, which you got answers
to.

So what reasons remain for why they need to be removed?


To reduce complexity and prior knowledge when dealing with Git tags.

For example, http://readthedocs.io/ site contains a lot of broken
"Edit on GitHub" links, for example - 
http://git-memo.readthedocs.io/en/stable/


And it appeared that the reason for that is discrepancy between git
annotated and non-annotated tags. The pull request that fixes the issue
after it was researched and understood is simple
https://github.com/rtfd/readthedocs.org/pull/3302

However, while looking through linked issues and PRs, one can try to
imagine how many days it took for people to come up with the solution,
which came from this thread.
--
anatoly t.





So if I understand correctly, the hope is that `git show-ref --tags` could 
get an alternate option `--all-tags` [proper option name required...] such 
that the user would not have to develop the rather over the complicated 
expression that used a newish capability of a different command.


Would that be right?

Or at least update the man page docs to clarify the annotated vs 
non-annotated tags issue (many SO questions!).


And indicate if the --dereference and/or --hash options would do the 
trick! - maybe the "^{}" appended would be part of the problem (and need 
that new option "--objectreference" ).


Philip 



Re: Re: bug deleting "unmerged" branch (2.12.3)

2017-12-02 Thread Philip Oakley

From: "Ulrich Windl" 
To: 
Cc: 
Sent: Wednesday, November 29, 2017 8:32 AM
Subject: Antw: Re: bug deleting "unmerged" branch (2.12.3)





"Ulrich Windl"  writes:


I think if more than one branches are pointing to the same commit,
one should be allowed to delete all but the last one without
warning. Do you agree?


That comes from a viewpoint that the only purpose "branch -d" exists
in addition to "branch -D" is to protect objects from "gc".  Those
who added the safety feature may have shared that view originally,
but it turns out that it protects another important thing you are
forgetting.

Imagine that two topics, 'topicA' and 'topicB', were independently
forked from 'master', and then later we wanted to add a feature that
depends on these two topics.  Since the 'feature' forked, there may
have been other developments, and we ended up in this topology:

---o---o---o---o---o---M
\   \
 \   o---A---o---F
  \ /
   o---o---o---o---B

where A, B and F are the tips of 'topicA', 'topicB' and 'feature'
branches right now [*1*].

Now imagine we are on 'master' and just made 'topicB' graduate.  We
would have this topology.

---o---o---o---o---o---o---M
\   \ /
 \   o---A---o---F   /
  \ /   /
   o---o---o---o---B

While we do have 'topicA' and 'feature' branches still in flight,
we are done with 'topicB'.  Even though the tip of 'topicA' is
reachable from the tip of 'feature', the fact that the branch points
at 'A' is still relevant.  If we lose that information right now,
we'd have to go find it when we (1) want to further enhance the
topic by checking out and building on 'topicA', and (2) want to
finally get 'topicA' graduate to 'master'.

Because removal of a topic (in this case 'topicB') is often done
after a merge of that topic is made into an integration branch,
"branch -d" that protects branches that are yet to be merged to the
current branch catches you if you said "branch -d topic{A,B}" (or
other equivalent forms, most likely you'd have a script that spits
out list of branches and feed it to "xargs branch -d").

So, no, I do not agree.


Hi!

I can follow your argumentation, but I fail to see that your branches A 
and B point to the same commit (which is what I was talking about). So my 
situation would be:


o---oA,B

I still think I could safely remove either A or B, even when the branch 
(identified by the commit, not by the name) is unmerged. What did I miss?


I think it was that currently you are on M, and neither A nor B are 
ancestors (i.e. merged) of M.


As Junio said:- "branch -d" protects branches that are yet to be merged to 
the **current branch**.


[I said the same in another part of the thread. The question now would be 
what needs changing? the error/warning message, the docs, something else?]




Regards,
Ulrich




[Footnotes]

*1* Since the 'feature' started developing, there were a few commits
added to 'topicB' but because the feature does not depend on
these enhancements to that topic, B is ahead of the commit that
was originally merged with the tip of 'topicA' to form the
'feature' branch.






Re: Antw: Re: bug deleting "unmerged" branch (2.12.3)

2017-12-02 Thread Philip Oakley

Hi Ulrich

From: "Johannes Schindelin" 
To: "Ulrich Windl" 
Cc: 
Sent: Wednesday, November 29, 2017 12:27 PM
Subject: Re: Antw: Re: bug deleting "unmerged" branch (2.12.3)



Hi Ulrich,

On Wed, 29 Nov 2017, Ulrich Windl wrote:


> On Tue, 28 Nov 2017, Ulrich Windl wrote:
>
>> During a rebase that turned out to be heavier than expected 8-( I
>> decided to keep the old branch by creating a temporary branch name to
>> the commit of the branch to rebase (which was still the old commit ID
>> at that time).
>>
>> When done rebasing, I attached a new name to the new (rebased)
>> branch, deleted the old name (pointing at the same rebase commit),
>> then recreated the old branch from the temporary branch name (created
>> to remember the commit id).
>>
>> When I wanted to delete the temporary branch (which is of no use
>> now), I got a message that the branch is unmerged.
>
> This is actually as designed, at least for performance reasons (it is
> not exactly cheap to figure out whether a given commit is contained in
> any other branch).
>
>> I think if more than one branches are pointing to the same commit,
>> one should be allowed to delete all but the last one without warning.
>> Do you agree?
>
> No, respectfully disagree, because I have found myself with branches
> pointing to the same commit, even if the branches served different
> purposes. I really like the current behavior where you can delete a
> branch with `git branch -d` as long as it is contained in its upstream
> branch.

I'm not talking about the intention of a branch, but of the state of a
branch: If multiple branches point (not "contain") the same commit, they
are equivalent (besides the name) at that moment.


I did a poor job of explaining myself, please let me try again. I'll give
you one concrete example:

Recently, while working on some topic, I stumbled over a bug and committed
a bug fix, then committed that and branched off a new branch to remind
myself to rebase the bug fix and contribute it.

At that point, those branches were at the same revision, but distinctly
not equivalent (except in just one, very narrow sense of the word, which I
would argue is the wrong interpretation in this context).

Sadly, I was called away at that moment to take care of something
completely different. Even if I had not been, the worktree with the first
branch would still have been at that revision for a longer time, as I had
to try out a couple of changes before I could commit.

This is just one example where the idea backfires that you can safely
delete one of two branches that happen to point at the same commit at the
same time.

I am sure that you possess vivid enough of an imagination to come up with
plenty more examples where that is the case.


As no program can predict the future or the intentions of the user, it
should be safe to delete the branch, because it can easily be recreated
(from the remaining branches pointing to the same commit).


Yes, no program can predict the future (at least *accurately*).

No, it is not safe to delete that branch. Especially if you take the
current paradigm of "it is safe to delete a branch if it is up-to-date
with, or at least fast-forwardable to, its upstream branch" into account.

And no, a branch cannot easily be recreated from the remaining branches in
the future, as branches can have different reflogs (and they are lost when
deleting the branch).


It shouldn't need a lot of computational power to find out when multiple
branches point to the same commit.


Sure, that test can even be scripted easily by using the `git for-each-ref
--points-at=` command.

By the way, if you are still convinced that my argument is flawed and that
it should be considered safe to delete a branch if any other branch points
to the same revision, I encourage you to work on a patch to make it so.

For maximum chance of getting included, you would want to guard this
behind a new config setting, say, branch.deleteRedundantIsSafe, parse it
here:

https://github.com/git/git/blob/v2.15.1/config.c#L1260-L1288

or here:

https://github.com/git/git/blob/v2.15.1/builtin/branch.c#L78-L97



I'd agree that it is easy to misinterpret the message. After close reading 
of the thread, Junio put his finger on the scenario with:


-  "branch -d" protects branches that are yet to be merged to the 
**current** branch.   (my emphasis)


Maybe the error message could say that (what exactly was the error 
message?),

or the documenation be improved to clarify.



document it here:

https://github.com/git/git/blob/v2.15.1/Documentation/git-branch.txt

and here:

https://github.com/git/git/blob/v2.15.1/Documentation/config.txt#L969

and handle it here:

https://github.com/git/git/blob/v2.15.1/builtin/branch.c#L185-L288

(look for the places where `force` is used, likely just before the call to
`check_branch_commit()`).

The way you'd want it to handle is most lilkely by 

Re: [add-default-config] add --default option to git config.

2017-12-02 Thread Philip Oakley

From: "Soukaina NAIT HMID" 

From: Soukaina NAIT HMID 




From a coursory read, there does need a bit more explanation.


I see you also add a --color description and code, and don't say what the 
problem being solved is.


If it is trickty to explain, then a two patch series may tease apart the 
issues. perhaps add the --color option first (noting you'll use it in the 
next patch), then a second patch that explains about the --default problem.


The patch title should be something like "[PATCH 1/n] config: add --default 
option"


You may also want to explain the test rationale, and maybe split them if 
appropriate.


--
Philip



Signed-off-by: Soukaina NAIT HMID 
---
Documentation/git-config.txt |   4 ++
builtin/config.c |  34 -
config.c |  10 +++
config.h |   1 +
t/t1300-repo-config.sh   | 161 
+++

5 files changed, 209 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 4edd09fc6b074..5d5cd58fdae37 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -179,6 +179,10 @@ See also <>.
 specified user.  This option has no effect when setting the
 value (but you can use `git config section.variable ~/`
 from the command line to let your shell do the expansion).
+--color::
+ Find the color configured for `name` (e.g. `color.diff.new`) and
+ output it as the ANSI color escape sequence to the standard
+ output.

-z::
--null::
diff --git a/builtin/config.c b/builtin/config.c
index d13daeeb55927..5e5b998b7c892 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -30,6 +30,7 @@ static int end_null;
static int respect_includes_opt = -1;
static struct config_options config_options;
static int show_origin;
+static const char *default_value;

#define ACTION_GET (1<<0)
#define ACTION_GET_ALL (1<<1)
@@ -52,6 +53,8 @@ static int show_origin;
#define TYPE_INT (1<<1)
#define TYPE_BOOL_OR_INT (1<<2)
#define TYPE_PATH (1<<3)
+#define TYPE_COLOR (1<<4)
+

static struct option builtin_config_options[] = {
 OPT_GROUP(N_("Config file location")),
@@ -80,11 +83,13 @@ static struct option builtin_config_options[] = {
 OPT_BIT(0, "int", , N_("value is decimal number"), TYPE_INT),
 OPT_BIT(0, "bool-or-int", , N_("value is --bool or --int"), 
TYPE_BOOL_OR_INT),
 OPT_BIT(0, "path", , N_("value is a path (file or directory 
name)"), TYPE_PATH),
+ OPT_BIT(0, "color", , N_("find the color configured"), 
TYPE_COLOR),

 OPT_GROUP(N_("Other")),
 OPT_BOOL('z', "null", _null, N_("terminate values with NUL byte")),
 OPT_BOOL(0, "name-only", _values, N_("show variable names only")),
 OPT_BOOL(0, "includes", _includes_opt, N_("respect include 
directives on lookup")),
 OPT_BOOL(0, "show-origin", _origin, N_("show origin of config (file, 
standard input, blob, command line)")),
+ OPT_STRING(0, "default", _value, N_("default-value"), N_("sets 
default value when no value is returned from config")),

 OPT_END(),
};

@@ -159,6 +164,13 @@ static int format_config(struct strbuf *buf, const 
char *key_, const char *value

 return -1;
 strbuf_addstr(buf, v);
 free((char *)v);
+ }
+ else if (types == TYPE_COLOR) {
+ char *v = xmalloc(COLOR_MAXLEN);
+ if (git_config_color(, key_, value_) < 0)
+ return -1;
+ strbuf_addstr(buf, v);
+ free((char *)v);
 } else if (value_) {
 strbuf_addstr(buf, value_);
 } else {
@@ -244,8 +256,16 @@ static int get_value(const char *key_, const char 
*regex_)

 config_with_options(collect_config, ,
 _config_source, _options);

- ret = !values.nr;
+ if (!values.nr && default_value && types) {
+ struct strbuf *item;
+ ALLOC_GROW(values.items, values.nr + 1, values.alloc);
+ item = [values.nr++];
+ if(format_config(item, key_, default_value) < 0){
+ values.nr = 0;
+ }
+ }

+ ret = !values.nr;
 for (i = 0; i < values.nr; i++) {
 struct strbuf *buf = values.items + i;
 if (do_all || i == values.nr - 1)
@@ -268,6 +288,7 @@ static int get_value(const char *key_, const char 
*regex_)

 return ret;
}

+
static char *normalize_value(const char *key, const char *value)
{
 if (!value)
@@ -281,6 +302,17 @@ static char *normalize_value(const char *key, const 
char *value)

 * when retrieving the value.
 */
 return xstrdup(value);
+ if (types == TYPE_COLOR)
+ {
+ char *v = xmalloc(COLOR_MAXLEN);
+ if (git_config_color(, key, value) == 0)
+ {
+ free((char *)v);
+ return xstrdup(value);
+ }
+ free((char *)v);
+ die("cannot parse color '%s'", value);
+ }
 if (types == TYPE_INT)
 return xstrfmt("%"PRId64, git_config_int64(key, value));
 if (types == TYPE_BOOL)
diff --git a/config.c b/config.c
index 903abf9533b18..5c5daffeb6723 100644
--- a/config.c
+++ b/config.c
@@ -16,6 +16,7 @@
#include "string-list.h"
#include "utf8.h"
#include "dir.h"
+#include "color.h"

struct config_source {
 struct config_source *prev;
@@ -990,6 +991,15 @@ int git_config_pathname(const char **dest, const 

Re: How hard would it be to implement sparse fetching/pulling?

2017-12-02 Thread Philip Oakley

From: "Jeff Hostetler" <g...@jeffhostetler.com>
Sent: Friday, December 01, 2017 5:23 PM

On 11/30/2017 6:43 PM, Philip Oakley wrote:

From: "Vitaly Arbuzov" <v...@uber.com>

[...]

comments below..


On Thu, Nov 30, 2017 at 9:01 AM, Vitaly Arbuzov <v...@uber.com> wrote:

Hey Jeff,

It's great, I didn't expect that anyone is actively working on this.
I'll check out your branch, meanwhile do you have any design docs that
describe these changes or can you define high level goals that you
want to achieve?

On Thu, Nov 30, 2017 at 6:24 AM, Jeff Hostetler <g...@jeffhostetler.com>
wrote:



On 11/29/2017 10:16 PM, Vitaly Arbuzov wrote:

[...]




I have, for separate reasons been _thinking_ about the issue ($dayjob is 
in

defence, so a similar partition would be useful).

The changes would almost certainly need to be server side (as well as 
client
side), as it is the server that decides what is sent over the wire in the 
pack files, which would need to be a 'narrow' pack file.


Yes, there will need to be both client and server changes.
In the current 3 part patch series, the client sends a "filter_spec"
to the server as part of the fetch-pack/upload-pack protocol.
If the server chooses to honor it, upload-pack passes the filter_spec
to pack-objects to build an "incomplete" packfile omitting various
objects (currently blobs).  Proprietary servers will need similar
changes to support this feature.

Discussing this feature in the context of the defense industry
makes me a little nervous.  (I used to be in that area.)


I'm viewing the desire for codebase partitioning from a soft layering of 
risk view (perhaps a more UK than USA approach ;-)



What we have in the code so far may be a nice start, but
probably doesn't have the assurances that you would need
for actual deployment.  But it's a start


True. I need to get some of my collegues more engaged...





If we had such a feature then all we would need on top is a separate
tool that builds the right "sparse" scope for the workspace based on
paths that developer wants to work on.

In the world where more and more companies are moving towards large
monorepos this improvement would provide a good way of scaling git to
meet this demand.


The 'companies' problem is that it tends to force a client-server, 
always-on
on-line mentality. I'm also wanting the original DVCS off-line capability 
to

still be available, with _user_ control, in a generic sense, of what they
have locally available (including files/directories they have not yet 
looked
at, but expect to have. IIUC Jeff's work is that on-line view, without 
the

off-line capability.

I'd commented early in the series at [1,2,3].


Yes, this does tend to lead towards an always-online mentality.
However, there are 2 parts:
[a] dynamic object fetching for missing objects, such as during a
random command like diff or blame or merge.  We need this
regardless of usage -- because we can't always predict (or
dry-run) every command the user might run in advance.


Making something "useful" happen here when off-line is an obvious goal.


[b] batch fetch mode, such as using partial-fetch to match your
sparse-checkout so that you always have the blobs of interest
to you.  And assuming you don't wander outside of this subset
of the tree, you should be able to work offline as usual.
If you can work within the confines of [b], you wouldn't need to
always be online.


I feel this is the area that does need ensure a capability to avoid any 
perception of the much maligned 'Embrace, extend, and extinguish' by 
accidental lockout.


I don't think this should be viewed as a type of sparse checkout - it's just 
a checkout of what you have (under the hood it could use the same code 
though).




We might also add a part [c] with explicit commands to back-fill or
alter your incomplete view of the ODB (as I explained in response
to the "git diff  " comment later in this thread.



At its core, my idea was to use the object store to hold markers for the
'not yet fetched' objects (mainly trees and blobs). These would be in a 
known fixed format, and have the same effect (conceptually) as the 
sub-module markers - they _confirm_ the oid, yet say 'not here, try 
elsewhere'.


We do have something like this.  Jonathan can explain better than I, but
basically, we denote possibly incomplete packfiles from partial clones
and fetches as "promisor" and have special rules in the code to assert
that a missing blob referenced from a "promisor" packfile is OK and can
be fetched later if necessary from the "promising" remote.


The remote interaction is one area that may need thought, especially in a 
triangle workflow, of which there are a few.




The main problem with markers or other lists of missing objects is
that it has scale problems for large repos.  Suppose I have 100M
blobs in my repo

Re: How hard would it be to implement sparse fetching/pulling?

2017-12-02 Thread Philip Oakley

Hi Jonathan,

Thanks for the outline. It has help clarify some points and see the very 
similar alignments.


The one thing I wasn't clear about is the "promised" objects/remote. Is that 
"promisor" remote a fixed entity, or could it be one of many remotes that 
could be a "provider"? (sort of like fetching sub-modules...)


Philip

From: "Jonathan Nieder" 
Sent: Friday, December 01, 2017 2:51 AM

Hi Vitaly,

Vitaly Arbuzov wrote:


I think it would be great if we high level agree on desired user
experience, so let me put a few possible use cases here.


I think one thing this thread is pointing to is a lack of overview
documentation about how the 'partial clone' series currently works.
The basic components are:

1. extending git protocol to (1) allow fetching only a subset of the
   objects reachable from the commits being fetched and (2) later,
   going back and fetching the objects that were left out.

   We've also discussed some other protocol changes, e.g. to allow
   obtaining the sizes of un-fetched objects without fetching the
   objects themselves

2. extending git's on-disk format to allow having some objects not be
   present but only be "promised" to be obtainable from a remote
   repository.  When running a command that requires those objects,
   the user can choose to have it either (a) error out ("airplane
   mode") or (b) fetch the required objects.

   It is still possible to work fully locally in such a repo, make
   changes, get useful results out of "git fsck", etc.  It is kind of
   similar to the existing "shallow clone" feature, except that there
   is a more straightforward way to obtain objects that are outside
   the "shallow" clone when needed on demand.

3. improving everyday commands to require fewer objects.  For
   example, if I run "git log -p", then I way to see the history of
   most files but I don't necessarily want to download large binary
   files just to print 'Binary files differ' for them.

   And by the same token, we might want to have a mode for commands
   like "git log -p" to default to restricting to a particular
   directory, instead of downloading files outside that directory.

   There are some fundamental changes to make in this category ---
   e.g. modifying the index format to not require entries for files
   outside the sparse checkout, to avoid having to download the
   trees for them.

The overall goal is to make git scale better.

The existing patches do (1) and (2), though it is possible to do more
in those categories. :)  We have plans to work on (3) as well.

These are overall changes that happen at a fairly low level in git.
They mostly don't require changes command-by-command.

Thanks,
Jonathan 




Re: How hard would it be to implement sparse fetching/pulling?

2017-12-02 Thread Philip Oakley

From: "Jeff Hostetler" 
Sent: Friday, December 01, 2017 2:30 PM

On 11/30/2017 8:51 PM, Vitaly Arbuzov wrote:

I think it would be great if we high level agree on desired user
experience, so let me put a few possible use cases here.

1. Init and fetch into a new repo with a sparse list.
Preconditions: origin blah exists and has a lot of folders inside of
src including "bar".
Actions:
git init foo && cd foo
git config core.sparseAll true # New flag to activate all sparse
operations by default so you don't need to pass options to each
command.
echo "src/bar" > .git/info/sparse-checkout
git remote add origin blah
git pull origin master
Expected results: foo contains src/bar folder and nothing else,
objects that are unrelated to this tree are not fetched.
Notes: This should work same when fetch/merge/checkout operations are
used in the right order.


With the current patches (parts 1,2,3) we can pass a blob-ish
to the server during a clone that refers to a sparse-checkout
specification.


I hadn't appreciated this capability. I see it as important, and should be 
available both ways, so that a .gitNarrow spec can be imposed from the 
server side, as well as by the requester.


It could also be used to assist in the 'precious/secret' blob problem, so 
that AWS keys are never pushed, nor available for fetching!



   There's a bit of a chicken-n-egg problem getting
things set up.  So if we assume your team would create a series
of "known enlistments" under version control, then you could


s/enlistments/entitlements/ I presume?


just reference one by : during your clone.  The
server can lookup that blob and just use it.

git clone --filter=sparse:oid=master:templates/bar URL

And then the server will filter-out the unwanted blobs during
the clone.  (The current version only filters blobs; you still
get full commits and trees.  That will be revisited later.)


I'm for the idea that only the in-heirachy trees should be sent.
It should also be possible that the server replies that it is only sending a 
narrow clone, with the given (accessible?) spec.




On the client side, the partial clone installs local config
settings into the repo so that subsequent fetches default to
the same filter criteria as used in the clone.


I don't currently have provision to send a full sparse-checkout
specification to the server during a clone or fetch.  That
seemed like too much to try to squeeze into the protocols.
We can revisit this later if there is interest, but it wasn't
critical for the initial phase.

Agreed. I think it should be somewhere 'visible' to the user, but could be 
setup by the server admin / repo maintainer if they don't have write access. 
But there could still be the catch-22 - maybe one starts with a toptree> :  pair to define an origin point (it's not as refined as a 
.gitNarrow spec file, but is definative). The toptree option could even 
allow sub-tree clones.. maybe..






2. Add a file and push changes.
Preconditions: all steps above followed.
touch src/bar/baz.txt && git add -A && git commit -m "added a file"
git push origin master
Expected results: changes are pushed to remote.


I don't believe partial clone and/or partial fetch will cause
any changes for push.


I suspect that pushes could be rejected if the user 'pretends' to modify 
files or trees outside their area. It does need the user to be able to spoof 
part of a tree they don't have, so an upstream / remote would immediatly 
know it was a spoof but locally the narrow clone doesn't have enough detail 
about the 'bad' oid. It would be right to reject such attempts!






3. Clone a repo with a sparse list as a filter.
Preconditions: same as for #1
Actions:
echo "src/bar" > /tmp/blah-sparse-checkout
git clone --sparse /tmp/blah-sparse-checkout blah # Clone should be
the only command that would requires specific option key being passed.
Expected results: same as for #1 plus /tmp/blah-sparse-checkout is
copied into .git/info/sparse-checkout


I presume clone and fetch are treated equivalently here.



There are 2 independent concepts here: clone and checkout.
Currently, there isn't any automatic linkage of the partial clone to
the sparse-checkout settings, so you could do something like this:

I see an implicit link that clearly one cannot checkout (inflate/populate) a 
file/directory that one does not have in the object store. But that does not 
imply the reverse linkage. The regular sparse checkout should be available 
independently of the local clone being a narrow one.



git clone --no-checkout --filter=sparse:oid=master:templates/bar URL
git cat-file ... templates/bar >.git/info/sparse-checkout
git config core.sparsecheckout true
git checkout ...

I've been focused on the clone/fetch issues and have not looked
into the automation to couple them.



I foresee that large files and certain files need to be filterable for 
fetch-clone, and that might not be (backward) compatible with the 

Re: How hard would it be to implement sparse fetching/pulling?

2017-12-02 Thread Philip Oakley

From: "Vitaly Arbuzov" <v...@uber.com>
Sent: Friday, December 01, 2017 1:27 AM

Jonathan, thanks for references, that is super helpful, I will follow

your suggestions.


Philip, I agree that keeping original DVCS off-line capability is an

important point. Ideally this feature should work even with remotes
that are located on the local disk.

And with other any other remote. (even to the extent that the other remote 
may indicate it has no capability, sorry, go away..)
E.g. One ought to be able to have/create a Github narrow fork of only the 
git.git/Documenation repo, and interact with that. (how much nicer if it was 
git.git/Documenation/ManPages/ to ease the exclusion of RelNotes/, howto/ 
and technical/ )



Which part of Jeff's work do you think wouldn't work offline after

repo initialization is done and sparse fetch is performed? All the
stuff that I've seen seems to be quite usable without GVFS.

I think it's that initial download that may be different, and what is 
expected of it. In my case, one may never connect to that server again, yet 
still be able to work both off-line and with other remotes (push and pull as 
per capabilities). Below I note that I'd only fetch the needed trees, not 
all of them. Also one needs to fetch a complete (pre-defined) subset, rather 
than an on-demand subset.



I'm not sure if we need to store markers/tombstones on the client,

what problem does it solve?

The part that the markers hopes to solve is the part that I hadn't said, 
that they should also show in the work tree so that users can see what is 
missing and where.


Importantly I would also trim the directory (tree) structure so only the 
direct heirachy of those files the user sees are visible, though at each 
level they would see side directory names (which are embedded in the 
heirachical tree objects). (IIUC Jeff H's scheme downloads *all* trees, not 
just a few)


It would mean that users can create a complete fresh tree and commit that 
can be merged and picked onto the usptream tree from the _directory worktree 
alone_, because the oid's of all the parts are listed in the worktree. The 
actual objects for the missing oids being available in the appropriate 
upstream.


It also means the index can be deleted, and with only the local narrow pack 
files and the current worktree the index can be recreated at the current 
sparseness level. (I'm hoping I've understood the dispersement of data 
between index and narrow packs corrrectly here ;-)


--
Philip

On Thu, Nov 30, 2017 at 3:43 PM, Philip Oakley <philipoak...@iee.org> wrote:

From: "Vitaly Arbuzov" <v...@uber.com>


Found some details here: https://github.com/jeffhostetler/git/pull/3

Looking at commits I see that you've done a lot of work already,
including packing, filtering, fetching, cloning etc.
What are some areas that aren't complete yet? Do you need any help
with implementation?



comments below..



On Thu, Nov 30, 2017 at 9:01 AM, Vitaly Arbuzov <v...@uber.com> wrote:


Hey Jeff,

It's great, I didn't expect that anyone is actively working on this.
I'll check out your branch, meanwhile do you have any design docs that
describe these changes or can you define high level goals that you
want to achieve?

On Thu, Nov 30, 2017 at 6:24 AM, Jeff Hostetler <g...@jeffhostetler.com>
wrote:




On 11/29/2017 10:16 PM, Vitaly Arbuzov wrote:



Hi guys,

I'm looking for ways to improve fetch/pull/clone time for large git
(mono)repositories with unrelated source trees (that span across
multiple services).
I've found sparse checkout approach appealing and helpful for most of
client-side operations (e.g. status, reset, commit, etc.)
The problem is that there is no feature like sparse fetch/pull in git,
this means that ALL objects in unrelated trees are always fetched.
It may take a lot of time for large repositories and results in some
practical scalability limits for git.
This forced some large companies like Facebook and Google to move to
Mercurial as they were unable to improve client-side experience with
git while Microsoft has developed GVFS, which seems to be a step back
to CVCS world.

I want to get a feedback (from more experienced git users than I am)
on what it would take to implement sparse fetching/pulling.
(Downloading only objects related to the sparse-checkout list)
Are there any issues with missing hashes?
Are there any fundamental problems why it can't be done?
Can we get away with only client-side changes or would it require
special features on the server side?



I have, for separate reasons been _thinking_ about the issue ($dayjob is 
in

defence, so a similar partition would be useful).

The changes would almost certainly need to be server side (as well as 
client

side), as it is the server that decides what is sent over the wire in the
pack files, which would need to be a 'narrow' pack file.


If we had such a feature then all we would need on top is a separate
tool that build

Re: How hard would it be to implement sparse fetching/pulling?

2017-11-30 Thread Philip Oakley

From: "Vitaly Arbuzov" 

Found some details here: https://github.com/jeffhostetler/git/pull/3

Looking at commits I see that you've done a lot of work already,
including packing, filtering, fetching, cloning etc.
What are some areas that aren't complete yet? Do you need any help
with implementation?



comments below..


On Thu, Nov 30, 2017 at 9:01 AM, Vitaly Arbuzov  wrote:

Hey Jeff,

It's great, I didn't expect that anyone is actively working on this.
I'll check out your branch, meanwhile do you have any design docs that
describe these changes or can you define high level goals that you
want to achieve?

On Thu, Nov 30, 2017 at 6:24 AM, Jeff Hostetler 
wrote:



On 11/29/2017 10:16 PM, Vitaly Arbuzov wrote:


Hi guys,

I'm looking for ways to improve fetch/pull/clone time for large git
(mono)repositories with unrelated source trees (that span across
multiple services).
I've found sparse checkout approach appealing and helpful for most of
client-side operations (e.g. status, reset, commit, etc.)
The problem is that there is no feature like sparse fetch/pull in git,
this means that ALL objects in unrelated trees are always fetched.
It may take a lot of time for large repositories and results in some
practical scalability limits for git.
This forced some large companies like Facebook and Google to move to
Mercurial as they were unable to improve client-side experience with
git while Microsoft has developed GVFS, which seems to be a step back
to CVCS world.

I want to get a feedback (from more experienced git users than I am)
on what it would take to implement sparse fetching/pulling.
(Downloading only objects related to the sparse-checkout list)
Are there any issues with missing hashes?
Are there any fundamental problems why it can't be done?
Can we get away with only client-side changes or would it require
special features on the server side?



I have, for separate reasons been _thinking_ about the issue ($dayjob is in
defence, so a similar partition would be useful).

The changes would almost certainly need to be server side (as well as client
side), as it is the server that decides what is sent over the wire in the 
pack files, which would need to be a 'narrow' pack file.



If we had such a feature then all we would need on top is a separate
tool that builds the right "sparse" scope for the workspace based on
paths that developer wants to work on.

In the world where more and more companies are moving towards large
monorepos this improvement would provide a good way of scaling git to
meet this demand.


The 'companies' problem is that it tends to force a client-server, always-on
on-line mentality. I'm also wanting the original DVCS off-line capability to
still be available, with _user_ control, in a generic sense, of what they
have locally available (including files/directories they have not yet looked
at, but expect to have. IIUC Jeff's work is that on-line view, without the
off-line capability.

I'd commented early in the series at [1,2,3].


At its core, my idea was to use the object store to hold markers for the
'not yet fetched' objects (mainly trees and blobs). These would be in a 
known fixed format, and have the same effect (conceptually) as the 
sub-module markers - they _confirm_ the oid, yet say 'not here, try 
elsewhere'.


The comaprison with submodules mean there is the same chance of
de-synchronisation with triangular and upstream servers, unless managed.

The server side, as noted, will need to be included as it is the one that
decides the pack file.

Options for a server management are:

- "I accept narrow packs?" No; yes

- "I serve narrow packs?" No; yes.

- "Repo completeness checks on reciept": (must be complete) || (allow narrow 
to nothing).


For server farms (e.g. Github..) the settings could be global, or by repo.
(note that the completeness requirement and narrow reciept option are not
incompatible - the recipient server can reject the pack from a narrow
subordinate as incomplete - see below)

* Marking of 'missing' objects in the local object store, and on the wire.
The missing objects are replaced by a place holder object, which used the
same oid/sha1, but has a short fixed length, with content “GitNarrowObject
”. The chance that that string would actually have such an oid clash is
the same as all other object hashes, so is a *safe* self-referential device.


* The stored object already includes length (and inferred type), so we do
know what it stands in for. Thus the local index (index file) should be able
to be recreated from the object store alone (including the ‘promised /
narrow / missing’ files/directory markers)

* the ‘same’ as sub-modules.
The potential for loss of synchronisation with a golden complete repo is
just the same as for sub-modules. (We expected object/commit X here, but it’s 
not in the store). This could happen with a small user group who have 
locally narrow clones, who interact with their 

Re: [PATCH v3 5/5] Testing: provide tests requiring them with ellipses after SHA-1 values

2017-11-22 Thread Philip Oakley

From: "Junio C Hamano" <gits...@pobox.com>

"Philip Oakley" <philipoak...@iee.org> writes:


From: "Junio C Hamano" <gits...@pobox.com>

Ann T Ropea <bedhan...@gmx.de> writes:


*1* We are being overly generous in t4013-diff-various.sh because we do
not want to destroy/take apart the here-document.  Given that all this 
a

temporary measure, we should get away with it.


So, the need to reformat the test for the future post-deprecation
period is being deferred to the time that the PRINT_SHA1_ELLIPSIS env
variable, and all ellipis, is removed - is that the case? Maybe it
just needs saying plainly.


And if we say it that way, it is clear that with this series, we are
shipping a new feature with a test that does not protect the output
format we claim to be the improved and preferred one.  That sounds
quite bad.

Having said that, I have already queued this to 'pu' and I do not
terribly mind to merge it down to 'next', leaving the test updates
to cover the new output format as well as the backward compatible
one at the same time for a later follow-up patch.


I'd agree. I just wanted to ensure that I had the right understanding.


I'd however hate it if I have to carry the topic in the current
shape in 'next' forever, waiting for such an update to come, that
may never materialize, and be forced to do it myself without being
explicitly asked by (and thanked for) anybody, especially because
this is not exactly my itch X-<.


True.



Or is the env variable being retained as a fallback 'forever'? I'm
half guessing that it may tend toward the latter as it's an easier
backward compatibility decision.


We do not know until this change is released to the wild, at which
time we will hear noises about the lack of expected ellipses their
(poorly written) scripts rely on and tell them to set the workaround
environment variable.  We may not hear from such people at all, in
which case we may be able to remove it within a year or so, but it
is too early to tell.


I was wondering if there should be a small documentation change for the env 
variable and states that it is a temporary measure for short term 
compatibility. Though I'm not sure where the 'right' place would be for it.





Re: [PATCH] git-send-email: fix get_maintainer.pl regression

2017-11-20 Thread Philip Oakley

From: "Eric Sunshine" 
On Sat, Nov 18, 2017 at 9:54 PM, Eric Sunshine 
wrote:

On Thu, Nov 16, 2017 at 10:48 AM, Alex Bennée 
wrote:

+test_expect_success $PREREQ 'cc trailer with get_maintainer output' '
+   [...]
+   git send-email -1 --to=recipi...@example.com \
+   --cc-cmd="$(pwd)/expected-cc-script.sh" \
+   [...]
+'
OK I'm afraid I don't fully understand the test harness as this breaks a
bunch of other tests. If anyone can offer some pointers on how to fix
I'd be grateful.


There are several problems:
[...]
* The directory in which the expected-cc-script.sh is created contains
a space; this is intentional to catch bugs in tests and Git itself. In
this case, your test is exposing what might be considered a bug in
git-send-email itself, in which it invokes the --cc-cmd as "/path/with
space/expected-cc-script.sh", which is interpreted as trying to invoke
program "/path/with" with argument "space/expected-cc-script.sh". One
> fix (which you could submit as a preparatory patch, making this a
> 2-patch series) would be this:
>
> --- 8< ---
> diff --git a/git-send-email.perl b/git-send-email.perl
> @@ -1724,7 +1724,7 @@ sub recipients_cmd {
> -open my $fh, "-|", "$cmd \Q$file\E"
> +   open my $fh, "-|", "\Q$cmd\E \Q$file\E"
> --- 8< ---
>
> However, it's possible that might break existing users who rely on
> --cc-cmd="myscript --option arg" working. It's not clear which
> behavior is correct.

The more I think about this, the less I consider this a bug in
git-send-email. As noted, people might legitimately use a complex
command (--cc-cmd="myscript--option arg"), so changing git-send-email
to treat cc-cmd as an atomic string seems like a bad idea.


A while back I proposed some documentation updates
https://public-inbox.org/git/1437416790-5792-1-git-send-email-philipoak...@iee.org/
regarding what is (should be) allowed in the cc-cmd etc., and at the time
Junio suggested that possible existing uses of the current code would be
abuses. I didn't pursue it further, but it may be useful guidance here as to
potential real world command lines..



Assuming no changes to git-send-email, to get your test working, you
could try to figure out how to quote the script's path you're
specifying with --cc-cmd, however, even easier would be to drop $(pwd)
altogether. That is, instead of:

--cc-cmd="$(pwd)/expected-cc-script.sh"

just use:

--cc-cmd=./expected-cc-script.sh




Re: [PATCH 6/7] builtin/describe.c: describe a blob

2017-11-20 Thread Philip Oakley

From: "Philip Oakley" <philipoak...@iee.org>

s/with/without/  ...


From: "Junio C Hamano" <gits...@pobox.com>
: Friday, November 10, 2017 1:24 AM
[catch up]


"Philip Oakley" <philipoak...@iee.org> writes:


From: "Stefan Beller" <sbel...@google.com>
Rereading this discussion, there is currently no urgent thing to 
address?


True.


Then the state as announced by the last cooking email, to just cook
it, seems
about right and we'll wait for further feedback.


A shiny new toy that is not a fix for a grave bug is rarely urgent,
so with that criterion, we'd end up with hundreds of topics not in
'next' but in 'pu' waiting for the original contributor to get out
of his or her procrastination, which certainly is not what I want to
see, as I'd have to throw them into the Stalled bin and then
eventually discard them, while having to worry about possible
mismerges with remaining good topics caused by these topics
appearing and disappearing from 'pu'.

I'd rather see any topic that consumed reviewers' time to be
polished enough to get into 'next' while we all recall the issues
raised during previous reviews.  I consider the process to further
incrementally polish it after that happens a true "cooking".

For this topic, aside from "known issues" that we decided to punt
for now, my impression was that the code is in good enough shape,
and we need a bit of documentation polishes before I can mark it
as "Will merge to 'next'".


Possibly only checking the documenation aspects, so folks don't fall
into the same trap as me.. ;-)


Yup, so let's resolve that documentation thing while we remember
that the topic has that issue, and what part of the documentation
we find needs improvement.

I am not sure what "trap: you fell into, though.  Are you saying
that giving

git describe [...] 
git describe [...] 

in the synopsis is not helpful, because the user may not know what
kind of object s/he has, and cannot decide from which set of options
to pick?  Then an alternative would be to list


(If I remember correctly) My nit pick was roughly along the lines you 
suggest, and that the two option lists (for commit-ish and blob) were 
shown in different ways, which could lead to the scenarion that, with 
knowing the


s/with/without/  ...

oid object type (or knowing how to get it), the user could give an invalid 
option, and think the command failure was because the oid was invalid, not 
that the option was not appropriate, along with variations on that theme.


The newer synopsis (v5) looks Ok in that it avoids digging the hole by not 
mentioning the blob options. Personally I'm more for manuals that tend 
toward instructional, rather than being expert references. I'd sneak in a 
line saying "The object type can be determined using `git cat-file`.", but 
maybe that's my work environment...




git describe [...] 

in the synopsis, say upfront that most options are applicable only
when describing a commit-ish, and when describing a blob, we do
quite different thing and a separate set of options apply, perhaps?


--
Philip 




Re: [PATCH 6/7] builtin/describe.c: describe a blob

2017-11-20 Thread Philip Oakley

From: "Junio C Hamano" <gits...@pobox.com>
: Friday, November 10, 2017 1:24 AM
[catch up]


"Philip Oakley" <philipoak...@iee.org> writes:


From: "Stefan Beller" <sbel...@google.com>
Rereading this discussion, there is currently no urgent thing to 
address?


True.


Then the state as announced by the last cooking email, to just cook
it, seems
about right and we'll wait for further feedback.


A shiny new toy that is not a fix for a grave bug is rarely urgent,
so with that criterion, we'd end up with hundreds of topics not in
'next' but in 'pu' waiting for the original contributor to get out
of his or her procrastination, which certainly is not what I want to
see, as I'd have to throw them into the Stalled bin and then
eventually discard them, while having to worry about possible
mismerges with remaining good topics caused by these topics
appearing and disappearing from 'pu'.

I'd rather see any topic that consumed reviewers' time to be
polished enough to get into 'next' while we all recall the issues
raised during previous reviews.  I consider the process to further
incrementally polish it after that happens a true "cooking".

For this topic, aside from "known issues" that we decided to punt
for now, my impression was that the code is in good enough shape,
and we need a bit of documentation polishes before I can mark it
as "Will merge to 'next'".


Possibly only checking the documenation aspects, so folks don't fall
into the same trap as me.. ;-)


Yup, so let's resolve that documentation thing while we remember
that the topic has that issue, and what part of the documentation
we find needs improvement.

I am not sure what "trap: you fell into, though.  Are you saying
that giving

git describe [...] 
git describe [...] 

in the synopsis is not helpful, because the user may not know what
kind of object s/he has, and cannot decide from which set of options
to pick?  Then an alternative would be to list


(If I remember correctly) My nit pick was roughly along the lines you 
suggest, and that the two option lists (for commit-ish and blob) were shown 
in different ways, which could lead to the scenarion that, with knowing the 
oid object type (or knowing how to get it), the user could give an invalid 
option, and think the command failure was because the oid was invalid, not 
that the option was not appropriate, along with variations on that theme.


The newer synopsis (v5) looks Ok in that it avoids digging the hole by not 
mentioning the blob options. Personally I'm more for manuals that tend 
toward instructional, rather than being expert references. I'd sneak in a 
line saying "The object type can be determined using `git cat-file`.", but 
maybe that's my work environment...




git describe [...] 

in the synopsis, say upfront that most options are applicable only
when describing a commit-ish, and when describing a blob, we do
quite different thing and a separate set of options apply, perhaps?


--
Philip 



Re: [PATCHv5 7/7] builtin/describe.c: describe a blob

2017-11-20 Thread Philip Oakley

From: "Stefan Beller" 
Sent: Thursday, November 16, 2017 2:00 AM

[in catch up mode..]


Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])

When describing commits, we try to anchor them to tags or refs, as these
are conceptually on a higher level than the commit. And if there is no ref
or tag that matches exactly, we're out of luck.  So we employ a heuristic
to make up a name for the commit. These names are ambiguous, there might
be different tags or refs to anchor to, and there might be different
path in the DAG to travel to arrive at the commit precisely.

When describing a blob, we want to describe the blob from a higher layer
as well, which is a tuple of (commit, deep/path) as the tree objects
involved are rather uninteresting.  The same blob can be referenced by
multiple commits, so how we decide which commit to use?  This patch
implements a rather naive approach on this: As there are no back pointers
from blobs to commits in which the blob occurs, we'll start walking from
any tips available, listing the blobs in-order of the commit and once we
found the blob, we'll take the first commit that listed the blob. For
example

 git describe --tags v0.99:Makefile
 conversion-901-g7672db20c2:Makefile

tells us the Makefile as it was in v0.99 was introduced in commit 
7672db20.


The walking is performed in reverse order to show the introduction of a
blob rather than its last occurrence.

[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob

Signed-off-by: Stefan Beller 
---
Documentation/git-describe.txt | 18 ++--
builtin/describe.c | 62 
++

t/t6120-describe.sh| 34 +++
3 files changed, 107 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-describe.txt 
b/Documentation/git-describe.txt

index c924c945ba..e027fb8c4b 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -3,14 +3,14 @@ git-describe(1)

NAME

-git-describe - Describe a commit using the most recent tag reachable from 
it

-
+git-describe - Give an object a human readable name based on an available 
ref


SYNOPSIS

[verse]
'git describe' [--all] [--tags] [--contains] [--abbrev=] 
[...]
'git describe' [--all] [--tags] [--contains] 
[--abbrev=] --dirty[=]

+'git describe' 

DESCRIPTION
---
@@ -24,6 +24,12 @@ By default (without --all or --tags) `git describe` 
only shows

annotated tags.  For more information about creating annotated tags
see the -a and -s options to linkgit:git-tag[1].

+If the given object refers to a blob, it will be described
+as `:`, such that the blob can be found
+at `` in the ``, which itself describes the
+first commit in which this blob occurs in a reverse revision walk
+from HEAD.
+
OPTIONS
---
...::
@@ -186,6 +192,14 @@ selected and output.  Here fewest commits different 
is defined as

the number of commits which would be shown by `git log tag..input`
will be the smallest number of commits possible.

+BUGS
+
+
+Tree objects as well as tag objects not pointing at commits, cannot be 
described.


Is this true? Is it stand alone from the describing of a blob? If so should 
it be its own patchlet. - I thought I'd read that within the series there is 
now a tree / tag (of blob/trees) description capability.


I'd prefer that we don't start with the "can't" view (relative to the 
subsequent sentences of the paragraph). It puts off the reader - we are 
about to say what can be described but in a limited way - the limitation 
being the bug. Maybe just swap the line to form a second paragraph.


+When describing blobs, the lightweight tags pointing at blobs are 
ignored,
+but the blob is still described as : despite the 
lightweight

+tag being favorable.
+

--
Philip


GIT
---
Part of the linkgit:git[1] suite
diff --git a/builtin/describe.c b/builtin/describe.c
index 9e9a5ed5d4..5b4bfaba3f 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -3,6 +3,7 @@
#include "lockfile.h"
#include "commit.h"
#include "tag.h"
+#include "blob.h"
#include "refs.h"
#include "builtin.h"
#include "exec_cmd.h"
@@ -11,8 +12,9 @@
#include "hashmap.h"
#include "argv-array.h"
#include "run-command.h"
+#include "revision.h"
+#include "list-objects.h"

-#define SEEN (1u << 0)
#define MAX_TAGS (FLAG_BITS - 1)

static const char * const describe_usage[] = {
@@ -434,6 +436,53 @@ static void describe_commit(struct object_id *oid, 
struct strbuf *dst)

 strbuf_addstr(dst, suffix);
}

+struct process_commit_data {
+ struct object_id current_commit;
+ struct object_id looking_for;
+ struct strbuf *dst;
+ struct rev_info *revs;
+};
+
+static void process_commit(struct commit *commit, void *data)
+{
+ struct process_commit_data *pcd = data;
+ pcd->current_commit = commit->object.oid;
+}
+
+static void process_object(struct 

Re: [PATCH v3 5/5] Testing: provide tests requiring them with ellipses after SHA-1 values

2017-11-20 Thread Philip Oakley

From: "Junio C Hamano" 

Ann T Ropea  writes:


*1* We are being overly generous in t4013-diff-various.sh because we do
not want to destroy/take apart the here-document.  Given that all this a
temporary measure, we should get away with it.


So, the need to reformat the test for the future post-deprecation period is 
being deferred to the time that the PRINT_SHA1_ELLIPSIS env variable, and 
all ellipis, is removed - is that the case? Maybe it just needs saying 
plainly.


Or is the env variable being retained as a fallback 'forever'? I'm half 
guessing that it may tend toward the latter as it's an easier backward 
compatibility decision.


[apologioes this is mid thread, I'm catching up on 2 weeks of emails]



I do not think the patch is being particularly generous.  If
anything, it is being unnecessarily sloppy by not adding new checks
to verify the updated behaviour.

The above comment mentions "destroy/take apart" the here-document,
but I do see no need to destroy anything.  All you need to do is to
enhance and extend.  For example, you could do it like so (this is
written in my e-mail client, and not an output of diff, so the
indentation etc. may be all off, but should be sufficient to
illustrate the idea):

   while read cmd
   do
   case "$cmd" in
   '' | '#'*) continue ;;
   esac
   test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g')
   pfx=$(printf "%04d" $test_count)
   expect="$TEST_DIRECTORY/t4013/diff.$test"
   actual="$pfx-diff.$test"
  +case "$cmd" in
  +X*) cmd=${cmd#X}; no_ellipses=" (no ellipses)" ;;
  +*) no_ellipses= ;;
  +esac
  -test_expect_success "git $cmd" '
  +test_expect_success "git $cmd$no_ellipses" '
   {
   echo "\$ git $cmd"
  -git $cmd |
  +if test -n "$no_ellipses"
  +then
  +git $cmd
  +else
  +PRINT_SHA1_ELLIPSES=yes git $cmd
  +fi |
   sed -e 
   ...
   done <<\EOF
   diff-tree initial
   diff-tree -r initial
   diff-tree -r --abbrev initial
   diff-tree -r --abbrev=4 initial
  +Xdiff-tree -r --abbrev=4 initial
   ...
   EOF

There is a new and duplicated line with a prefix X for one existing
test in the above.  The idea is that the ones marked as such will
test and verify the effect of this new behaviour by not setting the
environment variable.  The expected and actual test output for the
new test will have X prefixed to it.  t4013 is arranged in such a
way that it is easy to add a new test like this---you only need to
add an expected output in a new file in t/t4013/. directory.  And
the output with these ellipses removed will be something we would
expect see in the new world (without the escape hatch environment
variable), we would need to add a new file there to record what the
expected output from the command is.

I singled out the diff-tree invocation with --abbrev=4 as an example
in the above, but in a more thorough final version, we'd need to
cover both "abbreviation with ellipses" and "abbreviation without
ellipses" output for other lines in the test case listed in the
here-document. 




Re: more pedantry ... what means a file "known to Git"?

2017-11-13 Thread Philip Oakley

From: "Robert P. J. Day" 


 apologies for more excruciating nitpickery, but i ask since it seems
that phrase means slightly different things depending on where you
read it.

 first, i assume that there are only two categories:

 1) files known to Git
 2) files unknown to Git

and that there is no fuzzy, grey area middle ground, yes?


sort of...


 now, in "man git-clean", one reads (near the top):

   Cleans the working tree by recursively removing files that are
   not under version control, starting from the current directory.

   Normally, only files unknown to Git are removed, but if the -x
 ^
   option is specified, ignored files are also removed.

the way that's worded suggests that ignored files are "known" to Git,
yes?


You've hit the three way binary problem of +1, 0, -1 ! The lsb is still 0 or 
1, but we have the two assertions of:

Positively known to git -- added to the index and the object store
Negatively 'known' to git -- paths we actively ignore, thus not in the index 
or object store.


Unknown files are those that could be added.


 that is, if, by default, "git clean" removes only files "unknown"
to Git, and "-x" extends that to ignored files, the conclusion is that
ignored files are *known* to Git.


but only in a negative sense ...



 if, however, you check out "man git-rm", you read:

   The  list given to the command can be exact pathnames,
   file glob patterns, or leading directory names. The command
   removes only the paths that are known to Git. Giving the name
   
   of a file that you have not told Git about does not remove that file.

so "git rm" removes only files "known to Git", but from the above
regarding how "git clean" sees this, that should include ignored
files, which of course it doesn't.


The man page description starts with the key "Remove files from the index", 
so this is the positive 'knowing' part. Clearly it can never remove other 
ignored files as they can't be in the index (but note the 'other' caveat. 
P->Q # Q->P).




 given that this phrase occurs in a number of places:

$ grep -ir "known to git" *
builtin/difftool.c: /* The symlink is unknown to Git so read from the 
filesystem */

dir.c: error("pathspec '%s' did not match any file(s) known to git.",
Documentation/git-rm.txt:removes only the paths that are known to Git. 
Giving the name of

Documentation/git-commit.txt:   be known to Git);
Documentation/user-manual.txt:error: pathspec 
'261dfac35cb99d380eb966e102c1197139f7fa24' did not match any file(s) known 
to git.
Documentation/gitattributes.txt: Notice all types of potential whitespace 
errors known to Git.
Documentation/git-clean.txt:Normally, only files unknown to Git are 
removed, but if the `-x`
Documentation/RelNotes/1.8.2.1.txt: * The code to keep track of what 
directory names are known to Git on
Documentation/RelNotes/1.8.1.6.txt: * The code to keep track of what 
directory names are known to Git on
Documentation/RelNotes/2.9.0.txt:   known to Git.  They have been taught 
to do the normalization.
Documentation/RelNotes/2.8.4.txt:   known to Git.  They have been taught 
to do the normalization.
Documentation/RelNotes/1.8.3.txt: * The code to keep track of what 
directory names are known to Git on
t/t3005-ls-files-relative.sh: echo "error: pathspec $sq$f$sq did not match 
any file(s) known to git."
t/t3005-ls-files-relative.sh: echo "error: pathspec $sq$f$sq did not match 
any file(s) known to git."

$

it might be useful to define precisely what it means. or is it assumed
to be context dependent?



A little bit of clarification may be useful. You can't be/aren't the only 
one who is willing to note these subtle inconsistencies (Git knows things 
via the index (staging area) and the object store (repository)).


rday

--
Philip= 



Re: [PATCH 00/30] Add directory rename detection to git

2017-11-13 Thread Philip Oakley

From: "Elijah Newren" <new...@gmail.com>
: Friday, November 10, 2017 11:26 PM
On Fri, Nov 10, 2017 at 2:27 PM, Philip Oakley <philipoak...@iee.org> 
wrote:

From: "Elijah Newren" <new...@gmail.com>


In this patchset, I introduce directory rename detection to
merge-recursive,
predominantly so that when files are added to directories on one side of
history and those directories are renamed on the other side of history,
the
files will end up in the proper location after a merge or cherry-pick.

However, this isn't limited to that simplistic case.  More interesting
possibilities exist, such as:

 * a file being renamed into a directory which is renamed on the other
   side of history, causing the need for a transitive rename.



How does this cope with the case insensitive case preserving file systems 
on
Mac and Windows, esp when core.ignorecase is true. If it's a bigger 
problem

that the series already covers, would the likely changes be reasonably
localised?

This came up recently on GfW for `git checkout` of a branch where the 
case
changed ("Test" <-> "test"), but git didn't notice that it needed to 
rename

the directories on such an file system.
https://github.com/git-for-windows/git/issues/1333


I wasn't aware there were problems with git on case insensitive case
preserving filesystems; fixing them wasn't something I had in mind
when writing this series.


I was mainly ensuring awareness of the potential issue, as it's not easy to 
solve.



However, the particular bug you mention is
actually completely orthogonal to this series; it talks about
git-checkout without the -m/--merge option, which doesn't touch any
code path I modified in my series, so my series can't really fix or
worsen that particular issue.


That's good.


But, if there are further issues with such filesystems that also
affect merges/cherry-picks/rebases, then I don't think my series will
either help or hurt there either.  The recursive merge machinery
already has remove_file() and update_file() wrappers that it uses
whenever it needs to remove/add/update a file in the working directory
and/or index, and I have simply continued using those, so the number
of places you'd need to modify to fix issues would remain just as
localized as before.


It's when the working directory path/filename has a case change that goes 
undetected (one way or another) that can cause issues. I think that part of 
the problem (after awareness) is not having a cannonical expectation of 
which way is 'right', and what options there may be. E,g. if a project is 
wholly on a case insensitive system then the filenames in the worktree never 
matter, but aligning the path/filenames in the repository would still be a 
problem.



 Also, I continue to depend on the reading of the
index & trees that unpack_trees() does, which I haven't modified, so
again it'd be the same number of places that someone would need to
fix.  (However, the whole design to have unpack_trees() do the initial
work and then have recursive merge try to "fix it up" is really
starting to strain.


Interesting point.


 I'm starting to think, again, that merge
recursive needs a redesign, and have some arguments I wanted to float
out there...but I've dumped enough on the list for a day.)

It's possible that this series fixes one particular issue -- namely
when merging, if the merge-base contained a "Test" directory, one side
added a file to that directory, and the other side renamed "Test" to
"test", and if the presence of both "Test" and "test" directories in
the merge result is problematic, then at least with my fixes you
wouldn't end up with both directories and could thus avoid that
problem in a narrow set of cases.


I'll think on that. It may provide extra clues as to what the right 
solutions could be!


Sorry that I don't have any better news than that for you.

Elijah


Thanks
--
Philip 



Re: [PATCH 00/30] Add directory rename detection to git

2017-11-10 Thread Philip Oakley

From: "Elijah Newren" 

[This series is entirely independent of my rename detection limits series.
However, I have a separate rename detection performance series that 
depends

on both this series and the rename detection limits series.]

In this patchset, I introduce directory rename detection to 
merge-recursive,

predominantly so that when files are added to directories on one side of
history and those directories are renamed on the other side of history, 
the

files will end up in the proper location after a merge or cherry-pick.

However, this isn't limited to that simplistic case.  More interesting
possibilities exist, such as:

 * a file being renamed into a directory which is renamed on the other
   side of history, causing the need for a transitive rename.



How does this cope with the case insensitive case preserving file systems on 
Mac and Windows, esp when core.ignorecase is true. If it's a bigger problem 
that the series already covers, would the likely changes be reasonably 
localised?


This came up recently on GfW for `git checkout` of a branch where the case 
changed ("Test" <-> "test"), but git didn't notice that it needed to rename 
the directories on such an file system. 
https://github.com/git-for-windows/git/issues/1333




--
Philip 



Re: [PATCH 6/7] builtin/describe.c: describe a blob

2017-11-09 Thread Philip Oakley

From: "Stefan Beller" 

Rereading this discussion, there is currently no urgent thing to address?


True.

Then the state as announced by the last cooking email, to just cook it, 
seems

about right and we'll wait for further feedback.


Possibly only checking the documenation aspects, so folks don't fall into 
the same trap as me.. ;-)

--
Philip 



Re: [PATCH 1/3] checkout: describe_detached_head: remove 3dots after committish

2017-11-09 Thread Philip Oakley

From: "Junio C Hamano" <gits...@pobox.com>
Sent: Wednesday, November 08, 2017 1:59 AM



"Philip Oakley" <philipoak...@iee.org> writes:


But...

...
This change causes quite a few tests to fall over; however, they
all have truncated-something-longer-ellipses in their
raw-diff-output expected sections, and removing the ellipses
from there makes the tests pass again, :-)


The number of failures you report in the test suit suggests that
someone somewhere will be expecting that notation, and that we may
need a deprecation period, perhaps with an 'ellipsis' config variable
whose default value can later be flipped, though that leaves a config
value needing support forever!


Hmmm, never thought about that.

I have been assuming that tools reading "--raw" output that is
abbreviated would be crazy, because they have to strip the dots and
the number of dots may not always be three [*1*].

But you are right.  It would be very unlikely that there is no such
crazy tools, so it deserves consideration if we would be breaking
such tools.

On the other hand, if such a crazy tool was still written correctly
(it is debatable what the definition of "correct" is, though), it
would be stripping any number dots at the end, not just insisting on
seeing exactly three dots, and splitting these fields at SP.
Otherwise they would already be broken as they cannot handle
occasional object names that have less than three dots because they
happen to be longer than the more common abbreviation length used by
other objects.  So in practice it might not be _too_ bad.

Thinking on this, I'd suggest that the patch series does remove the ellipsis 
dots immediately, but retains a config option that can be set to get back 
the old 'dots' display for those who have badly written scripts that maybe 
haven't failed yet. i.e. no deprecation period, just a fall back option; and 
if nobody shouts then remove the config option after a respectable period.


It would also mean the existing tests can be re-used...



[Footnote]

*1* When we ask for --abbrev=7, we allocate 10 places and fill the
rest with necessary number of dots after the result of
find_unique_abbrev(), so if an object name turns out to require 8
hexdigits to make it unique, we'll append only two dots to it to
make it 10 so that it aligns nicely with others) and they would
always be reading the full, non abbreviated output.  The story does
not change that much when we do not explicitly ask for a specific
abbreviation length in that we add variable number of dots for
aligning in that case, too.


The --abbrev=7 does cater for many smaller repo's, so there is a possiblity 
that the bad script issue hasn't been hit yet by those repos.

--

Philip 



Re: [PATCH 1/3] checkout: describe_detached_head: remove 3dots after committish

2017-11-07 Thread Philip Oakley

From: "Ann T Ropea" 

Thanks for all the feedback provided!

I'd like to summarise what consensus we have reached so far and
then propose a way forward:

  * we'll use the term "ellipsis (pl. ellipses)" for what's
been referred to as "3dots", "n-dots", "many dots" and so
forth


Using a consistent  term for the *display* of shortened oid's is good.



  * we would like to use ellipses when attached to SHA-1
values only for the purpose of specifying a symmetric
difference (as per gitrevisions(7))


The symetric difference (three-dots) is a specific Git *cli* notation that 
is distinct from the use of ellipsis for displaying oid's




  * the usage of ellipses as a "here we truncated something
longer" is a relic which should be phased out.


I think that is true.



To get there, preventing describe_detached_head from appending
an ellipsis to the SHA-1 values it prints is one important step.

This change does not cause any test to fall over.


But...


The other important step is dealing with the "git diff --raw"
output which features ellipses in the relic-fashion no longer
desired.

It would appear that simplifying diff.c's diff_aligned_abbrev
routine to something like:

/* Do we want all 40 hex characters?
*/
if (len == GIT_SHA1_HEXSZ)
return oid_to_hex(oid);

/* An abbreviated value is fine.
*/
return diff_abbrev_oid(oid, len);

does do the trick.

This change causes quite a few tests to fall over; however, they
all have truncated-something-longer-ellipses in their
raw-diff-output expected sections, and removing the ellipses
from there makes the tests pass again, :-)


The number of failures you report in the test suit suggests that someone 
somewhere will be expecting that notation, and that we may need a 
deprecation period, perhaps with an 'ellipsis' config variable whose default 
value can later be flipped, though that leaves a config value needing 
support forever!


Junio should be able to better advise on his preferred approach.



If we can agree that this is a way forward, i'll create & send
v2 of the patch series to the mailing list (it'll include the
fixed tests) and we'll see where we go from there.


--
Philip 



Re: [PATCH 1/3] checkout: describe_detached_head: remove 3dots after committish

2017-11-06 Thread Philip Oakley

From: "Junio C Hamano" 

Ann T Ropea  writes:


This could be confusing not only for novices; in either case, no range
should be insinuated by describe_detached_head.


We actually do not insinuate any range in these output.  These dots
denote "truncated at the end, instead of giving full length."

Another place these "many dots" appear is "git diff --raw", for
example.



 The fancy word for the three dots is an `ellipsis`
- the omission from speech or writing of a word or words that are 
superfluous or able to be understood from contextual clues.
- from the Ancient Greek: ἔλλειψις, élleipsis, "omission" or "falling 
short".


The user/reader confusion may still be there though.
 



Re: [PATCH 6/7] builtin/describe.c: describe a blob

2017-11-06 Thread Philip Oakley

From: "Junio C Hamano" <gits...@pobox.com>
Sent: Sunday, November 05, 2017 6:28 AM

"Philip Oakley" <philipoak...@iee.org> writes:


Is this not also an alternative case, relative to the user, for the
scenario where the user has an oid/sha1 value but does not know what
it is, and would like to find its source and type relative to the
`describe` command.


I am not sure what you wanted to say with "source and type RELATIVE TO
the describe command".


The 'relative to' was meaning the user's expectation about this particular 
command. For a non-expert user, who may not have come across cat-file yet, 
their world view may not extend beyond 'Git describe ' for me.




The first thing the combination of the user and the describe command
would do when the user has a 40-hex string would be to do the
equivalent of "cat-file -t" to learn if it even exists and what its
type is.  With Stefan's patch, that is what describe command does in
order to choose quite a different codeflow from the traditional mode
when it learns that it was given a blob.


I realised, after sending, that this was probably the method for 
non-ambiguous shortened oid's. Thanks for the reminder.



IIUC the existing `describe` command only accepts  values,
and here we are extending that to be even more inclusive, but at the
same time the options become more restricted.


Do you mean that the command should check if it was given an option
that would not be applicable to the "find a commit that has the
blob" mode, once it learns that it was given a blob and needs to go
in that codepath?  I think that would make sense.


Correct, it was the option selection aspect.



Or have I misunderstood how the fast commit search and the slower
potentially-a-blob searching are disambiguated?


I do not think so.  We used to barf when we got anything but
commit-ish, but Stefan's new code kicks in if the object turns out
to be a blob---I think that is what you mean by the disambiguation.


Correct. We ask to describe an object, but then the option choices may vary 
by type.


The new [blob] synopys only lists , while the old [commit-ish] 
shows specifics. It wasn't clear if the options are the same for both. I 
quess they are the same once the cat-file -t has done its bit. Its only the 
speed that's affected.


As a side note, the commit message example don't show any pathspec that is 
not in the top level directory.


--
Philip 



Re: [PATCH 6/7] builtin/describe.c: describe a blob

2017-11-04 Thread Philip Oakley

From: "Junio C Hamano" 
Sent: Thursday, November 02, 2017 4:23 AM

Junio C Hamano  writes:


The reason why we say "-ish" is "Yes we know v2.15.0 is *NOT* a
commit object, we very well know it is a tag object, but because we
allow it to be used in a context that calls for a commit object, we
mark that use context as 'this accepts commit-ish, not just
commit'".


Having said all that, there is a valid case in which we might want
to say "blob-ish".


Is this not also an alternative case, relative to the user, for the scenario 
where the user has an oid/sha1 value but does not know what it is, and would 
like to find its source and type relative to the `describe` command.


IIUC the existing `describe` command only accepts  values, and 
here we are extending that to be even more inclusive, but at the same time 
the options become more restricted. Thus the synopsis terminology would be 
more about suggesting the range of options available (search style/start 
points) that are applicable to blobs, than being exactly about the 
'allow-blobs' parameter.


Or have I misunderstood how the fast commit search and the slower 
potentially-a-blob searching are disambiguated?

--
Philip



To review, X-ish is the word we use when the command wants to take
an X, but tolerates a lazy user who gives a Y, which is *NOT* X,
without bothering to add ^{X} suffix, i.e. Y^{X}.  In such a case,
the command takes not just X but takes X-ish because it takes a Y
and converts it internally to an X to be extra nice.

When the command wants to take a blob, but tolerates something else
and does "^{blob}" internally, we can say it takes "blob-ish".
Technically that "something else" could be an annotated tag that
points at a blob object, without any intervening commit or tree (I
did not check if the "describe " code in this thread handles
this, though).

But because it is not usually done to tag a blob directly, it would
probably be not worth to say "blob-ish" in the document and cause
readers to wonder in what situation something that is not a blob can
be treated as if it were a blob.  It does feel like we would be
pursuing technical correctness too much and sacrificing the readability
of the document, at least to me, and a bad trade-off.






  1   2   3   4   5   6   7   8   9   10   >