Re: Expanding Includes in .gitignore

2016-10-28 Thread Aaron Pelly
On 28/10/16 15:54, Junio C Hamano wrote:
> Jeff King  writes:
> 
>> However, as I said elsewhere, I'm not convinced this feature is all that
>> helpful for in-repository .gitignore files, and I think it does
>> introduce compatibility complications. People with older git will not
>> respect your .gitignore.d files. Whereas $GIT_DIR/info is purely a local
>> matter.
> 
> As I do not see the point of making in-tree .gitignore to a forest
> of .gitignore.d/ at all, compatibility complications is not worth
> even thinking about, I would have to say.

Well; that saves some work. :)

I do not suggesting making this mandatory. I think it adds value and it
is a common and understood mechanism. But, if it is abhorrent, consider:

There is precedent for including files in git-config. This could be
extended to ignore files. The code is not similar, but the concept is. I
could live with it.

Or how about a new githook that can intelligently create or return the
details? This would be my least favourite option unless it was
configured in an obvious place.

Finally, if this is a bad-idea, as I asked in the beginning, I will
consider the equine expired, cease flagellation and apologise for the noise.



Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 10:07, Jeff King wrote:
> I think it does
> introduce compatibility complications.

If this is not a show stopper, I am happy to knock out a short
functional spec. I'll give it some hours before I begin.



Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 10:07, Jeff King wrote:
> I think it does
> introduce compatibility complications. People with older git will not
> respect your .gitignore.d files. Whereas $GIT_DIR/info is purely a local
> matter.

I know. I don't think it's a serious compatibility issue, but more
important people may disagree, in which case I'll have to review my
stance. I think the pros of it working everywhere outweigh the cons.

I will say that I am a proponent of consistency and obviousness. I would
rather this worked the same everywhere or it was a completely novel
component in $GIT_DIR.

I don't think I like the idea much, but it would be possible to
magically generate an ignore file from a directory.

> But perhaps there is a use case I'm missing.

A new repo where the dev tools haven't been standardised yet. New tool,
new file. Simple, and somewhat self documenting as a table of contents
in .gitignore.d



Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 10:55, Aaron Pelly wrote:
> 2) I fetch a repo with a hostile ignore file. It includes files from
> $GIT_DIR/test-data/ssl/private or some such. Change. Don't pay
> attention. Commit. Push. Problems if my test data comes from production.
> 
> Is this mitigated currently?
> 
> Not that git should be an enabler, but surely it falls on the user of
> untrusted software to ensure their own security?

Balls, I meant $GIT_WORK_TREE not $GIT_DIR



Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 09:55, Jeff King wrote:
> On Fri, Oct 28, 2016 at 09:28:23AM +1300, Aaron Pelly wrote:
> 
>>>   - we parse possibly-hostile .gitignore files from cloned repositories.
>>> What happens when I include ask to include /etc/passwd? Probably
>>> nothing, but there are setups where it might matter (e.g., something
>>> like Travis that auto-builds untrusted repositories, and you could
>>> potentially leak the contents of files via error messages). It's
>>> nice to avoid the issue entirely.
>>
>> I understand the issue.
>>
>> It's not obvious to me how using a .d solves this problem though.
> 
> It doesn't by itself. But we are worried only about tracked .gitignore
> files (recall that even repo-level files in $GIT_DIR/info are generated
> fresh by the clone process, and don't come from the remote). If we apply
> the feature only to core.excludeFile and $GIT_DIR/info/exclude, those
> are already under the user's control.

The things you say make sense from this perspective.

I was hoping to employ this mechanism throughout the git ecosystem.

Thinking out loud for a minute:

1) I clone a repo with a hostile ignore file. It includes files from
/etc/ssl/private or some such. Change. Don't pay attention. Commit.
Push. Problems.

What is the use case for reaching out of the repo in the first place?

2) I fetch a repo with a hostile ignore file. It includes files from
$GIT_DIR/test-data/ssl/private or some such. Change. Don't pay
attention. Commit. Push. Problems if my test data comes from production.

Is this mitigated currently?

Not that git should be an enabler, but surely it falls on the user of
untrusted software to ensure their own security?

> It's true that we could make a similar exception for an "include"
> feature, and respect include directives only in those "safe" files.
> Somehow that seems more confusing to me, though, than doing adding the
> feature at the file level, as it introduces slightly varying syntax
> between the locations.

I'm quickly getting over the include file idea. But yes, that would be
non obvious.

>>> Whereas letting any of the user- or repo-level exclude files be a
>>> directory, and simply reading all of the files inside, seems simple and
>>> obvious.
>>
>> Apart from backwards compatibility, unless there's something I'm missing.
> 
> I'm not sure what you mean. If we make:
> 
>   mkdir .git/info/exclude
>   echo whatever >.git/info/exclude/local
> 
> work, I don't think we have to care about backwards compatibility. That
> was nonsensical before, and never did anything useful (so somebody might
> have done it, but we can assume anybody relying on it not to read the
> contents is crazy).

Seeing your perspective, now, I can see why you didn't understand me. In
your context this makes perfect sense.



Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 10:04, Jeff King wrote:
> On Thu, Oct 27, 2016 at 12:48:34PM -0700, Jacob Keller wrote:
> 
>>> I think the normal behavior in such "foo.d" directory is to just sort
>>> the contents lexically and read them in order, as if they were all
>>> concatenated together, and with no recursion. I.e., behave "as if" the
>>> user had run "cat $dir/*".
>>
>> Yea, this is the normal behavior, and the user is expected to order
>> their files lexically such as "00-name", "50-name" and so on. Pretty
>> traditional for a lot of newer configurations.
> 
> One thing I will say about this approach is that you can implement it
> without any changes in git by doing:
> 
>   path=.git/info/exclude
>   cat $path.d/* >$path
> 
> and I have seen several config mechanisms basically do that (e.g.,
> Debian packaging for a program that doesn't have its own ".d" mechanism,
> but needs to grab config provided by several separate packages).
> 
> The reason to teach that trick to git is convenience; you don't have to
> remember to build the master file from its parts because it's done
> dynamically whenever git needs to look at it.

Precisely.

>> One thing to keep in mind would be that we should make sure we can
>> handle the .gitignore being a submodule or a git repository, so that
>> users could just do something like
> 
> I'm not convinced this is needed for in-repo .gitignore files. The point
> is that you are pulling together separate files that may be administered
> independently. But git repositories inherently have a whole-project
> view. I'm not sure that separate files buy you a lot there. And the
> compatibility issues are more complicated.
> 
> I do agree that:
> 
>   cd .git/info
>   git clone /my/exclude/repo exclude ;# or exclude.d
> 
> should work; ignoring dotfiles when reading the directory solves that,
> and is a pretty standard solution.

You raise a good point about the requirement. I was about to make an
argument in favour of it, but I argued myself out of an argument. I will
say; why have it work in one place and not others?



Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 08:48, Jacob Keller wrote:
> I would strongly prefer rc.d style directories either with a "if the
> .gitignore is a directory treat it like rc.d" or even "add support for
> .gitignore.d as well as .gitignore"

I think adding .gitignore.d shouldn't break existing systems, is
intuitive, and solves my issue.

Does git know when it is in a repo that is too new to comprehend?

My current thinking is that anywhere a .gitignore can go, so can a
.gitignore.d (named appropriately of course.) Any existing .gitignore
should take precedence to the result of parsing the directory.

I haven't looked at the implementation of precedence yet, but I'd be
surprised if the existing mechanism can't be employed.

> One thing to keep in mind would be that we should make sure we can
> handle the .gitignore being a submodule or a git repository, so that
> users could just do something like
> 
> "git submodule add  .gitignore and then track git ignore
> contents from a repository in a nice way.
> 
> By this I mean that the reading of files in .gitignore directory
> should exclude reading .git or other hidden files in some documented
> manor so as to avoid problems when linking to a git directory for its
> contents.

Nice! I like this a lot.



Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 27/10/16 23:50, Jeff King wrote:
> I'd shy away from an actual include directive, as it raises a lot of
> complications:

I'm leaning that way now too.

>   - we parse possibly-hostile .gitignore files from cloned repositories.
> What happens when I include ask to include /etc/passwd? Probably
> nothing, but there are setups where it might matter (e.g., something
> like Travis that auto-builds untrusted repositories, and you could
> potentially leak the contents of files via error messages). It's
> nice to avoid the issue entirely.

I understand the issue.

It's not obvious to me how using a .d solves this problem though.

>   - finding a backwards-compatible syntax

using .d directories solves this nicely in my opinion

> Whereas letting any of the user- or repo-level exclude files be a
> directory, and simply reading all of the files inside, seems simple and
> obvious.

Apart from backwards compatibility, unless there's something I'm missing.

> If you go that route, it probably makes sense to teach
> gitattributes the same trick.

Understood. I'll keep that in mind.

>> In the case of a directory the plan would be to add links to files
>> stored/sourced elsewhere. This does pose a precedence question which I
>> haven't thought about yet, but probably makes it too hard for the
>> limited value it brings.
> 
> I think the normal behavior in such "foo.d" directory is to just sort
> the contents lexically and read them in order, as if they were all
> concatenated together, and with no recursion. I.e., behave "as if" the
> user had run "cat $dir/*".
> 
> That lets you handle precedence via the filenames (or symlink names).

That was my thinking at first, but I didn't want to bias the discussion.

> It
> can't handle all cases (some items in "00foo" want precedence over "01bar"
> and vice versa), but I don't think there's an easy solution. That's a
> good sign that one or more of the files should be broken up.

I've been burned by this myself by packages interfering with each other
in /etc/sysctl.d

Could we put this down to caveat emptor? I think this sorting should be
intuitive to most people these days, and simple to document and comprehend.





Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 27/10/16 21:19, Alexei Lozovsky wrote:
>> I'm thinking something like ". path/to/include/file" in an ignore file,
>> and/or creating .gitignore.d and/or allowing $HOME/.config/git/ignore
>> and $GIT_DIR/info/exclude to be directories. Or some sane and consistent
>> mixture of these things.
>
> I think the rc.d-like approach with directories is better as it
> does not add new magical filenames (what if I absolutely do need
> to name my directories ". path", with a space? :)

Yes. Another alternative I thought of was #include path/to/include/file.
That'd be backwards compatible, which is a good thing, but would involve
parsing comments, which obviously could start with #include. Or lines
like ^Include /path/file$ In the case of finding an invalid file,
passing it over and issuing a simple warning should surface any issues
with existing gitignores. Anyway, this conversation is why I bring it up
on the list.

Coming back to this, maybe :(include)/path/file might be more git-like

>> In the case of a directory the plan would be to add links to files
>> stored/sourced elsewhere. This does pose a precedence question which I
>> haven't thought about yet, but probably makes it too hard for the
>> limited value it brings.
>
> Now, if we consider the case of multiple .gitignore files, it
> could be unexpected and possibly annoying for negative patterns
> in one file to affect the patterns added by some other files.

That is a concern. It is non obvious; the worst kind of annoying.

> I would find it more conceptually simple to apply individual
> .gitignores one by one, as opposed to parsing them all and
> creating one giant exclusion rule. (In technical terms, this
> means keeping one struct exclude_list for each .gitignore,
> not merging them all into one single list.)

I agree. I haven't looked, but that sounds like touching significantly
more code though. Actually, thinking about it for 20 seconds more, it
shouldn't be too hard, should it?

> In this case there should be no precendence problems as applied
> gitignores only add new ignored files, without un-ignoring
> anything previously ignored by other files.

Again, I haven't looked yet, but there is still an issue of precedence
with other gitignore files in $HOME and the repo.

> However, if we allow textual inclusion, then it means that we
> can put a gitignore into our gitignore so that we can unignore
> while we ignore, which again brings us the question of whether
> it is actually needed and expected.

Gah! Yes. One way or the other.

>> I would like to know the desirability/practicality/stupidity of such a
>> feature as I believe it is within my skillset to implement it.
>
> However, I do not recall any precendent of git using rc.d-like
> configs.

Many things have adopted this technique recently. Well, the last 15
years. It is common, understood, and fairly simple. I see no issue with it.

> And some can argue that your goal can be achieved by
> generating the .gitignore by some external means and symlinking
> the result into .git/info/exclude, so this is not Git's problem
> and we should not be overcomplicating things with something as
> simple as a list exclude patterns. This line of argument also
> can be used to opposes any textual inclusion as well, because
> it can be expanded into 'why don't we add a Turing-complete
> programming language then to specify the patterns to ignore'.

I know. Another reason to bring the idea to the list. I sort of have
this attitude myself. My main objection to it is that I can't think of a
hook to automate it with.

But: What about some kind of :(exec) that executes a script and returns
a gitignore file? You write the script; you're responsible. And the
behaviour is obvious. I haven't thought that through. It just came to me
then, and might present security issues, but could greatly simplify things.







Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 27/10/16 15:22, Stefan Beller wrote:
>> The use case for this is where I did not write my own rules, but I want
>> to keep them updated. https://github.com/github/gitignore is a damn good
>> resource, but I want to pull it and include relevant bits project by
>> project and/or system wide. I don't want to have to update many projects
>> manually if that, or any other, repo changes.
>
> .git/info/exclude could be a (sym)link to an up to date version
> of the gitignore repo as a hack?
>

Using links isn't a bad idea, but you still end up at some stage
combining the contents of several files that already exist. Well, in my
example, anyway.

I accept that I'm being pretty trivial, and once it's set up there's
never any pressing need to change anything, but it still irks me.

Even with a linked .gitignore, or .git/info/exclude there will be
sections that are project, language, editor, machine, whatever,
specific. So I still need to copy stuff from one file to another by
hand. By allowing includes, I only have to have a link to each file
describing the data types for each component of the environment.

And they are community maintained, so I don't have to google every time
I try a new editor.

note to self: reply-to isn't the list.


Expanding Includes in .gitignore

2016-10-26 Thread Aaron Pelly
I want a feature. It may be a bad-idea(tm). Advice appreciated.

I want git to be able to include, in its gitignore files, sub-files of
ignores or have it understand a directory of ignore files. Or both.

The use case for this is where I did not write my own rules, but I want
to keep them updated. https://github.com/github/gitignore is a damn good
resource, but I want to pull it and include relevant bits project by
project and/or system wide. I don't want to have to update many projects
manually if that, or any other, repo changes.

A very brief look at dir.c would indicate that a recursive call from
add_excludes to itself when it parses some sort of include tag would do
it within a file. I'm sure it'd be pretty straight forward to hook into
something in dir.c to parse directories too.

I'm thinking something like ". path/to/include/file" in an ignore file,
and/or creating .gitignore.d and/or allowing $HOME/.config/git/ignore
and $GIT_DIR/info/exclude to be directories. Or some sane and consistent
mixture of these things.

In the case of a directory the plan would be to add links to files
stored/sourced elsewhere. This does pose a precedence question which I
haven't thought about yet, but probably makes it too hard for the
limited value it brings.

There is also the issue of malicious/accidental recursion which I
haven't thought about deeply either.

I would like to know the desirability/practicality/stupidity of such a
feature as I believe it is within my skillset to implement it.