Re: problem with not being able to enforce git content filters

2018-10-16 Thread Stas Bekman
On 2018-10-16 05:54 PM, brian m. carlson wrote:
[...]
>> It doesn't help with direct commits to master, since CI would be
>> detecting it after it was committed. And when that happens we all know
>> that already because 'git pull' fails.
> 
> Typically projects that have CI set up don't allow direct pushes to
> master, since that tends to allow to bypass CI, or they allow it only in
> exceptional circumstances (e.g., reverts).  Also, typically most
> projects want to have some sort of review before code (resp. documents,
> other contributions) are merged.  Most Git hosting platforms, including
> GitHub, offer protected branches, so it's something to consider.
> 
> There is a possible alternative, and that's that if your project has
> some sort of build file (e.g., a Makefile), you can set it up to
> automatically insert hooks or git configuration into developers'
> systems, optionally only if it's in a Git repository.  For example, you
> could add a pre-commit hook that fails if the filters are disabled.
> 
> These are the approaches that tend to work well for most projects.  I
> tend to prefer the CI approach with restricted branches because often I
> want to customize my hooks, but your project will decide what works best
> for it.

Yes, Brian, what you're sharing makes total sense when things are setup
this way, but this is not the case with the project I'm contributing to.
This one is setup, commit first, review later, due to having too few hands.

And I have already setup a CI to detect misconfigured systems. It'll
catch RPs in time, and everything else post-commit. Let's hope the
developers will watch the status of their commits and will react quickly
to fix their setup and mend the broken commit, when they see their
commit broke things. That's as good as it can get in this particular
situation I understand.

Thank you, Brian and Ævar for your support and very helpful suggestions.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: problem with not being able to enforce git content filters

2018-10-16 Thread Stas Bekman


>> And the devs honestly try to do their best to remember to configure the
>> filters, but for some reason they disappear for them, don't ask me why,
>> I don't know. This is an open source project team, not a work place.
> 
> This sounds like it could be easily solved by continuous integration.
> You could set up a job on any of a variety of services that checks that
> a pull request or other commit is clean when when the filter runs.  If
> it doesn't pass, the code doesn't merge.

This is an excellent idea wrt to PRs. Thank you, Brian! I will implement
that.

It doesn't help with direct commits to master, since CI would be
detecting it after it was committed. And when that happens we all know
that already because 'git pull' fails.


-- 
____
Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: problem with not being able to enforce git content filters

2018-10-16 Thread Stas Bekman
On 2018-10-16 02:17 PM, Ævar Arnfjörð Bjarmason wrote:
[...]
>> We can't use server-side hooks to enforce this because the project is on
>> github.
> 
> Ultimately git's a distributed system and we won't ever be able to
> enforce that users in their local copies use filters, and they might
> edit stuff e.g. in the GitHub UI directly, or with some other git
> implementation.

In this particular case github won't be a problem, since for the problem
to appear it has to be executed on the user side. Editing directly in
github UI is not a problem.

Just to give a little big more context to the issue in first place. A
jupyter notebook is a json file that contains the source code, the
outputs from executing that code and the unique user environment bits.
We want to "git" the source code but not the rest. Otherwise merging is
a hell. Hence the stripping.

There are other approaches to solve this problem, besides stripping, but
 they all require some kind of pre-processing before committing the file.

> So if you have such special needs maybe consider hosting your own setup
> where you can have pre-receive hooks, or within GitHub e.g. enforce that
> all things must go through merge requests, and have some robot that
> checks that the content to be merged has gone through filters before
> being approved.

Yes, that would be ideal. But I doubt this is going to happen, I'm just
a contributing developer, not the creator/owner of the project. And as I
said this affects anybody who collaborates on jupyter notebooks, not
just in our project. I think there are several millions of them on github.

> *Picks up flag*. For the purposes of us trying to understand this report
> it would be really useful to boil what's happening down to some
> step-by-step reproducible recipe.
> 
> I.e. with some dummy filter configured & not configured how does "git
> pull" end up breaking, and in this case you're alluding to git simply
> forgetting config, how would that happen?

The problem of 'git pull' and 'git status' and 'git stash' is presented
in details here:
https://stackoverflow.com/questions/51883227/git-pull-stash-conflicts-with-a-git-filter
and more here:
https://github.com/kynan/nbstripout/issues/65
https://github.com/jupyter/nbdime/issues/410#issuecomment-412999758

The problem of configuration disappearing I sadly have no input. It
doesn't happen to me, and all I get from others is that it first works,
and then it doesn't. Perhaps it has something to do with some of them
using windows. I don't know, sorry.

>> The bottom line it sucks and I hope that you can help with offering a
>> programmatic solution, rather than recommending creating a police
>> department.
> 
> I think it would be great to have .gitconfig in-repo support, but a lot
> of security & UI problems need to be surmounted before that would
> happen, here's some previous ramblings of mine on that issue:
> https://public-inbox.org/git/?q=87zi6eakkt.fsf%40evledraar.gmail.com
> 
> It now occurs to me that a rather minimal proof-of-concept version of
> that would be:
> 
>  1. On repository clone, detect if HEAD:.gitconfig exists
> 
>  2. If it does, and we trust $(git config -f  -l)
> enough, emit some advice() output saying the project suggests
> setting config so-and-so, and that you could run the following one
> liner(s) to set it if you agree.
> 
>  3. Once we have that we could eventually nudge our way towards
> something like what I suggested in the linked threads above,
> i.e. consuming some subset of that config directly from the repo's
> HEAD:.gitconfig

I like it, Ævar!

I feel doing the check and prompting the user on the first push/commit
after clone would be a better. It'd be too easy for the user to skip
that step on git clone. In our particular case we want it where the
problem is introduced, which is on commit, and not on clone. I hope it
makes sense.


-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


problem with not being able to enforce git content filters

2018-10-16 Thread Stas Bekman
Hi,

TL;DR

Our open source project dev team has a continuous problem with git
content filters, because developers don't always have them configured.

We need a way for git to support content filters w/o using user's
.gitconfig. Otherwise it leads to an inconsistent behavior and messed up
git checkouts.

=

Full story:

We use a version of the nbstripout content filter
(https://github.com/kynan/nbstripout), which removes user-specific
information from the jupyter notebooks during commit. But the problem
would be the same with any one way clean filter.

First the setup:
https://github.com/kynan/nbstripout#manual-filter-installation

=
Set up a git filter using nbstripout as follows:

git config filter.nbstripout.clean '/path/to/nbstripout'
git config filter.nbstripout.smudge cat
git config filter.nbstripout.required true

Create a file .gitattributes or .git/info/attributes with:

*.ipynb filter=nbstripout

Apply the filter for git diff of *.ipynb files:

git config diff.ipynb.textconv '/path/to/nbstripout -t'

In file .gitattributes or .git/info/attributes add:

*.ipynb diff=ipynb

=

The problem is that it can't be enforced.

When it's not enforced, we end up with some devs using it and others
don't, or more often is the same dev sometimes doesn't have it configured.

When a person has a stripped out notebook checked out, when another
person commits un-stripped out notebook, it leads to: invalid `git
status` reports, `git pull` breaks, `git stash` doesn't work, since it
tries to stash using the filters, and `git pull' can never succeed
because it thinks that it'll overwrite the local changes, but `git diff`
returns no changes.

So the only solution when this happens is to disable the filters, clean
up the mess, re-enable the filters. Many people just make a new clone -
ouch!

And the biggest problem is that it affects all users who may have the
filters enabled, e.g. because they worked on a PR, and not just devs -
i.e. the repercussions are much bigger than just a few devs affected.

We can't use server-side hooks to enforce this because the project is on
github.

And the devs honestly try to do their best to remember to configure the
filters, but for some reason they disappear for them, don't ask me why,
I don't know. This is an open source project team, not a work place.

You can see some related complaints here:
https://github.com/kynan/nbstripout/issues/65#issuecomment-430346894
https://stackoverflow.com/questions/51883227/git-pull-stash-conflicts-with-a-git-filter
and I can find you a whole bunch more if you need more evidence.

==

Proposed solution:

There needs to be a way for a project to define content filters w/o
going via user's .gitconfig configuration, since due to git's security
it can't be distributed to all users and must be enabled manually by
each user, which is just not the right solution in this case.

Of course, I'm open to other suggestions that may help this issue.

"Tell your developers they must configure the filters" is not it - I
tried it for a long time and in vain. If you look at our install
instructions: https://github.com/fastai/fastai#developer-install

  git clone https://github.com/fastai/fastai
  cd fastai
  tools/run-after-git-clone

It already includes an instruction to run a script which enables the
filters, but this doesn't seem to help (and no, it's not a problem with
the script). The devs report that the configuration is there for a
while, and then suddenly it is not there, I don't know why. Perhaps they
make a new clone and forget to re-enable the filters, perhaps they
disable them to clean up and forget to reenable them, I can't tell.

The bottom line it sucks and I hope that you can help with offering a
programmatic solution, rather than recommending creating a police
department.

Thank you for listening.

-- 
____
Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-24 Thread Stas Bekman
On 2018-09-24 02:08 PM, Ævar Arnfjörð Bjarmason wrote:
[...]

> Posting to this mailing list is generally how it's done

Thank you, Ævar, for clarifying that there is no issue tracker for the
git project.

> The best way to fix stuff in git that you can't interest others in is to
> do it yourself. Take a look at Documentation/SubmittingPatches in the
> git.git repository for how to do that.

Based on the initial rich discussion this post created I had a feeling
that there was a lot of interest. But you're correct, it's easier to
share one's thought and patches take a lot more effort and time.

> In particular, starting by clarifying the docs around this as I
> suggested upthread might be a good and easy start to your first
> contribution to git!

That's an excellent idea. Philip started the process and hopefully it
will lead to better documentation of the issue.

Thanks again.


-- 
____
Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: [PATCH 0/1] Re: git silently ignores include directive with single quotes

2018-09-24 Thread Stas Bekman
On 2018-09-24 03:24 PM, Philip Oakley wrote:
> Rather than attaching the problem with code, I decided to simply update
> the config file documentation.
> 
> As the userbase expands the documentation will need to be more comprehensive
> about exclusions and omissions, along with better highlighting for core
> areas.
> 
> I would be useful if Stas could comment on whether these changes would
> have assisted in debugging the faulty config file. 

Thank you for writing this doc patch, Philip.

The documentation improvement would be most useful in conjunction with a
an improved debugging/tracing facility. So that a user can see what git
is seeing. Once a user sees that their configuration is broken then they
can peruse the improved documentation to find why it is broken. Without
the debugging ability, the docs would help but it'll be a much longer
journey, since words like:

"Single quotes are not special and form part of the variable's value."

aren't necessarily going to stand out as an indicator of a potential
problem, when you won't think twice that quotes could even be a suspect,
even though the docs say so explicitly.

A trace saying:

"./.git/'.gitconfig'" is not found

would speak volumes and be self-documenting.

In lieu of that, the docs would be need to have more examples.

Here are the potential expansions to the patch you shared:

1. "Single quotes are not special and form part of the variable's value.
For example, if the configuration includes:

  include = '.gitconfig'

then git will look for "'.gitconfig'", single quotes included. Also note
that it'll look for the file relative to "REPO/.git/", hence it'll look
for "REPO/.git/'.gitconfig'", which is most likely incorrect, since you
can't check in files under "REPO/.git/". The correct configuration for
including "REPO/.gitconfig" is:

  include = ../.gitconfig

2. Same with:

"Both the `include` and `includeIf` sections implicitly apply an 'if
found' condition to the given path names."

To a user this would be a difficult statement to make sense of. An
example would fix that:

"Both the `include` and `includeIf` sections implicitly apply an 'if
found' condition to the given path names. For example, if the
configuration includes:

  include = ../.gitconfig

and git finds "REPO/.gitconfig", it will include its configuration. If
git can't find it, it will silently ignore this include statement until
this file appears. It has been designed this way to allow for optional
user-specific configuration facilities."

Thank you.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git diff-tree ignores --textconv

2018-09-23 Thread Stas Bekman
On 2018-09-23 06:23 PM, Jeff King wrote:
> On Sun, Sep 23, 2018 at 05:56:03PM -0700, Stas Bekman wrote:
> 
>>> You probably want "--ext-diff", not "--textconv".
>> [...]
>> Would it be safe to ask the maintainer of the application to include
>> both --textconv and --ext-diff in that 'git diff-tree' call? I only need
>> the latter, but someone needed --textconv there as it's in the code.
> 
> I think so. The main reason that they are not the default for plumbing
> commands such as diff-tree is that the output may be quite surprising to
> anything trying to parse the output. Using --textconv will always
> produce a diff, but one that may not be applied to the original content.
> Using --ext-diff may produce output that doesn't even look like a diff,
> though in practice they often do.
> 
>> This is for this package:
>> https://github.com/rsmmr/git-notifier
> 
> It looks like the output is meant to be read by humans, so yeah, I think
> it would be fine (and preferred) to enable both.

Fantastic. Thank you so much, Jeff.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git diff-tree ignores --textconv

2018-09-23 Thread Stas Bekman
On 2018-09-23 05:43 PM, Jeff King wrote:
> On Sun, Sep 23, 2018 at 03:41:45PM -0700, Stas Bekman wrote:
> 
>> $ git config --get diff.jupyternotebook.command
>> git-nbdiffdriver diff
> 
> That's an "external diff driver", not a textconv driver.
> 
> So here:
> 
>> $ GIT_TRACE=1 git diff-tree -p HEAD --textconv test/test.ipynb
>> 
> 
> You probably want "--ext-diff", not "--textconv".
> 
> There's some discussion in the gitattributes manpage, but the short of
> it is that textconv converts binary input to text, which is then fed
> through the normal diff mechanism. Whereas an external diff driver is
> given both sides and can produce whatever output it wants. Textconv is
> less flexible, but generally way easier to write.

Thank you, Jeff, for explaining my misunderstanding and how to fix it.

Would it be safe to ask the maintainer of the application to include
both --textconv and --ext-diff in that 'git diff-tree' call? I only need
the latter, but someone needed --textconv there as it's in the code.

This is for this package:
https://github.com/rsmmr/git-notifier

It was added here:
https://github.com/rsmmr/git-notifier/search?q=textconv_q=textconv

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-23 Thread Stas Bekman
Apologies for I don't know how this project manages issues, so I'm not
sure whether it is my responsibility to make sure this issue gets
resolved, or do you have some tracking mechanism where you have it
registered? There is no rush, I'm asking because the discussion about
this issue has suddenly dropped about 2 weeks ago, hence my ping.

Thank you.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


git diff-tree ignores --textconv

2018-09-23 Thread Stas Bekman
Hi,

I'm using a 3rd party application that internally uses 'git diff-tree'
instead of 'git diff'. I'm trying to add filter and it works with 'git
diff' but it gets ignored with 'git diff-tree' despite having --textconv.

I was able to reproduce the problem with the following much more
simplified setup:


$ git check-attr diff test/test.ipynb
test/test.ipynb: diff: jupyternotebook


$ git config --get diff.jupyternotebook.command
git-nbdiffdriver diff


$ GIT_TRACE=1 git diff test/test.ipynb
[...]
run_command: GIT_DIFF_PATH_COUNTER=1 GIT_DIFF_PATH_TOTAL=1
'git-nbdiffdriver diff'
[...]



$ GIT_TRACE=1 git diff-tree -p HEAD --textconv test/test.ipynb


Git tracing shows nothing relevant as in the command above.

According to https://git-scm.com/docs/git-diff-tree adding --textconv
should have respected the configured diff filter, but it does nothing.

Is this a bug or do I have something misconfigured?

Thank you.

git version 2.17.1

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-10 Thread Stas Bekman
To add another report of a similar problem, of silent skipping and not
of filepath quoting, I found this one:

https://stackoverflow.com/questions/31203634/git-clean-filter-python-script-on-windows/52264440#52264440

The user created .gitconfig and added to .git/config:

[include]
path = .gitconfig

Not realizing that the two were not in the same folder. And probably
assuming that .git/config was referring to the root of repository, and
not relative to .git/, which is a reasonable assumption.

Of course he had no way of resolving this as git wasn't telling him
where it wasn't finding the file. i.e.

Can't find: ~/myrepo/.git/.gitconfig

which would have instantly told him where the problem was.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 07:51 PM, Paul Smith wrote:
[...]
> What I personally think would be more useful would be some sort of
> "verbose parsing" option to git config, that would parse the
> configuration just as a normal Git command would and show diagnostic
> output as the entire config is parsed: for each action line the config
> file name and line number, and the operation performed (and any message
> about it) would be printed.  This could be useful in a variety of
> situations, for instance to discover conflicts between local, global,
> and system configuration, easily see where settings are coming from,
> etc.
> 
> And as part of this output, when an include file was not present or we
> didn't have permissions or whatever, an appropriate error message would
> be generated.

I was thinking along the same lines, Paul - i.e. no need to change
anything in the config syntax, but to provide better diagnostics.

I quote below what I suggested in an earlier email, but I like Paul's
idea even better as it'd be useful to many situations and not just the
one that started this thread.

> 1) I suggest this is done via:
>
>   git config --list --show-origin
>
> where the new addition would be to also show configuration parts that
> are not active and indicating why it is so.
>
> So for example currently I get on a valid configuration setup and having
> git/../.gitconfig in place the following output:
>
> [...]
> file:/home/stas/.gitconfig  mergetool.prompt=false
> [...]
> file:.git/configinclude.path=../.gitconfig
> [...]
> file:.git/../.gitconfig
> filter.fastai-nbstripout-code.clean=tools/fastai-nbstripout
> [...]
>
> Now, if include.path=../.gitconfig is there and file:.git/../.gitconfig
> is not found, it will indicate that in some way that stands out for the
> user. Perhaps:
>
> [...]
> file:/home/stas/.gitconfig  mergetool.prompt=false
> [...]
> file:.git/configinclude.path=../.gitconfig
> [...]
> file:.git/../.gitconfig FILE NOT FOUND! Ignored configuration
> [...]
>
> So that would allow things to work as before, but now we have a way to
> debug user-side configuration. And of course hoping that the docs would
> indicate that method for debugging configuration problems.
>
> I hope this is a reasonable suggestion that doesn't require any
> modification on the users' part who rely on this silent ignoring
> "feature", yet lending to a configuration debug feature.



-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 02:22 PM, Jeff King wrote:
[...]
>> The original problem cropped up due to using:
>>
>>  git config --local include.path '../.gitconfig'
>>
>> which on linux stripped the single quotes, but on some windows git bash
>> emulation it kept them.
> 
> That sounds like a bug in git bash, if it is not treating single quotes
> in the usual shell way. But I'd also expect such a bug to cause loads of
> problems in all of the shell scripts. Are you sure it wasn't cmd.exe or
> some other interpreter?

I don't know, Jeff. I think the user said it was first anaconda shell.
And then the user tried gitforwindows with same results. I don't know
MSwindows at all.

But it doesn't matter at the end of the day, since we can't cover all
possible unix shell emulations out there. What matters is that there is
a way to flag the misconfiguration, either by default, or through a
special check - some ideas I suggested in my previous email, but surely
you have a much better insight of how to deal with that.

Thank you.

-- 
________
Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 01:28 PM, Ævar Arnfjörð Bjarmason wrote:
[...]
> Yeah, some version of this is sensible. There's at least a doc patch in
> here somewhere, if not some "warn if missing" mode.
> 
> So don't take any of this as minimizing that aspect of your bug report.
> 
> *But*
> 
> There's just no way that "git" the tool can somehow in a sane way rescue
> you from knowing the quoting rules of the shell on your system, which
> differ wildly between the likes of Windows and Linux.

I understand. All your explanations are perfectly reasonable, Ævar.
Thank you.

Yet, there needs to be some way for a user to know that git ignored
something if their configuration doesn't work as expected.

1) I suggest this is done via:

  git config --list --show-origin

where the new addition would be to also show configuration parts that
are not active and indicating why it is so.

So for example currently I get on a valid configuration setup and having
git/../.gitconfig in place the following output:

[...]
file:/home/stas/.gitconfig  mergetool.prompt=false
[...]
file:.git/configinclude.path=../.gitconfig
[...]
file:.git/../.gitconfig
filter.fastai-nbstripout-code.clean=tools/fastai-nbstripout
[...]

Now, if include.path=../.gitconfig is there and file:.git/../.gitconfig
is not found, it will indicate that in some way that stands out for the
user. Perhaps:

[...]
file:/home/stas/.gitconfig  mergetool.prompt=false
[...]
file:.git/configinclude.path=../.gitconfig
[...]
file:.git/../.gitconfig FILE NOT FOUND! Ignored configuration
[...]

So that would allow things to work as before, but now we have a way to
debug user-side configuration. And of course hoping that the docs would
indicate that method for debugging configuration problems.

I hope this is a reasonable suggestion that doesn't require any
modification on the users' part who rely on this silent ignoring
"feature", yet lending to a configuration debug feature.

2) And a secondary suggestion I mentioned earlier is to also have a flag
for git config to validate the path as it is being configured:

 git config --local include.path '../.gitconfig' --validate-path

so that on shells that deal with quoting differently, than what git
expects, this git command will fail saying:

error: can't find file:.git/'../.gitconfig'

or at the very least give a warning if we don't want it be fatal. Though
I see no problem with it being fatal if a user uses a special flag.

I made this second suggestion since it will help users to detect the
problem early on. Before they need to search for another debug solution
such as the first one suggested in this email.

3) Finally, it'd be useful to have GIT_TRACE=1 state that so and so
include path wasn't found and was ignored during various 'git whatever'
commands.

I am open to any or all of these solutions, or alternative suggestions
of course.

Thank you.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 01:02 PM, Ævar Arnfjörð Bjarmason wrote:

> Aside from other issues here, in the "wold of unix" (not that we only
> use the git config syntax on those sort of systems) you can't assume
> that just because some quoting construct works in the shell, that it
> works the same way in some random config format. If you look in your
> /etc/ you'll find plenty of config formats where you can't use single,
> double and no quotes interchangeably, so I don't see what hte confusion
> is with that particular aspect of this.
> 
> Although as I mentioned in <87bm97rcih@evledraar.gmail.com> the fact
> that we ignore missing includes definitely needs to be documented, but
> that our quoting constructs in our config format behave like they do in
> POSIX shells I see as a non-issue.

I agree that I should make no such assumptions. Thank you. But it is a
cross-platform problem. I remind that the original problem came from a
simple command:

 git config --local include.path '../.gitconfig'

Which on linux removed the quotes and all was fine, and on windows the
same command kept the quotes and the user was tearing his hair out
trying to understand why the custom config was ignored.

So you can say, don't use the quotes in first place. But what if you have:

 git config --local include.path 'somewhere on the system/.gitconfig'

you have to use single or double quotes inside the shell to keep it as a
single argument, yet on some windows set ups it'll result in git
ignoring this configuration directive, as the quotes will end up in git
config file.

I'd say at the very least 'git config' could have an option
--verify-path or something similar and for it to validate that the path
is there exactly as it adds it to .git/config at the time of running
this command to help the user debug the situation. Of course this won't
help if .git/config is modified manually. But it's a step towards
supporting users.

I hope this clarifies the situation.

-- 
____
Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 12:54 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Sat, Sep 08 2018, Martin Ågren wrote:
> 
>> Hi Stas
>>
>> On Sat, 8 Sep 2018 at 21:00, Stas Bekman  wrote:
>>> [include]
>>> path = '../.gitconfig'
>>>
>>> Notice the single quotes around the filename. When this is the case git
>>> silently (!) ignores the custom configuration, which is clearly a bug.
>>
>> Thanks for reporting and describing out your expectations and what you
>> observed.
>>
>> Actually, there is a test explicitly testing that 'missing include files
>> are ignored'. I couldn't find a motivation for this in 9b25a0b52e
>> (config: add include directive, 2012-02-06).
>>
>>> The original problem cropped up due to using:
>>>
>>>  git config --local include.path '../.gitconfig'
>>>
>>> which on linux stripped the single quotes, but on some windows git bash
>>> emulation it kept them.
>>
>> Huh, I wouldn't have expected them to be kept. You learn something
>> new every day...
>>
>>> What am I suggesting is that git:
>>>
>>> (1) should complain if it encounters an invalid configuration and not
>>> silently ignore it. It took quite some effort and time to figure the
>>> culprit.
>>
>> Sounds reasonable to me, but I might be missing something. I'm cc-ing
>> the original author. Maybe he can recall why he made sure it silently
>> ignores missing files.
>>
>>> (2) probably allow the quoted location of the file, but it's much less
>>> important, as it's easy to rectify once git gives user #1
>>
>> I don't think this will work. Allowing quoting for just this one item,
>> or for all? Any and all quoting or just at the first and last character?
>> What about those config items where quotes might legitimately occur,
>> i.e., we'd need some escaping? Actually, something like '.gitconfig'
>> *with* *those* *quotes* is a valid filename on my machine.
> 
> The reason missing includes are ignored is that the way this is expected
> to be used is e.g.:
> 
> [include]
> path ~/.gitconfig.work
> 
> Where .gitconfig.work is some configuration you're going to drop into
> place on your $dayjob servers, but not on your personal machine, even
> though you sync the same ~/.gitconfig everywhere.

Thank you for clarifying why this is done silently, Ævar. It makes sense
then.

> If we were to make nonexisting files an error, we'd need something like
> an extension of the includeIf syntax added in 3efd0bedc6 ("config: add
> conditional include", 2017-03-01) 3efd0bedc6 ("config: add conditional
> include", 2017-03-01). I.e.:
> 
> [includeIfcond "test -e ~/.gitconfig.work"]
> path = ~/.gitconfig.work
> 
> Or something like that, this is getting increasingly harder to shove
> into the *.ini config syntax.

This suggestion won't solve the real problem. The real problem is that
git can't find '.gitconfig' even though it's there, due to single quotes
around the filepath. So the suggested check will still ignore the
configuration even if it's there.

This also leads me to think what if the include path has spaces in it?

path = ~/somewhere on my system/.gitconfig.work

most people would assume quotes are needed around the filepath.


-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 12:30 PM, Martin Ågren wrote:
> Hi Stas
> 
> On Sat, 8 Sep 2018 at 21:00, Stas Bekman  wrote:
>> [include]
>> path = '../.gitconfig'

> Actually, there is a test explicitly testing that 'missing include files
> are ignored'. I couldn't find a motivation for this in 9b25a0b52e
> (config: add include directive, 2012-02-06).

And also to stress out, that the file is not missing.  At least in the
world of unix, in particular its many shells, - command line arguments
"xyz", 'xyz', xyz are often deemed to be the same if there are no spaces
in the word. So that's why it took us a lot of trial and error to even
consider the quotes in '../.gitconfig' as a problem. While git deems it
different, to me:

path = '../.gitconfig'
path = "../.gitconfig"
path = ../.gitconfig

appear to be the "same". So git needs to have a way to say otherwise.

I realize I am going back to the issue of quoting here, after suggesting
to ignore it. So to clarify I'm bringing it up only in the context of
wanting git to tell the user what it wants, and not necessarily asking
to support all the possible ways one could quote a filepath.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 12:30 PM, Martin Ågren wrote:

> Actually, there is a test explicitly testing that 'missing include files
> are ignored'. I couldn't find a motivation for this in 9b25a0b52e
> (config: add include directive, 2012-02-06).

Thank you for the follow up, Martin. And discovering that it is by design.

I suppose this could have been done to optimize run-time performance.
But there must be a way for a user to validate their custom
configuration. So perhaps there should be a specific directive to do so?
One could argue that:

  git config --list --show-origin

does exactly that. Except it should probably also indicate that some
configuration file or parts of were ignored - and clearly indicate the
exact nature of the problem. In which case it'd be sufficient.

>> (2) probably allow the quoted location of the file, but it's much less
>> important, as it's easy to rectify once git gives user #1
> 
> I don't think this will work. Allowing quoting for just this one item,
> or for all? Any and all quoting or just at the first and last character?
> What about those config items where quotes might legitimately occur,
> i.e., we'd need some escaping? Actually, something like '.gitconfig'
> *with* *those* *quotes* is a valid filename on my machine.

Let's ignore this sub-issue for now. If we can get git to report when
something is mis-configured, this issue can then be easily resolved.


-- 
________
Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
Hi,

One of the windows users discovered this bug, and I was able to
reproduce it on linux.

We are using a custom content filter configuration REPO/.gitconfig which
needs to be enabled inside REPO/.git/config:

This works:

[include]
path = ../.gitconfig

This doesn’t:

[include]
path = '../.gitconfig'

Notice the single quotes around the filename. When this is the case git
silently (!) ignores the custom configuration, which is clearly a bug.

I found the easiest to debug this is by using:

git config --list --show-origin

In the former case it shows the custom config, in the latter it does not.

Yet, git gives no indication of any errors, not even with GIT_TRACE and
other debug vars.

The original problem cropped up due to using:

 git config --local include.path '../.gitconfig'

which on linux stripped the single quotes, but on some windows git bash
emulation it kept them.

What am I suggesting is that git:

(1) should complain if it encounters an invalid configuration and not
silently ignore it. It took quite some effort and time to figure the
culprit.

(2) probably allow the quoted location of the file, but it's much less
important, as it's easy to rectify once git gives user #1

I don't have the details about the windows user setup, but I was able to
reproduce this bug with git version 2.17.1 on linux.

Thank you.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: GIT_TRACE doesn't show content filter files it's operating on

2018-08-28 Thread Stas Bekman
On 2018-08-27 06:13 PM, Stas Bekman wrote:
[...]
> I now know how get the filenames for "clean/smudge" filters. Can you
> please help with the same for "textconv". %f doesn't work - it gets
> stuck there waiting for stdin, the following seems to pass, but I'm not
> sure it's correct:

On a closer look this is not needed for "textconv" as it already logs
the filename in the GIT_TRACE=1 trace.

I have no more remaining questions, Thank you for your help, Jeff.

-- 
________
Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: GIT_TRACE doesn't show content filter files it's operating on

2018-08-27 Thread Stas Bekman
On 2018-08-27 05:58 PM, Jeff King wrote:
[...]
>> 2. Is there no way to get git to do the filename reporting as a normal
>> GIT_TRACE behavior? I don't know anything about its internal workings,
>> but it surely must knows which file it operates on when it opens it and
>> sends its data as stdin to the content filter. It makes the debugging so
>> much easier when one can see what files are being worked on. So perhaps
>> this utility can be made available to all not just as a hack/workaround.
> 
> No, because GIT_TRACE itself only reports on the execution of commands
> and sub-processes. There are other GIT_TRACE_* variables for various
> subsystems, but AFAIK nobody has instrumented the smudge/clean filter
> code. IMHO it would be reasonable to have a GIT_TRACE_CONVERT
> that covered convert.c (so these filters, but also newline conversion,
> etc).

Thank you, Jeff.

Would you please add this to the TODO list (if that's how new feature
requests are made)?

I now know how get the filenames for "clean/smudge" filters. Can you
please help with the same for "textconv". %f doesn't work - it gets
stuck there waiting for stdin, the following seems to pass, but I'm not
sure it's correct:

[diff "ipynb-code"]
textconv = "f() { echo >&2 \"textconv: nbstripout -t $*\"; nbstripout
-t; }; cat $1 > f"

In a trace (git diff w/ one modified file) it gets invoked twice:

18:05:27.640955 run-command.c:640   trace: run_command: 'f() { echo
>&2 "textconv: nbstripout $*"; nbstripout -t; }; cat $1 > f'
/tmp/4nCYaT_004_callbacks.ipynb
18:05:27.642205 run-command.c:640   trace: run_command: 'f() { echo
>&2 "textconv: nbstripout $*"; nbstripout -t; }; cat $1 > f'
dev_nb/004_callbacks.ipynb

The first trace gives a tmp filename, whereas I need the actual filename.

Thank you.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: GIT_TRACE doesn't show content filter files it's operating on

2018-08-27 Thread Stas Bekman
On 2018-08-27 04:53 PM, Jeff King wrote:
> On Mon, Aug 27, 2018 at 04:23:34PM -0700, Stas Bekman wrote:
[...]
>> How can I get GIT_TRACE's run_command to show the arguments passed to
>> the filter? I looked at various other debug environment variables in
>> git's manual, but I don't see anything that would enable that level of
>> debug.
[...]
> You can work around it with some shell hackery:
> 
>   git config filter.foo.clean 'f() { echo >&2 "cleaning $1"; myfilter ...; }; 
> f %f'
> 
> and then even without GIT_TRACE, you get:
> 
>   $ git add .
>   cleaning .gitattributes
> 
> Or if you really just want to trigger for GIT_TRACE, try just this:
> 
>   $ git config filter.foo.clean 'f() { myfilter; }; f %f'
>   19:52:52.874064 [pid=14719] git.c:415 trace: built-in: git add .
>   19:52:52.875115 [pid=14719] run-command.c:637 trace: run_command: 'f() 
> { myfilter; }; f '\''.gitattributes'\'''

Your suggestions do the trick, Jeff. Thank you.

1. To benefit others who might be looking for something similar may I
post your suggestions as an answer to:
<https://stackoverflow.com/questions/51995773/getting-git-to-show-specific-filenames-it-is-running-content-filters-on>?

2. Is there no way to get git to do the filename reporting as a normal
GIT_TRACE behavior? I don't know anything about its internal workings,
but it surely must knows which file it operates on when it opens it and
sends its data as stdin to the content filter. It makes the debugging so
much easier when one can see what files are being worked on. So perhaps
this utility can be made available to all not just as a hack/workaround.


-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


GIT_TRACE doesn't show content filter files it's operating on

2018-08-27 Thread Stas Bekman
Hello,

I'm debugging the workings of a configured git content filter
(nbstripout) and I'm trying to get GIT_TRACE to show me the files it's
operating on, but it doesn't. Consider:

$ GIT_TRACE=1 git pull origin master
[...] removed irrelevant sections of the output
16:49:28.846707 run-command.c:640   trace: run_command: git merge
FETCH_HEAD
16:49:28.849309 git.c:344   trace: built-in: git merge
FETCH_HEAD
Updating 1ea49ad..ae0ba93
16:49:28.863291 run-command.c:640   trace: run_command: nbstripout
16:49:28.864700 run-command.c:640   trace: run_command: nbstripout
16:49:28.866060 run-command.c:640   trace: run_command: nbstripout
[...] many more of the same

How can I get GIT_TRACE's run_command to show the arguments passed to
the filter? I looked at various other debug environment variables in
git's manual, but I don't see anything that would enable that level of
debug.

I'm aware of git check-attr, but I'd like to see the run time trace on
which files it's running the filter on and with which arguments.

git version 2.17.1

I originally posted this as a question here:
<https://stackoverflow.com/questions/51995773/getting-git-to-show-specific-filenames-it-is-running-content-filters-on>
and it appears that sq_quote_argv_pretty() doesn't get argv when a
content filter is invoked, so it doesn't show this information.

Would it be possible to make the files and any other passed arguments
show in the trace?

Thank you.

-- 
____
Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books