Re: Support character classes in glob authz rules

2018-12-14 Thread Doug Robinson
Brane:

I just read through this thread.  Your proposal makes a lot of sense.
To me, it's one of those things that should go into a new version (not a
patch).
And there should be a nit-picky script to point out "strange stuff" (like
the example
that Julian posed).

Cheers.

Doug

On Mon, Dec 3, 2018 at 10:31 AM Branko Čibej  wrote:

> On 03.12.2018 16:07, Michael Pilato wrote:
> > On 12/3/18 3:15 AM, Julian Foad wrote:
> >> It makes me uncomfortable to depart from standard parsing. What if
> >> users are relying on Python ConfigParser or other compatible parsing
> >> as part of their Subversion authz infrastructure?
> > We needn't keep this hypothetical.  ViewVC is using (a slightly
> > modified[*]) Python ConfigParser in this way.
> >
> > -- Mike
> >
> > [*] By default, ConfigParser (well, really the RawConfigParser it
> > subclasses) lowercases option names, which can cause username/group
> > matching to fail.  So ViewVC's code replaces the optionxform method of
> > ConfigParser with a noop lambda function.  (See
> >
> https://docs.python.org/2/library/configparser.html#ConfigParser.RawConfigParser.optionxform
> > for official docs.)
>
>
> Interesting. Of course, our ConfigParser-like implementation already has
> the option to treat section names as case-sensitive, and this option is
> used ... that's righjt, by the authz file parser.
>
> -- Brane
>
>

-- 
*DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER

T +1 925 396 1125
*E* doug.robin...@wandisco.com

-- 


* *

**The LIVE DATA Company
*Find out more 
*wandisco.com *



 

*


THIS MESSAGE 
AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED

If 
this message was misdirected, WANdisco, Inc. and its subsidiaries, 
("WANdisco") does not waive any confidentiality or privilege. If you are 
not the intended recipient, please notify us immediately and destroy the 
message without disclosing its contents to anyone. Any distribution, use or 
copying of this email or the information it contains by other than an 
intended recipient is unauthorized. The views and opinions expressed in 
this email message are the author's own and may not reflect the views and 
opinions of WANdisco, unless the author is authorized by WANdisco to 
express such views or opinions on its behalf. All email sent to or from 
this address is subject to electronic storage and review by WANdisco. 
Although WANdisco operates anti-virus programs, it does not accept 
responsibility for any damage whatsoever caused by viruses being passed.


Re: Support character classes in glob authz rules

2018-12-03 Thread Branko Čibej
On 03.12.2018 16:07, Michael Pilato wrote:
> On 12/3/18 3:15 AM, Julian Foad wrote:
>> It makes me uncomfortable to depart from standard parsing. What if
>> users are relying on Python ConfigParser or other compatible parsing
>> as part of their Subversion authz infrastructure?
> We needn't keep this hypothetical.  ViewVC is using (a slightly
> modified[*]) Python ConfigParser in this way.
>
> -- Mike
>
> [*] By default, ConfigParser (well, really the RawConfigParser it
> subclasses) lowercases option names, which can cause username/group
> matching to fail.  So ViewVC's code replaces the optionxform method of
> ConfigParser with a noop lambda function.  (See
> https://docs.python.org/2/library/configparser.html#ConfigParser.RawConfigParser.optionxform
> for official docs.)


Interesting. Of course, our ConfigParser-like implementation already has
the option to treat section names as case-sensitive, and this option is
used ... that's righjt, by the authz file parser.

-- Brane



Re: Support character classes in glob authz rules

2018-12-03 Thread Branko Čibej
On 03.12.2018 11:42, Julian Foad wrote:
> Branko Čibej wrote:
>>> If we treat the authz format as its own format, that does make me wonder if 
>>> there are any other limitations the current ConfigParser format is 
>>> applying, that should also be lifted.
>>>
>>> For example, if "@harry" is parsed as a reference to group name "harry", 
>>> then is there also a way to specify a username that is literally "@harry"?
>> Well, first of all, this has nothing to do with the ConfigParser, but
>> with how our authz infrastructure interprets the access entry selectors.
> Accepted.
>
>> And no, there's no way to do this.
>>> In general, is there a programmatic transformation from arbitrary valid 
>>> user names and paths to corresponding authz entries?
>> Define "arbitrary valid" and I'll answer that. :)
> Ones that Subversion otherwise accepts, apart from in this context.
>
>> We've always had the following restictions: usernames can't begin with
>> '@' or '&' or '~',
> There must be also limitations with '#', '*', '=' and whitespace characters, 
> and I don't see a way to find a complete list.

True. Also '$' since we have the magic '$authenticated' and '$anonymous'
tokens.

The rules are not explicitly documented anywhere, but they're implied by
the documentation in The Book. With a bit of luck, I'll have these
spelled out on the wiki some day.

>> and similar limitations apply to group and alias
>> names. Such identifiers have always been rejected at one stage or
>> another, and this has not been a problem.
> We haven't heard vocal complaints. It could well have caused headaches for 
> those implementing authz in their systems. (Not because they need to use such 
> usernames, but because they need to defensively program against an 
> incompletely known set of problems.)

Well, realistically, user identifiers are likely to take one of the
following forms, with reasonable variations:

  * usernme
  * name.surname
  * usern...@example.com
  * CN=Name Surname,OU=marketing,DC=example,DC=com


While some authentication systems (pam_unix FTW) theoretically allow
funny usernames with leading '@', '&', '$' etc., such names are very,
very unlikely to be used in practice. Our authz parser will accept any
of the forms I listed above.

(The Distinguished Name has to be mapped through an alias because it
containe '=', this is documented in The Book.)

> Would I be totally off the mark in suggesting an enhancement request for an 
> authz file format that addresses these issues? It seems to me this is the 
> sort of thing "enterprise" users would value.

I don't think it's necessary. And until and unless we get an actual bug
report, I'd leave well enough alone. Inventing some escaping mechanism
post-hoc always leads to strange side effects, making some currently
valid identifiers suddenly invalid or interpreted differently.

-- Brane

P.S.: Users can't have ':' in the name of a repository for
repository-specific rules, either; this, too have never been a problem.



Re: Support character classes in glob authz rules

2018-12-03 Thread Michael Pilato
On 12/3/18 3:15 AM, Julian Foad wrote:
> It makes me uncomfortable to depart from standard parsing. What if
> users are relying on Python ConfigParser or other compatible parsing
> as part of their Subversion authz infrastructure?

We needn't keep this hypothetical.  ViewVC is using (a slightly
modified[*]) Python ConfigParser in this way.

-- Mike

[*] By default, ConfigParser (well, really the RawConfigParser it
subclasses) lowercases option names, which can cause username/group
matching to fail.  So ViewVC's code replaces the optionxform method of
ConfigParser with a noop lambda function.  (See
https://docs.python.org/2/library/configparser.html#ConfigParser.RawConfigParser.optionxform
for official docs.)


Re: Support character classes in glob authz rules

2018-12-03 Thread Julian Foad
Branko Čibej wrote:
>> If we treat the authz format as its own format, that does make me wonder if 
>> there are any other limitations the current ConfigParser format is applying, 
>> that should also be lifted.
>>
>> For example, if "@harry" is parsed as a reference to group name "harry", 
>> then is there also a way to specify a username that is literally "@harry"?
> 
> Well, first of all, this has nothing to do with the ConfigParser, but
> with how our authz infrastructure interprets the access entry selectors.

Accepted.

> And no, there's no way to do this.
> > In general, is there a programmatic transformation from arbitrary valid 
> > user names and paths to corresponding authz entries?
> 
> Define "arbitrary valid" and I'll answer that. :)

Ones that Subversion otherwise accepts, apart from in this context.

> We've always had the following restictions: usernames can't begin with
> '@' or '&' or '~',

There must be also limitations with '#', '*', '=' and whitespace characters, 
and I don't see a way to find a complete list.

> and similar limitations apply to group and alias
> names. Such identifiers have always been rejected at one stage or
> another, and this has not been a problem.

We haven't heard vocal complaints. It could well have caused headaches for 
those implementing authz in their systems. (Not because they need to use such 
usernames, but because they need to defensively program against an incompletely 
known set of problems.)

Would I be totally off the mark in suggesting an enhancement request for an 
authz file format that addresses these issues? It seems to me this is the sort 
of thing "enterprise" users would value.

-- 
- Julian


Re: Support character classes in glob authz rules

2018-12-03 Thread Branko Čibej
On 03.12.2018 10:39, Julian Foad wrote:
> Branko Čibej wrote:
>> I can't think of a solution that would not depart from ConfigParser
>> semantics in one way or another. As for "the system" ... well, the fact
>> that config and authz parsing shares common code is really just an
>> accident, and certainly an implementation detail.
> OK, that's fair. Your solution does seem to be a backward-compatible way that 
> practically should work well to cope with this case.
>
> I was going to say "this new case", but the square-bracket problem is not new 
> with glob rules: any filename can have square brackets in it, on Unix.


Yes, the original issue #4204 predates glob rules.


> If we treat the authz format as its own format, that does make me wonder if 
> there are any other limitations the current ConfigParser format is applying, 
> that should also be lifted.
>
> For example, if "@harry" is parsed as a reference to group name "harry", then 
> is there also a way to specify a username that is literally "@harry"?


Well, first of all, this has nothing to do with the ConfigParser, but
with how our authz infrastructure interprets the access entry selectors.
And no, there's no way to do this.


> In general, is there a programmatic transformation from arbitrary valid user 
> names and paths to corresponding authz entries?

Define "arbitrary valid" and I'll answer that. :)


We've always had the following restictions: usernames can't begin with
'@' or '&' or '~', and similar limitations apply to group and alias
names. Such identifiers have always been rejected at one stage or
another, and this has not been a problem.

Paths in rules are interpreted literally (either as literal paths or as
literal glob patterns), without any excaping mechanism. This hasn't
changed since the inception of authz rules, either. The only really
problematic case is having ']' in the rule, and with glob patterns, this
becomes more likely (or desired).

-- Brane


Re: Support character classes in glob authz rules

2018-12-03 Thread Julian Foad
Branko Čibej wrote:
> I can't think of a solution that would not depart from ConfigParser
> semantics in one way or another. As for "the system" ... well, the fact
> that config and authz parsing shares common code is really just an
> accident, and certainly an implementation detail.

OK, that's fair. Your solution does seem to be a backward-compatible way that 
practically should work well to cope with this case.

I was going to say "this new case", but the square-bracket problem is not new 
with glob rules: any filename can have square brackets in it, on Unix.

If we treat the authz format as its own format, that does make me wonder if 
there are any other limitations the current ConfigParser format is applying, 
that should also be lifted.

For example, if "@harry" is parsed as a reference to group name "harry", then 
is there also a way to specify a username that is literally "@harry"?

In general, is there a programmatic transformation from arbitrary valid user 
names and paths to corresponding authz entries?

-- 
- Julian


Re: Support character classes in glob authz rules

2018-12-03 Thread Branko Čibej
On 03.12.2018 09:39, Stefan Sperling wrote:
> On Mon, Dec 03, 2018 at 09:30:53AM +0100, Stefan Sperling wrote:
>> On Mon, Dec 03, 2018 at 08:15:24AM +, Julian Foad wrote:
>>> Branko Čibej wrote:
 The proposed change in the parser is only enabled for parsing authz and
 global group files, other Subversion configuration files will use the
 current semantics.
>>> These sorts of quirks tend to make a system hard to maintain in the long 
>>> run. What if we find another case where we'd like to depart from that 
>>> parsing? How far would we go?
>>>
>>> What alternative solution could we consider?
>> Some sort of escaping mechanism might work.
> See also https://issues.apache.org/jira/browse/SVN-4784
>
> It would be nice if a solution to the authz issues would address
> that issue as well.

"Some sort of excaping" also silently changes existing rules and, worse,
does it in far more subtle ways than my proposed change.

SVN-4784 is a different issue and should be solved in the auto-props
code; I added a proposal in Jira but haven't had time to bring it to
this list.

-- Brane


Re: Support character classes in glob authz rules

2018-12-03 Thread Branko Čibej
On 03.12.2018 09:15, Julian Foad wrote:
> Branko Čibej wrote:
>> https://issues.apache.org/jira/browse/SVN-4204
>> https://issues.apache.org/jira/browse/SVN-4795
>>
>> Even though apr_fnmatch(), which we use for matching glob patterns in
>> authz rules, supports character classes ([A-Z] etc.), we can't use them
>> in the glob rules because of the way the config parser works. For
>> example, the following rule:
>>
>> [:glob:/**/*.[Dd]oc]
>>
>> will be silently parsed to ":glob:/**/*.[Dd" [because] our config parser
>> strictly follows the semantics of Python's ConfigParser and treats the
>> first ']' it finds as the section terminator, ignoring any remaining
>> characters to the end of the line.
>>
>> The proposed patch changes this behaviour as follows:
>>
>>   * the /last/ ']' in the line is the section terminator;
>>   * only whitespace is allowed after the terminator to the end of the line.
> Glad to see a proposal.
>
> It makes me uncomfortable to depart from standard parsing. What if users are 
> relying on Python ConfigParser or other compatible parsing as part of their 
> Subversion authz infrastructure?

Then they're not using character classes in glob patterns.

The problem is that ConfigParser does not define an escaping method for
the ], so we can't even adopt that in our parser — and if we did, then
existing rules could silently change their meaning in really nasty ways.

> First I wondered if anything bad could happen if there's a silent change in 
> meaning where a user has written, let's say,
>
>> [:glob:/**/secret1]# was [:glob:/**/secret2]

Yes, this is the sort of case where the meaning would change.

> It's hard to find any plausible example that would successfully parse and 
> actually match something, but may be possible.

If one stretches "plausible" far enough ...

>> The proposed change in the parser is only enabled for parsing authz and
>> global group files, other Subversion configuration files will use the
>> current semantics.
> These sorts of quirks tend to make a system hard to maintain in the long run. 
> What if we find another case where we'd like to depart from that parsing? How 
> far would we go?
>
> What alternative solution could we consider?

I can't think of a solution that would not depart from ConfigParser
semantics in one way or another. As for "the system" ... well, the fact
that config and authz parsing shares common code is really just an
accident, and certainly an implementation detail.

-- Brane



Re: Support character classes in glob authz rules

2018-12-03 Thread Stefan Sperling
On Mon, Dec 03, 2018 at 09:30:53AM +0100, Stefan Sperling wrote:
> On Mon, Dec 03, 2018 at 08:15:24AM +, Julian Foad wrote:
> > Branko Čibej wrote:
> > > The proposed change in the parser is only enabled for parsing authz and
> > > global group files, other Subversion configuration files will use the
> > > current semantics.
> > 
> > These sorts of quirks tend to make a system hard to maintain in the long 
> > run. What if we find another case where we'd like to depart from that 
> > parsing? How far would we go?
> > 
> > What alternative solution could we consider?
> 
> Some sort of escaping mechanism might work.

See also https://issues.apache.org/jira/browse/SVN-4784

It would be nice if a solution to the authz issues would address
that issue as well.


Re: Support character classes in glob authz rules

2018-12-03 Thread Stefan Sperling
On Mon, Dec 03, 2018 at 08:15:24AM +, Julian Foad wrote:
> Branko Čibej wrote:
> > The proposed change in the parser is only enabled for parsing authz and
> > global group files, other Subversion configuration files will use the
> > current semantics.
> 
> These sorts of quirks tend to make a system hard to maintain in the long run. 
> What if we find another case where we'd like to depart from that parsing? How 
> far would we go?
> 
> What alternative solution could we consider?

Some sort of escaping mechanism might work.


Re: Support character classes in glob authz rules

2018-12-03 Thread Julian Foad
Branko Čibej wrote:
> https://issues.apache.org/jira/browse/SVN-4204
> https://issues.apache.org/jira/browse/SVN-4795
> 
> Even though apr_fnmatch(), which we use for matching glob patterns in
> authz rules, supports character classes ([A-Z] etc.), we can't use them
> in the glob rules because of the way the config parser works. For
> example, the following rule:
> 
> [:glob:/**/*.[Dd]oc]
> 
> will be silently parsed to ":glob:/**/*.[Dd" [because] our config parser
> strictly follows the semantics of Python's ConfigParser and treats the
> first ']' it finds as the section terminator, ignoring any remaining
> characters to the end of the line.
> 
> The proposed patch changes this behaviour as follows:
> 
>   * the /last/ ']' in the line is the section terminator;
>   * only whitespace is allowed after the terminator to the end of the line.

Glad to see a proposal.

It makes me uncomfortable to depart from standard parsing. What if users are 
relying on Python ConfigParser or other compatible parsing as part of their 
Subversion authz infrastructure?

First I wondered if anything bad could happen if there's a silent change in 
meaning where a user has written, let's say,

> [:glob:/**/secret1]# was [:glob:/**/secret2]

It's hard to find any plausible example that would successfully parse and 
actually match something, but may be possible.

> The proposed change in the parser is only enabled for parsing authz and
> global group files, other Subversion configuration files will use the
> current semantics.

These sorts of quirks tend to make a system hard to maintain in the long run. 
What if we find another case where we'd like to depart from that parsing? How 
far would we go?

What alternative solution could we consider?

- Julian