Re: [PATCH v1 0/2] urlmatch: allow regexp-based matches

2017-01-24 Thread Patrick Steinhardt
On Mon, Jan 23, 2017 at 11:53:43AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt  writes:
> 
> > This patch is mostly a request for comments. The use case is to
> > be able to configure an HTTP proxy for all subdomains of a
> > certain domain where there are hundreds of subdomains. The most
> > flexible way I could imagine was by using regular expressions for
> > the matching, which is how I implemented it for now. So users can
> > now create a configuration key like
> > `http.?http://.*\\.example\\.com.*` to apply settings for all
> > subdomains of `example.com`.
> 
> While reading 2/2, I got an impression that this is "too" flexible
> and possibly operates at a wrong level.  I would have expected that
> the wildcarding to be limited to the host part only and hook into
> match_urls(), allowing the users of the new feature to still take
> advantage of the existing support of "http://m...@example.com; that
> limits the match to the case that the connection is authenticated
> for a user, for example, by newly allowing "http://me@*.example.com;
> or something like that.
> 
> Because you cannot have a literal '*' in your hostname, I would
> imagine that supporting a match pattern "http://me@*.example.com;
> would be already backward compatible without requiring a leading
> question-mark.
> 
> I also personally would prefer these textual matching to be done
> with glob not with regexp, by the way, as the above description of
> mine shows.
> 
> Thanks.

Thanks for your feedback. Using globs in the hostname only was my
first intent, as well. I later on took regular expressions
instead so as to allow further flexibility for the user. The
reasoning was that there might be other use cases which cannot
actually be solved with using globs only, even if I myself wasn't
aware of different ones. So this might be indeed over-engineered
when using regular expressions.

There are several questions though regarding semantics with
globs, where I'd like to have additional opinions on.

- should a glob only be allowed for actual subdomains, allowing
  "http://*.example.com; but not "http://*example.com;?

- should a glob also match multiple nestings of subdomains? E.g.
  "http://*.example.com; would match "http://foo.example.com; but
  not "http://foo.bar.example.com;?

I'll send a version 2 soon-ish.

Regards
Patrick


signature.asc
Description: PGP signature


Re: [PATCH v1 0/2] urlmatch: allow regexp-based matches

2017-01-23 Thread Junio C Hamano
Patrick Steinhardt  writes:

> This patch is mostly a request for comments. The use case is to
> be able to configure an HTTP proxy for all subdomains of a
> certain domain where there are hundreds of subdomains. The most
> flexible way I could imagine was by using regular expressions for
> the matching, which is how I implemented it for now. So users can
> now create a configuration key like
> `http.?http://.*\\.example\\.com.*` to apply settings for all
> subdomains of `example.com`.

While reading 2/2, I got an impression that this is "too" flexible
and possibly operates at a wrong level.  I would have expected that
the wildcarding to be limited to the host part only and hook into
match_urls(), allowing the users of the new feature to still take
advantage of the existing support of "http://m...@example.com; that
limits the match to the case that the connection is authenticated
for a user, for example, by newly allowing "http://me@*.example.com;
or something like that.

Because you cannot have a literal '*' in your hostname, I would
imagine that supporting a match pattern "http://me@*.example.com;
would be already backward compatible without requiring a leading
question-mark.

I also personally would prefer these textual matching to be done
with glob not with regexp, by the way, as the above description of
mine shows.

Thanks.




[PATCH v1 0/2] urlmatch: allow regexp-based matches

2017-01-23 Thread Patrick Steinhardt
Hi,

Short disclaimer: this patch results from work for a client at my
day job at elego Software Solutions GmbH. As such, I'm using my
work mail address and added a new mailmap entry. I wasn't exactly
certain if the mailmap entry should've been created in a separate
commit series, as it has nothing to do with the actual topic --
I can re-send it separately if requested.

This patch is mostly a request for comments. The use case is to
be able to configure an HTTP proxy for all subdomains of a
certain domain where there are hundreds of subdomains. The most
flexible way I could imagine was by using regular expressions for
the matching, which is how I implemented it for now. So users can
now create a configuration key like
`http.?http://.*\\.example\\.com.*` to apply settings for all
subdomains of `example.com`.

I tried to make this feature as backwards-compatible as it can be
by having the '?' prefix. Older clients will barf when trying to
normalize the URL as '?' is not in the set of allowed characters
for a URL, and for newer clients there will be no change in
behavior for previously configured `http..*` keys.

Regards
Patrick Steinhardt

Patrick Steinhardt (2):
  mailmap: add Patrick Steinhardt's work address
  urlmatch: allow regex-based URL matching

 .mailmap |  1 +
 Documentation/config.txt |  6 -
 t/t1300-repo-config.sh   | 31 ++
 urlmatch.c   | 57 ++--
 4 files changed, 82 insertions(+), 13 deletions(-)

-- 
2.11.0