Re: Doing directory based access control (Survey / Poll of admin expectations)

2020-06-26 Thread Willy Tarreau
On Fri, Jun 26, 2020 at 11:28:09AM +0200, Tim Düsterhus wrote:
> I just filed a tracking bug for it on GitHub, so that the discussion
> will not get lost in the depths of the email archive:
> 
> https://github.com/haproxy/haproxy/issues/714

Thank you!

Willy



Re: Doing directory based access control (Survey / Poll of admin expectations)

2020-06-26 Thread Tim Düsterhus
Ilya,
Willy,

Am 26.06.20 um 11:19 schrieb Willy Tarreau:
>> Tim, can we schedule this for 2.3 ? It seems to be "too much" for 2.2
> 
> Rest assured that for me it's not even imaginable to break 2.2 with
> such sort of things. We have sufficient issues to address right now!

Agreed. This is something non-trivial and the solution should not be
rushed. Personally I've already adjusted my rules.

I just filed a tracking bug for it on GitHub, so that the discussion
will not get lost in the depths of the email archive:

https://github.com/haproxy/haproxy/issues/714

Best regards
Tim Düsterhus



Re: Doing directory based access control (Survey / Poll of admin expectations)

2020-06-26 Thread Илья Шипицин
пт, 26 июн. 2020 г. в 14:19, Willy Tarreau :

> Hi Ilya,
>
> On Fri, Jun 26, 2020 at 02:04:41PM +0500,  ??? wrote:
> > ??, 26 ???. 2020 ?. ? 11:00, Willy Tarreau :
> >
> > > Hi Tim,
> > >
> > > On Thu, Jun 25, 2020 at 04:30:37PM +0200, Tim Düsterhus wrote:
> > > (...)
> > > > Willy: Please correct me if I misrepresented your arguments or left
> out
> > > > something important.
> > >
> > > I think it's well summarized. There are other more painful points not
> > > mentioned here:
> > >
> >
> > Tim, can we schedule this for 2.3 ? It seems to be "too much" for 2.2
>
> Rest assured that for me it's not even imaginable to break 2.2 with
> such sort of things. We have sufficient issues to address right now!
>
> > as for normalization, I'd like an idea to compare nginx normalization
> rules.
> > (I recall myself that only "merge_slashes off;" was rarely an issue, the
> > rest of normalization rules seem to be just fine)
>
> Be careful that nginx is a web server, not a gateway, so it doesn't have
>

I have first hand experience on using nginx as a reverse proxy at 10+ years
at 1 billion queries a day :)


> to care about how the next hop would interpret the request since there
> isn't such "next hop" so it only has to be consistent with itself. And
> by the way, in case you'd still use it as a reverse-proxy using proxy_pass
> you have to be aware that it only normalizes during analysis but forwards
> the unprocessed request, leading to some of the well-known things I
> mentioned:
>
>
> https://www.acunetix.com/blog/articles/a-fresh-look-on-reverse-proxy-related-attacks/
>
> This article by the way also mentions the funny things with some
> application servers which incorrectly use ";" as a query string
> delimiter, which is yet another thing breaking normalization!
>
> Willy
>


Re: Doing directory based access control (Survey / Poll of admin expectations)

2020-06-26 Thread Willy Tarreau
Hi Ilya,

On Fri, Jun 26, 2020 at 02:04:41PM +0500,  ??? wrote:
> ??, 26 ???. 2020 ?. ? 11:00, Willy Tarreau :
> 
> > Hi Tim,
> >
> > On Thu, Jun 25, 2020 at 04:30:37PM +0200, Tim Düsterhus wrote:
> > (...)
> > > Willy: Please correct me if I misrepresented your arguments or left out
> > > something important.
> >
> > I think it's well summarized. There are other more painful points not
> > mentioned here:
> >
> 
> Tim, can we schedule this for 2.3 ? It seems to be "too much" for 2.2

Rest assured that for me it's not even imaginable to break 2.2 with
such sort of things. We have sufficient issues to address right now!

> as for normalization, I'd like an idea to compare nginx normalization rules.
> (I recall myself that only "merge_slashes off;" was rarely an issue, the
> rest of normalization rules seem to be just fine)

Be careful that nginx is a web server, not a gateway, so it doesn't have
to care about how the next hop would interpret the request since there
isn't such "next hop" so it only has to be consistent with itself. And
by the way, in case you'd still use it as a reverse-proxy using proxy_pass
you have to be aware that it only normalizes during analysis but forwards
the unprocessed request, leading to some of the well-known things I
mentioned:

  
https://www.acunetix.com/blog/articles/a-fresh-look-on-reverse-proxy-related-attacks/

This article by the way also mentions the funny things with some
application servers which incorrectly use ";" as a query string
delimiter, which is yet another thing breaking normalization!

Willy



Re: Doing directory based access control (Survey / Poll of admin expectations)

2020-06-26 Thread Илья Шипицин
пт, 26 июн. 2020 г. в 11:00, Willy Tarreau :

> Hi Tim,
>
> On Thu, Jun 25, 2020 at 04:30:37PM +0200, Tim Düsterhus wrote:
> (...)
> > Willy: Please correct me if I misrepresented your arguments or left out
> > something important.
>
> I think it's well summarized. There are other more painful points not
> mentioned here:
>

Tim, can we schedule this for 2.3 ? It seems to be "too much" for 2.2

as for normalization, I'd like an idea to compare nginx normalization rules.

(I recall myself that only "merge_slashes off;" was rarely an issue, the
rest of normalization rules seem to be just fine)


>
>   - RFC3986's path normalization algorithm is bogus when it sees
> multiple slashes such as in "/static//images/12.jpg". This happens
> very frequently in URLs built by concatenation. The problem is that
> when it meets a "../" it suggests to remove only one level of slash
> and will end up in a different directory than the one a server that
> does simplistic normalization would do (or a cache which would first
> merge consecutive slashes to increase cache hit ratio). So:
> /static//images/../../css/main.css would become:
> /static/css/main.css according to RFC3986
> /css/main.css according to most file systems or simplifications
>
>   - some operating systems also support the backslash "\" as a directory
> delimiter. So you try to normalize your path correctly and leave
> "\..\admin/" and you're screwed again.
>
>   - "+" and "%20" are equivalent in the query string, but given that in
> many simple applications these ones will only appear in a single form,
> such applications might not even check for the other one. So if you
> replace "%20" with "+" you will break some of them and if you replace
> "+" with "%20" you will break others. I've seen quite a number of
> implementations in the past perform the decoding just like haproxy
> used to until recently, which is: feed the whole URL string to the
> percent-decoder and decode the "+" as a space in the path part. And
> by normalizing that we'd also break some of them.
>
>   - some servers support a non-standard UTF-16 encoding (the same ones as
> those using case-insensitive matching). For this they use "%u" followed
> by 4 digits. So your "A" could be encoded "%61", "%41", "%u0061",
> "%u0041", "%U0041" or "%U0061" there and will also match. But this is
> not standard and must not be decoded as such, at the risk of breaking
> yet other applications which do not expect that "%u" is transcoded. And
> it's even possible that in some of these servers' configurations there
> are rules matching "%UFEDC" but not "%FE%DC".
>
>   - and I wouldn't even be surprised if some servers using some internal
> normalization functions would also resolve unicode homoglyphs to valid
> characters! Just check if your server accepts "/%ef%bd%81dmin/", that
> would be fun!
>
>   - actually, even browsers DO NOT normalize URLs, in order to preserve
> them as much as possible and not to fail on broken servers! This should
> be heard as a strong warning! Try it by yourself, just direct your
> browser to /a%%b%31c%xyz/?brightness=10% and see it send:
>
>GET /a%%b%31c%xyz/?brightness=10% HTTP/1.1
>
> you'll note that even "%31" wasn't turned into a "1".
>
>   - there are other aspects (some mentioned in RFC3986). The authority
> par of the URI can have various forms. The best known one is the
> split of the net and the host in the IPv4 address representation, by
> which "127.0.0.1" is also "127.0.1", "127.1", "2130706433" or even
> "0x7f01" (with X or x and F or f). You can even add leading
> zeroes. And you can use octal encoding: 01770001. You can try to
> ping all of them, they'll likely work on your OS. At least my browser
> rewrites them in the URL bar before sending the request. This might be
> normalized... or not, for the same reasons of not breaking the next
> step in the chain, which possibly expect to have a different behavior
> when dealing with "16bit.16bit" representation, since host names made
> of digits only are permitted if there's a domain behind :-/  The port
> number can accept leading zeroes, so ":80" and ":080" are aliases.
>
>   - last, those running haproxy in front of a WAF certainly don't want
> haproxy to wipe these precious information before the WAF has a chance
> to raise its awareness on this request!
>
> The problem with normalization is that it would work if everyone was doing
> it, but RFC3986 was specified long after 99% of the internet was already
> deployed and used, so at best it can be used as a guideline and a reference
> of traps to avoid. And things are getting worse with IoT. You can just try
> to run an HTTP server on an ESP8266 and you'll see that most of the time,
> percent-decoding is not the web server's problem at all and it 

Re: Doing directory based access control (Survey / Poll of admin expectations)

2020-06-26 Thread Willy Tarreau
Hi Tim,

On Thu, Jun 25, 2020 at 04:30:37PM +0200, Tim Düsterhus wrote:
(...)
> Willy: Please correct me if I misrepresented your arguments or left out
> something important.

I think it's well summarized. There are other more painful points not
mentioned here:

  - RFC3986's path normalization algorithm is bogus when it sees
multiple slashes such as in "/static//images/12.jpg". This happens
very frequently in URLs built by concatenation. The problem is that
when it meets a "../" it suggests to remove only one level of slash
and will end up in a different directory than the one a server that
does simplistic normalization would do (or a cache which would first
merge consecutive slashes to increase cache hit ratio). So:
/static//images/../../css/main.css would become:
/static/css/main.css according to RFC3986
/css/main.css according to most file systems or simplifications

  - some operating systems also support the backslash "\" as a directory
delimiter. So you try to normalize your path correctly and leave
"\..\admin/" and you're screwed again.

  - "+" and "%20" are equivalent in the query string, but given that in
many simple applications these ones will only appear in a single form,
such applications might not even check for the other one. So if you
replace "%20" with "+" you will break some of them and if you replace
"+" with "%20" you will break others. I've seen quite a number of
implementations in the past perform the decoding just like haproxy
used to until recently, which is: feed the whole URL string to the
percent-decoder and decode the "+" as a space in the path part. And
by normalizing that we'd also break some of them.

  - some servers support a non-standard UTF-16 encoding (the same ones as
those using case-insensitive matching). For this they use "%u" followed
by 4 digits. So your "A" could be encoded "%61", "%41", "%u0061",
"%u0041", "%U0041" or "%U0061" there and will also match. But this is
not standard and must not be decoded as such, at the risk of breaking
yet other applications which do not expect that "%u" is transcoded. And
it's even possible that in some of these servers' configurations there
are rules matching "%UFEDC" but not "%FE%DC".

  - and I wouldn't even be surprised if some servers using some internal
normalization functions would also resolve unicode homoglyphs to valid
characters! Just check if your server accepts "/%ef%bd%81dmin/", that
would be fun!

  - actually, even browsers DO NOT normalize URLs, in order to preserve
them as much as possible and not to fail on broken servers! This should
be heard as a strong warning! Try it by yourself, just direct your
browser to /a%%b%31c%xyz/?brightness=10% and see it send:

   GET /a%%b%31c%xyz/?brightness=10% HTTP/1.1

you'll note that even "%31" wasn't turned into a "1".

  - there are other aspects (some mentioned in RFC3986). The authority
par of the URI can have various forms. The best known one is the
split of the net and the host in the IPv4 address representation, by
which "127.0.0.1" is also "127.0.1", "127.1", "2130706433" or even
"0x7f01" (with X or x and F or f). You can even add leading
zeroes. And you can use octal encoding: 01770001. You can try to
ping all of them, they'll likely work on your OS. At least my browser
rewrites them in the URL bar before sending the request. This might be
normalized... or not, for the same reasons of not breaking the next
step in the chain, which possibly expect to have a different behavior
when dealing with "16bit.16bit" representation, since host names made
of digits only are permitted if there's a domain behind :-/  The port
number can accept leading zeroes, so ":80" and ":080" are aliases.

  - last, those running haproxy in front of a WAF certainly don't want
haproxy to wipe these precious information before the WAF has a chance
to raise its awareness on this request!

The problem with normalization is that it would work if everyone was doing
it, but RFC3986 was specified long after 99% of the internet was already
deployed and used, so at best it can be used as a guideline and a reference
of traps to avoid. And things are getting worse with IoT. You can just try
to run an HTTP server on an ESP8266 and you'll see that most of the time,
percent-decoding is not the web server's problem at all and it will pass
it unmodified. So you can definitely expect that your light bulb's web
server will take requests like "GET /dim?brightness=10%" and expect that
to work out of the box. Just install a normalizing load balancer in front
of a gateway managing tens of thousands of heating controllers based on
such devices and suddenly nobody can adjust the temperature in their homes
anymore (or worse, the '%' gets dropped and becomes degrees C so when you
ask for 50% heating you 

Re: Doing directory based access control (Survey / Poll of admin expectations)

2020-06-25 Thread Tim Düsterhus
Hi List,

Am 22.06.20 um 21:13 schrieb Tim Düsterhus:
> What kind of (configuration) advice would you give me? Do you have any
> concerns? I consider *anything* a valid answer here and I'd like to hear
> from both experienced admins and "newbies".
> 
> I'll give the "solution" once I get some replies :-)
> 

I've also forwarded the question to the #haproxy channel on Freenode and
I also got some replies in private. The suggestions mostly looked
something like Jonathan's reply:

  acl internal_net src 192.168.0.0/16
  acl admin_request path_beg /admin/
  http-request deny if admin_request !internal_net

Indeed that's also something I did myself within my production
configurations. However it's insecure.

One correct solution is the one given by Ilya and hinted by by Jonathan
is: Don't do this in HAProxy, instead do this in the backend which knows
exactly how it's going to interpret stuff.

However that might be painful if authentication is not going to be
performed based on IP address, but e.g. TLS client certificates, because
you would need to ship the DN within a header to the backend.

Another correct solution would be using a strict whitelist using
`http-request allow [...] if [...]` with an unconditional `http-request
deny` at the end.

However that might be painful for the situation I described in the
initial mail ("During upgrades of this off-the-shelf software new files
might be added for new features.").

So what exactly is the issue with the http-request deny rule as given above?

The issue I was concerned about is that HAProxy does not perform
normalization when comparing the path against something within an ACL.
Specifically HAProxy does not perform normalization of non-reserved
percent-encoded characters (RFC 3986#2.3). Nginx does.

So a simple `curl localhost/%61dmin` would circumvent the rule in
HAProxy. But nginx happily interprets this as a request to the /admin/
folder, allowing access to unauthorized users.

While this percent normalization could possibly be done generically,
because these URLs are defined in RFC 3986 to be equivalent, other types
normalization would be harder.

So a request to `curl localhost/public/../admin/` might or might not
cause the backend to interpret the request as a request to /admin/, but
a blanket normalization would be incorrect here, because API backends
might not refer to paths on a file system. This can be fixed with a
`http-request deny if { path_sub ../ }` or something similar.

On Windows a request to `curl localhost/ADMIN` might also access the
admin/ folder, but that is easily fixed with the `-i` flag (unless
unicode characters are used, I guess?).

However the percent-encoding normalization can not be implemented within
the configuration as of right now (to the best of my knowledge). Using
the url_dec converter is incorrect, because it also decodes stuff like
`%2F` to `/` which is disallowed during normalization and can introduce
issues on its own.

The disagreement I had with Willy was that I said that percent
normalization should happen by default (because of the RFC). However
Willy responded that this would take away some visibility, because
you're never going to see these unnormalized URLs in the logs, not
noticing an ongoing attack. Also some bogus clients and backends might
rely on these unnormalized forms.

Willy: Please correct me if I misrepresented your arguments or left out
something important.

Concluding:
- The documentation should be updated to warn administrators that
http-request deny must be used very carefully when combining it with a path.
- HAProxy should gain the ability to correctly normalize URLs (i.e. not
using the url_dec version). How exactly that would look is not yet clear.
  - It could be a `http-request normalize-path percent,dotdot,Y` action.
  - It could be a `normalize(percent)` converter
  - The `path` fetch could be extended to gain an argument, specifying
the normalization requested.
  - An option within the `defaults` section could enable normalization
for everything.

If you have anything to add after reading this mail: Please do!

Best regards
Tim Düsterhus



Re: Doing directory based access control (Survey / Poll of admin expectations)

2020-06-22 Thread Jonathan Matthews
On Mon, 22 Jun 2020 at 20:16, Tim Düsterhus  wrote:
> This off-the-shelf PHP application has an integrated admin control panel
> within the /admin/ directory. The frontend consists of several "old
> style" PHP files, handling the various paths (e.g. login.php,
> register.php, create-thread.php). During upgrades of this off-the-shelf
> software new files might be added for new features.
>
> My boss asked me to restrict the access to the admin control panel to
> our internal network (192.168.0.0/16) for security reasons. Access to
> the user frontend files must not be restricted.

If I were solving this problem solely at the haproxy layer, I'd do
something like this:

 acl internal_net src 192.168.0.0/16
 acl admin_request path_beg /admin/
 http-request deny if admin_request !internal_net

Though by preference I'd put app policy logic as close to, or best of
all inside, the app itself; which would have X-Forwarded-For
implications. I may have misunderstood your question though!

I'm intrigued by what common problems you foresee here. I suppose the
Front Controller pattern might be ... interesting to deal with?

J
-- 
Jonathan Matthews
London, UK
https://jpluscplusm.com



Re: Doing directory based access control (Survey / Poll of admin expectations)

2020-06-22 Thread Илья Шипицин
вт, 23 июн. 2020 г. в 00:16, Tim Düsterhus :

> Hi List,
>
> I was having a bit of off-list disagreement with Willy regarding how
> HAProxy ACLs should work and what (experienced) administrators may or
> may expect from them. I am arguing about something I believe many
> administrators might accidentally do incorrectly. I'm intentionally
> being vague here, to not spoil any results of this survey.
>
> Let's pretend I'm new to this list and send the following request for help:
>
> ---
>
> We are using an off-the-shelf PHP 7.2 application (think some bulletin
> board software), running behind nginx as the FastCGI gateway and static
> file server. In front of that nginx we are running HAProxy 2.0 in 'mode
> http'.
>


nginx can serve both static files and dynamic php in "try_files" approach.
i.e. "try to serve file statically if it exists and pass to php fastcgi if
not found"


like

location / {
   try_files $uri $uri/ @php;
}

location /admin/
   allow 192.168.0.0/16;
   deny all;
   try_files $uri $uri/ @php;
}

location @php {
fastcgi_pass .
}


as for IP observation there's couple of options - proxy protocol or
set_reaip_from
or restriction might be served on haproxy



>
> This off-the-shelf PHP application has an integrated admin control panel
> within the /admin/ directory. The frontend consists of several "old
> style" PHP files, handling the various paths (e.g. login.php,
> register.php, create-thread.php). During upgrades of this off-the-shelf
> software new files might be added for new features.
>
> My boss asked me to restrict the access to the admin control panel to
> our internal network (192.168.0.0/16) for security reasons. Access to
> the user frontend files must not be restricted.
>
> How can I do this?
>
> ---
>
> What kind of (configuration) advice would you give me? Do you have any
> concerns? I consider *anything* a valid answer here and I'd like to hear
> from both experienced admins and "newbies".
>
> I'll give the "solution" once I get some replies :-)
>
> Best regards
> Tim Düsterhus
>
>


Doing directory based access control (Survey / Poll of admin expectations)

2020-06-22 Thread Tim Düsterhus
Hi List,

I was having a bit of off-list disagreement with Willy regarding how
HAProxy ACLs should work and what (experienced) administrators may or
may expect from them. I am arguing about something I believe many
administrators might accidentally do incorrectly. I'm intentionally
being vague here, to not spoil any results of this survey.

Let's pretend I'm new to this list and send the following request for help:

---

We are using an off-the-shelf PHP 7.2 application (think some bulletin
board software), running behind nginx as the FastCGI gateway and static
file server. In front of that nginx we are running HAProxy 2.0 in 'mode
http'.

This off-the-shelf PHP application has an integrated admin control panel
within the /admin/ directory. The frontend consists of several "old
style" PHP files, handling the various paths (e.g. login.php,
register.php, create-thread.php). During upgrades of this off-the-shelf
software new files might be added for new features.

My boss asked me to restrict the access to the admin control panel to
our internal network (192.168.0.0/16) for security reasons. Access to
the user frontend files must not be restricted.

How can I do this?

---

What kind of (configuration) advice would you give me? Do you have any
concerns? I consider *anything* a valid answer here and I'd like to hear
from both experienced admins and "newbies".

I'll give the "solution" once I get some replies :-)

Best regards
Tim Düsterhus