Hello! On Mon, Mar 27, 2023 at 04:19:27PM +0000, Michael Kourlas via nginx-devel wrote:
> > As far as I understand, it will irreversibly corrupt URIs with > > double-encoded reserved characters. For example, "%252F" will > > become "%2F" when proxying in the following configuration: > > > > location /foo/ { > > proxy_pass http://upstream/foo/; > > } > > I believe you are correct. When the "all-except-reserved" option is used, > nginx > will not recognize, during the re-encoding step, that the "%" in "%2F" was > originally percent-encoded and not part of a percent-encoded "/". > > However, I think this issue can be addressed by renaming "all-except-reserved" > to "all-except-percent-and-reserved" and adding "%" to the corresponding set > of > characters that is not decoded or encoded automatically when that option is > used. "%252F" will then be left untouched throughout the entire process. So it will prevent access to the files with "%", such as in "foo%bar", unless specifically handled (see below). > > Further, requests to static files with (properly escaped) reserved > > characters will simply fail, because nginx won't decode these > > characters. For example, in the following trivial configuration a > > request to "/foo%3Fbar" won't be decoded to match "/foo?bar" file > > under the document root: > > > > location / { > > # static files > > } > > Yes, I agree it doesn't make sense to keep "%" and reserved characters > percent-encoded when performing a filesystem lookup, because at that point we > can be certain they're not being used as delimiters or anything other than > data. > > However, I think this issue can be addressed by decoding any remaining > percent-encoded characters at the point of filesystem lookup, as well as any > other place where you think it is appropriate. (Of course, this should only be > done when using the "all-except-percent-and-reserved" option, and only with > "%" > and reserved characters, to avoid double-decoding double-encoded characters.) This implies, basically, that there are 3 forms of the request URI: 1) fully encoded, as in $request_uri, 2) fully decoded, as in $uri now, and 3) "all-except-percent-and-reserved". To implement this correctly, it needs clear definition when each form is used, and it is going to be a non-trivial task to do this safely. For example, using fully decoded form for file system lookups while using "all-except-percent-and-reserved" form for location matching is going to be unsafe: restrictions like "location /private/ { deny all; }" can be easily bypassed. The same issue applies to other cases not involving file system access directly in nginx: for example, consider $fastcgi_script_name in FastCGI proxying. > > Please also note that the configuration directive you've > > introduced in this patch applies to URI parsing from not-yet-final > > server block (see [1] for details), but the configuration from the > > final server block will be used for URI escaping. These > > configuration can be different, and this might result in various > > additional issues. > > I think this issue can be addressed by making the new directive available only > in http blocks. This ensures that all virtual servers share the same > configuration. This approach basically precludes using different settings in different server blocks, even completely independent, such as IP-based virtual servers, and is not generally the way to go. Instead, this should be implemented correctly, if at all. > If you agree with these suggestions, I'd be happy to submit an updated patch. > > Reserved characters are often used as delimiters in APIs, and I believe it is > important that nginx be able to distinguish between usages as delimiters and > usages as data. First of all, you may want to focus on the logic you are going to implement. -- Maxim Dounin http://mdounin.ru/ _______________________________________________ nginx-devel mailing list nginx-devel@nginx.org https://mailman.nginx.org/mailman/listinfo/nginx-devel