Hi Mat, thanks for starting this discussion!

Quick question: don't you want to normalize the URI? I assume they already
have to follow a strict format in the HTTP case that is ready to use as is.
So doing any sort of normalization would be additional work. We could
perform some minimal validation but, if so, what should it be?


On Fri, Feb 18, 2022 at 6:29 PM Mat Trudel <m...@geeky.net> wrote:

> When implementing an HTTP server, one of the most unspecified parts of
> handling a request is the building and canonicalization of the requested
> URI. The constituent parts of a request URI are spread out across multiple
> sources. For example, the hostname of a request can be any of (possibly
> multiple!) Host header(s), an authority pseudo-header in HTTP/2, a
> statically configured value for IP-based hosting, or even something derived
> from upstream X- headers. Assembling these parts into a canonical request
> URI is non-trivial.
>
> The URI module as currently implemented does not provide supported ways to
> construct a URI from constituent parts (though that is changing [1] ).
> Nor does it provide methods to validate or meaningfully normalize an
> extant URI struct. Without these methods, HTTP servers need to resort to
> adhoc methods to build and canonicalize request URIs (see [2], [3]).
>
> To help alleviate this, it is proposed to add the following changes to the
> URI module:
>
> 1. Explicitly allow for the building of URI structs directly in the module
> documentation (subject to warnings about the use of the authority field).
>
> 2. Add a normalize(%{})/2 function which will return a normalized version
> of an existing URI struct (this can plumb through to
> :uri_string.normalize/2 [4]).
>
> 3. Add an absolute?/1 function which returns whether or not the URI is
> absolute (that is, does it contain sufficient information to discretely
> represent a complete, unambiguous request)
>
> Along with the existing new/1 and merge/2 functions, I believe that this
> should be sufficient to cleanly implement request URI construction within a
> web server such as Bandit. This will allow the web server to determine
> where to source the various components of a URI from, while deferring
> assembly, normalization and validation of those components to the URI
> module where it belongs.
>
> Subject to debate and approval I'm happy to work this up.
>
> m.
>
> [1] https://twitter.com/josevalim/status/1494208355732275200
> [2]
> https://github.com/mtrudel/bandit/blob/main/lib/bandit/http2/stream_task.ex#L101-L113
> [3]
> https://github.com/ninenines/cowboy/blob/8795233c57f1f472781a22ffbf186ce38cc5b049/src/cowboy_http.erl#L490-L553
> [4] https://www.erlang.org/doc/man/uri_string.html#normalize-2
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elixir-lang-core+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elixir-lang-core/8c4e9d5d-f83a-43dc-82e7-171730f19724n%40googlegroups.com
> <https://groups.google.com/d/msgid/elixir-lang-core/8c4e9d5d-f83a-43dc-82e7-171730f19724n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elixir-lang-core+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KcmuJNyOtc2DQ-LNuaMM1phMrpiHG7f2%3DP-3T2WrconQ%40mail.gmail.com.

Reply via email to