Although I'm hardly a web "expert", I think net/uri-codec is currently a little confusing.
I get the impression that it was originally written prior to 2005, because the detailed introduction talks only about RFCs 1738 and 2396.[1] It looks like perhaps functions such as uri-path-segment-encode were added at a later date, to support RFC 3986. Although these functions' docs tersely link to RFC 3986, the overall net/uri-codec introduction wasn't revised accordingly, nor is there a simple explanation like "these also encode #\( #\) ...". (As a result, I actually ended up writing my own variation because I overlooked them.) Aside from the history of the documentation and organization, another point is the treatment of +, which the docs say intentionally doesn't follow RFC 2396, but don't really explain why. (One of my earliest experiments with Racket was a simple web crawler, and this #\+ <-> #\space translation caused difficulties (although it's possible I was confused in other ways).) Wikipedia (usual caveats apply) says RFC 3986 is the the current standard since 2005.[2] I almost wonder if there should be a brand-new module that implements RFC 3986 strictly. (Either just that, or, any options/parameters default to 3986). With the current net/uri-codec deprecated but preserved for backward compatibility. I wonder if that would be best because the functions and documentation may already be confusing. And this is a topic where it's easy for people to get confused to begin with and choose the wrong function. [1]: http://docs.racket-lang.org/net/uri-codec.html [2]: http://en.wikipedia.org/wiki/Percent-encoding#Percent-encoding_in_a_URI On Mon, Dec 17, 2012 at 9:59 AM, Eli Barzilay <[email protected]> wrote: > For many people there is a constant source of annoyance when you > copy+paste doc URLs into a markdown context as with stackoverflow and > others. The problem is that these URLs have parens in them and at > least in Chrome, the copied URL still has them -- and because markdown > texts use parens for URLs "[text](url)" they get confused which means > that you have to manually replace parens with %28 and %29. > > Danny submitted a pull request that eventually got changed by Matthew > into a new parameter that controls which characters get encoded by > `net/uri-codec', so it can escape these too. The result on Chrome is > that the copied URL has the escapes instead of parens, and clicking > such a URL makes the copy-able address have the escapes too. The > actuall page that is displayed is still the same one, of course, it's > just weird that Chrome has a certain context where the original URL > string is preserved as is. (It even considered the escaped URL as one > that I didn't visit, even though I visited the one with the unescaped > parens.) > > In any case, given all of this I thought that maybe the default mode > could do the extra escaping -- it seems to me that there is no damage > with doing that, since in theory every character could be escaped > anyway. There's a minor overhead of a few extra characters, but > there's the above benefit of doing it (which might be a temporary > thing for all I know). > > Neither Matthew nor I feel confident enough to have this encoding be > the default without consulting some potential web standard gurus. > > So? > > -- > ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay: > http://barzilay.org/ Maze is Life! > _________________________ > Racket Developers list: > http://lists.racket-lang.org/dev _________________________ Racket Developers list: http://lists.racket-lang.org/dev

