p.s. Also the current docs[1] say this in the second paragraph: The URI encoding uses allows a few characters to be represented as-is: a through z, A through Z, 0-9, -, _, ., !, ~, *, ', ( and ).
But this in the final sentence: In additon, since there appear to be some brain-dead decoders on the web, the library also encodes !, ~, ', (, and ) using their hex representation, which is the same choice as made by the Java’s URLEncoder. Which seems to be contradictory with respect to !, ~, ', ( and ). [1]: http://docs.racket-lang.org/net/uri-codec.html On Mon, Dec 17, 2012 at 11:55 AM, Greg Hendershott <greghendersh...@gmail.com> wrote: > Although I'm hardly a web "expert", I think net/uri-codec is currently > a little confusing. > > I get the impression that it was originally written prior to 2005, > because the detailed introduction talks only about RFCs 1738 and > 2396.[1] > > It looks like perhaps functions such as uri-path-segment-encode were > added at a later date, to support RFC 3986. Although these functions' > docs tersely link to RFC 3986, the overall net/uri-codec introduction > wasn't revised accordingly, nor is there a simple explanation like > "these also encode #\( #\) ...". (As a result, I actually ended up > writing my own variation because I overlooked them.) > > Aside from the history of the documentation and organization, another > point is the treatment of +, which the docs say intentionally doesn't > follow RFC 2396, but don't really explain why. (One of my earliest > experiments with Racket was a simple web crawler, and this #\+ <-> > #\space translation caused difficulties (although it's possible I was > confused in other ways).) > > > Wikipedia (usual caveats apply) says RFC 3986 is the the current > standard since 2005.[2] > > I almost wonder if there should be a brand-new module that implements > RFC 3986 strictly. (Either just that, or, any options/parameters > default to 3986). With the current net/uri-codec deprecated but > preserved for backward compatibility. > > I wonder if that would be best because the functions and documentation > may already be confusing. And this is a topic where it's easy for > people to get confused to begin with and choose the wrong function. > > > [1]: http://docs.racket-lang.org/net/uri-codec.html > > [2]: http://en.wikipedia.org/wiki/Percent-encoding#Percent-encoding_in_a_URI > > On Mon, Dec 17, 2012 at 9:59 AM, Eli Barzilay <e...@barzilay.org> wrote: >> For many people there is a constant source of annoyance when you >> copy+paste doc URLs into a markdown context as with stackoverflow and >> others. The problem is that these URLs have parens in them and at >> least in Chrome, the copied URL still has them -- and because markdown >> texts use parens for URLs "[text](url)" they get confused which means >> that you have to manually replace parens with %28 and %29. >> >> Danny submitted a pull request that eventually got changed by Matthew >> into a new parameter that controls which characters get encoded by >> `net/uri-codec', so it can escape these too. The result on Chrome is >> that the copied URL has the escapes instead of parens, and clicking >> such a URL makes the copy-able address have the escapes too. The >> actuall page that is displayed is still the same one, of course, it's >> just weird that Chrome has a certain context where the original URL >> string is preserved as is. (It even considered the escaped URL as one >> that I didn't visit, even though I visited the one with the unescaped >> parens.) >> >> In any case, given all of this I thought that maybe the default mode >> could do the extra escaping -- it seems to me that there is no damage >> with doing that, since in theory every character could be escaped >> anyway. There's a minor overhead of a few extra characters, but >> there's the above benefit of doing it (which might be a temporary >> thing for all I know). >> >> Neither Matthew nor I feel confident enough to have this encoding be >> the default without consulting some potential web standard gurus. >> >> So? >> >> -- >> ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay: >> http://barzilay.org/ Maze is Life! >> _________________________ >> Racket Developers list: >> http://lists.racket-lang.org/dev _________________________ Racket Developers list: http://lists.racket-lang.org/dev