Looks like I broke something.. :p
On 5/18/07, Philip Neustrom <[EMAIL PROTECTED]> wrote: > I haven't seen anything screw up with the quote functions in a while. > What was the edge case you saw? > > It's mad important to me to make sure URLs never break, so it's > important to be careful when playing with the URL encoding stuff. Old > URLs should keep working forever, unless there's something insane that > needs to happen. I'm not sure if that's an issue in this case, but I > just thought I'd throw it out there. See: > http://daviswiki.org/index.scgi/Front_20Page :) > > --Philip > > On 5/18/07, Rottenchester <[EMAIL PROTECTED]> wrote: > > Scott, thanks for the reference. It looks like that encoding is the > > intent of quoteFIlenames and the fix I checked in this a.m. should > > handle edge cases that were causing an error in some testing we were > > doing. > > > > The remaining issue in UTF-8 handling is another error in search.py > > that apparently Philip has a fix for, but hasn't checked in [1]. > > According to a note in that ticket (dated 1/26), Philip has it fixed > > in his wikis branch but was planning to port it to trunk. > > > > Maybe it would be better for the project if Philip would check in his > > wikis branch "as is" and then we could work on merging it as a > > community. Philip, do you have some thoughts on that? > > > > I'll move on to other bugs until I hear back. > > > > ---------------- > > > > [1] http://sycamore.devjavu.com/projects/sycamore/ticket/17 > > > > > > > > On 5/18/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > > >> There are a couple of bugs in trac related to UTF-8. It looks like > > > >> all file names and URLs are run through the pretty restrictive > > > >> quoteFilename in wikiutil.py. This recodes all characters that aren't > > > >> in (A-Z,a-z,1-9). In a UTF-8 environment, it doesn't work on UTF-8 > > > >> URLs. > > > > > > > > It looks like[1] only these ascii characters are allowed in a URI: > > > > > > > > Unreserved Characters (no encoding needed) > > > > A-Z (uppercase letters) > > > > a-z (lowercase letters) > > > > 0-9 (numbers) > > > > - (dash) > > > > _ (underscore) > > > > . (period) > > > > ~ (tilde) > > > > > > > > Reserved Characters (allowed only if encoded) > > > > ! = %21 > > > > * = %2A > > > > ' = %27 > > > > ( = %28 > > > > ) = %29 > > > > ; = %3B > > > > : = %3A > > > > @ = %40 > > > > & = %26 > > > > = = %3D > > > > + = %2B > > > > $ = %24 > > > > , = %2C > > > > / = %2F > > > > ? = %3F > > > > % = %25 > > > > # = %23 > > > > [ = %5B > > > > ] = %5D > > > > > > > > If the filename is meant to be displayed in the browser it make sense to > > > > encode it using percent encoding. > > > > > > To clarify[1]... > > > > > > "For worldwide interoperability, URIs have to be encoded uniformly. To map > > > the wide range of characters used worldwide into the 60 or so allowed > > > characters in a URI, a two-step process is used: > > > > > > * Convert the character string into a sequence of bytes using the > > > UTF-8 encoding > > > * Convert each byte that is not an ASCII letter or digit to %HH, where > > > HH is the hexadecimal value of the byte" > > > > > > Scott > > > -------- > > > [1] http://www.w3.org/International/O-URL-code.html > > > > > > _______________________________________________ > > > Sycamore-Dev mailing list > > > [EMAIL PROTECTED] > > > http://www.projectsycamore.org/ > > > https://tools.cernio.com/mailman/listinfo/sycamore-dev > > > > > _______________________________________________ > > Sycamore-Dev mailing list > > [EMAIL PROTECTED] > > http://www.projectsycamore.org/ > > https://tools.cernio.com/mailman/listinfo/sycamore-dev > > > _______________________________________________ Sycamore-Dev mailing list [EMAIL PROTECTED] http://www.projectsycamore.org/ https://tools.cernio.com/mailman/listinfo/sycamore-dev