In light of some of the recent discussions regarding the turbogears
community contributing to other projects I would like some feedback on
a cherryPy issue.

I have been playing with a turbogears based REST interface tp tagging:
http://tasty.python-hosting.com/ and ran into a possible bug in the way
cherryPy handles mapping URLS.

The basic format of the REST url looks like:
    GET /service/<servicename>/user/<username>/item/<itemname>/

this works great for tagging my own content, such as
     POST
/service/photo_gallery/user/aaron/item/killington_2005/tag/vacation

but gets a bit ugly when dealing with tagging arbitrary content. For
example lets say I want to tag the following URL:
 http://www.killington.com/podcast/music/

If we take that item to be one part of the url then out url to post the
tags would be something like:

http://localhost:9980/service/test/user/aaron/item/http%3A%2F%2Fwww.killington.com%2Fpodcast%2Fmusic%2F/tag/vacation
      >>> urllib.quote_plus('http://www.killington.com/podcast/music/')
      'http%3A%2F%2Fwww.killington.com%2Fpodcast%2Fmusic%2F'
seems ok, some potentially long URLs will be passed around, but it
looks like a sane plan.

Unfortunately cherryPy does not like that and returns a 404:
NotFound: 404: The path
'/service/test/user/aaron/item/http://www.killington.com
/podcast/music/tag/fred' was not found.
It looks like cherryPy unquotes the incoming URL and then splits it:
around line line 232 of _cphttptools.py
        # Unquote the path (e.g. "/this%20path" -> "this path").
        #
http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2
        # Note that cgi.parse_qs will decode the querystring for us.
        path = urllib.unquote(path)

and a handful of lines later on gets the path
           scheme, location, p, pm, q, f = urlparse(path)
            path = path[len(scheme + "://" + location):]

 # Save original value (in case it gets modified by filters)
 request.path = request.originalPath = path


so ignoring the fact that I could not use strict rest and go for

http://localhost:9980/service/test/user/aaron?item=http%3A%2F%2Fwww.killington.com%2Fpodcast%2Fmusic%2F&tag=vacation
or md5 sum the item like:
    http://del.icio.us/url/8b7fec48fcb35763c9f8e1a8061eb124
or maybe something simpler like soap...


Should cherryPy be patched to support the type of URL encoding I need?
I read the spec and I think cherry Py is actually doing the right thing
but when I run tests in Apache apache seems to not treat %2F as a path
separator.

Any thoughts on this?
1) Can/Should I legally encode urls like this?
2) Should cherryPy be patched?

Thanks,
-Aaron

Reply via email to