Re: [Zope-dev] redirect burps on unicode URLs

2010-03-01 Thread Marius Gedminas
On Sun, Feb 28, 2010 at 05:05:51PM +0100, Wichert Akkerman wrote:
 On 2010-2-26 18:25, Tres Seaver wrote:
  Wichert Akkerman wrote:
  I see this as naming confusion. In this day and age every URL is
  effectively an IRI, and every modern browser treats them that way. If
  you look at http://jp.wikipedia.org/ you can see how well that works. I
  do not see why zope.publisher should not be able to support that
  transparently. Other systems such as Routes and repoze.bfg do.
 
  Browseers *display* what looks like unicode to the user, but they *pass*
  URL-encoded ASCII bytes to the server.
 
 But why can't zope.publisher do that conversion? It don't see the point 
 in requiring all the thousands of routines that call those functions to 
 do that conversion when zope.publisher can easily do so itself.

+1

Just like zope.publisher converts Unicode strings returned by views into
UTF-8 (or whatever encoding negotiated via Accept-Charset),
response.redirect() ought to Do The Right Thing with Unicode URLs or
IRLs or whatever they're called.

Marius Gedminas
-- 
http://pov.lt/ -- Zope 3 consulting and development


signature.asc
Description: Digital signature
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-03-01 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Marius Gedminas wrote:
 On Sun, Feb 28, 2010 at 05:05:51PM +0100, Wichert Akkerman wrote:
 On 2010-2-26 18:25, Tres Seaver wrote:
 Wichert Akkerman wrote:
 I see this as naming confusion. In this day and age every URL is
 effectively an IRI, and every modern browser treats them that way. If
 you look at http://jp.wikipedia.org/ you can see how well that works. I
 do not see why zope.publisher should not be able to support that
 transparently. Other systems such as Routes and repoze.bfg do.
 Browseers *display* what looks like unicode to the user, but they *pass*
 URL-encoded ASCII bytes to the server.
 But why can't zope.publisher do that conversion? It don't see the point 
 in requiring all the thousands of routines that call those functions to 
 do that conversion when zope.publisher can easily do so itself.
 
 +1
 
 Just like zope.publisher converts Unicode strings returned by views into
 UTF-8 (or whatever encoding negotiated via Accept-Charset),
 response.redirect() ought to Do The Right Thing with Unicode URLs or
 IRLs or whatever they're called.

- -1.

Where is this unicode URL coming from?  URLs generated from code
should already be correct.


Tres.
- --
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuLtdgACgkQ+gerLs4ltQ7lHwCgh//aPrrcaZ6StKVBGr8K1JaF
whIAoLheGkJ3w439F+FmLCrIv7NhIxqp
=7M8c
-END PGP SIGNATURE-

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-03-01 Thread Wichert Akkerman
On 3/1/10 13:41 , Tres Seaver wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Marius Gedminas wrote:
 On Sun, Feb 28, 2010 at 05:05:51PM +0100, Wichert Akkerman wrote:
 On 2010-2-26 18:25, Tres Seaver wrote:
 Wichert Akkerman wrote:
 I see this as naming confusion. In this day and age every URL is
 effectively an IRI, and every modern browser treats them that way. If
 you look at http://jp.wikipedia.org/ you can see how well that works. I
 do not see why zope.publisher should not be able to support that
 transparently. Other systems such as Routes and repoze.bfg do.
 Browseers *display* what looks like unicode to the user, but they *pass*
 URL-encoded ASCII bytes to the server.
 But why can't zope.publisher do that conversion? It don't see the point
 in requiring all the thousands of routines that call those functions to
 do that conversion when zope.publisher can easily do so itself.

 +1

 Just like zope.publisher converts Unicode strings returned by views into
 UTF-8 (or whatever encoding negotiated via Accept-Charset),
 response.redirect() ought to Do The Right Thing with Unicode URLs or
 IRLs or whatever they're called.

 - -1.

--1 is the same as +1, but I suspect that is not what you meant.


 Where is this unicode URL coming from?  URLs generated from code
 should already be correct.

The only change is changing the point where 'correct' changes from 
unicode to an escaped UTF-8 encoded string. That change can made without 
breaking any backwards compatibility.

Wichert.
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-03-01 Thread Martin Aspeli
Wichert Akkerman wrote:
 On 3/1/10 13:41 , Tres Seaver wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Marius Gedminas wrote:
 On Sun, Feb 28, 2010 at 05:05:51PM +0100, Wichert Akkerman wrote:
 On 2010-2-26 18:25, Tres Seaver wrote:
 Wichert Akkerman wrote:
 I see this as naming confusion. In this day and age every URL is
 effectively an IRI, and every modern browser treats them that way. If
 you look at http://jp.wikipedia.org/ you can see how well that works. I
 do not see why zope.publisher should not be able to support that
 transparently. Other systems such as Routes and repoze.bfg do.
 Browseers *display* what looks like unicode to the user, but they *pass*
 URL-encoded ASCII bytes to the server.
 But why can't zope.publisher do that conversion? It don't see the point
 in requiring all the thousands of routines that call those functions to
 do that conversion when zope.publisher can easily do so itself.
 +1

 Just like zope.publisher converts Unicode strings returned by views into
 UTF-8 (or whatever encoding negotiated via Accept-Charset),
 response.redirect() ought to Do The Right Thing with Unicode URLs or
 IRLs or whatever they're called.
 - -1.

 --1 is the same as +1, but I suspect that is not what you meant.


 Where is this unicode URL coming from?  URLs generated from code
 should already be correct.

 The only change is changing the point where 'correct' changes from
 unicode to an escaped UTF-8 encoded string. That change can made without
 breaking any backwards compatibility.

I'm with Wichert here.

In most places, we tend to carry around unicode strings internally, and 
only encode on the boundaries, e.g. when the URL is rendered. I don't 
see why redirect() can't have a sensible and predictable policy for 
unicode strings, making life easier for everyone.

If we think that non-ASCII URLs are illegal, then maybe we should 
validate for that and throw an error. However, I don't think that's the 
case (anymore?). In that case, passing a unicode object to the function 
seems entirely consistent with other places, e.g. when we pass unicode 
to the page template engine or return unicode from a view, which the 
publisher then encodes before it's pushed down to the client.

Martin

-- 
Author of `Professional Plone Development`, a book for developers who
want to work with Plone. See http://martinaspeli.net/plone-book

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-03-01 Thread Christian Theune
Hi,

On 03/01/2010 02:28 PM, Martin Aspeli wrote:

 I'm with Wichert here.
 
 In most places, we tend to carry around unicode strings internally, and 
 only encode on the boundaries, e.g. when the URL is rendered. I don't 
 see why redirect() can't have a sensible and predictable policy for 
 unicode strings, making life easier for everyone.
 
 If we think that non-ASCII URLs are illegal, then maybe we should 
 validate for that and throw an error. However, I don't think that's the 
 case (anymore?). In that case, passing a unicode object to the function 
 seems entirely consistent with other places, e.g. when we pass unicode 
 to the page template engine or return unicode from a view, which the 
 publisher then encodes before it's pushed down to the client.

I opened a question in another part of the thread, but haven't gotten an
answer yet. In my understanding, a Unicode string is not able to
represent the structural properties of a URL in http scheme properly,
thus encoding back to ASCII is not possible.

Can someone confirm or disprove this?

Christian

-- 
Christian Theune · c...@gocept.com
gocept gmbh  co. kg · forsterstraße 29 · 06112 halle (saale) · germany
http://gocept.com · tel +49 345 1229889 0 · fax +49 345 1229889 1
Zope and Plone consulting and development

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-03-01 Thread Wichert Akkerman
On 3/1/10 15:09 , Christian Theune wrote:
 Hi,

 On 03/01/2010 02:28 PM, Martin Aspeli wrote:

 I'm with Wichert here.

 In most places, we tend to carry around unicode strings internally, and
 only encode on the boundaries, e.g. when the URL is rendered. I don't
 see why redirect() can't have a sensible and predictable policy for
 unicode strings, making life easier for everyone.

 If we think that non-ASCII URLs are illegal, then maybe we should
 validate for that and throw an error. However, I don't think that's the
 case (anymore?). In that case, passing a unicode object to the function
 seems entirely consistent with other places, e.g. when we pass unicode
 to the page template engine or return unicode from a view, which the
 publisher then encodes before it's pushed down to the client.

 I opened a question in another part of the thread, but haven't gotten an
 answer yet. In my understanding, a Unicode string is not able to
 represent the structural properties of a URL in http scheme properly,
 thus encoding back to ASCII is not possible.

 Can someone confirm or disprove this?

I am not sure what you mean. On the wire you get a path component in a 
HTTP get request which is UTF-8 encoded and escaped. For example 
http://ja.wikipedia.org/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8
 
, which is a Japanese string if you decode it back to unicode. That 
encoding works fine in two directions, and all other properties used in 
the http scheme such as query strings and fragments work normally. Can 
you provide an example of something that might not work?

Wichert.
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-03-01 Thread Christian Theune
On 03/01/2010 03:34 PM, Wichert Akkerman wrote:
 On 3/1/10 15:09 , Christian Theune wrote:
 Hi,

 On 03/01/2010 02:28 PM, Martin Aspeli wrote:

 I'm with Wichert here.

 In most places, we tend to carry around unicode strings internally, and
 only encode on the boundaries, e.g. when the URL is rendered. I don't
 see why redirect() can't have a sensible and predictable policy for
 unicode strings, making life easier for everyone.

 If we think that non-ASCII URLs are illegal, then maybe we should
 validate for that and throw an error. However, I don't think that's the
 case (anymore?). In that case, passing a unicode object to the function
 seems entirely consistent with other places, e.g. when we pass unicode
 to the page template engine or return unicode from a view, which the
 publisher then encodes before it's pushed down to the client.

 I opened a question in another part of the thread, but haven't gotten an
 answer yet. In my understanding, a Unicode string is not able to
 represent the structural properties of a URL in http scheme properly,
 thus encoding back to ASCII is not possible.

 Can someone confirm or disprove this?
 
 I am not sure what you mean. On the wire you get a path component in a 
 HTTP get request which is UTF-8 encoded and escaped. For example 
 http://ja.wikipedia.org/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8
  
 , which is a Japanese string if you decode it back to unicode. That 
 encoding works fine in two directions, and all other properties used in 
 the http scheme such as query strings and fragments work normally. Can 
 you provide an example of something that might not work?

The problem is that a URI has internal structure which looks to me like
it can't be reconstructed properly if it was decoded into a regular
unicode string.

E.g. reserved characters are probably decoded into their regular symbols
(e.g. a slash embedded in a path component or ampersands used in query
arguments), so escaping needs to be done (manually) before encoding.
Also, some parts of a URI can use other ways to encode symbols.
Hostnames would like to be encoded to punycode whereas URIs don't even
say what character set unicode characters should be encoded to. That
would be up to the application (e.g. our publisher, so that's manageable).

I have the feeling that roundtrip behaviour of URI - unicode string -
URI won't be possible fully correctly and thus may be susceptible to
interference from the outside.

I still hope we can do better than doing nothing about it. I just think
it's more complex than calling encode('something'). ;)

Christian

-- 
Christian Theune · c...@gocept.com
gocept gmbh  co. kg · forsterstraße 29 · 06112 halle (saale) · germany
http://gocept.com · tel +49 345 1229889 0 · fax +49 345 1229889 1
Zope and Plone consulting and development

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-03-01 Thread Adam GROSZER
Hello,

Thinking about the problem and this itself the following comes to my
mind. I guess we're using most of the redirect with absoluteURL().
And what does absoluteURL do? It converts unicode object names to a
URL. Seemingly in a simple way. We feed then this URL to redirect().
The edge case that happened with loginform is when the URL does not
come from absoluteURL.

My assumption is that doing the same in redirect as absoluteURL does
should be OK. (unless Tres find this out of the line with the RFC)

Excerpts from the source:

class AbsoluteURL(BrowserView):
implements(IAbsoluteURL)

def __unicode__(self):
return urllib.unquote(self.__str__()).decode('utf-8')
...
def __str__(self):
...
name = getattr(context, '__name__', None)
...
if name:
url += '/' + urllib.quote(name.encode('utf-8'), _safe)

return url

Monday, March 1, 2010, 3:09:33 PM, you wrote:

CT Hi,

CT On 03/01/2010 02:28 PM, Martin Aspeli wrote:

 I'm with Wichert here.
 
 In most places, we tend to carry around unicode strings internally, and 
 only encode on the boundaries, e.g. when the URL is rendered. I don't 
 see why redirect() can't have a sensible and predictable policy for 
 unicode strings, making life easier for everyone.
 
 If we think that non-ASCII URLs are illegal, then maybe we should 
 validate for that and throw an error. However, I don't think that's the 
 case (anymore?). In that case, passing a unicode object to the function 
 seems entirely consistent with other places, e.g. when we pass unicode 
 to the page template engine or return unicode from a view, which the 
 publisher then encodes before it's pushed down to the client.

CT I opened a question in another part of the thread, but haven't gotten an
CT answer yet. In my understanding, a Unicode string is not able to
CT represent the structural properties of a URL in http scheme properly,
CT thus encoding back to ASCII is not possible.

CT Can someone confirm or disprove this?

CT Christian



-- 
Best regards,
 Adam GROSZERmailto:agros...@gmail.com
--
Quote of the day:
Death is God's way of telling you not to be such a wise guy.

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-03-01 Thread Adam GROSZER
Hello Christian,

Isn't it that anything below chr(128) converts to utf-8 as the same
character? That would mean that slash and ampersand will stay as it
is.
OTOH encoding is done only on non-ascii characters. Supposed that the
encoding is utf-8. What's hardwired into absoluteURL.

Monday, March 1, 2010, 4:40:30 PM, you wrote:

CT On 03/01/2010 03:34 PM, Wichert Akkerman wrote:
 On 3/1/10 15:09 , Christian Theune wrote:
 Hi,

 On 03/01/2010 02:28 PM, Martin Aspeli wrote:

 I'm with Wichert here.

 In most places, we tend to carry around unicode strings internally, and
 only encode on the boundaries, e.g. when the URL is rendered. I don't
 see why redirect() can't have a sensible and predictable policy for
 unicode strings, making life easier for everyone.

 If we think that non-ASCII URLs are illegal, then maybe we should
 validate for that and throw an error. However, I don't think that's the
 case (anymore?). In that case, passing a unicode object to the function
 seems entirely consistent with other places, e.g. when we pass unicode
 to the page template engine or return unicode from a view, which the
 publisher then encodes before it's pushed down to the client.

 I opened a question in another part of the thread, but haven't gotten an
 answer yet. In my understanding, a Unicode string is not able to
 represent the structural properties of a URL in http scheme properly,
 thus encoding back to ASCII is not possible.

 Can someone confirm or disprove this?
 
 I am not sure what you mean. On the wire you get a path component in a 
 HTTP get request which is UTF-8 encoded and escaped. For example 
 http://ja.wikipedia.org/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8
  
 , which is a Japanese string if you decode it back to unicode. That 
 encoding works fine in two directions, and all other properties used in 
 the http scheme such as query strings and fragments work normally. Can 
 you provide an example of something that might not work?

CT The problem is that a URI has internal structure which looks to me like
CT it can't be reconstructed properly if it was decoded into a regular
CT unicode string.

CT E.g. reserved characters are probably decoded into their regular symbols
CT (e.g. a slash embedded in a path component or ampersands used in query
CT arguments), so escaping needs to be done (manually) before encoding.
CT Also, some parts of a URI can use other ways to encode symbols.
CT Hostnames would like to be encoded to punycode whereas URIs don't even
CT say what character set unicode characters should be encoded to. That
CT would be up to the application (e.g. our publisher, so that's manageable).

CT I have the feeling that roundtrip behaviour of URI - unicode string -
CT URI won't be possible fully correctly and thus may be susceptible to
CT interference from the outside.

CT I still hope we can do better than doing nothing about it. I just think
CT it's more complex than calling encode('something'). ;)

CT Christian



-- 
Best regards,
 Adam GROSZERmailto:agros...@gmail.com
--
Quote of the day:
Reflect upon your present blessings - of which every man has many- not on your 
past misfortunes, of which all men have some. 
- Charles Dickens 

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-03-01 Thread Christian Theune
Hi,

On 03/01/2010 05:04 PM, Adam GROSZER wrote:
 Hello Christian,
 
 Isn't it that anything below chr(128) converts to utf-8 as the same
 character? That would mean that slash and ampersand will stay as it
 is.

No. The spec says that if you want to use a reserved character
(depending on the scheme) you need to quote it.

 OTOH encoding is done only on non-ascii characters. Supposed that the
 encoding is utf-8. What's hardwired into absoluteURL.

But then again, it's not UTF-8 for all of the URL. No spec ever says
code path elements to UTF-8.

Christian

-- 
Christian Theune · c...@gocept.com
gocept gmbh  co. kg · forsterstraße 29 · 06112 halle (saale) · germany
http://gocept.com · tel +49 345 1229889 0 · fax +49 345 1229889 1
Zope and Plone consulting and development

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-02-28 Thread Wichert Akkerman
On 2010-2-26 18:25, Tres Seaver wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Wichert Akkerman wrote:
 On 2/25/10 17:08 , Tres Seaver wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Adam GROSZER wrote:
 Hello,

 Looks like zope.publisher burps on unicode URL which contain non-ascii
 chars. This is from a KGS 3.4 application, but looking at the source
 it still seems to have the same problems.

 opinions?

 ...
   self.request.response.redirect(url)
 File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\browser.py,
  line
 729, in redirect
   return super(BrowserResponse, self).redirect(location, status)
 File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 882,
 in redirect
   self.setHeader('Location', location)
 File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 676,
 in setHeader
   value = str(value)
 UnicodeEncodeError: 'ascii' codec can't encode character u'\xd6' in 
 position 71: ordinal not in
 range(128)
 Two issues:

 - - Technically there is no such thing as a unicode URL:  URLs are
 always ASCII, with other characters encoded[1].  IRIs and IRLs are
 a different thing altogether.

 I see this as naming confusion. In this day and age every URL is
 effectively an IRI, and every modern browser treats them that way. If
 you look at http://jp.wikipedia.org/ you can see how well that works. I
 do not see why zope.publisher should not be able to support that
 transparently. Other systems such as Routes and repoze.bfg do.

 Browseers *display* what looks like unicode to the user, but they *pass*
 URL-encoded ASCII bytes to the server.

But why can't zope.publisher do that conversion? It don't see the point 
in requiring all the thousands of routines that call those functions to 
do that conversion when zope.publisher can easily do so itself.

Wichert.

-- 
Wichert Akkerman wich...@wiggy.net   It is simple to make things.
http://www.wiggy.net/  It is hard to make things simple.
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-02-26 Thread Wichert Akkerman
On 2/25/10 17:08 , Tres Seaver wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Adam GROSZER wrote:
 Hello,

 Looks like zope.publisher burps on unicode URL which contain non-ascii
 chars. This is from a KGS 3.4 application, but looking at the source
 it still seems to have the same problems.

 opinions?

 ...
  self.request.response.redirect(url)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\browser.py,
  line
 729, in redirect
  return super(BrowserResponse, self).redirect(location, status)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 882,
 in redirect
  self.setHeader('Location', location)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 676,
 in setHeader
  value = str(value)
 UnicodeEncodeError: 'ascii' codec can't encode character u'\xd6' in position 
 71: ordinal not in
 range(128)

 Two issues:

 - - Technically there is no such thing as a unicode URL:  URLs are
always ASCII, with other characters encoded[1].  IRIs and IRLs are
a different thing altogether.

I see this as naming confusion. In this day and age every URL is 
effectively an IRI, and every modern browser treats them that way. If 
you look at http://jp.wikipedia.org/ you can see how well that works. I 
do not see why zope.publisher should not be able to support that 
transparently. Other systems such as Routes and repoze.bfg do.

Wichert.
___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-02-26 Thread Adam GROSZER
Hello,

Some background:
This arises when you try to access a url which has non-ascii in it
(but is well encoded), usually a document uploaded by a user.
Then the loginform comes with it's camefrom parameter.
On successful login yo uget redirected to the camefrom url, which gets
unencoded to unicode, that's where it burps (for me).

Do we want to fix this?
Which part is bogus? The loginform with it's redirect or the redirect
itself?

Friday, February 26, 2010, 9:29:54 AM, you wrote:

WA On 2/25/10 17:08 , Tres Seaver wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Adam GROSZER wrote:
 Hello,

 Looks like zope.publisher burps on unicode URL which contain non-ascii
 chars. This is from a KGS 3.4 application, but looking at the source
 it still seems to have the same problems.

 opinions?

 ...
  self.request.response.redirect(url)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\browser.py,
  line
 729, in redirect
  return super(BrowserResponse, self).redirect(location, status)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 882,
 in redirect
  self.setHeader('Location', location)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 676,
 in setHeader
  value = str(value)
 UnicodeEncodeError: 'ascii' codec can't encode character u'\xd6' in 
 position 71: ordinal not in
 range(128)

 Two issues:

 - - Technically there is no such thing as a unicode URL:  URLs are
always ASCII, with other characters encoded[1].  IRIs and IRLs are
a different thing altogether.

WA I see this as naming confusion. In this day and age every URL is 
WA effectively an IRI, and every modern browser treats them that way. If 
WA you look at http://jp.wikipedia.org/ you can see how well that works. I
WA do not see why zope.publisher should not be able to support that 
WA transparently. Other systems such as Routes and repoze.bfg do.

WA Wichert.
WA ___
WA Zope-Dev maillist  -  Zope-Dev@zope.org
WA https://mail.zope.org/mailman/listinfo/zope-dev
WA **  No cross posts or HTML encoding!  **
WA (Related lists - 
WA  https://mail.zope.org/mailman/listinfo/zope-announce
WA  https://mail.zope.org/mailman/listinfo/zope )

-- 
Best regards,
 Adam GROSZERmailto:agros...@gmail.com
--
Quote of the day:
Do not make friends who are comfortable to be with. Make friends who will force 
you to lever yourself up. 
- Thomas J. Watson, Sr. 

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-02-26 Thread Christian Theune
On 02/26/2010 04:12 PM, Adam GROSZER wrote:
 Hello,
 
 Some background:
 This arises when you try to access a url which has non-ascii in it
 (but is well encoded), usually a document uploaded by a user.
 Then the loginform comes with it's camefrom parameter.
 On successful login yo uget redirected to the camefrom url, which gets
 unencoded to unicode, that's where it burps (for me).
 
 Do we want to fix this?

Yes, please.

 Which part is bogus? The loginform with it's redirect or the redirect
 itself?

According to the RFC Tres referenced we can't just represent a URL as a
unicode string but would have to deal with the schemes' structure and
specific encoding issues.

To support the round-trip issue, one way that I can see would be to use
a specific URL object instead of a unicode string. That would be quite a
change, but we could probably make it work in a backwards compatible way.

Other suggestions?

Christian

-- 
Christian Theune · c...@gocept.com
gocept gmbh  co. kg · forsterstraße 29 · 06112 halle (saale) · germany
http://gocept.com · tel +49 345 1229889 0 · fax +49 345 1229889 1
Zope and Plone consulting and development

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-02-26 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Wichert Akkerman wrote:
 On 2/25/10 17:08 , Tres Seaver wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Adam GROSZER wrote:
 Hello,

 Looks like zope.publisher burps on unicode URL which contain non-ascii
 chars. This is from a KGS 3.4 application, but looking at the source
 it still seems to have the same problems.

 opinions?

 ...
  self.request.response.redirect(url)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\browser.py,
  line
 729, in redirect
  return super(BrowserResponse, self).redirect(location, status)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 882,
 in redirect
  self.setHeader('Location', location)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 676,
 in setHeader
  value = str(value)
 UnicodeEncodeError: 'ascii' codec can't encode character u'\xd6' in 
 position 71: ordinal not in
 range(128)
 Two issues:

 - - Technically there is no such thing as a unicode URL:  URLs are
always ASCII, with other characters encoded[1].  IRIs and IRLs are
a different thing altogether.
 
 I see this as naming confusion. In this day and age every URL is 
 effectively an IRI, and every modern browser treats them that way. If 
 you look at http://jp.wikipedia.org/ you can see how well that works. I 
 do not see why zope.publisher should not be able to support that 
 transparently. Other systems such as Routes and repoze.bfg do.

Browseers *display* what looks like unicode to the user, but they *pass*
URL-encoded ASCII bytes to the server.  Even if that weren't so, HTTP
header values (the 'Location:' header, in this case) still have to be
ASCII, period.


Tres.
- --
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuIA/QACgkQ+gerLs4ltQ72NwCgu9No4P4J5y29kiTJk124GYZ2
PhYAnjNhssCzpFgNaMmL2c3Y1wVEzeKJ
=4HgE
-END PGP SIGNATURE-

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


[Zope-dev] redirect burps on unicode URLs

2010-02-25 Thread Adam GROSZER
Hello,

Looks like zope.publisher burps on unicode URL which contain non-ascii
chars. This is from a KGS 3.4 application, but looking at the source
it still seems to have the same problems.

opinions?

...
self.request.response.redirect(url)
  File 
d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\browser.py,
 line
729, in redirect
return super(BrowserResponse, self).redirect(location, status)
  File 
d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py, 
line 882,
in redirect
self.setHeader('Location', location)
  File 
d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py, 
line 676,
in setHeader
value = str(value)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xd6' in position 
71: ordinal not in
range(128)

-- 
Best regards,
 Adam GROSZER  mailto:agros...@gmail.com
--
Quote of the day:
There is not a heart that but has its moments of longing, yearning for 
something better. 
- Henry Ward Beecher 

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-02-25 Thread Wichert Akkerman
I suspect many parts in Zope are not up to handling unicode URLs. A bug 
I'ld say :)

Wichert.

On 2/25/10 15:00 , Adam GROSZER wrote:
 Hello,

 Looks like zope.publisher burps on unicode URL which contain non-ascii
 chars. This is from a KGS 3.4 application, but looking at the source
 it still seems to have the same problems.

 opinions?

 ...
  self.request.response.redirect(url)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\browser.py,
  line
 729, in redirect
  return super(BrowserResponse, self).redirect(location, status)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 882,
 in redirect
  self.setHeader('Location', location)
File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 676,
 in setHeader
  value = str(value)
 UnicodeEncodeError: 'ascii' codec can't encode character u'\xd6' in position 
 71: ordinal not in
 range(128)


___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-02-25 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Adam GROSZER wrote:
 Hello,
 
 Looks like zope.publisher burps on unicode URL which contain non-ascii
 chars. This is from a KGS 3.4 application, but looking at the source
 it still seems to have the same problems.
 
 opinions?
 
 ...
 self.request.response.redirect(url)
   File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\browser.py,
  line
 729, in redirect
 return super(BrowserResponse, self).redirect(location, status)
   File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 882,
 in redirect
 self.setHeader('Location', location)
   File 
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,
  line 676,
 in setHeader
 value = str(value)
 UnicodeEncodeError: 'ascii' codec can't encode character u'\xd6' in position 
 71: ordinal not in
 range(128)

Two issues:

- - Technically there is no such thing as a unicode URL:  URLs are
  always ASCII, with other characters encoded[1].  IRIs and IRLs are
  a different thing altogether.

- - Headers in responses must *not* be Unicode.

Your application needs to make the URL header-safe before calling
redirect, likely by using 'urlencode'.


[1] http://tools.ietf.org/html/rfc1738#section-2.2



Tres.
- --
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuGoI4ACgkQ+gerLs4ltQ5zGACfYBpr0A3z4pnH6qdS/Ku7irjO
Lt4AoIWHrsO3gN6AioJfbjbv1us/mZQf
=5r8C
-END PGP SIGNATURE-

___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-02-25 Thread Andreas Jung
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Adam GROSZER wrote:


As Tres pointed out: URLs must be properly encoded and it is not safe
to pass python unicode
strings to APIs expecting byte strings.

Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkuGoSkACgkQCJIWIbr9KYxDlACePVe51/GlXuuQFDTTLKkpwp3x
s7AAnRu72lK6/NxUrWmozwlSUtvWdmlY
=HvdP
-END PGP SIGNATURE-

attachment: lists.vcf___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] redirect burps on unicode URLs

2010-02-25 Thread Andreas Jung
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Adam GROSZER wrote:
 Hello,

 Looks like zope.publisher burps on unicode URL which contain
 non-ascii chars. This is from a KGS 3.4 application, but looking
 at the source it still seems to have the same problems.

 opinions?

 ... self.request.response.redirect(url) File
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\browser.py,


line 729, in redirect return super(BrowserResponse,
 self).redirect(location, status) File
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,


line 882, in redirect self.setHeader('Location', location) File
 d:\home\.buildout\eggs\zope.publisher-3.4.6-py2.5.egg\zope\publisher\http.py,


line 676, in setHeader value = str(value) UnicodeEncodeError:
 'ascii' codec can't encode character u'\xd6' in position 71:
 ordinal not in range(128)

As Tres pointed out: URLs must be properly encoded and it is not safe
to pass python unicode
strings to APIs expecting byte strings.

Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkuGoSoACgkQCJIWIbr9KYxyfQCfeHfH08kUzbwakG6PYXWVuptA
TCgAn2BQjVIUc+/2IxiYhBdaRHaigT/+
=SzNi
-END PGP SIGNATURE-

attachment: lists.vcf___
Zope-Dev maillist  -  Zope-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope )