Re: [Zope-CMF] Charsets

2009-01-19 Thread Charlie Clark

Am 18.01.2009 um 23:00 schrieb yuppie:

 Hi Charlie!

Hiya Yuppie,

 Charlie Clark wrote:
 Am 29.12.2008 um 15:01 schrieb Charlie Clark:

 CMFDefault.utils

 def getBrowserCharset(request):
  Get charset preferred by the browser.
 
 envadapter = IUserPreferredCharsets(request)
 charsets = envadapter.getPreferredCharsets() or ['utf-8']
 return charsets[0]

 This will always be iso-8859-1 for Opera and Firefox because all
 charsets have the same quality, again even if UTF-8 encoding is
 specified.

 getBrowserCharset does almost the same as
 zope.publisher.http.getCharsetUsingRequest. And it is only used for
 encoding and decoding 'portal_status_message'. It is not relevant for
 the issue you noticed.

Okay.

 I haven't been able to track where the decoding of form
 data occurs for Zope 2 stuff but I can identify the problem in
 zpublisher.browser.BrowserRequest

 You mean zope.publisher.browser.BrowserRequest. The Zope 2 version  
 is in
 Products.Five.browser.decode.

Thanks - I thought it must have been in Five but didn't know where to  
look.

 AFAICS the fallback to other charsets is usually not required in  
 Zope 3.
 If the publisher encodes responses using
 zope.publisher.http.getCharsetUsingRequest, the first charset will be
 the right one.

That seems reasonable.

 I would suggest that we work towards enforcing UTF-8 in where  
 possible
 but at the very least add the accept-charset attribute to forms and
 use the portal's default_charset for this.

 I'd very much appreciate your comments on this.

 I can't see a need to implement this in a different way than Zope 3.  
 So
 I propose to fix the encoding of forms sent to the browser.


I agree that there shouldn't be implemented in a different way than  
for Zope 3. And if we can solve the problems by fixing form encoding  
I'm happy. Although I'd like to see UTF-8 always the first charset  
returned if * the quality is the same.

One thing that did strike me when working on this is quite how often  
getPreferredCharsets() is called on single request.

Charlie
--
Charlie Clark
Helmholtzstr. 20
Düsseldorf
D- 40215
Tel: +49-211-938-5360
GSM: +49-178-782-6226



___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests


Re: [Zope-CMF] Charsets

2009-01-19 Thread yuppie
Charlie Clark wrote:
 Am 18.01.2009 um 23:00 schrieb yuppie:
 I agree that there shouldn't be implemented in a different way than  
 for Zope 3. And if we can solve the problems by fixing form encoding  
 I'm happy. Although I'd like to see UTF-8 always the first charset  
 returned if * the quality is the same.

zope.publisher.http.HTTPCharsets explicitly prefers utf-8. Are you sure 
getPreferredCharsets()[0] is iso-8859-1 with your browser? Or do you 
override somewhere the Content-Type header set by setPageEncoding()? 
AFAICS CMFDefault works exactly the way you expect it to.

Cheers, Yuppie

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests


[Zope-CMF] CMF Tests: 6 OK

2009-01-19 Thread CMF Tests Summarizer
Summary of messages to the cmf-tests list.
Period Sun Jan 18 12:00:00 2009 UTC to Mon Jan 19 12:00:00 2009 UTC.
There were 6 messages: 6 from CMF Tests.


Tests passed OK
---

Subject: OK : CMF-2.1 Zope-2.10 Python-2.4.5 : Linux
From: CMF Tests
Date: Sun Jan 18 21:15:18 EST 2009
URL: http://mail.zope.org/pipermail/cmf-tests/2009-January/010759.html

Subject: OK : CMF-2.1 Zope-2.11 Python-2.4.5 : Linux
From: CMF Tests
Date: Sun Jan 18 21:16:49 EST 2009
URL: http://mail.zope.org/pipermail/cmf-tests/2009-January/010760.html

Subject: OK : CMF-trunk Zope-2.10 Python-2.4.5 : Linux
From: CMF Tests
Date: Sun Jan 18 21:18:19 EST 2009
URL: http://mail.zope.org/pipermail/cmf-tests/2009-January/010761.html

Subject: OK : CMF-trunk Zope-2.11 Python-2.4.5 : Linux
From: CMF Tests
Date: Sun Jan 18 21:19:49 EST 2009
URL: http://mail.zope.org/pipermail/cmf-tests/2009-January/010762.html

Subject: OK : CMF-trunk Zope-trunk Python-2.4.5 : Linux
From: CMF Tests
Date: Sun Jan 18 21:21:19 EST 2009
URL: http://mail.zope.org/pipermail/cmf-tests/2009-January/010763.html

Subject: OK : CMF-trunk Zope-trunk Python-2.5.2 : Linux
From: CMF Tests
Date: Sun Jan 18 21:22:49 EST 2009
URL: http://mail.zope.org/pipermail/cmf-tests/2009-January/010764.html

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests


Re: [Zope-CMF] Charsets

2009-01-19 Thread Charlie Clark

Am 19.01.2009 um 11:32 schrieb yuppie:

 zope.publisher.http.HTTPCharsets explicitly prefers utf-8. Are you  
 sure
 getPreferredCharsets()[0] is iso-8859-1 with your browser? Or do you
 override somewhere the Content-Type header set by setPageEncoding()?
 AFAICS CMFDefault works exactly the way you expect it to.


No, I don't override anything. I'll run some tests and post the results.

Charlie
--
Charlie Clark
Helmholtzstr. 20
Düsseldorf
D- 40215
Tel: +49-211-938-5360
GSM: +49-178-782-6226



___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests


Re: [Zope-CMF] Charsets

2009-01-19 Thread Dieter Maurer
yuppie wrote at 2009-1-19 11:32 +0100:
Charlie Clark wrote:
 Am 18.01.2009 um 23:00 schrieb yuppie:
 I agree that there shouldn't be implemented in a different way than  
 for Zope 3. And if we can solve the problems by fixing form encoding  
 I'm happy. Although I'd like to see UTF-8 always the first charset  
 returned if * the quality is the same.

zope.publisher.http.HTTPCharsets explicitly prefers utf-8. Are you sure 
getPreferredCharsets()[0] is iso-8859-1 with your browser?

This might be true for the Zope 3 publisher
however, Zope 2 HTTPResponse uses default_encoding (configured
in zope.conf) unless an encoding is prescribed by the response
content type -- and this has nothing to do with the Accept-Charset
request header.



-- 
Dieter
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests


Re: [Zope-CMF] Charsets

2009-01-19 Thread Dieter Maurer
Charlie Clark wrote at 2009-1-18 22:30 +0100:
Am 18.01.2009 um 20:36 schrieb Dieter Maurer:
 ...
 From the current HTML specification:

accept-charset = charset list [CI]
This attribute specifies the list of character encodings for input  
data that is accepted by the server processing this form. The value is  
a space- and/or comma-delimited list of charset values. The client  
must interpret this list as an exclusive-or list, i.e., the server is  
able to accept any single character encoding per entity received.

ie. exactly as you have suggested: it is possible to force a client to  
encode data in a particular charset before sending it to the server.  
All references I have come across suggest that this, together with the  
meta tag content-type can and should be used to coerce browsers to use  
UTF-8.

I fear that the accept-charset form control attribute
can easily only be used for method=post content-type=multipart/form-data
as only then the browser has a chance to specify how it has
encoded the value.

I am not sure whether Zope handles the charset information
in this case correctly.


As the Accept-Charset request header has (almost) nothing to do
with the accept-charset form control attribute, it must of course
not be used to interpret form data even when this was created
based on an accept-charset.


If the server chooses its output encoding based on the Accept-Charset
request header (and Yuppie indicated that the Zope 3 publisher does this),
then the same algorithm can be used for normal form data
(where normal means, you do not explicitely specify an accept-charset
form control attribute).
That's one sensefull mode of operation.
Another one is choosing a fixed encoding and using it as input and
output encoding.



-- 
Dieter
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests


Re: [Zope-CMF] Charsets

2009-01-19 Thread yuppie
Dieter Maurer wrote:
 yuppie wrote at 2009-1-19 11:32 +0100:
 Charlie Clark wrote:
 Am 18.01.2009 um 23:00 schrieb yuppie:
 I agree that there shouldn't be implemented in a different way than  
 for Zope 3. And if we can solve the problems by fixing form encoding  
 I'm happy. Although I'd like to see UTF-8 always the first charset  
 returned if * the quality is the same.
 zope.publisher.http.HTTPCharsets explicitly prefers utf-8. Are you sure 
 getPreferredCharsets()[0] is iso-8859-1 with your browser?
 
 This might be true for the Zope 3 publisher
 however, Zope 2 HTTPResponse uses default_encoding (configured
 in zope.conf) unless an encoding is prescribed by the response
 content type -- and this has nothing to do with the Accept-Charset
 request header.

Products.Five.browser.decode.setPageEncoding sets the response content 
type charset based on zope.publisher.http.HTTPCharsets. And 
setPageEncoding is called by the update method of formlib forms in Zope 
2. So in this case the response encoding has something to do with the 
Accept-Charset request header.

Cheers, Yuppie

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests