Re: [Zope-dev] [ZOPE 2.6 B1] Unicode/locale problems withOFS/dtml/properties.dtml

2002-09-30 Thread Toby Dickenson

On Saturday 28 Sep 2002 4:38 pm, Florent Guillaume wrote:

   (yes, this sucks. The problem is that browsers dont specify the
   character encoding used in form submissions. At some point we need to
   discuss the way forward on this issue)

 Just to be clear, this is an HTML attribute of the form tag. For
 instance:

 form action=foo ... accept-charset=UTF-8
  ...
 /form

 This instructs the browser it should send the content of the form in the
 accepted charset.

Yes, accept-charset could be part of a full solution to this problem, but I 
dont think it is a whole solution

Are you suggesting that a method could assume its form submissions would 
always be made in utf-8? That would cause problems if a submission was made 
from:
* some other form that didnt have an accept-charset
* some non-browser code that synthesizes http requests

A further problem is that we want this decoding to be performed in ZPublisher, 
but that that point in the publishing process it doent know which method is 
going to be called. That means the utf8 assumption cant be made independantly 
for each method. 


One answer to this problem is when browsers include the charset attribute in 
multipart/form-data POST requests. ZPublisher knows unambiguously what 
encoding was used by the browser.

Sadly I cant see a nice way to do the same for GET requests





___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] [ZOPE 2.6 B1] Unicode/locale problems withOFS/dtml/properties.dtml

2002-09-28 Thread Florent Guillaume

  (yes, this sucks. The problem is that browsers dont specify the
  character encoding used in form submissions. At some point we need to
  discuss the way forward on this issue)
 
 There is a standard accept-charset attribute of forms, which says what
 encodings are accepted by the form handler (Zope here). I think we
 should use it and set it to UTF-8 in those cases.

Just to be clear, this is an HTML attribute of the form tag. For
instance:

form action=foo ... accept-charset=UTF-8
 ...
/form

This instructs the browser it should send the content of the form in the
accepted charset. As a default, it is recommended that user agents use
the encoding of the document, but this is not a strict requirement in
HTML4.

Florent

-- 
Florent Guillaume, Nuxeo (Paris, France)
+33 1 40 33 79 87  http://nuxeo.com  mailto:[EMAIL PROTECTED]

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



RE: [Zope-dev] [ZOPE 2.6 B1] Unicode/locale problems withOFS/dtml/properties.dtml

2002-09-26 Thread Andreas Kostyrka

Am Don, 2002-09-26 um 18.47 schrieb Arnar Lundesgaard:
  what do you mean by inserted into the form?
 
 Put there by the dtml-method in a dtml-var statement.
I can second this. With CVS-Zope (did the last cvs up this moment)
I'm getting a very curios thing:
Displaying .../index_html is ok.
But 
return context.index_html(context,request)
creates broken characters instead is isolatin1 Umlaute.
In my case (Konqueror on Linux) it seems that the
text/html;charset=UTF-8 breaks the page because the byte values are
correct for the Umlaute. This is further confirmed by the fact that
forcing Konq to display iso8859-1 fixes the display.

So how are these Unicode changes supposed to work? Are non-ascii
characters forbidden now? And how do I get UTF-8 text into Zope?

While I'm quite sure that this will help Zope in the Asiatic region, it
seems quite inconvienent for isolatin1 world :(


 
  That the response contains a single byte where you properties 
  contain character whose unicode value is greater than 127 ? 
 
 Yes
 
  how have you checked this? if so, thats a bug.
 
 In a Hex editor...
 
 The character 'æ' for instance is inserted as '0xE6' in the
 returned HTML. 
The same here with Umlaute, ...

I've used wget and less result to verify that the umlaute display
correctly.

Andreas
-- 
Andreas Kostyrka [EMAIL PROTECTED]


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



RE: [Zope-dev] [ZOPE 2.6 B1] Unicode/locale problems withOFS/dtml/properties.dtml

2002-09-26 Thread Arnar Lundesgaard

 I can second this. With CVS-Zope (did the last cvs up this moment)
 I'm getting a very curios thing:
 Displaying .../index_html is ok.
 But 
 return context.index_html(context,request)
 creates broken characters instead is isolatin1 Umlaute.
 In my case (Konqueror on Linux) it seems that the
 text/html;charset=UTF-8 breaks the page because the byte values are
 correct for the Umlaute. This is further confirmed by the fact that
 forcing Konq to display iso8859-1 fixes the display.

Hmm, you may check out http://collector.zope.org/Zope/517 but it could be the same 
difficulties as we experienced earlier.

The problem here was that Zope thought it was returning UTF-8, while it was really 
returning ISO-8859-1. This was due to the dtml-var u'' statement not having the 
desired effect. dtml-var u' ' (notice the space) seemed to work brilliantly.

 So how are these Unicode changes supposed to work? Are non-ascii
 characters forbidden now? And how do I get UTF-8 text into Zope?

There are converters inside ZOPE. UTF-8 is simply a transport format, although it may 
be used for storage to save space. There is lots of software that supports UTF-8 
today. This is the future.

 While I'm quite sure that this will help Zope in the Asiatic region, it
 seems quite inconvienent for isolatin1 world :(

This will be a win in Europe as well, especially for multilingual sites. IIRC  there 
are 15 variants of ISO-8859-1.

I18N is *very* important, and Unicode is an essential ingredient.

 
  Arnar Lundesgaard







___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] [ZOPE 2.6 B1] Unicode/locale problems withOFS/dtml/properties.dtml

2002-09-26 Thread Hajime Nakagami

Hi 

I'm Japanese.
#Sorry so I don't write english well.

 While I'm quite sure that this will help Zope in the Asiatic region, it
 seems quite inconvienent for isolatin1 world :(

This will be a win in Europe as well, especially for multilingual sites. IIRC  there 
are 15 variants of ISO-8859-1.

I18N is *very* important, and Unicode is an essential ingredient.

[1] These 3 filses (total 3 line) change 
   from encode('latin1') to encode('utf-8')
$find . -name '*.py' -exec grep -l 'encode.*latin1' {} \;
./lib/python/ZPublisher/Converters.py
./lib/python/ZPublisher/HTTPRequest.py
./lib/python/ZPublisher/HTTPResponse.py

[2] This line change from 'iso-8859-1' to 'utf-8'
lib/python/App/dtml/manage_page_header.dtml
dtml-call REQUEST.set('management_page_charset','iso-8859-1')

These changes seems well work for me.
But I have not enough test.

Are there some reason to treat unicode string  not as 'utf-8' but as 
'latin1'(iso-8859-1)?

Additionaly
Japanese languese have some encoding,
euc-jp, shift_jis, iso-8859-jp and utf-8 (and another?).
I want mechanism to change encoding dinamically.



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )