On Вск, Ноя 17, 2002 at 03:17:13 -0500, Arie Folger wrote:
> On Sunday 17 November 2002 13:46, Ilya Konstantinov wrote:
> > On Sun, Nov 17, 2002 at 10:12:39AM +0200, [EMAIL PROTECTED] wrote:
> > > Has your input come from Mozilla? It does that. To make sure, write a cgi
> > > script (if you don't trust PHP) that displays its input as text/plain,
> > > and create a form in UTF8 that sends to that script.
> >
> > Actually, both IE5 and Mozilla will encode characters which aren't
> > present in the charset of the page which contains the HTML form, as
> > Unicode "entities" (e.g. &#blah;).
> <snip>
> > To avoid this behavior, simply make this page, which contains the
> > HTML form, in any Unicode encoding -- UTF-8, UTF-7 or UCS-2 (yuck!).
> 
> But I stated in my first email that I patched phpnuke to do utf-8. The first 
> thing I did was to change the charset= attribute.I even patched the xml 
> container (standard feature of xhtml) because it also specifies 
> a charset, but there is no difference.
> 
> I must say that I now tested the site with konqueror (see below why I am using 
> galeon for this site) and indeed it is a mozilla unexpected (to me) behaviour 
> which caused the translation of utf8 to numbered entities. I think I will 
> write a maintenance script that will access the database directly and chage 
> numbered entitiesback to utf8, easier than changing parts of the php code 
> which I didn't research sufficiently. Is there a way to disable this 
> translation "feature" of mozilla and ie?

[NOTE: I ASSUME THAT THE DISCUSSION IS TOTALLY OFF-TOPIC]

I do not think that mozilla translates UTF-8 to HTML entities.  I work
constantly with UTF-8-based applications (which I've written myself),
that submit data via HTTP and store it in database.  Working with
Mozilla (starting from 1.0+) and MS SQL, PostgreSQL, Oracle the
"real"/"correct" UTF-8 is stored in the database.  I assume that the
problem is related to how PHP handles the response from your browser.
Mozilla probably doesn't provide the response character set, during form
submission and Konqueror does, hence the difference between them.  Now,
when PHP doesn't have a response content type it probably assumes some
default (say, iso-8859-1) and escapes every out-of-the-range character
by HTML entities.

The conclusion is that you need to persuade PHP that the submitted
response is encoded in UTF-8.  I don't know how exactly to do it.  Maybe
disabling magic quotes or related stuff will help...

> The whole matter is complicated by the fact that konqueror will crash on large 
> texts pasted into textarea boxes of forms, and I am posting long papers 
> (20-30 pages), so I have been using galeon.
> 
> Arie
> -- 
> It is absurd to seek to give an account of the matter to a man 
> who cannot himself give an account of anything; for insofar as
> he is already like this, such a man is no better than a vegetable.
>            -- Book IV of Aristotle's Metaphysics
> 
> 
> 
> =================================================================
> To unsubscribe, send mail to [EMAIL PROTECTED] with
> the word "unsubscribe" in the message body, e.g., run the command
> echo unsubscribe | mail [EMAIL PROTECTED]

-- 
==========================================================
#                 Andre E. Bar'yudin                     #
#       Phone: (972)-54-882-026       ICQ: 48036924      #
#     Home page: http://www.cs.huji.ac.il/~baryudin/     #
==========================================================

Attachment: msg23393/pgp00000.pgp
Description: PGP signature

Reply via email to