Jakub,
Thanks for the note. I appreciate your comments, as it suggests that my
guesses concerning what was going on were incorrect. I've a couple of
questions/comments though....
You write:
"Java can perfectly load files as stream of bytes and decide how to
interpret
it later."
I agree that you can indeed load files as streams of bytes and decide
interpretation
later. However, it is not clear to me that you could make this decision
based on
the contents of the file you load. You must interpret your stream of bytes
someway
in order to understand the portions of the file that instruct how to
interpret the file.
As you can see, this causes a sort of loop - you must interpret the file
correctly in
order to interpret it correctly.
You write:
"If you think that it "must" be like that why do you think there's the tag
<%@ page contentType="text/html; charset= ... %>"
This tag corresponds to the servlet response.setContentType("text/html")
command. (If you look at the generated servlets, you see that this is how
the
tag is translated.) What this tag does is provide information to the
browser about
the encoding to use to display the data. For the reason I listed above, it
cannot,
as far as I can see, be used to determine how to interpret the file in the
first
place.
With that said, it seems to me that the portion of the JSP engine
responsible for
reading the file could use a default encoding to read the initial portion of
the text.
This might work well for many of the encodings which share a code for the
regular ascii
characters (this would allow parsing of the contentType parameter in a
sensible manner).
The engine, therefore, may appear to work for all this encodings. However,
as you noted,
use of UniCode, SJIS, BIG-5, or any other encoding in which the standard
ascii characters
do not correspond to the standard ascii codes would still result in errors.
I don't see
any easy way to circumvent this problem (except by including code to loop
through all the
Java available encodings each time a JSP file is read in...).
If anyone has a better understanding of how this works, and where my
reasoning may be
incorrect, I would very much appreciate hearing your thoughts.
-AMT
> -----Original Message-----
> From: Jakub Murin [mailto:[EMAIL PROTECTED]]
> Sent: Monday, March 06, 2000 6:50 AM
> To: Arun Thomas
> Subject: RE: need help with setting charsets in JSP pages
>
>
> Hello Arun
>
> thanks for writing me. I've been looking for a
> solution in mailing
> list archives since I posted to that mailing list.
> The readme file
> that comes with the latest jswdk from Sun says that
> this version fixes
> the bug in previous versions that weren't able to deal
> with some of
> the character encodings. However it's not true
> altogether; as I
> mentioned it works well with iso-8859-1 and with
> win-1250 but not with
> iso-8859-2. What you said may be true but it's
> ridiculous - Java can
> perfectly load files as stream of bytes and decide how
> to interpret it
> later. The only trouble could be Unicode and
> multibyte character sets
> because Java only supports UTF-8. But I don't think
> that's the real
> problem - I even found postings from some Chinese
> programer that
> complained that jswdk (probably an old version) can't
> handle some of
> their encodings, I guess the problem was with BIG-5,
> whereas it works
> well with another one, actually multibyte encoding!
> He said he'd done
> a bit of reverse-engineering and found a stupid bug in
> Sun's code. In
> addition ISO-8859-2, the encoding that I want to use,
> has the first
> 128 characters the same as ascii and it's an 8-bit
> encoding as 8859-1.
>
> If you think that it "must" be like that why do you
> think there's the
> tag <%@ page contentType="text/html; charset= ... %>
> ???
>
> Doesn't matter, I found that there are another jsp
> engines (Resin,
> PolyJSP) that work correctly with any character
> encoding recognized by
> Java.
>
> Jakub
>
> ___________________________________________________________
> Do You Yahoo!?
> Achetez, vendez! À votre prix! Sur http://encheres.yahoo.fr
>
===========================================================================
To unsubscribe: mailto [EMAIL PROTECTED] with body: "signoff JSP-INTEREST".
Some relevant FAQs on JSP/Servlets can be found at:
http://java.sun.com/products/jsp/faq.html
http://www.esperanto.org.nz/jsp/jspfaq.html
http://www.jguru.com/jguru/faq/faqpage.jsp?name=JSP
http://www.jguru.com/jguru/faq/faqpage.jsp?name=Servlets