Costin Manolache wrote:

Sorry, I forgot there are 2 meanings of  'xml syntax' :-), I was thinking if
the output
is an xml file - with encoding in declaration, but in regular jsp. (well,
the patch is not dealing
with jspx anyway )
I was referring to the fact that <?xml encoding="iso-8859-2"?> is treated as
template text,
and pageEncoding (or web.xml ) takes precedence.
In jsp-xml ( jspx ) it seems we report an error if the web.xml encoding
doesn't match the
<?xml?> encoding. I can't see many use cases for having an explicit encoding
in the
xml header, and yet the file read with a different encoding.
In my case the xml header is:
<?xml version="1.0" encoding="OSD_EBCDIC_DF04_1"?> (In EBCDIC...)
Reading the file with ISO-8859-1 encoding only gives garbages.

But the patch prevents reading the <@page pageEncoding="bla" %> so it is bad.

The old code should be improved to allow to use the sourceEnc when the pageEncoding is not specified and ISO-8859-1 if none are specified.

Cheers

Jean-Frederic


Costin


On 3/17/06, Bill Barker <[EMAIL PROTECTED]> wrote:

-----Original Message-----
From: Costin Manolache [mailto:[EMAIL PROTECTED]
Sent: Friday, March 17, 2006 11:57 AM
To: Tomcat Developers List
Subject: Re: svn commit: r386315 -
/tomcat/jasper/tc5.5.x/src/share/org/apache/jasper/compiler/Pa
rserController.java

In his example ( where both XML and JSP declare encodings ) -
which one
would win ?
The patch only affects pages in JSP syntax, so the <?xml ... ?> is just
another piece of template text :).

IMO the XML encoding should win i.e. if the file uses xml
syntax and starts
with
<?xml version="1.0" encoding="iso-8859-2" ?>, then jsp
pageEncoding should
be ignored.
If a jsp is written using the XML syntax - it is supposed to
follow the XML
rules - there is no
exception in the XML spec for jsps specifying their different
syntax for
encoding.

The JSP expert group agrees with you:).  In XML syntax, the XML encoding
should win out over <jsp:directive.page pageEncoding="..." />.

For non-XML jsps - I think respecting pageEncoding is a must,
the jsp reader
must scan the
file to find the pageEncoding string - which is not trivial (
there is a
reason why XML requires the
encoding to be the first thing in the file, at the top, I
would't bet on
jasper implementing it correctly :-)

In JSP syntax, the spec (Appendix D) says that pageEncoding should win (at
least when there is no matching <page-encoding /> in web.xml :).  What the
patch breaks is that with it Jasper won't even look for the pageEncoding
most of the time.

Jasper looks like it does a pretty good job of guessing to set up the
Reader
that scans for the pageEncoding directive.  And JFC seems to agree, since
the patch is to use the guessed encoding rather than the one that was
specified :).

Costin

On 3/17/06, Bill Barker <[EMAIL PROTECTED]> wrote:

-----Original Message-----
From: Jean-frederic Clere [mailto:[EMAIL PROTECTED]
Sent: Friday, March 17, 2006 4:13 AM
To: Tomcat Developers List
Subject: Re: svn commit: r386315 -
/tomcat/jasper/tc5.5.x/src/share/org/apache/jasper/compiler/Pa
rserController.java

Bill Barker wrote:



-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 16, 2006 3:55 AM
To: tomcat-dev@jakarta.apache.org
Subject: svn commit: r386315 -
/tomcat/jasper/tc5.5.x/src/share/org/apache/jasper/compiler/Pa
rserController.java

Author: jfclere
Date: Thu Mar 16 03:54:29 2006
New Revision: 386315

URL: http://svn.apache.org/viewcvs?rev=386315&view=rev
Log:
If the encoding is not specified use the detected one.



-1.
If it gets to this point, the detected encoding is *wrong*
(e.g. <?xml
version="1.0" encoding="iso-8859-2" ?> in JSP syntax).

Why wrong?
Because the right encoding is the one specified in the <[EMAIL PROTECTED]
pageEncoding="utf8"%>.

+++
Connected to localhost.
Escape character is '^]'.
GET /try1.jsp
<?xml version="1.0" encoding="ISO-8859-2"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";>
+++

This is about pageEncoding, so I don't see the relevance.

I don't have access to an EBCDIC machine to know what the
problem is, but
this isn't the fix.  Possibly a better way to guess the
encoding of the
Reader?


Thinking to it  the patch is not prefect but the old code
is worse we
have a piece of code that detects correctly the  source
encoding and
detroy it...

However, the old code adheres to the JSP spec, whereas your
patch breaks
the
JSP spec (Appendix D).  That automatically makes the old
code better than
your patch.

In doParse() in ParserController.java the following happends
parse() is called with pageEnc = sourceEnc
jspConfigPageEnc = null
isDefaultPageEncoding = false.
But the line before the jspReader uses the sourceEnc to create the
InputStreamReader so the content of the file is translated to
utf-8 when
reading it.
In validator.java the charset will be set to the detected
encoding... In
the example above iso-8859.2. Bad for me that will be
OSD_EBCDIC_DF04_1.

The only issue is why Jasper can't recognize your <[EMAIL PROTECTED]
pageEncoding="OSD_EBCDIC_DF04_1" %> statement.  That's the
part that I
can't
figure out (and your patch is masking :).

Cheers

Jean-Frederic



This message is intended only for the use of the person(s)
listed above as the intended recipient(s), and may contain
information that is PRIVILEGED and CONFIDENTIAL.  If you are
not an intended recipient, you may not read, copy, or
distribute this message or any attachment. If you received
this communication in error, please notify us immediately by
e-mail and then delete all copies of this message and any
attachments.
In addition you should be aware that ordinary (unencrypted)
e-mail sent through the Internet is not secure. Do not send
confidential or sensitive information, such as social
security numbers, account numbers, personal identification
numbers and passwords, to us via ordinary (unencrypted) e-mail.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




This message is intended only for the use of the person(s)
listed above as
the intended recipient(s), and may contain information that
is PRIVILEGED
and CONFIDENTIAL.  If you are not an intended recipient,
you may not read,
copy, or distribute this message or any attachment. If you
received this
communication in error, please notify us immediately by
e-mail and then
delete all copies of this message and any attachments.

In addition you should be aware that ordinary (unencrypted)
e-mail sent
through the Internet is not secure. Do not send
confidential or sensitive
information, such as social security numbers, account
numbers, personal
identification numbers and passwords, to us via ordinary
(unencrypted)
e-mail.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



This message is intended only for the use of the person(s) listed above as
the intended recipient(s), and may contain information that is PRIVILEGED
and CONFIDENTIAL.  If you are not an intended recipient, you may not read,
copy, or distribute this message or any attachment. If you received this
communication in error, please notify us immediately by e-mail and then
delete all copies of this message and any attachments.

In addition you should be aware that ordinary (unencrypted) e-mail sent
through the Internet is not secure. Do not send confidential or sensitive
information, such as social security numbers, account numbers, personal
identification numbers and passwords, to us via ordinary (unencrypted)
e-mail.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to