On 3/17/06, Jean-frederic Clere <[EMAIL PROTECTED]> wrote:
>
> Costin Manolache wrote:
>
> >Sorry, I forgot there are 2 meanings of  'xml syntax' :-), I was thinking
> if
> >the output
> >is an xml file - with encoding in declaration, but in regular jsp. (well,
> >the patch is not dealing
> >with jspx anyway )
> >I was referring to the fact that <?xml encoding="iso-8859-2"?> is treated
> as
> >template text,
> >and pageEncoding (or web.xml ) takes precedence.
> >In jsp-xml ( jspx ) it seems we report an error if the web.xml encoding
> >doesn't match the
> ><?xml?> encoding. I can't see many use cases for having an explicit
> encoding
> >in the
> >xml header, and yet the file read with a different encoding.
> >
> >
> In my case the xml header is:
> <?xml version="1.0" encoding="OSD_EBCDIC_DF04_1"?> (In EBCDIC...)
> Reading the file with ISO-8859-1 encoding only gives garbages.
>
> But the patch prevents reading the  <@page pageEncoding="bla" %> so it
> is bad.



Yes, the patch is bad - but what would be a good patch ?

- if pageEncoding is not specified but document starts with <?xml
encoding=...?> - use xml encoding
- if pageEncoding is specified and so is <?xml encoding?> - report an error
( like jspx does ) or
a warning or choose the xml encoding
- leave current behavior - use default 8859-1 or pageEncoding only.

<?xml encoding?> is probably more used and supported ( i.e. more 'standard'
:-) that jsp pageEncoding.
The jsp spec is clear that last option should be used - but having 2
conflicting encodings is a source of problems,
and if we can't follow the 'higher' standard, we can at least warn.

Well - not a big deal, but encodings tends to be a headache area for many
people, in particular
when different parts of the system have different 'standards' and defaults
plus autodetections ( on browser,  http, html,
xml, or jsp ).

Costin


The old code should be improved to allow to use the sourceEnc when the
> pageEncoding is not specified and ISO-8859-1 if none are specified.






Cheers
>
> Jean-Frederic
>
> >
> >Costin
> >
> >
> >On 3/17/06, Bill Barker <[EMAIL PROTECTED]> wrote:
> >
> >
> >>
> >>
> >>
> >>>-----Original Message-----
> >>>From: Costin Manolache [mailto:[EMAIL PROTECTED]
> >>>Sent: Friday, March 17, 2006 11:57 AM
> >>>To: Tomcat Developers List
> >>>Subject: Re: svn commit: r386315 -
> >>>/tomcat/jasper/tc5.5.x/src/share/org/apache/jasper/compiler/Pa
> >>>rserController.java
> >>>
> >>>In his example ( where both XML and JSP declare encodings ) -
> >>>which one
> >>>would win ?
> >>>
> >>>
> >>The patch only affects pages in JSP syntax, so the <?xml ... ?> is just
> >>another piece of template text :).
> >>
> >>
> >>
> >>>IMO the XML encoding should win i.e. if the file uses xml
> >>>syntax and starts
> >>>with
> >>><?xml version="1.0" encoding="iso-8859-2" ?>, then jsp
> >>>pageEncoding should
> >>>be ignored.
> >>>If a jsp is written using the XML syntax - it is supposed to
> >>>follow the XML
> >>>rules - there is no
> >>>exception in the XML spec for jsps specifying their different
> >>>syntax for
> >>>encoding.
> >>>
> >>>
> >>>
> >>The JSP expert group agrees with you:).  In XML syntax, the XML encoding
> >>should win out over <jsp:directive.page pageEncoding="..." />.
> >>
> >>
> >>
> >>>For non-XML jsps - I think respecting pageEncoding is a must,
> >>>the jsp reader
> >>>must scan the
> >>>file to find the pageEncoding string - which is not trivial (
> >>>there is a
> >>>reason why XML requires the
> >>>encoding to be the first thing in the file, at the top, I
> >>>would't bet on
> >>>jasper implementing it correctly :-)
> >>>
> >>>
> >>>
> >>In JSP syntax, the spec (Appendix D) says that pageEncoding should win
> (at
> >>least when there is no matching <page-encoding /> in web.xml :).  What
> the
> >>patch breaks is that with it Jasper won't even look for the pageEncoding
> >>most of the time.
> >>
> >>Jasper looks like it does a pretty good job of guessing to set up the
> >>Reader
> >>that scans for the pageEncoding directive.  And JFC seems to agree,
> since
> >>the patch is to use the guessed encoding rather than the one that was
> >>specified :).
> >>
> >>
> >>
> >>>Costin
> >>>
> >>>On 3/17/06, Bill Barker <[EMAIL PROTECTED]> wrote:
> >>>
> >>>
> >>>>
> >>>>
> >>>>
> >>>>>-----Original Message-----
> >>>>>From: Jean-frederic Clere [mailto:[EMAIL PROTECTED]
> >>>>>Sent: Friday, March 17, 2006 4:13 AM
> >>>>>To: Tomcat Developers List
> >>>>>Subject: Re: svn commit: r386315 -
> >>>>>/tomcat/jasper/tc5.5.x/src/share/org/apache/jasper/compiler/Pa
> >>>>>rserController.java
> >>>>>
> >>>>>Bill Barker wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>-----Original Message-----
> >>>>>>>From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> >>>>>>>Sent: Thursday, March 16, 2006 3:55 AM
> >>>>>>>To: tomcat-dev@jakarta.apache.org
> >>>>>>>Subject: svn commit: r386315 -
> >>>>>>>/tomcat/jasper/tc5.5.x/src/share/org/apache/jasper/compiler/Pa
> >>>>>>>rserController.java
> >>>>>>>
> >>>>>>>Author: jfclere
> >>>>>>>Date: Thu Mar 16 03:54:29 2006
> >>>>>>>New Revision: 386315
> >>>>>>>
> >>>>>>>URL: http://svn.apache.org/viewcvs?rev=386315&view=rev
> >>>>>>>Log:
> >>>>>>>If the encoding is not specified use the detected one.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>-1.
> >>>>>>If it gets to this point, the detected encoding is *wrong*
> >>>>>>
> >>>>>>
> >>>>>(e.g. <?xml
> >>>>>
> >>>>>
> >>>>>>version="1.0" encoding="iso-8859-2" ?> in JSP syntax).
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>Why wrong?
> >>>>>
> >>>>>
> >>>>Because the right encoding is the one specified in the <[EMAIL PROTECTED]
> >>>>pageEncoding="utf8"%>.
> >>>>
> >>>>
> >>>>
> >>>>>+++
> >>>>>Connected to localhost.
> >>>>>Escape character is '^]'.
> >>>>>GET /try1.jsp
> >>>>><?xml version="1.0" encoding="ISO-8859-2"?>
> >>>>><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> >>>>>   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";>
> >>>>>+++
> >>>>>
> >>>>>
> >>>>>
> >>>>This is about pageEncoding, so I don't see the relevance.
> >>>>
> >>>>
> >>>>
> >>>>>>I don't have access to an EBCDIC machine to know what the
> >>>>>>
> >>>>>>
> >>>>>problem is, but
> >>>>>
> >>>>>
> >>>>>>this isn't the fix.  Possibly a better way to guess the
> >>>>>>
> >>>>>>
> >>>>>encoding of the
> >>>>>
> >>>>>
> >>>>>>Reader?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>Thinking to it  the patch is not prefect but the old code
> >>>>>
> >>>>>
> >>>is worse we
> >>>
> >>>
> >>>>>have a piece of code that detects correctly the  source
> >>>>>
> >>>>>
> >>>encoding and
> >>>
> >>>
> >>>>>detroy it...
> >>>>>
> >>>>>
> >>>>>
> >>>>However, the old code adheres to the JSP spec, whereas your
> >>>>
> >>>>
> >>>patch breaks
> >>>
> >>>
> >>>>the
> >>>>JSP spec (Appendix D).  That automatically makes the old
> >>>>
> >>>>
> >>>code better than
> >>>
> >>>
> >>>>your patch.
> >>>>
> >>>>
> >>>>
> >>>>>In doParse() in ParserController.java the following happends
> >>>>>parse() is called with pageEnc = sourceEnc
> >>>>>jspConfigPageEnc = null
> >>>>>isDefaultPageEncoding = false.
> >>>>>But the line before the jspReader uses the sourceEnc to create the
> >>>>>InputStreamReader so the content of the file is translated to
> >>>>>utf-8 when
> >>>>>reading it.
> >>>>>In validator.java the charset will be set to the detected
> >>>>>encoding... In
> >>>>>the example above iso-8859.2. Bad for me that will be
> >>>>>OSD_EBCDIC_DF04_1.
> >>>>>
> >>>>>
> >>>>>
> >>>>The only issue is why Jasper can't recognize your <[EMAIL PROTECTED]
> >>>>pageEncoding="OSD_EBCDIC_DF04_1" %> statement.  That's the
> >>>>
> >>>>
> >>>part that I
> >>>
> >>>
> >>>>can't
> >>>>figure out (and your patch is masking :).
> >>>>
> >>>>
> >>>>
> >>>>>Cheers
> >>>>>
> >>>>>Jean-Frederic
> >>>>>
> >>>>>
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>>This message is intended only for the use of the person(s)
> >>>>>>
> >>>>>>
> >>>>>listed above as the intended recipient(s), and may contain
> >>>>>information that is PRIVILEGED and CONFIDENTIAL.  If you are
> >>>>>not an intended recipient, you may not read, copy, or
> >>>>>distribute this message or any attachment. If you received
> >>>>>this communication in error, please notify us immediately by
> >>>>>e-mail and then delete all copies of this message and any
> >>>>>
> >>>>>
> >>>attachments.
> >>>
> >>>
> >>>>>>In addition you should be aware that ordinary (unencrypted)
> >>>>>>
> >>>>>>
> >>>>>e-mail sent through the Internet is not secure. Do not send
> >>>>>confidential or sensitive information, such as social
> >>>>>security numbers, account numbers, personal identification
> >>>>>numbers and passwords, to us via ordinary (unencrypted) e-mail.
> >>>>>
> >>>>>
> >>>>>>
> >>>>>>
> >>>>---------------------------------------------------------------------
> >>>>
> >>>>
> >>>>>>To unsubscribe, e-mail: [EMAIL PROTECTED]
> >>>>>>For additional commands, e-mail: [EMAIL PROTECTED]
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>---------------------------------------------------------------------
> >>>
> >>>
> >>>>>To unsubscribe, e-mail: [EMAIL PROTECTED]
> >>>>>For additional commands, e-mail: [EMAIL PROTECTED]
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>This message is intended only for the use of the person(s)
> >>>>
> >>>>
> >>>listed above as
> >>>
> >>>
> >>>>the intended recipient(s), and may contain information that
> >>>>
> >>>>
> >>>is PRIVILEGED
> >>>
> >>>
> >>>>and CONFIDENTIAL.  If you are not an intended recipient,
> >>>>
> >>>>
> >>>you may not read,
> >>>
> >>>
> >>>>copy, or distribute this message or any attachment. If you
> >>>>
> >>>>
> >>>received this
> >>>
> >>>
> >>>>communication in error, please notify us immediately by
> >>>>
> >>>>
> >>>e-mail and then
> >>>
> >>>
> >>>>delete all copies of this message and any attachments.
> >>>>
> >>>>In addition you should be aware that ordinary (unencrypted)
> >>>>
> >>>>
> >>>e-mail sent
> >>>
> >>>
> >>>>through the Internet is not secure. Do not send
> >>>>
> >>>>
> >>>confidential or sensitive
> >>>
> >>>
> >>>>information, such as social security numbers, account
> >>>>
> >>>>
> >>>numbers, personal
> >>>
> >>>
> >>>>identification numbers and passwords, to us via ordinary
> >>>>
> >>>>
> >>>(unencrypted)
> >>>
> >>>
> >>>>e-mail.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>---------------------------------------------------------------------
> >>>
> >>>
> >>>>To unsubscribe, e-mail: [EMAIL PROTECTED]
> >>>>For additional commands, e-mail: [EMAIL PROTECTED]
> >>>>
> >>>>
> >>>>
> >>>>
> >>
> >>This message is intended only for the use of the person(s) listed above
> as
> >>the intended recipient(s), and may contain information that is
> PRIVILEGED
> >>and CONFIDENTIAL.  If you are not an intended recipient, you may not
> read,
> >>copy, or distribute this message or any attachment. If you received this
> >>communication in error, please notify us immediately by e-mail and then
> >>delete all copies of this message and any attachments.
> >>
> >>In addition you should be aware that ordinary (unencrypted) e-mail sent
> >>through the Internet is not secure. Do not send confidential or
> sensitive
> >>information, such as social security numbers, account numbers, personal
> >>identification numbers and passwords, to us via ordinary (unencrypted)
> >>e-mail.
> >>
> >>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: [EMAIL PROTECTED]
> >>For additional commands, e-mail: [EMAIL PROTECTED]
> >>
> >>
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Reply via email to