On 15/04/2021 09:03, Jean-Louis MONTEIRO wrote:
I've got an answer from JSTL team.
Here it is

<snip/>

    1. Jakarta Tags Specification Section 7.4 details the <c:import> tag:
    c:import
    
<https://github.com/eclipse-ee4j/jstl-api/blob/master/spec/src/main/asciidoc/jakarta-stl.adoc#74-cimport>

Within this, it details the following: Character Encoding : The
<c:import/> for import-encoded.txt does not include an encoding so the
default encoding is used: ISO-8859-1

</snip>

At this point from a Jakarta Tags perspective, I believe the golden file
is correct.

Thanks for passing that on.

The Default Servlet improvements were written from the perspective of including static content with a variety of encodings where the correct encoding was not always known (or maintaining an accurate mapping of encoding to resource would impose a significant overhead).

JSTL is coming from the perspective that the encoding of the included target is always known.

Currently Tomcat provides the 'useBomIfPresent' option to control the BoM handling. The current values are:

- true  - BoM is stripped if present and any BoM found used to determine
          the encoding used to read the resource. This is the default.

- false - BoM is stripped and resource is read using the configured file
          encoding (which will be the platform default if not explicitly
          configured)

If we wanted to address this and provide a way to allow JSTL to have the control over the included content required to pass this TCK test then we could modify 'useBomIfPresent' as follows:

- true   - no change - remains the default

- false  - no change

- ignore - as current false but does not strip the BoM from the output

This would have no impact on existing users but using the new ignore option would allow the JSTL TCK to pass.

I do wonder if this use case has any real world consequences. For that to be that case there would need to be an application where:
- JSTL was importing static resources
- the content of static resource started with the same bytes as a valid
  BoM

That seems unlikely as the BoM values look to have been chosen to avoid this. While it is (very?) unlikely, it isn't impossible so I'm not against this change. Normally, I'd worry about regressions but the test case coverage is good in this area.

Any objections to implementing this? Thoughts on a better solution?

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to