Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Christopher Schultz

Greg,

On 3/31/22 12:17, Christopher Schultz wrote:

Greg,

On 3/29/22 13:41, gelo1234 wrote:

Have you also tried HTMLT or XHTMLT Serializers?
Default HTMLSerializer cannot handle some unicode characters: 
https://issues.apache.org/jira/browse/SLING-5973?attachmentOrder=asc 


Hmm. Are the HTMLT / XHTMLT serializers built-in? I have disabled all 
blocks during the build, so I'm just using Cocoon core.


I tried using a view, and it's not perfect but what I ended up with is 
Cocoon dumping-out the originally-generated (from the generator) XML and 
the US flag is already broken.


So it's definitely not being broken by the convoluted pipeline.

I'll try to put together an SSCCE[1]

-chris

[1] http://sscce.org/

wt., 29 mar 2022 o 19:37 gelo1234 > napisał(a):


    Hello Chris,

    I think you will not get any icon-type character on output without
    using proper font rendering - like Emoji support? Emoji might not be
    supported by default in Cocoon.
    So this might be the reason why you get HTML entities instead of
    Emoji-icons.
    Also notice:
    https://www.mail-archive.com/dev@cocoon.apache.org/msg61629.html
    

    Greetings,
    Greg



    wt., 29 mar 2022 o 18:36 Christopher Schultz
    mailto:ch...@christopherschultz.net>>
    napisał(a):

    Cédric,

    On 3/29/22 12:06, Cédric Damioli wrote:
 > Could you provide more details ?
 > How is your XML processed before outputting the wrong UTF-8
    sequence ?

    It's somewhat straightforward:

    
    https://source/ " />

    

    

    

    

    

    

    

Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Cédric Damioli

Hi,

To help isolate the issue, could you test with a simpler pipeline with 
only generator/single simple XSLT/xml serializer ?


Cédric

Le 31/03/2022 à 17:54, Christopher Schultz a écrit :

Cédric,

On 3/29/22 12:52, Cédric Damioli wrote:

Do you use Xalan as XSLT Processor ?
If so, I remember https://issues.apache.org/jira/browse/XALANJ-2617 
which could be a cause of your issue.
I resolved it on my side years ago by compiling my own patched version 

> of Xalan.

I'm using whatever Cocoon uses natively. For example, I don't throw-in 
Jackson or StaX or whatever other options there are.


For "markers", you may use labels on your sitemap steps associated 
with a cocoon view.


Yeah, that sound familiar.

Thanks,
-chris


Le 29/03/2022 à 18:36, Christopher Schultz a écrit :

Cédric,

On 3/29/22 12:06, Cédric Damioli wrote:

Could you provide more details ?
How is your XML processed before outputting the wrong UTF-8 sequence ?


It's somewhat straightforward:


  https://source/; />

  

  

  

  

  

  

  

Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Christopher Schultz

Greg,

On 3/31/22 12:13, Christopher Schultz wrote:

On 3/29/22 13:37, gelo1234 wrote:

Hello Chris,

I think you will not get any icon-type character on output without 
using proper font rendering - like Emoji support? Emoji might not be 
supported by default in Cocoon.


This isn't a font-rendering issue; it's just ... wrong. Either the raw 
character should be output, or the proper set of HTML entities should be 
output. Neither is happening. It's just mojibake somewhere in the pipeline.


So this might be the reason why you get HTML entities instead of 
Emoji-icons.
Also notice: 
https://www.mail-archive.com/dev@cocoon.apache.org/msg61629.html 


I read that, and was hopeful that 2.1.13 would resolve this issue, but 
it hasn't.


Hmm... strangely, the X-Cocoon-Version header still says 2.1.11. Perhaps 
I didn't upgrade properly...


Yeah, I had Cocoon 2.1.11 as a compile-time dependency which was 
dropping cocoon-2.1.11.jar into the web application along with all the 
other artifacts from the 2.1.13 build. Whoops.


I got that all fixed-up, but the behavior is still the same. I was 
pretty hopeful that was the only thing missing.


-chris

wt., 29 mar 2022 o 18:36 Christopher Schultz 
mailto:ch...@christopherschultz.net>> 
napisał(a):


    Cédric,

    On 3/29/22 12:06, Cédric Damioli wrote:
 > Could you provide more details ?
 > How is your XML processed before outputting the wrong UTF-8
    sequence ?

    It's somewhat straightforward:

    
    https://source/ " />

    

    

    

    

    

    

    

Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Christopher Schultz

Greg,

On 3/29/22 13:41, gelo1234 wrote:

Have you also tried HTMLT or XHTMLT Serializers?
Default HTMLSerializer cannot handle some unicode characters: 
https://issues.apache.org/jira/browse/SLING-5973?attachmentOrder=asc 


Hmm. Are the HTMLT / XHTMLT serializers built-in? I have disabled all 
blocks during the build, so I'm just using Cocoon core.


Thanks,
-chris

wt., 29 mar 2022 o 19:37 gelo1234 > napisał(a):


Hello Chris,

I think you will not get any icon-type character on output without
using proper font rendering - like Emoji support? Emoji might not be
supported by default in Cocoon.
So this might be the reason why you get HTML entities instead of
Emoji-icons.
Also notice:
https://www.mail-archive.com/dev@cocoon.apache.org/msg61629.html


Greetings,
Greg



wt., 29 mar 2022 o 18:36 Christopher Schultz
mailto:ch...@christopherschultz.net>>
napisał(a):

Cédric,

On 3/29/22 12:06, Cédric Damioli wrote:
 > Could you provide more details ?
 > How is your XML processed before outputting the wrong UTF-8
sequence ?

It's somewhat straightforward:


    https://source/ " />

    

    

    

    

    

    

    > To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org

 >> For additional commands, e-mail:
users-h...@cocoon.apache.org 
 >>
 >

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org

For additional commands, e-mail: users-h...@cocoon.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Christopher Schultz

Greg,

On 3/29/22 13:37, gelo1234 wrote:

Hello Chris,

I think you will not get any icon-type character on output without using 
proper font rendering - like Emoji support? Emoji might not be supported 
by default in Cocoon.


This isn't a font-rendering issue; it's just ... wrong. Either the raw 
character should be output, or the proper set of HTML entities should be 
output. Neither is happening. It's just mojibake somewhere in the pipeline.


So this might be the reason why you get HTML entities instead of 
Emoji-icons.
Also notice: 
https://www.mail-archive.com/dev@cocoon.apache.org/msg61629.html 


I read that, and was hopeful that 2.1.13 would resolve this issue, but 
it hasn't.


Hmm... strangely, the X-Cocoon-Version header still says 2.1.11. Perhaps 
I didn't upgrade properly...


Thanks,
-chris

wt., 29 mar 2022 o 18:36 Christopher Schultz 
mailto:ch...@christopherschultz.net>> 
napisał(a):


Cédric,

On 3/29/22 12:06, Cédric Damioli wrote:
 > Could you provide more details ?
 > How is your XML processed before outputting the wrong UTF-8
sequence ?

It's somewhat straightforward:


    https://source/ " />

    

    

    

    

    

    

    > To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org

 >> For additional commands, e-mail: users-h...@cocoon.apache.org

 >>
 >

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org

For additional commands, e-mail: users-h...@cocoon.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Christopher Schultz

Cédric,

On 3/29/22 12:52, Cédric Damioli wrote:

Do you use Xalan as XSLT Processor ?
If so, I remember https://issues.apache.org/jira/browse/XALANJ-2617 
which could be a cause of your issue.
I resolved it on my side years ago by compiling my own patched version 

> of Xalan.

I'm using whatever Cocoon uses natively. For example, I don't throw-in 
Jackson or StaX or whatever other options there are.


For "markers", you may use labels on your sitemap steps associated with 
a cocoon view.


Yeah, that sound familiar.

Thanks,
-chris


Le 29/03/2022 à 18:36, Christopher Schultz a écrit :

Cédric,

On 3/29/22 12:06, Cédric Damioli wrote:

Could you provide more details ?
How is your XML processed before outputting the wrong UTF-8 sequence ?


It's somewhat straightforward:


  https://source/; />