[jira] [Updated] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs

2024-01-24 Thread Max (Jira)


 [ 
https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max updated XALANJ-2725:

Attachment: astral-chars-split-buffer.patch

> Possible buffer-boundry issue when serializing surrogate pairs
> --
>
> Key: XALANJ-2725
> URL: https://issues.apache.org/jira/browse/XALANJ-2725
> Project: XalanJ2
>  Issue Type: Improvement
>  Security Level: No security risk; visible to anyone(Ordinary problems in 
> Xalan projects.  Anybody can view the issue.) 
>  Components: Serialization
>Reporter: Joe Kesselman
>Assignee: Joe Kesselman
>Priority: Major
>  Labels: Surrogates, escaping, unicode, utf
> Attachments: astral-chars-split-buffer.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a 
> surrogate pair (two UTF-16 units), were not being serialized correctly. We 
> have a proposed fix for that.
> There is reported to still be an edge case when a surrogate pair which 
> crosses buffer boundaries might not be handled correctly. [~maxfortun] 
> offered what looks like a reasonable proposed fix 
> (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607),
>  but in my testing this was not serializing the surrogate pairs correctly, 
> causing regression on the tests XALANJ-2419 introduced. I don't know whether 
> that's because we're taking multiple paths through
> But the edge case does appear to be real, and if so we will need some such 
> solution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org
For additional commands, e-mail: dev-h...@xalan.apache.org



[jira] [Updated] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs

2024-01-24 Thread Max (Jira)


 [ 
https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max updated XALANJ-2725:

Attachment: (was: astral-chars-split-buffer.patch)

> Possible buffer-boundry issue when serializing surrogate pairs
> --
>
> Key: XALANJ-2725
> URL: https://issues.apache.org/jira/browse/XALANJ-2725
> Project: XalanJ2
>  Issue Type: Improvement
>  Security Level: No security risk; visible to anyone(Ordinary problems in 
> Xalan projects.  Anybody can view the issue.) 
>  Components: Serialization
>Reporter: Joe Kesselman
>Assignee: Joe Kesselman
>Priority: Major
>  Labels: Surrogates, escaping, unicode, utf
> Attachments: astral-chars-split-buffer.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a 
> surrogate pair (two UTF-16 units), were not being serialized correctly. We 
> have a proposed fix for that.
> There is reported to still be an edge case when a surrogate pair which 
> crosses buffer boundaries might not be handled correctly. [~maxfortun] 
> offered what looks like a reasonable proposed fix 
> (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607),
>  but in my testing this was not serializing the surrogate pairs correctly, 
> causing regression on the tests XALANJ-2419 introduced. I don't know whether 
> that's because we're taking multiple paths through
> But the edge case does appear to be real, and if so we will need some such 
> solution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org
For additional commands, e-mail: dev-h...@xalan.apache.org



[jira] [Updated] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs

2024-01-23 Thread Max (Jira)


 [ 
https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max updated XALANJ-2725:

Attachment: astral-chars-split-buffer.patch

> Possible buffer-boundry issue when serializing surrogate pairs
> --
>
> Key: XALANJ-2725
> URL: https://issues.apache.org/jira/browse/XALANJ-2725
> Project: XalanJ2
>  Issue Type: Improvement
>  Security Level: No security risk; visible to anyone(Ordinary problems in 
> Xalan projects.  Anybody can view the issue.) 
>  Components: Serialization
>Reporter: Joe Kesselman
>Assignee: Joe Kesselman
>Priority: Major
>  Labels: Surrogates, escaping, unicode, utf
> Attachments: astral-chars-split-buffer.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a 
> surrogate pair (two UTF-16 units), were not being serialized correctly. We 
> have a proposed fix for that.
> There is reported to still be an edge case when a surrogate pair which 
> crosses buffer boundaries might not be handled correctly. [~maxfortun] 
> offered what looks like a reasonable proposed fix 
> (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607),
>  but in my testing this was not serializing the surrogate pairs correctly, 
> causing regression on the tests XALANJ-2419 introduced. I don't know whether 
> that's because we're taking multiple paths through
> But the edge case does appear to be real, and if so we will need some such 
> solution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org
For additional commands, e-mail: dev-h...@xalan.apache.org