[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346957#comment-16346957
 ] 

Julian Reschke commented on OAK-5506:
-------------------------------------

With respect to compatibility with segment-tar storage written by previous 
versions: my understanding is that any string value (be it an item name or a 
string property) persisted to segment-tar will have gone through 
{{String.getBytes(UTF8)}}. In the case of a string that doesn't roundtrip 
through UTF-8, this means that those characters will have been replaced by a 
replacement character (here: "?"). So the persisted state *does* represent 
valid UTF-8, and when read back, will just use that replacement character.

The proposed patch only affects writing of strings that are invalid, thus 
previously written garbage shouldn't be an issue.

> reject item names with unpaired surrogates early
> ------------------------------------------------
>
>                 Key: OAK-5506
>                 URL: https://issues.apache.org/jira/browse/OAK-5506
>             Project: Jackrabbit Oak
>          Issue Type: Wish
>          Components: core, jcr, segment-tar
>    Affects Versions: 1.5.18
>            Reporter: Julian Reschke
>            Assignee: Francesco Mari
>            Priority: Minor
>             Fix For: 1.10
>
>         Attachments: OAK-5506-01.patch, OAK-5506-02.patch, OAK-5506-4.diff, 
> OAK-5506-bench.diff, OAK-5506-name-conversion.diff, OAK-5506-segment.diff, 
> OAK-5506-segment2.diff, OAK-5506.diff, ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>    {{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
>     at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
>     at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
>     at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
>     at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
>     at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
>     at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
>     at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
>     at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
>     at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to