[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2018-01-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334475#comment-16334475
 ] 

Michael Dürig commented on OAK-5506:


[~reschke], I think the JCR layer is the right place to fix this. From quickly 
looking at the patch the approach seems sound. I cannot say what impact it will 
have on other areas (e.g. performance) though.

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Wish
>  Components: core, jcr, segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
>Priority: Minor
>  Labels: patch-available
> Fix For: 1.10
>
> Attachments: OAK-5506-01.patch, OAK-5506-02.patch, 
> OAK-5506-name-conversion.diff, OAK-5506.diff, ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2018-01-22 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334452#comment-16334452
 ] 

Julian Reschke commented on OAK-5506:
-

[^OAK-5506.diff] contains the updated change introducing name validation on the 
JCR layer.

[~mduerig], [~frm], [~mreutegg] - WDYT?

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Wish
>  Components: segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
>Priority: Minor
>  Labels: patch-available
> Fix For: 1.10
>
> Attachments: OAK-5506-01.patch, OAK-5506-02.patch, 
> OAK-5506-name-conversion.diff, OAK-5506.diff, ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2017-02-06 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854299#comment-15854299
 ] 

Julian Reschke commented on OAK-5506:
-

Throw an exception?

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Fix For: 1.8
>
> Attachments: OAK-5506-01.patch, OAK-5506-02.patch, ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2017-02-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854263#comment-15854263
 ] 

Michael Dürig commented on OAK-5506:


And what do we do if garbage is detected? 

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Fix For: 1.8
>
> Attachments: OAK-5506-01.patch, OAK-5506-02.patch, ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2017-02-06 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854212#comment-15854212
 ] 

Julian Reschke commented on OAK-5506:
-

Segment store already does the conversion to UTF-8 anyway, so I believe it's 
worthwhile to make that detect garbage -- independently of other considerations.

That said, 

 contains interesting information. On my system, I've been able to match the 
performance of {{getBytes}} by changing {{encode}} to:

{noformat}
ThreadLocal cse = new ThreadLocal() {
@Override
protected CharsetEncoder initialValue() {
CharsetEncoder e = Charsets.UTF_8.newEncoder();
e.onUnmappableCharacter(CodingErrorAction.REPORT);
e.onMalformedInput(CodingErrorAction.REPORT);
return e;
}
};

private static byte[] bytes(ByteBuffer b) {
byte[] a = new byte[b.remaining()];
b.get(a);
return a;
}

private byte[] encode(String in) throws IOException {
CharsetEncoder e = cse.get();
e.reset();
return bytes(e.encode(CharBuffer.wrap(in.toCharArray(;
}
{noformat}

Note that the more important part might be to {{CharBuffer.wrap()}} a char 
array instead of the source string, as counter-intuitively that may sound.

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Fix For: 1.8
>
> Attachments: OAK-5506-01.patch, OAK-5506-02.patch, ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2017-01-25 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837931#comment-15837931
 ] 

Francesco Mari commented on OAK-5506:
-

To be honest I'm a bit reluctant in adding this logic to the Segment Store. 
This is basically user input validation, which shouldn't be a concern of the 
Segment Store. Probably this stuff should better be added to a {{Validator}}.

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Attachments: ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2017-01-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837790#comment-15837790
 ] 

Michael Dürig commented on OAK-5506:


[~reschke], if so then yes. But shouldn't we detect them further up the stack 
where we iterate over the characters anyway? Even if this is only supposed to 
be a temporary solution until the name validation discussion is sorted out, why 
not put the work around where it makes more sense? 

Re. performance: let's just keep an eye on it and run the benchmarks to see how 
much this costs. My experience is that more than a few percent slowdown will 
escalate. 

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Attachments: ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2017-01-25 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837784#comment-15837784
 ] 

Julian Reschke commented on OAK-5506:
-

{{o.a.j.o.segment.SegmentWriter.SegmentWriteOperation#writeString}} doesn't 
seem to be involved (set a breakpoint, didn't get there).

FWIW; this (or something like this) is the code that would need to be added:
{noformat}
private static void checkValidString(String s) throws IOException {
for (int i = 0; i < s.length(); i++) {
char c1 = s.charAt(i);
if (Character.isSurrogate(c1)) {
try {
char c2 = s.charAt(i + 1);
if (Character.isSurrogatePair(c1, c2)) {
// proceed
i += 1;
} else {
throw new IOException("Invalid surrogate pair sequence: 
" + (int) c1 + " " + (int) c2);
}
} catch (IndexOutOfBoundsException ex) {
throw new IOException("String ends in unpaired surrogate 
character.", ex);
}
}
}
}
{noformat}

So, in general a single pass checking every char in the string.

[~mduerig]: agreed, but if we want to reject these values, then we'll have to 
detect them, right?

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Attachments: ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2017-01-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837613#comment-15837613
 ] 

Michael Dürig commented on OAK-5506:


Please be sure to look at the performance impact of such a change (i.e. run the 
benchmarks). Name parsing and validation has always been on the critical path. 

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Attachments: ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2017-01-25 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837604#comment-15837604
 ] 

Francesco Mari commented on OAK-5506:
-

It all boils down to the methods 
{{o.a.j.o.segment.SegmentWriter.SegmentWriteOperation#writeString}} when 
writing and {{o.a.j.o.segment.Segment#readString}} when reading. These methods 
are used for node names, property names and property values.

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Attachments: ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2017-01-25 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837557#comment-15837557
 ] 

Julian Reschke commented on OAK-5506:
-

I agree that, optimally, all persistence implementations accept the same node 
names. We should discuss this in the context of supported characters in node 
names.

(FWIW, the same issue may affect string property values)

In the meantime I would suggest that the segment store does it's own sanity 
check - we can remove this later on. The sanity check can be implemented with a 
single scan over the node name - if you can point me to the right code location 
for that I can give it a try.

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Attachments: ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5506) Segment store apparently doesn't round trip node names with unpaired surrogates

2017-01-25 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837537#comment-15837537
 ] 

Francesco Mari commented on OAK-5506:
-

[~reschke], after some debugging session in the Segment Store, I figured out 
that there is no bad behaviour from that side. The failing test can be reduced 
to the following lines, because this is all the Segment Store is actually doing 
to that string. This test also matches the values that you could observe while 
debugging.

{noformat}
String in = "foo\ud800";
byte[] raw = in.getBytes(Charsets.UTF_8);
String out = new String(raw, Charsets.UTF_8);
assertEquals(in, out);
{noformat}

In other words, the Segment Store expects valid UTF-8 strings and returns UTF-8 
strings, following a garbage in, garbage out policy. Shouldn't this malformed 
string be caught further up the stack?

> Segment store apparently doesn't round trip node names with unpaired 
> surrogates
> ---
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Attachments: ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)