Julian Sedding created OAK-4473:
-----------------------------------
Summary: MarkSweepGarbageCollector#saveBatchToFile should escape
IDs
Key: OAK-4473
URL: https://issues.apache.org/jira/browse/OAK-4473
Project: Jackrabbit Oak
Issue Type: Bug
Components: core
Affects Versions: 1.2.16, 1.5.3, 1.4.3, 1.0.31
Reporter: Julian Sedding
Assignee: Julian Sedding
Datastore garbage collection (DS GC) can fail if it encounters IDs containing
backslashes. This can happen e.g. when a file gets uploaded and by mistake it's
absolute (windows) path is stored as file name.
This is because IDs are written to temporary files and then sorted. The sorting
algorithm assumes the lines to be escaped and throws an exception otherwise.
{noformat}
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector Blob garbage
collection error
java.lang.IllegalArgumentException: Unexpected char [J] found at 78 of
[92c3bcd2270655a9c911bec9f7a4851860f05c79#553941,/content/dam/\\MAPPED_DRIVE\JOHN$\ABC.pdf].
Expected '\' or 'r' or 'n
at
org.apache.jackrabbit.oak.commons.sort.EscapeUtils.unescape(EscapeUtils.java:126)
at
org.apache.jackrabbit.oak.commons.sort.EscapeUtils.unescapeLineBreaks(EscapeUtils.java:51)
at
org.apache.jackrabbit.oak.commons.sort.ExternalSort.readLine(ExternalSort.java:633)
at
org.apache.jackrabbit.oak.commons.sort.ExternalSort.sortInBatch(ExternalSort.java:204)
at
org.apache.jackrabbit.oak.commons.sort.ExternalSort.sortInBatch(ExternalSort.java:257)
at
org.apache.jackrabbit.oak.commons.sort.ExternalSort.sortInBatch(ExternalSort.java:159)
at
org.apache.jackrabbit.oak.plugins.blob.GarbageCollectorFileState.sort(GarbageCollectorFileState.java:147)
at
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.iterateNodeTree(MarkSweepGarbageCollector.java:538)
at
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.mark(MarkSweepGarbageCollector.java:278)
at
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.markAndSweep(MarkSweepGarbageCollector.java:248)
at
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.collectGarbage(MarkSweepGarbageCollector.java:163)
at org.apache.jackrabbit.oak.plugins.blob.BlobGC$1.call(BlobGC.java:87)
at org.apache.jackrabbit.oak.plugins.blob.BlobGC$1.call(BlobGC.java:83)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)