[ https://issues.apache.org/jira/browse/FOP-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kelly H Wilkerson updated FOP-2920: ----------------------------------- Description: fop-core/src/main/java/org/apache/fop/pdf/PDFToUnicodeCMap.java writeBFCharEntries runs through the codepoint entries in sections of 100 at a time. It looks like there's an edge case here where the last entry in the section is a surrogate pair. Here's my steps to reproduce from the latest trunk: {{java -cp fop/target/fop-2.5.0-SNAPSHOT.jar:fop/lib/commons-logging-1.0.4.jar:fop/lib/commons-io-1.3.1.jar:fop/lib/xmlgraphics-commons-svn-trunk.jar org.apache.fop.fonts.apps.TTFReader TwitterColorEmoji-SVGinOT.ttf twe.xml}} {{java -cp fop/target/fop-2.5.0-SNAPSHOT.jar:fop/lib/commons-logging-1.0.4.jar:fop/lib/commons-io-1.3.1.jar:fop/lib/xmlgraphics-commons-svn-trunk.jar:fop/lib/batik-all-1.11.0-SNAPSHOT.jar org.apache.fop.cli.Main -c twe_userconfig.xml -xsl twe_template.fo -xml fail.xml -pdf fail.pdf}} Here's the temporary way I resolved it for my own build: {{index ee773dcec..37c21803e 100644}} {{--- a/fop-core/src/main/java/org/apache/fop/pdf/PDFToUnicodeCMap.java}} {{+++ b/fop-core/src/main/java/org/apache/fop/pdf/PDFToUnicodeCMap.java}} {{@@ -128,6 +128,18 @@ public class PDFToUnicodeCMap extends PDFCMap {}} {{ while (partOfRange(charArray, charIndex)) {}} {{ charIndex++;}} {{ }}} {{/*}} {{ * If this entry is going to overflow the entriesThisSection}} {{ * array, then don't use it. This happens if there are}} {{ * non-pair entries in the table mixed with pair entries.}} {{ */}} {{ if (Character.codePointAt(charArray, charIndex) > 0xFFFF}} {{ && i+1 >= entriesThisSection) {}} {{ entriesThisSection--;}} {{ break;}} {{ }}} {{writer.write("<" + padCharIndex(charIndex) + "> ");}} {{if (Character.codePointAt(charArray, charIndex) > 0xFFFF) {}} was: fop-core/src/main/java/org/apache/fop/pdf/PDFToUnicodeCMap.java writeBFCharEntries runs through the codepoint entries in sections of 100 at a time. It looks like there's an edge case here where the last entry in the section is a surrogate pair. Here's my steps to reproduce from the latest trunk: {{java -cp fop/target/fop-2.5.0-SNAPSHOT.jar:fop/lib/commons-logging-1.0.4.jar:fop/lib/commons-io-1.3.1.jar:fop/lib/xmlgraphics-commons-svn-trunk.jar org.apache.fop.fonts.apps.TTFReader TwitterColorEmoji-SVGinOT.ttf twe.xml}} {{java -cp fop/target/fop-2.5.0-SNAPSHOT.jar:fop/lib/commons-logging-1.0.4.jar:fop/lib/commons-io-1.3.1.jar:fop/lib/xmlgraphics-commons-svn-trunk.jar:fop/lib/batik-all-1.11.0-SNAPSHOT.jar org.apache.fop.cli.Main -c twe_userconfig.xml -xsl twe_template.fo -xml fail.xml -pdf fail.pdf}} Here's the temporary way I resolved it for my own build: {{— a/fop-core/src/main/java/org/apache/fop/pdf/PDFToUnicodeCMap.java}} {{ +++ b/fop-core/src/main/java/org/apache/fop/pdf/PDFToUnicodeCMap.java}} {{ @@ -128,6 +128,18 @@ public class PDFToUnicodeCMap extends PDFCMap {}} {{ while (partOfRange(charArray, charIndex))}} {{{ charIndex++; }}} {{+}} {{ + /*}} {{ + * If this entry is going to overflow the entriesThisSection}} {{ + * array, then don't use it. This happens if there are}} {{ + * non-pair entries in the table mixed with pair entries.}} {{ + */}} {{ + if (Character.codePointAt(charArray, charIndex) > 0xFFFF}} {{ + && i+1 >= entriesThisSection)}} {{{ + entriesThisSection--; + break; + }}} {{+}} {{ writer.write("<" + padCharIndex(charIndex) + "> ");}} {{if (Character.codePointAt(charArray, charIndex) > 0xFFFF) {}} > Surrogate pair edge-case causes java.lang.ArrayIndexOutOfBoundsException > ------------------------------------------------------------------------ > > Key: FOP-2920 > URL: https://issues.apache.org/jira/browse/FOP-2920 > Project: FOP > Issue Type: Bug > Components: renderer/pdf > Affects Versions: trunk > Environment: macOS Mojave, java 11 > java version "1.8.0_192-ea" > Java(TM) SE Runtime Environment (build 1.8.0_192-ea-b04) > Java HotSpot(TM) 64-Bit Server VM (build 25.192-b04, mixed mode) > Reporter: Kelly H Wilkerson > Priority: Minor > Attachments: TwitterColorEmoji-SVGinOT.ttf, fail.xml, > twe_template.fo, twe_userconfig.xml > > > fop-core/src/main/java/org/apache/fop/pdf/PDFToUnicodeCMap.java > writeBFCharEntries runs through the codepoint entries in sections of 100 at a > time. It looks like there's an edge case here where the last entry in the > section is a surrogate pair. > > Here's my steps to reproduce from the latest trunk: > {{java -cp > fop/target/fop-2.5.0-SNAPSHOT.jar:fop/lib/commons-logging-1.0.4.jar:fop/lib/commons-io-1.3.1.jar:fop/lib/xmlgraphics-commons-svn-trunk.jar > org.apache.fop.fonts.apps.TTFReader TwitterColorEmoji-SVGinOT.ttf twe.xml}} > {{java -cp > fop/target/fop-2.5.0-SNAPSHOT.jar:fop/lib/commons-logging-1.0.4.jar:fop/lib/commons-io-1.3.1.jar:fop/lib/xmlgraphics-commons-svn-trunk.jar:fop/lib/batik-all-1.11.0-SNAPSHOT.jar > org.apache.fop.cli.Main -c twe_userconfig.xml -xsl twe_template.fo -xml > fail.xml -pdf fail.pdf}} > > Here's the temporary way I resolved it for my own build: > > > {{index ee773dcec..37c21803e 100644}} > {{--- a/fop-core/src/main/java/org/apache/fop/pdf/PDFToUnicodeCMap.java}} > {{+++ b/fop-core/src/main/java/org/apache/fop/pdf/PDFToUnicodeCMap.java}} > {{@@ -128,6 +128,18 @@ public class PDFToUnicodeCMap extends PDFCMap {}} > {{ while (partOfRange(charArray, charIndex)) {}} > {{ charIndex++;}} > {{ }}} > {{/*}} > {{ * If this entry is going to overflow the entriesThisSection}} > {{ * array, then don't use it. This happens if there are}} > {{ * non-pair entries in the table mixed with pair entries.}} > {{ */}} > {{ if (Character.codePointAt(charArray, charIndex) > 0xFFFF}} > {{ && i+1 >= entriesThisSection) {}} > {{ entriesThisSection--;}} > {{ break;}} > {{ }}} > {{writer.write("<" + padCharIndex(charIndex) + "> ");}} > {{if (Character.codePointAt(charArray, charIndex) > 0xFFFF) {}} -- This message was sent by Atlassian Jira (v8.3.4#803005)