[ https://issues.apache.org/jira/browse/XERCESJ-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17912876#comment-17912876 ]
Elliotte Rusty Harold commented on XERCESJ-1768: ------------------------------------------------ I think the bug is in this line in skipSpaces: {{ } while (XMLChar.isSpace(c = fCurrentEntity.ch[fCurrentEntity.position]));}} If we have a very long run of spaces that exceeds the buffer length, we run right off the end. This condition also needs to check for the length of fCurrentEntity.ch. E.g. something like {{ } while (fCurrentEntity.position < fCurrentEntity.ch.length && XMLChar.isSpace(c = fCurrentEntity.ch[fCurrentEntity.position]));}} A unit test can prove or disprove this hypothesis. If anyone sends a patch, please make sure you write a unit test that exposes the bug *first* so we're sure this is really the problem. > ArrayIndexOutOfBoundsException when parsing a file > -------------------------------------------------- > > Key: XERCESJ-1768 > URL: https://issues.apache.org/jira/browse/XERCESJ-1768 > Project: Xerces2-J > Issue Type: Bug > Components: SAX > Affects Versions: 2.12.2 > Environment: Windows / jdk-17.0.13.11 > Reporter: Julien De Murcia > Priority: Major > > I am using Apache Xerces v2.12.2 (packaged with jdk-17.0.13.11-hotspot from > Eclipse Temurin on Windows) to parse an XML file with SAX. > On one particular xml file, I get this exception : > java.lang.ArrayIndexOutOfBoundsException: Index 8192 out of bounds for length > 8192 > at > java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipSpaces(XMLEntityScanner.java:1503) > ~[?:?] > at > java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$TrailingMiscDriver.next(XMLDocumentScannerImpl.java:1374) > ~[?:?] > at > java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605) > ~[?:?] > at > java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:542) > ~[?:?] > at > java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:889) > ~[?:?] > at > java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:825) > ~[?:?] > at > java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) > ~[?:?] > at > java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1224) > ~[?:?] > at > java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:637) > ~[?:?] > > So far the exception only occured with one particular file. Unfortunately it > is confidential so I cannot share it. The file size is 200 Mb (rather a small > file according to our standards). > If I remove a large part at the begining of the file, it is parsed without > error. And the same goes if I remove a large part at the end of the file, so > it does not seem caused by a single line of the xml. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: j-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: j-dev-h...@xerces.apache.org