[ https://issues.apache.org/jira/browse/TIKA-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Slava G updated TIKA-2727: -------------------------- Attachment: 1_6e4b115e-7d2d-45f1-a842-35b5ad7ba559 > Parsing and detect mime type of XML file stuck in infinite loop > --------------------------------------------------------------- > > Key: TIKA-2727 > URL: https://issues.apache.org/jira/browse/TIKA-2727 > Project: Tika > Issue Type: Bug > Components: detector, parser > Affects Versions: 1.17 > Reporter: Slava G > Assignee: Tim Allison > Priority: Major > Fix For: 1.19, 2.0.0 > > Attachments: 1_6e4b115e-7d2d-45f1-a842-35b5ad7ba559, > 1_e3e13f0e-7085-4000-a558-5d255ed7a944.xml > > > Hi, > I'm trying to parse (even mime type detect) some XML file that it's not > large, but kinda tricky and my process hangs on : > XMLStringBuffer.append(char[], int, int) line: not available > XMLStringBuffer.append(XMLString) line: not available > XMLNSDocumentScannerImpl(XMLScanner).scanAttributeValue(XMLString, XMLString, > String, boolean, String) line: not available > XMLNSDocumentScannerImpl.scanAttribute(XMLAttributesImpl) line: not available > XMLNSDocumentScannerImpl.scanStartElement() line: not available > XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook() line: not > available > XMLNSDocumentScannerImpl$NSContentDispatcher(XMLDocumentFragmentScannerImpl$FragmentContentDispatcher).dispatch(boolean) > line: not available > XMLNSDocumentScannerImpl(XMLDocumentFragmentScannerImpl).scanDocument(boolean) > line: not available > XIncludeAwareParserConfiguration(XML11Configuration).parse(boolean) line: not > available > XIncludeAwareParserConfiguration(XML11Configuration).parse(XMLInputSource) > line: not available > SAXParserImpl$JAXPSAXParser(XMLParser).parse(XMLInputSource) line: not > available > SAXParserImpl$JAXPSAXParser(AbstractSAXParser).parse(InputSource) line: not > available > SAXParserImpl$JAXPSAXParser.parse(InputSource) line: not available > SAXParserImpl.parse(InputSource, DefaultHandler) line: not available > SAXParserImpl(SAXParser).parse(InputStream, DefaultHandler) line: 195 > XmlRootExtractor.extractRootElement(InputStream) line: 62 > XmlRootExtractor.extractRootElement(byte[]) line: 42 > MimeTypes.getMimeType(byte[]) line: 212 > MimeTypes.detect(InputStream, Metadata) line: 494 > DefaultDetector(CompositeDetector).detect(InputStream, Metadata) line: 84 > > Please see attached XML file. > Please advise. > Thanks -- This message was sent by Atlassian JIRA (v7.6.3#76005)