Hi Patrick, nice to know you quickly fixed the issue before anybody could have provided his help! :)
As a side note, I would suggest you taking in consideration a different solution for the XML generation rather the StringBuffer, since you're parsing large dataset, streaming data while parsing would improve the performances and reduce the consumed memory. Just my 2 cents, have a nice day, Simo http://people.apache.org/~simonetripodi/ http://www.99soft.org/ On Mon, Mar 28, 2011 at 2:28 PM, Patrick Diviacco <[email protected]> wrote: > I've solved. the issue was a row in train.xml file. To solve the issue I've > printed the source file rows while processing. However it has been possible > only because the parsing takes 4 minutes. > > I'm wondering how to debug such issues with a much bigger text file. > > thanks > > On 28 March 2011 14:14, Patrick Diviacco <[email protected]> wrote: > >> And these are the files: >> >> http://dl.dropbox.com/u/72686/test.xml >> >> http://dl.dropbox.com/u/72686/train.xml >> >> thanks >> >> >> On 28 March 2011 14:13, Patrick Diviacco <[email protected]>wrote: >> >>> Hi, >>> >>> I've a 74MB xml document and I've split it into 2 docs:52MB and 22MB >>> respectively. >>> >>> I'm parsing the file using common Digester library, and everything works >>> perfectly for the small file, but I get a NullPointerExceptio with the big >>> one. >>> >>> I don't think the issue is the code because it works for the small file... >>> I guess the problem is with the file itself. >>> >>> I've parsed the files with the same parser, so I don't think the files >>> have issues either. >>> >>> In conclusion I dunno where the issue is. This is the code: >>> http://pastie.org/1726063 >>> >>> This is the exception >>> SEVERE: End event threw exception >>> java.lang.reflect.InvocationTargetException >>> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at >>> org.apache.commons.beanutils.MethodUtils.invokeMethod(MethodUtils.java:216) >>> at org.apache.commons.digester.SetNextRule.end(SetNextRule.java:220) >>> at org.apache.commons.digester.Rule.end(Rule.java:257) >>> at org.apache.commons.digester.Digester.endElement(Digester.java:1345) >>> at >>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:601) >>> at >>> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1782) >>> at >>> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2938) >>> at >>> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) >>> at >>> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511) >>> at >>> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) >>> at >>> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) >>> at >>> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) >>> at >>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205) >>> at >>> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522) >>> at org.apache.commons.digester.Digester.parse(Digester.java:1871) >>> at CentroidGenerator.main(CentroidGenerator.java:137) >>> Caused by: java.lang.NullPointerException >>> at CentroidGenerator.nextItem(CentroidGenerator.java:62) >>> ... 19 more >>> Exception in thread "main" java.lang.NullPointerException >>> at >>> org.apache.commons.digester.Digester.createSAXException(Digester.java:3363) >>> at >>> org.apache.commons.digester.Digester.createSAXException(Digester.java:3389) >>> at org.apache.commons.digester.Digester.endElement(Digester.java:1348) >>> at >>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:601) >>> at >>> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1782) >>> at >>> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2938) >>> at >>> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) >>> at >>> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511) >>> at >>> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) >>> at >>> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) >>> at >>> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) >>> at >>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205) >>> at >>> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522) >>> at org.apache.commons.digester.Digester.parse(Digester.java:1871) >>> at CentroidGenerator.main(CentroidGenerator.java:137) >>> Caused by: java.lang.NullPointerException >>> at CentroidGenerator.nextItem(CentroidGenerator.java:62) >>> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at >>> org.apache.commons.beanutils.MethodUtils.invokeMethod(MethodUtils.java:216) >>> at org.apache.commons.digester.SetNextRule.end(SetNextRule.java:220) >>> at org.apache.commons.digester.Rule.end(Rule.java:257) >>> at org.apache.commons.digester.Digester.endElement(Digester.java:1345) >>> ... 12 more >>> >>> thanks >>> >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
