Hi Patrick, take a look at this example[1]: all you have to do is obtaining a ContentHandler instance as shown, then invoking SAX events while parsing the original document. It's more efficient and consumes less memory Simo
[1] http://www.stylusstudio.com/xmldev/200502/post20440.html http://people.apache.org/~simonetripodi/ http://www.99soft.org/ On Mon, Mar 28, 2011 at 4:56 PM, Patrick Diviacco <[email protected]> wrote: > hi! > > What should I use instead of StringBuffer ? > > Any example or tutorial ? > > thanks > Patrick > > On 28 March 2011 16:53, Simone Tripodi <[email protected]> wrote: > >> Hi Patrick, >> nice to know you quickly fixed the issue before anybody could have >> provided his help! :) >> >> As a side note, I would suggest you taking in consideration a >> different solution for the XML generation rather the StringBuffer, >> since you're parsing large dataset, streaming data while parsing >> would improve the performances and reduce the consumed memory. >> >> Just my 2 cents, have a nice day, >> Simo >> >> http://people.apache.org/~simonetripodi/ >> http://www.99soft.org/ >> >> >> >> On Mon, Mar 28, 2011 at 2:28 PM, Patrick Diviacco >> <[email protected]> wrote: >> > I've solved. the issue was a row in train.xml file. To solve the issue >> I've >> > printed the source file rows while processing. However it has been >> possible >> > only because the parsing takes 4 minutes. >> > >> > I'm wondering how to debug such issues with a much bigger text file. >> > >> > thanks >> > >> > On 28 March 2011 14:14, Patrick Diviacco <[email protected]> >> wrote: >> > >> >> And these are the files: >> >> >> >> http://dl.dropbox.com/u/72686/test.xml >> >> >> >> http://dl.dropbox.com/u/72686/train.xml >> >> >> >> thanks >> >> >> >> >> >> On 28 March 2011 14:13, Patrick Diviacco <[email protected] >> >wrote: >> >> >> >>> Hi, >> >>> >> >>> I've a 74MB xml document and I've split it into 2 docs:52MB and 22MB >> >>> respectively. >> >>> >> >>> I'm parsing the file using common Digester library, and everything >> works >> >>> perfectly for the small file, but I get a NullPointerExceptio with the >> big >> >>> one. >> >>> >> >>> I don't think the issue is the code because it works for the small >> file... >> >>> I guess the problem is with the file itself. >> >>> >> >>> I've parsed the files with the same parser, so I don't think the files >> >>> have issues either. >> >>> >> >>> In conclusion I dunno where the issue is. This is the code: >> >>> http://pastie.org/1726063 >> >>> >> >>> This is the exception >> >>> SEVERE: End event threw exception >> >>> java.lang.reflect.InvocationTargetException >> >>> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) >> >>> at >> >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >>> at java.lang.reflect.Method.invoke(Method.java:597) >> >>> at >> >>> >> org.apache.commons.beanutils.MethodUtils.invokeMethod(MethodUtils.java:216) >> >>> at org.apache.commons.digester.SetNextRule.end(SetNextRule.java:220) >> >>> at org.apache.commons.digester.Rule.end(Rule.java:257) >> >>> at org.apache.commons.digester.Digester.endElement(Digester.java:1345) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:601) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1782) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2938) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522) >> >>> at org.apache.commons.digester.Digester.parse(Digester.java:1871) >> >>> at CentroidGenerator.main(CentroidGenerator.java:137) >> >>> Caused by: java.lang.NullPointerException >> >>> at CentroidGenerator.nextItem(CentroidGenerator.java:62) >> >>> ... 19 more >> >>> Exception in thread "main" java.lang.NullPointerException >> >>> at >> >>> >> org.apache.commons.digester.Digester.createSAXException(Digester.java:3363) >> >>> at >> >>> >> org.apache.commons.digester.Digester.createSAXException(Digester.java:3389) >> >>> at org.apache.commons.digester.Digester.endElement(Digester.java:1348) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:601) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1782) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2938) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205) >> >>> at >> >>> >> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522) >> >>> at org.apache.commons.digester.Digester.parse(Digester.java:1871) >> >>> at CentroidGenerator.main(CentroidGenerator.java:137) >> >>> Caused by: java.lang.NullPointerException >> >>> at CentroidGenerator.nextItem(CentroidGenerator.java:62) >> >>> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) >> >>> at >> >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >>> at java.lang.reflect.Method.invoke(Method.java:597) >> >>> at >> >>> >> org.apache.commons.beanutils.MethodUtils.invokeMethod(MethodUtils.java:216) >> >>> at org.apache.commons.digester.SetNextRule.end(SetNextRule.java:220) >> >>> at org.apache.commons.digester.Rule.end(Rule.java:257) >> >>> at org.apache.commons.digester.Digester.endElement(Digester.java:1345) >> >>> ... 12 more >> >>> >> >>> thanks >> >>> >> >> >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
