Unfortunately I'm getting an OutOfMemoryError using XPath splitting the way you shown. I'm parsing a file with about 500000 xml messages.
How can we use Apache Digester instead? Claus Ibsen-2 wrote: > > Hi > > This is as far I got with the xpath expression for splitting > http://svn.apache.org/viewvc?rev=825156&view=rev > > > > On Wed, Oct 14, 2009 at 4:40 PM, Claus Ibsen <claus.ib...@gmail.com> > wrote: >> On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <claus.ib...@gmail.com> >> wrote: >>> Hi >>> >>> On Wed, Oct 14, 2009 at 4:16 PM, mcarson <mcar...@amsa.com> wrote: >>>> >>>> It looks like the scanner might provide me with the capabilities I was >>>> looking for regarding reading in a file in delimited chunks. I'm >>>> assuming I >>>> would implement this as a bean... can the bean component be used as a >>>> "from" >>>> in a camel route? I'm new to Camel, and I have never seen that done. >>>> Is >>>> there an example bean (that is a consumer of some sort) that I could >>>> use to >>>> model my code after? >>>> >>> >>> Since you use xpath then I took at dive into looking how to split big >>> files. >>> Using InputSource seems to do the trick as it allow xpath to use SAX >>> events which fits with streaming. >>> >>> I will work a bit to get it supported nice out of the box. And provide >>> details how to do it in 2.0. >>> >> >> Ah yeah the xpath will still at least hold all the result into memory. >> >> As you can only get a result of these types listed here: >> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html >> >> And none of them is stream based. >> >> So even with SAX to parse the big xml file the xpath expression >> evaluation will result into all data being loaded into memory, or at >> least the NodeList which contains all the splitted entries. >> >> So maybe that Scanner is better if you can do some custom clipping. I >> believe its regexp based so you may be able to find a good regexp that >> can split on </person> or something. >> >> >> >> >> >> >> >>> >>> >>>> >>>> >>>> Claus Ibsen-2 wrote: >>>>> >>>>> Hi >>>>> >>>>> How do you want to split the file? >>>>> Is there a special character that denotes a new "record" >>>>> >>>>> Using java.util.Scanner is great as it can do streaming. And also what >>>>> Camel can do if you for example want to split by new line etc. >>>>> >>>>> -- >>>>> Claus Ibsen >>>>> Apache Camel Committer >>>>> >>>>> Open Source Integration: http://fusesource.com >>>>> Blog: http://davsclaus.blogspot.com/ >>>>> Twitter: http://twitter.com/davsclaus >>>>> >>>>> >>>> >>>> -- >>>> View this message in context: >>>> http://www.nabble.com/handling-large-files-tp25826380p25891924.html >>>> Sent from the Camel - Users mailing list archive at Nabble.com. >>>> >>>> >>> >>> >>> >>> -- >>> Claus Ibsen >>> Apache Camel Committer >>> >>> Open Source Integration: http://fusesource.com >>> Blog: http://davsclaus.blogspot.com/ >>> Twitter: http://twitter.com/davsclaus >>> >> >> >> >> -- >> Claus Ibsen >> Apache Camel Committer >> >> Open Source Integration: http://fusesource.com >> Blog: http://davsclaus.blogspot.com/ >> Twitter: http://twitter.com/davsclaus >> > > > > -- > Claus Ibsen > Apache Camel Committer > > Open Source Integration: http://fusesource.com > Blog: http://davsclaus.blogspot.com/ > Twitter: http://twitter.com/davsclaus > > -- View this message in context: http://old.nabble.com/handling-large-files-tp25826380p28005868.html Sent from the Camel - Users mailing list archive at Nabble.com.