Unfortunately I'm getting an OutOfMemoryError using XPath splitting the way
you shown. I'm parsing a file with about 500000 xml messages.

How can we use Apache Digester instead?
 

Claus Ibsen-2 wrote:
> 
> Hi
> 
> This is as far I got with the xpath expression for splitting
> http://svn.apache.org/viewvc?rev=825156&view=rev
> 
> 
> 
> On Wed, Oct 14, 2009 at 4:40 PM, Claus Ibsen <claus.ib...@gmail.com>
> wrote:
>> On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <claus.ib...@gmail.com>
>> wrote:
>>> Hi
>>>
>>> On Wed, Oct 14, 2009 at 4:16 PM, mcarson <mcar...@amsa.com> wrote:
>>>>
>>>> It looks like the scanner might provide me with the capabilities I was
>>>> looking for regarding reading in a file in delimited chunks.  I'm
>>>> assuming I
>>>> would implement this as a bean... can the bean component be used as a
>>>> "from"
>>>> in a camel route?  I'm new to Camel, and I have never seen that done.
>>>>  Is
>>>> there an example bean (that is a consumer of some sort) that I could
>>>> use to
>>>> model my code after?
>>>>
>>>
>>> Since you use xpath then I took at dive into looking how to split big
>>> files.
>>> Using InputSource seems to do the trick as it allow xpath to use SAX
>>> events which fits with streaming.
>>>
>>> I will work a bit to get it supported nice out of the box. And provide
>>> details how to do it in 2.0.
>>>
>>
>> Ah yeah the xpath will still at least hold all the result into memory.
>>
>> As you can only get a result of these types listed here:
>> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html
>>
>> And none of them is stream based.
>>
>> So even with SAX to parse the big xml file the xpath expression
>> evaluation will result into all data being loaded into memory, or at
>> least the NodeList which contains all the splitted entries.
>>
>> So maybe that Scanner is better if you can do some custom clipping. I
>> believe its regexp based so you may be able to find a good regexp that
>> can split on </person> or something.
>>
>>
>>
>>
>>
>>
>>
>>>
>>>
>>>>
>>>>
>>>> Claus Ibsen-2 wrote:
>>>>>
>>>>> Hi
>>>>>
>>>>> How do you want to split the file?
>>>>> Is there a special character that denotes a new "record"
>>>>>
>>>>> Using java.util.Scanner is great as it can do streaming. And also what
>>>>> Camel can do if you for example want to split by new line etc.
>>>>>
>>>>> --
>>>>> Claus Ibsen
>>>>> Apache Camel Committer
>>>>>
>>>>> Open Source Integration: http://fusesource.com
>>>>> Blog: http://davsclaus.blogspot.com/
>>>>> Twitter: http://twitter.com/davsclaus
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/handling-large-files-tp25826380p25891924.html
>>>> Sent from the Camel - Users mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Claus Ibsen
>>> Apache Camel Committer
>>>
>>> Open Source Integration: http://fusesource.com
>>> Blog: http://davsclaus.blogspot.com/
>>> Twitter: http://twitter.com/davsclaus
>>>
>>
>>
>>
>> --
>> Claus Ibsen
>> Apache Camel Committer
>>
>> Open Source Integration: http://fusesource.com
>> Blog: http://davsclaus.blogspot.com/
>> Twitter: http://twitter.com/davsclaus
>>
> 
> 
> 
> -- 
> Claus Ibsen
> Apache Camel Committer
> 
> Open Source Integration: http://fusesource.com
> Blog: http://davsclaus.blogspot.com/
> Twitter: http://twitter.com/davsclaus
> 
> 

-- 
View this message in context: 
http://old.nabble.com/handling-large-files-tp25826380p28005868.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Reply via email to