On Mon, Jan 25, 2010 at 4:22 PM, Kevin Jackson <[email protected]> wrote:
> Hi,
>
>> The JDK offers a java.util.Scanner which allows you to split a stream
>> on-the-fly. Camel leverages this scanner under the covers as well.
>>
>> For example suppose you want to split a 700mb file pr line then you
>> can use the Camel splitter and have it tokenized using \n, which
>> should leverage that Scanner under the covers. You can also enable the
>> streaming mode of the Splitter which should prevent reading the 700mb
>> into memory.
>>
>> So by enabling streaming and having the big message split by the
>> Scanner should allow you to do this with low memory usage.
>>
>>
>> Its the createIterator method on ObjectHelper which the Camel splitter
>> will use, if you use the body().tokenize("\n") as the split
>> expression.
>
> And this is the case when you use the POJO splitting method? I
> assumed that this was the case as it made the most sense so I have
> followed the example in the SplitterPOJOTest
>
> For the Java DSL :
> from("direct:start").split().method("mySplitter",
> "splitBody").streaming().to("mock:result");
>
> The equivalent spring xml :
> <route>
> <from uri="direct:start"/>
> <split streaming="true">
> <bean ref="mySplitter" method="splitBody"/>
> <to uri="mock:result"/>
> </split>
> </route>
>
> Given that I cannot tokenize on a simple \n
How do you want to split your file. Can you split it on some sort of
token, which is required by the Scanner?
You can use POJO to return a String value which then will be used by
the Scanner.
For example in that splitBody method in the sample above it could be defined as
public String splitBody() {
return "\n";
}
>
> Thanks,
> Kev
>
--
Claus Ibsen
Apache Camel Committer
Author of Camel in Action: http://www.manning.com/ibsen/
Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus