If we implement what the different stakeholders propose, can we guarantee that in case a problem occurs during the parsing of the file, a rollback of the messages created (by the batch or the tokenisation) will be done ?
Kind regards, Claus Ibsen wrote: > > Hi > > I have created 2 tickets to track this: > CAMEL-875, CAMEL-876 > > Med venlig hilsen > > Claus Ibsen > ...................................... > Silverbullet > Skovsgårdsvænget 21 > 8362 Hørning > Tlf. +45 2962 7576 > Web: www.silverbullet.dk > > -----Original Message----- > From: Claus Ibsen [mailto:[EMAIL PROTECTED] > Sent: 2. september 2008 21:44 > To: camel-user@activemq.apache.org > Subject: RE: Splitter for big files > > Ah of course well spotted. The tokenize is the memory hog. Good idea with > the java.util.Scanner. > > So combined with the batch stuff we should be able to operate on really > big files without consuming too much memory ;) > > > Med venlig hilsen > > Claus Ibsen > ...................................... > Silverbullet > Skovsgårdsvænget 21 > 8362 Hørning > Tlf. +45 2962 7576 > Web: www.silverbullet.dk > -----Original Message----- > From: Gert Vanthienen [mailto:[EMAIL PROTECTED] > Sent: 2. september 2008 21:28 > To: camel-user@activemq.apache.org > Subject: Re: Splitter for big files > > L.S., > > Just added my pair of eyes ;). One part of the problem is indeed the > list of exchanges that is returned by the expression, but I think you're > also reading the entire file into memory a first time for tokenizing > it. ExpressionBuilder.tokenizeExpression() converts the type to string > and then uses a StringTokenizer on that. I think we could add support > there for tokenizing File, InputStreams and Readers directly using a > Scanner. > > Regards, > > Gert > > Claus Ibsen wrote: >> Hi >> >> Looking into the source code of the splitter it looks like it creates the >> list of splitted exchanges before they are being processed. That is why >> it then will consume memory for big files. >> >> Maybe somekind of batch size option is needed so you can set for instance >> number, say 20 as batch size. >> >> .splitter(body(InputStream.class).tokenize("\r\n").batchSize(20)) >> >> Could you create a JIRA ticket for this improvement? >> Btw how big is the files you use? >> >> The file component uses a File as the object. >> So when you split using the input stream then Camel should use the type >> converter from File -> InputStream, that doesn't read the entire content >> into memory. This happends in the splitter where it creates the entire >> list of new exchanges to fire. >> >> At least that is what I can read from the source code after a long days >> work, so please read the code as 4 eyes is better that 2 ;) >> >> >> >> Med venlig hilsen >> >> Claus Ibsen >> ...................................... >> Silverbullet >> Skovsgårdsvænget 21 >> 8362 Hørning >> Tlf. +45 2962 7576 >> Web: www.silverbullet.dk >> >> -----Original Message----- >> From: Bart Frackiewicz [mailto:[EMAIL PROTECTED] >> Sent: 2. september 2008 17:40 >> To: camel-user@activemq.apache.org >> Subject: Splitter for big files >> >> Hi, >> >> i am using this route for a couple of CSV file routes: >> >> from("file:/tmp/input/?delete=true") >> .splitter(body(InputStream.class).tokenize("\r\n")) >> .beanRef("myBean", "process") >> .to("file:/tmp/output/?append=true") >> >> This works fine for small CSV files, but for big files i noticed >> that camel uses a lot of memory, it seems that camel is reading >> the file into memory. What is the configuration to use a stream >> in the splitter? >> >> I recognized the same behaviour in the xpath splitter: >> >> from("file:/tmp/input/?delete=true") >> .splitter(ns.xpath("//member")) >> ... >> >> BTW, i found a posting from march, where James suggest following >> implementation for an own splitter: >> >> -- quote -- >> >> from("file:///c:/temp?noop=true)"). >> splitter().method("myBean", "split"). >> to("activemq:someQueue") >> >> Then register "myBean" with a split method... >> >> class SomeBean { >> public Iterator split(File file) { >> /// figure out how to split this file into rows... >> } >> } >> -- quote -- >> >> But this won't work for me (Camel 1.4). >> >> Bart >> >> > > > ----- Enterprise Architect Xpectis 12, route d'Esch L-1470 Luxembourg Phone +352 25 10 70 470 Mobile +352 621 45 36 22 e-mail : [EMAIL PROTECTED] web site : www.xpectis.com www.xpectis.com My Blog : http://cmoulliard.blogspot.com/ http://cmoulliard.blogspot.com/ -- View this message in context: http://www.nabble.com/Splitter-for-big-files-tp19272583s22882p19289425.html Sent from the Camel - Users mailing list archive at Nabble.com.