6 hours? I'm currently streaming a 50k line file through Beanio with seda queues in and doing similar processing in under a minute.
You have a queue here tht appears to have something on it that isbeing unmarashalled and then split. How big is that object on the queue? Is it only a single line of data? Try this. <route id="ProcessAndStoreInQueue_Route"> <from uri="seda:processAndStoreInQueue?concurrentConsumers=30" /> On Fri, Apr 15, 2016 at 8:25 AM, Jens Breitenstein <mailingl...@j-b-s.de> wrote: > Hi Michele > > Reading a CSV with 40k lines using camel in streaming takes a view > seconds. As you limit the queue-size to avoid OOM the entire performance > depends how fast you can empty the queue. > How long does processing of ONE message take in average? To me it looks > like approximately 1.6 secs (35000/6/60/60). The processes responsible for > reading the queue is single-threaded?? > > Jens > > > Am 15/04/16 um 14:59 schrieb Michele: > > Hi, >> >> I spent a bit of time reading different topics on this issue, and I >> changed >> my route like this reducing the memory usage of about 300Mb: >> >> <route id="FileRetriever_Route"> >> <from >> >> uri="{{uri.inbound}}?scheduler=quartz2&scheduler.cron={{poll.consumer.scheduler}}&scheduler.triggerId=FileRetriever&scheduler.triggerGroup=IF_CBIKIT{{uri.inbound.options}}" >> /> >> <setHeader >> >> headerName="ImportDateTime"><simple>${date:now:yyyyMMdd-HHmmss}</simple></setHeader> >> <setHeader >> >> headerName="MsgCorrelationId"><simple>CBIKIT_INBOUND_${in.header.ImportDateTime}</simple></setHeader> >> <setHeader headerName="breadcrumbId"> >> >> >> <simple>Import-${in.header.CamelFileName}-${in.header.ImportDateTime}-${in.header.breadcrumbId}</simple> >> </setHeader> >> <to uri="seda:processAndStoreInQueue" /> >> <log message="END - FileRetriever_Route" /> >> </route> >> >> <route id="ProcessAndStoreInQueue_Route"> >> <from uri="seda:processAndStoreInQueue" /> >> <unmarshal> >> <bindy type="Csv" >> classType="com.fincons.ingenico.crt2.cbikit.inbound.model.RowData"/> >> </unmarshal> >> >> <split streaming="true" >> executorServiceRef="myThreadPoolExecutor" > >> <simple>${body}</simple> >> <choice> >> <when> >> <simple></simple> >> <setHeader >> >> headerName="CamelSplitIndex"><simple>${in.header.CamelSplitIndex}</simple></setHeader> >> <process >> ref="BodyEnricherProcessor" /> >> <to >> >> uri="dozer:transform?mappingFile=file:{{crt2.apps.home}}{{dozer.mapping.path}}&targetModel=com.fincons.ingenico.crt2.cbikit.inbound.model.SerialNumber" >> /> >> <marshal ref="Gson" /> >> <to >> uri="activemq:queue:CBIKIT" /> >> </when> >> <otherwise> >> <log message="Message >> discarded ${in.header.CamelSplitIndex} - >> ${body}" /> >> </otherwise> >> </choice> >> </split> >> </route> >> >> The last test processed 35000 lines of CSV file in about 6h with an >> average >> memory usage 1400Mb successful. But, Can I improve further processing >> performance? >> >> In addition, I noticed that Queue Size of Queue is low. Why? (Producer is >> slower than Consumer?) >> >> Thanks in advance. >> >> Best Regards >> >> Michele >> >> >> >> -- >> View this message in context: >> http://camel.465427.n5.nabble.com/Best-Strategy-to-process-a-large-number-of-rows-in-File-tp5779856p5781168.html >> Sent from the Camel - Users mailing list archive at Nabble.com. >> > >