Re: Best Strategy to process a large number of rows in File

Brad Johnson Fri, 15 Apr 2016 07:03:20 -0700

6 hours?  I'm currently streaming a 50k line file through Beanio with seda
queues in and doing similar processing in under a minute.


You have a queue here tht appears to have something on  it that isbeing
unmarashalled and then split.  How big is that object on the queue?  Is it
only a single line of data? Try this.

<route id="ProcessAndStoreInQueue_Route">
                        <from
uri="seda:processAndStoreInQueue?concurrentConsumers=30" />



On Fri, Apr 15, 2016 at 8:25 AM, Jens Breitenstein <mailingl...@j-b-s.de>
wrote:

> Hi Michele
>
> Reading a CSV with 40k lines using camel in streaming takes a view
> seconds. As you limit the queue-size to avoid OOM the entire performance
> depends how fast you can empty the queue.
> How long does processing of ONE message take in average? To me it looks
> like approximately 1.6 secs (35000/6/60/60). The processes responsible for
> reading the queue is single-threaded??
>
> Jens
>
>
> Am 15/04/16 um 14:59 schrieb Michele:
>
> Hi,
>>
>> I spent a bit of time reading different topics on this issue, and I
>> changed
>> my route like this reducing the memory usage of about 300Mb:
>>
>> <route id="FileRetriever_Route">
>>                         <from
>>
>> uri="{{uri.inbound}}?scheduler=quartz2&amp;scheduler.cron={{poll.consumer.scheduler}}&amp;scheduler.triggerId=FileRetriever&amp;scheduler.triggerGroup=IF_CBIKIT{{uri.inbound.options}}"
>> />
>>                         <setHeader
>>
>> headerName="ImportDateTime"><simple>${date:now:yyyyMMdd-HHmmss}</simple></setHeader>
>>                         <setHeader
>>
>> headerName="MsgCorrelationId"><simple>CBIKIT_INBOUND_${in.header.ImportDateTime}</simple></setHeader>
>>                         <setHeader headerName="breadcrumbId">
>>
>>
>> <simple>Import-${in.header.CamelFileName}-${in.header.ImportDateTime}-${in.header.breadcrumbId}</simple>
>>                 </setHeader>
>>                         <to uri="seda:processAndStoreInQueue" />
>>                         <log message="END - FileRetriever_Route" />
>>                 </route>
>>
>>                 <route id="ProcessAndStoreInQueue_Route">
>>                         <from uri="seda:processAndStoreInQueue" />
>>                         <unmarshal>
>>                                 <bindy type="Csv"
>> classType="com.fincons.ingenico.crt2.cbikit.inbound.model.RowData"/>
>>                         </unmarshal>
>>
>>                         <split streaming="true"
>> executorServiceRef="myThreadPoolExecutor" >
>>                                 <simple>${body}</simple>
>>                                 <choice>
>>                                         <when>
>>                                                 <simple></simple>
>>                                                 <setHeader
>>
>> headerName="CamelSplitIndex"><simple>${in.header.CamelSplitIndex}</simple></setHeader>
>>                                                 <process
>> ref="BodyEnricherProcessor" />
>>                                                 <to
>>
>> uri="dozer:transform?mappingFile=file:{{crt2.apps.home}}{{dozer.mapping.path}}&amp;targetModel=com.fincons.ingenico.crt2.cbikit.inbound.model.SerialNumber"
>> />
>>                                                 <marshal ref="Gson" />
>>                                                 <to
>> uri="activemq:queue:CBIKIT"  />
>>                                         </when>
>>                                         <otherwise>
>>                                                 <log message="Message
>> discarded ${in.header.CamelSplitIndex} -
>> ${body}" />
>>                                         </otherwise>
>>                                 </choice>
>>                         </split>
>>                 </route>
>>
>> The last test processed 35000 lines of CSV file in about 6h with an
>> average
>> memory usage 1400Mb successful. But, Can I improve further processing
>> performance?
>>
>> In addition, I noticed that Queue Size of Queue is low. Why? (Producer is
>> slower than Consumer?)
>>
>> Thanks in advance.
>>
>> Best Regards
>>
>> Michele
>>
>>
>>
>> --
>> View this message in context:
>> http://camel.465427.n5.nabble.com/Best-Strategy-to-process-a-large-number-of-rows-in-File-tp5779856p5781168.html
>> Sent from the Camel - Users mailing list archive at Nabble.com.
>>
>
>

Re: Best Strategy to process a large number of rows in File

Reply via email to