Re: Best Strategy to process a large number of rows in File

Brad Johnson Mon, 18 Apr 2016 09:36:54 -0700

If that doesn't work for you there is another way I use when I have to read
complex files that aren't simple one line = one record but it isn't really
necessary if you are simply reading in a CSV or fixed width file with only
single lines.   I don't know what your record format or Beanio looks like
so can't really tell.  But if it is set up to marshal/unmarshal single
lines then you simply have to tell the tokenizer what those lines look like
and what the line ending delimiters are.


On Mon, Apr 18, 2016 at 11:17 AM, Brad Johnson <brad.john...@mediadriver.com
> wrote:

> The tokenization may require a different line ending -  \r or \n or \r\n
> for example.  The file reader has to understand what it is parsing. I take
> it when you use that splitter it with the tokenization it was reading the
> whole file in one big slurp and never finding a line ending and you ended
> up doing a convertBody to string in order to convert the whole file into a
> String?  Obviously that's not what you're after.  What are the line endings?
>
> On Mon, Apr 18, 2016 at 3:26 AM, Michele <
> michele.mazzi...@finconsgroup.com> wrote:
>
>> Hi Brad,
>>
>> first of all thank you very much for the time you dedicate me.
>>
>> Are you getting the entire file in memory? I think so. I thought BeanIO
>> worked in lazy mode...
>>
>> A question, I noticed that in some file Splitter doesn't work with
>> <from uri="file:inbox"/>
>>   <split streaming="true">
>>     <tokenize token="\n" />
>>     <to uri="activemq:queue:store"/>
>>   </split>
>>
>> but It required a convertBodyTo type="java.lang.String" before split, but
>> we
>> don't want this (entire file in memory). How can I split file without
>> converter?
>>
>> I tried to introduce some changes at Route as you suggested with a good
>> use
>> of memory avoiding OOM.
>>
>> Current Route:
>>
>> <route id="FileRetriever_Route">
>> <from
>>
>> uri="{{uri.inbound}}?scheduler=quartz2&amp;scheduler.cron={{poll.consumer.scheduler}}&amp;scheduler.triggerId=FileRetriever&amp;scheduler.triggerGroup=IF_SAP{{uri.inbound.options}}"
>> />
>> <split streaming="true" executorServiceRef="ThreadPoolExecutor">
>>         <tokenize token="\n" />
>>         <choice>
>>                 <when>
>>                         <simple></simple>
>>                         <log message="Store Message in Queue
>> ${in.header.CamelSplitIndex} -
>> ${body}" loggingLevel="DEBUG" />
>>                         <to uri="activemq:queue:SAP.Product" />
>>                 </when>
>>                 <otherwise>
>>                         <log message="Invalid Row
>> ${in.header.CamelSplitIndex} - Discarded:
>> ${body}" />
>>                 </otherwise>
>>         </choice>
>> </split>
>> </route>
>> <route id="ProcessMessageData_Route" errorHandlerRef="DLQErrorHandler" >
>>                         <from
>> uri="activemq:queue:SAP.Product?destination.consumer.prefetchSize=1" />
>>
>>                         <throttle timePeriodMillis="1000"
>> asyncDelayed="true">
>>                                 <constant>3</constant>
>>
>>                                 <log message="START -
>> ProcessMessageData_Route" />
>>
>>
>>                                 <unmarshal ref="RecordParser" />
>>
>> <setBody><simple>${body[0]}</simple></setBody>
>>                                 <to
>>
>> uri="dozer:transform?mappingFile=file:{{crt2.apps.home}}{{dozer.mapping.path}}&amp;targetModel=java.util.LinkedHashMap"
>> />
>>                                 <marshal ref="Gson" />
>>
>>                                 <enrich uri="direct:crm-login"
>> strategyRef="OAuthStrategy" />
>>
>>                                 <recipientList>
>>
>> <header>CompleteActionPath</header>
>>                                 </recipientList>
>>
>>                         </throttle>
>> </route>
>>
>> Thanks a lot again
>>
>> Best Regards
>>
>> Michele
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://camel.465427.n5.nabble.com/Best-Strategy-to-process-a-large-number-of-rows-in-File-tp5779856p5781251.html
>> Sent from the Camel - Users mailing list archive at Nabble.com.
>>
>
>

Re: Best Strategy to process a large number of rows in File

Reply via email to