Hi, > from("file:../../../data/ > clients?noop=true").convertBodyTo(Client.class).to("jpa:au.com.interlated.server.domain.Client"); > > I uploaded 300k records - 90mb in total. The java image said it had 2.7Gb of > RAM allocated before it bombed out due to heap space.
I had a similar problem, I resolved it using <from uri="file://"> <split streaming="true"> (and some tokenize string - typically '\n') <to> </split> Essentially without the split + streaming, the entire file is retained in memory and any objects created from that file have a strong reference to the input data and cannot be gc'd. By streaming the file (I guess each line of input represents a 'record'), you only retain one 'record' in ram at a time and after the object has been created and persisted, then it becomes eligible for gc as the underlying bytestream it was attached to no longer has any references. Another advantage of this approach is that downstream processing can take place before the entire file has been read. This is how SAX works (as opposed to DOM which must read the entire file before being useful). The caveat to this is that each record in the input must be self contained so that it is sensible to split the file on a record boundary, if the file has a header section which contains lookup data for each record, then all the records, then a simple split won't work and you would have to pre-process the file to get the lookup data then post process (splitting and streaming) ignoring the header records (you can tell this is fresh in my mind as I have had exactly the same problems) Thanks, Kev