I'm not sure how to block the polling. Here is what seems like an ideal approach...the SFTP polling always runs on schedule and downloads files with single thread to a folder. This won't use much memory as its just copying one file at a time to the folder. Then I'd have X threads take those files and start the decrypting/processing. Since this part uses a lot of memory it seems I'd want to limit the number of threads that can do this task so the max memory is contained.
However I don't know how to do this as I'm new to Camel. Yes I'd really like to use streaming instead of byte[] at every step of the processing but no idea if that's possible in my use case. Sounds like it worked in yours. -Dave On Sat, Nov 21, 2015 at 10:22 AM, mailingl...@j-b-s.de <mailingl...@j-b-s.de > wrote: > I guess you need to block the polling while you process files in parallel. > A seda queue with a capacity limit will at least block the consumer. As I > do not know what exactly you are doing with the files, if always the same > amount of mem per file is required it's hard to tell what mem settings to > use. Always providing more mem is not a solution from my point of view, > because you hit the same limit just later. > > Limiting messages, use of streaming / splitting will keep mem usage low > (at least in our env it works that way and we reduced mem usage from 1G to > 128M per VM). But if this may something for you...don't know > > > Jens > > Von meinem iPhone gesendet > > > Am 21.11.2015 um 16:40 schrieb David Hoffer <dhoff...@gmail.com>: > > > > Yes when the sftp read thread stops it was still processing files it had > > previously downloaded. And since we can get so many files on each poll > > (~1000) and we have to do a lot of decrypting of these files in > subsequent > > routes that its possible that the processing of the 1000 files is not > done > > before the next poll where we get another 1000 files. Eventually the > SFTP > > endpoint will have less/no files and the rest of the routes can catch up. > > All the rest of the routes are file based (except the very last) so there > > is no harm if intermediate folders get backed up with files. > > > > We only have one SFTP connection for reading in this case. > > > > Do you think the seda approach is right for this case? I can look into > > it. Note my previous post that in my dev environment the reason it > stopped > > was out of memory error...i doubt that is the same case in production as > > the rest of the routes do not stop. > > > > -Dave > > > > On Sat, Nov 21, 2015 at 1:36 AM, mailingl...@j-b-s.de < > mailingl...@j-b-s.de> > > wrote: > > > >> Hi! > >> > >> when your sftp read threads stopps the files are still in process? In > our > >> env we had something similar in conjunction with splitting large files > >> because the initial message is pending until all processing is > completed. > >> We solved it using a seda queue (limited in size) in betweeen our sfpt > >> consumer and processing route and "parallel" execution. > >> > >> one sftp consumer -> seda (size limit) -> processing route (with dsl > >> parallel) > >> > >> and this works without any problems. > >> > >> Maybe you have to many sftp connections? Maybe its entirely independent > >> from camel and you reached a file handle limit? > >> > >> Jens > >> > >> > >> Von meinem iPhone gesendet > >> > >>> Am 20.11.2015 um 23:09 schrieb David Hoffer <dhoff...@gmail.com>: > >>> > >>> This part I'm not clear on and it raises more questions. > >>> > >>> When using the JDK one generally uses the Executors factory methods to > >>> create either a Fixed, Single or Cached thread tool. These will use a > >>> SynchronousQueue for Cached pools and LinkedBlockingQueue for Fixed or > >>> Single pools. In the case of SynchronousQueue there is no size...it > >> simply > >>> hands the new request off to either a thread in the pool or it creates > a > >>> new one. And in the case of LinkedBlockingQueue it uses an unbounded > >> queue > >>> size. Now it is possible to create a hybrid, e.g. LinkedBlockingQueue > >> with > >>> a max size but its not part of the factory methods or common. Another > >>> option is the ArrayBlockingQueue which does use a max size but none of > >> the > >>> factory methods use this type. > >>> > >>> So what type of thread pool does Camel create for the default thread > >> pool? > >>> Since its not fixed size I assumed it would use SynchronousQueue and > not > >>> have a separate worker queue. However if Camel is creating a hybrid > >> using > >>> a LinkedBlockingQueue or ArrayBlockingQueue is there a way I can change > >>> that to be a SynchronousQueue so no queue? Or is there a compelling > >> reason > >>> to use LinkedBlockingQueue in a cached pool? > >>> > >>> Now this gets to the problem I am trying to solve. We have a Camel app > >>> that deals with files, lots of them...e.g. all the routes deal with > >> files. > >>> It starts with an sftp URL that gets files off a remote server and then > >>> does a lot of subsequent file processing. The problem is that if the > >> SFTP > >>> server has 55 files (example) and I start the Camel app it processes > them > >>> fine until about 14 or 15 files are left and then it just stops. The > >>> thread that does the polling of the server stops (at least it appears > to > >>> have stopped) and the processing of the 55 files stops, e.g. it does > not > >>> continue to process all of the original 55 files, it stops with 14-15 > >> left > >>> to process (and it never picks them up again on the next poll). And I > >> have > >>> a breakpoint on my custom SftpChangedExclusiveReadLockStrategy and it > >> never > >>> is called again. > >>> > >>> Now getting back to the default thread pool and changing it I would > like > >> to > >>> change it so it uses more threads and no worker queue (like a standard > >>> Executors cached thread pool) but I'm not certain that would even help > as > >>> in the debugger & thread dumps I see that it looks like the SFTP > endpoint > >>> uses a Scheduled Thread Pool instead which makes sense since its a > >> polling > >>> (every 60 seconds in my case) operation. So is there another default > >> pool > >>> that I can configure for Camel's scheduled threads? > >>> > >>> All that being said why would the SFTP endpoint just quit? I don't see > >> any > >>> blocked threads and no deadlock. I'm new to Camel and just don't know > >>> where to look for possible causes of this. > >>> > >>> Thanks, > >>> -Dave > >>> > >>> > >>>> On Thu, Nov 19, 2015 at 11:40 PM, Claus Ibsen <claus.ib...@gmail.com> > >> wrote: > >>>> > >>>> Yes its part of JDK as it specifies the size of the worker queue, of > >>>> the thread pool (ThreadPoolExecutor) > >>>> > >>>> For more docs see > >>>> http://camel.apache.org/threading-model.html > >>>> > >>>> Or the Camel in Action books > >>>> > >>>> > >>>>> On Fri, Nov 20, 2015 at 12:22 AM, David Hoffer <dhoff...@gmail.com> > >> wrote: > >>>>> I'm trying to understand the default Camel Thread Pool and how the > >>>>> maxQueueSize is used, or more precisely what's it for? > >>>>> > >>>>> I can't find any documentation on what this really is or how it's > used. > >>>> I > >>>>> understand all the other parameters as they match what I'd expect > from > >>>> the > >>>>> JDK...poolSize is the minimum threads to keep in the pool for new > tasks > >>>> and > >>>>> maxPoolSize is the maximum number of the same. > >>>>> > >>>>> So how does maxQueueSize fit into this? This isn't part of the JDK > >>>> thread > >>>>> pool so I don't know how Camel uses this. > >>>>> > >>>>> The context of my question is that we have a from sftp route that > seems > >>>> to > >>>>> be getting thread starved. E.g. the thread that polls the sftp > >>>> connection > >>>>> is slowing/stopping at times when it is busy processing other files > >> that > >>>>> were previously downloaded. > >>>>> > >>>>> We are using the default camel thread pool that I see has only a max > of > >>>> 20 > >>>>> threads yet a maxQueueSize of 1000. That doesn't make any sense to > me > >>>>> yet. I would think one would want a much larger pool of threads (as > we > >>>> are > >>>>> processing lots of files) but no queue at all...but not sure on that > >> as I > >>>>> don't understand how the queue is used. > >>>>> > >>>>> -Dave > >>>> > >>>> > >>>> > >>>> -- > >>>> Claus Ibsen > >>>> ----------------- > >>>> http://davsclaus.com @davsclaus > >>>> Camel in Action 2: https://www.manning.com/ibsen2 > >> >