On 12/15/06, RNobrega <[EMAIL PROTECTED]> wrote:
gnodet wrote: > > >> >> 3) FTPPollingEndpoint scales poorly because >> pollFileOrDirectory(...) does not distribute >> the load across nodes: the files are fetched >> locally and sequentially > > I don't follow here. The directory listing is > done by one thread, but all actual file reads and > jbi stuff is delegated to the thread pool. > There's really no reason why it would not scale. > Furthermore, the recent changes I made should > allow clustering ftp poller endpoints -- provided > that we implement a distributed locking mechanism > ;) > > What you say is true, within one jvm; imagine you have 4 separate nodes and 100 files to download. Node 1 will be stressed out, while the other ones are idle (correct?)
If you deploy the same poller endpoint on multiple JVMs, all will regularly poll the directories for files to process. All nodes will queue one job for each file to process, but the locking mechanism will prevent several nodes to handle the same file, while allowing several nodes to process different files. However, I agree that there is no load-balancing, as one node may start downloading all the files, while the other nodes do nothing. Well, I guess this could be tuned by configuring the FTPClientPool or Executor used to allow only a limited number of concurrent connections (I think that the default for the FTPClientPool is 8). I'm not quite sure how to handle that in the case where we use the new active polling endpoint. If you have a clustered quartz timer (this would ensure that the timer is fired only once in a cluster) which list the available files, and send exchanges to download them, these exchanges may be put in a jms queue (using a jms BC or jms/jca flow) to achieve load-balancing. However, I don't know how ActiveMQ behaves when load-balancing a small number of messages. This need some tuning I guess.
gnodet wrote: > > > I guess the FTPDir could be triggered by a quartz > component. However, this won't be easily > clusterable: if you put the same FTPDir component > on several nodes, all of them will list the > available files and start downloading them, so > imho, it won't solve this problem. In both cases, > the only way to solve the problem (imho) is to use > a distributed store based (on top of a database > for example) or a DUP remover (which would be a > good idea to implement for other use cases too). > > Anyway, this is an interesting way and we have > already discussed in other threads. (see [1] for > example). However, I'd like this service to offer > a WSDL description of its operations (list files, > upload, download) which would be independant of > the ftp protocol, so that it can be implemented by > other services (file, webdav, etc...). > > What do you think ? > I agree 100%, and would very much like to see servicemix evolve in the direction I consider (perhaps wrongly) the best one for an esb: jbi message based services with great support for: clustering/fail-over and configuration (some generic way of using xbean.xml for defaults, and then using a datastore to override properties)
Cool ! Let's design and code :)
-- View this message in context: http://www.nabble.com/Some-comments-on-Ftp...Endpoint-tf2825900s12049.html#a7891991 Sent from the ServiceMix - User mailing list archive at Nabble.com.
-- Cheers, Guillaume Nodet
