DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=20103>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=20103 FileSet horrible performance when dir has huge number of subdirs ------- Additional Comments From [EMAIL PROTECTED] 2003-06-04 16:10 ------- I agree with the thoughts presented on revising the way that the various DirectoryScanner implementations do their business. Scan only the directories required to satisfy the wildcard patterns. Include files directly that have no wildcard patterns ( unless they have been excluded ). I created a quick and dirty override of the <ftp/> task that provides a remoteScan switch, allowing one to turn off remote scanning completely. Instead of using FTPDirectoryScanner, in this case, it uses DirectoryNoScanner. It is not very smart, really creating the totally opposite situation that we currently have. But, since I know the domain of my <fileset/> ( no patterns ), it is a decent performance test. With the remoteScan attribute set to the default of "yes", I have the following behavior: A list of 10 files takes approx. 5 minutes. A list of 35 files takes approx 10 minutes. A list of 100 files takes approx 30 minutes. If the <fileset/> gets much larger than this, the server times out ( during the scanning ) prior to downloading any files. With the remoteScan attribute set to "no", I have the following behavior: A list of 1000 files takes approx. 40 minuets. A list of 2500 takes approx 100 minutes. Downloading of files begins almost immediately, once the <ftp/> task connects to the server. These performance stats are really quite tied to my connection speed, the ftp server response, and approx. file size ( approx. 25KB each ). But, it does give a good indication of potential performance increases.