DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=20103>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=20103

FileSet horrible performance when dir has huge number of subdirs





------- Additional Comments From [EMAIL PROTECTED]  2003-06-04 16:10 -------
I agree with the thoughts presented on revising the way that the various
DirectoryScanner implementations do their business.  Scan only the directories
required to satisfy the wildcard patterns.  Include files directly that have no
wildcard patterns ( unless they have been excluded ).

I created a quick and dirty override of the <ftp/> task that provides a
remoteScan switch, allowing one to turn off remote scanning completely.  Instead
of using FTPDirectoryScanner, in this case, it uses DirectoryNoScanner.  It is
not very smart, really creating the totally opposite situation that we currently
have.  But, since I know the domain of my <fileset/> ( no patterns ), it is a
decent performance test.

With the remoteScan attribute set to the default of "yes", I have the following
behavior:

A list of 10 files takes approx. 5 minutes.  A list of 35 files takes 
approx 10 minutes.  A list of 100 files takes approx 30 minutes.  If the
<fileset/> gets much larger than this, the server times out ( during the
scanning ) prior to downloading any files.

With the remoteScan attribute set to "no", I have the following behavior:

A list of 1000 files takes approx. 40 minuets.  A list of 2500 takes approx 100
minutes.  Downloading of files begins almost immediately, once the <ftp/> task
connects to the server.

These performance stats are really quite tied to my connection speed, the ftp
server response, and approx. file size ( approx. 25KB each ).  But, it does give
a good indication of potential performance increases.

Reply via email to