> From: Stefan Bodewig [mailto:[EMAIL PROTECTED] > > I'm sure I'm probably not understanding your change correctly, so I > > guess I need an explanations for how your change works. > > Basically, if I want to be able to keep some stuff in the target > directory, I'll need a way to exclude them from the purge process and > thus using the full power of DirectoryScanner seemed to be the natural > choice.
So you plan on adding a patternset or an implicit/explicit fileset for things that should not be touched? I myself thought about this issue, and thought that the <zip> mechanism of being able to specify a mapper or destination directory (kind of like zipfileset's prefix) for each input (source) fileset would be another way to look at the same pb. I would allow having a single <sync> with several filesets, instead of several <sync>s with a single fileset each, with excludes. > > I believe toArray doesn't care if the array provided is larger than > > needed. > > This I am not sure about - and I worked on my iBook with only two JDKs > to test as opposed to my Linux desktop with seven 8-) > > I wanted to be save here first. This is just one data point with JDK 1.4.2, but toArray basically puts a null element for the one past last element if the array is bigger. So in all cases it will be safe to write elements after calling toArray, with indices bigger than the Collection converted. public Object[] toArray(Object a[]) { int size = size(); if (a.length < size) a = (Object[])java.lang.reflect.Array.newInstance( a.getClass().getComponentType(), size); Iterator it=iterator(); for (int i=0; i<size; i++) a[i] = it.next(); if (a.length > size) a[size] = null; return a; } > > >> + ds.setExcludes(excls); > >> + ds.scan(); > > > > I guess this is where I'm the most concerned. As I've written above, > > the nonOrphans Set will be quite large for large syncs, and even > > though I know Antoine optimized DirectoryScanner a lot, I'm doubtful > > a scanner with thousands of excludes as fast as a lookup in a Set, > > as the previous implementation was. > > Most time spent in DirectoryScanner is file system scanning AFAIU. > Antoine's changes improved performance for large lists of include > patterns by avoiding scans of directories completely. Since the > original code had to scan the directories as well, I don#t think the > impact will be significant (though I agree it will be slower). Which is precisely why I'm worried! We need to scan everything in any case, as you rightly point out, so we end up probably doing a linear search against all the excludes for every file of the dest. dir., instead of a lookup in a set. Like I said, I think the performance will suffer a lot for large sync, and I'd rather not add this feature if it can't be implemented more efficiently. For a relatively small 1000 files sync, that's already 1,000,000 checks against the exclude patterns. Make that 10,000 files, and we're already at 100 million checks... This will end up far out weighting the recursive directory visit I'm afraid. Maybe DirectoryScanner is smarter than I think it is!? --DD --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]