> From: Stefan Bodewig [mailto:[EMAIL PROTECTED]
> > I'm sure I'm probably not understanding your change correctly, so I
> > guess I need an explanations for how your change works.
> 
> Basically, if I want to be able to keep some stuff in the target
> directory, I'll need a way to exclude them from the purge process and
> thus using the full power of DirectoryScanner seemed to be the natural
> choice.

So you plan on adding a patternset or an implicit/explicit fileset for
things that should not be touched?

I myself thought about this issue, and thought that the <zip> mechanism
of being able to specify a mapper or destination directory (kind of like
zipfileset's prefix) for each input (source) fileset would be another
way to look at the same pb.

I would allow having a single <sync> with several filesets, instead of
several <sync>s with a single fileset each, with excludes.

> > I believe toArray doesn't care if the array provided is larger than
> > needed.
> 
> This I am not sure about - and I worked on my iBook with only two JDKs
> to test as opposed to my Linux desktop with seven 8-)
> 
> I wanted to be save here first.

This is just one data point with JDK 1.4.2, but toArray basically puts a
null element for the one past last element if the array is bigger. So in
all cases it will be safe to write elements after calling toArray, with
indices bigger than the Collection converted.

    public Object[] toArray(Object a[]) {
        int size = size();
        if (a.length < size)
            a = (Object[])java.lang.reflect.Array.newInstance(
                                  a.getClass().getComponentType(),
size);

        Iterator it=iterator();
        for (int i=0; i<size; i++)
            a[i] = it.next();

        if (a.length > size)
            a[size] = null;

        return a;
    }

> 
> >>   +        ds.setExcludes(excls);
> >>   +        ds.scan();
> >
> > I guess this is where I'm the most concerned. As I've written above,
> > the nonOrphans Set will be quite large for large syncs, and even
> > though I know Antoine optimized DirectoryScanner a lot, I'm doubtful
> > a scanner with thousands of excludes as fast as a lookup in a Set,
> > as the previous implementation was.
> 
> Most time spent in DirectoryScanner is file system scanning AFAIU.
> Antoine's changes improved performance for large lists of include
> patterns by avoiding scans of directories completely.  Since the
> original code had to scan the directories as well, I don#t think the
> impact will be significant (though I agree it will be slower).

Which is precisely why I'm worried! We need to scan everything in any
case, as you rightly point out, so we end up probably doing a linear
search against all the excludes for every file of the dest. dir.,
instead of a lookup in a set.

Like I said, I think the performance will suffer a lot for large sync,
and I'd rather not add this feature if it can't be implemented more
efficiently.

For a relatively small 1000 files sync, that's already 1,000,000 checks
against the exclude patterns. Make that 10,000 files, and we're already
at 100 million checks...

This will end up far out weighting the recursive directory visit I'm
afraid.

Maybe DirectoryScanner is smarter than I think it is!? --DD

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to