[
https://issues.apache.org/jira/browse/IO-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029669#comment-13029669
]
Sebb commented on IO-271:
-------------------------
Using list() instead of listFiles() would be possible, but would only double
the size of a directory that could be processed.
The only way to truly fix the problem would be to use a method that provided
access to the file names one by one, but there does not appear to be a method
to do this.
AFAICT FileUtils does not override anything - anyway, why would it be necessary
to delay updating the mod. date on the target file?
Personally, I don't think this is worth implementing. Users can always
implement their own filtering to split the transfer into chunks. Or just make
sure that directories don't contain so many files - this is likely to cause
problems elsewhere as well.
> FileUtils.copyDirectory should be able to handle arbitrary number of files
> --------------------------------------------------------------------------
>
> Key: IO-271
> URL: https://issues.apache.org/jira/browse/IO-271
> Project: Commons IO
> Issue Type: Improvement
> Components: Utilities
> Affects Versions: 2.0.1
> Reporter: Stephen Kestle
>
> File.listFiles() uses up to a bit over 2 times as much memory as File.list().
> The latter should be used in doCopyDirectory where there is no filter
> specified.
> This memory usage is a problem when copying directories with hundreds of
> thousands of files.
> I was also thinking of the option of implementing a file filter (that could
> be composed with the inputted filter) that would batch the file copy
> operation; copy the first 10000 (that match), then the next 10000 etc etc.
> Because of the lack of ordering consistency (between runs) of
> File.listFiles(), there would need to be a final file filter that would
> accept files that have not successfully been copied.
> I'm primarily concerned about copying into an empty directory (I validate
> this beforehand), but for general operation where it's a merge, the
> modification date re-writing should only be done in the final run of copies
> so that while batching occurs (and indeed the final "missed" filtering) files
> do not get copied if they have been modified after the start time. (I presume
> that I'm reading FileUtils correctly in that it overrides files...)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira