[ 
https://issues.apache.org/jira/browse/IO-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029669#comment-13029669
 ] 

Sebb commented on IO-271:
-------------------------

Using list() instead of listFiles() would be possible, but would only double 
the size of a directory that could be processed.
The only way to truly fix the problem would be to use a method that provided 
access to the file names one by one, but there does not appear to be a method 
to do this.

AFAICT FileUtils does not override anything - anyway, why would it be necessary 
to delay updating the mod. date on the target file?

Personally, I don't think this is worth implementing. Users can always 
implement their own filtering to split the transfer into chunks. Or just make 
sure that directories don't contain so many files - this is likely to cause 
problems elsewhere as well.

> FileUtils.copyDirectory should be able to handle arbitrary number of files
> --------------------------------------------------------------------------
>
>                 Key: IO-271
>                 URL: https://issues.apache.org/jira/browse/IO-271
>             Project: Commons IO
>          Issue Type: Improvement
>          Components: Utilities
>    Affects Versions: 2.0.1
>            Reporter: Stephen Kestle
>
> File.listFiles() uses up to a bit over 2 times as much memory as File.list(). 
>  The latter should be used in doCopyDirectory where there is no filter 
> specified.
> This memory usage is a problem when copying directories with hundreds of 
> thousands of files.
> I was also thinking of the option of implementing a file filter (that could 
> be composed with the inputted filter) that would batch the file copy 
> operation; copy the first 10000 (that match), then the next 10000 etc etc.
> Because of the lack of ordering consistency (between runs) of 
> File.listFiles(), there would need to be a final file filter that would 
> accept files that have not successfully been copied.
> I'm primarily concerned about copying into an empty directory (I validate 
> this beforehand), but for general operation where it's a merge, the 
> modification date re-writing should only be done in the final run of copies 
> so that while batching occurs (and indeed the final "missed" filtering) files 
> do not get copied if they have been modified after the start time. (I presume 
> that I'm reading FileUtils correctly in that it overrides files...)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to