[
https://issues.apache.org/jira/browse/IO-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030476#comment-13030476
]
Sebb commented on IO-271:
-
I'm not sure the memory usage checking strategy is appropriate, If you are near
the limits of memory, creating the original list may well tip you over the
limit anyway.
Further, for very large directories, even a String[] array may be too much.
As I wrote earlier, the only sure way to fix this is to process the file
entries one by one, but Java does not seem to provide this.
As already explained, listFiles() is more efficient at creating the File
entries than list() plus new File(), so I don't think the general case should
be changed even in the non-filter case.
AFAICT, your use case is very unusual. Given the difficulties that such large
directories are likely to cause other applications, and the fact that it is not
possible to support arbitrarily large numbers of files, I would look to see if
I could reduce the directory size, e.g. by splitting into subdirectories. That
would probably improve file system performance too.
FileUtils.copyDirectory should be able to handle arbitrary number of files
--
Key: IO-271
URL: https://issues.apache.org/jira/browse/IO-271
Project: Commons IO
Issue Type: Improvement
Components: Utilities
Affects Versions: 2.0.1
Reporter: Stephen Kestle
Priority: Minor
File.listFiles() uses up to a bit over 2 times as much memory as File.list().
The latter should be used in doCopyDirectory where there is no filter
specified.
This memory usage is a problem when copying directories with hundreds of
thousands of files.
I was also thinking of the option of implementing a file filter (that could
be composed with the inputted filter) that would batch the file copy
operation; copy the first 1 (that match), then the next 1 etc etc.
Because of the lack of ordering consistency (between runs) of
File.listFiles(), there would need to be a final file filter that would
accept files that have not successfully been copied.
I'm primarily concerned about copying into an empty directory (I validate
this beforehand), but for general operation where it's a merge, the
modification date re-writing should only be done in the final run of copies
so that while batching occurs (and indeed the final missed filtering) files
do not get copied if they have been modified after the start time. (I presume
that I'm reading FileUtils correctly in that it overrides files...)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira