[
https://issues.apache.org/jira/browse/YETUS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435186#comment-15435186
]
Allen Wittenauer commented on YETUS-421:
----------------------------------------
bq. Replacing the find to a temp file followed by a while loop with just a find
command may not provide much speedup.
For extremely large builds (hai hadoop), it should because one isn't traversing
the same files multiple times.
bq. Is there a reason rsync is used instead of a cp command?
Given that the archiver gets called a lot, my thinking was that this would save
on IO over the long haul if the same file is being asked to be copied twice.
For example, if clean is never run, the contents of the target directory may
have the same files still in there. Rather than do a full copy every time,
rsync should short circuit that.
bq. Is there a tunable parameter in YETUS for the number of concurrent threads
to run at the same time?
The closest is probably --test-threads.
To put this into perspective, Hadoop's qbt job on builds
(https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86) runs with this
option:
--archive-list=checkstyle-errors.xml,findbugsXml.xml
> archiver should be smarter
> --------------------------
>
> Key: YETUS-421
> URL: https://issues.apache.org/jira/browse/YETUS-421
> Project: Yetus
> Issue Type: Bug
> Components: Test Patch
> Affects Versions: 0.4.0
> Reporter: Allen Wittenauer
> Priority: Minor
>
> Ideally, the archiver functionality should use a single find statement rather
> than running in a for loop. As the list gets bigger and the repo gets more
> clogged with files, the slower it runs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)