Another issue is that rsync and find are single threaded applications.  No 
matter how many processors/cores/threads the system has, each invocation of 
find or rsync will use only one thread.

You can gain some parallelization by stepping up a level in the directory and 
running running find's or rsync's at the first subdirectory level.  I do this 
when transferring files between systems over a modern gigabit LAN.

From: [email protected] [mailto:[email protected]] On 
Behalf Of Cristian Bichis
Sent: Thursday, March 28, 2013 1:31 AM
To: [email protected]
Subject: Best organizing hundreds of thousands files for rsync and find

Hi,

I need to organize about 100 millions small files (and the number grows up) on 
a server which should be copied to other server.

I am wondering how many files are recommended to be kept into a folder for 
optimal performance? As well, if I have a folder with only subfolders (not 
files) what number of subfolders are recommended to have?

As well, the question could be for "find" command, not just for for rsync as I 
am doing some cleanups using find (or for - find).


I made a mistake before and I increased a lot the number of subfoldersfolders 
(having just few files within them) and rsync performance was decreasing 
considerably. Was a mistake which I will try to correct.

So now as the number of files is increasing constantly I need to find out a 
solution on long term to correct the current issues.

Cristian
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply via email to