Re: rsync very slow with large include/exclude file list

2015-06-15 Thread ray vantassle
I investigated the rsync code and found the reason why. For every file in the source, it searches the entire filter-list looking to see if that filename is on the exclude/include list. Most aren't, so it compares (350K - 72K) * 72K names (the non-listed files) plus (72K * 72K/2) names (the ones

Re: rsync very slow with large include/exclude file list

2015-06-15 Thread Ken Chase
This is similar to using fuzzy / -y in a large directory. O(n^2) behaviour occurs and can be incredibly slow. No caching of md5's for the directory occurs, it would seem (or even so, there are O(N^2) comparisons). /kc On Mon, Jun 15, 2015 at 06:02:14PM -0500, ray vantassle said: I

rsync very slow with large include/exclude file list

2015-06-15 Thread ray vantassle
I have a sensor collector system (very low-powered slow ARM cpu), and another system which daily pulls the data files from it for processing. There are about 1000 new files each day. As part of the processing it decides that certain of the files are of no interest, and adds them to an exclude