** Description changed: - In the last few months my home directory backup stopped completing. - I've been able to reproduce the problem on a single subdirectory - although I had to add the --debug=all flag to reproduce it on that - smaller directory. Specifically, this command never completes: + [Impact] + + What the user suffering from this bug experiences is that the big amount + of informative messages related to the copy process with the three + spawned processes(sender, receiver and generator) exhaust the I/O + buffer and the sync gets stuck, either because there are too many files + to synchronise and/or because too many detail messages (levels of + verbose mode) have been requested in the output. + + The fix, that comes from upstream and is applied there since version + 3.2.0., increments the size of the receiver's I/O buffer. + + + [Test Plan] + This test plan is for Focal, but it's the same for Bionic. + + 0.Preparing the test environment: + + #Preparing the container + lxc launch images:ubuntu/focal rsync-iobuffer-focal + lxc shell rsync-iobuffer-focal + apt update -y + apt upgrade -y + + #Installing necessary tools + apt install rsync + apt install wget + + #Get test cases from comments #16 and #19 on this LP bug + + cd /tmp/ + + #19 + wget https://bugs.launchpad.net/ubuntu/+source/rsync/+bug/1528921/+attachment/5211950/+files/html2.tgz + tar -xvzf /tmp/html2.tgz + mkdir /tmp/rsynctest + + #16 + Paste the contents of https://pastebin.com/raw/ctzJJGwt: + + #!/bin/bash + mkdir source_dir + pushd source_dir + dd if=/dev/zero of=source bs=600K count=1 + + for i in `seq 1 11500`; + do + cp -v source file_$i; + done + + rm source + + for i in `seq 1 10`; + do + dd if=/dev/zero of=file_large_$i bs=200M count=1 + done + + popd + + echo "Created 11500 files with size 600K and 10 files with size 200M, try the following command:" + echo "rsync -avvvz --delete source_dir target_dir" + + in a new file script_comment16.sh + + chmod +x script_comment16.sh + ./script_comment16.sh + + + 1. Bad cases (without and with using strace): + + # Scenario from comment 19 + $ rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/ + (Client) Protocol versions: remote=31, negotiated=31 + sending incremental file list + [sender] change_dir(/tmp) + send_files starting + server_recv(2) starting pid=49029 + get_local_name count=7070 /tmp/rsynctest/ + created directory /tmp/rsynctest + [Receiver] change_dir(/tmp/rsynctest) + generator starting pid=49029 + delta-transmission disabled for local transfer or --whole-file + recv_generator(html2,1) + recv_generator(html2,2) + set uid of html2 from 0 to 1000 + set gid of html2 from 0 to 1000 + set modtime of html2 to (1447205118) Wed Nov 11 01:25:18 2015 + [...] + sender finished /tmp/html2/annotator__raw_8h__incl.md5 + send_files(282, /tmp/html2/annotator__raw_8h__incl.png) + html2/annotator__raw_8h__incl.png + + It hangs here, where using strace we can see: + + $ strace rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/ + [...] + read(3, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\v\4\0\0\2\233\10\6\0\0\0\361\177\254"..., 262144) = 262144 + select(6, [5], [4], [5], {tv_sec=60, tv_usec=0}) = 2 (in [5], out [4], left {tv_sec=59, tv_usec=999998}) + read(5, "\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 95) = 95 + write(4, "K\374\0\7\177\377\207\343\335\345+{W\335{K\371y\211w`Ysl\336B{\312\340}\320\301"..., 64591) = 64591 + select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 1 (out [4], left {tv_sec=59, tv_usec=999998}) + write(4, "\336\322\0\7\177\377\255\371\367\215v\321-\224\323+\363\261\243\7\211Do\230\256\257O\372\367:\357O"..., 53986) = 53986 + select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) + select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) + select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) + select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) + select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) + select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) + select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) + select(5, [], [4], [], {tv_sec=60, tv_usec=0} + + + # Scenario from comment 16 + $ rsync -avvvz --delete source_dir target_dir + sending incremental file list + [sender] make_file(source_dir,*,0) + send_file_list done + [sender] pushing local filters for /root/source_dir/ + [sender] make_file(source_dir/file_3048,*,2) + [sender] make_file(source_dir/file_11358,*,2) + [sender] make_file(source_dir/file_5914,*,2) + [sender] make_file(source_dir/file_5880,*,2) + [sender] make_file(source_dir/file_9318,*,2) + [sender] make_file(source_dir/file_5539,*,2) + [...] + sending file_sum + false_alarms=0 hash_hits=0 matches=0 + sender finished source_dir/file_10807 + send_files(903, source_dir/file_10808) + send_files mapped source_dir/file_10808 of size 614400 + calling match_sums source_dir/file_10808 + source_dir/file_10808 + + It hangs here, where using strace we can see: + + $ strace rsync -avvvz --delete source_dir target_dir + source_dir/file_11280 + read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) = 262144 + read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) = 262144 + read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 90112) = 90112 + select(6, [5], [4], [5], {tv_sec=60, tv_usec=0}) = 1 (in [5], left {tv_sec=59, tv_usec=999996}) + read(5, "\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 1900) = 1900 + select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) + select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) + select(5, [], [4], [], {tv_sec=60, tv_usec=0} + + + 1. Good cases: + + # Scenario from comment 19 + + $ rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/ + (Client) Protocol versions: remote=31, negotiated=31 + sending incremental file list + [sender] change_dir(/tmp) + send_files starting + server_recv(2) starting pid=25249 + get_local_name count=7070 /tmp/rsynctest/ + created directory /tmp/rsynctest + [Receiver] change_dir(/tmp/rsynctest) + generator starting pid=25249 + delta-transmission disabled for local transfer or --whole-file + recv_generator(html2,1) + recv_generator(html2,2) + set uid of html2 from 0 to 1000 + set gid of html2 from 0 to 1000 + set modtime of html2 to (1447205118) Wed Nov 11 01:25:18 2015 + recv_generator(html2/ConstellationDisplayPlot_8h.html,3) + recv_generator(html2/ConstellationDisplayPlot_8h__dep__incl.map,4) + recv_generator(html2/ConstellationDisplayPlot_8h__dep__incl.md5,5) + [...] + generate_files finished + + sent 140,246,278 bytes received 4,053,586 bytes 15,189,459.37 bytes/sec + total size is 252,298,375 speedup is 1.75 + [sender] _exit_cleanup(code=0, file=main.c, line=1207): about to call exit(0) + + + # Scenario from comment 16 + + $ rsync -avvvz --delete source_dir target_dir + sending incremental file list + [sender] make_file(source_dir,*,0) + send_file_list done + [sender] pushing local filters for /tmp/source_dir/ + [sender] make_file(source_dir/file_3052,*,2) + [sender] make_file(source_dir/file_1766,*,2) + [sender] make_file(source_dir/file_10466,*,2) + [sender] make_file(source_dir/file_9375,*,2) + [sender] make_file(source_dir/file_7260,*,2) + [sender] make_file(source_dir/file_5554,*,2) + [sender] make_file(source_dir/file_5523,*,2) + [sender] make_file(source_dir/file_1685,*,2) + [sender] make_file(source_dir/file_7217,*,2) + [sender] make_file(source_dir/file_10411,*,2) + [...] + generate_files finished + + sent 9,555,678 bytes received 3,599,560 bytes 124,694.20 bytes/sec + total size is 9,162,752,000 speedup is 696.51 + + + [Where problems could occur] + + Perhaps the buffer size may not be sufficient for an operation involving + a very huge amount of data, as reported upstream here ( + https://bugzilla.samba.org/show_bug.cgi?id=11166#c9 ): the reporter + asked for a parameter to manually change the size of the receiver's + buffer I/O for messaging for those cases, but it may lend itself to a + memory overflow more easily even than this solution which also has this + risk. + + [Other Info] + + The fix has been picked from upstream here https://github.com/WayneD/rsync/commit/01b9bbb0f94e4ddd4b1e3ad678b220ca09bc0ad0 . I've checked on bugzilla that there aren't bugs related to "out of memory" issues after 3.2.0 release. + + + [Original Report] + ------------------------------------------------------------------- + In the last few months my home directory backup stopped completing. I've been able to reproduce the problem on a single subdirectory although I had to add the --debug=all flag to reproduce it on that smaller directory. Specifically, this command never completes: rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/ The html2 directory is a copy of gnuradio-3.7.8.1/build/docs/doxygen/html . When I strace the command, I see this: write(1, "sender finished /tmp/html2/atsc_"..., 58sender finished /tmp/html2/atsc__interleaver_8h__incl.md5 ) = 58 write(1, "send_files(338, /tmp/html2/atsc_"..., 59send_files(338, /tmp/html2/atsc__interleaver_8h__incl.png) ) = 59 open("html2/atsc__interleaver_8h__incl.png", O_RDONLY|O_LARGEFILE) = 3 fstat64(3, {st_mode=S_IFREG|0664, st_size=264657, ...}) = 0 write(1, "html2/atsc__interleaver_8h__incl"..., 37html2/atsc__interleaver_8h__incl.png ) = 37 read(3, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\n\253\0\0\2\233\10\6\0\0\0h\242\""..., 262144) = 262144 select(6, [5], [4], [5], {60, 0}) = 2 (in [5], out [4], left {59, 999996}) read(5, "\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 95) = 95 write(4, "r\311\0\7\177\377\232\237\264\272e\300\300\240\316\264&\314\301\252*\37\256y\225g\373^\315j\370\350"..., 51574) = 51574 select(5, [], [4], [], {60, 0}) = 1 (out [4], left {59, 999997}) write(4, "\7\320\0\7\177\377\234|\7X\223Y\273\255c\27\25f\306\212\202\214#E\272\212t\1\225A\fU"..., 53259) = 53259 select(5, [], [4], [], {60, 0} The select command times out over and over. I get the same behavior when trying to back up my entire home directory but I don't need the --debug=all flag in that case. - lsb_release -rd Description: Ubuntu 14.04.3 LTS Release: 14.04 apt-cache policy rsync rsync: - Installed: 3.1.0-2ubuntu0.1 - Candidate: 3.1.0-2ubuntu0.1 - Version table: - *** 3.1.0-2ubuntu0.1 0 - 500 http://us.archive.ubuntu.com/ubuntu/ trusty-updates/main i386 Packages - 500 http://security.ubuntu.com/ubuntu/ trusty-security/main i386 Packages - 100 /var/lib/dpkg/status - 3.1.0-2 0 - 500 http://us.archive.ubuntu.com/ubuntu/ trusty/main i386 Packages + Installed: 3.1.0-2ubuntu0.1 + Candidate: 3.1.0-2ubuntu0.1 + Version table: + *** 3.1.0-2ubuntu0.1 0 + 500 http://us.archive.ubuntu.com/ubuntu/ trusty-updates/main i386 Packages + 500 http://security.ubuntu.com/ubuntu/ trusty-security/main i386 Packages + 100 /var/lib/dpkg/status + 3.1.0-2 0 + 500 http://us.archive.ubuntu.com/ubuntu/ trusty/main i386 Packages ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: rsync 3.1.0-2ubuntu0.1 ProcVersionSignature: Ubuntu 3.13.0-74.118-generic 3.13.11-ckt30 Uname: Linux 3.13.0-74-generic i686 NonfreeKernelModules: nvidia ApportVersion: 2.14.1-0ubuntu3.19 Architecture: i386 CurrentDesktop: KDE Date: Wed Dec 23 09:44:17 2015 EcryptfsInUse: Yes InstallationDate: Installed on 2010-09-18 (1922 days ago) InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Beta i386 (20100901.1) SourcePackage: rsync UpgradeStatus: Upgraded to trusty on 2014-12-27 (361 days ago)
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1528921 Title: rsync hangs on select(5, [], [4], [], {60, 0} To manage notifications about this bug go to: https://bugs.launchpad.net/rsync/+bug/1528921/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
