** Description changed:

- In the last few months my home directory backup stopped completing.
- I've been able to reproduce the problem on a single subdirectory
- although I had to add the --debug=all flag to reproduce it on that
- smaller directory.  Specifically, this command never completes:
+ [Impact]
+ 
+ What the user suffering from this bug experiences is that the big amount
+ of informative messages related to the copy process with the three
+ spawned processes(sender, receiver and generator)  exhaust the I/O
+ buffer and the sync gets stuck, either because there are too many files
+ to synchronise and/or because too many detail messages (levels of
+ verbose mode) have been requested in the output.
+ 
+ The fix, that comes from upstream and is applied there since version
+ 3.2.0., increments the size of the receiver's I/O buffer.
+ 
+ 
+ [Test Plan]
+ This test plan is for Focal, but it's the same for Bionic.
+ 
+ 0.Preparing the test environment:
+ 
+ #Preparing the container
+ lxc launch images:ubuntu/focal rsync-iobuffer-focal
+ lxc shell rsync-iobuffer-focal
+ apt update -y
+ apt upgrade -y
+ 
+ #Installing necessary tools
+ apt install rsync 
+ apt install wget
+ 
+ #Get test cases from comments #16 and #19 on this LP bug
+ 
+ cd /tmp/
+ 
+ #19
+ wget 
https://bugs.launchpad.net/ubuntu/+source/rsync/+bug/1528921/+attachment/5211950/+files/html2.tgz
+ tar -xvzf /tmp/html2.tgz
+ mkdir /tmp/rsynctest
+ 
+ #16 
+ Paste the contents of https://pastebin.com/raw/ctzJJGwt:
+ 
+ #!/bin/bash
+ mkdir source_dir
+ pushd source_dir
+ dd if=/dev/zero of=source bs=600K count=1
+ 
+ for i in `seq 1 11500`;
+ do
+     cp -v source file_$i;
+ done
+ 
+ rm source
+ 
+ for i in `seq 1 10`;
+ do
+      dd if=/dev/zero of=file_large_$i bs=200M count=1
+ done
+ 
+ popd
+ 
+ echo "Created 11500 files with size 600K and 10 files with size 200M, try the 
following command:"
+ echo "rsync -avvvz --delete source_dir target_dir"
+ 
+ in a new file script_comment16.sh
+  
+ chmod +x script_comment16.sh 
+ ./script_comment16.sh
+ 
+ 
+ 1. Bad cases (without and with using strace):
+ 
+ # Scenario from comment 19 
+ $ rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/
+ (Client) Protocol versions: remote=31, negotiated=31
+ sending incremental file list
+ [sender] change_dir(/tmp)
+ send_files starting
+ server_recv(2) starting pid=49029
+ get_local_name count=7070 /tmp/rsynctest/
+ created directory /tmp/rsynctest
+ [Receiver] change_dir(/tmp/rsynctest)
+ generator starting pid=49029
+ delta-transmission disabled for local transfer or --whole-file
+ recv_generator(html2,1)
+ recv_generator(html2,2)
+ set uid of html2 from 0 to 1000
+ set gid of html2 from 0 to 1000
+ set modtime of html2 to (1447205118) Wed Nov 11 01:25:18 2015
+ [...]
+ sender finished /tmp/html2/annotator__raw_8h__incl.md5
+ send_files(282, /tmp/html2/annotator__raw_8h__incl.png)
+ html2/annotator__raw_8h__incl.png
+ 
+ It hangs here, where using strace we can see:
+ 
+ $ strace rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/
+ [...]
+ read(3, 
"\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\v\4\0\0\2\233\10\6\0\0\0\361\177\254"..., 
262144) = 262144
+ select(6, [5], [4], [5], {tv_sec=60, tv_usec=0}) = 2 (in [5], out [4], left 
{tv_sec=59, tv_usec=999998})
+ read(5, 
"\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 95) 
= 95
+ write(4, 
"K\374\0\7\177\377\207\343\335\345+{W\335{K\371y\211w`Ysl\336B{\312\340}\320\301"...,
 64591) = 64591
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 1 (out [4], left {tv_sec=59, 
tv_usec=999998})
+ write(4, 
"\336\322\0\7\177\377\255\371\367\215v\321-\224\323+\363\261\243\7\211Do\230\256\257O\372\367:\357O"...,
 53986) = 53986
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout)
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout)
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout)
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout)
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout)
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout)
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout)
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}
+ 
+ 
+ # Scenario from comment 16
+ $ rsync -avvvz --delete source_dir target_dir
+ sending incremental file list
+ [sender] make_file(source_dir,*,0)
+ send_file_list done
+ [sender] pushing local filters for /root/source_dir/
+ [sender] make_file(source_dir/file_3048,*,2)
+ [sender] make_file(source_dir/file_11358,*,2)
+ [sender] make_file(source_dir/file_5914,*,2)
+ [sender] make_file(source_dir/file_5880,*,2)
+ [sender] make_file(source_dir/file_9318,*,2)
+ [sender] make_file(source_dir/file_5539,*,2)
+ [...]
+ sending file_sum
+ false_alarms=0 hash_hits=0 matches=0
+ sender finished source_dir/file_10807
+ send_files(903, source_dir/file_10808)
+ send_files mapped source_dir/file_10808 of size 614400
+ calling match_sums source_dir/file_10808
+ source_dir/file_10808
+ 
+ It hangs here, where using strace we can see:
+ 
+ $ strace rsync -avvvz --delete source_dir target_dir 
+ source_dir/file_11280
+ read(3, 
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) 
= 262144
+ read(3, 
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) 
= 262144
+ read(3, 
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 90112) = 
90112
+ select(6, [5], [4], [5], {tv_sec=60, tv_usec=0}) = 1 (in [5], left 
{tv_sec=59, tv_usec=999996})
+ read(5, 
"\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 
1900) = 1900
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout)
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout)
+ select(5, [], [4], [], {tv_sec=60, tv_usec=0}
+ 
+ 
+ 1. Good cases:
+ 
+ # Scenario from comment 19
+ 
+ $ rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/
+ (Client) Protocol versions: remote=31, negotiated=31
+ sending incremental file list
+ [sender] change_dir(/tmp)
+ send_files starting
+ server_recv(2) starting pid=25249
+ get_local_name count=7070 /tmp/rsynctest/
+ created directory /tmp/rsynctest
+ [Receiver] change_dir(/tmp/rsynctest)
+ generator starting pid=25249
+ delta-transmission disabled for local transfer or --whole-file
+ recv_generator(html2,1)
+ recv_generator(html2,2)
+ set uid of html2 from 0 to 1000
+ set gid of html2 from 0 to 1000
+ set modtime of html2 to (1447205118) Wed Nov 11 01:25:18 2015
+ recv_generator(html2/ConstellationDisplayPlot_8h.html,3)
+ recv_generator(html2/ConstellationDisplayPlot_8h__dep__incl.map,4)
+ recv_generator(html2/ConstellationDisplayPlot_8h__dep__incl.md5,5)
+ [...]
+ generate_files finished
+ 
+ sent 140,246,278 bytes  received 4,053,586 bytes  15,189,459.37 bytes/sec
+ total size is 252,298,375  speedup is 1.75
+ [sender] _exit_cleanup(code=0, file=main.c, line=1207): about to call exit(0)
+ 
+ 
+ # Scenario from comment 16
+ 
+ $ rsync -avvvz --delete source_dir target_dir
+ sending incremental file list
+ [sender] make_file(source_dir,*,0)
+ send_file_list done
+ [sender] pushing local filters for /tmp/source_dir/
+ [sender] make_file(source_dir/file_3052,*,2)
+ [sender] make_file(source_dir/file_1766,*,2)
+ [sender] make_file(source_dir/file_10466,*,2)
+ [sender] make_file(source_dir/file_9375,*,2)
+ [sender] make_file(source_dir/file_7260,*,2)
+ [sender] make_file(source_dir/file_5554,*,2)
+ [sender] make_file(source_dir/file_5523,*,2)
+ [sender] make_file(source_dir/file_1685,*,2)
+ [sender] make_file(source_dir/file_7217,*,2)
+ [sender] make_file(source_dir/file_10411,*,2)
+ [...]
+ generate_files finished
+ 
+ sent 9,555,678 bytes  received 3,599,560 bytes  124,694.20 bytes/sec
+ total size is 9,162,752,000  speedup is 696.51
+ 
+ 
+ [Where problems could occur]
+ 
+ Perhaps the buffer size may not be sufficient for an operation involving
+ a very huge amount of data, as reported upstream here (
+ https://bugzilla.samba.org/show_bug.cgi?id=11166#c9 ): the reporter
+ asked for a parameter to manually change the size of the receiver's
+ buffer I/O for messaging for those cases, but it may lend itself to a
+ memory overflow more easily even than this solution which also has this
+ risk.
+ 
+ [Other Info]
+  
+ The fix has been picked from upstream here 
https://github.com/WayneD/rsync/commit/01b9bbb0f94e4ddd4b1e3ad678b220ca09bc0ad0 
. I've checked on bugzilla that there aren't bugs related to "out of memory" 
issues after 3.2.0 release.
+ 
+ 
+ [Original Report]
+ -------------------------------------------------------------------
+ In the last few months my home directory backup stopped completing.  I've 
been able to reproduce the problem on a single subdirectory although I had to 
add the --debug=all flag to reproduce it on that smaller directory.  
Specifically, this command never completes:
  
  rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/
  
  The html2 directory is a copy of
  gnuradio-3.7.8.1/build/docs/doxygen/html .
  
  When I strace the command, I see this:
  write(1, "sender finished /tmp/html2/atsc_"..., 58sender finished 
/tmp/html2/atsc__interleaver_8h__incl.md5
  ) = 58
  write(1, "send_files(338, /tmp/html2/atsc_"..., 59send_files(338, 
/tmp/html2/atsc__interleaver_8h__incl.png)
  ) = 59
  open("html2/atsc__interleaver_8h__incl.png", O_RDONLY|O_LARGEFILE) = 3
  fstat64(3, {st_mode=S_IFREG|0664, st_size=264657, ...}) = 0
  write(1, "html2/atsc__interleaver_8h__incl"..., 
37html2/atsc__interleaver_8h__incl.png
  ) = 37
  read(3, 
"\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\n\253\0\0\2\233\10\6\0\0\0h\242\""..., 
262144) = 262144
  select(6, [5], [4], [5], {60, 0})       = 2 (in [5], out [4], left {59, 
999996})
  read(5, 
"\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 95) 
= 95
  write(4, 
"r\311\0\7\177\377\232\237\264\272e\300\300\240\316\264&\314\301\252*\37\256y\225g\373^\315j\370\350"...,
 51574) = 51574
  select(5, [], [4], [], {60, 0})         = 1 (out [4], left {59, 999997})
  write(4, 
"\7\320\0\7\177\377\234|\7X\223Y\273\255c\27\25f\306\212\202\214#E\272\212t\1\225A\fU"...,
 53259) = 53259
  select(5, [], [4], [], {60, 0}
  
  The select command times out over and over.  I get the same behavior
  when trying to back up my entire home directory but I don't need the
  --debug=all flag in that case.
  
- 
  lsb_release -rd
  Description:    Ubuntu 14.04.3 LTS
  Release:        14.04
  
  apt-cache policy rsync
  rsync:
-   Installed: 3.1.0-2ubuntu0.1
-   Candidate: 3.1.0-2ubuntu0.1
-   Version table:
-  *** 3.1.0-2ubuntu0.1 0
-         500 http://us.archive.ubuntu.com/ubuntu/ trusty-updates/main i386 
Packages
-         500 http://security.ubuntu.com/ubuntu/ trusty-security/main i386 
Packages
-         100 /var/lib/dpkg/status
-      3.1.0-2 0
-         500 http://us.archive.ubuntu.com/ubuntu/ trusty/main i386 Packages
+   Installed: 3.1.0-2ubuntu0.1
+   Candidate: 3.1.0-2ubuntu0.1
+   Version table:
+  *** 3.1.0-2ubuntu0.1 0
+         500 http://us.archive.ubuntu.com/ubuntu/ trusty-updates/main i386 
Packages
+         500 http://security.ubuntu.com/ubuntu/ trusty-security/main i386 
Packages
+         100 /var/lib/dpkg/status
+      3.1.0-2 0
+         500 http://us.archive.ubuntu.com/ubuntu/ trusty/main i386 Packages
  
  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: rsync 3.1.0-2ubuntu0.1
  ProcVersionSignature: Ubuntu 3.13.0-74.118-generic 3.13.11-ckt30
  Uname: Linux 3.13.0-74-generic i686
  NonfreeKernelModules: nvidia
  ApportVersion: 2.14.1-0ubuntu3.19
  Architecture: i386
  CurrentDesktop: KDE
  Date: Wed Dec 23 09:44:17 2015
  EcryptfsInUse: Yes
  InstallationDate: Installed on 2010-09-18 (1922 days ago)
  InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Beta i386 (20100901.1)
  SourcePackage: rsync
  UpgradeStatus: Upgraded to trusty on 2014-12-27 (361 days ago)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1528921

Title:
  rsync hangs on select(5, [], [4], [], {60, 0}

To manage notifications about this bug go to:
https://bugs.launchpad.net/rsync/+bug/1528921/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to