[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #28 from Dave Gordon--- (In reply to Carson Gaspar from comment #27) Hmm? If you're referring to line 810 of io.c, which is the only write(2) call I can see in perform_io(), in the current HEAD it looks like this: 810 if ((n = write(iobuf.out_fd, out->buf + out->pos, len)) <= 0) { ... 822 } ... 835 if ((out->pos += n) == out->size) { ... so that while the immediate test is only for <= 0, the actual result is stored in n and subsequently used to adjust counters, pointers, etc. I don't think there's a bug there. BTW, AFAICT perform_io() is only used for IPC. The received file is written out by write_file() in fileio.c: 140 /* write_file does not allow incomplete writes. It loops internally 141 * until len bytes are written or errno is set. Note that use_seek and 142 * offset are only used in sparse processing (see write_sparse()). */ 143 int write_file(int f, int use_seek, OFF_T offset, const char *buf, int len) .Dave. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #27 from Carson Gaspar--- (In reply to Dave Gordon from comment #23) Reading this, I took a look at the rsync sources, and, indeed, rsync has a bug. perform_io() does not correctly check the return code from write(). safe_write() does - please steal the correct logic from it. perform_io() only checks for write returning <=0, not for write returning something != len. So in the case of a partial write() it will corrupt the destination file. I haven't looked for additional coding errors of this type. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #26 from Rui DeSousa--- (In reply to Ben RUBSON from comment #25) That is awesome. Thanks you for all of your efforts! -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #24 from Ben RUBSON--- ZFS only shares between files with dedup on. So first rsync / diff succeeded, second gave same result as before : # rsync -a --inplace $tmpfs/f1 $f/f3 ; echo $? ; diff $tmpfs/f1 $f/f3 0 Files /mnt/f1 and /test/f3 differ # ls -l $f total 1048593 -rw--- 1 root wheel 629145600 6 Mar 09:38 f2 -rw--- 1 root wheel 629145600 6 Mar 09:40 f3 # du -sm $f/* 601/test/f2 424/test/f3 -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #23 from Dave Gordon--- Looks like this ZFS problem could be a FreeBSD-specific issue; one of the commits mentioned in this FreeNAS bug report has the subject zfs_write: fix problem with writes appearing to succeed when over quota See https://redmine.ixsystems.com/issues/26650 FreeNAS Bug #26650: Correctly report ZFS dataset quota overflows Updated by Alexander Motin 4 months ago Investigation of the problem brought me to FreeBSD-specific change r298105 by avg@ on 2016-04-16. If quota overflow detected during write, the write will fail, but the error status can be lost, falsely reporting partial completion. As result written data are flowing to nowhere and indefinitely, as fast as CPU can handle the loop. HTH, .Dave. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #22 from Dave Gordon--- (In reply to Ben RUBSON from comment #19) Just to be totally certain about what ZFS may or may not share between files, could you try this variant of your testcase: # zfs destroy $z # zfs create $z # zfs set compression=off $z # zfs set dedup=off $z # zfs set quota=1G $z Then with "tmpfs" referring to any non-ZFS filesystem: # dd if=/dev/random bs=1M count=600 of=$tmpfs/f1 # rsync -a --inplace $tmpfs/f1 $f/f2 ; echo $? ; diff $tmpfs/f1 $f/f2 # dd if=/dev/random bs=1M count=600 of=$tmpfs/f1 # rsync -a --inplace $tmpfs/f1 $f/f3 ; echo $? ; diff $tmpfs/f1 $f/f3 The first rsync should succeed, and the files should be identical. The second should fail, but if it doesn't then at least it can't be down to ZFS finding matching datablocks. .Dave. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #21 from Dave Gordon--- Created attachment 14019 --> https://bugzilla.samba.org/attachment.cgi?id=14019=edit Test patch to see whether fdatasync() or fsync() detects a write failure -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #20 from Dave Gordon--- So, looking like a ZFS issue but triggered in some way by the specific behaviour of rsync (e.g. writing a certain block size/pattern causes the quota error to be lost). The truss listing from a failing case should show exactly what operations were performed on the destination filesystem; maybe someone from the ZFS team could see whether there's some pathological case that's not handled properly. .Dave. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #19 from Ben RUBSON--- I managed to reproduce the issue on 11.0-RELEASE-p16. Below a simple test case, without compression, without deduplication. Note that issue is reproductible with quota, but not with userquota. # f=/test # z=zroot/ROOT/default$f ### quota / refquota (same results) # zfs destroy $z # zfs create $z # zfs set compression=off $z # zfs set dedup=off $z # zfs set quota=1G $z # dd if=/dev/random bs=1M count=600 of=$f/f1 # rsync -a --inplace $f/f1 $f/f2 ; echo $? ; diff $f/f1 $f/f2 ; rm $f/f2 0 Files /test/f1 and /test/f2 differ # rsync -a $f/f1 $f/f2 ; echo $? ; rm $f/f2 rsync: rename "/test/.f2.6NVNwD" -> "f2": Disc quota exceeded (69) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1192) [sender=3.1.3] 23 // the quota error should have triggered on write instead // ### userquota # zfs destroy $z # zfs create $z # zfs set compression=off $z # zfs set dedup=off $z # zfs set userquota@root=1G $z # dd if=/dev/random bs=1M count=600 of=$f/f1 # rsync -a --inplace $f/f1 $f/f2 ; echo $? ; diff $f/f1 $f/f2 ; rm $f/f2 rsync: [sender] write error: Broken pipe (32) rsync: write failed on "/test/f2": Disc quota exceeded (69) rsync error: error in file IO (code 11) at receiver.c(404) [receiver=3.1.3] 11 Files /test/f1 and /test/f2 differ # rsync -a $f/f1 $f/f2 ; echo $? ; rm $f/f2 rsync: [sender] write error: Broken pipe (32) rsync: write failed on "/test/f2": Disc quota exceeded (69) rsync error: error in file IO (code 11) at receiver.c(404) [receiver=3.1.3] 11 # dd if=/dev/random bs=1M count=535 of=$f/f1 # rsync -a --inplace $f/f1 $f/f2 ; echo $? ; diff $f/f1 $f/f2 0 # zfs get -Hp -o value used,userquota@root $z 1123188736 1073741824 # touch $f/file touch: /test/file: Disc quota exceeded -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #18 from Rui DeSousa--- I also wrote a little util as well; I get the correct error in write spot. [postgres@hades ~]$ cat 0001005E0017 | ./fwrite/fwrite arch/0001005E0017 fwrite: write: Disc quota exceeded [postgres@hades ~]$ echo $? 1 [postgres@hades ~]$ du -h arch/0001005E0017.GghJVR 1.9March/0001005E0017.GghJVR Here's the code: #include #include #include #include #include #include #include #defineBUFSIZE 131072 int main(int argc, char *argv[]) { int fd, r, w; char *buf; char *name; if (argc != 2) { fprintf(stderr, "usage: fwrite [file]\n"); exit(1); } if ((buf = malloc(BUFSIZE)) == NULL) err(1, "malloc"); ++argv; if ((name = (char *) malloc(strlen(*argv) + 10)) == NULL) err(1, "malloc"); strcat(strcpy(name, *argv), ".XX"); if ((fd = mkstemp(name)) < 0) err(1, "mkstemp"); while ((r = read(STDIN_FILENO, buf, BUFSIZE)) > 0) if ((w = write(fd, buf, r)) == -1) err(1, "write"); if (r < 0) err(1, "read"); if (fsync(fd) != 0) err(1, "fsync"); if (close(fd) != 0) err(1, "close"); if (rename(name, *argv) != 0) err(1, "rename"); free(name); exit(0); } -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #17 from Rui DeSousa--- (In reply to Dave Gordon from comment #14) Here's the output you requested. ZFS would use the same block even if it's the same data as don't have dedup enabled. [postgres@hades ~]$ ls arch/ dbc1 [postgres@hades ~]$ du -h 0001005E0017 ; df -h arch/ 19M0001005E0017 Filesystem SizeUsed Avail Capacity Mounted on hydra/home/postgres/arch977M975M1.9M 100% /usr/home/postgres/arch [postgres@hades ~]$ rsync -avi --inplace --debug=deltasum2,recv2,io2 0001005E0017 arch/0001005E0017 [Receiver] safe_read(0)=4 [sender] safe_read(5)=4 [sender] safe_read(5)=1 [sender] safe_read(5)=4 sending incremental file list server_recv(2) starting pid=64788 get_local_name count=1 arch/0001005E0017 send_files mapped 0001005E0017 of size 67108864 calling match_sums 0001005E0017 >f+ 0001005E0017 sending file_sum false_alarms=0 hash_hits=0 matches=0 recv_files(1) starting recv_files(0001005E0017) got file_sum finishing 0001005E0017 [receiver] send_msg_int(100, 1) recv_files phase=1 total: matches=0 hash_hits=0 false_alarms=0 data=67108864 recv_files phase=2 recv_files finished [receiver] send_msg(10, 8) sent 67,125,370 bytes received 313 bytes 2,200,842.07 bytes/sec total size is 67,108,864 speedup is 1.00 [postgres@hades ~]$ echo $? 0 [postgres@hades ~]$ du -h arch/0001005E0017; md5 0001005E0017 arch/0001005E0017 1.9March/0001005E0017 MD5 (0001005E0017) = b607d345c5527152450c02c23d778cf2 MD5 (arch/0001005E0017) = f90772e440b04b63490a88e7dafeac84 [postgres@hades ~]$ touch 0001005E0017 [postgres@hades ~]$ rsync -avic --inplace --debug=deltasum3,recv2,io2 0001005E0017 arch/0001005E0017 [Receiver] safe_read(0)=4 [sender] safe_read(5)=4 [sender] safe_read(5)=1 [sender] safe_read(5)=4 sending incremental file list server_recv(2) starting pid=67722 get_local_name count=1 arch/0001005E0017 count=0 n=0 rem=0 send_files mapped 0001005E0017 of size 67108864 calling match_sums 0001005E0017 >fc.t.. 0001005E0017 recv_files(1) starting recv_files(0001005E0017) recv mapped 0001005E0017 of size 67108864 data recv 32768 at 0 data recv 32768 at 32768 data recv 32768 at 65536 data recv 32768 at 98304 data recv 32768 at 131072 data recv 32768 at 163840 data recv 32768 at 196608 data recv 32768 at 229376 data recv 32768 at 262144 data recv 32768 at 294912 data recv 32768 at 327680 data recv 32768 at 360448 data recv 32768 at 393216 data recv 32768 at 425984 data recv 32768 at 458752 data recv 32768 at 491520 data recv 32768 at 524288 data recv 32768 at 557056 data recv 32768 at 589824 data recv 32768 at 622592 data recv 32768 at 655360 data recv 32768 at 688128 data recv 32768 at 720896 data recv 32768 at 753664 data recv 32768 at 786432 data recv 32768 at 819200 data recv 32768 at 851968 data recv 32768 at 884736 data recv 32768 at 917504 data recv 32768 at 950272 data recv 32768 at 983040 data recv 32768 at 1015808 data recv 32768 at 1048576 data recv 32768 at 1081344 data recv 32768 at 1114112 data recv 32768 at 1146880 data recv 32768 at 1179648 data recv 32768 at 1212416 data recv 32768 at 1245184 data recv 32768 at 1277952 data recv 32768 at 1310720 data recv 32768 at 1343488 data recv 32768 at 1376256 data recv 32768 at 1409024 data recv 32768 at 1441792 data recv 32768 at 1474560 data recv 32768 at 1507328 data recv 32768 at 1540096 data recv 32768 at 1572864 data recv 32768 at 1605632 data recv 32768 at 1638400 data recv 32768 at 1671168 data recv 32768 at 1703936 data recv 32768 at 1736704 data recv 32768 at 1769472 data recv 32768 at 1802240 data recv 32768 at 1835008 data recv 32768 at 1867776 data recv 32768 at 1900544 data recv 32768 at 1933312 data recv 32768 at 1966080 data recv 32768 at 1998848 data recv 32768 at 2031616 data recv 32768 at 2064384 data recv 32768 at 2097152 data recv 32768 at 2129920 data recv 32768 at 2162688 data recv 32768 at 2195456 data recv 32768 at 2228224 data recv 32768 at 2260992 data recv 32768 at 2293760 data recv 32768 at 2326528 data recv 32768 at 2359296 data recv 32768 at 2392064 data recv 32768 at 2424832 data recv 32768 at 2457600 data recv 32768 at 2490368 data recv 32768 at 2523136 data recv 32768 at 2555904 data recv 32768 at 2588672 data recv 32768 at 2621440 data recv 32768 at 2654208 data recv 32768 at 2686976 data recv 32768 at 2719744 data recv 32768 at 2752512 data recv 32768 at 2785280 data recv 32768 at 2818048 data recv 32768 at 2850816 data recv 32768 at 2883584 data recv 32768 at 2916352 data recv 32768 at 2949120 data recv 32768 at 2981888 data recv 32768 at 3014656 data recv 32768 at 3047424 data recv 32768 at
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #16 from Rui DeSousa--- (In reply to Ben RUBSON from comment #15) I just set the quota property. NAME PROPERTY VALUE SOURCE hydra/home/postgres/arch quota 1Glocal hydra/home/postgres/arch reservationnone default hydra/home/postgres/arch refquota none default hydra/home/postgres/arch compressionlz4 inherited from hydra/home/postgres hydra/home/postgres/arch compressratio 3.66x - -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #15 from Ben RUBSON--- Rui just to be sure, which type of ZFS quota are you using ? quota ? refquota ? userquota ? -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #14 from Dave Gordon--- (In reply to Rui DeSousa from comment #6) So you now have an example which reliably produces a bad outcome (corrupted file)? With the combination of (a) insufficient space before copying, and (b) --inplace so that no rename is needed you get no error from rsync but the file uses far less space than it should? And the contents are not as expected? Perhaps you could try this sequence: $ src=0001005E0017 $ dst=arch/0001005E0017 $ rm $dst $ rsync -avi --inplace --debug=deltasum2,recv2,io2 $src $dst $ touch $src $ rsync -avic --inplace --debug=deltasum3,recv2,io2 $src $dst $ md5sum $src $dst and if the md5sum shows a discrepancy then also capture a truss listing of all the syscalls made by the first run of rsync (where the destination does not exist). Rationale: I'm wondering whether ZFS if able to detect that the destination file is identical to the source file (even though it's on a different logical filesystem, it's probably still in the same pool) and merge the disk blocks (retrospectively?) So we start with a file whose logical size is 64M, but which is compressed to 19M on disk. ZFS uncompresses it on the fly and delivers 64M of data to the first rsync. rsync sequentially writes 64M, checking the success of each write. The last write should end at an offset of 64M, then the destination file is closed (and the return from that is checked). The truss listing will show how many writes were made, and whether any of them failed. ZFS will recompress (and dedup?) the written data, resulting in an unknown amount of new space being used. The second rsync will read both the source and destination files -- presumably both now being decompressed on demand -- and compare checksums. Any mismatch here will result in (parts of) the file being retransferred; obviously, this shouldn't happen if the first transfer succeeded. md5sum will check whether the files really do match. HTH, .Dave. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #13 from Rui DeSousa--- (In reply to Rui DeSousa from comment #12) Running truss on the --sparse option does show the error being is returned during the write call. [postgres@hades ~]$ truss -f -o sparse.log rsync -av --sparse 0001005E0017 arch/0001005E0017 sending incremental file list 0001005E0017 rsync: write failed on "/usr/home/postgres/arch/0001005E0017": Disc quota exceeded (69) rsync error: error in file IO (code 11) at receiver.c(400) [receiver=3.1.2] rsync: [sender] write error: Broken pipe (32) [postgres@hades ~]$ grep 68408 sparse.log |grep ERR 68408: openat(AT_FDCWD,"0001005E0017",O_RDONLY,00) ERR#2 'No such file or directory' 68408: write(1,"\M^W\M-P\^E\0\^A\0\0\0\0\0\^P\\^"...,1024) ERR#69 'Disc quota exceeded' 68408: stat("/usr/share/nls/C/libc.cat",0x7fff9dd8) ERR#2 'No such file or directory' 68408: stat("/usr/share/nls/libc/C",0x7fff9dd8) ERR#2 'No such file or directory' 68408: stat("/usr/local/share/nls/C/libc.cat",0x7fff9dd8) ERR#2 'No such file or directory' 68408: stat("/usr/local/share/nls/libc/C",0x7fff9dd8) ERR#2 'No such file or directory' -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #12 from Rui DeSousa--- (In reply to Dave Gordon from comment #10) The sparse option errors out :). [postgres@hades ~]$ rsync -av --sparse 0001005E0017 arch/0001005E0017 sending incremental file list 0001005E0017 rsync: write failed on "/usr/home/postgres/arch/0001005E0017": Disc quota exceeded (69) rsync error: error in file IO (code 11) at receiver.c(400) [receiver=3.1.2] rsync: [sender] write error: Broken pipe (32) [postgres@hades ~]$ rsync -av --preallocate 0001005E0017 arch/0001005E0017 preallocation is not supported on this Server rsync error: syntax or usage error (code 1) at compat.c(205) [Receiver=3.1.2] rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(226) [sender=3.1.2] -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #11 from Rui DeSousa--- (In reply to Dave Gordon from comment #7) This is the result of hard link on the temp file where the rename failed. root@hades:~postgres/arch # ls -lh rsync.temp ; du -h rsync.temp -rw--- 1 postgres postgres 1.1M Mar 5 16:02 rsync.temp 329Krsync.temp The file system has LZ4 compression; thus the file is actually 19MB on disk. root@hades:~postgres # ls -lh 0001005E0017 -rw--- 1 postgres postgres64M Mar 5 16:02 0001005E0017 root@hades:~postgres # du -h 0001005E0017 19M0001005E0017 Since it's using compression; I thought /dev/zero would be a bad option to test with. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #10 from Dave Gordon--- BTW, have you tried *either* --sparse *or* --preallocate (but not both together, please, as that will trigger bug 13320 - https://bugzilla.samba.org/show_bug.cgi?id=13320) Does you get the same problem (i.e. file corruption with no reported error) with each of these, or an error, or some different type of misbehaviour? .Dave. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #9 from Dave Gordon--- (In reply to Rui DeSousa from comment #6) In your example: $ rsync -av --inplace 0001005E0017 arch/0001005E0017 sending incremental file list 0001005E0017 sent 67,125,370 bytes received 35 bytes 8,950,054.00 bytes/sec total size is 67,108,864 speedup is 1.00 $ ls -lh arch/0001005E0017 -rw--- 1 postgres postgres64M Mar 5 16:02 arch/0001005E0017 $ du -h arch/0001005E0017 362Karch/0001005E0017 how much of the source file is non-sparse? 'Cos ZFS can "sparsify" a file if it detects that you've got big chunks of zeros. For that matter, if you've got dedup enabled, it should be able to detect any repeated pattern that's block-sized and -aligned. That might let you create files that go over quota, since they're not really using very much space. In your dd(1) example, do you get a different result if you source /dev/zero rather than /dev/(u)random? .Dave. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #8 from Ben RUBSON--- (In reply to Dave Gordon from comment #3) > ZFS probably notices the quota problem somewhere between (b) and (c), drops > the excess data, and returns EDQUOT to the close(2) call. (In reply to Dave Gordon from comment #7) > So if ZFS did return a delayed error at that point, it would be detected and > cause the transfer to fail. I confirm ZFS does not return delayed errors. userquota are delayed, so user may store more than the userquota, however what is stored is stored. The other types of ZFS quotas, quota and refquota, are not delayed and ZFS extremely takes care to the last bytes stored around the limit to return EDQUOT after the very last allowed bytes (throughput is then here terribly slow). -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 13317] rsync returns success when target filesystem is full
On Mon, Mar 5, 2018 at 3:09 PM, just subscribed for rsync-qa from bugzilla via rsyncwrote: > https://bugzilla.samba.org/show_bug.cgi?id=13317 > > --- Comment #6 from Rui DeSousa --- > (In reply to Rui DeSousa from comment #5) > > It looks like no error is returned and result is a sparse file. I think a > sync() would be required otherwise the file is truncated on close to meet the > quota. IINM, to create a sparse file, you have to seek past the concrete part of a file (if any), and write something. I'm not sure that a sync would change whether you've gone over your quota. At least, if I were designing a quota system, I'd make it depend on what you've written to the buffer cache and disk, not disk alone. Here's an example of creating a sparse file: $ /usr/local/pypy3-5.10.0/bin/pypy3 below cmd output started 2018 Mon Mar 05 04:01:55 PM PST Python 3.5.3 (09f9160b643e, Dec 22 2017, 10:10:27) [PyPy 5.10.0 with GCC 6.2.0 20160901] on linux Type "help", "copyright", "credits" or "license" for more information. file_ = open('/tmp/sparse-file', 'wb') file_.seek(1024*1024*8) 8388608 file_.write(b'a') 1 file_.close() above cmd output done2018 Mon Mar 05 04:02:14 PM PST dstromberg@zareason2:~ x86_64-unknown-linux-gnu 4805 $ ls -l /tmp/sparse-file below cmd output started 2018 Mon Mar 05 04:02:18 PM PST -rw-rw-r-- 1 dstromberg dstromberg 8388609 Mar 5 16:02 /tmp/sparse-file above cmd output done2018 Mon Mar 05 04:02:18 PM PST dstromberg@zareason2:~ x86_64-unknown-linux-gnu 4805 $ du -sh /tmp/sparse-file below cmd output started 2018 Mon Mar 05 04:02:22 PM PST 4.0K/tmp/sparse-file -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #7 from Dave Gordon--- (In reply to Rui DeSousa from comment #5) That was a run where the rename failed. Do you know whether the temporary file was truncated or corrupted in that scenario? [HINT: one can stop the rsync with a signal, find the temporary file, create a hardlink to it, and then let rsync resume; in the event of an error, the temporary won't disappear because of the hardlink, and then one can examine it to see what state it was in when rsync quit]. Another thing to try would be adding the options --itemize-changes and --debug=deltasum3,recv2,io2 BTW I checked whether the return code from close() is verified, and, while there are many close() calls whose result is ignored, the specific one at the end of a file transfer *is* checked. So if ZFS did return a delayed error at that point, it would be detected and cause the transfer to fail. .Dave. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #6 from Rui DeSousa--- (In reply to Rui DeSousa from comment #5) It looks like no error is returned and result is a sparse file. I think a sync() would be required otherwise the file is truncated on close to meet the quota. [postgres@hades ~]$ df -h arch Filesystem SizeUsed Avail Capacity Mounted on hydra/home/postgres/arch1.0G1.0G316K 100% /usr/home/postgres/arch [postgres@hades ~]$ rsync -av --inplace 0001005E0017 arch/0001005E0017 sending incremental file list 0001005E0017 sent 67,125,370 bytes received 35 bytes 8,950,054.00 bytes/sec total size is 67,108,864 speedup is 1.00 [postgres@hades ~]$ ls -lh arch/0001005E0017 -rw--- 1 postgres postgres64M Mar 5 16:02 arch/0001005E0017 [postgres@hades ~]$ du -h arch/0001005E0017 362Karch/0001005E0017 Here's example of dd exceeding the quota; however, it ends up with a truncated file instead of it being sparse. [postgres@hades ~/arch]$ dd if=/dev/random of=test.dmp bs=256k ^C3080+0 records in 2+6156 records out 807403520 bytes transferred in 10.843296 secs (74461081 bytes/sec) [postgres@hades ~/arch]$ ls -lh test.dmp -rw-r--r-- 1 postgres postgres 384K Mar 5 18:05 test.dmp [postgres@hades ~/arch]$ du -h test.dmp 393Ktest.dmp -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #5 from Rui DeSousa--- (In reply to Dave Gordon from comment #4) Hi Dave, I'm not seeing any errors on the write calls. Would an fsync() be required to force the error? [postgres@hades ~]$ grep ERR rsync.test.log 52419: lstat("/usr/local/etc/libmap.d",0x7fffc728) ERR#2 'No such file or directory' 52419: access("/lib/libiconv.so.2",F_OK) ERR#2 'No such file or directory' 52419: access("/usr/lib/libiconv.so.2",F_OK) ERR#2 'No such file or directory' 52419: access("/usr/lib/compat/libiconv.so.2",F_OK) ERR#2 'No such file or directory' 52419: readlink("/etc/malloc.conf",0x7fffd7e0,1024) ERR#2 'No such file or directory' 52419: openat(AT_FDCWD,"/etc/popt",O_RDONLY,00) ERR#2 'No such file or directory' 52419: openat(AT_FDCWD,"/usr/home/postgres/.popt",O_RDONLY,00) ERR#2 'No such file or directory' 52881: stat("arch/0001005E0017",0x7fffca10) ERR#2 'No such file or directory' 52881: __acl_get_file(0x455d12,0x3,0x8012ca000) ERR#22 'Invalid argument' 52881: lstat("0001005E0017",0x7fffbc60) ERR#2 'No such file or directory' 52987: openat(AT_FDCWD,"0001005E0017",O_RDONLY,00) ERR#2 'No such file or directory' 52987: __acl_get_file(0x455d12,0x3,0x8012ca000) ERR#22 'Invalid argument' 52987: rename(".0001005E0017.kmkocG","0001005E0017") ERR#69 'Disc quota exceeded' 52987: stat("/usr/share/nls/C/libc.cat",0x7fff9e28) ERR#2 'No such file or directory' 52987: stat("/usr/share/nls/libc/C",0x7fff9e28) ERR#2 'No such file or directory' 52987: stat("/usr/local/share/nls/C/libc.cat",0x7fff9e28) ERR#2 'No such file or directory' 52987: stat("/usr/local/share/nls/libc/C",0x7fff9e28) ERR#2 'No such file or directory' 52987: select(1,{ 0 },{ },{ 0 },{ 60.00 }) ERR#4 'Interrupted system call' 52881: nanosleep({ 0.02000 }) ERR#4 'Interrupted system call' 52881: wait4(-1,0x7fffc3d4,WNOHANG,0x0) ERR#10 'No child processes' 52881: sigreturn(0x7fffc400) ERR#4 'Interrupted system call' 52881: wait4(52987,0x7fffc9d4,WNOHANG,0x0) ERR#10 'No child processes' 52419: nanosleep({ 0.02000 }) ERR#4 'Interrupted system call' 52419: wait4(-1,0x7fffc4d4,WNOHANG,0x0) ERR#10 'No child processes' 52419: sigreturn(0x7fffc500) ERR#4 'Interrupted system call' 52419: wait4(52881,0x7fffcaf4,WNOHANG,0x0) ERR#10 'No child processes' 52419: wait4(52881,0x7fffcaf4,WNOHANG,0x0) ERR#10 'No child processes' -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #4 from Dave Gordon--- To see whether rsync is getting any errors reported by any system calls it makes, one could run it under strace(1) on Linux, or dtrace on Solaris. Presumably FreeBSD has at least one of these, or something similar? Linux: $ strace -ff -o rsync-trace -e trace=file,desc rsync ... $ grep -w 'E[[:upper:][:digit:]]*' rsync-trace.* access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/usr/etc/popt", O_RDONLY) = -1 ENOENT (No such file or directory) open("/etc/popt", O_RDONLY) = -1 ENOENT (No such file or directory) stat("/etc/popt.d", 0x7ffdafe82860) = -1 ENOENT (No such file or directory) The output is a list of all filename- or descriptor-related syscalls that failed. If any of them show up as EDQUOT or relate to the problematic output file, that might be a big clue :) HTH, .Dave. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #3 from Dave Gordon--- (In reply to Rui DeSousa from comment #2) Here's a snippet from Oracle's ZFS help ... https://docs.oracle.com/cd/E19253-01/819-5461/gitfx/index.html "Enforcement of user and group quotas might be delayed by several seconds. This delay means that users might exceed their quota before the system notices that they are over quota and refuses additional writes with the EDQUOT error message." So, rsync(1) might (a) successfully write a bunch of data that falls just short of the quota, (b) "successfully" (i.e. with no reported error) write some more that takes it over quota (c) close the file (ignoring any result) (d) rename the file -- which may or may not work. ZFS probably notices the quota problem somewhere between (b) and (c), drops the excess data, and returns EDQUOT to the close(2) call. The success of the rename depends only on whether there is sufficient free space at that instant; any previous failure in writing the file won't affect the rename directly. Rename may not always need any extra space anyway. BTW: apropos your comment about "cat" in thread https://www.postgresql.org/message-id/E6AB850C-D05E-405B-8D4E-DE18E128F402%40icloud.com it's not the cat that makes this reliable, it's the ssh! :) HTH, .Dave. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #2 from Rui DeSousa--- (In reply to Kevin Korb from comment #1) I saying that in some cases the rename does not fail; and in those cases it returns success despite there not being enough space to store the original file. It looks like it should have error before the rename occurs but does not; thus leaving a truncated or sparsely populated file. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 13317] rsync returns success when target filesystem is full
https://bugzilla.samba.org/show_bug.cgi?id=13317 --- Comment #1 from Kevin Korb--- Are you saying that it is exiting with an exit code of 0 after outputting that error or that sometimes in the same condition it shows no error and exits code 0? Either way it would probably be helpful to see an output with -v and -i. Also, in your link, that isn't what -c does. It does not verify that the file was transferred correctly. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html