[Bug 12819] [PATCH] sync() on receiving side for data consistency
https://bugzilla.samba.org/show_bug.cgi?id=12819 Ben RUBSON changed: What|Removed |Added Resolution|--- |MOVED Status|NEW |RESOLVED --- Comment #9 from Ben RUBSON --- Patch moved : https://github.com/WayneD/rsync/pull/4 -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 12819] [PATCH] sync() on receiving side for data consistency
On Fri, 16 Jun 2017 12:34:40 +0200 Ben RUBSON via rsyncwrote: > > On 15 Jun 2017, at 19:29, Karl O. Pinc via rsync > > wrote: > > The problem is that the --server (and, especially, > > --daemon) documentation has gone away. Or at least > > left the man page. (v3.1.1, Debian 8, Jessie) Except > > for a hint that --server exists at the bottom. > > Are you looking for `man rsyncd.conf` ? No, that tells me what --daemon does; how to run rsync as a server. It does not tell me how to invoke rsync at the remote end manually without doing server-side things such as the reading of rsyncd.conf. What I want documened is how to use a customized transport that does not allow the client side to send arbirtrary commands to the remote end. The sort of thing done when using ssh with keys and the command= option within an authorized_keys file. As mentioned, now I use command="rsync --server --daemon ." in my authorized_keys file. I once figured this out from old rsync man pages, but don't see how to glean this command sequence from a more recent man page. Again, I might (eventually) get around to sending in a man page patch if somebody explains how it's done. Regards, Karl Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 12819] [PATCH] sync() on receiving side for data consistency
> On 15 Jun 2017, at 19:29, Karl O. Pinc via rsync> wrote: > > On Thu, 15 Jun 2017 13:23:44 + > just subscribed for rsync-qa from bugzilla via rsync > wrote: > >> https://bugzilla.samba.org/show_bug.cgi?id=12819 >> >> --- Comment #7 from Ben RUBSON --- > >> Note that my patch simply adds a sync() just after recv_files(), so >> one sync() per connection, not per write operation. > >> But we could make this a rsync option, so that one can enable / >> disable it on its own. > > I think the "right" rsync option to add (because rsync does > not have enough options already ;-) is a --hook-post option. > It would run something (a `sync` in your case) on the > remote end after finishing. There are clear security issues > here. > > Rather than having --hook-post and having to do something > (a server side config option that says what --hook-post > can do?) to address the security concerns it seems much > simpler to improve the rsync documentation regarding running > the rsync server side. --daemon (if used) already has post-xfer option, but as explained in the bug report, could be hard to use when daemon is chrooted. > I'm still using command="rsync --server --daemon ." in my > ~/.ssh/authorized_keys file on the remote end. It'd be simple > enough to add, say, a "sync" to the end of this to force a sync > when rsync finishes. It would however sync() even if the client only read files. > The problem is that the --server (and, especially, > --daemon) documentation has gone away. Or at least > left the man page. (v3.1.1, Debian 8, Jessie) Except > for a hint that --server exists at the bottom. Are you looking for `man rsyncd.conf` ? > If the server side of rsync was better documented then > perhaps a simple inetd rsync service (or --rsync-path > or -e value, etc.) would be easy for the end-user to > cobble together to meet needs such as this. > > Can somebody please explain --server? (And --sender, I guess.) > I might (possibly) be motivated to send in a man page patch. > > Regards, > > Karl Thank you for your feedback Karl ! Ben -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 12819] [PATCH] sync() on receiving side for data consistency
https://bugzilla.samba.org/show_bug.cgi?id=12819 --- Comment #8 from Brian K. White--- You tell me, what ABOUT a power failure between 2 zfs, or any other fs operations? This does not improve or solve any problem that the fs and all the other layers aren't already handling. This is simply a misguided idea, however sensible and attractive it seems. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 12819] [PATCH] sync() on receiving side for data consistency
On Thu, 15 Jun 2017 13:23:44 + just subscribed for rsync-qa from bugzilla via rsyncwrote: > https://bugzilla.samba.org/show_bug.cgi?id=12819 > > --- Comment #7 from Ben RUBSON --- > Note that my patch simply adds a sync() just after recv_files(), so > one sync() per connection, not per write operation. > But we could make this a rsync option, so that one can enable / > disable it on its own. I think the "right" rsync option to add (because rsync does not have enough options already ;-) is a --hook-post option. It would run something (a `sync` in your case) on the remote end after finishing. There are clear security issues here. Rather than having --hook-post and having to do something (a server side config option that says what --hook-post can do?) to address the security concerns it seems much simpler to improve the rsync documentation regarding running the rsync server side. I'm still using command="rsync --server --daemon ." in my ~/.ssh/authorized_keys file on the remote end. It'd be simple enough to add, say, a "sync" to the end of this to force a sync when rsync finishes. The problem is that the --server (and, especially, --daemon) documentation has gone away. Or at least left the man page. (v3.1.1, Debian 8, Jessie) Except for a hint that --server exists at the bottom. If the server side of rsync was better documented then perhaps a simple inetd rsync service (or --rsync-path or -e value, etc.) would be easy for the end-user to cobble together to meet needs such as this. Can somebody please explain --server? (And --sender, I guess.) I might (possibly) be motivated to send in a man page patch. Regards, Karl Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 12819] [PATCH] sync() on receiving side for data consistency
https://bugzilla.samba.org/show_bug.cgi?id=12819 --- Comment #7 from Ben RUBSON--- And what about a power failure between 2 ZFS transaction groups ? Note that my patch simply adds a sync() just after recv_files(), so one sync() per connection, not per write operation. Quite low workload actually :) But we could make this a rsync option, so that one can enable / disable it on its own. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 12819] [PATCH] sync() on receiving side for data consistency
https://bugzilla.samba.org/show_bug.cgi?id=12819 --- Comment #6 from Brian K. White--- Think of it this way, write() already makes a certain promise that it will not return until it's done it's job, and it will not assert success when it can't. Essentially the man page for any syscall is a contract. In fact all API's are contracts. write() in turn is relies on various other calls to even lower layers to keep their promises too, to manage the in-kernel buffer or the cache on a raid card etc. All of these things MUST be relied on rather than second-guessed. It would be insane for example, for write() to say "I can't really be sure this disk driver has really done it's thing. I better force it to sync before I return to the application." or "I can't really be sure malloc() really allocated the memory, I better malloc 3 or 4 copies and compare them and use whichever copies agree with each other... It's insane. You write(), you check the return value, and you're done. The low level hardware is someone else's job, and you won't be doing a better job than they already did. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 12819] [PATCH] sync() on receiving side for data consistency
https://bugzilla.samba.org/show_bug.cgi?id=12819 --- Comment #5 from Brian K. White--- Any program could make this same "just to be safe" argument practically every time they ever close-on-write for any reason. If they wrote anything, it was always for some reason, and they want to know for sure that it really got safely written. There is nothing special about rsync in that regard. cp might as well have it. The ">" operator in bash might as well have it. The kernel and vfs and hardware drivers all already do whatever is necessary in that regard, and it's generally wrong for any application to try to do it itself. Otherwise the disk would be in a constant state of sync()'ing and never actually manage to get any other work done. Consider a multiuser host with 500 rsync receivers. Each individual sync() is incredibly disruptive to all other processes. "Everyone hold up while we flush the disk buffer...". The entire system waits while that happens. That way just leads to things like the example you just used, lower layers that just start lying about sync() to upper layers because too many apps use it when they shouldn't. "Fine, if apps are going to sync all the time, that ends up being 86 times a second between all procs running at any given moment, which is unsupportable, so we'll just make sync() a no-op stub and we'll do it when it's' actually required, and apps can sync()-away to their hearts content". I think the only reason rsync might have to sync is if you built rsync as a self-contained bootable executable like memtest86, or possibly as an MS-DOS executable. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 12819] [PATCH] sync() on receiving side for data consistency
https://bugzilla.samba.org/show_bug.cgi?id=12819 --- Comment #4 from Ben RUBSON--- Yes Paul I thought about it but sync command may not be available if the server (receiver) is chrooted (for example using patch proposed in #12817). -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 12819] [PATCH] sync() on receiving side for data consistency
https://bugzilla.samba.org/show_bug.cgi?id=12819 --- Comment #3 from Paul Slootman--- How about just using a post-xfer command on the server side that does 'sync'? -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 12819] [PATCH] sync() on receiving side for data consistency
https://bugzilla.samba.org/show_bug.cgi?id=12819 --- Comment #2 from Ben RUBSON--- Thank you for your feedback Brian. I don't have any problem. I just want to be sure that when client (sender) has finished its transfer, its data is on server's (receiver) disks, before it disconnects. So that when it correctly / successfully disconnects, its data is for sure on disks. On disks means on platters, so that if there is a failure (hardware, power...), data is safe, not lost. Of course disks which do not lie about sync() command must be used (data must be on platters, not only in disks' cache). As well as a robust filesystem, some redundancy... (but here that's off-topic). Perhaps we could make it an option, so that those who have OS failing to manage write buffers would not be degraded even more... But certainly they should have a look to their performance issue first. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 12819] [PATCH] sync() on receiving side for data consistency
https://bugzilla.samba.org/show_bug.cgi?id=12819 --- Comment #1 from Brian K. White--- This seems wrong to me. If the OS is failing to manage write buffers and file access between processes, you would have a lot bigger problems in every process all through the system, and this wouldn't fix it. Similarly, if rsync were corrupting data, a lot of people would already know about it. It gets used way too much and too heavily for anything like this to go unnoticed for more than a day, let alone 15 or more years. It's almost axiomatic: No matter what problem you think you have, no matter what language or OS or platform, if you think it's fixed by either sleep() or sync(), it's not. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html