On Mon, 27 Mar 2017 08:25:41 -0700, Gregory Szorc wrote: > > On Mar 27, 2017, at 08:13, Yuya Nishihara <y...@tcha.org> wrote: > >> On Sun, 26 Mar 2017 11:56:59 -0700, Gregory Szorc wrote: > >> # HG changeset patch > >> # User Gregory Szorc <gregory.sz...@gmail.com> > >> # Date 1490554582 25200 > >> # Sun Mar 26 11:56:22 2017 -0700 > >> # Branch stable > >> # Node ID 414128ddd876eb33f70b0dd95d110e29a9308c93 > >> # Parent ed5b25874d998ababb181a939dd37a16ea644435 > >> changegroup: retry read() when partial data returned
[...] > >> def readexactly(stream, n): > >> '''read n bytes from stream.read and abort if less was available''' > >> - s = stream.read(n) > >> - if len(s) < n: > >> + # Most of the time, stream.read() returns exactly the number of bytes > >> + # requested. Various Python APIs will perform multiple system calls > >> + # to retrieve more bytes if the first call did not return enough. This > >> + # includes cases where the system call is interrupted. > >> + # > >> + # Even with this retry logic in the Python APIs, we still see failure > >> + # to retrieve the requested number of bytes. So, we build in our own > >> + # retry layer here. > >> + left = n > >> + chunk = stream.read(n) > >> + left -= len(chunk) > >> + assert left >= 0 > >> + > >> + # Common case is it just works. > >> + if not left: > >> + return chunk > >> + > >> + chunks = [chunk] > >> + # Retry to get remaining data. In cases where the stream has errored > >> or > >> + # is at EOF, this will do a bit of redundant work. But it helps > >> prevent > >> + # intermittent failures and isn't common. So the overhead is > >> acceptable. > >> + for i in range(3): > >> + chunk = stream.read(left) > >> + chunks.append(chunk) > >> + left -= len(chunk) > >> + assert left >= 0 > >> + > >> + if not left: > >> + break > > > > Do you have some number that supports this change actually mitigates the > > problem? > > No. I could try to get that this week. Many thanks. > > Suppose underlying functions have bugs, I don't think we can always get > > contiguous chunks with no data loss. If a few bytes lost for example, we > > could read arbitrary part as a length field, and call readexactly() with > > that invalid length value. > > Hmm. My assumption is that any error that would result in data loss > effectively poisons the stream and prevents future reads. I was thinking e.g. some intermediate layer might not handle EINTR appropriately and EINTR might be caught by some upper layer. If it had a temporary buffer, its content would be lost. _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel