On 5/22/2017 2:30 AM, Wouter Verhelst wrote: > On Fri, May 12, 2017 at 09:53:24AM -0400, Menke, Gregory D. > (GSFC-582.0)[Arctic Slope Technical Services, Inc.] wrote: >> Hi all, >> >> I traced the issue some more, it is related to the client side- it >> appears the client connection to the localhost end of the tunnel drops, >> but if the tunnel is connected from a different computer on the local >> subnet, and nbd-client sends its connection thru that, then nbp is stable. >> >> So I'm pursing why nbp-client making a connection to a localhost tunnel > nbd, not nbp ;-)
hmm yes indeed :) > >> endpoint is fragile. I'm going to try ssh tunnels on the local subnet >> so they are fast, to see if the behavior is related to wan >> latency/bandwidth or not. >> >> In the circumstance of the localhost connection dropping it tends to >> leave the nbp-client and mount point difficult to close, SIGKILL on the >> entire stack of related software is sometimes unable to exit the >> processes so things can be unwound. When SIGKILL does work then use of >> the nbp device can be recovered. It has the appearance of deadlock in >> the nbp kernel module. > Does this seem more likely to happen under memory stress? Are you > swapping to the device, or running programs from it? > > If so, this might be related to what > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7f338fe4540b1d0600b02314c7d885fd358e9eca > fixed for direct NBD connections. There is little to nothing that can be > done about that. > I have not seen it obviously associated with memory pressure though I could have been missing such a factor. It is definitely associated with the nbd client connection to localhost. I was able to duplicate the behavior both on my slow wan ssh tunnel and a fast local network ssh tunnel; in both cases once the nbd client's connection is to another host on the network behavior is much better. Leaving nbd-client (and nbd-client -d) dead and unkillable with nbd kernel module impossible to unload remains the characteristic symptom. I'll have a try at adjusting min_free_kbytes as per the thread, see if the behavior changes. Thanks! ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Nbd-general mailing list Nbd-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nbd-general