On 5/22/2017 2:30 AM, Wouter Verhelst wrote:
> On Fri, May 12, 2017 at 09:53:24AM -0400, Menke, Gregory D. 
> (GSFC-582.0)[Arctic Slope Technical Services, Inc.] wrote:
>> Hi all,
>>
>> I traced the issue some more, it is related to the client side- it
>> appears the client connection to the localhost end of the tunnel drops,
>> but if the tunnel is connected from a different computer on the local
>> subnet, and nbd-client sends its connection thru that, then nbp is stable.
>>
>> So I'm pursing why nbp-client making a connection to a localhost tunnel
> nbd, not nbp ;-)

hmm yes indeed :)



>
>> endpoint is fragile.  I'm going to try ssh tunnels on the local subnet
>> so they are fast, to see if the behavior is related to wan
>> latency/bandwidth or not.
>>
>> In the circumstance of the localhost connection dropping it tends to
>> leave the nbp-client and mount point difficult to close, SIGKILL on the
>> entire stack of related software is sometimes unable to exit the
>> processes so things can be unwound.  When SIGKILL does work then use of
>> the nbp device can be recovered.  It has the appearance of deadlock in
>> the nbp kernel module.
> Does this seem more likely to happen under memory stress? Are you
> swapping to the device, or running programs from it?
>
> If so, this might be related to what
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7f338fe4540b1d0600b02314c7d885fd358e9eca
> fixed for direct NBD connections. There is little to nothing that can be
> done about that.
>

I have not seen it obviously associated with memory pressure though I 
could have been missing such a factor.  It is definitely associated with 
the nbd client connection to localhost.  I was able to duplicate the 
behavior both on my slow wan ssh tunnel and a fast local network ssh 
tunnel; in both cases once the nbd client's connection is to another 
host on the network behavior is much better.  Leaving nbd-client (and 
nbd-client -d) dead and unkillable with nbd kernel module impossible to 
unload remains the characteristic symptom.  I'll have a try at adjusting 
min_free_kbytes as per the thread, see if the behavior changes.


Thanks!


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general
  • [Nbd... Greg Menke
    • ... Wouter Verhelst
      • ... Menke, Gregory D. (GSFC-582.0)[Arctic Slope Technical Services, Inc.]
        • ... Wouter Verhelst
          • ... Alex Bligh
            • ... Menke, Gregory D. (GSFC-582.0)[Arctic Slope Technical Services, Inc.]
              • ... Wouter Verhelst
                • ... Menke, Gregory D. (GSFC-582.0)[Arctic Slope Technical Services, Inc.]

Reply via email to