Dear David,

i have not been able to look into this yet. However, i've checked our logs on several servers using nsproxy intensively, but did not see any problems. How is the proxy setup (config section and "ns_proxy configure" command) in your environment?

-g
Am 10.05.17 um 10:50 schrieb David Osborne:
Has anyone got any ideas on debugging this issue?
Turns out it's causing us many more operational issues that we first thought...

Regards,
--
David

On 18 April 2017 at 15:44, David Osborne <da...@qcode.co.uk <mailto:da...@qcode.co.uk>> wrote:

    Hi,

    We're encountering an occasional problem which means calls to
    ns_proxy hang forever.
    It seems to be when an ns_proxy eval command timeouts during
    evaluation & it produces a large amount of output.
    Under those circumstances ns_proxy becomes inoperable.

    Following is a contrived test case which I think demonstrates what
    we are experiencing.
    (The ns_proxy timeout here is set unrealistically small to force
    the timeout - in production the timeout is 1 second)


    Normal test case using command:

    |ns_proxy eval $handle [list exec dd if=/dev/urandom bs=1K count=23
    | base64] 1|


    % set handle [ns_proxy get exec]
    [07/Apr/2017:13:06:57][28009.7fa7b760b700][-command-]
    Debug(nsproxy): set expire in 300000 ms for pool exec slave 28197
    [07/Apr/2017:13:06:57][28009.7fa7b760b700][-command-]
    Debug(nsproxy): nsproxy: slave 28197 started
    [07/Apr/2017:13:06:57][28197.7f3f5dcc3700][-main-] Notice: OpenSSL
    1.0.1t  3 May 2016 initialized
    [07/Apr/2017:13:06:57][28009.7fa78bfff700][-nsproxy:reap-] Notice:
    starting
    [07/Apr/2017:13:06:57][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper run
    [07/Apr/2017:13:06:57][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper checks pool exec
    [07/Apr/2017:13:06:57][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper sets timeout based on idle diff
    1491567117.835940 of pool exec
    [07/Apr/2017:13:06:57][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper waits for cond with timeout 1491567117.835940
    exec-7

    % set result [ns_proxy eval $handle [list exec dd if=/dev/urandom
    bs=1K count=23 | base64] 1]
    [07/Apr/2017:13:07:09][28009.7fa7b760b700][-command-]
    Debug(nsproxy): proxy send pool exec slave 28197: exec dd
    if=/dev/urandom bs=1K count=23 | base64
    could not wait for proxy "exec-7": timeout waiting for evaluation

    % ns_proxy active exec
    {handle exec-7 slave 28197 start 1491566829:518164 script {exec dd
    if=/dev/urandom bs=1K count=23 | base64}}

    % ns_proxy cleanup
    [07/Apr/2017:13:07:26][28009.7fa7b760b700][-command-]
    Debug(nsproxy): nsproxy [exec]: close slave 28197 (expire 1000 ms)
    [07/Apr/2017:13:07:26][28009.7fa7b760b700][-command-]
    Debug(nsproxy): set expire in 1000 ms for pool exec slave 28197
    [07/Apr/2017:13:07:26][28009.7fa7b760b700][-command-]
    Debug(nsproxy): nsproxy [exec]: slave 28197 closed
    [07/Apr/2017:13:07:26][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper run
    [07/Apr/2017:13:07:26][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper checks pool exec
    [07/Apr/2017:13:07:26][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper sets timeout based on idle diff
    1491567146.613112 of pool exec
    [07/Apr/2017:13:07:26][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper waits for cond with timeout 1491567146.613112


    Hanging test case using command (note: same command except with
    slightly larger output):

    |ns_proxy eval $handle [list exec dd if=/dev/urandom bs=1K count=24
    | base64] 1|


    % set handle [ns_proxy get exec]
    [07/Apr/2017:13:08:13][28009.7fa7b760b700][-command-]
    Debug(nsproxy): set expire in 300000 ms for pool exec slave 28968
    [07/Apr/2017:13:08:13][28009.7fa7b760b700][-command-]
    Debug(nsproxy): nsproxy: slave 28968 started
    [07/Apr/2017:13:08:13][28968.7f0059a9e700][-main-] Notice: OpenSSL
    1.0.1t  3 May 2016 initialized
    [07/Apr/2017:13:08:13][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper run
    [07/Apr/2017:13:08:13][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper checks pool exec
    [07/Apr/2017:13:08:13][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper sets timeout based on idle diff
    1491567193.223279 of pool exec
    [07/Apr/2017:13:08:13][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper waits for cond with timeout 1491567193.223279
    exec-7

    % set result [ns_proxy eval $handle [list exec dd if=/dev/urandom
    bs=1K count=24 | base64] 1]
    [07/Apr/2017:13:08:21][28009.7fa7b760b700][-command-]
    Debug(nsproxy): proxy send pool exec slave 28968: exec dd
    if=/dev/urandom bs=1K count=24 | base64
    could not wait for proxy "exec-7": timeout waiting for evaluation

    % ns_proxy active exec
    {handle exec-7 slave 28968 start 1491566901:234806 script {exec dd
    if=/dev/urandom bs=1K count=24 | base64}}

    % ns_proxy cleanup
    [07/Apr/2017:13:08:40][28009.7fa7b760b700][-command-]
    Debug(nsproxy): nsproxy [exec]: close slave 28968 (expire 1000 ms)
    [07/Apr/2017:13:08:40][28009.7fa7b760b700][-command-]
    Debug(nsproxy): set expire in 1000 ms for pool exec slave 28968
    [07/Apr/2017:13:08:40][28009.7fa7b760b700][-command-]
    Debug(nsproxy): nsproxy [exec]: slave 28968 closed
    [07/Apr/2017:13:08:40][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper run
    [07/Apr/2017:13:08:40][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper checks pool exec
    [07/Apr/2017:13:08:40][28009.7fa78bfff700][-nsproxy:reap-]
    Debug(nsproxy): reaper sets timeout based on idle diff
    1491567220.342132 of pool exec

    ns_proxy cleanup never returns.


    The location of the hang is during waitpid() called via
    Ns_WaitProcess
    
<https://bitbucket.org/naviserver/naviserver/src/5b852119ea5f3e3e796eb9385c22162a070a6b24/nsproxy/nsproxylib.c?at=default&fileviewer=file-view-default#nsproxylib.c-2935>

    The system seems to think that the process hasn't exited.
    Any ideas what is going on here?


    Regards,
-- David






------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


--
Univ.Prof. Dr. Gustaf Neumann
WU Vienna
Institute of Information Systems and New Media
Welthandelsplatz 1, A-1020 Vienna, Austria

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to