Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-31 Thread Dan Kenigsberg
On Fri, Jan 25, 2013 at 12:30:39PM +0800, Royce Lv wrote: 3. Actions we will take: (1)As a work round we can first remove the zombie reaper from supervdsm server (2)I'll see whether python has a fixed version for this It does not seem so:

Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-25 Thread Mark Wu
On Fri 25 Jan 2013 05:23:24 PM CST, Royce Lv wrote: I patched python source managers.py to retry recv() after EINTR, supervdsm works well and the issue gone. Even declared in python doc that:only the main thread can set a new signal handler, and the main thread will be the only one to

Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-24 Thread Dan Kenigsberg
On Wed, Jan 23, 2013 at 04:44:29PM -0600, Dead Horse wrote: I narrowed down on the commit where the originally reported issue crept in: commitfc3a44f71d2ef202cff18d7203b9e4165b546621building and testing with this commit or subsequent commits yields the original issue. Could you provide more

Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-24 Thread Royce Lv
On 01/24/2013 05:21 PM, Dan Kenigsberg wrote: quent commits yields the Hi, Will you provide the log or let me access the test env if possible(cause we don't have IB in our Lab)? I'll look at it immediately. Sorry for the inconvenience if I have introduced the regression.

Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-24 Thread ybronhei
On 01/24/2013 12:44 AM, Dead Horse wrote: I narrowed down on the commit where the originally reported issue crept in: commitfc3a44f71d2ef202cff18d7203b9e4165b546621building and testing with this commit or subsequent commits yields the original issue. Interesting.. it might be related to this

Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-24 Thread Dead Horse
This test harness setup here consists of two servers tied to NFS storage via IB (NFS mounts are via IPoIB, NFS over RDMA is disabled) . All storage domains are NFS. The issue does occur with both servers on when attempting to bring them out of maintenance mode with the end result being

Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-24 Thread Dead Horse
Tried some manual edits to SD states in the dbase. The net result was I was able to get a node active. However as reconstructing the master storage domain kicked in it was unable to do so. It was also not able to recognize the other SD with similar failure modes to the unrecognized master above.

Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-24 Thread Mark Wu
Great work! The default action for SIGCHLD is ignore, so there's no problems reported before a signal handler is installed by zombie reaper. But I still have one problem: the python multiprocessing.manager code is running a new thread and according to the implementation of python's signal,

Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-23 Thread Dan Kenigsberg
On Tue, Jan 22, 2013 at 04:02:24PM -0600, Dead Horse wrote: Any ideas on this one? (from VDSM log): Thread-25::DEBUG::2013-01-22 15:35:29,065::BindingXMLRPC::914::vds::(wrapper) client [3.57.111.30]::call getCapabilities with () {} Thread-25::ERROR::2013-01-22

Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-23 Thread Dead Horse
Indeed reverting back to an older vdsm clears up the above issue. However now I the issue is see is: Thread-18::ERROR::2013-01-23 15:50:42,885::task::833::TaskManager.Task::(_setError) Task=`08709e68-bcbc-40d8-843a-d69d4df40ac6`::Unexpected error Traceback (most recent call last): File

Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-23 Thread Dead Horse
I narrowed down on the commit where the originally reported issue crept in: commitfc3a44f71d2ef202cff18d7203b9e4165b546621building and testing with this commit or subsequent commits yields the original issue. - DHC On Wed, Jan 23, 2013 at 3:56 PM, Dead Horse deadhorseconsult...@gmail.comwrote:

[Users] latest vdsm cannot read ib device speeds causing storage attach fail

2013-01-22 Thread Dead Horse
Any ideas on this one? (from VDSM log): Thread-25::DEBUG::2013-01-22 15:35:29,065::BindingXMLRPC::914::vds::(wrapper) client [3.57.111.30]::call getCapabilities with () {} Thread-25::ERROR::2013-01-22 15:35:29,113::netinfo::159::root::(speed) cannot read ib0 speed Traceback (most recent call