Make sure PyOpenSSL's version. UPGRADE pyOpenSSL of all machines' to
pyOpenSSL-0.6-2.el5.
Check the hosts. Make sure all machine can resolve each other.

On Thu, Oct 20, 2011 at 12:07 AM, Alison Young <[email protected]>wrote:

>  Hello,
>
> We are seeing an occasional problem where restarts of funcd on the minions
> are not successful and the func daemon is stopped but not able to start
> again.
>
> Checking func.log gives:
>
> 2011-10-02 04:02:04,321 - INFO - Exception occured: socket.error
> 2011-10-02 04:02:04,321 - INFO - Exception value: (98, 'Address already in
> use')
> 2011-10-02 04:02:04,322 - INFO - Exception Info:
>   File "/usr/bin/funcd", line 23, in ?
>     server.main(sys.argv)
>    File "/usr/lib/python2.4/site-packages/func/minion/server.py", line 413,
> in main
>     serve()
>    File "/usr/lib/python2.4/site-packages/func/minion/server.py", line 225,
> in serve
>     server = setup_server()
>    File "/usr/lib/python2.4/site-packages/func/minion/server.py", line 220,
> in setup_server
>     server = FuncSSLXMLRPCServer((listen_addr, listen_port),
> config.module_list)
>    File "/usr/lib/python2.4/site-packages/func/minion/server.py", line 279,
> in __init__
>     self.ca)
>    File
> "/usr/lib/python2.4/site-packages/func/minion/AuthedXMLRPCServer.py", line
> 74, in __init__
>     SimpleXMLRPCServer.SimpleXMLRPCServer.__init__(self, address,
> AuthedSimpleXMLRPCRequestHandler)
>    File "/usr/lib64/python2.4/SimpleXMLRPCServer.py", line 473, in __init__
>     SocketServer.TCPServer.__init__(self, addr, requestHandler)
>    File "/usr/lib64/python2.4/SocketServer.py", line 330, in __init__
>     self.server_bind()
>    File "/usr/lib64/python2.4/SocketServer.py", line 341, in server_bind
>     self.socket.bind(self.server_address)
>    File "<string>", line 1, in bind
>
>
> As you may guess from the timestamp we are seeing this problem most often
> at 4:02am on Sundays, i.e. as part of the logrotate of func logs. Logging in
> to the server and starting the func service once we spot it is stopped has
> always worked so far without needing manual removal of any pid or lock file.
>
> One theory is that this problem occurred when the func minion was
> processing a command and told to restart part way through. From watching
> netstat, it looks like the func daemon stops listening on the minion port to
> allow the spawned process to communicate with the master. If the daemon
> stops, the spawned process blocks a new daemon from starting ('Address
> already in use') but that spawned process then exits and we're left with no
> daemons.
>
> Does this ring any bells with anyone? Is this a known bug?
>
> We've already added monit to mop up after this, but it'd be much preferable
> to find a proper fix.
>
> Alison
>
> _______________________________________________
> Func-list mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/func-list
>



-- 
--------------------------

马新成 | Jackie Ma

MSN: [email protected]   QQ: 2252339967
Twitter: @JackieMa2   G+:  Jackie Ma
My_web: http://jackiema.blog.chinaunix.net

              http://cn.linkedin.com/in/jacknet

使IT运维简单,方便,智能,提高运维效率,节省人力
_______________________________________________
Func-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/func-list

Reply via email to