Comment #1 on issue 145 by [email protected]: Cluster init/node join has
often problems with stale node daemons
http://code.google.com/p/ganeti/issues/detail?id=145
This happens to me as well when running gnt-cluster init on gentoo with
ganeti 2.4.2, here is the debug output:
vm0 ~ # gnt-cluster init --debug --secondary-ip=$drbd_ip --vg-name=$vgname
--master-netdev=$interfacename --enabled-hypervisors=$hypervisors
--prealloc-wipe-disks=no $clustername
2011-08-02 15:23:46,953: gnt-cluster init pid=2332 cli:1927 INFO run with
arguments '--debug --secondary-ip=10.1.0.23 --vg-name=ganeti
--master-netdev=lan --enabled-hypervisors=kvm --prealloc-wipe-disks=no
ganeti.hq.*deleted*.hu'
2011-08-02 15:23:46,953: gnt-cluster init pid=2332 rpc:93 INFO Using PycURL
libcurl/7.21.4 OpenSSL/1.0.0d zlib/1.2.5
2011-08-02 15:23:49,961: gnt-cluster init pid=2332 process:195 DEBUG RunCmd
vgs --noheadings --units m --nosuffix -o name,size
2011-08-02 15:23:50,006: gnt-cluster init pid=2332 process:195 DEBUG RunCmd
ip link show dev lan
2011-08-02 15:23:50,077: gnt-cluster init pid=2332 process:195 DEBUG RunCmd
ssh-keygen -t dsa -f /root/.ssh/id_dsa -q -N ''
2011-08-02 15:23:51,662: gnt-cluster init pid=2332 bootstrap:124 DEBUG
Generating new cluster certificate at /var/lib/ganeti/server.pem
2011-08-02 15:23:51,853: gnt-cluster init pid=2332 bootstrap:129 DEBUG
Writing new confd HMAC key to /var/lib/ganeti/hmac.key
2011-08-02 15:23:51,902: gnt-cluster init pid=2332 bootstrap:144 DEBUG
Generating new RAPI certificate at /var/lib/ganeti/rapi.pem
2011-08-02 15:23:52,542: gnt-cluster init pid=2332 bootstrap:153 DEBUG
Generating new cluster domain secret at
/var/lib/ganeti/cluster-domain-secret
2011-08-02 15:23:52,584: gnt-cluster init pid=2332 process:195 DEBUG RunCmd
/usr/lib64/ganeti/daemon-util start ganeti-noded
2011-08-02 15:23:52,829: gnt-cluster init pid=2332 client:335 DEBUG
Starting request <ganeti.http.client.HttpClientRequest 10.0.0.23:1811 PUT
/version at 0x24e0b10>
2011-08-02 15:23:52,829: gnt-cluster init pid=2332 client:320 DEBUG Created
new client <ganeti.http.client._PooledHttpClient id=10.0.0.23/1811
lastuse=0 <ganeti.http.client._HttpClient object at 0x24e0810> at 0x24eaf38>
2011-08-02 15:23:52,864: gnt-cluster init pid=2332 client:232 DEBUG Request
<ganeti.http.client.HttpClientRequest 10.0.0.23:1811 PUT /version at
0x24e0b10> finished, errmsg=None
2011-08-02 15:23:52,865: gnt-cluster init pid=2332 client:350 DEBUG
Returning client <ganeti.http.client._PooledHttpClient id=10.0.0.23/1811
lastuse=1 <ganeti.http.client._HttpClient object at 0x24e0810> at
0x24eaf38> to pool
2011-08-02 15:23:52,865: gnt-cluster init pid=2332 bootstrap:446 DEBUG
Starting daemons
2011-08-02 15:23:52,866: gnt-cluster init pid=2332 process:195 DEBUG RunCmd
/usr/lib64/ganeti/daemon-util start-all
2011-08-02 15:23:52,964: gnt-cluster init pid=2332 process:116 DEBUG
Command '/usr/lib64/ganeti/daemon-util start-all' failed (exited with exit
code 1); output: * start-stop-daemon: /usr/sbin/ganeti-noded is already
running
exit code 1
2011-08-02 15:23:52,965: gnt-cluster init pid=2332 cli:1936 ERROR Error
during command processing
Traceback (most recent call last):
File "/usr/lib64/python2.7/site-packages/ganeti/cli.py", line 1932, in
GenericMain
result = func(options, args)
File "/usr/lib64/python2.7/site-packages/ganeti/rpc.py", line 176, in
wrapper
return fn(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/ganeti/client/gnt_cluster.py",
line 146, in InitCluster
prealloc_wipe_disks=opts.prealloc_wipe_disks,
File "/usr/lib64/python2.7/site-packages/ganeti/bootstrap.py", line 451,
in InitCluster
(result.cmd, result.exit_code, result.output))
OpExecError: Could not start daemons, command /usr/lib64/ganeti/daemon-util
start-all had exitcode 1 and error * start-stop-daemon:
/usr/sbin/ganeti-noded is already running
exit code 1
Failure: command execution error:
Could not start daemons, command /usr/lib64/ganeti/daemon-util start-all
had exitcode 1 and error * start-stop-daemon: /usr/sbin/ganeti-noded is
already running
exit code 1
Please note, that before running this command there were no ganeti daemons
running. Init is trying to start ganeti-noded twice, first at 15:23:52,584
and later at 5:23:52,866.