LGTM, thanks!

The only think that I'd suggest would be to expand the docstrings for _VerifyCertificateSoft and _VerifyCertificateStrong, as currently the text is the same and it's a bit confusing. But I think it'd be better to do it as a separate patch for 2.13, not as a part of the merge.

On Tue, Jul 07, 2015 at 10:40:05AM +0000, 'Helga Velroyen' via ganeti-devel wrote:
commit a338c3f4114e58e70cabeffa453b393d54363415
Merge: 8e50042 d2050bd
Author: Helga Velroyen <[email protected]>
Date:   Tue Jul 7 11:52:47 2015 +0200

   Merge branch 'stable-2.12' into stable-2.13

   * stable-2.12
     Handle SSL setup when downgrading
     Write SSH ports to ssconf files
     Noded: Consider certificate chain in callback
     Cluster-keys-replacement: update documentation
     Backend: Use timestamp as serial no for server cert
     UPGRADE: add note about 2.12.5
     NEWS: Mention issue 1094
     man: mention changes in renew-crypto
     Verify: warn about self-signed client certs
     Bootstrap: validate SSL setup before starting noded
     Clean up configuration of curl request
     Renew-crypto: remove superflous copying of node certs
     Renew-crypto: propagate verbose and debug option
     Noded: log the certificate and digest on noded startup
     QA: reload rapi cert after renew crypto
     Prepare-node-join: use common functions
     Renew-crypto: remove dead code
     Init: add master client certificate to configuration
     Renew-crypto: rebuild digest map of all nodes
     Noded: make "bootstrap" a constant
     node-daemon-setup: generate client certificate
     tools: Move (Re)GenerateClientCert to common
     Renew cluster and client certificates together
     Init: create the master's client cert in bootstrap
     Renew client certs using ssl_update tool
     Run functions while (some) daemons are stopped
     Back up old client.pem files
     Introduce ssl_update tool
     x509 function for creating signed certs
     Add tools/common.py from 2.13
     Consider ECDSA in SSH setup
     Update documentation of watcher and RAPI daemon
     Watcher: add option for setting RAPI IP
     When connecting to Metad fails, log the full stack trace
     Set up the Metad client with allow_non_master
     Set up the configuration client properly on non-masters
     Add the 'allow_non_master' option to the WConfd RPC client
     Add the option to disable master checks to the RPC client
     Add 'allow_non_master' to the Luxi test transport class too
     Add 'allow_non_master' to FdTransport for compatibility
     Properly document all constructor arguments of Transport
     Allow the Transport class to be used for non-master nodes
     Don't define the set of all daemons twice

   Conflicts:
     Makefile.am
     NEWS
     UPGRADE
     lib/client/gnt_cluster.py
     lib/cmdlib/cluster.py
     lib/tools/common.py
     lib/tools/prepare_node_join.py
     lib/watcher/__init__.py
     man/ganeti-watcher.rst
     src/Ganeti/OpCodes.hs
     test/hs/Test/Ganeti/OpCodes.hs
     test/py/cmdlib/cluster_unittest.py
     test/py/ganeti.tools.prepare_node_join_unittest.py
     tools/cfgupgrade

   Resolutions:
     Makefile.am:
       add ssl_update and ssh_update
     NEWS:
       add new sections from 2.12 and 2.13
     UPGRADE:
       add notes for both 2.12 and 2.13
     lib/client/gnt_cluster.py:
       add all new options to RenewCluster, remove version-specific
       downgrade code
     lib/tools/common.py:
       split the two mismatching versions of _VerifyCertificate
       and VerifyCertificate up into [_]VerifyCertifcate{Soft,Strong}
       and update usages accordingly
     lib/tools/prepare_node_join.py
       update usage of correct VerifyCertificate function
     lib/watcher/__init__.py
       add both new options, --rapi-ip and --no-verify-disks
     man/ganeti-watcher.rst
       update docs for both new options (see above)
     src/Ganeti/OpCodes.hs
       add all new options to OpRenewCrypto
     test/hs/Test/Ganeti/OpCodes.hs
       add enough 'arbitrary' for all new options of OpRenewCrypto
     test/py/cmdlib/cluster_unittest.py
       use changes from 2.12
     test/py/ganeti.tools.prepare_node_join_unittest.py
       remove tests that were moved to common_unittest.py
     tools/cfgupgrade
       use only downgrade code of 2.13

   Signed-off by: Helga Velroyen <[email protected]>

diff --cc Makefile.am
index 55be4ea,79c964e..c04c6ef
--- a/Makefile.am
+++ b/Makefile.am
@@@ -322,9 -300,8 +322,10 @@@ CLEANFILES =
 tools/vif-ganeti-metad \
 tools/net-common \
 tools/users-setup \
+ tools/ssl-update \
 tools/vcluster-setup \
+ tools/prepare-node-join \
+ tools/ssh-update \
 $(python_scripts_shebang) \
 stamp-directories \
 stamp-srclinks \
@@@ -576,8 -553,7 +578,8 @@@ pytools_PYTHON =
 lib/tools/node_cleanup.py \
 lib/tools/node_daemon_setup.py \
 lib/tools/prepare_node_join.py \
- lib/tools/common.py \
- lib/tools/ssh_update.py
++ lib/tools/ssh_update.py \
+ lib/tools/ssl_update.py

 utils_PYTHON = \
 lib/utils/__init__.py \
@@@ -1225,8 -1161,8 +1227,9 @@@ PYTHON_BOOTSTRAP =
 tools/ensure-dirs \
 tools/node-cleanup \
 tools/node-daemon-setup \
- tools/ssl-update \
- tools/prepare-node-join
+ tools/prepare-node-join \
- tools/ssh-update
++ tools/ssh-update \
++ tools/ssl-update

 qa_scripts = \
 qa/__init__.py \
@@@ -1471,7 -1406,7 +1474,8 @@@ nodist_pkglib_python_scripts =
 tools/ensure-dirs \
 tools/node-daemon-setup \
 tools/prepare-node-join \
- tools/ssh-update
++ tools/ssh-update \
+ tools/ssl-update

 pkglib_python_basenames = \
 $(patsubst daemons/%,%,$(patsubst tools/%,%,\
@@@ -2391,8 -2310,8 +2395,9 @@@ tools/burnin: MODULE = ganeti.tools.bur
 tools/ensure-dirs: MODULE = ganeti.tools.ensure_dirs
 tools/node-daemon-setup: MODULE = ganeti.tools.node_daemon_setup
 tools/prepare-node-join: MODULE = ganeti.tools.prepare_node_join
+tools/ssh-update: MODULE = ganeti.tools.ssh_update
 tools/node-cleanup: MODULE = ganeti.tools.node_cleanup
+ tools/ssl-update: MODULE = ganeti.tools.ssl_update
 $(HS_BUILT_TEST_HELPERS): TESTROLE = $(patsubst test/hs/%,%,$@)

 $(PYTHON_BOOTSTRAP) $(gnt_scripts) $(gnt_python_sbin_SCRIPTS): Makefile |
stamp-directories
diff --cc UPGRADE
index 7c0ae77,e7cc46a..18efd65
--- a/UPGRADE
+++ b/UPGRADE
@@@ -39,34 -39,14 +39,42 @@@ the Ganeti binaries should happen in th
 your system.


+2.13
+----
+
+When upgrading to 2.13, first apply the instructions of ``2.11 and
+above``. 2.13 comes with the new feature of enhanced SSH security
+through individual SSH keys. This features needs to be enabled
+after the upgrade by::
+
+   $ gnt-cluster renew-crypto --new-ssh-keys --no-ssh-key-check
+
+Note that new SSH keys are generated automatically without warning when
+upgrading with ``gnt-cluster upgrade``.
+
+If you instructed Ganeti to not touch the SSH setup (by using the
+``--no-ssh-init`` option of ``gnt-cluster init``, the changes in the
+handling of SSH keys will not affect your cluster.
+
+If you want to be prompted for each newly created SSH key, leave out
+the ``--no-ssh-key-check`` option in the command listed above.
+
+Note that after a downgrade from 2.13 to 2.12, the individual SSH keys
+will not get removed automatically. This can lead to reachability
+errors under very specific circumstances (Issue 1008). In case you plan
+on keeping 2.12 for a while and not upgrade to 2.13 again soon, we
recommend
+to replace all SSH key pairs of non-master nodes' with the master node's
SSH
+key pair.
+
+
+ 2.12
+ ----
+
+ Due to issue #1094 in Ganeti 2.11 and 2.12 up to version 2.12.4, we
+ advise to rerun 'gnt-cluster renew-crypto --new-node-certificates'
+ after an upgrade to 2.12.5 or higher.
+
+
 2.11
 ----

diff --cc lib/backend.py
index e977e81,5a04c7f..7c82d8d
--- a/lib/backend.py
+++ b/lib/backend.py
@@@ -953,139 -957,15 +959,143 @@@ def _VerifyClientCertificate(cert_file=
   (errcode, msg) = utils.VerifyCertificate(cert_file)
   if errcode is not None:
     return (errcode, msg)
-   else:
-     # if everything is fine, we return the digest to be compared to the
config
-     return (None, utils.GetCertificateDigest(cert_filename=cert_file))
+
+   (errcode, msg) = utils.IsCertificateSelfSigned(cert_file)
+   if errcode is not None:
+     return (errcode, msg)
+
+   # if everything is fine, we return the digest to be compared to the
config
+   return (None, utils.GetCertificateDigest(cert_filename=cert_file))


+def _VerifySshSetup(node_status_list, my_name,
+                    pub_key_file=pathutils.SSH_PUB_KEYS):
+  """Verifies the state of the SSH key files.
+
+  @type node_status_list: list of tuples
+  @param node_status_list: list of nodes of the cluster associated with a
+    couple of flags: (uuid, name, is_master_candidate,
+    is_potential_master_candidate, online)
+  @type my_name: str
+  @param my_name: name of this node
+  @type pub_key_file: str
+  @param pub_key_file: filename of the public key file
+
+  """
+  if node_status_list is None:
+    return ["No node list to check against the pub_key_file received."]
+
+  my_status_list = [(my_uuid, name, mc, pot_mc) for (my_uuid, name, mc,
pot_mc)
+                    in node_status_list if name == my_name]
+  if len(my_status_list) == 0:
+    return ["Cannot find node information for node '%s'." % my_name]
+  (my_uuid, _, _, potential_master_candidate) = \
+     my_status_list[0]
+
+  result = []
+
+  if not os.path.exists(pub_key_file):
+    result.append("The public key file '%s' does not exist. Consider
running"
+                  " 'gnt-cluster renew-crypto --new-ssh-keys"
+                  " [--no-ssh-key-check]' to fix this." % pub_key_file)
+    return result
+
+  pot_mc_uuids = [uuid for (uuid, _, _, _) in node_status_list]
+  pub_keys = ssh.QueryPubKeyFile(None)
+
+  if potential_master_candidate:
+    # Check that the set of potential master candidates matches the
+    # public key file
+    pub_uuids_set = set(pub_keys.keys())
+    pot_mc_uuids_set = set(pot_mc_uuids)
+    missing_uuids = set([])
+    if pub_uuids_set != pot_mc_uuids_set:
+      unknown_uuids = pub_uuids_set - pot_mc_uuids_set
+      if unknown_uuids:
+        result.append("The following node UUIDs are listed in the public
key"
+                      " file on node '%s', but are not potential master"
+                      " candidates: %s."
+                      % (my_name, ", ".join(list(unknown_uuids))))
+      missing_uuids = pot_mc_uuids_set - pub_uuids_set
+      if missing_uuids:
+        result.append("The following node UUIDs of potential master
candidates"
+                      " are missing in the public key file on node %s:
%s."
+                      % (my_name, ", ".join(list(missing_uuids))))
+
+    (_, key_files) = \
+      ssh.GetAllUserFiles(constants.SSH_LOGIN_USER, mkdir=False,
dircheck=False)
+    (_, dsa_pub_key_filename) = key_files[constants.SSHK_DSA]
+
+    my_keys = pub_keys[my_uuid]
+
+    dsa_pub_key = utils.ReadFile(dsa_pub_key_filename)
+    if dsa_pub_key.strip() not in my_keys:
+      result.append("The dsa key of node %s does not match this node's
key"
+                    " in the pub key file." % (my_name))
+    if len(my_keys) != 1:
+      result.append("There is more than one key for node %s in the public
key"
+                    " file." % my_name)
+  else:
+    if len(pub_keys.keys()) > 0:
+      result.append("The public key file of node '%s' is not empty,
although"
+                    " the node is not a potential master candidate."
+                    % my_name)
+
+  # Check that all master candidate keys are in the authorized_keys file
+  (auth_key_file, _) = \
+    ssh.GetAllUserFiles(constants.SSH_LOGIN_USER, mkdir=False,
dircheck=False)
+  for (uuid, name, mc, _) in node_status_list:
+    if uuid in missing_uuids:
+      continue
+    if mc:
+      for key in pub_keys[uuid]:
+        if not ssh.HasAuthorizedKey(auth_key_file, key):
+          result.append("A SSH key of master candidate '%s' (UUID: '%s')
is"
+                        " not in the 'authorized_keys' file of node '%s'."
+                        % (name, uuid, my_name))
+    else:
+      for key in pub_keys[uuid]:
+        if name != my_name and ssh.HasAuthorizedKey(auth_key_file, key):
+          result.append("A SSH key of normal node '%s' (UUID: '%s') is in
the"
+                        " 'authorized_keys' file of node '%s'."
+                        % (name, uuid, my_name))
+        if name == my_name and not ssh.HasAuthorizedKey(auth_key_file,
key):
+          result.append("A SSH key of normal node '%s' (UUID: '%s') is
not"
+                        " in the 'authorized_keys' file of itself."
+                        % (my_name, uuid))
+
+  return result
+
+
+def _VerifySshClutter(node_status_list, my_name):
+  """Verifies that the 'authorized_keys' files are not cluttered up.
+
+  @type node_status_list: list of tuples
+  @param node_status_list: list of nodes of the cluster associated with a
+    couple of flags: (uuid, name, is_master_candidate,
+    is_potential_master_candidate, online)
+  @type my_name: str
+  @param my_name: name of this node
+
+  """
+  result = []
+  (auth_key_file, _) = \
+    ssh.GetAllUserFiles(constants.SSH_LOGIN_USER, mkdir=False,
dircheck=False)
+  node_names = [name for (_, name, _, _) in node_status_list]
+  multiple_occurrences = ssh.CheckForMultipleKeys(auth_key_file,
node_names)
+  if multiple_occurrences:
+    msg = "There are hosts which have more than one SSH key stored for
the" \
+          " same user in the 'authorized_keys' file of node %s. This can
be" \
+          " due to an unsuccessful operation which cluttered up the" \
+          " 'authorized_keys' file. We recommend to clean this up
manually. " \
+          % my_name
+    for host, occ in multiple_occurrences.items():
+      msg += "Entry for '%s' in lines %s. " % (host, utils.CommaJoin(occ))
+    result.append(msg)
+
+  return result
+
+
 def VerifyNode(what, cluster_name, all_hvparams, node_groups, groups_cfg):
   """Verify the status of the local node.

diff --cc lib/bootstrap.py
index eaaaed0,3beefa0..c962daa
--- a/lib/bootstrap.py
+++ b/lib/bootstrap.py
@@@ -188,8 -227,30 +202,29 @@@ def _InitGanetiServerSetup(master_name

   """
   # Generate cluster secrets
-   GenerateClusterCrypto(True, False, False, False, False)
+   GenerateClusterCrypto(True, False, False, False, False, False,
master_name)
+
+   # Add the master's SSL certificate digest to the configuration.
+   master_uuid = cfg.GetMasterNode()
+   master_digest = utils.GetCertificateDigest()
+   cfg.AddNodeToCandidateCerts(master_uuid, master_digest)
+   cfg.Update(cfg.GetClusterInfo(), logging.error)
+   ssconf.WriteSsconfFiles(cfg.GetSsconfValues())

-  if not os.path.exists(
-      os.path.join(pathutils.DATA_DIR,
-      "%s%s" % (constants.SSCONF_FILEPREFIX,
-                constants.SS_MASTER_CANDIDATES_CERTS))):
++  if not os.path.exists(os.path.join(pathutils.DATA_DIR,
++                        "%s%s" % (constants.SSCONF_FILEPREFIX,
++                                  constants.SS_MASTER_CANDIDATES_CERTS))):
+     raise errors.OpExecError("Ssconf file for master candidate
certificates"
+                              " was not written.")
+
+   if not os.path.exists(pathutils.NODED_CERT_FILE):
+     raise errors.OpExecError("The server certficate was not created
properly.")
+
+   if not os.path.exists(pathutils.NODED_CLIENT_CERT_FILE):
+     raise errors.OpExecError("The client certificate was not created"
+                              " properly.")
+
+   # set up the inter-node password and certificate
   result = utils.RunCmd([pathutils.DAEMON_UTIL, "start", constants.NODED])
   if result.failed:
     raise errors.OpExecError("Could not start the node daemon, command %s"
@@@ -776,11 -917,8 +811,11 @@@ def InitCluster(cluster_name, mac_prefi
   cfg.Update(cfg.GetClusterInfo(), logging.error)
   ssconf.WriteSsconfFiles(cfg.GetSsconfValues())

+  master_uuid = cfg.GetMasterNode()
+  if modify_ssh_setup:
+    ssh.InitPubKeyFile(master_uuid)
   # set up the inter-node password and certificate
-   _InitGanetiServerSetup(hostname.name)
+   _InitGanetiServerSetup(hostname.name, cfg)

   logging.debug("Starting daemons")
   result = utils.RunCmd([pathutils.DAEMON_UTIL, "start-all"])
@@@ -897,16 -1034,14 +932,17 @@@ def SetupNodeDaemon(opts, cluster_name
       utils.ReadFile(pathutils.NODED_CERT_FILE),
     constants.NDS_SSCONF: ssconf.SimpleStore().ReadAll(),
     constants.NDS_START_NODE_DAEMON: True,
+     constants.NDS_NODE_NAME: node,
     }

-  RunNodeSetupCmd(cluster_name, node, pathutils.NODE_DAEMON_SETUP,
-                  opts.debug, opts.verbose,
-                  True, opts.ssh_key_check, opts.ssh_key_check,
-                  ssh_port, data)
+  ssh.RunSshCmdWithStdin(cluster_name, node, pathutils.NODE_DAEMON_SETUP,
+                         ssh_port, data,
+                         debug=opts.debug, verbose=opts.verbose,
+                         use_cluster_key=True, ask_key=opts.ssh_key_check,
+                         strict_host_check=opts.ssh_key_check,
+                         ensure_version=True)

+  _WaitForSshDaemon(node, ssh_port)
   _WaitForNodeDaemon(node)


diff --cc lib/client/gnt_cluster.py
index 031bfbf,c56e2bb..54b0b80
--- a/lib/client/gnt_cluster.py
+++ b/lib/client/gnt_cluster.py
@@@ -966,7 -941,7 +967,8 @@@ def _ReadAndVerifyCert(cert_filename, v
 def _RenewCrypto(new_cluster_cert, new_rapi_cert, # pylint: disable=R0911
                  rapi_cert_filename, new_spice_cert, spice_cert_filename,
                  spice_cacert_filename, new_confd_hmac_key, new_cds,
-                  cds_filename, force, new_node_cert, new_ssh_keys):
-                 cds_filename, force, new_node_cert, verbose, debug):
++                 cds_filename, force, new_node_cert, new_ssh_keys,
++                 verbose, debug):
   """Renews cluster certificates, keys and secrets.

   @type new_cluster_cert: bool
@@@ -990,15 -965,14 +992,19 @@@
   @param cds_filename: Path to file containing new cluster domain secret
   @type force: bool
   @param force: Whether to ask user for confirmation
-  @type new_node_cert: string
+  @type new_node_cert: bool
   @param new_node_cert: Whether to generate new node certificates
+  @type new_ssh_keys: bool
+  @param new_ssh_keys: Whether to generate new node SSH keys
+   @type verbose: boolean
+   @param verbose: show verbose output
+   @type debug: boolean
+   @param debug: show debug output

   """
+  ToStdout("Updating certificates now. Running \"gnt-cluster verify\" "
+           " is recommended after this operation.")
+
   if new_rapi_cert and rapi_cert_filename:
     ToStderr("Only one of the --new-rapi-certificate and
--rapi-certificate"
              " options can be specified at the same time.")
@@@ -1086,63 -1059,101 +1091,151 @@@
         for file_name in files_to_copy:
           ctx.ssh.CopyFileToNode(node_name, port, file_name)

-   RunWhileClusterStopped(ToStdout, _RenewCryptoInner)
-
-   if new_node_cert or new_ssh_keys:
+   def _RenewClientCerts(ctx):
+     ctx.feedback_fn("Updating client SSL certificates.")
+
+     cluster_name = ssconf.SimpleStore().GetClusterName()
+
+     for node_name in ctx.nonmaster_nodes + [ctx.master_node]:
+       ssh_port = ctx.ssh_ports[node_name]
+       data = {
+         constants.NDS_CLUSTER_NAME: cluster_name,
+         constants.NDS_NODE_DAEMON_CERTIFICATE:
+           utils.ReadFile(pathutils.NODED_CERT_FILE),
+         constants.NDS_NODE_NAME: node_name,
+         constants.NDS_ACTION: constants.CRYPTO_ACTION_CREATE,
+         }
+
-      bootstrap.RunNodeSetupCmd(
++      ssh.RunSshCmdWithStdin(
+           cluster_name,
+           node_name,
+           pathutils.SSL_UPDATE,
-          ctx.debug,
-          ctx.verbose,
-          True, # use cluster key
-          False, # ask key
-          True, # strict host check
+           ssh_port,
-          data)
++          data,
++          debug=ctx.debug,
++          verbose=ctx.verbose,
++          use_cluster_key=True,
++          ask_key=False,
++          strict_host_check=True)
+
+     # Create a temporary ssconf file using the master's client cert digest
+     # and the 'bootstrap' keyword to enable distribution of all nodes'
digests.
+     master_digest = utils.GetCertificateDigest()
+     ssconf_master_candidate_certs_filename = os.path.join(
+         pathutils.DATA_DIR, "%s%s" %
+         (constants.SSCONF_FILEPREFIX,
constants.SS_MASTER_CANDIDATES_CERTS))
+     utils.WriteFile(
+         ssconf_master_candidate_certs_filename,
+         data="%s=%s" % (constants.CRYPTO_BOOTSTRAP, master_digest))
+     for node_name in ctx.nonmaster_nodes:
+       port = ctx.ssh_ports[node_name]
+       ctx.feedback_fn("Copying %s to %s:%d" %
+                       (ssconf_master_candidate_certs_filename, node_name,
port))
+       ctx.ssh.CopyFileToNode(node_name, port,
+                              ssconf_master_candidate_certs_filename)
+
+     # Write the boostrap entry to the config using wconfd.
+     config_live_lock = utils.livelock.LiveLock("renew_crypto")
+     cfg = config.GetConfig(None, config_live_lock)
+     cfg.AddNodeToCandidateCerts(constants.CRYPTO_BOOTSTRAP, master_digest)
+     cfg.Update(cfg.GetClusterInfo(), ctx.feedback_fn)
+
+   def _RenewServerAndClientCerts(ctx):
+     ctx.feedback_fn("Updating the cluster SSL certificate.")
+
+     master_name = ssconf.SimpleStore().GetMasterNode()
+     bootstrap.GenerateClusterCrypto(True, # cluster cert
+                                     False, # rapi cert
+                                     False, # spice cert
+                                     False, # confd hmac key
+                                     False, # cds
+                                     True, # client cert
+                                     master_name)
+
+     for node_name in ctx.nonmaster_nodes:
+       port = ctx.ssh_ports[node_name]
+       server_cert = pathutils.NODED_CERT_FILE
+       ctx.feedback_fn("Copying %s to %s:%d" %
+                       (server_cert, node_name, port))
+       ctx.ssh.CopyFileToNode(node_name, port, server_cert)
+
+     _RenewClientCerts(ctx)
+
+   if new_cluster_cert or new_rapi_cert or new_spice_cert \
+       or new_confd_hmac_key or new_cds:
+     RunWhileClusterStopped(ToStdout, _RenewCryptoInner)
+
+   # If only node certficates are recreated, call _RenewClientCerts only.
+   if new_node_cert and not new_cluster_cert:
+     RunWhileDaemonsStopped(ToStdout, [constants.NODED, constants.WCONFD],
+                            _RenewClientCerts, verbose=verbose,
debug=debug)
+
+   # If the cluster certificate are renewed, the client certificates need
+   # to be renewed too.
+   if new_cluster_cert:
+     RunWhileDaemonsStopped(ToStdout, [constants.NODED, constants.WCONFD],
+                            _RenewServerAndClientCerts, verbose=verbose,
+                            debug=debug)
+
++  if new_node_cert or new_cluster_cert or new_ssh_keys:
+    cl = GetClient()
-     renew_op =
opcodes.OpClusterRenewCrypto(node_certificates=new_node_cert,
-                                             ssh_keys=new_ssh_keys)
++    renew_op = opcodes.OpClusterRenewCrypto(
++        node_certificates=new_node_cert or new_cluster_cert,
++        ssh_keys=new_ssh_keys)
+    SubmitOpCode(renew_op, cl=cl)
+
+   ToStdout("All requested certificates and keys have been replaced."
+            " Running \"gnt-cluster verify\" now is recommended.")
+
-  if new_node_cert or new_cluster_cert:
+  return 0
+
+
+def _BuildGanetiPubKeys(options, pub_key_file=pathutils.SSH_PUB_KEYS,
cl=None,
+                        get_online_nodes_fn=GetOnlineNodes,
+                        get_nodes_ssh_ports_fn=GetNodesSshPorts,
+                        get_node_uuids_fn=GetNodeUUIDs,
+                        homedir_fn=None):
+  """Recreates the 'ganeti_pub_key' file by polling all nodes.
+
+  """
+  if os.path.exists(pub_key_file):
+    utils.CreateBackup(pub_key_file)
+    utils.RemoveFile(pub_key_file)
+
+  ssh.ClearPubKeyFile(pub_key_file)
+
+  if not cl:
     cl = GetClient()
-    renew_op = opcodes.OpClusterRenewCrypto()
-    SubmitOpCode(renew_op, cl=cl)

-  return 0
+  (cluster_name, master_node) = \
+    cl.QueryConfigValues(["cluster_name", "master_node"])
+
+  online_nodes = get_online_nodes_fn([], cl=cl)
+  ssh_ports = get_nodes_ssh_ports_fn(online_nodes + [master_node], cl)
+  ssh_port_map = dict(zip(online_nodes + [master_node], ssh_ports))
+
+  node_uuids = get_node_uuids_fn(online_nodes + [master_node], cl)
+  node_uuid_map = dict(zip(online_nodes + [master_node], node_uuids))
+
+  nonmaster_nodes = [name for name in online_nodes
+                     if name != master_node]
+
+  _, pub_key_filename, _ = \
+    ssh.GetUserFiles(constants.SSH_LOGIN_USER, mkdir=False,
dircheck=False,
+                     kind=constants.SSHK_DSA, _homedir_fn=homedir_fn)
+
+  # get the key file of the master node
+  pub_key = utils.ReadFile(pub_key_filename)
+  ssh.AddPublicKey(node_uuid_map[master_node], pub_key,
+                   key_file=pub_key_file)
+
+  # get the key files of all non-master nodes
+  for node in nonmaster_nodes:
+    pub_key = ssh.ReadRemoteSshPubKeys(pub_key_filename, node,
cluster_name,
+                                       ssh_port_map[node],
+                                       options.ssh_key_check,
+                                       options.ssh_key_check)
+    ssh.AddPublicKey(node_uuid_map[node], pub_key, key_file=pub_key_file)


 def RenewCrypto(opts, args):
@@@ -1162,7 -1171,8 +1255,9 @@@
                       opts.cluster_domain_secret,
                       opts.force,
                       opts.new_node_cert,
-                       opts.new_ssh_keys)
++                      opts.new_ssh_keys,
+                       opts.verbose,
+                       opts.debug > 0)


 def _GetEnabledDiskTemplates(opts):
@@@ -2086,6 -2069,38 +2181,7 @@@ def _VersionSpecificDowngrade()
   @return: True upon success
   """
   ToStdout("Performing version-specific downgrade tasks.")
+
-  nodes = ssconf.SimpleStore().GetOnlineNodeList()
-  cluster_name = ssconf.SimpleStore().GetClusterName()
-  ssh_ports = ssconf.SimpleStore().GetSshPortMap()
-
-  for node in nodes:
-    data = {
-      constants.NDS_CLUSTER_NAME: cluster_name,
-      constants.NDS_NODE_DAEMON_CERTIFICATE:
-        utils.ReadFile(pathutils.NODED_CERT_FILE),
-      constants.NDS_NODE_NAME: node,
-      constants.NDS_ACTION: constants.CRYPTO_ACTION_DELETE,
-      }
-
-    try:
-      bootstrap.RunNodeSetupCmd(
-          cluster_name,
-          node,
-          pathutils.SSL_UPDATE,
-          True, # debug
-          True, # verbose,
-          True, # use cluster key
-          False, # ask key
-          True, # strict host check
-          ssh_ports[node],
-          data)
-    except Exception as e: # pylint: disable=W0703
-      # As downgrading can fail if a node is temporarily unreachable
-      # only output the error, but do not abort the entire operation.
-      ToStderr("Downgrading SSL setup of node '%s' failed: %s." %
-               (node, e))
-
   return True


@@@ -2409,7 -2422,7 +2505,8 @@@ commands =
      NEW_CONFD_HMAC_KEY_OPT, FORCE_OPT,
      NEW_CLUSTER_DOMAIN_SECRET_OPT, CLUSTER_DOMAIN_SECRET_OPT,
      NEW_SPICE_CERT_OPT, SPICE_CERT_OPT, SPICE_CACERT_OPT,
-      NEW_NODE_CERT_OPT, NEW_SSH_KEY_OPT, NOSSH_KEYCHECK_OPT],
-     NEW_NODE_CERT_OPT, VERBOSE_OPT],
++     NEW_NODE_CERT_OPT, NEW_SSH_KEY_OPT, NOSSH_KEYCHECK_OPT,
++     VERBOSE_OPT],
     "[opts...]",
     "Renews cluster certificates, keys and secrets"),
   "epo": (
diff --cc lib/cmdlib/cluster.py
index e6733f7,e19b08e..01acb75
--- a/lib/cmdlib/cluster.py
+++ b/lib/cmdlib/cluster.py
@@@ -65,11 -65,9 +65,10 @@@ from ganeti.cmdlib.common import ShareA
   CheckOSParams, CheckHVParams, AdjustCandidatePool, CheckNodePVs, \
   ComputeIPolicyInstanceViolation, AnnotateDiskParams, SupportsOob, \
   CheckIpolicyVsDiskTemplates, CheckDiskAccessModeValidity, \
-   CheckDiskAccessModeConsistency, CreateNewClientCert, \
+   CheckDiskAccessModeConsistency, GetClientCertDigest, \
   AddInstanceCommunicationNetworkOp,
ConnectInstanceCommunicationNetworkOp, \
-   CheckImageValidity, \
-   CheckDiskAccessModeConsistency, CreateNewClientCert, EnsureKvmdOnNodes,
\
-  CheckImageValidity, CheckDiskAccessModeConsistency, EnsureKvmdOnNodes
++  CheckImageValidity, CheckDiskAccessModeConsistency, EnsureKvmdOnNodes, \
+  WarnAboutFailedSshUpdates

 import ganeti.masterd.instance

@@@ -240,51 -136,8 +162,47 @@@ class LUClusterRenewCrypto(NoHooksLU)
         msg += "Node %s: %s\n" % (uuid, e)
       feedback_fn(msg)

-     self.cfg.RemoveNodeFromCandidateCerts("%s-SERVER" % master_uuid)
-     self.cfg.RemoveNodeFromCandidateCerts("%s-OLDMASTER" % master_uuid)
-     logging.debug("Cleaned up *-SERVER and *-OLDMASTER certificate from"
-                   " master candidate cert list. Current state of the"
-                   " list: %s.", cluster.candidate_certs)
+     self.cfg.SetCandidateCerts(digest_map)

+  def _RenewSshKeys(self, feedback_fn):
+    """Renew all nodes' SSH keys.
+
+    """
+    master_uuid = self.cfg.GetMasterNode()
+
+    nodes = self.cfg.GetAllNodesInfo()
+    nodes_uuid_names = [(node_uuid, node_info.name) for (node_uuid,
node_info)
+                        in nodes.items() if not node_info.offline]
+    node_names = [name for (_, name) in nodes_uuid_names]
+    node_uuids = [uuid for (uuid, _) in nodes_uuid_names]
+    port_map = ssh.GetSshPortMap(node_names, self.cfg)
+    potential_master_candidates = self.cfg.GetPotentialMasterCandidates()
+    master_candidate_uuids = self.cfg.GetMasterCandidateUuids()
+
+    result = self.rpc.call_node_ssh_keys_renew(
+      [master_uuid],
+      node_uuids, node_names, port_map,
+      master_candidate_uuids,
+      potential_master_candidates)
+
+    # Check if there were serious errors (for example master key files not
+    # writable).
+    result[master_uuid].Raise("Could not renew the SSH keys of all nodes")
+
+    # Process any non-disruptive errors (a few nodes unreachable etc.)
+    WarnAboutFailedSshUpdates(result, master_uuid, feedback_fn)
+
+  def Exec(self, feedback_fn):
+    if self.op.node_certificates:
+      feedback_fn("Renewing Node SSL certificates")
+      self._RenewNodeSslCertificates(feedback_fn)
+    if self.op.ssh_keys and not self._ssh_renewal_suppressed:
+      feedback_fn("Renewing SSH keys")
+      self._RenewSshKeys(feedback_fn)
+    elif self._ssh_renewal_suppressed:
+      feedback_fn("Cannot renew SSH keys if the cluster is configured to
not"
+                  " modify the SSH setup.")
+

 class LUClusterActivateMasterIp(NoHooksLU):
   """Activate the master IP on the master node.
diff --cc lib/cmdlib/node.py
index 18d4f34,2f83c73..b6d67bc
--- a/lib/cmdlib/node.py
+++ b/lib/cmdlib/node.py
@@@ -51,11 -51,9 +51,11 @@@ from ganeti.cmdlib.common import CheckP
   CheckInstanceState, INSTANCE_DOWN, GetUpdatedParams, \
   AdjustCandidatePool, CheckIAllocatorOrNode, LoadNodeEvacResult, \
   GetWantedNodes, MapInstanceLvsToNodes, RunPostHook, \
-   FindFaultyInstanceDisks, CheckStorageTypeEnabled, CreateNewClientCert, \
+   FindFaultyInstanceDisks, CheckStorageTypeEnabled, GetClientCertDigest, \
   AddNodeCertToCandidateCerts, RemoveNodeCertFromCandidateCerts, \
-  EnsureKvmdOnNodes
+  EnsureKvmdOnNodes, WarnAboutFailedSshUpdates
+
+from ganeti.ssh import GetSshPortMap


 def _DecideSelfPromotion(lu, exceptions=None):
diff --cc lib/pathutils.py
index ea35bcb,be6c432..d13eb75
--- a/lib/pathutils.py
+++ b/lib/pathutils.py
@@@ -64,8 -64,8 +64,9 @@@ IMPORT_EXPORT_DAEMON = _constants.PKGLI
 KVM_CONSOLE_WRAPPER = _constants.PKGLIBDIR + "/tools/kvm-console-wrapper"
 KVM_IFUP = _constants.PKGLIBDIR + "/kvm-ifup"
 PREPARE_NODE_JOIN = _constants.PKGLIBDIR + "/prepare-node-join"
+SSH_UPDATE = _constants.PKGLIBDIR + "/ssh-update"
 NODE_DAEMON_SETUP = _constants.PKGLIBDIR + "/node-daemon-setup"
+ SSL_UPDATE = _constants.PKGLIBDIR + "/ssl-update"
 XEN_CONSOLE_WRAPPER = _constants.PKGLIBDIR + "/tools/xen-console-wrapper"
 CFGUPGRADE = _constants.PKGLIBDIR + "/tools/cfgupgrade"
 POST_UPGRADE = _constants.PKGLIBDIR + "/tools/post-upgrade"
diff --cc lib/tools/common.py
index 5b4d2fc,b48b1ee..d3c4f67
--- a/lib/tools/common.py
+++ b/lib/tools/common.py
@@@ -31,10 -31,14 +31,15 @@@

 """

+ import logging
 import OpenSSL
+ import os
+ import time
+ from cStringIO import StringIO

 from ganeti import constants
+from ganeti import errors
+ from ganeti import pathutils
 from ganeti import utils
 from ganeti import serializer
 from ganeti import ssconf
@@@ -51,42 -54,72 +56,113 @@@ def VerifyOptions(parser, opts, args)
   return opts


--def _VerifyCertificate(cert_pem, error_fn,
--                       _check_fn=utils.CheckNodeCertificate):
++def _VerifyCertificateStrong(cert_pem, error_fn,
++                             _check_fn=utils.CheckNodeCertificate):
+   """Verifies a certificate against the local node daemon certificate.
+
+   @type cert_pem: string
+   @param cert_pem: Certificate and key in PEM format
+   @type error_fn: callable
+   @param error_fn: function to call in case of an error
+   @rtype: string
+   @return: Formatted key and certificate
+
+   """
+   try:
+     cert = \
+       OpenSSL.crypto.load_certificate(OpenSSL.crypto.FILETYPE_PEM,
cert_pem)
+   except Exception, err:
+     raise error_fn("(stdin) Unable to load certificate: %s" % err)
+
+   try:
+     key = OpenSSL.crypto.load_privatekey(OpenSSL.crypto.FILETYPE_PEM,
cert_pem)
+   except OpenSSL.crypto.Error, err:
+     raise error_fn("(stdin) Unable to load private key: %s" % err)
+
+   # Check certificate with given key; this detects cases where the key
given on
+   # stdin doesn't match the certificate also given on stdin
+   try:
+     utils.X509CertKeyCheck(cert, key)
+   except OpenSSL.SSL.Error:
+     raise error_fn("(stdin) Certificate is not signed with given key")
+
+   # Standard checks, including check against an existing local certificate
+   # (no-op if that doesn't exist)
+   _check_fn(cert)
+
+   key_encoded =
OpenSSL.crypto.dump_privatekey(OpenSSL.crypto.FILETYPE_PEM, key)
+   cert_encoded =
OpenSSL.crypto.dump_certificate(OpenSSL.crypto.FILETYPE_PEM,
+                                                  cert)
+   complete_cert_encoded = key_encoded + cert_encoded
+   if not cert_pem == complete_cert_encoded:
+     logging.error("The certificate differs after being reencoded. Please"
+                   " renew the certificates cluster-wide to prevent future"
+                   " inconsistencies.")
+
+   # Format for storing on disk
+   buf = StringIO()
+   buf.write(cert_pem)
+   return buf.getvalue()
+
+
-def VerifyCertificate(data, error_fn, _verify_fn=_VerifyCertificate):
-  """Verifies cluster certificate.
++def _VerifyCertificateSoft(cert_pem, error_fn,
++                           _check_fn=utils.CheckNodeCertificate):
+  """Verifies a certificate against the local node daemon certificate.
+
+  @type cert_pem: string
+  @param cert_pem: Certificate in PEM format (no key)
+
+  """
+  try:
+    OpenSSL.crypto.load_privatekey(OpenSSL.crypto.FILETYPE_PEM, cert_pem)
+  except OpenSSL.crypto.Error, err:
+    pass
+  else:
+    raise error_fn("No private key may be given")
+
+  try:
+    cert = \
+      OpenSSL.crypto.load_certificate(OpenSSL.crypto.FILETYPE_PEM,
cert_pem)
+  except Exception, err:
+    raise errors.X509CertError("(stdin)",
+                               "Unable to load certificate: %s" % err)
+
+  _check_fn(cert)
+
+
- def VerifyCertificate(data, error_fn, _verify_fn=_VerifyCertificate):
-   """Verifies cluster certificate.
++def VerifyCertificateSoft(data, error_fn,
_verify_fn=_VerifyCertificateSoft):
++  """Verifies cluster certificate if existing.
+
+  @type data: dict
++  @type error_fn: callable
++  @param error_fn: function to call in case of an error
++  @rtype: string
++  @return: Formatted key and certificate
+
+  """
+  cert = data.get(constants.SSHS_NODE_DAEMON_CERTIFICATE)
+  if cert:
+    _verify_fn(cert, error_fn)
+
+
++def VerifyCertificateStrong(data, error_fn,
++                            _verify_fn=_VerifyCertificateStrong):
++  """Verifies cluster certificate. Throws error when not existing.
+
+   @type data: dict
+   @type error_fn: callable
+   @param error_fn: function to call in case of an error
+   @rtype: string
+   @return: Formatted key and certificate
+
+   """
+   cert = data.get(constants.NDS_NODE_DAEMON_CERTIFICATE)
+   if not cert:
+     raise error_fn("Node daemon certificate must be specified")
+
+   return _verify_fn(cert, error_fn)
+
+
 def VerifyClusterName(data, error_fn,
                       _verify_fn=ssconf.VerifyClusterName):
   """Verifies cluster name.
@@@ -110,8 -143,30 +186,37 @@@ def LoadData(raw, data_check)
   return serializer.LoadAndVerifyJson(raw, data_check)


+def GenerateRootSshKeys(error_fn, _suffix="", _homedir_fn=None):
+  """Generates root's SSH keys for this node.
+
+  """
+  ssh.InitSSHSetup(error_fn=error_fn, _homedir_fn=_homedir_fn,
_suffix=_suffix)
++
++
+ def GenerateClientCertificate(
+     data, error_fn, client_cert=pathutils.NODED_CLIENT_CERT_FILE,
+     signing_cert=pathutils.NODED_CERT_FILE):
+   """Regenerates the client certificate of the node.
+
+   @type data: string
+   @param data: the JSON-formated input data
+
+   """
+   if not os.path.exists(signing_cert):
+     raise error_fn("The signing certificate '%s' cannot be found."
+                    % signing_cert)
+
+   # TODO: This sets the serial number to the number of seconds
+   # since epoch. This is technically not a correct serial number
+   # (in the way SSL is supposed to be used), but it serves us well
+   # enough for now, as we don't have any infrastructure for keeping
+   # track of the number of signed certificates yet.
+   serial_no = int(time.time())
+
+   # The hostname of the node is provided with the input data.
+   hostname = data.get(constants.NDS_NODE_NAME)
+   if not hostname:
+     raise error_fn("No hostname found.")
+
+   utils.GenerateSignedSslCert(client_cert, serial_no, signing_cert,
+                               common_name=hostname)
diff --cc lib/tools/prepare_node_join.py
index 5a491d9,4db335f..0902cf4
--- a/lib/tools/prepare_node_join.py
+++ b/lib/tools/prepare_node_join.py
@@@ -197,7 -227,7 +197,7 @@@ def Main()

     # Check if input data is correct
     common.VerifyClusterName(data, JoinError)
-     common.VerifyCertificate(data, JoinError)
-    VerifyCertificate(data)
++    common.VerifyCertificateSoft(data, JoinError)

     # Update SSH files
     UpdateSshDaemon(data, opts.dry_run)
diff --cc lib/tools/ssh_update.py
index b60f6f0,0000000..904cbd3
mode 100644,000000..100644
--- a/lib/tools/ssh_update.py
+++ b/lib/tools/ssh_update.py
@@@ -1,229 -1,0 +1,229 @@@
+#
+#
+
+# Copyright (C) 2014 Google Inc.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met:
+#
+# 1. Redistributions of source code must retain the above copyright
notice,
+# this list of conditions and the following disclaimer.
+#
+# 2. Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
+# IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+# TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR
+# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
+# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+# LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+"""Script to update a node's SSH key files.
+
+This script is used to update the node's 'authorized_keys' and
+'ganeti_pub_key' files. It will be called via SSH from the master
+node.
+
+"""
+
+import os
+import os.path
+import optparse
+import sys
+import logging
+
+from ganeti import cli
+from ganeti import constants
+from ganeti import errors
+from ganeti import utils
+from ganeti import ht
+from ganeti import ssh
+from ganeti import pathutils
+from ganeti.tools import common
+
+
+_DATA_CHECK = ht.TStrictDict(False, True, {
+  constants.SSHS_CLUSTER_NAME: ht.TNonEmptyString,
+  constants.SSHS_NODE_DAEMON_CERTIFICATE: ht.TNonEmptyString,
+  constants.SSHS_SSH_PUBLIC_KEYS:
+    ht.TItems(
+      [ht.TElemOf(constants.SSHS_ACTIONS),
+       ht.TDictOf(ht.TNonEmptyString, ht.TListOf(ht.TNonEmptyString))]),
+  constants.SSHS_SSH_AUTHORIZED_KEYS:
+    ht.TItems(
+      [ht.TElemOf(constants.SSHS_ACTIONS),
+       ht.TDictOf(ht.TNonEmptyString, ht.TListOf(ht.TNonEmptyString))]),
+  constants.SSHS_GENERATE: ht.TDictOf(ht.TNonEmptyString, ht.TString),
+  })
+
+
+class SshUpdateError(errors.GenericError):
+  """Local class for reporting errors.
+
+  """
+
+
+def ParseOptions():
+  """Parses the options passed to the program.
+
+  @return: Options and arguments
+
+  """
+  program = os.path.basename(sys.argv[0])
+
+  parser = optparse.OptionParser(
+    usage="%prog [--dry-run] [--verbose] [--debug]", prog=program)
+  parser.add_option(cli.DEBUG_OPT)
+  parser.add_option(cli.VERBOSE_OPT)
+  parser.add_option(cli.DRY_RUN_OPT)
+
+  (opts, args) = parser.parse_args()
+
+  return common.VerifyOptions(parser, opts, args)
+
+
+def UpdateAuthorizedKeys(data, dry_run, _homedir_fn=None):
+  """Updates root's C{authorized_keys} file.
+
+  @type data: dict
+  @param data: Input data
+  @type dry_run: boolean
+  @param dry_run: Whether to perform a dry run
+
+  """
+  instructions = data.get(constants.SSHS_SSH_AUTHORIZED_KEYS)
+  if not instructions:
+    logging.info("No change to the authorized_keys file requested.")
+    return
+  (action, authorized_keys) = instructions
+
+  (auth_keys_file, _) = \
+    ssh.GetAllUserFiles(constants.SSH_LOGIN_USER, mkdir=True,
+                        _homedir_fn=_homedir_fn)
+
+  key_values = []
+  for key_value in authorized_keys.values():
+    key_values += key_value
+  if action == constants.SSHS_ADD:
+    if dry_run:
+      logging.info("This is a dry run, not adding keys to %s",
+                   auth_keys_file)
+    else:
+      if not os.path.exists(auth_keys_file):
+        utils.WriteFile(auth_keys_file, mode=0600, data="")
+      ssh.AddAuthorizedKeys(auth_keys_file, key_values)
+  elif action == constants.SSHS_REMOVE:
+    if dry_run:
+      logging.info("This is a dry run, not removing keys from %s",
+                   auth_keys_file)
+    else:
+      ssh.RemoveAuthorizedKeys(auth_keys_file, key_values)
+  else:
+    raise SshUpdateError("Action '%s' not implemented for authorized
keys."
+                         % action)
+
+
+def UpdatePubKeyFile(data, dry_run, key_file=pathutils.SSH_PUB_KEYS):
+  """Updates the file of public SSH keys.
+
+  @type data: dict
+  @param data: Input data
+  @type dry_run: boolean
+  @param dry_run: Whether to perform a dry run
+
+  """
+  instructions = data.get(constants.SSHS_SSH_PUBLIC_KEYS)
+  if not instructions:
+    logging.info("No instructions to modify public keys received."
+                 " Not modifying the public key file at all.")
+    return
+  (action, public_keys) = instructions
+
+  if action == constants.SSHS_OVERRIDE:
+    if dry_run:
+      logging.info("This is a dry run, not overriding %s", key_file)
+    else:
+      ssh.OverridePubKeyFile(public_keys, key_file=key_file)
+  elif action in [constants.SSHS_ADD, constants.SSHS_REPLACE_OR_ADD]:
+    if dry_run:
+      logging.info("This is a dry run, not adding or replacing a key to
%s",
+                   key_file)
+    else:
+      for uuid, keys in public_keys.items():
+        if action == constants.SSHS_REPLACE_OR_ADD:
+          ssh.RemovePublicKey(uuid, key_file=key_file)
+        for key in keys:
+          ssh.AddPublicKey(uuid, key, key_file=key_file)
+  elif action == constants.SSHS_REMOVE:
+    if dry_run:
+      logging.info("This is a dry run, not removing keys from %s",
key_file)
+    else:
+      for uuid in public_keys.keys():
+        ssh.RemovePublicKey(uuid, key_file=key_file)
+  elif action == constants.SSHS_CLEAR:
+    if dry_run:
+      logging.info("This is a dry run, not clearing file %s", key_file)
+    else:
+      ssh.ClearPubKeyFile(key_file=key_file)
+  else:
+    raise SshUpdateError("Action '%s' not implemented for public keys."
+                         % action)
+
+
+def GenerateRootSshKeys(data, dry_run):
+  """(Re-)generates the root SSH keys.
+
+  @type data: dict
+  @param data: Input data
+  @type dry_run: boolean
+  @param dry_run: Whether to perform a dry run
+
+  """
+  generate_info = data.get(constants.SSHS_GENERATE)
+  if generate_info:
+    suffix = generate_info[constants.SSHS_SUFFIX]
+    if dry_run:
+      logging.info("This is a dry run, not generating any files.")
+    else:
+      common.GenerateRootSshKeys(SshUpdateError, _suffix=suffix)
+
+
+def Main():
+  """Main routine.
+
+  """
+  opts = ParseOptions()
+
+  utils.SetupToolLogging(opts.debug, opts.verbose)
+
+  try:
+    data = common.LoadData(sys.stdin.read(), _DATA_CHECK)
+
+    # Check if input data is correct
+    common.VerifyClusterName(data, SshUpdateError)
-     common.VerifyCertificate(data, SshUpdateError)
++    common.VerifyCertificateSoft(data, SshUpdateError)
+
+    # Update / Generate SSH files
+    UpdateAuthorizedKeys(data, opts.dry_run)
+    UpdatePubKeyFile(data, opts.dry_run)
+    GenerateRootSshKeys(data, opts.dry_run)
+
+    logging.info("Setup finished successfully")
+  except Exception, err: # pylint: disable=W0703
+    logging.debug("Caught unhandled exception", exc_info=True)
+
+    (retcode, message) = cli.FormatError(err)
+    logging.error(message)
+
+    return retcode
+  else:
+    return constants.EXIT_SUCCESS
diff --cc lib/tools/ssl_update.py
index 0000000,88a24ee..f9c5c19
mode 000000,100644..100644
--- a/lib/tools/ssl_update.py
+++ b/lib/tools/ssl_update.py
@@@ -1,0 -1,148 +1,148 @@@
+ #
+ #
+
+ # Copyright (C) 2015 Google Inc.
+ # All rights reserved.
+ #
+ # Redistribution and use in source and binary forms, with or without
+ # modification, are permitted provided that the following conditions are
+ # met:
+ #
+ # 1. Redistributions of source code must retain the above copyright
notice,
+ # this list of conditions and the following disclaimer.
+ #
+ # 2. Redistributions in binary form must reproduce the above copyright
+ # notice, this list of conditions and the following disclaimer in the
+ # documentation and/or other materials provided with the distribution.
+ #
+ # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
+ # IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ # TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR
+ # PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
+ # CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ # PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ # LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+ # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ # SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ """Script to recreate and sign the client SSL certificates.
+
+ """
+
+ import os
+ import os.path
+ import optparse
+ import sys
+ import logging
+
+ from ganeti import cli
+ from ganeti import constants
+ from ganeti import errors
+ from ganeti import utils
+ from ganeti import ht
+ from ganeti import pathutils
+ from ganeti.tools import common
+
+
+ _DATA_CHECK = ht.TStrictDict(False, True, {
+   constants.NDS_CLUSTER_NAME: ht.TNonEmptyString,
+   constants.NDS_NODE_DAEMON_CERTIFICATE: ht.TNonEmptyString,
+   constants.NDS_NODE_NAME: ht.TNonEmptyString,
+   constants.NDS_ACTION: ht.TNonEmptyString,
+   })
+
+
+ class SslSetupError(errors.GenericError):
+   """Local class for reporting errors.
+
+   """
+
+
+ def ParseOptions():
+   """Parses the options passed to the program.
+
+   @return: Options and arguments
+
+   """
+   parser = optparse.OptionParser(usage="%prog [--dry-run]",
+                                  prog=os.path.basename(sys.argv[0]))
+   parser.add_option(cli.DEBUG_OPT)
+   parser.add_option(cli.VERBOSE_OPT)
+   parser.add_option(cli.DRY_RUN_OPT)
+
+   (opts, args) = parser.parse_args()
+
+   return common.VerifyOptions(parser, opts, args)
+
+
+ def DeleteClientCertificate():
+   """Deleting the client certificate. This is necessary for downgrades."""
+   if os.path.exists(pathutils.NODED_CLIENT_CERT_FILE):
+     os.remove(pathutils.NODED_CLIENT_CERT_FILE)
+   else:
+     logging.debug("Trying to delete the client certificate '%s' which did
not"
+                   " exist.", pathutils.NODED_CLIENT_CERT_FILE)
+
+
+ def ClearMasterCandidateSsconfList():
+   """Clear the ssconf list of master candidate certs.
+
+   This is necessary when deleting the client certificates for a downgrade,
+   because otherwise the master cannot distribute the configuration to the
+   nodes via RPC during a downgrade anymore.
+
+   """
+   ssconf_file = os.path.join(
+     pathutils.DATA_DIR,
+     "%s%s" % (constants.SSCONF_FILEPREFIX,
+               constants.SS_MASTER_CANDIDATES_CERTS))
+   if os.path.exists:
+     os.remove(ssconf_file)
+   else:
+     logging.debug("Trying to delete the ssconf file '%s' which does not"
+                   " exist.", ssconf_file)
+
+
+ # pylint: disable=E1103
+ # This pyling message complains about 'data' as 'bool' not having a get
+ # member, but obviously the type is wrongly inferred.
+ def Main():
+   """Main routine.
+
+   """
+   opts = ParseOptions()
+
+   utils.SetupToolLogging(opts.debug, opts.verbose)
+
+   try:
+     data = common.LoadData(sys.stdin.read(), _DATA_CHECK)
+
+     common.VerifyClusterName(data, SslSetupError)
+
+     # Verifies whether the server certificate of the caller
+     # is the same as on this node.
-    common.VerifyCertificate(data, SslSetupError)
++    common.VerifyCertificateStrong(data, SslSetupError)
+
+     action = data.get(constants.NDS_ACTION)
+     if not action:
+       raise SslSetupError("No Action specified.")
+
+     if action == constants.CRYPTO_ACTION_CREATE:
+       common.GenerateClientCertificate(data, SslSetupError)
+     elif action == constants.CRYPTO_ACTION_DELETE:
+       DeleteClientCertificate()
+       ClearMasterCandidateSsconfList()
+     else:
+       raise SslSetupError("Unsupported action: %s." % action)
+
+   except Exception, err: # pylint: disable=W0703
+     logging.debug("Caught unhandled exception", exc_info=True)
+
+     (retcode, message) = cli.FormatError(err)
+     logging.error(message)
+
+     return retcode
+   else:
+     return constants.EXIT_SUCCESS
diff --cc lib/watcher/__init__.py
index 509bb18,0d51e7d..9a139ee
--- a/lib/watcher/__init__.py
+++ b/lib/watcher/__init__.py
@@@ -474,8 -477,9 +474,11 @@@ def ParseOptions()
   parser.add_option("--no-wait-children", dest="wait_children",
                     action="store_false",
                     help="Don't wait for child processes")
+  parser.add_option("--no-verify-disks", dest="no_verify_disks",
default=False,
+                    action="store_true", help="Do not verify disk status")
+   parser.add_option("--rapi-ip", dest="rapi_ip",
+                     default=constants.IP4_ADDRESS_LOCALHOST,
+                     help="Use this IP to talk to RAPI.")
   # See optparse documentation for why default values are not set by
options
   parser.set_defaults(wait_children=True)
   options, args = parser.parse_args()
diff --cc man/ganeti-watcher.rst
index 9a69f86,ae4cb8d..539ba2e
--- a/man/ganeti-watcher.rst
+++ b/man/ganeti-watcher.rst
@@@ -9,10 -9,8 +9,8 @@@ ganeti-watcher - Ganeti cluster watche
 Synopsis
 --------

- **ganeti-watcher** [``--debug``]
- [``--job-age=``*age*]
- [``--ignore-pause``]
- [``--no-verify-disks``]
+ **ganeti-watcher** [\--debug] [\--job-age=*age* ] [\--ignore-pause]
-[\--rapi-ip=*IP*]
++[\--rapi-ip=*IP*] [\--no-verify-disks]

 DESCRIPTION
 -----------
diff --cc src/Ganeti/OpCodes.hs
index 6f860ce,b274a84..d318829
--- a/src/Ganeti/OpCodes.hs
+++ b/src/Ganeti/OpCodes.hs
@@@ -279,8 -275,8 +279,10 @@@ $(genOpCode "OpCode
   , ("OpClusterRenewCrypto",
      [t| () |],
      OpDoc.opClusterRenewCrypto,
-     [ pVerbose
+     [ pNodeSslCerts
+     , pSshKeys
++     , pVerbose
+      , pDebug
      ],
      [])
   , ("OpQuery",
diff --cc test/hs/Test/Ganeti/OpCodes.hs
index 9038ecd,167b28b..ba9e4bc
--- a/test/hs/Test/Ganeti/OpCodes.hs
+++ b/test/hs/Test/Ganeti/OpCodes.hs
@@@ -168,8 -157,8 +168,8 @@@ instance Arbitrary OpCodes.OpCode wher
       "OP_TAGS_DEL" ->
         arbitraryOpTagsDel
       "OP_CLUSTER_POST_INIT" -> pure OpCodes.OpClusterPostInit
-       "OP_CLUSTER_RENEW_CRYPTO" ->
-         OpCodes.OpClusterRenewCrypto <$> arbitrary <*> arbitrary
+       "OP_CLUSTER_RENEW_CRYPTO" -> OpCodes.OpClusterRenewCrypto <$>
-         arbitrary <*> arbitrary
++         arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
       "OP_CLUSTER_DESTROY" -> pure OpCodes.OpClusterDestroy
       "OP_CLUSTER_QUERY" -> pure OpCodes.OpClusterQuery
       "OP_CLUSTER_VERIFY" ->
diff --cc test/py/cmdlib/cluster_unittest.py
index 9d4b786,650d8e1..2d3ee5e
--- a/test/py/cmdlib/cluster_unittest.py
+++ b/test/py/cmdlib/cluster_unittest.py
@@@ -2382,27 -2349,17 +2371,18 @@@ class TestLUClusterRenewCrypto(CmdlibTe
     cluster = self.cfg.GetClusterInfo()
     self.assertEqual(num_nodes + 1, len(cluster.candidate_certs))
     nodes = self.cfg.GetAllNodesInfo()
-     for (node_uuid, _) in nodes.items():
-       expected_digest = self._GetFakeDigest(node_uuid)
-       self.assertEqual(expected_digest,
cluster.candidate_certs[node_uuid])
-
-   @patchPathutils("cluster")
-   def testMasterFails(self, pathutils):
-     self._InitPathutils(pathutils)
-
-     # make sure the RPC calls are failing for all nodes
     master_uuid = self.cfg.GetMasterNode()
-     self.rpc.call_node_crypto_tokens.return_value =
self.RpcResultsBuilder() \
-         .CreateFailedNodeResult(master_uuid)
-
-     op = opcodes.OpClusterRenewCrypto(node_certificates=True)
-     self.ExecOpCode(op)
-
-     self._AssertCertFiles(pathutils)
+
-     # Check if we correctly have no candidate certificates
-     cluster = self.cfg.GetClusterInfo()
-     self.assertFalse(cluster.candidate_certs)
+     for (node_uuid, _) in nodes.items():
+       if node_uuid == master_uuid:
+         # The master digest is from the actual test certificate.
+         self.assertEqual(self._client_node_cert_digest,
+                          cluster.candidate_certs[node_uuid])
+       else:
+         # The non-master nodes have the fake digest from the
+         # mock RPC.
+         expected_digest = self._GetFakeDigest(node_uuid)
+         self.assertEqual(expected_digest,
cluster.candidate_certs[node_uuid])

   def _partiallyFailingRpc(self, node_uuid, _):
     if node_uuid == self._failed_node:
diff --cc test/py/ganeti.tools.prepare_node_join_unittest.py
index e07e079,82acce5..a76db15
--- a/test/py/ganeti.tools.prepare_node_join_unittest.py
+++ b/test/py/ganeti.tools.prepare_node_join_unittest.py
@@@ -48,38 -46,7 +47,8 @@@ import testutil


 _JoinError = prepare_node_join.JoinError
+_DATA_CHECK = prepare_node_join._DATA_CHECK

- class TestLoadData(unittest.TestCase):
-   def testNoJson(self):
-     self.assertRaises(errors.ParseError, common.LoadData, "", _DATA_CHECK)
-     self.assertRaises(errors.ParseError, common.LoadData, "}",
_DATA_CHECK)
-
-   def testInvalidDataStructure(self):
-     raw = serializer.DumpJson({
-       "some other thing": False,
-       })
-     self.assertRaises(errors.ParseError, common.LoadData, raw,
_DATA_CHECK)
-
-     raw = serializer.DumpJson([])
-     self.assertRaises(errors.ParseError, common.LoadData, raw,
_DATA_CHECK)
-
-   def testEmptyDict(self):
-     raw = serializer.DumpJson({})
-     self.assertEqual(common.LoadData(raw, _DATA_CHECK), {})
-
-   def testValidData(self):
-     key_list = [[constants.SSHK_DSA, "private foo", "public bar"]]
-     data_dict = {
-       constants.SSHS_CLUSTER_NAME: "Skynet",
-       constants.SSHS_SSH_HOST_KEY: key_list,
-       constants.SSHS_SSH_ROOT_KEY: key_list,
-       constants.SSHS_SSH_AUTHORIZED_KEYS:
-         {"nodeuuid01234": ["foo"],
-          "nodeuuid56789": ["bar"]}}
-     raw = serializer.DumpJson(data_dict)
-     self.assertEqual(common.LoadData(raw, _DATA_CHECK), data_dict)
-

 class TestVerifyCertificate(testutils.GanetiTestCase):
   def setUp(self):
@@@ -91,21 -58,20 +60,21 @@@
     shutil.rmtree(self.tmpdir)

   def testNoCert(self):
-     common.VerifyCertificate({}, error_fn=prepare_node_join.JoinError,
-                              _verify_fn=NotImplemented)
-    prepare_node_join.VerifyCertificate({}, _verify_fn=NotImplemented)
++    common.VerifyCertificateSoft({}, error_fn=prepare_node_join.JoinError,
++                                 _verify_fn=NotImplemented)

   def testGivenPrivateKey(self):
     cert_filename = testutils.TestDataFilename("cert2.pem")
     cert_pem = utils.ReadFile(cert_filename)

-     self.assertRaises(_JoinError, common._VerifyCertificate,
-    self.assertRaises(_JoinError, prepare_node_join._VerifyCertificate,
-                      cert_pem, _check_fn=NotImplemented)
++    self.assertRaises(_JoinError, common._VerifyCertificateSoft,
+                      cert_pem, _JoinError, _check_fn=NotImplemented)

   def testInvalidCertificate(self):
     self.assertRaises(errors.X509CertError,
-                       common._VerifyCertificate,
-                      prepare_node_join._VerifyCertificate,
++                      common._VerifyCertificateSoft,
                       "Something that's not a certificate",
-                      _check_fn=NotImplemented)
+                      _JoinError, _check_fn=NotImplemented)

   @staticmethod
   def _Check(cert):
@@@ -114,37 -80,9 +83,10 @@@
   def testSuccessfulCheck(self):
     cert_filename = testutils.TestDataFilename("cert1.pem")
     cert_pem = utils.ReadFile(cert_filename)
-     common._VerifyCertificate(cert_pem, _JoinError,
-    prepare_node_join._VerifyCertificate(cert_pem, _check_fn=self._Check)
++    common._VerifyCertificateSoft(cert_pem, _JoinError,
+      _check_fn=self._Check)


- class TestVerifyClusterName(unittest.TestCase):
-   def setUp(self):
-     unittest.TestCase.setUp(self)
-     self.tmpdir = tempfile.mkdtemp()
-
-   def tearDown(self):
-     unittest.TestCase.tearDown(self)
-     shutil.rmtree(self.tmpdir)
-
-   def testNoName(self):
-     self.assertRaises(_JoinError, common.VerifyClusterName,
-                       {}, _JoinError, _verify_fn=NotImplemented)
-
-   @staticmethod
-   def _FailingVerify(name):
-     assert name == "cluster.example.com"
-     raise errors.GenericError()
-
-   def testFailingVerification(self):
-     data = {
-       constants.SSHS_CLUSTER_NAME: "cluster.example.com",
-       }
-
-     self.assertRaises(errors.GenericError, common.VerifyClusterName,
-                       data, _JoinError, _verify_fn=self._FailingVerify)
-
-
 class TestUpdateSshDaemon(unittest.TestCase):
   def setUp(self):
     unittest.TestCase.setUp(self)
--

Helga Velroyen
Software Engineer
[email protected]

Google Germany GmbH
Dienerstraße 12
80331 München

Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind,
leiten Sie diese bitte nicht weiter, informieren Sie den Absender und
löschen Sie die E-Mail und alle Anhänge. Vielen Dank.

This e-mail is confidential. If you are not the right addressee please do
not forward it, please inform the sender, and please erase this e-mail
including any attachments. Thanks.

Reply via email to