Hi all, I stumbled upon this stackoverflow issue <https://stackoverflow.com/questions/44456801/slurmctld-fatal-cluster-name-mismatch>, the user gets the error below. Please note I'm not asking for help about the error, I'm pointing to an issue with the error message itself:
> slurmctld -cD slurmctld: fatal: CLUSTER NAME MISMATCH. slurmctld has been started with "ClusterName=�����", but read "cluster" from the state files in StateSaveLocation. Running multiple clusters from a shared StateSaveLocation WILL CAUSE CORRUPTION. I checked the source, and it seems to me the 2 values are switched, which adds to the confusion. Relevant extract from source code <https://github.com/SchedMD/slurm/blob/slurm-17-02-4-1/src/slurmctld/controller.c#L2634-L2643> : fatal("CLUSTER NAME MISMATCH.\n" "slurmctld has been started with \"" "ClusterName=%s\", but read \"%s\" from " "the state files in StateSaveLocation.\n" "Running multiple clusters from a shared " "StateSaveLocation WILL CAUSE CORRUPTION.\n" "Remove %s to override this safety check if " "this is intentional (e.g., the ClusterName " "has changed).", name, slurmctld_conf.cluster_name, filename); Here, "name" was the value read from clustername file. So I think the parameters are switched, and the message should say: slurmctld has been started with "ClusterName=cluster", but read "�����" from the state files in StateSaveLocation. So I'm asking here before I file a useless bug: have I missed something? (would not be the first time I misread code) Thanks in advance, Hugues
