(anonymous) wrote: >> with gridengine-master 6.2u5-7.3 (Ubuntu Trusty), our >> /var/lib/gridengine/spool/qmaster/messages gets constantly >> filled with:
>> | 12/07/2016 04:11:43|worker|tools-grid-master|E|got load report of unknown >> exec host "tools-exec-1204.eqiad.wmflabs" >> (tools-exec-1204.eqiad.wmflabs is a host that no longer >> exists.) >> How can I convince the grid master to "move on", >> i. e. "accept" that it did receive a load report from an >> unknown host, or "delete" the load report from its inbox? > Do you have any custom load sensors defined, either on a > global or local level per exechost? The machine in question > was completely removed and shut down? I don't think we have any custom load sensors defined, but your latter question caused me reconsider the facts: The host was shut down, removed from DNS and an entry for that host removed from /var/lib/gridengine/default/common/host_aliases, /but/ the grid master had not been restarted afterwards, i. e. it was still working with the old host_aliases that had an entry for that host. After "service gridengine-master restart", the error no longer shows up in /var/lib/gridengine/spool/qmaster/messages. So I assume the outdated host_aliases confused the grid master. Thanks, Tim _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users