Yes, I used to see this on my Mac - but it had gone away for a year or
more. It seems to be back. We asked the UWash student(s) to share logs
- nothing back yet - I can share mine tomorrow if you let me know which
file(s) on which paths (relative to my installation) to send. Shutdown
-f (twice - first one hung and JPS revealed that it didn't succeed) and
then startup again resolved mine. Presumably the logs will still have
the error info though?
On 5/1/18 12:31 AM, Murtadha Hubail wrote:
This is most likely caused by missing heartbeat from the NC to the CC. Some
macOS versions had issues with reestablishing connected sockets after waking up
from sleep.
But it could also be some unexpected exception that caused the NC to shut down.
If you could share the logs with me, I can tell you for sure.
Cheers,
Murtadha
On 05/01/2018, 9:06 AM, "Michael Carey" <[email protected]> wrote:
Q: Do we maybe have a stability regression in recent versions (e.g.,
the one leading to the UW snapshot)? They have occasionally seen things
like this and I just did too. (The system had been running for awhile
in the background on my Mac - e.g., for a day or so.)
Error: Cluster is in UNUSABLE state.
One or more Node Controllers have left or haven't joined yet.