We had a system crash with all nodes being forced to shut down.
When restarted we're having problems with the GlusterFS storage (this is
with OpenShift Origin 3.7.2).
The gluster nodes appear to have restarted fine, and AFAICT the volumes
and bricks are all OK.
But the heketi pod is failing to restart as its db is in an inconsistent
state.
Following the instructions here [1] we tried to remove the pending
operations, but heketi is still stuck with errors like this:
[heketi] INFO 2018/08/22 13:24:50 Loaded kubernetes executor
[heketi] ERROR 2018/08/22 13:24:50
/src/github.com/heketi/heketi/apps/glusterfs/dbcommon.go:109:
<http://github.com/heketi/heketi/apps/glusterfs/dbcommon.go:109:>
Failed to upgrade db for brick entries: Id not found
[heketi] ERROR 2018/08/22 13:24:50
/src/github.com/heketi/heketi/apps/glusterfs/app.go:125:
<http://github.com/heketi/heketi/apps/glusterfs/app.go:125:> Unable to
Upgrade Changes
[heketi] ERROR 2018/08/22 13:24:50
/src/github.com/heketi/heketi/apps/glusterfs/app.go:133:
<http://github.com/heketi/heketi/apps/glusterfs/app.go:133:> Id not found
ERROR: Unable to start application
Is it possible to work around this somehow by restarting the heketi pod
and get it to pick up its information afresh from the gluster nodes?
[1]
https://github.com/heketi/heketi/blob/263fbb72055d71b3763a77c051e7a00cf0c4e436/docs/troubleshooting.md
_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users