GitHub user Jayd603 added a comment to the discussion: Agent can't start even if only one KVM VM was using network attached primary storage and that storage becomes unavailable.
> @Jayd603, @DaanHoogland, the behavior is the same with 4.22 (tested with NFS > and StorPool). I've seen this with Ceph as a primary storage with older > versions. I think we need to have a discussion on the best way to resolve > this issue. Simply fixing the connection with the agent won’t be enough—if > the storage problem itself isn’t addressed, the user won’t notice the > connectivity issue between the host and the storage (unless they look through > the logs). Most likely, there will be a missing entry in the storage_pool_ref > table in the database, which will cause problems for example when trying to > start a virtual machine that uses a volume on the affected storage. > > One possible solution could be to add an extra status to the storage that > clearly indicates its connection to the host. For me this might be the most > effective way to make the issue visible and prevent confusion. However, > implementing this proposal would require significant changes to core > functionalities, so it needs to be carefully evaluated. I think first step should be to allow unaffected instances to start, even if logging/UI is broken for the instances on the shared primary NAS storage. I put in a request to allow quicker changing of the primary storage IP/endpoint in the UI - I'm wondering if simply changing it to any functional primary storage in place of the failed storage will at least allow agents to start, even if that primary storage has basically nothing on it. I would like to test that. GitHub link: https://github.com/apache/cloudstack/discussions/12168#discussioncomment-15136537 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
