Try to mount your primary storage to a compute host and try to write to it. Your NFS server might not have come back up properly (settings-wise or all the relevant services). On Sep 16, 2013 6:08 PM, "Matt Foley" <mfo...@hortonworks.com> wrote:
> Thank you Chiradeep. Log snippet now available as http://apaste.info/qBIB > --Matt > > On Mon, Sep 16, 2013 at 5:19 PM, Chiradeep Vittal < > chiradeep.vit...@citrix.com> wrote: > > > Attachments are stripped. Can you paste (say at http://apaste.info/) > > > > From: Matt Foley <mfo...@hortonworks.com> > > Date: Monday, September 16, 2013 4:58 PM > > > > We had a planned network outage this weekend, which inadvertently > resulted > > in making the NFS Shared Primary Storage (used by System VMs) unavailable > > for a day and a half. (Guest VMs use local storage only, but System VMs > > use shared storage only.) Cloudstack was not brought down prior to the > > outage. > > > > After network came back, we gracefully brought down all services > including > > cloudstack-management, mysql, and NFS, then actually rebooted all servers > > in the cluster and the NFS server (to make sure no stale file handles), > > then brought up services in the appropriate order. Also checked mysql > for > > table corruption, and found none. Confirmed that the NFS volumes are > > mountable from all hosts, and in fact Shared Primary Storage is being > > mounted by cloudstack on hosts as usual, under /mnt/<uuid>. > > > > Nevertheless, when try to bring up the cluster, we fail to start the > > system VMs, with errors "InsufficientServerCapacityException: Unable to > > create a deployment for VM". The cause is not really insufficient > > capacity, as actual usage of resources is tiny; these error messages are > > false explanations of the failure to create primary storage volume for > the > > System VMs. > > > > Digging into management-server.log, the core issue seems to be the ~160 > > line snippet from the log attached to this message as > > cloudstack_debug_2013.09.16.log. The only Shared Primary Storage pool is > > pool 201, named "cs-primary". It is mounted on all hosts as > > /mnt/9c6fd9a3-43e5-389a-9594-faecf178b4b9, which is its uuid. The log > > shows the management server correctly identifying a particular host as > > being able to access pool 201, then trying to allocate a primary storage > > volume using the template with uuid f23a16e7-b628-429e-83e1-698935588465. > > It fails, but I cannot tell why. I suspect its claim that "Template 3 > has > > already been downloaded to pool 201" is false, but I don't know how to > > check this (or fix if wrong). > > > > Any guidance for further debugging or fixing this would be GREATLY > > appreciated. > > Thanks, > > --Matt > > > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. >