Re: InsufficientServerCapacityException - Unsure Why

Elliot Berg Wed, 16 Jul 2014 00:10:46 -0700

I've already had to flatten and start again so I'd rather avoid it - butmy suspicion is that all of this is related to the kvm host's networkingsomehow. I followed the instructions on the cloudstack install guide,and ended up with the below - does it look right to you guys?


auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual

auto cloudbr0
iface cloudbr0 inet static
        bridge_ports eth0
        bridge_fd 5
        bridge_stp off
        bridge_maxwait 1
        address 10.4.0.2
        netmask 255.0.0.0
        network 10.0.0.0
        broadcast 10.255.255.255
        gateway 10.0.0.1

# dns-* options are implemented by the resolvconf package, ifinstalled

        dns-nameservers 10.0.0.12
        dns-search avco

auto cloudbr1
iface cloudbr1 inet manual
        bridge_ports eth0
        bridge_fd 5
        bridge_stp off
        bridge_maxwait 1

Many Thanks,

Elliot

Elliot Berg wrote:

Hi,

Cloud.log contains the following just after the machine's rebooted;

Mon Jul 14 16:01:06 UTC 2014 checking that eth0 has IP
Mon Jul 14 16:01:07 UTC 2014 waiting for eth0 interface setup with iptimer=0Mon Jul 14 16:01:08 UTC 2014 waiting for eth0 interface setup with iptimer=1Mon Jul 14 16:01:09 UTC 2014 waiting for eth0 interface setup with iptimer=2Mon Jul 14 16:01:10 UTC 2014 waiting for eth0 interface setup with iptimer=3Mon Jul 14 16:01:11 UTC 2014 waiting for eth0 interface setup with iptimer=4Mon Jul 14 16:01:12 UTC 2014 waiting for eth0 interface setup with iptimer=5Mon Jul 14 16:01:13 UTC 2014 waiting for eth0 interface setup with iptimer=6Mon Jul 14 16:01:14 UTC 2014 waiting for eth0 interface setup with iptimer=7Mon Jul 14 16:01:15 UTC 2014 waiting for eth0 interface setup with iptimer=8Mon Jul 14 16:01:16 UTC 2014 waiting for eth0 interface setup with iptimer=9Mon Jul 14 16:01:17 UTC 2014 waiting for eth0 interface setup with iptimer=10Mon Jul 14 16:01:18 UTC 2014 waiting for eth0 interface setup with iptimer=11Mon Jul 14 16:01:19 UTC 2014 waiting for eth0 interface setup with iptimer=12Mon Jul 14 16:01:20 UTC 2014 waiting for eth0 interface setup with iptimer=13Mon Jul 14 16:01:21 UTC 2014 waiting for eth0 interface setup with iptimer=14Mon Jul 14 16:01:22 UTC 2014 waiting for eth0 interface setup with iptimer=15Mon Jul 14 16:01:23 UTC 2014 waiting for eth0 interface setup with iptimer=16Mon Jul 14 16:01:23 UTC 2014 interface eth0 is not set up with ip...exiting
As I say, I'm wondering whether this indicates a more generalnetworking issue on the host, as I'd have expected the virtual routerto sort its own networking assuming the host's is fine?
Thanks,

Elliot

Jayapal Reddy Uradi wrote:
Hi,
Check the logs while the router is booting. Also check/var/log/cloud.log
Thanks,
Jayapal
On 14-Jul-2014, at 2:39 PM, Elliot Berg<elliot.b...@avcosystems.com>
  wrote:
Hi,
I did that earlier as part of the troubleshooting when it was stuck- so I've just looked at the logs instead of recreating it again asthat was only just done. When you say the router logs, do you meangeneral logs on the virtual router machine? If so,syslog/messages/kern.log/daemon.log are all empty?
Elliot

Jayapal Reddy Uradi wrote:
Hi Elliot,
Try recreating router (destroy the router and deploy new vm, routerget recreated).After recreation if the problem still exists, check the router logsto see why the interfaces are brought up.
Thanks,
jayapal
On 11-Jul-2014, at 1:38 PM, ElliotBerg<elliot.b...@avcosystems.com> wrote:
So, I'm wondering whether the guest not having the interfacesconfigured correctly (i.e. not having an IP) is just a symptom ofmore generally broken networking - my interfaces file for the KVMhost is below, does anyone spot any issues?
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual

auto cloudbr0
iface cloudbr0 inet static
        bridge_ports eth0
        bridge_fd 5
        bridge_stp off
        bridge_maxwait 1
        address 10.4.0.2
        netmask 255.0.0.0
        network 10.0.0.0
        broadcast 10.255.255.255
        gateway 10.0.0.1
# dns-* options are implemented by the resolvconf package,if installed
        dns-nameservers 10.0.0.12
        dns-search avco

auto cloudbr1
iface cloudbr1 inet manual
        bridge_ports eth0
        bridge_fd 5
        bridge_stp off
        bridge_maxwait 1

Thanks,

Elliot

Elliot Berg wrote:
Doh! I did, but forgot about it being on a funny port. Now thatI'm into the VM I can see that it's not running, and fails tostart when it tries to bind to the address that it should have onthe guest range. I notice that "ifconfig -a" shows two NICs, onlyone of which is up (the one with the link local IP). I'mguessing that indicates a more general networking issue?
I think how it's laid out is 10.4.0.0-255 for physical machines(1 is the management server, 2 is the first host), 10.4.1.0-255is the management network and 10.4.2.0-255 is the guestnetwork...but it's possible I've misunderstood the networkingconfig during setup? What I really wanted was hosts on10.4.0.0-255 and guests on 10.4.1.0-255 (and beyond), as in thefuture I'd like it to co-exist with our existing infrastructurewhile we migrate things - but I kept being told about conflictsetc when I tried to set up cloudstack like that during theinitial set up process?
Thanks,

Elliot

Jayapal Reddy Uradi wrote:
Hi Elliot,

Did you ssh to VR using the ssh key ?
Ex: ssh -i /root/.ssh/id_rsa.cloud -p3922root@169.254.3.196

If it is failed to ssh, then there is issue with the ssh keys.

Thanks,
Jayapal
On 09-Jul-2014, at 4:43 PM, HarikrishnaPatnala<harikrishna.patn...@citrix.com> wrote:
1) Log into your KVM host.
2) Use command “virsh list”. This gives the list of VMs on thehost.3) Use command “virsh console<VirtualRouterId>” to log into theVR.
-Harikrishna
On 09-Jul-2014, at 3:52 pm, ElliotBerg<elliot.b...@avcosystems.com> wrote:
I don't know - I can't seem to ssh to the link local IP. Itpings, but ssh times out. If I try and use the "connect toconsole" button in the gui, that too times out :(
Elliot

Harikrishna Patnala wrote:
 From the logs
2014-07-08 12:08:56,218 DEBUG [agent.transport.Request](AgentManager-Handler-1:null) Seq 1-277348416: Processing: {Ans: , MgmtId: 159320647860937, via: 1, Ver: v1, Flags: 110,[{"com.cloud.agent.api.Answer":{"result":false,"details":"grep:/var/lib/misc/dnsmasq.leases: No such file ordirectory","wait":0}}] }
Can you check whether dnsmasq service is running in theVirtual Router ? if not, start the service and check for“/var/lib/misc/dnsmasq.leases”
-Harikrishna
On 08-Jul-2014, at 3:47 pm, ElliotBerg<elliot.b...@avcosystems.com> wrote:
Hi,
I've done that, and now there's a new virtual router whichsays it's running, however a deployment still fails. Mylatest lot of logs are availableathttps://dl.dropboxusercontent.com/u/47728104/management-server.log.gz,and there's now one thing in the op_it_work table with astep != 'Done', which is a ConsoleProxy.
Interestingly if I look at the console proxy vm in thecloudstack management gui it says it's running, though.
Thanks,

Elliot

Harikrishna Patnala wrote:
Yes mark the VR to stopped, destroy VR, mark the VR entryin op_it_work to “Done” and try deploying VM.
-Harikrishna
On 08-Jul-2014, at 12:44 pm, ElliotBerg<elliot.b...@avcosystems.com> wrote:
Hi,
It appears to be stuck in the "starting" state - so Idon't get the option to reboot it or anything. If I changethe state to stopped in the database directly will themanagement server attempt to start it again or do I needto do something more?
Thanks!

Elliot

Harikrishna Patnala wrote:
Is your Virtual Router up and running ? If is in runningstate you can mark it Done and deploy a VM.If it is in stopped state try restarting it. You can tryupdating the field as well.
-Harikrishna
On 07-Jul-2014, at 7:10 pm, ElliotBerg<elliot.b...@avcosystems.com> wrote:
I can see two entries that have the "step" field set tosomething other than "Done", one of them is
ConsoleProxy | Starting

and the other is

DomainRouter | Prepare
Am I safe to just delete the rows, or should I justupdate the field?
Thanks,

Elliot

Harikrishna Patnala wrote:
Do you see any work item pending for Virtual Routerr-4-VM in “op_it_work” table ?If there are any, remove those entries and try VMdeployment again.
I see in the logs that VR has a task pending
2014-07-07 10:28:15,934 WARN[cloud.vm.VirtualMachineManagerImpl](Job-Executor-5:job-48 = [22369802-b5aa-4b5a-a26d-1fab11241551 ]) The task itemfor vm VM[DomainRouter|r-4-VM] has been inactive for418531
-Harikrishna
On 07-Jul-2014, at 2:18 pm, ElliotBerg<elliot.b...@avcosystems.com<mailto:elliot.b...@avcosystems.com>>wrote:
I'm still not really spotting anything indicating whyit's not using the host, but I suspect that's justbecause I don't really know what I'm looking for - soI've zipped the whole log for today and stuffed it ondropboxathttps://dl.dropboxusercontent.com/u/47728104/management-server.log.gz.
Hopefully someone who's used cloudstack a lot more willhave more success!
Thanks,

Elliot


Elliot Berg wrote:
I'm going back over everything and I've noticedsomething else - everywhere I've looked for how to uselocal storage says I should change two global settings;
  *   system.vm.use.local.storage = true
  *   use.local.storage = true
However I'm looking at my global settings and only thefirst exists (which I have set to true).
Elliot

Elliot Berg wrote:
Ah, so when looking back a bit further before (I waskind of only looking for exceptions higher up beforenow), I've just spotted this...
2014-07-03 10:48:28,765 DEBUG[allocator.impl.FirstFitAllocator](Job-Executor-3:job-46 = [ 92fb959d-edc5-4fe2-84a0-56001226e4ac ] FirstFitRoutingAllocator) Looking forspeed=1000Mhz, Ram=10242014-07-03 10:48:28,765 DEBUG[allocator.impl.FirstFitAllocator](Job-Executor-3:job-46 = [ 92fb959d-edc5-4fe2-84a0-56001226e4ac ] FirstFitRoutingAllocator) Host name:cloudstack-host1, hostId: 1 is in avoid set, skippingthis and try
ing other available hosts
That's the one and only host - so I'm guessing that hassomething to do with it!
Elliot
--
Elliot Berg  |  Analyst Programmer/Network Team
Email:elliot.b...@avcosystems.com<mailto:elliot.b...@avcosystems.com>| Tel: 01753 213700 |Web:www.avcosystems.com<http://www.avcosystems.com/>
<image.png>
Avco Systems Ltd, Registered in England& Wales,Registration Number 1976620Registered Office: Avco Systems | 17 Bath Road | Slough| SL1 3UF
ilya musayev wrote:
Elliot,
When you see such an error - there usually apredecessor message that says CloudStack checked for X,Y and Z and found no suitable resources based on yourconfiguration.
Put the logs on pastebin or some other site (strip outany private info you dont want to share). I would alsorecommend cloudstack 4.3.1 (which is not officially outyet) but should come thru in the next several weeks.Its latest stable release of CloudStack 4.3.0 - withlatest bug fixes.
I've put a build for folks who want to try it out untilwe complete official release of ACS 4.3.1 process.
Unzip tgz and it should have required RPMs with bothOpen Source and Non-Open Source modules.
http://www.cloudsand.com/cloudstack-4.3.0-1.tgz

Regards
ilya


On 7/2/14, 1:06 AM, Elliot Berg wrote:
Hi,
I've been putting together a cloudstack set-up forexperimentation purposes - right now we're just tryingto compare different platforms for private cloudinfrastructure before we start getting too in depthwith any of them.
I've added the cloudstack 4.2 apt repository, and I'mrunning on Ubuntu 12.04 LTS, and I believe I'vefollowed all the installation guides correctly at thevarious stages.
We've set up a management server, which is also an NFSserver, however we're interested in using local storagefor the majority of things, and have also set up asingle KVM host which I believe is all configuredcorrectly to use local storage. If I look at thedashboard, I'm told I have more than enough resource inevery section to create an instance the size I want to- which is a small offering I've created with just1.0GHz and 1GB of RAM, with local storage. The host'snot very powerful, but according to the dashboard I amusing 1.50GHz/5.87GHz, 1.38GB/7.80GB, 3.55GB/285.95GBSecondary Storage, 1.03GB/450.99GB Local Storage and0.00KB/571.90GB Primary Storage (I'm assuming that'smeant to be a combination of the NFS server's primarystorage offering and the local storage on the host,though the numbers don't quite make sense at firstglance).
However, when I try to add an instance, I receive anInsufficientServerCapacityException and I'm strugglingto work out why. I can't add an instance using a smallshared storage offering either, but if I'm not mistakenthat's expected because the zone and host areconfigured to use local storage. The only thing I canthink of is that the local storage isn't properlyconfigured, but when I've looked it seems to be.
Any pointers for how I can further diagnose this wouldbe great - thanks in advance!
Elliot

Re: InsufficientServerCapacityException - Unsure Why

Reply via email to