Re: InsufficientServerCapacityException - Unsure Why

Elliot Berg Wed, 16 Jul 2014 06:48:35 -0700

That worked - it was set incorrectly in the nics table. The secondarystorage VM and Console Proxy VM are suffering the same problem, so I'vefixed them - but I'm guessing this means it's been entered incorrectlyon the main network config. I didn't spot it in the networks table, doyou know where that would be stored?


Thanks a bunch, I think this is the root cause of most of my issues!


Elliot

Jayapal Reddy Uradi wrote:

Hi Elliot,

Reboot the router and see the management server for router startcommand.
These values are passed in startcommand.

If it has . at the end then check the database nics table for entry with guest 
ip.
If nic table has entry with . then correct and restart the MS and restart VR.

Thanks,
Jayapal

On 16-Jul-2014, at 3:39 PM, Elliot Berg<[email protected]>
  wrote:

Hi,

I've got

template=domP name=r-27-VM eth0ip=10.4.2.6 eth0mask=255.0.0.0. gateway=10.0.0.1 
domain=cs1cloud.internal dhcprange=10.0.0.1 eth1ip=169.254.1.246 
eth1mask=255.255.0.0 type=dhcpsrvr disable_rp_filter=true dns1=10.0.0.12 dns2= 
ip6dns1= ip6dns2=

In that file, which includes the incorrect netmask.

Elliot

Jayapal Reddy Uradi wrote:

Hi,

Check the /var/cache/cloud/cmdline for eth0ip=10.1.1.1 eth0mask=255.255.255.0
If it is correct, then interfaces file is written wrongly.
The /etc/network/interfaces updated from the cloud-early-config on router boot.

What you can do is put set -x in cloud-early-config and run 
/etc/init.d/cloud-early-config from the router.
And observe the setup_interface for how /etc/network/interfaces is written.

Thanks,
Jayapal

On 16-Jul-2014, at 3:07 PM, Elliot Berg<[email protected]>   wrote:

Hi,

So that fails with the error

Error: an inet prefix is expected rather than "10.4.2.6/255.0.0.0.".
Failed to bring up eth0.

I went and looked at the router's /etc/network/interfaces file and spotted that the 
netmask has a "." on the end, as below. Removing that and then running ifup 
eth0 works, however when I reboot the router that file appears to be regenerated, as my 
change was undone. Does anyone know where the information to generate that file comes 
from?

iface  eth0 inet static
  address 10.4.2.6
  netmask 255.0.0.0.

Thanks,

Elliot

Jayapal Reddy Uradi wrote:

Hi Elliot,

Can you please try 'ifup eth0' on the router.
It seems there is delay in bringing up the eth0 interface.

Thanks,
Jayapal
On 16-Jul-2014, at 12:40 PM, Elliot Berg<[email protected]>    wrote:

I've already had to flatten and start again so I'd rather avoid it - but my 
suspicion is that all of this is related to the kvm host's networking somehow. 
I followed the instructions on the cloudstack install guide, and ended up with 
the below - does it look right to you guys?


auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual

auto cloudbr0
iface cloudbr0 inet static
        bridge_ports eth0
        bridge_fd 5
        bridge_stp off
        bridge_maxwait 1
        address 10.4.0.2
        netmask 255.0.0.0
        network 10.0.0.0
        broadcast 10.255.255.255
        gateway 10.0.0.1
        # dns-* options are implemented by the resolvconf package, if installed
        dns-nameservers 10.0.0.12
        dns-search avco

auto cloudbr1
iface cloudbr1 inet manual
        bridge_ports eth0
        bridge_fd 5
        bridge_stp off
        bridge_maxwait 1

Many Thanks,

Elliot

Elliot Berg wrote:

Hi,

Cloud.log contains the following just after the machine's rebooted;

Mon Jul 14 16:01:06 UTC 2014 checking that eth0 has IP
Mon Jul 14 16:01:07 UTC 2014 waiting for eth0 interface setup with ip timer=0
Mon Jul 14 16:01:08 UTC 2014 waiting for eth0 interface setup with ip timer=1
Mon Jul 14 16:01:09 UTC 2014 waiting for eth0 interface setup with ip timer=2
Mon Jul 14 16:01:10 UTC 2014 waiting for eth0 interface setup with ip timer=3
Mon Jul 14 16:01:11 UTC 2014 waiting for eth0 interface setup with ip timer=4
Mon Jul 14 16:01:12 UTC 2014 waiting for eth0 interface setup with ip timer=5
Mon Jul 14 16:01:13 UTC 2014 waiting for eth0 interface setup with ip timer=6
Mon Jul 14 16:01:14 UTC 2014 waiting for eth0 interface setup with ip timer=7
Mon Jul 14 16:01:15 UTC 2014 waiting for eth0 interface setup with ip timer=8
Mon Jul 14 16:01:16 UTC 2014 waiting for eth0 interface setup with ip timer=9
Mon Jul 14 16:01:17 UTC 2014 waiting for eth0 interface setup with ip timer=10
Mon Jul 14 16:01:18 UTC 2014 waiting for eth0 interface setup with ip timer=11
Mon Jul 14 16:01:19 UTC 2014 waiting for eth0 interface setup with ip timer=12
Mon Jul 14 16:01:20 UTC 2014 waiting for eth0 interface setup with ip timer=13
Mon Jul 14 16:01:21 UTC 2014 waiting for eth0 interface setup with ip timer=14
Mon Jul 14 16:01:22 UTC 2014 waiting for eth0 interface setup with ip timer=15
Mon Jul 14 16:01:23 UTC 2014 waiting for eth0 interface setup with ip timer=16
Mon Jul 14 16:01:23 UTC 2014 interface eth0 is not set up with ip... exiting

As I say, I'm wondering whether this indicates a more general networking issue 
on the host, as I'd have expected the virtual router to sort its own networking 
assuming the host's is fine?

Thanks,

Elliot

Jayapal Reddy Uradi wrote:

Hi,

Check the logs while the router is booting. Also check /var/log/cloud.log

Thanks,
Jayapal
On 14-Jul-2014, at 2:39 PM, Elliot Berg<[email protected]>
  wrote:

Hi,

I did that earlier as part of the troubleshooting when it was stuck - so I've 
just looked at the logs instead of recreating it again as that was only just 
done. When you say the router logs, do you mean general logs on the virtual 
router machine? If so, syslog/messages/kern.log/daemon.log are all empty?

Elliot

Jayapal Reddy Uradi wrote:

Hi Elliot,

Try recreating router (destroy the router and deploy new vm, router get 
recreated).
After recreation if the problem still exists, check the router logs to see why 
the interfaces are brought up.


Thanks,
jayapal

On 11-Jul-2014, at 1:38 PM, Elliot Berg<[email protected]>      wrote:

So, I'm wondering whether the guest not having the interfaces configured 
correctly (i.e. not having an IP) is just a symptom of more generally broken 
networking - my interfaces file for the KVM host is below, does anyone spot any 
issues?

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual

auto cloudbr0
iface cloudbr0 inet static
        bridge_ports eth0
        bridge_fd 5
        bridge_stp off
        bridge_maxwait 1
        address 10.4.0.2
        netmask 255.0.0.0
        network 10.0.0.0
        broadcast 10.255.255.255
        gateway 10.0.0.1
        # dns-* options are implemented by the resolvconf package, if installed
        dns-nameservers 10.0.0.12
        dns-search avco

auto cloudbr1
iface cloudbr1 inet manual
        bridge_ports eth0
        bridge_fd 5
        bridge_stp off
        bridge_maxwait 1

Thanks,

Elliot

Elliot Berg wrote:

Doh! I did, but forgot about it being on a funny port. Now that I'm into the VM I can see 
that it's not running, and fails to start when it tries to bind to the address that it 
should have on the guest range. I notice that "ifconfig -a" shows two NICs, 
only one of which is up (the one with the link local IP).  I'm guessing that indicates a 
more general networking issue?

I think how it's laid out is 10.4.0.0-255 for physical machines (1 is the 
management server, 2 is the first host), 10.4.1.0-255 is the management network 
and 10.4.2.0-255 is the guest network...but it's possible I've misunderstood 
the networking config during setup? What I really wanted was hosts on 
10.4.0.0-255 and guests on 10.4.1.0-255 (and beyond), as in the future I'd like 
it to co-exist with our existing infrastructure while we migrate things - but I 
kept being told about conflicts etc when I tried to set up cloudstack like that 
during the initial set up process?

Thanks,

Elliot

Jayapal Reddy Uradi wrote:

Hi Elliot,

Did you ssh to VR using the ssh key ?
Ex: ssh -i /root/.ssh/id_rsa.cloud [email protected]

If it is failed to ssh, then there is issue with the ssh keys.

Thanks,
Jayapal


On 09-Jul-2014, at 4:43 PM, Harikrishna Patnala<[email protected]> 
      wrote:

1) Log into your KVM host.
2) Use command “virsh list”. This gives the list of VMs on the host.
3) Use command “virsh console<VirtualRouterId>” to log into the VR.


-Harikrishna

On 09-Jul-2014, at 3:52 pm, Elliot Berg<[email protected]>       
wrote:

I don't know - I can't seem to ssh to the link local IP. It pings, but ssh times out. If 
I try and use the "connect to console" button in the gui, that too times out :(

Elliot

Harikrishna Patnala wrote:

 From the logs

2014-07-08 12:08:56,218 DEBUG [agent.transport.Request] (AgentManager-Handler-1:null) Seq 1-277348416: Processing:  { Ans: , 
MgmtId: 159320647860937, via: 1, Ver: v1, Flags: 110, 
[{"com.cloud.agent.api.Answer":{"result":false,"details":"grep: /var/lib/misc/dnsmasq.leases: 
No such file or directory","wait":0}}] }

Can you check whether dnsmasq service is running in the Virtual Router ? if 
not, start the service and check for “/var/lib/misc/dnsmasq.leases”

-Harikrishna

On 08-Jul-2014, at 3:47 pm, Elliot Berg<[email protected]>        
wrote:

Hi,

I've done that, and now there's a new virtual router which says it's running, 
however a deployment still fails. My latest lot of logs are available 
athttps://dl.dropboxusercontent.com/u/47728104/management-server.log.gz, and 
there's now one thing in the op_it_work table with a step != 'Done', which is a 
ConsoleProxy.

Interestingly if I look at the console proxy vm in the cloudstack management 
gui it says it's running, though.

Thanks,

Elliot

Harikrishna Patnala wrote:

Yes mark the VR to stopped, destroy VR, mark the VR entry in op_it_work to 
“Done” and try deploying VM.

-Harikrishna

On 08-Jul-2014, at 12:44 pm, Elliot Berg<[email protected]>         
wrote:

Hi,

It appears to be stuck in the "starting" state - so I don't get the option to 
reboot it or anything. If I change the state to stopped in the database directly will the 
management server attempt to start it again or do I need to do something more?

Thanks!

Elliot

Harikrishna Patnala wrote:

Is your Virtual Router up and running ? If is in running state you can mark it 
Done and deploy a VM.
If it is in stopped state try restarting it. You can try updating the field as 
well.

-Harikrishna

On 07-Jul-2014, at 7:10 pm, Elliot Berg<[email protected]>          
wrote:

I can see two entries that have the "step" field set to something other than 
"Done", one of them is

ConsoleProxy | Starting

and the other is

DomainRouter | Prepare

Am I safe to just delete the rows, or should I just update the field?

Thanks,

Elliot

Harikrishna Patnala wrote:

Do you see any work item pending for Virtual Router r-4-VM in “op_it_work” 
table ?
If there are any, remove those entries and try VM deployment again.

I see in the logs that VR has a task pending
2014-07-07 10:28:15,934 WARN  [cloud.vm.VirtualMachineManagerImpl] 
(Job-Executor-5:job-48 = [ 22369802-b5aa-4b5a-a26d-1fab11241551 ]) The task 
item for vm VM[DomainRouter|r-4-VM] has been inactive for 418531


-Harikrishna


On 07-Jul-2014, at 2:18 pm, Elliot 
Berg<[email protected]<mailto:[email protected]>>           
wrote:

I'm still not really spotting anything indicating why it's not using the host, 
but I suspect that's just because I don't really know what I'm looking for - so 
I've zipped the whole log for today and stuffed it on dropbox 
athttps://dl.dropboxusercontent.com/u/47728104/management-server.log.gz.

Hopefully someone who's used cloudstack a lot more will have more success!

Thanks,

Elliot


Elliot Berg wrote:
I'm going back over everything and I've noticed something else - everywhere 
I've looked for how to use local storage says I should change two global 
settings;


  *   system.vm.use.local.storage = true
  *   use.local.storage = true

However I'm looking at my global settings and only the first exists (which I 
have set to true).

Elliot

Elliot Berg wrote:
Ah, so when looking back a bit further before (I was kind of only looking for 
exceptions higher up before now), I've just spotted this...

2014-07-03 10:48:28,765 DEBUG [allocator.impl.FirstFitAllocator] 
(Job-Executor-3:job-46 = [ 92fb959d-edc5-4fe2-84a0-5
6001226e4ac ] FirstFitRoutingAllocator) Looking for speed=1000Mhz, Ram=1024
2014-07-03 10:48:28,765 DEBUG [allocator.impl.FirstFitAllocator] 
(Job-Executor-3:job-46 = [ 92fb959d-edc5-4fe2-84a0-5
6001226e4ac ] FirstFitRoutingAllocator) Host name: cloudstack-host1, hostId: 1 
is in avoid set, skipping this and try
ing other available hosts

That's the one and only host - so I'm guessing that has something to do with it!

Elliot
--
Elliot Berg  |  Analyst Programmer/Network Team
Email:[email protected]<mailto:[email protected]>           | 
Tel: 01753 213700 | Web:www.avcosystems.com<http://www.avcosystems.com/>
<image.png>

Avco Systems Ltd, Registered in England&           Wales, Registration Number 
1976620
Registered Office: Avco Systems | 17 Bath Road | Slough | SL1 3UF


ilya musayev wrote:
Elliot,

When you see such an error - there usually a predecessor message that says 
CloudStack checked for X, Y and Z and found no suitable resources based on your 
configuration.

Put the logs on pastebin or some other site (strip out any private info you 
dont want to share). I would also  recommend cloudstack 4.3.1 (which is not 
officially out yet) but should come thru in the next several weeks. Its latest 
stable release of CloudStack 4.3.0 - with latest bug fixes.

I've put a build for folks who want to try it out until we complete official 
release of ACS 4.3.1 process.

Unzip tgz and it should have required RPMs with both Open Source and Non-Open 
Source modules.

http://www.cloudsand.com/cloudstack-4.3.0-1.tgz

Regards
ilya


On 7/2/14, 1:06 AM, Elliot Berg wrote:
Hi,

I've been putting together a cloudstack set-up for experimentation purposes - 
right now we're just trying to compare different platforms for private cloud 
infrastructure before we start getting too in depth with any of them.

I've added the cloudstack 4.2 apt repository, and I'm running on Ubuntu 12.04 
LTS, and I believe I've followed all the installation guides correctly at the 
various stages.

We've set up a management server, which is also an NFS server, however we're 
interested in using local storage for the majority of things, and have also set 
up a single KVM host which I believe is all configured correctly to use local 
storage. If I look at the dashboard, I'm told I have more than enough resource 
in every section to create an instance the size I want to - which is a small 
offering I've created with just 1.0GHz and 1GB of RAM, with local storage. The 
host's not very powerful, but according to the dashboard I am using 
1.50GHz/5.87GHz, 1.38GB/7.80GB, 3.55GB/285.95GB Secondary Storage, 
1.03GB/450.99GB Local Storage and 0.00KB/571.90GB Primary Storage (I'm assuming 
that's meant to be a combination of the NFS server's primary storage offering 
and the local storage on the host, though the numbers don't quite make sense at 
first glance).

However, when I try to add an instance, I receive an 
InsufficientServerCapacityException and I'm struggling to work out why. I can't 
add an instance using a small shared storage offering either, but if I'm not 
mistaken that's expected because the zone and host are configured to use local 
storage. The only thing I can think of is that the local storage isn't properly 
configured, but when I've looked it seems to be.

Any pointers for how I can further diagnose this would be great - thanks in 
advance!

Elliot

Re: InsufficientServerCapacityException - Unsure Why

Reply via email to