[Bug 564355] Re: Second euca-run-instance request in same security group causes eucalyptus to remove network assoicated with security group

2010-08-25 Thread Piotr T Zbiegiel
Not sure about logs from all Eucalyptus components.  This problem seems
to be centered in the cluster controller code.  I think the telling log
lines I've seen after the second euca-run-instances command are:

[Thu Apr 15 14:29:46 2010][001328][EUCAINFO ] RunInstances(): called
[Thu Apr 15 14:29:46 2010][001328][EUCAERROR ] vnetAddHost(): failed to add 
host d0:0d:3B:E6:07:11 on vlan 10
[Thu Apr 15 14:29:46 2010][001328][EUCAERROR ] RunInstances(): could not 
find/initialize any free network address, failing doRunInstances()

Once the cluster controller fails to issue network addresses for the new
instances it doesn't bother to farm them out to the node controllers.
Those instances are never started on any of the NCs.

It almost seems like the cluster controller forgets about the available
network addresses on a given network and won't allocate addresses for
new instances.  The most distressing thing is (and this doesn't happen
every time) the network associated with a given security group is
deallocated by the cluster controller.  Its rule chain is removed from
iptables and I've even seen other users get issued the same slice of
network addresses for their new security groups.  All this while
instances in the old security group are still in a running state.

I can confirm Aimon's comment.  We have seen this behavior with
ADDRSPERNET set to 256, 128, and 64.

-- 
Second euca-run-instance request in same security group causes eucalyptus to 
remove network assoicated with security group
https://bugs.launchpad.net/bugs/564355
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to eucalyptus in ubuntu.

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 564355] Re: Second euca-run-instance request in same security group causes eucalyptus to remove network assoicated with security group

2010-08-25 Thread Piotr T Zbiegiel
I wanted to add that we have since upgraded to 1.6.2-0ubuntu30.3 and
still witness this behavior regularly.

-- 
Second euca-run-instance request in same security group causes eucalyptus to 
remove network assoicated with security group
https://bugs.launchpad.net/bugs/564355
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to eucalyptus in ubuntu.

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 568108] [NEW] getSecretKey() in euca_conf uses unanchored regex to find admin credentials

2010-04-21 Thread Piotr T Zbiegiel
Public bug reported:

When the function getSecretKey() in euca_conf tries to set SKEY and AKEY
it uses an unanchored regex with awk that can cause it to select the
credentials of any user with the word admin in their login name.  I
imagine the intent was to select the 'admin' user but the way the code
is written the regex could match 'sadminer' for instance, who may or may
not have admin credentials.

This problem manifested when we created some accounts named jdoe_admin.
Even through jdoe_admin was marked as an Administrator since there were
no credentials in the database (the user had not retrieved their
credentials.zip) euca_conf requests started to fail on the machine.

The offending lines seem to be:
SKEY=$(eval echo $(awk -v field=${FIELD} -F, '/INSERT INTO AUTH_USERS.*admin/ 
{print $field}' ${DBDIR}/*auth* | head -n 1))

AKEY=$(eval echo $(awk -v field=${FIELD} -F, '/INSERT INTO
AUTH_USERS.*admin/ {print $field}' ${DBDIR}/*auth* | head -n 1))

Since the usernames in the files are surrounded by single quotes the
following fix seemed to work for us:

Replace:  '/INSERT INTO AUTH_USERS.*admin/ {print $field}' 
With:  /INSERT INTO AUTH_USERS.*'admin'/ {print \$field}

Not sure if that is the best solution.

Thanks!

** Affects: eucalyptus (Ubuntu)
 Importance: Undecided
 Status: New

-- 
getSecretKey() in euca_conf uses unanchored regex to find admin credentials
https://bugs.launchpad.net/bugs/568108
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to eucalyptus in ubuntu.

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 564355] Re: Second euca-run-instance request in same security group causes eucalyptus to remove network assoicated with security group

2010-04-19 Thread Piotr T Zbiegiel
I was able to repeat this behavior with ADDRSPERNET set to 128.  The
system seems more prone to this behavior when a user makes requests for
large numbers of VMs in a security group and then attempts to add more.
Not sure if this bug manifests based on the size of requests or how many
IPs are already allocated in a given security group.

-- 
Second euca-run-instance request in same security group causes eucalyptus to 
remove network assoicated with security group
https://bugs.launchpad.net/bugs/564355
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to eucalyptus in ubuntu.

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 566715] [NEW] GREEDY scheduling policy occasionally over-subscribes nodes (more VMs than cores)

2010-04-19 Thread Piotr T Zbiegiel
Public bug reported:

We are running eucalyptus 1.6.2-0ubuntu27 on lucid beta1.  I will retest
this bug on the latest-and-greatest as soon as that is feasible on our
cluster.

We have been running many tests with the ROUNDROBIN scheduling policy.
As a test I changed it to GREEDY a little over a week ago.  The
scheduling policy seems to work as expected except that occasionally
when servicing large requests the cluster controller will request that a
node run more VMs than available cores on a machine.  We are using kvm.
When making a large request (say 100 VMs) it seems that invariably one
of the nodes used to run the machines will be over-subscribed.  Our
machines have 8 cores each and I often see 9 VMs and occasionally I've
seen as many as 12.

I have checked that the machines with extra VMs did not have hyper-
threading enabled and were therefore reporting the correct number of
cores to Eucalyptus according to the
/var/log/eucalyptus/euca_test_nc.log file on each system.

** Affects: eucalyptus (Ubuntu)
 Importance: Undecided
 Status: New

-- 
GREEDY scheduling policy occasionally over-subscribes nodes (more VMs than 
cores)
https://bugs.launchpad.net/bugs/566715
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to eucalyptus in ubuntu.

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 564355] [NEW] Second euca-run-instance request in same security group causes eucalyptus to remove network assoicated with security group

2010-04-15 Thread Piotr T Zbiegiel
Public bug reported:

We are running eucalyptus 1.6.2-0ubuntu27 on lucid beta1 in MANAGED-
NOVLAN.  I will retest as soon as is feasible with ubuntu30 but as I see
no mention of this issue/fix in the changelog I wanted to get the
information in your hands.

Eucalyptus has trouble allocating additional VMs to existing security
groups in some cases.  I tried several tests and saw very similar
results.  Eucalyptus allows you to request VMs in a given security
group.  Once all the VMs are running an additional euca-run-instances
request for that security group will fail and in some cases  the network
associated with that security group will be removed from iptables  (even
if there are running VMs within that security group).  The network that
was freed up can be re-allocated to another security group but new VMs
requested in that security group fail with the same failed to add host
message.

---
A typical cycle looks like this (command-line interspersed with snippets of 
cc.log):

$ euca-run-instances -n 250 -g default…

[Thu Apr 15 14:14:51 2010][001325][EUCAINFO  ] StartNetwork(): called
[Thu Apr 15 14:14:51 2010][001324][EUCAINFO  ] ConfigureNetwork(): called
[Thu Apr 15 14:14:51 2010][001324][EUCAINFO  ] vnetTableRule(): applying 
iptables rule: -A user-default -s 0.0.0.0/0 -d 10.0.8.0/24 -p tcp --dport 22:22 
-j ACCEPT
[Thu Apr 15 14:14:51 2010][001327][EUCAINFO  ] RunInstances(): called

 #….Proceeds to run 250 instances successfully…..

$ euca-run-instances -n 1 -g default….

[Thu Apr 15 14:29:46 2010][001376][EUCAINFO  ] StartNetwork(): called
[Thu Apr 15 14:29:46 2010][001368][EUCAINFO  ] ConfigureNetwork(): called
[Thu Apr 15 14:29:46 2010][001368][EUCAINFO  ] vnetTableRule(): applying 
iptables rule: -A user-default -s 0.0.0.0/0 -d 10.0.8.0/24 -p tcp --dport 22:22 
-j ACCEPT
[Thu Apr 15 14:29:46 2010][001328][EUCAINFO  ] RunInstances(): called
[Thu Apr 15 14:29:46 2010][001328][EUCAERROR ] vnetAddHost(): failed to add 
host d0:0d:3B:E6:07:11 on vlan 10
[Thu Apr 15 14:29:46 2010][001328][EUCAERROR ] RunInstances(): could not 
find/initialize any free network address, failing doRunInstances()

#…..After 15 minutes instance goes to terminated and TerminateInstance()
is called many times (once per NC?)…….

[Thu Apr 15 14:39:51 2010][005458][EUCAERROR ] ERROR: TerminateInstance() could 
not be invoked (check NC host, port, and credentia
ls)
[Thu Apr 15 14:39:51 2010][001326][EUCAINFO  ] TerminateInstances(): calling 
terminate instance (i-3BE60711) on (192.168.1.2)
[Thu Apr 15 14:39:51 2010][005459][EUCAERROR ] ERROR: TerminateInstance() could 
not be invoked (check NC host, port, and credentia
ls)
[Thu Apr 15 14:39:51 2010][001326][EUCAINFO  ] TerminateInstances(): calling 
terminate instance (i-3BE60711) on (192.168.1.3)
[Thu Apr 15 14:39:51 2010][005460][EUCAERROR ] ERROR: TerminateInstance() could 
not be invoked (check NC host, port, and credentia
ls)
[Thu Apr 15 14:39:51 2010][001326][EUCAINFO  ] TerminateInstances(): calling 
terminate instance (i-3BE60711) on (192.168.1.4)
[Thu Apr 15 14:39:51 2010][005461][EUCAERROR ] ERROR: TerminateInstance() could 
not be invoked (check NC host, port, and credentia
ls)

#……It then removes the network allocated for the user's default security
group even though there are 250 running VMs!!!……

[Thu Apr 15 14:40:00 2010][001328][EUCAINFO  ] StopNetwork(): called


#iptables shows that the chain user-default has disappeared!

---
I tried many different combinations of numbers of nodes, etc.
(ADDRSPERNET is 256)

250 + 1 additional (the 1 additional failed, network was removed and VMs are 
inaccessible)
100 + 1 additional (the 1 additional failed, network was removed and VMs are 
inaccessible)
20 + 20 additional (the 20 additional failed, network was removed and VMs are 
inaccessible)

I did have some success adding to to existing security groups by 10 or
20 nodes at a time.  One security group grew to 80 nodes before I
received the failed to add host  messages.  It seemed I was more
successful when I was making requests rapidly (waiting only a few
minutes between requests) rather than waiting for all the nodes to
allocate in a given reservation.  I am at a loss to the exact cause
because some security groups are allowed to expand while others are cut
off from receiving additional IPs well before they reach ADDRSPERNET.

** Affects: eucalyptus (Ubuntu)
 Importance: Undecided
 Status: New

-- 
Second euca-run-instance request in same security group causes eucalyptus to 
remove network assoicated with security group
https://bugs.launchpad.net/bugs/564355
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to eucalyptus in ubuntu.

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs