Re: hail algorithm clarification

Constantinos Venetsanopoulos Wed, 05 Sep 2012 02:54:38 -0700

Hello iustin,

any news on that? I assume you are quite busy these days..


thanks,
Constantinos

On 08/24/2012 06:26 PM, Constantinos Venetsanopoulos wrote:

On 08/24/2012 06:18 PM, Iustin Pop wrote:

On Fri, Aug 24, 2012 at 06:06:42PM +0300, ConstantinosVenetsanopoulos wrote:

On 08/24/2012 05:39 PM, Iustin Pop wrote:
On Fri, Aug 24, 2012 at 04:33:21PM +0300, ConstantinosVenetsanopoulos wrote:
Hello team,

I seem to not have actually understood how hail's algorithm works,
and I would like some insight if possible.

I have a Ganeti cluster with 13 VM-capable nodes running the latest
Ganeti 2.6.0 stable.
I see that when adding instances of type 'plain' or 'rbd' allinstancesgo to the same node, until i hit the IPolicy vcpu or spindleratio. Then,
another node is picked up and again all instances go to that node,
until this node is full too. Is that the expected functionality ofhail?
No, not at all.
Pheww, that's what I thought too :)

:)

Shouldn't it balance the cluster somehow?

Yes. I'm not sure what is the problem, especially in case of plain, it

should work well; while I can imagine rbd doesn't work well (notenough

tested), I'm surprised that plain is broken.

I see that when adding instances of type 'drbd' everything works
as I would expect, meaning that  hail picks different nodes of the
cluster for each instance, resulting in a balanced cluster.

Migration/Failover works well with all templates. Hail picksdifferent

nodes everytime.

Hmm, very surprising. I'll test if I can reproduce this with plain.

Alternatively, if you can give me:

- "hscan -L" output after a few instances have been mis-allocated
- gnt-debug allocator - wait, there's a bug in it, so instead the
   command line you use to add another instance

it would help to understand the issue.

Thanks for the report!
iustin

Here is some output of the cluster:

# gnt-instance list -o +vcpus,disk_template
Instance Hypervisor OS                Primary_node Status
Memory ConfigVCPUs Disk_template
snf-45    kvm        snf-image+default demo9.xxxxxx.gr running 1.0G
1 drbd
snf-169  kvm        snf-image+default demo11.xxxxxx.gr running 1.0G
1 rbd
snf-170  kvm        snf-image+default demo11.xxxxxx.gr running 1.0G
1 rbd
snf-171  kvm        snf-image+default demo11.xxxxxx.gr running 4.0G
4 plain
snf-172  kvm        snf-image+default demo11.xxxxxx.gr running 4.0G
4 plain
snf-173  kvm        snf-image+default demo11.xxxxxx.gr running 4.0G
4 plain
snf-174  kvm        snf-image+default demo11.xxxxxx.gr running 2.0G
2 plain
snf-175  kvm        snf-image+default demo11.xxxxxx.gr running 4.0G
4 plain
snf-176  kvm        snf-image+default demo11.xxxxxx.gr running 1.0G
1 rbd
snf-177  kvm        snf-image+default demo13.xxxxxx.gr running 1.0G
1 drbd
snf-178  kvm        snf-image+default demo10.xxxxxx.gr running 1.0G
1 drbd
snf-179  kvm        snf-image+default demo11.xxxxxx.gr running 1.0G
1 plain

Hmm, a nice mix of disk templates. Honestly I never tested htools in
these conditions, fun!



:)
Note that all 'plain' and 'rbd' instances go to demo11

# hscan -L
Name  Nodes  Inst BNode BInst  t_mem  f_mem t_disk f_disk Score
LOCAL    13    12     0     0 1744299 1730423  34204  33942 11.59884147
#

Ah, sorry I wasn't clear. "hscan -L" generates a LOCAL.data file in your
current directory, and that is what I'm interested in (mail it directly
to me).



OK, i'll mail it to you now

#gnt-instance add -o snf-image+default --os-parameters
img_format=diskdump,img_id=debian_base-6.0-7-x86_64.diskdump,img_passwd=example_passw0rd,img_properties='{"OSFAMILY":"linux"\,"ROOT_PARTITION":"1"}'
                              -t drbd --disk 0:size=2G
--no-name-check --no-ip-check
                              --net 0:ip=pool,network=snf-net-1
                              snf-178

Thanks, this + the LOCAL.data file will be enough.

iustin



Thanks,
Constantinos

Re: hail algorithm clarification

Reply via email to