Re: hail algorithm clarification

Iustin Pop Fri, 24 Aug 2012 08:18:55 -0700

On Fri, Aug 24, 2012 at 06:06:42PM +0300, Constantinos Venetsanopoulos wrote:
> On 08/24/2012 05:39 PM, Iustin Pop wrote:
> >On Fri, Aug 24, 2012 at 04:33:21PM +0300, Constantinos Venetsanopoulos wrote:
> >>Hello team,
> >>
> >>I seem to not have actually understood how hail's algorithm works,
> >>and I would like some insight if possible.
> >>
> >>I have a Ganeti cluster with 13 VM-capable nodes running the latest
> >>Ganeti 2.6.0 stable.
> >>
> >>I see that when adding instances of type 'plain' or 'rbd' all instances
> >>go to the same node, until i hit the IPolicy vcpu or spindle ratio. Then,
> >>another node is picked up and again all instances go to that node,
> >>until this node is full too. Is that the expected functionality of hail?
> >No, not at all.
> 
> 
> Pheww, that's what I thought too :)


:)

> >>Shouldn't it balance the cluster somehow?
> >Yes. I'm not sure what is the problem, especially in case of plain, it
> >should work well; while I can imagine rbd doesn't work well (not enough
> >tested), I'm surprised that plain is broken.
> >
> >>I see that when adding instances of type 'drbd' everything works
> >>as I would expect, meaning that  hail picks different nodes of the
> >>cluster for each instance, resulting in a balanced cluster.
> >>
> >>Migration/Failover works well with all templates. Hail picks different
> >>nodes everytime.
> >Hmm, very surprising. I'll test if I can reproduce this with plain.
> >
> >Alternatively, if you can give me:
> >
> >- "hscan -L" output after a few instances have been mis-allocated
> >- gnt-debug allocator - wait, there's a bug in it, so instead the
> >   command line you use to add another instance
> >
> >it would help to understand the issue.
> >
> >Thanks for the report!
> >iustin
> 
> Here is some output of the cluster:
> 
> # gnt-instance list -o +vcpus,disk_template
> Instance Hypervisor OS                Primary_node        Status
> Memory ConfigVCPUs Disk_template
> snf-45    kvm        snf-image+default demo9.xxxxxx.gr  running 1.0G
> 1 drbd
> snf-169  kvm        snf-image+default demo11.xxxxxx.gr running 1.0G
> 1 rbd
> snf-170  kvm        snf-image+default demo11.xxxxxx.gr running 1.0G
> 1 rbd
> snf-171  kvm        snf-image+default demo11.xxxxxx.gr running 4.0G
> 4 plain
> snf-172  kvm        snf-image+default demo11.xxxxxx.gr running 4.0G
> 4 plain
> snf-173  kvm        snf-image+default demo11.xxxxxx.gr running 4.0G
> 4 plain
> snf-174  kvm        snf-image+default demo11.xxxxxx.gr running 2.0G
> 2 plain
> snf-175  kvm        snf-image+default demo11.xxxxxx.gr running 4.0G
> 4 plain
> snf-176  kvm        snf-image+default demo11.xxxxxx.gr running 1.0G
> 1 rbd
> snf-177  kvm        snf-image+default demo13.xxxxxx.gr running 1.0G
> 1 drbd
> snf-178  kvm        snf-image+default demo10.xxxxxx.gr running 1.0G
> 1 drbd
> snf-179  kvm        snf-image+default demo11.xxxxxx.gr running 1.0G
> 1 plain

Hmm, a nice mix of disk templates. Honestly I never tested htools in
these conditions, fun!

> # hscan -L
> Name  Nodes  Inst BNode BInst  t_mem  f_mem t_disk f_disk      Score
> LOCAL    13    12     0     0 1744299 1730423  34204  33942 11.59884147
> #

Ah, sorry I wasn't clear. "hscan -L" generates a LOCAL.data file in your
current directory, and that is what I'm interested in (mail it directly
to me).

> #gnt-instance add -o snf-image+default --os-parameters
> img_format=diskdump,img_id=debian_base-6.0-7-x86_64.diskdump,img_passwd=example_passw0rd,img_properties='{"OSFAMILY":"linux"\,"ROOT_PARTITION":"1"}'
>                              -t drbd --disk 0:size=2G
> --no-name-check --no-ip-check
>                              --net 0:ip=pool,network=snf-net-1
>                              snf-178

Thanks, this + the LOCAL.data file will be enough.

iustin

Re: hail algorithm clarification

Reply via email to