On Fri, Aug 24, 2012 at 06:06:42PM +0300, Constantinos Venetsanopoulos wrote:
> On 08/24/2012 05:39 PM, Iustin Pop wrote:
> >On Fri, Aug 24, 2012 at 04:33:21PM +0300, Constantinos Venetsanopoulos wrote:
> >>Hello team,
> >>
> >>I seem to not have actually understood how hail's algorithm works,
> >>and I would like some insight if possible.
> >>
> >>I have a Ganeti cluster with 13 VM-capable nodes running the latest
> >>Ganeti 2.6.0 stable.
> >>
> >>I see that when adding instances of type 'plain' or 'rbd' all instances
> >>go to the same node, until i hit the IPolicy vcpu or spindle ratio. Then,
> >>another node is picked up and again all instances go to that node,
> >>until this node is full too. Is that the expected functionality of hail?
> >No, not at all.
>
>
> Pheww, that's what I thought too :)
:)
> >>Shouldn't it balance the cluster somehow?
> >Yes. I'm not sure what is the problem, especially in case of plain, it
> >should work well; while I can imagine rbd doesn't work well (not enough
> >tested), I'm surprised that plain is broken.
> >
> >>I see that when adding instances of type 'drbd' everything works
> >>as I would expect, meaning that hail picks different nodes of the
> >>cluster for each instance, resulting in a balanced cluster.
> >>
> >>Migration/Failover works well with all templates. Hail picks different
> >>nodes everytime.
> >Hmm, very surprising. I'll test if I can reproduce this with plain.
> >
> >Alternatively, if you can give me:
> >
> >- "hscan -L" output after a few instances have been mis-allocated
> >- gnt-debug allocator - wait, there's a bug in it, so instead the
> > command line you use to add another instance
> >
> >it would help to understand the issue.
> >
> >Thanks for the report!
> >iustin
>
> Here is some output of the cluster:
>
> # gnt-instance list -o +vcpus,disk_template
> Instance Hypervisor OS Primary_node Status
> Memory ConfigVCPUs Disk_template
> snf-45 kvm snf-image+default demo9.xxxxxx.gr running 1.0G
> 1 drbd
> snf-169 kvm snf-image+default demo11.xxxxxx.gr running 1.0G
> 1 rbd
> snf-170 kvm snf-image+default demo11.xxxxxx.gr running 1.0G
> 1 rbd
> snf-171 kvm snf-image+default demo11.xxxxxx.gr running 4.0G
> 4 plain
> snf-172 kvm snf-image+default demo11.xxxxxx.gr running 4.0G
> 4 plain
> snf-173 kvm snf-image+default demo11.xxxxxx.gr running 4.0G
> 4 plain
> snf-174 kvm snf-image+default demo11.xxxxxx.gr running 2.0G
> 2 plain
> snf-175 kvm snf-image+default demo11.xxxxxx.gr running 4.0G
> 4 plain
> snf-176 kvm snf-image+default demo11.xxxxxx.gr running 1.0G
> 1 rbd
> snf-177 kvm snf-image+default demo13.xxxxxx.gr running 1.0G
> 1 drbd
> snf-178 kvm snf-image+default demo10.xxxxxx.gr running 1.0G
> 1 drbd
> snf-179 kvm snf-image+default demo11.xxxxxx.gr running 1.0G
> 1 plain
Hmm, a nice mix of disk templates. Honestly I never tested htools in
these conditions, fun!
> # hscan -L
> Name Nodes Inst BNode BInst t_mem f_mem t_disk f_disk Score
> LOCAL 13 12 0 0 1744299 1730423 34204 33942 11.59884147
> #
Ah, sorry I wasn't clear. "hscan -L" generates a LOCAL.data file in your
current directory, and that is what I'm interested in (mail it directly
to me).
> #gnt-instance add -o snf-image+default --os-parameters
> img_format=diskdump,img_id=debian_base-6.0-7-x86_64.diskdump,img_passwd=example_passw0rd,img_properties='{"OSFAMILY":"linux"\,"ROOT_PARTITION":"1"}'
> -t drbd --disk 0:size=2G
> --no-name-check --no-ip-check
> --net 0:ip=pool,network=snf-net-1
> snf-178
Thanks, this + the LOCAL.data file will be enough.
iustin