Re: [PATCH master 0/6] New balancing options implementation

2016-06-27 Thread 'Iustin Pop' via ganeti-devel
On 24 June 2016 at 14:36, Oleg Ponomarev  wrote:

> Hi Iustin,
>
> > I'll look at the patches, but if I read correctly—these are currently
> stored as tags. Would it make more sense to have
> > them as proper values in the objects, so that (in the future) they can
> be used by other parts of the code? Just a thought.
>
> Do you have any ideas about how network bandwidth might be used in Ganeti
> itself?
> At my first glance, this information might be useful in HTools only. And
> in this case, node tags is the common way to pass the information. It's the
> same mechanism as used in HTools to obtain location, migration, desired
> location and some other information.
>

I don't have very strong examples, but the reason I mention this is that I
regard network capacity in a nodegroup as a physical characteristic of the
hardware, not something that is htools-specific, so modelling it explicitly
might have some value. For example, it might make sense to say that the
number of concurrent disk replaces on a given node, or the number of
concurrent instance migrations, might depend on the network bandwidth, such
that Ganeti would automatically adjust the concurrency of such jobs per
node, without needing external control.

That is however far-fetched, so I'm not proposing any change to the code
per se; I was asking to see what others think of this.

regards,
iustin

On Fri, Jun 24, 2016 at 3:17 PM Iustin Pop  wrote:
>
>> On 23 June 2016 at 18:32, Даниил Лещёв  wrote:
>>
>>>
 I would slightly prefer if we discuss it over plain email (without
 patches), to see what you think about how complex the network model needs
 to be, and whether a static "time X" vs. semi-dynamic (based on the
 instance disk size) approach is best.

 Maybe there was some more information back at the start of the project?
 (I only started watching the mailing list again recently).

 The initial plan was to implement "static" solutions, based on instance
>>> disk size and then make it "dynamic" by using information about network
>>> speed from data collectors.
>>>
>>
>> Ack.
>>
>> At the moment, we have "semi-dynamic" solution, I think. The new tags may
>>> specify network speed in cluster (and between different parts of cluster).
>>>
>>
>> I'll look at the patches, but if I read correctly—these are currently
>> stored as tags. Would it make more sense to have them as proper values in
>> the objects, so that (in the future) they can be used by other parts of the
>> code? Just a thought.
>>
>> I am assuming that this speed remains constant since the network usually
>>> configured once and locally (for example in server rack).
>>>
>>
>> That makes sense.
>>
>>
>>> I think, with such assumption, the network speed stays almost constant
>>> and the time estimations for balancing solutions become predictable.
>>>
>>> I suggest to use the new options for discarding solutions, that takes
>>> long time and slightly changes the state of the cluster.
>>> In my mind the time to perform disk replication is directly depends on
>>> the network bandwidth.
>>>
>>
>> Hmm, depends. On a gigabyte or 10G network and with mechanical
>> harddrives, the time will depend more on disk load.
>>
>> thanks,
>> iustin
>>
>


Re: [PATCH master 0/6] New balancing options implementation

2016-06-24 Thread Oleg Ponomarev
Hi Iustin,

> I'll look at the patches, but if I read correctly—these are currently
stored as tags. Would it make more sense to have
> them as proper values in the objects, so that (in the future) they can be
used by other parts of the code? Just a thought.

Do you have any ideas about how network bandwidth might be used in Ganeti
itself?
At my first glance, this information might be useful in HTools only. And in
this case, node tags is the common way to pass the information. It's the
same mechanism as used in HTools to obtain location, migration, desired
location and some other information.

Sincerely,
Oleg

On Fri, Jun 24, 2016 at 3:17 PM Iustin Pop  wrote:

> On 23 June 2016 at 18:32, Даниил Лещёв  wrote:
>
>>
>>> I would slightly prefer if we discuss it over plain email (without
>>> patches), to see what you think about how complex the network model needs
>>> to be, and whether a static "time X" vs. semi-dynamic (based on the
>>> instance disk size) approach is best.
>>>
>>> Maybe there was some more information back at the start of the project?
>>> (I only started watching the mailing list again recently).
>>>
>>> The initial plan was to implement "static" solutions, based on instance
>> disk size and then make it "dynamic" by using information about network
>> speed from data collectors.
>>
>
> Ack.
>
> At the moment, we have "semi-dynamic" solution, I think. The new tags may
>> specify network speed in cluster (and between different parts of cluster).
>>
>
> I'll look at the patches, but if I read correctly—these are currently
> stored as tags. Would it make more sense to have them as proper values in
> the objects, so that (in the future) they can be used by other parts of the
> code? Just a thought.
>
> I am assuming that this speed remains constant since the network usually
>> configured once and locally (for example in server rack).
>>
>
> That makes sense.
>
>
>> I think, with such assumption, the network speed stays almost constant
>> and the time estimations for balancing solutions become predictable.
>>
>> I suggest to use the new options for discarding solutions, that takes
>> long time and slightly changes the state of the cluster.
>> In my mind the time to perform disk replication is directly depends on
>> the network bandwidth.
>>
>
> Hmm, depends. On a gigabyte or 10G network and with mechanical harddrives,
> the time will depend more on disk load.
>
> thanks,
> iustin
>


Re: [PATCH master 0/6] New balancing options implementation

2016-06-24 Thread 'Iustin Pop' via ganeti-devel
On 23 June 2016 at 18:32, Даниил Лещёв  wrote:

>
>> I would slightly prefer if we discuss it over plain email (without
>> patches), to see what you think about how complex the network model needs
>> to be, and whether a static "time X" vs. semi-dynamic (based on the
>> instance disk size) approach is best.
>>
>> Maybe there was some more information back at the start of the project?
>> (I only started watching the mailing list again recently).
>>
>> The initial plan was to implement "static" solutions, based on instance
> disk size and then make it "dynamic" by using information about network
> speed from data collectors.
>

Ack.

At the moment, we have "semi-dynamic" solution, I think. The new tags may
> specify network speed in cluster (and between different parts of cluster).
>

I'll look at the patches, but if I read correctly—these are currently
stored as tags. Would it make more sense to have them as proper values in
the objects, so that (in the future) they can be used by other parts of the
code? Just a thought.

I am assuming that this speed remains constant since the network usually
> configured once and locally (for example in server rack).
>

That makes sense.


> I think, with such assumption, the network speed stays almost constant and
> the time estimations for balancing solutions become predictable.
>
> I suggest to use the new options for discarding solutions, that takes long
> time and slightly changes the state of the cluster.
> In my mind the time to perform disk replication is directly depends on the
> network bandwidth.
>

Hmm, depends. On a gigabyte or 10G network and with mechanical harddrives,
the time will depend more on disk load.

thanks,
iustin


Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread Даниил Лещёв
>
>
> I would slightly prefer if we discuss it over plain email (without
> patches), to see what you think about how complex the network model needs
> to be, and whether a static "time X" vs. semi-dynamic (based on the
> instance disk size) approach is best.
>
> Maybe there was some more information back at the start of the project? (I
> only started watching the mailing list again recently).
>
> The initial plan was to implement "static" solutions, based on instance
disk size and then make it "dynamic" by using information about network
speed from data collectors.

At the moment, we have "semi-dynamic" solution, I think. The new tags may
specify network speed in cluster (and between different parts of cluster).
I am assuming that this speed remains constant since the network usually
configured once and locally (for example in server rack).
I think, with such assumption, the network speed stays almost constant and
the time estimations for balancing solutions become predictable.

I suggest to use the new options for discarding solutions, that takes long
time and slightly changes the state of the cluster.
In my mind the time to perform disk replication is directly depends on the
network bandwidth.

The next step (according to plan) is to implement data collector for
network speed information and use it instead (or may be with) the new tags
in order to estimate time more properly.

-- 
Sincerely,
Daniil Leshchev


Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread 'Iustin Pop' via ganeti-devel
On 23 June 2016 at 17:42, Даниил Лещёв  wrote:

> Hi, Iustin
>
>
>> Oh, no worries, I just wanted to know if Daniil acknowledged the comments
>> or not.
>>
>> Anyway, comments are welcome here and the discussion is still open:)
>>>
>>
>>
> The only reason why I didn't reply to your comments is that I wanted to
> show patchset in order to discuss ideas in design document in more detailed
> way.
> Hope, that was not a big mistake.
>

Oh no, not a mistake at all.


> I'm also going to rewrite patch for design document (append information
> about bandwidth tags).
>

I would slightly prefer if we discuss it over plain email (without
patches), to see what you think about how complex the network model needs
to be, and whether a static "time X" vs. semi-dynamic (based on the
instance disk size) approach is best.

Maybe there was some more information back at the start of the project? (I
only started watching the mailing list again recently).

thanks!
iustin


Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread Даниил Лещёв
Hi, Iustin


> Oh, no worries, I just wanted to know if Daniil acknowledged the comments 
> or not.
>
> Anyway, comments are welcome here and the discussion is still open:)
>>
>
>
The only reason why I didn't reply to your comments is that I wanted to 
show patchset in order to discuss ideas in design document in more detailed 
way.
Hope, that was not a big mistake.
I'm also going to rewrite patch for design document (append information 
about bandwidth tags).

--
Sincerely,
Daniil Leshchev


Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread 'Iustin Pop' via ganeti-devel
On 23 June 2016 at 17:08, Oleg Ponomarev  wrote:

> Hi Iustin, Daniil,
>
> The reason for Daniil not to reply immediately is his GSoC midterm
> evaluation coming soon. As the implementation represents his work during
> the first month of GSoC, it's necessary to share it with the community at
> this point. We have discussed your comments on design document and Daniil
> took them into account. Still, I don't understand why Daniil decided not to
> spend 5-10 minutes to reply in the design document discussion thread.
>

Oh, no worries, I just wanted to know if Daniil acknowledged the comments
or not.

Anyway, comments are welcome here and the discussion is still open:)
>

Sounds good.

And thanks Daniil for the commits.
>

Of course, looking forward to see this implemented!

thanks!
iustin

On Thu, Jun 23, 2016 at 5:49 PM 'Iustin Pop' via ganeti-devel <
> ganeti-devel@googlegroups.com> wrote:
>
>> On 23 June 2016 at 16:45,  wrote:
>>
>>> From: Daniil Leshchev 
>>>
>>> The patchset introduces new command line options
>>> (--long-solution-threshold" and --avoid-long-solutions").
>>> That gives an ability for HBal to avoid balancing solutions,
>>> that take significant amount of time.
>>>
>>
>> Daniil, I've replied to your design doc changes, but I haven't seen a
>> reply. That discussion would be useful before implementing this :)
>>
>> regards,
>> iustin
>>
>


Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread Oleg Ponomarev
Hi Iustin, Daniil,

The reason for Daniil not to reply immediately is his GSoC midterm
evaluation coming soon. As the implementation represents his work during
the first month of GSoC, it's necessary to share it with the community at
this point. We have discussed your comments on design document and Daniil
took them into account. Still, I don't understand why Daniil decided not to
spend 5-10 minutes to reply in the design document discussion thread.

Anyway, comments are welcome here and the discussion is still open:)

And thanks Daniil for the commits.

Sincerely,
Oleg


On Thu, Jun 23, 2016 at 5:49 PM 'Iustin Pop' via ganeti-devel <
ganeti-devel@googlegroups.com> wrote:

> On 23 June 2016 at 16:45,  wrote:
>
>> From: Daniil Leshchev 
>>
>> The patchset introduces new command line options
>> (--long-solution-threshold" and --avoid-long-solutions").
>> That gives an ability for HBal to avoid balancing solutions,
>> that take significant amount of time.
>>
>
> Daniil, I've replied to your design doc changes, but I haven't seen a
> reply. That discussion would be useful before implementing this :)
>
> regards,
> iustin
>


Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread 'Iustin Pop' via ganeti-devel
On 23 June 2016 at 16:45,  wrote:

> From: Daniil Leshchev 
>
> The patchset introduces new command line options
> (--long-solution-threshold" and --avoid-long-solutions").
> That gives an ability for HBal to avoid balancing solutions,
> that take significant amount of time.
>

Daniil, I've replied to your design doc changes, but I haven't seen a
reply. That discussion would be useful before implementing this :)

regards,
iustin


[PATCH master 0/6] New balancing options implementation

2016-06-23 Thread meleodr
From: Daniil Leshchev 

The patchset introduces new command line options
(--long-solution-threshold" and --avoid-long-solutions").
That gives an ability for HBal to avoid balancing solutions,
that take significant amount of time.

Daniil Leshchev (6):
  Add "long-solution-threshold" and "avoid-long-solutions" command-line
options
  Add bandwidth tags and bandwidth map fields into Node
  Add bandwidth tags extraction and parsing
  Add extraction network bandwidth data from tags
  Add long-time solutions filtering
  Add tests for 'avoid-long-solutions' and 'long-solution-threshold'
options

 src/Ganeti/HTools/AlgorithmParams.hs|   9 +++
 src/Ganeti/HTools/CLI.hs|  30 +++
 src/Ganeti/HTools/Cluster.hs| 103 +++-
 src/Ganeti/HTools/Loader.hs |  20 -
 src/Ganeti/HTools/Node.hs   |  30 +++
 src/Ganeti/HTools/Program/Hbal.hs   |   2 +
 src/Ganeti/HTools/Tags.hs   |  60 --
 src/Ganeti/HTools/Tags/Constants.hs |   5 ++
 test/data/htools/hbal-avoid-long-solutions.data |  17 
 test/hs/shelltests/htools-hbal.test |  32 
 10 files changed, 283 insertions(+), 25 deletions(-)
 create mode 100644 test/data/htools/hbal-avoid-long-solutions.data

-- 
1.9.1