Re: [PATCH master 0/6] New balancing options implementation
On 24 June 2016 at 14:36, Oleg Ponomarevwrote: > Hi Iustin, > > > I'll look at the patches, but if I read correctly—these are currently > stored as tags. Would it make more sense to have > > them as proper values in the objects, so that (in the future) they can > be used by other parts of the code? Just a thought. > > Do you have any ideas about how network bandwidth might be used in Ganeti > itself? > At my first glance, this information might be useful in HTools only. And > in this case, node tags is the common way to pass the information. It's the > same mechanism as used in HTools to obtain location, migration, desired > location and some other information. > I don't have very strong examples, but the reason I mention this is that I regard network capacity in a nodegroup as a physical characteristic of the hardware, not something that is htools-specific, so modelling it explicitly might have some value. For example, it might make sense to say that the number of concurrent disk replaces on a given node, or the number of concurrent instance migrations, might depend on the network bandwidth, such that Ganeti would automatically adjust the concurrency of such jobs per node, without needing external control. That is however far-fetched, so I'm not proposing any change to the code per se; I was asking to see what others think of this. regards, iustin On Fri, Jun 24, 2016 at 3:17 PM Iustin Pop wrote: > >> On 23 June 2016 at 18:32, Даниил Лещёв wrote: >> >>> I would slightly prefer if we discuss it over plain email (without patches), to see what you think about how complex the network model needs to be, and whether a static "time X" vs. semi-dynamic (based on the instance disk size) approach is best. Maybe there was some more information back at the start of the project? (I only started watching the mailing list again recently). The initial plan was to implement "static" solutions, based on instance >>> disk size and then make it "dynamic" by using information about network >>> speed from data collectors. >>> >> >> Ack. >> >> At the moment, we have "semi-dynamic" solution, I think. The new tags may >>> specify network speed in cluster (and between different parts of cluster). >>> >> >> I'll look at the patches, but if I read correctly—these are currently >> stored as tags. Would it make more sense to have them as proper values in >> the objects, so that (in the future) they can be used by other parts of the >> code? Just a thought. >> >> I am assuming that this speed remains constant since the network usually >>> configured once and locally (for example in server rack). >>> >> >> That makes sense. >> >> >>> I think, with such assumption, the network speed stays almost constant >>> and the time estimations for balancing solutions become predictable. >>> >>> I suggest to use the new options for discarding solutions, that takes >>> long time and slightly changes the state of the cluster. >>> In my mind the time to perform disk replication is directly depends on >>> the network bandwidth. >>> >> >> Hmm, depends. On a gigabyte or 10G network and with mechanical >> harddrives, the time will depend more on disk load. >> >> thanks, >> iustin >> >
Re: [PATCH master 0/6] New balancing options implementation
Hi Iustin, > I'll look at the patches, but if I read correctly—these are currently stored as tags. Would it make more sense to have > them as proper values in the objects, so that (in the future) they can be used by other parts of the code? Just a thought. Do you have any ideas about how network bandwidth might be used in Ganeti itself? At my first glance, this information might be useful in HTools only. And in this case, node tags is the common way to pass the information. It's the same mechanism as used in HTools to obtain location, migration, desired location and some other information. Sincerely, Oleg On Fri, Jun 24, 2016 at 3:17 PM Iustin Popwrote: > On 23 June 2016 at 18:32, Даниил Лещёв wrote: > >> >>> I would slightly prefer if we discuss it over plain email (without >>> patches), to see what you think about how complex the network model needs >>> to be, and whether a static "time X" vs. semi-dynamic (based on the >>> instance disk size) approach is best. >>> >>> Maybe there was some more information back at the start of the project? >>> (I only started watching the mailing list again recently). >>> >>> The initial plan was to implement "static" solutions, based on instance >> disk size and then make it "dynamic" by using information about network >> speed from data collectors. >> > > Ack. > > At the moment, we have "semi-dynamic" solution, I think. The new tags may >> specify network speed in cluster (and between different parts of cluster). >> > > I'll look at the patches, but if I read correctly—these are currently > stored as tags. Would it make more sense to have them as proper values in > the objects, so that (in the future) they can be used by other parts of the > code? Just a thought. > > I am assuming that this speed remains constant since the network usually >> configured once and locally (for example in server rack). >> > > That makes sense. > > >> I think, with such assumption, the network speed stays almost constant >> and the time estimations for balancing solutions become predictable. >> >> I suggest to use the new options for discarding solutions, that takes >> long time and slightly changes the state of the cluster. >> In my mind the time to perform disk replication is directly depends on >> the network bandwidth. >> > > Hmm, depends. On a gigabyte or 10G network and with mechanical harddrives, > the time will depend more on disk load. > > thanks, > iustin >
Re: [PATCH master 0/6] New balancing options implementation
On 23 June 2016 at 18:32, Даниил Лещёвwrote: > >> I would slightly prefer if we discuss it over plain email (without >> patches), to see what you think about how complex the network model needs >> to be, and whether a static "time X" vs. semi-dynamic (based on the >> instance disk size) approach is best. >> >> Maybe there was some more information back at the start of the project? >> (I only started watching the mailing list again recently). >> >> The initial plan was to implement "static" solutions, based on instance > disk size and then make it "dynamic" by using information about network > speed from data collectors. > Ack. At the moment, we have "semi-dynamic" solution, I think. The new tags may > specify network speed in cluster (and between different parts of cluster). > I'll look at the patches, but if I read correctly—these are currently stored as tags. Would it make more sense to have them as proper values in the objects, so that (in the future) they can be used by other parts of the code? Just a thought. I am assuming that this speed remains constant since the network usually > configured once and locally (for example in server rack). > That makes sense. > I think, with such assumption, the network speed stays almost constant and > the time estimations for balancing solutions become predictable. > > I suggest to use the new options for discarding solutions, that takes long > time and slightly changes the state of the cluster. > In my mind the time to perform disk replication is directly depends on the > network bandwidth. > Hmm, depends. On a gigabyte or 10G network and with mechanical harddrives, the time will depend more on disk load. thanks, iustin
Re: [PATCH master 0/6] New balancing options implementation
> > > I would slightly prefer if we discuss it over plain email (without > patches), to see what you think about how complex the network model needs > to be, and whether a static "time X" vs. semi-dynamic (based on the > instance disk size) approach is best. > > Maybe there was some more information back at the start of the project? (I > only started watching the mailing list again recently). > > The initial plan was to implement "static" solutions, based on instance disk size and then make it "dynamic" by using information about network speed from data collectors. At the moment, we have "semi-dynamic" solution, I think. The new tags may specify network speed in cluster (and between different parts of cluster). I am assuming that this speed remains constant since the network usually configured once and locally (for example in server rack). I think, with such assumption, the network speed stays almost constant and the time estimations for balancing solutions become predictable. I suggest to use the new options for discarding solutions, that takes long time and slightly changes the state of the cluster. In my mind the time to perform disk replication is directly depends on the network bandwidth. The next step (according to plan) is to implement data collector for network speed information and use it instead (or may be with) the new tags in order to estimate time more properly. -- Sincerely, Daniil Leshchev
Re: [PATCH master 0/6] New balancing options implementation
On 23 June 2016 at 17:42, Даниил Лещёвwrote: > Hi, Iustin > > >> Oh, no worries, I just wanted to know if Daniil acknowledged the comments >> or not. >> >> Anyway, comments are welcome here and the discussion is still open:) >>> >> >> > The only reason why I didn't reply to your comments is that I wanted to > show patchset in order to discuss ideas in design document in more detailed > way. > Hope, that was not a big mistake. > Oh no, not a mistake at all. > I'm also going to rewrite patch for design document (append information > about bandwidth tags). > I would slightly prefer if we discuss it over plain email (without patches), to see what you think about how complex the network model needs to be, and whether a static "time X" vs. semi-dynamic (based on the instance disk size) approach is best. Maybe there was some more information back at the start of the project? (I only started watching the mailing list again recently). thanks! iustin
Re: [PATCH master 0/6] New balancing options implementation
Hi, Iustin > Oh, no worries, I just wanted to know if Daniil acknowledged the comments > or not. > > Anyway, comments are welcome here and the discussion is still open:) >> > > The only reason why I didn't reply to your comments is that I wanted to show patchset in order to discuss ideas in design document in more detailed way. Hope, that was not a big mistake. I'm also going to rewrite patch for design document (append information about bandwidth tags). -- Sincerely, Daniil Leshchev
Re: [PATCH master 0/6] New balancing options implementation
On 23 June 2016 at 17:08, Oleg Ponomarevwrote: > Hi Iustin, Daniil, > > The reason for Daniil not to reply immediately is his GSoC midterm > evaluation coming soon. As the implementation represents his work during > the first month of GSoC, it's necessary to share it with the community at > this point. We have discussed your comments on design document and Daniil > took them into account. Still, I don't understand why Daniil decided not to > spend 5-10 minutes to reply in the design document discussion thread. > Oh, no worries, I just wanted to know if Daniil acknowledged the comments or not. Anyway, comments are welcome here and the discussion is still open:) > Sounds good. And thanks Daniil for the commits. > Of course, looking forward to see this implemented! thanks! iustin On Thu, Jun 23, 2016 at 5:49 PM 'Iustin Pop' via ganeti-devel < > ganeti-devel@googlegroups.com> wrote: > >> On 23 June 2016 at 16:45, wrote: >> >>> From: Daniil Leshchev >>> >>> The patchset introduces new command line options >>> (--long-solution-threshold" and --avoid-long-solutions"). >>> That gives an ability for HBal to avoid balancing solutions, >>> that take significant amount of time. >>> >> >> Daniil, I've replied to your design doc changes, but I haven't seen a >> reply. That discussion would be useful before implementing this :) >> >> regards, >> iustin >> >
Re: [PATCH master 0/6] New balancing options implementation
Hi Iustin, Daniil, The reason for Daniil not to reply immediately is his GSoC midterm evaluation coming soon. As the implementation represents his work during the first month of GSoC, it's necessary to share it with the community at this point. We have discussed your comments on design document and Daniil took them into account. Still, I don't understand why Daniil decided not to spend 5-10 minutes to reply in the design document discussion thread. Anyway, comments are welcome here and the discussion is still open:) And thanks Daniil for the commits. Sincerely, Oleg On Thu, Jun 23, 2016 at 5:49 PM 'Iustin Pop' via ganeti-devel < ganeti-devel@googlegroups.com> wrote: > On 23 June 2016 at 16:45,wrote: > >> From: Daniil Leshchev >> >> The patchset introduces new command line options >> (--long-solution-threshold" and --avoid-long-solutions"). >> That gives an ability for HBal to avoid balancing solutions, >> that take significant amount of time. >> > > Daniil, I've replied to your design doc changes, but I haven't seen a > reply. That discussion would be useful before implementing this :) > > regards, > iustin >
Re: [PATCH master 0/6] New balancing options implementation
On 23 June 2016 at 16:45,wrote: > From: Daniil Leshchev > > The patchset introduces new command line options > (--long-solution-threshold" and --avoid-long-solutions"). > That gives an ability for HBal to avoid balancing solutions, > that take significant amount of time. > Daniil, I've replied to your design doc changes, but I haven't seen a reply. That discussion would be useful before implementing this :) regards, iustin
[PATCH master 0/6] New balancing options implementation
From: Daniil LeshchevThe patchset introduces new command line options (--long-solution-threshold" and --avoid-long-solutions"). That gives an ability for HBal to avoid balancing solutions, that take significant amount of time. Daniil Leshchev (6): Add "long-solution-threshold" and "avoid-long-solutions" command-line options Add bandwidth tags and bandwidth map fields into Node Add bandwidth tags extraction and parsing Add extraction network bandwidth data from tags Add long-time solutions filtering Add tests for 'avoid-long-solutions' and 'long-solution-threshold' options src/Ganeti/HTools/AlgorithmParams.hs| 9 +++ src/Ganeti/HTools/CLI.hs| 30 +++ src/Ganeti/HTools/Cluster.hs| 103 +++- src/Ganeti/HTools/Loader.hs | 20 - src/Ganeti/HTools/Node.hs | 30 +++ src/Ganeti/HTools/Program/Hbal.hs | 2 + src/Ganeti/HTools/Tags.hs | 60 -- src/Ganeti/HTools/Tags/Constants.hs | 5 ++ test/data/htools/hbal-avoid-long-solutions.data | 17 test/hs/shelltests/htools-hbal.test | 32 10 files changed, 283 insertions(+), 25 deletions(-) create mode 100644 test/data/htools/hbal-avoid-long-solutions.data -- 1.9.1