Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread Даниил Лещёв
>
>
> I would slightly prefer if we discuss it over plain email (without
> patches), to see what you think about how complex the network model needs
> to be, and whether a static "time X" vs. semi-dynamic (based on the
> instance disk size) approach is best.
>
> Maybe there was some more information back at the start of the project? (I
> only started watching the mailing list again recently).
>
> The initial plan was to implement "static" solutions, based on instance
disk size and then make it "dynamic" by using information about network
speed from data collectors.

At the moment, we have "semi-dynamic" solution, I think. The new tags may
specify network speed in cluster (and between different parts of cluster).
I am assuming that this speed remains constant since the network usually
configured once and locally (for example in server rack).
I think, with such assumption, the network speed stays almost constant and
the time estimations for balancing solutions become predictable.

I suggest to use the new options for discarding solutions, that takes long
time and slightly changes the state of the cluster.
In my mind the time to perform disk replication is directly depends on the
network bandwidth.

The next step (according to plan) is to implement data collector for
network speed information and use it instead (or may be with) the new tags
in order to estimate time more properly.

-- 
Sincerely,
Daniil Leshchev


Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread 'Iustin Pop' via ganeti-devel
On 23 June 2016 at 17:42, Даниил Лещёв  wrote:

> Hi, Iustin
>
>
>> Oh, no worries, I just wanted to know if Daniil acknowledged the comments
>> or not.
>>
>> Anyway, comments are welcome here and the discussion is still open:)
>>>
>>
>>
> The only reason why I didn't reply to your comments is that I wanted to
> show patchset in order to discuss ideas in design document in more detailed
> way.
> Hope, that was not a big mistake.
>

Oh no, not a mistake at all.


> I'm also going to rewrite patch for design document (append information
> about bandwidth tags).
>

I would slightly prefer if we discuss it over plain email (without
patches), to see what you think about how complex the network model needs
to be, and whether a static "time X" vs. semi-dynamic (based on the
instance disk size) approach is best.

Maybe there was some more information back at the start of the project? (I
only started watching the mailing list again recently).

thanks!
iustin


Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread Даниил Лещёв
Hi, Iustin


> Oh, no worries, I just wanted to know if Daniil acknowledged the comments 
> or not.
>
> Anyway, comments are welcome here and the discussion is still open:)
>>
>
>
The only reason why I didn't reply to your comments is that I wanted to 
show patchset in order to discuss ideas in design document in more detailed 
way.
Hope, that was not a big mistake.
I'm also going to rewrite patch for design document (append information 
about bandwidth tags).

--
Sincerely,
Daniil Leshchev


Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread 'Iustin Pop' via ganeti-devel
On 23 June 2016 at 17:08, Oleg Ponomarev  wrote:

> Hi Iustin, Daniil,
>
> The reason for Daniil not to reply immediately is his GSoC midterm
> evaluation coming soon. As the implementation represents his work during
> the first month of GSoC, it's necessary to share it with the community at
> this point. We have discussed your comments on design document and Daniil
> took them into account. Still, I don't understand why Daniil decided not to
> spend 5-10 minutes to reply in the design document discussion thread.
>

Oh, no worries, I just wanted to know if Daniil acknowledged the comments
or not.

Anyway, comments are welcome here and the discussion is still open:)
>

Sounds good.

And thanks Daniil for the commits.
>

Of course, looking forward to see this implemented!

thanks!
iustin

On Thu, Jun 23, 2016 at 5:49 PM 'Iustin Pop' via ganeti-devel <
> ganeti-devel@googlegroups.com> wrote:
>
>> On 23 June 2016 at 16:45,  wrote:
>>
>>> From: Daniil Leshchev 
>>>
>>> The patchset introduces new command line options
>>> (--long-solution-threshold" and --avoid-long-solutions").
>>> That gives an ability for HBal to avoid balancing solutions,
>>> that take significant amount of time.
>>>
>>
>> Daniil, I've replied to your design doc changes, but I haven't seen a
>> reply. That discussion would be useful before implementing this :)
>>
>> regards,
>> iustin
>>
>


Re: [PATCH master 5/6] Add long-time solutions filtering

2016-06-23 Thread Oleg Ponomarev
> +  improvement = (ini_cv - opt_cv) / (ini_cv - fin_cv - opt_cv)
The above expression is definitely incorrect (just consider opt_cv + eps =
fin_cv and fin_cv + eps = ini_cv). The improvement will be negative.
Do you mean "(ini_cv - opt_cv) / (fin_cv - opt_cv)" here?


> +  in if long_sol_f == 0 ||
Are you sure that it's OK to compare floating point numbers for exact
equality in Haskell?

Sincerely,
Oleg

On Thu, Jun 23, 2016 at 5:47 PM  wrote:

> From: Daniil Leshchev 
>
> Change balancing functions in order to disallow long-time solutions,
> whose K/N metric is less than algLongSolutionsFactor,
> where K is solution's metric improvement and N is an estimated
> time in seconds to perform this solution.
> ---
>  src/Ganeti/HTools/Cluster.hs | 103
> +++
>  1 file changed, 85 insertions(+), 18 deletions(-)
>
> diff --git a/src/Ganeti/HTools/Cluster.hs b/src/Ganeti/HTools/Cluster.hs
> index 65746fd..e5b52cf 100644
> --- a/src/Ganeti/HTools/Cluster.hs
> +++ b/src/Ganeti/HTools/Cluster.hs
> @@ -98,7 +98,7 @@ import Data.List ( nub
>   , (\\)
>   , sort
>   , intercalate)
> -import Data.Maybe (fromJust, fromMaybe, isJust, isNothing)
> +import Data.Maybe (fromJust, fromMaybe, isJust, isNothing, mapMaybe)
>  import Data.Ord (comparing)
>  import Text.Printf (printf)
>
> @@ -115,7 +115,8 @@ import Ganeti.HTools.Cluster.AllocationSolution
>  import Ganeti.HTools.Cluster.Evacuate ( EvacSolution(..),
> emptyEvacSolution
>, updateEvacSolution,
> reverseEvacSolution
>, nodeEvacInstance)
> -import Ganeti.HTools.Cluster.Metrics (compCV, compClusterStatistics)
> +import Ganeti.HTools.Cluster.Metrics (compCV, compClusterStatistics
> +  , optimalCVScore)
>  import Ganeti.HTools.Cluster.Moves (applyMoveEx)
>  import Ganeti.HTools.Cluster.Utils (splitCluster, instancePriGroup
> , availableGroupNodes, iMoveToJob)
> @@ -430,12 +431,80 @@ checkInstanceMove opts nodes_idx ini_tbl@(Table nl
> _ _ _) target =
>  foldl' (checkSingleStep force ini_tbl target) best_migr_tbl
> disk_moves
>in (best_migr_tbl, best_tbl)
>
> +-- | The default network bandwidth value in Mbit/s
> +defaultBandwidth :: Int
> +defaultBandwidth = 100
> +
> +-- | Compute network bandwidth during given move in Mbit/s
> +plcBandwidth :: Table -> Placement -> Int
> +plcBandwidth (Table nl _ _ _) (_, pn, sn, move, _) =
> +  fromMaybe defaultBandwidth (Node.calcBandwidthToNode src dst)
> +  where getNode ndx = Container.find ndx nl
> +(src, dst) = case move of
> +  Failover -> (getNode pn, getNode sn)
> +  FailoverToAny ndx -> (getNode pn, getNode ndx)
> +  ReplacePrimary ndx -> (getNode pn, getNode ndx)
> +  ReplaceSecondary ndx -> (getNode sn, getNode ndx)
> +  ReplaceAndFailover ndx -> (getNode sn, getNode ndx)
> +  FailoverAndReplace ndx -> (getNode sn, getNode ndx)
> +
> +-- | Compute the amount of data to be moved
> +moveVolume :: IMove -> Instance.Instance -> Int
> +moveVolume Failover inst = Instance.mem inst
> +moveVolume (FailoverToAny _) inst = Instance.mem inst
> +moveVolume _ inst = Instance.mem inst + Instance.dsk inst
> +
> +-- | Compute the estimated time to perform move
> +placementTimeEstimation :: Table -> Placement -> Double
> +placementTimeEstimation tbl@(Table _ il _ _)
> +plc@(idx, _, _, move, _) =
> +  (fromIntegral volume * 8) / fromIntegral bandwidth
> +  where volume = moveVolume move (Container.find idx il)
> +bandwidth = plcBandwidth tbl plc
> +
> +-- | Compute the estimated time to perform solution
> +solutionTimeEstimation :: Table -> Double
> +solutionTimeEstimation fin_tbl@(Table _ _ _ plcs) = sum times
> +where times = map (placementTimeEstimation fin_tbl) plcs
> +
> +-- | Filter long-time solutions without enough gain
> +filterLongSolutions :: AlgorithmOptions
> +-> Table
> +-> Table
> +-> Maybe Table
> +filterLongSolutions opts ini_tbl fin_tbl =
> +  let long_sol_th = fromIntegral $ algLongSolutionThreshold opts
> +  long_sol_f = algLongSolutionsFactor opts
> +  fin_t = solutionTimeEstimation fin_tbl
> +  time_metric = fin_t / long_sol_th
> +  Table nl _ ini_cv _ = ini_tbl
> +  Table _ _ fin_cv _ = fin_tbl
> +  opt_cv = optimalCVScore nl
> +  improvement = (ini_cv - opt_cv) / (ini_cv - fin_cv - opt_cv)
> +  in if long_sol_f == 0 ||
> +fin_t < long_sol_th ||
> +improvement > long_sol_f * time_metric
> +  then Just fin_tbl
> +  else Nothing
> +
> +-- | Filter solutions without enough gain
> +filterGainMove :: AlgorithmOptions -> Table -> Table -> Maybe Table
> +filterGainMove opts ini_tbl fin_tbl =
> +  let mg_limit = algMinGainLimit opts
> +  

Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread Oleg Ponomarev
Hi Iustin, Daniil,

The reason for Daniil not to reply immediately is his GSoC midterm
evaluation coming soon. As the implementation represents his work during
the first month of GSoC, it's necessary to share it with the community at
this point. We have discussed your comments on design document and Daniil
took them into account. Still, I don't understand why Daniil decided not to
spend 5-10 minutes to reply in the design document discussion thread.

Anyway, comments are welcome here and the discussion is still open:)

And thanks Daniil for the commits.

Sincerely,
Oleg


On Thu, Jun 23, 2016 at 5:49 PM 'Iustin Pop' via ganeti-devel <
ganeti-devel@googlegroups.com> wrote:

> On 23 June 2016 at 16:45,  wrote:
>
>> From: Daniil Leshchev 
>>
>> The patchset introduces new command line options
>> (--long-solution-threshold" and --avoid-long-solutions").
>> That gives an ability for HBal to avoid balancing solutions,
>> that take significant amount of time.
>>
>
> Daniil, I've replied to your design doc changes, but I haven't seen a
> reply. That discussion would be useful before implementing this :)
>
> regards,
> iustin
>


Re: [PATCH master 0/6] New balancing options implementation

2016-06-23 Thread 'Iustin Pop' via ganeti-devel
On 23 June 2016 at 16:45,  wrote:

> From: Daniil Leshchev 
>
> The patchset introduces new command line options
> (--long-solution-threshold" and --avoid-long-solutions").
> That gives an ability for HBal to avoid balancing solutions,
> that take significant amount of time.
>

Daniil, I've replied to your design doc changes, but I haven't seen a
reply. That discussion would be useful before implementing this :)

regards,
iustin


[PATCH master 5/6] Add long-time solutions filtering

2016-06-23 Thread meleodr
From: Daniil Leshchev 

Change balancing functions in order to disallow long-time solutions,
whose K/N metric is less than algLongSolutionsFactor,
where K is solution's metric improvement and N is an estimated
time in seconds to perform this solution.
---
 src/Ganeti/HTools/Cluster.hs | 103 +++
 1 file changed, 85 insertions(+), 18 deletions(-)

diff --git a/src/Ganeti/HTools/Cluster.hs b/src/Ganeti/HTools/Cluster.hs
index 65746fd..e5b52cf 100644
--- a/src/Ganeti/HTools/Cluster.hs
+++ b/src/Ganeti/HTools/Cluster.hs
@@ -98,7 +98,7 @@ import Data.List ( nub
  , (\\)
  , sort
  , intercalate)
-import Data.Maybe (fromJust, fromMaybe, isJust, isNothing)
+import Data.Maybe (fromJust, fromMaybe, isJust, isNothing, mapMaybe)
 import Data.Ord (comparing)
 import Text.Printf (printf)
 
@@ -115,7 +115,8 @@ import Ganeti.HTools.Cluster.AllocationSolution
 import Ganeti.HTools.Cluster.Evacuate ( EvacSolution(..), emptyEvacSolution
   , updateEvacSolution, reverseEvacSolution
   , nodeEvacInstance)
-import Ganeti.HTools.Cluster.Metrics (compCV, compClusterStatistics)
+import Ganeti.HTools.Cluster.Metrics (compCV, compClusterStatistics
+  , optimalCVScore)
 import Ganeti.HTools.Cluster.Moves (applyMoveEx)
 import Ganeti.HTools.Cluster.Utils (splitCluster, instancePriGroup
, availableGroupNodes, iMoveToJob)
@@ -430,12 +431,80 @@ checkInstanceMove opts nodes_idx ini_tbl@(Table nl _ _ _) 
target =
 foldl' (checkSingleStep force ini_tbl target) best_migr_tbl disk_moves
   in (best_migr_tbl, best_tbl)
 
+-- | The default network bandwidth value in Mbit/s
+defaultBandwidth :: Int
+defaultBandwidth = 100
+
+-- | Compute network bandwidth during given move in Mbit/s
+plcBandwidth :: Table -> Placement -> Int
+plcBandwidth (Table nl _ _ _) (_, pn, sn, move, _) =
+  fromMaybe defaultBandwidth (Node.calcBandwidthToNode src dst)
+  where getNode ndx = Container.find ndx nl
+(src, dst) = case move of
+  Failover -> (getNode pn, getNode sn)
+  FailoverToAny ndx -> (getNode pn, getNode ndx)
+  ReplacePrimary ndx -> (getNode pn, getNode ndx)
+  ReplaceSecondary ndx -> (getNode sn, getNode ndx)
+  ReplaceAndFailover ndx -> (getNode sn, getNode ndx)
+  FailoverAndReplace ndx -> (getNode sn, getNode ndx)
+
+-- | Compute the amount of data to be moved
+moveVolume :: IMove -> Instance.Instance -> Int
+moveVolume Failover inst = Instance.mem inst
+moveVolume (FailoverToAny _) inst = Instance.mem inst
+moveVolume _ inst = Instance.mem inst + Instance.dsk inst
+
+-- | Compute the estimated time to perform move
+placementTimeEstimation :: Table -> Placement -> Double
+placementTimeEstimation tbl@(Table _ il _ _)
+plc@(idx, _, _, move, _) =
+  (fromIntegral volume * 8) / fromIntegral bandwidth
+  where volume = moveVolume move (Container.find idx il)
+bandwidth = plcBandwidth tbl plc
+
+-- | Compute the estimated time to perform solution
+solutionTimeEstimation :: Table -> Double
+solutionTimeEstimation fin_tbl@(Table _ _ _ plcs) = sum times
+where times = map (placementTimeEstimation fin_tbl) plcs
+
+-- | Filter long-time solutions without enough gain
+filterLongSolutions :: AlgorithmOptions
+-> Table
+-> Table
+-> Maybe Table
+filterLongSolutions opts ini_tbl fin_tbl =
+  let long_sol_th = fromIntegral $ algLongSolutionThreshold opts
+  long_sol_f = algLongSolutionsFactor opts
+  fin_t = solutionTimeEstimation fin_tbl
+  time_metric = fin_t / long_sol_th
+  Table nl _ ini_cv _ = ini_tbl
+  Table _ _ fin_cv _ = fin_tbl
+  opt_cv = optimalCVScore nl
+  improvement = (ini_cv - opt_cv) / (ini_cv - fin_cv - opt_cv)
+  in if long_sol_f == 0 ||
+fin_t < long_sol_th ||
+improvement > long_sol_f * time_metric
+  then Just fin_tbl
+  else Nothing
+
+-- | Filter solutions without enough gain
+filterGainMove :: AlgorithmOptions -> Table -> Table -> Maybe Table
+filterGainMove opts ini_tbl fin_tbl =
+  let mg_limit = algMinGainLimit opts
+  min_gain = algMinGain opts
+  Table _ _ ini_cv _ = ini_tbl
+  Table _ _ fin_cv _ = fin_tbl
+  in if fin_cv < ini_cv && (ini_cv > mg_limit
+  || ini_cv - fin_cv > min_gain)
+  then Just fin_tbl -- this round made success, return the new table
+  else Nothing
+
 -- | Compute the best next move.
 checkMove :: AlgorithmOptions   -- ^ Algorithmic options for balancing
  -> [Ndx]   -- ^ Allowed target node indices
  -> Table   -- ^ The current solution
  -> [Instance.Instance] -- ^ List of instances still to move
- -> Table   -- ^ The new solution
+ -> Maybe 

[PATCH master 3/6] Add bandwidth tags extraction and parsing

2016-06-23 Thread meleodr
From: Daniil Leshchev 

This tags make possible to specify network bandwidth between nodes.
The syntax is "htools:bandwidth:a::b::N", there N is network speed
between 'a' and 'b' tags.
---
 src/Ganeti/HTools/Tags.hs   | 60 +
 src/Ganeti/HTools/Tags/Constants.hs |  5 
 2 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/src/Ganeti/HTools/Tags.hs b/src/Ganeti/HTools/Tags.hs
index 6fd7ebb..010fd3d 100644
--- a/src/Ganeti/HTools/Tags.hs
+++ b/src/Ganeti/HTools/Tags.hs
@@ -39,17 +39,20 @@ module Ganeti.HTools.Tags
   , getMigRestrictions
   , getRecvMigRestrictions
   , getLocations
+  , getBandwidth
+  , getBandwidthGraph
+  , inheritByPrefixes
   ) where
 
-import Control.Monad (guard, (>=>))
-import Data.List (isPrefixOf, isInfixOf, stripPrefix)
+import Control.Monad ((>=>))
+import Data.List (isPrefixOf, stripPrefix)
 import Data.Maybe (mapMaybe)
 import qualified Data.Set as S
 
 import qualified Ganeti.HTools.Node as Node
 import Ganeti.HTools.Tags.Constants ( standbyPrefix
 , migrationPrefix, allowMigrationPrefix
-, locationPrefix )
+, locationPrefix, bandwidthPrefix )
 
 -- * Predicates
 
@@ -79,9 +82,7 @@ getMigRestrictions = getTags migrationPrefix
 -- the parts before and after.
 splitAtColons :: String -> Maybe (String, String)
 
-splitAtColons (':':':':xs) = do
-  guard $ not ("::" `isInfixOf` xs)
-  return ("", xs)
+splitAtColons (':':':':xs) = return ("", xs)
 
 splitAtColons (x:xs) = do
   (as, bs) <- splitAtColons xs
@@ -107,3 +108,50 @@ getRecvMigRestrictions ctags ntags =
 -- from the node tags.
 getLocations :: [String] -> [String] -> S.Set String
 getLocations = getTags locationPrefix
+
+-- | Given the cluster tags extract the network bandwidth
+-- from a node tag.
+getBandwidth :: [String] -> [String] -> S.Set String
+getBandwidth = getTags bandwidthPrefix
+
+-- | Split given string on the  all "::" occurences
+splitToColonsList :: String -> [String]
+splitToColonsList str =
+  case splitAtColons str of
+Just (f, s) -> f : splitToColonsList s
+Nothing -> [str]
+
+-- | Try to parse string into value
+maybeRead :: Read a => String -> Maybe a
+maybeRead s = case reads s of
+[(x,"")] -> Just x
+_-> Nothing
+
+-- | Extract bandwidth graph from cluster tags
+getBandwidthGraph :: [String] -> [(String, String, Int)]
+getBandwidthGraph ctags =
+  let unprefTags = mapMaybe (stripPrefix bandwidthPrefix) ctags
+  tupleList = mapMaybe (listToTuple . splitToColonsList) unprefTags
+  in mapMaybe parseInt tupleList
+  where parseInt (a, b, s) = case maybeRead s :: Maybe Int of
+  Just i -> Just (a, b, i)
+  Nothing -> Nothing
+listToTuple (a:b:c:[]) = Just (a, b, c)
+listToTuple _ = Nothing
+
+-- | Maybe extract string after first occurence of ":" return
+stripFirstPrefix :: String -> Maybe String
+stripFirstPrefix (':':_) = Just ""
+stripFirstPrefix (x:xs) =
+  case stripFirstPrefix xs of
+Just pref -> Just (x:pref)
+Nothing -> Nothing
+stripFirstPrefix _ = Nothing
+
+-- | Drop all victims having same prefixes from inherits, unite lists
+inheritByPrefixes :: [String] -> [String] -> [String]
+inheritByPrefixes victims inherits =
+  let prefixes = mapMaybe stripFirstPrefix inherits
+  in inherits ++ filter (prefixFilter prefixes) victims
+  where prefixFilter pl s = not $ any (`isPrefixOf` s) pl
+
diff --git a/src/Ganeti/HTools/Tags/Constants.hs 
b/src/Ganeti/HTools/Tags/Constants.hs
index 3b741ac..d143b7a 100644
--- a/src/Ganeti/HTools/Tags/Constants.hs
+++ b/src/Ganeti/HTools/Tags/Constants.hs
@@ -43,6 +43,7 @@ module Ganeti.HTools.Tags.Constants
   , migrationPrefix
   , allowMigrationPrefix
   , locationPrefix
+  , bandwidthPrefix
   , desiredLocationPrefix
   , standbyAuto
   , autoRepairTagPrefix
@@ -75,6 +76,10 @@ allowMigrationPrefix = "htools:allowmigration:"
 locationPrefix :: String
 locationPrefix = "htools:nlocation:"
 
+-- | The prefix for bandwidth between nodes.
+bandwidthPrefix :: String
+bandwidthPrefix = "htools:bandwidth:"
+
 -- | The prefix for instance desired location tags.
 desiredLocationPrefix :: String
 desiredLocationPrefix = "htools:desiredlocation:"
-- 
1.9.1



[PATCH master 4/6] Add extraction network bandwidth data from tags

2016-06-23 Thread meleodr
From: Daniil Leshchev 

This information is used by Hbal in order to prevent long balancing steps
("--avoid-long-solutions").
---
 src/Ganeti/HTools/Loader.hs | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/Ganeti/HTools/Loader.hs b/src/Ganeti/HTools/Loader.hs
index 2294468..10a364c 100644
--- a/src/Ganeti/HTools/Loader.hs
+++ b/src/Ganeti/HTools/Loader.hs
@@ -326,6 +326,23 @@ addLocationTags ctags node =
   let ntags = Node.nTags node
   in Node.setLocationTags node $ Tags.getLocations ctags ntags
 
+addBandwidthData :: [String]
+ -> Group.List -> Node.Node -> Node.Node
+addBandwidthData ctags gl node =
+  let grp = Container.find (Node.group node) gl
+  ntags = Node.nTags node
+  gtags = Group.allTags grp
+  bgraph = Tags.getBandwidthGraph ctags
+  alltags = Tags.inheritByPrefixes ctags $
+Tags.inheritByPrefixes gtags ntags
+  btags = Tags.getBandwidth ctags alltags
+  tnode = Node.setBandwidthTags node btags
+  update nd (src, dst, bndwdth)
+| Set.member src btags = Node.setBandwidthToLocation nd dst bndwdth
+| Set.member dst btags = Node.setBandwidthToLocation nd src bndwdth
+| otherwise = nd
+  in foldl update tnode bgraph
+
 -- | Initializer function that loads the data from a node and instance
 -- list and massages it into the correct format.
 mergeData :: [(String, DynUtil)]  -- ^ Instance utilisation data
@@ -366,8 +383,9 @@ mergeData um extags selinsts exinsts time 
cdata@(ClusterData gl nl il ctags _) =
(`Node.buildPeers` il4)) nl3
   il6 = Container.map (disableSplitMoves nl3) il5
   nl5 = Container.map (addMigrationTags ctags) nl4
+  nl6 = Container.map (addBandwidthData ctags gl) nl5
   in if' (null lkp_unknown)
- (Ok cdata { cdNodes = nl5, cdInstances = il6 })
+ (Ok cdata { cdNodes = nl6, cdInstances = il6 })
  (Bad $ "Unknown instance(s): " ++ show(map lrContent lkp_unknown))
 
 -- | In a cluster description, clear dynamic utilisation information.
-- 
1.9.1



[PATCH master 6/6] Add tests for 'avoid-long-solutions' and 'long-solution-threshold' options

2016-06-23 Thread meleodr
From: Daniil Leshchev 

These tests cover the following cases:
* default option values [1-3]
* avoiding slow networks [4-6]
* avoiding balancing moves at all [6-8]
All cases use inheritance of tags
---
 test/data/htools/hbal-avoid-long-solutions.data | 17 +
 test/hs/shelltests/htools-hbal.test | 32 +
 2 files changed, 49 insertions(+)
 create mode 100644 test/data/htools/hbal-avoid-long-solutions.data

diff --git a/test/data/htools/hbal-avoid-long-solutions.data 
b/test/data/htools/hbal-avoid-long-solutions.data
new file mode 100644
index 000..2637182
--- /dev/null
+++ b/test/data/htools/hbal-avoid-long-solutions.data
@@ -0,0 +1,17 @@
+group-01|fake-uuid-01|preferred|nic:old|
+
+node-01|16384|0|16384|409600|281600|16|N|fake-uuid-01|1|nic:new
+node-02|16384|0|16384|409600|281600|16|N|fake-uuid-01|1|nic:new
+node-03|16384|0|16384|409600|409600|16|N|fake-uuid-01|1|
+node-04|16384|0|16384|409600|409600|16|N|fake-uuid-01|1|
+
+inst1|1024|51200|1|running|Y|node-01|node-02|drbd||1
+inst2|1024|12800|1|running|Y|node-01|node-02|drbd||1
+inst3|1024|12800|1|running|Y|node-01|node-02|drbd||1
+inst4|1024|51200|1|running|Y|node-01|node-02|drbd||1
+
+htools:bandwidth:nic
+nic:new
+htools:bandwidth:nic:new::nic:new::1000
+htools:bandwidth:nic:old::nic:old::100
+htools:bandwidth:nic:new::nic:old::50
diff --git a/test/hs/shelltests/htools-hbal.test 
b/test/hs/shelltests/htools-hbal.test
index f7ce274..57dba17 100644
--- a/test/hs/shelltests/htools-hbal.test
+++ b/test/hs/shelltests/htools-hbal.test
@@ -101,6 +101,38 @@
 >>>/Solution length=2/
 >>>= 0
 
+./test/hs/hbal -t $TESTDATA_DIR/hbal-avoid-long-solutions.data 
--long-solution-threshold=1000
+>>>/Solution length=4/
+>>>= 0
+
+./test/hs/hbal -t $TESTDATA_DIR/hbal-avoid-long-solutions.data 
--avoid-long-solutions=2
+>>>/Solution length=3/
+>>>= 0
+
+./test/hs/hbal -t $TESTDATA_DIR/hbal-avoid-long-solutions.data 
--avoid-long-solutions=5
+>>>/Solution length=2/
+>>>= 0
+
+./test/hs/hbal -t $TESTDATA_DIR/hbal-avoid-long-solutions.data 
--long-solution-threshold=1500 --avoid-long-solutions=2
+>>>/Solution length=4/
+>>>= 0
+
+./test/hs/hbal -t $TESTDATA_DIR/hbal-avoid-long-solutions.data 
--long-solution-threshold=1000 --avoid-long-solutions=2
+>>>/Solution length=3/
+>>>= 0
+
+./test/hs/hbal -t $TESTDATA_DIR/hbal-avoid-long-solutions.data 
--long-solution-threshold=1000 --avoid-long-solutions=5
+>>>/Solution length=2/
+>>>= 0
+
+./test/hs/hbal -t $TESTDATA_DIR/hbal-avoid-long-solutions.data 
--long-solution-threshold=10 --avoid-long-solutions=10
+>>>/Solution length=1/
+>>>= 0
+
+./test/hs/hbal -t $TESTDATA_DIR/hbal-avoid-long-solutions.data 
--long-solution-threshold=1 --avoid-long-solutions=10
+>>>/Solution length=0/
+>>>= 0
+
 ./test/hs/hbal -t $TESTDATA_DIR/hbal-memory-over-commitment.data
 >>>/No solution found/
 >>>= 0
-- 
1.9.1



[PATCH master 1/6] Add "long-solution-threshold" and "avoid-long-solutions" command-line options

2016-06-23 Thread meleodr
From: Daniil Leshchev 

The "long-solution-threshold" option specifies the threshold
in seconds, that defines long-time solutions.
The "avoid-long-solutions" option specifies the factor of avoiding
long-solutions. Allow only long-time solutions with K/N more than
avoiding factor, where K is solutions's gain in cluster metric
and N is estimated time in second to perfom solution.
---
 src/Ganeti/HTools/AlgorithmParams.hs |  9 +
 src/Ganeti/HTools/CLI.hs | 30 ++
 src/Ganeti/HTools/Program/Hbal.hs|  2 ++
 3 files changed, 41 insertions(+)

diff --git a/src/Ganeti/HTools/AlgorithmParams.hs 
b/src/Ganeti/HTools/AlgorithmParams.hs
index 8a53e69..539b404 100644
--- a/src/Ganeti/HTools/AlgorithmParams.hs
+++ b/src/Ganeti/HTools/AlgorithmParams.hs
@@ -52,6 +52,13 @@ data AlgorithmOptions = AlgorithmOptions
 -- in cluster score more than
 -- algDiskMovesFactor times higher than
 -- the gain in migration moves
+  , algLongSolutionThreshold :: Int -- ^ Threshold for long-time solutions
+  , algLongSolutionsFactor :: Double -- ^ Allow only long solutions,
+-- whose K/N metrics are more,
+-- than algLongSolutionsFactor
+-- where K is solution's gain in cluster
+-- metric and N is estimated time
+-- in seconds to perform this solution
   , algInstanceMoves :: Bool-- ^ Whether instance moves are allowed
   , algRestrictedMigration :: Bool  -- ^ Whether migration is restricted
   , algIgnoreSoftErrors :: Bool -- ^ Whether to always ignore soft errors
@@ -73,6 +80,8 @@ fromCLIOptions :: CLI.Options -> AlgorithmOptions
 fromCLIOptions opts = AlgorithmOptions
   { algDiskMoves = CLI.optDiskMoves opts
   , algDiskMovesFactor = CLI.optAvoidDiskMoves opts
+  , algLongSolutionThreshold = CLI.optLongSolutionThreshold opts
+  , algLongSolutionsFactor = CLI.optAvoidLongSolutions opts
   , algInstanceMoves = CLI.optInstMoves opts
   , algRestrictedMigration = CLI.optRestrictedMigrate opts
   , algIgnoreSoftErrors = CLI.optIgnoreSoftErrors opts
diff --git a/src/Ganeti/HTools/CLI.hs b/src/Ganeti/HTools/CLI.hs
index 9effbdc..fc7ab8c 100644
--- a/src/Ganeti/HTools/CLI.hs
+++ b/src/Ganeti/HTools/CLI.hs
@@ -56,6 +56,8 @@ module Ganeti.HTools.CLI
   , oDataFile
   , oDiskMoves
   , oAvoidDiskMoves
+  , oLongSolutionThreshold
+  , oAvoidLongSolutions
   , oDiskTemplate
   , oDryRun
   , oSpindleUse
@@ -151,6 +153,14 @@ data Options = Options
   , optAvoidDiskMoves :: Double  -- ^ Allow only disk moves improving
  -- cluster score in more than
  -- optAvoidDiskMoves times
+  , optLongSolutionThreshold :: Int  -- ^ The threshold in seconds,
+ -- that defines long-time solutions
+  , optAvoidLongSolutions :: Double  -- ^ Allow only long-time solutions
+ -- whose K/N metric is more than
+ -- optAvoidLongSolutions, where K is
+ -- solution's gain in cluster metric
+ -- and N is an estimated time in seconds
+ -- to perform this solution.
   , optInstMoves   :: Bool   -- ^ Allow instance moves
   , optDiskTemplate :: Maybe DiskTemplate  -- ^ Override for the disk template
   , optSpindleUse  :: Maybe Int  -- ^ Override for the spindle usage
@@ -234,6 +244,8 @@ defaultOptions  = Options
   { optDataFile= Nothing
   , optDiskMoves   = True
   , optAvoidDiskMoves = 1.0
+  , optLongSolutionThreshold = 1000
+  , optAvoidLongSolutions = 0.0
   , optInstMoves   = True
   , optIndependentGroups = False
   , optAcceptExisting = False
@@ -370,6 +382,24 @@ oAvoidDiskMoves =
\ admit disk move during the step",
OptComplFloat)
 
+oLongSolutionThreshold :: OptType
+oLongSolutionThreshold =
+  (Option "" ["long-solution-threshold"]
+   (reqWithConversion (tryRead "long-time solution threshold")
+(\f opts -> Ok opts { optLongSolutionThreshold = f }) "FACTOR")
+   "specify the threshold in seconds, that defines long-time solutions",
+   OptComplInteger)
+
+oAvoidLongSolutions :: OptType
+oAvoidLongSolutions =
+  (Option "" ["avoid-long-solutions"]
+   (reqWithConversion (tryRead "long-time solutions avoiding factor")
+(\f opts -> Ok opts { optAvoidLongSolutions = f }) "FACTOR")
+   "K/N metrics of solution should be higher than FACTOR\
+   \ in order to admit it, where K is solutions's gain in cluster metric\
+   \ and N is estimated time in seconds to perform it",
+   OptComplFloat)
+
 oMonD :: OptType
 oMonD =
   (Option "" ["mond"]
diff --git a/src/Ganeti/HTools/Program/Hbal.hs 

[PATCH master 0/6] New balancing options implementation

2016-06-23 Thread meleodr
From: Daniil Leshchev 

The patchset introduces new command line options
(--long-solution-threshold" and --avoid-long-solutions").
That gives an ability for HBal to avoid balancing solutions,
that take significant amount of time.

Daniil Leshchev (6):
  Add "long-solution-threshold" and "avoid-long-solutions" command-line
options
  Add bandwidth tags and bandwidth map fields into Node
  Add bandwidth tags extraction and parsing
  Add extraction network bandwidth data from tags
  Add long-time solutions filtering
  Add tests for 'avoid-long-solutions' and 'long-solution-threshold'
options

 src/Ganeti/HTools/AlgorithmParams.hs|   9 +++
 src/Ganeti/HTools/CLI.hs|  30 +++
 src/Ganeti/HTools/Cluster.hs| 103 +++-
 src/Ganeti/HTools/Loader.hs |  20 -
 src/Ganeti/HTools/Node.hs   |  30 +++
 src/Ganeti/HTools/Program/Hbal.hs   |   2 +
 src/Ganeti/HTools/Tags.hs   |  60 --
 src/Ganeti/HTools/Tags/Constants.hs |   5 ++
 test/data/htools/hbal-avoid-long-solutions.data |  17 
 test/hs/shelltests/htools-hbal.test |  32 
 10 files changed, 283 insertions(+), 25 deletions(-)
 create mode 100644 test/data/htools/hbal-avoid-long-solutions.data

-- 
1.9.1



[PATCH master 2/6] Add bandwidth tags and bandwidth map fields into Node

2016-06-23 Thread meleodr
From: Daniil Leshchev 

The bandwidths map store data about network speed
between current node and given Node by it's bandwidth tags.
---
 src/Ganeti/HTools/Node.hs | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/src/Ganeti/HTools/Node.hs b/src/Ganeti/HTools/Node.hs
index 6749568..4eb1576 100644
--- a/src/Ganeti/HTools/Node.hs
+++ b/src/Ganeti/HTools/Node.hs
@@ -60,6 +60,8 @@ module Ganeti.HTools.Node
   , setMigrationTags
   , setRecvMigrationTags
   , setLocationTags
+  , setBandwidthTags
+  , setBandwidthToLocation
   -- * Tag maps
   , addTags
   , delTags
@@ -98,6 +100,7 @@ module Ganeti.HTools.Node
   , mkNodeGraph
   , mkRebootNodeGraph
   , haveExclStorage
+  , calcBandwidthToNode
   ) where
 
 import Prelude ()
@@ -111,6 +114,7 @@ import qualified Data.IntMap as IntMap
 import Data.List (intercalate, foldl', delete, union, sortBy, groupBy)
 import qualified Data.Map as Map
 import Data.Ord (comparing)
+import Data.Maybe (mapMaybe)
 import qualified Data.Set as Set
 import Text.Printf (printf)
 
@@ -213,6 +217,10 @@ data Node = Node
-- to
   , locationScore :: Int -- ^ Sum of instance location and desired location
  -- scores
+  , bandwidthTags :: Set.Set String -- ^ Node's bandwidth tags
+  , bandwidthMap :: Map.Map String Int -- ^ Node's network bandwidth between
+-- current location and given location
+-- in Mbit per second
   , instanceMap :: Map.Map (String, String) Int -- ^ Number of instances with
 -- each exclusion/location tags
 -- pair
@@ -384,6 +392,8 @@ create name_init mem_t_init mem_n_init mem_f_init
, rmigTags = Set.empty
, locationTags = Set.empty
, locationScore = 0
+   , bandwidthTags = Set.empty
+   , bandwidthMap = Map.empty
, instanceMap = Map.empty
}
 
@@ -435,6 +445,15 @@ setRecvMigrationTags t val = t { rmigTags = val }
 setLocationTags :: Node -> Set.Set String -> Node
 setLocationTags t val = t { locationTags = val }
 
+-- | Set the network bandwidth tags
+setBandwidthTags :: Node -> Set.Set String -> Node
+setBandwidthTags t val = t { bandwidthTags = val }
+
+-- | Add network bandwidth to nodes with given bandwidth tag
+setBandwidthToLocation :: Node -> String -> Int -> Node
+setBandwidthToLocation t tag bandwidth = t { bandwidthMap = new_map }
+  where new_map = Map.insert tag bandwidth (bandwidthMap t)
+
 -- | Sets the unnaccounted memory.
 setXmem :: Node -> Int -> Node
 setXmem t val = t { xMem = val }
@@ -556,6 +575,17 @@ calcFmemOfflineOrForthcoming node allInstances =
  . filter (not . Instance.usesMemory)
  $ nodeInstances
 
+-- | Calculate the network bandwidth between two given nodes
+calcBandwidthToNode :: Node -> Node -> Maybe Int
+calcBandwidthToNode src dst =
+  case bndwths of
+[] -> Nothing
+_  -> Just $ minimum bndwths
+  where dstTags = Set.toList $ bandwidthTags dst
+srcMap = bandwidthMap src
+mapper = flip Map.lookup srcMap
+bndwths = mapMaybe mapper dstTags
+
 -- | Calculates the desired location score of an instance, given its primary
 -- node.
 getInstanceDsrdLocScore :: Node -- ^ the primary node of the instance
-- 
1.9.1



Re: [PATCH master] New configuration option for the rbd user-id

2016-06-23 Thread 'Brian Foley' via ganeti-devel
On Thu, Jun 23, 2016 at 06:59:34AM -0700, David Mohr wrote:
>Hi Brian,
>this is the squashed commit of the actual code changes; without this
>the documentation update is slightly misleading ;-)

Woops. I guess I was a bit too quick off the mark!

>Thanks for applying it! I'll be back soon with more patches.

Great.

Thanks,
Brian.


Re: [PATCH master] New configuration option for the rbd user-id

2016-06-23 Thread 'Brian Foley' via ganeti-devel
On Thu, Jun 23, 2016 at 03:57:50PM +0200, David Mohr wrote:
> The user id is used by ceph to determine the keyring to use for
> authentication. By default the admin keyring is used, which may
> not be desirable. Example usage:
> 
>   $ gnt-cluster modify -D rbd:user-id=foobar
> 
> Signed-off-by: David Mohr 

As I mentioned before, we don't have a Ceph setup to test this against, but the
patch seems fine to me. However, it fails the testValidDiskparams test in
test/py/cmdlib/cluster_unittest.py. Could you take a look a this please?

Thanks,
Brian.

>  lib/storage/bdev.py |   44 +++-
>  src/Ganeti/Constants.hs |   18 +-
>  2 files changed, 44 insertions(+), 18 deletions(-)
> 
> diff --git a/lib/storage/bdev.py b/lib/storage/bdev.py
> index 1f95004..5a4772f 100644
> --- a/lib/storage/bdev.py
> +++ b/lib/storage/bdev.py
> @@ -886,6 +886,12 @@ class RADOSBlockDevice(base.BlockDev):
>  self.Attach()
>  
>@classmethod
> +  def MakeRbdCmd(cls, params, cmd):
> +if params.get(constants.RBD_USER_ID, ""):
> +  cmd.extend(["--id", str(params[constants.RBD_USER_ID])])
> +return [constants.RBD_CMD] + cmd
> +
> +  @classmethod
>def Create(cls, unique_id, children, size, spindles, params, excl_stor,
>   dyn_params, **kwargs):
>  """Create a new rbd device.
> @@ -903,8 +909,8 @@ class RADOSBlockDevice(base.BlockDev):
>  rbd_name = unique_id[1]
>  
>  # Provision a new rbd volume (Image) inside the RADOS cluster.
> -cmd = [constants.RBD_CMD, "create", "-p", rbd_pool,
> -   rbd_name, "--size", "%s" % size]
> +cmd = cls.MakeRbdCmd(params, ["create", "-p", rbd_pool, rbd_name,
> +  "--size", str(size)])
>  result = utils.RunCmd(cmd)
>  if result.failed:
>base.ThrowError("rbd creation failed (%s): %s",
> @@ -928,7 +934,7 @@ class RADOSBlockDevice(base.BlockDev):
>  self.Shutdown()
>  
>  # Remove the actual Volume (Image) from the RADOS cluster.
> -cmd = [constants.RBD_CMD, "rm", "-p", rbd_pool, rbd_name]
> +cmd = self.__class__.MakeRbdCmd(self.params, ["rm", "-p", rbd_pool, 
> rbd_name])
>  result = utils.RunCmd(cmd)
>  if result.failed:
>base.ThrowError("Can't remove Volume from cluster with rbd rm: %s - 
> %s",
> @@ -987,7 +993,7 @@ class RADOSBlockDevice(base.BlockDev):
>return rbd_dev
>  
>  # The mapping doesn't exist. Create it.
> -map_cmd = [constants.RBD_CMD, "map", "-p", pool, name]
> +map_cmd = self.__class__.MakeRbdCmd(self.params, ["map", "-p", pool, 
> name])
>  result = utils.RunCmd(map_cmd)
>  if result.failed:
>base.ThrowError("rbd map failed (%s): %s",
> @@ -1017,14 +1023,13 @@ class RADOSBlockDevice(base.BlockDev):
>  try:
># Newer versions of the rbd tool support json output formatting. Use it
># if available.
> -  showmap_cmd = [
> -constants.RBD_CMD,
> +  showmap_cmd = cls.MakeRbdCmd({}, [
>  "showmapped",
>  "-p",
>  pool,
>  "--format",
>  "json"
> -]
> +])
>result = utils.RunCmd(showmap_cmd)
>if result.failed:
>  logging.error("rbd JSON output formatting returned error (%s): %s,"
> @@ -1036,7 +1041,7 @@ class RADOSBlockDevice(base.BlockDev):
>  except RbdShowmappedJsonError:
># For older versions of rbd, we have to parse the plain / text output
># manually.
> -  showmap_cmd = [constants.RBD_CMD, "showmapped", "-p", pool]
> +  showmap_cmd = cls.MakeRbdCmd({}, ["showmapped", "-p", pool])
>result = utils.RunCmd(showmap_cmd)
>if result.failed:
>  base.ThrowError("rbd showmapped failed (%s): %s",
> @@ -1168,7 +1173,7 @@ class RADOSBlockDevice(base.BlockDev):
>  
>  if rbd_dev:
># The mapping exists. Unmap the rbd device.
> -  unmap_cmd = [constants.RBD_CMD, "unmap", "%s" % rbd_dev]
> +  unmap_cmd = self.__class__.MakeRbdCmd(self.params, ["unmap", 
> str(rbd_dev)])
>result = utils.RunCmd(unmap_cmd)
>if result.failed:
>  base.ThrowError("rbd unmap failed (%s): %s",
> @@ -1212,8 +1217,8 @@ class RADOSBlockDevice(base.BlockDev):
>  new_size = self.size + amount
>  
>  # Resize the rbd volume (Image) inside the RADOS cluster.
> -cmd = [constants.RBD_CMD, "resize", "-p", rbd_pool,
> -   rbd_name, "--size", "%s" % new_size]
> +cmd = self.__class__.MakeRbdCmd(self.params, ["resize", "-p", rbd_pool,
> +   rbd_name, "--size", str(new_size)])
>  result = utils.RunCmd(cmd)
>  if result.failed:
>base.ThrowError("rbd resize failed (%s): %s",
> @@ -1248,16 +1253,17 @@ class RADOSBlockDevice(base.BlockDev):
>  self._UnmapVolumeFromBlockdev(self.unique_id)
>  
>  # Remove the actual Volume (Image) from the RADOS cluster.
> -cmd = [constants.RBD_CMD, "rm", "-p", rbd_pool, rbd_name]
> +cmd = 

Re: [PATCH master] New configuration option for the rbd user-id

2016-06-23 Thread David Mohr
Hi Brian,

this is the squashed commit of the actual code changes; without this the 
documentation update is slightly misleading ;-)

Thanks for applying it! I'll be back soon with more patches.

~David

On Thursday, June 23, 2016 at 3:57:52 PM UTC+2, David Mohr wrote:
>
> The user id is used by ceph to determine the keyring to use for 
> authentication. By default the admin keyring is used, which may 
> not be desirable. Example usage: 
>
>   $ gnt-cluster modify -D rbd:user-id=foobar 
>
> Signed-off-by: David Mohr  
> --- 
>  lib/storage/bdev.py |   44 
> +++- 
>  src/Ganeti/Constants.hs |   18 +- 
>  2 files changed, 44 insertions(+), 18 deletions(-) 
>


[PATCH master] New configuration option for the rbd user-id

2016-06-23 Thread David Mohr
The user id is used by ceph to determine the keyring to use for
authentication. By default the admin keyring is used, which may
not be desirable. Example usage:

  $ gnt-cluster modify -D rbd:user-id=foobar

Signed-off-by: David Mohr 
---
 lib/storage/bdev.py |   44 +++-
 src/Ganeti/Constants.hs |   18 +-
 2 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/lib/storage/bdev.py b/lib/storage/bdev.py
index 1f95004..5a4772f 100644
--- a/lib/storage/bdev.py
+++ b/lib/storage/bdev.py
@@ -886,6 +886,12 @@ class RADOSBlockDevice(base.BlockDev):
 self.Attach()
 
   @classmethod
+  def MakeRbdCmd(cls, params, cmd):
+if params.get(constants.RBD_USER_ID, ""):
+  cmd.extend(["--id", str(params[constants.RBD_USER_ID])])
+return [constants.RBD_CMD] + cmd
+
+  @classmethod
   def Create(cls, unique_id, children, size, spindles, params, excl_stor,
  dyn_params, **kwargs):
 """Create a new rbd device.
@@ -903,8 +909,8 @@ class RADOSBlockDevice(base.BlockDev):
 rbd_name = unique_id[1]
 
 # Provision a new rbd volume (Image) inside the RADOS cluster.
-cmd = [constants.RBD_CMD, "create", "-p", rbd_pool,
-   rbd_name, "--size", "%s" % size]
+cmd = cls.MakeRbdCmd(params, ["create", "-p", rbd_pool, rbd_name,
+  "--size", str(size)])
 result = utils.RunCmd(cmd)
 if result.failed:
   base.ThrowError("rbd creation failed (%s): %s",
@@ -928,7 +934,7 @@ class RADOSBlockDevice(base.BlockDev):
 self.Shutdown()
 
 # Remove the actual Volume (Image) from the RADOS cluster.
-cmd = [constants.RBD_CMD, "rm", "-p", rbd_pool, rbd_name]
+cmd = self.__class__.MakeRbdCmd(self.params, ["rm", "-p", rbd_pool, 
rbd_name])
 result = utils.RunCmd(cmd)
 if result.failed:
   base.ThrowError("Can't remove Volume from cluster with rbd rm: %s - %s",
@@ -987,7 +993,7 @@ class RADOSBlockDevice(base.BlockDev):
   return rbd_dev
 
 # The mapping doesn't exist. Create it.
-map_cmd = [constants.RBD_CMD, "map", "-p", pool, name]
+map_cmd = self.__class__.MakeRbdCmd(self.params, ["map", "-p", pool, name])
 result = utils.RunCmd(map_cmd)
 if result.failed:
   base.ThrowError("rbd map failed (%s): %s",
@@ -1017,14 +1023,13 @@ class RADOSBlockDevice(base.BlockDev):
 try:
   # Newer versions of the rbd tool support json output formatting. Use it
   # if available.
-  showmap_cmd = [
-constants.RBD_CMD,
+  showmap_cmd = cls.MakeRbdCmd({}, [
 "showmapped",
 "-p",
 pool,
 "--format",
 "json"
-]
+])
   result = utils.RunCmd(showmap_cmd)
   if result.failed:
 logging.error("rbd JSON output formatting returned error (%s): %s,"
@@ -1036,7 +1041,7 @@ class RADOSBlockDevice(base.BlockDev):
 except RbdShowmappedJsonError:
   # For older versions of rbd, we have to parse the plain / text output
   # manually.
-  showmap_cmd = [constants.RBD_CMD, "showmapped", "-p", pool]
+  showmap_cmd = cls.MakeRbdCmd({}, ["showmapped", "-p", pool])
   result = utils.RunCmd(showmap_cmd)
   if result.failed:
 base.ThrowError("rbd showmapped failed (%s): %s",
@@ -1168,7 +1173,7 @@ class RADOSBlockDevice(base.BlockDev):
 
 if rbd_dev:
   # The mapping exists. Unmap the rbd device.
-  unmap_cmd = [constants.RBD_CMD, "unmap", "%s" % rbd_dev]
+  unmap_cmd = self.__class__.MakeRbdCmd(self.params, ["unmap", 
str(rbd_dev)])
   result = utils.RunCmd(unmap_cmd)
   if result.failed:
 base.ThrowError("rbd unmap failed (%s): %s",
@@ -1212,8 +1217,8 @@ class RADOSBlockDevice(base.BlockDev):
 new_size = self.size + amount
 
 # Resize the rbd volume (Image) inside the RADOS cluster.
-cmd = [constants.RBD_CMD, "resize", "-p", rbd_pool,
-   rbd_name, "--size", "%s" % new_size]
+cmd = self.__class__.MakeRbdCmd(self.params, ["resize", "-p", rbd_pool,
+   rbd_name, "--size", str(new_size)])
 result = utils.RunCmd(cmd)
 if result.failed:
   base.ThrowError("rbd resize failed (%s): %s",
@@ -1248,16 +1253,17 @@ class RADOSBlockDevice(base.BlockDev):
 self._UnmapVolumeFromBlockdev(self.unique_id)
 
 # Remove the actual Volume (Image) from the RADOS cluster.
-cmd = [constants.RBD_CMD, "rm", "-p", rbd_pool, rbd_name]
+cmd = self.__class__.MakeRbdCmd(self.params, ["rm", "-p", rbd_pool, 
rbd_name])
 result = utils.RunCmd(cmd)
 if result.failed:
   base.ThrowError("Can't remove Volume from cluster with rbd rm: %s - %s",
   result.fail_reason, result.output)
 
 # We use "-" for importing from stdin
-return [constants.RBD_CMD, "import",
+return self.__class__.MakeRbdCmd(self.params, [
+"import",
 "-p", rbd_pool,
-"-", rbd_name]
+"-", rbd_name])
 
   def Export(self):
 

Re: [PATCH stable-2.17] Prevent the watcher from submitting too many jobs

2016-06-23 Thread 'Brian Foley' via ganeti-devel
On Thu, Jun 23, 2016 at 12:57:01PM +0100, 'Federico Morg Pareschi' via 
ganeti-devel wrote:
> When the watcher runs on each node group, if it can obtain the group
> lock, it submits a GROUP_VERIFY_DISKS job for each group. This happens
> every 5 minutes due to a cron job. This patch stops the watcher from
> submitting unnecessary verify disks jobs if there are some already
> pending in the queue to prevent job congestion.
> 
> Signed-off-by: Federico Morg Pareschi 

Great. LGTM except for a tiny nit and a couple of lint warnings. I'll fix those
up and push.

Thanks,
Brian.

> +  existing_jobs = _GetPendingVerifyDisks(cl, uuid)
> +  if existing_jobs:
> +logging.info("There are verify disks jobs already pending (%s), skipping 
> "
> + "VerifyDisks step for %s." % (existing_jobs, uuid))
> +return

This will log something like "...jobs already pending (['68040']), skipping..."
We can use utils.CommaJoin to avoid the extra punctuation.

>op = opcodes.OpGroupVerifyDisks(
>  group_name=uuid, priority=constants.OP_PRIO_LOW)
>op.reason = [(constants.OPCODE_REASON_SRC_WATCHER,
> -- 
> 2.8.0.rc3.226.g39d4020
> 


Re: [PATCH master] Update documentation for rbd:user-id

2016-06-23 Thread 'Brian Foley' via ganeti-devel
On Thu, Jun 23, 2016 at 02:05:30PM +0200, David Mohr wrote:
> 
> Signed-off-by: David Mohr 

Thanks. LGTM. I'll push it with some minor formatting tweaks.

Cheers,
Brian.


[PATCH master] Update documentation for rbd:user-id

2016-06-23 Thread David Mohr

Signed-off-by: David Mohr 
---
 doc/design-ceph-ganeti-support.rst |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/doc/design-ceph-ganeti-support.rst 
b/doc/design-ceph-ganeti-support.rst
index 7ec865c..9bd939c 100644
--- a/doc/design-ceph-ganeti-support.rst
+++ b/doc/design-ceph-ganeti-support.rst
@@ -72,7 +72,14 @@ Updated commands
   $ gnt-instance info
 
 ``access:userspace/kernelspace`` will be added to Disks category. This
-output applies to KVM based instances only.
+output applies to KVM based instances only.::
+
+  $ gnt-cluster modify -D rbd:user-id=foobar
+
+The user id for ceph authentication is an optional setting. If it is not
+provided, then no special option is passed to ceph. If it is provided,
+then all ceph commands are run with the ``--user`` option and the
+configured username.
 
 Ceph configuration on Ganeti nodes
 ==
-- 
1.7.9.5



[PATCH stable-2.17] Prevent the watcher from submitting too many jobs

2016-06-23 Thread 'Federico Morg Pareschi' via ganeti-devel
When the watcher runs on each node group, if it can obtain the group
lock, it submits a GROUP_VERIFY_DISKS job for each group. This happens
every 5 minutes due to a cron job. This patch stops the watcher from
submitting unnecessary verify disks jobs if there are some already
pending in the queue to prevent job congestion.

Signed-off-by: Federico Morg Pareschi 
---
 lib/watcher/__init__.py | 21 +
 1 file changed, 21 insertions(+)

diff --git a/lib/watcher/__init__.py b/lib/watcher/__init__.py
index 4e946b3..cfc23e1 100644
--- a/lib/watcher/__init__.py
+++ b/lib/watcher/__init__.py
@@ -344,11 +344,32 @@ def _CheckForOfflineNodes(nodes, instance):
   """
   return compat.any(nodes[node_name].offline for node_name in instance.snodes)
 
+def _GetPendingVerifyDisks(cl, uuid):
+  """Checks if there are any currently running or pending group verify jobs and
+  if so, returns their id.
+
+  """
+  qfilter = qlang.MakeSimpleFilter("status",
+frozenset([constants.JOB_STATUS_RUNNING,
+   constants.JOB_STATUS_QUEUED,
+   constants.JOB_STATUS_WAITING]))
+  qresult = cl.Query(constants.QR_JOB, ["id", "summary"], qfilter)
+
+  ids = [jobid for ((_, jobid), (_, (job, ))) in qresult.data
+ if job == ("GROUP_VERIFY_DISKS(%s)" % uuid)]
+  return ids
 
 def _VerifyDisks(cl, uuid, nodes, instances):
   """Run a per-group "gnt-cluster verify-disks".
 
   """
+
+  existing_jobs = _GetPendingVerifyDisks(cl, uuid)
+  if existing_jobs:
+logging.info("There are verify disks jobs already pending (%s), skipping "
+ "VerifyDisks step for %s." % (existing_jobs, uuid))
+return
+
   op = opcodes.OpGroupVerifyDisks(
 group_name=uuid, priority=constants.OP_PRIO_LOW)
   op.reason = [(constants.OPCODE_REASON_SRC_WATCHER,
-- 
2.8.0.rc3.226.g39d4020