Because of instance locks, after submitting a repair job we weren't able to set the "pending" tag until at least the first opcode of the job finished. Introduce a small delay in the repair job so as to allow the subsequent TAGS_SET job to finish immediately, resulting in faster operation from the user's point of view.
Make the duration of the delay configurable with --job-delay; if not zero, avoid inserting the TestDelay opcode entirely. Signed-off-by: Dato Simó <d...@google.com> --- > LGTM-ish. I was wondering two things: > - should we hardcode the 10 seconds there? what if we find that on a big > cluster, we need for example 30 seconds (for whatever reason)? maybe > we should introduce this delay as a command line option, defaulting to > 10? > - following from the above: should we make it so that if the given delay > is 0, then the extra opcode is not prepended? so that we can test live > both with and without it? Both seem good suggestions to me, thanks. Please see updated patch below. (I didn't deem it necessary to split the changes to CLI.hs to a separate commit, but let me know if you'd prefer it differently.) Thanks, src/Ganeti/HTools/CLI.hs | 12 ++++++++++ src/Ganeti/HTools/Program/Harep.hs | 45 +++++++++++++++++++++++++++++++++----- 2 files changed, 52 insertions(+), 5 deletions(-) diff --git a/src/Ganeti/HTools/CLI.hs b/src/Ganeti/HTools/CLI.hs index 6e40df7..00f8324 100644 --- a/src/Ganeti/HTools/CLI.hs +++ b/src/Ganeti/HTools/CLI.hs @@ -55,6 +55,7 @@ module Ganeti.HTools.CLI , oGroup , oIAllocSrc , oInstMoves + , oJobDelay , genOLuxiSocket , oLuxiSocket , oMachineReadable @@ -118,6 +119,7 @@ data Options = Options , optIAllocSrc :: Maybe FilePath -- ^ The iallocation spec , optSelInst :: [String] -- ^ Instances to be excluded , optLuxi :: Maybe FilePath -- ^ Collect data from Luxi + , optJobDelay :: Double -- ^ Delay before executing first job , optMachineReadable :: Bool -- ^ Output machine-readable format , optMaster :: String -- ^ Collect data from RAPI , optMaxLength :: Int -- ^ Stop after this many steps @@ -162,6 +164,7 @@ defaultOptions = Options , optIAllocSrc = Nothing , optSelInst = [] , optLuxi = Nothing + , optJobDelay = 10 , optMachineReadable = False , optMaster = "" , optMaxLength = -1 @@ -326,6 +329,15 @@ oIAllocSrc = "Specify an iallocator spec as the cluster data source", OptComplFile) +oJobDelay :: OptType +oJobDelay = + (Option "" ["job-delay"] + (reqWithConversion (tryRead "job delay") + (\d opts -> Ok opts { optJobDelay = d }) "SECONDS") + "insert this much delay before the execution of repair jobs\ + \ to allow the tool to continue processing instances", + OptComplFloat) + genOLuxiSocket :: String -> OptType genOLuxiSocket defSocket = (Option "L" ["luxi"] diff --git a/src/Ganeti/HTools/Program/Harep.hs b/src/Ganeti/HTools/Program/Harep.hs index 3e0345d..1e1c8a4 100644 --- a/src/Ganeti/HTools/Program/Harep.hs +++ b/src/Ganeti/HTools/Program/Harep.hs @@ -61,6 +61,7 @@ options = do luxi <- oLuxiSocket return [ luxi + , oJobDelay ] arguments :: [ArgCompletion] @@ -347,9 +348,12 @@ detectBroken nl inst = -- DTFile, DTSharedFile, DTBlock, DTRbd, DTExt. -- | Perform the suggested repair on an instance if its policy allows it. -doRepair :: L.Client -> InstanceData -> (AutoRepairType, [OpCode]) - -> IO InstanceData -doRepair client instData (rtype, opcodes) = +doRepair :: L.Client -- ^ The Luxi client + -> Maybe Double -- ^ Delay to insert before the first repair opcode + -> InstanceData -- ^ The instance data + -> (AutoRepairType, [OpCode]) -- ^ The repair job to perform + -> IO InstanceData -- ^ The updated instance data +doRepair client delay instData (rtype, opcodes) = let inst = arInstance instData ipol = Instance.arPolicy inst iname = Instance.name inst @@ -373,9 +377,36 @@ doRepair client instData (rtype, opcodes) = else do putStrLn ("Executing " ++ show rtype ++ " repair on " ++ iname) + -- After submitting the job, we must write an autorepair:pending tag, + -- that includes the repair job IDs so that they can be checked later. + -- One problem we run into is that the repair job immediately grabs + -- locks for the affected instance, and the subsequent TAGS_SET job is + -- blocked, introducing an unnecesary delay for the end-user. One + -- alternative would be not to wait for the completion of the TAGS_SET + -- job, contrary to what commitChange normally does; but we insist on + -- waiting for the tag to be set so as to abort in case of failure, + -- because the cluster is left in an invalid state in that case. + -- + -- The proper solution (in 2.9+) would be not to use tags for storing + -- autorepair data, or make the TAGS_SET opcode not grab an instance's + -- locks (if that's deemed safe). In the meantime, we introduce an + -- artificial delay in the repair job (via a TestDelay opcode) so that + -- once we have the job ID, the TAGS_SET job can complete before the + -- repair job actually grabs the locks. (Please note that this is not + -- about synchronization, but merely about speeding up the execution of + -- the harep tool. If this TestDelay opcode is removed, the program is + -- still correct.) + let opcodes' = case delay of + Just d -> OpTestDelay { opDelayDuration = d + , opDelayOnMaster = True + , opDelayOnNodes = [] + , opDelayRepeat = fromJust $ mkNonNegative 0 + } : opcodes + Nothing -> opcodes + uuid <- newUUID time <- getClockTime - jids <- submitJobs [map wrapOpCode opcodes] client + jids <- submitJobs [map wrapOpCode opcodes'] client case jids of Bad e -> exitErr e @@ -418,8 +449,12 @@ main opts args = do -- Third step: create repair jobs for broken instances that are in ArHealthy. let maybeRepair c (i, r) = maybe (return i) (repairHealthy c i) r + jobDelay = optJobDelay opts + repairDelay = if jobDelay > 0 -- Could use "mfilter (> 0)" after + then Just jobDelay -- dropping support for 6.12. + else Nothing repairHealthy c i = case arState i of - ArHealthy _ -> doRepair c i + ArHealthy _ -> doRepair c repairDelay i _ -> const (return i) _unused_repairDone <- bracket (L.getClient master) L.closeClient $ -- 1.8.0.2-x20-1 -- You received this message because you are subscribed to the Google Groups "ganeti-devel" group. To unsubscribe from this group, send email to ganeti-devel+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.