On Fri, May 07, 2010 at 03:36:11 -0400, Max Battcher wrote: > So this is a fun bundle of patches..??? After the "get it done" but "try > not to touch/break anything important" first attempt at Graphing, this > bundle of patches factors out the common parts of tabulateRepo/graphRepo.
Yay! Apologies for the very late review. Luckily there's no hold-up as darcs-benchmark operates on a more liberal push first review when you can basis. This looks like a nice refactor. I didn't have anything to say so I just rambled upon about stylistic niggles which may just be personal taste anyway. Sorry! Data format and converter for "square" repo tables -------------------------------------------------- > +data RepoTable = RepoTable { rtRepo :: String > + , rtColumns :: [String] > + , rtRows :: [String] > + , rtTable :: [(TimeUnit, [Maybe MemTimeOutput])] > + } If I understand correctly, one of these guys corresponds to all the data for a given repo (eg. GHC). This is one step higher than having a Table for each rtColumns and rtRows refer to the row/column titles rtTable is a list of rows (and each sublist is a cell?) Each row uses the same units for comparability. Once we have these RepoTables we should probably avoid trying to do anything with them that's not straightforward rendering of some sort. I'm really prone to mistakes like sorting row/column headers (say according to fastest system or whatever) and then forgetting to actually sort the data points themselves or vice-versa. This is why I tend to instead manipulate "fatty" cells, where each cell contains its own row/column information. Much more Eric-proof that way. I do all my sorting and grouping on the fat cells and only remove the headery bits on the formatting end. > +-- Reformat (test, output) array as a square table > +repoTables :: [Benchmark ()] -> [(Test a, Maybe MemTimeOutput)] -> > [RepoTable] > +repoTables benchmarks results = [RepoTable { rtRepo = reponame (head repo) > + , rtColumns = columns > + , rtRows = rows > + , rtTable = table repo > + } | repo <- reposResults] Safe to use head here because repoResults is the result of groupBy. This may be cleaner as [ ... | repo <- groupBy sameRepo results ] which avoids keeps the groupBy-makes-non-empty-lists knowledge and the consume-a-non-empty-list assumption near each other. (On the other hand, it may be harder to read, sigh tradeoffs). > + where > + reposResults = groupBy sameRepo results -- ASSUME: ordered by repository > + sameRepo (Test _ tr1 _, _) (Test _ tr2 _, _) = trName tr1 == trName tr2 > + reponame (Test _ tr _, _) = trName tr > + -- > + rows = map description . filter hasBenchmark $ benchmarks > + hasBenchmark b = any (\ (Test tb _ _, _) -> description tb == description > b) results > + -- > + columns = map mkColName columnInfos > + columnInfos = nub [ (b, trName tr) | (Test _ tr b, _) <- results ] > + mkColName (TestBinary b, tname) = > + let v = nameToVariant tname > + prefix = case vId v of > + DefaultVariant -> "" > + _-> vSuffix v ++ " " > + in prefix ++ cutdown b > + cutdown d | "darcs-" `isPrefixOf` d = cutdown (drop 6 d) > + | takeExtension d == ".exe" = dropExtension d > + | otherwise = d By the way, check out http://wiki.darcs.net/Benchmarks/GhLinux It could be worth doing some repo name clean up, maybe with takeFileName It has "./darcs-2.3.1", which I think we could maybe prettify a bit. > + -- > + table repo = [(tu (tableRow row), map justMemTimeOutput $ tableRow row) > + | row <- rows] I'm not sure a list-comprehesion is the most readable route here. How about a helper function like mkRow or rowInfo and a map? > + where > + tableRow row = [find (match row col) repo | col <- columnInfos] > + getTime (Just (_, Just mt)) = Just (mtTimeMean mt) > + getTime _ = Nothing > + tu row = case mapMaybe getTime row of > + [] -> Milliseconds > + xs -> appropriateUnit (minimum xs) > + match bench (binary, name) (Test bench' tr binary', _) = > + bench == description bench' && binary == binary' && trName tr == name > + justMemTimeOutput (Just (_, mt)) = mt > + justMemTimeOutput Nothing = Nothing Move graphRepo to RepoTable --------------------------- Seems better for me to just paste in the new code than work with the diff | graphRepoMemory :: RepoTable -> [String] | graphRepoMemory repo = map graphUrl rows | where | rows = [(title rowname, rtColumns repo, graphdata rowdata) | | (rowname, (tu, rowdata)) <- zip (rtRows repo) (rtTable repo)] | title rowname = rowname ++ " (MiB)" | graphdata rowdata = map selectMemory rowdata | selectMemory (Just mt) = fromRational (mtMemMean mt) / (1024 * 1024) | selectMemory _ = 0.0 Each row (benchmark) corresponds to a single graph. It's definitely a lot clearer what's going on here now that you've broken it down. I tend to use a lot more point-free style these days (eg graphdata = map selectMemory), but maybe when I'm older and grumpier I'll swing back. > + t_graphs = map (repoTuple graphRepoTime) tables > + m_graphs = map (repoTuple graphRepoMemory) tables > + showG (r,gs) = intercalate "\n" $ [ r > + , replicate (length r) '-' > + , "" > + ] ++ (map imgDirective gs) ++ [""] > + imgDirective = (".. image:: " ++) Heh, due to cherry picking this patch is incomplete (missing repoTuple). But that's the way things go sometimes... Refactor tabulateRepo to use RepoTable -------------------------------------- Seems like a straightforward refactor. > + repoTuple tabulate repo = (rtRepo repo, tabulate repo) > + t_tables = map (repoTuple $ tabulateRepo formatTimeResult) tables > + m_tables = map (repoTuple $ tabulateRepo formatMemoryResult) tables I actually would have probably found (repoTuple (tabulateRepo formatTimeResult)) a bit easier to read (because the '$' primes me to expect a non-function value at the other end). Please don't take my stylistic comments too seriously! There's a lot of this kind of stuff that could just go either way and loop around forever. repoTables should use trCoreName; force sort -------------------------------------------- > It seems better to actually sort the input results than to assume they > are sorted. Context: for variant repos, eg. op-darcs-2.3.1, the "core" name would be that of the original unvarnished darcs (darcs-2.3.1). > - reposResults = groupBy sameRepo results -- ASSUME: ordered by repository > - sameRepo (Test _ tr1 _, _) (Test _ tr2 _, _) = trName tr1 == trName tr2 > - reponame (Test _ tr _, _) = trName tr > + reposResults = groupBy sameRepo $ sortBy repoOrder results > + sameRepo (Test _ tr1 _, _) (Test _ tr2 _, _) = trCoreName tr1 == > trCoreName tr2 > + reponame (Test _ tr _, _) = trCoreName tr sameRepo could conceivably use reponame (on the other hand, maybe ((==) `on` reponame) sort of the kind of just-because-you-can-doesn't-mean-you-should code I might write) reponame could conceivably be eliminated if we added general use field labels (if nothing else, the Gentle Introduction to Haskell is useful for learning what to call things. I was just about to say "field accessor thingies"): data Test a = Test { tBenchmark :: Benchmark a , tRepo :: TestRepo ... } > + repoOrder a b = compare (reponame a) (reponame b) This can be written as compare `on` reponame on :: (b -> b -> c) -> (a -> b) -> a -> a -> c Source (*) `on` f = \x y -> f x * f y. (The kind of thing one picks up through cultural osmosis) Refactor Run.hs to use updated tabulateRepo ------------------------------------------- > - case tabulate formatTimeResult res of > - [] -> return () > - [(_,t)] -> echo_n $ TR.render id id id t > - _ -> error "Not expecting more than one table for a repo and its > variants" > + let tables = repoTables benchmarks res > + if length tables == 1 then > + echo_n $ TR.render id id id $ tabulateRepo formatTimeResult (head > tables) > + else error "Not expecting more than one table for a repo and its > variants" Note that checking length and doing head is slightly less safe than just relying on pattern-matching with case (albeit perhaps more readable). It's unlikely that you'll break apart if length tables == 1 then foo (head tables), which is why I say "slightly" Also: do we never get the empty list? -- Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow> PGP Key ID: 08AC04F9
signature.asc
Description: Digital signature
_______________________________________________ darcs-users mailing list darcs-users@darcs.net http://lists.osuosl.org/mailman/listinfo/darcs-users