Hello,

I'm looking for a bit of help (ok, a lot) the speed up my program which I use to build a calltree out of an annotated program execution trace. To give you an idea about the sluggishness at the moment, for a trace containing 70MB, it has been running for about 10 hours straight (AMD Athlon XP (Barton) 2600+).

The trace contains lines made up of a number of fields:

C 4 1000 1000000
C 4 1001 1000200
R 4 1001 1003045
R 4 1000 1003060

C indicates a function entrypoint (call), R indicates a function exitpoint (return). The second field indicates which thread is executing the function, the third field denotes the function id, the last field contains a performance counter value. As you can see, numbering each line with a pre-order and a post-order number yields a list that can be transformed easily into a tree, which can then be manipulated. The first goal is to build the tree. This is done in the following code:


data ParserState = ParserState { methodStack :: !ThreadMap
                               , methodQueue :: !ThreadMap
                               , pre         :: !Integer
                               , post        :: !Integer
                               , methodMap   :: !MethodMap
                               , currentThread :: !Integer
                               } deriving (Show)

initialParserState :: ParserState
initialParserState = ParserState e e 0 0 e 0
  where e = M.empty :: Map Integer a

readInteger :: B.ByteString -> Integer
readInteger = fromIntegral . fst . fromJust . B.readInt


parseTraceMonadic :: [B.ByteString] -> ParserState
parseTraceMonadic ss = state { methodQueue = M.map reverse (methodQueue state) } where state = execState (mapM_ (\x -> modify (updateState x) >> get >>= (`seq` return ())) ss) initialParserState


updateState :: B.ByteString -> ParserState -> ParserState
updateState s state = case (B.unpack $ head fields) of
  "M" -> updateStateMethod     fields state
  "E" -> updateStateException  fields state
  "C" -> updateStateEntry      fields state
  "R" -> updateStateExit       fields state
  where fields = B.splitWith (== ' ') s


updateStateMethod :: [B.ByteString] -> ParserState -> ParserState
updateStateMethod (_:methodId:methodName:_) state = state { methodMap = M.insert (readInteger methodId) methodName (methodMap state) }

updateStateException :: [B.ByteString] -> ParserState -> ParserState
updateStateException _ state = state

updateStateEntry :: [B.ByteString] -> ParserState -> ParserState
updateStateEntry (_:ss) state = {-Debug.Trace.trace ("before: " ++ (show state) ++ "\nafter: " ++ (show newstate)) $-} newstate where newstate = state { methodStack = updateMap thread (methodStack state) (\x y -> Just (x:y)) (pre state, 0, method)
                                      , pre = ((+1) $! (pre state))
                                      }
        method = mkMethod (Prelude.map B.unpack ss)
        thread = Method.thread method

updateStateExit :: [B.ByteString] -> ParserState -> ParserState
updateStateExit (_:ss) state = {-Debug.Trace.trace ("before: " ++ (show state)) $-} case updateMethod m (Prelude.map B.unpack ss) of Just um -> state { methodStack = M.update (\x -> Just (tail x)) thread (methodStack state) , methodQueue = updateMap thread (methodQueue state) (\x y -> Just (x:y)) (pre_, post state, um) , post = ((+1) $! (post state))
                                                    }
Nothing -> error $ "Top of the stack is mismatching! Expected " ++ (show m) ++ " yet got " ++ (show ss) ++ "\n" ++ (show state)
  where method = mkMethod (Prelude.map B.unpack ss)
        thread = Method.thread method
        (pre_, _, m) = case M.lookup thread (methodStack state) of
                          Just stack -> head stack
Nothing -> error $ "Method stack has not been found for thread " ++ (show thread) ++ " -> fields: " ++ (show ss)


updateMap key map f value = case M.member key map of
                              True  -> M.update (f value) key map
                              False -> M.insert key [value] map

As you can see, the state is updated for each entry, a stack being maintained with methods we've seen up to now, and a list with methods that have received both pre and post order numbers, and of which both the entry and exit point have been parsed. I am using a ByteString, because using a plain String is causing the program to grab far too much heap.

The mkMethod yields a Method like this:


data Method = Method { mid :: Integer
                     , thread :: Integer
                     , instruction_entry :: Integer
                     , instruction_exit :: Integer
                     } deriving (Eq, Show)

eM = Method 0 0 0 0

mkMethod :: [String] -> Method
mkMethod s = let [_thread, _id, _entry] = take 3 $ map (read :: String -> Integer) s
             in [_thread, _id, _entry] `seq` Method { mid = _id
                                                    , thread = _thread
, instruction_entry = _entry , instruction_exit = 0
                                                    }

updateMethod :: Method -> [String] -> Maybe Method
updateMethod (Method mid thread instruction_entry instruction_exit ) s
| thread == _thread && mid == _id = _exit `seq` Just Method { mid = mid , thread = thread , instruction_entry = instruction_entry , instruction_exit = _exit
                                                              }
  | otherwise = Nothing
where [_thread, _id, _exit] = take 3 $ map (read :: String -> Integer) s


Any suggestions for improving this code?

Thanks,

Andy


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to