Hi,

Thanks to everyone who reviewed my code and submitted comments the
first time!  I've updated the code and transitioned to using the State
monad.  Perhaps controversially, I've continued to use |> in a bunch
of places that the monad didn't get rid of because I think it's more
readable, but I'm still open for argument on this topic.  Using the
monad didn't make the code any shorter, but it kind of "felt" better,
once I figured out how to use it.  Figuring out how to use execState
to get into and out of "monad-ity" was the hardest part, because it's
mentioned in so few of the examples.  I think it's fair to say, of
course, that using a monad has increased the complexity, but I can
still read what I wrote.  I've posted my code below for additional
comments.

Thanks again!
-jj

{- Translate C type declarations into English.

   This exercise was taken from "Expert C Programming:  Deep C Secrets", p. 84.

   Example: echo -n "int *p;" | runhugs cdecl.hs

   Name: Shannon -jj Behrens <[EMAIL PROTECTED]>
   Date: Fri Feb 17 00:03:38 PST 2006
-}

import Char (isSpace, isAlphaNum, isDigit)
import Control.Monad.State

-- |> is like a UNIX pipe.
infixl 9 |>
x |> f = f x

data TokenType = Identifier | Qualifier | Type | Symbol Char
  deriving (Show, Eq)

data Token = Token {
  tokenType :: TokenType,
  tokenValue :: String
} deriving Show

data ParseContext = ParseContext {
  input :: String,    -- The input that has not been parsed yet.
  output :: [String], -- A list of strings in the reverse order of that which
                      -- they should be printed (e.g. [" a dog.", "I have"]).
  currTok :: Token,   -- The current token, if defined.
  stack :: [Token]    -- A stack of tokens we haven't dealt with yet.
} deriving Show

-- For convenience:
currTokType :: ParseContext -> TokenType
currTokType ctx = ctx |> currTok |> tokenType

currTokValue :: ParseContext -> String
currTokValue ctx = ctx |> currTok |> tokenValue

-- Start a new State ParseContext given an input string.
createParseContext :: String -> ParseContext
createParseContext input =
  ParseContext {input = input, output = [], stack = []}

-- Create the final output string given a ParseContext.
consolidateOutput :: ParseContext -> String
consolidateOutput ctx =
  ctx |> output |> reverse |> concat

-- "Write" to a ParseContext's output.
writeOutput :: String -> State ParseContext ()
writeOutput s = modify (\ctx -> ctx {output = s : output ctx})

-- Return the top token on the stack.
stackTop :: ParseContext -> Token
stackTop ctx = ctx |> stack |> head

-- Pop the stack.
pop :: State ParseContext ()
pop = modify (\ctx -> ctx {stack = ctx |> stack |> tail})

-- Write the value of the top of the stack and then pop it.
popAndWrite :: State ParseContext ()
popAndWrite = do
  top <- gets stackTop
  writeOutput (tokenValue top)
  pop

-- Classify a string into a Token.
classifyString :: String -> Token
classifyString "const"  = Token Qualifier "read-only"
classifyString "*"      = Token (Symbol '*') "pointer to"
classifyString [c]
  | not (isAlphaNum c)  = Token (Symbol c) [c]
classifyString s        = Token tokType s
  where
    tokType = case s of
      "volatile" -> Qualifier
      x | x `elem` ["void", "char", "signed", "unsigned", "short",
                    "int", "long", "float", "double", "struct",
                    "union", "enum"] -> Type
      x -> Identifier

-- Read the next token into currTok.
getToken :: State ParseContext ()
getToken = modify getToken'
  where
    getToken' ctx@(ParseContext {input = s}) =
      ctx {currTok = token, input = theRest}
      where
        (token, theRest) = s |> lstrip |> lexString
        lstrip s = dropWhile isSpace s

-- Read a token.  Return it and the left-over portion of the string.
lexString :: String -> (Token, String)
lexString s@(c:cs) | isAlphaNum c = (token, theRest)
  where
    (tokString, theRest) = span isAlphaNum s
    token = classifyString tokString
lexString ('*':cs) = (classifyString "*", cs)
lexString (c:cs) = (classifyString [c], cs)

-- Put tokens on the stack until we reach the first identifier.
readToFirstIdentifier :: State ParseContext ()
readToFirstIdentifier = do
  getToken
  pushUntilIdentifier
  afterIdentifier <- get
  let s = identifier ++ " is "
      identifier = currTokValue afterIdentifier in
    put (afterIdentifier {output = [s]})
  getToken

-- Keep pushing tokens until we hit an identifier.
pushUntilIdentifier :: State ParseContext ()
pushUntilIdentifier = do
  ctx <- get
  if currTokType ctx == Identifier
    then return ()                      -- Leave things as they are.
    else do
      put (ctx {stack = (currTok ctx) : (stack ctx)})
      getToken
      pushUntilIdentifier
      return ()

-- Deal with arrays.
dealWithArrays :: State ParseContext ()
dealWithArrays = do
  ctx <- get
  case currTokType ctx of
    Symbol '[' -> do
      writeOutput "array "
      getToken
      writeIfNumber
      getToken
      writeOutput "of "
      dealWithArrays
    _ -> return ()                      -- Recurse until we get past the ['s.
  where
    writeIfNumber = do                  -- Call writeSize if a number.
      tokValue <- gets currTokValue
      if tokValue |> head |> isDigit
        then do
          writeSize
          getToken
        else return ()
    writeSize = do                      -- Output the array size.
      tokValue <- gets currTokValue
      let num = tokValue |> read |> (+ -1) |> show
          s = "0.." ++ num ++ " " in    -- Can't use where instead of let here.
        writeOutput s

-- Deal with function arguments.
dealWithFunctionArgs :: State ParseContext ()
dealWithFunctionArgs = do
  getUntilParen
  getToken
  writeOutput "function returning "
  where
    getUntilParen = do                  -- Read tokens until we hit ).
      ctx <- get
      case currTokType ctx of
        Symbol ')' -> return ()
        _ -> do
          getToken
          getUntilParen

-- Deal with pointers.
dealWithPointers :: State ParseContext ()
dealWithPointers = do
  top <- gets stackTop
  case tokenType top of
    Symbol '*' -> do
      popAndWrite
      writeOutput " "
      dealWithPointers
    _ -> return ()                      -- Recurse until we get past the *'s.

-- Process tokens that we stacked while reading to identifier.
dealWithStack :: State ParseContext ()
dealWithStack = do
  stack' <- gets stack
  case stack' of
    [] -> return ()
    (x:xs) ->
      case tokenType x of
        Symbol '(' -> do
          pop
          getToken
          dealWithDeclarator
        _ -> popAndWrite

-- Do all parsing after first identifier.
dealWithDeclarator :: State ParseContext ()
dealWithDeclarator = do
  tokType <- gets currTokType
  case tokType of
    Symbol '[' -> dealWithArrays
    Symbol '(' -> dealWithFunctionArgs
    _ -> return ()                      -- "Exit" the case, not the function.
  dealWithPointers
  dealWithStack

-- Do all parsing.
dealWithEverything :: State ParseContext ()
dealWithEverything = do
  readToFirstIdentifier
  dealWithDeclarator

-- Translate a C type declaration into English.
translate :: String -> String
translate s =
  -- Change "consolidateOutput" to "show" to debug.
  s |> createParseContext |> execState dealWithEverything |> consolidateOutput

main :: IO ()
main = do
  input <- getContents
  input |> translate |> putStrLn

On 3/5/06, Shannon -jj Behrens <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I'm working on another article like
> <http://www.linuxjournal.com/article/8850>.  This time, I'm taking an
> exercise out of "Expert C Programming:  Deep C Secrets" and
> translating it into Haskell.  The program translates C type
> declarations into English.  I would greatly appreciate some code
> review.  I'd prefer to look like an idiot in front of you guys rather
> than in front of everyone in the world! ;)
>
> Please understand, I am not a Haskell expert!  Therefore, please make
> your suggestions simple enough that I can actually accomplish them!
>
> By the way, my code *mostly* follows the code laid out in the book.  I
> don't use a lexer or a parser or greatly improve on his algorithm.
> I'd like the Haskell and C versions to be similar so that they can be
> compared.
>
> The C version is:
> <http://www.cs.may.ie/~jpower/Courses/compilers/labs/lab3/parse_decl.c>
>
> The Haskell version is below.
[snip]
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to