Re: [Haskell-cafe] Please critique my code (a simple lexer)

2012-05-23 Thread Lorenzo Bolla
 On Tue, May 22, 2012 at 4:13 PM, John Simon zildjoh...@gmail.com wrote:

 data Lexer = Lexer String

 makeLexer :: String - Lexer
 makeLexer fn = Lexer fn


`makeLexer` is redundant. You can simply use `Lexer`.

L.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Please critique my code (a simple lexer)

2012-05-22 Thread John Simon
Hi all,

I've been teaching myself Haskell lately (I come from the C#/Python
world). I wrote a simplistic lexer, and I was hoping I could get a
code review or two. The code that follows is a stand-alone app that
works under GHC.

A few concerns of mine:
- My `consume` function seems basic enough that it should be library
code, but my searches turned up empty. Did I miss anything?
- Is `case _ of x:xs - x:xsr where xsr = something xs` a common
idiom? It happened twice in my code, and it seems odd to split the
first element away from the rest of the list as it's processed.
- Is creating data structures with simple field names like `kind`,
`offset`, etc a good practice? Since the names are global functions, I
worry about namespace pollution, or stomping on functions defined
elsewhere.

Thanks in advance for anyone willing to take the time.

-- code follows

module Main where

import qualified Data.Map as Map

data Lexer = Lexer String

makeLexer :: String - Lexer
makeLexer fn = Lexer fn

data Loc = Loc {offset :: Int, line :: Int, column :: Int}

locInc loc n = Loc (offset loc + n) (line loc) (column loc + n)
locNL loc = Loc (offset loc + 1) (line loc + 1) 1

data TokenKind = Colon | RArrow1 | Def | Var | Identifier String | EOF
deriving Show

data Token = Token {lexer :: Lexer, loc :: Loc, kind :: TokenKind}

idStart = ['a'..'z'] ++ ['A'..'Z'] ++ !@$%^*-_=+|/?
idNext = idStart ++ ['0'..'9'] ++ '\

namedTokens = Map.fromList [
    (def, Def),
    (var, Var)]

doLex :: Lexer - String - [Token]
doLex lexer = doLex' lexer (Loc 0 1 1)

doLex' lexer loc source = case source of
    [] - [makeToken EOF]
    ' ':xs - more (locInc loc 1) xs
    '\n':xs    - more (locNL loc) xs
    ':':xs - makeToken Colon : more (locInc loc 1) xs
    '-':'':xs - makeToken RArrow1 : more (locInc loc 2) xs
    x:xs | x `elem` idStart -
    makeToken kind : more (locInc loc $ length name) xsr
    where (namer, xsr) = consume idNext xs
  name = x:namer
  kind = maybe (Identifier name) id $ Map.lookup name namedTokens
    _ - error Invalid character in source
    where
    makeToken = Token lexer loc
    more = doLex' lexer

consume :: Eq a = [a] - [a] - ([a], [a])
consume want xs = case xs of
    x:xs | x `elem` want - (x:xsr, rest) where (xsr, rest) = consume want xs
    _    - ([], xs)

main :: IO ()
main = do
    let toks = doLex (makeLexer ) def x - y in
    putStrLn $ show $ map kind toks

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Please critique my code (a simple lexer)

2012-05-22 Thread Taylor Hedberg
John Simon, Tue 2012-05-22 @ 10:13:07-0500:
 - My `consume` function seems basic enough that it should be library
 code, but my searches turned up empty. Did I miss anything?

consume = span . flip elem


 - Is creating data structures with simple field names like `kind`,
 `offset`, etc a good practice? Since the names are global functions, I
 worry about namespace pollution, or stomping on functions defined
 elsewhere.

If you don't intend your module to be imported and used as a library,
then there's no reason to worry about this. If you do intend it to be
used that way, then it's probably still not worth worrying about, as
name clashes can be resolved at the import level via qualified imports
or `hiding` lists. If it ends up really being a problem, you can always
add a namespace prefix to those names, though honestly I find that kind
of ugly.

The compiler will always catch cases of ambiguity caused by multiple
definitions of the same name being in scope, so you don't have to worry
about this causing inadvertent runtime bugs.


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Please critique my code (a simple lexer)

2012-05-22 Thread Eric Rasmussen
Another suggestion is to use pattern matching at the function level:

doLex' lexer loc [] = [makeToken EOF]
doLex' lexer loc (x:xs) = case x of
' ' - more (locInc loc 1) xs
'\n'- more (locNL loc) xs
...
_  -

That saves you from having to deconstruct repeatedly in your case
statements.

You might also want to check out the excellent HLint (available on
hackage), which will give you even more suggestions.

On Tue, May 22, 2012 at 8:36 AM, Taylor Hedberg t...@tmh.cc wrote:

 John Simon, Tue 2012-05-22 @ 10:13:07-0500:
  - My `consume` function seems basic enough that it should be library
  code, but my searches turned up empty. Did I miss anything?

 consume = span . flip elem


  - Is creating data structures with simple field names like `kind`,
  `offset`, etc a good practice? Since the names are global functions, I
  worry about namespace pollution, or stomping on functions defined
  elsewhere.

 If you don't intend your module to be imported and used as a library,
 then there's no reason to worry about this. If you do intend it to be
 used that way, then it's probably still not worth worrying about, as
 name clashes can be resolved at the import level via qualified imports
 or `hiding` lists. If it ends up really being a problem, you can always
 add a namespace prefix to those names, though honestly I find that kind
 of ugly.

 The compiler will always catch cases of ambiguity caused by multiple
 definitions of the same name being in scope, so you don't have to worry
 about this causing inadvertent runtime bugs.

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Please critique my code (a simple lexer)

2012-05-22 Thread wren ng thornton

On 5/22/12 11:13 AM, John Simon wrote:

- Is `case _ of x:xs -  x:xsr where xsr = something xs` a common
idiom? It happened twice in my code, and it seems odd to split the
first element away from the rest of the list as it's processed.


I don't know how common it is in practice, but that's fmap for the 
PreList functor. Allow me to explain...


(tl;dr: there's some non-trivial theoretical backing, if you're 
interested in recursion theory. Though again, I'm not sure how often it 
actually comes up for lists.)




Here's the list type, if we defined it ourselves:

data List a = Nil | Cons a (List a)

Like other recursive types, we can decompose this into a non-recursive 
type plus a generic version of recursion:


data Fix f = MkFix (f (Fix f))
-- N.B., MkFix :: f (Fix f) - Fix f

data PreList a recurse = PreNil | PreCons a recurse

type List' a = Fix (PreList a)

The new List' is essentially the same as the old List:

to :: List a - List' a
to Nil = Fix PreNil
to (Cons x xs) = Fix (PreCons x (to xs))

fro :: List' a - List a
fro (MkFix PreNil) = Nil
fro (MkFix (PreCons x xs)) = Cons x (fro xs)

-- and we can prove:
--  to . fro == id
--  fro . to == id

There's a whole bunch of other machinery that comes from this 
perspective, but the important thing is that it all depends on the fact 
that PreList is a functor in a very specific way, namely that it applies 
the function to all the recursion sites (i.e., one level down if we're 
thinking of the fixed-point version):


instance Functor (PreList a) where
fmap f PreNil = PreNil
fmap f (PreCons x xs) = PreCons x (f xs)

Of course, Fix(PreList a) ---aka List a--- has its own functor instance, 
but that's unrelated.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe