I recently played in a code golf Roman to Decimal challenge (in Perl, Python, Ruby and PHP). In playing this game, I found some interesting short algorithms for converting from Roman Numerals to Decimal. Because I think some of these are new, I'd like to present them here, in case they are not really new after all, or in case there are problems with these algorithms that I've overlooked.
I'd like to eventually write up a "rosetta code" article comparing the implementation of these algorithms in various languages, including Haskell. Since I'm only a very occasional Haskell programmer, I thought it best to get feedback from Haskell experts before inflicting any of my Haskell code on a wider audience. Hence this post. To keep this post reasonably brief, note that these algorithms are for "modern" Roman Numerals only, limited to the range 1-3999, and with no error checking. That is, the code below assumes that the input is always a well formed Roman Numeral. I have tested each algorithm against every modern Roman Numeral in the range 1-3999. As a starting point, note this "HaskellWiki" function: romanToInt :: String -> Int romanToInt = fst . foldr (\p (t,s) -> if p >= s then (t+p,p) else (t-p,p)) (0,0) . map (fromJust . flip lookup (zip "IVXLCDM" [1,5,10,50,100,500,1000])) taken from http://haskell.cs.yale.edu/haskellwiki/Roman_numerals. I'm going to essentially duplicate the functionality of this code. {-# OPTIONS_GHC -fglasgow-exts -Wall #-} import Data.Char (toUpper) rtoa :: Char -> Int rtoa 'M' = 1000 rtoa 'D' = 500 rtoa 'C' = 100 rtoa 'L' = 50 rtoa 'X' = 10 rtoa 'V' = 5 rtoa 'I' = 1 rtoa r = error $ "Invalid rtoa char:" ++ show r urtoa :: Char -> Int urtoa = rtoa . toUpper romanToInt :: String -> Int romanToInt = foldl1 (\t n -> t+n-t`mod`n*2) . map urtoa The essential difference between this solution and the HaskellWiki one is the use of the "running total" for state rather than the "previous value". I see this as an improvement mainly because it is shorter -- though you might argue that it is less clear. An alternative way to express the romanToInt function is: romanToInt = foldl (\t c -> t+(urtoa c)-t`mod`(urtoa c)*2) 0 I'm open to persuasion as to which is better Haskell style. Since I'm not playing golf anymore, the rtoa function above, though hardly short, seemed to me to be the simplest and clearest way to express converting a single Roman Numeral to its corresponding arabic number. Again, suggestions for the "best"/most efficient way to do this in Haskell are most welcome. A common, and often winning technique, in golf is to perform these sorts of conversions by concocting a "magic formula". For fun, I rewrote rtoa using a magic formula I used in the golf game: rtoa c = 10^(205558`mod`(ord c)`mod`7)`mod`9995 I'm not suggesting that magic formulae are useful outside of golf and this second rtoa function, though shorter, is much less clear. I might add that this particular magic formula appears to be less useful in Haskell golf than the other languages because `mod` is five times longer than the % operator of the other languages. :) By way of explanation, notice that this formula: 205558`mod`(ord c)`mod`7 maps I->0, X->1, C->2, M->3, V->4, L->5, D->6 as shown below: Roman m 10^m 10^m`mod`9995 ----- - ----- ------------- M 3 1000 1000 D 6 1000000 500 C 2 100 100 L 5 100000 50 X 1 10 10 V 4 10000 5 I 0 1 1 Noticing this, you can replace the 205558`mod`(ord c)`mod`7 magic formula with a function that returns a string index (index() in Perl and Python). I am sometimes overwhelmed by the quantity and richness of all the functions in the GHC Haskell libraries. I eventually found a Haskell solution that seemed to work: {-# OPTIONS_GHC -fglasgow-exts -XOverloadedStrings -Wall #-} import Data.ByteString.Char8 (elemIndex) import Data.Maybe (fromJust) rtoa c = 10^(fromJust (elemIndex c "IXCMVLD"))`mod`9995 I got this to work by trial and error and have no clue what this "-XOverloadedStrings" and "Data.ByteString.Char8" business really means. If there is a better Haskell way of finding the numeric index of a particular character in a string, please let me know. Alternatively, you could write the rtoa function using an approach taken from the original HaskellWiki solution: rtoa = fromJust . flip lookup (zip "IVXLCDM" [1,5,10,50,100,500,1000]) What is your recommendation as to the "best"/most efficient way of writing the rtoa function in Haskell? Finally, here is an example complete test program. Suggestions for improving the style of this code are welcome. {-# OPTIONS_GHC -fglasgow-exts -Wall #-} import Data.Char (toUpper) import Data.List (concat, intersperse) rtoa :: Char -> Int rtoa 'M' = 1000 rtoa 'D' = 500 rtoa 'C' = 100 rtoa 'L' = 50 rtoa 'X' = 10 rtoa 'V' = 5 rtoa 'I' = 1 rtoa r = error $ "Invalid rtoa char:" ++ show r urtoa :: Char -> Int urtoa = rtoa . toUpper -- one is derived from http://haskell.cs.yale.edu/haskellwiki/Roman_numerals one :: String -> Int one = fst . foldr (\p (t,s) -> if p >= s then (t+p,p) else (t-p,p)) (0,0) . map urtoa two :: String -> Int two = foldl1 (\t n -> t+n-t`mod`n*2) . map urtoa three :: String -> Int three = foldl (\t c -> t+(urtoa c)-t`mod`(urtoa c)*2) 0 myshow :: String -> (String -> Int) -> String -> String myshow capt fn val = capt ++ val ++ " " ++ (show (fn val)) testdata :: [String] testdata = [ "I", "xiv", "DXVI", "CMLXIII" ] main :: IO () main = do putStrLn $ (concat $ intersperse "\n" (map (\c -> (myshow "one : " one c)) testdata)) putStrLn $ (concat $ intersperse "\n" (map (\c -> (myshow "two : " two c)) testdata)) putStrLn $ (concat $ intersperse "\n" (map (\c -> (myshow "three: " three c)) testdata)) Cheers, /-\ Need a Holiday? Win a $10,000 Holiday of your choice. Enter now.http://us.lrd.yahoo.com/_ylc=X3oDMTJxN2x2ZmNpBF9zAzIwMjM2MTY2MTMEdG1fZG1lY2gDVGV4dCBMaW5rBHRtX2xuawNVMTEwMzk3NwR0bV9uZXQDWWFob28hBHRtX3BvcwN0YWdsaW5lBHRtX3BwdHkDYXVueg--/SIG=14600t3ni/**http%3A//au.rd.yahoo.com/mail/tagline/creativeholidays/*http%3A//au.docs.yahoo.com/homepageset/%3Fp1=other%26p2=au%26p3=mailtagline _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe