[Haskell-cafe] Code review request - using ErrorT
Dear gentle Haskellers, I'd appreciate it very much if you could please take a look the my code here and give me some feedback - https://github.com/ckkashyap/haskell-websocket/blob/master/src/lib/Net/WebSocket/Request.hs I am particularly looking for advice around what would be a good type for a server handler function that does IO and needs to short circuit if it fails at any point also needs to thread a state along. I am considering this - ErrorT String (StateT Int IO) Data Regards, Kashyap ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review?
On Tue, May 24, 2011 at 7:18 PM, Neil Mitchell ndmitch...@gmail.com wrote: Before doing a code review I always demand that the author runs over the code with HLint (http://community.haskell.org/~ndm/hlint) - they Very good point. In fact you just inspired me to finally download it and run it on my own code. Thanks for the great tool! Glad you like it. While I'm on the topic, I recently wrote a tool that wanted to traverse deep data structures as produced by haskell-src-exts. I wound up with about 50 lines of case expressions and around the time my hands were literally beginning to hurt decided that enough was enough and I should try a generic approach. I heard uniplate was pretty easy to use, and was pretty pleased to turn the entire thing into a single line. It took me a little longer to figure out I needed to use universeBi since all the examples were monotyped, but once I did it Just Worked. Amazing. So thanks again! And maybe you could mention universeBi in the instant introduction? Yes, I probably should - I'll try and get to that. Of course, I'd also happily accept a patch against http://community.haskell.org/~ndm/darcs/uniplate I use Uniplate inside HLint, and it's invaluable - there are a lot of times when List Comp + universeBi really hits the spot. +1 on that, I use uniplate for pretty much all my haskell-src-exts tasks these days, works like a charm! I'd love to include some standard traversal functionality in haskell-src-exts that depends on uniplate, but hesitate to do so because of HP aspirations for haskell-src-exts. Neil, what do you reckon the chances of getting uniplate in the HP are? :-) Cheers, /Niklas ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review?
Hi Niklas, I use Uniplate inside HLint, and it's invaluable - there are a lot of times when List Comp + universeBi really hits the spot. +1 on that, I use uniplate for pretty much all my haskell-src-exts tasks these days, works like a charm! I'd love to include some standard traversal functionality in haskell-src-exts that depends on uniplate, but hesitate to do so because of HP aspirations for haskell-src-exts. Neil, what do you reckon the chances of getting uniplate in the HP are? :-) I'm happy to have it included. The only likely change is that I currently have basically two versions of all modules (Data.Generics.PlateData and Data.Generics.Uniplate.Data). I'll upload a version 1.7 of Uniplate marking those deprecated, then a 1.8 removing them, and then I'd love to get included in the Platform. Of course, whether it goes in the platform is not really up to me, it's up to the community - but if someone proposes it I'll back it. Thanks, Neil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review?
On Mon, May 23, 2011 at 12:02 PM, Neil Mitchell ndmitch...@gmail.com wrote: 'if all == False then return False else return True' is a pretty confusing way to say 'return all'. In fact, any time you see 'x == True' you can just remove the '== True'. The whole postAll thing would be clearer as Before doing a code review I always demand that the author runs over the code with HLint (http://community.haskell.org/~ndm/hlint) - they Very good point. In fact you just inspired me to finally download it and run it on my own code. Thanks for the great tool! While I'm on the topic, I recently wrote a tool that wanted to traverse deep data structures as produced by haskell-src-exts. I wound up with about 50 lines of case expressions and around the time my hands were literally beginning to hurt decided that enough was enough and I should try a generic approach. I heard uniplate was pretty easy to use, and was pretty pleased to turn the entire thing into a single line. It took me a little longer to figure out I needed to use universeBi since all the examples were monotyped, but once I did it Just Worked. Amazing. So thanks again! And maybe you could mention universeBi in the instant introduction? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review?
Evan Laforge qdunkan at gmail.com writes: [...] a tool that wanted to traverse deep data structures as produced by haskell-src-exts. [...] I heard uniplate was pretty easy to use, and was pretty pleased to turn the entire thing into a single line. Please post that line here - J.W. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review?
On Tue, May 24, 2011 at 6:47 AM, Johannes Waldmann waldm...@imn.htwk-leipzig.de wrote: Evan Laforge qdunkan at gmail.com writes: [...] a tool that wanted to traverse deep data structures as produced by haskell-src-exts. [...] I heard uniplate was pretty easy to use, and was pretty pleased to turn the entire thing into a single line. Please post that line here - J.W. Getting a bit off subject, and it's nothing special: moduleQNames :: Types.Module - [Types.Qualification] moduleQNames mod = [Types.qualification m | Haskell.Qual _ m _ - Uniplate.universeBi mod] But just try writing this without a generics library :) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review?
Before doing a code review I always demand that the author runs over the code with HLint (http://community.haskell.org/~ndm/hlint) - they Very good point. In fact you just inspired me to finally download it and run it on my own code. Thanks for the great tool! Glad you like it. While I'm on the topic, I recently wrote a tool that wanted to traverse deep data structures as produced by haskell-src-exts. I wound up with about 50 lines of case expressions and around the time my hands were literally beginning to hurt decided that enough was enough and I should try a generic approach. I heard uniplate was pretty easy to use, and was pretty pleased to turn the entire thing into a single line. It took me a little longer to figure out I needed to use universeBi since all the examples were monotyped, but once I did it Just Worked. Amazing. So thanks again! And maybe you could mention universeBi in the instant introduction? Yes, I probably should - I'll try and get to that. Of course, I'd also happily accept a patch against http://community.haskell.org/~ndm/darcs/uniplate I use Uniplate inside HLint, and it's invaluable - there are a lot of times when List Comp + universeBi really hits the spot. Thanks, Neil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review?
Evan Laforge schrieb: I'd keep the line length down to 80 chars, and not use ';'s. You can find more programming tips in http://www.haskell.org/haskellwiki/Haskell_programming_tips and in other articles of the Haskell Wiki, e.g. in Category:Idioms, Category:Style, Category:FAQ. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review?
'if all == False then return False else return True' is a pretty confusing way to say 'return all'. In fact, any time you see 'x == True' you can just remove the '== True'. The whole postAll thing would be clearer as Before doing a code review I always demand that the author runs over the code with HLint (http://community.haskell.org/~ndm/hlint) - they don't have to necessarily apply all the suggestions, but they do have to at least be aware of obvious alternatives. A code review takes a reasonable amount of time, and it's best to use that for things that machines can't yet figure out - rather than the simpler stuff like the above. Thanks, Neil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] code review?
Hey guys. Pretty new to haskell here. I've started off by writing a small command-line interface to plurk.. I was just wondering if anyone would be willing to give everything a look-over and lemme know what kinds of things I should be looking to improve upon, style-wise. Not sure I'm currently doing things in the 'haskell-way' so to speak . Thanks a bunch! https://github.com/saiko-chriskun/hermes (note: the JSON and Plurkell submodules are also mine.) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review?
I'd keep the line length down to 80 chars, and not use ';'s. All of that fiddling with buffering, printing, reading results could be more clearly put into a couple of functions. 'if all == False then return False else return True' is a pretty confusing way to say 'return all'. In fact, any time you see 'x == True' you can just remove the '== True'. The whole postAll thing would be clearer as postAll - if all not first then return all else ask Post to all? post - if postAll then return True else ask Post to..? Anywhere you have 'head x' or 'x!!0' or a 'case length xs of' that's a chance to crash. Don't do that. You can get rid of the heads by writing (config:configs) and [] cases for postStatus. Get rid of the !!0 by making config into a data type, it looks like 'data Config = Config { configPostTo :: URL?, configUser :: Maybe String, configPass :: Maybe String }'. Then 'pass - maybe (ask pass?) return (configPass config)'. Of course, why make these things optional at all? Looks like the postStatus return value is not used. It would simplify it to not return those codes. I don't know anything about 'postPlurk' but it looks like it could return a real data type as well. All this nested if else stuff makes it hard to read, but I think you can replace the explicit recursion in postStatus with 'mapM_ (postStatus update) configs'. It looks like that mysterious (all, first) pair has a different value for the first one, in that case it would be clearer to either not do that, or do it explicitly like case configs of first : rest - postFirst update first mapM_ (postStatus update) rest [] - complain about no configs If you pass a single Bool to a function, it means it can have two behaviours, which is confusing. If you pass two Bools, then it can have 4, which is even more confusing :) I myself use if/else only rarely. Looking a little at Plurkell, 'return =' is redundant. And I'm sure there's a urlEncode function around so you don't have to build URLs yourself? I don't understand the stuff with the words in the case, but it looks like a confusing way to say 'if word `elem` specialWords'. There's also a Set type for that kind of thing. And that regex stuff is... quoting? Isn't there a library function for that too? It's the sort of thing a URL library would have. If not, it's something like 'replace [(|, %7C), (/, %2F), ( , %20)]', right? I'm sure there's a replace function like that floating around somewhere, if not, you can write your own. And for JSON... wasn't someone just complaining about there being too many JSON libraries on hackage? Unless you want to do it yourself for fun (it's a good way to learn parsing), why not just download one of those? That's enough for now, have fun :) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review?
Good points. Here are a few more. Use more explicit types. You can cut down on if-then-else's by using pattern matching and guards. (For example, the first if-then-else in postStatus can be turned into: newtype PostToAllNetworks = PostToAll Bool newtype FirstPost = FirstPost Bool postToAllNetworks :: PostToAllNetworks - FirstPost - IO Bool PostToAllNetworks (PostToAll True) (First True) = do putStr $ Post to all networks? [y/n] hSetBuffering stdin NoBuffering ; hFlush stdout postAll - getChar ; hSetBuffering stdin LineBuffering return (postAll' == 'y') postToAllNetworks _ _ = return False Notice I'm not just playing golf here. This is easier to read. Also, keep in mind that post is a noun and a verb. It is very functional to write functions whose names are nouns. It might be nice if Haskell had some syntax like Ruby does, so we could ask postToAllNetworks? But since we don't, you should be open to the possibility that a name can be either. On Sun, May 22, 2011 at 3:32 PM, Evan Laforge qdun...@gmail.com wrote: I'd keep the line length down to 80 chars, and not use ';'s. All of that fiddling with buffering, printing, reading results could be more clearly put into a couple of functions. 'if all == False then return False else return True' is a pretty confusing way to say 'return all'. In fact, any time you see 'x == True' you can just remove the '== True'. The whole postAll thing would be clearer as postAll - if all not first then return all else ask Post to all? post - if postAll then return True else ask Post to..? Anywhere you have 'head x' or 'x!!0' or a 'case length xs of' that's a chance to crash. Don't do that. You can get rid of the heads by writing (config:configs) and [] cases for postStatus. Get rid of the !!0 by making config into a data type, it looks like 'data Config = Config { configPostTo :: URL?, configUser :: Maybe String, configPass :: Maybe String }'. Then 'pass - maybe (ask pass?) return (configPass config)'. Of course, why make these things optional at all? Looks like the postStatus return value is not used. It would simplify it to not return those codes. I don't know anything about 'postPlurk' but it looks like it could return a real data type as well. All this nested if else stuff makes it hard to read, but I think you can replace the explicit recursion in postStatus with 'mapM_ (postStatus update) configs'. It looks like that mysterious (all, first) pair has a different value for the first one, in that case it would be clearer to either not do that, or do it explicitly like case configs of first : rest - postFirst update first mapM_ (postStatus update) rest [] - complain about no configs If you pass a single Bool to a function, it means it can have two behaviours, which is confusing. If you pass two Bools, then it can have 4, which is even more confusing :) I myself use if/else only rarely. Looking a little at Plurkell, 'return =' is redundant. And I'm sure there's a urlEncode function around so you don't have to build URLs yourself? I don't understand the stuff with the words in the case, but it looks like a confusing way to say 'if word `elem` specialWords'. There's also a Set type for that kind of thing. And that regex stuff is... quoting? Isn't there a library function for that too? It's the sort of thing a URL library would have. If not, it's something like 'replace [(|, %7C), (/, %2F), ( , %20)]', right? I'm sure there's a replace function like that floating around somewhere, if not, you can write your own. And for JSON... wasn't someone just complaining about there being too many JSON libraries on hackage? Unless you want to do it yourself for fun (it's a good way to learn parsing), why not just download one of those? That's enough for now, have fun :) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Code review of text-xml-generic
Hi, I've been working on a (de)serialization package from/to XML (using SYB) and wondering if anyone feels like giving me a quick code review before I upload it to hackagedb. The source code can be found at http://github.com/finnsson/Text.XML.Generic It is *heavily* inspired by Text.JSON.Generic. The three files DataEx.hs (slightly modified version of Olegs code), ExGeneric.hs and ExTestGeneric.hs are not part of the package yet since I haven't got the deserialization to work for existentials yet (but I haven't given up!). I plan to upload the package to hackagedb by the end of this weekend so any comments before (and after) are more than welcome. -- Oscar ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review? store server, 220loc.
On Sat, Aug 2, 2008 at 10:13 PM, Tim Newsham [EMAIL PROTECTED] wrote: http://www.thenewsh.com/%7Enewsham/store/Server5.hs You should try profiling this. I can see a few possible problems (such as reading String from a socket, instead of a ByteString), but it's difficult to predict what might be causing your code to be so slow. Haskell code ought to be much more competitive with C for an application like this. Profiling didn't turn up anything obvious: http://www.thenewsh.com/~newsham/store/Server9.prof one thing I dont quite understand is that it seems to be crediting more time to put8 and get8 than is warranted, perhaps all of the get and put functions... One suprising result from testing: http://www.thenewsh.com/~newsham/store/TestBin.hs shows that the Data.Binary marshalling is actually very fast, but when I want to add a length field at the start of the buffer it has a huge impact on performance. I've tried several variations without much luck... (also the third test case is very odd in that skipping a conversion to strict bytestring actually makes it slower(!)). soo... I think thats probably one of the things limiting my performance. Another one is probably my need to do two reads (one for length one for the data) for each received record... I think with some trickiness I could get around that (since recv will return only as many bytes as are immediately available) but I don't know that its worth the effort right now Anyway, any thoughts, especially on how to make the bytestring operations faster, would be appreciated. Tim Newsham ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review? store server, 220loc.
On Sat, Aug 2, 2008 at 10:13 PM, Tim Newsham [EMAIL PROTECTED] wrote: Anyone interested in critiquing some code? I'm looking for ideas for making it faster and/or simpler: http://www.thenewsh.com/%7Enewsham/store/Server5.hs The code looks fairly reasonable, although most of your strictness annotations are unlikely to do much (particularly those on String fields). You should try profiling this. I can see a few possible problems (such as reading String from a socket, instead of a ByteString), but it's difficult to predict what might be causing your code to be so slow. Haskell code ought to be much more competitive with C for an application like this. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review? store server, 220loc.
You should try profiling this. I can see a few possible problems (such as reading String from a socket, instead of a ByteString), but it's difficult to predict what might be causing your code to be so slow. Haskell code ought to be much more competitive with C for an application like this. I haven't tried profiling yet, I should do that (ugh, will prob have to rebuild some libraries I'm using to support that). Anyway, I haven't yet used any ByteString IO functions. I ran some tests when I was starting and it seems that using Handle IO functions was a bit slower than using the Socket IO functions directly. It looks like there are a bunch of Handle IO functions that can return ByteStrings but I don't see any for sockets... Are there any? If not, I suspect it will be a wash switching to Handles and ByteStrings (especially since most of my requests and responses are quite small). I'll write up some tests to see how well those perform.. Tim Newsham http://www.thenewsh.com/~newsham/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review? store server, 220loc.
On 2008 Aug 4, at 23:45, Tim Newsham wrote: Anyway, I haven't yet used any ByteString IO functions. I ran some tests when I was starting and it seems that using Handle IO functions was a bit slower than using the Socket IO functions directly. It looks like there are a bunch of Handle IO functions that can return ByteStrings but I don't see any for sockets... Are there any? There's something on hackage... http://hackage.haskell.org/cgi-bin/hackage-scripts/package/network-bytestring -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED] system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED] electrical and computer engineering, carnegie mellon universityKF8NH ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review? store server, 220loc.
newsham: Anyone interested in critiquing some code? I'm looking for ideas for making it faster and/or simpler: What optimisation and runtime flags did you use (-threaded or not?) currently ghc -O --make $ -o $@. For some measurements I tried -threaded which seemed to have a slight slowdown and +RTS -N2 at runtime which didnt seem to help. I also ran some tests with multiple clients running concurrently, and it seemed to handle them at approximately the same rate with and without the threaded and RTS flags (on an amd64x2). Those flags should have allowed it to handle two client connections (via forkIO) concurrently, right? [ps: the same web directory has test clients and other versions of the server.] Tim Newsham http://www.thenewsh.com/~newsham/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review? store server, 220loc.
On Sat, 2008-08-02 at 19:13 -1000, Tim Newsham wrote: My measurements show that a simple dummy server (accept, forkio, recv byte) handles roughly 7500 requests/connects per second, the server/client that do real messages do about 4500 req and connections per second. If all requests are on the same connection one after another it does about 13500 requests/second. For comparisons, a C ping-pong server does about 3600/second if it has to fork for each new connection/request, and about 35000/sec if its all on the same connection. So it seems at least competitive with a forking C server. I havent tested threaded C servers. What kind of performance do you actually need? Can your network connection actually sustain the bandwidth of your synthetic benchmarks? For reference, I've got a demo HAppS-based server which handles reasonably high load pretty well I think: (tested using apache-bench, loopback interface, amd64 @ 2.2GHz) With cached content 450k: with 1 concurrent client: 760 requests per second ~1ms latency 34Mb/s with 100 concurrent clients: 1040 requests per second ~90ms latency 46Mb/s on a non-cached generated-on-demand 3k page: with 1 concurrent client: 280 requests per second ~4ms latency 900Kb/s bandwidth with 100 concurrent clients: 240 requests per second ~400ms latency 750Kb/s bandwidth Using http keep-alive boots requests per sec by ~20% Obviously this is testing with a loopback network. My point is, it's serving at a rather higher rate than most real network connections I could buy (except local ethernet networks). Duncan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review? store server, 220loc.
What kind of performance do you actually need? Can your network connection actually sustain the bandwidth of your synthetic benchmarks? This is just an exercise at the moment, so no particular performance goal beyond how fast can it go. (tested using apache-bench, loopback interface, amd64 @ 2.2GHz) With cached content 450k: with 1 concurrent client: 760 requests per second ~1ms latency 34Mb/s [...] Obviously this is testing with a loopback network. My point is, it's serving at a rather higher rate than most real network connections I could buy (except local ethernet networks). My requests and responses are fairly small, a typical request is 29bytes and a typical response is 58bytes. If you just count these payloads its about 390kB/sec for 4.5kreq/sec and 1.2MB/sec for 13.5kreq/sec (of course its more like double these as the TCP overhead will be about the same size). Anyway, with such small sizes, the performance shouldn't be limited by the bandwidth (I dont think). If this was a back-end storage server, the network probably wouldn't be the limiting factor. Duncan Tim Newsham http://www.thenewsh.com/~newsham/___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] code review? store server, 220loc.
Anyone interested in critiquing some code? I'm looking for ideas for making it faster and/or simpler: http://www.thenewsh.com/%7Enewsham/store/Server5.hs This is an exercise to see how well a server in Haskell would perform. My goals are roughly: - retargetability to other server types (ie. easy to replace request and response structures and business logic). - readability. - performance. My measurements show that a simple dummy server (accept, forkio, recv byte) handles roughly 7500 requests/connects per second, the server/client that do real messages do about 4500 req and connections per second. If all requests are on the same connection one after another it does about 13500 requests/second. For comparisons, a C ping-pong server does about 3600/second if it has to fork for each new connection/request, and about 35000/sec if its all on the same connection. So it seems at least competitive with a forking C server. I havent tested threaded C servers. Tim Newsham http://www.thenewsh.com/~newsham/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] code review? store server, 220loc.
newsham: Anyone interested in critiquing some code? I'm looking for ideas for making it faster and/or simpler: http://www.thenewsh.com/%7Enewsham/store/Server5.hs This is an exercise to see how well a server in Haskell would perform. My goals are roughly: - retargetability to other server types (ie. easy to replace request and response structures and business logic). - readability. - performance. My measurements show that a simple dummy server (accept, forkio, recv byte) handles roughly 7500 requests/connects per second, the server/client that do real messages do about 4500 req and connections per second. If all requests are on the same connection one after another it does about 13500 requests/second. For comparisons, a C ping-pong server does about 3600/second if it has to fork for each new connection/request, and about 35000/sec if its all on the same connection. So it seems at least competitive with a forking C server. I havent tested threaded C servers. packBS :: String - B.ByteString packBS = B.pack . map (toEnum.fromEnum) -- | Convert a bytestring to a string. unpackBS :: B.ByteString - String unpackBS = map (toEnum.fromEnum) . B.unpack are Data.ByteString.Char8.pack/unpack. What optimisation and runtime flags did you use (-threaded or not?) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Code review: initial factoring for sequences and other structures
Hi - I've started work on an initial factoring of sequence ops into classes, and already I've run into some major design issues which stand like a mountain in the way of progress. The classes are below: -- all code below standard BSD3 licence :-) module Duma.Data.Class.Foldable ( Foldable(..) ) where import qualified Data.List as List class Foldable c a | c - a where foldR :: (a - b - b) - b - c - b foldL :: (b - a - b) - b - c - b foldL' :: (b - a - b) - b - c - b -- arg order from FingerTree paper -- opposite to Edison arg order reduceR :: (a - b - b) - (c - b - b) reduceR f xs y = foldR f y xs reduceL :: (b - a - b) - (b - c - b) reduceL = foldL reduceL' :: (b - a - b) - (b - c - b) reduceL' = foldL' instance Foldable [a] a where foldR = List.foldr foldL = List.foldl foldL' = List.foldl' and module Duma.Data.Class.BasicSeq ( BasicSeq(..) ) where import Prelude hiding(map) import Duma.Data.Class.Foldable import qualified Data.List as List import Control.Monad class Foldable c a = BasicSeq c a where empty :: c isEmpty :: c - Bool atL :: c - a atR :: c - a pushL :: a - c - c pushR :: c - a - c viewL :: Monad m = c - m (a, c) viewR :: Monad m = c - m (c, a) -- (1) Should this be in its own class? convert :: BasicSeq d a = c - d convert = foldR pushL empty -- (2) Is this too general or not general enough? map :: BasicSeq d b = (a - b) - c - d map f xs = case viewL xs of Just (x, xs') - pushL (f x) (map f xs') Nothing - empty instance BasicSeq [a] a where empty = [] isEmpty [] = True isEmpty _ = False atL (x:_) = x atR = List.last pushL = (:) pushR xs x = xs ++ [x] viewL (x:xs) = return (x,xs) viewL _ = fail viewL viewR xs@(_:_) = return (List.init xs, List.last xs) viewR _ = fail viewR (Indexing ops like take, drop, length, at, split would be provided by a third class and measurements (to access the power of finger trees and similar structures) would be dealt with by a fourth class with analogous ops ie takeWith, splitWith etc) However already with just the above I have some questions about the (convert) and (map) functions: 1) Because (convert) takes BasicSeq d a as a context, it is currently impossible to provide an optimized version for specific combinations of source/dest types eg if both types are the same (convert) should just be (id). Does this mean I should put (convert) into its own class or should I expect the compiler to be able to rewrite (foldR pushL empty) into (id) in this situation? 2) Ditto with (map), but here there is another issue: it is at once too general and too specific compared to the usual definitions of (map): --Prelude - limited to lists only map f (x:xs) = f x : map f xs map _ [] = [] -- Functor - source and dest limited to the same type fmap :: (a-b) - f a - f b So even though fmap is constrained to only map between the same type, it is more general than BasicSeq.map because the type doesn't need to be a BasicSeq. However it is less general in the sense that fmap wouldn't allow to map between two different instances of BasicSeq. There is a general problem that when the element type needs to be specified along with the type of the overall collection, there is no way to talk about the functor type at the same time eg I'd like to be able to write something like this: class Functor f a b where fmap :: (a-b) - f a - f b instance (BasicSeq (f a) a, BasicSeq (f b) b) = Functor f a b where fmap = map but GHC complains that I'd need to use -fallow-undecidable-instances which I'm reluctant to do because I don't know what the implications of such a decision would be. Why are these instances for such a simple thing (ie just pattern matching against the components of a type) already undecidable? The issues above are the kind of problems I just don't know how to solve or even how to approach, since there seem to be too many conflicting dimensions in the design space. Any ideas? Thanks, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code review: initial factoring for sequences andother structures
Brian Hulley wrote: Hi - I've started work on an initial factoring of sequence ops into [snip] class Foldable c a | c - a where foldR :: (a - b - b) - b - c - b [snip] There is a general problem that when the element type needs to be specified along with the type of the overall collection, there is no way to talk about the functor type at the same time After writing this I realised it was a bit silly of me to have used a fundep when I could have just specified the functor types directly (instead of the resulting collection) thus: class Foldable f a where foldR :: (a - b - b) - b - f a - b so I applied this change to everything then wrote: module Duma.Data.Class.Functor ( Functor(..) ) where import Prelude hiding (map, Functor) import Duma.Data.Class.BasicSeq class Functor f a b where fmap :: (a - b) - f a - f b instance (BasicSeq f a, BasicSeq f b) = Functor f a b where fmap = map Now surely such an absolutely clear and straightforward instance decl would cause no problems at all, but guess what? Yes, even without fundeps and attempts to pattern match against the components of a type, GHC can't handle it because it needs at least one non-type variable in the instance head or else the dreaded -f-allow-undecidable-instances. I wonder why there needs to be a distinction between type variables and non-type-variables because I'd have thought that the whole point of restricted polymorphism is that there's supposed to be a continuum between a fully unrestricted polymorphic type and a fully concrete type, with constraints like (BasicSeq f a) partially solidifying (f) and (a). The good news is that even though the Functor class can't be implemented, (fmap) can at least now be implemented within BasicSeq: class Foldable s a = BasicSeq s a where map :: BasicSeq t b = (a - b) - s a - t b map f xs = case viewL xs of Just (x, xs') - pushL (f x) (map f xs') Nothing - empty fmap :: BasicSeq s b = (a - b) - s a - s b fmap = map Anyway it's interesting that there does not yet appear to be a proper solution (ie to give us Foldable, BasicSeq, and Functor all with restricted polymorphism) with the type system as it stands at present unless dangerous flags are used. Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code review: efficiency question
Bulat Ziganshin wrote: [ideas including reverseMapM_] you will laugh, but speed of your two solutions depends on so many factors (including size of CPU cache) that noone can say that is better in general. although for small lists reverseMapM_ should be faster than reverse+mapM. what will be faster - using of higher-order function or direct recursion, i can't say, it's a really counter-intuitive area of ghc optimizer :) of course, i don't think that all that really matters for your program (drawing should anyway need much more time than looping). just use higher-level approach (that makes code simpler to write, understand and maintain) and don't bother your mind :) Hi Bulat! Thanks for the suggestions about reverseMapM_ etc. It seems that since the speeds of the two solutions can be relatively faster/slower on different platforms/CPUs I might as well just use the combination of existing functions mapM_ and reverse at the moment to get readable code with the least amount of effort :-) Best regards, Brian. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code review: efficiency question
Brian, You might also want to take a look at the list fusion functionality in GHC which often can help optimize your programs when programming with lists. http://www.haskell.org/ghc/docs/latest/html/users_guide/rewrite-rules.html#id3153234 It doesn't help in your particular program but it might be usable for you in the future. Cheers, /Josef On 5/3/06, Brian Hulley [EMAIL PROTECTED] wrote: Bulat Ziganshin wrote: [ideas including reverseMapM_] you will laugh, but speed of your two solutions depends on so many factors (including size of CPU cache) that noone can say that is better in general. although for small lists reverseMapM_ should be faster than reverse+mapM. what will be faster - using of higher-order function or direct recursion, i can't say, it's a really counter-intuitive area of ghc optimizer :) of course, i don't think that all that really matters for your program (drawing should anyway need much more time than looping). just use higher-level approach (that makes code simpler to write, understand and maintain) and don't bother your mind :) Hi Bulat! Thanks for the suggestions about reverseMapM_ etc. It seems that since the speeds of the two solutions can be relatively faster/slower on different platforms/CPUs I might as well just use the combination of existing functions mapM_ and reverse at the moment to get readable code with the least amount of effort :-) Best regards, Brian. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code review: efficiency question
Evan Martin wrote: I remember reading a tutorial that pointed out that you can often avoid explicit recusion in Haskell and instead use higher-level operators. For your code, I think drawModals = foldr (flip ()) (return ()) . map drawModal works(?). I think it would be foldl so that the (return()) would be nested as the leftmost element. Thanks for pointing out this point-free version of drawModals, although for readability at the moment I think I still prefer just to use mapM_ drawModal (reverse cs) Best regards, Brian. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code review: efficiency question
Hello Brian, Tuesday, May 2, 2006, 3:12:48 AM, you wrote: -- Prolog style coding... drawModals :: [Control] - ManagerM () drawModals [] = return () drawModals (c:cs) = do drawModals cs drawModal c imho, it's typical functional style, but without using higher-level functions mapM_ drawModal (reverse cs) However, while this looks more elegant, it is less efficient? In other words, how much optimization can one assume when writing Haskell code? ghc will don't translate your later code into the former one. although in general ghc (but not other haskell compilers, afaik) is very good in replacing one code with another faster one and in particular in translating list producer + list consumer into straightforward loop how about this solution: reverseMapM_ f (x:xs) = do reverseMapM_ f xs; f x reverseMapM_ f [] = return () or you can define `reverseMapM_` via fold, if you have better FP skills than me :) I'm trying to get a rough idea so I can decide whether to write helper functions such as drawModals in future or whether I should always just use the most elegant code instead. Any ideas? you will laugh, but speed of your two solutions depends on so many factors (including size of CPU cache) that noone can say that is better in general. although for small lists reverseMapM_ should be faster than reverse+mapM. what will be faster - using of higher-order function or direct recursion, i can't say, it's a really counter-intuitive area of ghc optimizer :) of course, i don't think that all that really matters for your program (drawing should anyway need much more time than looping). just use higher-level approach (that makes code simpler to write, understand and maintain) and don't bother your mind :) -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
I have added 'solve method F', aka X-wing, swordfish... (www.sudokusolver.co.uk) to my solver, that reduced the number of puzzles needing guesses to 5306, so I suppose that's it. I haven't yet implemented it efficiently, so it was devastating for performance - and solving thirteen puzzles more by pure logic (or should we rather say without using any assumptions except that the puzzle is solvable, uniquely or not?) isn't such a big step, I had hoped for more. Cheers, Daniel Am Freitag, 14. April 2006 21:40 schrieb Chris Kuklewicz: I believe if you change the representation of puzzles from [(pos,range)] to an Array, you'll get a significant speedup yet because I only recently removed a logic bug that slowed down the search instead of speading it up; ..). so the more interesting bit is that our solvers disagree on which are the most difficult puzzles (requiring the largest number of guesses): df puzzles involving guesses: 5319 If that's not a typo, I'm baffled. My original needed to guess in 5309 Rot! Typo in _my_ previous message, 5319 is correct. After posting my cleaned up dancing links solver, I went back to my logical solver. I sent the 36628 line sudoku17 puzzle through it and it could solve 31322 of the puzzles, leaving 5306 resistant. I haven't check my algorithms against your code, but it does not seem like I do any spectacularly more clever. I don't have any time to clean up my code, but I may try turning off some steps to see if one of them gives my the same 5319 number you are seeing. -- In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt. -- Blair P. Houghton ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
I believe if you change the representation of puzzles from [(pos,range)] to an Array, you'll get a significant speedup yet because I only recently removed a logic bug that slowed down the search instead of speading it up; ..). so the more interesting bit is that our solvers disagree on which are the most difficult puzzles (requiring the largest number of guesses): df puzzles involving guesses: 5319 If that's not a typo, I'm baffled. My original needed to guess in 5309 Rot! Typo in _my_ previous message, 5319 is correct. After posting my cleaned up dancing links solver, I went back to my logical solver. I sent the 36628 line sudoku17 puzzle through it and it could solve 31322 of the puzzles, leaving 5306 resistant. I haven't check my algorithms against your code, but it does not seem like I do any spectacularly more clever. I don't have any time to clean up my code, but I may try turning off some steps to see if one of them gives my the same 5319 number you are seeing. -- Chris Kuklewicz ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Daniel Fischer wrote: My IDE is kate/kwrite + ghci + hugs, if you know a better one (which doesn't involve loading down a 99MB thing like eclipse), I'd be interested to try it out. (I already knew emacs, so using haskell-mode is my choice) I changed the output format to a line per puzzle (solution/number and list of guesses - I still like that metric as each guess means that we've run out of ideas, which suggests that we want to get the number of guesses down to zero), using the same format for both solvers to and if we get down to zero, all's fine, but I think we shouldn't ignore the false guesses. As a metric, you need to include all the blind alleys. as for guessing strategies, you're sorting the non-fixed positions by size of range before extracting candidates for guesses, whereas I simply take them in the order they appear in the puzzle (which may have been somewhat hard to see thanks to another one of those overcomplicated pieces of code that has now disappeared..). I had thought about that, but I have also tried guessing on the first blank, increased time about 12%, so until I find a better criterion, I'll stay with fewest possibilities. The (simple) idea behind that choice is, if I have only two possibilities, I have a 50% chance of being right on the first attempt - that reasoning of course was based on the 'find one solution and stop' target, for determining all solutions, i.e. following all branches, I'm not so sure I could justify it. However, I've also tried the opposite (just to compare), guess at a position with maximum number of choices: MUCH slower. The *only* optimization in Knuth's dancing-links binary-code-problem-solver is that it always chooses the constraint with the minimum number of possibilities. (Sometimes there is only one possibility). if I eliminate the sorting from your solver, both solvers deliver the same results, so that's nice!-) but it also means that we have room for further improvement. one reason why I got interested in this problem in the first place was that I had been reading about constraint propagation, and wanted to get some hands-on experience. in particular, there is a technique called generalized propagation: Generalised Constraint Propagation Over the CLP Scheme, by Thierry Le Provost and Mark Wallace. Journal of Logic Programming 16, July 1993. Also [ECRC-92-1] http://eclipse.crosscoreop.com:8080/eclipse/reports/ecrc_genprop.ps.gz if I may oversimplify things a bit: 1 split inferences into branching (search) and non-branching (propagation) 2 distribute common information from non-branching inferences out of branches 3 to enhance applicability of step 2, add approximation of information applied to our sudoku problem, we have deterministic propagation and branching search, and what we do if we run out of propagation opportunities is to select a non-fixed position, then branch on each of the numbers in its range. adding generalised propagation (as I understand it) amounts to a form of lookahead: we take each of those branches, but look only at the deterministic information we can extract in each (ignoring further search). then we check whether the branches have any of that information in common, and apply any such common information to our matrix. rinse and repeat. an obvious way to refine this process by approximation is to build the union of the ranges on each position for the matrices resulting from all the branches. using this idea with a single level of branch lookahead, and selecting a position with the widest range for inspection, reduces the number of puzzles requiring guesses to 373, with a maximum depth of 3 guesses. 373 out of the list of 36628 ? Impressive. But this method, if I'm not grossly mistaken, does involve guessing - refined, educated guessing, but guessing still. On the other hand, one might correctly state that even the wildest guess and check is logic (proof by contradiction; if I put values v_1 to v_n at positions p_1 to p_n respectively and then value v_(n+1) at position p_(n+1), I finally can't satisfy the constraints, hence, given v_1 ... v_n at p_1 ... p_n, at p_(n+1), v_(n+1) cannot be ...) There is no clear definition that separates logic and guessing, because the algorithm does not understand your semantics. I would say the semantics of lookahead 1 step are different from guessing because the amount of computation involved is predictable ahead of time: you can look at the current possibilities and know how much work goes into lookahead 1 step but you can't look at the current state and know how much work a brute force search will take. it doesn't reduce runtime much, yet (7min50s), but I haven't even started profiling, either. I guess I have to look into representations now, if I ever want my solver to catch up with your's in terms of runtime;-) cheers, claus Cheers, Daniel
Re[2]: [Haskell-cafe] Code Review: Sudoku solver
Hello Daniel, Tuesday, April 11, 2006, 3:26:06 AM, you wrote: Today, I wrote a version using IOArray, hoping to get incredibly fast in-place updating, and explicitly making copies via freeze and thaw when guessing is required, but no luck (maybe I just don't know how to do it properly), spent Arrays implementation does the same. for example: class HasBounds a = IArray a e where unsafeReplace arr ies = runST (unsafeReplaceST arr ies = unsafeFreeze) unsafeReplaceST :: (IArray a e, Ix i) = a i e - [(Int, e)] - ST s (STArray s i e) unsafeReplaceST arr ies = do marr - thaw arr sequence_ [unsafeWrite marr i e | (i, e) - ies] return marr may be you should try using unsafeFreeze or unsafeThaw? they differ from 'safe' analogs in what they don't take a copy of array, but modify just modify appropriate flag in array header 85% of the time in GC, 18.3% with -H64M -A32M, MUT time is about 15% more than plain Array and EnumSet. If I had even the vaguest idea how I could provide an instance MArray IOUArray Entry IO (or such with Entry replaced by (Int, Set Int) or even (# Int, Word #)), I would give that a try. on 32-bit CPU you can use just IOUArray Int Word64 :) Now I've changed writeArray to unsafeWrite (analogously for read), that are you used `unsafeAt` for Arrays? brought MUT time to less than the plain Array version, but still spends a lot of time gc'ing (16.4% with -H64M -A64M, 10.2% with -H64M -A64M, that brought total time to less than plain Array, but had a maximum residency of 74,252,696 bytes, too much for my liking, plain Array has maximum residency of several K) if you will use -H1G the residency will be even larger ;) the real problem is IOArray - it's usage raise GC times significantly in ghc 6.4. ghc 6.6 solved this problem. anyway, using of unboxed array will be much faster. because you use only 9 bits for enumeration and i think even less for Int, it's no problem for you to use UArray Int Int and manually divide Int extracted to two parts - Int and Set. The other custom solution is mangling indexes - for example 2*i for Int and 2*i+1 for Set, or i for Int and i+10 for Set - of course, in this cases you should allocate large enough array I don't understand it, I would have thought that when I call newArray, an array of appropriate size is created somewhere on the heap (or wherever) and stays there until no longer referenced and writeArray would just change the contents of the relevant memory-cell and not touch the rest, so that should be fast and allocate less than plain Array, but it seems that isn't so :-( you are right. IOArray raise GC times not because it allocates more but because on each GC (even minor ones, which happens after each 256kb allocated with default -A/-H settings) ALL IOArrays should be scanned because pointers data from new allocated part of heap can be written to these arrays (after all, these arrays are mutable and can contain references to other data). the solution is either using UNBOXED arrays that can't cpntain references and therefore not scanned or immutable arrays that again is not scanned because they cannot be changed. ghc 6.6 just saves the list of all arrays/references that was mutated since last GC. the ultima problem is what 2-stage GC collection mechanism is optimized for immutable functional datastructures - not IOArrays -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Mine's even faster now (at least on my computer, would you care to test it on your's? If you don't want to get EnumSet, just change DiffArray to Array, worked wonders for me), I'll dig into yours tomorrow to see what I can get out of it to improve my algorithm. Unforunately, no new inference rules :-( indeed, the crucial difference between our versions was that I'd avoided group inference; I thought it would be expensive to implement and, based on my own limited experience, rarely used - boy, was I wrong on both accounts: it is easily implemented, with a little care for the inner loop, not too inefficient, either, and the examples I mentioned (on which my solver needed guesswork) simply fall apart once grouping is added (it was quite impressive to take the output of my old propagation code and apply grouping by hand!). now my solver takes slightly fewer guesses than your's, though your's is still significantly faster (same factor of about 3 between them, both having improved by a factor of 2 or so..). Two things I don't like about your code: 1. no type declarations that's what type inference is for!-) I tend to add them only as debugging assertions, when language limitations require them, or when I'm fairly certain that the code is stable (ie., dead;-). it is useful to think about types, but engraving them into the code is overrated (any good ide should be able to show them as if they'd been written in anyway..; lacking such ide, :browse Main will do the trick, though ghci's current formatting is ugly!). 2. too much name shadowing, that makes following the code difficult depends on which parts you mean - I don't like overusing primed or numbered variables, and using the same variable name through a sequence of steps is usually the last step before factoring that variable out into some monad anyway (I did mention that the code was still ad-hoc, not cleaned up). apart from that: clever as in: you'd prefer it if I'd express myself more clearly?-) agreed. I've been working on cleaning up the code (also trying to remove stupidities as they become exposed in the process..). lists and adhoc code tend to be performance killers, I doubled the speed of mine by de-adhoccing the code (and that although I introduced the speed-killer DiffArray) there's nothing wrong with lists, just as arrays are not synonymous with good performance - it all depends on the usage patterns. there are several interesting examples in this problem: 1 when I first tried EnumSet for the ranges, there was no substantial performance difference to a list-based representation. in theory, the change sounds like a clear win (constant vs linear time membership test), but (a) the ranges are small (they start at size 9, and we do our best to shrink them rapidly, so even if they were at average size 6 or 7, the test would take an average of 3 or 4 steps); (b) membership is not the only operation - for the inner loops in my code, singleton testing and extraction of the single element from a singleton were equally important, and here the relation was reversed: lists give us those operations in constant time while they were linear in the old EnumSet. Bulat's tabling suggestion (which I interpreted rather more extremely by tabling the actual calls to size and range2list:-) solves this issue beautifully, making EnumSet starting to look competitive in spite of the particular usage pattern in this problem, and when the grouping inference asked for more calls to size, as well as intersection and union, the choice flipped completely, now in favour of EnumSet (has agreement been reached about an updated version of EnumSet?). [if I might make a suggestion for EnumSet: it is not just that the implementation of the standard API can be more efficient for these small sets, but there are additional operations that ought to be included (unless the EnumSet.Set constructor gets exported), such as lifting the Ix instance from Word to Set a] 2 DiffArrays: they are neither a speed killer nor a universal cure. they were intended to enable functional arrays with inplace update for sequential access, but without the complex analyses needed to ensure such sequential access, and with the reverse diff to old versions preserving the persistence of copies. using them, we'll get inplace update plus a long trail recording the diffs needed to reconstruct the old versions, and if we use them single-threadedly, that trail will just be ignored and the garbage collector will neglect to copy it at some point. but if we combine them with backtracking, those trails are likely to kill much of the performance gains, and it is not so much the width of branching as the depth of each branch: we iterate on that array to narrow ranges and to update positions, and each tiny update extends that trail of reverse diffs (they are not even
Re: [Haskell-cafe] Code Review: Sudoku solver
Moin Claus, Am Montag, 10. April 2006 10:11 schrieben Sie: Mine's even faster now (at least on my computer, would you care to test it on your's? If you don't want to get EnumSet, just change DiffArray to Array, worked wonders for me), I'll dig into yours tomorrow to see what I can get out of it to improve my algorithm. Unforunately, no new inference rules :-( indeed, the crucial difference between our versions was that I'd avoided group inference; I thought it would be expensive to implement and, based on my own limited experience, rarely used - boy, was I wrong on both accounts: it is easily implemented, with a little care for the inner loop, not too inefficient, either, and the examples I mentioned (on which my solver needed guesswork) simply fall apart once grouping is added (it was quite impressive to take the output of my old propagation code and apply grouping by hand!). Just to be sure, with group inference added, your solver must guess for exactly the same puzzles as mine? now my solver takes slightly fewer guesses than your's, though your's is still significantly faster (same factor of about 3 between them, both having improved by a factor of 2 or so..). Two things I don't like about your code: 1. no type declarations that's what type inference is for!-) I tend to add them only as debugging assertions, when language limitations require them, or when I'm fairly certain that the code is stable (ie., dead;-). it is useful to think about types, but engraving them into the code is overrated (any good ide should My IDE is kate/kwrite + ghci + hugs, if you know a better one (which doesn't involve loading down a 99MB thing like eclipse), I'd be interested to try it out. be able to show them as if they'd been written in anyway..; lacking such ide, :browse Main will do the trick, though ghci's current formatting is ugly!). : Well, I didn't :b Main and add the signatures before I printed it out, and since I read it in bed, I had to figure them out myself. And as somebody else somewhere stated, a type signature is about the most useful piece of documentation. But I concede that there are also good reasons for leaving them out (especially in the early development stage). 2. too much name shadowing, that makes following the code difficult depends on which parts you mean - I don't like overusing primed or guesses, candidates and the where clause of findNarrowPos numbered variables, and using the same variable name through a sequence of steps is usually the last step before factoring that variable out into some monad anyway (I did mention that the code was still ad-hoc, not cleaned up). Yes, and if you hadn't, I would have been annoyed by it (after all, I'm the only one who may write unreadable code :-) apart from that: clever as in: you'd prefer it if I'd express myself more clearly?-) agreed. That, too - don't we all want everybody else to express themselves immaculately lucid? But mostly: clever, as in: clever I've been working on cleaning up the code (also trying to remove stupidities as they become exposed in the process..). I'd like to see it, when you consider it fit for posting. lists and adhoc code tend to be performance killers, I doubled the speed of mine by de-adhoccing the code (and that although I introduced the speed-killer DiffArray) there's nothing wrong with lists, just as arrays are not synonymous with good performance - it all depends on the usage patterns. there are several Of course, and I love lists! I meant, lists are slow in situations like these, where you repeatedly access elements at various positions, as in findSingletonPos and findNarrowPos, where you have to traverse the whole list for each row/column/block. Such selections are much faster done over an array (I believe, but imagining a 1024 by 1024 sudoku, I'm rather confident) interesting examples in this problem: 1 when I first tried EnumSet for the ranges, there was no substantial performance difference to a list-based representation. in theory, the change sounds like a clear win (constant vs linear time membership test), but (a) the ranges are small (they start at size 9, and we do our best to shrink them rapidly, so even if they were at average size 6 or 7, the test would take an average of 3 or 4 steps); (b) membership is not the only operation - for the inner loops in my code, singleton testing and extraction of the single element from a singleton were equally important, and here the relation was reversed: lists give us those operations in constant time while they were linear in the old EnumSet. Yes, and for those reasons I didn't expect much difference from changing to EnumSet, but in the group inference, I form a lot of unions (and before I cleaned up my code, also many differences) which I expected to be significantly faster with EnumSet, so I gave it a try and it was
Re: [Haskell-cafe] Code Review: Sudoku solver
I just ran a simple metric for the dancing-links solver. The only real metric I could use was the number of coverOthers calls which counts the number of selections made (there is no distinction between certainty and guessing). So the minimum # of selections is 81 (of which 17 are the free hints, and 64 were real moves). Each selection beyond 81 involved backtracking. Of the 36628 puzzles, there were 21395 puzzles that were solved with 81 selections, and therefore no backtracking. So 58% of them were (some with luck) solved instantly. This low percentage is why this is not a by logic solver. puzzles selections each selections excess/above 81 21395 minimal 81 (17+64*1) 17329950 9841 up to 145 (17+64*2) 1043838 246717 (== 1043838 - 9841*81) 3192 up to 273 (17+64*4) 611275 352723 1215 up to 529 (17+64*8) 453302 354887 576 up to 1041 (17+64*16) 415496 368840 248 up to 2065 (17+64*32) 354028 333940 105 up to 4113 (17+64*64) 300909 292404 42 up to 8209 (17+64*128)248645 245243 10 up to 16401 (17+64*256)120724 119914 3 up to 32785 (17+64*512) 6231962076 1 used 32875 3287532794 (puzzle # 10995) - --- --- 36628 5376406 2409538 I think the important thing about the list is that the work in each group is double the one before it, but the count of puzzles is always less than half the one before it. Thus the total selections decreases. So the hard puzzles are time consuming but not catastrophic. Anyway, 94% of the puzzles were solved with no more than 17+64*4 selections. And only 50.7% of the selections beyond the 17 hints required backtracking. (Computed from 2409538 / ( 5376406 - 36628 * 17 ) ). That's an average of 65.8 bad selections per puzzle that requires 64 good selections to solve. This seems like an excellent benchmark for comparing brute force methods. A final point about dancing links is that this is a generic data structure and algorithm for solving any binary cover problem. (Such as arranging pentaminos). It is not specialized to knowing anything about Sudoku. The initialization tells it there are 324 constraints and this is a comprehensive set of moves, limited only by the initial hints. And it runs only 50% slower than bit-twiddling specialized logic. The funny thing is that Sudoku is too small for this method, by which I mean that the constant cost of initializing the data structure that is used can become comparable to running the algorithm. I had to profile my code while rewriting it and optimize the initialization code to have a much smaller overhead than Knuth's code. Most non-Sudoku problems that one needs to brute force with this will run much longer than any setup costs. Finally, here were the worst 30 puzzles: selections puzzle# 32875 10995 23577 412 20707 27267 18035 26632 14350 21765 14247 33677 13966 10992 13660 7250 13453 28608 12243 19106 12181 10993 930018177 868624590 863812445 816119387 81577921 791635782 790033131 785036162 766321951 742731110 732116608 726833554 72001259 71089750 680528607 678121919 671614969 643226234 636115697 I don't notice any overlap with the list of hard puzzles to solve with the other programs. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Am Samstag, 8. April 2006 10:21 schrieb Chris Kuklewicz: Daniel Fischer wrote: But, lo and behold, I also tried how plain Array fared in comparison to DiffArray and ... reduced the running time to under ten minutes (a little above for the list version), 5% GC time without -AxM, 1.2% with -A8M. And I thought, DiffArrays were supposed to be fast! No. DiffArray's are faster for the usual imperative single threaded usage pattern. The haddock documentation explains: Well, it's single threaded for 31,309 of the puzzles and hasn't much branching for most of the others. I hoped that when guessing was necessary, a copy was made to be used for the other branches, but apparently I couldn't persuade the compiler to do that. Diff arrays have an immutable interface, but rely on internal updates in place to provide fast functional update operator //. When the // operator is applied to a diff array, its contents are physically updated in place. The old array silently changes its representation without changing the visible behavior: it stores a link to the new current array along with the difference to be applied to get the old contents. So if a diff array is used in a single-threaded style, i.e. after // application the old version is no longer used, a'!'i takes O(1) time and a // d takes O(length d). Accessing elements of older versions gradually becomes slower. Updating an array which is not current makes a physical copy. The resulting array is unlinked from the old family. So you can obtain a version which is guaranteed to be current and thus have fast element access by a // []. I assume the usage in a Sudoku solver involves a non-trivial amount of back-tracking. So as the solver backs up and goes forward again it ends up being much more work than having used a plain Array. I tried to keep backtracking to a minimum, and in most cases that minimum is reached. And as was pointed out by someone else on this list: to be thread safe the DiffArray uses MVar's (with locking) instead of IOVars. But I expect the main problem is that a DiffArray is simply not the right mutable data structure for the job. Should I try an MArray? And what's more promising, IOArray or STArray? I have had the flu this week, so I did not finish cleaning up my port of So did I, hope you're well again. Knuth's mutable dancing links based Sudoku solver. But it uses a much more lightweight way to alter a mutable data structure both going forward and backwards while backtracking. And I can use STRef's to build it, instead of MVars. Cheers, Daniel -- In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt. -- Blair P. Houghton ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Daniel Fischer wrote: But, lo and behold, I also tried how plai Array fared in comparison to DiffArray and ... reduced the running time to under ten minutes (a little above for the list version), 5% GC time without -AxM, 1.2% with -A8M. And I thought, DiffArrays were supposed to be fast! No. DiffArray's are faster for the usual imperative single threaded usage pattern. The haddock documentation explains: Diff arrays have an immutable interface, but rely on internal updates in place to provide fast functional update operator //. When the // operator is applied to a diff array, its contents are physically updated in place. The old array silently changes its representation without changing the visible behavior: it stores a link to the new current array along with the difference to be applied to get the old contents. So if a diff array is used in a single-threaded style, i.e. after // application the old version is no longer used, a'!'i takes O(1) time and a // d takes O(length d). Accessing elements of older versions gradually becomes slower. Updating an array which is not current makes a physical copy. The resulting array is unlinked from the old family. So you can obtain a version which is guaranteed to be current and thus have fast element access by a // []. I assume the usage in a Sudoku solver involves a non-trivial amount of back-tracking. So as the solver backs up and goes forward again it ends up being much more work than having used a plain Array. And as was pointed out by someone else on this list: to be thread safe the DiffArray uses MVar's (with locking) instead of IOVars. But I expect the main problem is that a DiffArray is simply not the right mutable data structure for the job. I have had the flu this week, so I did not finish cleaning up my port of Knuth's mutable dancing links based Sudoku solver. But it uses a much more lightweight way to alter a mutable data structure both going forward and backwards while backtracking. And I can use STRef's to build it, instead of MVars. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] Code Review: Sudoku solver
Hello Daniel, Saturday, April 8, 2006, 3:06:03 AM, you wrote: And I thought, DiffArrays were supposed to be fast! 1. your arrays are too small (8 elements only) 2. DiffArray use internally MVars. with IORefs they will be a lot faster -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] Code Review: Sudoku solver
Hello Chris, Saturday, April 8, 2006, 12:21:07 PM, you wrote: backtracking. And I can use STRef's to build it, instead of MVars. may be it's better to use unboxed arrays/references? -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
I have finished my cleanup of the dancing links based solver for Sudoku. I don't have time to compare performance with the other programs that have been posted recently, or to do more profiling of my code. For those who will say It is ugly, imperative, and ugly! please remember this is a conversion of Knuth's c-code, which depended on five non-trivial goto jumps because he did not have tail recursion. And the whole point of the algorithm are the imperative unlink relink routines acting on a sparse binary matrix. This does not use any clever logic, but it does pick the tightest constraint at every step. This means if there is only one obvious possibility for a position/row/column/block then it will immediately act on it. The literate source file is attached. [ My clever logic solver may eventually be cleaned up as well. ] -- Chris A Sukodku solver by Chris Kuklewicz (haskell (at) list (dot) mightyreason (dot) com) I compile on a powerbook G4 (Mac OS X, ghc 6.4.2) using ghc -optc-O3 -funbox-strict-fields -O2 --make -fglasgow-exts This is a translation of Knuth's GDANCE from dance.w / dance.c http://www-cs-faculty.stanford.edu/~uno/preprints.html http://www-cs-faculty.stanford.edu/~uno/programs.html http://en.wikipedia.org/wiki/Dancing_Links I have an older verison that uses lazy ST to return the solutions on demand, which was more useful when trying to generate new puzzles to solve. module Main where import Prelude hiding (read) import Control.Monad import Control.Monad.Fix import Data.Array.IArray import Control.Monad.ST.Strict import Data.STRef.Strict import Data.Char(intToDigit,digitToInt) import Data.List(unfoldr,intersperse) new = newSTRef {-# INLINE new #-} read = readSTRef {-# INLINE read #-} write = writeSTRef {-# INLINE write #-} modify = modifySTRef {-# INLINE modify #-} Data types to prevent mixing different index and value types type A = Int newtype R = R A deriving (Show,Read,Eq,Ord,Ix,Enum) newtype C = C A deriving (Show,Read,Eq,Ord,Ix,Enum) newtype V = V A deriving (Show,Read,Eq,Ord,Ix,Enum) newtype B = B A deriving (Show,Read,Eq,Ord,Ix,Enum) Sudoku also has block constraints, so we want to look up a block index in an array: lookupBlock :: Array (R,C) B lookupBlock = listArray bb [ toBlock ij | ij - range bb ] where ra :: Array Int B ra = listArray (0,pred (rangeSize b)) [B (fst b) .. B (snd b)] toBlock (R i,C j) = ra ! ( (div (index b j) 3)+3*(div (index b i) 3) ) The values for an unknown location is 'u'. The bound and range are given by b and rng. And bb is a 2D bound. u = V 0 -- unknown value b :: (Int,Int) b = (1,9) -- min and max bounds rng = enumFromTo (fst b) (snd b) -- list from '1' to '9' bb = ((R (fst b),C (fst b)),(R (snd b),C (snd b))) A Spec can be turned into a parsed array with ease: type Hint = ((R,C),V) newtype Spec = Spec [Hint] deriving (Eq,Show) type PA = Array (R,C) V parse :: Spec - PA parse (Spec parsed) = let acc old new = new in accumArray acc u bb parsed The dancing links algorithm depends on a sparse 2D node structure. Each column represents a constraint. Each row represents a Hint. The number of possible hints is 9x9x9 = 271 type (MutInt st) = (STRef st) Int The pointer types: type (NodePtr st) = (STRef st) (Node st) type (HeadPtr st) = (STRef st) (Head st) The structures is a 2D grid of nodes, with Col's on the top of columns and a sparse collection of nodes. Note that topNode of Head is not a strict field. This is because the topNode needs to refer to the Head, and they are both created monadically. type HeadName = (Int,Int,Int) -- see below for meaning data Head st = Head {headName:: !HeadName ,topNode:: (Node st) -- header node for this column ,len:: !(MutInt st) -- number of nodes below this head ,next,prev:: !(HeadPtr st) -- doubly-linked list } data Node st = Node {getHint:: !Hint ,getHead:: !(Head st) -- head for the column this node is in ,up,down,left,right :: !(NodePtr st) -- two doubly-linked lists } instance Eq (Head st) where a == b = headName a == headName b instance Eq (Node st) where a == b = up a == up b To initialize the structures is a bit tedious. Knuth's code reads in the problem description from a data file and builds the structure based on that. Rather than short strings, I will use HeadName as the identifier. The columns are (0,4,5) for nodes that put some value in Row 4 Col 5 (1,2,3) for nodes that put Val 3 in Row 2 and some column (2,7,4) for nodes that put Val 4 in Col 7 and some row (3,1,8) for nodes that put Val 8 in some (row,column) in Block 1 The first head is (0,0,0) which is the root. The non-root head data will be put in an array with
Re: [Haskell-cafe] Code Review: Sudoku solver
Am Samstag, 8. April 2006 02:20 schrieb Daniel Fischer: Am Freitag, 7. April 2006 17:33 schrieben Sie: Just out of curiosity, speed was not the objective when I wrote my solver, I wanted to avoid guesswork (as much as possible), but in comparison with Cale Gibbard's and Alson Kemp's solvers (which are much more beautifully coded), it turned out that mine is blazingly fast, so are there faster solvers around (in Haskell, in other languages)? if I modify your solver to produce similar output to mine (input/first propagation, solved puzzle, number and list of guesses made), your's takes about a third of the time of mine (solving 36628 17hint puzzles in 6m vs 17m, 2GHz Pentium M), and I wasn't exactly unhappy with my solver before I did this comparison!-) Mine's even faster now (at least on my computer, would you care to test it on your's? If you don't want to get EnumSet, just change DiffArray to Array, worked wonders for me), I'll dig into yours tomorrow to see what I can get out of it to improve my algorithm. Unforunately, no new inference rules :-( Two things I don't like about your code: 1. no type declarations 2. too much name shadowing, that makes following the code difficult apart from that: clever like you, I've been trying to remove guesses, and the speed came as a welcome bonus (I'm still using lists all over the place, with lots of not nice adhoc code still remaining; not all propagators are iterated fully lists and adhoc code tend to be performance killers, I doubled the speed of mine by de-adhoccing the code (and that although I introduced the speed-killer DiffArray) I believe if you change the representation of puzzles from [(pos,range)] to an Array, you'll get a significant speedup yet because I only recently removed a logic bug that slowed down the search instead of speading it up; ..). so the more interesting bit is that our solvers disagree on which are the most difficult puzzles (requiring the largest number of guesses): df puzzles involving guesses: 5319 If that's not a typo, I'm baffled. My original needed to guess in 5309 Rot! Typo in _my_ previous message, 5319 is correct. puzzles, and I can't imagine what inference I could have dropped when cleaning up the code. largest number of guesses: 10 (#36084), 11 (#22495) cr puzzles involving guesses: 8165 largest number of guesses: 10 (#9175), 10 (#17200), 10 (#29823), 10 (#30811) df's solver needs 0/0/3/0 for cr's trouble spots, while cr's solver needs 5/9 guesses for df's. lots of potential for interesting investigations, We use different guessing strategies (plus, I also have group-inference). But the given number of guesses is the number of guesses in the successful branch, and I think a better measure for the nefariousness of a puzzle is the accumulated number of guesses in all branches or the number of branches (still better is the time needed to solve the puzzle). This is the list of the 30 puzzles with the most branches: puzzle #branches 3992 213 7475 120 12235 117 534169 1181560 940260 1154459 918454 1040350 3111048 857548 148945 273240 1152339 673039 1092938 96035 1947432 641231 159930 3608429 2183229 2249528 465728 3474727 1040427 2993126 94225 56324 the top 30 in CPUTime (in milliseconds, cpuTimePrecision = 10^10) 3992 6480 9184 1520 31110 1470 10403 1310 12235 1260 7475 1130 2732 1080 960 1050 5341 990 11544 960 11815 930 1395 730 10929 710 1863 710 1330 700 20807 630 4181 610 10634 570 34401 550 959 550 34747 520 1599 520 14912 510 29282 500 7983 500 29273 480 23958 470 2245 460 2232 440 36425 430 so puzzle 3992 is outstandingly bad in both respects (I fed it into my old step by step solver and boy, failure is detected _very_ late in practically all branches) and from a couple of tests I have the vague impression that the correlation between the number of guesses in the successful branch and time is not very strong (3992 has 6, 9184 and 2732 only 3, 31110 has 5, 10403 8, 12235 9, 7475 6 and 960 7), but I don't think I'll do a statistical analysis, I'll stick to time as the measure. Here's the meanest puzzle: 0 0 0 0 4 0 0 5 9 3 0 0 2 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 7 0 0 0 4 6 0 0 0 0 0 0 9 5 0 0 0 0 0 0 0 0 0 0 0 5 6 0 4 0 0 0 0 8 0 0 3 0 0 0 0 0 0 0 0 0 0 0 and that's so mean that David Place's incrsud beats my solver on this by a factor of 5.5! though mostly for me!-) ^^ I'm not sure about that :-) cheers, claus Cheers again, Daniel -- In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt. -- Blair P. Houghton
Re: [Haskell-cafe] Code Review: Sudoku solver
Am Samstag, 8. April 2006 20:28 schrieb Chris Kuklewicz: I have finished my cleanup of the dancing links based solver for Sudoku. I don't have time to compare performance with the other programs that have been posted recently, or to do more profiling of my code. Your dancing links: ckSud +RTS -sstderr -H32M -A8M sudoku17 Solutions.txt ckSud +RTS -sstderr -H32M -A8M 62,941,602,892 bytes allocated in the heap 330,404,632 bytes copied during GC 465,944 bytes maximum residency (41 sample(s)) 2023 collections in generation 0 ( 15.60s) 41 collections in generation 1 ( 0.30s) 32 Mb total memory in use INIT time0.00s ( 0.00s elapsed) MUT time 734.59s (781.93s elapsed) GCtime 15.90s ( 16.73s elapsed) EXIT time0.00s ( 0.00s elapsed) Total time 750.49s (798.66s elapsed) %GC time 2.1% (2.1% elapsed) Alloc rate85,682,629 bytes per MUT second Productivity 97.9% of total user, 92.0% of total elapsed Without -HxM, -AxM: INIT time0.00s ( 0.00s elapsed) MUT time 597.47s (915.94s elapsed) GCtime 912.65s (1363.63s elapsed) EXIT time0.00s ( 0.00s elapsed) Total time 1510.12s (2279.57s elapsed) My version using EnumSet (with the faster 'size'): sudokus +RTS -sstderr Solutions sudokus +RTS -sstderr 82,190,535,600 bytes allocated in the heap 771,054,072 bytes copied during GC 153,512 bytes maximum residency (394 sample(s)) 286104 collections in generation 0 ( 33.98s) 394 collections in generation 1 ( 0.35s) 2 Mb total memory in use INIT time0.00s ( 0.00s elapsed) MUT time 482.51s (1105.12s elapsed) GCtime 34.33s ( 79.90s elapsed) EXIT time0.00s ( 0.00s elapsed) Total time 516.84s (1185.02s elapsed) %GC time 6.6% (6.7% elapsed) Alloc rate170,339,548 bytes per MUT second Productivity 93.4% of total user, 40.7% of total elapsed Nice that original Haskell code can beat a translation from C. However: setSud ../puzzle3992 +RTS -sstderr 628|743|159 395|261|478 174|589|632 ---+---+--- 832|195|764 746|328|591 951|674|283 ---+---+--- 213|956|847 469|817|325 587|432|916 === 888,672,920 bytes allocated in the heap 3,352,784 bytes copied during GC 45,648 bytes maximum residency (1 sample(s)) 3287 collections in generation 0 ( 0.21s) 1 collections in generation 1 ( 0.00s) 2 Mb total memory in use INIT time0.00s ( 0.00s elapsed) MUT time4.77s ( 4.77s elapsed) GCtime0.21s ( 0.22s elapsed) EXIT time0.00s ( 0.00s elapsed) Total time4.98s ( 4.99s elapsed) %GC time 4.2% (4.4% elapsed) Alloc rate186,304,595 bytes per MUT second Productivity 95.8% of total user, 95.6% of total elapsed But ckSud +RTS -sstderr oneBad ckSud +RTS -sstderr (1,673941852148256379295378461534167928826594713917832546351729684762483195489615237) (2,683941752179258463245376918534167829826594371917832546351729684762483195498615237) (3,829143657361785492547629381678954213934271568215836974152468739486397125793512846) (4,713642958825917436649835127594781362378264519261593784136478295482359671957126843) (5,763942158425817936189635427594781362378264519216593784631478295842359671957126843) (6,269743158371865924458921637945137286836492571712658349597386412683214795124579863) (7,269743158371865924485921637943157286856492371712638549597386412638214795124579863) (8,628743159395261478174589632832195764746328591951674283213956847469817325587432916) (9,983541762761328945524679813679483251835162497142957638457816329296735184318294576) (10,578942361923165748614837925867491532235786194149253876796514283452378619381629457) (11,938145762127396854654872931873629145546713289291584673415967328369258417782431596) (12,792548361531672894846931572657384129483129657219756438965817243174263985328495716) (13,738249561296517483154386927673192854981654372425873619547928136319765248862431795) (14,957842361386719452124653879598364127673281945412975638845137296231596784769428513) (15,598241367673958124421736985254873691317692548986415732742169853165384279839527416) 25,708,036 bytes allocated in the heap 9,097,220 bytes copied during GC 329,648 bytes maximum residency (5 sample(s)) 97 collections in generation 0 ( 0.42s) 5 collections in generation 1 ( 0.04s) 2 Mb total memory in use INIT time0.00s ( 0.00s elapsed) MUT time0.23s ( 0.23s elapsed) GCtime0.46s ( 0.46s elapsed) EXIT time0.00s ( 0.00s elapsed) Total time0.69s ( 0.69s elapsed) %GC time 66.7% (66.7% elapsed) Alloc rate111,774,069 bytes per MUT second Productivity 33.3% of total user, 33.3% of total elapsed The infamous puzzle 3992 is the eighth in oneBad, so the links dance rings around me for that. I wonder where the dancing links run into difficulties. I'll see whether I can grok (that does mean
Re[2]: [Haskell-cafe] Code Review: Sudoku solver
Hello Daniel, Friday, April 7, 2006, 2:05:03 AM, you wrote: I've cleaned up my solver, removed a lot of redundant inference steps and made the board a DiffArray (is that really faster?). btw, DiffArray implementation can be made significantly faster by using IORefs instead of MVars -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Just out of curiosity, speed was not the objective when I wrote my solver, I wanted to avoid guesswork (as much as possible), but in comparison with Cale Gibbard's and Alson Kemp's solvers (which are much more beautifully coded), it turned out that mine is blazingly fast, so are there faster solvers around (in Haskell, in other languages)? if I modify your solver to produce similar output to mine (input/first propagation, solved puzzle, number and list of guesses made), your's takes about a third of the time of mine (solving 36628 17hint puzzles in 6m vs 17m, 2GHz Pentium M), and I wasn't exactly unhappy with my solver before I did this comparison!-) like you, I've been trying to remove guesses, and the speed came as a welcome bonus (I'm still using lists all over the place, with lots of not nice adhoc code still remaining; not all propagators are iterated fully yet because I only recently removed a logic bug that slowed down the search instead of speading it up; ..). so the more interesting bit is that our solvers disagree on which are the most difficult puzzles (requiring the largest number of guesses): df puzzles involving guesses: 5319 largest number of guesses: 10 (#36084), 11 (#22495) cr puzzles involving guesses: 8165 largest number of guesses: 10 (#9175), 10 (#17200), 10 (#29823), 10 (#30811) df's solver needs 0/0/3/0 for cr's trouble spots, while cr's solver needs 5/9 guesses for df's. lots of potential for interesting investigations, though mostly for me!-) cheers, claus Sudoku.hs Description: Binary data ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Am Freitag, 7. April 2006 01:50 schrieben Sie: On Apr 6, 2006, at 6:05 PM, Daniel Fischer wrote: I've also written a version using David F. Place's EnumSet instead of [Int], that takes less MUT time, but more GC time, so is slower on the 36,628 test, but faster for a single puzzle. That's a curious result. Did you compile with optimization? It I considered that curious, too. Everything was compiled with -O2 (does -O3 give better results?, would adding -optc-On improve performance? I'll try). What makes it even weirder is that the EnumSet-version does indeed allocate fewer bytes and performs fewer garbage collections (small difference, though). But I got consistent results, when running on a large number of puzzles, the list-version's faster gc'ing led to shorter overall run times. The same held when compiled with -O3 -optc-O3, however, I've been stupid, my excuse is that I was ill this week, the list version spent 46.5% gc'ing and the set version 53.5%, which is not really cricket, so today I used -AxM, x - [10,16,32,64], all reducing GC time to reasonable 0.5 - 2%. That, plus a few polishings, reduced the running time to about 16 minutes for EnumSet, a little more for lists. But, lo and behold, I also tried how plai Array fared in comparison to DiffArray and ... reduced the running time to under ten minutes (a little above for the list version), 5% GC time without -AxM, 1.2% with -A8M. And I thought, DiffArrays were supposed to be fast! should compile into primitive bit-twiddling operations and do no allocating at all. I'd be curious to see how fast my solver works on Well, since I've wrapped the sets in the Entry datatype, I wouldn't expect that, switching from Poss (singleton k) to Def k costs. I tried making Board a DiffArray (Int,Int) (Set Int), but then I had the problem that either I lost the information gained by placing forbidding for those positions where the number of possibilities dropped to one by inference, or had to scan the grid and re-place every now and then, both resulting in poor performance. the 36,628 test. I'm afraid to run my ancient limping powerbook in such a tight loop for that long. It gets too hot! If you'd find it amusing to give it a whirl, I'd love to know the result. I ran your incrsud on the first fifteen 17-hint puzzles, took over 20s, so I decided against the full 36,628 test. Extrapolation makes me believe it'd take thirteen to fourteen hours. The really big thing is to include the if there is only one place to put a number in a row/column/cell, then put it there inference step. Further inference has smaller impact (but the group-inference bought me a 20% speedup, which isn't bad, is it?). But using Array instead of DiffArray gave almost 40%, that's impressive. Attached is the fastest version I have, oddly, compiled with -O2 it's faster than with -O3 -optc-O3 (on my computer), how come? setUni +RTS -A8M -sstderr True 99,859,933,904 bytes allocated in the heap 104,713,900 bytes copied during GC 150,260 bytes maximum residency (72 sample(s)) 11558 collections in generation 0 ( 6.83s) 72 collections in generation 1 ( 0.16s) 13 Mb total memory in use INIT time0.00s ( 0.00s elapsed) MUT time 554.68s (568.29s elapsed) GCtime6.99s ( 7.22s elapsed) EXIT time0.00s ( 0.00s elapsed) Total time 561.67s (575.51s elapsed) %GC time 1.2% (1.3% elapsed) Alloc rate180,031,610 bytes per MUT second Productivity 98.8% of total user, 96.4% of total elapsed an average of 0.015s per 17-hint puzzle, cool! Cheers, Daniel David F. Place mailto:[EMAIL PROTECTED] -- In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt. -- Blair P. Houghton module SudokuSet where import Data.Array import Data.List (unfoldr, intersperse, sortBy, nub) import Data.Char (isDigit, digitToInt) import System.IO import qualified EnumSet as Set (null) import EnumSet (Set, (\\), size, member, empty, delete, unions, intersection, toList, fromList, singleton, findMin, fold) type Position = (Int, Int) pool :: Entry pool = Poss $ fromList [1 .. 9] ixes :: [Int] ixes = [0 .. 8] data Entry = Def !Int -- ^ definite entry | Poss !(Set Int) -- ^ set of possibilities at a position deriving Eq -- Show instance, the Poss case comes from the original design, -- where a sudoku puzzle was solved step by step instance Show Entry where show (Def k) = show k show (Poss s) = case size s of 0 - # 1 - * 2 - : _ - -- was a DiffArray, however this is MUCH faster type Board = Array Position Entry emptyBoard :: Board emptyBoard = listArray ((0,0),(8,8)) $ replicate 81 pool -- get the entry in (cellNo, cellInd) form (?) :: Board -
Re: [Haskell-cafe] Code Review: Sudoku solver
Am Freitag, 7. April 2006 17:33 schrieben Sie: Just out of curiosity, speed was not the objective when I wrote my solver, I wanted to avoid guesswork (as much as possible), but in comparison with Cale Gibbard's and Alson Kemp's solvers (which are much more beautifully coded), it turned out that mine is blazingly fast, so are there faster solvers around (in Haskell, in other languages)? if I modify your solver to produce similar output to mine (input/first propagation, solved puzzle, number and list of guesses made), your's takes about a third of the time of mine (solving 36628 17hint puzzles in 6m vs 17m, 2GHz Pentium M), and I wasn't exactly unhappy with my solver before I did this comparison!-) Mine's even faster now (at least on my computer, would you care to test it on your's? If you don't want to get EnumSet, just change DiffArray to Array, worked wonders for me), I'll dig into yours tomorrow to see what I can get out of it to improve my algorithm. like you, I've been trying to remove guesses, and the speed came as a welcome bonus (I'm still using lists all over the place, with lots of not nice adhoc code still remaining; not all propagators are iterated fully lists and adhoc code tend to be performance killers, I doubled the speed of mine by de-adhoccing the code (and that although I introduced the speed-killer DiffArray) yet because I only recently removed a logic bug that slowed down the search instead of speading it up; ..). so the more interesting bit is that our solvers disagree on which are the most difficult puzzles (requiring the largest number of guesses): df puzzles involving guesses: 5319 If that's not a typo, I'm baffled. My original needed to guess in 5309 puzzles, and I can't imagine what inference I could have dropped when cleaning up the code. largest number of guesses: 10 (#36084), 11 (#22495) cr puzzles involving guesses: 8165 largest number of guesses: 10 (#9175), 10 (#17200), 10 (#29823), 10 (#30811) df's solver needs 0/0/3/0 for cr's trouble spots, while cr's solver needs 5/9 guesses for df's. lots of potential for interesting investigations, though mostly for me!-) ^^ I'm not sure about that :-) cheers, claus Cheers back, Daniel -- In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt. -- Blair P. Houghton ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Curry does not have a constraint solver of its own. It simply delegates all constraints to the FD solver of SICStus Prolog. or that of SWI Prolog (which prompted my attempt to install Curry). which was implemented by..hi, again!-) (*) The all_different constraint subsumes the rules that you describe, depending on the consistency algorithm used. FD solvers implement general but efficient algorithms that are much more complex than a few simple rules. I haven't yet been able to get Curry to work on my windows machine, but it seems to do a straightforward translation of these constraints to Prolog +FD solver, close to the one I've attached (an easy way to use external constraint solvers from Haskell;-). the docs you pointed to state that all_different, in declarative view, simply translates into mutual inequalities between the list members, and although I don't fully understand the sources, they seem to confirm that not much more is going on. as I said, it is reasonable that this covers the first group of constraints (every position in each coordinate holds a positive single-digit number, every positive single-digit number occurs at most once in each coordinate). what I can't quite seem to be able to make out, though, is how that would cover the second group of constraints we discussed (every number occurs at least once in each coordinate), in terms of using it for propagation and avoiding search: if I have a line in which no position is uniquely determined (all positions have a current domain of size1), but only one of the positions (i) has a domain including the number n, then that group of constraints allows me to assign n to i, without guessing. wouldn't the labelling in the FD solver generate all the possible numbers in the domain of i, up to n, before committing to n on i as the only option that is consistent with the constraints? while equivalent from a declarative point of view, that would be a lot less efficient, not to mention other propagation techniques that depend on inspecting and modifying the current domains of more than one position without actually instantiating any variables. btw, I thought it would be easy to augment the declarative spec from which all those constraints are generated, to expose this second group of constraints, but since the domains are implicit, I don't see how the assign a number to each position and the assign a position to each number constraints could communicate about their separate progress in narrowing the search space (other than when either group uniquely determines a number on a position, or by having the prolog code inspect the low-level representation of constraints). if I compare the Prolog version generated by the attached Haskell program with my current Haskell version, both in code interpreters, then the Haskell version is significantly faster. is that only because of details or because of more propagation? cheers, claus (*) closed-world or open-world, but small-world for sure!-) SudokuPL.hs Description: Binary data ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
On Thu, 6 Apr 2006, Claus Reinke wrote: Curry does not have a constraint solver of its own. It simply delegates all constraints to the FD solver of SICStus Prolog. or that of SWI Prolog (which prompted my attempt to install Curry). which was implemented by..hi, again!-) (*) The SWI-Prolog FD library is just a prototype implementation... looking for someone to replace it with a state-of-the-art implementation. The all_different constraint subsumes the rules that you describe, depending on the consistency algorithm used. FD solvers implement general but efficient algorithms that are much more complex than a few simple rules. I haven't yet been able to get Curry to work on my windows machine, but it seems to do a straightforward translation of these constraints to Prolog +FD solver, close to the one I've attached (an easy way to use external constraint solvers from Haskell;-). :) the docs you pointed to state that all_different, in declarative view, simply translates into mutual inequalities between the list members, and although I don't fully understand the sources, they seem to confirm that not much more is going on. The SWI-Prolog prototype library does nothing more than the declarative meaning, that's why it's a prototype. State-of-the-art all_different implementations are a lot more complicated (often based on graph algorithms) to do very strong propagation. Here is a paper about solving Sudoku with constraint (logic ) programming comparing a number of all_different algorithms and additional tricks: http://www.computational-logic.org/iccl/master/lectures/winter05/fcp/fcp/sudoku.pdf Cheers, Tom -- Tom Schrijvers Department of Computer Science K.U. Leuven Celestijnenlaan 200A B-3001 Heverlee Belgium tel: +32 16 327544 e-mail: [EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Am Mittwoch, 5. April 2006 15:09 schrieb Chris Kuklewicz: Henning Thielemann wrote: On Mon, 3 Apr 2006, Jared Updike wrote: or ambiguously) with your Sudoku solver? A rough mesaure of the difficulty of the unsolved puzzle could be how long the solver took to solve it (number of steps) (and the number of possible solutions)? Are puzzles with multiple solutions usually considered harder or easier? Are these considered proper puzzles? It's an interesting test to run a Sudoku solver on an empty array. :-) I am cleaning up my old (aka inexperienced) solver based on Knuth's dancing links to put on the wiki. The code is very different than most Haskell solutions, since it revolves around a mutable data structure (which is not an MArray). It solves an empty array in 81 steps with no backtracking. It will produce a list of all the solutions of an empty board quite efficiently. Cleaning up my logic based solver will take longer. I've cleaned up my solver, removed a lot of redundant inference steps and made the board a DiffArray (is that really faster?). Now it completely solves (following all guesses) the 36,628 17-hint puzzles in about 32 minutes (1909 secs). It solves an empty board in 81 steps without false guesses, but still takes over four seconds (that's the price for excessive inference). I've also written a version using David F. Place's EnumSet instead of [Int], that takes less MUT time, but more GC time, so is slower on the 36,628 test, but faster for a single puzzle. If anybody feels like improving the code (especially finding better names for the functions) and/or placing it on the wiki, I'll be honoured. Just out of curiosity, speed was not the objective when I wrote my solver, I wanted to avoid guesswork (as much as possible), but in comparison with Cale Gibbard's and Alson Kemp's solvers (which are much more beautifully coded), it turned out that mine is blazingly fast, so are there faster solvers around (in Haskell, in other languages)? Cheers, Daniel -- In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt. -- Blair P. Houghton module Sudoku where import Data.Array.Diff import Data.List import Data.Char (isSpace, digitToInt) import System.IO type Position = (Int, Int) pool :: Entry pool = Poss [1 .. 9] ixes :: [Int] ixes = [0 .. 8] data Entry = Def !Int | Poss ![Int] deriving Eq instance Show Entry where show (Def k) = show k show (Poss s) = case s of []- # [_] - * [_,_] - : _ - type Board = DiffArray Position Entry emptyBoard :: Board emptyBoard = listArray ((0,0),(8,8)) $ replicate 81 pool (?) :: Board - Position - Entry b ? p = b ! toGridPos p failed :: Board - Bool failed = any isImp . elems solved :: Board - Bool solved = all isDef . elems unique :: Board - Bool unique b = case solve b of [_] - True _ - False -- -- Solving a Puzzle -- -- place :: Board - Position - Int - Board place b p@(r,c) k = case b!p of Def n - if k == n then b else b // [(p, Poss [])] Poss s - if k `elem` s then b' else b // [(p, Poss [])] where b1 = b // [(p,Def k)] ri = [(r,i) | i - ixes, i /= c] ci = [(i,c) | i - ixes, i /= r] cn = cellNo p good (x,y) = x /= r y /= c bi = filter good [toGridPos (cn,i) | i - ixes] b' = foldl (forbid k) b1 $ ri ++ ci ++ bi restrict :: Board - (Position, [Int]) - Board restrict b (p,[k]) = place b p k restrict b (p,s) = case b!p of Poss t - b // [(p, Poss $ filter (`elem` s) t)] Def k - if k `elem` s then b else b // [(p, Poss [])] forbid :: Int - Board - Position - Board forbid k b p = case b!p of Poss s | k `elem` s - b // [(p, Poss $ delete k s)] _ - b certs :: Board - [(Position, Int)] certs b = [(p,k) | (p, Poss [k]) - assocs b] defCerts :: Board - (Bool, Board) defCerts b = case certs b of [] - (False, b) cs - let b1 = foldl (uncurry . place) b cs (_, b2) = defCerts b1 in (True, b2) posCell :: Int - Board - Int - Board posCell cn b k = let ps = [(i,mkSt e) | i - ixes, let e = b ? (cn,i), not (isDef e)] fp = findPos k ps rs = nub [i `quot` 3 | i - fp] cs = nub [i `rem` 3 | i - fp] rc = filter (/= cn) [(cn `quot` 3) * 3 + i | i - [0 .. 2]] cc = filter (/= cn) [3 * i + (cn `rem` 3) | i - [0 .. 2]] ls = case (rs,cs) of
Re: [Haskell-cafe] Code Review: Sudoku solver
On Apr 6, 2006, at 6:05 PM, Daniel Fischer wrote:I've also written a version using David F. Place's EnumSet instead of [Int], that takes less MUT time, but more GC time, so is slower on the 36,628 test, but faster for a single puzzle. That's a curious result. Did you compile with optimization? It should compile into primitive bit-twiddling operations and do no allocating at all. I'd be curious to see how fast my solver works on the 36,628 test. I'm afraid to run my ancient limping powerbook in such a tight loop for that long. It gets too hot! If you'd find it amusing to give it a whirl, I'd love to know the result. incrsud.hs Description: Binary data David F. Place mailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
On Mon, 3 Apr 2006, Jared Updike wrote: or ambiguously) with your Sudoku solver? A rough mesaure of the difficulty of the unsolved puzzle could be how long the solver took to solve it (number of steps) (and the number of possible solutions)? Are puzzles with multiple solutions usually considered harder or easier? Are these considered proper puzzles? It's an interesting test to run a Sudoku solver on an empty array. :-) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Henning Thielemann wrote: On Mon, 3 Apr 2006, Jared Updike wrote: or ambiguously) with your Sudoku solver? A rough mesaure of the difficulty of the unsolved puzzle could be how long the solver took to solve it (number of steps) (and the number of possible solutions)? Are puzzles with multiple solutions usually considered harder or easier? Are these considered proper puzzles? It's an interesting test to run a Sudoku solver on an empty array. :-) I am cleaning up my old (aka inexperienced) solver based on Knuth's dancing links to put on the wiki. The code is very different than most Haskell solutions, since it revolves around a mutable data structure (which is not an MArray). It solves an empty array in 81 steps with no backtracking. It will produce a list of all the solutions of an empty board quite efficiently. Cleaning up my logic based solver will take longer. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
On 4/3/06, Donald Bruce Stewart [EMAIL PROTECTED] wrote: It would also be nice to see some example sudoku solvers posted on an `Idioms' page on haskell.org: http://www.haskell.org/haskellwiki/Category:Idioms someone could just create a new page 'Sudoku' and add the phrase [Category:Idioms]] to the text, and it will be indexed. Done. Lives at http://www.haskell.org/haskellwiki/Sudoku Seeing 4 or 5 solutions would be a useful tutorial, I'd imagine. it seems that sudoku solver may be a good candidate for nofib suite / language comparison shootout Anyone with killer solvers (Chris?) can add them to the wiki. Maybe the fastest can be used in a new Shootout benchmark. Jared. -- http://www.updike.org/~jared/ reverse )-: ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
It solves sudoku puzzles. (What pleasure do people get by doing these in their heads?!?) probably the same you get from writing programs?-) figuring out the rules, not getting lost in complexity, making something difficult work.. They are probably asking the same question: why take hours to write a program to do it when with my mad sudoku solving skills I can solve it in X seconds? My roommate is like this. if we just throw raw computing power at the problem (in-place array updates, bitvectors, multiprocessors, ..), wouldn't they be justified? but as soon as we try to take their skills and encode them in our programs, things become more interesting (do computers really play chess?-). I would say because they have chosen the wrong language for this problem :-) Solving Sudoku is a typical finite domain constraint problem. Thus, a language with constraint solving facilities like Curry (a combination of Haskell and constraint logic programming) is much better suited. Actually, I wrote a Sudoku solver in Curry and the actual code for the solver is 10 lines of code which is compact and well readable (if you are familiar with Haskell), see http://www.informatik.uni-kiel.de/~curry/examples/CLP/sudoku.curry interesting. I haven't yet got round to installing Curry and trying this, but I assume that this declarative specification, under a finite domain constraint solver, is not just an effective implementation, but an efficient one, right? if yes, it is really impressive how constraint propagation has managed to make essentially the same code that, as a mere functional logic program, would be effective, but hardly useable, so much more efficient, just by imposing a different evaluation strategy on it. and the factoring into constraint generation and constraint propagation under some strategy is nice as well. my own Sudoku solver (yes, me too - see attached, but only after you've written your own!-) uses simple hand-coded constraint propagation to limit the space for exhaustive search - some puzzles are solved by constraint propagation only, and even where guesses are used, each guess is immediately followed by propagation, to weed out useless branches early, and to deduce consequences of each guess before asking for the next one. the good old game, start with generatetest, then move the tests forward, into the generator. I've only coded the two most useful groups of constraints (when there's only a single number left for a position, or when there is only a single position left for a number). there are other deductions one does in by-hand solving, and I'm not an experienced sudoku solver myself, so I don't even know more than a few such rules yet, but these particular two seem sufficient to get acceptable performance even under ghci/hugs, so why do more?-) (*) [by acceptable, I mean that my sequence of 5 test puzzles is solved in less than 20 seconds on a 2GHz Pentium M; no idea how that compares to the most efficient solvers] since I haven't factored out the constraint propagation into a general module, the core of my code is a lot longer than the Curry version (about 60 additional lines, though I'm sure one could reduce that;-). the only negative point I can find about the Curry example is that it isn't obvious what tricks the FD-constraint solver is using (ie., it would be nice to have a concise language for expressing propagation techniques, and then explicitly to apply a strategy to the declarative specification, instead of the implicit, fixed strategy of the built-in solver). for instance, I assume that Michael's declarative specification implicitly allows the built-in solver to use the first group of constraints I mentioned (only a single possible number left for a position), but does it use the second group (only a single position left to place a number in a particular line/column/block)? my guess is that no, it doesn't, although it wouldn't be difficult to change that - just have the declarative specification express the dual puzzle as well (assigning positions to numbers instead of numbers to positions). is this correct, or is that dual reading already implied? perhaps Haskell should have Control.Constraint.* libraries for generalized constraint propagation (and presumably for constraint handling rules as well, as they are so popular nowadays for specifying Haskell's type classes)? cheers, claus (*) actually, that was a bit disappointing!-( I was hoping for some fun while trying to encode more and more clever rules, but not much of that seems to be required. Sudoku.hs Description: Binary data ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Claus Reinke wrote: It solves sudoku puzzles. (What pleasure do people get by doing these in their heads?!?) probably the same you get from writing programs?-) figuring out the rules, not getting lost in complexity, making something difficult work.. From a human standpoint, there are some puzzles that are much harder than others. This also applies to the few programs that I have seen. It is this variety of complexity that makes it interesting. They are probably asking the same question: why take hours to write a program to do it when with my mad sudoku solving skills I can solve it in X seconds? My roommate is like this. if we just throw raw computing power at the problem (in-place array updates, bitvectors, multiprocessors, ..), wouldn't they be justified? but as soon as we try to take their skills and encode them in our programs, things become more interesting (do computers really play chess?-). You can go get the 36,628 distict minimal puzzles from http://www.csse.uwa.edu.au/~gordon/sudokumin.php that have only 17 clues. Then you can run all of them through your program to locate the most evil ones, and use them on your associates. :) This also gave me a way to statistically measure if a new deductive step made much of a difference (or if it made no difference). Histograms and gnuplot helped. [snip Curry language example] my own Sudoku solver (yes, me too - see attached, but only after you've written your own!-) uses simple hand-coded constraint propagation to limit the space for exhaustive search - some puzzles are solved by constraint propagation only, and even where guesses are used, each guess is immediately followed by propagation, to weed out useless branches early, and to deduce consequences of each guess before asking for the next one. the good old game, start with generatetest, then move the tests forward, into the generator. I've only coded the two most useful groups of constraints (when there's only a single number left for a position, or when there is only a single position left for a number). there are other deductions one does in by-hand solving, and I'm not an experienced sudoku solver myself, so I don't even know more than a few such rules yet, but these particular two seem sufficient to get acceptable performance even under ghci/hugs, so why do more?-) (*) I have two versions of a solver. The first is a re-write of GDANCE bu Knuth to solve Sudoku efficiently as a binary cover problem. (see http://www-cs-faculty.stanford.edu/~knuth/programs.html ) This uses the Dancing Links algorithm implemented with STRef's and is very fast. The second uses a different encoding to look for clever deductions. This alone solves some easy problems and comes very close before getting stuck on most. There are few though that start with 17 hints and only discover one or two more by logic before getting stuck. These are the most evil of all. You might be interested in the deductions described at http://www.sudokusolver.co.uk/ [by acceptable, I mean that my sequence of 5 test puzzles is solved in less than 20 seconds on a 2GHz Pentium M; no idea how that compares to the most efficient solvers] I could run ~20,000 puzzles in a couple of hours. (The collection was smaller then). perhaps Haskell should have Control.Constraint.* libraries for generalized constraint propagation (and presumably for constraint handling rules as well, as they are so popular nowadays for specifying Haskell's type classes)? Did you see the monad at http://haskell.org/hawiki/SudokuSolver ? Perhaps you could generalize that. cheers, claus (*) actually, that was a bit disappointing!-( I was hopingfor some fun while trying to encode more and more clever rules, but not much of that seems to be required. You need more than 5 examples. The truly evil puzzles are rarer than that. Go get the set of minimal puzzles and see how far your logic takes you. -- Chris ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Chris wrote: You need more than 5 examples. The truly evil puzzles are rarer than that. Go get the set of minimal puzzles and see how far your logic takes you. Chris elucidated some of my questions before I finished writing my email... Claus wrote: (*) actually, that was a bit disappointing!-( How much harder is the problem of generating (hard/medium/easy) (solvable) Sudoku puzzles? Are all puzzles solvable (that don't break the rules at any point)? I imagine a simple way is to start with a correctly saturated grid of numbers and then start randomly shooting holes in it and testing if it is still solvable (either unambiguously or ambiguously) with your Sudoku solver? A rough mesaure of the difficulty of the unsolved puzzle could be how long the solver took to solve it (number of steps) (and the number of possible solutions)? Are puzzles with multiple solutions usually considered harder or easier? Are these considered proper puzzles? Is this a more interesting problem to try to solve (generating) rather than solving puzzles? I haven't investigated it much but I thought about it when I was writing YASS (Yet Another Sudoku Solver) of my own. What makes a Sudoku puzzle fiendish? Just the amount of missing information, the amount of lookahed required? Jared. P.S. Another interesting problem could be trying other number arrangements besides 9x9, e.g. hexadecimal puzzles... I wrote my solver to handle these but I never saw other than 9x9 puzzles to try it on (hence the idea of generating puzzles)... Is that because most people want puzzles to solve by hand instead of computer? -- http://www.updike.org/~jared/ reverse )-: ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
since I haven't factored out the constraint propagation into a general module, the core of my code is a lot longer than the Curry version (about 60 additional lines, though I'm sure one could reduce that;-). the only negative point I can find about the Curry example is that it isn't obvious what tricks the FD-constraint solver is using Curry does not have a constraint solver of its own. It simply delegates all constraints to the FD solver of SICStus Prolog. The all_different constraint subsumes the rules that you describe, depending on the consistency algorithm used. FD solvers implement general but efficient algorithms that are much more complex than a few simple rules. See http://www.sics.se/sicstus/docs/latest/html/sicstus/Combinatorial-Constraints.html#Combinatorial%20Constraints for the SICStus FD all_different documentation. (ie., it would be nice to have a concise language for expressing propagation techniques, and then explicitly to apply a strategy to the declarative specification, instead of the implicit, fixed strategy of the built-in solver). CHR is meant as a highlevel language for expressing propagation. It (currently) does not include a strategy language though. Cheers, Tom -- Tom Schrijvers Department of Computer Science K.U. Leuven Celestijnenlaan 200A B-3001 Heverlee Belgium tel: +32 16 327544 e-mail: [EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Jared Updike wrote: Chris wrote: You need more than 5 examples. The truly evil puzzles are rarer than that. Go get the set of minimal puzzles and see how far your logic takes you. Chris elucidated some of my questions before I finished writing my email... Claus wrote: (*) actually, that was a bit disappointing!-( How much harder is the problem of generating (hard/medium/easy) (solvable) Sudoku puzzles? I pursued this line of attack as well. I could not generate the hardest puzzles, though I was working without studying other's approaches. Are all puzzles solvable (that don't break the rules at any point)? All well formed problems have exactly one solution. It is always solvable by, at worst, brute force. I imagine a simple way is to start with a correctly saturated grid of numbers and then start randomly shooting holes in it and testing if it is still solvable (either unambiguously or ambiguously) with your Sudoku solver? That method works poorly. A rough mesaure of the difficulty of the unsolved puzzle could be how long the solver took to solve it (number of steps) (and the number of possible solutions)? Are puzzles with multiple solutions usually considered harder or easier? Are these considered proper puzzles? Proper puzzle have exactly one solution, accept no substitute. A problem is minimal (a partial order) if removing any single hint creates additional solutions. So the goal is build a minimal problem with as few hints as possible. The smallest number of hints achieved to date is 17, but there is no proof that a 16 hint puzzle is impossible. Is this a more interesting problem to try to solve (generating) rather than solving puzzles? I haven't investigated it much but I thought about it when I was writing YASS (Yet Another Sudoku Solver) of my own. What makes a Sudoku puzzle fiendish? Just the amount of missing information, the amount of lookahed required? A measure of difficulty is more subjective. Different programs will make luckier guesses on any specific problem. So I consider how many blanks are left when the pure logic program gets stuck to be my favorite difficulty metric. These were the worst from the list of 17's (left to right, top to bottom) : - - - - - 6..8..2.1...71.25...3..442.1...3..7..6.5. ...8...17.6.35...26..4..7...21.4...7.3...8..1 .9..25..36.3.6...4..81...7.8..9..23...67. 1...2..6.7.58.15.3.474..7..2.5...6..1 5...8.2..74..2.5...678..4..6.7...1...5.3.4... My puzzle generator worked like this: Start with an empty grid and randomly add allowed hints until the number of solutions falls to 1. Now try and remove hints while keeping the number of solutions exactly 1. The performance of this was erratic, but occasionally produced puzzles with as few as 20 to 22 hints. There was a few fairly evil spawn of my generator: .2.|...|58. 6..|9..|..1 3..|.7.|6.. ---+---+--- ...|.65|... ...|...|1.. ...|.32|.97 ---+---+--- ...|...|.38 .59|...|... ..4|...|2.. .5.|1..|... 6..|7..|... 4.8|.63|... ---+---+--- .1.|...|98. .6.|...|... 97.|.28|... ---+---+--- ...|..5|..1 ...|93.|.4. ...|...|2.7 1.6|...|..5 ...|.7.|.21 3..|...|.8. ---+---+--- ...|8..|1.6 ...|.1.|9.. ...|..9|.7. ---+---+--- ...|4.6|... ..2|..3|... 857|...|... I like that they have two 3x3 sections which are without any hints. Jared. P.S. Another interesting problem could be trying other number arrangements besides 9x9, e.g. hexadecimal puzzles... I wrote my solver to handle these but I never saw other than 9x9 puzzles to try it on (hence the idea of generating puzzles)... Is that because most people want puzzles to solve by hand instead of computer? Yes -- http://www.updike.org/~jared/ reverse )-: ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] Code Review: Sudoku solver
Hello it seems that sudoku solver may be a good candidate for nofib suite / language comparison shootout -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
bulat.ziganshin: Hello it seems that sudoku solver may be a good candidate for nofib suite / language comparison shootout It would also be nice to see some example sudoku solvers posted on an `Idioms' page on haskell.org: http://www.haskell.org/haskellwiki/Category:Idioms someone could just create a new page 'Sudoku' and add the phrase [Category:Idioms]] to the text, and it will be indexed. Seeing 4 or 5 solutions would be a useful tutorial, I'd imagine. -- Don ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Am Montag, 3. April 2006 18:52 schrieb Chris Kuklewicz: Claus Reinke wrote: It solves sudoku puzzles. (What pleasure do people get by doing these in their heads?!?) probably the same you get from writing programs?-) figuring out the rules, not getting lost in complexity, making something difficult work.. Exactly, and I wrote a solver with a relatively elaborate strategy last year (it was written incrementally, so the code is horrible, I always wanted to rewrite it properly but never got to do it, hence I will not post it unless asked to), to have both kinds of pleasure, figure out a strategy and teach it to my computer. From a human standpoint, there are some puzzles that are much harder than others. This also applies to the few programs that I have seen. It is this variety of complexity that makes it interesting. They are probably asking the same question: why take hours to write a program to do it when with my mad sudoku solving skills I can solve it in X seconds? My roommate is like this. if we just throw raw computing power at the problem (in-place array updates, bitvectors, multiprocessors, ..), wouldn't they be justified? but as soon as we try to take their skills and encode them in our programs, things become more interesting (do computers really play chess?-). You can go get the 36,628 distict minimal puzzles from http://www.csse.uwa.edu.au/~gordon/sudokumin.php that have only 17 clues. Then you can run all of them through your program to locate the most evil ones, and use them on your associates. :) Well, I loaded them down and let my solver determine whether all of them have a unique solution (they do), took 76 min 14.260 sec user time, hence roughly 0.125 secs per puzzle, so I dare say there are no evil ones among them (However, Alson Kemp's solver from the HaWiki-page -- which, I don't know why, is much faster than Cale Gibbard's -- took over 20 minutes to solve the second 0 0 0 0 0 0 0 1 0 4 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 5 0 6 0 4 0 0 8 0 0 0 3 0 0 0 0 1 0 9 0 0 0 0 3 0 0 4 0 0 2 0 0 0 5 0 1 0 0 0 0 0 0 0 0 8 0 7 0 0 0, so these puzzles may be evil after all). My solver needed to guess on 5,309 of the 36,628 17-hint puzzles, which I find a bit disappointing -- the big disappointment was when neither I nor my solver were able to solve the example from the hawiki-page without guessing :-( -- does anybody know whether in a uniquly solvable sudoku-puzzle guessing is never necessary, i.e. by proper reasoning ('if I put 6 here, then there must be a 3 and thus the 4 must go there...' is what I call guessing) there is always at least one entry determined? This also gave me a way to statistically measure if a new deductive step made much of a difference (or if it made no difference). Histograms and gnuplot helped. [snip Curry language example] my own Sudoku solver (yes, me too - see attached, but only after you've written your own!-) uses simple hand-coded constraint propagation to limit the space for exhaustive search - some puzzles are solved by constraint propagation only, and even where guesses are used, each guess is immediately followed by propagation, to weed out useless branches early, and to deduce consequences of each guess before asking for the next one. the good old game, start with generatetest, then move the tests forward, into the generator. I've only coded the two most useful groups of constraints (when there's only a single number left for a position, or when there is only a single position left for a number). there are other deductions one does in by-hand solving, and I'm not an experienced sudoku solver myself, so I don't even know more than a few such rules yet, but these particular two seem sufficient to get acceptable performance even under ghci/hugs, so why do more?-) (*) I have two versions of a solver. The first is a re-write of GDANCE bu Knuth to solve Sudoku efficiently as a binary cover problem. (see http://www-cs-faculty.stanford.edu/~knuth/programs.html ) This uses the Dancing Links algorithm implemented with STRef's and is very fast. The second uses a different encoding to look for clever deductions. This alone solves some easy problems and comes very close before getting stuck on most. There are few though that start with 17 hints and only discover one or two more by logic before getting stuck. These are the most evil of all. You might be interested in the deductions described at http://www.sudokusolver.co.uk/ [by acceptable, I mean that my sequence of 5 test puzzles is solved in less than 20 seconds on a 2GHz Pentium M; no idea how that compares to the most efficient solvers] I could run ~20,000 puzzles in a couple of hours. (The collection was smaller then). As stated above, I ran the 36,628 in 76 minutes on a 1200MHz Duron. But I must confess that my solver takes about 25 secs for the empty puzzle, guessing is
Re: [Haskell-cafe] Code Review: Sudoku solver
Jared Updike wrote: On 3/22/06, David F. Place [EMAIL PROTECTED] wrote: ... It solves sudoku puzzles. (What pleasure do people get by doing these in their heads?!?) They are probably asking the same question: why take hours to write a program to do it when with my mad sudoku solving skills I can solve it in X seconds? My roommate is like this. I would say because they have chosen the wrong language for this problem :-) Solving Sudoku is a typical finite domain constraint problem. Thus, a language with constraint solving facilities like Curry (a combination of Haskell and constraint logic programming) is much better suited. Actually, I wrote a Sudoku solver in Curry and the actual code for the solver is 10 lines of code which is compact and well readable (if you are familiar with Haskell), see http://www.informatik.uni-kiel.de/~curry/examples/CLP/sudoku.curry Regards, Michael ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Thanks for your helpful suggestions. I took them to heart and incorporated many of them in a new version. sudoku.hs Description: Binary data David F. Place mailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Code Review: Sudoku solver
Hi All, I really appreciate all the help I received when I asked you to critique my PrefixMap module a few weeks ago. I think I am making good progress in correcting the lisp in my Haskell programming. I'll be very grateful to anyone who can take a glance at the attached short program and say if any unidiomatic usages pop out. It solves sudoku puzzles. (What pleasure do people get by doing these in their heads?!?) Cheers, David sudoku.hs Description: Binary data David F. Place mailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
On 3/22/06, David F. Place [EMAIL PROTECTED] wrote: Hi All, I really appreciate all the help I received when I asked you to critique my PrefixMap module a few weeks ago. I think I am making good progress in correcting the lisp in my Haskell programming. I'll be very grateful to anyone who can take a glance at the attached short program and say if any unidiomatic usages pop out Try cellIndex r c = 3*(r `div` 3) + c `div` 3 It's much much shorter and should produce the same results. It solves sudoku puzzles. (What pleasure do people get by doing these in their heads?!?) They are probably asking the same question: why take hours to write a program to do it when with my mad sudoku solving skills I can solve it in X seconds? My roommate is like this. Cheers, Jared. -- http://www.updike.org/~jared/ reverse )-: ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
On Mar 22, 2006, at 2:16 PM, David F. Place wrote: Hi All, I really appreciate all the help I received when I asked you to critique my PrefixMap module a few weeks ago. I think I am making good progress in correcting the lisp in my Haskell programming. I'll be very grateful to anyone who can take a glance at the attached short program and say if any unidiomatic usages pop out. sudoku :: Sudoku - IO () sudoku s = ((mapM_ putStrLn) . (check s) . (take 1) . solveSudoku) s check puzzle [] = [showSudoku puzzle,No solutions.] check puzzle [solution] | solution `solves` puzzle = [Puzzle:,showSudoku puzzle,Solution:,showSudoku solution] | otherwise = [Program Error. Incorrect Solution!] That '(check s) . (take 1)' bit looks a little odd to me. I would simply have written 'check' to match like: check puzzle [] = failure case check puzzle (solution : _ ) = success case Also, I like to put off doing IO as long as possible, so I'd probably have 'sodoku' return a String or [String] and move the putStr into main. Its an easy way to make your code more reusable. Also, your parser is pretty hackish (but I suspect you knew that already). FYI, solveSudoku has a bug; if you enter an invalid puzzle it will return non-solutions. It solves sudoku puzzles. (What pleasure do people get by doing these in their heads?!?) I have no idea. Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Code Review: Sudoku solver
Robert Dockins wrote: On Mar 22, 2006, at 2:16 PM, David F. Place wrote: Hi All, I really appreciate all the help I received when I asked you to critique my PrefixMap module a few weeks ago. I think I am making good progress in correcting the lisp in my Haskell programming. The style of the code and choice of names is good. I'll be very grateful to anyone who can take a glance at the attached short program and say if any unidiomatic usages pop out. That '(check s) . (take 1)' bit looks a little odd to me. I would simply have written 'check' to match like: check puzzle [] = failure case check puzzle (solution : _ ) = success case A simpler version of replace: replace :: Sudoku - Int - Int - Int - Sudoku replace s r c x = let (above,row:below) = splitAt r s (left,_:right) = splitAt c row in above++((left++(x:right)):below) And a simpler version of toList in keeping with your style: toList :: Set - [Int] toList i = concatMap f [9,8..1] where f b = if testBit i b then [b] else [] (The above is also a little less prone to off-by-one errors since testBit is the opposite of setBit) Also, I like to put off doing IO as long as possible, so I'd probably have 'sodoku' return a String or [String] and move the putStr into main. Its an easy way to make your code more reusable. Also, your parser is pretty hackish (but I suspect you knew that already). A minimal change to the parser that still does no sanity checking but may be a little more robust is import Data.Char readSudoku :: [String] - String - Sudoku readSudoku [line] input = takeBy 9 $ map digitToInt $ filter isDigit $ head $ lines input readSudoku _ input = map (map digitToInt . filter isDigit) $ lines input ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe