I implemented a version of 'case' following Clinger's design from "Rapid Case Dispatch in Scheme" [http://scheme2006.cs.uchicago.edu/07-clinger.pdf]. (This came up a while back on the users list: [http://lists.racket-lang.org/users/archive/2011-September/048301.html]. In my case, I've been writing a virtual machine, and the opcode dispatch could benefit from a faster 'case.')
On Clinger's micro-benchmarks at [http://www.ccs.neu.edu/home/will/Research/SW2006/case.sch], my implementation performs the same as racket's current 'case' on small expressions and outperforms it as the expressions get bigger. So, for example, on the f10 and s10 benchmarks (testing case expressions with 10 fixnum and 10 symbol constants, respectively), both implementations run in 6-7 milliseconds on my machine, whereas on f1000, my implementation runs in 63-67 milliseconds while the current racket implementation takes about 3 seconds. I had previously sent a pull request on the racket github site, but Eli mentioned that it would be better to discuss the matter here first. In that pull request, I mentioned a few caveats, the first of which is that my implementation tends to produce larger code. As a data point: the total size of all the .zos in the collects tree is 102485877 when compiled with my implementation and 102272930 compiled with the current one, so the difference represents about 0.2% of the current size. A more serious issue, in my view, is that the new implementation appears to slow down the full build (raco setup) by about 7-15 seconds on my machine. Since the full build takes a long time to complete, I've been trying to use smaller benchmarks to find out what, exactly, is slower. I could definitely use some help with this; I think I've barked up several wrong trees. For example, at one point I was trying to test expansion performance like this: time raco expand file.rkt > /dev/null until I realized that, in at least one case, I was testing the pretty-printer as much as anything else. In the github thread, Vincent mentioned the syntax.rktl tests at collects/tests/racket/syntax.rktl. Averaging 5 runs of "time racket -f syntax.rktl" results in: Clinger 'case' ========== real 0m0.810 user 0m0.648 sys 0m0.129 Current 'case' ========== real 0m0.807 user 0m0.644 sys 0m0.130 So, not much difference there. The chameneos benchmark (under collects/tests/racket/benchmarks/shootout) was interesting. It uses case expressions, but only small ones. The expanded code is very similar for both implementations, but the current version seems to perform very slightly better. However, I don't know if this is an edge at runtime or at expand time (or compile time). I've begun to wonder if just adding more code to the expander, whether or not it's used, might be the issue. I'd appreciate any help in trying to determine exactly where the performance difference lies. The latest version of the code is at: https://github.com/97jaz/racket/blob/case-v3/ The only changed files are: https://github.com/97jaz/racket/blob/case-v3/collects/racket/private/more-scheme.rkt https://github.com/97jaz/racket/blob/case-v3/collects/racket/private/case.rkt [added] Note that the current version is in the 'case-v3' branch, if you decide to clone my git tree. (My initial version is in the master branch, and a second version is in branch 'case-v2' if anyone cares to compare them. -Jon _________________________ Racket Developers list: http://lists.racket-lang.org/dev