I'm surprised nobody has mentioned Theano, which seems like the moral equivalent of your request for Python, including C code generation, GPU computation, and automatic differentiation. Its approach seems amenable to Racket implementation: it builds a graph representing the numeric operations to be performed (which you could do with a DSL), does optimizations on the graph, and then turns the result into fast C. You can probably steal many of its ideas.
http://deeplearning.net/software/theano/ -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.