On 09/09/12 19:36, renoX wrote: > Hello, > > one common issue when you optimize code is that the code becomes > difficult to read/maintain, but if you're trying to process images there > may be hope: Halide is a DSL (currently embedded in C++) which keep the > algorithm and the "optimization recipe"(schedule) separated AND the > performance can be similar to hand-optimized C++ code. > > You can read more about Halide here: http://halide-lang.org/ > > Regards, > renoX > > PS: I'm not related at all with the Halide's developers but I thought > this is an interesting topic.
I was about to tell about Halide in D's forum : perfect timing :) Halide has been announced just before siggraph 2012 (august). I think they really spotted the caveats of current Image Processing frameworks. But I'm unsure it can address all IP problems ( graph based images are out for instance ) and how the optimization part composes with the actual composition of functions. As bearophile told the automatic optimization problem is still quite open and it's not been tackled in this paper. So they decided to rely on the expert to then specify the optimization part in a dedicated language. They call this phase _the schedule_. The schedule language is very terse and allows experts to try a lot of different designs without rewriting the algorithm, quickly converging to an efficient solution for a particular hardware. The optimizations proposed by the framework ranges from 'compute once this subdomain and reuse' to 'inline everything'. The former is good when there is some redundant calculations ( as for spatial filters ), the later is better when it's just a pixel wise pipeline. Also because of bandwidth limitations it's often more interesting to trade computation against locality and better performance can be achieved by actually recomputing data. I do recommend anyone interested in image processing or data crunching to look at this project. For the moment they can process and let you optimize data pipelines up to four dimensions. The only issue is that optimal performance will be achieved only for a particular hardware (GPU and CPU schedules are completely different), there is no automatic optimization at runtime yet. But I'm sure the schedules produced by expert will soon lead to heuristics enabling self optimizing algorithms ( genetic algorithms anyone ? ) There is already a Python binding and I'm sure it'll be very easy to add a D one. The DSL basically builds an AST at runtime which is JIT compiled to machine code. -- Guillaume
