Hi Phil, > The generator code is pushed on the master branch.
Cool, thanks. I actually wasn't expecting this to arrive in master today :-) I commented the commit on github. The short summary is: 1.) I don't quite know/see why we need SYMBOLIC_*, since a true symbolic operation could equally well be obtained with just providing NULL for the data-members in lhs and rhs. 2.) one_vector seems redundant to me. I suggest to use/extend scalar_vector instead. Everything happens at runtime, so an additional compile time type shouldn't be necessary. > The padding is no longer 'static'. The 'ALIGNMENT' template > parameter is now ignored (vector_base no longer holds an ALIGNMENT > parameter), so we can introduce a runtime padding without breaking > old code. Thus, we can pick a proper padding entirely at runtime, > tailored to the underlying device. > > > Oh, true. This padding has to be the "smallest one compatible with all > profiles", some sort of lest common multiple, which I hope is not going > to grow ridiculously big... The padding is just chosen such that it fits the profile for the device. It's not static, so we can just query everything from the underlying device ;-) > Yes, a good autotuning procedure should verify the correctness of > the results obtained anyway. There may be compiler or hardware bugs > which can lead to fast, but erroneous kernels. > > A two-stage scheme seems best here: > - First, find the fastest kernel (either without checking, or just > checking for a particular size). > - Second, verify this kernel for a couple of different sizes. If > this fails, pick the next kernel, etc. > > > Ok, I'll do that. > However, there are things to test in the way the generator behave, > rather than the profiles. > All the operations in tests/vector.cpp have to be compatible with the > generator. Should the corresponding tests be in the same vector.cpp file > (in some #ifdef VIENNACL_WITH_OPENCL) or should it be in a separate file? Let's just reuse vector.cpp for that. I need the same for the scheduler, so all I'll do is to add flags such that one can switch between a template-driven approach, the scheduler, or the generator. Three extensive test sets with one code :-) For more exotic operations for stress-testing the generator I suggest you just use a separate test file. > Right, it's not over-complicated to do. The problem is more about > knowing the right optimization profile used at runtime (the > local memory > used by the to-be-compiled kernel). Ok, it means that this > optimization > profile should not change (since I think we cannot really use global > objects), so that this local memory value is consistent over > time. Only > the autotuner will be allowed to play with optimization > profiles, then, > which is fine for me. > > > There is no reason to expect that the hardware changes during the > execution of a process. Even if a hardware falls off the bus because > it overheats, it doesn't come back without rebooting the machine > (verified with two SDKs). > > > Oh, I was more referring to user forcing some execution profile on an > operation. But I think it's ok not to allow it either :P For that purpose it's probably best if the (advanced!) user interfaces the generator directly and forces recompilation and such. I don't see how this would otherwise fit into the public API without torturing the other ~95% of users who are not interested in changing execution profiles ;-) Best regards, Karli ------------------------------------------------------------------------------ Get your SQL database under version control now! Version control is standard for application code, but databases havent caught up. So what steps can you take to put your SQL databases under version control? Why should you start doing it? Read more to find out. http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk _______________________________________________ ViennaCL-devel mailing list ViennaCL-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/viennacl-devel