Hi hi,

2013/7/30 Karl Rupp <r...@iue.tuwien.ac.at>

> Hi Phil,
>
>
> > The generator code is pushed on the master branch.
>
> Cool, thanks. I actually wasn't expecting this to arrive in master today
> :-)
>

Haha, it's already more than one week late!


> I commented the commit on github. The short summary is:
>
> 1.) I don't quite know/see why we need SYMBOLIC_*, since a true symbolic
> operation could equally well be obtained with just providing NULL for the
> data-members in lhs and rhs.
>

Hmm, are you talking about something like :

if(lhs.type_family==VECTOR_TYPE_FAMILY){
    if(lhs.type==VECTOR_FLOAT_TYPE){
        viennacl::vector_base<float> * p = lhs.vector_float;
        if( p == NULL){
                viennacl::symbolic_vector_base<float> * p =
lhs.symbolic_vector_float; //Will not work here, since lhs.* is an union
        }
   }
}

or rather

if(lhs.type_family==SYMBOLIC_VECTOR_TYPE_FAMILY){
    if(lhs.type==VECTOR_FLOAT_TYPE){
                viennacl::symbolic_vector_base<float> * p =
lhs.symbolic_vector_float; //Will not work here, since lhs.* is an union
   }
}

which would indeed work and avoid duplication. However, when it comes to
symbolic matrix, I think the separation between row-major and col-major
does not quite fit the semantics of the handle-free symbolic matrices.



> 2.) one_vector seems redundant to me. I suggest to use/extend
> scalar_vector instead. Everything happens at runtime, so an additional
> compile time type shouldn't be necessary.
>
>
Yes, we've started a discussion on the github :P


>
> >     The padding is no longer 'static'. The 'ALIGNMENT' template
>
>>     parameter is now ignored (vector_base no longer holds an ALIGNMENT
>>     parameter), so we can introduce a runtime padding without breaking
>>     old code. Thus, we can pick a proper padding entirely at runtime,
>>     tailored to the underlying device.
>>
>>
>> Oh, true. This padding has to be the "smallest one compatible with all
>> profiles", some sort of lest common multiple, which I hope is not going
>> to grow ridiculously big...
>>
>
> The padding is just chosen such that it fits the profile for the device.
> It's not static, so we can just query everything from the underlying device
> ;-)
>
>
I know, I know. What I meant for example was :

if the super optimized BLAS2 requires some padding multiple of 16*112
and the BLAS3 some 32*96, then we got to be careful to pad it 32*112 :)


>
>      Yes, a good autotuning procedure should verify the correctness of
>>     the results obtained anyway. There may be compiler or hardware bugs
>>     which can lead to fast, but erroneous kernels.
>>
>>     A two-stage scheme seems best here:
>>     - First, find the fastest kernel (either without checking, or just
>>     checking for a particular size).
>>     - Second, verify this kernel for a couple of different sizes. If
>>     this fails, pick the next kernel, etc.
>>
>>
>> Ok, I'll do that.
>> However, there are things to test in the way the generator behave,
>> rather than the profiles.
>> All the operations in tests/vector.cpp have to be compatible with the
>> generator. Should the corresponding tests be in the same vector.cpp file
>> (in some #ifdef VIENNACL_WITH_OPENCL) or should it be in a separate file?
>>
>
> Let's just reuse vector.cpp for that. I need the same for the scheduler,
> so all I'll do is to add flags such that one can switch between a
> template-driven approach, the scheduler, or the generator. Three extensive
> test sets with one code :-)
>
> For more exotic operations for stress-testing the generator I suggest you
> just use a separate test file.
>
>
Great, ok :)


>
>
>
>          Right, it's not over-complicated to do. The problem is more about
>>         knowing the right optimization profile used at runtime (the
>>         local memory
>>         used by the to-be-compiled kernel). Ok, it means that this
>>         optimization
>>         profile should not change (since I think we cannot really use
>> global
>>         objects), so that this local memory value is consistent over
>>         time. Only
>>         the autotuner will be allowed to play with optimization
>>         profiles, then,
>>         which is fine for me.
>>
>>
>>     There is no reason to expect that the hardware changes during the
>>     execution of a process. Even if a hardware falls off the bus because
>>     it overheats, it doesn't come back without rebooting the machine
>>     (verified with two SDKs).
>>
>>
>> Oh, I was more referring to user forcing some execution profile on an
>> operation. But I think it's ok not to allow it either :P
>>
>
> For that purpose it's probably best if the (advanced!) user interfaces the
> generator directly and forces recompilation and such. I don't see how this
> would otherwise fit into the public API without torturing the other ~95% of
> users who are not interested in changing execution profiles ;-)
>

If the user may change the execution profile, though, I can see a potential
problem :
- the matrix constructor will probably need to be aware of the right
execution profiles to pad the matrix correctly
- If the user changes a profile after we pad the matrix (the padding is
supposedto be transparent), the results will become undefined.
I fear even an advanced user interface cannot grand access to the
optimization profiles...


> Best regards,
> Karli
>

Best regards,
Philippe
------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So what steps can you take to put your SQL databases under 
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to