date:20130802

Re: [ViennaCL-devel] zero-padding datastructures...

2013-08-02 Thread Philippe Tillet

Hi hi, 2013/8/2 Karl Rupp > Hi, > > > I've been thinking a bit about dynamically zero-padding > > viennacl::matrix<> for full hardware use ( best bandwidth for BLAS1, > > BLAS2, best performance for BLAS3). > > > > Basically, the big problem arising is that the blocking-parameter is not > > de

Re: [ViennaCL-devel] zero-padding datastructures...

2013-08-02 Thread Karl Rupp

Hey, > > Hmm, I'm not completely sure. > The best GEMM performance are not located "around" (distance-wise in the > parameter space) the sweet spot, generally, since perturbating one > parameter can result in disastrous performance. Yeah, I agree, the sweet spot may not be defined 'distance-wise

[ViennaCL-devel] openmp 4.0

2013-08-02 Thread Evan Bollig

FYI: OpenMP 4.0 specification has been released, which includes support for accelerators, thread affinity, Fortran 2003, etc.: http://www.hpcwire.com/hpcwire/2013-07-31/openmp_40_specification_released_with_significant_new_standard_features.html And CUDA 5.5 has been released: http://www.hpcwir

Re: [ViennaCL-devel] openmp 4.0

2013-08-02 Thread Karl Rupp

Hi Evan, > OpenMP 4.0 specification has been released, which includes support for > accelerators, thread affinity, Fortran 2003, etc.: > > http://www.hpcwire.com/hpcwire/2013-07-31/openmp_40_specification_released_with_significant_new_standard_features.html Thanks! I consider the thread affinity

Re: [ViennaCL-devel] openmp 4.0

2013-08-02 Thread Evan Bollig

I hope it wont take years. I saw a presentation earlier today that they already have spec 5 started and are hoping to have most of it squared away by SC13. -E On Fri, Aug 2, 2013 at 3:20 PM, Karl Rupp wrote: > Hi Evan, > > > OpenMP 4.0 specification has been released, which includes support for

Re: [ViennaCL-devel] openmp 4.0

2013-08-02 Thread Karl Rupp

Hey, > I hope it wont take years. First compiler implementations will be available in no time, sure. However, it will take years until enterprise cluster systems like CentOS have upgraded to these compilers. We still have clusters here with GCC 4.2.x... > I saw a presentation earlier today th

Re: [ViennaCL-devel] Compilation load of matrix-test-*

2013-08-02 Thread Karl Rupp

Hi Phil, the tests are now split into more light-weight units by separating single and double precision. matrix-test was additionally split into row-major and column-major tests. This should now allow you to build with `make -j4` on weaker machines with limited RAM. Best regards, Karli On 0

Re: [ViennaCL-devel] zero-padding datastructures...

2013-08-02 Thread Philippe Tillet

Hi, 2013/8/2 Karl Rupp > Hey, > > > > > >> Hmm, I'm not completely sure. >> The best GEMM performance are not located "around" (distance-wise in the >> parameter space) the sweet spot, generally, since perturbating one >> parameter can result in disastrous performance. >> > > Yeah, I agree, the

Re: [ViennaCL-devel] zero-padding datastructures...

2013-08-02 Thread Karl Rupp

Hi, > A padding of 256 looks pretty expensive to me, resulting in a lot of > unnecessary FLOPs in worst case. Can you please assemble a list of > all GEMM kernel configuration parameters and their execution times > for the GTX 470, Tesla C2050, HD 7970 and HD 5850? mL, nL, and kL >

Re: [ViennaCL-devel] zero-padding datastructures...

Re: [ViennaCL-devel] zero-padding datastructures...

[ViennaCL-devel] openmp 4.0

Re: [ViennaCL-devel] openmp 4.0

Re: [ViennaCL-devel] openmp 4.0

Re: [ViennaCL-devel] openmp 4.0

Re: [ViennaCL-devel] Compilation load of matrix-test-*

Re: [ViennaCL-devel] zero-padding datastructures...

Re: [ViennaCL-devel] zero-padding datastructures...

9 matches

Site Navigation

Mail list logo

Footer information