Re: [ViennaCL-devel] Problem using different contexts from different OpenMP-Threads

2013-07-21 Thread Karl Rupp
Hi Andreas, > I have a program which has to iteratively solve a number of independent equation systems. > The actual number of systems may vary between 1 and 10. > Because they are totally independent from each other, we assign each system a > number of OpenMP threads allowing them to be solved

Re: [ViennaCL-devel] building against boost

2013-07-21 Thread Karl Rupp
Hi Evan, > Building against Boost is a challenge with CMake. I repeatedly run > into problems building ViennaCL against boost_filesystem and > boost_system libraries depending on how the host has them built. I agree, it's indeed often unnecessarily challenging on Windows. Do you encounter the s

Re: [ViennaCL-devel] vector = vector - vector_range

2013-07-21 Thread Karl Rupp
Hi Evan, > Can you refresh my memory on how ViennaCL behaves with this operation: > > vector = vector - vector_range > > I would like to subtract an M < N sized vector_range from an N sized > vector (by operating on the first M elements). Is this supported in > 1.4.2? Yes, this is supported. You

Re: [ViennaCL-devel] vector = vector - vector_range

2013-07-21 Thread Karl Rupp
gt;>> >>> On Sun, Jul 21, 2013 at 4:13 PM, Evan Bollig wrote: >>>> Ah I see. Thanks for the clarification on projecting the LHS. I was >>>> trying to assign the result to an unprojected vector assuming (vector >>>> - vector

Re: [ViennaCL-devel] OpenCL 2.0 Provisional !

2013-07-22 Thread Karl Rupp
Hey, thanks for the hint! There's one thing that is of particular interest for us: The portable intermediate representation, which is essentially a LLVM IR standardized for OpenCL. This should cut down compilation times and at the same time allow for a few more optimizations which are hard to

Re: [ViennaCL-devel] vector = vector - vector_range

2013-07-22 Thread Karl Rupp
rn a vector_range (i.e., operation returns operand type). >> >> -E >> >> On Sun, Jul 21, 2013 at 4:09 PM, Karl Rupp wrote: >>> Hi Evan, >>> >>> >>>> Can you refresh my memory on how ViennaCL behaves with this operation: >>>

Re: [ViennaCL-devel] scheduler in dev-master

2013-07-22 Thread Karl Rupp
Hi Evan, thanks, this is now fixed. Unfortunately GCC 4.6 and above use some mixed C++11 compilation mode, in which these extra typename keywords are perfectly valid. They are not allowed in C++03, thus GCC 4.4 is right about complaining here. Any hints on how to make GCC 4.6 and above complai

Re: [ViennaCL-devel] Problem using different contexts from different OpenMP-Threads

2013-07-22 Thread Karl Rupp
Hi Andreas, the rigorous solution of the problem turns out to require more effort than anticipated such that the generic solver implementations don't get messed up. I'll need another day or two for this to correct. Best regards, Karli On 07/21/2013 09:59 AM, Andreas IHU wrote: > Hi again, > >

Re: [ViennaCL-devel] vector = vector - vector_range

2013-07-24 Thread Karl Rupp
latest dev and test against that. >> >> Cheers, >> -E >> >> On Mon, Jul 22, 2013 at 1:47 PM, Karl Rupp wrote: >>> Hi Evan, >>> >>> I just pushed support for sparse matrix-vector products when using >>> vector-ranges and vector-sli

Re: [ViennaCL-devel] viennacl cuthill mckee

2013-07-24 Thread Karl Rupp
Hi Evan, > Hey Karl, does the cuthill mckee algorithm from ViennaCL account for > unsymmetric matrices? I looked it up in the bachelor thesis of the student and he assumed a symmetric *graph*. Hence, if your matrix is structurally symmetric, but non-symmetric in terms of values, things should

Re: [ViennaCL-devel] vector = vector - vector_range

2013-07-24 Thread Karl Rupp
t; >> -E >> >> On Wed, Jul 24, 2013 at 4:53 PM, Karl Rupp wrote: >>> Hi Evan, >>> >>> this is strange, as there are tests checking for exactly this. >>> Could you please run a 'make clean && make' in the build folder? I su

Re: [ViennaCL-devel] vector = vector - vector_range

2013-07-24 Thread Karl Rupp
hanks Karl. I thought it was strange too. I blasted my ~/.nv folder >> and am waiting for make clean && make to finish. Ill keep you posted. >> >> -E >> >> On Wed, Jul 24, 2013 at 4:53 PM, Karl Rupp wrote: >>> Hi Evan, >>> >>> this is s

Re: [ViennaCL-devel] vector = vector - vector_range

2013-07-24 Thread Karl Rupp
morning. > > -E > > On Wed, Jul 24, 2013 at 9:03 PM, Evan Bollig wrote: >> Ok ill take a look. Yes, im using ELL. >> >> -Evan Bollig >> >> On Jul 24, 2013 6:34 PM, "Karl Rupp" wrote: >>> >>> Hi again, >>> >>> whic

Re: [ViennaCL-devel] vector = vector - vector_range

2013-07-25 Thread Karl Rupp
Hi Evan, this is now fixed for COO, ELL, and HYB formats on OpenCL. CPU and CUDA backends will be fixed later today. Best regards, Karli On 07/24/2013 11:24 PM, Evan Bollig wrote: > Cool. I appreciate the help! > > -E > > On Wed, Jul 24, 2013 at 11:20 PM, Karl Rupp wr

Re: [ViennaCL-devel] permutation matrix

2013-07-26 Thread Karl Rupp
Hi Evan, > I need to scatter the elements of a vector out to multiple processors. > The mapping is one to many (vector elements can go to many procs). I > would like to do this with a permutation matrix which has 1 nonzero > per row. > > I'd like the process to run on the GPU, so a warp would need

Re: [ViennaCL-devel] Some benchmarks

2013-07-27 Thread Karl Rupp
Hi, @all: We will use this mailinglist for *all* development discussions now, replacing previous private email communications - expect quite some additional traffic. :-) > I did some brief benchmarks using the following function: > def dobench(size): > ... v = p.Vector(size, 0.1) > .

Re: [ViennaCL-devel] Kernel Generator wrap-up

2013-07-28 Thread Karl Rupp
Hey, > I'm proud to announce that after about 3weeks, I've recoded from scratch > the OpenCL code generator to integrate it fully with > viennacl::scheduler::statement. hurray :-) With the changes to the generator I pushed yesterday there is now a clear spot on where to hand the expression over

Re: [ViennaCL-devel] Kernel Generator wrap-up

2013-07-28 Thread Karl Rupp
Hey, > My preferred option is to pad by default and either to make the > padding a multiple of four or sixteen. However, we need to maintain > a full set of unpadded operations, because user-provided buffers > need not be padded (and a subsequent padding may be too expensive) > >

[ViennaCL-devel] Scheduler datastructure changes

2013-07-28 Thread Karl Rupp
Hey, see commit message here: https://github.com/viennacl/viennacl-dev/commit/1a214259f577acd1b329197285e26cf2cd774e34 Best regards, Karli -- See everything from the browser to the database with AppDynamics Get end-to-en

Re: [ViennaCL-devel] Problem using different contexts from different OpenMP-Threads

2013-07-28 Thread Karl Rupp
Hi Andreas, - good news: I made good progress on introducing a generic context. For code lines such as viennacl::vector x = y + z; the vector x is created in the correct context (i.e. deduced from y, z). This resolves most of the issues with temporaries. - bad news: I haven't found enough tim

Re: [ViennaCL-devel] Problem using different contexts from different OpenMP-Threads

2013-07-29 Thread Karl Rupp
Hi Andreas, I just pushed the remaining code changes to the viennacl-dev repository. examples/benchmarks/solver.cpp now runs the iterative solvers and preconditioners in their own context, passing viennacl::context() to the constructor to overwrite the default context. This should allow for the

Re: [ViennaCL-devel] Problem using different contexts from different OpenMP-Threads

2013-07-29 Thread Karl Rupp
Hi Phil, > Thanks Karl ! > This will allow serve us when we deal with multiple GPU ;) I'm pretty happy with the model now, basically extending the concept of a 'context' in OpenCL beyond OpenCL boundaries: Create vectors as follows: viennacl::vector x(42); //vector in default context vi

[ViennaCL-devel] async_copy for vectors

2013-07-29 Thread Karl Rupp
Hi guys, as I was recently discussing asynchronous transfer and execution with Evan in an MPI context, this is now addressed with viennacl::async_copy() Typical use case: std::vector std_x(SIZE); viennacl::vector vcl_x(SIZE); viennacl::async_copy(std_x, vcl_x); // same as next line

Re: [ViennaCL-devel] Kernel Generator wrap-up

2013-07-29 Thread Karl Rupp
Hi Phil, > The generator code is pushed on the master branch. Cool, thanks. I actually wasn't expecting this to arrive in master today :-) I commented the commit on github. The short summary is: 1.) I don't quite know/see why we need SYMBOLIC_*, since a true symbolic operation could equally

Re: [ViennaCL-devel] Mid-Term Evaluations

2013-07-31 Thread Karl Rupp
Hi Toby, thanks for submitting the evaluation :-) > Now for the couple of questions... > > I'm currently thinking that I need to rewrite the majority of the Python > expression tree code in C++/Boost.Python, because (as I feared a while > ago) there's quite a lot of overhead in using pure Python

Re: [ViennaCL-devel] Mid-Term Evaluations

2013-07-31 Thread Karl Rupp
Hey, >> I started going through the code, and you're right, the change isn't >> very large. But it makes sense to split it up as you describe, to keep >> the different semantics different, and same semantics shared (ie, to >> make sure the concepts are as clear as possible). I also have another >

Re: [ViennaCL-devel] Integrating the Kernel Generator with the Scheduler

2013-07-31 Thread Karl Rupp
Hi, On 07/31/2013 03:48 PM, Philippe Tillet wrote: > Hi, > > I've explored a bit of the execute_*.hpp files, but I'm not sure how to > integrate the generator here, ie when to create the kernel generation > object, when to add statements for generation, where to trigger > generation, etc... Any id

Re: [ViennaCL-devel] Integrating the Kernel Generator with the Scheduler

2013-07-31 Thread Karl Rupp
Hi ho, > However, my question was rather about packing multiple operations > together, and specifically scoping the necessary kernel generator object > (or more generally scoping the std::vector that has to be > packed together for generation/execution) > *bool code_generator::add(statement)* > r

Re: [ViennaCL-devel] MEMORY_NOT_INITIALIZED errors

2013-08-01 Thread Karl Rupp
Hi Toby, yes, I totally agree that we should have different types of exceptions. Please commit it yourself directly, you should have push permissions. :-) Best regards, Karli On 08/01/2013 02:01 PM, Toby St Clere Smithe wrote: > Toby St Clere Smithe > writes: >> Actually, whoops -- that's bug

Re: [ViennaCL-devel] MEMORY_NOT_INITIALIZED errors

2013-08-01 Thread Karl Rupp
Hi again, actually, please introduce an exception derived from std::exception just like for the scheduler. Thanks and best regards, Karli On 08/01/2013 02:01 PM, Toby St Clere Smithe wrote: > Toby St Clere Smithe > writes: >> Actually, whoops -- that's buggy. I'll fix it... > > OK -- fixed pa

Re: [ViennaCL-devel] MEMORY_NOT_INITIALIZED errors

2013-08-01 Thread Karl Rupp
Hi, >> yes, I totally agree that we should have different types of exceptions. >> Please commit it yourself directly, you should have push permissions. :-) > > Right-ho, will do :) (I also managed to coerce Boost into handling char* > exceptions -- the trick was remembering that they're all /cons

Re: [ViennaCL-devel] MEMORY_NOT_INITIALIZED errors

2013-08-01 Thread Karl Rupp
Hey, >> Oh no - let's hope that this doesn't delay the process of actually >> 'doing it right', i.e. throw exceptions derived from std::exception >> rather than quick&dirty const char* stuff from the prototyping stage... ;-) > > Haha, it won't: I believe the time cost of not being able to write an

Re: [ViennaCL-devel] fast_/async_copy segfault with nvidia OpenCL

2013-08-01 Thread Karl Rupp
Hi Toby, it certainly has to do with your scalar_vector. Have a look at the documentation for fast_copy: http://viennacl.sourceforge.net/doc/namespaceviennacl.html#a815cf9646ece6cc98ec80b3f925c482d "However, keep in mind that the cpu type MUST represent a linear piece of memory, otherwise you

Re: [ViennaCL-devel] Compilation load of matrix-test-*

2013-08-01 Thread Karl Rupp
Hi, > I have had troubles compiling matrix-test-* for quite some time, but it > has gone worse over time. The compilation process appears to eat up one > core at 100% (i have a core i5!) and over 1GB on RAM, which is enough to > freeze my computer for 20-25sec. 100% is just what a core is suppos

Re: [ViennaCL-devel] zero-padding datastructures...

2013-08-01 Thread Karl Rupp
Hi, > I've been thinking a bit about dynamically zero-padding > viennacl::matrix<> for full hardware use ( best bandwidth for BLAS1, > BLAS2, best performance for BLAS3). > > Basically, the big problem arising is that the blocking-parameter is not > dependent on the hardware or the matrix, but ra

Re: [ViennaCL-devel] zero-padding datastructures...

2013-08-02 Thread Karl Rupp
Hey, > > Hmm, I'm not completely sure. > The best GEMM performance are not located "around" (distance-wise in the > parameter space) the sweet spot, generally, since perturbating one > parameter can result in disastrous performance. Yeah, I agree, the sweet spot may not be defined 'distance-wise

Re: [ViennaCL-devel] openmp 4.0

2013-08-02 Thread Karl Rupp
Hi Evan, > OpenMP 4.0 specification has been released, which includes support for > accelerators, thread affinity, Fortran 2003, etc.: > > http://www.hpcwire.com/hpcwire/2013-07-31/openmp_40_specification_released_with_significant_new_standard_features.html Thanks! I consider the thread affinity

Re: [ViennaCL-devel] openmp 4.0

2013-08-02 Thread Karl Rupp
Hey, > I hope it wont take years. First compiler implementations will be available in no time, sure. However, it will take years until enterprise cluster systems like CentOS have upgraded to these compilers. We still have clusters here with GCC 4.2.x... > I saw a presentation earlier today th

Re: [ViennaCL-devel] Compilation load of matrix-test-*

2013-08-02 Thread Karl Rupp
Hi Phil, the tests are now split into more light-weight units by separating single and double precision. matrix-test was additionally split into row-major and column-major tests. This should now allow you to build with `make -j4` on weaker machines with limited RAM. Best regards, Karli On 0

Re: [ViennaCL-devel] zero-padding datastructures...

2013-08-02 Thread Karl Rupp
Hi, > A padding of 256 looks pretty expensive to me, resulting in a lot of > unnecessary FLOPs in worst case. Can you please assemble a list of > all GEMM kernel configuration parameters and their execution times > for the GTX 470, Tesla C2050, HD 7970 and HD 5850? mL, nL, and kL >

Re: [ViennaCL-devel] zero-padding datastructures...

2013-08-04 Thread Karl Rupp
Hi, > We actually need two sets of files: One for dumping the benchmark > results, one for holding the 'best' parameter configuration. For > dumping results, we probably want something more lightweight than XML: > - JSON > - Just CSV files with a metadata section, e.g. > #

Re: [ViennaCL-devel] PyViennaCL Semantics: what should "vector * vector" mean?

2013-08-04 Thread Karl Rupp
Hi, > Suppose we have vectors v1 and v2. Then, we have four options for the > semantics of "v1 * v2": > > 1) Element-wise product > 2) Dot product > 3) Outer product > 4) Leave undefined > > Most of the time, in the rest of PyViennaCL, I've chosen semantics for > the * operator that make sense gi

Re: [ViennaCL-devel] PyViennaCL Semantics: what should "vector * vector" mean?

2013-08-04 Thread Karl Rupp
Hi Toby, > The main difficulty with following the conventions is that it's not > clear which is the convention to pick. NumPy provides both a matrix() > class and a ndarray() class -- the former has semantics closer to matrix > algebra, whilst the latter is designed to be closer to having more >

Re: [ViennaCL-devel] PyViennaCL Semantics: what should "vector * vector" mean?

2013-08-04 Thread Karl Rupp
Hi Toby, > At the moment, I have the following semantics for *: > >Matrix * Matrix -> Matrix (matrix product) >Matrix * Vector -> Vector (matrix product) >Matrix * Scalar -> Matrix (scalar product) > >Vector * Vector -> Matrix (outer product) >Vector * Scalar -> Vector (scalar

Re: [ViennaCL-devel] PyViennaCL Semantics: what should "vector * vector" mean?

2013-08-05 Thread Karl Rupp
Hi Toby, >> I consider Vector * Vector -> Matrix to be surprising or at least >> somewhat non-intuitive. Following your operations, the most reasonable >> definition would be >> >> >Matrix * Matrix -> Matrix (matrix product) >> >Vector * Vector -> Vector (element-wise product) >> >

Re: [ViennaCL-devel] Compilation load of matrix-test-*

2013-08-06 Thread Karl Rupp
Hi, > I've just realized i had forgotten to answer! > My computer is no longer laggy in single-threaded mode, which is already > a good thing :) it still cannot bear make -j4, even though it has 4GB of > RAM, my desktop computer can without any issue, though. I'll update this > when I have cleane

Re: [ViennaCL-devel] On Autotuning GEMM

2013-08-07 Thread Karl Rupp
Hi, > For a few days, I've been playing around with AMD's CodeXL, the HD5850 > and the generator/autotuner: > > > - First of all, I want to share something that made me completely crazy. > Avoid : > *vector += scalar*vector > * > in a compute bound context. After replacing the above by: > *vector.

Re: [ViennaCL-devel] Struggling with matrix proxies

2013-08-07 Thread Karl Rupp
Hi Toby, > I'm currently trying to implement matrix and vector proxies in > PyViennaCL, and I can't get my matrices to look right. Suppose I have > the following arbitrary 5x5 matrix, as displayed in Python: > m.value > array([[ 1., 2., 3., 4., 0.], > [ 5., 6., 7., 0.

[ViennaCL-devel] IRC meeting tomorrow?

2013-08-07 Thread Karl Rupp
Hi guys, let's have another IRC meeting (#ViennaCL on irc.freenode.net) this week! Everybody is welcome to join :-) As I'm presumably unavailable on Friday and don't know about my availability over the weekend, what about tomorrow, Thursday, at 19:00 UTC? Does this work for at least Toby and P

Re: [ViennaCL-devel] IRC meeting tomorrow?

2013-08-07 Thread Karl Rupp
Hi, > 19:00 UTC suits me fine. Cool, so have have a meeting of four already. Welcome, Evan :-) >> Potential topics (of course all work in progress): >>- pyViennaCL: current status >>- Scheduler: interface extensions required? >>- Generator: Define functionality for 1.5.0 release >>

Re: [ViennaCL-devel] On Autotuning GEMM

2013-08-07 Thread Karl Rupp
Hi Phil, please don't drop the devel mailing list unless you mention some French secrets (maybe cheese or wine recipes?) which you are not allowed to share in public ;-) > They switched from a VLIW-architecture to their GCN architecture within > the HD7xxx series: > http://en.wik

Re: [ViennaCL-devel] Struggling with matrix proxies

2013-08-07 Thread Karl Rupp
> OK, all working now: > m = p.Matrix(10,10,1.0) p.Assign(m[0:5,0:5], p.Matrix(5,5,5.0)).execute(); m.value > > array([[ 5., 5., 5., 5., 5., 1., 1., 1., 1., 1.], > [ 5., 5., 5., 5., 5., 1., 1., 1., 1., 1.], > [ 5., 5., 5., 5., 5., 1., 1., 1., 1.

Re: [ViennaCL-devel] IRC meeting tomorrow?

2013-08-07 Thread Karl Rupp
Hi, >> Currently the closest thing to a roadmap is the issue tracker. We had a >> couple of long-term ideas on our Trac server in Vienna, but since we are >> about to close that down, I better not share the link and instead >> transfer it over to the Github wiki to get this rectified. It's not >>

Re: [ViennaCL-devel] IRC meeting tomorrow?

2013-08-07 Thread Karl Rupp
Hi again, I transferred all the interesting stuff over to the wiki: https://github.com/viennacl/viennacl-dev/wiki/ViennaCL-Roadmap Best regards, Karli On 08/07/2013 04:18 PM, Toby St Clere Smithe wrote: > Karl Rupp writes: >> Currently the closest thing to a roadmap is the issue tr

Re: [ViennaCL-devel] old link

2013-08-11 Thread Karl Rupp
Hi Evan, > I remembered I posted this link on my blog a while ago. Hit the link > again today and see that the new version (May 2013) gives a good shout > out to ViennaCL for direct solvers: > > http://www.netlib.org/utk/people/JackDongarra/la-sw.html Haha, ViennaCL is certainly most well-known

Re: [ViennaCL-devel] BLAS3, range, slice, compilation time...

2013-08-11 Thread Karl Rupp
Hi Xeon Phil ;-) > There are a lot of problems related to coupling the current BLAS3 > implementation with the kernel generator: > > - While I think I could add some range support, adding slices will be > extremely difficult, and it would probably result in bad performance > whatever kernel is u

Re: [ViennaCL-devel] Polishing BLAS3 benchmark

2013-08-12 Thread Karl Rupp
Hi, > Good news : the GEMMs calls for OpenCL on dense non-proxy matrix now > call the generator ! It's a good step towards performance portability. Hurray, indeed it is! Well done! :-) Now as you fixed some things in the autotuner, I could also give it another shot on the MIC. Does the autotune

Re: [ViennaCL-devel] Polishing BLAS3 benchmark

2013-08-12 Thread Karl Rupp
Hi, > I still have some polishing to do for the autotuner, so that it indeed > print the same thing as viennacl-info. The problem of the MIC is that I > have absolutely ZERO idea of how to prune profiles. We can try, with > the GPU configuration so that it remains tractable, but I don't > gua

[ViennaCL-devel] Updates to scheduler (node members updated)

2013-08-12 Thread Karl Rupp
Hi guys, as discussed at the last IRC meeting on Thursday, I've updated the scheduler node members. Both LHS and RHS now have the following members: - type_family: One out of: COMPOSITE_OPERATION_FAMILY SCALAR_TYPE_FAMILY VECTOR_TYPE_FAMILY MATRIX_TYPE_FAMILY - subtype:

Re: [ViennaCL-devel] BLAS3, range, slice, compilation time...

2013-08-13 Thread Karl Rupp
Hey, alright, we've got some issues to fight ;-) On GPUs with 16kB of shared memory (e.g. GTX 285), the generated GEMM kernels now exceed the available memory: Log: ptxas error : Entry function 'kernel_0x207f4b0_0' uses too much shared data (0x40a0 bytes + 0x10 bytes system, 0x4000 max) Thi

Re: [ViennaCL-devel] Fwd: BLAS3, range, slice, compilation time...

2013-08-13 Thread Karl Rupp
Hi, > On GPUs with 16kB of shared memory (e.g. GTX 285), the generated > GEMM kernels now exceed the available memory: > > Log: ptxas error : Entry function 'kernel_0x207f4b0_0' uses too > much shared data (0x40a0 bytes + 0x10 bytes system, 0x4000 max) > > This is because of

Re: [ViennaCL-devel] Fwd: BLAS3, range, slice, compilation time...

2013-08-13 Thread Karl Rupp
Hi, > We can directly query the available local device memory (which is the > reason why I added all this buffering to the device class). Am I missing > something? > > > Yes, we could. But having the combination {vendor, local memory} seems a > bit weird to me, I think {vendor, genera

Re: [ViennaCL-devel] Fwd: BLAS3, range, slice, compilation time...

2013-08-13 Thread Karl Rupp
or ID: 4318 Version: OpenCL 1.0 CUDA Driver Version:304.43 Maybe the work group size exceeds 512? It works well on the GTX 470, though... Best regards, Karli On 08/13/2013 11:01 AM, Philippe Tillet wrote: > Hi hi, > > > 20

Re: [ViennaCL-devel] Fwd: BLAS3, range, slice, compilation time...

2013-08-13 Thread Karl Rupp
Hi, > Yes, the default NVidia profile for double precision uses a work group > size of 1024... All this is checked during the autotuning procedure so > that it will work for the hardware it's tunned for... > Meh, seems like we need a couple additional levels of abstraction to > reach safety. In

Re: [ViennaCL-devel] Fwd: BLAS3, range, slice, compilation time...

2013-08-13 Thread Karl Rupp
Hey, > {vendor, generation} is the natural format for the handling the > profile internally, yes. This will presumably involve string parsing > of the device name, yes :-( > > > I'll do that :) Should I add a "generation" method in the ocl::device > class? I think it is most suited he

[ViennaCL-devel] Fwd: APPML is now available as open source as clMath

2013-08-13 Thread Karl Rupp
Hi guys, wow, AMD open-sourced their Math libraries... Best regards, Karli --- *AMD Accelerated Parallel Processing Math Libraries (APPML) is now available as open source as clMath.* I am extremely pleased to have the opportunity to announce that the APPML BLAS & FFT proje

Re: [ViennaCL-devel] Fwd: BLAS3, range, slice, compilation time...

2013-08-13 Thread Karl Rupp
Hey, > I've pushed the changes. Does it solve the GTX285 case? thanks, it does! > The policy is : > > - One global GPU fallback (very conservative) > - One global CPU fallback (very conservative) > - One global Accelerator fallback (very conservative) > -One Fallback per architecture family >

Re: [ViennaCL-devel] Fwd: BLAS3, range, slice, compilation time...

2013-08-13 Thread Karl Rupp
Hi, > Do we want to keep the full device name in the profiles map? With > vendor and arch determined, we know pretty much everything we need > to know. If we need to match the name 1:1, there may be too many > devices which we miss even though the 'faster' profile should work? > >

[ViennaCL-devel] Parallella boards

2013-08-14 Thread Karl Rupp
Hi guys, Parallella will ship their first (OpenCL-enabled!) boards in October and also offers a university partner program: http://www.parallella.org/pup/ I'm tempted to order one of the boards in September and eventually apply for the university partnership program when I'm back in Vienna. Who

Re: [ViennaCL-devel] Parallella boards

2013-08-14 Thread Karl Rupp
Hi Toby, > Karl Rupp writes: >> Parallella will ship their first (OpenCL-enabled!) boards in October and >> also offers a university partner program: >> http://www.parallella.org/pup/ > > I'd forgotten about these things! Their roadmap does look intruiging; >

[ViennaCL-devel] Scheduler progresses

2013-08-15 Thread Karl Rupp
Hi guys, the scheduler for kernel fusion makes good progress. Toby, you should be able to use all of the fundamental dense linear algebra operations now. There should be only be two blocks of functionality missing: - Sparse matrices (i.e. matrix-vector products) - In some cases where += and

[ViennaCL-devel] Next IRC meeting?

2013-08-15 Thread Karl Rupp
Hi again, what about an IRC meeting on Monday, August 19, 19:00 UTC? The largest items for the release are about to be completed, so we transition into the polishing phase. Potential topics: - Low-level C interface: How close to BLAS should it be? - PyViennaCL: Current status - Interface e

Re: [ViennaCL-devel] Scheduler progresses

2013-08-15 Thread Karl Rupp
Hey, > Nice job guys! The peanut gallery approves. I'll continue watching the > masters at play. :-P haha, we'll do our best to keep you entertained :-D Best regards, Karli > On Thu, Aug 15, 2013 at 7:22 PM, Karl Rupp wrote: >> Hi guys, >> >> the s

Re: [ViennaCL-devel] OpenCL to CUDA kernel translation

2013-08-16 Thread Karl Rupp
Hi, > It seems to me that most of the differences between CUDA and OpenCL come > from the respective APIs, but that the kernel code is very similar in > the two cases. > Do you guys think it's possible to easily translate the generated kernel > from OpenCL to CUDA, by just doing one-to-one repla

Re: [ViennaCL-devel] Scheduler progresses

2013-08-16 Thread Karl Rupp
Hi Toby, >> the scheduler for kernel fusion makes good progress. Toby, you should be >> able to use all of the fundamental dense linear algebra operations now. >> There should be only be two blocks of functionality missing: >>- Sparse matrices (i.e. matrix-vector products) >>- In some cas

Re: [ViennaCL-devel] Kernel Generator's problem on Kepler

2013-08-20 Thread Karl Rupp
Hi Philippe, rather than having too much speculation here, what about adding a quick OpenCL-to-CUDA translator (just string substitution, you don't need more) to the generator? Put the best kernels for Fermi and Kepler into a compilation unit and then hopefully Denis or Evan will give it a try?

Re: [ViennaCL-devel] Kernel Generator's problem on Kepler

2013-08-20 Thread Karl Rupp
Hey, > Sorry for too much speculation, it seems like the problem comes from the > generator, not the OpenCL SDK. > Seems like I'm way too suspicious, sorry :D Ok, apparently we need an even larger search space then... > The good news is that the converter being done, I can have a better > workf

[ViennaCL-devel] Autotuner thoughts

2013-08-22 Thread Karl Rupp
Hi guys, this is presumably mostly for Philippe: I noticed that the autotuning targets in examples/autotuner use a couple of static parameters. It would be nice to have them dynamically set via named command line parameters, so that it's easier to run them in a batched manner via some shell sc

Re: [ViennaCL-devel] Autotuner thoughts

2013-08-22 Thread Karl Rupp
n install Boost on a Parallella board right now. It's not that hard to match the command line strings directly, is it? ;-) Best regards, Karli > > 2013/8/22 Karl Rupp mailto:r...@iue.tuwien.ac.at>> > > Hi guys, > > this is presumably mostly for Philippe: > &g

Re: [ViennaCL-devel] Segfault with composite subexpression of elementwise operation

2013-08-22 Thread Karl Rupp
Hey, > Also, I'm not sure of the 'correct' way to construct a unary type > node. For instance, if you try and print a unary type node with only lhs > set (as would make sense), then you get into an infinite loop, because > the operator<< and print_node functions in scheduler/io.hpp assume > you'v

Re: [ViennaCL-devel] Segfault with composite subexpression of elementwise operation

2013-08-22 Thread Karl Rupp
Hey, > So I've implemented a PyViennaCL translation of the blas3_prod test, but > I'm having difficulties with OPERATION_UNARY_FABS_TYPE. If I construct > an OPERATION_UNARY_FABS_TYPE node with just a DENSE_ROW_MATRIX as a > leaf, then everything passes as expected. But if I try and execute > OPE

Re: [ViennaCL-devel] Segfault with composite subexpression of elementwise operation

2013-08-22 Thread Karl Rupp
Hey, >> Yeah. I thought that, since I'd fixed the segfault and the expression >> tree worked for other expressions, the problem didn't lie in the >> wrapper. But I began to doubt that judgement, so I wrote a C++ test >> program[1] to do the equivalent construction manually. And that >> worked.. S

Re: [ViennaCL-devel] Autotuner thoughts

2013-08-24 Thread Karl Rupp
Hey, > This is done :) > You can check on ./examples/autotuner/gemm_autotuning --help and tell me > if it is alright ! It is not heavily tested (I'm on my laptop) but he > basic tests suggest that it works. Cool, this looks great already. What about ordering the options such that the required p

Re: [ViennaCL-devel] Status update

2013-08-26 Thread Karl Rupp
Hi guys, the ViennaCL API does not allow the mix of row- and column-major dense matrices for general operations yet, mixing is only allowed for matrix-matrix products. You should run into some assertions, though, so I wonder why this is not the case. Did you compile with NDEBUG? Best regards,

Re: [ViennaCL-devel] Exceptions vs assertions

2013-08-26 Thread Karl Rupp
Hi Toby, > Can I convert all / most of the assertions in the scheduler code to > exceptions? It would help me (so that Python doesn't just abort if > something goes wrong), and I think it would help anyone else who happens > to compile with NDEBUG defined, and who then does something undefined.

Re: [ViennaCL-devel] Exceptions vs assertions -- and a bug..

2013-08-27 Thread Karl Rupp
Hi Toby, > Having fixed up a layout consistency check, I seem to have hit a real > scheduler bug. I'm currently reading the code myself with a view to a > fix, but since you wrote it, I suspect you'd be quicker. I've written a > C++ program that produces a seg-fault: http://paste.ubuntu.com/603058

Re: [ViennaCL-devel] Exceptions vs assertions -- and a bug..

2013-08-27 Thread Karl Rupp
Hi Toby, > Having fixed up a layout consistency check, I seem to have hit a real > scheduler bug. I'm currently reading the code myself with a view to a > fix, but since you wrote it, I suspect you'd be quicker. I've written a > C++ program that produces a seg-fault: http://paste.ubuntu.com/60305

Re: [ViennaCL-devel] Vendor ID portability

2013-08-28 Thread Karl Rupp
Hi, > I have just realized that the Vendor ID was associated with the SDK > provider, not the Hardware Vendor. That is, Apple's implementation of > NVidia is slightly different from the original one : > > Apple : > Vendor name : NVIDIA > Vendor ID : 16918016 > > NVidia : > Vendor name : NVIDIA Co

[ViennaCL-devel] Release approaching

2013-08-29 Thread Karl Rupp
Hi guys, the OpenCL kernels are now integrated into the main source tree, the auxiliary/ folder is no longer used. This makes the repository easier to handle, as `make -j4` works out of the box again and no extra packaging step is necessary to get a fully self-contained source tree. There are

Re: [ViennaCL-devel] Release approaching

2013-08-29 Thread Karl Rupp
Hey, > There are a few more relatively minor things left for the release, I > hope to fix most of them today and update the documentation. Philippe, > please also add more Doxygen documentation to the generator and focus on > testing the generator output for corner cases. Presumab

Re: [ViennaCL-devel] Auto-Tuner, GEMM, GEMV... : Integrating RaijinCL into the generator

2013-08-30 Thread Karl Rupp
Hi Philippe, > About 6months ago I had heard of a library that also performed > autotuning (http://raijincl.org), but that offered the same performance > as ours back then. > Since then, the performance have *greatly* improved, largely > outperforming our autotuner : > - Over 3TFLOP/s on HD7970 >

Re: [ViennaCL-devel] Auto-Tuner, GEMM, GEMV... : Integrating RaijinCL into the generator

2013-08-30 Thread Karl Rupp
Hi Philippe, > Since our generator is skeleton-based anyway, what about having a look > at the best performing kernels in RaijinCL and then extending the > current generator accordingly such that these kernels are covered as > well? I consider this to be *far* less painful then t

Re: [ViennaCL-devel] Another apparent scheduler bug?

2013-09-03 Thread Karl Rupp
Hi Toby, > Despite a fairly hectic last couple of days (girlfriend finished her > PhD, then had to move house...), congratulations! > I've been doing some PyViennaCL bug > fixing in order not to fall behind, and I think I've hit another > scheduler bug. I've written a test case at [1], and a ba

[ViennaCL-devel] Release status

2013-09-04 Thread Karl Rupp
Hi guys, FYI: Due to the many new features we've added since the last release, I'm still polishing the code. Rather than pushing out a few incomplete pieces, I decided to invest a few more days to round things up, so I'll postpone (again) the celebrations to next week... Best regards, Karli -

Re: [ViennaCL-devel] Release status

2013-09-04 Thread Karl Rupp
Hey, > On my side, I've been working a bit more on my unconstrained nonlinear > minimization library, that I plan to keep as a separate package, because > i've made it pretty generic (can work theoretically with any linear > algebra backend - viennacl, eigen, armadillo ...) using template magic.

Re: [ViennaCL-devel] Handling extremely non-square matrices. Dispatching?

2013-09-04 Thread Karl Rupp
Hey, > For my research I will have to deal with extremely nonsquare matrices. > That is potentially 32*10 000, 4*80 000, or even 128*1 000 000. > This often occurs in statistics, where one has a few numbers of > variables in the first dimensions, and a significant number of samples > in the other

Re: [ViennaCL-devel] Handling extremely non-square matrices. Dispatching?

2013-09-07 Thread Karl Rupp
Hi, > I can't think of any such case where one would want to have control over > this. This would require knowledge of our implementations to make > appropriate choices anyway. In order to have a reasonable decision > process, we need to come up with some heuristics... > My first idea would be to

Re: [ViennaCL-devel] Release status

2013-09-07 Thread Karl Rupp
Hey hey hey, > While integrating it completely is trivial. I have no real name for it > right now though. Let's assume it is named fminpp, then : > > typedef fminpp::optimization_options optimization_options; > typedef fminpp::directions::cg cg; > //other aliases... > > template > viennacl::vecto

Re: [ViennaCL-devel] PyViennaCL status

2013-09-18 Thread Karl Rupp
Hi Toby, sorry for the late reply. I'm finally back to Austria ;-) > Just a quick update. I'm in the process of writing decent documentation > for my classes and functions. There are quite a few: sloccount tells me > that, including tests, I've got about 5500 source lines of code (not > includi

Re: [ViennaCL-devel] Enabling integer types in PyViennaCL

2013-09-18 Thread Karl Rupp
Hi Toby, > I get a lot of errors like the below when I enable T = char (or other > integer numeric types) in PyViennaCL. The error always comes down to the > "request for member 'handle' in 'val'". Previously, it was in the > context of arithmetical functions like addition or element_pow, but here

  1   2   3   4   5   6   >