Re: [ViennaCL-devel] More weird problems

2014-11-08 Thread Karl Rupp
Hi Toby,

  Thanks! The numerical errors with element-wise operations such as tan()
 or sin() look okay, that's just numerical noise. The following test
 cases deserve a closer look, though:

 test_matrix_matrix_trans_isub_C_float32
 test_matrix_matrix_slice_trans_isub_C_float32
 test_matrix_matrix_trans_isub_F_float32
 test_matrix_range_matrix_slice_trans_iadd_C_float32
 test_matrix_slice_matrix_slice_trans_iadd_C_float32
 test_matrix_slice_matrix_trans_iadd_F_float32
 test_matrix_matrix_range_trans_isub_C_float32
 test_matrix_slice_matrix_trans_iadd_C_float32
 test_matrix_matrix_slice_trans_isub_F_float32
 test_matrix_matrix_range_trans_iadd_C_float32
 test_matrix_range_matrix_trans_isub_C_float32
 test_matrix_slice_matrix_slice_trans_isub_C_float32
 test_matrix_slice_matrix_trans_isub_F_float32
 test_matrix_matrix_range_trans_iadd_F_float32
 test_matrix_slice_matrix_trans_isub_C_float32
 test_matrix_matrix_slice_trans_iadd_C_float32
 test_matrix_range_matrix_range_trans_isub_C_float32
 test_matrix_range_matrix_trans_isub_F_float32
 test_matrix_range_matrix_slice_trans_isub_C_float32
 test_matrix_slice_matrix_range_trans_iadd_F_float32
 test_matrix_range_matrix_range_trans_iadd_C_float32
 test_matrix_slice_matrix_range_trans_isub_F_float32
 test_matrix_range_matrix_trans_iadd_C_float32
 test_matrix_slice_matrix_slice_trans_iadd_F_float32
 test_matrix_range_matrix_trans_iadd_F_float32
 test_matrix_slice_matrix_range_trans_iadd_C_float32
 test_matrix_matrix_trans_iadd_F_float32
 test_matrix_matrix_trans_iadd_C_float32
 test_matrix_slice_matrix_slice_trans_isub_F_float32
 test_matrix_range_matrix_slice_trans_isub_F_float32
 test_matrix_matrix_range_trans_isub_F_float32
 test_matrix_range_matrix_range_trans_isub_F_float32
 test_matrix_slice_matrix_range_trans_isub_C_float32
 test_matrix_range_matrix_range_trans_iadd_F_float32

 Apparently they all belong to the same family of operations. Can you
 please help me with the deciphering? Which operations correspond to the
 test cases above? (I could guess, but I may be wrong...) iadd and isub
 refer to += and -=?

 Yep. So 'C' and 'F' mean C (row-major) layout or Fortran (col-major) layout.
 Your guess was right about 'iadd' and 'isub': these are simply A += B
 and A -= B. The values of A and B are given by the _matrix_ bits: the
 first one describes A, and the second describes B. So
 test_matrix_slice_matrix_range_trans_isub_C_float32 means

matrix_slice -= matrix_range.T

 where both are C-layout and single precision.

okay, these should be working now, I've pushed a fix. :-)
Still need to look into the GMRES issue...

Best regards,
Karli


--
___
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel


Re: [ViennaCL-devel] More weird problems

2014-11-08 Thread Toby St Clere Smithe
Hi Karli,

Karl Rupp r...@iue.tuwien.ac.at writes:
 Yep. So 'C' and 'F' mean C (row-major) layout or Fortran (col-major) layout.
 Your guess was right about 'iadd' and 'isub': these are simply A += B
 and A -= B. The values of A and B are given by the _matrix_ bits: the
 first one describes A, and the second describes B. So
 test_matrix_slice_matrix_range_trans_isub_C_float32 means

matrix_slice -= matrix_range.T

 where both are C-layout and single precision.

 okay, these should be working now, I've pushed a fix. :-)

Yep, all matrix_operations tests now pass! :)

 Still need to look into the GMRES issue...

OK.

Cheers,

T



-- 
Toby St Clere Smithe
http://tsmithe.net


--
___
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel


Re: [ViennaCL-devel] More weird problems

2014-11-08 Thread Karl Rupp
Hi Toby,

  Yep. So 'C' and 'F' mean C (row-major) layout or Fortran 
(col-major) layout.
 Your guess was right about 'iadd' and 'isub': these are simply A += B
 and A -= B. The values of A and B are given by the _matrix_ bits: the
 first one describes A, and the second describes B. So
 test_matrix_slice_matrix_range_trans_isub_C_float32 means

 matrix_slice -= matrix_range.T

 where both are C-layout and single precision.

 okay, these should be working now, I've pushed a fix. :-)

 Yep, all matrix_operations tests now pass! :)

Excellent! :-)


 Still need to look into the GMRES issue...

 OK.

Just pushed a fix. Apparently I got punished for taking a mild shortcut. 
Thanks for spotting this! :-)

Best regards,
Karli


--
___
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel


Re: [ViennaCL-devel] More weird problems

2014-11-08 Thread Karl Rupp
Hi,

 Still need to look into the GMRES issue...

 OK.

fixed :-)

Best regards,
Karli

--
___
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel


Re: [ViennaCL-devel] More weird problems

2014-11-07 Thread Toby St Clere Smithe
Hi Karli,

Karl Rupp r...@iue.tuwien.ac.at writes:
 So it looks like the fix was not that complicated:
 https://github.com/viennacl/viennacl-dev/commit/b5909a765eef584bf1300f3d6696b095b4a7e6a2
 (most of the new lines are additional testing code...)

Neat!

 In addition to matrix operations such as
A = B - trans(C)
 this should now also support things like
   x = element_sin(y) + element_cos(z);
 for vectors. :-)

 Please let me know ASAP whether this resolves the problem. I'll wait for 
 another round of nightly tests and release tomorrow as soon as I got a 
 positive feedback (and if I get negative feedback I'll work on the fix 
 ;-) ).

Great, yes, that seems (at least almost entirely) to have done the
trick. There are still a few failures (on the order of a couple of per
cent), but I'm not sure I know the core well enough to spot if they're a
bug there, an expected numerical difference, trying to do something
unsupported, or a bug in PyViennaCL. Could you cast your eye over
these?[1]

[1] http://paste.ubuntu.com/8873927/

I also get some pretty huge error values using GMRES without a
preconditioner on any double-precision sparse matrix (compressed,
coordinate, ell, hybrid). Weirdly, the tests pass with GMRES on the same
systems, when any of the (supported) preconditioners is used.[2] But I
would have thought these would have been caught by the ViennaCL test
suite, so I'm suspicious. Nonetheless, these seem to occur both on
nVidia and Beignet.

[2] http://paste.ubuntu.com/8874078/

The rest of my tests pass just fine.

Best,

Toby


-- 
Toby St Clere Smithe
http://tsmithe.net


--
___
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel


[ViennaCL-devel] More weird problems

2014-11-05 Thread Toby St Clere Smithe
OK, so 2 more weird problems. The rest of my tests pass (but I still
don't quite have complete test coverage).


1. Cache issues continue:

On both nVidia and Beignet, I get some error and a segfault like this on
sparse gemv:

Build Status = -2 ( Err = -11 )

The segfault happens when calling (in ocl/context.hpp):

443   err = clGetProgramBuildInfo(temp, devices_[0].id(), 
CL_PROGRAM_BUILD_LOG, 0, NULL, ret_val_size);

Now, the cache prefix seems correct (at least inspecting it with 'info
locals' in gdb doesn't give a cross-device collision), and this happens
even with a clean cache.

source_text = 0x1c8b828 __kernel void vec_mul( \n  __global const unsigned int 
* coords, \n  __global const float * elements, \n  __global const float * x, \n 
 uint4 layout_x, \n  __global float * result, \n  uint4 layout_result, ..

But the preceding call (before the failed buildProgram) works:

427   temp = clCreateProgramWithSource(h_.get(), 1, (const char 
**)source_text, source_size, err);

So I'm not sure what to make of this..



2. I get a lot of failures from test_matrix_operations.py, too, which
did not happen before. One time, I also got a segfault, but because I
was outside of a debugger, I did not catch it; I cannot yet reliably
reproduce it. The failures are all of the form

Assign(Matrix:float64, ElementFabs(Sub(Matrix:float64, 
Sub(Mul(Trans(Matrix:float64)=Matrix:float64, Scalar:float64)=Matrix:float64, 
Trans(Matrix:float64)=Matrix:float64)=Matrix:float64)=Matrix:float64)=Matrix:float64)

and they produce this exception:

ViennaCL: Internal error: The scheduler encountered a problem with the 
operation provided: Cannot deal with unary operations on vectors

I think the defining feature is the ElementFabs of some expression
involving a Trans, and I think these are now occurring because I
disabled an old bit of code: previously, I dispatched the matrix
transposition before the rest of the statement, because expressions
involving trans weren't supported by the scheduler. But then Philippe
pointed out that this made autotuning such expressions impossible, so I
disabled that dispatch. It seems there are some bits where this remains
unsupported, so I'll have a think about what to do.


Cheers,

Toby


--
___
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel


Re: [ViennaCL-devel] More weird problems

2014-11-05 Thread Toby St Clere Smithe
Toby St Clere Smithe m...@tsmithe.net
writes:
 The segfault happens when calling (in ocl/context.hpp):

 443   err = clGetProgramBuildInfo(temp, devices_[0].id(), 
 CL_PROGRAM_BUILD_LOG, 0, NULL, ret_val_size);

Oh, and the segfault happens in nVidia's OpenCL when it calls strlen
somewhere..

Oh, wait.. Apparently my nvidia module was still loaded, so I take back
the beignet comment (the system defaults to nvidia if it's available;
what I thought was beignet was therefore not..).

And reloading the nvidia module seems to solve this for nvidia, too.

Very strange.

But the matrix_operations problems still remain!



--
___
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel


Re: [ViennaCL-devel] More weird problems

2014-11-05 Thread Philippe Tillet
I remember us already having a problem with strlen on the cache with your
NVidia SDK, which disappeared when you rebooted. Didn't we?

2014-11-05 16:25 GMT-05:00 Toby St Clere Smithe m...@tsmithe.net:

 Toby St Clere Smithe m...@tsmithe.net
 writes:
  The segfault happens when calling (in ocl/context.hpp):
 
  443   err = clGetProgramBuildInfo(temp, devices_[0].id(),
 CL_PROGRAM_BUILD_LOG, 0, NULL, ret_val_size);

 Oh, and the segfault happens in nVidia's OpenCL when it calls strlen
 somewhere..

 Oh, wait.. Apparently my nvidia module was still loaded, so I take back
 the beignet comment (the system defaults to nvidia if it's available;
 what I thought was beignet was therefore not..).

 And reloading the nvidia module seems to solve this for nvidia, too.

 Very strange.

 But the matrix_operations problems still remain!




 --
 ___
 ViennaCL-devel mailing list
 ViennaCL-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/viennacl-devel

--
___
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel


Re: [ViennaCL-devel] More weird problems

2014-11-05 Thread Karl Rupp
Hi Toby,

  OK, so 2 more weird problems. The rest of my tests pass (but I still
 don't quite have complete test coverage).

I fixed a couple of minor problems today, which should finally result in 
practically full success of the nightly test suite. One of them might be 
the cause of your first item, which showed up as build errors on NVIDIA 
GPUs.



 1. Cache issues continue:

 On both nVidia and Beignet, I get some error and a segfault like this on
 sparse gemv:

 Build Status = -2 ( Err = -11 )

 The segfault happens when calling (in ocl/context.hpp):

 443   err = clGetProgramBuildInfo(temp, devices_[0].id(), 
 CL_PROGRAM_BUILD_LOG, 0, NULL, ret_val_size);

 Now, the cache prefix seems correct (at least inspecting it with 'info
 locals' in gdb doesn't give a cross-device collision), and this happens
 even with a clean cache.

 source_text = 0x1c8b828 __kernel void vec_mul( \n  __global const unsigned 
 int * coords, \n  __global const float * elements, \n  __global const float * 
 x, \n  uint4 layout_x, \n  __global float * result, \n  uint4 layout_result, 
 ..

 But the preceding call (before the failed buildProgram) works:

 427   temp = clCreateProgramWithSource(h_.get(), 1, (const char 
 **)source_text, source_size, err);

 So I'm not sure what to make of this..

ViennaCL used to dump the build log and the kernel sources. You should 
see the same?


 2. I get a lot of failures from test_matrix_operations.py, too, which
 did not happen before. One time, I also got a segfault, but because I
 was outside of a debugger, I did not catch it; I cannot yet reliably
 reproduce it. The failures are all of the form

 Assign(Matrix:float64, ElementFabs(Sub(Matrix:float64, 
 Sub(Mul(Trans(Matrix:float64)=Matrix:float64, 
 Scalar:float64)=Matrix:float64, 
 Trans(Matrix:float64)=Matrix:float64)=Matrix:float64)=Matrix:float64)=Matrix:float64)

 and they produce this exception:

 ViennaCL: Internal error: The scheduler encountered a problem with the 
 operation provided: Cannot deal with unary operations on vectors

 I think the defining feature is the ElementFabs of some expression
 involving a Trans, and I think these are now occurring because I
 disabled an old bit of code: previously, I dispatched the matrix
 transposition before the rest of the statement, because expressions
 involving trans weren't supported by the scheduler. But then Philippe
 pointed out that this made autotuning such expressions impossible, so I
 disabled that dispatch. It seems there are some bits where this remains
 unsupported, so I'll have a think about what to do.

The core does not support composite expressions involving trans(), e.g
  A + trans(A)
What is currently supported is:
  A = prod(trans(B), C);
  A = prod(B, trans(C));
  A = prod(trans(B), trans(C));
  A = solve(trans(B), C, (unit_)upper_tag or (unit_)lower_tag);
  A = solve(B, trans(C), (unit_)upper_tag or (unit_)lower_tag);
  A = solve(trans(B), trans(C), (unit_)upper_tag or (unit_)lower_tag);
  y = prod(trans(A), x);
  A = trans(B);

I could fix up the scheduler such that it creates a temporary when 
encountering trans() in any other operation, though.

Best regards,
Karli

--
___
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel