Re: [theano-users] Re: Segmentation fault when setting lib.amdlibm=True and using trigonometric operations. How to debug?

2016-10-15 Thread Pascal Lamblin
Can you try running python from gdb? This may give some information.

On Sat, Oct 15, 2016, Michael Harradon wrote:
> I'm hitting this problem myself, as well.
> 
> Ubuntu 14.04
> gcc 4.8.4
> amdlibm-3.1
> libopenblas-dev 0.2.8
> 
> Would appreciate any suggestions - running with amdlibm off for the moment.
> 
> Best,
> Michael
> 
> On Friday, March 11, 2016 at 12:49:04 PM UTC-5, Pascal Lamblin wrote:
> >
> > On Fri, Mar 11, 2016, Juan Camilo Gamboa Higuera wrote: 
> > > I tried changing the architecture, but this still produces the 
> > segmentation 
> > > fault. How can I see all the flags that are passed to the gcc command 
> > from 
> > > theano? 
> >
> > You can change the log level of the following call [1], or just add 
> > "print(cmd)" there. 
> >
> > [1] 
> > https://github.com/Theano/Theano/blob/master/theano/gof/cmodule.py#L2165 
> >
> > > 
> > > -- Juan 
> > > 
> > > On Thursday, March 10, 2016 at 1:42:59 PM UTC-5, Pascal Lamblin wrote: 
> > > > 
> > > > Can you try passing explicitly a target architecture to g++ in theano, 
> > > > for instance THEANO_FLAGS=gcc.cxxflags='-march=core2'? 
> > > > 
> > > > By default, Theano tries to emulate '-march=native', which might cause 
> > > > trouble. 
> > > > 
> > > > On Thu, Mar 10, 2016, Juan Camilo Gamboa Higuera wrote: 
> > > > > I tried another machine with he following software: 
> > > > > Ubuntu 14.04.4, 4.2 kernel 
> > > > > gcc 4.8.4 
> > > > > libopenblas-dev 0.2.8 
> > > > > amd libm 3.1 
> > > > > latest theano version from the git repository 
> > > > > 
> > > > > and I got again a segmentation fault. The main difference between 
> > the 
> > > > > machines is the processor (The first a 6th gen i7, the second a 
> > Xeon, 
> > > > the 
> > > > > third a 5th gen i7). 
> > > > > Any other ideas of what might be causing the segmentation fault? 
> > > > > 
> > > > > -- Juan Camilo 
> > > > > 
> > > > > On Thursday, March 10, 2016 at 9:36:38 AM UTC-5, Juan Camilo Gamboa 
> > > > Higuera 
> > > > > wrote: 
> > > > > > 
> > > > > > Hi Pascal, 
> > > > > > 
> > > > > > Thanks for your time! 
> > > > > > 
> > > > > > I'm using the latest version of Theano from the git repository, 
> > and 
> > > > AMD 
> > > > > > libm version 3.1. The md5 sums are: 
> > > > > > 
> > > > > > fd9f86e040a4d5f26013fc5f60182680 libamdlibm.so 
> > > > > > 701474c84f1e3dff6bbb1a11d07eb215 libamdlibm.a 
> > > > > > 
> > > > > > I tried disabling blas (by setting 
> > > > > > *THEANO_FLAGS="lib.amdlibm=True,blas.ldflags=''"*) but I still get 
> > the 
> > > > > > segmentation fault. 
> > > > > > 
> > > > > > I've also tried it on another machine, and I get no segmentation 
> > > > fault. 
> > > > > > 
> > > > > > The machine where the segfault happens: 
> > > > > > Ubuntu 15.10, 4.2 kernel 
> > > > > > gcc 4.9.3 
> > > > > > libopenblas-dev 0.2.14 
> > > > > > 
> > > > > > The machine where it works 
> > > > > > Ubuntu 14.04.4, 4.2 kernel 
> > > > > > gcc 4.8.4 
> > > > > > libopenblas-dev 0.2.8 
> > > > > > 
> > > > > > I will try installing the older versions of openblas and gcc on my 
> > > > machine 
> > > > > > and see if it makes a difference. 
> > > > > > 
> > > > > > Do you have any other suggestions? 
> > > > > > 
> > > > > > -- Juan Camilo 
> > > > > > 
> > > > > > On Wednesday, March 9, 2016 at 7:52:39 PM UTC-5, Pascal Lamblin 
> > wrote: 
> > > > > >> 
> > > > > >> So I just tried your case, on Linux, with a version of amdlibm 
> > that 
> > > > is 
> > > > > >> possibly quite old but that I'm unable to identify (md5sum 
> > below). 
> > > > > >> 
> > > > > >> I was unfortunately unable to reproduce the segfault. 
> > > > > >> 
> > > > > >> It may be possible that for some reason, the output of the dot 
> > > > product 
> > > > > >> is not aligned, and that amdlibm does not support that. In that 
> > case, 
> > > > > >> maybe disabling BLAS could help. 
> > > > > >> 
> > > > > >> 
> > > > > >> 9211766f5cef4ce1a35dc44701dcac6a  libamdlibm.a 
> > > > > >> 3ce0e1c4c7afbfe514639fd8482ed220  libamdlibm.so 
> > > > > >> 
> > > > > >> 
> > > > > >> On Wed, Mar 09, 2016, Juan Camilo Gamboa Higuera wrote: 
> > > > > >> > Another development! 
> > > > > >> > 
> > > > > >> > If i disable the inplace optimization ( 
> > > > optimizer_excluding=inplace_opt 
> > > > > >> ) 
> > > > > >> > the  segmentation fault disappears. Which narrows down the 
> > problem 
> > > > to 
> > > > > >> > situations where 
> > > > > >> > 
> > > > > >> > * you take the sine or cosine of a dot product between two 
> > matrices 
> > > > and 
> > > > > >> > inplace optimization for elementwise ops is enabled.*Cheers! 
> > > > > >> > 
> > > > > >> > -- Juan Camilo 
> > > > > >> > 
> > > > > >> > On Wednesday, March 9, 2016 at 10:23:22 AM UTC-5, Juan Camilo 
> > > > Gamboa 
> > > > > >> > Higuera wrote: 
> > > > > >> > > 
> > > > > >> > > Hi all, 
> > > > > >> > > 
> > > > > >> > > *Here is a simpler version of the code that causes the 
> > > > segmentation 
> > > > > >> fault 
> > > > > >> > > when 

Re: [theano-users] Re: Segmentation fault when setting lib.amdlibm=True and using trigonometric operations. How to debug?

2016-10-15 Thread Michael Harradon
I'm hitting this problem myself, as well.

Ubuntu 14.04
gcc 4.8.4
amdlibm-3.1
libopenblas-dev 0.2.8

Would appreciate any suggestions - running with amdlibm off for the moment.

Best,
Michael

On Friday, March 11, 2016 at 12:49:04 PM UTC-5, Pascal Lamblin wrote:
>
> On Fri, Mar 11, 2016, Juan Camilo Gamboa Higuera wrote: 
> > I tried changing the architecture, but this still produces the 
> segmentation 
> > fault. How can I see all the flags that are passed to the gcc command 
> from 
> > theano? 
>
> You can change the log level of the following call [1], or just add 
> "print(cmd)" there. 
>
> [1] 
> https://github.com/Theano/Theano/blob/master/theano/gof/cmodule.py#L2165 
>
> > 
> > -- Juan 
> > 
> > On Thursday, March 10, 2016 at 1:42:59 PM UTC-5, Pascal Lamblin wrote: 
> > > 
> > > Can you try passing explicitly a target architecture to g++ in theano, 
> > > for instance THEANO_FLAGS=gcc.cxxflags='-march=core2'? 
> > > 
> > > By default, Theano tries to emulate '-march=native', which might cause 
> > > trouble. 
> > > 
> > > On Thu, Mar 10, 2016, Juan Camilo Gamboa Higuera wrote: 
> > > > I tried another machine with he following software: 
> > > > Ubuntu 14.04.4, 4.2 kernel 
> > > > gcc 4.8.4 
> > > > libopenblas-dev 0.2.8 
> > > > amd libm 3.1 
> > > > latest theano version from the git repository 
> > > > 
> > > > and I got again a segmentation fault. The main difference between 
> the 
> > > > machines is the processor (The first a 6th gen i7, the second a 
> Xeon, 
> > > the 
> > > > third a 5th gen i7). 
> > > > Any other ideas of what might be causing the segmentation fault? 
> > > > 
> > > > -- Juan Camilo 
> > > > 
> > > > On Thursday, March 10, 2016 at 9:36:38 AM UTC-5, Juan Camilo Gamboa 
> > > Higuera 
> > > > wrote: 
> > > > > 
> > > > > Hi Pascal, 
> > > > > 
> > > > > Thanks for your time! 
> > > > > 
> > > > > I'm using the latest version of Theano from the git repository, 
> and 
> > > AMD 
> > > > > libm version 3.1. The md5 sums are: 
> > > > > 
> > > > > fd9f86e040a4d5f26013fc5f60182680 libamdlibm.so 
> > > > > 701474c84f1e3dff6bbb1a11d07eb215 libamdlibm.a 
> > > > > 
> > > > > I tried disabling blas (by setting 
> > > > > *THEANO_FLAGS="lib.amdlibm=True,blas.ldflags=''"*) but I still get 
> the 
> > > > > segmentation fault. 
> > > > > 
> > > > > I've also tried it on another machine, and I get no segmentation 
> > > fault. 
> > > > > 
> > > > > The machine where the segfault happens: 
> > > > > Ubuntu 15.10, 4.2 kernel 
> > > > > gcc 4.9.3 
> > > > > libopenblas-dev 0.2.14 
> > > > > 
> > > > > The machine where it works 
> > > > > Ubuntu 14.04.4, 4.2 kernel 
> > > > > gcc 4.8.4 
> > > > > libopenblas-dev 0.2.8 
> > > > > 
> > > > > I will try installing the older versions of openblas and gcc on my 
> > > machine 
> > > > > and see if it makes a difference. 
> > > > > 
> > > > > Do you have any other suggestions? 
> > > > > 
> > > > > -- Juan Camilo 
> > > > > 
> > > > > On Wednesday, March 9, 2016 at 7:52:39 PM UTC-5, Pascal Lamblin 
> wrote: 
> > > > >> 
> > > > >> So I just tried your case, on Linux, with a version of amdlibm 
> that 
> > > is 
> > > > >> possibly quite old but that I'm unable to identify (md5sum 
> below). 
> > > > >> 
> > > > >> I was unfortunately unable to reproduce the segfault. 
> > > > >> 
> > > > >> It may be possible that for some reason, the output of the dot 
> > > product 
> > > > >> is not aligned, and that amdlibm does not support that. In that 
> case, 
> > > > >> maybe disabling BLAS could help. 
> > > > >> 
> > > > >> 
> > > > >> 9211766f5cef4ce1a35dc44701dcac6a  libamdlibm.a 
> > > > >> 3ce0e1c4c7afbfe514639fd8482ed220  libamdlibm.so 
> > > > >> 
> > > > >> 
> > > > >> On Wed, Mar 09, 2016, Juan Camilo Gamboa Higuera wrote: 
> > > > >> > Another development! 
> > > > >> > 
> > > > >> > If i disable the inplace optimization ( 
> > > optimizer_excluding=inplace_opt 
> > > > >> ) 
> > > > >> > the  segmentation fault disappears. Which narrows down the 
> problem 
> > > to 
> > > > >> > situations where 
> > > > >> > 
> > > > >> > * you take the sine or cosine of a dot product between two 
> matrices 
> > > and 
> > > > >> > inplace optimization for elementwise ops is enabled.*Cheers! 
> > > > >> > 
> > > > >> > -- Juan Camilo 
> > > > >> > 
> > > > >> > On Wednesday, March 9, 2016 at 10:23:22 AM UTC-5, Juan Camilo 
> > > Gamboa 
> > > > >> > Higuera wrote: 
> > > > >> > > 
> > > > >> > > Hi all, 
> > > > >> > > 
> > > > >> > > *Here is a simpler version of the code that causes the 
> > > segmentation 
> > > > >> fault 
> > > > >> > > when using amdlibm. It looks like it happens when you take 
> the 
> > > sine 
> > > > >> or 
> > > > >> > > cosine of a dot product:* 
> > > > >> > > 
> > > > >> > > import theano 
> > > > >> > > import theano.tensor as T 
> > > > >> > > import numpy as np 
> > > > >> > > 
> > > > >> > > np.set_printoptions(linewidth=200) 
> > > > >> > > n_samples = 500 
> > > > >> > > n_basis = 100 
> > > > >> > > idims = 4 
> >