Re: [theano-users] Re: Segmentation fault when setting lib.amdlibm=True and using trigonometric operations. How to debug?
Can you try running python from gdb? This may give some information. On Sat, Oct 15, 2016, Michael Harradon wrote: > I'm hitting this problem myself, as well. > > Ubuntu 14.04 > gcc 4.8.4 > amdlibm-3.1 > libopenblas-dev 0.2.8 > > Would appreciate any suggestions - running with amdlibm off for the moment. > > Best, > Michael > > On Friday, March 11, 2016 at 12:49:04 PM UTC-5, Pascal Lamblin wrote: > > > > On Fri, Mar 11, 2016, Juan Camilo Gamboa Higuera wrote: > > > I tried changing the architecture, but this still produces the > > segmentation > > > fault. How can I see all the flags that are passed to the gcc command > > from > > > theano? > > > > You can change the log level of the following call [1], or just add > > "print(cmd)" there. > > > > [1] > > https://github.com/Theano/Theano/blob/master/theano/gof/cmodule.py#L2165 > > > > > > > > -- Juan > > > > > > On Thursday, March 10, 2016 at 1:42:59 PM UTC-5, Pascal Lamblin wrote: > > > > > > > > Can you try passing explicitly a target architecture to g++ in theano, > > > > for instance THEANO_FLAGS=gcc.cxxflags='-march=core2'? > > > > > > > > By default, Theano tries to emulate '-march=native', which might cause > > > > trouble. > > > > > > > > On Thu, Mar 10, 2016, Juan Camilo Gamboa Higuera wrote: > > > > > I tried another machine with he following software: > > > > > Ubuntu 14.04.4, 4.2 kernel > > > > > gcc 4.8.4 > > > > > libopenblas-dev 0.2.8 > > > > > amd libm 3.1 > > > > > latest theano version from the git repository > > > > > > > > > > and I got again a segmentation fault. The main difference between > > the > > > > > machines is the processor (The first a 6th gen i7, the second a > > Xeon, > > > > the > > > > > third a 5th gen i7). > > > > > Any other ideas of what might be causing the segmentation fault? > > > > > > > > > > -- Juan Camilo > > > > > > > > > > On Thursday, March 10, 2016 at 9:36:38 AM UTC-5, Juan Camilo Gamboa > > > > Higuera > > > > > wrote: > > > > > > > > > > > > Hi Pascal, > > > > > > > > > > > > Thanks for your time! > > > > > > > > > > > > I'm using the latest version of Theano from the git repository, > > and > > > > AMD > > > > > > libm version 3.1. The md5 sums are: > > > > > > > > > > > > fd9f86e040a4d5f26013fc5f60182680 libamdlibm.so > > > > > > 701474c84f1e3dff6bbb1a11d07eb215 libamdlibm.a > > > > > > > > > > > > I tried disabling blas (by setting > > > > > > *THEANO_FLAGS="lib.amdlibm=True,blas.ldflags=''"*) but I still get > > the > > > > > > segmentation fault. > > > > > > > > > > > > I've also tried it on another machine, and I get no segmentation > > > > fault. > > > > > > > > > > > > The machine where the segfault happens: > > > > > > Ubuntu 15.10, 4.2 kernel > > > > > > gcc 4.9.3 > > > > > > libopenblas-dev 0.2.14 > > > > > > > > > > > > The machine where it works > > > > > > Ubuntu 14.04.4, 4.2 kernel > > > > > > gcc 4.8.4 > > > > > > libopenblas-dev 0.2.8 > > > > > > > > > > > > I will try installing the older versions of openblas and gcc on my > > > > machine > > > > > > and see if it makes a difference. > > > > > > > > > > > > Do you have any other suggestions? > > > > > > > > > > > > -- Juan Camilo > > > > > > > > > > > > On Wednesday, March 9, 2016 at 7:52:39 PM UTC-5, Pascal Lamblin > > wrote: > > > > > >> > > > > > >> So I just tried your case, on Linux, with a version of amdlibm > > that > > > > is > > > > > >> possibly quite old but that I'm unable to identify (md5sum > > below). > > > > > >> > > > > > >> I was unfortunately unable to reproduce the segfault. > > > > > >> > > > > > >> It may be possible that for some reason, the output of the dot > > > > product > > > > > >> is not aligned, and that amdlibm does not support that. In that > > case, > > > > > >> maybe disabling BLAS could help. > > > > > >> > > > > > >> > > > > > >> 9211766f5cef4ce1a35dc44701dcac6a libamdlibm.a > > > > > >> 3ce0e1c4c7afbfe514639fd8482ed220 libamdlibm.so > > > > > >> > > > > > >> > > > > > >> On Wed, Mar 09, 2016, Juan Camilo Gamboa Higuera wrote: > > > > > >> > Another development! > > > > > >> > > > > > > >> > If i disable the inplace optimization ( > > > > optimizer_excluding=inplace_opt > > > > > >> ) > > > > > >> > the segmentation fault disappears. Which narrows down the > > problem > > > > to > > > > > >> > situations where > > > > > >> > > > > > > >> > * you take the sine or cosine of a dot product between two > > matrices > > > > and > > > > > >> > inplace optimization for elementwise ops is enabled.*Cheers! > > > > > >> > > > > > > >> > -- Juan Camilo > > > > > >> > > > > > > >> > On Wednesday, March 9, 2016 at 10:23:22 AM UTC-5, Juan Camilo > > > > Gamboa > > > > > >> > Higuera wrote: > > > > > >> > > > > > > > >> > > Hi all, > > > > > >> > > > > > > > >> > > *Here is a simpler version of the code that causes the > > > > segmentation > > > > > >> fault > > > > > >> > > when
Re: [theano-users] Re: Segmentation fault when setting lib.amdlibm=True and using trigonometric operations. How to debug?
I'm hitting this problem myself, as well. Ubuntu 14.04 gcc 4.8.4 amdlibm-3.1 libopenblas-dev 0.2.8 Would appreciate any suggestions - running with amdlibm off for the moment. Best, Michael On Friday, March 11, 2016 at 12:49:04 PM UTC-5, Pascal Lamblin wrote: > > On Fri, Mar 11, 2016, Juan Camilo Gamboa Higuera wrote: > > I tried changing the architecture, but this still produces the > segmentation > > fault. How can I see all the flags that are passed to the gcc command > from > > theano? > > You can change the log level of the following call [1], or just add > "print(cmd)" there. > > [1] > https://github.com/Theano/Theano/blob/master/theano/gof/cmodule.py#L2165 > > > > > -- Juan > > > > On Thursday, March 10, 2016 at 1:42:59 PM UTC-5, Pascal Lamblin wrote: > > > > > > Can you try passing explicitly a target architecture to g++ in theano, > > > for instance THEANO_FLAGS=gcc.cxxflags='-march=core2'? > > > > > > By default, Theano tries to emulate '-march=native', which might cause > > > trouble. > > > > > > On Thu, Mar 10, 2016, Juan Camilo Gamboa Higuera wrote: > > > > I tried another machine with he following software: > > > > Ubuntu 14.04.4, 4.2 kernel > > > > gcc 4.8.4 > > > > libopenblas-dev 0.2.8 > > > > amd libm 3.1 > > > > latest theano version from the git repository > > > > > > > > and I got again a segmentation fault. The main difference between > the > > > > machines is the processor (The first a 6th gen i7, the second a > Xeon, > > > the > > > > third a 5th gen i7). > > > > Any other ideas of what might be causing the segmentation fault? > > > > > > > > -- Juan Camilo > > > > > > > > On Thursday, March 10, 2016 at 9:36:38 AM UTC-5, Juan Camilo Gamboa > > > Higuera > > > > wrote: > > > > > > > > > > Hi Pascal, > > > > > > > > > > Thanks for your time! > > > > > > > > > > I'm using the latest version of Theano from the git repository, > and > > > AMD > > > > > libm version 3.1. The md5 sums are: > > > > > > > > > > fd9f86e040a4d5f26013fc5f60182680 libamdlibm.so > > > > > 701474c84f1e3dff6bbb1a11d07eb215 libamdlibm.a > > > > > > > > > > I tried disabling blas (by setting > > > > > *THEANO_FLAGS="lib.amdlibm=True,blas.ldflags=''"*) but I still get > the > > > > > segmentation fault. > > > > > > > > > > I've also tried it on another machine, and I get no segmentation > > > fault. > > > > > > > > > > The machine where the segfault happens: > > > > > Ubuntu 15.10, 4.2 kernel > > > > > gcc 4.9.3 > > > > > libopenblas-dev 0.2.14 > > > > > > > > > > The machine where it works > > > > > Ubuntu 14.04.4, 4.2 kernel > > > > > gcc 4.8.4 > > > > > libopenblas-dev 0.2.8 > > > > > > > > > > I will try installing the older versions of openblas and gcc on my > > > machine > > > > > and see if it makes a difference. > > > > > > > > > > Do you have any other suggestions? > > > > > > > > > > -- Juan Camilo > > > > > > > > > > On Wednesday, March 9, 2016 at 7:52:39 PM UTC-5, Pascal Lamblin > wrote: > > > > >> > > > > >> So I just tried your case, on Linux, with a version of amdlibm > that > > > is > > > > >> possibly quite old but that I'm unable to identify (md5sum > below). > > > > >> > > > > >> I was unfortunately unable to reproduce the segfault. > > > > >> > > > > >> It may be possible that for some reason, the output of the dot > > > product > > > > >> is not aligned, and that amdlibm does not support that. In that > case, > > > > >> maybe disabling BLAS could help. > > > > >> > > > > >> > > > > >> 9211766f5cef4ce1a35dc44701dcac6a libamdlibm.a > > > > >> 3ce0e1c4c7afbfe514639fd8482ed220 libamdlibm.so > > > > >> > > > > >> > > > > >> On Wed, Mar 09, 2016, Juan Camilo Gamboa Higuera wrote: > > > > >> > Another development! > > > > >> > > > > > >> > If i disable the inplace optimization ( > > > optimizer_excluding=inplace_opt > > > > >> ) > > > > >> > the segmentation fault disappears. Which narrows down the > problem > > > to > > > > >> > situations where > > > > >> > > > > > >> > * you take the sine or cosine of a dot product between two > matrices > > > and > > > > >> > inplace optimization for elementwise ops is enabled.*Cheers! > > > > >> > > > > > >> > -- Juan Camilo > > > > >> > > > > > >> > On Wednesday, March 9, 2016 at 10:23:22 AM UTC-5, Juan Camilo > > > Gamboa > > > > >> > Higuera wrote: > > > > >> > > > > > > >> > > Hi all, > > > > >> > > > > > > >> > > *Here is a simpler version of the code that causes the > > > segmentation > > > > >> fault > > > > >> > > when using amdlibm. It looks like it happens when you take > the > > > sine > > > > >> or > > > > >> > > cosine of a dot product:* > > > > >> > > > > > > >> > > import theano > > > > >> > > import theano.tensor as T > > > > >> > > import numpy as np > > > > >> > > > > > > >> > > np.set_printoptions(linewidth=200) > > > > >> > > n_samples = 500 > > > > >> > > n_basis = 100 > > > > >> > > idims = 4 > >