Re: [Rd] Cluster: Various GCC, how important is consistency?

2016-10-18 Thread Paul Johnson
Dear Jeroen

Did you  rebuild R-3.3.1 and all of the packages with GCC-5.3 in order
to make this work?

The part that worries me is that the shared libraries won't be
consistent, with various versions of GCC in play.

On Tue, Oct 18, 2016 at 5:55 AM, Jeroen Ooms  wrote:
> On Tue, Oct 18, 2016 at 1:44 AM, Paul Johnson  wrote:
>>
>> Administrator suggested I try to build with the GCC that is provided
>> with the nodes, which is gcc-4.4.7.
>
> Redhat provides an alternative compiler (gcc 5.3 based) in one of it's
> opt-in repositories called "redhat developer toolkit" (RDT). In CentOS
> you install it as follows:
>
>   yum install -y centos-release-scl
>   yum install -y devtoolset-4-gcc-c++
>
> This compiler is specifically designed to be used alongside the EL6
> stock gcc 4.4.7. It includes a simple 'enable' script which will put
> RDT gcc and g++ in front of your PATH and LD_LIBRARY_PATH and so on.
>
> So what I do on CentOS is install R from EPEL (built with stock gcc
> 4.4.7) and whenever I need to install an R package that uses e.g.
> CXX11, simply start an R shell using the RDT compilers:
>
>source /opt/rh/devtoolset-4/enable
>R
>
> From what I have been able to test, this works pretty well (though I
> am not a regular EL user). But I was able to build R packages that use
> C++11 (such as feather) and once installed, these packages can be used
> even in a regular R session (without RDT enabled).



-- 
Paul E. Johnson   http://pj.freefaculty.org
Director, Center for Research Methods and Data Analysis http://crmda.ku.edu

I only use this account for email list memberships. To write directly,
address me at pauljohn at ku.edu.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Cluster: Various GCC, how important is consistency?

2016-10-18 Thread Jeroen Ooms
On Tue, Oct 18, 2016 at 1:44 AM, Paul Johnson  wrote:
>
> Administrator suggested I try to build with the GCC that is provided
> with the nodes, which is gcc-4.4.7.

Redhat provides an alternative compiler (gcc 5.3 based) in one of it's
opt-in repositories called "redhat developer toolkit" (RDT). In CentOS
you install it as follows:

  yum install -y centos-release-scl
  yum install -y devtoolset-4-gcc-c++

This compiler is specifically designed to be used alongside the EL6
stock gcc 4.4.7. It includes a simple 'enable' script which will put
RDT gcc and g++ in front of your PATH and LD_LIBRARY_PATH and so on.

So what I do on CentOS is install R from EPEL (built with stock gcc
4.4.7) and whenever I need to install an R package that uses e.g.
CXX11, simply start an R shell using the RDT compilers:

   source /opt/rh/devtoolset-4/enable
   R

>From what I have been able to test, this works pretty well (though I
am not a regular EL user). But I was able to build R packages that use
C++11 (such as feather) and once installed, these packages can be used
even in a regular R session (without RDT enabled).

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Cluster: Various GCC, how important is consistency?

2016-10-17 Thread Gabriel Becker
This absolutely causes it's own problems (and they may be bad enough that
you shouldnt do it) but you can also install an older version of
rcpparmadillo. My switchr package makes this more convenient from within r
but grabbing tarballs from the crank Web archive also works  (in fact
that's what switchr will do in this case).

This, of course will never be more than a stop gap. Eventually, sadly,
you'll likely need a newer operating system. We have the same problems on
our cluster.

Best of luck,
~G

On Oct 17, 2016 6:16 PM, "Simon Urbanek" 
wrote:

> There are many issues with different gcc versions, but they can at least
> be minimized by using static linking, i.e. you should at the very least use
> -static-libstdc++ -static-libgcc to make sure you don't mix runtime
> versions. We run into the same problem since C++11 compilers are rare on
> production machines, but as long as you can isolate the packages away from
> the dynamically loaded code it often works since R only works at symbol
> level as long as you have a self-contained binary. The only other thing to
> worry about are ABI changes, but unless you use Fortran they tend to be
> compatible enough.
>
> Cheers,
> Simon
>
>
> > On Oct 17, 2016, at 7:44 PM, Paul Johnson  wrote:
> >
> > On a cluster that is based on RedHat 6.2, we are updating to R-3.3.1.
> > I have, from time to time, run into problems with various R packages
> > and some older versions of GCC. I wish we had newer Linux in the
> > cluster, but with 1000s of nodes running 1000s of jobs, well, they
> > don't want a restart.
> >
> > Administrator suggested I try to build with the GCC that is provided
> > with the nodes, which is gcc-4.4.7.  To my surprise, R-3.3.1 compiled
> > with that.  After that, I got quite far, many 100s of packages
> > compiled, but then I hit a snag that RccArmadillo explicitly refuses
> > to build with anything older than gcc-4.6.  The OpenMx package and
> > emplik packages also refuse to compile with old gcc
> >
> > The cluster uses a module system, it is easy enough to swap in various
> > gcc versions to see what compiles.
> >
> > I did succeed compiling RcppArmadillo with gcc 4.9.2. But Rcpp is not
> > picky, it compiled with gcc-4.4.7.
> >
> > I worry...
> >
> > 1)  will reliance on various GCC make the packages incompatible with
> > R, or each other?
> >
> > I logged out, logged back in, with R 3.3.1 I can run
> >
> > library(RcppArmadillo)
> > library(Rcpp)
> >
> > with no errors so far. But I'm not stress testing it much.
> >
> > I should rebuild everything?
> >
> > I expect that if I were to use gcc-6 on one package, it would not be
> > compatible with binaries built with 4.4.7.  But is there a zone of
> > tolerance allowing 4.4.7 and 4.9 packages to coexist?
> >
> > 2) If I build with non-default GCC, are all of the R users going to
> > hit trouble if they don't have the same GCC I use?  Unless I make some
> > extraordinary effort, they are getting GCC 4.4.7. If they try to
> > install a package, they are getting that GCC, not the one I use to
> > build RcppArmadillo or the other trouble cases (or everything, if you
> > say I need to go back and rebuild).
> >
> >> From an administrative point of view, should I tie R-3.3.1 to a
> > particular version of GCC? I think I could learn how to do that.
> >
> > On the cluster, they use the module framework. There are about 50
> > versions of GCC.  It is easy enough ask for a newer one:
> >
> > $ module load gcc/4.9.2
> >
> > It puts the gcc 4.9.2 binaries and shared libraries at the front of the
> PATHs.
> >
> > pj
> >
> >
> > --
> > Paul E. Johnson   http://pj.freefaculty.org
> > Director, Center for Research Methods and Data Analysis
> http://crmda.ku.edu
> >
> > To write me directly, address me at pauljohn at ku.edu.
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Cluster: Various GCC, how important is consistency?

2016-10-17 Thread Simon Urbanek
There are many issues with different gcc versions, but they can at least be 
minimized by using static linking, i.e. you should at the very least use 
-static-libstdc++ -static-libgcc to make sure you don't mix runtime versions. 
We run into the same problem since C++11 compilers are rare on production 
machines, but as long as you can isolate the packages away from the dynamically 
loaded code it often works since R only works at symbol level as long as you 
have a self-contained binary. The only other thing to worry about are ABI 
changes, but unless you use Fortran they tend to be compatible enough.

Cheers,
Simon


> On Oct 17, 2016, at 7:44 PM, Paul Johnson  wrote:
> 
> On a cluster that is based on RedHat 6.2, we are updating to R-3.3.1.
> I have, from time to time, run into problems with various R packages
> and some older versions of GCC. I wish we had newer Linux in the
> cluster, but with 1000s of nodes running 1000s of jobs, well, they
> don't want a restart.
> 
> Administrator suggested I try to build with the GCC that is provided
> with the nodes, which is gcc-4.4.7.  To my surprise, R-3.3.1 compiled
> with that.  After that, I got quite far, many 100s of packages
> compiled, but then I hit a snag that RccArmadillo explicitly refuses
> to build with anything older than gcc-4.6.  The OpenMx package and
> emplik packages also refuse to compile with old gcc
> 
> The cluster uses a module system, it is easy enough to swap in various
> gcc versions to see what compiles.
> 
> I did succeed compiling RcppArmadillo with gcc 4.9.2. But Rcpp is not
> picky, it compiled with gcc-4.4.7.
> 
> I worry...
> 
> 1)  will reliance on various GCC make the packages incompatible with
> R, or each other?
> 
> I logged out, logged back in, with R 3.3.1 I can run
> 
> library(RcppArmadillo)
> library(Rcpp)
> 
> with no errors so far. But I'm not stress testing it much.
> 
> I should rebuild everything?
> 
> I expect that if I were to use gcc-6 on one package, it would not be
> compatible with binaries built with 4.4.7.  But is there a zone of
> tolerance allowing 4.4.7 and 4.9 packages to coexist?
> 
> 2) If I build with non-default GCC, are all of the R users going to
> hit trouble if they don't have the same GCC I use?  Unless I make some
> extraordinary effort, they are getting GCC 4.4.7. If they try to
> install a package, they are getting that GCC, not the one I use to
> build RcppArmadillo or the other trouble cases (or everything, if you
> say I need to go back and rebuild).
> 
>> From an administrative point of view, should I tie R-3.3.1 to a
> particular version of GCC? I think I could learn how to do that.
> 
> On the cluster, they use the module framework. There are about 50
> versions of GCC.  It is easy enough ask for a newer one:
> 
> $ module load gcc/4.9.2
> 
> It puts the gcc 4.9.2 binaries and shared libraries at the front of the PATHs.
> 
> pj
> 
> 
> -- 
> Paul E. Johnson   http://pj.freefaculty.org
> Director, Center for Research Methods and Data Analysis http://crmda.ku.edu
> 
> To write me directly, address me at pauljohn at ku.edu.
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel