[deal.II] Re: Installation on cray XC50 | linking to petsc, lapack and blas libraries with different names

2020-02-14 Thread vachan potluri
Here is a summary of the installation process on Cray XC50.

I have configured deal.II with MPI, LAPACK, SCALAPACK, PETSc and p4est. Our 
system didn't have p4est so I started with installing it. All cray 
libraries are in /opt/cray/pe/lib64/ in out system.

*Installing p4est*
1. Download source files and setup script from here 
.
2. By default, the setup script searches for mpicxx compilers. Instead, 
explicitly specifiy cray compilers. The configure command will look as 
follows.
"$SRCDIR/configure" CXX=/opt/cray/pe/craype/2.5.13/bin/CC \
CC=/opt/cray/pe/craype/2.5.13/bin/cc \
F77=/opt/cray/pe/craype/2.5.13/bin/ftn \
FC=/opt/cray/pe/craype/2.5.13/bin/ftn \
--enable-mpi --enable-shared \
--disable-vtk-binary --without-blas \
--prefix="$INSTALL_DEBUG" CFLAGS="$CFLAGS_DEBUG" \
CPPFLAGS="-DSC_LOG_PRIORITY=SC_LP_ESSENTIAL" \
"$@" > config.output || bdie "Error in configure"
Make this change for both FAST and DEBUG versions.
3. By default cray assumes static linking. Change them:
export XTPE_LINK_TYPE=dynamic
export CRAYPE_LINK_TYPE=dynamic
This will be required in subsequent steps also.
4. The makefile generated uses flags corresponding to GNU compilers. Switch 
module PrgEnv-cray with PrgEnv-gnu.

*Configuring with LAPACK and SCALAPACK*
1. For LAPACK, deal.II's find module calls cmake's corresponding find 
module. For this to work on cray systems, cmake version >=3.16 is required. 
So I installed a new version in my home directory and used this cmake 
version. See this 

 
and this .
2. For Cray environments, lapack libraries are linked directly to cray 
compiler without requiring any other flags. So the _lapack_libraries 
variable in deal.II's FindLAPACK.cmake will be empty. This is okay. So set 
this as OPTIONAL in the end of this file.
3. For SCALAPACK, the library name in FindLAPACK.cmake should be changed to 
sci_gnu_61_mpi_mp (or whatever is the name of libsci library on your 
system) since on cray, SCALAPACK is a part of this library.

*Configuring with MPI and PETSc*
1. For MPI, simply specify the compilers explicitly.
2. For PETSc, the library name must be changed to craypetsc_gnu_real-64 
(depending on your system).
3. The additional libraries PETSc interfaces to are read from linker line 
of $PETSC_DIR/lib/petsc/conf/petscvariables. Make a copy of this file and 
modify the linker line so that the library names are correct (if they are 
not already, as was the case with me). Change the hint to petscvariables 
file in FindPETSC.cmake.
4. Also, add the correct hint to these library paths in the following 
portion of the aforementioned file.
DEAL_II_FIND_LIBRARY(PETSC_LIBRARY_${_token}
NAMES ${_token}
#HINTS ${_hints}
HINTS ${_hints} ${CMAKE_PREFIX_PATH}
)
In my case, I set CMAKE_PREFIX_PATH in configure script to 
/opt/cray/pe/lib64.
5. If your system has PETSc libraries with ".so.mpi" extensions, you 
must enable find those in dealii-9.1.1/CMakeLists.txt (the top most one)
SET(CMAKE_FIND_LIBRARY_SUFFIXES ${CMAKE_FIND_LIBRARY_SUFFIXES}
".so.0" ".so.5" ".so.mpi31.2" ".so.mpi31.4" ".so.mpi31.5" 
".so.mpi31.6" ".so.mpi31.12"
)
6. If you are using 64-bit versions of PETSc libraries, you must enable 
this for deal.II too (see below).



You must unload the atp module before configuring (see here 
). For 
cross-compilation (see here 
),
 
you can just add -DCMAKE_SYSTEM_NAME=CrayLinuxEnvironment without requiring 
a Toolchain file in newer cmake versions. The configure script is

cmake_new=~/bin/cmake-3.16.4/usr/local/bin/cmake # from bashrc, shell 
scripts can't use aliases
$cmake_new -DCMAKE_INSTALL_PREFIX=~/bin/dealii-9.1.1/ \
-DWITH_64BIT_INDICES=ON \
-DCMAKE_PREFIX_PATH=/opt/cray/pe/lib64 \
-DWITH_MPI=ON \
-DMPI_DIR=/opt/cray/pe/mpt/default/gni/mpich-gnu/5.1/ \

-DMPI_CXX_INCLUDE_PATH=/opt/cray/pe/mpt/default/gni/mpich-gnu/5.1/include/ \
-DCMAKE_CXX_COMPILER=/opt/cray/pe/craype/2.5.13/bin/CC \
-DCMAKE_C_COMPILER=/opt/cray/pe/craype/2.5.13/bin/cc \
-DCMAKE_Fortran_COMPILER=/opt/cray/pe/craype/2.5.13/bin/ftn 
\
-DWITH_BLAS=ON \
-DWITH_LAPACK=ON \
-DWITH_SCALAPACK=ON \
-DWITH_PETSC=ON \
-DWITH_P4EST=ON -DP4EST_DIR=~/bin/p4est-2.2/ \
-DCMAKE_SYSTEM_NAME=CrayLinuxEnvironment \
~/source/dealii-9.1.1

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you ar

Re: [deal.II] Re: Installation on cray XC50 | linking to petsc, lapack and blas libraries with different names

2020-02-13 Thread Bruno Turcksin
Le jeu. 13 févr. 2020 à 07:27, vachan potluri
 a écrit :
>
> It is working!
>
> The mistake I did was to open an interactive job and run the executables 
> through bash. When I instead submitted a job, and executed using aprun 
> (Cray's equivalent to mpirun) to run the executables, they ran successfully. 
> I tested step-1, step-18 and my own code too. The installation tests will 
> probably not run though, since they are actually makefile targets.

Glad to hear. You should be able to run the installation tests but you
need to let cmake know that it shouldn't use mpiexec but aprun
instead. This can be done when you configure deal.II by using:

-D MPI_EXEC="aprun"

Bruno

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/CAGVt9eO4jdF-C48awXwffjN4wsmCsOJRA4sGXDuyAfyb5%2Bj1tA%40mail.gmail.com.


[deal.II] Re: Installation on cray XC50 | linking to petsc, lapack and blas libraries with different names

2020-02-13 Thread vachan potluri
It is working!

The mistake I did was to open an interactive job and run the executables 
through bash. When I instead submitted a job, and executed using aprun 
(Cray's equivalent to mpirun) to run the executables, they ran 
successfully. I tested step-1, step-18 and my own code too. The 
installation tests will probably not run though, since they are actually 
makefile targets.

I apologise for being irresponsible and hasty in the previous couple of 
messages. I thank everyone for helping me and hearing me out when I was all 
by myself. I will also post a summary of the installation process.

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/15ec6df7-c08a-4ae1-98c0-ab04bc6db24c%40googlegroups.com.


[deal.II] Re: Installation on cray XC50 | linking to petsc, lapack and blas libraries with different names

2020-02-12 Thread vachan potluri
I have found few reports of glibc version 2.28 causing such behaviour (e.g. 
see here ). It might be possible 
that /lib64/ld-linux-x86-64.so.2 on our system "links" to this version of 
glibc. But it actually is a static library:
$ ldd -v ld-linux-x86-64.so.2
statically linked
So there is probably no way to ascertain this. If it infact is so (linked 
to glibc 2.28), then I don't think there is anyway I can get working. With 
a simple code from here ,  I 
have found that my compiler links to glibc version 2.22 both during compile 
and run time. So there is no issue with the compiler.

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/0c6262f0-e4c1-49da-b896-afc5aa461a50%40googlegroups.com.


[deal.II] Re: Installation on cray XC50 | linking to petsc, lapack and blas libraries with different names

2020-02-12 Thread vachan potluri
This is the full backtrace with gdb.
(gdb) bt
#0  __static_initialization_and_destruction_0 (__initialize_p=1, 
__priority=65535)
at 
/home/ComptGasDynLab/vachanpotluri/source/dealii-9.1.1/source/numerics/time_dependent.cc:1196
#1  0x7fffec1aa6f8 in _GLOBAL__sub_I_time_dependent.cc(void) () at 
/home/ComptGasDynLab/vachanpotluri/source/dealii-9.1.1/source/numerics/time_dependent.cc:1275
#2  0x77deacba in call_init.part () from /lib64/ld-linux-x86-64.so.2
#3  0x77deada3 in _dl_init () from /lib64/ld-linux-x86-64.so.2
#4  0x77ddd22a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#5  0x0001 in ?? ()
#6  0x7fff729e in ?? ()
#7  0x in ?? ()
Unfortunately, gdb is probably not configured properly. Cray has its own 
debuggers, most of them GUIs (and hence cannot be used) and all of them 
require submitting a job interactively which I am currently unable to do. I 
will post the bt with one of those when the queue is empty.

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/ac773c2c-87a8-430b-b4c2-3a1617a27011%40googlegroups.com.


[deal.II] Re: Installation on cray XC50 | linking to petsc, lapack and blas libraries with different names

2020-02-12 Thread Bruno Turcksin
Vachan,

On Wednesday, February 12, 2020 at 12:24:38 AM UTC-5, vachan potluri wrote:
>
> Step-1 aborts with Illegal Instruction (core dumped). The error msg gdb 
> prints is the following.
> Program received signal SIGILL, Illegal instruction.
> __static_initialization_and_destruction_0 (__initialize_p=1, 
> __priority=65535)
> at 
> /home/ComptGasDynLab/vachanpotluri/source/dealii-9.1.1/source/numerics/time_dependent.cc:1196
> 1196  std::make_pair(0U, 
> 0.)));
> When I backtrace the error, it leads to this.
> template  
> typename 
> TimeStepBase_Tria_Flags::RefinementFlags::CorrectionRelaxations
>   
> TimeStepBase_Tria_Flags::RefinementFlags::default_correction_relaxations(
> 1, // one element, denoting the first and all subsequent sweeps
> std::vector>(1, // one element, 
> denoting the
> // upper bound for the
> // following relaxation
>  std::make_pair(0U, 0.)));
> Not just step-1, but step.debug, affinity.debug and mpi.debug (and 
> possibly other debug tests may) also terminate with the same error and bt. 
> Can someone explain why this is happening?
>

I am not sure why you have this error. Can you show the whole backtrace? 

Best,

Bruno

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/590fe189-1dbd-412a-927b-b0aa7b0b3e5a%40googlegroups.com.


[deal.II] Re: Installation on cray XC50 | linking to petsc, lapack and blas libraries with different names

2020-02-11 Thread vachan potluri
Step-1 aborts with Illegal Instruction (core dumped). The error msg gdb 
prints is the following.
Program received signal SIGILL, Illegal instruction.
__static_initialization_and_destruction_0 (__initialize_p=1, 
__priority=65535)
at 
/home/ComptGasDynLab/vachanpotluri/source/dealii-9.1.1/source/numerics/time_dependent.cc:1196
1196  std::make_pair(0U, 
0.)));
When I backtrace the error, it leads to this.
template  
typename 
TimeStepBase_Tria_Flags::RefinementFlags::CorrectionRelaxations
  
TimeStepBase_Tria_Flags::RefinementFlags::default_correction_relaxations(
1, // one element, denoting the first and all subsequent sweeps
std::vector>(1, // one element, 
denoting the
// upper bound for the
// following relaxation
 std::make_pair(0U, 0.)));
Not just step-1, but step.debug, affinity.debug and mpi.debug (and possibly 
other debug tests may) also terminate with the same error and bt. Can 
someone explain why this is happening?

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/9fce9f67-4040-44c3-89c2-bae0effe339a%40googlegroups.com.


Re: [deal.II] Re: Installation on cray XC50 | linking to petsc, lapack and blas libraries with different names

2020-02-11 Thread Wolfgang Bangerth

On 2/11/20 12:09 AM, vachan potluri wrote:


However, make test shows all tests failing with the error
/bin/sh: .: command not found.
When I navigated to tests/quick_tests and individually ran the tests, 
the output was as follows.

$ ./lapack.debug
Illegal instruction

Can anyone help me with this issue? Is this because I have 
cross-compiled? The make test instruction was run on login node. It so, 
is there a way I check my installation on login node itself without 
needing submitting a job?


Yes, that's exactly the problem: The compiler compiles the library for 
the target nodes, not for the front-end login node. There isn't a good 
way to test things in this case -- I'd just see whether you can run 
something like step-1 on the compute nodes by submitting a job.


You're discovering why the Cray XC machines are so tremendously 
unpopular: Cray managed to create a software system with the many 
support libraries that is incredibly complicated and, moreover, 
incompatible with the way software is set up on almost every other 
machine. It's also exceedingly impractical that they chose a different 
processor for the frontend than for the compute nodes. Both of these 
decisions are really beyond essentially everyone's imagination who has 
to deal with these machines. I don't know a single person who believes 
that these were good design choices for Cray to make.


I don't wish Cray bankruptcy, but I do hope that the tens or hundreds of 
thousands of dollars that are spent on the difficulty of installing 
software on each one of their machines is deducted from the salaries of 
those responsible for their machine designs...


Best
 W.

--

Wolfgang Bangerth  email: bange...@colostate.edu
   www: http://www.math.colostate.edu/~bangerth/

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/a9c22e81-c47e-1d2f-baeb-ddbb84e3c3f9%40colostate.edu.


[deal.II] Re: Installation on cray XC50 | linking to petsc, lapack and blas libraries with different names

2020-02-10 Thread vachan potluri
Ok. After installing newer cmake version and making _lapack_libraries 
OPTIONAL, LAPACK configuration has gone fine. For PETSc, I did something 
dirty. I figured that FindPETSC.cmake searches for libraries in a file 
named petscvariables. I made my own copy of petscvariables file and 
modified the linker line in this file. I changed the path hints to find 
this file and then the rest was as expected.

The installation was successful (I had to unload apt module, see the 
trailing discussion here 
). I 
did a cross compilation (see here 
).
 
Instead of using a Toolchain file, I used the option 
-DCMAKE_SYSTEM_NAME=CrayLinuxEnvironment as mentioned here 
.
 
The complete cmake invocation is as below.

cmake_new=~/bin/cmake-3.16.4/usr/local/bin/cmake
$cmake_new -DCMAKE_INSTALL_PREFIX=~/bin/dealii-9.1.1/ \
-DWITH_64BIT_INDICES=ON \
-DCMAKE_PREFIX_PATH=/opt/cray/pe/lib64 \
-DWITH_MPI=ON \
-DMPI_DIR=/opt/cray/pe/mpt/default/gni/mpich-gnu/5.1/ \

-DMPI_CXX_INCLUDE_PATH=/opt/cray/pe/mpt/default/gni/mpich-gnu/5.1/include/ \
-DCMAKE_CXX_COMPILER=/opt/cray/pe/craype/2.5.13/bin/CC \
-DCMAKE_C_COMPILER=/opt/cray/pe/craype/2.5.13/bin/cc \
-DCMAKE_Fortran_COMPILER=/opt/cray/pe/craype/2.5.13/bin/ftn 
\
-DWITH_BLAS=ON \
-DWITH_LAPACK=ON \
-DWITH_SCALAPACK=ON \
-DWITH_PETSC=ON \
-DWITH_P4EST=ON -DP4EST_DIR=~/bin/p4est-2.2/ \
-DCMAKE_SYSTEM_NAME=CrayLinuxEnvironment \
~/source/dealii-9.1.1

However, make test shows all tests failing with the error
/bin/sh: .: command not found.
When I navigated to tests/quick_tests and individually ran the tests, the 
output was as follows.
$ ./lapack.debug 
Illegal instruction

Can anyone help me with this issue? Is this because I have cross-compiled? 
The make test instruction was run on login node. It so, is there a way I 
check my installation on login node itself without needing submitting a job?

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/390c861b-65ec-4075-8715-ecafcd6a2864%40googlegroups.com.


[deal.II] Re: Installation on cray XC50 | linking to petsc, lapack and blas libraries with different names

2020-02-09 Thread vachan potluri

>
> If only an old version is the problem, I would just go ahead and download 
> and compile a recent version myself. I never had any issues with that and 
> should be quite simple.

I did this. Indeed the cmake output now prints
A library with LAPACK API found.
 However, the lapack configuration fails:
-- Include 
/home/ComptGasDynLab/vachanpotluri/source/dealii-9.1.1/cmake/configure/configure_1_lapack.cmake
-- Lapack dir /usr/lib64
-- A library with LAPACK API found.
-- Performing Test craypetsc_gnu_53_real-64_LIBRARY
-- Performing Test craypetsc_gnu_53_real-64_LIBRARY - Success
-- Performing Test rca_LIBRARY
-- Performing Test rca_LIBRARY - Success
-- Performing Test AtpSigHandler_LIBRARY
-- Performing Test AtpSigHandler_LIBRARY - Success
-- Performing Test AtpSigHCommData_LIBRARY
-- Performing Test AtpSigHCommData_LIBRARY - Success
-- Performing Test sci_gnu_61_mpi_LIBRARY
-- Performing Test sci_gnu_61_mpi_LIBRARY - Success
-- Performing Test sci_gnu_61_LIBRARY
-- Performing Test sci_gnu_61_LIBRARY - Success
-- Performing Test mpich_gnu_51_LIBRARY
-- Performing Test mpich_gnu_51_LIBRARY - Success
-- Performing Test mpichf90_gnu_51_LIBRARY
-- Performing Test mpichf90_gnu_51_LIBRARY - Success
-- Performing Test gfortran_LIBRARY
-- Performing Test gfortran_LIBRARY - Success
-- Performing Test quadmath_LIBRARY
-- Performing Test quadmath_LIBRARY - Success
-- Performing Test pthread_LIBRARY
-- Performing Test pthread_LIBRARY - Success
-- Performing Test m_LIBRARY
-- Performing Test m_LIBRARY - Success
-- Performing Test gomp_LIBRARY
-- Performing Test gomp_LIBRARY - Success
-- Performing Test gcc_s_LIBRARY
-- Performing Test gcc_s_LIBRARY - Success
-- Performing Test gcc_LIBRARY
-- Performing Test gcc_LIBRARY - Success
-- Performing Test c_LIBRARY
-- Performing Test c_LIBRARY - Success
--   LAPACK_LIBRARIES: *** Required variable "_lapack_libraries" empty ***
--   LAPACK_LINKER_FLAGS: 
--   LAPACK_INCLUDE_DIRS: 
--   LAPACK_USER_INCLUDE_DIRS: 
-- Could NOT find LAPACK
-- DEAL_II_WITH_LAPACK has unmet external dependencies.

I had a look at cmake's version of FindLAPACK.cmake. They explicitly 
mention that for cray programming environment, the variable 
LAPACK_LIBRARIES is set empty.
# On compilers that implicitly link LAPACK (such as ftn, cc, and CC on Cray 
HPC machines)
# we used a placeholder for empty LAPACK_LIBRARIES to get through our logic 
above.
if (LAPACK_LIBRARIES STREQUAL 
"LAPACK_LIBRARIES-PLACEHOLDER-FOR-EMPTY-LIBRARIES")
  set(LAPACK_LIBRARIES "")
endif()
And this is causing the error. So in deal.II's FindLAPACK.cmake can I set 
_lapack_libraries as OPTIONAL? Or is there a cleaner way to tackle this?

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/af1f6f19-a84b-4690-8242-8051df13d87f%40googlegroups.com.