Hi. Sorry for the late response.

On 10/2/2025 2:10 PM, Omar Elías Velasco Castillo wrote:
Hello,

This is a followup for retaking my questions about my failing to submit and run simulations in a remote cluster.


    Can you show us what the error message(s) are?


Yes, the tail of my err file reads:

+ set -
ERROR: ld.so: object '/lib64/libpapi.so.5.2.0.0' from LD_PRELOAD cannot be preloaded: ignored. ERROR: ld.so: object '/lib64/libpapi.so.5.2.0.0' from LD_PRELOAD cannot be preloaded: ignored.
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
ERROR: ld.so: object '/lib64/libpapi.so.5.2.0.0' from LD_PRELOAD cannot be preloaded: ignored. /home/ia/ovelasco/simulations/tov_ET_decisiva/SIMFACTORY/exe/cactus_sim: error while loading shared libraries: libpapi.so.5.2.0.0: cannot open shared object file: No such file or directory

Is it possible this library is installed on the head node and not the compute nodes? That could be the problem.

Do you need papi? Maybe build without it. Why is there a preload?

=>> PBS: job killed: walltime 864033 exceeded limit 864000
mpirun: abort is already in progress...hit ctrl-c again to forcibly terminate



    Maybe. I'm not 100% sure what you are doing. Can you be clearer
    about how you are running the ET?


Sure, as I mentioned in a previous email, my intention is to run and submit the ET in a queue of a remote machine using Simfactory using either PBS or SLURM, as I work in different remote machines from time to time. The key point is that, I notice that while the sim and cactus_sim build is being done, it seems that some lines printed on the shell indicate that when a module is not found in the machine, the ET compiler "builds a bundle" for those modules that weren't found in the compilation from some thorns (examples below showing creation of bundles for GSL and HDF5):
That is correct.


********************************************************************************
Running configuration script for thorn GSL:
GSL selected, but GSL_DIR not set. Checking pkg-config ...
GSL not found. Checking standard paths ...
GSL not found.
Using bundled GSL...
Finished running configuration script for thorn GSL.
********************************************************************************
Running configuration script for thorn HDF5:
Additional requested language support:  Fortran
HDF5 selected, but HDF5_DIR not set. Checking pkg-config ...
HDF5 not found. Checking standard paths ...
HDF5 not found.
Using bundled HDF5...
Finished running configuration script for thorn HDF5.

First, these messages about GSL and HDF5 not being found surprise me because both modules are located in the /usr directory and I pointed to them in the configurations file.
It may be that HDF5 isn't built with the correct options. You need hdf5+hl+fortran+cxx+mpi (i.e. you need fortran, cxx, mpi, and hl enabled).

Second. If this build was completed succesfully in the home or login shell of that machine, then I assume I can run and submit simulations with the configuration and runscript files of my machine in PBS or SLURM queues, am I right? If yes then, why I still haven't been capable of doing so? My simulations die within some seconds after being started.
That should be the case, but every cluster is a special animal.

I include here the optionlist and runscript that I use for a machine that uses PBS as an example.

Optionlist:

VERSION = 2025 #For Einstein Toolkit 2022_11

CPP = cpp
#CC  = gcc
#CXX = g++
CC  = mpicc
CXX = mpic++
You should use gcc and g++ here.

FPP = cpp
#F90 = gfortran
F90 = mpif90
Again, gfortran, not mpif90.

CPPFLAGS = -I/usr/include
CPPFLAGS += -DCCTK_VECTOR_DISABLE_TESTS
CPPFLAGS = -D__USE_ISOC99
PPFLAGS = -D_GLIBCXX_USE_C99_MATH

LDFLAGS = -L/usr/lib64 -lssl -lcrypto -rdynamic


FPPFLAGS = -traditional

CFLAGS   = -g -std=gnu99
CXXFLAGS = -g -std=gnu++11 -fpermissive
F90FLAGS = -g -fcray-pointer -ffixed-line-length-none

DEBUG = no
CPP_DEBUG_FLAGS =
C_DEBUG_FLAGS   =
CXX_DEBUG_FLAGS =

OPTIMISE = yes
CPP_OPTIMISE_FLAGS =
C_OPTIMISE_FLAGS   = -O2
CXX_OPTIMISE_FLAGS = -O2
F90_OPTIMISE_FLAGS = -O2

PROFILE = no
CPP_PROFILE_FLAGS =
C_PROFILE_FLAGS   = -pg
CXX_PROFILE_FLAGS = -pg
F90_PROFILE_FLAGS = -pg

WARN           = yes
CPP_WARN_FLAGS = -Wall
C_WARN_FLAGS   = -Wall
CXX_WARN_FLAGS = -Wall
F90_WARN_FLAGS = -Wall

OPENMP           = yes
CPP_OPENMP_FLAGS = -fopenmp
FPP_OPENMP_FLAGS = -D_OPENMP
C_OPENMP_FLAGS   = -fopenmp
CXX_OPENMP_FLAGS = -fopenmp
F90_OPENMP_FLAGS = -fopenmp

VECTORISE                = no
VECTORISE_ALIGNED_ARRAYS = no
VECTORISE_INLINE         = yes

PTHREADS_DIR = NO_BUILD

# Para hallar todas estas opciones, hacemos: ldconfig -p | grep nombredelpaquete
MPI_DIR = /software/TEST/local
#

LAPACK_DIR = /usr
#/usr/lib64/liblapack.so.3
# lapack-3.4.2-8.el7.x86_64

BLAS_DIR = /usr
#/usr/lib64/libblas.so.3
# blas-3.4.2-8.el7.x86_64

HDF5_DIR = /usr
#/usr/lib64/libhdf5.so.8
# hdf5-1.8.12-13.el7.x86_64

HWLOC_DIR = /usr
#/usr/lib64/libhwloc.so.5
# hwloc-1.11.8-4.el7.x86_64

JPEG_DIR = /usr
#/usr/lib64/libjpeg.so.62
#

YAML_DIR = /usr
#/usr/lib64/libyaml-0.so.2
# /usr/lib64/libyaml-0.so.2.0.

ZLIB_DIR = /usr
#/usr/lib64/imlib2/loaders/zlib.so
# zlib-1.2.7-21.el7_9.x86_64

GSL_DIR     = /usr
#/usr/lib64/libgsl.so
# gsl-1.15-13.el7.x86_64

FFTW3_DIR   = /usr
#/usr/lib64/libfftw3.so
# fftw-3.3.3-8.el7.x86_64

PAPI_DIR    = /usr
#/usr/lib64/libpapi.so.5.2.0.0
# papi-5.2.0-26.el7.x86_64

XML2_DIR = /usr
#/usr/lib64/libxml2.so.2
# xml2-0.5-7.el7.x86_64

NUMA_DIR = /usr
#/usr/lib64/libnuma.so.1

OPENSSL_DIR = /usr
#/usr/lib64/libssl3.so
# openssl-1.0.2k-26.el7_9.x86_64


Runscript:

#!/bin/bash

set -x
set -e

cd @SIMULATION_DIR@

# Environment setup
source /opt/rh/devtoolset-8/enable

module purge
module load lamod/cmake/3.17
module load lamod/fftw/gnu/3.3.8
module load lamod/openmpi/gnu/4.1.0
module load libraries/gsl/2.6_gnu
module load libraries/hdf5/1.10.5_gnu

It's hard for me to tell whether this is a good environment.

Are you using simfactory to build and run? It's important to make sure you have the same env for both of those tasks.


echo "Environment diagnostics:"
date
hostname
env
ldd @EXECUTABLE@ | grep -E "lapack|blas|openssl|stdc++|gfortran"

# Set runtime parameters
export CACTUS_NUM_PROCS=64
export CACTUS_NUM_THREADS=1
export OMP_NUM_THREADS=1
export GMON_OUT_PREFIX=gmon.out
env | sort > SIMFACTORY/ENVIRONMENT

echo "Starting simulation at $(date)"
export CACTUS_STARTTIME=$(date +%s)

This looks right.
mpirun -np $CACTUS_NUM_PROCS -x LD_LIBRARY_PATH @EXECUTABLE@ -L 3 @PARFILE@

echo "Simulation finished at $(date)"
touch segment.done






El jue, 18 sept 2025 a las 14:04, Steven Brandt (<[email protected]>) escribió:


    On 9/17/2025 12:11 PM, Omar Elías Velasco Castillo wrote:
    Dear Einstein Toolkit team,

    I hope this message finds you well. I am a beginner with the
    Einstein Toolkit. On personal workstations I have been able to
    compile and run tutorial simulations at low resolution, but I am
    facing problems on two different clusters. I would like to ask
    two questions:

    1. *Are there ET versions prior to 2022_05 (e.g. 2019–2020
    releases) that can still be downloaded and compiled
    successfully?* When I try to fetch them from the website using
    ./GetComponents, the process fails (CactusSourceJar.git is not
    created and some components do not download). Since some of the
    nodes I use have older GCC versions (8 or 10) and limited
    modules, a stable older release might be more practical.

    2. During compilation, I notice that thorns (such as GSL and
    HDF5, for example) fall back to using the bundled versions
    because system modules are not found. The build completes
    successfully, but jobs fail immediately after submission to PBS
    or SLURM queues.
          Can you show us what the error message(s) are?

    *What is the role of the bundled versions in this case*?*If the
    build uses bundled GSL/HDF5, do I still need to load
    corresponding, compatible modules in the runscript?*

    Could this mismatch explain why jobs die right after submission?

    Maybe. I'm not 100% sure what you are doing. Can you be clearer
    about how you are running the ET?

    --Steve


    Any advice on handling these issues would be very helpful. Thank
    you very much for your time and support.

    Greetings,

    O.V.





    _______________________________________________
    Users mailing list
    [email protected]
    http://lists.einsteintoolkit.org/mailman/listinfo/users
_______________________________________________
Users mailing list
[email protected]
http://lists.einsteintoolkit.org/mailman/listinfo/users

Reply via email to