The option list On 20 October 2015 at 12:50, Ian Hinder <ian.hin...@aei.mpg.de> wrote:
> > On 20 Oct 2015, at 13:44, Geraint Pratten <g.prat...@sussex.ac.uk> wrote: > > Hi, > > So I'm getting a problem passing the correct number of MPI processes > through to Carpet. I'm not sure what I've messed up in the configurations, > the error I am getting is: > > The environment variable CACTUS_NUM_PROCS is set to 4, but there are 1 MPI > processes. This may indicate a severe problem with the MPI startup > mechanism. > > I've attached the .run and .sub scripts that I use. Can anyone see > anything obviously wrong? The local cluster uses the UNIVA Grid Engine for > submission scripts. > > > Can you also send the optionlist? Maybe it's a mismatch between the MPI > installation used at compile and run time? > > > > Thanks in advance! > Geraint > > > On 5 October 2015 at 12:34, Geraint Pratten <g.prat...@sussex.ac.uk> > wrote: > >> Ah, I think you could be right! It seems that I've messed up the >> submit/run scripts along the line! Relevant output attached below. I'll try >> fixing this now. >> >> Thanks! Much appreciated! >> Geraint >> >> INFO (Carpet): MPI is enabled >> INFO (Carpet): Carpet is running on 1 processes >> WARNING level 1 from host node207.cm.cluster process 0 >> while executing schedule bin (none), routine (no thorn)::(no routine) >> in thorn Carpet, file >> /mnt/pact/gp234/NR/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:226: >> -> The environment variable CACTUS_NUM_PROCS is set to 6, but there are >> 1 MPI processes. This may indicate a severe problem with the MPI startup >> mechanism. >> INFO (Carpet): This is process 0 >> INFO (Carpet): OpenMP is enabled >> INFO (Carpet): This process contains 6 threads, this is thread 0 >> WARNING level 2 from host node207.cm.cluster process 0 >> while executing schedule bin (none), routine (no thorn)::(no routine) >> in thorn Carpet, file >> /mnt/pact/gp234/NR/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:256: >> -> Although OpenMP is enabled, the environment variable >> CACTUS_NUM_THREADS is not set. >> INFO (Carpet): There are 6 threads in total >> INFO (Carpet): There are 6 threads per process >> INFO (Carpet): This process runs on host node207, pid=29183 >> INFO (Carpet): This process runs on 24 cores: 0-23 >> INFO (Carpet): Thread 0 runs on 24 cores: 0-23 >> INFO (Carpet): Thread 1 runs on 24 cores: 0-23 >> INFO (Carpet): Thread 2 runs on 24 cores: 0-23 >> INFO (Carpet): Thread 3 runs on 24 cores: 0-23 >> INFO (Carpet): Thread 4 runs on 24 cores: 0-23 >> INFO (Carpet): Thread 5 runs on 24 cores: 0-23 >> >> On 5 October 2015 at 12:27, Ian Hinder <ian.hin...@aei.mpg.de> wrote: >> >>> >>> On 5 Oct 2015, at 12:43, Geraint Pratten <g.prat...@sussex.ac.uk> wrote: >>> >>> Hi, >>> >>> I am trying to run a simulation with the same output directory - though >>> I've been removing the simulations after each run while trying to fix the >>> above issue. I should have write access to the file and there should be >>> sufficient storage. >>> >>> >>> Can you check the standard output of the run and see if the number of >>> processes and threads are what you expect? You should see something like >>> "Carpet is running on N processes with M threads". Maybe there is a >>> problem in MPI initialisation, and multiple processes are trying to write >>> to the same HDF5 file at the same time. >>> >>> >>> >>> Geraint >>> >>> On 5 October 2015 at 11:34, Ian Hinder <ian.hin...@aei.mpg.de> wrote: >>> >>>> >>>> On 1 Oct 2015, at 15:49, Erik Schnetter <schnet...@cct.lsu.edu> wrote: >>>> >>>> Geraint >>>> >>>> Some very fundamental HDF5 operations fail. This could be due to a file >>>> system problem; maybe you don't have write access to the file, or your disk >>>> is full. It could also be that there exists a corrupted HDF5 file, and you >>>> are trying to write to it (extend it). >>>> >>>> HDF5 files can get corrupted if you have opened it for writing, and >>>> then the application is interrupted or crashes, or you run out of disk >>>> space. Even if you make free space later, a corrupted file remains >>>> corrupted. >>>> >>>> >>>> Hi Geraint, >>>> >>>> Are you trying to run a simulation with output in the same directory as >>>> one that was run before? >>>> >>>> >>>> -erik >>>> >>>> >>>> On Thu, Oct 1, 2015 at 9:30 AM, Geraint Pratten <g.prat...@sussex.ac.uk >>>> > wrote: >>>> >>>>> Hi all, >>>>> >>>>> I recently built the latest release (Hilbert) on the local cluster and >>>>> I've been getting a HDF5 error. I've tried building Cactus with both the >>>>> local HDF5 package on the cluster as well as letting Cactus build HDF5 >>>>> from >>>>> scratch. Does anyone have any insight into the error I've attached below >>>>> (extract of a long chain of errors of the same form)? >>>>> >>>>> I can dump the output to file or give configuration files if needed. >>>>> >>>>> Thanks in advance! >>>>> Geraint >>>>> >>>>> ---------- >>>>> >>>>> HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 0: >>>>> #000: H5Ddeprec.c line 193 in H5Dcreate1(): unable to create dataset >>>>> major: Dataset >>>>> minor: Unable to initialize object >>>>> #001: H5Dint.c line 453 in H5D__create_named(): unable to create and >>>>> link to dataset >>>>> major: Dataset >>>>> minor: Unable to initialize object >>>>> #002: H5L.c line 1638 in H5L_link_object(): unable to create new >>>>> link to object >>>>> major: Links >>>>> minor: Unable to initialize object >>>>> #003: H5L.c line 1882 in H5L_create_real(): can't insert link >>>>> major: Symbol table >>>>> minor: Unable to insert object >>>>> #004: H5Gtraverse.c line 861 in H5G_traverse(): internal path >>>>> traversal failed >>>>> major: Symbol table >>>>> minor: Object not found >>>>> #005: H5Gtraverse.c line 641 in H5G_traverse_real(): traversal >>>>> operator failed >>>>> major: Symbol table >>>>> minor: Callback failed >>>>> #006: H5L.c line 1674 in H5L_link_cb(): name already exists >>>>> major: Symbol table >>>>> minor: Object already exists >>>>> WARNING level 1 from host node207.cm.cluster process 0 >>>>> while executing schedule bin (none), routine (no thorn)::(no routine) >>>>> in thorn CarpetIOHDF5, file >>>>> /mnt/pact/gp234/NR/Cactus/arrangements/Carpet/CarpetIOHDF5/src/ >>>>> Output.cc <http://output.cc/>:677: >>>>> -> HDF5 call 'dataset = H5Dcreate (outfile, >>>>> datasetname.str().c_str(), filedatatype, dataspace, plist)' returned error >>>>> code -1 >>>>> WARNING level 1 from host node207.cm.cluster process 0 >>>>> while executing schedule bin (none), routine (no thorn)::(no routine) >>>>> in thorn CarpetIOHDF5, file >>>>> /mnt/pact/gp234/NR/Cactus/arrangements/Carpet/CarpetIOHDF5/src/ >>>>> Output.cc <http://output.cc/>:689: >>>>> -> HDF5 call 'H5Dwrite (dataset, memdatatype, H5S_ALL, H5S_ALL, >>>>> H5P_DEFAULT, data)' returned error code -1 >>>>> WARNING level 1 from host node207.cm.cluster process 0 >>>>> while executing schedule bin (none), routine (no thorn)::(no routine) >>>>> in thorn CarpetIOHDF5, file >>>>> /mnt/pact/gp234/NR/Cactus/arrangements/Carpet/CarpetIOHDF5/src/ >>>>> Output.cc <http://output.cc/>:725: >>>>> -> HDF5 call 'attr = H5Acreate (dataset, "level", H5T_NATIVE_INT, >>>>> dataspace, H5P_DEFAULT)' returned error code -1 >>>>> WARNING level 1 from host node207.cm.cluster process 0 >>>>> while executing schedule bin (none), routine (no thorn)::(no routine) >>>>> in thorn CarpetIOHDF5, file >>>>> /mnt/pact/gp234/NR/Cactus/arrangements/Carpet/CarpetIOHDF5/src/ >>>>> Output.cc <http://output.cc/>:726: >>>>> -> HDF5 call 'H5Awrite (attr, H5T_NATIVE_INT, &refinementlevel)' >>>>> returned error code -1 >>>>> WARNING level 1 from host node207.cm.cluster process 0 >>>>> >>>>> -- >>>>> Geraint Pratten >>>>> Postdoctoral Research Associate >>>>> >>>>> Mobile: +44(0) 7581709282 >>>>> E-mail: g.prat...@sussex.ac.uk >>>>> Skype: geraint.pratten >>>>> >>>>> School of Mathematical and Physical Sciences >>>>> Pevensey 3 Building >>>>> University of Sussex >>>>> Falmer Campus >>>>> Brighton >>>>> BN1 9QH >>>>> United Kingdom >>>>> >>>>> _______________________________________________ >>>>> Users mailing list >>>>> Users@einsteintoolkit.org >>>>> http://lists.einsteintoolkit.org/mailman/listinfo/users >>>>> >>>>> >>>> >>>> >>>> -- >>>> Erik Schnetter <schnet...@cct.lsu.edu> >>>> http://www.perimeterinstitute.ca/personal/eschnetter/ >>>> _______________________________________________ >>>> Users mailing list >>>> Users@einsteintoolkit.org >>>> http://lists.einsteintoolkit.org/mailman/listinfo/users >>>> >>>> >>>> -- >>>> Ian Hinder >>>> http://members.aei.mpg.de/ianhin >>>> >>>> >>> >>> >>> -- >>> Geraint Pratten >>> Postdoctoral Research Associate >>> >>> Mobile: +44(0) 7581709282 >>> E-mail: g.prat...@sussex.ac.uk >>> Skype: geraint.pratten >>> >>> School of Mathematical and Physical Sciences >>> Pevensey 3 Building >>> University of Sussex >>> Falmer Campus >>> Brighton >>> BN1 9QH >>> United Kingdom >>> >>> >>> -- >>> Ian Hinder >>> http://members.aei.mpg.de/ianhin >>> >>> >> >> >> -- >> Geraint Pratten >> Postdoctoral Research Associate >> >> Mobile: +44(0) 7581709282 >> E-mail: g.prat...@sussex.ac.uk >> Skype: geraint.pratten >> >> School of Mathematical and Physical Sciences >> Pevensey 3 Building >> University of Sussex >> Falmer Campus >> Brighton >> BN1 9QH >> United Kingdom >> > > > > -- > Geraint Pratten > Postdoctoral Research Associate > > Mobile: +44(0) 7581709282 > E-mail: g.prat...@sussex.ac.uk > Skype: geraint.pratten > > School of Mathematical and Physical Sciences > Pevensey 3 Building > University of Sussex > Falmer Campus > Brighton > BN1 9QH > United Kingdom > <apollo.run><apollo.sub> > > > -- > Ian Hinder > http://members.aei.mpg.de/ianhin > > -- Geraint Pratten Postdoctoral Research Associate Mobile: +44(0) 7581709282 E-mail: g.prat...@sussex.ac.uk Skype: geraint.pratten School of Mathematical and Physical Sciences Pevensey 3 Building University of Sussex Falmer Campus Brighton BN1 9QH United Kingdom
apollo.cfg
Description: Binary data
_______________________________________________ Users mailing list Users@einsteintoolkit.org http://lists.einsteintoolkit.org/mailman/listinfo/users