Dominic,

at first, you might try to add
call MPI_Barrier(comm,ierr)
between

if (file_is_there .and. irank.eq.iroot) call MPI_FILE_DELETE(file,MPI_INFO_NULL,ierr)

and

call MPI_FILE_OPEN(comm,file,IOR(MPI_MODE_WRONLY,MPI_MODE_CREATE),MPI_INFO_NULL,iunit,ierr)

/* there might me a race condition, i am not sure about that */


fwiw, the

STOP A configuration file is required

error message is not coming from OpenMPI.
it might be indirectly triggered by an ompio bug/limitation, or even a bug in your application.

did you get your application work with an other flavor of OpenMPI ?
e.g. are you reporting an OpenMPI bug ?
or are you asking some help with your application (the bug could either be in your code or in OpenMPI, and you do not know for sure)

i am a bit surprised you are using the same fileview_node type with both MPI_INTEGER and MPI_REAL_SP, but since they should be the same size, that might not be an issue.

the subroutine depends on too many external parameters
(nnodes_, fileview_node, ncells_hexa, ncells_hexa_, unstr2str, ...)
so writing a simple reproducer might not be trivial.

i recommend you first write a self contained program that can be evidenced to reproduce the issue, and then i will investigate that. for that, you might want to dump the array sizes and the description of fileview_node in your application, and then hard code them into your self contained program. also how many nodes/tasks are you running and what filesystem are you running on ?

Cheers,

Gilles

On 3/16/2016 3:05 PM, Dominic Kedelty wrote:
Gilles,

I do not have the latest mpich available. I tested using openmpi version 1.8.7 as well as mvapich2 version 1.9. both produced similar errors. I tried the mca flag that you had provided and it is telling me that a configuration file is needed.

all processes return:

STOP A configuration file is required

I am attaching the subroutine of the code that I believe is where the problem is occuring.



On Mon, Mar 14, 2016 at 6:25 PM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> wrote:

    Dominic,

    this is a ROMIO error message, and ROMIO is from MPICH project.
    at first, I recommend you try the same test with the latest mpich,
    in order to check
    whether the bug is indeed from romio, and has been fixed in the
    latest release.
    (ompi is a few version behind the latest romio)

    would you be able to post a trimmed version of your application
    that evidences the test ?
    that will be helpful to understand what is going on.

    you might also want to give a try to
    mpirun --mca io ompio ...
    and see whether this helps.
    that being said, I think ompio is not considered as production
    ready on the v1.10 series of ompi

    Cheers,

    Gilles


    On Tuesday, March 15, 2016, Dominic Kedelty <dkede...@asu.edu
    <mailto:dkede...@asu.edu>> wrote:

        I am getting the following error using openmpi and I am
        wondering if anyone would have clue as to why it is happening.
        It is an error coming from openmpi.

        Error in ADIOI_Calc_aggregator(): rank_index(40) >=
        fd->hints->cb_nodes (40) fd_size=213909504 off=8617247540
        Error in ADIOI_Calc_aggregator(): rank_index(40) >=
        fd->hints->cb_nodes (40) fd_size=213909504 off=8617247540
        application called MPI_Abort(MPI_COMM_WORLD, 1) - process 157
        application called MPI_Abort(MPI_COMM_WORLD, 1) - process 477

        Any help would be appreciated. Thanks.


    _______________________________________________
    devel mailing list
    de...@open-mpi.org <mailto:de...@open-mpi.org>
    Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
    Link to this post:
    http://www.open-mpi.org/community/lists/devel/2016/03/18697.php




_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/03/18700.php

Reply via email to