Hi Folks,

I have a run on 256 PEs onot a lustre file system with the following code:

[snip]
  integer :: mype,npe,pe_min,pe_max,pe_prev,pe_next,mpi_my_real, &
             comm=mpi_comm_world,status(mpi_status_size),error, &
             mpi_realsize, thefile
  integer (kind=MPI_OFFSET_KIND) disp

  logical :: pe0,prl


! *************************************************************************

    call mpi_init(error)
    call mpi_comm_rank(comm,mype,error)
    call mpi_comm_size(comm, npe,error)

    call mpi_type_extent(mpi_real, mpi_realsize, error);
    call mpi_type_size(MPI_REAL8, mpi_realsize, error)

    pe0=mype==0

. 
. 
. 
     disp = mype*lu*mpi_realsize

     call mpi_barrier(comm,error)
     call mpi_file_open(comm,'output-parallel/dump.dat',
MPI_MODE_RDONLY, mpi_info_null, thefile, error)
     call mpi_file_write_at(thefile, disp, u(1,nx1,ny1,nz1), lu,
MPI_REAL8, mpi_status_ignore, error)
     call mpi_file_close(thefile, error)
     call mpi_barrier(comm,error)


[snip]

where lu is an integer which does not extend the limit. If I am
exceeding the 32 Bit limit, which means that the size of my output file
is larger then 2**31 but (what rouhgly 2.4 Gbytes), I am getting only a
file with a size of 327 MBytey instead of expected 181 GByte for a
checkpoint. This leads of course to a segfault when restarting. I am
afraid this has something to do with the 32 Bit limit of my filesize,
which might be calculated wrong in my offset (which is disp in my code)
in mpi_file_write_at.

Any ideas on how I can enclose the reson of the errpr, or - even better
- on how to solve it?

Best wishes

Alexander

Reply via email to