https://bitbucket.org/petsc/petsc/pull-requests/1551/chunksize-could-overflow-and-become/diff
With this fix I can run with your vector size on 1 process. With 2 processes I get $ petscmpiexec -n 2 ./ex1 Assertion failed in file adio/common/ad_write_coll.c at line 904: (curr_to_proc[p] + len - done_to_proc[p]) == (unsigned) (curr_to_proc[p] + len - done_to_proc[p]) 0 libpmpi.0.dylib 0x0000000111241f3e backtrace_libc + 62 1 libpmpi.0.dylib 0x0000000111241ef5 MPL_backtrace_show + 21 2 libpmpi.0.dylib 0x000000011119f85a MPIR_Assert_fail + 90 3 libpmpi.0.dylib 0x00000001111a15f3 MPIR_Ext_assert_fail + 35 4 libmpi.0.dylib 0x0000000110eee16e ADIOI_Fill_send_buffer + 1134 5 libmpi.0.dylib 0x0000000110eefe74 ADIOI_W_Exchange_data + 2980 6 libmpi.0.dylib 0x0000000110eed7ad ADIOI_Exch_and_write + 3197 7 libmpi.0.dylib 0x0000000110eec854 ADIOI_GEN_WriteStridedColl + 2004 8 libpmpi.0.dylib 0x000000011128ad4b MPIOI_File_write_all + 1179 9 libmpi.0.dylib 0x0000000110ec382b MPI_File_write_at_all + 91 10 libhdf5.10.dylib 0x00000001108b982a H5FD_mpio_write + 1466 11 libhdf5.10.dylib 0x00000001108b127a H5FD_write + 634 12 li Looks like an int overflow in the MPIIO. (It is scary to see the ints in the ADIO code as opposed to 64 bit integers but I guess somehow it works, maybe this is a strange corner case and I don't know if the problem is with HDF5 or MPIIO) on 4 and 8 processes it runs. Note that you are playing with a very dangerous size. 32768 * 32768 * 2 is a negative number in int. So this is essentially the largest problem you can run before switching to 64 bit indices for PETSc. Barry > On Apr 16, 2019, at 9:32 AM, Sajid Ali via petsc-users > <petsc-users@mcs.anl.gov> wrote: > > Hi PETSc developers, > > I’m trying to write a large vector created with VecCreateMPI (size > 32768x32768) concurrently from 4 nodes (+32 tasks per node, total 128 > mpi-ranks) and I see the following (indicative) error : [Full error log is > here : https://file.io/CdjUfe] > > HDF5-DIAG: Error detected in HDF5 (1.10.5) MPI-process 52: > #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset > major: Dataset > minor: Unable to initialize object > #001: H5Dint.c line 329 in H5D__create_named(): unable to create and link > to dataset > major: Dataset > minor: Unable to initialize object > #002: H5L.c line 1557 in H5L_link_object(): unable to create new link to > object > major: Links > minor: Unable to initialize object > #003: H5L.c line 1798 in H5L__create_real(): can't insert link > major: Links > minor: Unable to insert object > #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path traversal > failed > major: Symbol table > HDF5-DIAG: Error detected in HDF5 (1.10.5) MPI-process 59: > > #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset > > major: Dataset > > minor: Unable to initialize object > > #001: H5Dint.c line 329 in H5D__create_named(): unable to create and link > to dataset > major: Dataset > > minor: Unable to initialize object > > #002: H5L.c line 1557 in H5L_link_object(): unable to create new link to > object > major: Links > > minor: Unable to initialize object > > #003: H5L.c line 1798 in H5L__create_real(): can't insert link > > major: Links > > minor: Unable to insert object > > #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path traversal > failed > major: Symbol table > > minor: Object not found > > #005: H5Gtraverse.c line 627 in H5G__traverse_real(): traversal operator > failed > major: Symbol table > > minor: Callback failed > > #006: H5L.c line 1604 in H5L__link_cb(): unable to create object > > major: Links > > minor: Unable to initialize object > > #007: H5Oint.c line 2453 in H5O_obj_create(): unable to open object > > major: Object header > > minor: Can't open object > > #008: H5Doh.c line 300 in H5O__dset_create(): unable to create dataset > > minor: Object not found > > #005: H5Gtraverse.c line 627 in H5G__traverse_real(): traversal operator > failed > major: Symbol table > > minor: Callback failed > > #006: H5L.c line 1604 in H5L__link_cb(): unable to create object > > major: Links > > minor: Unable to initialize object > > #007: H5Oint.c line 2453 in H5O_obj_create(): unable to open object > > major: Object header > > minor: Can't open object > > #008: H5Doh.c line 300 in H5O__dset_create(): unable to create dataset > > major: Dataset > > minor: Unable to initialize object > > #009: H5Dint.c line 1274 in H5D__create(): unable to construct layout > information > major: Dataset > minor: Unable to initialize object > #010: H5Dchunk.c line 872 in H5D__chunk_construct(): unable to set chunk > sizes > major: Dataset > minor: Bad value > #011: H5Dchunk.c line 831 in H5D__chunk_set_sizes(): chunk size must be < > 4GB > major: Dataset > minor: Unable to initialize object > major: Dataset > minor: Unable to initialize object > #009: H5Dint.c line 1274 in H5D__create(): unable to construct layout > information > major: Dataset > minor: Unable to initialize object > #010: H5Dchunk.c line 872 in H5D__chunk_construct(): unable to set chunk > sizes > major: Dataset > minor: Bad value > #011: H5Dchunk.c line 831 in H5D__chunk_set_sizes(): chunk size must be < > 4GB > major: Dataset > minor: Unable to initialize object > ....... > > I spoke to Barry last evening who said that this is a known error that was > fixed for DMDA vecs but is broken for non-dmda vecs. > > Could this be fixed ? > > > Thank You, > Sajid Ali > Applied Physics > Northwestern University