> > Actually, I didn't ask for it. I only asked for a bug to be fixed. A bug
> > which means > 4 GB Vec cannot be saved into a HDF5 file using PETSc
> > VecView, because chunking *was* introduced, but with insane chunk sizes
> Ah, right.  I would note that using the local size is also flawed
> because we can only have 65k chunks, but we sometimes run jobs with more
> than that number of processes.  Maybe we need something like this?
> 
>   chunk_size = min(vec_size, max(avg_local_vec_size, vec_size/65k, 10 MiB),
> 4 GiB)

Argh, messy indeed. Are you sure you mean 65 k and not 64 Ki? I made a small 
table of the situation just to make sure I am not missing anything. In the 
table, "small" means < 4 GB, "large" means >= 4 GB, "few" means < 65 k, "many" 
means >= 65 k. Note that local size > global size is impossible, but I include 
the row on the table for completeness's sake.

Variables:      local size      global size     # ranks         chunks
                small           small           few             global size
                small           small           many            global size[1]  
                small           large           few             avg local size
                small           large           many            4 GiB
                large           small           few             impossible
                large           small           many            impossible
                large           large           few             4 GiB[2]
                large           large           many            65 k chunks

[1] It sounds improbable anyone would run a problem with < 4 GiB data with >= 
65k ranks, but fortunately it's not a problem.

[2] Unless I'm mistaken, this situation will always give < 65 k chunks for 4 
GiB chunk size.

I also believe your formula gives "the right" answer in each case. Just one 
more question: is "average local size" a good solution or is it better to use 
"max local size"? The latter will cause more unnecessary data in the file, but 
unless I'm mistaken, the former will require extra MPI communication to fill 
in the portions of ranks whose local size is less than average.

HDF5 really needs to fix this internally. As it stands, a single HDF5 dataset 
cannot hold more than 260 TiB – not that many people would want such files 
anyway, but then again, "640 kiB should be enough for everybody", right? I'm 
running simulations which take more than terabyte of memory, and I'm by far 
not the biggest memory consumer in the world, so the limit is not really as 
far as it might seem.

> I think we're planning to tag 3.4.3 in the next couple weeks.  There
> might be a 3.4.4 as well, but I could see going straight to 3.5.

Ok. I don't see myself having time to fix and test this in two weeks, but 
3.4.4 should be doable. Anyone else want to fix the bug by then?

Cheers,
Juha

-- 
                 -----------------------------------------------
                | Juha Jäykkä, [email protected]                     |
                | http://koti.kapsi.fi/~juhaj/                  |
                 -----------------------------------------------

Reply via email to