Hi,
I am working on the following problem:
A code is producing about 20000 dataset, which should be placed pairwise into
groups, resulting in 10000 groups with 2 datasets each. Each pair of data for
the datasets is calculated by a computer node, so only one node needs to write
to one dataset without any data from other nodes. The site of one dataset is
about 500 kByte.
My approach for doing so is the following:
1) I am open the file in parallel mode
2) I am looping with all nodes over all groups and dataset: creating and
closing them collectively
3) Then I loop over the calculation of the dataset as for (int i=rank; i <
10000, i+=size){„calculate the data of the data set“}
4) One node writes exclusively into a single dataset with a transfer protocol
which is INDEPENDENT
So far the idea. I am profiling the code with MPE and it works ok for a small
number of nodes but with more nodes it gets worse, much worse, up to a point
that doing the calculation on a single node doing a serial writing, while the
remaining nodes are idle.
I am stuccoed now to get a good performance which scales nicely with the number
of cores.
Any help or tips are appreciated,
Sven
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org