Re: [Hdf-forum] Slow Reading 721GB File in Parallel

chrisyeshi Wed, 30 May 2012 11:47:58 -0700

Thanks in advance!

The selection of each process actually stays the same size since the
region_count is not changing.


the result of running "lfs getstripe filename | grep stripe" is:

lmm_stripe_count:   4
lmm_stripe_size:    1048576
lmm_stripe_offset:  286

Let me confirm with the second question.

On Wed, May 30, 2012 at 11:01 AM, Mohamad Chaarawi [via hdf-forum] <
[email protected]> wrote:

> Hi Yucong ,
>
> On 5/30/2012 12:33 PM, Yucong Ye wrote:
>
> The region_index changes according to the mpi rank while the region_count
> stays the same, which is 16,16,16.
>
>
> Ok, I just needed to make sure that the selections for each process are
> done such that it is compatible with scaling being done (as the number of
> processes increase, the selection of each process decreases accordingly)..
> The performance numbers you provided are indeed troubling, but it could be
> for several reasons, some being:
>
>    - The stripe size & count of your file on Lustre could be too small.
>    Although this is a read operation (no file locking is done by the OSTs),
>    increasing the number of io processes puts too much burden on the OSTs.
>    Could you check those 2 parameters of your file? you can do that by running
>    this on the command line:
>     - lfs getstripe filename | grep stripe
>        - The MPI-I/O implementation is not doing aggregation. If you are
>    using ROMIO, two phase should do this for you which sets the default to the
>    number of nodes (not processes). I would also try and increase the
>    cb_buffer_size (default is 4MBs).
>
> Thanks,
> Mohamad
>
>  On May 30, 2012 8:19 AM, "Mohamad Chaarawi" <[hidden 
> email]<http://user/SendEmail.jtp?type=node&node=4023015&i=0>>
> wrote:
>
>> Hi Chrisyeshi,
>>
>> Is the region_index & region_count the same on all processes? i.e. Are
>> you just reading the same data on all processes?
>>
>> Mohamad
>>
>> On 5/29/2012 3:02 PM, chrisyeshi wrote:
>>
>>> Hi,
>>>
>>> I am having trouble to read from a 721GB file using 4096 nodes.
>>> When I test with a few nodes, it works, but when I test with more nodes,
>>> it
>>> takes significantly more time.
>>> What the test program does it only read in the data and deleting it.
>>> Here's the timing information:
>>>
>>> Nodes    |    Time For Running Entire Program
>>> 16              4:28
>>> 32              6:55
>>> 64              8:56
>>> 128            11:22
>>> 256            13:25
>>> 512            15:34
>>>
>>> 768            28:34
>>> 800            29:04
>>>
>>> I am running the program in a Cray XK6 system, and the file system is
>>> Lustre
>>>
>>> *There is a big gap after 512 nodes, and with 4096 nodes, it couldn't
>>> finish
>>> in 6 hours.
>>> Is this normal? Shouldn't it be a lot faster?*
>>>
>>> Here is my reading function, it's similar to the sample hdf5 parallel
>>> program:
>>>
>>> #include<hdf5.h>
>>> #include<stdio.h>
>>> #include<stdlib.h>
>>> #include<assert.h>
>>>
>>> void readData(const char* filename, int region_index[3], int
>>> region_count[3], float* flow_field[6])
>>> {
>>>   char attributes[6][50];
>>>   sprintf(attributes[0], "/uvel");
>>>   sprintf(attributes[1], "/vvel");
>>>   sprintf(attributes[2], "/wvel");
>>>   sprintf(attributes[3], "/pressure");
>>>   sprintf(attributes[4], "/temp");
>>>   sprintf(attributes[5], "/OH");
>>>
>>>   herr_t status;
>>>   hid_t file_id;
>>>   hid_t dset_id;
>>>   hid_t dset_plist;
>>>   // open file spaces
>>>   hid_t acc_tpl = H5Pcreate(H5P_FILE_ACCESS);
>>>   status = H5Pset_fapl_mpio(acc_tpl, MPI_COMM_WORLD, MPI_INFO_NULL);
>>>   file_id = H5Fopen(filename, H5F_ACC_RDONLY, acc_tpl);
>>>   status = H5Pclose(acc_tpl);
>>>   for (int i = 0; i<  6; ++i)
>>>   {
>>>     // open dataset
>>>     dset_id = H5Dopen(file_id, attributes[i], H5P_DEFAULT);
>>>
>>>     // get dataset space
>>>     hid_t spac_id = H5Dget_space(dset_id);
>>>     hsize_t htotal_size3[3];
>>>     status = H5Sget_simple_extent_dims(spac_id, htotal_size3, NULL);
>>>     hsize_t region_size3[3] = {htotal_size3[0] / region_count[0],
>>>                                htotal_size3[1] / region_count[1],
>>>                                htotal_size3[2] / region_count[2]};
>>>
>>>     // hyperslab
>>>     hsize_t start[3] = {region_index[0] * region_size3[0],
>>>                         region_index[1] * region_size3[1],
>>>                         region_index[2] * region_size3[2]};
>>>     hsize_t count[3] = {region_size3[0], region_size3[1],
>>> region_size3[2]};
>>>     status = H5Sselect_hyperslab(spac_id, H5S_SELECT_SET, start, NULL,
>>> count, NULL);
>>>     hid_t memspace = H5Screate_simple(3, count, NULL);
>>>
>>>     // read
>>>     hid_t xfer_plist = H5Pcreate(H5P_DATASET_XFER);
>>>     status = H5Pset_dxpl_mpio(xfer_plist, H5FD_MPIO_COLLECTIVE);
>>>
>>>     flow_field[i] = (float *) malloc(count[0] * count[1] * count[2] *
>>> sizeof(float));
>>>     status = H5Dread(dset_id, H5T_NATIVE_FLOAT, memspace, spac_id,
>>> xfer_plist, flow_field[i]);
>>>
>>>     // clean up
>>>     H5Dclose(dset_id);
>>>     H5Sclose(spac_id);
>>>     H5Pclose(xfer_plist);
>>>   }
>>>   H5Fclose(file_id);
>>> }
>>>
>>> *Do you see any problem with this function? I am new to hdf5 parallel.*
>>>
>>> Thanks in advance!
>>>
>>> --
>>> View this message in context:
>>> http://hdf-forum.184993.n3.nabble.com/Slow-Reading-721GB-File-in-Parallel-tp4021429.html
>>> Sent from the hdf-forum mailing list archive at Nabble.com.
>>>
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> [hidden email] <http://user/SendEmail.jtp?type=node&node=4023015&i=1>
>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>>
>>
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [hidden email] <http://user/SendEmail.jtp?type=node&node=4023015&i=2>
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [hidden email] 
> <http://user/SendEmail.jtp?type=node&node=4023015&i=3>http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [hidden email] <http://user/SendEmail.jtp?type=node&node=4023015&i=4>
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://hdf-forum.184993.n3.nabble.com/Slow-Reading-721GB-File-in-Parallel-tp4021429p4023015.html
>  To unsubscribe from Slow Reading 721GB File in Parallel, click 
> here<http://hdf-forum.184993.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4021429&code=Y2hyaXN5ZXNoaUBnbWFpbC5jb218NDAyMTQyOXwxMTg1MjYxNzA=>
> .
> NAML<http://hdf-forum.184993.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>


--
View this message in context: 
http://hdf-forum.184993.n3.nabble.com/Slow-Reading-721GB-File-in-Parallel-tp4021429p4023160.html
Sent from the hdf-forum mailing list archive at Nabble.com.

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Re: [Hdf-forum] Slow Reading 721GB File in Parallel

Reply via email to