Phil,

If you are having the same problems I've had, I would offer to try the advise that some have given below. I am working with several layers of which I am not the owner, but I have the source and can make edits. For me, it is reasonable to call my own, explicit MPI_info_set during initialization, for the hints, romio_ds_write and romio_ds_read changing both their respective values to 'disable'. How these defaults are initialized in the ROMIO code in adio/common/ad_hints.c (for these two, specifically, 'enable') is the only best documentation I have found on matter. I've never seen anything describing all the hints available, and the syntax and semantics for the acceptable values.

I don't fully understand data sieving, but I believe it is an older paradigm, and not applicable to our current, high-performance, large-distribution, parallel FS. My suggestion was that, at least here, with Lustre, and it's new abstract device routines, the _ds_ be set to disable, so I don't have to find a place in every new library I deal with to set it explicitly myself.


Marty



Phil Dickens wrote:
hello,

  I am having similar struggles with locking on MPI-IO.
I am doing a simple strided write, and it fails because
of the locking. I'm a bit behind in the discussion, but
is there a way to fix (workaround) this problem?? Is this
something in my code, or the default driver (this is on
lonestar at TACC)? I have even downloaded the most up to date
version of MPICH, which I believe has a new Lustre ADIO
driver, but I am running into the same issues.

  Any thoughts would be greatly appreciated!!

Phil


On Thu, 8 May 2008, Tom.Wang wrote:

Hi

Marty Barnaby wrote:
To return to this discussion, in recent testing, I have found that
writing to a Lustre FS via a higher level library, like PNetCDF, fails
because the default for value for romio_ds_write is not disable. This
is set in the mpich code in the file /src/mpi/romio/adio/common/ad_hints.c
You can use MPI_Info_set to disable romio_ds_write.  What is the fail?
flock? since data-sieving need flock.
I believe it has something to do with locking issues. I'm not sure how
best to handle this, I'd prefer the data sieving default be disable,
though I don't know all the implications there.
I agree data sieving should be disable. And also it check the contiguous
buftype or filetype only by fileview, which is not enough sometimes, and
trigger unnecessary read-modify-write even for contiguous
write(especially for those higher level library, if you choose
collective write). Since lustre has client cache and also the overhead
of flock and read-modify-write, so I doubt the performance improvements
we could  get from data-sieving on lustre, although I do not have
performance data to prove that.
Maybe an ad_lustre_open should be a place where the  _ds_  hints are
set to disable.
Yes, we should disable this for stride write in lustre. ad_lustre_open
seems a right place to do this.

Thanks
WangDi
Marty Barnaby


Weikuan Yu wrote:
Andreas Dilger wrote:

On Mar 11, 2008  16:10 -0600, Marty Barnaby wrote:

I'm not actually sure what ROMIO abstract device the multiple CFS
deployments I utilize were defined with. Probably just UFS, or maybe NFS.
Did you have a recommended option yourself.

The UFS driver is the one used for Lustre if no other one exists.


Besides the fact that most of the adio that were created over the years are
completely obsolete and could be cleaned from ROMIO, what will the new one
for Lustre offer? Particularly with respect to controls via the lfs utility
that I can  already get?

There is improved collective IO that aligns the IO on Lustre stripe
boundaries.  Also the hints given to the MPIIO layer (before open,
not after) result in lustre picking a better stripe count/size.


In addition, the one integrated into MPICH2-1.0.7 contains direct I/O
support. Lockless I/O support was purged out due into my lack of
confidence in low-level file system support. But it can be revived when
possible.

--
Weikuan Yu <+> 1-865-574-7990
http://ft.ornl.gov/~wyu/


------------------------------------------------------------------------

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

--
Regards,
Tom Wangdi
--
Sun Lustre Group
System Software Engineer
http://www.sun.com

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss



_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to