hello, I am having similar struggles with locking on MPI-IO. I am doing a simple strided write, and it fails because of the locking. I'm a bit behind in the discussion, but is there a way to fix (workaround) this problem?? Is this something in my code, or the default driver (this is on lonestar at TACC)? I have even downloaded the most up to date version of MPICH, which I believe has a new Lustre ADIO driver, but I am running into the same issues.
Any thoughts would be greatly appreciated!! Phil On Thu, 8 May 2008, Tom.Wang wrote: > Hi > > Marty Barnaby wrote: >> To return to this discussion, in recent testing, I have found that >> writing to a Lustre FS via a higher level library, like PNetCDF, fails >> because the default for value for romio_ds_write is not disable. This >> is set in the mpich code in the file /src/mpi/romio/adio/common/ad_hints.c > You can use MPI_Info_set to disable romio_ds_write. What is the fail? > flock? since data-sieving need flock. >> >> I believe it has something to do with locking issues. I'm not sure how >> best to handle this, I'd prefer the data sieving default be disable, >> though I don't know all the implications there. > I agree data sieving should be disable. And also it check the contiguous > buftype or filetype only by fileview, which is not enough sometimes, and > trigger unnecessary read-modify-write even for contiguous > write(especially for those higher level library, if you choose > collective write). Since lustre has client cache and also the overhead > of flock and read-modify-write, so I doubt the performance improvements > we could get from data-sieving on lustre, although I do not have > performance data to prove that. >> Maybe an ad_lustre_open should be a place where the _ds_ hints are >> set to disable. > Yes, we should disable this for stride write in lustre. ad_lustre_open > seems a right place to do this. > > Thanks > WangDi >> >> Marty Barnaby >> >> >> Weikuan Yu wrote: >>> Andreas Dilger wrote: >>> >>>> On Mar 11, 2008 16:10 -0600, Marty Barnaby wrote: >>>> >>>>> I'm not actually sure what ROMIO abstract device the multiple CFS >>>>> deployments I utilize were defined with. Probably just UFS, or maybe NFS. >>>>> Did you have a recommended option yourself. >>>>> >>>> The UFS driver is the one used for Lustre if no other one exists. >>>> >>>> >>>>> Besides the fact that most of the adio that were created over the years >>>>> are >>>>> completely obsolete and could be cleaned from ROMIO, what will the new one >>>>> for Lustre offer? Particularly with respect to controls via the lfs >>>>> utility >>>>> that I can already get? >>>>> >>>> There is improved collective IO that aligns the IO on Lustre stripe >>>> boundaries. Also the hints given to the MPIIO layer (before open, >>>> not after) result in lustre picking a better stripe count/size. >>>> >>>> >>> >>> In addition, the one integrated into MPICH2-1.0.7 contains direct I/O >>> support. Lockless I/O support was purged out due into my lack of >>> confidence in low-level file system support. But it can be revived when >>> possible. >>> >>> -- >>> Weikuan Yu <+> 1-865-574-7990 >>> http://ft.ornl.gov/~wyu/ >>> >>> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Lustre-discuss mailing list >> [email protected] >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > > > -- > Regards, > Tom Wangdi > -- > Sun Lustre Group > System Software Engineer > http://www.sun.com > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
