How many data it will write in each I/O for each client?
With this driver, you can specify stripe_size, stripe_count and stripe
index
by MPI_Info_set.You may need --with-file-system=lustre when building the
driver + mpich2.
So it will not use UFS adio driver. And you may need "lustre:" indicator
before your file, then it will use that driver. (CC to WeiKuan, he
should provide more details).
Thanks
WangDi
Marty Barnaby wrote:
In this case I'm running what I think is the simplest, collective I/O
case. All processors are writing their respective buffers,
concurrently, to the same file, though not with two-phase aggregation,
or any other special hints. I'm not aware of any non-posix, Lustre
API. What will the adio driver have beyond the standard UFS options?
MLB
wangdi wrote:
There is a new lustre adio driver(bug 12521), there are some
optimizations for collective write(MPI_File_write_all_at)
If you are interested, you can try it. Could you please tell us more
about this benchmark program? For example, its I/O pattern, shared or
separated file for each client, how many clients will take part in
the I/O?
Thanks
Marty Barnaby wrote:
I'm attempting to establish an absolute maximum, byte-rate
performance value, running a bare bones MPI_File_write_all_at
benchmark program, for our Cray XT3 installation, RedStorm, here at
Sandia National Laboratories. Processor time is at a premium, and I
only run in the standard queue, so I'm not able to do everything I
would imagine, though maybe what I can run is adequate.
I have a directory under our Lustre, redstorm:/scratch_grande which
I have defined with:
lfs setstripe -1 0 -1
Though there are 320 OST's comprising the FS, these defaults give me
a stripe_count of 160 (I'm sure someone could explain that), and I
don't know the stripe_size. With a job of 160 processors, each of
which has a contiguous chuck of 20 MB of memory, respectively, to
append to an open file in an iterative series of singular, atomic,
write_all operations, I can normally average 25 GB/s. To curb any
confusion here, that represents only an experimental maximum to me;
none of our many, complex, science and engineering simulation
applications perform their output dumping with per-processor blocks
as large as a single MB.
I would like any succinct suggestions on explicitly setting my lfs
stripe_size, given the configuration and parameters I've mentioned
here, to optimize it and, perhaps see a decrease in the time spent
storing my data on the FS.
Thank you,
Marty Barnaby
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
--
Regards,
Tom Wangdi
--
Cluster File Systems, Inc
Software Engineer
http://www.clusterfs.com
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss