Matthew, Sam-
FYI, using jumbo frames is not necessarily that simple. The IBM file
servers that came with our BG/L don't support jumbo frames on the
internal NICs. Ours are IBM x346 type 8840, and the built-in
Broadcom
NICs couldn't handle jumbo frames. I imagine other xSeries boxes
with
integrated Broadcom NICs may have similar issues. We ended up buying
PCI network cards in order to implement jumbo frames in our
environment. Also, you'll need to make sure your network switch can
handle jumbo frames (ours is a Force10, don't know the exact model
off
the top of my head but it supports jumbo frames).
The other thing you need to be aware of is that switching to jumbo
frames is an all-or-nothing proposition; if you do it, you'll have
to do
it for *all* of the hardware on the involved network segment. You
can't
just change a couple of servers.
I'm Cc:ing a couple of folks at Argonne who worked on getting jumbo
frames working for our environment; they might be able to warn you of
any other gotchas. We're using 8000 byte frames, but if I were
starting
from scratch I'd try something closer to 8300 so that an entire
8192-byte NFS packet can fit in a single frame (avoiding
fragmentation
if you're using an 8192 byte NFS rsize/wsize). Note, 8300 is just a
ballparck guess that I haven't been able to confirm.
Be warned -- in our environment, we started to have problems when
we got
close to 9000 byte frames, so don't go too high.
-Andrew Cherry
BG/L Support
Argonne National Laboratory
On Apr 20, 2007, at 5:02 PM, Sam Lang wrote:
Hi Matthew,
I think the version of PVFS in the Zepto release is pvfs2-1.5.1.
Besides some performance improvements in the latest release
(pvfs-2.6.3), there was a specific bugfix made in PVFS for largish
mpi-io jobs. If you could try the latest (at http://www.pvfs.org/),
it would help us to verify that you're not running into the same
problem.
Regarding config options for PVFS on BGL, make sure you have jumbo
frames enabled, i.e.
ifconfig eth0 mtu 8000 up
Also, you should probably set the tcp buffer sizes explicitly in the
pvfs config file, fs.conf:
<Defaults>
...
TCPBufferSend 524288
TCPBufferReceive 1048576
</Defaults>
You might also see better performance with an alternative trove
method
for doing disk io:
<StorageHints>
...
TroveMethod alt-aio
</StorageHints>
Thanks,
-sam
On Apr 20, 2007, at 4:25 PM, Matthew Woitaszek wrote:
Good afternoon,
Michael Oberg and I are attempting to get PVFS2 working on
NCAR's 1-rack
BlueGene/L system using ZeptoOS. We ran into a snag at over 8 BG/L
I/O nodes
(>256 compute nodes).
We've been using the mpi-io-test program shipped with PVFS2 to
test the
system. For cases up to and including 8 I/O nodes (256
coprocessor or
512
virtual node mode tasks), everything works fine. Larger jobs fail
with file
not found error messages, such as:
MPI_File_open: File does not exist, error stack:
ADIOI_BGL_OPEN(54): File /pvfs2/mattheww/_file_0512_co does
not exist
The file is created on the PVFS2 filesystem and has a zero-byte
size.
We've
run the tests with 512 tasks on 256 nodes, and it successfully
created a
8589934592-byte file. Going to 257 nodes fails.
Has anyone seen this behavior before? Are there any PVFS2 server or
client
configuration options that you would recommend for a BG/L
installation like
this?
Thanks for your time,
Matthew
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users