On Apr 23, 2007, at 12:37 PM, Matthew Woitaszek wrote:
Sam and Nathan,
Thanks for your quick replies, and thank you for your suggestions.
I tried
them both, but it turned out that it was a BlueGene/L mount
problem, and not
a PVFS2 problem at all.
On Friday, we turned on additional debugging messages, and the
following
messages appeared in the PVFS2 logs on the metadata server:
[A 04/20 17:36] [EMAIL PROTECTED] H=1048576 S=0x2a96943530:
lookup_path: path: mattheww, handle: 1047133
[A 04/20 17:36] [EMAIL PROTECTED] H=1048576 S=0x2a96943530:
lookup_path: finish (Success)
[A 04/20 17:36] [EMAIL PROTECTED] H=1047133 S=0x2a967e0470:
lookup_path: path: _file_258_co, lookup failed
[A 04/20 17:36] [EMAIL PROTECTED] H=1047133 S=0x2a967e0470:
lookup_path: finish (No such file or directory)
This had us convinced that it was a problem related to the metadata
server,
and that all of the clients were sending requests properly. To
investigate
further, we ran a very simple program that just barriers, calls
fopen once
for each process with a filename based on rank, and sums the number
of file
descriptors that were returned. Sure enough, at cases > 256 tasks,
some of
the fopen() calls didn't return a file descriptor.
With that, it didn't seem to be a MPI-IO problem. Looking at client
logs, we
found that when booting over 8 32-node partitions, some of the
partitions
weren't properly mounting pvfs2. A few changes to remount if
required during
the boot process fixed the problem. Since then, everything's worked
fine. I
suppose the moral is: "Make sure that your clients are mounting the
file
system!" Those lookup_failed messages were quite perplexing and
definitely
led me to look in the wrong place first.
Matthew,
Thanks for the report. Nice to hear that its not a bug somewhere in
the PVFS/ROMIO path. :-)
It might be helpful to disable read and write permissions to the pvfs
mountpoint (/pvfs) on the IO nodes when the pvfs volume isn't mounted
-- in fact you can probably just do chmod 000 /pvfs. This will
prevent your apps from writing/reading to the /pvfs directory if its
not actually mounted to a pvfs volume. Once you mount /pvfs, it gets
777 permissions with the sticky bit. So if /pvfs isn't actually
mounted, your apps will get EPERM errors pretty early in the IO process.
Also, you can mount the pvfs volume from any of the pvfs servers
(including the IO servers), which can help to distribute the mount
workload when the IO nodes startup and all try to mount at once. At
ANL, the IO nodes pick a server randomly when mounting.
My apologies for bothering everyone with this, and again, thanks
for your
quick offers with assistance. I really appreciate it!
It helps to hear reports like this so we know what to look for when
the same thing happens to us in the future. :-)
-sam
Matthew
-----Original Message-----
From: Sam Lang [mailto:[EMAIL PROTECTED]
Sent: Friday, April 20, 2007 4:53 PM
To: Matthew Woitaszek
Cc: [email protected]
Subject: Re: [Pvfs2-users] PVFS2 on BlueGene
Hi Matthew,
Does mpi-io-test consistently fail with 257 nodes (9 IO nodes), or do
you get any successful runs there? Are there any messages in the
pvfs server logs (/tmp/pvfs2-server.log)?
Thanks,
-sam
On Apr 20, 2007, at 4:25 PM, Matthew Woitaszek wrote:
Good afternoon,
Michael Oberg and I are attempting to get PVFS2 working on NCAR's 1-
rack
BlueGene/L system using ZeptoOS. We ran into a snag at over 8 BG/L
I/O nodes
(>256 compute nodes).
We've been using the mpi-io-test program shipped with PVFS2 to test
the
system. For cases up to and including 8 I/O nodes (256 coprocessor
or 512
virtual node mode tasks), everything works fine. Larger jobs fail
with file
not found error messages, such as:
MPI_File_open: File does not exist, error stack:
ADIOI_BGL_OPEN(54): File /pvfs2/mattheww/_file_0512_co does not
exist
The file is created on the PVFS2 filesystem and has a zero-byte
size. We've
run the tests with 512 tasks on 256 nodes, and it successfully
created a
8589934592-byte file. Going to 257 nodes fails.
Has anyone seen this behavior before? Are there any PVFS2 server or
client
configuration options that you would recommend for a BG/L
installation like
this?
Thanks for your time,
Matthew
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users