Hi Phil -
Just wanted to let you know - I reverted back to DB 4.6, recompiled
everything, and it all starts up without any issues... there definitely
looks like there are some changes with DB 4.7 that makes it incompatible
with the current version of PVFS2...
if I get a chance, I'll see if I can probe a bit into the code and see
what I can see.
Thanks again for your help and suggestions.
-Mark
Phil Carns wrote:
Oh, now that I think about it a little more, the first server is the
only one that actually has file system objects already stored in it
initially. It has two objects to keep up with the root "/" directory,
and two objects to keep up with the "lost+found" directory. So
probably on the first startup it is exercising more db routines than
the other servers are.
That doesn't help solve the problem, but it might explain why the
first server might have problems on startup while the others do not.
-Phil
Phil Carns wrote:
Ok, let us know what you find out. I'll try 4.7 on my end too, but I
have a feeling you'll get to try 4.6 before I get around to trying
4.7 :)
-Phil
Mark J. Hoy wrote:
Thanks Phil -
I'll try installing db 4.6 (or earlier) this afternoon and see if
that makes any difference (we don't need 4.7 for anything on our
system at the moment, so switching back won't be any issue)...
As for ldd - on all nodes, I'm seeing the same versions:
libdb-4.7.so => /usr0/BerkeleyDB.4.7/lib/libdb-4.7.so
(0x00002b3225245000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00000030e3300000)
librt.so.1 => /lib64/librt.so.1 (0x00000030e9000000)
libc.so.6 => /lib64/libc.so.6 (0x00000030e2c00000)
/lib64/ld-linux-x86-64.so.2 (0x00000030e2000000)
Thanks again!
-Mark
Phil Carns wrote:
Hi Mark,
Ok, maybe this is some berkeley db compatibility issue; it looks
like 4.7 is hot off the presses a couple of weeks ago. Maybe there
is something new we need to figure out.
It is strange that you would only have trouble with one particular
server, though. Could you double check with "ldd ./pvfs2-server"
that they are all linked to the same db library?
-Phil
Mark J. Hoy wrote:
Thanks Phil -
Retrying the re-initialization of the storage space (-f) - says it
works, but also provides the additional messages:
[S 07/01 10:54] PVFS2 Server on node boston19 version 2.7.1
starting...
[E 07/01 10:54] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
allocation flag on key DBT
[E 07/01 10:54] error in dspace create (db_p->get failed).
[E 07/01 10:54] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
allocation flag on key DBT
[E 07/01 10:54] error in dspace create (db_p->get failed).
[E 07/01 10:54] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
allocation flag on key DBT
[E 07/01 10:54] error in dspace create (db_p->get failed).
[E 07/01 10:54] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
allocation flag on key DBT
[E 07/01 10:54] error in dspace create (db_p->get failed).
[D 07/01 10:54] PVFS2 Server: storage space created. Exiting.
This only happens on the very first server (server1 in my
configuration below) - also, I'm using a standard make/make
install of BerkeleyDB.4.7 (no odd configuration options other than
the --prefix setting to install to a different path)...
Running showcoll on the first server yields:
[E 10:58:15.614847] TROVE:DBPF:Berkeley DB: DB_THREAD mandates
memory allocation flag on key DBT
[E 10:58:15.614940] TROVE:DBPF:Berkeley DB: DB->get: Invalid argument
0x00100000 (dspace_getattr output: type = unknown, b_size =
5421632)
[E 10:58:15.614990] TROVE:DBPF:Berkeley DB: DB_THREAD mandates
memory allocation flag on key DBT
[E 10:58:15.615003] TROVE:DBPF:Berkeley DB: DB->get: Invalid argument
0x00000000 (dspace_getattr output: type = unknown, b_size =
5421632)
Running showcoll (with the same parameters) on any of the other 5
nodes yields:
[E 11:01:34.794473] src/io/trove/trove-dbpf/dbpf-mgmt.c line 515:
dbpf_collection_geteattr: DB_NOTFOUND: No matching key/data pair
found
[E 11:01:34.794792] [bt]
bin/pvfs2-showcoll(dbpf_collection_geteattr+0x103) [0x414063]
[E 11:01:34.794811] [bt] bin/pvfs2-showcoll(main+0x417)
[0x4070b7]
[E 11:01:34.794822] [bt]
/lib64/libc.so.6(__libc_start_main+0xf4) [0x30e2c1d084]
[E 11:01:34.794834] [bt] bin/pvfs2-showcoll(aio_fsync64+0x39)
[0x406669]
Storage space /usr2/pvfs-storage, collection pvfs2-fs (coll_id =
1375400306, *** no root_handle found ***):
... not sure what I'm missing here... Thanks!
-Mark
Phil Carns wrote:
Hello,
I haven't seen that error message in that particular context
before. In general, though, it happens on startup when the
server finds that it has a handle (storage object) in its
directory that doesn't match the ranges specified in the
configuration file.
In this specific case it thinks there is a handle with value 0 in
your storage space, which shouldn't happen.
Has the server ever started successfully, or is this the first
attempt to get it running?
You may want to try just deleting the storage space (the
/usr2/pvfs-storage directory) and redoing the "-f" step, if you
haven't already.
You could also try running this command:
pvfs2-showcoll -s /usr2/pvfs-storage -c pvfs2-fs
That should list all of the handles in the storage space so that
we can confirm if there is really bad data in there or if there
is something wrong with the server's startup.
-Phil
Mark J. Hoy wrote:
Hi -
I'm trying to get PVFS2 version 2.7.1 (latest stable) up and
running - It compiles correctly without issue and to initialize
my storage (via "pvfs2-server -f /path/to/config/file" ) - but
I'm having a problem getting the server to start...
every time I try running "sbin/pvfs2-server
/path/to/config/file" (where /path/to/config/file is my
configuration file generated via pvfs2-genconfig), I keep
getting an error: Error: handle 0 is invalid (out of bounds)
The relevant pieces of the log are shown below:
[D 06/27 13:32] PVFS2 Server version 2.7.1 starting.
[E 06/27 13:32] Error: handle 0 is invalid (out of bounds)
[E 06/27 13:32] Error adding handle range
3-1317624576693539402,2635249153387078803-3952873730080618202 to
filesystem pvfs2-fs
[E 06/27 13:32] Error: Could not initialize server interfaces;
aborting.
[E 06/27 13:32] Error: Could not initialize server; aborting.
This seems to happen both using a single-machine configuration,
and during a cluster configuration (with 6 machines) - _but_ in
the multiple machine configuration, it only happens when I try
and start the first I/O node - the other 5 machines startup
without issue.
Has anyone else experienced this sort of problem? I've attached
a copy of my configuration file below (but changed the machine
names to protect the innocent). Also, I'm running on a
homogeneous configuration where all six of my machines are
running Fedora Core 5, kernel: Linux version 2.6.19.1-001-K8,
Dual-Core AMD Opteron(tm) Processor (model 1218, 2.6 GHz), 4GB
RAM, and 400 GB of storage on the volume for pvfs2
<Defaults>
UnexpectedRequests 50
EventLogging none
LogStamp datetime
BMIModules bmi_tcp
FlowModules flowproto_multiqueue
PerfUpdateInterval 1000
ServerJobBMITimeoutSecs 30
ServerJobFlowTimeoutSecs 30
ClientJobBMITimeoutSecs 300
ClientJobFlowTimeoutSecs 300
ClientRetryLimit 5
ClientRetryDelayMilliSecs 2000
StorageSpace /usr2/pvfs-storage
LogFile /tmp/pvfs2-server.log
</Defaults>
<Aliases>
Alias server1 tcp://server1:3334
Alias server2 tcp://server2:3334
Alias server3 tcp://server3:3334
Alias server4 tcp://server4:3334
Alias server5 tcp://server5:3334
Alias server6 tcp://server6:3334
</Aliases>
<Filesystem>
Name pvfs2-fs
ID 1375400306
RootHandle 1048576
<MetaHandleRanges>
Range server1 3-1152921504606846977
Range server6
1152921504606846978-2305843009213693952
</MetaHandleRanges>
<DataHandleRanges>
Range server1
2305843009213693953-3458764513820540927
Range server2
3458764513820540928-4611686018427387902
Range server3
4611686018427387903-5764607523034234877
Range server4
5764607523034234878-6917529027641081852
Range server5
6917529027641081853-8070450532247928827
Range server6
8070450532247928828-9223372036854775802
</DataHandleRanges>
<StorageHints>
TroveSyncMeta yes
TroveSyncData no
</StorageHints>
</Filesystem>
Thanks!
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users