Hi Matt,
It is indeed a 64 bit bug that was fixed in the patch/url I mentioned.
The server logs with the relevant aio messages would be more interesting..
The snippet below does not seem to have the aio messages.
Please let us know if that patch addresses your problems. Alternatively,
do try cvs if possible..
Thanks,
Murali

On Wed, 24 May 2006, Matt wrote:

> Hi Murali,
>
> its an Opteron cluster:
>
> # uname -a
> Linux cree 2.6.12-12mdksmp #1 SMP Fri Sep 9 17:20:34 CEST 2005 x86_64
> AMD Opteron(tm) Processor 248 unknown GNU/Linux
>
> The compute node run a FC4 kernel:
>
> [EMAIL PROTECTED] ~]# uname -a
> Linux node1 2.6.11-1.1369_FC4smp #1 SMP Thu Jun 2 23:16:33 EDT 2005
> x86_64 AMD Opteron(tm) Processor 248 unknown GNU/Linux
>
> Below is the log when I do
>
> # pvfs2-cp -t  pvfs2-1.4.0.tar.gz /mnt/pvfs2/
>
>
> ... Matt
>
> -------------8<-------
>
> [EMAIL PROTECTED] log]# /usr/local/sbin/pvfs2-server -d /etc/pvfs2-fs.conf
> /etc/pvfs2-server.conf-cree
> [D 17:53:53.501988] PVFS2 Server version 1.4.0 starting.
> [D 17:53:53.502770] Logging none,storage,network,server (mask 589871)
> [D 17:53:53.503233] Passing tcp://cree:3334 as BMI listen address.
> [D 17:53:53.503311] BMI_tcp_initialize: Initializing TCP/IP module.
> [D 17:53:53.503438] BMI_tcp_initialize: TCP/IP module successfully
> initialized.
> [D 17:53:53.504442] dbpf_thread_initialize: initialized
> [D 17:53:53.504622] dbpf_thread_function started
> [D 17:53:53.504953] collection lookup: version is 0.0.1
> [D 17:53:53.505122] - set handle re-use timeout to 360 seconds (ret=0)
> [D 17:53:53.528647] File system pvfs2-fs using handles:
>         4-330382102
> [D 17:53:53.528730] Sync on metadata update for pvfs2-fs: yes
> [D 17:53:53.528779] Sync on I/O data update for pvfs2-fs: no
> [D 17:53:53.528844] Storage Init Complete (aio-threaded)
> [D 17:53:53.528893] 1 filesystem(s) initialized
> [D 17:53:53.529540] Initialization completed successfully.
> [D 17:54:08.277682] handle_new_connection: Assigning socket 8 to new
> method addr.
> [D 17:54:08.277838] tcp_do_work_recv: Reading header for new op.
> [D 17:54:08.277859] tcp_do_work_recv: Received new message; mode: 2.
> [D 17:54:08.277868] tcp_do_work_recv: tag: 1
> [D 17:54:08.277919] (0x5df330) getconfig (prelude sm) state: req_sched
> [D 17:54:08.277940] (0x5df330) getconfig (prelude sm) state:
> getattr_if_needed
> [D 17:54:08.277951] (0x5df330) getconfig (prelude sm) state: perm_check
> (status = 0)
> [D 17:54:08.277965] (0x5df330) getconfig state: init
> [D 17:54:08.277976] (0x5df330) getconfig (FR sm) state: release:
> (error_code = 0)
> [D 17:54:08.277989] (0x5df330) getconfig (FR sm) state: send_resp
> (status = 0)
> [D 17:54:08.278014] BMI_post_send_list: addr: 81, count: 1, total_size: 1720
> [D 17:54:08.278024]    element 0: offset: 0x618b80, size: 1720
> [D 17:54:08.278056] BMI_tcp_post_send_generic: Sent: 1720 bytes of data.
> [D 17:54:08.278068] (0x5df330) getconfig (FR sm) state: cleanup
> [D 17:54:08.278098] (0x5df330) getconfig state: cleanup
> [D 17:54:08.278937] tcp_do_work_recv: Reading header for new op.
> [D 17:54:08.278959] tcp_do_work_recv: Received new message; mode: 2.
> [D 17:54:08.278969] tcp_do_work_recv: tag: 2
> [D 17:54:08.278993] (0x5e0960) lookup_path (prelude sm) state: req_sched
> [D 17:54:08.279006] (0x5e0960) lookup_path (prelude sm) state:
> getattr_if_needed
> [D 17:54:08.279015] About to retrieve attributes for handle 1048576
> [D 17:54:08.279246] ATTRIB: retrieved attributes from DISK for key 1048576
>         uid = 0, mode = 511, type = 4, dfile_count = 0, dist_size = 0
>         b_size = 0, k_size = 1
> [D 17:54:08.287018] (0x5e0960) lookup_path (prelude sm) state:
> perm_check (status = 0)
> [D 17:54:08.287033] (0x5e0960) lookup_path state: init
> [D 17:54:08.287047] (0x5e0960) lookup_path state: read_object_metadata
> [D 17:54:08.287059] (0x5e0960) lookup_path state: verify_object_metadata
> [D 17:54:08.287067]   attrs = (owner = 0, group = 0, perms = 777, type = 4)
> [D 17:54:08.287076]   object is a directory; will be looking for handle
> for segment "pvfs2-1.4.0.tar.gz" in a bit
> [D 17:54:08.287087] (0x5e0960) lookup_path state:
> read_directory_entry_handle
> [D 17:54:08.287096]   reading dirent handle value from handle 1048576
> [D 17:54:08.287224] (0x5e0960) lookup_path state: read_directory_entry
> [D 17:54:08.287236]   reading from dirent handle = 4, segment =
> pvfs2-1.4.0.tar.gz (len=18)
> [D 17:54:08.300232] warning: keyval read error on handle 4 and
> key=pvfs2-1.4.0.tar.gz (DB_NOTFOUND: No matching key/data pair found)
> [D 17:54:08.300278] (0x5e0960) lookup_path state: setup_resp
> [D 17:54:08.300293]   sending 'error' response with 0 handle(s) and 0
> attr(s)
> [D 17:54:08.300302] (0x5e0960) lookup_path (FR sm) state: release:
> (error_code = -1073742082)
> [D 17:54:08.300315] (0x5e0960) lookup_path (FR sm) state: send_resp
> (status = -1073741826)
> [D 17:54:08.300331] BMI_post_send_list: addr: 81, count: 1, total_size: 32
> [D 17:54:08.300341]    element 0: offset: 0x61fb50, size: 32
> [D 17:54:08.300369] BMI_tcp_post_send_generic: Sent: 32 bytes of data.
> [D 17:54:08.300381] (0x5e0960) lookup_path (FR sm) state: cleanup
> [D 17:54:08.300395] (0x5e0960) lookup_path state: cleanup
> [D 17:54:08.300524] tcp_do_work_recv: Reading header for new op.
> [D 17:54:08.300542] tcp_do_work_recv: Received new message; mode: 2.
> [D 17:54:08.300551] tcp_do_work_recv: tag: 3
> [D 17:54:08.300625] (0x5e1af0) getattr (prelude sm) state: req_sched
> [D 17:54:08.300638] (0x5e1af0) getattr (prelude sm) state: getattr_if_needed
> [D 17:54:08.300647] About to retrieve attributes for handle 1048576
> [D 17:54:08.300657] (0x5e1af0) getattr (prelude sm) state: perm_check
> (status = 0)
> [D 17:54:08.300669] (0x5e1af0) getattr state: verify_attribs
> [D 17:54:08.300683] (0x5e1af0) getattr state: setup_resp
> [D 17:54:08.300691] (0x5e1af0) getattr (FR sm) state: release:
> (error_code = 0)
> [D 17:54:08.300701] (0x5e1af0) getattr (FR sm) state: send_resp (status = 0)
> [D 17:54:08.300711] BMI_post_send_list: addr: 81, count: 1, total_size: 64
> [D 17:54:08.300721]    element 0: offset: 0x61fb50, size: 64
> [D 17:54:08.300741] BMI_tcp_post_send_generic: Sent: 64 bytes of data.
> [D 17:54:08.300752] (0x5e1af0) getattr (FR sm) state: cleanup
> [D 17:54:08.300762] (0x5e1af0) getattr state: getattr_cleanup
> [D 17:54:08.300857] tcp_do_work_recv: Reading header for new op.
> [D 17:54:08.300877] tcp_do_work_recv: Received new message; mode: 2.
> [D 17:54:08.300886] tcp_do_work_recv: tag: 4
> [D 17:54:08.300910] (0x5e2c80) create (prelude sm) state: req_sched
> [D 17:54:08.300921] (0x5e2c80) create (prelude sm) state: getattr_if_needed
> [D 17:54:08.300929] (0x5e2c80) create (prelude sm) state: perm_check
> (status = 0)
> [D 17:54:08.300941] (0x5e2c80) create state: create
> [D 17:54:08.301021] [1 extents] -- new_handle is 330382099 (cur_extent
> is 4 - 330382102)
> [D 17:54:08.310859] db SYNC called servicing op type DSPACE_CREATE
> [D 17:54:08.310892] (0x5e2c80) create state: setup_resp
> [D 17:54:08.310906] Handle created: 330382099
> [D 17:54:08.310915] (0x5e2c80) create (FR sm) state: release:
> (error_code = 0)
> [D 17:54:08.310925] (0x5e2c80) create (FR sm) state: send_resp (status = 0)
> [D 17:54:08.310934] BMI_post_send_list: addr: 81, count: 1, total_size: 24
> [D 17:54:08.310944]    element 0: offset: 0x5e1c10, size: 24
> [D 17:54:08.310964] BMI_tcp_post_send_generic: Sent: 24 bytes of data.
> [D 17:54:08.310975] (0x5e2c80) create (FR sm) state: cleanup
> [D 17:54:08.310985] (0x5e2c80) create state: cleanup
> [D 17:54:08.361729] tcp_do_work_recv: Reading header for new op.
> [D 17:54:08.361748] tcp_do_work_recv: Received new message; mode: 2.
> [D 17:54:08.361757] tcp_do_work_recv: tag: 25
> [D 17:54:08.361781] (0x5e3e10) remove (prelude sm) state: req_sched
> [D 17:54:08.361792] (0x5e3e10) remove (prelude sm) state: getattr_if_needed
> [D 17:54:08.361801] About to retrieve attributes for handle 330382099
> [D 17:54:08.361812] (0x5e3e10) remove (prelude sm) state: perm_check
> (status = 0)
> [D 17:54:08.361822] (0x5e3e10) remove state: setup_work
> [D 17:54:08.361830] (0x5e3e10) remove state: check_object_type
> [D 17:54:08.361839] (0x5e3e10) remove state: verify_object_metadata
> [D 17:54:08.361847]   attrs read from keyval = (owner = 0, group = 0,
> perms = 0, type = 1)
> [D 17:54:08.361857] (0x5e3e10) remove state: remove_dspace
> [D 17:54:08.361865] (0x5e3e10) remove: removing dspace object
> 330382099,1559991218
> [D 17:54:08.361970] removed dataspace with handle 330382099
> [D 17:54:08.362342] db SYNC called servicing op type DSPACE_REMOVE
> [D 17:54:08.362408] (0x5e3e10) remove (FR sm) state: release:
> (error_code = 0)
> [D 17:54:08.362427] (0x5e3e10) remove (FR sm) state: send_resp (status = 0)
> [D 17:54:08.362437] BMI_post_send_list: addr: 81, count: 1, total_size: 16
> [D 17:54:08.362445]    element 0: offset: 0x61d640, size: 16
> [D 17:54:08.362480] BMI_tcp_post_send_generic: Sent: 16 bytes of data.
> [D 17:54:08.362491] (0x5e3e10) remove (FR sm) state: cleanup
> [D 17:54:08.362502] (0x5e3e10) remove state: cleanup
>
>
> Murali Vilayannur wrote:
>
> >Hi Matt,
> >Is this a ppc64 machine/cluster?
> >If yes, then this bug has been fixed in cvs and you can find patches for
> >1.4.0 here
> >http://www.beowulf-underground.org/pipermail/pvfs2-users/2006-March/001263.html
> >If not, then we would need your help investigating this.. Could you turn
> >on verbose logging on server and send the logs (offlist if need be)?
> >Alternatively, if you are comfortable installing CVS head, please do so
> >and let us know if the errors persist.
> >Thanks,
> >Murali
> >
> >On Wed, 24 May 2006, Matt wrote:
> >
> >
> >
> >>Hi,
> >>
> >>following the Quick Start Guide I am trying to install pvfs2 on a 12+1
> >>cluster.  When testing the installation pvfs2-ping (see below) reports a
> >>correctly configured system. However at the pvfs2-cp step I get:
> >>---------------8<------------
> >>[EMAIL PROTECTED] pvfs2-1.4.0]# pvfs2-cp  /lib/libc.so.6  /mnt/pvfs2/
> >>[E 16:04:30.142118] create_datafiles_comp_fn: Failed to create data handle 4
> >>[E 16:04:30.142258] Creation failure: No space left on device
> >>[E 16:04:30.142269] create_datafiles_comp_fn: Failed to create data handle 5
> >>[E 16:04:30.142278] Creation failure: No space left on device
> >>[E 16:04:30.142287] create_datafiles_comp_fn: Failed to create data handle 6
> >>[E 16:04:30.142295] Creation failure: No space left on device
> >>[E 16:04:30.142304] create_datafiles_comp_fn: Failed to create data handle 7
> >>[E 16:04:30.142313] Creation failure: No space left on device
> >>PVFS_sys_create: No space left on device
> >>------------->8-------------
> >>Note that I can create directories:
> >>-----------8<-----------
> >>[EMAIL PROTECTED] pvfs2-1.4.0]# pvfs2-mkdir /mnt/pvfs2/blabla
> >>[EMAIL PROTECTED] pvfs2-1.4.0]# pvfs2-ls -l /mnt/pvfs2/
> >>drwxr-xr-x    1 root     root            4096 2006-05-24 14:54 bla
> >>drwxr-xr-x    1 root     root            4096 2006-05-24 16:19 blabla
> >>drwxrwxrwx    1 root     root            4096 2006-05-24 14:34 lost+found
> >>------------>8-----------
> >>
> >>I  followed the instruction in of the guide except for configuring with
> >>
> >>$ ./configure --with-db=/usr/local/BerkeleyDB.4.4
> >>--with-kernel=/usr/src/linux
> >>
> >>as this is where db is installed.   (Later on, I needed to set the
> >>library search path: export LD_LIBRARY_PATH=/usr/local/BerkeleyDB.4.4)
> >>Any hints what to do are appreciated.
> >>
> >>... Matt
> >>
> >>
> >>-----------8<-----------
> >>[EMAIL PROTECTED] pvfs2-1.4.0]# pvfs2-ping
> >>pvfs2-ping version 1.4.0
> >>
> >>Usage  : pvfs2-ping -m file_system_path
> >>Example: pvfs2-ping -m /mnt/pvfs2
> >>[EMAIL PROTECTED] pvfs2-1.4.0]# pvfs2-ping -m /mnt/pvfs2
> >>
> >>(1) Parsing tab file...
> >>
> >>(2) Initializing system interface...
> >>
> >>(3) Initializing each file system found in tab file: /etc/pvfs2tab...
> >>
> >>   /mnt/pvfs2: Ok
> >>
> >>(4) Searching for /mnt/pvfs2 in pvfstab...
> >>
> >>   PVFS2 servers: tcp://cree:3334
> >>   Storage name: pvfs2-fs
> >>   Local mount point: /mnt/pvfs2
> >>
> >>   meta servers:
> >>   tcp://cree:3334
> >>
> >>   data servers:
> >>   tcp://node10:3334
> >>   tcp://node11:3334
> >>   tcp://node12:3334
> >>   tcp://node1:3334
> >>   tcp://node2:3334
> >>   tcp://node3:3334
> >>   tcp://node4:3334
> >>   tcp://node5:3334
> >>   tcp://node6:3334
> >>   tcp://node7:3334
> >>   tcp://node8:3334
> >>   tcp://node9:3334
> >>
> >>(5) Verifying that all servers are responding...
> >>
> >>   meta servers:
> >>   tcp://cree:3334 Ok
> >>
> >>   data servers:
> >>   tcp://node10:3334 Ok
> >>   tcp://node11:3334 Ok
> >>   tcp://node12:3334 Ok
> >>   tcp://node1:3334 Ok
> >>   tcp://node2:3334 Ok
> >>   tcp://node3:3334 Ok
> >>   tcp://node4:3334 Ok
> >>   tcp://node5:3334 Ok
> >>   tcp://node6:3334 Ok
> >>   tcp://node7:3334 Ok
> >>   tcp://node8:3334 Ok
> >>   tcp://node9:3334 Ok
> >>
> >>(6) Verifying that fsid 1559991218 is acceptable to all servers...
> >>
> >>   Ok; all servers understand fs_id 1559991218
> >>
> >>(7) Verifying that root handle is owned by one server...
> >>
> >>   Root handle: 1048576
> >>   Ok; root handle is owned by exactly one server.
> >>
> >>=============================================================
> >>
> >>The PVFS filesystem at /mnt/pvfs2 appears to be correctly configured.
> >>
> >>
> >>_______________________________________________
> >>Pvfs2-users mailing list
> >>[email protected]
> >>http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
> >>
> >>
> >>
> >>
> >
> >
> >
> >
>
>
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to