Hi Matt, Could you replace line 2083 in src/io/bmi/bmi-tcp.c from int copy_size = 0; to bmi_size_t copy_size = 0; and see if that helps? Thanks, Murali
On Fri, 26 May 2006, Matt wrote: > Hi > > I did some more testing. > > My problems may to be related to a 2TB limit. If my combined storage size > is below 2TB (9 nodes a 200GB), everything seems to work. However, if > I use 10 or more nodes I get the reported error. > > The same holds if I stay away from our head node and use only Mandriva > nodes with > FC4 kernel. However, in this case the error message changes: > > ---------8<--------------- > [EMAIL PROTECTED] pvfs2]# /usr/local/pvfs2_nodes/bin/pvfs2-cp -t > /home/munnich/Soft/pvfs2/pvfs2-1.4.0.tar.gz /mnt/pvfs2/testfile > [E 14:22:06.071020] Receive immediately failed: Value too large for > defined data type > [E 14:22:06.071114] msgpair failed, will retry:: Value too large for > defined data type > [E 14:22:06.071159] *** msgpairarray_completion_fn: msgpair to server > tcp://node10:3334 failed: Value too large for defined data type > [E 14:22:06.071170] *** Non-BMI failure. > [E 14:22:06.071179] getattr_object_getattr_failure : Value too large for > defined data type > PVFS_sys_create: Value too large for defined data type > Could not open /mnt/pvfs2/testfile > Segmentation fault (core dumped) > --------->8--------------- > > Does this behavior ring a bell? > > ... Matt > > > Log files > [EMAIL PROTECTED] pvfs2]# cat /tmp/pvfs2-server.log > [D 14:21:59.274612] PVFS2 Server version 1.4.1pre1-2006-05-25-230553 > starting. > [D 14:21:59.275359] Passing tcp://node10:3334 as BMI listen address. > [D 14:21:59.275417] BMI_tcp_initialize: Initializing TCP/IP module. > [D 14:21:59.275495] BMI_tcp_initialize: TCP/IP module successfully > initialized. > [D 14:21:59.276813] dbpf_thread_initialize: initialized > [D 14:21:59.278074] collection lookup: version is 0.1.2 > [D 14:21:59.278238] dbpf_thread_function started > [D 14:21:59.278312] - set handle re-use timeout to 360 seconds (ret=0) > [D 14:21:59.301219] File system pvfs2-fs using handles: > 4-390451575 > [D 14:21:59.301276] Sync on metadata update for pvfs2-fs: yes > [D 14:21:59.301287] Sync on I/O data update for pvfs2-fs: no > [D 14:21:59.301320] Storage Init Complete (aio-threaded) > [D 14:21:59.301331] 1 filesystem(s) initialized > [D 14:21:59.301816] Initialization completed successfully. > [D 14:22:06.068882] handle_new_connection: Assigning socket 12 to new > method addr. > [D 14:22:06.068956] tcp_do_work_recv: Reading header for new op. > [D 14:22:06.068972] tcp_do_work_recv: Received new message; mode: 2. > [D 14:22:06.068983] tcp_do_work_recv: tag: 1 > [D 14:22:06.069054] (0x5e1b70) getconfig (prelude sm) state: req_sched > [D 14:22:06.069118] (0x5e1b70) getconfig (prelude sm) state: > getattr_if_needed > [D 14:22:06.069132] (0x5e1b70) getconfig (prelude sm) state: perm_check > (status = 0) > [D 14:22:06.069147] (0x5e1b70) getconfig state: init > [D 14:22:06.069162] (0x5e1b70) getconfig (FR sm) state: release: > (error_code = 0) > [D 14:22:06.069179] (0x5e1b70) getconfig (FR sm) state: send_resp > (status = 0) > [D 14:22:06.069204] BMI_post_send_list: addr: 65, count: 1, total_size: 1632 > [D 14:22:06.069216] element 0: offset: 0x61b6f0, size: 1632 > [D 14:22:06.069258] BMI_tcp_post_send_generic: Sent: 1632 bytes of data. > [D 14:22:06.069273] (0x5e1b70) getconfig (FR sm) state: cleanup > [D 14:22:06.069305] (0x5e1b70) getconfig state: cleanup > > [EMAIL PROTECTED] ~]# cat /tmp/pvfs2-server.log > [D 13:44:33.574416] PVFS2 Server version 1.4.1pre1-2006-05-25-230553 > starting. > [E 13:44:37.699990] > PVFS2 server got signal 15 (server_status_flag: 262143) > [D 13:44:39.722231] PVFS2 Server version 1.4.1pre1-2006-05-25-230553 > starting. > [E 13:47:29.145477] > PVFS2 server got signal 15 (server_status_flag: 262143) > [D 13:47:31.168877] PVFS2 Server version 1.4.1pre1-2006-05-25-230553 > starting. > [D 13:47:31.169626] Passing tcp://node2:3334 as BMI listen address. > [D 13:47:31.169682] BMI_tcp_initialize: Initializing TCP/IP module. > [D 13:47:31.169755] BMI_tcp_initialize: TCP/IP module successfully > initialized. > [D 13:47:31.171265] dbpf_thread_initialize: initialized > [D 13:47:31.172522] collection lookup: version is 0.1.2 > [D 13:47:31.172670] - set handle re-use timeout to 360 seconds (ret=0) > [D 13:47:31.172826] dbpf_thread_function started > [D 13:47:31.172883] File system pvfs2-fs using handles: > 1171354720-1561806291 > [D 13:47:31.172895] Sync on metadata update for pvfs2-fs: yes > [D 13:47:31.172909] Sync on I/O data update for pvfs2-fs: no > [D 13:47:31.172942] Storage Init Complete (aio-threaded) > [D 13:47:31.172953] 1 filesystem(s) initialized > [D 13:47:31.173461] Initialization completed successfully. > > > > [EMAIL PROTECTED] pvfs2]# cat /etc/pvfs2-fs.conf > <Defaults> > UnexpectedRequests 50 > LogFile /tmp/pvfs2-server.log > EventLogging storage,network,server > LogStamp usec > BMIModules bmi_tcp > FlowModules flowproto_multiqueue > PerfUpdateInterval 1000 > ServerJobBMITimeoutSecs 30 > ServerJobFlowTimeoutSecs 30 > ClientJobBMITimeoutSecs 300 > ClientJobFlowTimeoutSecs 300 > ClientRetryLimit 5 > ClientRetryDelayMilliSecs 2000 > </Defaults> > > <Aliases> > Alias node1 tcp://node10:3334 > Alias node10 tcp://node11:3334 > Alias node11 tcp://node1:3334 > Alias node2 tcp://node2:3334 > Alias node3 tcp://node3:3334 > Alias node4 tcp://node4:3334 > Alias node5 tcp://node5:3334 > Alias node6 tcp://node6:3334 > Alias node7 tcp://node7:3334 > Alias node8 tcp://node8:3334 > Alias node9 tcp://node9:3334 > </Aliases> > > <Filesystem> > Name pvfs2-fs > ID 833677876 > RootHandle 1048576 > <MetaHandleRanges> > Range node1 4-390451575 > </MetaHandleRanges> > <DataHandleRanges> > Range node10 390451576-780903147 > Range node11 780903148-1171354719 > Range node2 1171354720-1561806291 > Range node3 1561806292-1952257863 > Range node4 1952257864-2342709435 > Range node5 2342709436-2733161007 > Range node6 2733161008-3123612579 > Range node7 3123612580-3514064151 > Range node8 3514064152-3904515723 > Range node9 3904515724-4294967295 > </DataHandleRanges> > <StorageHints> > TroveSyncMeta yes > TroveSyncData no > AttrCacheKeywords datafile_handles,metafile_dist > AttrCacheKeywords dir_ent, symlink_target > AttrCacheSize 4093 > AttrCacheMaxNumElems 32768 > </StorageHints> > </Filesystem> > > _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
