I am using pvfs-2.8.2 and it works using TCP, but I ideally want to
run it using open-mx. I have installed and configured open-mx-1.3.1
and it is running on all three servers.
Does anyone actually have Open-MX working with PVFS2? I have set the
MX_IMM_ACK environment variable to 1 as directed in the FAQ, and all
my connectivity tests with Open-MX seem to work just fine.
Below I have attached relevant output and configuration files.
Thanks for any help you can offer!
Josh.
The output of omx_info shows all three hosts are successfully
communicating over ethernet.
$ sudo /opt/open-mx/bin/omx_info
Open-MX version 1.3.1
build: jrand...@tommy:/usr/local/src/open-mx/open-mx-1.3.1 Fri Aug
13 19:07:08 BST 2010
Found 1 boards (32 max) supporting 32 endpoints each:
tommy:0 (board #0 name eth3 addr 00:1b:21:4f:4b:e6)
managed by driver 'ixgbe'
attached to numa node 0
Peer table is ready, mapper is 00:00:00:00:00:00
================================================
0) 00:1b:21:4f:4b:e6 tommy:0
1) 00:1b:21:4d:ba:92 renton:0
2) 00:1b:21:4f:4d:5a begbie:0
The output of omx_endpoint_info shows all 32 endpoints are available.
$ sudo /opt/open-mx/bin/omx_endpoint_info
tommy:0 (board #0 name eth3 addr 00:1b:21:4f:4b:e6)
==============================================
raw open by pid 20653 (omxoed)
0 regular endpoints open (out of 32)
When I run pvfs2-server, with PVFS2_DEBUGMASK="all" I get a "Remote
Endpoint is Closed" error and the server exits with code 255.
$ sudo /usr/local/sbin/pvfs2-server /etc/pvfs2-fs.conf -d
[S 08/14 16:40] PVFS2 Server on node tommy version 2.8.2 starting...
[D 08/14 16:40] Logging all (mask 18446744073709551615)
[D 08/14 16:40] PINT_encode_initialize
[D 08/14 16:40] lebf_initialize
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] PINT_do_request_commit: commit node 0x7fff0a7e6e40
[D 08/14 16:40] node stored at 0
[D 08/14 16:40] clearing tree
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] PINT_do_request_commit: commit node 0x7fff0a7e6e40
[D 08/14 16:40] node stored at 0
[D 08/14 16:40] clearing tree
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] Passing mx://tommy:0:0 as BMI listen address.
OMX: Emulating MX_DISABLE_SHMEM as OMX_DISABLE_SHARED
OMX: Forcing shared comms to disabled
OMX: Setting 4 bits of context id at offset 60 in matching
[D 08/14 16:40] Server using shm key hint: 1937657271
[D 08/14 16:40] [BMI CONTROL]: BMI_set_info: set_info: 0 option: 11
[D 08/14 16:40] [BMI CONTROL]: BMI_set_info: set_info: 0 option: 12
[D 08/14 16:40] dbpf_thread_initialize: initialized
[D 08/14 16:40] dbpf_thread_function started
[D 08/14 16:40] [SYNC_COALESCE]: dbpf_sync_context_init for context
0 called
OMX: Completing iconnect request: Remote Endpoint is Closed
My pvfs2-fs.conf file contains:
<Defaults>
UnexpectedRequests 50
EventLogging all
EnableTracing no
LogStamp datetime
BMIModules bmi_mx
FlowModules flowproto_multiqueue
PerfUpdateInterval 1000
ServerJobBMITimeoutSecs 30
ServerJobFlowTimeoutSecs 30
ClientJobBMITimeoutSecs 300
ClientJobFlowTimeoutSecs 300
ClientRetryLimit 5
ClientRetryDelayMilliSecs 2000
PrecreateBatchSize 512
PrecreateLowThreshold 256
StorageSpace /raid/pvfs2-storage-space
LogFile /var/log/pvfs2-server.log
</Defaults>
<Aliases>
Alias begbie mx://begbie:0:0
Alias renton mx://renton:0:0
Alias tommy mx://tommy:0:0
</Aliases>
<Filesystem>
Name pvfs2-fs
ID 1937657241
RootHandle 1048576
FileStuffing yes
<MetaHandleRanges>
Range begbie 3-1537228672809129302
Range renton 1537228672809129303-3074457345618258602
Range tommy 3074457345618258603-4611686018427387902
</MetaHandleRanges>
<DataHandleRanges>
Range begbie 4611686018427387903-6148914691236517202
Range renton 6148914691236517203-7686143364045646502
Range tommy 7686143364045646503-9223372036854775802
</DataHandleRanges>
<StorageHints>
TroveSyncMeta yes
TroveSyncData no
TroveMethod alt-aio
</StorageHints>
</Filesystem>
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users