I am using pvfs-2.8.2 and it works using TCP, but I ideally want to run it using open-mx. I have installed and configured open-mx-1.3.1 and it is running on all three servers.

Does anyone actually have Open-MX working with PVFS2? I have set the MX_IMM_ACK environment variable to 1 as directed in the FAQ, and all my connectivity tests with Open-MX seem to work just fine.

Below I have attached relevant output and configuration files.

Thanks for any help you can offer!

Josh.



The output of omx_info shows all three hosts are successfully communicating over ethernet.
$ sudo /opt/open-mx/bin/omx_info
Open-MX version 1.3.1
build: jrand...@tommy:/usr/local/src/open-mx/open-mx-1.3.1 Fri Aug 13 19:07:08 BST 2010

Found 1 boards (32 max) supporting 32 endpoints each:
tommy:0 (board #0 name eth3 addr 00:1b:21:4f:4b:e6)
  managed by driver 'ixgbe'
  attached to numa node 0

Peer table is ready, mapper is 00:00:00:00:00:00
================================================
 0) 00:1b:21:4f:4b:e6 tommy:0
 1) 00:1b:21:4d:ba:92 renton:0
 2) 00:1b:21:4f:4d:5a begbie:0


The output of omx_endpoint_info shows all 32 endpoints are available.
$ sudo /opt/open-mx/bin/omx_endpoint_info
tommy:0 (board #0 name eth3 addr 00:1b:21:4f:4b:e6)
==============================================
 raw   open by pid 20653 (omxoed)
0 regular endpoints open (out of 32)


When I run pvfs2-server, with PVFS2_DEBUGMASK="all" I get a "Remote Endpoint is Closed" error and the server exits with code 255.
$ sudo /usr/local/sbin/pvfs2-server /etc/pvfs2-fs.conf -d

[S 08/14 16:40] PVFS2 Server on node tommy version 2.8.2 starting...
[D 08/14 16:40] Logging all (mask 18446744073709551615)
[D 08/14 16:40] PINT_encode_initialize
[D 08/14 16:40] lebf_initialize
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] PINT_do_request_commit: commit node 0x7fff0a7e6e40
[D 08/14 16:40] node stored at 0
[D 08/14 16:40] clearing tree
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] PINT_do_request_commit: commit node 0x7fff0a7e6e40
[D 08/14 16:40] node stored at 0
[D 08/14 16:40] clearing tree
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_req_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_req
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] check_resp_size
[D 08/14 16:40] encode_common
[D 08/14 16:40] lebf_encode_resp
[D 08/14 16:40] lebf_encode_rel
[D 08/14 16:40] Passing mx://tommy:0:0 as BMI listen address.
OMX: Emulating MX_DISABLE_SHMEM as OMX_DISABLE_SHARED
OMX: Forcing shared comms to disabled
OMX: Setting 4 bits of context id at offset 60 in matching
[D 08/14 16:40] Server using shm key hint: 1937657271
[D 08/14 16:40] [BMI CONTROL]: BMI_set_info: set_info: 0 option: 11
[D 08/14 16:40] [BMI CONTROL]: BMI_set_info: set_info: 0 option: 12
[D 08/14 16:40] dbpf_thread_initialize: initialized
[D 08/14 16:40] dbpf_thread_function started
[D 08/14 16:40] [SYNC_COALESCE]: dbpf_sync_context_init for context 0 called
OMX: Completing iconnect request: Remote Endpoint is Closed



My pvfs2-fs.conf file contains:
<Defaults>
        UnexpectedRequests 50
        EventLogging all
        EnableTracing no
        LogStamp datetime
        BMIModules bmi_mx
        FlowModules flowproto_multiqueue
        PerfUpdateInterval 1000
        ServerJobBMITimeoutSecs 30
        ServerJobFlowTimeoutSecs 30
        ClientJobBMITimeoutSecs 300
        ClientJobFlowTimeoutSecs 300
        ClientRetryLimit 5
        ClientRetryDelayMilliSecs 2000
        PrecreateBatchSize 512
        PrecreateLowThreshold 256

        StorageSpace /raid/pvfs2-storage-space
        LogFile /var/log/pvfs2-server.log
</Defaults>

<Aliases>
        Alias begbie mx://begbie:0:0
        Alias renton mx://renton:0:0
        Alias tommy mx://tommy:0:0
</Aliases>

<Filesystem>
        Name pvfs2-fs
        ID 1937657241
        RootHandle 1048576
        FileStuffing yes
        <MetaHandleRanges>
                Range begbie 3-1537228672809129302
                Range renton 1537228672809129303-3074457345618258602
                Range tommy 3074457345618258603-4611686018427387902
        </MetaHandleRanges>
        <DataHandleRanges>
                Range begbie 4611686018427387903-6148914691236517202
                Range renton 6148914691236517203-7686143364045646502
                Range tommy 7686143364045646503-9223372036854775802
        </DataHandleRanges>
        <StorageHints>
                TroveSyncMeta yes
                TroveSyncData no
                TroveMethod alt-aio
        </StorageHints>
</Filesystem>


_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to