[
https://issues.apache.org/jira/browse/QPID-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gregory Marsh updated QPID-1855:
--------------------------------
Attachment: simple_fed_tcp.bash
simple_fed_rdma.bash
rdma_fed_bug_pub.cpp
> qpid-route -t rdma crashes C++ broker & client with lrg msgs
> ------------------------------------------------------------
>
> Key: QPID-1855
> URL: https://issues.apache.org/jira/browse/QPID-1855
> Project: Qpid
> Issue Type: Bug
> Components: C++ Broker, C++ Client, python tools
> Affects Versions: M4, 0.5
> Environment: Red Hat Enterprise Linux Server 5. Mellanox Infiniband
> MT25208 HCA. OFED-1.3.1 drivers.
> Reporter: Gregory Marsh
> Attachments: Makefile, qpid-m5-env.bash, rdma_fed_bug_cons.cpp,
> rdma_fed_bug_pub.cpp, simple_fed_rdma.bash, simple_fed_tcp.bash,
> start_qpidd.bash
>
>
> I've run into a problem when using "qpid-route -t rdma route add" to setup an
> rdma federation link between 2 brokers. I've attached some simple code that
> replicates the problem by sending just 1 message.
> Here's the setup. I have 4 nodes as follows:
> (publisher) -> (broker1) -> federation route -> (broker2) -> (consumer)
> Through trial and error I've found that when I send 1 message with payload
> size 7989 or greater, the (broker1) qpidd crashes with following error:
> *******************
> qpidd: qpid/amqp_0_10/Connection.cpp:93: virtual size_t
> qpid::amqp_0_10::Connection::encode(const char*, size_t): Assertion
> `workQueue.empty() || workQueue.front().encodedSize() <= size' failed.
> *******************
> This does not happen on rdma with message sizes 7888 or less. It does not
> happen with tcp at all.
> Here is explanation of how to use attached code to (hopefully) replicate:
> I set my path and ld lib path env vars by running "source qpid-m5-env.bash."
> 0. Compile the 2 cpp files with make and attached Makefile.
> 1. Start qpidd on the (broker1) & (broker2) hosts using "start_qpidd.bash"
> 2. Setup the route btw the broker hosts using "simple_fed_rdma.bash". I've
> also included "simple_fed_tcp.bash" that does it in tcp.
> 3. Start the consumer with "rdma_fed_bug_cons.exe". Use rdma or tcp protocol
> according to how you've setup the route in step 2.
> $ ./rdma_fed_bug_cons.exe
> Usage: ./rdma_fed_bug_cons.exe
> [broker_ip_addr]
> [protocol (tcp|rdma)]
> 4. Start the consumer with "rdma_fed_bug_pub.exe". Use rdma or tcp protocol
> according to how you've setup the route.
> $ ./rdma_fed_bug_pub.exe
> Usage: ./rdma_fed_bug_pub.exe
> [broker_ip_addr]
> [msg_size]
> [protocol (tcp|rdma)]
>
> Again with rdma route and rdma protocol on clients, a msg_size 7989 or
> greater should crash.
> My 4 hosts each have the following Mellanox Infiniband HCA with an assigned
> IPoIB interface address showing in "ifconfig". We are using OFED-1.3.1
> drivers:
> $ ibstat
> CA 'mthca0'
> CA type: MT25208
> Number of ports: 2
> Firmware version: 5.1.400
> Hardware version: a0
> Node GUID: 0x0002c9020023c300
> System image GUID: 0x0002c9020023c303
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 20
> Base lid: 2
> LMC: 0
> SM lid: 1
> Capability mask: 0x02510a68
> Port GUID: 0x0002c9020023c301
> Port 2:
> State: Down
> Physical state: Polling
> Rate: 10
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x02510a68
> Port GUID: 0x0002c9020023c302
> Thanks for looking into this. Let me know if you have any problems
> compiling/running my code.
> Greg Marsh
> Network Based Computing Lab
> Ohio State University
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project: http://qpid.apache.org
Use/Interact: mailto:[email protected]