Hi, this discussion has been brought to my attention so I joined this mailing list to try to help. As you already stated that the SL maps correctly to PCP when using ibv_rc_pingpong, I assume OpenMPI works over rdma_cm. In that cases please note the following: 1. If you're using OFED-1.5.2, than if if the rdma_cm socket is bound to VLAN net device, all egress traffic will bear a default priority of 3. 2. The default priority is controlled by a module parameter to rdma_cm.ko named def_prec2sl. 3. You may change the priority on a per socket basis (overriding the module parameter) by using setsockopt() to set the option RDMA_OPTION_ID_TOS to the required value of the TOS. 4. The TOS is mapped to SL according to the following formula: SL = TOS >> 5
I hope that clears things. > Late yesterday I did have a chance to test the patch Jeff provided > (against 1.4.3 - testing 1.5.x is on the docket for today). While it > works, in that I can specify a gid_index, it doesn't do everything > required - my traffic won't match a lossless CoS on the ethernet > switch. Specifying a GID is only half of it; I really need to also > specify a service level. > The bottom 3 bits of the IB SL are mapped to ethernet's PCP bits in > the VLAN tag. With a non-default gid, I can select an available VLAN > (so RoCE's packets will include the PCP bits), but the only way to > specify a priority is to use an SL. So far, the only RoCE-enabled app > I've been able to make work correctly (such that traffic matches a > lossless CoS on the switch) is ibv_rc_pingpong - and then, I need to > use both a specific GID and a specific SL. > The slides Pavel found seem a little misleading to me. The VLAN isn't > determined by bound netdev; all VLAN netdevs map to the same IB > adapter for RoCE. VLAN is determined by gid index. Also, the SL > isn't determined by a set kernel policy; it's provided via the IB > interfaces. As near as I can tell from Mellanox's documentation, OFED > test apps, and the driver source, a RoCE adapter is an Infiniband card > in almost all respects (even more so than an iWARP adapter).