Sunay Tripathi wrote: > Tom McMillan wrote: > >> Nicolas Michael wrote: >> >>> Hello, >>> - While we already have some ideas how to solve problem #1, the next >>> problem really seems to destroy everything again: Sunay wrote, that from >>> the possible L3/L4 classifiers, nxge only uses the source IP address as a >>> classifier for fanout. Since we are talking about the traffic on our >>> cluster interconnect, we always have the same source address for *all* our >>> traffic: It's all inter-node traffic, always coming from the other node of >>> the cluster! This means no matter how many connections we use, they will >>> always be mapped to the same cpu since all have the same source IP address! >>> (Since we will use 2 or 4 NICs as the interconnect for reasons of >>> redundancy, they will be mapped to 2 or 4 cpus -- but not to 16 or 32). We >>> will need some kind of solution for this! Do you have any ideas about what >>> could be a solution for this? E.g., do you plan to extend nxge to also >>> consider source & destination port as classifiers for fanout? >>> >> Just to clarify what Matheos & Sunay have already mentioned. I have worked >> on a TCAM manager for the NXGE, and you can classify flows according to any >> of the protocol fields mentioned. >> >> Here is an excerpt from the description of a Neptune IPv4 TCAM key (IPv6 >> is a different matter): >> >> bits >> 111:104 TOS byte. >> 103:96 Protocol ID. >> 95:64 Either L4 port numbers or SPI. >> 63:32 ip_addr_sa IP source address. >> 31:0 ip_addr_da IP destination address. >> >> Here is a Crossbow flow description: >> >> typedef struct flow_desc_s { >> flow_mask_t fd_mask; >> struct ether_vlan_header fd_mac; >> char fd_ipversion; >> char fd_protocol; >> in6_addr_t fd_remoteaddr; >> in6_addr_t fd_localaddr; >> in_port_t fd_remoteport; >> in_port_t fd_localport; >> uint32_t fd_sap; >> char fd_pad[4]; /* 64-bit alignment */ >> } flow_desc_t; >> >> As Matheos metioned previously. the NXGE has 16 receive channels >> available. So if you programmed the TCAM accordingly (you can't >> program it directly, but you can certainly use Crossbow to do so), you >> could define 16 different flows. >> > > Tom, > > I don't think that works. The connection attributes related to > ports etc get changed as connections get re established or machines > reboots. And users shouldn't be interacting with TCAMs and figuring > out how to program them either with Crossbow or without. >
yep. the feature to use here is the ring group load balancing policy. From the administrator side, dladm(1m)'s existing <-P policy> option for setting load the outbound load balancing criteria between members of an aggregation is being both generalized to extend to multiple rings of the same NIC/VNIC, and made applicable to both directions. From the driver interface side, the driver exposes a mac_set_lb_t entry point (MAC set Load Balancing) as part of the mac_rx_ring_group_info_t. This is detailed in the Crossbow Hardware Resources Management and Virtualization <http://www.opensolaris.org/os/project/crossbow/Docs/virtual_resources.pdf> Kais. > Cheers, > Sunay > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/crossbow-discuss/attachments/20071004/1e0a895b/attachment.html>