Narayan, I didn't hear back from you since I sent my response to your comments 3 weeks ago. In order to make our current schedule, we need to resolve all points of contention regarding LDOMs by October 19th, and I'm concerned by the lack of progress of this discussion.
When can we expect to see a follow-up from you so that we bring closure to this discussion? Nicolas. On Sep 19, 2007, at 10:13 AM, Nicolas Droux wrote: > Narayan, > > Thanks for the comments, my answers below... > > On Sep 8, 2007, at 12:12 PM, Narayan Venkat wrote: > >> 1) MAC client open/close related: >> (Crossbow-virt.pdf Section 5.2.2 Pg 36) >> int mac_client_open(mac_handle_t *mh, >> mac_client_handle_t *mchp, mac_bind_cpus >> void mac_client_close(mac_client_handle_t mch); >> >> typedef struct mac_bind_cpus_s { >> uint_t mbc_ncpus; >> uint32_t *mbc_cpus; >> } mac_bind_cpus_t; >> >> Q1.1) The mac_client_open() interface definition line in the >> document >> is abruptly cut. It seems like there are additional >> arguments >> such as flags etc. > > Yes, there's a missing break in that line, and the flag argument is > missing, will fix. > >> >> Q1.2) On pg 38, there is a reference to the following flags, >> but which >> interface takes them as an argument? >> >> MAC_OPEN_FLAGS_FORCE_MULTI_RINGS >> MAC_OPEN_FLAGS_FORCE_ONE_RING >> >> It seems like these are an argument to mac_client_open(), >> but there is a reference mac_open() in the description see below: >> >> "If MAC_OPEN_FLAGS_FORCE_MULTI_RINGS flag is set and it is not >> possible to allocate mbc_ncpus hardware rings, the mac_open() >> call will fail, otherwise the MAC layer will attempt to reserve >> one hardware ring for the MAC client." > > These flags are specified when calling mac_client_open(), not mac_open > (). > >> >> Q1.3) Are there any other flags other than the following ones? >> >> MAC_OPEN_FLAGS_FORCE_MULTI_RINGS >> MAC_OPEN_FLAGS_FORCE_ONE_RING > > No. > >> >> - Is there a way to force a software ring? > > Do you mean not assign a hardware ring? I think this is something we > could add, yes. > >> >> Q1.4) Is the mbc_cpus in mac_bind_cpus_t an array of CPU ids? > > Yes. > >> >> Q1.5) The following description of mbc_cpus on pg 37 is not >> clear, >> especially for the non-NULL case. >> >> "If mbc_cpus is NULL, the MAC layer will pick the CPUs. >> If mbc_cpus is non-NULL, the MAC layer will chose the CPUs.". > > The first one is correct. If mbc_cpus is non-NULL, the MAC layer will > assign the CPUs provided by the caller. > >> >> Q1.6) What is the relationship between Unicast addresses(multiple >> unicast set via mac_unicast_add()), Rings and CPUs? >> >> - Is there a 1:1 relation between a unicast address and a ring? >> - Is there a 1:1 relation between a ring and CPU? > > Neither. The MAC addresses will share the same rings and CPUs. > >> >> - The Rings and CPUs are tightly coupled in this interface. >> How can allocate multiple rings even when there is one CPU(or >> less >> number of CPUs). > > You don't allocate rings explicitly, you express a level of > parallelism instead, the framework distributes the hardware rings > transparently. > >> - When there are multiple CPUs and multiple unicast addresses, >> is there address fanout per CPU? > > See 2 answers above. > >> >> Q1.7) How is the binding of CPUs via mac_bind_cpus_t is co- >> ordinated >> with CPU DR(on the platforms that support them)? > > The MAC layer will be notified of the removal of the CPU and will > stop using it for its worker threads and interrupts. > >> >> NOTE: CPU DR is already a supported feature on LDoms. >> >> Q1.8) LDoms requires the CPU binding to be changed dynamically, >> how can this be accomplished ? > > This cannot be done with the API as documented today. It seems that > you are looking for a call to change the set of CPUs assigned to the > MAC client, is that what you are asking for? > >> >> Q1.9) The following XXX on pg 37. When are the interface >> changes for >> priority and bandwidth specification available? >> >> "XXX We still need to add the priority and bandwidth >> limit as argument to mac_open(). We also need an entry >> point to change the set of CPUs." > > I'm working on it but I don't have a firm date. > >> >> Q1.10) Can the mac client interface be extended to support >> creating >> a client based on ether_type? This is required for mac >> clients >> like fiberchannel over ethernet. > > No, each MAC client corresponds to a MAC level entity which is > defined by its MAC address. Multiple ether types can be supported on > top of a MAC client. > >> >> 2) MAC Unicast address related: >> (Crossbow-virt.pdf Section 5.2.4 Pg 38) >> >> mac_unicast_handle_t mac_unicast_add(mac_client_handle_t mch, >> mac_addr_type_t addr_type, int *addr_slot, uint_t prefix_len, >> uchar_t *mac_addr, uint32_t flags); >> >> void mac_unicast_unset(mac_unicast_handle_t); >> void mac_unicast_get(mac_unicast_handle_t mah, uchar_t *mac_addr); >> void mac_unicast_update(mac_unicast_handle_t mah, >> mac_addr_type_t addr_type, int *addr_slot, uint_t prefix_len, >> uchar_t *mac_addr); >> >> QUESTIONS: >> Q2.1) The section 4.5 describes "By value" type which is used >> to set a specific MAC address by the MAC client. But there >> is no equivalent addr_type definition under mac_unicast_add() >> interface. > > MAC_UNICAST_VALUE is missing from the list, this is what you are > looking for. > >> >> NOTE: LDoms requires the MAC addresses that are allocated >> by LDom manager be used by the network device. So, LDoms >> will not use any other addr_type other than "By value" type. > > That's fine. > >> >> Q2.2) Is there an impact to the multiaddress_capab_t.maddr_add()/ >> maddr_remove() interfaces? Are these being obsoleted or >> going away? > > The capability will stay, and the framework will continue to use that > capability to query and control the allocation of MAC address slots. > However that interface is not intended to be used by drivers which > should use the MAC client interfaces instead. > >> >> Q2.3) A system with many domains (aka LDoms) with virtual network >> devices, it requires the use of a large number layer2 >> addresses, >> this will exhaust h/w slots available on most standard NICs. >> How can a client take advantage of layer2 filtering >> provided by >> NICs like NII-NIU/Neptune. Specifically, this will help in >> avoiding the programming of the device into >> PROMISCous mode >> etc. Currently there are no interfaces that seem to >> provide >> such ability. > > Yes, this is a situation we are aware of. We've talked on this list > about having multiple VNICs sharing the same MAC address, and > identified by their IP address instead. However this needs to be > scoped and defined further before we can commit on providing that > functionality. > >> >> Q2.4) Clients will need the ability to specify if mac_unicast_add() >> is allowed it to go into promiscous mode or not. An >> error return >> value is required if no h/w mac address slot is available. > > OK, I will add a flag. > >> >> Q2.5) On pg 40, the follow description still pointing to the >> rings argument even though it has been removed from >> mac_unicast_add() interface. >> >> "The rings argument specifies the list of rings to >> associate with the specified unicast MAC address. >> If it is NULL, the MAC layer allocates a set of rings >> according to those available to the MAC client, see >> Section ringselection." > > This should be removed, good catch. > >> >> Q2.6) Can it be assumed that every address added to a client is >> processed in a separate ring (either h/w ring or s/w >> ring)? > > No, all the MAC addresses for a client will share the same ring(s). > If there's a need to have a different set of rings associated with a > MAC address, then a different MAC client should be created. > >> Q2.7) How are the multiple addresses per client maintained, is it >> done >> in the MAC layer or does it bybpass the MAC layer and passed >> to h/w directly. > > Since the action of reserving the MAC address is triggered by a call > to the MAC layer, the MAC layer cannot be bypassed. The MAC layer > will use the multiple MAC address capability exposed by the driver to > reserve a new MAC address slot. > >> >> Q2.8) Can unlimited number of mac addresses be assigned to a MAC >> client? What are the software/hardware features that limit >> this? > > Memory that can be allocated by the kernel. > >> >> >> 3) Rings related: >> (Crossbow-virt.pdf Section 5.3 Pg 43) >> mac_rint_t *mac_rings_list_get(mac_client_handle_t mch, >> uint_t nrings); >> void mac_rings_list_free(mac_rings_t *rings, uint_t nrings); >> uint16_t mac_ring_get_flags(mac_ring_t ring); >> >> >> QUESTIONS: >> >> Q3.1) All of these interfaces are now categorized as project- >> private >> API. What motivated this change. These interfaces need to be >> more open. > > The MAC layer will do the allocation of hardware resources to the > various MAC clients and their flows. Instead of having each MAC > client manage its own set of resources, the resources are allocated > to MAC clients based on their needs, for example the degree of > parallelism expressed through mac_client_open(). If you have specific > functional requirements that are not satisfied by the current > document, please list them. > >> Q3.2) The mac_rings_list_get() is only for h/w rings, is >> there >> an equivalent interface to obtain s/w ring information. >> Or this interface can be extended return both h/w ring >> or s/w ring information. > > The interface will evolve to provide that information, but it will > remain project private. It is provided here FYI but will change in > future revisions of the document. > >> Q3.3) Are the mac_resource_set() and mac_resources() interfaces >> going away? > > Yes, they will be replaced by different interfaces. But note that > they are already project private in Nevada and were not supposed to > be used by other ON components. > >> Q3.4) What is the action taken when no free h/w ring available. >> As per the documentation of mac_rings_list_get(), if no h/w >> ring available, it returns NULL. In such case, how does >> mac_unicast_add() behave when NULL is passed for rings? > > mac_unicast_add() no longer takes rings. This will be handled > transparently to the MAC clients by using a default ring and falling > back to software classification. > >> Q3.5) Are there any interfaces other than the above mac_rings_xxx >> interfaces that are available to deal with MAC rings? > > Not available to MAC clients. The set of project private interfaces > might evolve as we refine the design. > >> Q3.6) Is the mac_rings_list_get() returns the list of mac rings >> assigned to the client at the time of client open. How can >> this be changed after the client is open. > > The set of assigned rings may change. The details on the APIs needed > to support this still need to be defined, but they will remain > project private. > >> Q3.7) Assigning h/w rings to a specific MAC address limits the >> bandwidth to the number of rings that are assigned to that >> address. Is there a way to not to bind h/w rings specific >> to MAC address so that the bandwidth could be used by >> any mac client depending on the traffic? > > See Q1.3. > >> >> 4) Receive callback related: >> (Crossbow-virt.pdf Section 5.2.5 Pg 40) >> int mac_rx_set(mac_client_handle_t mch, mac_rx_fn_t rx_fn, >> void *arg); >> int mac_rx_clear(mac_client_handle_t mch); >> >> QUESTIONS: >> >> Q4.1) How can a client get rx callback per ring that is >> assigned >> to the mac client? This will allow parallel processing >> and improve the performance. Such a feature is already >> being used in the current implementation of LDoms vSwitch >> driver and the mac_xxx interfaces should support such an >> ability. > > The parallel processing will still happen. I.e. if multiple hardware > rings or software rings are assigned to a MAC clients, multiple > connections associated with that MAC client will be spread across > these rings. > >> Q4.2) How can a client get a separate callback for a defined type of >> traffic, such as different SAP numbers etc. This will >> be useful to provide out of the band type packet processing >> or related services. > > This will be supported by a MAC flow API built on top of the MAC > client API. The flow API will be described by a separate document. > >> >> Q4.3) There is a reference mac_addr_set(), should it be >> mac_unicast_add()? > > Yes, will fix. > >> >> 5) Transmit related: >> >> (Crossbow-virt.pdf Section 5.2.7 Pg 41) >> mblk_t *mac_tx(mac_client_handle_t *mch, mblk_t *mp, uint64_t hint); >> >> QUESTIONS: >> >> Q5.1) What are the valid values for the 'hint' argument? >> From the description on pg 42, NULL seems to be >> a valid value. Is it safe to assume that the 'hint' is a >> ring-id, if so, a NULL value of 0 will conflict with a >> ring-id of 0. > > The hint can be any 64 bit value, but it must always be the same > value for the packets corresponding to the same connection to avoid > reordering. TCP and UDP for example pass the connection pointer as > the hint, which allows us to avoid packet inspection for these > protocols. > >> >> Q5.2) If NULL specified as a 'hint', how is the tx ring >> selected? > > In this case mac_tx() will parse the packet headers and hash on the > header information to select a transmit ring. > >> >> Q5.3) The 'hint' argument description says the following. >> What is the meaning of a connection in this context and >> how to identify this? >> >> "The hint must be the same for packets of the same >> connection." > > It can be a TCP connection for example. This is required to avoid > reordering of packets for the same connection. > >> 6) Multicast addresses related: >> (Crossbow-virt.pdf Section 5.2.6 Pg 41) >> int mac_multicast_add(mac_client_handle_t mch, const uint8_t >> *addr); >> int mac_multicast_remove(mac_client_handle_t mch, const uint8_t >> *addr); >> >> >> No comments at this point. >> >> 7) Promiscous mode realted: >> >> (Crossbow-virt.pdf Section 5.2.8 Pg 42) >> Its not clear if the above interface will be available or not, >> but two new intefaces are added: >> >> int mac_promisc_add(mac_client_handle_t mch, mac_promisc_type >> promisc_type, mac_promisc_fn_t promisc_fn, void *arg, >> mac_promisc_handle_t *php); >> int mac_promisc_remove(mac_client_handle_t mch, >> mac_promisc_handle_t *ph); >> >> MAC_PROMISC_ALL - send all packets >> MAC_PROMISC_MULTI - only broadcast and multicast >> >> May be the mac_promisc_add(MAC_PROMISC_ALL) will force device >> to operate in the promiscous mode. > > Both need to, since the device needs to be in promiscuous mode also > to receive all multicast traffic. > >> >> QUESTIONS: >> >> Q7.1) According to the section 4.6, the promiscuous mode >> operates >> in the layer2 switch model. When choosing the promiscuous mode >> model can it be either layer2 switch model or shared >> ethernet model? > >> Q7.2) From the explanation of mac_promisc_add(), it seems like >> the mac_promisc_add() could be called without setting >> MAC address via mac_unicast_add(). Is this correct? >> If so, what is the expected behaviour? > > Currently we provide the same semantics as a switched environment, > i.e. a MAC client will see the same traffic that would be seen by a > NIC connected to a switch. > > What we would also like to provide is the ability to for a MAC client > to obtain all the traffic going in and out of the box, as well as the > traffic exchanged between MAC clients. The non-unicast address was > part of that solution. > > Another option would be to generalize this with the shared ethernet > model, and allow a MAC client to specify that it wants to observe all > traffic via a separate promiscuous type. I need to see how this can > be added to the API. > >> >> 8) Statistics related: >> >> Q8.1) Is the mac_stat_get() interface being obsoleted or >> changed? >> If so, what is the new equivalent interface? > > Yes, there will be a new MAC client interface. The MAC layer will > also maintain per-MAC client statistics for MAC client specific > statistics such as number of packets sent/received, etc. I need to > add that interface to the document. > >> >> >> GENERAL QUESTONS: >> ================ >> >> Qg.1) Are there any GLDv3 MAC client interfaces that are being >> obsoleted(provided by the Nemo framework) but not documented >> in this doc? > > The MAC client interface was project private, and most of the > interface is being completely revamped by Crossbow. The set of MAC > client API available to ON consolidation components is described by > section 5.2 of the document. Any other MAC client API are still > project private. > >> Qg.2) Are there any changes to the MAC driver interfaces or being >> obsoleted? > > The changes made to the driver API will be published as part of a > separate forthcoming document. > >> >> Qg.3) There are no MAC client interfaces to specify bandwidth >> attributes. From the section 4.7, it seems like they are >> implemented as part of VNIC and not as MAC client interfaces. >> If this is the case, how can the bandwidth attributes be >> specified? > > They are not documented yet, but will be specified as arguments to > mac_client_open(). > >> >> Qg.4) When will the classification interface be fully documented >> for review? > > There will be separate documents for the MAC driver classification > interfaces, and for the MAC client flow APIs. > >> >> Qg.5) In the future it will be great if the document can include >> version info and change bars. > > Will do. > > Thanks, > Nicolas. > > -- > Nicolas Droux - Solaris Core OS - Sun Microsystems, Inc. > droux at sun.com - http://blogs.sun.com/droux > > > > _______________________________________________ > crossbow-discuss mailing list > crossbow-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss -- Nicolas Droux - Solaris Core OS - Sun Microsystems, Inc. droux at sun.com - http://blogs.sun.com/droux