SYSTEM ARCHITECTURE COUNCIL Platform Software ARC --------------------------------- PSARC Regular Meeting time: Wednesdays 10:00-1:00pm in MPK17-3507.
05-20-2009 MEETING MINUTES ============================================================================ Send CORRECTIONS, additions, deletions to psarc-coord at sun.com. Minutes are archived in sac.Eng:/sac/export/sac/Minutes/PSARC. Co-Chair(s): James Carlson: Yes Tim Marsland: no ATTENDEES - Members: (6 active members) Kais Belgaied: no Mark Carlson: Yes Garrett D'Amore: no (on sabbatical) Richard Matthews: Yes Darren Moffat: Yes (on sabbatical) Sebastien Roy: Yes Glenn Skinner: Yes Bill Sommerfeld: no (on sabbatical) Gary Winiger: Yes (on sabbatical) STAFF - Asa Romberger (PM): Yes ATTENDEES - Interns: Frank Che no David Chieu no Charles Debardeleben: no Peter Dennis: no James Falkner: no (on sabbatical) Daniel Hain: Yes Michael Haines: no Alan Hargreaves: no Phil Harman: no Cecilia Hu: no Wyllys Ingersoll: no Alec Muffett: no (on sabbatical) Darren Reed: no Dean Roehrich Yes Ienup Sung: no Phi Tran no Brian Utterback: no James Walker no Mark Martin Yes (external) Don Cragun Yes (external) Guests: -- GUESTS -- Alan Coopersmith Yes Matthew Ahrens Yes Chris Kirby Yes Tim Haley Yes Not all names are captured. Please send email to Asa.Romberger at Sun.com, if you attended the meeting and your name is missing from the list. --------------------------------------------------------------------------- MEETING SUMMARY: ================ AGENDA 05/20/2009 10:00-10:10 Open ARC Business (use open dial in above) 10:10-10:55 Open Commitment 2009/232 Packet Capture for OpenSolaris Submitter: Darren Reed Owner: Garrett D'Amore Intern: Darren Reed Exposure: open 11:00-11:10 Closed ARC Business (use closed dial in above) --------------------------------------------------------------------------- Case Anchors: <br> <A HREF="#case1">Packet Capture for OpenSolaris (2009/232)</A> <br> =========================================================================== Fast Tracks: ============ Case (Timeout) Exposure Title 2009/275 (05/08/09) open Amendments to pconsole fast-track extend to 5/27 2009/292 (05/22/09) open Xorg server 1.6 approved 2009/297 (05/20/09) open zfs snapshot holds approved 2009/304 (05/22/09) open IP PROMISC Flag approved 2009/309 (05/25/09) open Increase the maximum default ufs log size (ldl_maxlogsize) from 64 Mbytes to 512 Mbytes approved 2009/310 (05/27/09) open Disk IO PM Enhancement let run 2009/312 (05/26/09) open Configurable Boot Archive Updates let run Commitment: =========== 2009/232 Packet Capture for OpenSolaris approved Assignments: ============ 2009/253: Owner: Rick Matthews 2009/306: Owner Sebastien Roy Next Meeting: ============= 05/27/2009 10:00-10:10 Open ARC Business (use open dial in above) 10:10-10:55 Open Inception 2009/253 S10C 11:00-11:10 Closed ARC Business (use closed dial in above) IAM ==== PSARC 2009/232/: Packet Capture for OpenSolaris Submitter: Darren Reed Owner: Garrett D'Amore Intern: Darren Reed SUMMARY ======= To provide a Linux compatible alternative to DLPI for packet capture, PF_PACKET is being implemented for OpenSolaris. ISSUES ======= Issues for inception (04/29/2009): seb-1 On the IPNET version bump: Given that the feature has only been out there for a few OpenSolaris builds and that the only known consumer of DLIOCIPNETINFO is the snoop program, preserving backward compatibility is not important. Bumping the version number is fine, but I think it would be sufficient to reject DLIOCIPNETINFO ioctls that select version 1. seb-2 Related to seb-1, given that we'll have no applications that support both version 1 and version 2 (why would we?), I don't believe that you need a dl_ipnetinfo_v2_t structure. A single dl_ipnetinfo_t structure representing the "current" version should be sufficient, cleaner, and simpler. seb-3 Also related to seb-1, I don't think you should go out of your way to keep support for the parsing of version 1 IPNET headers in snoop, but that's obviously your call. djr In answer to seb-1 through seb-3, if PSARC is happy for version 1 IPNET headers to be "obsolete" and unsupported, I'm happy to remove it from existance. Given that version 1 was publicly documented I opted to take the safe path. seb-4 On the new DLT_LOOP_SOLARIS DLT type: Given that this DLT type represents an IP pseudo data-link type for observability of all IP interfaces (and not just loopback, unless I'm misreading the spec), I'm uncomfortable with the of the chosen name for this type. Would something like DLT_IPNET not be more appropriate? djr If you feel strongly enough about it, I'm happy to use that name. seb-5 "struct ifreq" is Obsolete, yet it is used by /dev/bpf ioctls. Should these not be using "struct lifreq" instead? djr Compatibility. Software (ie libpcap) using BPF is currently only written using "struct ifreq". Additional definitions to support "struct lifreq" could be added. seb-6 How do BIOCSETIF callers select whether the desired observability point is to be at the MAC layer or the IP layer? djr By default the MAC layer is selected and packets from it are presented. This is in part an implementation artifact as it is the first "registration" of an interface in BPF that becomes the default. What this means is that when a device attaches to the MAC layer via mac_register(), it gets inserted in BPF. Later when plumb'd into IP, the IPNET bit is inserted into BPF. seb-7 Are all ifreq fields other than ifr_name ignored when processing BIOCSETIF? djr Yes. seb-8 I don't understand the implications of the Project-Private 32-bit ioctls. Can a 32-bit program not set a BPF filter when running on a 64-bit kernel using Committed interfaces? djr The structure names of the 32bit-format structures are Project Private. They are only used by the kernel module to get the correct layout of data from applications, allowing a 64bit kernel to work with a 32bit application. seb-9 What are the interactions between /dev/bpf and zones? Logically there is probably a restriction of datalinks that BIOCSETIF will allow from a non-global zone, but I didn't find that in the spec (other than the bit about IPNET in shared-stack zones). seb-10 PF_PACKET is about more than "packet capture", it's a full-fledged socket API for interacting with the link-layer. DLPI overlaps this functionality almost exactly. The Background section implies that this project (and DLPI) is just about "packet capture", which is confusing. seb-11 Given that PF_PACKET depends on BPF for filtering, what interfaces does PF_PACKET use from what is being delivered by the BPF "case" (I realize it's not a separate case, but it probably should be)? Does BPF provide some sort of kernel-level API for packet filtering that isn't described in this case? A diagram might help. djr PF_PACKET uses the following interfaces from BPF: struct bpf_program32 struct bpf_program struct bpf_insn bpf_validate() bpf_filter() seb-12 Exclusive-stack zones _do_ have the net_rawaccess privilege by default, contrary to what's described in the "Zones" section of pfp-psarc.txt. I don't think this has any impact on the architecture described, however, since I don't believe that lx branded zones have that privilege, and so PF_PACKET still won't work from lx branded zones. djr Correct, LX branded zones don't have that privilege. seb-13 It doesn't make sense to me to tie IP-layer interface indices to PF_PACKET (see sll_ifindex, mr_ifindex, etc.). IP interfaces are only tangentially related to the objects that PF_PACKET ineracts with, which are datalinks. Also, one should be able to use PF_PACKET over a link even when IP isn't plumbed on that link. Of course, if the SIOCGIFINDEX ioctl on PF_PACKET sockets doesn't return the IP "interface index", then that would address my concern. What does it return? djr The same Id as BPF uses: the link ID from DLS. On a related note, I think a more serious and general problem exists for OpenSolaris (not specifically this project) when it comes to porting Linux or BSD networking functionality. On these operating systems, the "interface index" represents the "interface" at all layers of the stack where it exists. On Solaris, it is only a concept at the IP layer, and this creates incompatibilities as described here. Perhaps advice needs to be doled out to investigate and address this for Solaris. seb-14 What flags could possibly be gotten or set using SIOC*IFFLAGS on a PF_PACKET socket? djr The only one that is currently checked for is IFF_PROMISC. seb-15 What is the definition of "host" for PACKET_HOST and PACKET_OTHERHOST? I believe that these sll_pkttype values are ill-conceived and ill-defined (as you already know). djr The "Linux" definition, in this instance, references to the interface in question. seb-16 There is no discussion of privileges in the BPF materials. Given how deficient DLPI is in this area, this case would be a great place to introduce some more fine-grained privileges for observability (beyond "net_rawaccess"). For example, "net_observability" was introduced a little while ago. Can that be leveraged? djr BPF (also) requires "net_rawaccess". Because the definition of net_observability is suggested to be for /dev/ipnet only, it isn't clear if it should be used by bpf. PSARC 2009/232/: Packet Capture for OpenSolaris Submitter: Darren Reed Owner: Garrett D'Amore Intern: Darren Reed Issues for commitment (05/20/2009): jdc-0 Which materials are which? There are newer files located outside of the commitment-materials directory. What exactly are we supposed to review? djr My mistake. I didn't copy in the updated bpf-psarc.txt correctly. Fixed. jdc-1 20q8: how can both "ifconfig -a" and "dladm show-link" be used? It's unclear to me what "interfaces" are being supported here. "dladm" refers to link-layer (mac) interfaces, while "ifconfig" refers to network-layer (IP) interfaces. The two are not quite the same; there are objects in each that do not appear in the other. Can I open BPF or PF_PACKET on "ipmp0" and "vni0", which are IP-only objects, and do not exist in the mac layer? Can I open it on "etherstub0", which is only at the mac layer and not in IP? djr For IP-only objects, it will be possible to use the IPNET DLT to "sniff" packets. For MAC level devices, it is possible to use whichever DLTs they advertise. If you were to do dladm create-etherstub e0 then you could do tcpdump -nvi e0 So to answer the question, both interfaces seen with "ifconfig" and those seen with "dladm" will be accessible from the global zone. jdc-2 The privilege model with respect to Zones does not appear to be complete. For instance, the 'pfp-psarc.txt' document asserts the following: Zones ~~~~~ Given that zones do not have the "net_rawaccess" privilege, they are thus unable to open PF_PACKET sockets, even if they have an exclusive instance of IP. Thus Linux branded zones will currently not be able to create PF_PACKET sockets. However, that's not true. net_rawaccess is in the default set for exclusive stack instance zones (see /usr/lib/brand/native/config.xml). djr That is different to the advice I was given, my fault for not verifying the advice with source code. I suspect the person answering was not too familiar with exclusive IP instances :( My preference would be to handle this as a follow-on RFE. jdc-3 Zones usage seems unclear. Can I really open PF_PACKET and BPF in the global zone on datalinks that have been assigned to exclusive IP stack zones? djr That is the intended design, yes. jdc-4 Do the separate /dev/bpf0-15 files matter on Solaris? Does BSD require the maintenance of state that applies to each of the 16 devices (such that multiple opens on a single named bpf* device will share state)? The man page provided (bpf.7d.txt) and the architectural documentation (bpf-psarc.txt) seem to conflict on this. djr There is no need for bpf0-15 on Solaris. The bpf driver can assign a new minor number with each opening of the device, allowing for state to be kept per-open-file. jdc-5 bpf.7d.tx: "SLIP links"? djr The man page has not been thoroughly editted for Solaris. seb-17 Going back to seb-16 for a moment, has the relationship between bpf and the net_observability privilege been determined? I don't see any information to that effect in the updated materials. here VOTE ==== Approve - Mark, Jim, Seb, Rick Deny - Abstain - Not Participating (NP) - Glenn, Gary THE NEXT STEP ============= approved