On Wed, 2009-08-12 at 14:58 -0700, Darren Reed wrote: > > > > * 1168: What's the purpose of this addition? This function only gets > > > > called for ipnet_t streams that are on the /dev/lo0 device. > > > > > > > Discuss. > > > ipnet_loaccept() is used twice: once in ipnet_open() and once in > > > ipnet_promisc_add(). > > > > > > > I don't follow. ipnet_acceptfn should only be set to ipnet_loaccept() > > if /dev/lo0 is opened, and in no other case. Therefore the check you > > added in the function itself: "if (getminor(ipnet->ipnet_if->if_dev) == > > IPNET_MINOR_LO)" is redundant. If ipnet_loaccept() is getting called > > when it shouldn't be, then something else is wrong. > > > > So lets examine ipnet_accept vs ipnet_loaccept for the purpose of > lo0 packets going through ipnet, into BPF, in a local zone with a > shared stack. > > If we set ipnet_acceptfn to ipnet_accept for lo0 packets that are > going to end up in BPF: > - the IP address of the packet isn't important, so what address type > it is does not matter, thus calls to ipnet_get_addrtype() are not > required;
The address is important. A consumer of an ipnet device receives packets if they contain an IP address assigned to that interface (this is described in the IP Observability PSARC case). That's the distinction between /dev/lo0 and /dev/ipnet/lo0 (and thus the difference between ipnet_accept() and ipnet_loaccept()). > - the lo0 interface has no special behaviour for promiscuous mode, > so that flag and the SAP one are meaningless; > - thus there's lots of code in that function that gets executed for > seemingly no reason. ipnet_accept() implements the semantics of _all_ ipnet observability nodes including /dev/ipnet/lo0. ipnet_loaccept() only implements the semantics of /dev/lo0 (which are deliberately different as described in the PSARC case). If you feel that ipnet_accept() does not perform well enough for the special case of /dev/ipnet/lo0 in a shared stack zone, then the answer isn't to break its semantic by using ipnet_loaccept(), but rather to address its performance issues (if indeed there are any). > The goal of using BPF in a local zone with a shared stack is to > see all of the traffic in that zone which is associated with the > lo0 "interface". The ipnet_loaccept function provides a filter > that does exactly that - except for the filter on the hook type. > I suppose i could add an ipnet_bpf_loaccept but it would be > a carbon copy of ipnet_loaccept with that check removed. > > I think what I'm saying is that BPF on lo0 in a zone has the > same semantics as /dev/lo0 and not /dev/ipnet/lo0. I don't understand. I thought that there were different "levels" of observability for BPF, where if one enabled DLT_IPNET, one would get the ipnet semantics (/dev/ipnet/* for any device). I would then assume that for "lo0", if one does not use DLT_IPNET, then one gets the /dev/lo0 semantics, but if one enables DLT_IPNET, one gets /dev/ipnet/lo0 semantics. In any case, we're currently talking about the setting of ipnet_loaccept as ipnet_acceptfn, which definitely needs to be worked out for BPF, but the comment I made was slightly different. The code change you made in ipnet_loaccept() itself seems to be tangential to that, and I still don't know why you made that change. Returning B_FALSE if /dev/lo0 wasn't the device opened still seems redundant regardless. > > > > * 1959: What is the reason for this callback having to duplicate every > > > > packet passed to it? Why wouldn't the code that dispatches packets > > > > to callbacks always simply pass a duplicate? An observability hook > > > > should always have those semantics (it should never manipulate the > > > > original or block while processing it), so hard-coding such > > > > semantics for all NH_OBSERVE hooks would make sense to me. > > > > > > > Discuss. > > > The duplication here has been moved from ip.c`ipobs_hook() to here. > > > Whereas ipobs_hook() was previously just walking a list of functions > > > to call and doing the duplication there, as required, now the list > > > walking is separated (pfhooks) from the packet duplication. Thus the > > > dispatching comes out of pfhooks and jumps through ipobs_bounce_func_cb() > > > to do the duplication for the real recipient (func). > > > > > > > Ah, I see. > > > > > > > But, if your > > > comments above are correct, then dupmsg() should be removed and only > > > copymsg() used, otherwise the observe is presented with the real > > > packet data (in dblk_t's off of its own mblk_t.) There is a performance > > > cost with this, though: it enforces a deep copy of all observed packets, > > > copying all of the dblk_t's, not just a shallow one (creating new > > > mblk_t's.) > > > The current implementation only does a deep copy if the shallow one > > > fails. > > > > > > > Okay. > > > > Just to confirm, it's ok to leave the code to do both dupmsg() and > copymsg() > for performance reasons. Sure, that's fine. -Seb _______________________________________________ networking-discuss mailing list networking-discuss@opensolaris.org