Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Dec 17, 2012, at 1:50 AM, David Laight david.lai...@aculab.com wrote: How are you going to tell whether a feature is present in a non-Linux kernel ? The Linux memory-mapped capture mechanism is not present in a non-Linux kernel, so all the libpcap work involved here would, if necessary on other platforms, have to be done differently on those platforms. Those platforms would have to have their own mechanisms to indicate whether any changes to filter code, processing of VLAN tags supplied out of band, etc. would need to be done. The same would apply to other additional features of the Linux memory-mapped capture mechanism that require changes in libpcap. (Ideally, those changes would only require changes in order to use them, and would not break existing userland code, including but not limited to libpcap - your reply was to Daniel Borkmann, who is, I believe, the originator of netsniff-ng: http://netsniff-ng.org which has its own code using PF_PACKET sockets.) ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Mon, Dec 17, 2012 at 11:35 AM, Guy Harris g...@alum.mit.edu wrote: On Dec 17, 2012, at 1:50 AM, David Laight david.lai...@aculab.com wrote: How are you going to tell whether a feature is present in a non-Linux kernel ? The Linux memory-mapped capture mechanism is not present in a non-Linux kernel, so all the libpcap work involved here would, if necessary on other platforms, have to be done differently on those platforms. Those platforms would have to have their own mechanisms to indicate whether any changes to filter code, processing of VLAN tags supplied out of band, etc. would need to be done. The same would apply to other additional features of the Linux memory-mapped capture mechanism that require changes in libpcap. Exactly. ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Mon, Dec 17, 2012 at 2:35 AM, Guy Harris g...@alum.mit.edu wrote: On Dec 17, 2012, at 1:50 AM, David Laight david.lai...@aculab.com wrote: How are you going to tell whether a feature is present in a non-Linux kernel ? The Linux memory-mapped capture mechanism is not present in a non-Linux kernel, so all the libpcap work involved here would, if necessary on other platforms, have to be done differently on those platforms. Those platforms would have to have their own mechanisms to indicate whether any changes to filter code, processing of VLAN tags supplied out of band, etc. would need to be done. Actually lib-pcap has these pcap-platform.c files that are kind of like platform specific drivers that plug into platform independent code like gencode.c or bpf_filter.c. These platform specific drivers are responsible for getting packets from the kernel and running filters (kernel or userland) on it. So all linux specific code to get a packet and packet metadata from the kernel can neatly reside in pcap-linux.c. Unfortunately though, in this specific problem involving filtering with vlan tags, both code generation (gentags.c) and code running the filter (bpf_filter.c) will have to be aware of linux specific semantics. Due to the issues that Bill had explained earlier in the thread, we can not rely on post processing before installing the kernel filter. Therefore, we need to generate a filter that can be directly installed in the kernel. For the same reason, bpf_filter() code also needs to change - be aware of linux specific semantics. ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On 12/12/2012 10:53 PM, Ani Sinha wrote: unsigned int netdev_8021q_inskb = 1; ... { .ctl_name = NET_CORE_8021q_INSKB, .procname = netdev_8021q_inskb, .data = netdev_8021q_inskb, .maxlen = sizeof(int), .mode = 0444, .proc_handler = proc_dointvec }, would seem to do it to me. Then pcap can fopen(/proc/sys/net/core/netdev_8021q_inskb) and if it finds it, and it is 0, then do the cmsg thing. Does this work? This is just an experimental patch and by no means final. I just want to have an idea what everyone thought about it. Once we debate and discusss, I can cook up a final patch that would be worth commiting. Also instead of having this /proc interface, we can perhaps check for a specific kernel version that : (a) has the vlan tag info in the skb metadata (as opposed to in the packet itself) (b) has the following patch that adds the capability to generate a filter based on the tag value : commit f3335031b9452baebfe49b8b5e55d3fe0c4677d1 Author: Eric Dumazet eduma...@google.com Date: Sat Oct 27 02:26:17 2012 + net: filter: add vlan tag access WE need both of the above two things for the userland to generate a filter code that compares vlan tag values in the skb metadata. For kernels that has the vlan tag in the skb metadata but does not have the above commit (b), there is nothing that can be done. For older kernels that had the vlan tag info in the packet itself, the filter code can be generated differently to look at specific offsets within the packet (something that libpcap does currently). We have already ruled out the idea of generating a filter and trying to load and see if that fails (see previous emails on this thread). Hope this makes sense. I think it doesn't. Because then you are obviously considering adding one procfs file into /proc/sys/net/core/ *for each* feature that is added into the ancillary ops which cannot be the right way ... diff --git a/include/linux/filter.h b/include/linux/filter.h index c45eabc..91e2ba3 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -36,6 +36,7 @@ static inline unsigned int sk_filter_len(const struct sk_filter *fp) return fp-len * sizeof(struct sock_filter) + sizeof(*fp); } +extern bool sysctl_8021q_inskb; extern int sk_filter(struct sock *sk, struct sk_buff *skb); extern unsigned int sk_run_filter(const struct sk_buff *skb, const struct sock_filter *filter); diff --git a/net/core/filter.c b/net/core/filter.c index c23543c..4f5a657 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -41,6 +41,8 @@ #include linux/seccomp.h #include linux/if_vlan.h +bool sysctl_8021q_inskb = 1; + /* No hurry in this branch * * Exported for the bpf jit load helper. diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index d1b0804..f9a3700 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -15,6 +15,7 @@ #include linux/init.h #include linux/slab.h #include linux/kmemleak.h +#include linux/filter.h #include net/ip.h #include net/sock.h @@ -189,6 +190,13 @@ static struct ctl_table net_core_table[] = { .mode = 0644, .proc_handler = proc_dointvec }, + { + .procname = 8021q_inskb, + .data = sysctl_8021q_inskb, + .maxlen = sizeof(bool), + .mode = 0444, + .proc_handler = proc_dointvec + }, { } }; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Thu, Dec 13, 2012 at 6:34 PM, Ani Sinha a...@aristanetworks.com wrote: On Thu, Dec 13, 2012 at 12:35 AM, Daniel Borkmann dbork...@redhat.com wrote: On 12/12/2012 10:53 PM, Ani Sinha wrote: unsigned int netdev_8021q_inskb = 1; ... { .ctl_name = NET_CORE_8021q_INSKB, .procname = netdev_8021q_inskb, .data = netdev_8021q_inskb, .maxlen = sizeof(int), .mode = 0444, .proc_handler = proc_dointvec }, would seem to do it to me. Then pcap can fopen(/proc/sys/net/core/netdev_8021q_inskb) and if it finds it, and it is 0, then do the cmsg thing. I think it doesn't. Because then you are obviously considering adding one procfs file into /proc/sys/net/core/ *for each* feature that is added into the ancillary ops which cannot be the right way ... We had already brought up this topic previously in the same thread. A suggestion was made to add that proc entry and no one from netdev responded to it saying that it did not make any sense. Therefore before I went ahead and made the fixes in libpcap, I wanted to run this by your guys again to make sure we are still on the same page. Well, not everyone on netdev might be following this thread (resp. following fully). The best way to get responses for a patch is to go through the normal patch submission process on netdev, and if you like to request for comments, then mark it as RFC in the subject. This way, people will know and likely comment on it if it makes sense or not. I do agree that instead of a /proc entry, we should check for a kenrel version = X where X is the upstream version that first started supporting all the features needed by libpcap for vlan filtering. This is not a compile time check but a run time one. Does anyone see any issues with this? Is there any long term implications of this, like if you backport patches to an older long term supported kernel? Are there other better ways to do this, like may be returning feature bits from an ioctl call? This is something we need to deal with on a continuous basis as we keep supporting newer AUX fields and libpcap and other user land code needs to make use of it. At the same time, they need to handle backward compatibility issues with older kernels. As Eric mentioned earlier, for now there seems not to be a reliable way to get to know which ops are present and which not. It's not really nice, but if you want to make use of those new (ANC*) features, probably checking kernel version might be the only way if I'm not missing something. Now net-next is closed, but if it reopens, I'll submit a version 2 of my patch where you've been CC'd to. If it gets in, then at least it's for sure that since kernel xyz this kind of feature test is present. ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Thu, Dec 13, 2012 at 1:49 PM, Daniel Borkmann danborkm...@iogearbox.net wrote: Well, not everyone on netdev might be following this thread (resp. following fully). The best way to get responses for a patch is to go through the normal patch submission process on netdev, and if you like to request for comments, then mark it as RFC in the subject. This way, people will know and likely comment on it if it makes sense or not. OK good to know. As Eric mentioned earlier, for now there seems not to be a reliable way to get to know which ops are present and which not. It's not really nice, but if you want to make use of those new (ANC*) features, probably checking kernel version might be the only way if I'm not missing something. Now net-next is closed, but if it reopens, I'll submit a version 2 of my patch where you've been CC'd to. If it gets in, then at least it's for sure that since kernel xyz this kind of feature test is present. thanks, yes, I believe we do need some sort of validation on the ancilliary features. ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
unsigned int netdev_8021q_inskb = 1; ... { .ctl_name = NET_CORE_8021q_INSKB, .procname = netdev_8021q_inskb, .data = netdev_8021q_inskb, .maxlen = sizeof(int), .mode = 0444, .proc_handler = proc_dointvec }, would seem to do it to me. Then pcap can fopen(/proc/sys/net/core/netdev_8021q_inskb) and if it finds it, and it is 0, then do the cmsg thing. Does this work? This is just an experimental patch and by no means final. I just want to have an idea what everyone thought about it. Once we debate and discusss, I can cook up a final patch that would be worth commiting. Also instead of having this /proc interface, we can perhaps check for a specific kernel version that : (a) has the vlan tag info in the skb metadata (as opposed to in the packet itself) (b) has the following patch that adds the capability to generate a filter based on the tag value : commit f3335031b9452baebfe49b8b5e55d3fe0c4677d1 Author: Eric Dumazet eduma...@google.com Date: Sat Oct 27 02:26:17 2012 + net: filter: add vlan tag access WE need both of the above two things for the userland to generate a filter code that compares vlan tag values in the skb metadata. For kernels that has the vlan tag in the skb metadata but does not have the above commit (b), there is nothing that can be done. For older kernels that had the vlan tag info in the packet itself, the filter code can be generated differently to look at specific offsets within the packet (something that libpcap does currently). We have already ruled out the idea of generating a filter and trying to load and see if that fails (see previous emails on this thread). Hope this makes sense. diff --git a/include/linux/filter.h b/include/linux/filter.h index c45eabc..91e2ba3 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -36,6 +36,7 @@ static inline unsigned int sk_filter_len(const struct sk_filter *fp) return fp-len * sizeof(struct sock_filter) + sizeof(*fp); } +extern bool sysctl_8021q_inskb; extern int sk_filter(struct sock *sk, struct sk_buff *skb); extern unsigned int sk_run_filter(const struct sk_buff *skb, const struct sock_filter *filter); diff --git a/net/core/filter.c b/net/core/filter.c index c23543c..4f5a657 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -41,6 +41,8 @@ #include linux/seccomp.h #include linux/if_vlan.h +bool sysctl_8021q_inskb = 1; + /* No hurry in this branch * * Exported for the bpf jit load helper. diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index d1b0804..f9a3700 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -15,6 +15,7 @@ #include linux/init.h #include linux/slab.h #include linux/kmemleak.h +#include linux/filter.h #include net/ip.h #include net/sock.h @@ -189,6 +190,13 @@ static struct ctl_table net_core_table[] = { .mode = 0644, .proc_handler = proc_dointvec }, + { + .procname = 8021q_inskb, + .data = sysctl_8021q_inskb, + .maxlen = sizeof(bool), + .mode = 0444, + .proc_handler = proc_dointvec + }, { } }; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
+ Eric B. On Wed, Dec 12, 2012 at 1:53 PM, Ani Sinha a...@aristanetworks.com wrote: unsigned int netdev_8021q_inskb = 1; ... { .ctl_name = NET_CORE_8021q_INSKB, .procname = netdev_8021q_inskb, .data = netdev_8021q_inskb, .maxlen = sizeof(int), .mode = 0444, .proc_handler = proc_dointvec }, would seem to do it to me. Then pcap can fopen(/proc/sys/net/core/netdev_8021q_inskb) and if it finds it, and it is 0, then do the cmsg thing. Does this work? This is just an experimental patch and by no means final. I just want to have an idea what everyone thought about it. Once we debate and discusss, I can cook up a final patch that would be worth commiting. Also instead of having this /proc interface, we can perhaps check for a specific kernel version that : (a) has the vlan tag info in the skb metadata (as opposed to in the packet itself) (b) has the following patch that adds the capability to generate a filter based on the tag value : commit f3335031b9452baebfe49b8b5e55d3fe0c4677d1 Author: Eric Dumazet eduma...@google.com Date: Sat Oct 27 02:26:17 2012 + net: filter: add vlan tag access WE need both of the above two things for the userland to generate a filter code that compares vlan tag values in the skb metadata. For kernels that has the vlan tag in the skb metadata but does not have the above commit (b), there is nothing that can be done. For older kernels that had the vlan tag info in the packet itself, the filter code can be generated differently to look at specific offsets within the packet (something that libpcap does currently). We have already ruled out the idea of generating a filter and trying to load and see if that fails (see previous emails on this thread). Hope this makes sense. diff --git a/include/linux/filter.h b/include/linux/filter.h index c45eabc..91e2ba3 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -36,6 +36,7 @@ static inline unsigned int sk_filter_len(const struct sk_filter *fp) return fp-len * sizeof(struct sock_filter) + sizeof(*fp); } +extern bool sysctl_8021q_inskb; extern int sk_filter(struct sock *sk, struct sk_buff *skb); extern unsigned int sk_run_filter(const struct sk_buff *skb, const struct sock_filter *filter); diff --git a/net/core/filter.c b/net/core/filter.c index c23543c..4f5a657 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -41,6 +41,8 @@ #include linux/seccomp.h #include linux/if_vlan.h +bool sysctl_8021q_inskb = 1; + /* No hurry in this branch * * Exported for the bpf jit load helper. diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index d1b0804..f9a3700 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -15,6 +15,7 @@ #include linux/init.h #include linux/slab.h #include linux/kmemleak.h +#include linux/filter.h #include net/ip.h #include net/sock.h @@ -189,6 +190,13 @@ static struct ctl_table net_core_table[] = { .mode = 0644, .proc_handler = proc_dointvec }, + { + .procname = 8021q_inskb, + .data = sysctl_8021q_inskb, + .maxlen = sizeof(bool), + .mode = 0444, + .proc_handler = proc_dointvec + }, { } }; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
It is possible to test for the presence of support of the new vlan bpf extensions by attempting to load a filter that uses them. As only valid filters can be loaded, old kernels that do not support filtering of vlan tags will fail to load the a test filter with uses them. Unfortunately I do not see this. The sk_chk_filter() does not have a default in the case statement and the check will not detect an unknown instruction. It will fail when the filter is run and as far as I can see, the packet will be dropped. Something like this might help? diff --git a/net/core/filter.c b/net/core/filter.c index c23543c..96338aa 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -548,6 +548,8 @@ int sk_chk_filter(struct sock_filter *filter, unsigned int flen) return -EINVAL; /* Some instructions need special checks */ switch (code) { + /* for unknown instruction, return EINVAL */ + default : return -EINVAL; case BPF_S_ALU_DIV_K: /* check for division by zero */ if (ftest-k == 0) ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Tue, Dec 11, 2012 at 3:04 PM, Eric Dumazet eric.duma...@gmail.com wrote: On Tue, 2012-12-11 at 14:36 -0800, Ani Sinha wrote: It is possible to test for the presence of support of the new vlan bpf extensions by attempting to load a filter that uses them. As only valid filters can be loaded, old kernels that do not support filtering of vlan tags will fail to load the a test filter with uses them. Unfortunately I do not see this. The sk_chk_filter() does not have a default in the case statement and the check will not detect an unknown instruction. It will fail when the filter is run and as far as I can see, the packet will be dropped. Something like this might help? diff --git a/net/core/filter.c b/net/core/filter.c index c23543c..96338aa 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -548,6 +548,8 @@ int sk_chk_filter(struct sock_filter *filter, unsigned int flen) return -EINVAL; /* Some instructions need special checks */ switch (code) { + /* for unknown instruction, return EINVAL */ + default : return -EINVAL; case BPF_S_ALU_DIV_K: /* check for division by zero */ if (ftest-k == 0) This patch is wrong. yes I generated this patch wrong. Check lines 546, 547, 548 where we do the check for unknown instructions code = codes[code]; if (!code) return -EINVAL; yepph it's OK here. If you want to test ANCILLARY possible values, its already too late, as old kernels wont use any patch anyway. yepph, I was looking at possible ancilliary values. Basically this case statement : #define ANCILLARY(CODE) case SKF_AD_OFF + SKF_AD_##CODE:\ code = BPF_S_ANC_##CODE;\ break switch (ftest-k) { ANCILLARY(PROTOCOL); ANCILLARY(PKTTYPE); ANCILLARY(IFINDEX); ANCILLARY(NLATTR); ANCILLARY(NLATTR_NEST); ANCILLARY(MARK); ANCILLARY(QUEUE); ANCILLARY(HATYPE); ANCILLARY(RXHASH); ANCILLARY(CPU); ANCILLARY(ALU_XOR_X); ANCILLARY(VLAN_TAG); ANCILLARY(VLAN_TAG_PRESENT); } ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Tue, Dec 11, 2012 at 3:04 PM, Eric Dumazet eric.duma...@gmail.com wrote: On Tue, 2012-12-11 at 14:36 -0800, Ani Sinha wrote: It is possible to test for the presence of support of the new vlan bpf If you want to test ANCILLARY possible values, its already too late, as old kernels wont use any patch anyway. So basically this means that if we generate a filter with these special negative offset values and expect that the kernel will complain if it does not recognize the newer values then we would be wrong. And you are right. Old kernels never knew about them and the code wasn't written in a way to return EINVAL if it didn't recognize a special negative anciliary offset value. ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Thu, 6 Dec 2012, Michael Richardson wrote: Date: Thu, 6 Dec 2012 17:59:13 From: Michael Richardson m...@sandelman.ca To: Eric W. Biederman ebied...@xmission.com Cc: a...@aristanetworks.com, tcpdump-workers@lists.tcpdump.org, net...@vger.kernel.org, Francesco Ruggeri frugg...@aristanetworks.com Subject: Re: [tcpdump-workers] vlan tagged packets and libpcap breakage Eric == Eric W Biederman ebied...@xmission.com writes: Eric It is a bit odd that you are indenting with spaces instead of tabs Eric in a file that is indented with tab. Again libpcap isn't my code so I Eric don't care but someone else might. Here's an updated patch. I have converted spaces to tabs consistent with rest of the file. I have also set up github and sent a pull request : commit b977ebd9d9608bb8a8e4927e7a6286bdfd6b94a8 Author: Ani Sinha a...@aristanetworks.com Date: Mon Dec 10 14:58:15 2012 -0800 Linux kernel 3.0 uses TP_STATUS_VLAN_VALID flag in packet auxilliary data aux-tp_status to indicate a packet tagged with a valid vlan ID 0 with another packet that is not vlan tagged. Use this flag to check for the presence of a vlan tagged packet. Also be careful not to cause any breakage when libpcap is compiled against a newer kernel (=3.0) and used on top of an older kernel that does not use this flag. Signed-off-by: Ani Sinha a...@aristanetworks.com diff --git a/pcap-linux.c b/pcap-linux.c index a42c3ac..ffcabdf 100644 --- a/pcap-linux.c +++ b/pcap-linux.c @@ -133,6 +133,7 @@ static const char rcsid[] _U_ = #include sys/utsname.h #include sys/mman.h #include linux/if.h +#include linux/if_packet.h #include netinet/in.h #include linux/if_ether.h #include net/if_arp.h @@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, u_char *userdata) continue; aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg); - if (aux-tp_vlan_tci == 0) +#if defined(TP_STATUS_VLAN_VALID) + if ((aux-tp_vlan_tci == 0) !(aux-tp_status TP_STATUS_VLAN_VALID)) +#else + if (aux-tp_vlan_tci == 0) /* this is ambigious but without the + TP_STATUS_VLAN_VALID flag, there is + nothing that we can do */ +#endif continue; len = packet_len iov.iov_len ? iov.iov_len : packet_len; @@ -3936,7 +3943,12 @@ pcap_read_linux_mmap(pcap_t *handle, int max_packets, pcap_handler callback, } #ifdef HAVE_TPACKET2 - if (handle-md.tp_version == TPACKET_V2 h.h2-tp_vlan_tci + if ((handle-md.tp_version == TPACKET_V2) +#if defined(TP_STATUS_VLAN_VALID) + (h.h2-tp_vlan_tci || (h.h2-tp_status TP_STATUS_VLAN_VALID)) +#else + h.h2-tp_vlan_tci +#endif handle-md.vlan_offset != -1 tp_snaplen = (unsigned int) handle-md.vlan_offset) { struct vlan_tag *tag; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Wed, Oct 31, 2012 at 5:50 PM, Guy Harris g...@alum.mit.edu wrote: On Oct 31, 2012, at 3:35 PM, Ani Sinha a...@aristanetworks.com wrote: yes but if the packet is passed to the filter within libpcap (when we are not using the kernel filter) before the reinsertion, ...that would be a bug. Currently, that bug doesn't exist in the recvfrom() code path, but *does* appear to exist in the tpacket code path - and that code path also runs the filter before the SLL header is constructed. That should be fixed. Something like this? Index: libpcap-1.1.1/pcap-linux.c === --- libpcap-1.1.1.orig/pcap-linux.c +++ libpcap-1.1.1/pcap-linux.c @@ -132,6 +132,7 @@ static const char rcsid[] _U_ = #include sys/utsname.h #include sys/mman.h #include linux/if.h +#include linux/if_packet.h #include netinet/in.h #include linux/if_ether.h #include net/if_arp.h @@ -3469,23 +3476,6 @@ pcap_read_linux_mmap(pcap_t *handle, int return -1; } - /* run filter on received packet - * If the kernel filtering is enabled we need to run the - * filter until all the frames present into the ring - * at filter creation time are processed. - * In such case md.use_bpf is used as a counter for the - * packet we need to filter. - * Note: alternatively it could be possible to stop applying - * the filter when the ring became empty, but it can possibly - * happen a lot later... */ - bp = (unsigned char*)h.raw + tp_mac; - run_bpf = (!handle-md.use_bpf) || - ((handle-md.use_bpf1) handle-md.use_bpf--); - if (run_bpf handle-fcode.bf_insns - (bpf_filter(handle-fcode.bf_insns, bp, - tp_len, tp_snaplen) == 0)) - goto skip; - /* * Do checks based on packet direction. */ @@ -3582,6 +3576,23 @@ pcap_read_linux_mmap(pcap_t *handle, int } #endif + /* run filter on received packet + * If the kernel filtering is enabled we need to run the + * filter until all the frames present into the ring + * at filter creation time are processed. + * In such case md.use_bpf is used as a counter for the + * packet we need to filter. + * Note: alternatively it could be possible to stop applying + * the filter when the ring became empty, but it can possibly + * happen a lot later... */ + bp = (unsigned char*)h.raw + tp_mac; + run_bpf = (!handle-md.use_bpf) || + ((handle-md.use_bpf1) handle-md.use_bpf--); + if (run_bpf handle-fcode.bf_insns + (bpf_filter(handle-fcode.bf_insns, bp, + tp_len, tp_snaplen) == 0)) + goto skip; + /* * The only way to tell the kernel to cut off the * packet at a snapshot length is with a filter program; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Ani Sinha a...@aristanetworks.com writes: On Sat, Nov 17, 2012 at 3:33 PM, Eric W. Biederman ebied...@xmission.com wrote: the vlan header in packets as we receive them. The code is correct except for the case of packets in vlan 0. Currently the packet reconstruction is ambiguous. The most recent kernels have a TP_STATUS_VLAN_VALID flag that can be checked to see if the packet was in vlan 0 or if there was no vlan at all. libpcap probably should be taught how to handle TP_STATUS_VLAN_VALID so that it can get the vlan 0 handling correct. May be this? Two things. - TP_STATUS_VLAN_VALID lives in the tp_status field not the tp_vlan_tci field. - To work on older kernels with binaries compiled with newer headers you first want to test for tp_vlan_tci == 0 and then look at the status field for TP_STATUS_VALID. Which means the tests need to look something like: - if (aux-tp_vlan_tci == 0) +#if defined(TP_STATUS_VLAN_VALID) + if ((aux-tp_vlan_tci == 0) !(aux-tp_status TP_STATUS_VLAN_VALID)) +#else + if (aux-tp_vlan_tci == 0) /* this is ambigious but without the + TP_STATUS_VLAN_VALID flag, there is + nothing that we can do */ +#endif #ifdef HAVE_TPACKET2 - if (handle-md.tp_version == TPACKET_V2 h.h2-tp_vlan_tci + if (handle-md.tp_version == TPACKET_V2 +#if defined(TP_STATUS_VLAN_VALID) + (h.h2-tp_vlan_tci || (h.h2-tp_status TP_STATUS_VALID)) +#else + h.h2-tp_vlan_tci +#endif Eric ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Thu, Dec 6, 2012 at 2:19 PM, Eric W. Biederman ebied...@xmission.com wrote: Ani Sinha a...@aristanetworks.com writes: On Sat, Nov 17, 2012 at 3:33 PM, Eric W. Biederman ebied...@xmission.com wrote: the vlan header in packets as we receive them. The code is correct except for the case of packets in vlan 0. Currently the packet reconstruction is ambiguous. The most recent kernels have a TP_STATUS_VLAN_VALID flag that can be checked to see if the packet was in vlan 0 or if there was no vlan at all. libpcap probably should be taught how to handle TP_STATUS_VLAN_VALID so that it can get the vlan 0 handling correct. May be this? Two things. - TP_STATUS_VLAN_VALID lives in the tp_status field not the tp_vlan_tci field. - To work on older kernels with binaries compiled with newer headers you first want to test for tp_vlan_tci == 0 and then look at the status field for TP_STATUS_VALID. yes you are right Eric on both counts. I will resend the patch again in a bit. ani ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Thu, Dec 6, 2012 at 2:19 PM, Eric W. Biederman ebied...@xmission.com wrote: Ani Sinha a...@aristanetworks.com writes: May be this? Two things. - TP_STATUS_VLAN_VALID lives in the tp_status field not the tp_vlan_tci field. - To work on older kernels with binaries compiled with newer headers you first want to test for tp_vlan_tci == 0 and then look at the status field for TP_STATUS_VALID. trying again : pcap-linux.c | 50 +++--- 1 files changed, 31 insertions(+), 19 deletions(-) diff --git a/pcap-linux.c b/pcap-linux.c index a42c3ac..8e355d3 100644 --- a/pcap-linux.c +++ b/pcap-linux.c @@ -133,6 +133,7 @@ static const char rcsid[] _U_ = #include sys/utsname.h #include sys/mman.h #include linux/if.h +#include linux/if_packet.h #include netinet/in.h #include linux/if_ether.h #include net/if_arp.h @@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, u_char *userdata) continue; aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg); - if (aux-tp_vlan_tci == 0) +#if defined(TP_STATUS_VLAN_VALID) +if ((aux-tp_vlan_tci == 0) || !(aux-tp_status TP_STATUS_VLAN_VALID)) +#else + if (aux-tp_vlan_tci == 0) /* this is ambigious but without the + TP_STATUS_VLAN_VALID flag, there is + nothing that we can do */ +#endif continue; len = packet_len iov.iov_len ? iov.iov_len : packet_len; @@ -3936,7 +3926,12 @@ pcap_read_linux_mmap(pcap_t *handle, int max_packets, pcap_handler callback, } #ifdef HAVE_TPACKET2 - if (handle-md.tp_version == TPACKET_V2 h.h2-tp_vlan_tci +if ((handle-md.tp_version == TPACKET_V2) +#if defined(TP_STATUS_VLAN_VALID) +(h.h2-tp_vlan_tci || (h.h2-tp_status TP_STATUS_VLAN_VALID)) +#else +h.h2-tp_vlan_tci +#endif handle-md.vlan_offset != -1 tp_snaplen = (unsigned int) handle-md.vlan_offset) { struct vlan_tag *tag; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Ani Sinha a...@aristanetworks.com writes: On Thu, Dec 6, 2012 at 2:19 PM, Eric W. Biederman ebied...@xmission.com wrote: Ani Sinha a...@aristanetworks.com writes: May be this? Two things. - TP_STATUS_VLAN_VALID lives in the tp_status field not the tp_vlan_tci field. - To work on older kernels with binaries compiled with newer headers you first want to test for tp_vlan_tci == 0 and then look at the status field for TP_STATUS_VALID. trying again : The patch is whitespace damaged. And one of your test is using || instead of Eric pcap-linux.c | 50 +++--- 1 files changed, 31 insertions(+), 19 deletions(-) diff --git a/pcap-linux.c b/pcap-linux.c index a42c3ac..8e355d3 100644 --- a/pcap-linux.c +++ b/pcap-linux.c @@ -133,6 +133,7 @@ static const char rcsid[] _U_ = #include sys/utsname.h #include sys/mman.h #include linux/if.h +#include linux/if_packet.h #include netinet/in.h #include linux/if_ether.h #include net/if_arp.h @@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, u_char *userdata) continue; aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg); - if (aux-tp_vlan_tci == 0) +#if defined(TP_STATUS_VLAN_VALID) +if ((aux-tp_vlan_tci == 0) || !(aux-tp_status TP_STATUS_VLAN_VALID)) The test should be not ||. +#else + if (aux-tp_vlan_tci == 0) /* this is ambigious but without the + TP_STATUS_VLAN_VALID flag, there is + nothing that we can do */ +#endif continue; len = packet_len iov.iov_len ? iov.iov_len : packet_len; @@ -3936,7 +3926,12 @@ pcap_read_linux_mmap(pcap_t *handle, int max_packets, pcap_handler callback, } #ifdef HAVE_TPACKET2 - if (handle-md.tp_version == TPACKET_V2 h.h2-tp_vlan_tci +if ((handle-md.tp_version == TPACKET_V2) +#if defined(TP_STATUS_VLAN_VALID) +(h.h2-tp_vlan_tci || (h.h2-tp_status TP_STATUS_VLAN_VALID)) +#else +h.h2-tp_vlan_tci +#endif handle-md.vlan_offset != -1 tp_snaplen = (unsigned int) handle-md.vlan_offset) { struct vlan_tag *tag; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
The patch is whitespace damaged. And one of your test is using || instead of OK, using alpine now :-) The test should be not ||. aargh! I am retarded! Fixed. Hopefully 3rd time is a charm :-) ani pcap-linux.c | 50 +++--- 1 files changed, 31 insertions(+), 19 deletions(-) diff --git a/pcap-linux.c b/pcap-linux.c index a42c3ac..b2c1a08 100644 --- a/pcap-linux.c +++ b/pcap-linux.c @@ -133,6 +133,7 @@ static const char rcsid[] _U_ = #include sys/utsname.h #include sys/mman.h #include linux/if.h +#include linux/if_packet.h #include netinet/in.h #include linux/if_ether.h #include net/if_arp.h @@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, u_char *userdata) continue; aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg); - if (aux-tp_vlan_tci == 0) +#if defined(TP_STATUS_VLAN_VALID) +if ((aux-tp_vlan_tci == 0) !(aux-tp_status TP_STATUS_VLAN_VALID)) +#else + if (aux-tp_vlan_tci == 0) /* this is ambigious but without the + TP_STATUS_VLAN_VALID flag, there is + nothing that we can do */ +#endif continue; len = packet_len iov.iov_len ? iov.iov_len : packet_len; @@ -3936,7 +3926,12 @@ pcap_read_linux_mmap(pcap_t *handle, int max_packets, pcap_handler callback, } #ifdef HAVE_TPACKET2 - if (handle-md.tp_version == TPACKET_V2 h.h2-tp_vlan_tci +if ((handle-md.tp_version == TPACKET_V2) +#if defined(TP_STATUS_VLAN_VALID) +(h.h2-tp_vlan_tci || (h.h2-tp_status TP_STATUS_VLAN_VALID)) +#else +h.h2-tp_vlan_tci +#endif handle-md.vlan_offset != -1 tp_snaplen = (unsigned int) handle-md.vlan_offset) { struct vlan_tag *tag; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
The patch is whitespace damaged. And one of your test is using || instead of OK, using alpine now :-) The test should be not ||. aargh! I am retarded! Fixed. Hopefully 3rd time is a charm :-) ani pcap-linux.c | 50 +++--- 1 files changed, 31 insertions(+), 19 deletions(-) diff --git a/pcap-linux.c b/pcap-linux.c index a42c3ac..b2c1a08 100644 --- a/pcap-linux.c +++ b/pcap-linux.c @@ -133,6 +133,7 @@ static const char rcsid[] _U_ = #include sys/utsname.h #include sys/mman.h #include linux/if.h +#include linux/if_packet.h #include netinet/in.h #include linux/if_ether.h #include net/if_arp.h @@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, u_char *userdata) continue; aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg); - if (aux-tp_vlan_tci == 0) +#if defined(TP_STATUS_VLAN_VALID) +if ((aux-tp_vlan_tci == 0) !(aux-tp_status TP_STATUS_VLAN_VALID)) +#else + if (aux-tp_vlan_tci == 0) /* this is ambigious but without the + TP_STATUS_VLAN_VALID flag, there is + nothing that we can do */ +#endif continue; len = packet_len iov.iov_len ? iov.iov_len : packet_len; @@ -3936,7 +3926,12 @@ pcap_read_linux_mmap(pcap_t *handle, int max_packets, pcap_handler callback, } #ifdef HAVE_TPACKET2 - if (handle-md.tp_version == TPACKET_V2 h.h2-tp_vlan_tci +if ((handle-md.tp_version == TPACKET_V2) +#if defined(TP_STATUS_VLAN_VALID) +(h.h2-tp_vlan_tci || (h.h2-tp_status TP_STATUS_VLAN_VALID)) +#else +h.h2-tp_vlan_tci +#endif handle-md.vlan_offset != -1 tp_snaplen = (unsigned int) handle-md.vlan_offset) { struct vlan_tag *tag; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Ani Sinha a...@aristanetworks.com writes: The patch is whitespace damaged. And one of your test is using || instead of OK, using alpine now :-) The test should be not ||. aargh! I am retarded! Fixed. Hopefully 3rd time is a charm :-) Acked-by: Eric W. Biederman ebied...@xmission.com The code looks ok here. I don't know what format the tcpdump folks want patches in. The typically format is an email with [PATCH] in the subject. You indentation now comes through clear. It is a bit odd that you are indenting with spaces instead of tabs in a file that is indented with tab. Again libpcap isn't my code so I don't care but someone else might. Eric ani pcap-linux.c | 50 +++--- 1 files changed, 31 insertions(+), 19 deletions(-) diff --git a/pcap-linux.c b/pcap-linux.c index a42c3ac..b2c1a08 100644 --- a/pcap-linux.c +++ b/pcap-linux.c @@ -133,6 +133,7 @@ static const char rcsid[] _U_ = #include sys/utsname.h #include sys/mman.h #include linux/if.h +#include linux/if_packet.h #include netinet/in.h #include linux/if_ether.h #include net/if_arp.h @@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, u_char *userdata) continue; aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg); - if (aux-tp_vlan_tci == 0) +#if defined(TP_STATUS_VLAN_VALID) +if ((aux-tp_vlan_tci == 0) !(aux-tp_status TP_STATUS_VLAN_VALID)) +#else + if (aux-tp_vlan_tci == 0) /* this is ambigious but without the + TP_STATUS_VLAN_VALID flag, there is + nothing that we can do */ +#endif continue; len = packet_len iov.iov_len ? iov.iov_len : packet_len; @@ -3936,7 +3926,12 @@ pcap_read_linux_mmap(pcap_t *handle, int max_packets, pcap_handler callback, } #ifdef HAVE_TPACKET2 - if (handle-md.tp_version == TPACKET_V2 h.h2-tp_vlan_tci +if ((handle-md.tp_version == TPACKET_V2) +#if defined(TP_STATUS_VLAN_VALID) +(h.h2-tp_vlan_tci || (h.h2-tp_status TP_STATUS_VLAN_VALID)) +#else +h.h2-tp_vlan_tci +#endif handle-md.vlan_offset != -1 tp_snaplen = (unsigned int) handle-md.vlan_offset) { struct vlan_tag *tag; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Eric == Eric W Biederman ebied...@xmission.com writes: Eric It is a bit odd that you are indenting with spaces instead of tabs Eric in a file that is indented with tab. Again libpcap isn't my code so I Eric don't care but someone else might. tabs are hysterical raisens. Send two patches 1) tab-spaces 2) your patch. github preferred :-) but, I'll process this patch as is. -- ] He who is tired of Weird Al is tired of life! | firewalls [ ] Michael Richardson, Sandelman Software Works, Ottawa, ON|net architect[ ] m...@sandelman.ottawa.on.ca http://www.sandelman.ottawa.on.ca/ |device driver[ Kyoto Plus: watch the video http://www.youtube.com/watch?v=kzx1ycLXQSE then sign the petition. ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Ani Sinha a...@aristanetworks.com writes: cc'ing netdev. On Wed, Oct 31, 2012 at 2:01 PM, Michael Richardson m...@sandelman.ca wrote: Thanks for this email. Ani == Ani Sinha a...@aristanetworks.com writes: Ani remove inline from vlan_core.c functions Ani Signed-off-by: David S. Miller da...@davemloft.net Ani As a result of this patch, with or without HW acceleration support in Ani the kernel driver, the vlan tag information is pulled out from the Ani packet and is inserted in the skb metadata. Thereafter, the vlan tag Ani information for a 802.1q packet can be obtained my applications Ani running in the user space by using the auxdata and cmsg Ani structure. Do you think that the existance of this behaviour could be exposed in sysctl, /proc/net or /sys equivalent (we still don't have /sys/net...)? As a read only file that had a 0/1 in it? yes, we definitely need a run time check. Whether this could be in the form of a socket option or a /proc entry I don't know. I don't see any need to add any kernel code to allow checking if vlan tags are stripped. Vlan headers are stripped on all kernel interfaces today. Vlan headers have been stripped on all but a handful of software interfaces for 6+ years. For all kernels if the vlan header is stripped it is reported in the auxdata, upon packet reception. Careful code should also look for TP_STATUS_VLAN_VALID which allows for distinguishing a striped vlan header of 0 from no vlan header. The safe assumption then is that testing for vlan headers and vlan values in bpf filters is not possible without the new bpf extentions. It is possible to test for the presence of support of the new vlan bpf extensions by attempting to load a filter that uses them. As only valid filters can be loaded, old kernels that do not support filtering of vlan tags will fail to load the a test filter with uses them. For old kernels that do not support the new extensions it is possible to generate code that looks at the ethernet header and sees if the ethertype is 0x8100 and then does things with it, but that will only work on a small handful of software only interfaces. Eric ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Sat, Nov 17, 2012 at 11:14 PM, Michael Richardson m...@sandelman.ca wrote: Thank you for this reply. Eric == Eric W Biederman ebied...@xmission.com writes: Eric I don't see any need to add any kernel code to allow checking Eric if vlan tags are stripped. Vlan headers are stripped on all Eric kernel interfaces today. Vlan headers have been stripped on Eric all but a handful of software interfaces for 6+ years. For Eric all kernels if the vlan header is stripped it is reported in Eric the auxdata, upon packet reception. Careful code should also Eric look for TP_STATUS_VLAN_VALID which allows for distinguishing Eric a striped vlan header of 0 from no vlan header. I can regularly see vlan tags on raw dumps from the untagged (eth0) today, running 3.2 (debian stable): obiwan-[~] mcr 4848 %sudo tcpdump -i eth0 -n -p -e | grep -i vlan listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 17:05:15.404909 38:60:77:38:e6:47 ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 3800, p 0, ethertype ARP, Request who-has 172.30.42.1 tell 172.30.42.11, length 28 So, I'm curious about the statement that vlan tags have been stripped for some time, because I don't see them stripped today. My desktop has an Intel 82579V NIC in it... Speaking of netsniff-ng where we don't reconstruct VLAN headers, users have reported that depending on the NIC/driver resp. ethtool setting, they can come in stripped or not (in the pf_packet's rx_ring buffer). However, I assume VLAN AUXDATA is always consistent (and so the BPF/BPF JIT filtering). Eric For old kernels that do not support the new extensions it is Eric possible to generate code that looks at the ethernet header Eric and sees if the ethertype is 0x8100 and then does things with Eric it, but that will only work on a small handful of software Eric only interfaces. at tcpdump.org, our concern is to release code that works on both new, and what for kernel.org folks would be considered ancient systems, such as Centos5/RHEL5 machines which are regularly still in production in the field (sadly...), but often need the latest diagnostics. What I hear you saying is that our existing code will work on older kernels, and that once we have new code to use the VLAN tag extensions, we should simply attempt to load it, and either it loads, or we get an error, and we go back to the code we had before. That's great news. Yes, this should be handled in such a way. ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Michael Richardson m...@sandelman.ca writes: Thank you for this reply. Eric == Eric W Biederman ebied...@xmission.com writes: Eric I don't see any need to add any kernel code to allow checking Eric if vlan tags are stripped. Vlan headers are stripped on all Eric kernel interfaces today. Vlan headers have been stripped on Eric all but a handful of software interfaces for 6+ years. For Eric all kernels if the vlan header is stripped it is reported in Eric the auxdata, upon packet reception. Careful code should also Eric look for TP_STATUS_VLAN_VALID which allows for distinguishing Eric a striped vlan header of 0 from no vlan header. I can regularly see vlan tags on raw dumps from the untagged (eth0) today, running 3.2 (debian stable): obiwan-[~] mcr 4848 %sudo tcpdump -i eth0 -n -p -e | grep -i vlan listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 17:05:15.404909 38:60:77:38:e6:47 ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 3800, p 0, ethertype ARP, Request who-has 172.30.42.1 tell 172.30.42.11, length 28 So, I'm curious about the statement that vlan tags have been stripped for some time, because I don't see them stripped today. My desktop has an Intel 82579V NIC in it... So I just took a quick look at libpcap 1.1.1.1. In pcap_read_packet in pcap-linux.c the code to readd the vlan header is protected by HAVE_PACKET_AUXDATA. In pcap_read_linux_mmap in pcap-linux.c the code to readd the vlan header is protected by HAVE_TPACKET2. That code is shifting the the ethernet header up 4 bytes and inserting the vlan header in packets as we receive them. The code is correct except for the case of packets in vlan 0. Currently the packet reconstruction is ambiguous. The most recent kernels have a TP_STATUS_VLAN_VALID flag that can be checked to see if the packet was in vlan 0 or if there was no vlan at all. libpcap probably should be taught how to handle TP_STATUS_VLAN_VALID so that it can get the vlan 0 handling correct. Eric For old kernels that do not support the new extensions it is Eric possible to generate code that looks at the ethernet header Eric and sees if the ethertype is 0x8100 and then does things with Eric it, but that will only work on a small handful of software Eric only interfaces. at tcpdump.org, our concern is to release code that works on both new, and what for kernel.org folks would be considered ancient systems, such as Centos5/RHEL5 machines which are regularly still in production in the field (sadly...), but often need the latest diagnostics. What I hear you saying is that our existing code will work on older kernels, and that once we have new code to use the VLAN tag extensions, we should simply attempt to load it, and either it loads, or we get an error, and we go back to the code we had before. That's great news. The existing code will work as well as anything if you don't have the VLAN tag extensions. If you are not using the VLAN tag extensions there is no reliable way to filter for vlan tagged packets in the kernel as on most network devices (all for the last year) the vlan header has been stripped at the time the bpf filter is run. So yes. Just falling back to the existing code seems as good as it is going to get. Which isn't very good but it took 5+ years before people cared enough to get this fixed. :( The major concern is that if the 802.1q header is gone, although we can retrieve it somehow (I'm not sure how the AUXDATA is presented on the MMAP PF_PACKET interface...) we will have to reconstruct it in order to save it properly to a savefile. Many entities, including most major Network Telescopes (http://en.wikipedia.org/wiki/Network_telescope) use libpcap (and often tcpdump itself) to capture packets on probe points, and having good performance matters here... Yes. I wasn't looking at the mmap path. The vlan information is present in the tpacket2_hdr and tpacket3_hdr structures that accompany packets read with the mmap interface. The old tpacket1_hdr doesn't support the vlan_tci information so that won't work. Not having the vlan header on the packet when the bpf filter is run is justified by performance, and likewise reading vlan tagged packets to userspace with the vlan header stripped is justified by performance. Eric ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Daniel Borkmann danborkm...@iogearbox.net writes: Speaking of netsniff-ng where we don't reconstruct VLAN headers, users have reported that depending on the NIC/driver resp. ethtool setting, they can come in stripped or not (in the pf_packet's rx_ring buffer). However, I assume VLAN AUXDATA is always consistent (and so the BPF/BPF JIT filtering). Yes it was a mess before we added software stripping of the vlan headers a year ago. Eric ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Ani == Ani Sinha a...@aristanetworks.com writes: Ani I'm wondering if there has been any progress on this. Are there Ani any thoughts on what Bill wrote in his email? I don't think so. Do you have time? ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Folks, I'm wondering if there has been any progress on this. Are there any thoughts on what Bill wrote in his email? Thanks ani On Fri, Nov 2, 2012 at 9:13 AM, Bill Fenner fen...@gmail.com wrote: On Wed, Oct 31, 2012 at 6:20 PM, Guy Harris g...@alum.mit.edu wrote: On Oct 31, 2012, at 2:50 PM, Ani Sinha a...@aristanetworks.com wrote: pcap files that already have the tags reinsrted should work with current filter code. However for live traffic, one has to get the tags from CMSG() and then reinsert it back to the packet for the current filter to work. *Somebody* has to do that, at least to packets that pass the filter, before they're handed to a libpcap-based application, for programs that expect to see packets as they arrived from/were transmitted to the wire to work. I.e., the tags *should* be reinserted by libpcap, and, as I understand it, that's what the #if defined(HAVE_PACKET_AUXDATA) defined(HAVE_LINUX_TPACKET_AUXDATA_TP_VLAN_TCI) ... #endif blocks of code in pcap-linux.c in libpcap are doing. Now, if filtering is being done in the *kernel*, and the tags aren't being reinserted by the kernel, then filter code stuffed into the kernel would need to differ from filter code run in userland. There's already precedent for that on Linux, with the cooked mode headers; those are synthesized by libpcap from the metadata returned for PF_PACKET sockets, and the code that attempts to hand the kernel a filter goes through the filter code, which was generated under the assumption that the packet begins with a cooked mode header, and modifies (a copy of) the code to, instead, use the special Linux-BPF-interpreter offsets to access the metadata. The right thing to do here would be to, if possible, do the same, so that the kernel doesn't have to reinsert VLAN tags for packets that aren't going to be handed to userland. In this case, it would be incredibly complicated to do this just postprocessing a set of bpf instructions. The problem is that when running the filter in the kernel, the IP header, etc. are not offset, so off_macpl and off_linktype would be zero, not 4, while generating the rest of the expression. We would also have to insert code when comparing the ethertype to 0x8100 to instead load the vlan-tagged metadata, so all jumps crossing that point would have to be adjusted, and if the if-false instruction was also testing the ethertype, then the ethertype would have to be reloaded (again inserting another instruction). Basically, take a look at the output of tcpdump -d tcp port 22 or (vlan and tcp port 22). Are the IPv4 tcp ports at x+14/x+16, or at x+18/x+20? If we're filtering in the kernel, they're at x+14/x+16 whether the packet is vlan tagged or not. If we're filtering on the actual packet contents (from a savefile, for example), they're at x+18/x+20 if the packet is vlan tagged. Also, an expression such as 'tcp port 22' would have to have some instructions added at the beginning, for vlan-tagged == false, or it would match both tagged and untagged packets. This would be much more straightforward to deal with in the code generation phase, except until now the code generation phase hasn't known whether the filter is headed for the kernel or not. Bill ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Folks, I'm wondering if there has been any progress on this. Are there any thoughts on what Bill wrote in his email? Thanks On Tue, Nov 13, 2012 at 2:41 PM, Ani Sinha a...@aristanetworks.com wrote: Folks, I'm wondering if there has been any progress on this. Are there any thoughts on what Bill wrote in his email? Thanks ani On Fri, Nov 2, 2012 at 9:13 AM, Bill Fenner fen...@gmail.com wrote: On Wed, Oct 31, 2012 at 6:20 PM, Guy Harris g...@alum.mit.edu wrote: On Oct 31, 2012, at 2:50 PM, Ani Sinha a...@aristanetworks.com wrote: pcap files that already have the tags reinsrted should work with current filter code. However for live traffic, one has to get the tags from CMSG() and then reinsert it back to the packet for the current filter to work. *Somebody* has to do that, at least to packets that pass the filter, before they're handed to a libpcap-based application, for programs that expect to see packets as they arrived from/were transmitted to the wire to work. I.e., the tags *should* be reinserted by libpcap, and, as I understand it, that's what the #if defined(HAVE_PACKET_AUXDATA) defined(HAVE_LINUX_TPACKET_AUXDATA_TP_VLAN_TCI) ... #endif blocks of code in pcap-linux.c in libpcap are doing. Now, if filtering is being done in the *kernel*, and the tags aren't being reinserted by the kernel, then filter code stuffed into the kernel would need to differ from filter code run in userland. There's already precedent for that on Linux, with the cooked mode headers; those are synthesized by libpcap from the metadata returned for PF_PACKET sockets, and the code that attempts to hand the kernel a filter goes through the filter code, which was generated under the assumption that the packet begins with a cooked mode header, and modifies (a copy of) the code to, instead, use the special Linux-BPF-interpreter offsets to access the metadata. The right thing to do here would be to, if possible, do the same, so that the kernel doesn't have to reinsert VLAN tags for packets that aren't going to be handed to userland. In this case, it would be incredibly complicated to do this just postprocessing a set of bpf instructions. The problem is that when running the filter in the kernel, the IP header, etc. are not offset, so off_macpl and off_linktype would be zero, not 4, while generating the rest of the expression. We would also have to insert code when comparing the ethertype to 0x8100 to instead load the vlan-tagged metadata, so all jumps crossing that point would have to be adjusted, and if the if-false instruction was also testing the ethertype, then the ethertype would have to be reloaded (again inserting another instruction). Basically, take a look at the output of tcpdump -d tcp port 22 or (vlan and tcp port 22). Are the IPv4 tcp ports at x+14/x+16, or at x+18/x+20? If we're filtering in the kernel, they're at x+14/x+16 whether the packet is vlan tagged or not. If we're filtering on the actual packet contents (from a savefile, for example), they're at x+18/x+20 if the packet is vlan tagged. Also, an expression such as 'tcp port 22' would have to have some instructions added at the beginning, for vlan-tagged == false, or it would match both tagged and untagged packets. This would be much more straightforward to deal with in the code generation phase, except until now the code generation phase hasn't known whether the filter is headed for the kernel or not. Bill ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Wed, Oct 31, 2012 at 6:20 PM, Guy Harris g...@alum.mit.edu wrote: On Oct 31, 2012, at 2:50 PM, Ani Sinha a...@aristanetworks.com wrote: pcap files that already have the tags reinsrted should work with current filter code. However for live traffic, one has to get the tags from CMSG() and then reinsert it back to the packet for the current filter to work. *Somebody* has to do that, at least to packets that pass the filter, before they're handed to a libpcap-based application, for programs that expect to see packets as they arrived from/were transmitted to the wire to work. I.e., the tags *should* be reinserted by libpcap, and, as I understand it, that's what the #if defined(HAVE_PACKET_AUXDATA) defined(HAVE_LINUX_TPACKET_AUXDATA_TP_VLAN_TCI) ... #endif blocks of code in pcap-linux.c in libpcap are doing. Now, if filtering is being done in the *kernel*, and the tags aren't being reinserted by the kernel, then filter code stuffed into the kernel would need to differ from filter code run in userland. There's already precedent for that on Linux, with the cooked mode headers; those are synthesized by libpcap from the metadata returned for PF_PACKET sockets, and the code that attempts to hand the kernel a filter goes through the filter code, which was generated under the assumption that the packet begins with a cooked mode header, and modifies (a copy of) the code to, instead, use the special Linux-BPF-interpreter offsets to access the metadata. The right thing to do here would be to, if possible, do the same, so that the kernel doesn't have to reinsert VLAN tags for packets that aren't going to be handed to userland. In this case, it would be incredibly complicated to do this just postprocessing a set of bpf instructions. The problem is that when running the filter in the kernel, the IP header, etc. are not offset, so off_macpl and off_linktype would be zero, not 4, while generating the rest of the expression. We would also have to insert code when comparing the ethertype to 0x8100 to instead load the vlan-tagged metadata, so all jumps crossing that point would have to be adjusted, and if the if-false instruction was also testing the ethertype, then the ethertype would have to be reloaded (again inserting another instruction). Basically, take a look at the output of tcpdump -d tcp port 22 or (vlan and tcp port 22). Are the IPv4 tcp ports at x+14/x+16, or at x+18/x+20? If we're filtering in the kernel, they're at x+14/x+16 whether the packet is vlan tagged or not. If we're filtering on the actual packet contents (from a savefile, for example), they're at x+18/x+20 if the packet is vlan tagged. Also, an expression such as 'tcp port 22' would have to have some instructions added at the beginning, for vlan-tagged == false, or it would match both tagged and untagged packets. This would be much more straightforward to deal with in the code generation phase, except until now the code generation phase hasn't known whether the filter is headed for the kernel or not. Bill ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Oct 31, 2012, at 2:50 PM, Ani Sinha a...@aristanetworks.com wrote: pcap files that already have the tags reinsrted should work with current filter code. However for live traffic, one has to get the tags from CMSG() and then reinsert it back to the packet for the current filter to work. *Somebody* has to do that, at least to packets that pass the filter, before they're handed to a libpcap-based application, for programs that expect to see packets as they arrived from/were transmitted to the wire to work. I.e., the tags *should* be reinserted by libpcap, and, as I understand it, that's what the #if defined(HAVE_PACKET_AUXDATA) defined(HAVE_LINUX_TPACKET_AUXDATA_TP_VLAN_TCI) ... #endif blocks of code in pcap-linux.c in libpcap are doing. Now, if filtering is being done in the *kernel*, and the tags aren't being reinserted by the kernel, then filter code stuffed into the kernel would need to differ from filter code run in userland. There's already precedent for that on Linux, with the cooked mode headers; those are synthesized by libpcap from the metadata returned for PF_PACKET sockets, and the code that attempts to hand the kernel a filter goes through the filter code, which was generated under the assumption that the packet begins with a cooked mode header, and modifies (a copy of) the code to, instead, use the special Linux-BPF-interpreter offsets to access the metadata. The right thing to do here would be to, if possible, do the same, so that the kernel doesn't have to reinsert VLAN tags for packets that aren't going to be handed to userland. And, yes, if that should be done for some interfaces with some kernel versions but not all interfaces for all kernel versions, there would need to be a way for libpcap to ask whether it's necessary. Is it necessary on any interfaces *before* the kernel change in question? ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
cc'ing netdev. On Wed, Oct 31, 2012 at 2:01 PM, Michael Richardson m...@sandelman.ca wrote: Thanks for this email. Ani == Ani Sinha a...@aristanetworks.com writes: Ani remove inline from vlan_core.c functions Ani Signed-off-by: David S. Miller da...@davemloft.net Ani As a result of this patch, with or without HW acceleration support in Ani the kernel driver, the vlan tag information is pulled out from the Ani packet and is inserted in the skb metadata. Thereafter, the vlan tag Ani information for a 802.1q packet can be obtained my applications Ani running in the user space by using the auxdata and cmsg Ani structure. Do you think that the existance of this behaviour could be exposed in sysctl, /proc/net or /sys equivalent (we still don't have /sys/net...)? As a read only file that had a 0/1 in it? yes, we definitely need a run time check. Whether this could be in the form of a socket option or a /proc entry I don't know. Should be one stanza to net/dev/sysctl_net_core, I think. If we are fast, maybe no kernel will ship without it. (An aside, is this auxdata/cmsg info available on UDP sockets too?) Ani More recently, two more patches were committed to net-next that enable Ani the kernel BPF filter to filter packets using their vlan tags : Ani http://www.spinics.net/lists/netdev/msg214758.html Ani http://www.spinics.net/lists/netdev/msg214759.html okay... Ani Now to the real problem. The filter that is currently generated by Ani libpcap (gencode.c) uses offsets into the packet to obtain the vlan Ani tag for a packet and then compare it to the one provided by the user Ani in the filter expression (gen_vlan()). This will no longer work if the Ani tag has been pulled off from the packet itself. One has to use the Ani negative offsets (SKF_AD_VLAN_TAG) as used by the kernel BPF filter to Ani pass down the attribute that we need to compare against. Now here's a Ani bigger problem. How do we avoid breakage in case we run libpcap on top Ani of an older kernel that does not extract the tags from the packet? How Ani do we handle cases of parsing and filtering against captured pcap Ani files that already have the vlan tags reinserted into the packet data? Ani There might be other corner cases that I am missing and not Ani understanding here. AFAIK, We will have to modify pcap-linux to produce different filters. Ani Are you guys aware of the problem? Is there any thoughts as to how we Ani may be able to address the problem or not at all? Nobody gave us a heads up, no. Glad that I brought this up. Actually Bill Fenner who also works for us helped me a lot in understanding the problem and potentially how it can be addressed. It sounds like it's impossible to capture all traffic now, and do vlan filtering later on? pcap files that already have the tags reinsrted should work with current filter code. However for live traffic, one has to get the tags from CMSG() and then reinsert it back to the packet for the current filter to work. This is slow. ani ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Wed, Oct 31, 2012 at 3:20 PM, Guy Harris g...@alum.mit.edu wrote: On Oct 31, 2012, at 2:50 PM, Ani Sinha a...@aristanetworks.com wrote: pcap files that already have the tags reinsrted should work with current filter code. However for live traffic, one has to get the tags from CMSG() and then reinsert it back to the packet for the current filter to work. *Somebody* has to do that, at least to packets that pass the filter, yes but if the packet is passed to the filter within libpcap (when we are not using the kernel filter) before the reinsertion, then the filter has to be taught to look into the metadata for tag information and not in the packet itself. before they're handed to a libpcap-based application, for programs that expect to see packets as they arrived from/were transmitted to the wire to work. I.e., the tags *should* be reinserted by libpcap, and, as I understand it, that's what the #if defined(HAVE_PACKET_AUXDATA) defined(HAVE_LINUX_TPACKET_AUXDATA_TP_VLAN_TCI) ... #endif blocks of code in pcap-linux.c in libpcap are doing. yes, I agree. ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
On Wed, Oct 31, 2012 at 5:50 PM, Guy Harris g...@alum.mit.edu wrote: On Oct 31, 2012, at 3:35 PM, Ani Sinha a...@aristanetworks.com wrote: yes but if the packet is passed to the filter within libpcap (when we are not using the kernel filter) before the reinsertion, ...that would be a bug. Currently, that bug doesn't exist in the recvfrom() code path, but *does* appear to exist in the tpacket code path - and that code path also runs the filter before the SLL header is constructed. That should be fixed. yes, agreed. ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers