Jiri, I am not sure it would be simple to move the add_offload to vlan_Core.c as the add_offload should happen once. in vlan.c it's done as part of module init but in vlan_core.c we are not initializing any model and making some logic for doing that once looks awkward to me. Why can't we just load the module (8021q) once a vlan interface is added via OVS?
בתאריך יום ה׳, 25 באוק׳ 2018 ב-15:15 מאת Jiri Pirko <j...@resnulli.us >: > Thu, Oct 25, 2018 at 01:34:04PM CEST, michaels...@gmail.com wrote: > >I'v used ftrace to see which paths we take in the kernel. > >I remind you that the packet (in my scenario) looks like the following: > >ETH | IP | UDP | VXLAN | L2 | VLAN | IP | TCP | payload > > > >When 8021q is not loaded: > >dev_gro_receive <-napi_gro_receive > >inet_gro_receive <-dev_gro_receive > >udp4_gro_receive <-inet_gro_receive > >udp_gro_receive <-udp4_gro_receive > >udp4_lib_lookup_skb <-udp_gro_receive > >__udp4_lib_lookup <-udp4_lib_lookup_skb > >compute_score <-__udp4_lib_lookup > >vxlan_gro_receive <-udp_gro_receive > >__pskb_pull_tail <-vxlan_gro_receive > >skb_copy_bits <-__pskb_pull_tail > >eth_gro_receive <-vxlan_gro_receive > >__pskb_pull_tail <-eth_gro_receive > >skb_copy_bits <-__pskb_pull_tail > >gro_find_receive_by_type <-eth_gro_receive > >netif_receive_skb_internal <-napi_gro_receive > >ktime_get_real <-netif_receive_skb_internal > >getnstimeofday64 <-ktime_get_real > >__getnstimeofday64 <-getnstimeofday64 > >skb_defer_rx_timestamp <-netif_receive_skb_internal > >classify <-skb_defer_rx_timestamp > >__netif_receive_skb <-netif_receive_skb_internal > >__netif_receive_skb_core <-__netif_receive_skb > >ip_rcv <-__netif_receive_skb_core > >nf_hook_slow <-ip_rcv > >nf_iterate <-nf_hook_slow > >ipv4_conntrack_defrag <-nf_iterate > >ipv4_conntrack_in <-nf_iterate > > > >When 8021q was loaded manually: > > > > > >inet_gro_receive <-dev_gro_receive > >classify <-skb_clone_tx_timestamp > >udp4_gro_receive <-inet_gro_receive > >_raw_spin_lock <-sch_direct_xmit > >udp_gro_receive <-udp4_gro_receive > >udp4_lib_lookup_skb <-udp_gro_receive > >local_bh_enable <-__dev_queue_xmit > >__udp4_lib_lookup <-udp4_lib_lookup_skb > >__local_bh_enable_ip <-local_bh_enable > >compute_score <-__udp4_lib_lookup > >local_bh_enable <-ip_finish_output > >vxlan_gro_receive <-udp_gro_receive > >__local_bh_enable_ip <-local_bh_enable > >__pskb_pull_tail <-vxlan_gro_receive > >skb_copy_bits <-__pskb_pull_tail > >local_bh_enable <-__dev_queue_xmit > >eth_gro_receive <-vxlan_gro_receive > >__local_bh_enable_ip <-local_bh_enable > >__pskb_pull_tail <-eth_gro_receive > >skb_copy_bits <-__pskb_pull_tail > >gro_find_receive_by_type <-eth_gro_receive > >local_bh_enable <-__dev_queue_xmit > >__local_bh_enable_ip <-local_bh_enable > >vlan_gro_receive <-eth_gro_receive > >__pskb_pull_tail <-vlan_gro_receive > >local_bh_enable <-ip_finish_output > >skb_copy_bits <-__pskb_pull_tail > >__local_bh_enable_ip <-local_bh_enable > >gro_find_receive_by_type <-vlan_gro_receive > >inet_gro_receive <-vlan_gro_receive > >kfree_skb_partial <-tcp_rcv_established > >__pskb_pull_tail <-inet_gro_receive > >skb_release_all <-kfree_skb_partial > >skb_copy_bits <-__pskb_pull_tail > >skb_release_head_state <-skb_release_all > >skb_release_data <-skb_release_all > >tcp4_gro_receive <-inet_gro_receive > >tcp_gro_receive <-tcp4_gro_receive > >put_page <-skb_release_data > > > >You can see that eth_gro_receive calls vlan_gro_recieve when 8021q module > >is loaded and then performs ding gro_receive for the next layers as well > >(inet, tcp), whereas on default scenario where 8021q is not loaded the > vlan > >device ndo cannot be used. > > Ah, got it. > > What happens is that during the module load, vlan_proto_init() registers > vlan_packet_offloads: > > for (i = 0; i < ARRAY_SIZE(vlan_packet_offloads); i++) > dev_add_offload(&vlan_packet_offloads[i]); > > I think that it would make sense to move this and the related code > to net/8021q/vlan_core.c > Then the offloads will be there regardless if the vlan module is loaded > or not. > > Please feel free to spin-up a patch, will be more than happy to review > it. Thanks! > > Jiri > > > > > > >בתאריך יום ה׳, 25 באוק׳ 2018 ב-9:40 מאת Jiri Pirko <j...@resnulli.us > >>: > > > >> Wed, Oct 24, 2018 at 10:12:54PM CEST, michaels...@gmail.com wrote: > >> >Hi, > >> >I noticed that there is a performance issue when running traffic on a > vlan > >> >interface that was created by OVS. > >> >If we create a bridge with a vlan interface, the 8021q module is not > >> loaded. > >> >Then when packets with a 8021q tag arrives, the linux stack can't use > the > >> >vlan ndos (such as gro_recieve) since there is no such vlan device. > >> >If I perform the same test after loading the 8021q module, I get 2x-5x > >> >better performance. > >> > >> Could you please describe why exacly do you think you see this increase? > >> > >> > >> >I personally test that using tunnels (vxlan) but I think the issue > exists > >> >even without the tunnel. > >> > > >> >*Creating a bridge and vlan interface:* > >> >ovs-vsctl --no-wait add-br br_0 > >> >ip link set br_0 up > >> >ovs-vsctl --no-wait add-port br_0 ens5f0 > >> >ovs-vsctl --no-wait add-port br_0 vlan10 tag=10 -- set interface vlan10 > >> >type=internal > >> >ip link set vlan10 up > >> >ip addr add <ip> dev vlan10 > >> > > >> >run traffic (netperf on the vlan10 interface) and see the result. > >> >Then do 'modprobe 8021q' and perform the same test -> you will see much > >> >better numbers. > >> > > >> >I am suggesting to load 80021q module once vlan interface is added > through > >> >ovs. > >> >similarly as a vlan interface would be created using 'ip link'. > >> > > >> >Michael > >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev