Hi, There seem to be several data races in e1000 driver (checked a couple of usage scenarios on the vanilla Linux kernel 3.11-rc1). Perhaps, not that critical but still worth fixing, I suppose.
1. CPU0 CPU1 e1000_probe() | register_netdev() | | e1000_open() | | | e1000_configure() | | | e1000_set_rx_mode() | | | rctl = er32(RCTL); | (e1000_main.c:2237) | | | | e1000_vlan_filter_on_off | (e1000_main.c:1222) | | | ew32(RCTL, rctl); | (e1000_main.c:4823) | | ew32(RCTL, rctl); (e1000_main.c:2259) As soon as register_netdev() registers the network device at e1000_main.c:1218, something (e.g. NetworkManager) may try to access the device thus triggering e1000_open() and all the path depicted above. The race window is rather small but still it is possible for the write of RCTL in e1000_vlan_filter_on_off() interfere with its update in e1000_set_rx_mode() as depicted above. Perhaps, moving the call to e1000_vlan_filter_on_off() in e1000_probe() before register_netdev() will fix this: ------------------------------ + e1000_vlan_filter_on_off(adapter, false); + strcpy(netdev->name, "eth%d"); err = register_netdev(netdev); if (err) goto err_register; - e1000_vlan_filter_on_off(adapter, false); ------------------------------ 2. The accesses to adapter->link_speed, adapter->link_duplex, adapter->stats.*, hw->tx_packet_delta, hw->collision_delta in e1000_get_ethtool_stats() are probably not synchronized with those in e1000_update_stats(). Example: CPU0 CPU1 e1000_watchdog (e1000_main.c:2500) | e1000_update_stats | <lock adapter->stats_lock> e1000_get_ethtool_stats (e1000_main.c:3620) | | | | data[i] = ... *(u32 *)p; | (e1000_ethtool.c:1850) adapter->stats.xonrxc += er32(XONRXC); (e1000_main.c:3651) | <unlock adapter->stats_lock> (e1000_main.c:3749) e1000_update_stats() as well as some other functions take 'adapter->stats_lock' when they access these data, perhaps e1000_get_ethtool_stats() should do that too? 3. Our tools also report a number of possible races between e1000_clean() (NAPI callback) and e1000_xmit_frame() but we may have missed something. NAPI poll callbacks and ndo_start_xmit callbacks cannot race with each other on a single CPU. But in my experiments, they executed on different CPUs. Is there anything that ensures they never execute concurrently even on different CPUs? If so, I'll update our tools accordingly to avoid such false positives. If the callbacks may execute concurrently then there are races in e1000 on the accesses to tx_ring->next_to_clean, tx_ring->next_to_use, tx_ring->buffer_info[i].*, tx_desc->upper.data. Same for the accesses to dql->num_queued in netdev_tx_sent_queue() and netdev_completed_queue() called from these callbacks. Example: CPU0 CPU1 e1000_clean() (e1000_main.c:3812) | e1000_clean_tx_irq e1000_xmit_frame (e1000_main.c:3869) (e1000_main.c:3260) | | e1000_unmap_and_free_tx_resource e1000_tx_map (e1000_main.c:1961) (e1000_main.c:2880) | | if (buffer_info->dma) { ... buffer_info->dma = ... ------------------------------ So, what do you think? Regards, Eugene -- Eugene Shatokhin, ROSA Laboratory. www.rosalab.com ------------------------------------------------------------------------------ Get your SQL database under version control now! Version control is standard for application code, but databases havent caught up. So what steps can you take to put your SQL databases under version control? Why should you start doing it? Read more to find out. http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired