Re: [PATCH net] net: usb: lan78xx: Connect PHY before registering MAC

2019-10-18 Thread Daniel Wagner
On Thu, Oct 17, 2019 at 09:29:26PM +0200, Andrew Lunn wrote:
> As soon as the netdev is registers, the kernel can start using the
> interface. If the driver connects the MAC to the PHY after the netdev
> is registered, there is a race condition where the interface can be
> opened without having the PHY connected.
> 
> Change the order to close this race condition.
> 
> Fixes: 92571a1aae40 ("lan78xx: Connect phy early")
> Reported-by: Daniel Wagner 
> Signed-off-by: Andrew Lunn 

Tested-by: Daniel Wagner 

Thanks for the fix!
Daniel


Re: lan78xx and phy_state_machine

2019-10-17 Thread Daniel Wagner
> >> Unfortunately, you didn't wrote which kernel version works for you
> >> (except of this splat). Only 5.3 or 5.4-rc3 too?
> > With v5.2.20 I was able to boot the system. But after this discussion
> > I would say that was just luck. The race seems to exist for longer and
> > only with my 'special' config I am able to reproduce it.
> okay, let me rephrase my question. You said that 5.4-rc3 didn't even
> boot in your setup. After applying Andrew's patch, does it boot or is it
> a different issue?

Yes, with Andrew's patch the initial problem is gone.

> >> [1] - https://marc.info/?l=linux-netdev&m=154604180927252&w=2
> >> [2] - https://patchwork.kernel.org/patch/10888797/
> > Indeed, the irq domain code looks suspicious and Marc pointed out that
> > is dead wrong. Could we just go with [2] and fix this up?
> 
> Sorry, i cannot answer this question.

Sure, I just trying to lobbying :)


Re: lan78xx and phy_state_machine

2019-10-17 Thread Daniel Wagner
Hi Stefan,

On Thu, Oct 17, 2019 at 07:05:32PM +0200, Stefan Wahren wrote:
> Am 17.10.19 um 08:52 schrieb Daniel Wagner:
> > On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
> >> Please could you give this a go. It is totally untested, not even
> >> compile tested...
> > Sure. The system boots but ther is one splat:
> >
> this is a known issues since 4.20 [1], [2]. So not related to the crash.

Oh, I see.

> Unfortunately, you didn't wrote which kernel version works for you
> (except of this splat). Only 5.3 or 5.4-rc3 too?

With v5.2.20 I was able to boot the system. But after this discussion
I would say that was just luck. The race seems to exist for longer and
only with my 'special' config I am able to reproduce it.

> [1] - https://marc.info/?l=linux-netdev&m=154604180927252&w=2
> [2] - https://patchwork.kernel.org/patch/10888797/

Indeed, the irq domain code looks suspicious and Marc pointed out that
is dead wrong. Could we just go with [2] and fix this up?

Thanks,
Daniel


Re: lan78xx and phy_state_machine

2019-10-16 Thread Daniel Wagner
On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
> Hi Daniel
> 
> Please could you give this a go. It is totally untested, not even
> compile tested...

Sure. The system boots but ther is one splat:


[2.213987] usb 1-1: new high-speed USB device number 2 using dwc2
[2.426789] hub 1-1:1.0: USB hub found
[2.430677] hub 1-1:1.0: 4 ports detected
[2.721982] usb 1-1.1: new high-speed USB device number 3 using dwc2
[2.826991] hub 1-1.1:1.0: USB hub found
[2.831093] hub 1-1.1:1.0: 3 ports detected
[3.489988] usb 1-1.1.1: new high-speed USB device number 4 using dwc2
[3.729045] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): 
deferred multicast write 0x7ca0
[3.870518] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No 
External EEPROM. Setting MAC Speed
[3.881900] libphy: lan78xx-mdiobus: probed
[3.893322] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): 
registered mdiobus bus usb-001:004
[3.902984] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): 
phydev->irq = 79
[4.283761] random: crng init done
[4.958866] lan78xx 1-1.1.1:1.0 eth0: receive multicast hash filter
[4.965311] lan78xx 1-1.1.1:1.0 eth0: deferred multicast write 0x7ca2
[6.502358] lan78xx 1-1.1.1:1.0 eth0: PHY INTR: 0x0002
[6.507935] [ cut here ]
[6.512635] irq 79 handler irq_default_primary_handler+0x0/0x8 enabled 
interrupts
[6.520250] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 
__handle_irq_event_percpu+0x150/0x170
[6.529424] Modules linked in:
[6.532526] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.4.0-rc3-00018-g5bc52f64e884-dirty #36
[6.541172] Hardware name: Raspberry Pi 3 Model B+ (DT)
[6.546471] pstate: 6005 (nZCv daif -PAN -UAO)
[6.551329] pc : __handle_irq_event_percpu+0x150/0x170
[6.556539] lr : __handle_irq_event_percpu+0x150/0x170
[6.561747] sp : 800010003cc0
[6.565104] x29: 800010003cc0 x28: 0060 
[6.570493] x27: 8000110fb9b0 x26: 800011a3daeb 
[6.575882] x25: 800011892d40 x24: 37525800 
[6.581270] x23: 004f x22: 800010003d64 
[6.586659] x21:  x20: 0002 
[6.592046] x19: 3716fb00 x18: 0010 
[6.597434] x17: 0001 x16: 0007 
[6.602822] x15: 8000118931b0 x14: 747075727265746e 
[6.608210] x13: 692064656c62616e x12: 65203878302f3078 
[6.613598] x11: 302b72656c646e61 x10: 685f7972616d6972 
[6.618986] x9 : 705f746c75616665 x8 : 800011a9f000 
[6.624374] x7 : 800010681150 x6 : 00f9 
[6.629761] x5 :  x4 :  
[6.635148] x3 :  x2 : 8000118a2440 
[6.640535] x1 : ab82878caf7c9e00 x0 :  
[6.645923] Call trace:
[6.648404]  __handle_irq_event_percpu+0x150/0x170
[6.653262]  handle_irq_event_percpu+0x30/0x88
[6.657767]  handle_irq_event+0x44/0xc8
[6.661659]  handle_simple_irq+0x90/0xc0
[6.665635]  generic_handle_irq+0x24/0x38
[6.669703]  intr_complete+0x104/0x178
[6.673508]  __usb_hcd_giveback_urb+0x58/0xf8
[6.677927]  usb_giveback_urb_bh+0xac/0x108
[6.682173]  tasklet_action_common.isra.0+0x154/0x1a0
[6.687298]  tasklet_hi_action+0x24/0x30
[6.691277]  __do_softirq+0x120/0x23c
[6.694990]  irq_exit+0xb8/0xd8
[6.698174]  __handle_domain_irq+0x64/0xb8
[6.702326]  bcm2836_arm_irqchip_handle_irq+0x60/0xc0
[6.707449]  el1_irq+0xb8/0x180
[6.710634]  arch_cpu_idle+0x10/0x18
[6.714260]  do_idle+0x200/0x280
[6.717532]  cpu_startup_entry+0x20/0x40
[6.721512]  rest_init+0xd4/0xe0
[6.724786]  arch_call_rest_init+0xc/0x14
[6.728851]  start_kernel+0x420/0x44c
[6.732562] ---[ end trace e770c2c68be5476f ]---
[6.742776] lan78xx 1-1.1.1:1.0 eth0: speed: 1000 duplex: 1 anadv: 0x05e1 
anlpa: 0xc1e1
[6.750940] lan78xx 1-1.1.1:1.0 eth0: rx pause disabled, tx pause disabled
[6.769976] Sending DHCP requests ..., OK
[   12.926088] IP-Config: Got DHCP answer from 192.168.19.2, my address is 
192.168.19.53
[   12.934059] IP-Config: Complete:
[   12.937335]  device=eth0, hwaddr=b8:27:eb:85:c7:c9, 
ipaddr=192.168.19.53, mask=255.255.255.0, gw=192.168.19.1
[   12.947758]  host=192.168.19.53, domain=, nis-domain=(none)
[   12.953772]  bootserver=192.168.19.2, rootserver=192.168.19.2, rootpath=
[   12.953776]  nameserver0=192.168.19.2
[   12.965221] ALSA device list:
[   12.968246]   No soundcards found.
[   12.984397] VFS: Mounted root (nfs filesystem) on device 0:19.
[   12.991059] devtmpfs: mounted
[   13.000530] Freeing unused kernel memory: 5504K
[   13.018077] Run /sbin/init as init process
[   44.010022] nfs: server 192.168.19.2 not responding, still trying
[   44.010027] nfs: server 192.168.19.2 not responding, still trying
[   44.010033] nfs: server 192.168.19.2 not responding, still trying

Re: lan78xx and phy_state_machine

2019-10-16 Thread Daniel Wagner
On Tue, Oct 15, 2019 at 07:16:53PM +0200, Daniel Wagner wrote:
> Could it be that the networking interface is still running (from
> u-boot and PXE) when the drivers is setting it up and the workqueue is
> premature kicked to work?

I've dump the registers before the device is setup and verified with
the manual. So the device is in reset state as documented in the
FIGURE 13-1 
http://ww1.microchip.com/downloads/en/DeviceDoc/LAN7800-Data-Sheet-DS1992G.pdf

After being burned several times I'd like to check such things
first. Anyway, rules out my boot setup.

> Anyway, I keep trying to get some trace out of it.

After adding ignore_loglevel to command line, I finally get the a
trace on the console. Note with the WARN_ON the system boots. Though
there seems to be still something wrong the the network, because there
is no reliable connetion to the NFS server.

[3.743559] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No 
External EEPROM. Setting MAC Speed
[3.754941] libphy: lan78xx-mdiobus: probed
[3.815609] [ cut here ]
[3.820316] WARNING: CPU: 3 PID: 1 at drivers/net/phy/phy.c:496 
phy_queue_state_machine+0xc/0x30
[3.829226] Modules linked in:
[3.832329] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 
5.4.0-rc3-00018-g5bc52f64e884-dirty #32
[3.840974] Hardware name: Raspberry Pi 3 Model B+ (DT)
[3.846273] pstate: 6005 (nZCv daif -PAN -UAO)
[3.851132] pc : phy_queue_state_machine+0xc/0x30
[3.855903] lr : phy_start+0x88/0xa0
[3.859524] sp : 800010023b80
[3.862882] x29: 800010023b80 x28: 37c34000 
[3.868270] x27: 8000111ac178 x26: 1002 
[3.873657] x25: 0001 x24:  
[3.879046] x23: 1002 x22: 800010e3d850 
[3.884433] x21: 37c34800 x20: 37328438 
[3.889820] x19: 37328000 x18: 000e 
[3.895209] x17: 0001 x16: 0019 
[3.900596] x15:  x14:  
[3.905985] x13:  x12: 1da9 
[3.911372] x11:  x10:  
[3.916759] x9 : 383b2750 x8 : 383b1dc0 
[3.922148] x7 : 37e900c0 x6 : 0002 
[3.927535] x5 : 0001 x4 : 37e90028 
[3.932923] x3 :  x2 : 0001 
[3.938311] x1 :  x0 : 37328000 
[3.943698] Call trace:
[3.946179]  phy_queue_state_machine+0xc/0x30
[3.950597]  phy_start+0x88/0xa0
[3.953870]  lan78xx_open+0x30/0x140
[3.957499]  __dev_open+0xc0/0x170
[3.960950]  __dev_change_flags+0x160/0x1b8
[3.965192]  dev_change_flags+0x20/0x60
[3.969083]  ip_auto_config+0x254/0xe54
[3.972974]  do_one_initcall+0x50/0x190
[3.976865]  kernel_init_freeable+0x194/0x22c
[3.981285]  kernel_init+0x10/0x100
[3.984822]  ret_from_fork+0x10/0x18
[3.988445] ---[ end trace a7b6e745fa28cd56 ]---
[4.025682] random: crng init done
[6.401142] [ cut here ]
[6.405854] irq 79 handler irq_default_primary_handler+0x0/0x8 enabled 
interrupts
[6.413468] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 
__handle_irq_event_percpu+0x150/0x170
[6.422642] Modules linked in:
[6.425744] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW 
5.4.0-rc3-00018-g5bc52f64e884-dirty #32
[6.435799] Hardware name: Raspberry Pi 3 Model B+ (DT)
[6.441099] pstate: 6005 (nZCv daif -PAN -UAO)
[6.445957] pc : __handle_irq_event_percpu+0x150/0x170
[6.451168] lr : __handle_irq_event_percpu+0x150/0x170
[6.456375] sp : 800010003cc0
[6.459732] x29: 800010003cc0 x28: 0060 
[6.465120] x27: 8000110929a8 x26: 80001192d86b 
[6.470508] x25: 800011782d40 x24: 374cde00 
[6.475897] x23: 004f x22: 800010003d64 
[6.481285] x21:  x20: 0002 
[6.486672] x19: 372ee180 x18: 0010 
[6.492060] x17: 0001 x16: 0007 
[6.497448] x15: 8000117831b0 x14: 747075727265746e 
[6.502835] x13: 692064656c62616e x12: 65203878302f3078 
[6.508223] x11: 302b72656c646e61 x10: 685f7972616d6972 
[6.513611] x9 : 705f746c75616665 x8 : 800011952000 
[6.518999] x7 : 80001066dce0 x6 : 0106 
[6.524387] x5 :  x4 :  
[6.529775] x3 :  x2 : 800011792440 
[6.535163] x1 : 190f5ab71e843000 x0 :  
[6.540550] Call trace:
[6.543032]  __handle_irq_event_percpu+0x150/0x170
[6.547890]  handle_irq_event_percpu+0x30/0x88
[6.552394]  handle_irq_event+0x44/0xc8
[6.556283]  handle_simple_irq+0x90/0xc0
[6.560260]  generic_handle_irq+0x24/0x38
[6.564328]  intr_complete+0xb0/0xe0
[6.567955]  __usb_hcd_giveback_urb+0x58/0xf8
[6.572374]  usb_give

Re: lan78xx and phy_state_machine

2019-10-15 Thread Daniel Wagner
Hi Andrew,

On Tue, Oct 15, 2019 at 02:53:27AM +0200, Andrew Lunn wrote:
> On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
> > Hi,
> > 
> > I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
> > my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
> > when initializing the eth interface.
> > 
> > Is this a know issue? Some configuration issues?
> 
> Hi Daniel
> 
> Please could you add a WARN_ON(1); in phy_queue_state_machine() and
> post the stack dump. That might help us figure out what is going on.

I tried to get a stack dump from the WARN_ON(1). The 'make defconfig'
seems not to enable it(?). Anyway I played a bit and noticed, that
depending which additional debug config switch is enabled the
problem disappears. The boot timing is important it seems.

After the feedback I got so far, it think my setup is 'special' in
sofar I don't boot from eMMC. Instead I rely on TFTP and NFS for
rootfs:

 - kernel is configured as 'make defconfig' +

#
# Built in drivers
#
CONFIG_USB_LAN78XX=y

#
# Networking
#
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y

# NFS
CONFIG_NFS_FS=y
CONFIG_NFS_V4=y
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y

#
# Debugging
#
CONFIG_PRINTK_TIME=y
CONFIG_DEBUG_KERNEL=y
CONFIG_EARLY_PRINTK=y
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=7

# Embedded config to kernel. /proc/config.gz
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y

CONFIG_KEXEC=y

 - u-boot enables network interface, does DHCP
 - fetches a PXE image
 - PXE loads DTB, kernel and starts the kernel
 - rootfs is supposed to be provided via NFS

Could it be that the networking interface is still running (from
u-boot and PXE) when the drivers is setting it up and the workqueue is
premature kicked to work?

Anyway, I keep trying to get some trace out of it.

Thanks,
Daniel



Re: wl1251 & mac address & calibration data

2016-12-15 Thread Daniel Wagner

On 12/16/2016 03:03 AM, Luis R. Rodriguez wrote:

For the new API a solution for "fallback mechanisms" should be clean
though and I am looking to stay as far as possible from the existing
mess. A solution to help both the old API and new API is possible for
the "fallback mechanism" though -- but for that I can only refer you
at this point to some of Daniel Wagner and Tom Gunderson's firmwared
deamon prospect. It should help pave the way for a clean solution and
help address other stupid issues.


The firmwared project is hosted here

https://github.com/teg/firmwared

As Luis pointed out, firmwared relies on FW_LOADER_USER_HELPER_FALLBACK, 
which is not enabled by default. I don't see any reason why firmwared 
should not also support loading calibration data. If we find a sound way 
to do this.


As you can see from the commit history it is a pretty young project and 
more ore less reanimation of the old udev firmware loader feature.  We 
are getting int into shape, adding integration tests etc.


The main motivation for this project is the get movement back in stuck 
discussion on the firmware loader API. Luis was very busy writing up all 
the details on the current situation and purely from the amount of 
documentation need to describe the API you can tell something is awry.


Thanks,
Daniel


[PATCH] xprtrdma: use complete() instead complete_all()

2016-09-23 Thread Daniel Wagner
From: Daniel Wagner 

There is only one waiter for the completion, therefore there
is no need to use complete_all(). Let's make that clear by
using complete() instead of complete_all().

The usage pattern of the completion is:

waiter context  waker context

frwr_op_unmap_sync()
  reinit_completion()
  ib_post_send()
  wait_for_completion()

frwr_wc_localinv_wake()
  complete()

Signed-off-by: Daniel Wagner 
Cc: Anna Schumaker 
Cc: Trond Myklebust 
Cc: Chuck Lever 
Cc: linux-...@vger.kernel.org
Cc: netdev@vger.kernel.org
---
 net/sunrpc/xprtrdma/frwr_ops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index 892b5e1..4a24f0e 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -329,7 +329,7 @@ frwr_wc_localinv_wake(struct ib_cq *cq, struct ib_wc *wc)
frmr = container_of(cqe, struct rpcrdma_frmr, fr_cqe);
if (wc->status != IB_WC_SUCCESS)
__frwr_sendcompletion_flush(wc, frmr, "localinv");
-   complete_all(&frmr->fr_linv_done);
+   complete(&frmr->fr_linv_done);
 }
 
 /* Post a REG_MR Work Request to register a memory region
-- 
2.7.4


[PATCH 0/2] wireless: Use complete() instead complete_all()

2016-08-18 Thread Daniel Wagner
From: Daniel Wagner 

Hi,

Using complete_all() is not wrong per se but it suggest that there
might be more than one reader. For -rt I am reviewing all
complete_all() users and would like to leave only the real ones in the
tree. The main problem for -rt about complete_all() is that it can be
uses inside IRQ context and that can lead to unbounded amount work
inside the interrupt handler. That is a no no for -rt.

The patches grouped per subsystem and in small batches to allow
reviewing.

This series ignores all complete_all() usages in the firmware loading
path. They will be hopefully address by Luis' sysdata patches [0].
That leaves a couple of complete_all() calls.

The first patch fixes a real glitch for the carl9170 driver. I was
able to test it because I have the hardware. For the second one I
haven't found any dongle with that chip in my drawers. 

This series against net-next of today.

cheers,
daniel

[0] 
https://lkml.kernel.org/r/1466117661-22075-1-git-send-email-mcg...@kernel.org

Daniel Wagner (2):
  carl9170: Fix wrong completion usage
  ath10k: use complete() instead complete_all()

 drivers/net/wireless/ath/ath10k/core.c  | 16 
 drivers/net/wireless/ath/ath10k/mac.c   |  2 +-
 drivers/net/wireless/ath/carl9170/usb.c |  6 ++
 3 files changed, 11 insertions(+), 13 deletions(-)

-- 
2.7.4


[PATCH 2/2] ath10k: use complete() instead complete_all()

2016-08-18 Thread Daniel Wagner
From: Daniel Wagner 

There is only one waiter for the completion, therefore there
is no need to use complete_all(). Let's make that clear by
using complete() instead of complete_all().

The usage pattern of the completion is:

waiter context  waker context

scan.started


ath10k_start_scan()
  lockdep_assert_held(conf_mutex)
  auth10k_wmi_start_scan()
  wait_for_completion_timeout(scan.started)

ath10k_wmi_event_scan_start_failed()
  complete(scan.started)

ath10k_wmi_event_scan_started()
  complete(scan.started)

scan.completed
--

ath10k_scan_stop()
  lockdep_assert_held(conf_mutex)
  ath10k_wmi_stop_scan()
  wait_for_completion_timeout(scan.completed)

__ath10k_scan_finish()
  complete(scan.completed)

scan.on_channel
---

ath10k_remain_on_channel()
  mutex_lock(conf_mutex)
  ath10k_start_scan()
  wait_for_completion_timeout(scan.on_channel)

ath10k_wmi_event_scan_foreign_chan()
  complete(scan.on_channel)

offchan_tx_completed


ath10k_offchan_tx_work()
  mutex_lock(conf_mutex)
  reinit_completion(offchan_tx_completed)
  wait_for_completion_timeout(offchan_tx_completed)

ath10k_report_offchain_tx()
  complete(offchan_tx_completed)

install_key_done

ath10k_install_key()
  lockep_assert_held(conf_mutex)
  reinit_completion(install_key_done)
  wait_for_completion_timeout(install_key_done)

ath10k_htt_t2h_msg_handler()
  complete(install_key_done)

vdev_setup_done
---

ath10k_monitor_vdev_start()
  lockdep_assert_held(conf_mutex)
   reinit_completion(vdev_setup_done)
  ath10k_vdev_setup_sync()
wait_for_completion_timeout(vdev_setup_done)

ath10k_wmi_event_vdev_start_resp()
  complete(vdev_setup_done)

ath10k_monitor_vdev_stop()
  lockdep_assert_held(conf_mutex)
  reinit_completion(vdev_setup_done()
  ath10k_vdev_setup_sync()
wait_for_completion_timeout(vdev_setup_done)

ath10k_wmi_event_vdev_stopped()
 complete(vdev_setup_done)

thermal.wmi_sync

ath10k_thermal_show_temp()
  mutex_lock(conf_mutex)
  reinit_completion(thermal.wmi_sync)
  wait_for_completion_timeout(thermal.wmi_sync)

ath10k_thermal_event_temperature()
  complete(thermal.wmi_sync)

bss_survey_done
---
ath10k_mac_update_bss_chan_survey
  lockdep_assert_held(conf_mutex)
  reinit_completion(bss_survey_done)
  wait_for_completion_timeout(bss_survey_done)

ath10k_wmi_event_pdev_bss_chan_info()
  complete(bss_survey_done)

All complete() calls happen while the conf_mutex is taken. That means
at max one waiter is possible.

Signed-off-by: Daniel Wagner 
---
 drivers/net/wireless/ath/ath10k/core.c | 16 
 drivers/net/wireless/ath/ath10k/mac.c  |  2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/core.c 
b/drivers/net/wireless/ath/ath10k/core.c
index e889829..ed76601 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -1497,14 +1497,14 @@ static void ath10k_core_restart(struct work_struct 
*work)
 
ieee80211_stop_queues(ar->hw);
ath10k_drain_tx(ar);
-   complete_all(&ar->scan.started);
-   complete_all(&ar->scan.completed);
-   complete_all(&ar->scan.on_channel);
-   complete_all(&ar->offchan_tx_completed);
-   complete_all(&ar->install_key_done);
-   complete_all(&ar->vdev_setup_done);
-   complete_all(&ar->thermal.wmi_sync);
-   complete_all(&ar->bss_survey_done);
+   complete(&ar->scan.started);
+   complete(&ar->scan.completed);
+   complete(&ar->scan.on_channel);
+   complete(&ar->offchan_tx_completed);
+   complete(&ar->install_key_done);
+   complete(&ar->vdev_setup_done);
+   complete(&ar->thermal.wmi_sync);
+   complete(&ar->bss_survey_done);
wake_up(&ar->htt.empty_tx_wq);
wake_up(&ar->wmi.tx_credits_wq);
wake_up(&ar->peer_mapping_wq);
diff --git a/drivers/net/wireless/ath/ath10k/mac.c 
b/drivers/net/wireless/ath/ath10k/mac.c
index 0bbd0a0..c3c1c25 100644
-

[PATCH 1/2] carl9170: Fix wrong completion usage

2016-08-18 Thread Daniel Wagner
From: Daniel Wagner 

carl9170_usb_stop() is used from several places to flush and cleanup any
pending work. The normal pattern is to send a request and wait for the
irq handler to call complete(). The completion is not reinitialized
during normal operation and as the old comment indicates it is important
to keep calls to wait_for_completion_timeout() and complete() balanced.

Calling complete_all() brings this equilibirum out of balance and needs
to be fixed by a reinit_completion(). But that opens a small race
window. It is possible that the sequence of complete_all(),
reinit_completion() is faster than the wait_for_completion_timeout() can
do its work. The wake up is not lost but the done counter test is after
reinit_completion() has been executed. The only reason we don't see
carl9170_exec_cmd() hang forever is we use the timeout version of
wait_for_copletion().

Let's fix this by reinitializing the completion (that is just setting
done counter to 0) just before we send out an request. Now,
carl9170_usb_stop() can be sure a complete() call is enough to make
progess since there is only one waiter at max. This is a common pattern
also seen in various drivers which use completion.

Signed-off-by: Daniel Wagner 
---
 drivers/net/wireless/ath/carl9170/usb.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/carl9170/usb.c 
b/drivers/net/wireless/ath/carl9170/usb.c
index 76842e6..99ab203 100644
--- a/drivers/net/wireless/ath/carl9170/usb.c
+++ b/drivers/net/wireless/ath/carl9170/usb.c
@@ -670,6 +670,7 @@ int carl9170_exec_cmd(struct ar9170 *ar, const enum 
carl9170_cmd_oids cmd,
ar->readlen = outlen;
spin_unlock_bh(&ar->cmd_lock);
 
+   reinit_completion(&ar->cmd_wait);
err = __carl9170_exec_cmd(ar, &ar->cmd, false);
 
if (!(cmd & CARL9170_CMD_ASYNC_FLAG)) {
@@ -778,10 +779,7 @@ void carl9170_usb_stop(struct ar9170 *ar)
spin_lock_bh(&ar->cmd_lock);
ar->readlen = 0;
spin_unlock_bh(&ar->cmd_lock);
-   complete_all(&ar->cmd_wait);
-
-   /* This is required to prevent an early completion on _start */
-   reinit_completion(&ar->cmd_wait);
+   complete(&ar->cmd_wait);
 
/*
 * Note:
-- 
2.7.4


Re: [PATCH net-next] nfnetlink_queue: enable PID info retrieval

2016-06-09 Thread Daniel Wagner
Hi Daniel,

> [ Cc'ing John, Daniel, et al ]
> 
> Btw, while I just looked at scm_detach_fds(), I think commits ...
> 
>  * 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set
> correctly")
>  * d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set
> correctly")
> 
> ... might not be correct, maybe I'm missing something ...? Lets say
> process A
> has a socket fd that it sends via SCM_RIGHTS to process B. Process A was
> the
> one that called sk_alloc() originally. Now in scm_detach_fds() we
> install a new
> fd for process B pointing to the same sock (file's private_data) and
> above commits
> update the cached socket cgroup data for net_cls/net_prio to the new
> process B.
> So, if process A for example still sends data over that socket, skbs
> will then
> wrongly match on B's cgroup membership instead of A's, no?

I can't remember the details right now (need to read up again but I wont
have time till Wednesday).

>From your analysis I would say that is not the desired effect. A should
match against its own cgroup and not the one of B.

cheers,
daniel


Re: [PATCH v2 net-next 0/12] bpf: map pre-alloc

2016-03-08 Thread Daniel Wagner
Hi Alexei,

On 03/08/2016 06:57 AM, Alexei Starovoitov wrote:
> v1->v2:
> . fix few issues spotted by Daniel
> . converted stackmap into pre-allocation as well
> . added a workaround for lockdep false positive
> . added pcpu_freelist_populate to be used by hashmap and stackmap
> 
> this path set switches bpf hash map to use pre-allocation by default
> and introduces BPF_F_NO_PREALLOC flag to keep old behavior for cases
> where full map pre-allocation is too memory expensive.
> 
> Some time back Daniel Wagner reported crashes when bpf hash map is
> used to compute time intervals between preempt_disable->preempt_enable
> and recently Tom Zanussi reported a dead lock in iovisor/bcc/funccount
> tool if it's used to count the number of invocations of kernel
> '*spin*' functions. Both problems are due to the recursive use of
> slub and can only be solved by pre-allocating all map elements.

I gave it a short spin and lathist sample works just fine.

cheers,
daniel


Re: [PATCHSET v3] netfilter, cgroup: implement cgroup2 path match in xt_cgroup

2015-11-23 Thread Daniel Wagner
On 11/23/2015 04:53 PM, Tejun Heo wrote:
> On Mon, Nov 23, 2015 at 09:54:32AM +0100, Daniel Wagner wrote:
> ...
>>> [3.224665] BUG: spinlock bad magic on CPU#1, systemd/1
>>> [3.225653]  lock: cgroup_sk_update_lock+0x0/0x60, .magic: , 
>>> .owner: systemd/1, .owner_cpu: 1
>>> [3.227034] CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #195
>>> [3.227862] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>>> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
>>> [3.228906]  834a2160 88007c043ad0 81551edc 
>>> 88007c028000
>>> [3.229512]  88007c043af0 81136868 834a2160 
>>> 88007aff5940
>>> [3.230105]  88007c043b08 81136b05 834a2160 
>>> 88007c043b20
>>> [3.230716] Call Trace:
>>> [3.230906]  [] dump_stack+0x4e/0x82
>>> [3.231289]  [] spin_dump+0x78/0xc0
>>> [3.231642]  [] do_raw_spin_unlock+0x75/0xd0
>>> [3.232039]  [] _raw_spin_unlock+0x27/0x50
>>> [3.232431]  [] update_classid_sock+0x68/0x80
>>> [3.232836]  [] iterate_fd+0x71/0x150
>>> [3.233197]  [] update_classid+0x47/0x80
>>> [3.233571]  [] cgrp_attach+0x14/0x20
>>> [3.233929]  [] cgroup_taskset_migrate+0x1e1/0x330
>>> [3.234366]  [] cgroup_migrate+0xf5/0x190
>>> [3.235130]  [] cgroup_attach_task+0x176/0x200
>>> [3.235953]  [] __cgroup_procs_write+0x2ad/0x460
>>> [3.236805]  [] cgroup_procs_write+0x14/0x20
>>> [3.237205]  [] cgroup_file_write+0x35/0x1c0
>>> [3.237600]  [] kernfs_fop_write+0x141/0x190
>>> [3.237998]  [] __vfs_write+0x28/0xe0
>>> [3.239554]  [] vfs_write+0xac/0x1a0
>>> [3.240308]  [] SyS_write+0x49/0xb0
>>> [3.240656]  [] entry_SYSCALL_64_fastpath+0x12/0x76
>>
>> I have enabled a few additional cgroup controllers as well, because I was
>> trying to figure out why I only see the 'memory' cgroup controller in 
>> cgroup.controllers. pid and io show up but not net_prio or net_cls.
>> Not sure why systemd (v227) is not mounting them.
> 
> net_prio and net_cls aren't gonna be on the v2 hierarchy.  The match
> in this patchset is being introduced to replace them; however, you can
> mount them separately on a v1 hierarchy and use the same as before.

Okay, I could have figured that myself I guess. I mounted the v1
hierarchy and it works as you have described it.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] sock, cgroup: add sock->sk_cgroup

2015-11-23 Thread Daniel Wagner
On 11/23/2015 04:48 PM, Tejun Heo wrote:
> On Mon, Nov 23, 2015 at 02:02:03PM +0100, Daniel Wagner wrote:
>> On 11/21/2015 05:13 PM, Tejun Heo wrote:
>>> Signed-off-by: Tejun Heo 
>>> Cc: Daniel Borkmann 
>>> Cc: Daniel Wagner 
>>
>> I did a quick test and for new connection the cgroup2 match worked as
>> expected. For an existing connection I wasn't able to trigger the match.
>>
>> It is quite likely I do something wrong:
>>
>>  ssh into the box
>>  # mkdir /sys/fs/cgroup/test
>>  # echo $$ > /sys/fs/cgroup/test/cgroup.procs
>>  # echo $PPID > /sys/fs/cgroup/test/cgroup.procs
>>  # iptables -A OUTPUT -m cgroup --path test
>>
>> Should I see matches with the existing ssh session?
> 
> Socket is associated with the creating cgroup and stays associated
> with that cgroup until it's released.  Migrating the process doesn't
> change the ownership of the sockets it has created.  This is in line
> with how other stateful resources such as memory are handled in
> cgroup2 hierarchy.

Thanks for the explanation. Looks good to me:

Tested-by: Daniel Wagner 
Acked-by: Daniel Wagner 

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] sock, cgroup: add sock->sk_cgroup

2015-11-23 Thread Daniel Wagner
Hi Tejun,

On 11/21/2015 05:13 PM, Tejun Heo wrote:
> Signed-off-by: Tejun Heo 
> Cc: Daniel Borkmann 
> Cc: Daniel Wagner 

I did a quick test and for new connection the cgroup2 match worked as
expected. For an existing connection I wasn't able to trigger the match.

It is quite likely I do something wrong:

ssh into the box
# mkdir /sys/fs/cgroup/test
# echo $$ > /sys/fs/cgroup/test/cgroup.procs
# echo $PPID > /sys/fs/cgroup/test/cgroup.procs
# iptables -A OUTPUT -m cgroup --path test

Should I see matches with the existing ssh session?

cheers,
daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] netfilter: prepare xt_cgroup for multi revisions

2015-11-23 Thread Daniel Wagner
Hi Tejun,

On 11/21/2015 05:14 PM, Tejun Heo wrote:
> xt_cgroup will grow cgroup2 path based match.  Postfix existing
> symbols with _v0 and prepare for multi revision registration.
> 
> Signed-off-by: Tejun Heo 
> Cc: Daniel Borkmann 
> Cc: Daniel Wagner 

Same as in my reply to patch #9 (yes, I know do it wrong order...
thought can't stop now... :))

Tested-by: Daniel Wagner 
Acked-by: Daniel Wagner 

cheers,
daniel

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/9] netfilter: implement xt_cgroup cgroup2 path match

2015-11-23 Thread Daniel Wagner
Hi Tejun,

On 11/21/2015 05:14 PM, Tejun Heo wrote:> +static int
> cgroup_mt_check_v1(const struct xt_mtchk_param *par)
> +{
> + struct xt_cgroup_info_v1 *info = par->matchinfo;
> + struct cgroup *cgrp;
> +
> + if ((info->invert_path & ~1) || (info->invert_classid & ~1))
> + return -EINVAL;

The checks below use pr_info() in case the configuration is not valid.
Is this missing here on purpose?

I have tested it slightly and it seems to work (also on an older
kernel). I don't know if that qualifies it for a Tested-by but at least
Acked-by should do the trick:

Tested-by: Daniel Wagner 
Acked-by: Daniel Wagner 

cheers,
daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHSET v3] netfilter, cgroup: implement cgroup2 path match in xt_cgroup

2015-11-23 Thread Daniel Wagner
On 11/23/2015 08:11 AM, Daniel Wagner wrote:
> [3.217648] systemd[1]: tmp.mount: Directory /tmp to mount over is not 
> empty, mounting anyway.
> [3.224665] BUG: spinlock bad magic on CPU#1, systemd/1
> [3.225653]  lock: cgroup_sk_update_lock+0x0/0x60, .magic: , 
> .owner: systemd/1, .owner_cpu: 1
> [3.227034] CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #195
> [3.227862] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
> [3.228906]  834a2160 88007c043ad0 81551edc 
> 88007c028000
> [3.229512]  88007c043af0 81136868 834a2160 
> 88007aff5940
> [3.230105]  88007c043b08 81136b05 834a2160 
> 88007c043b20
> [3.230716] Call Trace:
> [3.230906]  [] dump_stack+0x4e/0x82
> [3.231289]  [] spin_dump+0x78/0xc0
> [3.231642]  [] do_raw_spin_unlock+0x75/0xd0
> [3.232039]  [] _raw_spin_unlock+0x27/0x50
> [3.232431]  [] update_classid_sock+0x68/0x80
> [3.232836]  [] iterate_fd+0x71/0x150
> [3.233197]  [] update_classid+0x47/0x80
> [3.233571]  [] cgrp_attach+0x14/0x20
> [3.233929]  [] cgroup_taskset_migrate+0x1e1/0x330
> [3.234366]  [] cgroup_migrate+0xf5/0x190
> [3.234747]  [] ? cgroup_migrate+0x5/0x190
> [3.235130]  [] cgroup_attach_task+0x176/0x200
> [3.235543]  [] ? cgroup_attach_task+0x5/0x200
> [3.235953]  [] __cgroup_procs_write+0x2ad/0x460
> [3.236377]  [] ? __cgroup_procs_write+0x5e/0x460
> [3.236805]  [] cgroup_procs_write+0x14/0x20
> [3.237205]  [] cgroup_file_write+0x35/0x1c0
> [3.237600]  [] kernfs_fop_write+0x141/0x190
> [3.237998]  [] __vfs_write+0x28/0xe0
> [3.238361]  [] ? percpu_down_read+0x57/0xa0
> [3.238761]  [] ? __sb_start_write+0xb4/0xf0
> [3.239154]  [] ? __sb_start_write+0xb4/0xf0
> [3.239554]  [] vfs_write+0xac/0x1a0
> [3.239930]  [] ? __fget_light+0x66/0x90
> [3.240308]  [] SyS_write+0x49/0xb0
> [3.240656]  [] entry_SYSCALL_64_fastpath+0x12/0x76

I have enabled a few additional cgroup controllers as well, because I was
trying to figure out why I only see the 'memory' cgroup controller in 
cgroup.controllers. pid and io show up but not net_prio or net_cls.
Not sure why systemd (v227) is not mounting them.

Though, after a while a similar call trace is produced. I guess this
has nothing to do with the current changes.

[   11.594536] [ cut here ]
[   11.595274] WARNING: CPU: 1 PID: 1 at kernel/cgroup_pids.c:97 
pids_cancel.constprop.6+0x31/0x40()
[   11.595958] Modules linked in:
[   11.596199] CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #196
[   11.596689] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[   11.597632]  81f66d8b 88007c04bb90 8155ccdc 

[   11.598234]  88007c04bbc8 810de202 8800793dda00 
88007a096800
[   11.598877]  88007c04bc80 88007a6b6200 0001 
88007c04bbd8
[   11.599547] Call Trace:
[   11.599784]  [] dump_stack+0x4e/0x82
[   11.600197]  [] warn_slowpath_common+0x82/0xc0
[   11.600705]  [] warn_slowpath_null+0x1a/0x20
[   11.601208]  [] pids_cancel.constprop.6+0x31/0x40
[   11.601764]  [] pids_can_attach+0x6d/0xf0
[   11.602245]  [] cgroup_taskset_migrate+0x6a/0x330
[   11.602795]  [] cgroup_migrate+0xf5/0x190
[   11.603276]  [] ? cgroup_migrate+0x5/0x190
[   11.603788]  [] cgroup_attach_task+0x176/0x200
[   11.604308]  [] ? cgroup_attach_task+0x5/0x200
[   11.604831]  [] __cgroup_procs_write+0x2ad/0x460
[   11.605367]  [] ? __cgroup_procs_write+0x5e/0x460
[   11.605929]  [] cgroup_procs_write+0x14/0x20
[   11.606448]  [] cgroup_file_write+0x35/0x1c0
[   11.606931]  [] kernfs_fop_write+0x141/0x190
[   11.607401]  [] __vfs_write+0x28/0xe0
[   11.607834]  [] ? percpu_down_read+0x57/0xa0
[   11.608366]  [] ? __sb_start_write+0xb4/0xf0
[   11.608874]  [] ? __sb_start_write+0xb4/0xf0
[   11.609343]  [] vfs_write+0xac/0x1a0
[   11.609843]  [] ? __fget_light+0x66/0x90
[   11.610315]  [] SyS_write+0x49/0xb0
[   11.610756]  [] entry_SYSCALL_64_fastpath+0x12/0x76
[   11.611305] ---[ end trace 7f953d0ce5af99ea ]---

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHSET v3] netfilter, cgroup: implement cgroup2 path match in xt_cgroup

2015-11-22 Thread Daniel Wagner
Hi Tejun,

On 11/21/2015 05:13 PM, Tejun Heo wrote:
> This is v3 of the xt_cgroup2 patchset.  Changes from the last take are
> 
> * Folded cgroup2 path matching into xt_cgroup as a new revision rather
>   than a separate xt_cgroup2 match as suggested by Pablo.
> 
> * Refreshed on top of Nina's net_cls dynamic config update fix patch.
>   I included the fix patch as part of this series to ease reviewing.

I started to play with your patches and was greeted by this:

[3.217648] systemd[1]: tmp.mount: Directory /tmp to mount over is not 
empty, mounting anyway.
[3.224665] BUG: spinlock bad magic on CPU#1, systemd/1
[3.225653]  lock: cgroup_sk_update_lock+0x0/0x60, .magic: , .owner: 
systemd/1, .owner_cpu: 1
[3.227034] CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #195
[3.227862] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[3.228906]  834a2160 88007c043ad0 81551edc 
88007c028000
[3.229512]  88007c043af0 81136868 834a2160 
88007aff5940
[3.230105]  88007c043b08 81136b05 834a2160 
88007c043b20
[3.230716] Call Trace:
[3.230906]  [] dump_stack+0x4e/0x82
[3.231289]  [] spin_dump+0x78/0xc0
[3.231642]  [] do_raw_spin_unlock+0x75/0xd0
[3.232039]  [] _raw_spin_unlock+0x27/0x50
[3.232431]  [] update_classid_sock+0x68/0x80
[3.232836]  [] iterate_fd+0x71/0x150
[3.233197]  [] update_classid+0x47/0x80
[3.233571]  [] cgrp_attach+0x14/0x20
[3.233929]  [] cgroup_taskset_migrate+0x1e1/0x330
[3.234366]  [] cgroup_migrate+0xf5/0x190
[3.234747]  [] ? cgroup_migrate+0x5/0x190
[3.235130]  [] cgroup_attach_task+0x176/0x200
[3.235543]  [] ? cgroup_attach_task+0x5/0x200
[3.235953]  [] __cgroup_procs_write+0x2ad/0x460
[3.236377]  [] ? __cgroup_procs_write+0x5e/0x460
[3.236805]  [] cgroup_procs_write+0x14/0x20
[3.237205]  [] cgroup_file_write+0x35/0x1c0
[3.237600]  [] kernfs_fop_write+0x141/0x190
[3.237998]  [] __vfs_write+0x28/0xe0
[3.238361]  [] ? percpu_down_read+0x57/0xa0
[3.238761]  [] ? __sb_start_write+0xb4/0xf0
[3.239154]  [] ? __sb_start_write+0xb4/0xf0
[3.239554]  [] vfs_write+0xac/0x1a0
[3.239930]  [] ? __fget_light+0x66/0x90
[3.240308]  [] SyS_write+0x49/0xb0
[3.240656]  [] entry_SYSCALL_64_fastpath+0x12/0x76

I am using a Fedora 23 host with systemd.unified_cgroup_hierarchy=1. The config 
is
available here:

http://monom.org/cgroup/config-review-xt_cgroup2

Probably completely rubbish, because it's my random test config.

cheers,
daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] sock, cgroup: add sock->sk_cgroup

2015-11-20 Thread Daniel Wagner
Hi Tejun,

On 11/19/2015 07:52 PM, Tejun Heo wrote:
> +/*
> + * There's a theoretical window where the following accessors race with
> + * updaters and return part of the previous pointer as the prioidx or
> + * classid.  Such races are short-lived and the result isn't critical.
> + */
>  static inline u16 sock_cgroup_prioidx(struct sock_cgroup_data *skcd)
>  {
> - return skcd->prioidx;
> + return (skcd->is_data & 1) ? skcd->prioidx : 1;
>  }
>  
>  static inline u32 sock_cgroup_classid(struct sock_cgroup_data *skcd)
>  {
> - return skcd->classid;
> + return (skcd->is_data & 1) ? skcd->classid : 0;
>  }


I still try to understand what the code does, hence this stupid question:

Why is sock_cgroup_prioidx() returning 1 if is not data and
sock_cgroup_classid() a 0?

thanks,
daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/7] netprio_cgroup: limit the maximum css->id to USHRT_MAX

2015-11-20 Thread Daniel Wagner
On 11/19/2015 07:52 PM, Tejun Heo wrote:
> netprio builds per-netdev contiguous priomap array which is indexed by
> css->id.  The array is allocated using kzalloc() effectively limiting
> the maximum ID supported to some thousand range.  This patch caps the
> maximum supported css->id to USHRT_MAX which should be way above what
> is actually useable.
> 
> This allows reducing sock->sk_cgrp_prioidx to u16 from u32.  The freed
> up part will be used to overload the cgroup related fields.
> sock->sk_cgrp_prioidx's position is swapped with sk_mark so that the
> two cgroup related fields are adjacent.
> 
> Signed-off-by: Tejun Heo 
> Cc: Daniel Borkmann 
> Cc: Daniel Wagner 

Acked-by: Daniel Wagner 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] bpf: BPF based latency tracing

2015-06-22 Thread Daniel Wagner
On 06/20/2015 10:14 AM, Daniel Borkmann wrote:
> I think it would be useful to perhaps have two options:
> 
> 1) User specifies a specific CPU and gets one such an output above.

Good point. Will do.

> 2) Summary view, i.e. to have the samples of each CPU for comparison
>next to each other in columns and maybe the histogram view a bit
>more compressed (perhaps summary of all CPUs).

I agree, the current view is not really optimal. I'll look into this as
well.

Alexei indicated that he is working on per-cpu variables support. I
think that would be extremely useful to drop the hard coded limit of
CPUs and turning this sample code into some more generic code.

> Anyway, it's sample code people can go with and modify individually.

I am interested to turn this code into a more useful tool. Though I
think I miss some background information why this code is kept as
samples. Obviously, there is the API and ARCH dependency. As long as an
API change can reliable be detected I don't see a real show stopper.
Maybe I am too naive. Furthermore I expected that trace_preempt_[on|off]
wont change that often.

> Acked-by: Daniel Borkmann 

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in


[PATCH v2] bpf: BPF based latency tracing

2015-06-19 Thread Daniel Wagner
t; 32767: 178  ||
   32768 -> 65535: 59   ||
   65536 -> 131071   : 2||
  131072 -> 262143   : 0||
  262144 -> 524287   : 1||
  524288 -> 1048575  : 174  ||
CPU 3
  latency: count distribution
   1 -> 1: 0||
   2 -> 3: 0||
   4 -> 7: 0||
   8 -> 15   : 0||
  16 -> 31   : 0||
  32 -> 63   : 0||
  64 -> 127  : 0||
 128 -> 255  : 0||
 256 -> 511  : 0||
 512 -> 1023 : 0||
1024 -> 2047 : 0||
2048 -> 4095 : 29626|*** |
4096 -> 8191 : 2704 |**  |
8192 -> 16383: 1090 ||
   16384 -> 32767: 160  ||
   32768 -> 65535: 72   ||
   65536 -> 131071   : 32   ||
  131072 -> 262143   : 26   ||
  262144 -> 524287   : 12   ||
  524288 -> 1048575  : 298  ||

All this is based on the trace3 examples written by
Alexei Starovoitov .

Signed-off-by: Daniel Wagner 
Cc: Alexei Starovoitov 
Cc: Alexei Starovoitov 
Cc: "David S. Miller" 
Cc: Daniel Borkmann 
Cc: Ingo Molnar 
Cc: linux-ker...@vger.kernel.org
Cc: netdev@vger.kernel.org
---
Hi Alexei,

Something broke in my toolchain and it took me a while to figure out whats
going on.

With the rebase on net-next no additinal patches are needed and this
thing here runs fine.

This time with code...

cheers,
daniel

changes

v2: - forgot to do a git add... 
v1: - rebases on net-next

v0:
- renamed to lathist since there is no direct hw latency involved
-  use arrays instead of hash tables

samples/bpf/Makefile   |   4 ++
 samples/bpf/lathist_kern.c |  99 +++
 samples/bpf/lathist_user.c | 103 +
 3 files changed, 206 insertions(+)
 create mode 100644 samples/bpf/lathist_kern.c
 create mode 100644 samples/bpf/lathist_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 46c6a8c..4450fed 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -12,6 +12,7 @@ hostprogs-y += tracex2
 hostprogs-y += tracex3
 hostprogs-y += tracex4
 hostprogs-y += tracex5
+hostprogs-y += lathist
 
 test_verifier-objs := test_verifier.o libbpf.o
 test_maps-objs := test_maps.o libbpf.o
@@ -24,6 +25,7 @@ tracex2-objs := bpf_load.o libbpf.o tracex2_user.o
 tracex3-objs := bpf_load.o libbpf.o tracex3_user.o
 tracex4-objs := bpf_load.o libbpf.o tracex4_user.o
 tracex5-objs := bpf_load.o libbpf.o tracex5_user.o
+lathist-objs := bpf_load.o libbpf.o lathist_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -36,6 +38,7 @@ always += tracex3_kern.o
 always += tracex4_kern.o
 always += tracex5_kern.o
 always += tcbpf1_kern.o
+always += lathist_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 
@@ -48,6 +51,7 @@ HOSTLOADLIBES_tracex2 += -lelf
 HOSTLOADLIBES_tracex3 += -lelf
 HOSTLOADLIBES_tracex4 += -lelf -lrt
 HOSTLOADLIBES_tracex5 += -lelf
+HOSTLOADLIBES_lathist += -lelf
 
 # point this to your LLVM backend with bpf support
 LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
diff --git a/samples/bpf/lathist_kern.c b/samples/bpf/lathist_kern.c
new file mode 100644
index 000..18fa088
--- /dev/null
+++ b/samples/bpf/lathist_kern.c
@@ -0,0 +1,99 @@
+/* Copyright (c) 2013-2015 PLUMgrid, http://plumgrid.com
+ * Copyright (c) 2015 BMW Car IT GmbH
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include 
+#include 
+#include 
+#include "bpf_helpers.h"
+
+#define MAX_ENTRIES20
+#define MAX_CPU4
+
+/* We need to stick to static allocated memory (an array instead of
+ * hash table) because managing dynamic mem

[PATCH v1] bpf: BPF based latency tracing

2015-06-19 Thread Daniel Wagner
t; 32767: 178  ||
   32768 -> 65535: 59   ||
   65536 -> 131071   : 2||
  131072 -> 262143   : 0||
  262144 -> 524287   : 1||
  524288 -> 1048575  : 174  ||
CPU 3
  latency: count distribution
   1 -> 1: 0||
   2 -> 3: 0||
   4 -> 7: 0||
   8 -> 15   : 0||
  16 -> 31   : 0||
  32 -> 63   : 0||
  64 -> 127  : 0||
 128 -> 255  : 0||
 256 -> 511  : 0||
 512 -> 1023 : 0||
1024 -> 2047 : 0||
2048 -> 4095 : 29626|*** |
4096 -> 8191 : 2704 |**  |
8192 -> 16383: 1090 ||
   16384 -> 32767: 160  ||
   32768 -> 65535: 72   ||
   65536 -> 131071   : 32   ||
  131072 -> 262143   : 26   ||
  262144 -> 524287   : 12   ||
  524288 -> 1048575  : 298  ||

All this is based on the trace3 examples written by
Alexei Starovoitov .

Signed-off-by: Daniel Wagner 
Cc: Alexei Starovoitov 
Cc: "David S. Miller" 
Cc: Daniel Borkmann 
Cc: Ingo Molnar 
Cc: linux-ker...@vger.kernel.org
Cc: netdev@vger.kernel.org
---

Hi Alexei,

Something broke in my toolchain and it took me a while to figure out whats
going on.

With the rebase on net-next no additinal patches are needed and this
thing here runs fine.

cheers,
daniel

changes

v1: - rebases on net-next

v0:
- renamed to lathist since there is no direct hw latency involved
-  use arrays instead of hash tables

samples/bpf/Makefile | 4 
 1 file changed, 4 insertions(+)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 46c6a8c..4450fed 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -12,6 +12,7 @@ hostprogs-y += tracex2
 hostprogs-y += tracex3
 hostprogs-y += tracex4
 hostprogs-y += tracex5
+hostprogs-y += lathist
 
 test_verifier-objs := test_verifier.o libbpf.o
 test_maps-objs := test_maps.o libbpf.o
@@ -24,6 +25,7 @@ tracex2-objs := bpf_load.o libbpf.o tracex2_user.o
 tracex3-objs := bpf_load.o libbpf.o tracex3_user.o
 tracex4-objs := bpf_load.o libbpf.o tracex4_user.o
 tracex5-objs := bpf_load.o libbpf.o tracex5_user.o
+lathist-objs := bpf_load.o libbpf.o lathist_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -36,6 +38,7 @@ always += tracex3_kern.o
 always += tracex4_kern.o
 always += tracex5_kern.o
 always += tcbpf1_kern.o
+always += lathist_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 
@@ -48,6 +51,7 @@ HOSTLOADLIBES_tracex2 += -lelf
 HOSTLOADLIBES_tracex3 += -lelf
 HOSTLOADLIBES_tracex4 += -lelf -lrt
 HOSTLOADLIBES_tracex5 += -lelf
+HOSTLOADLIBES_lathist += -lelf
 
 # point this to your LLVM backend with bpf support
 LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in