Re: wl1251 & mac address & calibration data

2016-12-15 Thread Daniel Wagner

On 12/16/2016 03:03 AM, Luis R. Rodriguez wrote:

For the new API a solution for "fallback mechanisms" should be clean
though and I am looking to stay as far as possible from the existing
mess. A solution to help both the old API and new API is possible for
the "fallback mechanism" though -- but for that I can only refer you
at this point to some of Daniel Wagner and Tom Gunderson's firmwared
deamon prospect. It should help pave the way for a clean solution and
help address other stupid issues.


The firmwared project is hosted here

https://github.com/teg/firmwared

As Luis pointed out, firmwared relies on FW_LOADER_USER_HELPER_FALLBACK, 
which is not enabled by default. I don't see any reason why firmwared 
should not also support loading calibration data. If we find a sound way 
to do this.


As you can see from the commit history it is a pretty young project and 
more ore less reanimation of the old udev firmware loader feature.  We 
are getting int into shape, adding integration tests etc.


The main motivation for this project is the get movement back in stuck 
discussion on the firmware loader API. Luis was very busy writing up all 
the details on the current situation and purely from the amount of 
documentation need to describe the API you can tell something is awry.


Thanks,
Daniel


[PATCH 0/2] wireless: Use complete() instead complete_all()

2016-08-18 Thread Daniel Wagner
From: Daniel Wagner <daniel.wag...@bmw-carit.de>

Hi,

Using complete_all() is not wrong per se but it suggest that there
might be more than one reader. For -rt I am reviewing all
complete_all() users and would like to leave only the real ones in the
tree. The main problem for -rt about complete_all() is that it can be
uses inside IRQ context and that can lead to unbounded amount work
inside the interrupt handler. That is a no no for -rt.

The patches grouped per subsystem and in small batches to allow
reviewing.

This series ignores all complete_all() usages in the firmware loading
path. They will be hopefully address by Luis' sysdata patches [0].
That leaves a couple of complete_all() calls.

The first patch fixes a real glitch for the carl9170 driver. I was
able to test it because I have the hardware. For the second one I
haven't found any dongle with that chip in my drawers. 

This series against net-next of today.

cheers,
daniel

[0] 
https://lkml.kernel.org/r/1466117661-22075-1-git-send-email-mcg...@kernel.org

Daniel Wagner (2):
  carl9170: Fix wrong completion usage
  ath10k: use complete() instead complete_all()

 drivers/net/wireless/ath/ath10k/core.c  | 16 
 drivers/net/wireless/ath/ath10k/mac.c   |  2 +-
 drivers/net/wireless/ath/carl9170/usb.c |  6 ++
 3 files changed, 11 insertions(+), 13 deletions(-)

-- 
2.7.4


[PATCH 2/2] ath10k: use complete() instead complete_all()

2016-08-18 Thread Daniel Wagner
From: Daniel Wagner <daniel.wag...@bmw-carit.de>

There is only one waiter for the completion, therefore there
is no need to use complete_all(). Let's make that clear by
using complete() instead of complete_all().

The usage pattern of the completion is:

waiter context  waker context

scan.started


ath10k_start_scan()
  lockdep_assert_held(conf_mutex)
  auth10k_wmi_start_scan()
  wait_for_completion_timeout(scan.started)

ath10k_wmi_event_scan_start_failed()
  complete(scan.started)

ath10k_wmi_event_scan_started()
  complete(scan.started)

scan.completed
--

ath10k_scan_stop()
  lockdep_assert_held(conf_mutex)
  ath10k_wmi_stop_scan()
  wait_for_completion_timeout(scan.completed)

__ath10k_scan_finish()
  complete(scan.completed)

scan.on_channel
---

ath10k_remain_on_channel()
  mutex_lock(conf_mutex)
  ath10k_start_scan()
  wait_for_completion_timeout(scan.on_channel)

ath10k_wmi_event_scan_foreign_chan()
  complete(scan.on_channel)

offchan_tx_completed


ath10k_offchan_tx_work()
  mutex_lock(conf_mutex)
  reinit_completion(offchan_tx_completed)
  wait_for_completion_timeout(offchan_tx_completed)

ath10k_report_offchain_tx()
  complete(offchan_tx_completed)

install_key_done

ath10k_install_key()
  lockep_assert_held(conf_mutex)
  reinit_completion(install_key_done)
  wait_for_completion_timeout(install_key_done)

ath10k_htt_t2h_msg_handler()
  complete(install_key_done)

vdev_setup_done
---

ath10k_monitor_vdev_start()
  lockdep_assert_held(conf_mutex)
   reinit_completion(vdev_setup_done)
  ath10k_vdev_setup_sync()
wait_for_completion_timeout(vdev_setup_done)

ath10k_wmi_event_vdev_start_resp()
  complete(vdev_setup_done)

ath10k_monitor_vdev_stop()
  lockdep_assert_held(conf_mutex)
  reinit_completion(vdev_setup_done()
  ath10k_vdev_setup_sync()
wait_for_completion_timeout(vdev_setup_done)

ath10k_wmi_event_vdev_stopped()
 complete(vdev_setup_done)

thermal.wmi_sync

ath10k_thermal_show_temp()
  mutex_lock(conf_mutex)
  reinit_completion(thermal.wmi_sync)
  wait_for_completion_timeout(thermal.wmi_sync)

ath10k_thermal_event_temperature()
  complete(thermal.wmi_sync)

bss_survey_done
---
ath10k_mac_update_bss_chan_survey
  lockdep_assert_held(conf_mutex)
  reinit_completion(bss_survey_done)
  wait_for_completion_timeout(bss_survey_done)

ath10k_wmi_event_pdev_bss_chan_info()
  complete(bss_survey_done)

All complete() calls happen while the conf_mutex is taken. That means
at max one waiter is possible.

Signed-off-by: Daniel Wagner <daniel.wag...@bmw-carit.de>
---
 drivers/net/wireless/ath/ath10k/core.c | 16 
 drivers/net/wireless/ath/ath10k/mac.c  |  2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/core.c 
b/drivers/net/wireless/ath/ath10k/core.c
index e889829..ed76601 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -1497,14 +1497,14 @@ static void ath10k_core_restart(struct work_struct 
*work)
 
ieee80211_stop_queues(ar->hw);
ath10k_drain_tx(ar);
-   complete_all(>scan.started);
-   complete_all(>scan.completed);
-   complete_all(>scan.on_channel);
-   complete_all(>offchan_tx_completed);
-   complete_all(>install_key_done);
-   complete_all(>vdev_setup_done);
-   complete_all(>thermal.wmi_sync);
-   complete_all(>bss_survey_done);
+   complete(>scan.started);
+   complete(>scan.completed);
+   complete(>scan.on_channel);
+   complete(>offchan_tx_completed);
+   complete(>install_key_done);
+   complete(>vdev_setup_done);
+   complete(>thermal.wmi_sync);
+   complete(>bss_survey_done);
wake_up(>htt.empty_tx_wq);
wake_up(>wmi.tx_credits_wq);
wake_up(>peer_mapping_wq);
diff --git a/drivers/net/wireless/ath/ath10k/mac.c 
b/drivers/net/wireless/ath/ath10k/mac.c
index 0bbd0a0..c3c1c25 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k

[PATCH 1/2] carl9170: Fix wrong completion usage

2016-08-18 Thread Daniel Wagner
From: Daniel Wagner <daniel.wag...@bmw-carit.de>

carl9170_usb_stop() is used from several places to flush and cleanup any
pending work. The normal pattern is to send a request and wait for the
irq handler to call complete(). The completion is not reinitialized
during normal operation and as the old comment indicates it is important
to keep calls to wait_for_completion_timeout() and complete() balanced.

Calling complete_all() brings this equilibirum out of balance and needs
to be fixed by a reinit_completion(). But that opens a small race
window. It is possible that the sequence of complete_all(),
reinit_completion() is faster than the wait_for_completion_timeout() can
do its work. The wake up is not lost but the done counter test is after
reinit_completion() has been executed. The only reason we don't see
carl9170_exec_cmd() hang forever is we use the timeout version of
wait_for_copletion().

Let's fix this by reinitializing the completion (that is just setting
done counter to 0) just before we send out an request. Now,
carl9170_usb_stop() can be sure a complete() call is enough to make
progess since there is only one waiter at max. This is a common pattern
also seen in various drivers which use completion.

Signed-off-by: Daniel Wagner <daniel.wag...@bmw-carit.de>
---
 drivers/net/wireless/ath/carl9170/usb.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/carl9170/usb.c 
b/drivers/net/wireless/ath/carl9170/usb.c
index 76842e6..99ab203 100644
--- a/drivers/net/wireless/ath/carl9170/usb.c
+++ b/drivers/net/wireless/ath/carl9170/usb.c
@@ -670,6 +670,7 @@ int carl9170_exec_cmd(struct ar9170 *ar, const enum 
carl9170_cmd_oids cmd,
ar->readlen = outlen;
spin_unlock_bh(>cmd_lock);
 
+   reinit_completion(>cmd_wait);
err = __carl9170_exec_cmd(ar, >cmd, false);
 
if (!(cmd & CARL9170_CMD_ASYNC_FLAG)) {
@@ -778,10 +779,7 @@ void carl9170_usb_stop(struct ar9170 *ar)
spin_lock_bh(>cmd_lock);
ar->readlen = 0;
spin_unlock_bh(>cmd_lock);
-   complete_all(>cmd_wait);
-
-   /* This is required to prevent an early completion on _start */
-   reinit_completion(>cmd_wait);
+   complete(>cmd_wait);
 
/*
 * Note:
-- 
2.7.4


Re: [PATCH net-next] nfnetlink_queue: enable PID info retrieval

2016-06-10 Thread Daniel Wagner
Hi Daniel,

> [ Cc'ing John, Daniel, et al ]
> 
> Btw, while I just looked at scm_detach_fds(), I think commits ...
> 
>  * 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set
> correctly")
>  * d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set
> correctly")
> 
> ... might not be correct, maybe I'm missing something ...? Lets say
> process A
> has a socket fd that it sends via SCM_RIGHTS to process B. Process A was
> the
> one that called sk_alloc() originally. Now in scm_detach_fds() we
> install a new
> fd for process B pointing to the same sock (file's private_data) and
> above commits
> update the cached socket cgroup data for net_cls/net_prio to the new
> process B.
> So, if process A for example still sends data over that socket, skbs
> will then
> wrongly match on B's cgroup membership instead of A's, no?

I can't remember the details right now (need to read up again but I wont
have time till Wednesday).

>From your analysis I would say that is not the desired effect. A should
match against its own cgroup and not the one of B.

cheers,
daniel


Re: [PATCH v2 net-next 0/12] bpf: map pre-alloc

2016-03-08 Thread Daniel Wagner
Hi Alexei,

On 03/08/2016 06:57 AM, Alexei Starovoitov wrote:
> v1->v2:
> . fix few issues spotted by Daniel
> . converted stackmap into pre-allocation as well
> . added a workaround for lockdep false positive
> . added pcpu_freelist_populate to be used by hashmap and stackmap
> 
> this path set switches bpf hash map to use pre-allocation by default
> and introduces BPF_F_NO_PREALLOC flag to keep old behavior for cases
> where full map pre-allocation is too memory expensive.
> 
> Some time back Daniel Wagner reported crashes when bpf hash map is
> used to compute time intervals between preempt_disable->preempt_enable
> and recently Tom Zanussi reported a dead lock in iovisor/bcc/funccount
> tool if it's used to count the number of invocations of kernel
> '*spin*' functions. Both problems are due to the recursive use of
> slub and can only be solved by pre-allocating all map elements.

I gave it a short spin and lathist sample works just fine.

cheers,
daniel


Re: [PATCHSET v3] netfilter, cgroup: implement cgroup2 path match in xt_cgroup

2015-11-23 Thread Daniel Wagner
On 11/23/2015 08:11 AM, Daniel Wagner wrote:
> [3.217648] systemd[1]: tmp.mount: Directory /tmp to mount over is not 
> empty, mounting anyway.
> [3.224665] BUG: spinlock bad magic on CPU#1, systemd/1
> [3.225653]  lock: cgroup_sk_update_lock+0x0/0x60, .magic: , 
> .owner: systemd/1, .owner_cpu: 1
> [3.227034] CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #195
> [3.227862] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
> [3.228906]  834a2160 88007c043ad0 81551edc 
> 88007c028000
> [3.229512]  88007c043af0 81136868 834a2160 
> 88007aff5940
> [3.230105]  88007c043b08 81136b05 834a2160 
> 88007c043b20
> [3.230716] Call Trace:
> [3.230906]  [] dump_stack+0x4e/0x82
> [3.231289]  [] spin_dump+0x78/0xc0
> [3.231642]  [] do_raw_spin_unlock+0x75/0xd0
> [3.232039]  [] _raw_spin_unlock+0x27/0x50
> [3.232431]  [] update_classid_sock+0x68/0x80
> [3.232836]  [] iterate_fd+0x71/0x150
> [3.233197]  [] update_classid+0x47/0x80
> [3.233571]  [] cgrp_attach+0x14/0x20
> [3.233929]  [] cgroup_taskset_migrate+0x1e1/0x330
> [3.234366]  [] cgroup_migrate+0xf5/0x190
> [3.234747]  [] ? cgroup_migrate+0x5/0x190
> [3.235130]  [] cgroup_attach_task+0x176/0x200
> [3.235543]  [] ? cgroup_attach_task+0x5/0x200
> [3.235953]  [] __cgroup_procs_write+0x2ad/0x460
> [3.236377]  [] ? __cgroup_procs_write+0x5e/0x460
> [3.236805]  [] cgroup_procs_write+0x14/0x20
> [3.237205]  [] cgroup_file_write+0x35/0x1c0
> [3.237600]  [] kernfs_fop_write+0x141/0x190
> [3.237998]  [] __vfs_write+0x28/0xe0
> [3.238361]  [] ? percpu_down_read+0x57/0xa0
> [3.238761]  [] ? __sb_start_write+0xb4/0xf0
> [3.239154]  [] ? __sb_start_write+0xb4/0xf0
> [3.239554]  [] vfs_write+0xac/0x1a0
> [3.239930]  [] ? __fget_light+0x66/0x90
> [3.240308]  [] SyS_write+0x49/0xb0
> [3.240656]  [] entry_SYSCALL_64_fastpath+0x12/0x76

I have enabled a few additional cgroup controllers as well, because I was
trying to figure out why I only see the 'memory' cgroup controller in 
cgroup.controllers. pid and io show up but not net_prio or net_cls.
Not sure why systemd (v227) is not mounting them.

Though, after a while a similar call trace is produced. I guess this
has nothing to do with the current changes.

[   11.594536] [ cut here ]
[   11.595274] WARNING: CPU: 1 PID: 1 at kernel/cgroup_pids.c:97 
pids_cancel.constprop.6+0x31/0x40()
[   11.595958] Modules linked in:
[   11.596199] CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #196
[   11.596689] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[   11.597632]  81f66d8b 88007c04bb90 8155ccdc 

[   11.598234]  88007c04bbc8 810de202 8800793dda00 
88007a096800
[   11.598877]  88007c04bc80 88007a6b6200 0001 
88007c04bbd8
[   11.599547] Call Trace:
[   11.599784]  [] dump_stack+0x4e/0x82
[   11.600197]  [] warn_slowpath_common+0x82/0xc0
[   11.600705]  [] warn_slowpath_null+0x1a/0x20
[   11.601208]  [] pids_cancel.constprop.6+0x31/0x40
[   11.601764]  [] pids_can_attach+0x6d/0xf0
[   11.602245]  [] cgroup_taskset_migrate+0x6a/0x330
[   11.602795]  [] cgroup_migrate+0xf5/0x190
[   11.603276]  [] ? cgroup_migrate+0x5/0x190
[   11.603788]  [] cgroup_attach_task+0x176/0x200
[   11.604308]  [] ? cgroup_attach_task+0x5/0x200
[   11.604831]  [] __cgroup_procs_write+0x2ad/0x460
[   11.605367]  [] ? __cgroup_procs_write+0x5e/0x460
[   11.605929]  [] cgroup_procs_write+0x14/0x20
[   11.606448]  [] cgroup_file_write+0x35/0x1c0
[   11.606931]  [] kernfs_fop_write+0x141/0x190
[   11.607401]  [] __vfs_write+0x28/0xe0
[   11.607834]  [] ? percpu_down_read+0x57/0xa0
[   11.608366]  [] ? __sb_start_write+0xb4/0xf0
[   11.608874]  [] ? __sb_start_write+0xb4/0xf0
[   11.609343]  [] vfs_write+0xac/0x1a0
[   11.609843]  [] ? __fget_light+0x66/0x90
[   11.610315]  [] SyS_write+0x49/0xb0
[   11.610756]  [] entry_SYSCALL_64_fastpath+0x12/0x76
[   11.611305] ---[ end trace 7f953d0ce5af99ea ]---

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/9] netfilter: implement xt_cgroup cgroup2 path match

2015-11-23 Thread Daniel Wagner
Hi Tejun,

On 11/21/2015 05:14 PM, Tejun Heo wrote:> +static int
> cgroup_mt_check_v1(const struct xt_mtchk_param *par)
> +{
> + struct xt_cgroup_info_v1 *info = par->matchinfo;
> + struct cgroup *cgrp;
> +
> + if ((info->invert_path & ~1) || (info->invert_classid & ~1))
> + return -EINVAL;

The checks below use pr_info() in case the configuration is not valid.
Is this missing here on purpose?

I have tested it slightly and it seems to work (also on an older
kernel). I don't know if that qualifies it for a Tested-by but at least
Acked-by should do the trick:

Tested-by: Daniel Wagner <daniel.wag...@bmw-carit.de>
Acked-by: Daniel Wagner <daniel.wag...@bmw-carit.de>

cheers,
daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] netfilter: prepare xt_cgroup for multi revisions

2015-11-23 Thread Daniel Wagner
Hi Tejun,

On 11/21/2015 05:14 PM, Tejun Heo wrote:
> xt_cgroup will grow cgroup2 path based match.  Postfix existing
> symbols with _v0 and prepare for multi revision registration.
> 
> Signed-off-by: Tejun Heo <t...@kernel.org>
> Cc: Daniel Borkmann <dan...@iogearbox.net>
> Cc: Daniel Wagner <daniel.wag...@bmw-carit.de>

Same as in my reply to patch #9 (yes, I know do it wrong order...
thought can't stop now... :))

Tested-by: Daniel Wagner <daniel.wag...@bmw-carit.de>
Acked-by: Daniel Wagner <daniel.wag...@bmw-carit.de>

cheers,
daniel

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] sock, cgroup: add sock->sk_cgroup

2015-11-23 Thread Daniel Wagner
Hi Tejun,

On 11/21/2015 05:13 PM, Tejun Heo wrote:
> Signed-off-by: Tejun Heo <t...@kernel.org>
> Cc: Daniel Borkmann <dan...@iogearbox.net>
> Cc: Daniel Wagner <daniel.wag...@bmw-carit.de>

I did a quick test and for new connection the cgroup2 match worked as
expected. For an existing connection I wasn't able to trigger the match.

It is quite likely I do something wrong:

ssh into the box
# mkdir /sys/fs/cgroup/test
# echo $$ > /sys/fs/cgroup/test/cgroup.procs
# echo $PPID > /sys/fs/cgroup/test/cgroup.procs
# iptables -A OUTPUT -m cgroup --path test

Should I see matches with the existing ssh session?

cheers,
daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] sock, cgroup: add sock->sk_cgroup

2015-11-23 Thread Daniel Wagner
On 11/23/2015 04:48 PM, Tejun Heo wrote:
> On Mon, Nov 23, 2015 at 02:02:03PM +0100, Daniel Wagner wrote:
>> On 11/21/2015 05:13 PM, Tejun Heo wrote:
>>> Signed-off-by: Tejun Heo <t...@kernel.org>
>>> Cc: Daniel Borkmann <dan...@iogearbox.net>
>>> Cc: Daniel Wagner <daniel.wag...@bmw-carit.de>
>>
>> I did a quick test and for new connection the cgroup2 match worked as
>> expected. For an existing connection I wasn't able to trigger the match.
>>
>> It is quite likely I do something wrong:
>>
>>  ssh into the box
>>  # mkdir /sys/fs/cgroup/test
>>  # echo $$ > /sys/fs/cgroup/test/cgroup.procs
>>  # echo $PPID > /sys/fs/cgroup/test/cgroup.procs
>>  # iptables -A OUTPUT -m cgroup --path test
>>
>> Should I see matches with the existing ssh session?
> 
> Socket is associated with the creating cgroup and stays associated
> with that cgroup until it's released.  Migrating the process doesn't
> change the ownership of the sockets it has created.  This is in line
> with how other stateful resources such as memory are handled in
> cgroup2 hierarchy.

Thanks for the explanation. Looks good to me:

Tested-by: Daniel Wagner <daniel.wag...@bmw-carit.de>
Acked-by: Daniel Wagner <daniel.wag...@bmw-carit.de>

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHSET v3] netfilter, cgroup: implement cgroup2 path match in xt_cgroup

2015-11-23 Thread Daniel Wagner
On 11/23/2015 04:53 PM, Tejun Heo wrote:
> On Mon, Nov 23, 2015 at 09:54:32AM +0100, Daniel Wagner wrote:
> ...
>>> [3.224665] BUG: spinlock bad magic on CPU#1, systemd/1
>>> [3.225653]  lock: cgroup_sk_update_lock+0x0/0x60, .magic: , 
>>> .owner: systemd/1, .owner_cpu: 1
>>> [3.227034] CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #195
>>> [3.227862] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>>> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
>>> [3.228906]  834a2160 88007c043ad0 81551edc 
>>> 88007c028000
>>> [3.229512]  88007c043af0 81136868 834a2160 
>>> 88007aff5940
>>> [3.230105]  88007c043b08 81136b05 834a2160 
>>> 88007c043b20
>>> [3.230716] Call Trace:
>>> [3.230906]  [] dump_stack+0x4e/0x82
>>> [3.231289]  [] spin_dump+0x78/0xc0
>>> [3.231642]  [] do_raw_spin_unlock+0x75/0xd0
>>> [3.232039]  [] _raw_spin_unlock+0x27/0x50
>>> [3.232431]  [] update_classid_sock+0x68/0x80
>>> [3.232836]  [] iterate_fd+0x71/0x150
>>> [3.233197]  [] update_classid+0x47/0x80
>>> [3.233571]  [] cgrp_attach+0x14/0x20
>>> [3.233929]  [] cgroup_taskset_migrate+0x1e1/0x330
>>> [3.234366]  [] cgroup_migrate+0xf5/0x190
>>> [3.235130]  [] cgroup_attach_task+0x176/0x200
>>> [3.235953]  [] __cgroup_procs_write+0x2ad/0x460
>>> [3.236805]  [] cgroup_procs_write+0x14/0x20
>>> [3.237205]  [] cgroup_file_write+0x35/0x1c0
>>> [3.237600]  [] kernfs_fop_write+0x141/0x190
>>> [3.237998]  [] __vfs_write+0x28/0xe0
>>> [3.239554]  [] vfs_write+0xac/0x1a0
>>> [3.240308]  [] SyS_write+0x49/0xb0
>>> [3.240656]  [] entry_SYSCALL_64_fastpath+0x12/0x76
>>
>> I have enabled a few additional cgroup controllers as well, because I was
>> trying to figure out why I only see the 'memory' cgroup controller in 
>> cgroup.controllers. pid and io show up but not net_prio or net_cls.
>> Not sure why systemd (v227) is not mounting them.
> 
> net_prio and net_cls aren't gonna be on the v2 hierarchy.  The match
> in this patchset is being introduced to replace them; however, you can
> mount them separately on a v1 hierarchy and use the same as before.

Okay, I could have figured that myself I guess. I mounted the v1
hierarchy and it works as you have described it.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHSET v3] netfilter, cgroup: implement cgroup2 path match in xt_cgroup

2015-11-22 Thread Daniel Wagner
Hi Tejun,

On 11/21/2015 05:13 PM, Tejun Heo wrote:
> This is v3 of the xt_cgroup2 patchset.  Changes from the last take are
> 
> * Folded cgroup2 path matching into xt_cgroup as a new revision rather
>   than a separate xt_cgroup2 match as suggested by Pablo.
> 
> * Refreshed on top of Nina's net_cls dynamic config update fix patch.
>   I included the fix patch as part of this series to ease reviewing.

I started to play with your patches and was greeted by this:

[3.217648] systemd[1]: tmp.mount: Directory /tmp to mount over is not 
empty, mounting anyway.
[3.224665] BUG: spinlock bad magic on CPU#1, systemd/1
[3.225653]  lock: cgroup_sk_update_lock+0x0/0x60, .magic: , .owner: 
systemd/1, .owner_cpu: 1
[3.227034] CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #195
[3.227862] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[3.228906]  834a2160 88007c043ad0 81551edc 
88007c028000
[3.229512]  88007c043af0 81136868 834a2160 
88007aff5940
[3.230105]  88007c043b08 81136b05 834a2160 
88007c043b20
[3.230716] Call Trace:
[3.230906]  [] dump_stack+0x4e/0x82
[3.231289]  [] spin_dump+0x78/0xc0
[3.231642]  [] do_raw_spin_unlock+0x75/0xd0
[3.232039]  [] _raw_spin_unlock+0x27/0x50
[3.232431]  [] update_classid_sock+0x68/0x80
[3.232836]  [] iterate_fd+0x71/0x150
[3.233197]  [] update_classid+0x47/0x80
[3.233571]  [] cgrp_attach+0x14/0x20
[3.233929]  [] cgroup_taskset_migrate+0x1e1/0x330
[3.234366]  [] cgroup_migrate+0xf5/0x190
[3.234747]  [] ? cgroup_migrate+0x5/0x190
[3.235130]  [] cgroup_attach_task+0x176/0x200
[3.235543]  [] ? cgroup_attach_task+0x5/0x200
[3.235953]  [] __cgroup_procs_write+0x2ad/0x460
[3.236377]  [] ? __cgroup_procs_write+0x5e/0x460
[3.236805]  [] cgroup_procs_write+0x14/0x20
[3.237205]  [] cgroup_file_write+0x35/0x1c0
[3.237600]  [] kernfs_fop_write+0x141/0x190
[3.237998]  [] __vfs_write+0x28/0xe0
[3.238361]  [] ? percpu_down_read+0x57/0xa0
[3.238761]  [] ? __sb_start_write+0xb4/0xf0
[3.239154]  [] ? __sb_start_write+0xb4/0xf0
[3.239554]  [] vfs_write+0xac/0x1a0
[3.239930]  [] ? __fget_light+0x66/0x90
[3.240308]  [] SyS_write+0x49/0xb0
[3.240656]  [] entry_SYSCALL_64_fastpath+0x12/0x76

I am using a Fedora 23 host with systemd.unified_cgroup_hierarchy=1. The config 
is
available here:

http://monom.org/cgroup/config-review-xt_cgroup2

Probably completely rubbish, because it's my random test config.

cheers,
daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] sock, cgroup: add sock->sk_cgroup

2015-11-20 Thread Daniel Wagner
Hi Tejun,

On 11/19/2015 07:52 PM, Tejun Heo wrote:
> +/*
> + * There's a theoretical window where the following accessors race with
> + * updaters and return part of the previous pointer as the prioidx or
> + * classid.  Such races are short-lived and the result isn't critical.
> + */
>  static inline u16 sock_cgroup_prioidx(struct sock_cgroup_data *skcd)
>  {
> - return skcd->prioidx;
> + return (skcd->is_data & 1) ? skcd->prioidx : 1;
>  }
>  
>  static inline u32 sock_cgroup_classid(struct sock_cgroup_data *skcd)
>  {
> - return skcd->classid;
> + return (skcd->is_data & 1) ? skcd->classid : 0;
>  }


I still try to understand what the code does, hence this stupid question:

Why is sock_cgroup_prioidx() returning 1 if is not data and
sock_cgroup_classid() a 0?

thanks,
daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/7] netprio_cgroup: limit the maximum css->id to USHRT_MAX

2015-11-20 Thread Daniel Wagner
On 11/19/2015 07:52 PM, Tejun Heo wrote:
> netprio builds per-netdev contiguous priomap array which is indexed by
> css->id.  The array is allocated using kzalloc() effectively limiting
> the maximum ID supported to some thousand range.  This patch caps the
> maximum supported css->id to USHRT_MAX which should be way above what
> is actually useable.
> 
> This allows reducing sock->sk_cgrp_prioidx to u16 from u32.  The freed
> up part will be used to overload the cgroup related fields.
> sock->sk_cgrp_prioidx's position is swapped with sk_mark so that the
> two cgroup related fields are adjacent.
> 
> Signed-off-by: Tejun Heo <t...@kernel.org>
> Cc: Daniel Borkmann <dan...@iogearbox.net>
> Cc: Daniel Wagner <daniel.wag...@bmw-carit.de>

Acked-by: Daniel Wagner <daniel.wag...@bmw-carit.de>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] bpf: BPF based latency tracing

2015-06-22 Thread Daniel Wagner
On 06/20/2015 10:14 AM, Daniel Borkmann wrote:
 I think it would be useful to perhaps have two options:
 
 1) User specifies a specific CPU and gets one such an output above.

Good point. Will do.

 2) Summary view, i.e. to have the samples of each CPU for comparison
next to each other in columns and maybe the histogram view a bit
more compressed (perhaps summary of all CPUs).

I agree, the current view is not really optimal. I'll look into this as
well.

Alexei indicated that he is working on per-cpu variables support. I
think that would be extremely useful to drop the hard coded limit of
CPUs and turning this sample code into some more generic code.

 Anyway, it's sample code people can go with and modify individually.

I am interested to turn this code into a more useful tool. Though I
think I miss some background information why this code is kept as
samples. Obviously, there is the API and ARCH dependency. As long as an
API change can reliable be detected I don't see a real show stopper.
Maybe I am too naive. Furthermore I expected that trace_preempt_[on|off]
wont change that often.

 Acked-by: Daniel Borkmann dan...@iogearbox.net

Thanks,
Daniel
--
To unsubscribe from this list: send the line unsubscribe netdev in


[PATCH v1] bpf: BPF based latency tracing

2015-06-19 Thread Daniel Wagner
||
  262144 - 524287   : 1||
  524288 - 1048575  : 174  ||
CPU 3
  latency: count distribution
   1 - 1: 0||
   2 - 3: 0||
   4 - 7: 0||
   8 - 15   : 0||
  16 - 31   : 0||
  32 - 63   : 0||
  64 - 127  : 0||
 128 - 255  : 0||
 256 - 511  : 0||
 512 - 1023 : 0||
1024 - 2047 : 0||
2048 - 4095 : 29626|*** |
4096 - 8191 : 2704 |**  |
8192 - 16383: 1090 ||
   16384 - 32767: 160  ||
   32768 - 65535: 72   ||
   65536 - 131071   : 32   ||
  131072 - 262143   : 26   ||
  262144 - 524287   : 12   ||
  524288 - 1048575  : 298  ||

All this is based on the trace3 examples written by
Alexei Starovoitov a...@plumgrid.com.

Signed-off-by: Daniel Wagner daniel.wag...@bmw-carit.de
Cc: Alexei Starovoitov a...@plumgrid.com
Cc: David S. Miller da...@davemloft.net
Cc: Daniel Borkmann dan...@iogearbox.net
Cc: Ingo Molnar mi...@kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: netdev@vger.kernel.org
---

Hi Alexei,

Something broke in my toolchain and it took me a while to figure out whats
going on.

With the rebase on net-next no additinal patches are needed and this
thing here runs fine.

cheers,
daniel

changes

v1: - rebases on net-next

v0:
- renamed to lathist since there is no direct hw latency involved
-  use arrays instead of hash tables

samples/bpf/Makefile | 4 
 1 file changed, 4 insertions(+)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 46c6a8c..4450fed 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -12,6 +12,7 @@ hostprogs-y += tracex2
 hostprogs-y += tracex3
 hostprogs-y += tracex4
 hostprogs-y += tracex5
+hostprogs-y += lathist
 
 test_verifier-objs := test_verifier.o libbpf.o
 test_maps-objs := test_maps.o libbpf.o
@@ -24,6 +25,7 @@ tracex2-objs := bpf_load.o libbpf.o tracex2_user.o
 tracex3-objs := bpf_load.o libbpf.o tracex3_user.o
 tracex4-objs := bpf_load.o libbpf.o tracex4_user.o
 tracex5-objs := bpf_load.o libbpf.o tracex5_user.o
+lathist-objs := bpf_load.o libbpf.o lathist_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -36,6 +38,7 @@ always += tracex3_kern.o
 always += tracex4_kern.o
 always += tracex5_kern.o
 always += tcbpf1_kern.o
+always += lathist_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 
@@ -48,6 +51,7 @@ HOSTLOADLIBES_tracex2 += -lelf
 HOSTLOADLIBES_tracex3 += -lelf
 HOSTLOADLIBES_tracex4 += -lelf -lrt
 HOSTLOADLIBES_tracex5 += -lelf
+HOSTLOADLIBES_lathist += -lelf
 
 # point this to your LLVM backend with bpf support
 LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe netdev in


[PATCH v2] bpf: BPF based latency tracing

2015-06-19 Thread Daniel Wagner
||
  262144 - 524287   : 1||
  524288 - 1048575  : 174  ||
CPU 3
  latency: count distribution
   1 - 1: 0||
   2 - 3: 0||
   4 - 7: 0||
   8 - 15   : 0||
  16 - 31   : 0||
  32 - 63   : 0||
  64 - 127  : 0||
 128 - 255  : 0||
 256 - 511  : 0||
 512 - 1023 : 0||
1024 - 2047 : 0||
2048 - 4095 : 29626|*** |
4096 - 8191 : 2704 |**  |
8192 - 16383: 1090 ||
   16384 - 32767: 160  ||
   32768 - 65535: 72   ||
   65536 - 131071   : 32   ||
  131072 - 262143   : 26   ||
  262144 - 524287   : 12   ||
  524288 - 1048575  : 298  ||

All this is based on the trace3 examples written by
Alexei Starovoitov a...@plumgrid.com.

Signed-off-by: Daniel Wagner daniel.wag...@bmw-carit.de
Cc: Alexei Starovoitov a...@plumgrid.com
Cc: Alexei Starovoitov a...@plumgrid.com
Cc: David S. Miller da...@davemloft.net
Cc: Daniel Borkmann dan...@iogearbox.net
Cc: Ingo Molnar mi...@kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: netdev@vger.kernel.org
---
Hi Alexei,

Something broke in my toolchain and it took me a while to figure out whats
going on.

With the rebase on net-next no additinal patches are needed and this
thing here runs fine.

This time with code...

cheers,
daniel

changes

v2: - forgot to do a git add... 
v1: - rebases on net-next

v0:
- renamed to lathist since there is no direct hw latency involved
-  use arrays instead of hash tables

samples/bpf/Makefile   |   4 ++
 samples/bpf/lathist_kern.c |  99 +++
 samples/bpf/lathist_user.c | 103 +
 3 files changed, 206 insertions(+)
 create mode 100644 samples/bpf/lathist_kern.c
 create mode 100644 samples/bpf/lathist_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 46c6a8c..4450fed 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -12,6 +12,7 @@ hostprogs-y += tracex2
 hostprogs-y += tracex3
 hostprogs-y += tracex4
 hostprogs-y += tracex5
+hostprogs-y += lathist
 
 test_verifier-objs := test_verifier.o libbpf.o
 test_maps-objs := test_maps.o libbpf.o
@@ -24,6 +25,7 @@ tracex2-objs := bpf_load.o libbpf.o tracex2_user.o
 tracex3-objs := bpf_load.o libbpf.o tracex3_user.o
 tracex4-objs := bpf_load.o libbpf.o tracex4_user.o
 tracex5-objs := bpf_load.o libbpf.o tracex5_user.o
+lathist-objs := bpf_load.o libbpf.o lathist_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -36,6 +38,7 @@ always += tracex3_kern.o
 always += tracex4_kern.o
 always += tracex5_kern.o
 always += tcbpf1_kern.o
+always += lathist_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 
@@ -48,6 +51,7 @@ HOSTLOADLIBES_tracex2 += -lelf
 HOSTLOADLIBES_tracex3 += -lelf
 HOSTLOADLIBES_tracex4 += -lelf -lrt
 HOSTLOADLIBES_tracex5 += -lelf
+HOSTLOADLIBES_lathist += -lelf
 
 # point this to your LLVM backend with bpf support
 LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
diff --git a/samples/bpf/lathist_kern.c b/samples/bpf/lathist_kern.c
new file mode 100644
index 000..18fa088
--- /dev/null
+++ b/samples/bpf/lathist_kern.c
@@ -0,0 +1,99 @@
+/* Copyright (c) 2013-2015 PLUMgrid, http://plumgrid.com
+ * Copyright (c) 2015 BMW Car IT GmbH
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include linux/version.h
+#include linux/ptrace.h
+#include uapi/linux/bpf.h
+#include bpf_helpers.h
+
+#define MAX_ENTRIES20
+#define MAX_CPU4
+
+/* We need to stick to static allocated memory (an array instead of
+ * hash table) because managing dynamic memory from the
+ * trace_preempt_[on|off] tracepoints hooks is not supported.
+ */
+
+struct bpf_map_def SEC(maps) my_map = {
+   .type = BPF_MAP_TYPE_ARRAY