Re: [PATCH] [RFC] bpf: tracing: new helper bpf_get_current_cgroup_ino

2018-05-25 Thread Alban Crequy
id'
> PID TID COMMFUNC -
> 40674067a.out   __x64_sys_nanosleep cgid = 106b2
> 40674067    a.out   __x64_sys_nanosleep cgid = 106b2
> 40674067a.out   __x64_sys_nanosleep cgid = 106b2
> ^C[yhs@localhost tools]$

> The kernel and user space cgid matches. Will provide a
> formal patch later.




> On Mon, May 21, 2018 at 5:24 PM, Y Song <ys114...@gmail.com> wrote:
> > On Mon, May 21, 2018 at 9:26 AM, Alexei Starovoitov
> > <alexei.starovoi...@gmail.com> wrote:
> >> On Sun, May 13, 2018 at 07:33:18PM +0200, Alban Crequy wrote:
> >>>
> >>> +BPF_CALL_2(bpf_get_current_cgroup_ino, u32, hierarchy, u64, flags)
> >>> +{
> >>> + // TODO: pick the correct hierarchy instead of the mem
controller
> >>> + struct cgroup *cgrp = task_cgroup(current, memory_cgrp_id);
> >>> +
> >>> + if (unlikely(!cgrp))
> >>> + return -EINVAL;
> >>> + if (unlikely(hierarchy))
> >>> + return -EINVAL;
> >>> + if (unlikely(flags))
> >>> + return -EINVAL;
> >>> +
> >>> + return cgrp->kn->id.ino;
> >>
> >> ino only is not enough to identify cgroup. It needs generation number
too.
> >> I don't quite see how hierarchy and flags can be used in the future.
> >> Also why limit it to memcg?
> >>
> >> How about something like this instead:
> >>
> >> BPF_CALL_2(bpf_get_current_cgroup_id)
> >> {
> >> struct cgroup *cgrp = task_dfl_cgroup(current);
> >>
> >> return cgrp->kn->id.id;
> >> }
> >> The user space can use fhandle api to get the same 64-bit id.
> >
> > I think this should work. This will also be useful to bcc as user
> > space can encode desired id
> > in the bpf program and compared that id to the current cgroup id, so we
can have
> > cgroup level tracing (esp. stat collection) support. To cope with
> > cgroup hierarchy, user can use
> > cgroup-array based approach or explicitly compare against multiple
cgroup id's.


Re: [PATCH] [RFC] bpf: tracing: new helper bpf_get_current_cgroup_ino

2018-05-21 Thread Alban Crequy
On Mon, May 14, 2018 at 9:38 PM, Y Song <ys114...@gmail.com> wrote:
>
> On Sun, May 13, 2018 at 10:33 AM, Alban Crequy <alban.cre...@gmail.com> wrote:
> > From: Alban Crequy <al...@kinvolk.io>
> >
> > bpf_get_current_cgroup_ino() allows BPF trace programs to get the inode
> > of the cgroup where the current process resides.
> >
> > My use case is to get statistics about syscalls done by a specific
> > Kubernetes container. I have a tracepoint on raw_syscalls/sys_enter and
> > a BPF map containing the cgroup inode that I want to trace. I use
> > bpf_get_current_cgroup_ino() and I quickly return from the tracepoint if
> > the inode is not in the BPF hash map.
>
> Alternatively, the kernel already has bpf_current_task_under_cgroup helper
> which uses a cgroup array to store cgroup fd's. If the current task is
> in the hierarchy of a particular cgroup, the helper will return true.
>
> One difference between your helper and bpf_current_task_under_cgroup() is
> that your helper tests against a particular cgroup, not including its
> children, but
> bpf_current_task_under_cgroup() will return true even the task is in a
> nested cgroup.
>
> Maybe this will work for you?

I like the behaviour that it checks for children cgroups. But with the
cgroup array, I can test only one cgroup at a time. I would like to be
able to enable my tracer for a few Kubernetes containers or all by
adding the inodes of a few cgroups in a hash map. So I could keep
separate stats for each. With bpf_current_task_under_cgroup(), I would
need to iterate over the list of cgroups, which is difficult with BPF.

Also, Kubernetes is cgroup-v1 only and bpf_current_task_under_cgroup()
is cgroup-v2 only. In Kubernetes, the processes remain in the root of
the v2 hierarchy. I'd like to be able to select the cgroup hierarchy
in my helper so it'd work for both v1 and v2.

> > Without this BPF helper, I would need to keep track of all pids in the
> > container. The Netlink proc connector can be used to follow process
> > creation and destruction but it is racy.
> >
> > This patch only looks at the memory cgroup, which was enough for me
> > since each Kubernetes container is placed in a different mem cgroup.
> > For a generic implementation, I'm not sure how to proceed: it seems I
> > would need to use 'for_each_root(root)' (see example in
> > proc_cgroup_show() from kernel/cgroup/cgroup.c) but I don't know if
> > taking the cgroup mutex is possible in the BPF helper function. It might
> > be ok in the tracepoint raw_syscalls/sys_enter but could the mutex
> > already be taken in some other tracepoints?
>
> mutex is not allowed in a helper since it can block.

Ok. I don't know how to implement my helper properly then. Maybe I
could just use the 1st cgroup-v1 hierarchy (the name=systemd one) so I
don't have to iterate over the hierarchies. But would that be
acceptable?

Cheers,
Alban

> > Signed-off-by: Alban Crequy <al...@kinvolk.io>
> > ---
> >  include/uapi/linux/bpf.h | 11 ++-
> >  kernel/trace/bpf_trace.c | 25 +
> >  2 files changed, 35 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index c5ec89732a8d..38ac3959cdf3 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -755,6 +755,14 @@ union bpf_attr {
> >   * @addr: pointer to struct sockaddr to bind socket to
> >   * @addr_len: length of sockaddr structure
> >   * Return: 0 on success or negative error code
> > + *
> > + * u64 bpf_get_current_cgroup_ino(hierarchy, flags)
> > + * Get the cgroup{1,2} inode of current task under the specified 
> > hierarchy.
> > + * @hierarchy: cgroup hierarchy
>
> Not sure what is the value to specify hierarchy here.
> A cgroup directory fd?
>
> > + * @flags: reserved for future use
> > + * Return:
> > + *   == 0 error
>
> looks like < 0 means error.
>
> > + *> 0 inode of the cgroup
>>= 0 means good?
> >   */
> >  #define __BPF_FUNC_MAPPER(FN)  \
> > FN(unspec), \
> > @@ -821,7 +829,8 @@ union bpf_attr {
> > FN(msg_apply_bytes),\
> > FN(msg_cork_bytes), \
> > FN(msg_pull_data),  \
> > -   FN(bind),
> > +   FN(bind),   \
> > +   FN(get_current_cgroup_ino),
> >
> >  /* integer value in 'imm' field of BPF_CALL instruction selects which 
> > helper
> >   * function eBPF program intends to call
> > diff --git a/kernel/

[PATCH] [RFC] bpf: tracing: new helper bpf_get_current_cgroup_ino

2018-05-13 Thread Alban Crequy
From: Alban Crequy <al...@kinvolk.io>

bpf_get_current_cgroup_ino() allows BPF trace programs to get the inode
of the cgroup where the current process resides.

My use case is to get statistics about syscalls done by a specific
Kubernetes container. I have a tracepoint on raw_syscalls/sys_enter and
a BPF map containing the cgroup inode that I want to trace. I use
bpf_get_current_cgroup_ino() and I quickly return from the tracepoint if
the inode is not in the BPF hash map.

Without this BPF helper, I would need to keep track of all pids in the
container. The Netlink proc connector can be used to follow process
creation and destruction but it is racy.

This patch only looks at the memory cgroup, which was enough for me
since each Kubernetes container is placed in a different mem cgroup.
For a generic implementation, I'm not sure how to proceed: it seems I
would need to use 'for_each_root(root)' (see example in
proc_cgroup_show() from kernel/cgroup/cgroup.c) but I don't know if
taking the cgroup mutex is possible in the BPF helper function. It might
be ok in the tracepoint raw_syscalls/sys_enter but could the mutex
already be taken in some other tracepoints?

Signed-off-by: Alban Crequy <al...@kinvolk.io>
---
 include/uapi/linux/bpf.h | 11 ++-
 kernel/trace/bpf_trace.c | 25 +
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c5ec89732a8d..38ac3959cdf3 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -755,6 +755,14 @@ union bpf_attr {
  * @addr: pointer to struct sockaddr to bind socket to
  * @addr_len: length of sockaddr structure
  * Return: 0 on success or negative error code
+ *
+ * u64 bpf_get_current_cgroup_ino(hierarchy, flags)
+ * Get the cgroup{1,2} inode of current task under the specified hierarchy.
+ * @hierarchy: cgroup hierarchy
+ * @flags: reserved for future use
+ * Return:
+ *   == 0 error
+ *> 0 inode of the cgroup
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -821,7 +829,8 @@ union bpf_attr {
FN(msg_apply_bytes),\
FN(msg_cork_bytes), \
FN(msg_pull_data),  \
-   FN(bind),
+   FN(bind),   \
+   FN(get_current_cgroup_ino),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 56ba0f2a01db..9bf92a786639 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -524,6 +524,29 @@ static const struct bpf_func_proto 
bpf_probe_read_str_proto = {
.arg3_type  = ARG_ANYTHING,
 };
 
+BPF_CALL_2(bpf_get_current_cgroup_ino, u32, hierarchy, u64, flags)
+{
+   // TODO: pick the correct hierarchy instead of the mem controller
+   struct cgroup *cgrp = task_cgroup(current, memory_cgrp_id);
+
+   if (unlikely(!cgrp))
+   return -EINVAL;
+   if (unlikely(hierarchy))
+   return -EINVAL;
+   if (unlikely(flags))
+   return -EINVAL;
+
+   return cgrp->kn->id.ino;
+}
+
+static const struct bpf_func_proto bpf_get_current_cgroup_ino_proto = {
+   .func   = bpf_get_current_cgroup_ino,
+   .gpl_only   = false,
+   .ret_type   = RET_INTEGER,
+   .arg1_type  = ARG_DONTCARE,
+   .arg2_type  = ARG_DONTCARE,
+};
+
 static const struct bpf_func_proto *
 tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -564,6 +587,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct 
bpf_prog *prog)
return _get_prandom_u32_proto;
case BPF_FUNC_probe_read_str:
return _probe_read_str_proto;
+   case BPF_FUNC_get_current_cgroup_ino:
+   return _get_current_cgroup_ino_proto;
default:
return NULL;
}
-- 
2.14.3



Re: [PATCH net-next] tcp: add tracepoint trace_tcp_retransmit_synack()

2017-10-27 Thread Alban Crequy
Hi,

On 25 October 2017 at 01:57, Song Liu <songliubrav...@fb.com> wrote:
> This tracepoint can be used to trace synack retransmits. It maintains
> pointer to struct request_sock.
>
> We cannot simply reuse trace_tcp_retransmit_skb() here, because the
> sk here is the LISTEN socket. The IP addresses and ports should be
> extracted from struct request_sock.
>
> Signed-off-by: Song Liu <songliubrav...@fb.com>
> ---
>  include/trace/events/tcp.h | 56 
> ++
>  net/ipv4/tcp_output.c  |  1 +
>  2 files changed, 57 insertions(+)
>
> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> index 03699ba..07a 100644
> --- a/include/trace/events/tcp.h
> +++ b/include/trace/events/tcp.h
> @@ -237,6 +237,62 @@ TRACE_EVENT(tcp_set_state,
>   show_tcp_state_name(__entry->newstate))
>  );
>
> +TRACE_EVENT(tcp_retransmit_synack,
> +
> +   TP_PROTO(const struct sock *sk, const struct request_sock *req),
> +
> +   TP_ARGS(sk, req),
> +
> +   TP_STRUCT__entry(
> +   __field(const void *, skaddr)
> +   __field(const void *, req)
> +   __field(__u16, sport)
> +   __field(__u16, dport)
> +   __array(__u8, saddr, 4)
> +   __array(__u8, daddr, 4)
> +   __array(__u8, saddr_v6, 16)
> +   __array(__u8, daddr_v6, 16)

Would it make sense to add the inode of the network namespace that
owns the socket? (along with the major/minor of the nsfs)

If the kernel later gains tracepoints for TCP connect, accept, close
including the netns ino, then I might be able to replace some
ebpf-kprobes code by ebpf-tracepoints code :)

> [...]

Thanks,
Alban


Re: [PATCH net-next v7 08/10] bpf: Add a Landlock sandbox example

2017-09-01 Thread Alban Crequy
Hi Mickaël,

On 21 August 2017 at 02:09, Mickaël Salaün <m...@digikod.net> wrote:
> Add a basic sandbox tool to create a process isolated from some part of
> the system. This sandbox create a read-only environment. It is only
> allowed to write to a character device such as a TTY:
...
> +   /*
> +* This check allows the action on the file if it is a directory or a
> +* pipe. Otherwise, a message is printed to the eBPF log.
> +*/
> +   if (S_ISCHR(ret) || S_ISFIFO(ret))
> +   return 0;


The comment says "directory", but the code checks for "character device".

Thanks!
Alban


[PATCH] net: core: Fix slab-out-of-bounds in netdev_stats_to_stats64

2017-07-02 Thread Alban Browaeys
0 00 00 00 00 00 00 00 00 07 fc fc fc fc
 ^
  8801be248c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  8801be248c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ==

Signed-off-by: Alban Browaeys <alban.browa...@gmail.com>
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 416137c64bf8..25f9461eff3f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7751,7 +7751,7 @@ void netdev_stats_to_stats64(struct rtnl_link_stats64 
*stats64,
 {
 #if BITS_PER_LONG == 64
BUILD_BUG_ON(sizeof(*stats64) < sizeof(*netdev_stats));
-   memcpy(stats64, netdev_stats, sizeof(*stats64));
+   memcpy(stats64, netdev_stats, sizeof(*netdev_stats));
/* zero out counters that only exist in rtnl_link_stats64 */
memset((char *)stats64 + sizeof(*netdev_stats), 0,
   sizeof(*stats64) - sizeof(*netdev_stats));
-- 
2.13.2



[PATCH v3] tracing/kprobes: expose maxactive for kretprobe in kprobe_events

2017-04-03 Thread Alban Crequy
From: Alban Crequy <al...@kinvolk.io>

When a kretprobe is installed on a kernel function, there is a maximum
limit of how many calls in parallel it can catch (aka "maxactive"). A
kernel module could call register_kretprobe() and initialize maxactive
(see example in samples/kprobes/kretprobe_example.c).

But that is not exposed to userspace and it is currently not possible to
choose maxactive when writing to /sys/kernel/debug/tracing/kprobe_events

The default maxactive can be as low as 1 on single-core with a
non-preemptive kernel. This is too low and we need to increase it not
only for recursive functions, but for functions that sleep or resched.

This patch updates the format of the command that can be written to
kprobe_events so that maxactive can be optionally specified.

I need this for a bpf program attached to the kretprobe of
inet_csk_accept, which can sleep for a long time.

This patch includes a basic selftest:

> # ./ftracetest -v  test.d/kprobe/
> === Ftrace unit tests ===
> [1] Kprobe dynamic event - adding and removing[PASS]
> [2] Kprobe dynamic event - busy event check   [PASS]
> [3] Kprobe dynamic event with arguments   [PASS]
> [4] Kprobes event arguments with types[PASS]
> [5] Kprobe dynamic event with function tracer [PASS]
> [6] Kretprobe dynamic event with arguments[PASS]
> [7] Kretprobe dynamic event with maxactive[PASS]
>
> # of passed:  7
> # of failed:  0
> # of unresolved:  0
> # of untested:  0
> # of unsupported:  0
> # of xfailed:  0
> # of undefined(test bug):  0

BugLink: https://github.com/iovisor/bcc/issues/1072
Signed-off-by: Alban Crequy <al...@kinvolk.io>

---

Changes since v2:
- Explain the default maxactive value in the documentation.
  (Review from Steven Rostedt)

Changes since v1:
- Remove "(*)" from documentation. (Review from Masami Hiramatsu)
- Fix support for "r100" without the event name (Review from Masami Hiramatsu)
- Get rid of magic numbers within the code.  (Review from Steven Rostedt)
  Note that I didn't use KRETPROBE_MAXACTIVE_ALLOC since that patch is not
  merged.
- Return -E2BIG when maxactive is too big.
- Add basic selftest
---
 Documentation/trace/kprobetrace.txt|  5 ++-
 kernel/trace/trace_kprobe.c| 39 ++
 .../ftrace/test.d/kprobe/kretprobe_maxactive.tc| 39 ++
 3 files changed, 76 insertions(+), 7 deletions(-)
 create mode 100644 
tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc

diff --git a/Documentation/trace/kprobetrace.txt 
b/Documentation/trace/kprobetrace.txt
index 41ef9d8..25f3960 100644
--- a/Documentation/trace/kprobetrace.txt
+++ b/Documentation/trace/kprobetrace.txt
@@ -23,7 +23,7 @@ current_tracer. Instead of that, add probe points via
 Synopsis of kprobe_events
 -
   p[:[GRP/]EVENT] [MOD:]SYM[+offs]|MEMADDR [FETCHARGS] : Set a probe
-  r[:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS]: Set a return probe
+  r[MAXACTIVE][:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS] : Set a return probe
   -:[GRP/]EVENT: Clear a probe
 
  GRP   : Group name. If omitted, use "kprobes" for it.
@@ -32,6 +32,9 @@ Synopsis of kprobe_events
  MOD   : Module name which has given SYM.
  SYM[+offs]: Symbol+offset where the probe is inserted.
  MEMADDR   : Address where the probe is inserted.
+ MAXACTIVE : Maximum number of instances of the specified function that
+ can be probed simultaneously, or 0 for the default value
+ as defined in Documentation/kprobes.txt section 1.3.1.
 
  FETCHARGS : Arguments. Each probe can have up to 128 args.
   %REG : Fetch register REG
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index c5089c7..ae81f3c 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -25,6 +25,7 @@
 #include "trace_probe.h"
 
 #define KPROBE_EVENT_SYSTEM "kprobes"
+#define KRETPROBE_MAXACTIVE_MAX 4096
 
 /**
  * Kprobe event core functions
@@ -282,6 +283,7 @@ static struct trace_kprobe *alloc_trace_kprobe(const char 
*group,
 void *addr,
 const char *symbol,
 unsigned long offs,
+int maxactive,
 int nargs, bool is_return)
 {
struct trace_kprobe *tk;
@@ -309,6 +311,8 @@ static struct trace_kprobe *alloc_trace_kprobe(const char 
*group,
else
tk->rp.kp.pre_handler = kprobe_dispatcher;
 
+   tk->rp.maxactive = maxactive;
+
if (!event || !is_good_name(event)) {
ret = -EINVAL;
goto error;
@@ -598,8 +602,10 @@ static int cr

[PATCH v2] tracing/kprobes: expose maxactive for kretprobe in kprobe_events

2017-03-31 Thread Alban Crequy
When a kretprobe is installed on a kernel function, there is a maximum
limit of how many calls in parallel it can catch (aka "maxactive"). A
kernel module could call register_kretprobe() and initialize maxactive
(see example in samples/kprobes/kretprobe_example.c).

But that is not exposed to userspace and it is currently not possible to
choose maxactive when writing to /sys/kernel/debug/tracing/kprobe_events

The default maxactive can be as low as 1 on single-core with a
non-preemptive kernel. This is too low and we need to increase it not
only for recursive functions, but for functions that sleep or resched.

This patch updates the format of the command that can be written to
kprobe_events so that maxactive can be optionally specified.

I need this for a bpf program attached to the kretprobe of
inet_csk_accept, which can sleep for a long time.

This patch includes a basic selftest:

> # ./ftracetest -v  test.d/kprobe/
> === Ftrace unit tests ===
> [1] Kprobe dynamic event - adding and removing[PASS]
> [2] Kprobe dynamic event - busy event check   [PASS]
> [3] Kprobe dynamic event with arguments   [PASS]
> [4] Kprobes event arguments with types[PASS]
> [5] Kprobe dynamic event with function tracer [PASS]
> [6] Kretprobe dynamic event with arguments[PASS]
> [7] Kretprobe dynamic event with maxactive[PASS]
>
> # of passed:  7
> # of failed:  0
> # of unresolved:  0
> # of untested:  0
> # of unsupported:  0
> # of xfailed:  0
> # of undefined(test bug):  0

BugLink: https://github.com/iovisor/bcc/issues/1072
Signed-off-by: Alban Crequy <al...@kinvolk.io>

---

Changes since v1:
- Remove "(*)" from documentation. (Review from Masami Hiramatsu)
- Fix support for "r100" without the event name (Review from Masami Hiramatsu)
- Get rid of magic numbers within the code.  (Review from Steven Rostedt)
  Note that I didn't use KRETPROBE_MAXACTIVE_ALLOC since that patch is not
  merged.
- Return -E2BIG when maxactive is too big.
- Add basic selftest
---
 Documentation/trace/kprobetrace.txt|  4 ++-
 kernel/trace/trace_kprobe.c| 39 ++
 .../ftrace/test.d/kprobe/kretprobe_maxactive.tc| 39 ++
 3 files changed, 75 insertions(+), 7 deletions(-)
 create mode 100644 
tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc

diff --git a/Documentation/trace/kprobetrace.txt 
b/Documentation/trace/kprobetrace.txt
index 41ef9d8..7051a20 100644
--- a/Documentation/trace/kprobetrace.txt
+++ b/Documentation/trace/kprobetrace.txt
@@ -23,7 +23,7 @@ current_tracer. Instead of that, add probe points via
 Synopsis of kprobe_events
 -
   p[:[GRP/]EVENT] [MOD:]SYM[+offs]|MEMADDR [FETCHARGS] : Set a probe
-  r[:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS]: Set a return probe
+  r[MAXACTIVE][:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS] : Set a return probe
   -:[GRP/]EVENT: Clear a probe
 
  GRP   : Group name. If omitted, use "kprobes" for it.
@@ -32,6 +32,8 @@ Synopsis of kprobe_events
  MOD   : Module name which has given SYM.
  SYM[+offs]: Symbol+offset where the probe is inserted.
  MEMADDR   : Address where the probe is inserted.
+ MAXACTIVE : Maximum number of instances of the specified function that
+ can be probed simultaneously, or 0 for the default.
 
  FETCHARGS : Arguments. Each probe can have up to 128 args.
   %REG : Fetch register REG
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index c5089c7..ae81f3c 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -25,6 +25,7 @@
 #include "trace_probe.h"
 
 #define KPROBE_EVENT_SYSTEM "kprobes"
+#define KRETPROBE_MAXACTIVE_MAX 4096
 
 /**
  * Kprobe event core functions
@@ -282,6 +283,7 @@ static struct trace_kprobe *alloc_trace_kprobe(const char 
*group,
 void *addr,
 const char *symbol,
 unsigned long offs,
+int maxactive,
 int nargs, bool is_return)
 {
struct trace_kprobe *tk;
@@ -309,6 +311,8 @@ static struct trace_kprobe *alloc_trace_kprobe(const char 
*group,
else
tk->rp.kp.pre_handler = kprobe_dispatcher;
 
+   tk->rp.maxactive = maxactive;
+
if (!event || !is_good_name(event)) {
ret = -EINVAL;
goto error;
@@ -598,8 +602,10 @@ static int create_trace_kprobe(int argc, char **argv)
 {
/*
 * Argument syntax:
-*  - Add kprobe: p[:[GRP/]EVENT] [MOD:]KSYM[+OFFS]|KADDR [FETCHARGS]
-*  - Add kretprobe: r[:[GRP/]EVENT] [MOD:]KSYM[+0] [FETCHARGS

Re: [RFC PATCH tip/master 1/3] trace: kprobes: Show sum of probe/retprobe nmissed count

2017-03-31 Thread Alban Crequy
On Wed, Mar 29, 2017 at 7:22 AM, Masami Hiramatsu <mhira...@kernel.org> wrote:
> Show sum of probe and retprobe nmissed count in
> kprobe_profile, since retprobe can be missed even
> if the kprobe itself succeeeded.
> This explains user why their return probe didn't hit
> sometimes.
>
> Signed-off-by: Masami Hiramatsu <mhira...@kernel.org>

I tested this patch with my kretprobe on "inet_csk_accept" when there
are many processes waiting in the accept() syscall. I can now
successfully see the nmissed counter in
/sys/kernel/debug/tracing/kprobe_profile being incremented when the
kretprobe is missed.

Tested-by: Alban Crequy <al...@kinvolk.io>


> ---
>  kernel/trace/trace_kprobe.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 013f4e7..bbdc3de 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -896,7 +896,7 @@ static int probes_profile_seq_show(struct seq_file *m, 
> void *v)
> seq_printf(m, "  %-44s %15lu %15lu\n",
>trace_event_name(>tp.call),
>trace_kprobe_nhit(tk),
> -  tk->rp.kp.nmissed);
> +  tk->rp.kp.nmissed + tk->rp.nmissed);
>
> return 0;
>  }
>


Re: [RFC PATCH tip/master 2/3] kprobes: Allocate kretprobe instance if its free list is empty

2017-03-30 Thread Alban Crequy
On Thu, Mar 30, 2017 at 8:53 AM, Ingo Molnar <mi...@kernel.org> wrote:
>
> * Masami Hiramatsu <mhira...@kernel.org> wrote:
>
>> > So this is something I missed while the original code was merged, but the 
>> > concept
>> > looks a bit weird: why do we do any "allocation" while a handler is 
>> > executing?
>> >
>> > That's fundamentally fragile. What's the maximum number of parallel
>> > 'kretprobe_instance' required per kretprobe - one per CPU?
>>
>> It depends on the place where we put the probe. If the probed function will 
>> be
>> blocked (yield to other tasks), then we need a same number of threads on
>> the system which can invoke the function. So, ultimately, it is same
>> as function_graph tracer, we need it for each thread.
>
> So then put it into task_struct (assuming there's no 
> kretprobe-inside-kretprobe
> nesting allowed). There's just no way in hell we should be calling any complex
> kernel function from kernel probes!

Some kprobes are called from an interruption context. We have a kprobe
on tcp_set_state() and this is sometimes called when the network card
receives a packet.

> I mean, think about it, a kretprobe can be installed in a lot of places, and 
> now
> we want to call get_free_pages() from it?? This would add a massive amount of
> fragility.
>
> Instrumentation must be _simple_, every patch that adds more complexity to the
> most fundamental code path of it should raise a red flag ...
>
> So let's make this more robust, ok?
>
> Thanks,
>
> Ingo

Thanks,
Alban


Re: [PATCH v1] tracing/kprobes: expose maxactive for kretprobe in kprobe_events

2017-03-28 Thread Alban Crequy
Thanks for the review,

On Tue, Mar 28, 2017 at 5:23 PM, Masami Hiramatsu <mhira...@kernel.org> wrote:
> On Tue, 28 Mar 2017 15:52:22 +0200
> Alban Crequy <alban.cre...@gmail.com> wrote:
>
>> When a kretprobe is installed on a kernel function, there is a maximum
>> limit of how many calls in parallel it can catch (aka "maxactive"). A
>> kernel module could call register_kretprobe() and initialize maxactive
>> (see example in samples/kprobes/kretprobe_example.c).
>>
>> But that is not exposed to userspace and it is currently not possible to
>> choose maxactive when writing to /sys/kernel/debug/tracing/kprobe_events
>
> I see, I found same issue last week...
>
>>
>> The default maxactive can be as low as 1 on single-core with a
>> non-preemptive kernel. This is too low and we need to increase it not
>> only for recursive functions, but for functions that sleep or resched.
>>
>> This patch updates the format of the command that can be written to
>> kprobe_events so that maxactive can be optionally specified.
>
> Good! this is completely same what I'm planning to add.
>
>>
>> I need this for a bpf program attached to the kretprobe of
>> inet_csk_accept, which can sleep for a long time.
>
> I'm also preparing another patch for using kmalloc(GFP_ATOMIC) in
> kretprobe_pre_handler(), since this manual way is sometimes hard to
> estimate how many instances needed. Anyway, since that may involve
> unwilling memory allocation, this patch also needed.

Where is that kretprobe_pre_handler()? I don't see any allocations in
pre_handler_kretprobe().

>> BugLink: https://github.com/iovisor/bcc/issues/1072
>
> Could you also add Lukasz to Cc list, since he had reported an issue
> related this one.
>
> https://www.spinics.net/lists/linux-trace/msg00448.html
>
> Please see my comments below.
>
>> Signed-off-by: Alban Crequy <al...@kinvolk.io>
>> ---
>>  Documentation/trace/kprobetrace.txt |  4 +++-
>>  kernel/trace/trace_kprobe.c | 34 +-
>>  2 files changed, 32 insertions(+), 6 deletions(-)
>>
>> diff --git a/Documentation/trace/kprobetrace.txt 
>> b/Documentation/trace/kprobetrace.txt
>> index 41ef9d8..655ca7e 100644
>> --- a/Documentation/trace/kprobetrace.txt
>> +++ b/Documentation/trace/kprobetrace.txt
>> @@ -23,7 +23,7 @@ current_tracer. Instead of that, add probe points via
>>  Synopsis of kprobe_events
>>  -
>>p[:[GRP/]EVENT] [MOD:]SYM[+offs]|MEMADDR [FETCHARGS]   : Set a probe
>> -  r[:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS]  : Set a return probe
>> +  r[MAXACTIVE][:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS]   : Set a return 
>> probe
>>-:[GRP/]EVENT  : Clear a probe
>>
>>   GRP : Group name. If omitted, use "kprobes" for it.
>> @@ -32,6 +32,8 @@ Synopsis of kprobe_events
>>   MOD : Module name which has given SYM.
>>   SYM[+offs]  : Symbol+offset where the probe is inserted.
>>   MEMADDR : Address where the probe is inserted.
>> + MAXACTIVE   : Maximum number of instances of the specified function that
>> +   can be probed simultaneously, or 0 for the default.(*)
>
> Here, yoy don't need (*), since [MAXACTIVE] is not a FETCHARGS.

Why not? (*) refers to "(*) only for return probe." and the maxactive
only makes sense for the kretprobe.

>>   FETCHARGS   : Arguments. Each probe can have up to 128 args.
>>%REG   : Fetch register REG
>> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
>> index 5f688cc..807e01c 100644
>> --- a/kernel/trace/trace_kprobe.c
>> +++ b/kernel/trace/trace_kprobe.c
>> @@ -282,6 +282,7 @@ static struct trace_kprobe *alloc_trace_kprobe(const 
>> char *group,
>>void *addr,
>>const char *symbol,
>>unsigned long offs,
>> +  int maxactive,
>>int nargs, bool is_return)
>>  {
>>   struct trace_kprobe *tk;
>> @@ -309,6 +310,8 @@ static struct trace_kprobe *alloc_trace_kprobe(const 
>> char *group,
>>   else
>>   tk->rp.kp.pre_handler = kprobe_dispatcher;
>>
>> + tk->rp.maxactive = maxactive;
>> +
>>   if (!event || !is_good_name(event)) {
>>   ret = -EINVAL;
>>   goto error;
>> @@ -598,8 +601,10 @@ static

[PATCH v1] tracing/kprobes: expose maxactive for kretprobe in kprobe_events

2017-03-28 Thread Alban Crequy
When a kretprobe is installed on a kernel function, there is a maximum
limit of how many calls in parallel it can catch (aka "maxactive"). A
kernel module could call register_kretprobe() and initialize maxactive
(see example in samples/kprobes/kretprobe_example.c).

But that is not exposed to userspace and it is currently not possible to
choose maxactive when writing to /sys/kernel/debug/tracing/kprobe_events

The default maxactive can be as low as 1 on single-core with a
non-preemptive kernel. This is too low and we need to increase it not
only for recursive functions, but for functions that sleep or resched.

This patch updates the format of the command that can be written to
kprobe_events so that maxactive can be optionally specified.

I need this for a bpf program attached to the kretprobe of
inet_csk_accept, which can sleep for a long time.

BugLink: https://github.com/iovisor/bcc/issues/1072
Signed-off-by: Alban Crequy <al...@kinvolk.io>
---
 Documentation/trace/kprobetrace.txt |  4 +++-
 kernel/trace/trace_kprobe.c | 34 +-
 2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/Documentation/trace/kprobetrace.txt 
b/Documentation/trace/kprobetrace.txt
index 41ef9d8..655ca7e 100644
--- a/Documentation/trace/kprobetrace.txt
+++ b/Documentation/trace/kprobetrace.txt
@@ -23,7 +23,7 @@ current_tracer. Instead of that, add probe points via
 Synopsis of kprobe_events
 -
   p[:[GRP/]EVENT] [MOD:]SYM[+offs]|MEMADDR [FETCHARGS] : Set a probe
-  r[:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS]: Set a return probe
+  r[MAXACTIVE][:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS] : Set a return probe
   -:[GRP/]EVENT: Clear a probe
 
  GRP   : Group name. If omitted, use "kprobes" for it.
@@ -32,6 +32,8 @@ Synopsis of kprobe_events
  MOD   : Module name which has given SYM.
  SYM[+offs]: Symbol+offset where the probe is inserted.
  MEMADDR   : Address where the probe is inserted.
+ MAXACTIVE : Maximum number of instances of the specified function that
+ can be probed simultaneously, or 0 for the default.(*)
 
  FETCHARGS : Arguments. Each probe can have up to 128 args.
   %REG : Fetch register REG
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 5f688cc..807e01c 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -282,6 +282,7 @@ static struct trace_kprobe *alloc_trace_kprobe(const char 
*group,
 void *addr,
 const char *symbol,
 unsigned long offs,
+int maxactive,
 int nargs, bool is_return)
 {
struct trace_kprobe *tk;
@@ -309,6 +310,8 @@ static struct trace_kprobe *alloc_trace_kprobe(const char 
*group,
else
tk->rp.kp.pre_handler = kprobe_dispatcher;
 
+   tk->rp.maxactive = maxactive;
+
if (!event || !is_good_name(event)) {
ret = -EINVAL;
goto error;
@@ -598,8 +601,10 @@ static int create_trace_kprobe(int argc, char **argv)
 {
/*
 * Argument syntax:
-*  - Add kprobe: p[:[GRP/]EVENT] [MOD:]KSYM[+OFFS]|KADDR [FETCHARGS]
-*  - Add kretprobe: r[:[GRP/]EVENT] [MOD:]KSYM[+0] [FETCHARGS]
+*  - Add kprobe:
+*  p[:[GRP/]EVENT] [MOD:]KSYM[+OFFS]|KADDR [FETCHARGS]
+*  - Add kretprobe:
+*  r[MAXACTIVE][:[GRP/]EVENT] [MOD:]KSYM[+0] [FETCHARGS]
 * Fetch args:
 *  $retval : fetch return value
 *  $stack  : fetch stack address
@@ -619,6 +624,7 @@ static int create_trace_kprobe(int argc, char **argv)
int i, ret = 0;
bool is_return = false, is_delete = false;
char *symbol = NULL, *event = NULL, *group = NULL;
+   int maxactive = 0;
char *arg;
unsigned long offset = 0;
void *addr = NULL;
@@ -637,8 +643,26 @@ static int create_trace_kprobe(int argc, char **argv)
return -EINVAL;
}
 
-   if (argv[0][1] == ':') {
+   if (is_return && isdigit(argv[0][1]) && strchr([0][1], ':')) {
+   event = strchr([0][1], ':') + 1;
+   event[-1] = '\0';
+   ret = kstrtouint([0][1], 0, );
+   if (ret) {
+   pr_info("Failed to parse maxactive.\n");
+   return ret;
+   }
+   /* kretprobes instances are iterated over via a list. The
+* maximum should stay reasonable.
+*/
+   if (maxactive > 1024) {
+   pr_info("Maxactive is too big.\n");
+   return -EINVAL;
+   

Re: [PATCH 1/7] Documentation: dt: net: Update the ath9k binding for SoC devices

2017-03-28 Thread Alban
On Mon, 27 Mar 2017 18:11:15 +0200
Christian Lamparter <chunk...@googlemail.com> wrote:

> On Monday, March 13, 2017 10:05:09 PM CEST Alban wrote:
> > The current binding only cover PCI devices so extend it for SoC devices.
> > 
> > Most SoC platforms use an MTD partition for the calibration data
> > instead of an EEPROM. The qca,no-eeprom property was added to allow
> > loading the EEPROM content using firmware loading. This new binding
> > replace this hack with NVMEM cells, so we also mark the qca,no-eeprom
> > property as deprecated in case anyone ever used it.  
> 
> Please don't mark "qca,no-eeprom" as deprecated then.
> If some devices geniously need to rely on userspace for extracting 
> and processing the calibration data, it should be stay a
> optional properties.

Deprecated just mean that it shouldn't be used for new devices. But as
it is not used by any board, misuse the firmware loading API and
firmware loader user helper are deprecated in udev, I find we could also
just drop it.

> For example: A device that can't do easily without "qca,no-eeprom" is
> the AVM FRITZ!WLAN Repeater 300E. For this device, the caldata 
> is stored in the flash, however for whatever reason the vendor
> choose to "reverse" it. (like completely back to front, not byteswapped
> or something). So an extra "unreversing step" is required. So, it would
> require some sort of a special nvmem-provider-processor as an 
> alternative.

Or just handle this special eeprom format in the ath9k driver. I doubt
that this case is so common that it would justify adding a whole new
layer to nvmem.

> > Signed-off-by: Alban <al...@free.fr>
> > ---
> > diff --git a/Documentation/devicetree/bindings/net/wireless/qca,ath9k.txt 
> > b/Documentation/devicetree/bindings/net/wireless/qca,ath9k.txt
> > index b7396c8..61f5f6d 100644
> > --- a/Documentation/devicetree/bindings/net/wireless/qca,ath9k.txt
> > +++ b/Documentation/devicetree/bindings/net/wireless/qca,ath9k.txt
> > @@ -27,16 +27,34 @@ Required properties:
> > - 0034 for AR9462
> > - 0036 for AR9565
> > - 0037 for AR9485
> > +   For SoC devices the compatible should be "qca,-wmac"
> > +   and one of the following fallbacks:
> > +   - "qca,ar9100-wmac"
> > +   - "qca,ar9330-wmac"
> > +   - "qca,ar9340-wmac"
> > +   - "qca,qca9550-wmac"
> > +   - "qca,qca9530-wmac"
> >  - reg: Address and length of the register set for the device.
> >  
> > +Required properties for SoC devices:
> > +- interrupt-parent: phandle of the parent interrupt controller.
> > +- interrupts: Interrupt specifier for the controllers interrupt.
> > +
> >  Optional properties:
> > +- mac-address: See ethernet.txt in the parent directory
> > +- local-mac-address: See ethernet.txt in the parent directory
> > +- clock-names: has to be "ref"
> > +- clocks: phandle of the reference clock
> > +- resets: phandle of the reset line
> > +- nvmem-cell-names: has to be "eeprom" and/or "address"
> > +- nvmem-cells: phandle to the eeprom nvmem cell and/or to the mac address
> > +   nvmem cell.
> > +
> > +Deprecated properties:
> >  - qca,no-eeprom: Indicates that there is no physical EEPROM
> > connected to the ath9k wireless chip (in this case the calibration /
> > EEPROM data will be loaded from userspace
> > using the kernel firmware loader).
> > -- mac-address: See ethernet.txt in the parent directory
> > -- local-mac-address: See ethernet.txt in the parent directory
> > -  
> It sounds like you want to deprecate mac-address and
> local-mac-address as well. If so you sould add this to the commit as
> well. From my point of view, people mostly flat-out patched the
> eeprom-image if they wanted to set the mac-address. However, this was
> an extra step, if nvmem does away with it, I'm completely fine with
> deprecating these properties.

The produced diff is very misleading because the mac-address properties
get lumped with the other new optional  properties. But if you look
closely it just move qca,no-eeprom to the deprecated section.

Alban


pgp0BJ7VscCj1.pgp
Description: OpenPGP digital signature


Re: [PATCH 3/7] ath9k: Add support for reading the EEPROM data using the nvmem API

2017-03-23 Thread Alban
On Tue, 14 Mar 2017 00:53:55 +0100
Christian Lamparter <chunk...@googlemail.com> wrote:

> On Monday, March 13, 2017 10:05:11 PM CET Alban wrote:
> > Currently SoC platforms use a firmware request to get the EEPROM data.
> > This is mostly a hack and rely on using a user-helper scripts which is
> > deprecated. A nicer alternative is to use the nvmem API which was
> > designed for this kind of task.
> > 
> > Furthermore we let CONFIG_ATH9K_AHB select CONFIG_NVMEM as such
> > devices will generally use this method for loading the EEPROM data.
> > 
> > Signed-off-by: Alban <al...@free.fr>
> > ---
> > @@ -654,6 +656,25 @@ static int ath9k_init_softc(u16 devid, struct 
> > ath_softc *sc,
> > if (ret)
> > return ret;
> >  
> > +   /* If the EEPROM hasn't been retrieved via firmware request
> > +* use the nvmem API insted.
> > +*/
> > +   if (!ah->eeprom_blob) {
> > +   struct nvmem_cell *eeprom_cell;
> > +
> > +   eeprom_cell = nvmem_cell_get(ah->dev, "eeprom");
> > +   if (!IS_ERR(eeprom_cell)) {
> > +   ah->eeprom_data = nvmem_cell_read(
> > +   eeprom_cell, >eeprom_size);
> > +   nvmem_cell_put(eeprom_cell);
> > +
> > +   if (IS_ERR(ah->eeprom_data)) {
> > +   dev_err(ah->dev, "failed to read eeprom");
> > +   return PTR_ERR(ah->eeprom_data);
> > +   }
> > +   }
> > +   }
> > +
> > if (ath9k_led_active_high != -1)
> > ah->config.led_active_high = ath9k_led_active_high == 1;
> >
> Are you sure this works with AR93XX SoCs that have the calibration data
> in the OTP?

I only tested this on an ar9132 platform, so it might well be that a
few things are missing for the newer SoC. However this shouldn't break
anything existing as a platform needs to define an nvmem cell on the
athk9 device for this code to be used at all.

Alban


pgpkNvuX79h5H.pgp
Description: OpenPGP digital signature


[PATCH 7/7] ath9k: hw: Reset the device with the external reset before init

2017-03-13 Thread Alban
On the SoC platform the board code often manually reset the device
before registering it. To allow the same to happen on DT platforms
let the driver call the reset before init.

Signed-off-by: Alban <al...@free.fr>
---
 drivers/net/wireless/ath/ath9k/hw.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/wireless/ath/ath9k/hw.c 
b/drivers/net/wireless/ath/ath9k/hw.c
index efc0435..dfb13bc 100644
--- a/drivers/net/wireless/ath/ath9k/hw.c
+++ b/drivers/net/wireless/ath/ath9k/hw.c
@@ -576,6 +576,13 @@ static int __ath9k_hw_init(struct ath_hw *ah)
struct ath_common *common = ath9k_hw_common(ah);
int r = 0;
 
+   /* Reset the device before using it */
+   r = ath9k_hw_external_reset(ah);
+   if (r) {
+   ath_err(common, "Failed to reset chip\n");
+   return r;
+   }
+
ath9k_hw_read_revisions(ah);
 
switch (ah->hw_version.macVersion) {
-- 
2.7.4



[PATCH 3/7] ath9k: Add support for reading the EEPROM data using the nvmem API

2017-03-13 Thread Alban
Currently SoC platforms use a firmware request to get the EEPROM data.
This is mostly a hack and rely on using a user-helper scripts which is
deprecated. A nicer alternative is to use the nvmem API which was
designed for this kind of task.

Furthermore we let CONFIG_ATH9K_AHB select CONFIG_NVMEM as such
devices will generally use this method for loading the EEPROM data.

Signed-off-by: Alban <al...@free.fr>
---
 drivers/net/wireless/ath/ath9k/Kconfig  |  1 +
 drivers/net/wireless/ath/ath9k/eeprom.c | 10 ++
 drivers/net/wireless/ath/ath9k/hw.h |  2 ++
 drivers/net/wireless/ath/ath9k/init.c   | 21 +
 4 files changed, 34 insertions(+)

diff --git a/drivers/net/wireless/ath/ath9k/Kconfig 
b/drivers/net/wireless/ath/ath9k/Kconfig
index 783a38f..1558c03 100644
--- a/drivers/net/wireless/ath/ath9k/Kconfig
+++ b/drivers/net/wireless/ath/ath9k/Kconfig
@@ -49,6 +49,7 @@ config ATH9K_PCI
 config ATH9K_AHB
bool "Atheros ath9k AHB bus support"
depends on ATH9K
+   select NVMEM
default n
---help---
  This option enables the AHB bus support in ath9k.
diff --git a/drivers/net/wireless/ath/ath9k/eeprom.c 
b/drivers/net/wireless/ath/ath9k/eeprom.c
index fb80ec8..1f28222 100644
--- a/drivers/net/wireless/ath/ath9k/eeprom.c
+++ b/drivers/net/wireless/ath/ath9k/eeprom.c
@@ -127,6 +127,14 @@ static bool ath9k_hw_nvram_read_pdata(struct 
ath9k_platform_data *pdata,
 offset, data);
 }
 
+static bool ath9k_hw_nvram_read_data(struct ath_hw *ah,
+off_t offset, u16 *data)
+{
+   return ath9k_hw_nvram_read_array(ah->eeprom_data,
+ah->eeprom_size / 2,
+offset, data);
+}
+
 static bool ath9k_hw_nvram_read_firmware(const struct firmware *eeprom_blob,
 off_t offset, u16 *data)
 {
@@ -143,6 +151,8 @@ bool ath9k_hw_nvram_read(struct ath_hw *ah, u32 off, u16 
*data)
 
if (ah->eeprom_blob)
ret = ath9k_hw_nvram_read_firmware(ah->eeprom_blob, off, data);
+   else if (ah->eeprom_data)
+   ret = ath9k_hw_nvram_read_data(ah, off, data);
else if (pdata && !pdata->use_eeprom && pdata->eeprom_data)
ret = ath9k_hw_nvram_read_pdata(pdata, off, data);
else
diff --git a/drivers/net/wireless/ath/ath9k/hw.h 
b/drivers/net/wireless/ath/ath9k/hw.h
index 9cbca12..7f17c2a 100644
--- a/drivers/net/wireless/ath/ath9k/hw.h
+++ b/drivers/net/wireless/ath/ath9k/hw.h
@@ -970,6 +970,8 @@ struct ath_hw {
bool disable_5ghz;
 
const struct firmware *eeprom_blob;
+   void *eeprom_data;
+   size_t eeprom_size;
 
struct ath_dynack dynack;
 
diff --git a/drivers/net/wireless/ath/ath9k/init.c 
b/drivers/net/wireless/ath/ath9k/init.c
index fa4b3cc..054f254 100644
--- a/drivers/net/wireless/ath/ath9k/init.c
+++ b/drivers/net/wireless/ath/ath9k/init.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -511,6 +512,7 @@ static int ath9k_eeprom_request(struct ath_softc *sc, const 
char *name)
 static void ath9k_eeprom_release(struct ath_softc *sc)
 {
release_firmware(sc->sc_ah->eeprom_blob);
+   kfree(sc->sc_ah->eeprom_data);
 }
 
 static int ath9k_init_platform(struct ath_softc *sc)
@@ -654,6 +656,25 @@ static int ath9k_init_softc(u16 devid, struct ath_softc 
*sc,
if (ret)
return ret;
 
+   /* If the EEPROM hasn't been retrieved via firmware request
+* use the nvmem API insted.
+*/
+   if (!ah->eeprom_blob) {
+   struct nvmem_cell *eeprom_cell;
+
+   eeprom_cell = nvmem_cell_get(ah->dev, "eeprom");
+   if (!IS_ERR(eeprom_cell)) {
+   ah->eeprom_data = nvmem_cell_read(
+   eeprom_cell, >eeprom_size);
+   nvmem_cell_put(eeprom_cell);
+
+   if (IS_ERR(ah->eeprom_data)) {
+   dev_err(ah->dev, "failed to read eeprom");
+   return PTR_ERR(ah->eeprom_data);
+   }
+   }
+   }
+
if (ath9k_led_active_high != -1)
ah->config.led_active_high = ath9k_led_active_high == 1;
 
-- 
2.7.4



[PATCH 5/7] ath9k: of: Use the clk API to get the reference clock rate

2017-03-13 Thread Alban
If a clock named "ref" exists use it to get the reference clock rate.

Signed-off-by: Alban <al...@free.fr>
---
 drivers/net/wireless/ath/ath9k/init.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/wireless/ath/ath9k/init.c 
b/drivers/net/wireless/ath/ath9k/init.c
index 36b51a5..5cb9c61 100644
--- a/drivers/net/wireless/ath/ath9k/init.c
+++ b/drivers/net/wireless/ath/ath9k/init.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "ath9k.h"
@@ -564,6 +565,7 @@ static int ath9k_of_init(struct ath_softc *sc)
struct ath_hw *ah = sc->sc_ah;
struct ath_common *common = ath9k_hw_common(ah);
enum ath_bus_type bus_type = common->bus_ops->ath_bus_type;
+   struct clk *clk;
const char *mac;
char eeprom_name[100];
int ret;
@@ -573,6 +575,12 @@ static int ath9k_of_init(struct ath_softc *sc)
 
ath_dbg(common, CONFIG, "parsing configuration from OF node\n");
 
+   clk = clk_get(sc->dev, "ref");
+   if (!IS_ERR(clk)) {
+   ah->is_clk_25mhz = (clk_get_rate(clk) == 2500);
+   clk_put(clk);
+   }
+
if (of_property_read_bool(np, "qca,no-eeprom")) {
/* ath9k-eeprom--.bin */
scnprintf(eeprom_name, sizeof(eeprom_name),
-- 
2.7.4



[PATCH 6/7] ath9k: Allow using the reset API for the external reset

2017-03-13 Thread Alban
From: Alban Bedel <al...@free.fr>

Allow using the reset API instead of a platform specific callback to
reset the device.

Signed-off-by: Alban Bedel <al...@free.fr>
---
 drivers/net/wireless/ath/ath9k/hw.c   | 26 +-
 drivers/net/wireless/ath/ath9k/hw.h   |  1 +
 drivers/net/wireless/ath/ath9k/init.c | 10 ++
 3 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/hw.c 
b/drivers/net/wireless/ath/ath9k/hw.c
index 8c5c2dd..efc0435 100644
--- a/drivers/net/wireless/ath/ath9k/hw.c
+++ b/drivers/net/wireless/ath/ath9k/hw.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "hw.h"
@@ -551,6 +552,24 @@ static int ath9k_hw_attach_ops(struct ath_hw *ah)
return 0;
 }
 
+static int ath9k_hw_external_reset(struct ath_hw *ah)
+{
+   int ret = 0;
+
+   ath_dbg(ath9k_hw_common(ah), RESET,
+   "reset MAC via external reset\n");
+
+   if (ah->external_reset) {
+   ret = ah->external_reset();
+   } else if (ah->reset) {
+   ret = reset_control_assert(ah->reset);
+   if (!ret)
+   ret = reset_control_deassert(ah->reset);
+   }
+
+   return ret;
+}
+
 /* Called for all hardware families */
 static int __ath9k_hw_init(struct ath_hw *ah)
 {
@@ -1286,14 +1305,11 @@ static bool ath9k_hw_ar9330_reset_war(struct ath_hw 
*ah, int type)
break;
}
 
-   if (ah->external_reset &&
+   if ((ah->reset || ah->external_reset) &&
(npend || type == ATH9K_RESET_COLD)) {
int reset_err = 0;
 
-   ath_dbg(ath9k_hw_common(ah), RESET,
-   "reset MAC via external reset\n");
-
-   reset_err = ah->external_reset();
+   reset_err = ath9k_hw_external_reset(ah);
if (reset_err) {
ath_err(ath9k_hw_common(ah),
"External reset failed, err=%d\n",
diff --git a/drivers/net/wireless/ath/ath9k/hw.h 
b/drivers/net/wireless/ath/ath9k/hw.h
index 7f17c2a..53b67e3 100644
--- a/drivers/net/wireless/ath/ath9k/hw.h
+++ b/drivers/net/wireless/ath/ath9k/hw.h
@@ -966,6 +966,7 @@ struct ath_hw {
bool is_clk_25mhz;
int (*get_mac_revision)(void);
int (*external_reset)(void);
+   struct reset_control *reset;
bool disable_2ghz;
bool disable_5ghz;
 
diff --git a/drivers/net/wireless/ath/ath9k/init.c 
b/drivers/net/wireless/ath/ath9k/init.c
index 5cb9c61..f1cb806 100644
--- a/drivers/net/wireless/ath/ath9k/init.c
+++ b/drivers/net/wireless/ath/ath9k/init.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "ath9k.h"
@@ -697,6 +698,15 @@ static int ath9k_init_softc(u16 devid, struct ath_softc 
*sc,
if (!is_valid_ether_addr(common->macaddr))
ath9k_get_nvmem_address(sc);
 
+   /* Try to get a reset controller */
+   ah->reset = devm_reset_control_get_optional(sc->dev, NULL);
+   if (IS_ERR(ah->reset)) {
+   if (PTR_ERR(ah->reset) != -ENOENT &&
+   PTR_ERR(ah->reset) != -ENOTSUPP)
+   return PTR_ERR(ah->reset);
+   ah->reset = NULL;
+   }
+
/* If the EEPROM hasn't been retrieved via firmware request
 * use the nvmem API insted.
 */
-- 
2.7.4



[PATCH 4/7] ath9k: Add support for reading the MAC address with nvmem

2017-03-13 Thread Alban
On embedded platforms the MAC address is often stored in flash,
use nvmem to read it if the platform data or DT didn't specify one.

Signed-off-by: Alban <al...@free.fr>
---
 drivers/net/wireless/ath/ath9k/init.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/drivers/net/wireless/ath/ath9k/init.c 
b/drivers/net/wireless/ath/ath9k/init.c
index 054f254..36b51a5 100644
--- a/drivers/net/wireless/ath/ath9k/init.c
+++ b/drivers/net/wireless/ath/ath9k/init.c
@@ -594,6 +594,35 @@ static int ath9k_of_init(struct ath_softc *sc)
return 0;
 }
 
+static int ath9k_get_nvmem_address(struct ath_softc *sc)
+{
+   struct ath_common *common = ath9k_hw_common(sc->sc_ah);
+   struct nvmem_cell *cell;
+   size_t cell_size;
+   int err = 0;
+   void *mac;
+
+   cell = nvmem_cell_get(sc->dev, "address");
+   if (IS_ERR(cell))
+   return PTR_ERR(cell);
+
+   mac = nvmem_cell_read(cell, _size);
+   nvmem_cell_put(cell);
+
+   if (IS_ERR(mac))
+   return PTR_ERR(mac);
+
+   if (cell_size == 6) {
+   ether_addr_copy(common->macaddr, mac);
+   } else {
+   dev_err(sc->dev, "nvmem 'address' cell has invalid size\n");
+   err = -EINVAL;
+   }
+
+   kfree(mac);
+   return err;
+}
+
 static int ath9k_init_softc(u16 devid, struct ath_softc *sc,
const struct ath_bus_ops *bus_ops)
 {
@@ -656,6 +685,10 @@ static int ath9k_init_softc(u16 devid, struct ath_softc 
*sc,
if (ret)
return ret;
 
+   /* If no MAC address has been set yet try to use nvmem */
+   if (!is_valid_ether_addr(common->macaddr))
+   ath9k_get_nvmem_address(sc);
+
/* If the EEPROM hasn't been retrieved via firmware request
 * use the nvmem API insted.
 */
-- 
2.7.4



[PATCH 2/7] ath9k: ahb: Add OF support

2017-03-13 Thread Alban
Allow registering ath9k AHB devices defined in DT. This just add the
compatible strings to allow matching the driver and setting the proper
device ID.

Signed-off-by: Alban <al...@free.fr>
---
 drivers/net/wireless/ath/ath9k/ahb.c | 47 +---
 1 file changed, 43 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/ahb.c 
b/drivers/net/wireless/ath/ath9k/ahb.c
index 2bd982c..36a2645 100644
--- a/drivers/net/wireless/ath/ath9k/ahb.c
+++ b/drivers/net/wireless/ath/ath9k/ahb.c
@@ -18,6 +18,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include "ath9k.h"
 
@@ -49,6 +50,33 @@ static const struct platform_device_id 
ath9k_platform_id_table[] = {
{},
 };
 
+#ifdef CONFIG_OF
+static const struct of_device_id ath_ahb_of_match[] = {
+   {
+   .compatible = "qca,ar9100-wmac",
+   .data = (void *)AR5416_AR9100_DEVID
+   },
+   {
+   .compatible = "qca,ar9330-wmac",
+   .data = (void *)AR9300_DEVID_AR9330
+   },
+   {
+   .compatible = "qca,ar9340-wmac",
+   .data = (void *)AR9300_DEVID_AR9340
+   },
+   {
+   .compatible = "qca,qca9550-wmac",
+   .data = (void *)AR9300_DEVID_QCA955X
+   },
+   {
+   .compatible = "qca,qca9530-wmac",
+   .data = (void *)AR9300_DEVID_AR953X
+   },
+   {},
+};
+MODULE_DEVICE_TABLE(of, ath_ahb_of_match);
+#endif
+
 /* return bus cachesize in 4B word units */
 static void ath_ahb_read_cachesize(struct ath_common *common, int *csz)
 {
@@ -79,10 +107,20 @@ static int ath_ahb_probe(struct platform_device *pdev)
int ret = 0;
struct ath_hw *ah;
char hw_name[64];
+   u16 devid;
 
-   if (!dev_get_platdata(>dev)) {
-   dev_err(>dev, "no platform data specified\n");
-   return -EINVAL;
+   if (id) {
+   devid = id->driver_data;
+   } else {
+   const struct of_device_id *match;
+
+   match = of_match_device(ath_ahb_of_match, >dev);
+   if (!match) {
+   dev_err(>dev, "no device match found\n");
+   return -EINVAL;
+   }
+
+   devid = (u16)(unsigned long)match->data;
}
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
@@ -127,7 +165,7 @@ static int ath_ahb_probe(struct platform_device *pdev)
goto err_free_hw;
}
 
-   ret = ath9k_init_device(id->driver_data, sc, _ahb_bus_ops);
+   ret = ath9k_init_device(devid, sc, _ahb_bus_ops);
if (ret) {
dev_err(>dev, "failed to initialize device\n");
goto err_irq;
@@ -167,6 +205,7 @@ static struct platform_driver ath_ahb_driver = {
.remove = ath_ahb_remove,
.driver = {
.name   = "ath9k",
+   .of_match_table = of_match_ptr(ath_ahb_of_match),
},
.id_table= ath9k_platform_id_table,
 };
-- 
2.7.4



[PATCH 1/7] Documentation: dt: net: Update the ath9k binding for SoC devices

2017-03-13 Thread Alban
The current binding only cover PCI devices so extend it for SoC devices.

Most SoC platforms use an MTD partition for the calibration data
instead of an EEPROM. The qca,no-eeprom property was added to allow
loading the EEPROM content using firmware loading. This new binding
replace this hack with NVMEM cells, so we also mark the qca,no-eeprom
property as deprecated in case anyone ever used it.

Signed-off-by: Alban <al...@free.fr>
---
 .../devicetree/bindings/net/wireless/qca,ath9k.txt | 41 --
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/wireless/qca,ath9k.txt 
b/Documentation/devicetree/bindings/net/wireless/qca,ath9k.txt
index b7396c8..61f5f6d 100644
--- a/Documentation/devicetree/bindings/net/wireless/qca,ath9k.txt
+++ b/Documentation/devicetree/bindings/net/wireless/qca,ath9k.txt
@@ -27,16 +27,34 @@ Required properties:
- 0034 for AR9462
- 0036 for AR9565
- 0037 for AR9485
+   For SoC devices the compatible should be "qca,-wmac"
+   and one of the following fallbacks:
+   - "qca,ar9100-wmac"
+   - "qca,ar9330-wmac"
+   - "qca,ar9340-wmac"
+   - "qca,qca9550-wmac"
+   - "qca,qca9530-wmac"
 - reg: Address and length of the register set for the device.
 
+Required properties for SoC devices:
+- interrupt-parent: phandle of the parent interrupt controller.
+- interrupts: Interrupt specifier for the controllers interrupt.
+
 Optional properties:
+- mac-address: See ethernet.txt in the parent directory
+- local-mac-address: See ethernet.txt in the parent directory
+- clock-names: has to be "ref"
+- clocks: phandle of the reference clock
+- resets: phandle of the reset line
+- nvmem-cell-names: has to be "eeprom" and/or "address"
+- nvmem-cells: phandle to the eeprom nvmem cell and/or to the mac address
+   nvmem cell.
+
+Deprecated properties:
 - qca,no-eeprom: Indicates that there is no physical EEPROM connected to the
ath9k wireless chip (in this case the calibration /
EEPROM data will be loaded from userspace using the
kernel firmware loader).
-- mac-address: See ethernet.txt in the parent directory
-- local-mac-address: See ethernet.txt in the parent directory
-
 
 In this example, the node is defined as child node of the PCI controller:
  {
@@ -46,3 +64,20 @@ In this example, the node is defined as child node of the 
PCI controller:
qca,no-eeprom;
};
 };
+
+In this example it is defined as a SoC device:
+   wmac@180c {
+   compatible = "qca,ar9132-wmac", "qca,ar9100-wmac";
+   reg = <0x180c 0x3>;
+
+   interrupt-parent = <>;
+   interrupts = <2>;
+
+   clock-names = "ref";
+   clocks = <>;
+
+   nvmem-cell-names = "eeprom", "address";
+   nvmem-cells = <_eeprom>, <_address>;
+
+   resets = < 22>;
+   };
-- 
2.7.4



[PATCH] drivers: net: xgene: Fix crash on DT systems

2017-02-28 Thread Alban Bedel
On DT systems the driver require a clock, but the probe just print a
warning and continue, leading to a crash when resetting the device.
To fix this crash and properly handle probe deferals only ignore the
missing clock if DT isn't used or if the clock doesn't exist.

Signed-off-by: Alban Bedel <alban.be...@avionic-design.de>
---
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index d0d0d12b531f..68b48edc5921 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -1756,6 +1756,12 @@ static int xgene_enet_get_resources(struct 
xgene_enet_pdata *pdata)
 
pdata->clk = devm_clk_get(>dev, NULL);
if (IS_ERR(pdata->clk)) {
+   /* Abort if the clock is defined but couldn't be retrived.
+* Always abort if the clock is missing on DT system as
+* the driver can't cope with this case.
+*/
+   if (PTR_ERR(pdata->clk) != -ENOENT || dev->of_node)
+   return PTR_ERR(pdata->clk);
/* Firmware may have set up the clock already. */
dev_info(dev, "clocks have been setup already\n");
}
-- 
2.11.0



[PATCH v2] netfilter: xt_hashlimit: Fix integer divide round to zero.

2017-02-06 Thread Alban Browaeys
Diving the divider by the multiplier before applying to the input.
When this would "divide by zero", divide the multiplier by the divider
first then multiply the input by this value.

Currently user2creds outputs zero when input value is bigger than the
number of slices and  lower than scale.
This as then user input is applied an integer divide operation to
a number greater than itself (scale).
That rounds up to zero, then we mulitply zero by the credits slice size.
  iptables -t filter -I INPUT --protocol tcp --match hashlimit
  --hashlimit 40/second --hashlimit-burst 20 --hashlimit-mode srcip
  --hashlimit-name syn-flood --jump RETURN
thus trigger the overflow detection code:
xt_hashlimit: overflow, try lower: 25000/20
(25000 as hashlimit avd and 20 the burst)
Here:
134217 slices of (HZ * CREDITS_PER_JIFFY) size.
50 is user input value
100 is XT_HASHLIMIT_SCALE_v2
gives: 0 as user2creds output
Setting burst to "1" typically solve the issue ...
but setting it to "40" does too !

This is on 32bit arch calling into revision 2 of hashlimit.

Signed-off-by: Alban Browaeys <alban.browa...@gmail.com>
---
v2:
- fix missing conditional statement in revision 1 case
.

I removed the code duplication between revision 1 and 2.

v1: https://lkml.org/lkml/2017/2/4/173
---
 net/netfilter/xt_hashlimit.c | 25 +
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 10063408141d..84ad5ab34558 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -463,23 +463,16 @@ static u32 xt_hashlimit_len_to_chunks(u32 len)
 /* Precision saver. */
 static u64 user2credits(u64 user, int revision)
 {
-   if (revision == 1) {
-   /* If multiplying would overflow... */
-   if (user > 0x / (HZ*CREDITS_PER_JIFFY_v1))
-   /* Divide first. */
-   return div64_u64(user, XT_HASHLIMIT_SCALE)
-   * HZ * CREDITS_PER_JIFFY_v1;
-
-   return div64_u64(user * HZ * CREDITS_PER_JIFFY_v1,
-XT_HASHLIMIT_SCALE);
-   } else {
-   if (user > 0xULL / (HZ*CREDITS_PER_JIFFY))
-   return div64_u64(user, XT_HASHLIMIT_SCALE_v2)
-   * HZ * CREDITS_PER_JIFFY;
+   u64 scale = (revision == 1) ?
+   XT_HASHLIMIT_SCALE : XT_HASHLIMIT_SCALE_v2;
+   u64 cpj = (revision == 1) ?
+   CREDITS_PER_JIFFY_v1 : CREDITS_PER_JIFFY;
 
-   return div64_u64(user * HZ * CREDITS_PER_JIFFY,
-XT_HASHLIMIT_SCALE_v2);
-   }
+   /* Avoid overflow: divide the constant operands first */
+   if (scale >= HZ * cpj)
+   return div64_u64(user, div64_u64(scale, HZ * cpj));
+
+   return user * div64_u64(HZ * cpj, scale);
 }
 
 static u32 user2credits_byte(u32 user)
-- 
2.11.0



Re: [PATCH] netfilter: xt_hashlimit: Fix integer divide round to zero.

2017-02-06 Thread Alban Browaeys
Le lundi 06 février 2017 à 14:04 +0100, Pablo Neira Ayuso a écrit :
> On Sat, Feb 04, 2017 at 11:47:31PM +0100, Alban Browaeys wrote:
> > diff --git a/net/netfilter/xt_hashlimit.c
> > b/net/netfilter/xt_hashlimit.c
> > index 10063408141d..df75ad643eef 100644
> > --- a/net/netfilter/xt_hashlimit.c
> > +++ b/net/netfilter/xt_hashlimit.c
> > @@ -463,21 +463,19 @@ static u32 xt_hashlimit_len_to_chunks(u32 len)
> >  /* Precision saver. */
> >  static u64 user2credits(u64 user, int revision)
> >  {
> > > > +   /* Avoid overflow: divide the constant operands first */
> > > >     if (revision == 1) {
> > > > -   /* If multiplying would overflow... */
> > > > -   if (user > 0x / (HZ*CREDITS_PER_JIFFY_v1))
> > > > -   /* Divide first. */
> > > > -   return div64_u64(user, XT_HASHLIMIT_SCALE)
> > > > -   * HZ * CREDITS_PER_JIFFY_v1;
> > > > +   return div64_u64(user, div64_u64(XT_HASHLIMIT_SCALE,
> > > > +   HZ * CREDITS_PER_JIFFY_v1));
> >  
> > > > -   return div64_u64(user * HZ * CREDITS_PER_JIFFY_v1,
> > > > +   return user * div64_u64(HZ * CREDITS_PER_JIFFY_v1,
> >      XT_HASHLIMIT_SCALE);
> 
> I see two return statements here, the one coming later is
> shadowed, this can't be right.

I fixed revision 2 case then copy the code to revision 1
 and lost the condition in the process.
The code duplication is useless. I will rework it in v2.

Thank you for the review.


[PATCH] netfilter: xt_hashlimit: Fix integer divide round to zero.

2017-02-04 Thread Alban Browaeys
Diving the divider by the multiplier before applying to the input.
When this would "divide by zero", divide the multiplier by the divider
first then multiply the input by this value.

Currently user2creds outputs zero when input value is bigger than the
number of slices and  lower than scale.
This as then user input is applied an integer divide operation to
a number greater than itself (scale).
That rounds up to zero, then we mulitply zero by the credits slice size.
  iptables -t filter -I INPUT --protocol tcp --match hashlimit
  --hashlimit 40/second --hashlimit-burst 20 --hashlimit-mode srcip
  --hashlimit-name syn-flood --jump RETURN
thus trigger the overflow detection code:
xt_hashlimit: overflow, try lower: 25000/20
(25000 as hashlimit avd and 20 the burst)
Here:
134217 slices of (HZ * CREDITS_PER_JIFFY) size.
50 is user input value
100 is XT_HASHLIMIT_SCALE_v2
gives: 0 as user2creds output
Setting burst to "1" typically solve the issue ...
but setting it to "40" does too !

This is on 32bit arch calling into revision 2 of hashlimit.

Signed-off-by: Alban Browaeys <alban.browa...@gmail.com>
---
 net/netfilter/xt_hashlimit.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 10063408141d..df75ad643eef 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -463,21 +463,19 @@ static u32 xt_hashlimit_len_to_chunks(u32 len)
 /* Precision saver. */
 static u64 user2credits(u64 user, int revision)
 {
+   /* Avoid overflow: divide the constant operands first */
if (revision == 1) {
-   /* If multiplying would overflow... */
-   if (user > 0x / (HZ*CREDITS_PER_JIFFY_v1))
-   /* Divide first. */
-   return div64_u64(user, XT_HASHLIMIT_SCALE)
-   * HZ * CREDITS_PER_JIFFY_v1;
+   return div64_u64(user, div64_u64(XT_HASHLIMIT_SCALE,
+   HZ * CREDITS_PER_JIFFY_v1));
 
-   return div64_u64(user * HZ * CREDITS_PER_JIFFY_v1,
+   return user * div64_u64(HZ * CREDITS_PER_JIFFY_v1,
 XT_HASHLIMIT_SCALE);
} else {
-   if (user > 0xULL / (HZ*CREDITS_PER_JIFFY))
-   return div64_u64(user, XT_HASHLIMIT_SCALE_v2)
-   * HZ * CREDITS_PER_JIFFY;
+   if (XT_HASHLIMIT_SCALE_v2 >= HZ * CREDITS_PER_JIFFY)
+   return div64_u64(user, div64_u64(XT_HASHLIMIT_SCALE_v2,
+   HZ * CREDITS_PER_JIFFY));
 
-   return div64_u64(user * HZ * CREDITS_PER_JIFFY,
+   return user * div64_u64(HZ * CREDITS_PER_JIFFY,
 XT_HASHLIMIT_SCALE_v2);
}
 }
-- 
2.11.0



[RFC v2 2/2] proc connector: add a "get feature" op

2016-10-15 Thread Alban Crequy
From: Alban Crequy <al...@kinvolk.io>

As more kinds of events are being added in the proc connector, userspace
needs a way to detect whether the kernel supports those new events.

When a kind of event is not supported, userspace should report an error
propertly, or fallback to other methods (regular polling of procfs).

The events fork, exec, uid, gid, sid, ptrace, comm, exit were added
together. Then commit 2b5faa4c ("connector: Added coredumping event to
the process connector") added coredump events but without a way for
userspace to detect if the kernel will emit those. So I am grouping
them all together in PROC_CN_FEATURE_BASIC.

- PROC_CN_FEATURE_BASIC: supports fork, exec, uid, gid, sid, ptrace,
  comm, exit, coredump.

- PROC_CN_FEATURE_NS: supports ns.

Signed-off-by: Alban Crequy <al...@kinvolk.io>
---
 drivers/connector/cn_proc.c  | 25 +++--
 include/uapi/linux/cn_proc.h |  4 
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index c38733d..5f9ace6 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -442,15 +442,12 @@ void proc_ns_connector_send(struct ns_event_prepare 
*prepare, struct task_struct
  * values because it's not being returned via syscall return
  * mechanisms.
  */
-static void cn_proc_ack(int err, int rcvd_seq, int rcvd_ack)
+static void cn_proc_ack(int err, u16 flags, int rcvd_seq, int rcvd_ack)
 {
struct cn_msg *msg;
struct proc_event *ev;
__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
 
-   if (atomic_read(_event_num_listeners) < 1)
-   return;
-
msg = buffer_to_cn_msg(buffer);
ev = (struct proc_event *)msg->data;
memset(>event_data, 0, sizeof(ev->event_data));
@@ -462,7 +459,7 @@ static void cn_proc_ack(int err, int rcvd_seq, int rcvd_ack)
memcpy(>id, _proc_event_id, sizeof(msg->id));
msg->ack = rcvd_ack + 1;
msg->len = sizeof(*ev);
-   msg->flags = 0; /* not used */
+   msg->flags = flags;
send_msg(msg);
 }
 
@@ -475,9 +472,12 @@ static void cn_proc_mcast_ctl(struct cn_msg *msg,
 {
enum proc_cn_mcast_op *mc_op = NULL;
int err = 0;
+   u16 flags = 0;
 
-   if (msg->len != sizeof(*mc_op))
-   return;
+   if (msg->len != sizeof(*mc_op)) {
+   err = EINVAL;
+   goto out;
+   }
 
/* 
 * Events are reported with respect to the initial pid
@@ -485,8 +485,10 @@ static void cn_proc_mcast_ctl(struct cn_msg *msg,
 * other namespaces.
 */
if ((current_user_ns() != _user_ns) ||
-   (task_active_pid_ns(current) != _pid_ns))
-   return;
+   (task_active_pid_ns(current) != _pid_ns)) {
+   err = EPERM;
+   goto out;
+   }
 
/* Can only change if privileged. */
if (!__netlink_ns_capable(nsp, _user_ns, CAP_NET_ADMIN)) {
@@ -496,6 +498,9 @@ static void cn_proc_mcast_ctl(struct cn_msg *msg,
 
mc_op = (enum proc_cn_mcast_op *)msg->data;
switch (*mc_op) {
+   case PROC_CN_GET_FEATURES:
+   flags = PROC_CN_FEATURE_BASIC | PROC_CN_FEATURE_NS;
+   break;
case PROC_CN_MCAST_LISTEN:
atomic_inc(_event_num_listeners);
break;
@@ -508,7 +513,7 @@ static void cn_proc_mcast_ctl(struct cn_msg *msg,
}
 
 out:
-   cn_proc_ack(err, msg->seq, msg->ack);
+   cn_proc_ack(err, flags, msg->seq, msg->ack);
 }
 
 /*
diff --git a/include/uapi/linux/cn_proc.h b/include/uapi/linux/cn_proc.h
index 3270e8c..2ea0e5d 100644
--- a/include/uapi/linux/cn_proc.h
+++ b/include/uapi/linux/cn_proc.h
@@ -25,10 +25,14 @@
  * for events on the connector.
  */
 enum proc_cn_mcast_op {
+   PROC_CN_GET_FEATURES = 0,
PROC_CN_MCAST_LISTEN = 1,
PROC_CN_MCAST_IGNORE = 2
 };
 
+#define PROC_CN_FEATURE_BASIC 0x0001
+#define PROC_CN_FEATURE_NS0x0002
+
 /*
  * From the user's point of view, the process
  * ID is the thread group ID and thread ID is the internal
-- 
2.7.4



[RFC v2 1/2] proc connector: add namespace events

2016-10-15 Thread Alban Crequy
From: Alban Crequy <al...@kinvolk.io>

The act of a process creating or joining a namespace via clone(),
unshare() or setns() is a useful signal for monitoring applications.

I am working on a monitoring application that keeps track of all the
containers and all processes inside each container. The current way of
doing it is by polling regularly in /proc for the list of processes and
in /proc/*/ns/* to know which namespaces they belong to. This is
inefficient on systems with a large number of containers and a large
number of processes.

Instead, I would inspect /proc only one time and get the updates with
the proc connector. Unfortunately, the proc connector gives me the list
of processes but does not notify me when a process changes namespaces.
So I would still need to inspect /proc/*/ns/*.

This patch adds namespace events for processes. It generates a namespace
event each time a process changes namespace via clone(), unshare() or
setns().

For example, the following command:
| # unshare -n -i -f ls -l /proc/self/ns/
| total 0
| lrwxrwxrwx 1 root root 0 Sep 25 22:31 cgroup -> 'cgroup:[4026531835]'
| lrwxrwxrwx 1 root root 0 Sep 25 22:31 ipc -> 'ipc:[4026532208]'
| lrwxrwxrwx 1 root root 0 Sep 25 22:31 mnt -> 'mnt:[4026531840]'
| lrwxrwxrwx 1 root root 0 Sep 25 22:31 net -> 'net:[4026532210]'
| lrwxrwxrwx 1 root root 0 Sep 25 22:31 pid -> 'pid:[4026531836]'
| lrwxrwxrwx 1 root root 0 Sep 25 22:31 user -> 'user:[4026531837]'
| lrwxrwxrwx 1 root root 0 Sep 25 22:31 uts -> 'uts:[4026531838]'

causes the proc connector to generate the following events:
| fork: ppid=691 pid=808
| exec: pid=808
| ns: pid=808 reason=unshare count=2
| type=ipc  4026531839 -> 4026532208
| type=net  4026531957 -> 4026532210
| fork: ppid=808 pid=809
| exec: pid=809
| exit: pid=809
| exit: pid=808

Signed-off-by: Alban Crequy <al...@kinvolk.io>
---
 drivers/connector/cn_proc.c  | 138 +++
 include/linux/cn_proc.h  |  25 
 include/uapi/linux/cn_proc.h |  23 +++-
 kernel/fork.c|  10 
 kernel/nsproxy.c |   6 ++
 5 files changed, 201 insertions(+), 1 deletion(-)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index a782ce8..c38733d 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -30,8 +30,13 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
 
 #include 
+#include 
 
 /*
  * Size of a cn_msg followed by a proc_event structure.  Since the
@@ -296,6 +301,139 @@ void proc_exit_connector(struct task_struct *task)
send_msg(msg);
 }
 
+void proc_ns_connector_prepare(struct ns_event_prepare *prepare, u16 reason)
+{
+   struct nsproxy *ns = current->nsproxy;
+   struct ns_common *mntns;
+
+   prepare->num_listeners = atomic_read(_event_num_listeners);
+
+   if (prepare->num_listeners < 1)
+   return;
+
+   prepare->reason = reason;
+
+   prepare->user_inum = current->cred->user_ns->ns.inum;
+   prepare->uts_inum = ns->uts_ns->ns.inum;
+   prepare->ipc_inum = ns->ipc_ns->ns.inum;
+
+   mntns = mntns_operations.get(current);
+   if (mntns) {
+   prepare->mnt_inum = mntns->inum;
+   mntns_operations.put(mntns);
+   } else
+   prepare->mnt_inum = 0;
+
+   prepare->pid_inum = ns->pid_ns_for_children->ns.inum;
+   prepare->net_inum = ns->net_ns->ns.inum;
+   prepare->cgroup_inum = ns->cgroup_ns->ns.inum;
+}
+
+void proc_ns_connector_send(struct ns_event_prepare *prepare, struct 
task_struct *task)
+{
+   struct nsproxy *ns = task->nsproxy;
+   struct ns_common *mntns;
+   struct cn_msg *msg;
+   struct proc_event *ev;
+   __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
+   int count;
+
+   if (prepare->num_listeners < 1)
+   return;
+
+   if (atomic_read(_event_num_listeners) < 1)
+   return;
+
+   msg = buffer_to_cn_msg(buffer);
+   ev = (struct proc_event *)msg->data;
+   memset(>event_data, 0, sizeof(ev->event_data));
+   ev->timestamp_ns = ktime_get_ns();
+   ev->what = PROC_EVENT_NS;
+
+   ev->event_data.ns.process_pid  = task->pid;
+   ev->event_data.ns.process_tgid = task->tgid;
+   ev->event_data.ns.reason = prepare->reason;
+   count = 0;
+
+   /* user */
+   if (prepare->user_inum != task->cred->user_ns->ns.inum) {
+   ev->event_data.ns.items[count].type = CLONE_NEWUSER;
+   ev->event_data.ns.items[count].flags = 0;
+   ev->event_data.ns.items[count].old_inum = prepare->user_inum;
+   ev->event_data.ns.items[count].inum = 
task->cred->user_ns->ns.inum;
+   count++;
+   }
+
+   /* uts */

[RFC v2 0/2] proc connector: get namespace events

2016-10-15 Thread Alban Crequy
This is v2 of the patch set to add namespace events in the proc connector.

The act of a process creating or joining a namespace via clone(),
unshare() or setns() is a useful signal for monitoring applications.

I am working on a monitoring application that keeps track of all the
containers and all processes inside each container. The current way of
doing it is by polling regularly in /proc for the list of processes and
in /proc/*/ns/* to know which namespaces they belong to. This is
inefficient on systems with a large number of containers and a large
number of processes.

Instead, I would inspect /proc only one time and get the updates with
the proc connector. Unfortunately, the proc connector gives me the list
of processes but does not notify me when a process changes namespaces.
So I would still need to inspect /proc/*/ns/*.

 (1) Add namespace events for processes. It generates a namespace event each
 time a process changes namespace via clone(), unshare() or setns().

 (2) Add a way for userspace to detect if proc connector is able to send
 namespace events.


Changes since RFC-v1: https://lkml.org/lkml/2016/9/8/588

* Supports userns.

* The reason field says exactly whether it is clone/setns/unshare.

* Sends aggregated messages containing details of several namespaces
  changes. Suggested by Evgeniy Polyakov.

* Add patch 2 to detect if proc connector is able to send namespace events.


This patch set is available in the git repository at:

  https://github.com/kinvolk/linux.git alban/proc_ns_connector-v2-5


Alban Crequy (2):
  proc connector: add namespace events
  proc connector: add a "get feature" op

 drivers/connector/cn_proc.c  | 163 ---
 include/linux/cn_proc.h  |  25 +++
 include/uapi/linux/cn_proc.h |  27 ++-
 kernel/fork.c|  10 +++
 kernel/nsproxy.c |   6 ++
 5 files changed, 220 insertions(+), 11 deletions(-)

-- 
2.7.4



Re: [PATCH] [RFC] proc connector: add namespace events

2016-09-13 Thread Alban Crequy
On 12 September 2016 at 23:39, Evgeniy Polyakov <z...@ioremap.net> wrote:
> Hi everyone
>
> 08.09.2016, 18:39, "Alban Crequy" <alban.cre...@gmail.com>:
>> The act of a process creating or joining a namespace via clone(),
>> unshare() or setns() is a useful signal for monitoring applications.
>
>> + if (old_ns->mnt_ns != new_ns->mnt_ns)
>> + proc_ns_connector(tsk, CLONE_NEWNS, PROC_NM_REASON_CLONE, old_mntns_inum, 
>> new_mntns_inum);
>> +
>> + if (old_ns->uts_ns != new_ns->uts_ns)
>> + proc_ns_connector(tsk, CLONE_NEWUTS, PROC_NM_REASON_CLONE, 
>> old_ns->uts_ns->ns.inum, new_ns->uts_ns->ns.inum);
>> +
>> + if (old_ns->ipc_ns != new_ns->ipc_ns)
>> + proc_ns_connector(tsk, CLONE_NEWIPC, PROC_NM_REASON_CLONE, 
>> old_ns->ipc_ns->ns.inum, new_ns->ipc_ns->ns.inum);
>> +
>> + if (old_ns->net_ns != new_ns->net_ns)
>> + proc_ns_connector(tsk, CLONE_NEWNET, PROC_NM_REASON_CLONE, 
>> old_ns->net_ns->ns.inum, new_ns->net_ns->ns.inum);
>> +
>> + if (old_ns->cgroup_ns != new_ns->cgroup_ns)
>> + proc_ns_connector(tsk, CLONE_NEWCGROUP, PROC_NM_REASON_CLONE, 
>> old_ns->cgroup_ns->ns.inum, new_ns->cgroup_ns->ns.inum);
>> +
>> + if (old_ns->pid_ns_for_children != new_ns->pid_ns_for_children)
>> + proc_ns_connector(tsk, CLONE_NEWPID, PROC_NM_REASON_CLONE, 
>> old_ns->pid_ns_for_children->ns.inum, new_ns->pid_ns_for_children->ns.inum);
>> + }
>> +
>
> Patch looks good to me from technical/connector point of view, but these even 
> multiplication is a bit weird imho.
>
> I'm not against it, but did you consider sending just 2 serialized ns 
> structures via single message, and client
> would check all ns bits himself?

I have not considered it, thanks for the suggestion. Should we offer
the guarantee to userspace that it will always be send in one single
message? If we want to give the information about the userns change
too, it will be a bit more complicated to write the patch because it
is not done in the same function.

Note that I will probably not have the chance to spend more time on
this patch soon because Iago will explore other methods with
eBPF+kprobes instead. eBPF+kprobes would not have the same API
stability though. I was curious to see if anyone would find the
namespace addition to proc connector interesting for other projects.


[PATCH] [RFC] proc connector: add namespace events

2016-09-08 Thread Alban Crequy
From: Alban Crequy <al...@kinvolk.io>

The act of a process creating or joining a namespace via clone(),
unshare() or setns() is a useful signal for monitoring applications.

I am working on a monitoring application that keeps track of all the
containers and all processes inside each container. The current way of
doing it is by polling regularly in /proc for the list of processes and
in /proc/*/ns/* to know which namespaces they belong to. This is
inefficient on systems with a large number of containers and a large
number of processes.

Instead, I would inspect /proc only one time and get the updates with
the proc connector. Unfortunately, the proc connector gives me the list
of processes but does not notify me when a process changes namespaces.
So I would still need to inspect /proc/*/ns/*.

This patch add namespace events for processes. It generates a namespace
event each time a process changes namespace via clone(), unshare() or
setns().

For example, the following command:
| # unshare -n -f ls -l /proc/self/ns/net
| lrwxrwxrwx 1 root root 0 Sep  6 05:35 /proc/self/ns/net -> 'net:[4026532142]'

causes the proc connector to generate the following events:
| fork: ppid=696 pid=858
| exec: pid=858
| ns: pid=858 type=net reason=set old_inum=4026531957 inum=4026532142
| fork: ppid=858 pid=859
| exec: pid=859
| exit: pid=859
| exit: pid=858

Note: this patch is just a RFC, we are exploring other ways to achieve
  the same feature.

The current implementation has the following limitations:

- Ideally, I want to know whether the event is cause by clone(),
  unshare() or setns(). At the moment, the reason field only
  distinguishes between clone() and non-clone.

- The event for pid namespaces is generated when pid_ns_for_children
  changes. I think that's ok, and it just needs to be documented for
  userspace in the same way it is already documented in
  pid_namespaces(7). Userspace really needs to know whether the event is
  caused by clone() or non-clone to interpret the event correctly.

- Events for userns are not implemented yet. I skipped it for now
  because user namespaces are not managed with nsproxy as other namespaces.

- The mnt namespace struct is more private than other so the code is a
  bit different for this. I don't know if there is a better way to do
  this.

- Userspace needs a way to know whether namespace events are implemented
  in the proc connector. If not implemented, userspaces needs to
  fallback to polling changes in /proc/*/ns/*. I am not sure whether to
  add a Netlink message to query the kernel if the feature is implemented
  or otherwise.

- There is no granularity when subscribing for proc connector events. I
  figured it might not be a problem since namespace events are more rare
  than other fork/exec events. It will probably not flood existing users
  of the proc connector.

Signed-off-by: Alban Crequy <al...@kinvolk.io>
---
 drivers/connector/cn_proc.c  | 28 +
 include/linux/cn_proc.h  |  4 +++
 include/uapi/linux/cn_proc.h | 16 +-
 kernel/nsproxy.c | 71 
 4 files changed, 118 insertions(+), 1 deletion(-)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index a782ce8..69e6815 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -246,6 +246,34 @@ void proc_comm_connector(struct task_struct *task)
send_msg(msg);
 }
 
+void proc_ns_connector(struct task_struct *task, int type, int reason, u64 
old_inum, u64 inum)
+{
+   struct cn_msg *msg;
+   struct proc_event *ev;
+   __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
+
+   if (atomic_read(_event_num_listeners) < 1)
+   return;
+
+   msg = buffer_to_cn_msg(buffer);
+   ev = (struct proc_event *)msg->data;
+   memset(>event_data, 0, sizeof(ev->event_data));
+   ev->timestamp_ns = ktime_get_ns();
+   ev->what = PROC_EVENT_NM;
+   ev->event_data.nm.process_pid  = task->pid;
+   ev->event_data.nm.process_tgid = task->tgid;
+   ev->event_data.nm.type = type;
+   ev->event_data.nm.reason = reason;
+   ev->event_data.nm.old_inum = old_inum;
+   ev->event_data.nm.inum = inum;
+
+   memcpy(>id, _proc_event_id, sizeof(msg->id));
+   msg->ack = 0; /* not used */
+   msg->len = sizeof(*ev);
+   msg->flags = 0; /* not used */
+   send_msg(msg);
+}
+
 void proc_coredump_connector(struct task_struct *task)
 {
struct cn_msg *msg;
diff --git a/include/linux/cn_proc.h b/include/linux/cn_proc.h
index 1d5b02a..2e6915e 100644
--- a/include/linux/cn_proc.h
+++ b/include/linux/cn_proc.h
@@ -26,6 +26,7 @@ void proc_id_connector(struct task_struct *task, int 
which_id);
 void proc_sid_connector(struct task_struct *task);
 void proc_ptrace_connector(struct task_struct *task, int which_id);
 void proc_comm_connector(struct task_struct *task);
+vo

Re: [PATCH] usbnet: ax88179_178a: Add support for writing the EEPROM

2016-08-25 Thread Alban Bedel
On Thu, 25 Aug 2016 11:16:36 +0200
Oliver Neukum <oneu...@suse.com> wrote:

> On Wed, 2016-08-24 at 16:40 +0200, Alban Bedel wrote:
> > On Wed, 24 Aug 2016 16:30:39 +0200
> > Oliver Neukum <oneu...@suse.com> wrote:
> >   
> > > On Wed, 2016-08-24 at 15:52 +0200, Alban Bedel wrote:  
> 
> > > > +   if (block != data)
> > > > +   kfree(block);
> > > 
> > > And if block == dta, what frees the memory?  
> > 
> > In this case this function didn't allocate any memory, so there is
> > nothing to free.  
> 
> Hi,
> 
> I see. kfree() has a check for NULL, so you could drop the
> test, but it doesn't matter much either way.

I think you misunderstand something here. data is the buffer passed
by the caller and block is a local variable. There is two cases:

1) The data to write is block aligned, then we use the caller buffer
   as is and set block = data.
2) The requested data is not block aligned, then we kalloc block.

In both case the writing loop then use the block pointer. Afterwards
we only need to kfree block in case 2, that is when block != data.

Alban


pgp0R87lnmIOn.pgp
Description: OpenPGP digital signature


Re: [PATCH] usbnet: ax88179_178a: Add support for writing the EEPROM

2016-08-24 Thread Alban Bedel
On Wed, 24 Aug 2016 16:30:39 +0200
Oliver Neukum <oneu...@suse.com> wrote:

> On Wed, 2016-08-24 at 15:52 +0200, Alban Bedel wrote:
> > Implement the .set_eeprom callback to allow setting the MAC address
> > as well as a few other parameters. Note that the EEPROM must have a
> > correct PID/VID checksum set otherwise the SROM is used and reads
> > return the SROM content.
> > 
> > Signed-off-by: Alban Bedel <alban.be...@avionic-design.de>
> > ---
> >  drivers/net/usb/ax88179_178a.c | 57
> > ++
> >  1 file changed, 57 insertions(+)
> > 
> > diff --git a/drivers/net/usb/ax88179_178a.c
> > b/drivers/net/usb/ax88179_178a.c
> > index e6338c16081a..e6a986303dad 100644
> > --- a/drivers/net/usb/ax88179_178a.c
> > +++ b/drivers/net/usb/ax88179_178a.c
> > @@ -28,6 +28,7 @@
> >  
> >  #define AX88179_PHY_ID 0x03
> >  #define AX_EEPROM_LEN  0x100
> > +#define AX_EEPROM_BLOCK0x40u
> >  #define AX88179_EEPROM_MAGIC   0x17900b95
> >  #define AX_MCAST_FLTSIZE   8
> >  #define AX_MAX_MCAST   64
> > @@ -43,6 +44,7 @@
> >  #define AX_ACCESS_PHY  0x02
> >  #define AX_ACCESS_EEPROM   0x04
> >  #define AX_ACCESS_EFUS 0x05
> > +#define AX_RELOAD_EEPROM   0x06
> >  #define AX_PAUSE_WATERLVL_HIGH 0x54
> >  #define AX_PAUSE_WATERLVL_LOW  0x55
> >  
> > @@ -620,6 +622,60 @@ ax88179_get_eeprom(struct net_device *net, struct
> > ethtool_eeprom *eeprom,
> > return 0;
> >  }
> >  
> > +static int
> > +ax88179_set_eeprom(struct net_device *net, struct ethtool_eeprom
> > *eeprom,
> > +  u8 *data)
> > +{
> > +   struct usbnet *dev = netdev_priv(net);
> > +   unsigned int offset = eeprom->offset;
> > +   unsigned int len = eeprom->len;
> > +   int i, err = 0;
> > +   u8 *block;
> > +
> > +   /* The EEPROM data must be aligned on blocks of 64 bytes */
> > +   if ((offset % AX_EEPROM_BLOCK) || (len % AX_EEPROM_BLOCK)) {
> > +   offset = eeprom->offset / AX_EEPROM_BLOCK *
> > AX_EEPROM_BLOCK;
> > +   len = eeprom->len + eeprom->offset - offset;
> > +   len = DIV_ROUND_UP(len, AX_EEPROM_BLOCK) *
> > AX_EEPROM_BLOCK;
> > +
> > +   block = kmalloc(len, GFP_KERNEL);
> > +   if (!block)
> > +   return -ENOMEM;
> > +
> > +   /* Copy the current data, we could skip some but KISS
> > */
> > +   for (i = 0; i < len; i += AX_EEPROM_BLOCK) {
> > +   err = __ax88179_read_cmd(dev,
> > AX_ACCESS_EEPROM,
> > +(offset + i) >> 1,
> > +AX_EEPROM_BLOCK >> 1,
> > +AX_EEPROM_BLOCK,
> > +[i], 0);
> > +   if (err < 0) {
> > +   kfree(block);
> > +   return err;
> > +   }
> > +   }
> > +   memcpy(block + eeprom->offset - offset, data,
> > eeprom->len);
> > +   } else {
> > +   block = data;
> > +   }
> > +
> > +   for (i = 0; err >= 0 && i < len; i += AX_EEPROM_BLOCK) {
> > +   err = ax88179_write_cmd(dev, AX_ACCESS_EEPROM,
> > +   (offset + i) >> 1,
> > +   AX_EEPROM_BLOCK >> 1,
> > +   AX_EEPROM_BLOCK, [i]);
> > +   }
> > +
> > +   if (block != data)
> > +   kfree(block);  
> 
> And if block == dta, what frees the memory?

In this case this function didn't allocate any memory, so there is
nothing to free.

Alban 



pgpq3XFV3hiBn.pgp
Description: OpenPGP digital signature


[PATCH] usbnet: ax88179_178a: Add support for writing the EEPROM

2016-08-24 Thread Alban Bedel
Implement the .set_eeprom callback to allow setting the MAC address
as well as a few other parameters. Note that the EEPROM must have a
correct PID/VID checksum set otherwise the SROM is used and reads
return the SROM content.

Signed-off-by: Alban Bedel <alban.be...@avionic-design.de>
---
 drivers/net/usb/ax88179_178a.c | 57 ++
 1 file changed, 57 insertions(+)

diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
index e6338c16081a..e6a986303dad 100644
--- a/drivers/net/usb/ax88179_178a.c
+++ b/drivers/net/usb/ax88179_178a.c
@@ -28,6 +28,7 @@
 
 #define AX88179_PHY_ID 0x03
 #define AX_EEPROM_LEN  0x100
+#define AX_EEPROM_BLOCK0x40u
 #define AX88179_EEPROM_MAGIC   0x17900b95
 #define AX_MCAST_FLTSIZE   8
 #define AX_MAX_MCAST   64
@@ -43,6 +44,7 @@
 #define AX_ACCESS_PHY  0x02
 #define AX_ACCESS_EEPROM   0x04
 #define AX_ACCESS_EFUS 0x05
+#define AX_RELOAD_EEPROM   0x06
 #define AX_PAUSE_WATERLVL_HIGH 0x54
 #define AX_PAUSE_WATERLVL_LOW  0x55
 
@@ -620,6 +622,60 @@ ax88179_get_eeprom(struct net_device *net, struct 
ethtool_eeprom *eeprom,
return 0;
 }
 
+static int
+ax88179_set_eeprom(struct net_device *net, struct ethtool_eeprom *eeprom,
+  u8 *data)
+{
+   struct usbnet *dev = netdev_priv(net);
+   unsigned int offset = eeprom->offset;
+   unsigned int len = eeprom->len;
+   int i, err = 0;
+   u8 *block;
+
+   /* The EEPROM data must be aligned on blocks of 64 bytes */
+   if ((offset % AX_EEPROM_BLOCK) || (len % AX_EEPROM_BLOCK)) {
+   offset = eeprom->offset / AX_EEPROM_BLOCK * AX_EEPROM_BLOCK;
+   len = eeprom->len + eeprom->offset - offset;
+   len = DIV_ROUND_UP(len, AX_EEPROM_BLOCK) * AX_EEPROM_BLOCK;
+
+   block = kmalloc(len, GFP_KERNEL);
+   if (!block)
+   return -ENOMEM;
+
+   /* Copy the current data, we could skip some but KISS */
+   for (i = 0; i < len; i += AX_EEPROM_BLOCK) {
+   err = __ax88179_read_cmd(dev, AX_ACCESS_EEPROM,
+(offset + i) >> 1,
+AX_EEPROM_BLOCK >> 1,
+AX_EEPROM_BLOCK,
+[i], 0);
+   if (err < 0) {
+   kfree(block);
+   return err;
+   }
+   }
+   memcpy(block + eeprom->offset - offset, data, eeprom->len);
+   } else {
+   block = data;
+   }
+
+   for (i = 0; err >= 0 && i < len; i += AX_EEPROM_BLOCK) {
+   err = ax88179_write_cmd(dev, AX_ACCESS_EEPROM,
+   (offset + i) >> 1,
+   AX_EEPROM_BLOCK >> 1,
+   AX_EEPROM_BLOCK, [i]);
+   }
+
+   if (block != data)
+   kfree(block);
+
+   /* Reload the EEPROM */
+   if (err >= 0)
+   err = ax88179_write_cmd(dev, AX_RELOAD_EEPROM, 0, 0, 0, NULL);
+
+   return err < 0 ? err : 0;
+}
+
 static int ax88179_get_settings(struct net_device *net, struct ethtool_cmd 
*cmd)
 {
struct usbnet *dev = netdev_priv(net);
@@ -826,6 +882,7 @@ static const struct ethtool_ops ax88179_ethtool_ops = {
.set_wol= ax88179_set_wol,
.get_eeprom_len = ax88179_get_eeprom_len,
.get_eeprom = ax88179_get_eeprom,
+   .set_eeprom = ax88179_set_eeprom,
.get_settings   = ax88179_get_settings,
.set_settings   = ax88179_set_settings,
.get_eee= ax88179_get_eee,
-- 
2.9.3



Re: [PATCH 8/8] netfilter: implement xt_cgroup cgroup2 path match

2016-02-11 Thread Alban Crequy
Hi,

On 7 December 2015 at 23:38, Tejun Heo <t...@kernel.org> wrote:
> This patch implements xt_cgroup path match which matches cgroup2
> membership of the associated socket.  The match is recursive and
> invertible.

Is there any plans to implement a similar cgroup2 path match in a
cgroup classifier in tc?
I wonder if it will be possible to use cgroup to classify packets in
tc without net_cls.class_id in cgroup2.

Thanks!
Alban


[PATCH v3] MIPS: Remove all the uses of custom gpio.h

2015-08-04 Thread Alban Bedel
Currently CONFIG_ARCH_HAVE_CUSTOM_GPIO_H is defined for all MIPS
machines, and each machine type provides its own gpio.h. However
only a handful really implement the GPIO API, most just forward
everythings to gpiolib.

The Alchemy machine is notable as it provides a system to allow
implementing the GPIO API at the board level. But it is not used by
any board currently supported, so it can also be removed.

For most machine types we can just remove the custom gpio.h, as well
as the custom wrappers if some exists. Some of the code found in
the wrappers must be moved to the respective GPIO driver.

A few more fixes are need in some drivers as they rely on linux/gpio.h
to provides some machine specific definitions, or used asm/gpio.h
instead of linux/gpio.h for the gpio API.

Signed-off-by: Alban Bedel al...@free.fr
---

This patch is based on my previous serie:
MIPS: ath79: Move the GPIO driver to drivers/gpio.

It supercede my previous patch named:
MIPS: Remove most of the custom gpio.h

Changelog:
v3: * Add the missing includes to fix the build on jz4740 machines
v2: * Don't remove AR7_GPIO_MAX and TITAN_GPIO_MAX, that's for another
  patch
v1: * Fixed gpio_to_irq on jz4740 and rb532
* Cleaned up alchemy as well
* Removed asm/gpio.h

For testing I tried to build all mips defconfig, however my toolchain
couldn't handle a few configs: ip28 malta_qemu_32r6 maltasmvp_eva
sead3micro. If somebody can test these that would be more than welcome.

Now a few stats about the state of CONFIG_ARCH_HAVE_CUSTOM_GPIO_H
after appling this patch. Of the 31 supportd arch, 15 still have
asm/gpio.h, of these 9 are just a #warning Include linux/gpio.h
instead of asm/gpio.h. So we have 6 arch left: arm, avr32, blackfin,
m68k, sh and unicore32. But only m68k and unicore32 really provides
custom wrappers, all the others only forward to gpiolib.

On the drivers side we only have 13 occurences of '#include
asm/gpio.h' left, mostly in drivers used on ARM SoC.

So the work left to phase out the legacy GPIO is really not that much
anymore.

Alban
---
 arch/mips/Kconfig   |   1 -
 arch/mips/alchemy/Kconfig   |   7 --
 arch/mips/alchemy/board-gpr.c   |   1 +
 arch/mips/alchemy/board-mtx1.c  |   1 +
 arch/mips/alchemy/common/Makefile   |   7 +-
 arch/mips/alchemy/devboards/db1000.c|   1 +
 arch/mips/alchemy/devboards/db1300.c|   1 +
 arch/mips/alchemy/devboards/db1550.c|   1 +
 arch/mips/alchemy/devboards/pm.c|   2 +-
 arch/mips/ar7/gpio.c|   5 +-
 arch/mips/ar7/platform.c|   1 -
 arch/mips/ar7/setup.c   |   1 -
 arch/mips/include/asm/gpio.h|   6 -
 arch/mips/include/asm/mach-ar7/ar7.h|   4 +
 arch/mips/include/asm/mach-ar7/gpio.h   |  41 ---
 arch/mips/include/asm/mach-ath25/gpio.h |  16 ---
 arch/mips/include/asm/mach-ath79/gpio.h |  26 -
 arch/mips/include/asm/mach-au1x00/gpio-au1000.h | 148 ++--
 arch/mips/include/asm/mach-au1x00/gpio.h|  86 --
 arch/mips/include/asm/mach-bcm47xx/gpio.h   |  17 ---
 arch/mips/include/asm/mach-bcm63xx/gpio.h   |  15 ---
 arch/mips/include/asm/mach-cavium-octeon/gpio.h |  21 
 arch/mips/include/asm/mach-generic/gpio.h   |  21 
 arch/mips/include/asm/mach-jz4740/gpio.h|   2 -
 arch/mips/include/asm/mach-lantiq/gpio.h|  13 ---
 arch/mips/include/asm/mach-loongson64/gpio.h|  36 --
 arch/mips/include/asm/mach-pistachio/gpio.h |  21 
 arch/mips/include/asm/mach-rc32434/gpio.h   |  12 --
 arch/mips/jz4740/board-qi_lb60.c|   1 +
 arch/mips/jz4740/gpio.c |  21 ++--
 arch/mips/pci/pci-lantiq.c  |   1 -
 arch/mips/rb532/devices.c   |   1 +
 arch/mips/rb532/gpio.c  |   6 +
 arch/mips/txx9/generic/setup.c  |  16 ---
 drivers/ata/pata_rb532_cf.c |   3 +-
 drivers/gpio/gpio-ath79.c   |  32 -
 drivers/input/misc/rb532_button.c   |   1 +
 drivers/net/ethernet/ti/cpmac.c |   2 +
 38 files changed, 47 insertions(+), 551 deletions(-)
 delete mode 100644 arch/mips/include/asm/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-ar7/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-ath25/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-ath79/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-au1x00/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-bcm47xx/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-bcm63xx/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-cavium-octeon/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-generic/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-lantiq/gpio.h
 delete mode 100644 arch

[PATCH v2] MIPS: Remove all the uses of custom gpio.h

2015-08-02 Thread Alban Bedel
Currently CONFIG_ARCH_HAVE_CUSTOM_GPIO_H is defined for all MIPS
machines, and each machine type provides its own gpio.h. However
only a handful really implement the GPIO API, most just forward
everythings to gpiolib.

The Alchemy machine is notable as it provides a system to allow
implementing the GPIO API at the board level. But it is not used by
any board currently supported, so it can also be removed.

For most machine types we can just remove the custom gpio.h, as well
as the custom wrappers if some exists. Some of the code found in
the wrappers must be moved to the respective GPIO driver.

A few more fixes are need in some drivers as they rely on linux/gpio.h
to provides some machine specific definitions, or used asm/gpio.h
instead of linux/gpio.h for the gpio API.

Signed-off-by: Alban Bedel al...@free.fr
---

This patch is based on my previous serie:
MIPS: ath79: Move the GPIO driver to drivers/gpio.

It supercede my previous patch named:
MIPS: Remove most of the custom gpio.h

Changelog:
v2: * Don't remove AR7_GPIO_MAX and TITAN_GPIO_MAX, that's for another
  patch
v1: * Fixed gpio_to_irq on jz4740 and rb532
* Cleaned up alchemy as well
* Removed asm/gpio.h

For testing I tried to build all mips defconfig, however my toolchain
couldn't handle a few configs: ip28 malta_qemu_32r6 maltasmvp_eva
sead3micro. If somebody can test these that would be more than welcome.

Now a few stats about the state of CONFIG_ARCH_HAVE_CUSTOM_GPIO_H
after appling this patch. Of the 31 supportd arch, 15 still have
asm/gpio.h, of these 9 are just a #warning Include linux/gpio.h
instead of asm/gpio.h. So we have 6 arch left: arm, avr32, blackfin,
m68k, sh and unicore32. But only m68k and unicore32 really provides
custom wrappers, all the others only forward to gpiolib.

On the drivers side we only have 13 occurences of '#include
asm/gpio.h' left, mostly in drivers used on ARM SoC.

So the work left to phase out the legacy GPIO is really not that much
anymore.

Alban
---
 arch/mips/Kconfig   |   1 -
 arch/mips/alchemy/Kconfig   |   7 --
 arch/mips/alchemy/board-gpr.c   |   1 +
 arch/mips/alchemy/board-mtx1.c  |   1 +
 arch/mips/alchemy/common/Makefile   |   7 +-
 arch/mips/alchemy/devboards/db1000.c|   1 +
 arch/mips/alchemy/devboards/db1300.c|   1 +
 arch/mips/alchemy/devboards/db1550.c|   1 +
 arch/mips/alchemy/devboards/pm.c|   2 +-
 arch/mips/ar7/gpio.c|   5 +-
 arch/mips/ar7/platform.c|   1 -
 arch/mips/ar7/setup.c   |   1 -
 arch/mips/include/asm/gpio.h|   6 -
 arch/mips/include/asm/mach-ar7/ar7.h|   4 +
 arch/mips/include/asm/mach-ar7/gpio.h   |  41 ---
 arch/mips/include/asm/mach-ath25/gpio.h |  16 ---
 arch/mips/include/asm/mach-ath79/gpio.h |  26 -
 arch/mips/include/asm/mach-au1x00/gpio-au1000.h | 148 ++--
 arch/mips/include/asm/mach-au1x00/gpio.h|  86 --
 arch/mips/include/asm/mach-bcm47xx/gpio.h   |  17 ---
 arch/mips/include/asm/mach-bcm63xx/gpio.h   |  15 ---
 arch/mips/include/asm/mach-cavium-octeon/gpio.h |  21 
 arch/mips/include/asm/mach-generic/gpio.h   |  21 
 arch/mips/include/asm/mach-jz4740/gpio.h|   2 -
 arch/mips/include/asm/mach-lantiq/gpio.h|  13 ---
 arch/mips/include/asm/mach-loongson64/gpio.h|  36 --
 arch/mips/include/asm/mach-pistachio/gpio.h |  21 
 arch/mips/include/asm/mach-rc32434/gpio.h   |  12 --
 arch/mips/jz4740/gpio.c |  20 ++--
 arch/mips/pci/pci-lantiq.c  |   1 -
 arch/mips/rb532/devices.c   |   1 +
 arch/mips/rb532/gpio.c  |   6 +
 arch/mips/txx9/generic/setup.c  |  16 ---
 drivers/ata/pata_rb532_cf.c |   3 +-
 drivers/gpio/gpio-ath79.c   |  32 -
 drivers/input/misc/rb532_button.c   |   1 +
 drivers/net/ethernet/ti/cpmac.c |   2 +
 37 files changed, 45 insertions(+), 551 deletions(-)
 delete mode 100644 arch/mips/include/asm/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-ar7/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-ath25/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-ath79/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-au1x00/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-bcm47xx/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-bcm63xx/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-cavium-octeon/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-generic/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-lantiq/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-loongson64/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-pistachio/gpio.h

diff --git a/arch

[PATCH] MIPS: Remove all the uses of custom gpio.h

2015-07-30 Thread Alban Bedel
Currently CONFIG_ARCH_HAVE_CUSTOM_GPIO_H is defined for all MIPS
machines, and each machine type provides its own gpio.h. However
only a handful really implement the GPIO API, most just forward
everythings to gpiolib.

The Alchemy machine is notable as it provides a system to allow
implementing the GPIO API at the board level. But it is not used by
any board currently supported, so it can also be removed.

For most machine types we can just remove the custom gpio.h, as well
as the custom wrappers if some exists. Some of the code found in
the wrappers must be moved to the respective GPIO driver.

A few more fixes are need in some drivers as they rely on linux/gpio.h
to provides some machine specific definitions, or used asm/gpio.h
instead of linux/gpio.h for the gpio API.

Signed-off-by: Alban Bedel al...@free.fr
---

This patch is based on my previous serie:
MIPS: ath79: Move the GPIO driver to drivers/gpio.

It supercede my previous patch named:
MIPS: Remove most of the custom gpio.h

Compared to the previous patch:
* Fixed gpio_to_irq on jz4740 and rb532
* Cleaned up alchemy as well
* Removed asm/gpio.h

For testing I tried to build all mips defconfig, however my toolchain
couldn't handle a few configs: ip28 malta_qemu_32r6 maltasmvp_eva
sead3micro. If somebody can test these that would be more than welcome.

Now a few stats about the state of CONFIG_ARCH_HAVE_CUSTOM_GPIO_H
after appling this patch. Of the 31 supportd arch, 15 still have
asm/gpio.h, of these 9 are just a #warning Include linux/gpio.h
instead of asm/gpio.h. So we have 6 arch left: arm, avr32, blackfin,
m68k, sh and unicore32. But only m68k and unicore32 really provides
custom wrappers, all the others only forward to gpiolib.

On the drivers side we only have 13 occurences of '#include
asm/gpio.h' left, mostly in drivers used on ARM SoC.

So the work left to phase out the legacy GPIO is really not that much
anymore.

Alban
---
 arch/mips/Kconfig   |   1 -
 arch/mips/alchemy/Kconfig   |   7 --
 arch/mips/alchemy/board-gpr.c   |   1 +
 arch/mips/alchemy/board-mtx1.c  |   1 +
 arch/mips/alchemy/common/Makefile   |   7 +-
 arch/mips/alchemy/devboards/db1000.c|   1 +
 arch/mips/alchemy/devboards/db1300.c|   1 +
 arch/mips/alchemy/devboards/db1550.c|   1 +
 arch/mips/alchemy/devboards/pm.c|   2 +-
 arch/mips/ar7/gpio.c|  12 +-
 arch/mips/ar7/platform.c|   1 -
 arch/mips/ar7/setup.c   |   1 -
 arch/mips/include/asm/gpio.h|   6 -
 arch/mips/include/asm/mach-ar7/ar7.h|   4 +
 arch/mips/include/asm/mach-ar7/gpio.h   |  41 ---
 arch/mips/include/asm/mach-ath25/gpio.h |  16 ---
 arch/mips/include/asm/mach-ath79/gpio.h |  26 -
 arch/mips/include/asm/mach-au1x00/gpio-au1000.h | 148 ++--
 arch/mips/include/asm/mach-au1x00/gpio.h|  86 --
 arch/mips/include/asm/mach-bcm47xx/gpio.h   |  17 ---
 arch/mips/include/asm/mach-bcm63xx/gpio.h   |  15 ---
 arch/mips/include/asm/mach-cavium-octeon/gpio.h |  21 
 arch/mips/include/asm/mach-generic/gpio.h   |  21 
 arch/mips/include/asm/mach-jz4740/gpio.h|   2 -
 arch/mips/include/asm/mach-lantiq/gpio.h|  13 ---
 arch/mips/include/asm/mach-loongson64/gpio.h|  36 --
 arch/mips/include/asm/mach-pistachio/gpio.h |  21 
 arch/mips/include/asm/mach-rc32434/gpio.h   |  12 --
 arch/mips/jz4740/gpio.c |  20 ++--
 arch/mips/pci/pci-lantiq.c  |   1 -
 arch/mips/rb532/devices.c   |   1 +
 arch/mips/rb532/gpio.c  |   6 +
 arch/mips/txx9/generic/setup.c  |  16 ---
 drivers/ata/pata_rb532_cf.c |   3 +-
 drivers/gpio/gpio-ath79.c   |  32 -
 drivers/input/misc/rb532_button.c   |   1 +
 drivers/net/ethernet/ti/cpmac.c |   2 +
 37 files changed, 44 insertions(+), 559 deletions(-)
 delete mode 100644 arch/mips/include/asm/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-ar7/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-ath25/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-ath79/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-au1x00/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-bcm47xx/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-bcm63xx/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-cavium-octeon/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-generic/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-lantiq/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-loongson64/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-pistachio/gpio.h

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index cee5f93..6cab2b8 100644
--- a/arch

Re: [PATCH] MIPS: Remove most of the custom gpio.h

2015-07-23 Thread Alban
On Wed, 22 Jul 2015 19:47:18 +0200
Manuel Lauss manuel.la...@gmail.com wrote:

 On Wed, Jul 22, 2015 at 7:33 PM, Alban Bedel al...@free.fr wrote:
  Currently CONFIG_ARCH_HAVE_CUSTOM_GPIO_H is defined for all MIPS
  machines, and each machine type provides its own gpio.h. However
  only the Alchemy machine really use the feature, all other machines
  only use the default wrappers.
 
  For most machine types we can just remove the custom gpio.h, as well
  as the custom wrappers if some exists. A few more fixes are need in
  a few drivers as they rely on linux/gpio.h to provides some machine
  specific definitions, or used asm/gpio.h instead of linux/gpio.h for
  the gpio API.
 
  Signed-off-by: Alban Bedel al...@free.fr
  ---
 
  This patch is based on my previous serie:
  MIPS: ath79: Move the GPIO driver to drivers/gpio.
 
  For testing I tried to build all mips defconfig, however my
  toolchain couldn't handle a few configs: ip28 malta_qemu_32r6
  maltasmvp_eva sead3micro. If somebody can test these that would be
  more than welcome.
 
  It might well be that some more drivers for MIPS devices that are
  not enabled in the defconfig will break because of this change, so
  more testing would be nice :)
 
  Regarding Alchemy I'm not sure what to do. It use a little more
  complex setup, quoting arch/mips/include/asm/mach-au1x00/gpio.h:
 
  /* Linux gpio framework integration.
  *
  * 4 use cases of Alchemy GPIOS:
  *(1) GPIOLIB=y, ALCHEMY_GPIO_INDIRECT=y:
  *   Board must register gpiochips.
  *(2) GPIOLIB=y, ALCHEMY_GPIO_INDIRECT=n:
  *   A gpiochip for the 75 GPIOs is registered.
  *
  *(3) GPIOLIB=n, ALCHEMY_GPIO_INDIRECT=y:
  *   the boards' gpio.h must provide the linux gpio wrapper
  functions,
  *
  *(4) GPIOLIB=n, ALCHEMY_GPIO_INDIRECT=n:
  *   inlinable gpio functions are provided which enable access
  to the
  *   Au1300 gpios only by using the numbers straight out of the
  data-
  *   sheets.
 
  * Cases 1 and 3 are intended for boards which want to provide their
  own
  * GPIO namespace and -operations (i.e. for example you have 8 GPIOs
  * which are in part provided by spare Au1300 GPIO pins and in part
  by
  * an external FPGA but you still want them to be accssible in linux
  * as gpio0-7. The board can of course use the alchemy_gpioX_*
  functions
  * as required).
  */
 
  This sound to me like this is really not needed anymore. Is there
  any users of this left, or can it just go?
 
 There are no in-tree users of this, but a few out-of-tree ones (all
 made by me) Does it have to be removed?  Is it blocking anything?

It is not blocking anything, but I see little gain in it. Cases 1 and 3
should nowadays be handled using normal GPIO drivers, and not with such
platform specific constructs.

Alban
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] MIPS: Remove most of the custom gpio.h

2015-07-22 Thread Alban Bedel
Currently CONFIG_ARCH_HAVE_CUSTOM_GPIO_H is defined for all MIPS
machines, and each machine type provides its own gpio.h. However only
the Alchemy machine really use the feature, all other machines only
use the default wrappers.

For most machine types we can just remove the custom gpio.h, as well
as the custom wrappers if some exists. A few more fixes are need in
a few drivers as they rely on linux/gpio.h to provides some machine
specific definitions, or used asm/gpio.h instead of linux/gpio.h for
the gpio API.

Signed-off-by: Alban Bedel al...@free.fr
---

This patch is based on my previous serie:
MIPS: ath79: Move the GPIO driver to drivers/gpio.

For testing I tried to build all mips defconfig, however my toolchain
couldn't handle a few configs: ip28 malta_qemu_32r6 maltasmvp_eva
sead3micro. If somebody can test these that would be more than welcome.

It might well be that some more drivers for MIPS devices that are not
enabled in the defconfig will break because of this change, so more
testing would be nice :)

Regarding Alchemy I'm not sure what to do. It use a little more
complex setup, quoting arch/mips/include/asm/mach-au1x00/gpio.h:

/* Linux gpio framework integration.
*
* 4 use cases of Alchemy GPIOS:
*(1) GPIOLIB=y, ALCHEMY_GPIO_INDIRECT=y:
*   Board must register gpiochips.
*(2) GPIOLIB=y, ALCHEMY_GPIO_INDIRECT=n:
*   A gpiochip for the 75 GPIOs is registered.
*
*(3) GPIOLIB=n, ALCHEMY_GPIO_INDIRECT=y:
*   the boards' gpio.h must provide the linux gpio wrapper
functions,
*
*(4) GPIOLIB=n, ALCHEMY_GPIO_INDIRECT=n:
*   inlinable gpio functions are provided which enable access to the
*   Au1300 gpios only by using the numbers straight out of the data-
*   sheets.

* Cases 1 and 3 are intended for boards which want to provide their own
* GPIO namespace and -operations (i.e. for example you have 8 GPIOs
* which are in part provided by spare Au1300 GPIO pins and in part by
* an external FPGA but you still want them to be accssible in linux
* as gpio0-7. The board can of course use the alchemy_gpioX_* functions
* as required).
*/

This sound to me like this is really not needed anymore. Is there any
users of this left, or can it just go?

Alban
---
 arch/mips/Kconfig   |  2 +-
 arch/mips/ar7/gpio.c| 12 ++--
 arch/mips/ar7/platform.c|  1 -
 arch/mips/ar7/setup.c   |  1 -
 arch/mips/include/asm/mach-ar7/ar7.h|  4 +++
 arch/mips/include/asm/mach-ar7/gpio.h   | 41 -
 arch/mips/include/asm/mach-ath25/gpio.h | 16 --
 arch/mips/include/asm/mach-ath79/gpio.h | 26 
 arch/mips/include/asm/mach-bcm47xx/gpio.h   | 17 --
 arch/mips/include/asm/mach-bcm63xx/gpio.h   | 15 -
 arch/mips/include/asm/mach-cavium-octeon/gpio.h | 21 -
 arch/mips/include/asm/mach-generic/gpio.h   | 21 -
 arch/mips/include/asm/mach-jz4740/gpio.h|  2 --
 arch/mips/include/asm/mach-lantiq/gpio.h| 13 
 arch/mips/include/asm/mach-loongson64/gpio.h| 36 --
 arch/mips/include/asm/mach-pistachio/gpio.h | 21 -
 arch/mips/include/asm/mach-rc32434/gpio.h   | 12 
 arch/mips/jz4740/gpio.c | 12 
 arch/mips/rb532/devices.c   |  1 +
 arch/mips/txx9/generic/setup.c  | 16 --
 drivers/ata/pata_rb532_cf.c |  3 +-
 drivers/gpio/gpio-ath79.c   | 32 ---
 drivers/input/misc/rb532_button.c   |  1 +
 drivers/net/ethernet/ti/cpmac.c |  2 ++
 24 files changed, 13 insertions(+), 315 deletions(-)
 delete mode 100644 arch/mips/include/asm/mach-ar7/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-ath25/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-ath79/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-bcm47xx/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-bcm63xx/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-cavium-octeon/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-generic/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-lantiq/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-loongson64/gpio.h
 delete mode 100644 arch/mips/include/asm/mach-pistachio/gpio.h

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index d896ffb..c3b2f38 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -13,7 +13,6 @@ config MIPS
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_BPF_JIT if !CPU_MICROMIPS
-   select ARCH_HAVE_CUSTOM_GPIO_H
select HAVE_FUNCTION_TRACER
select HAVE_DYNAMIC_FTRACE
select HAVE_FTRACE_MCOUNT_RECORD
@@ -77,6 +76,7 @@ config MIPS_ALCHEMY
select SYS_SUPPORTS_32BIT_KERNEL
select SYS_SUPPORTS_APM_EMULATION
select