Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb}

2018-10-19 Thread Peter Zijlstra
READ_ONCE(base->data_head); > + > + smp_rmb(); > + return head; > +#endif > +} > + > +static inline void ring_buffer_write_tail(struct perf_event_mmap_page *base, > + u64 tail) > +{ > + smp_store_release(>data_tail, tail); > +} > + > +#endif /* _TOOLS_LINUX_RING_BUFFER_H_ */ (for the whole patch, but in particular the above) Acked-by: Peter Zijlstra (Intel)

Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb}

2018-10-19 Thread Peter Zijlstra
On Thu, Oct 18, 2018 at 08:33:09AM -0700, Alexei Starovoitov wrote: > On Thu, Oct 18, 2018 at 05:04:34PM +0200, Daniel Borkmann wrote: > > #endif /* _TOOLS_LINUX_ASM_IA64_BARRIER_H */ > > diff --git a/tools/arch/powerpc/include/asm/barrier.h > > b/tools/arch/powerpc/include/asm/barrier.h > >

Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb}

2018-10-18 Thread Peter Zijlstra
On Thu, Oct 18, 2018 at 01:10:15AM +0200, Daniel Borkmann wrote: > Wouldn't this then also allow the kernel side to use smp_store_release() > when it updates the head? We'd be pretty much at the model as described > in Documentation/core-api/circular-buffers.rst. > > Meaning, rough pseudo-code

Re: [PATCH bpf-next 3/3] bpf, libbpf: use proper barriers in perf ring buffer walk

2018-10-17 Thread Peter Zijlstra
On Wed, Oct 17, 2018 at 04:41:56PM +0200, Daniel Borkmann wrote: > +static __u64 bpf_perf_read_head(struct perf_event_mmap_page *header) > +{ > + __u64 data_head = READ_ONCE(header->data_head); > + > + smp_rmb(); > + return data_head; > +} > + > +static void bpf_perf_write_tail(struct

Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb}

2018-10-17 Thread Peter Zijlstra
On Wed, Oct 17, 2018 at 04:41:55PM +0200, Daniel Borkmann wrote: > @@ -73,7 +73,8 @@ static inline u64 perf_mmap__read_head(struct perf_mmap *mm) > { > struct perf_event_mmap_page *pc = mm->base; > u64 head = READ_ONCE(pc->data_head); > - rmb(); > + > + smp_rmb(); >

Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload

2018-09-21 Thread Peter Zijlstra
On Fri, Sep 21, 2018 at 09:25:00AM -0300, Arnaldo Carvalho de Melo wrote: > There is another longstanding TODO list entry: PERF_RECORD_MMAP records > should include a build-id I throught the problem was that the kernel doesn't have the build-id in the first place. So it cannot hand them out.

Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload

2018-09-21 Thread Peter Zijlstra
On Fri, Sep 21, 2018 at 09:25:00AM -0300, Arnaldo Carvalho de Melo wrote: > > I consider synthetic perf events to be non-ABI. Meaning they're > > emitted by perf user space into perf.data and there is a convention > > on names, but it's not a kernel abi. Like RECORD_MMAP with > > event.filename ==

Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload

2018-09-20 Thread Peter Zijlstra
On Thu, Sep 20, 2018 at 10:25:45AM -0300, Arnaldo Carvalho de Melo wrote: > PeterZ provided a patch introducing PERF_RECORD_MUNMAP, went nowhere due > to having to cope with munmapping parts of existing mmaps, etc. > > I'm still more in favour of introduce PERF_RECORD_MUNMAP, even if for > now it

Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload

2018-09-20 Thread Peter Zijlstra
On Thu, Sep 20, 2018 at 10:44:24AM +0200, Peter Zijlstra wrote: > On Wed, Sep 19, 2018 at 03:39:34PM -0700, Alexei Starovoitov wrote: > > void bpf_prog_kallsyms_del(struct bpf_prog *fp) > > { > > + unsigned long symbol_start, symbol_end; > > + /* mmap_record.file

Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload

2018-09-20 Thread Peter Zijlstra
On Wed, Sep 19, 2018 at 03:39:34PM -0700, Alexei Starovoitov wrote: > void bpf_prog_kallsyms_del(struct bpf_prog *fp) > { > + unsigned long symbol_start, symbol_end; > + /* mmap_record.filename cannot be NULL and has to be u64 aligned */ > + char buf[sizeof(u64)] = {}; > + > if

Re: [net-next, v6, 6/7] net-sysfs: Add interface for Rx queue(s) map per Tx queue

2018-07-19 Thread Peter Zijlstra
On Wed, Jul 18, 2018 at 11:22:36AM -0700, Andrei Vagin wrote: > > > [1.085679] lock(cpu_hotplug_lock.rw_sem); > > > [1.085753] lock(cpu_hotplug_lock.rw_sem); > > > [1.085828] > > > [1.085828] *** DEADLOCK *** > Peter and Ingo, maybe you could explain why it isn't safe to

Re: [PATCH bpf-next v2 1/7] perf/core: add perf_get_event() to return perf_event given a struct file

2018-05-18 Thread Peter Zijlstra
On Thu, May 17, 2018 at 10:32:53PM -0700, Yonghong Song wrote: > A new extern function, perf_get_event(), is added to return a perf event > given a struct file. This function will be used in later patches. Can't you do a narrower interface? Like return the prog. I'm not too keen on random !perf

Re: [PATCH bpf-next 2/7] bpf: introduce bpf subcommand BPF_PERF_EVENT_QUERY

2018-05-16 Thread Peter Zijlstra
On Tue, May 15, 2018 at 04:45:16PM -0700, Yonghong Song wrote: > Currently, suppose a userspace application has loaded a bpf program > and attached it to a tracepoint/kprobe/uprobe, and a bpf > introspection tool, e.g., bpftool, wants to show which bpf program > is attached to which

Re: [PATCH bpf v3] x86/cpufeature: bpf hack for clang not supporting asm goto

2018-05-10 Thread Peter Zijlstra
On Thu, May 03, 2018 at 08:31:19PM -0700, Yonghong Song wrote: > This approach is preferred since the already deployed bcc scripts, or > any other bpf applicaitons utilizing LLVM JIT compilation functionality, > will continue work with the new kernel without re-compilation and > re-deployment.

Re: [RFC PATCH 2/2] net: mac808211: mac802154: use lockdep_assert_in_softirq() instead own warning

2018-05-04 Thread Peter Zijlstra
On Fri, May 04, 2018 at 09:07:35PM +0200, Sebastian Andrzej Siewior wrote: > On 2018-05-04 20:51:32 [+0200], Peter Zijlstra wrote: > > softirqs disabled, ack that is exactly what it checks. > > > > But afaict the assertion you introduced tests that we are _in_ softi

Re: [RFC PATCH 2/2] net: mac808211: mac802154: use lockdep_assert_in_softirq() instead own warning

2018-05-04 Thread Peter Zijlstra
On Fri, May 04, 2018 at 08:45:39PM +0200, Sebastian Andrzej Siewior wrote: > On 2018-05-04 20:32:49 [+0200], Peter Zijlstra wrote: > > On Fri, May 04, 2018 at 07:51:44PM +0200, Sebastian Andrzej Siewior wrote: > > > From: Anna-Maria Gleixner <anna-ma...@linutronix.de>

Re: [RFC PATCH 2/2] net: mac808211: mac802154: use lockdep_assert_in_softirq() instead own warning

2018-05-04 Thread Peter Zijlstra
On Fri, May 04, 2018 at 07:51:44PM +0200, Sebastian Andrzej Siewior wrote: > From: Anna-Maria Gleixner > > The warning in ieee802154_rx() and ieee80211_rx_napi() is there to ensure > the softirq context for the subsequent netif_receive_skb() call. That's not in fact

Re: [PATCH bpf-next 1/2] bpf: enable stackmap with build_id in nmi context

2018-05-02 Thread Peter Zijlstra
On Wed, May 02, 2018 at 04:48:32PM +, Song Liu wrote: > > It's broken though, I've bet you've never actually ran this with lockdep > > enabled for example. > > I am not following here. I just run the new selftest with CONFIG_LOCKDEP on, > and got no warning for this. Weird, I would be

Re: [PATCH bpf-next 1/2] bpf: enable stackmap with build_id in nmi context

2018-05-02 Thread Peter Zijlstra
On Tue, May 01, 2018 at 05:02:19PM -0700, Song Liu wrote: > @@ -267,17 +285,27 @@ static void stack_map_get_build_id_offset(struct > bpf_stack_build_id *id_offs, > { > int i; > struct vm_area_struct *vma; > + bool in_nmi_ctx = in_nmi(); > + bool irq_work_busy = false; > +

Re: [PATCH] x86/cpufeature: guard asm_volatile_goto usage with CC_HAVE_ASM_GOTO

2018-04-14 Thread Peter Zijlstra
On Fri, Apr 13, 2018 at 01:42:14PM -0700, Alexei Starovoitov wrote: > On 4/13/18 11:19 AM, Peter Zijlstra wrote: > > On Tue, Apr 10, 2018 at 02:28:04PM -0700, Alexei Starovoitov wrote: > > > Instead of > > > #ifdef CC_HAVE_ASM_GOTO > > > we can replace it with

Re: linux-next: build failure after merge of the tip tree

2018-04-03 Thread Peter Zijlstra
On Tue, Apr 03, 2018 at 01:39:08PM +0100, David Howells wrote: > Peter Zijlstra <pet...@infradead.org> wrote: > > > I figured that since there were only a handful of users it wasn't a > > popular API, also David very much knew of those patches changing it so >

Re: linux-next: build failure after merge of the tip tree

2018-04-03 Thread Peter Zijlstra
On Tue, Apr 03, 2018 at 03:41:22PM +1000, Stephen Rothwell wrote: > Caused by commit > > 9b8cce52c4b5 ("sched/wait: Remove the wait_on_atomic_t() API") > > interacting with commits > > d3be4d244330 ("xrpc: Fix potential call vs socket/net destruction race") > 31f5f9a1691e ("rxrpc: Fix

Re: [PATCH bpf-next v4 1/2] bpf: extend stackmap to save binary_build_id+offset instead of address

2018-03-12 Thread Peter Zijlstra
On Mon, Mar 12, 2018 at 01:39:56PM -0700, Song Liu wrote: > +static void stack_map_get_build_id_offset(struct bpf_map *map, > + struct stack_map_bucket *bucket, > + u64 *ips, u32 trace_nr) > +{ > + int i; > +

Re: [PATCH v2 1/2] bpf: extend stackmap to save binary_build_id+offset instead of address

2018-03-06 Thread Peter Zijlstra
On Tue, Mar 06, 2018 at 10:09:13AM -0800, Song Liu wrote: > +/* Parse build ID of ELF file mapped to vma */ > +static int stack_map_get_build_id(struct vm_area_struct *vma, > + unsigned char *build_id) > +{ > + Elf32_Ehdr *ehdr; > + struct page *page; > +

Re: [PATCH bpf-next 0/5] bpf, tracing: introduce bpf raw tracepoints

2018-03-06 Thread Peter Zijlstra
On Mon, Mar 05, 2018 at 02:36:07PM +0100, Daniel Borkmann wrote: > On 03/01/2018 05:19 AM, Alexei Starovoitov wrote: > > This patch set is a different way to address the pressing need to access > > task_struct pointers in sched tracepoints from bpf programs. > > > > The first approach simply

Re: [PATCH bpf-next 1/2] bpf: extend stackmap to save binary_build_id+offset instead of address

2018-03-05 Thread Peter Zijlstra
On Mon, Feb 26, 2018 at 09:49:22AM -0800, Song Liu wrote: > +/* Parse build ID of ELF file mapped to vma */ > +static int stack_map_get_build_id(struct vm_area_struct *vma, > + unsigned char *build_id) > +{ > + Elf32_Ehdr *ehdr = (Elf32_Ehdr *)vma->vm_start; How

Re: Serious performance degradation in Linux 4.15

2018-02-16 Thread Peter Zijlstra
On Fri, Feb 16, 2018 at 02:38:39PM +, Matt Fleming wrote: > On Wed, 14 Feb, at 10:46:20PM, Matt Fleming wrote: > > Here's some more numbers. This is with RETPOLINE=y but you'll see it > > doesn't make much of a difference. Oh, this is also with powersave > > cpufreq governor. > > Feh, I was

Re: Serious performance degradation in Linux 4.15

2018-02-16 Thread Peter Zijlstra
On Wed, Feb 14, 2018 at 10:46:20PM +, Matt Fleming wrote: > 3. ./run-mmtests.sh > --config=configs/config-global-dhp__network-netperf-unbound `uname -r` Not a success.. firstly it attempts to install packages without asking and then horribly fails at it..

Re: Serious performance degradation in Linux 4.15

2018-02-16 Thread Peter Zijlstra
On Wed, Feb 14, 2018 at 10:46:20PM +, Matt Fleming wrote: > Peter, if you want to run this test yourself you can do: > > 1. git clone https://github.com/gorman/mmmtests.git root@ivb-ep:/usr/local/src# git clone https://github.com/gorman/mmmtests.git Cloning into 'mmmtests'... Username for

Re: Serious performance degradation in Linux 4.15

2018-02-15 Thread Peter Zijlstra
On Wed, Feb 14, 2018 at 10:46:20PM +, Matt Fleming wrote: > Here's some more numbers. This is with RETPOLINE=y but you'll see it > doesn't make much of a difference. Oh, this is also with powersave > cpufreq governor. Hurmph, I'll go have a look when I can boot tip/master again :/ But didn't

Re: Serious performance degradation in Linux 4.15

2018-02-12 Thread Peter Zijlstra
On Fri, Feb 09, 2018 at 05:59:12PM +, Jon Maloy wrote: > Command for TCP: > "netperf TCP_STREAM (netperf -n 4 -f m -c 4 -C 4 -P 1 -H 10.0.0.1 -t > TCP_STREAM -l 10 -- -O THROUGHPUT)" > Command for TIPC: > "netperf TIPC_STREAM (netperf -n 4 -f m -c 4 -C 4 -P 1 -H 10.0.0.1 -t > TCP_STREAM -l

Re: Serious performance degradation in Linux 4.15

2018-02-10 Thread Peter Zijlstra
On Fri, Feb 09, 2018 at 05:59:12PM +, Jon Maloy wrote: > The two commits > d153b153446f7 (" sched/core: Fix wake_affine() performance regression") and > f2cdd9cc6c97 ("sched/core: Address more wake_affine() regressions") > are causing a serious performance degradation in Linux 4.5. > > The

Re: [4.15-rc9] fs_reclaim lockdep trace

2018-01-29 Thread Peter Zijlstra
On Mon, Jan 29, 2018 at 08:47:20PM +0900, Tetsuo Handa wrote: > Peter Zijlstra wrote: > > On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote: > > > This warning seems to be caused by commit d92a8cfcb37ecd13 > > > ("locking/lockdep: Rework FS_

Re: [4.15-rc9] fs_reclaim lockdep trace

2018-01-29 Thread Peter Zijlstra
o nopage; bit? > Reported-by: Dave Jones <da...@codemonkey.org.uk> > Signed-off-by: Tetsuo Handa <penguin-ker...@i-love.sakura.ne.jp> > Cc: Peter Zijlstra <pet...@infradead.org> > Cc: Nick Piggin <npig...@gmail.com> > --- > mm/page_alloc.c | 2 +- > 1 file changed,

Re: net: r8169: a question of memory barrier in the r8169 driver

2018-01-19 Thread Peter Zijlstra
On Fri, Jan 19, 2018 at 02:11:18AM +0100, Francois Romieu wrote: > Peter Zijlstra <pet...@infradead.org> : > [...] > > There is only 1 variable afaict. Memory barriers need at least 2 in > > order to be able to do _anything_. > > I don't get your point: why do

Re: net: r8169: a question of memory barrier in the r8169 driver

2018-01-18 Thread Peter Zijlstra
On Thu, Jan 18, 2018 at 10:06:17PM +0800, Jia-Ju Bai wrote: > In the rt8169 driver, the function "rtl_tx" uses "smp_mb" to sync the > writing operation with rtl8169_start_xmit: > if (tp->dirty_tx != dirty_tx) { > tp->dirty_tx = dirty_tx; > smp_mb(); > ... > } > The

Re: dvb usb issues since kernel 4.9

2018-01-08 Thread Peter Zijlstra
On Mon, Jan 08, 2018 at 10:31:09PM +0100, Jesper Dangaard Brouer wrote: > I did expected the issue to get worse, when you load the Pi with > network traffic, as now the softirq time-budget have to be shared > between networking and USB/DVB. Thus, I guess you are running TCP and > USB/mpeg2ts on

Re: [PATCH 00/18] prevent bounds-check bypass via speculative execution

2018-01-08 Thread Peter Zijlstra
On Mon, Jan 08, 2018 at 11:43:42AM +, Alan Cox wrote: > On Mon, 8 Jan 2018 11:08:36 +0100 > Peter Zijlstra <pet...@infradead.org> wrote: > > > On Fri, Jan 05, 2018 at 10:30:16PM -0800, Dan Williams wrote: > > > On Fri, Jan 5, 2018 at 6:22 PM, Eric W. Bi

Re: [PATCH 00/18] prevent bounds-check bypass via speculative execution

2018-01-08 Thread Peter Zijlstra
On Fri, Jan 05, 2018 at 10:30:16PM -0800, Dan Williams wrote: > On Fri, Jan 5, 2018 at 6:22 PM, Eric W. Biederman > wrote: > > In at least one place (mpls) you are patching a fast path. Compile out > > or don't load mpls by all means. But it is not acceptable to change

Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

2018-01-08 Thread Peter Zijlstra
On Sun, Jan 07, 2018 at 06:57:35PM -0800, Alexei Starovoitov wrote: > On Sun, Jan 07, 2018 at 01:59:35PM +, Alan Cox wrote: > > lfence timing is also heavily dependent upon what work has to be done to > > retire previous live instructions. > > BPF does not normally do a lot of writing so

Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

2018-01-08 Thread Peter Zijlstra
On Sun, Jan 07, 2018 at 06:24:11PM -0800, Alexei Starovoitov wrote: > How about: > CONFIG_SPECTRE1_WORKAROUND_INDEX_MASK > CONFIG_SPECTRE1_WORKAROUND_LOAD_FENCE INSTRUCTION_FENCE if anything. LFENCE for Intel (and now also for AMD as per 0592b0bce169) is a misnomer, IFENCE would be a better name

Re: [PATCH v5 3/6] perf: implement pmu perf_kprobe

2017-12-20 Thread Peter Zijlstra
On Wed, Dec 20, 2017 at 06:10:11PM +, Song Liu wrote: > I think there is one more thing to change: OK, folded that too; it should all be at: git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/core Can you verify it all looks/works right?

Re: [PATCH v5 3/6] perf: implement pmu perf_kprobe

2017-12-20 Thread Peter Zijlstra
On Wed, Dec 20, 2017 at 11:03:01AM +0100, Peter Zijlstra wrote: > On Wed, Dec 06, 2017 at 02:45:15PM -0800, Song Liu wrote: > > @@ -8537,7 +8620,7 @@ static int perf_event_set_filter(struct perf_event > > *event, void __user *arg) > > char *filter_str; >

Re: [PATCH v5 3/6] perf: implement pmu perf_kprobe

2017-12-20 Thread Peter Zijlstra
On Wed, Dec 06, 2017 at 02:45:15PM -0800, Song Liu wrote: > @@ -8537,7 +8620,7 @@ static int perf_event_set_filter(struct perf_event > *event, void __user *arg) > char *filter_str; > int ret = -EINVAL; > > - if ((event->attr.type != PERF_TYPE_TRACEPOINT || > + if

Re: [PATCH v5 0/6] enable creating [k,u]probe with perf_event_open

2017-12-19 Thread Peter Zijlstra
Took 1-4, Thanks!

Re: [PATCH net-next v4 1/2] bpf/tracing: allow user space to query prog array on the same tp

2017-12-12 Thread Peter Zijlstra
On Mon, Dec 11, 2017 at 11:39:02AM -0800, Yonghong Song wrote: > The usage: > struct perf_event_query_bpf *query = malloc(...); > query.ids_len = ids_len; > err = ioctl(pmu_efd, PERF_EVENT_IOC_QUERY_BPF, ); You didn't spot the fixes to your changelog ;-) The above should read something

Re: [PATCH net-next v3 1/2] bpf/tracing: allow user space to query prog array on the same tp

2017-12-11 Thread Peter Zijlstra
_cnt is the number of available progs, > * number of progs in ids: (ids_len == 0) ? 0 : query.prog_cnt > */ > } else if (errno == ENOSPC) { > /* query.ids_len number of progs copied, > * query.prog_cnt is the number of available progs > */ > } else { > /* other errors */ > } > > Signed-off-by: Yonghong Song <y...@fb.com> Yes this looks much better, thanks! Acked-by: Peter Zijlstra (Intel) <pet...@infradead.org>

Re: [PATCH net-next v2 1/2] bpf/tracing: allow user space to query prog array on the same tp

2017-12-06 Thread Peter Zijlstra
On Wed, Dec 06, 2017 at 12:56:36PM +0100, Peter Zijlstra wrote: > On Tue, Dec 05, 2017 at 10:31:28PM -0800, Yonghong Song wrote: > > Commit e87c6bc3852b ("bpf: permit multiple bpf attachments > > for a single perf event") added support to attach multiple > > bpf

Re: [PATCH net-next v2 1/2] bpf/tracing: allow user space to query prog array on the same tp

2017-12-06 Thread Peter Zijlstra
On Tue, Dec 05, 2017 at 10:31:28PM -0800, Yonghong Song wrote: > Commit e87c6bc3852b ("bpf: permit multiple bpf attachments > for a single perf event") added support to attach multiple > bpf programs to a single perf event. > Commit 2541517c32be ("tracing, perf: Implement BPF programs > attached

Re: [PATCH v4 1/6] perf: prepare perf_event.h for new types perf_kprobe and perf_uprobe

2017-12-06 Thread Peter Zijlstra
On Mon, Dec 04, 2017 at 05:27:24PM -0800, Song Liu wrote: > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index 362493a..0f39b31 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -291,6 +291,16 @@ enum

Re: [PATCH tip/core/rcu 21/21] drivers/vhost: Remove now-redundant read_barrier_depends()

2017-12-05 Thread Peter Zijlstra
On Tue, Dec 05, 2017 at 01:36:44PM -0800, Paul E. McKenney wrote: > What we do in some code is to comment the pairings, allowing the other > side of the pairing to be easily located. Would that work for you? I would say that that is mandatory for any memory ordering code ;-)

Re: [PATCH tip/core/rcu 21/21] drivers/vhost: Remove now-redundant read_barrier_depends()

2017-12-05 Thread Peter Zijlstra
On Tue, Dec 05, 2017 at 11:24:49PM +0200, Michael S. Tsirkin wrote: > READ_ONCE is really all over the place (some code literally replaced all > memory accesses with READ/WRITE ONCE). Yeah, so? Complain to the compiler people for forcing us into that. > Would an API like

Re: [PATCH tip/core/rcu 21/21] drivers/vhost: Remove now-redundant read_barrier_depends()

2017-12-05 Thread Peter Zijlstra
On Tue, Dec 05, 2017 at 10:28:38PM +0200, Michael S. Tsirkin wrote: > On Tue, Dec 05, 2017 at 08:57:52PM +0100, Peter Zijlstra wrote: > > On Tue, Dec 05, 2017 at 09:51:48PM +0200, Michael S. Tsirkin wrote: > > > > > WRITE_ONCE(obj->val, 1); > > > > >

Re: [PATCH tip/core/rcu 21/21] drivers/vhost: Remove now-redundant read_barrier_depends()

2017-12-05 Thread Peter Zijlstra
On Tue, Dec 05, 2017 at 09:51:48PM +0200, Michael S. Tsirkin wrote: > > > WRITE_ONCE(obj->val, 1); > > > smp_wmb(); > > > WRITE_ONCE(*foo, obj); > > > > I believe Peter was instead suggesting: > > > > WRITE_ONCE(obj->val, 1); > > smp_store_release(foo, obj); > > Isn't that more expensive

Re: [PATCH tip/core/rcu 21/21] drivers/vhost: Remove now-redundant read_barrier_depends()

2017-12-05 Thread Peter Zijlstra
On Tue, Dec 05, 2017 at 09:24:21PM +0200, Michael S. Tsirkin wrote: > On Tue, Dec 05, 2017 at 08:17:33PM +0100, Peter Zijlstra wrote: > > On Tue, Dec 05, 2017 at 08:57:46PM +0200, Michael S. Tsirkin wrote: > > > > > I don't see WRITE_ONCE inserting any barriers

Re: [PATCH tip/core/rcu 21/21] drivers/vhost: Remove now-redundant read_barrier_depends()

2017-12-05 Thread Peter Zijlstra
On Tue, Dec 05, 2017 at 08:57:46PM +0200, Michael S. Tsirkin wrote: > I don't see WRITE_ONCE inserting any barriers, release or > write. Correct, never claimed there was. Just saying that: obj = READ_ONCE(*foo); val = READ_ONCE(obj->val); Never needs a barrier (except on Alpha

Re: [PATCH tip/core/rcu 21/21] drivers/vhost: Remove now-redundant read_barrier_depends()

2017-12-05 Thread Peter Zijlstra
On Tue, Dec 05, 2017 at 08:31:20PM +0200, Michael S. Tsirkin wrote: > Apropos, READ_ONCE is now asymmetrical with WRITE_ONCE. > > I can read a pointer with READ_ONCE and be sure the value > is sane, but only if I also remember to put in smp_wmb before > WRITE_ONCE. Otherwise the pointer is ok

Re: [PATCH v3 3/6] perf: implement pmu perf_kprobe

2017-12-04 Thread Peter Zijlstra
On Thu, Nov 30, 2017 at 03:50:20PM -0800, Song Liu wrote: > + tp_event = create_local_trace_kprobe( > + func, (void *)(unsigned long)(p_event->attr.kprobe_addr), > + p_event->attr.probe_offset, p_event->attr.config != 0); So you want to explicitly test bit0 instead?

Re: [PATCH v3 3/6] perf: implement pmu perf_kprobe

2017-12-04 Thread Peter Zijlstra
On Thu, Nov 30, 2017 at 03:50:20PM -0800, Song Liu wrote: > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 494eca1..49bbf46 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > static inline void perf_tp_register(void) > { > perf_pmu_register(_tracepoint,

Re: [PATCH v3 3/6] perf: implement pmu perf_kprobe

2017-12-04 Thread Peter Zijlstra
On Thu, Nov 30, 2017 at 03:50:20PM -0800, Song Liu wrote: > +static struct pmu perf_kprobe = { > +}; > +static inline bool perf_event_is_tracing(struct perf_event *event) > +{ > + return event->attr.type == PERF_TYPE_TRACEPOINT || > + strncmp(event->pmu->name, "kprobe", 6) == 0;

Re: [PATCH 1/6] perf: Add new type PERF_TYPE_PROBE

2017-11-30 Thread Peter Zijlstra
On Thu, Nov 30, 2017 at 01:43:06AM +, Song Liu wrote: > I added two fixed types (PERF_TYPE_KPROBE and PERF_TYPE_UPROBE) in the new > version. I know that perf doesn't need them any more. But currently bcc still > relies on these fixed types to use the probes/tracepoints. Yeah, sorry,

Re: [PATCH 1/6] perf: Add new type PERF_TYPE_PROBE

2017-11-26 Thread Peter Zijlstra
On Sat, Nov 25, 2017 at 05:59:54PM -0800, Alexei Starovoitov wrote: > If we were poking into 'struct perf_event_attr __user *uptr' > directly like get|put_user(.., >config) > then 32-bit user space with 4-byte aligned u64s would cause > 64-bit kernel to trap on archs like sparc. But surely archs

Re: [PATCH 1/6] perf: Add new type PERF_TYPE_PROBE

2017-11-24 Thread Peter Zijlstra
On Thu, Nov 23, 2017 at 10:31:29PM -0800, Alexei Starovoitov wrote: > unfortunately 32-bit is more screwed than it seems: > > $ cat align.c > #include > > struct S { > unsigned long long a; > } s; > > struct U { > unsigned long long a; > } u; > > int main() > { > printf("%d,

Re: [PATCH 1/6] perf: Add new type PERF_TYPE_PROBE

2017-11-23 Thread Peter Zijlstra
On Wed, Nov 15, 2017 at 09:23:33AM -0800, Song Liu wrote: > A new perf type PERF_TYPE_PROBE is added to allow creating [k,u]probe > with perf_event_open. These [k,u]probe are associated with the file > decriptor created by perf_event_open, thus are easy to clean when > the file descriptor is

Re: [PATCH 3/6] perf: implement kprobe support to PERF_TYPE_PROBE

2017-11-23 Thread Peter Zijlstra
On Wed, Nov 15, 2017 at 09:23:36AM -0800, Song Liu wrote: > +int perf_probe_init(struct perf_event *p_event) > +{ > + __aligned_u64 aligned_probe_desc; > + > + /* > + * attr.probe_desc may not be 64-bit aligned on 32-bit systems. > + * Make an aligned copy of it to before

Re: [PATCH 1/6] perf: Add new type PERF_TYPE_PROBE

2017-11-23 Thread Peter Zijlstra
On Wed, Nov 15, 2017 at 09:23:33AM -0800, Song Liu wrote: > Note: We use type __u64 for pointer probe_desc instead of __aligned_u64. > The reason here is to avoid changing the size of struct perf_event_attr, > and breaking new-kernel-old-utility scenario. To avoid alignment problem > with the

Re: [PATCH 0/6] enable creating [k,u]probe with perf_event_open

2017-11-23 Thread Peter Zijlstra
On Thu, Nov 23, 2017 at 01:02:00AM -0800, Christoph Hellwig wrote: > Just curious: why do you want to overload a multiplexer syscall even > more instead of adding explicit syscalls? Mostly because perf provides much of what they already want; fd-based lifetime and bpf integration.

Re: [PATCH net-next 2/8] rtnetlink: add rtnl_register_module

2017-11-12 Thread Peter Zijlstra
On Mon, Nov 13, 2017 at 08:21:59AM +0100, Florian Westphal wrote: > Reason is that some places do this: > > rtnl_register(pf, RTM_FOO, doit, NULL, 0); > rtnl_register(pf, RTM_FOO, NULL, dumpit, 0); Sure, however, > (from different call sites in the stack). > > - if (doit) > > -

Re: [PATCH 0/2][v5] Add the ability to do BPF directed error injection

2017-11-10 Thread Peter Zijlstra
On Wed, Nov 08, 2017 at 06:43:25AM +0900, Alexei Starovoitov wrote: > On 11/8/17 5:28 AM, Josef Bacik wrote: > > I'm sending this through Dave since it'll conflict with other BPF changes > > in his > > tree, but since it touches tracing as well Dave would like a review from > > somebody on the

Re: [PATCH v3] scripts: add leaking_addresses.pl

2017-11-08 Thread Peter Zijlstra
On Tue, Nov 07, 2017 at 05:44:13PM -0500, Steven Rostedt wrote: > On Tue, 7 Nov 2017 13:44:01 -0800 > Linus Torvalds wrote: > > > > Looking other places that stand out, it seems like > > > /proc/lockdep_chains and /proc/lockdep (CONFIG_LOCKDEP=y) has a ton of > > >

Re: [PATCH net-next 2/8] rtnetlink: add rtnl_register_module

2017-11-07 Thread Peter Zijlstra
On Tue, Nov 07, 2017 at 10:47:51AM +0100, Florian Westphal wrote: > I would expect this to trigger all the time, due to > > rtnl_register(AF_INET, RTM_GETROUTE, ... > rtnl_register(AF_INET, RTM_GETADDR, ... Ah, sure, then something like so then... There's bound to be bugs there too, as I pretty

Re: [PATCH net-next 2/8] rtnetlink: add rtnl_register_module

2017-11-07 Thread Peter Zijlstra
On Tue, Nov 07, 2017 at 10:10:04AM +0100, Peter Zijlstra wrote: > On Tue, Nov 07, 2017 at 07:11:56AM +0100, Florian Westphal wrote: > > Peter Zijlstra <pet...@infradead.org> wrote: > > > On Mon, Nov 06, 2017 at 11:51:07AM +0100, Florian Westphal wrote: > >

Re: [PATCH net-next 2/8] rtnetlink: add rtnl_register_module

2017-11-07 Thread Peter Zijlstra
On Tue, Nov 07, 2017 at 07:11:56AM +0100, Florian Westphal wrote: > Peter Zijlstra <pet...@infradead.org> wrote: > > On Mon, Nov 06, 2017 at 11:51:07AM +0100, Florian Westphal wrote: > > > @@ -180,6 +164,12 @@ int __rtnl_register(int protocol, int msgtype, > >

Re: [PATCH net-next 2/8] rtnetlink: add rtnl_register_module

2017-11-06 Thread Peter Zijlstra
On Mon, Nov 06, 2017 at 11:51:07AM +0100, Florian Westphal wrote: > @@ -180,6 +164,12 @@ int __rtnl_register(int protocol, int msgtype, > rcu_assign_pointer(rtnl_msg_handlers[protocol], tab); > } > > + WARN_ON(tab[msgindex].owner && tab[msgindex].owner != owner); > + > +

Re: linux-next: manual merge of the tip tree with the net-next tree

2017-11-01 Thread Peter Zijlstra
On Wed, Nov 01, 2017 at 09:27:43AM +0100, Ingo Molnar wrote: > > * Peter Zijlstra <pet...@infradead.org> wrote: > > > On Wed, Nov 01, 2017 at 06:15:54PM +1100, Stephen Rothwell wrote: > > > Hi all, > > > > > > Today's linux-next merge of the t

Re: linux-next: manual merge of the tip tree with the net-next tree

2017-11-01 Thread Peter Zijlstra
On Wed, Nov 01, 2017 at 06:15:54PM +1100, Stephen Rothwell wrote: > Hi all, > > Today's linux-next merge of the tip tree got a conflict in: > > kernel/trace/bpf_trace.c > > between commits: > > 97562633bcba ("bpf: perf event change needed for subsequent bpf helpers") > and more changes ...

Re: [PATCH net-next] bpf: avoid rcu_dereference inside bpf_event_mutex lock region

2017-10-31 Thread Peter Zijlstra
On Mon, Oct 30, 2017 at 01:50:22PM -0700, Yonghong Song wrote: > Could you check whether the below change to remove rcu_dereference_protected > is what you wanted or not? Yep that looks fine. Thanks! > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c > index b65011d..e7685c5

Re: [PATCH net-next 2/3] bpf: permit multiple bpf attachments for a single perf event

2017-10-26 Thread Peter Zijlstra
On Mon, Oct 23, 2017 at 10:58:04AM -0700, Yonghong Song wrote: > This patch enables multiple bpf attachments for a > kprobe/uprobe/tracepoint single trace event. This forgets to explain _why_ this is a good thing to do. > +static DEFINE_MUTEX(bpf_event_mutex); > + > +int

Re: problem with rtnetlink 'reference' count

2017-10-24 Thread Peter Zijlstra
On Mon, Oct 23, 2017 at 09:37:03PM +0200, Florian Westphal wrote: > > OK, so then why not do something like so? > > @@ -260,10 +259,18 @@ void rtnl_unregister_all(int protocol) > > RCU_INIT_POINTER(rtnl_msg_handlers[protocol], NULL); > > rtnl_unlock(); > > > > + /* > > +* XXX

Re: problem with rtnetlink 'reference' count

2017-10-23 Thread Peter Zijlstra
On Mon, Oct 23, 2017 at 06:37:44PM +0200, Florian Westphal wrote: > Is refcount_t only supposed to be used with dec_and_test patterns? Yes, for reference counting objects. > > This rtnetlink_rcv_msg() is called from softirq-context, right? Also, > > all that stuff happens with rcu_read_lock()

Re: problem with rtnetlink 'reference' count

2017-10-23 Thread Peter Zijlstra
On Mon, Oct 23, 2017 at 05:32:00PM +0200, Florian Westphal wrote: > > 1) it not in fact a refcount, so using refcount_t is silly > > Your suggestion is...? Normal atomic_t > > 2) there is a distinct lack of memory barriers, so we can easily > > observe the decrement while the msg_handler

problem with rtnetlink 'reference' count

2017-10-23 Thread Peter Zijlstra
Hi, I just ran across commit: 019a316992ee ("rtnetlink: add reference counting to prevent module unload while dump is in progress") And that commit is _completely_ broken. 1) it not in fact a refcount, so using refcount_t is silly 2) there is a distinct lack of memory barriers, so we can

Re: [PATCH RFC tip/core/rcu 14/15] netfilter: Remove now-redundant smp_read_barrier_depends()

2017-10-10 Thread Peter Zijlstra
On Mon, Oct 09, 2017 at 05:22:48PM -0700, Paul E. McKenney wrote: > READ_ONCE() now implies smp_read_barrier_depends(), which means that > the instances in arpt_do_table(), ipt_do_table(), and ip6t_do_table() > are now redundant. This commit removes them and adjusts the comments. Similar to the

Re: [PATCH net-next v7 1/5] bpf: perf event change needed for subsequent bpf helpers

2017-10-06 Thread Peter Zijlstra
: Yonghong Song <y...@fb.com> Acked-by: Peter Zijlstra (Intel) <pet...@infradead.org> And as discussed, I'll take this patch into my dev tree while Dave will take all of them into the network tree.

Re: [PATCH net-next v6 0/4] bpf: add two helpers to read perf event enabled/running time

2017-10-05 Thread Peter Zijlstra
On Wed, Oct 04, 2017 at 04:00:56PM -0700, David Miller wrote: > From: Yonghong Song > Date: Mon, 2 Oct 2017 15:42:14 -0700 > > > [Dave, Peter, > > > > Previous communcation shows that this patch may potentially have > > merge conflict with upcoming tip changes in the next merge

Re: [PATCH net] bpf: one perf event close won't free bpf program attached by another perf event

2017-09-21 Thread Peter Zijlstra
On Wed, Sep 20, 2017 at 10:20:13PM -0700, Yonghong Song wrote: > > (2). trace_event_call->perf_events are per cpu data structure, that > > means, some filtering logic is needed to avoid the same perf_event prog > > is executing twice. > > What I mean here is that the trace_event_call->perf_events

Re: [PATCH net-next v5 1/4] bpf: add helper bpf_perf_event_read_value for perf event array map

2017-09-20 Thread Peter Zijlstra
On Tue, Sep 19, 2017 at 11:09:32PM -0700, Yonghong Song wrote: > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 3e691b7..2d5bbe5 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -3684,10 +3684,12 @@ static inline u64 perf_event_count(struct perf_event >

Re: [PATCH v2 net-next 1/4] bpf: add helper bpf_perf_read_counter_time for perf event array map

2017-09-04 Thread Peter Zijlstra
On Fri, Sep 01, 2017 at 10:48:21PM -0700, Yonghong Song wrote: > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index b14095b..5a50808 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -898,7 +898,8 @@ perf_event_create_kernel_counter(struct

Re: [PATCH net-next 1/4] bpf: add helper bpf_perf_read_counter_time for perf event array map

2017-09-01 Thread Peter Zijlstra
On Fri, Sep 01, 2017 at 01:29:17PM -0700, Alexei Starovoitov wrote: > >+BPF_CALL_4(bpf_perf_read_counter_time, struct bpf_map *, map, u64, flags, > >+struct bpf_perf_counter_time *, buf, u32, size) > >+{ > >+struct perf_event *pe; > >+u64 now; > >+int err; > >+ > >+if

Re: [PATCH net-next 1/4] bpf: add helper bpf_perf_read_counter_time for perf event array map

2017-09-01 Thread Peter Zijlstra
On Fri, Sep 01, 2017 at 09:53:54AM -0700, Yonghong Song wrote: > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index b14095b..7fd5e94 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -901,6 +901,8 @@ extern void

Re: [PATCH net-next v2 1/2] bpf: add support for sys_enter_* and sys_exit_* tracepoints

2017-08-03 Thread Peter Zijlstra
On Wed, Aug 02, 2017 at 10:28:27PM -0700, Yonghong Song wrote: > Currently, bpf programs cannot be attached to sys_enter_* and sys_exit_* > style tracepoints. The iovisor/bcc issue #748 > (https://github.com/iovisor/bcc/issues/748) documents this issue. > For example, if you try to attach a bpf

Re: [PATCH net-next 1/2] bpf: add support for sys_{enter|exit}_* tracepoints

2017-08-02 Thread Peter Zijlstra
On Tue, Aug 01, 2017 at 11:30:04PM -0700, Yonghong Song wrote: > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 426c2ff..623c977 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -8050,7 +8050,7 @@ static void perf_event_free_bpf_handler(struct > perf_event

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-07 Thread Peter Zijlstra
On Fri, Jul 07, 2017 at 12:33:49PM +0200, Ingo Molnar wrote: > [1997/04] v2.1.36: > > the spin_unlock_wait() primitive gets introduced as part of release() Whee, that goes _way_ further back than I thought it did :-) > [2017/07] v4.12: > > wait_task_inactive() is still alive

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-07 Thread Peter Zijlstra
On Fri, Jul 07, 2017 at 10:31:28AM +0200, Ingo Molnar wrote: > Here's a quick list of all the use cases: > > net/netfilter/nf_conntrack_core.c: > >- This is I believe the 'original', historic spin_unlock_wait() usecase > that > still exists in the kernel. spin_unlock_wait() is only

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-06 Thread Peter Zijlstra
On Thu, Jul 06, 2017 at 12:49:12PM -0400, Alan Stern wrote: > On Thu, 6 Jul 2017, Paul E. McKenney wrote: > > > On Thu, Jul 06, 2017 at 06:10:47PM +0200, Peter Zijlstra wrote: > > > On Thu, Jul 06, 2017 at 08:21:10AM -0700, Paul E. McKenney wrote: > > > > And yes

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-06 Thread Peter Zijlstra
On Thu, Jul 06, 2017 at 09:20:24AM -0700, Paul E. McKenney wrote: > On Thu, Jul 06, 2017 at 06:05:55PM +0200, Peter Zijlstra wrote: > > On Thu, Jul 06, 2017 at 02:12:24PM +, David Laight wrote: > > > From: Paul E. McKenney > > [ . . . ] > > > Now

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-06 Thread Peter Zijlstra
On Thu, Jul 06, 2017 at 09:24:12AM -0700, Paul E. McKenney wrote: > On Thu, Jul 06, 2017 at 06:10:47PM +0200, Peter Zijlstra wrote: > > On Thu, Jul 06, 2017 at 08:21:10AM -0700, Paul E. McKenney wrote: > > > And yes, there are architecture-specific optimizations for an >

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-06 Thread Peter Zijlstra
On Thu, Jul 06, 2017 at 08:21:10AM -0700, Paul E. McKenney wrote: > And yes, there are architecture-specific optimizations for an > empty spin_lock()/spin_unlock() critical section, and the current > arch_spin_unlock_wait() implementations show some of these optimizations. > But I expect that

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-06 Thread Peter Zijlstra
On Thu, Jul 06, 2017 at 02:12:24PM +, David Laight wrote: > From: Paul E. McKenney > > Sent: 06 July 2017 00:30 > > There is no agreed-upon definition of spin_unlock_wait()'s semantics, > > and it appears that all callers could do just as well with a lock/unlock > > pair. This series

Re: [PATCH v3 net-next 1/3] perf, bpf: Add BPF support to all perf_event types

2017-06-02 Thread Peter Zijlstra
On Thu, Jun 01, 2017 at 07:03:34PM -0700, Alexei Starovoitov wrote: > diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c > index 172dc8ee0e3b..ab93490d1a00 100644 > --- a/kernel/bpf/arraymap.c > +++ b/kernel/bpf/arraymap.c > @@ -452,39 +452,18 @@ static void bpf_event_entry_free_rcu(struct

  1   2   3   4   5   6   >