On Mon, 2019-08-26 at 14:17 -0700, Palmer Dabbelt wrote:
> On Sun, 18 Aug 2019 21:49:01 PDT (-0700), a...@brainfault.org wrote:
> > On Sun, Aug 18, 2019 at 11:49 PM Christoph Hellwig <
> > h...@infradead.org> wrote:
> > > > +#define FIXADDR_TOP (VMALLOC_START)
> > >
> > > Nit: no need for
On Sun, 18 Aug 2019 21:49:01 PDT (-0700), a...@brainfault.org wrote:
On Sun, Aug 18, 2019 at 11:49 PM Christoph Hellwig wrote:
> +#define FIXADDR_TOP (VMALLOC_START)
Nit: no need for the braces, the definitions below don't use it
either.
Sure, I will update and send v2 soon.
>
Palmer, Paul - are you going to pick this up? Seems like we've just
missed -rc6.
On Wed, 2019-08-21 at 20:52 +0300, Andrey Ryabinin wrote:
>
> On 8/20/19 8:37 AM, Walter Wu wrote:
> > On Tue, 2019-08-06 at 13:43 +0800, Walter Wu wrote:
> >> This patch adds memory corruption identification at bug report for
> >> software tag-based mode, the report
From: Walter Wu
This patch adds memory corruption identification at bug report for
software tag-based mode, the report show whether it is "use-after-free"
or "out-of-bound" error instead of "invalid-access" error. This will make
it easier for programmers to see the
On 8/20/19 8:37 AM, Walter Wu wrote:
> On Tue, 2019-08-06 at 13:43 +0800, Walter Wu wrote:
>> This patch adds memory corruption identification at bug report for
>> software tag-based mode, the report show whether it is "use-after-free"
>> or "out-of-bou
On Tue, 2019-08-06 at 13:43 +0800, Walter Wu wrote:
> This patch adds memory corruption identification at bug report for
> software tag-based mode, the report show whether it is "use-after-free"
> or "out-of-bound" error instead of "invalid-access" error. Thi
FDT in the FIXMAP area.
On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas
happen to overlap so we don't see any FIXMAP area corruptions.
This patch fixes FIXMAP area corruption on RV32 systems by setting
TASK_SIZE to FIXADDR_START. We also move FIXADDR_TOP, FIXADDR_SIZE
On Sun, Aug 18, 2019 at 11:49 PM Christoph Hellwig wrote:
>
> > +#define FIXADDR_TOP (VMALLOC_START)
>
> Nit: no need for the braces, the definitions below don't use it
> either.
Sure, I will update and send v2 soon.
>
> > +#ifdef CONFIG_64BIT
> > +#define FIXADDR_SIZE PMD_SIZE
> >
> +#define FIXADDR_TOP (VMALLOC_START)
Nit: no need for the braces, the definitions below don't use it
either.
> +#ifdef CONFIG_64BIT
> +#define FIXADDR_SIZE PMD_SIZE
> +#else
> +#define FIXADDR_SIZE PGDIR_SIZE
> +#endif
> +#define FIXADDR_START(FIXADDR_TOP - FIXADDR_SIZE)
> +
>
will crash
> whenever they access corrupted FDT in the FIXMAP area.
>
> On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas
> happen to overlap so we don't see any FIXMAP area corruptions.
>
> This patch fixes FIXMAP area corruption on RV32 systems by setting
&g
happen to overlap so we don't see any FIXMAP area corruptions.
This patch fixes FIXMAP area corruption on RV32 systems by setting
TASK_SIZE to FIXADDR_START. We also move FIXADDR_TOP, FIXADDR_SIZE,
and FIXADDR_START defines to asm/pgtable.h so that we can avoid cyclic
header includes.
Signed-off
result in data
corruption.
However it decides not to serialize if the potentially unaligned aio is
past i_size with the rationale that no pending writes are possible past
i_size. Unfortunately if the i_size is not block aligned and the second
unaligned write lands past i_size, but still
This patch adds memory corruption identification at bug report for
software tag-based mode, the report show whether it is "use-after-free"
or "out-of-bound" error instead of "invalid-access" error. This will make
it easier for programmers to see the memory corruptio
ixed the issue.
>>>
>>> aefde94195ca mm: thp: make deferred split shrinker memcg aware [1]
>>>
>>> [1]
>>> https://lore.kernel.org/linux-mm/1561507361-59349-5-git-send-email-yang.shi@
>>> linux.alibaba.com/
>>
>>
real meat of the patch series, which converted to memcg
deferred split queue actually.
list_del corruption. prev->next should be ea0022b10098, but was
Finally I could reproduce the list corruption issue on my machine with
THP swap (swap device is fast device). I s
From: Filipe Manana
commit cb2d3daddbfb6318d170e79aac1f7d5e4d49f0d7 upstream.
When one transaction is finishing its commit, it is possible for another
transaction to start and enter its initial commit phase as well. If the
first ends up getting aborted, we have a small time window where the
From: Filipe Manana
commit cb2d3daddbfb6318d170e79aac1f7d5e4d49f0d7 upstream.
When one transaction is finishing its commit, it is possible for another
transaction to start and enter its initial commit phase as well. If the
first ends up getting aborted, we have a small time window where the
From: Filipe Manana
commit cb2d3daddbfb6318d170e79aac1f7d5e4d49f0d7 upstream.
When one transaction is finishing its commit, it is possible for another
transaction to start and enter its initial commit phase as well. If the
first ends up getting aborted, we have a small time window where the
all exact and inexact policies instead of
zeroing the list heads.
Add the commands equivalent to the syzbot reproducer to xfrm_policy.sh,
without fix KASAN catches the corruption as it happens, SLUB poisoning
detects it a bit later.
Reported-by: syzbot+0165480d4ef07360e...@syzkaller.appspotmail.com
Fixes
On Wed, 2019-07-31 at 20:04 +0300, Andrey Ryabinin wrote:
>
> On 7/26/19 4:19 PM, Walter Wu wrote:
> > On Fri, 2019-07-26 at 15:52 +0300, Andrey Ryabinin wrote:
> >>
> >> On 7/26/19 3:28 PM, Walter Wu wrote:
> >>> On Fri, 2019-07-26 at 15:00 +0300, Andrey Ryabinin wrote:
>
> >>>
> >
>
On 7/26/19 4:19 PM, Walter Wu wrote:
> On Fri, 2019-07-26 at 15:52 +0300, Andrey Ryabinin wrote:
>>
>> On 7/26/19 3:28 PM, Walter Wu wrote:
>>> On Fri, 2019-07-26 at 15:00 +0300, Andrey Ryabinin wrote:
>>>
>
>
> I remember that there are already the lists which you concern.
On Fri, 2019-07-26 at 15:52 +0300, Andrey Ryabinin wrote:
>
> On 7/26/19 3:28 PM, Walter Wu wrote:
> > On Fri, 2019-07-26 at 15:00 +0300, Andrey Ryabinin wrote:
> >>
> >
> >>>
> >>>
> >>> I remember that there are already the lists which you concern. Maybe we
> >>> can try to solve those problems
On 7/26/19 3:28 PM, Walter Wu wrote:
> On Fri, 2019-07-26 at 15:00 +0300, Andrey Ryabinin wrote:
>>
>
>>>
>>>
>>> I remember that there are already the lists which you concern. Maybe we
>>> can try to solve those problems one by one.
>>>
>>> 1. deadlock issue? cause by kmalloc() after kfree()?
On Fri, 2019-07-26 at 15:00 +0300, Andrey Ryabinin wrote:
>
> On 7/22/19 12:52 PM, Walter Wu wrote:
> > On Thu, 2019-07-18 at 19:11 +0300, Andrey Ryabinin wrote:
> >>
> >> On 7/15/19 6:06 AM, Walter Wu wrote:
> >>> On Fri, 2019-07-12 at 13:52 +0300, Andrey Ryabinin wrote:
>
> On 7/11/19
On 7/22/19 12:52 PM, Walter Wu wrote:
> On Thu, 2019-07-18 at 19:11 +0300, Andrey Ryabinin wrote:
>>
>> On 7/15/19 6:06 AM, Walter Wu wrote:
>>> On Fri, 2019-07-12 at 13:52 +0300, Andrey Ryabinin wrote:
On 7/11/19 1:06 PM, Walter Wu wrote:
> On Wed, 2019-07-10 at 21:24 +0300,
converted to memcg
deferred split queue actually.
list_del corruption. prev->next should be ea0022b10098, but was
Finally I could reproduce the list corruption issue on my machine with
THP swap (swap device is fast device). I should checked this with you at
th
2976kB
managed:18893712kB mlocked:0kB kernel_stack:22240kB pagetables:10372kB
bounce:0kB free_pcp:12848kB local_pcp:36kB free_cma:0kB
[ 666.372602][ T3141] lowmem_reserve[]: 0 0 0 0 0
[ 666.377419][ T3141] Node 4 Normal free:234488kB m[ 685.274656][ T3456]
list_del corruption. prev->nex
ruct page *page)
> > > spin_lock_irqsave(_queue->split_queue_lock, flags);
> > > if (!list_empty(page_deferred_list(page))) {
> > > ds_queue->split_queue_len--;
> > > - list_del(page_deferred_list(
On Wed, 24 Jul 2019, kernel test robot wrote:
> Greetings,
>
> 0day kernel testing robot got the below dmesg and the first bad commit is
>
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux.git
> master
>
> commit a0d14b8909de55139b8702fe0c7e80b69763dcfb
> Author:
: Thomas Gleixner
CommitDate: Wed Jul 17 23:17:38 2019 +0200
x86/mm, tracing: Fix CR2 corruption
Despite the current efforts to read CR2 before tracing happens there still
exist a number of possible holes:
idtentry page_fault do_page_fault
On Thu, 2019-07-18 at 19:11 +0300, Andrey Ryabinin wrote:
>
> On 7/15/19 6:06 AM, Walter Wu wrote:
> > On Fri, 2019-07-12 at 13:52 +0300, Andrey Ryabinin wrote:
> >>
> >> On 7/11/19 1:06 PM, Walter Wu wrote:
> >>> On Wed, 2019-07-10 at 21:24 +0300, Andrey Ryabinin wrote:
>
> On 7/9/19
eferred_list(page));
}
spin_unlock_irqrestore(_queue->split_queue_lock, flags);
free_compound_page(page);
Unfortunately, I am no longer be able to reproduce the original list corruption
with today’s linux-next.
It is because the patches have been dropped from -mm tree by Andrew due
to
spin_lock_irqsave(_queue->split_queue_lock, flags);
> if (!list_empty(page_deferred_list(page))) {
> ds_queue->split_queue_len--;
> - list_del(page_deferred_list(page));
> + list_del_init(page_deferred_list(page));
> }
> spin_unlock_irqrestore(_queue->split_queue_lock, flags);
> free_compound_page(page);
Unfortunately, I am no longer be able to reproduce the original list corruption
with today’s linux-next.
On 7/15/19 6:06 AM, Walter Wu wrote:
> On Fri, 2019-07-12 at 13:52 +0300, Andrey Ryabinin wrote:
>>
>> On 7/11/19 1:06 PM, Walter Wu wrote:
>>> On Wed, 2019-07-10 at 21:24 +0300, Andrey Ryabinin wrote:
On 7/9/19 5:53 AM, Walter Wu wrote:
> On Mon, 2019-07-08 at 19:33 +0300, Andrey
[ Upstream commit 2eba4e640b2c4161e31ae20090a53ee02a518657 ]
DM verity should also use DMERR_LIMIT to limit repeat data block
corruption messages.
Signed-off-by: Milan Broz
Signed-off-by: Mike Snitzer
Signed-off-by: Sasha Levin
---
drivers/md/dm-verity.c | 4 ++--
1 file changed, 2
[ Upstream commit 2eba4e640b2c4161e31ae20090a53ee02a518657 ]
DM verity should also use DMERR_LIMIT to limit repeat data block
corruption messages.
Signed-off-by: Milan Broz
Signed-off-by: Mike Snitzer
Signed-off-by: Sasha Levin
---
drivers/md/dm-verity-target.c | 4 ++--
1 file changed, 2
[ Upstream commit 2eba4e640b2c4161e31ae20090a53ee02a518657 ]
DM verity should also use DMERR_LIMIT to limit repeat data block
corruption messages.
Signed-off-by: Milan Broz
Signed-off-by: Mike Snitzer
Signed-off-by: Sasha Levin
---
drivers/md/dm-verity-target.c | 4 ++--
1 file changed, 2
[ Upstream commit 2eba4e640b2c4161e31ae20090a53ee02a518657 ]
DM verity should also use DMERR_LIMIT to limit repeat data block
corruption messages.
Signed-off-by: Milan Broz
Signed-off-by: Mike Snitzer
Signed-off-by: Sasha Levin
---
drivers/md/dm-verity-target.c | 4 ++--
1 file changed, 2
[ Upstream commit 2eba4e640b2c4161e31ae20090a53ee02a518657 ]
DM verity should also use DMERR_LIMIT to limit repeat data block
corruption messages.
Signed-off-by: Milan Broz
Signed-off-by: Mike Snitzer
Signed-off-by: Sasha Levin
---
drivers/md/dm-verity-target.c | 4 ++--
1 file changed, 2
: Fix CR2 corruption
Despite the current efforts to read CR2 before tracing happens there still
exist a number of possible holes:
idtentry page_fault do_page_fault has_error_code=1
call error_entry
TRACE_IRQS_OFF
call trace_hardirqs_off*
#PF
On 7/17/19 10:02 AM, Shakeel Butt wrote:
On Tue, Jul 16, 2019 at 5:12 PM Yang Shi wrote:
On 7/16/19 4:36 PM, Shakeel Butt wrote:
Adding related people.
The thread starts at:
http://lkml.kernel.org/r/1562795006.8510.19.ca...@lca.pw
On Mon, Jul 15, 2019 at 8:01 PM Yang Shi wrote:
On
On Tue, Jul 16, 2019 at 5:12 PM Yang Shi wrote:
>
>
>
> On 7/16/19 4:36 PM, Shakeel Butt wrote:
> > Adding related people.
> >
> > The thread starts at:
> > http://lkml.kernel.org/r/1562795006.8510.19.ca...@lca.pw
> >
> > On Mon, Jul 15, 2019 at 8:01 PM Yang Shi wrote:
> >>
> >>
> >> On 7/15/19
On 7/16/19 4:36 PM, Shakeel Butt wrote:
Adding related people.
The thread starts at:
http://lkml.kernel.org/r/1562795006.8510.19.ca...@lca.pw
On Mon, Jul 15, 2019 at 8:01 PM Yang Shi wrote:
On 7/15/19 6:36 PM, Qian Cai wrote:
On Jul 15, 2019, at 8:22 PM, Yang Shi wrote:
On 7/15/19
Adding related people.
The thread starts at:
http://lkml.kernel.org/r/1562795006.8510.19.ca...@lca.pw
On Mon, Jul 15, 2019 at 8:01 PM Yang Shi wrote:
>
>
>
> On 7/15/19 6:36 PM, Qian Cai wrote:
> >
> >> On Jul 15, 2019, at 8:22 PM, Yang Shi wrote:
> >>
> >>
> >>
> >> On 7/15/19 2:23 PM, Qian
On 7/15/19 6:36 PM, Qian Cai wrote:
On Jul 15, 2019, at 8:22 PM, Yang Shi wrote:
On 7/15/19 2:23 PM, Qian Cai wrote:
On Fri, 2019-07-12 at 12:12 -0700, Yang Shi wrote:
Another possible lead is that without reverting the those commits below,
kdump
kernel would always also crash in
> On Jul 15, 2019, at 8:22 PM, Yang Shi wrote:
>
>
>
> On 7/15/19 2:23 PM, Qian Cai wrote:
>> On Fri, 2019-07-12 at 12:12 -0700, Yang Shi wrote:
Another possible lead is that without reverting the those commits below,
kdump
kernel would always also crash in
On 7/15/19 2:23 PM, Qian Cai wrote:
On Fri, 2019-07-12 at 12:12 -0700, Yang Shi wrote:
Another possible lead is that without reverting the those commits below,
kdump
kernel would always also crash in shrink_slab_memcg() at this line,
map =
On Fri, 2019-07-12 at 12:12 -0700, Yang Shi wrote:
> > Another possible lead is that without reverting the those commits below,
> > kdump
> > kernel would always also crash in shrink_slab_memcg() at this line,
> >
> > map = rcu_dereference_protected(memcg->nodeinfo[nid]->shrinker_map, true);
>
>
memcg kmem
1c0af4b86bcf mm: move mem_cgroup_uncharge out of __page_cache_release()
4e050f2df876 mm: thp: extract split_queue_* into a struct
[1]
https://lore.kernel.org/linux-mm/1561507361-59349-1-git-send-email-yang@linux.alibaba.com/
[ 1145.730682][ T5764] list_del corruption, ea00251c
On Fri, 2019-07-12 at 13:52 +0300, Andrey Ryabinin wrote:
>
> On 7/11/19 1:06 PM, Walter Wu wrote:
> > On Wed, 2019-07-10 at 21:24 +0300, Andrey Ryabinin wrote:
> >>
> >> On 7/9/19 5:53 AM, Walter Wu wrote:
> >>> On Mon, 2019-07-08 at 19:33 +0300, Andrey Ryabinin wrote:
>
> On 7/5/19
hp: extract split_queue_* into a struct
[1]
https://lore.kernel.org/linux-mm/1561507361-59349-1-git-send-email-yang.
shi@
linux.alibaba.com/
[ 1145.730682][ T5764] list_del corruption, ea00251c8098->next is
LIST_POISON1 (dead0100)
[ 1145.739763][ T5764] [ cut here
linux.alibaba.com/
[ 1145.730682][ T5764] list_del corruption, ea00251c8098->next is
LIST_POISON1 (dead0100)
[ 1145.739763][ T5764] [ cut here ]
[ 1145.745126][ T5764] kernel BUG at lib/list_debug.c:47!
[ 1145.750320][ T5764] invalid opcode: [#1] SMP DEBUG_P
On 7/11/19 1:06 PM, Walter Wu wrote:
> On Wed, 2019-07-10 at 21:24 +0300, Andrey Ryabinin wrote:
>>
>> On 7/9/19 5:53 AM, Walter Wu wrote:
>>> On Mon, 2019-07-08 at 19:33 +0300, Andrey Ryabinin wrote:
On 7/5/19 4:34 PM, Dmitry Vyukov wrote:
> On Mon, Jul 1, 2019 at 11:56 AM Walter
-depend-on-memcg-kmem-fix
> > c9d49e69e887 mm: shrinker: make shrinker not depend on memcg kmem
> > 1c0af4b86bcf mm: move mem_cgroup_uncharge out of __page_cache_release()
> > 4e050f2df876 mm: thp: extract split_queue_* into a struct
> >
> > [1] https://lore.kernel.org/linux
On Wed, Jul 10, 2019 at 04:27:09PM -0400, Steven Rostedt wrote:
> But isn't it easier for them to just pull the quick fix in, if it is in
Steve, I've not yet seen a quick fix that actually fixes all the
problems.
Your initial one only fixes the IRQ tracing one, but leaves the context
tracking
On Wed, Jul 10, 2019 at 04:27:09PM -0400, Steven Rostedt wrote:
[ added stable folks ]
On Sun, 7 Jul 2019 11:17:09 -0700
Linus Torvalds wrote:
On Sun, Jul 7, 2019 at 8:11 AM Andy Lutomirski wrote:
>
> FWIW, I'm leaning toward suggesting that we apply the trivial tracing
> fix and backport
Despire the current efforts to read CR2 before tracing happens there
still exist a number of possible holes:
idtentry page_fault do_page_fault has_error_code=1
call error_entry
TRACE_IRQS_OFF
call trace_hardirqs_off*
#PF // modifies CR2
On Wed, 2019-07-10 at 21:24 +0300, Andrey Ryabinin wrote:
>
> On 7/9/19 5:53 AM, Walter Wu wrote:
> > On Mon, 2019-07-08 at 19:33 +0300, Andrey Ryabinin wrote:
> >>
> >> On 7/5/19 4:34 PM, Dmitry Vyukov wrote:
> >>> On Mon, Jul 1, 2019 at 11:56 AM Walter Wu
> >>> wrote:
>
> >>>
> >>> Sorry for
On Wed, Jul 10, 2019 at 04:27:09PM -0400, Steven Rostedt wrote:
>
> [ added stable folks ]
>
> On Sun, 7 Jul 2019 11:17:09 -0700
> Linus Torvalds wrote:
>
> > On Sun, Jul 7, 2019 at 8:11 AM Andy Lutomirski wrote:
> > >
> > > FWIW, I'm leaning toward suggesting that we apply the trivial
g/linux-mm/1561507361-59349-1-git-send-email-yang.shi@
linux.alibaba.com/
[ 1145.730682][ T5764] list_del corruption, ea00251c8098->next is
LIST_POISON1 (dead0100)
[ 1145.739763][ T5764] [ cut here ]
[ 1145.745126][ T5764] kernel BUG at lib/list_debug.c:47!
p: extract split_queue_* into a struct
[1] https://lore.kernel.org/linux-mm/1561507361-59349-1-git-send-email-yang.shi@
linux.alibaba.com/
[ 1145.730682][ T5764] list_del corruption, ea00251c8098->next is
LIST_POISON1 (dead0100)
[ 1145.739763][ T5764] [ cut here ]
[ added stable folks ]
On Sun, 7 Jul 2019 11:17:09 -0700
Linus Torvalds wrote:
> On Sun, Jul 7, 2019 at 8:11 AM Andy Lutomirski wrote:
> >
> > FWIW, I'm leaning toward suggesting that we apply the trivial tracing
> > fix and backport *that*. Then, in -tip, we could revert it and apply
> >
On 7/9/19 5:53 AM, Walter Wu wrote:
> On Mon, 2019-07-08 at 19:33 +0300, Andrey Ryabinin wrote:
>>
>> On 7/5/19 4:34 PM, Dmitry Vyukov wrote:
>>> On Mon, Jul 1, 2019 at 11:56 AM Walter Wu wrote:
>>>
>>> Sorry for delays. I am overwhelm by some urgent work. I afraid to
>>> promise any dates
int3_emulate_call() selftest stack corruption
KASAN shows the following splat during boot:
BUG: KASAN: unknown-crash in unwind_next_frame+0x3f6/0x490
Read of size 8 at addr 84007db0 by task swapper/0
CPU: 0 PID: 0 Comm: swapper Tainted: GT
5.2.0-rc6-00013-g7457c0d
; {
> > static __initdata struct notifier_block int3_exception_nb = {
> > @@ -676,7 +683,7 @@ static void __init int3_selftest(void)
> > "int3_selftest_ip:\n\t"
> > __ASM_SEL(.long, .quad) " 1b\n\t"
> > ".po
t; __ASM_SEL(.long, .quad) " 1b\n\t"
> ".popsection\n\t"
> - : : __ASM_SEL_RAW(a, D) () : "memory");
> + : : __ASM_SEL_RAW(a, D) () : INT3_TEST_CLOBBERS);
>
> BUG_ON(
On 2019/07/08 18:42, Eiichi Tsukata wrote:
>
>
> On 2019/07/08 17:58, Eiichi Tsukata wrote:
>
>>
>> By the way, is there possibility that the WARNING(#GP in execve(2)) which
>> Steven
>> previously hit? :
>> https://lore.kernel.org/lkml/20190321095502.47b51...@gandalf.local.home/
>>
>>
On Mon, 2019-07-08 at 19:33 +0300, Andrey Ryabinin wrote:
>
> On 7/5/19 4:34 PM, Dmitry Vyukov wrote:
> > On Mon, Jul 1, 2019 at 11:56 AM Walter Wu wrote:
> >>>>>>>>> This patch adds memory corruption identification at bug report for
> >>>&
KASAN shows the following splat during boot:
BUG: KASAN: unknown-crash in unwind_next_frame+0x3f6/0x490
Read of size 8 at addr 84007db0 by task swapper/0
CPU: 0 PID: 0 Comm: swapper Tainted: GT
5.2.0-rc6-00013-g7457c0d #1
Hardware name: QEMU Standard PC (i440FX +
On 7/5/19 4:34 PM, Dmitry Vyukov wrote:
> On Mon, Jul 1, 2019 at 11:56 AM Walter Wu wrote:
>>>>>>>>> This patch adds memory corruption identification at bug report for
>>>>>>>>> software tag-based mode, the report show whether it is
>
On 2019/07/08 17:58, Eiichi Tsukata wrote:
>
> By the way, is there possibility that the WARNING(#GP in execve(2)) which
> Steven
> previously hit? :
> https://lore.kernel.org/lkml/20190321095502.47b51...@gandalf.local.home/
>
> Even if there were, it will *Not* be the bug introduced by
On 2019/07/08 16:48, Peter Zijlstra wrote:
...
>
> Or are we going to put the CR2 save/restore on every single tracepoint?
> But then we also need it on the mcount/fentry stubs and we again have
> multiple places.
>
> Whereas if we stick it in the entry path, like I proposed, we fix it in
>
On Sat, Jul 06, 2019 at 08:07:22PM +0900, Eiichi Tsukata wrote:
>
>
> On 2019/07/05 11:18, Linus Torvalds wrote:
> > On Fri, Jul 5, 2019 at 5:03 AM Peter Zijlstra wrote:
> >>
> >> Despire the current efforts to read CR2 before tracing happens there
> >> still exist a number of possible holes:
>
3.16.70-rc1 review patch. If anyone has any objections, please let me know.
--
From: Filipe Manana
commit 8e928218780e2f1cf2f5891c7575e8f0b284fcce upstream.
In the past we had data corruption when reading compressed extents that
are shared within the same file
ss 2 pages, and is sized
so that the next directory entry doesn't fit in the requested size,
then memory corruption can happen.
When encode_entry() is called after encoding the last entry that fits,
it notices that ->offset and ->offset1 are set, and so stores the
offset value in the two p
On Sun, Jul 7, 2019 at 8:11 AM Andy Lutomirski wrote:
>
> FWIW, I'm leaning toward suggesting that we apply the trivial tracing
> fix and backport *that*. Then, in -tip, we could revert it and apply
> this patch instead.
You don't have to have the same fix in stable as in -tip.
It's fine to
On Sun, Jul 7, 2019 at 8:10 AM Andy Lutomirski wrote:
>
> On Thu, Jul 4, 2019 at 1:03 PM Peter Zijlstra wrote:
> >
> > Despire the current efforts to read CR2 before tracing happens there
> > still exist a number of possible holes:
> >
> > idtentry page_fault do_page_fault
On Thu, Jul 4, 2019 at 1:03 PM Peter Zijlstra wrote:
>
> Despire the current efforts to read CR2 before tracing happens there
> still exist a number of possible holes:
>
> idtentry page_fault do_page_fault has_error_code=1
> call error_entry
> TRACE_IRQS_OFF
>
On 2019/07/07 7:27, Steven Rostedt wrote:
>
> We also have to deal with reading vmalloc'd data as that can fault too.
> The perf ring buffer IIUC is vmalloc, so if perf records in one of
> these locations, then the reading of the vmalloc area has a potential
> to fault corrupting the CR2
> On Jul 6, 2019, at 6:08 PM, Linus Torvalds
> wrote:
>
> On Sat, Jul 6, 2019 at 3:41 PM Linus Torvalds
> wrote:
>>
>>> On Sat, Jul 6, 2019 at 3:27 PM Steven Rostedt wrote:
>>>
>>> We also have to deal with reading vmalloc'd data as that can fault too.
>>
>> Ahh, that may be a better
On Sat, Jul 6, 2019 at 3:41 PM Linus Torvalds
wrote:
>
> On Sat, Jul 6, 2019 at 3:27 PM Steven Rostedt wrote:
> >
> > We also have to deal with reading vmalloc'd data as that can fault too.
>
> Ahh, that may be a better reason for PeterZ's patches and reading cr2
> very early from asm code than
he
> > > read_cr2() earlier; notably:
> > >
> > > 0ac09f9f8cd1 ("x86, trace: Fix CR2 corruption when tracing page faults")
> > > d4078e232267 ("x86, trace: Further robustify CR2 handling vs tracing")
> >
> > I think both of thos
On Sat, Jul 6, 2019 at 3:27 PM Steven Rostedt wrote:
>
> We also have to deal with reading vmalloc'd data as that can fault too.
Ahh, that may be a better reason for PeterZ's patches and reading cr2
very early from asm code than the stack trace case. It's why the page
fault handler delayed
On Sat, 6 Jul 2019 14:41:22 -0700
Linus Torvalds wrote:
> On Fri, Jul 5, 2019 at 6:50 AM Peter Zijlstra wrote:
> >
> > Also; all previous attempts at fixing this have been about pushing the
> > read_cr2() earlier; notably:
> >
> > 0ac09f9f8cd1 ("x86,
On Fri, Jul 5, 2019 at 6:50 AM Peter Zijlstra wrote:
>
> Also; all previous attempts at fixing this have been about pushing the
> read_cr2() earlier; notably:
>
> 0ac09f9f8cd1 ("x86, trace: Fix CR2 corruption when tracing page faults")
> d4078e232267 ("
On 2019/07/05 11:18, Linus Torvalds wrote:
> On Fri, Jul 5, 2019 at 5:03 AM Peter Zijlstra wrote:
>>
>> Despire the current efforts to read CR2 before tracing happens there
>> still exist a number of possible holes:
>
> So this whole series disturbs me for the simple reason that I thought
>
arlier; notably:
0ac09f9f8cd1 ("x86, trace: Fix CR2 corruption when tracing page faults")
d4078e232267 ("x86, trace: Further robustify CR2 handling vs tracing")
And I'm thinking that with exception of this patch, the rest are
worthwhile cleanups regardless.
Also; while
On Mon, Jul 1, 2019 at 11:56 AM Walter Wu wrote:
> > > > > > > > This patch adds memory corruption identification at bug report
> > > > > > > > for
> > > > > > > > software tag-based mode, the report show whether it is
On Fri, Jul 5, 2019 at 12:16 PM Andy Lutomirski wrote:
>
> If nothing else, MOV to CR2 is architecturally serializing, so, unless
> there’s some fancy unwinding involved, this will be quite slow.
That's why the NMI code does this:
if (unlikely(this_cpu_read(nmi_cr2) != read_cr2()))
> On Jul 4, 2019, at 7:18 PM, Linus Torvalds
> wrote:
>
>> On Fri, Jul 5, 2019 at 5:03 AM Peter Zijlstra wrote:
>>
>> Despire the current efforts to read CR2 before tracing happens there
>> still exist a number of possible holes:
>
> So this whole series disturbs me for the simple reason
On Fri, Jul 5, 2019 at 5:03 AM Peter Zijlstra wrote:
>
> Despire the current efforts to read CR2 before tracing happens there
> still exist a number of possible holes:
So this whole series disturbs me for the simple reason that I thought
tracing was supposed to save/restore cr2 and make it
Despire the current efforts to read CR2 before tracing happens there
still exist a number of possible holes:
idtentry page_fault do_page_fault has_error_code=1
call error_entry
TRACE_IRQS_OFF
call trace_hardirqs_off*
#PF // modifies CR2
On Thu, Jul 04, 2019 at 12:05:22AM +0200, Peter Zijlstra wrote:
> On Wed, Jul 03, 2019 at 04:47:01PM -0400, Steven Rostedt wrote:
> > Yeah, looks like we might be missing a TRACE_IRQS_OFF from the
> > from_usermode_stack_switch path.
>
> Oh bugger, there's a second error_entry call.
---
On Thu, Jul 04, 2019 at 12:00:57AM +0200, Peter Zijlstra wrote:
> On Wed, Jul 03, 2019 at 01:27:09PM -0700, Andy Lutomirski wrote:
> > On Wed, Jul 3, 2019 at 3:28 AM root wrote:
>
> > > @@ -1338,18 +1347,9 @@ ENTRY(error_entry)
> > > movq%rax, %rsp /* switch
On Wed, Jul 3, 2019 at 3:01 PM Peter Zijlstra wrote:
>
> On Wed, Jul 03, 2019 at 01:27:09PM -0700, Andy Lutomirski wrote:
> > On Wed, Jul 3, 2019 at 3:28 AM root wrote:
>
> > > @@ -1338,18 +1347,9 @@ ENTRY(error_entry)
> > > movq%rax, %rsp /* switch stack */
> >
On Wed, Jul 03, 2019 at 04:47:01PM -0400, Steven Rostedt wrote:
> On Wed, 3 Jul 2019 13:27:09 -0700
> Andy Lutomirski wrote:
>
>
> > > @@ -1180,10 +1189,10 @@ idtentry xenint3do_int3
> > > has_error_co
> > > #endif
> > >
> > > idtentry general_protection
On Wed, Jul 03, 2019 at 01:27:09PM -0700, Andy Lutomirski wrote:
> On Wed, Jul 3, 2019 at 3:28 AM root wrote:
> > @@ -1338,18 +1347,9 @@ ENTRY(error_entry)
> > movq%rax, %rsp /* switch stack */
> > ENCODE_FRAME_POINTER
> > pushq %r12
> > -
> > -
On Wed, Jul 03, 2019 at 04:29:42PM -0400, Steven Rostedt wrote:
> On Wed, 3 Jul 2019 22:22:31 +0200
> Peter Zijlstra wrote:
>
> > On Wed, Jul 03, 2019 at 12:27:34PM +0200, root wrote:
> > > Despire the current efforts to read CR2 before tracing happens there
> > > still exist a number of
On Wed, 3 Jul 2019 13:27:09 -0700
Andy Lutomirski wrote:
> > @@ -1180,10 +1189,10 @@ idtentry xenint3do_int3
> > has_error_co
> > #endif
> >
> > idtentry general_protectiondo_general_protection has_error_code=1
> > -idtentry page_fault
701 - 800 of 10735 matches
Mail list logo