Committer: Ingo Molnar
CommitterDate: Tue, 23 Jun 2020 10:42:30 +02:00
sched/core: Check cpus_mask, not cpus_ptr in __set_cpus_allowed_ptr(), to fix
mask corruption
This function is concerned with the long-term CPU mask, not the
transitory mask the task might have while migrate disabled. Before
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
at last one block space, let's just writeback raw data instead of
compressed one, this can fix data corruption when decompressing
incomplete stored compression data.
Fixes: 50cfa66f0de0 ("f2fs: compress: support zstd compress algorithm")
Signed-off-by: Daeho Jeong
Signed-off-by: Chao Yu
From: Franck LENORMAND
[ Upstream commit f5f27b79eab80de0287c243a22169e4876b08d5e ]
The header of the message to send can be changed if the
response is longer than the request:
- 1st word, the header is sent
- the remaining words of the message are sent
- the response is received
From: Franck LENORMAND
[ Upstream commit f5f27b79eab80de0287c243a22169e4876b08d5e ]
The header of the message to send can be changed if the
response is longer than the request:
- 1st word, the header is sent
- the remaining words of the message are sent
- the response is received
From: Franck LENORMAND
[ Upstream commit f5f27b79eab80de0287c243a22169e4876b08d5e ]
The header of the message to send can be changed if the
response is longer than the request:
- 1st word, the header is sent
- the remaining words of the message are sent
- the response is received
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
for multiple interfaces. This corruption can be triggered by registering
multiple SSIDs and calling, in parallel for multiple interfaces
iw dev station dump
[16750.719775] Unable to handle kernel paging request at virtual address
dead0110
...
[16750.899173] Call trace:
[16750.901696
My config is attached. This is the greatly reduced config that I used
when trying to narrow down the problem. We normally have much more
enabled, but that had no effect on the bug in my testing. We do,
unfortunately, have quite a few out-of-tree patches, but they are all
in USB or Networking,
On 2020-05-29 18:15:18 [+0200], To Mark Marshall wrote:
> In order to get it back into the RT queue I need to understand why it is
> required. What exactly is it fixing. Let me stare at for a little…
it used to be local_irq_disable() which then became preempt_disable()
local_irq_disable() due to
On 2020-05-29 17:38:39 [+0200], Mark Marshall wrote:
> Hi Sebastian & list,
Hi,
> I had assumed that my e-mail had got lost or overlooked, I was meaning to
> post a follow up message this week...
>
> All I could find from the debugging and tracing that we added was that
> something was going
Hi Sebastian & list,
I had assumed that my e-mail had got lost or overlooked, I was meaning to
post a follow up message this week...
All I could find from the debugging and tracing that we added was that
something was going wrong with the mm data structures somewhere in the
exec code. In the
On 2020-05-04 11:40:08 [+0200], Mark Marshall wrote:
> The easiest way we have found to reproduce the crash is to repeatedly
> insert and then remove a module. The crash then appears to be related
> to either paging in the module or in exiting the mdev process. (The
> crash does also happen at
3.16.84-rc1 review patch. If anyone has any objections, please let me know.
--
From: Trond Myklebust
commit 4b310319c6a8ce708f1033d57145e2aa027a883c upstream.
nfs_readdir_xdr_to_array() must not exit without having initialised
the array, so that the page cache deletion
On Tue, 2020-05-19 at 08:59 +0800, Jeremy Kerr wrote:
> Hi Stan,
>
> > The new kernel compiled and booted with no errors, with these
> > STACKPROTECTOR options in .config (the last two revealed the bug):
> >
> > CONFIG_HAVE_STACKPROTECTOR=y
> > CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
> >
Hi Stan,
> The new kernel compiled and booted with no errors, with these
> STACKPROTECTOR options in .config (the last two revealed the bug):
>
> CONFIG_HAVE_STACKPROTECTOR=y
> CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
> CONFIG_STACKPROTECTOR=y
> CONFIG_STACKPROTECTOR_STRONG=y
Brilliant, thanks for
From: David Howells
[ Upstream commit c5f9d9db83d9f84d2b4aae5a1b29d9b582ccff2f ]
The patch which changed cachefiles from calling ->bmap() to using the
bmap() wrapper overwrote the running return value with the result of
calling bmap(). This causes an assertion failure elsewhere in the code.
Pali Rohár wrote:
> The mwifiex_cfg80211_dump_station() uses static variable for iterating
> over a linked list of all associated stations (when the driver is in UAP
> role). This has a race condition if .dump_station is called in parallel
> for multiple interfaces. This co
Hi Finn,
> This fixes an old bug recently revealed by CONFIG_STACKPROTECTOR.
Good catch. I'm not sure about the fix though. That variable ('addr')
should be a ethernet hardware address; I'm surprised we're accessing
past the 6th byte. The culprit seems to be this, where 'ea' is the
address
This fixes an old bug recently revealed by CONFIG_STACKPROTECTOR.
[ 25.707616] scsi host0: MESH
[ 28.488852] eth0: BMAC at 00:05:02:07:5a:a6
[ 28.488859]
[ 28.505397] Kernel panic - not syncing: stack-protector: Kernel stack is
corrupted in: bmac_probe+0x540/0x618
[ 28.535152] CPU: 0
Hi Pali,
> The mwifiex_cfg80211_dump_station() uses static variable for iterating over
> a linked list of all associated stations (when the driver is in UAP role).
> This has
> a race condition if .dump_station is called in parallel for multiple
> interfaces.
> This corruptio
The mwifiex_cfg80211_dump_station() uses static variable for iterating
over a linked list of all associated stations (when the driver is in UAP
role). This has a race condition if .dump_station is called in parallel
for multiple interfaces. This corruption can be triggered by registering
multiple
l.org on behalf of Suresh Gumpula"
wrote:
Hi,
We are a seeing a problem with windows guests(2016/2012R2) where guest
crashes with
Virtual APIC page corruption similar to the following redhat ticket.
https://urldefense.proofpoint.com/v2/url?u=https-3A__bugzilla.redha
Hi,
We are a seeing a problem with windows guests(2016/2012R2) where guest crashes
with
Virtual APIC page corruption similar to the following redhat ticket.
https://bugzilla.redhat.com/show_bug.cgi?id=1751017
> Arg4: 0017, Type of corrupted region, can be
16 : Criti
remained in intermediate buffer, it means that zstd algorithm can not
> save at last one block space, let's just writeback raw data instead of
> compressed one, this can fix data corruption when decompressing
> incomplete stored compression data.
>
Fixes: 50cfa66f0de0 ("f2fs: compr
The patch which changed cachefiles from calling ->bmap() to using the
bmap() wrapper overwrote the running return value with the result of
calling bmap(). This causes an assertion failure elsewhere in the code.
Fix this by using ret2 rather than ret to hold the return value.
The oops looks
The patch which changed cachefiles from calling ->bmap() to using the
bmap() wrapper overwrote the running return value with the result of
calling bmap(). This causes an assertion failure elsewhere in the code.
Fix this by using ret2 rather than ret to hold the return value.
The oops looks
From: Nicolas Schichan
commit 3b89624ab54b9dc2d92fc08ce2670e5f19ad8ec8 upstream.
The code in txq_put_data() would use txq->tx_curr_desc to index the
tso_hdrs/tso_hdrs_dma buffers, for less than 8 bytes unaligned
fragments, which is already moved to the next descriptor at the
beginning of the
From: Andrew Rybchenko
commit e70c70c38d7a5ced76fc8b1c4a7ccee76e9c2911 upstream.
On 32-bit systems, mask is only an array of 3 longs, not 4, so don't try
to write to mask[3].
Also include build-time checks in case the size of the bitmask changes.
Fixes: 3c36a2aded8c ("sfc: display vadaptor
From: Eran Ben Elisha
commit 6b94bab0ee8d5def6a2aac0ef6204ee6e24386b6 upstream.
The error flow in procedure handle_existing_counter() is wrong.
The procedure should exit after encountering the error, not continue
as if everything is OK.
Fixes: 68230242cdbc ('net/mlx4_core: Add port attribute
From: Michael Neuling
commit 6bcb80143e792becfd2b9cc6a339ce523e4e2219 upstream.
At the start of __tm_recheckpoint() we save the kernel stack pointer
(r1) in SPRG SCRATCH0 (SPRG2) so that we can restore it after the
trecheckpoint.
Unfortunately, the same SPRG is used in the SLB miss handler.
> > Could we save more memory space for these two cases like ZSTD?
> > > As you know, we are using 5 pages compression buffer for LZ4 and LZO
> > in
> > > compress_log_size=2,
> > > and if the compressed data doesn't fit in 3 pages, it returns
d data doesn't fit in 3 pages, it returns -EAGAIN to
> > give up compressing that one.
> >
> > Thanks,
> >
> > 2020년 5월 8일 (금) 오전 10:17, Chao Yu <mailto:yuch...@huawei.com>>님이 작성:
> >
> >> During zstd compression, ZSTD_endStream(
return non-zero value
>> because distination buffer is full, but there is still compressed data
>> remained in intermediate buffer, it means that zstd algorithm can not
>> save at last one block space, let's just writeback raw data instead of
>> compressed one, this can fix data corruption
, this can fix data corruption when decompressing
incomplete stored compression data.
Signed-off-by: Daeho Jeong
Signed-off-by: Chao Yu
---
fs/f2fs/compress.c | 7 +++
1 file changed, 7 insertions(+)
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index c22cc0d37369..5e4947250262 100644
On 5/6/20 6:20 PM, Edwin Török wrote:
>> (Obviously, a full metadump would be useful for confirming the shape
>> of
>> the refcount btree, but...first things first let's look at the
>> filefrag
>> output.)
> I'll try to gather one, and find a place to store/share it.
>
> Best regards,
> --Edwin
gt; > > > On Sat, Apr 18, 2020 at 11:19:03AM +0100, Edwin Török
> > > > > > > wrote:
> > > > > > > > [1.] One line summary of the problem:
> > > > > > > >
> > > > > > > > I 100%
a reload:
1. construct new target
2. suspend old target
3. resume new target
4. destroy old target
Metadata that were written by the old target between steps 1 and 2 would
not be visible by the new target.
Fix the data corruption by loading the metadata in the resume handler.
Also, validate block_size
a reload:
1. construct new target
2. suspend old target
3. resume new target
4. destroy old target
Metadata that were written by the old target between steps 1 and 2 would
not be visible by the new target.
Fix the data corruption by loading the metadata in the resume handler.
Also, validate block_size
a reload:
1. construct new target
2. suspend old target
3. resume new target
4. destroy old target
Metadata that were written by the old target between steps 1 and 2 would
not be visible by the new target.
Fix the data corruption by loading the metadata in the resume handler.
Also, validate block_size
Hi RT experts,
We are using the RT kernel with the PowerPC e500. Until recently we
were on the 4.19 kernel series, and are in the process of upgrading.
When we switched to the v5.4 version, we get a reproducible kernel
crash. The crashes all contain the "BUG: Bad rss-counter state" line,
and
From: Vasily Averin
commit e1e8399eee72e9d5246d4d1bcacd793debe34dd3 upstream.
New struct nfsd4_blocked_lock allocated in find_or_allocate_block()
does not initialized nbl_list and nbl_lru.
If conflock allocation fails rollback can call list_del_init()
access uninitialized fields and corrupt
From: Vasily Averin
commit e1e8399eee72e9d5246d4d1bcacd793debe34dd3 upstream.
New struct nfsd4_blocked_lock allocated in find_or_allocate_block()
does not initialized nbl_list and nbl_lru.
If conflock allocation fails rollback can call list_del_init()
access uninitialized fields and corrupt
From: Vasily Averin
commit e1e8399eee72e9d5246d4d1bcacd793debe34dd3 upstream.
New struct nfsd4_blocked_lock allocated in find_or_allocate_block()
does not initialized nbl_list and nbl_lru.
If conflock allocation fails rollback can call list_del_init()
access uninitialized fields and corrupt
From: Vasily Averin
commit e1e8399eee72e9d5246d4d1bcacd793debe34dd3 upstream.
New struct nfsd4_blocked_lock allocated in find_or_allocate_block()
does not initialized nbl_list and nbl_lru.
If conflock allocation fails rollback can call list_del_init()
access uninitialized fields and corrupt
From: Vasily Averin
commit e1e8399eee72e9d5246d4d1bcacd793debe34dd3 upstream.
New struct nfsd4_blocked_lock allocated in find_or_allocate_block()
does not initialized nbl_list and nbl_lru.
If conflock allocation fails rollback can call list_del_init()
access uninitialized fields and corrupt
From: Vasily Averin
[ Upstream commit e1e8399eee72e9d5246d4d1bcacd793debe34dd3 ]
New struct nfsd4_blocked_lock allocated in find_or_allocate_block()
does not initialized nbl_list and nbl_lru.
If conflock allocation fails rollback can call list_del_init()
access uninitialized fields and corrupt
From: Vasily Averin
[ Upstream commit e1e8399eee72e9d5246d4d1bcacd793debe34dd3 ]
New struct nfsd4_blocked_lock allocated in find_or_allocate_block()
does not initialized nbl_list and nbl_lru.
If conflock allocation fails rollback can call list_del_init()
access uninitialized fields and corrupt
From: Vasily Averin
[ Upstream commit e1e8399eee72e9d5246d4d1bcacd793debe34dd3 ]
New struct nfsd4_blocked_lock allocated in find_or_allocate_block()
does not initialized nbl_list and nbl_lru.
If conflock allocation fails rollback can call list_del_init()
access uninitialized fields and corrupt
From: Vasily Averin
[ Upstream commit e1e8399eee72e9d5246d4d1bcacd793debe34dd3 ]
New struct nfsd4_blocked_lock allocated in find_or_allocate_block()
does not initialized nbl_list and nbl_lru.
If conflock allocation fails rollback can call list_del_init()
access uninitialized fields and corrupt
From: Vasily Averin
[ Upstream commit e1e8399eee72e9d5246d4d1bcacd793debe34dd3 ]
New struct nfsd4_blocked_lock allocated in find_or_allocate_block()
does not initialized nbl_list and nbl_lru.
If conflock allocation fails rollback can call list_del_init()
access uninitialized fields and corrupt
From: Nikhil Devshatwar
When setting DMA for video capture from CSI channel, if the DMA size
is not given, it ends up writing as much data as sent by the camera.
This may lead to overwriting the buffers causing memory corruption.
Observed green lines on the default framebuffer.
Restrict
From: Ondrej Mosnacek
commit 2a5243937c700ffe6a28e6557a4562a9ab0a17a4 upstream.
string_to_context_struct() may garble the context string, so we need to
copy back the contents again from the old context struct to avoid
storing the corrupted context.
Since string_to_context_struct() tokenizes
to
the size of the second smallest - etc.
A change in Linux 3.14 unintentionally changed the layout for the
second and subsequent zones. All the correct data is still stored, but
each chunk may be assigned to a different device than in pre-3.14 kernels.
This can lead to data corruption
to
the size of the second smallest - etc.
A change in Linux 3.14 unintentionally changed the layout for the
second and subsequent zones. All the correct data is still stored, but
each chunk may be assigned to a different device than in pre-3.14 kernels.
This can lead to data corruption
to
the size of the second smallest - etc.
A change in Linux 3.14 unintentionally changed the layout for the
second and subsequent zones. All the correct data is still stored, but
each chunk may be assigned to a different device than in pre-3.14 kernels.
This can lead to data corruption
to
the size of the second smallest - etc.
A change in Linux 3.14 unintentionally changed the layout for the
second and subsequent zones. All the correct data is still stored, but
each chunk may be assigned to a different device than in pre-3.14 kernels.
This can lead to data corruption
From: Roderick Colenbrander
commit 2bcdacb70327013ca2066bfcf2af1009eff01f1d upstream.
The sony driver is not properly cleaning up from potential failures in
sony_input_configured. Currently it calls hid_hw_stop, while hid_connect
is still running. This is not a good idea, instead hid_hw_stop
3.16.75-rc1 review patch. If anyone has any objections, please let me know.
--
From: Ravi Bangoria
commit 3202e35ec1c8fc19cea24253ff83edf702a60a02 upstream.
Consider a scenario where user creates two events:
1st event:
attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
I/Os should merge
into one but they don't. Finally it turns out to be a stack corruption
caused by macro PRECEDING_KEY().
See how PRECEDING_KEY() is defined in bset.h,
437 #define PRECEDING_KEY(_k) \
438
From: Roderick Colenbrander
commit 2bcdacb70327013ca2066bfcf2af1009eff01f1d upstream.
The sony driver is not properly cleaning up from potential failures in
sony_input_configured. Currently it calls hid_hw_stop, while hid_connect
is still running. This is not a good idea, instead hid_hw_stop
From: Roderick Colenbrander
commit 2bcdacb70327013ca2066bfcf2af1009eff01f1d upstream.
The sony driver is not properly cleaning up from potential failures in
sony_input_configured. Currently it calls hid_hw_stop, while hid_connect
is still running. This is not a good idea, instead hid_hw_stop
From: Roderick Colenbrander
commit 2bcdacb70327013ca2066bfcf2af1009eff01f1d upstream.
The sony driver is not properly cleaning up from potential failures in
sony_input_configured. Currently it calls hid_hw_stop, while hid_connect
is still running. This is not a good idea, instead hid_hw_stop
3.16.74-rc1 review patch. If anyone has any objections, please let me know.
--
From: Lukas Czerner
commit 57a0da28ced8707cb9f79f071a016b9d005caf5a upstream.
Unaligned AIO must be serialized because the zeroing of partial blocks
of unaligned AIO can result in data corruption
3.16.74-rc1 review patch. If anyone has any objections, please let me know.
--
From: Slava Pestov
commit c9a78332b42cbdcdd386a95192a716b67d1711a4 upstream.
If register_cache_set() failed, we would touch ca->set after
it had already been freed. Also, fix an assertion to catch
discover server trunking, the client will renew the lease,
but check the client state, it lead the client state corruption.
So, we need to call state manager to recover it when detect server
ip trunking.
Signed-off-by: ZhangXiaoxu
Signed-off-by: Anna Schumaker
Signed-off-by: Ben Hutchings
---
fs
corrupt the FIXMAP area and kernel OF APIs will crash
whenever they access corrupted FDT in the FIXMAP area.
On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas
happen to overlap so we don't see any FIXMAP area corruptions.
This patch fixes FIXMAP area corruption on RV32 systems
Hi,
We're getting some random memory corruption on an AWS EC2 instance with
4.19.x kernels. I've tried 4.19.19, 4.19.52, but the results below are
from the most recent (4.19.72). For debugging I enabled
KASAN+slub_debug, but TBH, I can't make heads or tails from these.
Without slub_debug
> > To: sunqiuyang
> > > Cc: linux-kernel@vger.kernel.org; linux...@kvack.org
> > > Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of
> > > non-LRU movable pages
> > >
> > > On Wed 04-09-19 12:19:11, sunqiuyang wrote:
&g
is deleted from its LRU list (cc->migratepages) in unmap_and_move().
> > However, between steps 1) and 2), the page could be isolated by another
> > thread in isolate_movable_page(), and added to another LRU list, leading
> > to list_del corruption later.
>
> Once non-LRU page is
etween steps 1) and 2), the page could be isolated by another
> thread in isolate_movable_page(), and added to another LRU list, leading
> to list_del corruption later.
Once non-LRU page is migrated out successfully, driver should clear
the movable flag in the page. Look at reset_page in
; > Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of
> > non-LRU movable pages
> >
> > On Wed 04-09-19 12:19:11, sunqiuyang wrote:
> > > > Do not top post please
> > > >
> > > > On Wed 04-09-19 07:27:25,
- Original Message -
> Jan Stancek writes:
>
> > sb_getblk does not guarantee that buffer_head is uptodate. If there is
> > async read running in parallel for same buffer_head, it can overwrite
> > just initialized msdos_dir_entry, leading to corruption:
>
Jan Stancek writes:
> sb_getblk does not guarantee that buffer_head is uptodate. If there is
> async read running in parallel for same buffer_head, it can overwrite
> just initialized msdos_dir_entry, leading to corruption:
> FAT-fs (loop0): error, corrupted directory (invalid entr
From: Michal Hocko [mho...@kernel.org]
Sent: Wednesday, September 04, 2019 20:52
To: sunqiuyang
Cc: linux-kernel@vger.kernel.org; linux...@kvack.org
Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of
non-LRU movable pages
On Wed 04
corrupt the FIXMAP area and kernel OF APIs will crash
whenever they access corrupted FDT in the FIXMAP area.
On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas
happen to overlap so we don't see any FIXMAP area corruptions.
This patch fixes FIXMAP area corruption on RV32 systems
On Wed 04-09-19 12:19:11, sunqiuyang wrote:
> > Do not top post please
> >
> > On Wed 04-09-19 07:27:25, sunqiuyang wrote:
> > > isolate_migratepages_block() from another thread may try to isolate the
> > > page again:
> > >
> > > for (; low_pfn < end_pfn; low_pfn++) {
> > > /* ... */
> > >
From: Michal Hocko [mho...@kernel.org]
Sent: Wednesday, September 04, 2019 16:14
To: sunqiuyang
Cc: linux-kernel@vger.kernel.org; linux...@kvack.org
Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of
non-LRU movable pages
Do
Do not top post please
On Wed 04-09-19 07:27:25, sunqiuyang wrote:
> isolate_migratepages_block() from another thread may try to isolate the page
> again:
>
> for (; low_pfn < end_pfn; low_pfn++) {
> /* ... */
> page = pfn_to_page(low_pfn);
> /* ... */
> if (!PageLRU(page)) {
> if
through this path?
From: Michal Hocko [mho...@kernel.org]
Sent: Wednesday, September 04, 2019 14:38
To: sunqiuyang
Cc: linux-kernel@vger.kernel.org; linux...@kvack.org
Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of
non-LRU movable pages
On Wed 04-09-19 02:18:38,
On Wed 04-09-19 02:18:38, sunqiuyang wrote:
> The isolate path of non-lru movable pages:
>
> isolate_migratepages_block
> isolate_movable_page
> trylock_page
> // if PageIsolated, goto out_no_isolated
> a_ops->isolate_page
>
> > const char *get_bug_type(struct kasan_access_info *info)
> > {
> > +#ifdef CONFIG_KASAN_SW_TAGS_IDENTIFY
> > + struct kasan_alloc_meta *alloc_meta;
> > + struct kmem_cache *cache;
> > + struct page *page;
> > + const void *addr;
> > + void *object;
> > +
T_POISON1.
So we will end up with a bug like
"list_del corruption. prev->next should be ffbf0a1eb8e0, but was
dead0100"
(see __list_del_entry_valid).
From: Michal Hocko [mho...@kernel.org]
Sent: Tuesday, September 03, 2019 21:17
To:
On Wed, Aug 21, 2019 at 8:03 PM Andrey Ryabinin wrote:
>
> From: Walter Wu
>
> This patch adds memory corruption identification at bug report for
> software tag-based mode, the report show whether it is "use-after-free"
> or "out-of-bound" error instead of
s 1) and 2), the page could be isolated by another
> thread in isolate_movable_page(), and added to another LRU list, leading
> to list_del corruption later.
Care to explain the race? Both paths use page_lock AFAICS
>
> This patch fixes the bug by moving list_del into the critical secti
age(), and added to another LRU list, leading
to list_del corruption later.
This patch fixes the bug by moving list_del into the critical section
protected by lock_page(), so that a page will not be isolated again before
it has been deleted from its LRU list.
Signed-off-by: Qiuyang Sun
---
sb_getblk does not guarantee that buffer_head is uptodate. If there is
async read running in parallel for same buffer_head, it can overwrite
just initialized msdos_dir_entry, leading to corruption:
FAT-fs (loop0): error, corrupted directory (invalid entries)
FAT-fs (loop0): Filesystem has been
kernel OF APIs will crash
> > > whenever they access corrupted FDT in the FIXMAP area.
> > >
> > > On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas
> > > happen to overlap so we don't see any FIXMAP area corruptions.
> > >
> >
IXMAP area. This allows user
> > > space apps
> > > to potentially corrupt the FIXMAP area and kernel OF APIs will
> > > crash
> > > whenever they access corrupted FDT in the FIXMAP area.
> > >
> > > On RV64 systems, TASK_SIZE is set to
gt; On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas
> > happen to overlap so we don't see any FIXMAP area corruptions.
> >
> > This patch fixes FIXMAP area corruption on RV32 systems by setting
> > TASK_SIZE to FIXADDR_START.
>
> This part -- the
ace apps
> to potentially corrupt the FIXMAP area and kernel OF APIs will crash
> whenever they access corrupted FDT in the FIXMAP area.
>
> On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas
> happen to overlap so we don't see any FIXMAP area corruptions.
>
601 - 700 of 10735 matches
Mail list logo