Re: KCSAN: data-race in __jbd2_journal_file_buffer / jbd2_journal_dirty_metadata
On Tue, Apr 06, 2021 at 11:01AM -0400, Theodore Ts'o wrote: > On Tue, Apr 06, 2021 at 02:32:33PM +0200, Jan Kara wrote: > > And the comment explains, why we do this unreliable check. Again, if we > > wanted to silence KCSAN, we could use data_race() macro but AFAIU Ted isn't > > very fond of that annotation. > > I'm not fond of the data_race macro, but I like bogus KCSAN reports > even less. My main complaint is if we're going to have to put the > data_race() macro in place, we're going to need to annotate each > location with an explanation of why it's there (suppress a KCSAN false > positive), and why's it's safe. If it's only one or two places, it'll > probably be fine. If it's dozens, then I would say that KCSAN is > becoming a net negative in terms of making the Linux kernel code > maintainable. I've just seen the latest reports on these data races [1], but it seems the more relevant context is here. [1] https://lore.kernel.org/linux-ext4/20210412113158.ga4...@quack2.suse.cz/ Let me try to put things in perspective. No, we do not want maintainability to suffer. Whether or not documenting the concurrency design via data_race() and a few comments is a negative or positive is up to you. To me, it'd be a positive because I don't have to guess what the code is trying to do because concurrent code rarely is obvious. (In fairness, if you don't like to add comments, just a data_race() without comment tells a reader more than now; perhaps they'd then rummage in the git logs.) Yes, there are currently lots of data-racy accesses in the kernel that are mostly benign. Yet, they are data races in the memory model's eyes, and every optimizing compiler is free to screw them up! For example a lot of those plain read-modify-write bitops ("... |= ..."). Unfortunately tooling cannot determine without hints (like data_race()) whether or not those are safe, since the programmer's intent is unclear. Crucially, the programmer's intent is also unclear to the compiler! Which means the compiler _is_ free to screw up those operations. If we could somehow precisely determine which plain accesses can race, we'd solve a decades-old problem: optimizing compilers and concurrent code do not get along. Therefore, C needed a memory model to sort out this mess, which we have since C11. The Linux kernel, however, doesn't play by those rules. The Linux Kernel Memory Model (LKMM) tries to specify the rules the kernel can safely play by. But since we have KCSAN, which initially tried to follow the LKMM strictly, various feedback has resulted in taming KCSAN to a subset of the LKMM. A lot of the data races that are left, yet appear benign, simply have no obvious rules or patterns (otherwise we wouldn't have the problem we have with optimizing compilers). I couldn't, in good conscience, tame KCSAN based on poorly thought-out rules. Because we know they're data races, and the compiler _is_ free to subject them to concurrency-unsafe optimizations. Because we knew that different codes will want different KCSAN exposure until there is a de-facto LKMM that is to be followed everywhere (one can dream), KCSAN has lots of knobs. They are described in detail here: https://lwn.net/Articles/816854/ > I'm not fond of the data_race macro, but I like bogus KCSAN reports > even less. While the data_race() macro was meant to be exactly for this case, to tell tooling "this data race is fine, even if the compiler messes it up", if there are too many data races for you right now feel free to add 'KCSAN_SANITIZE_file.o := n' to the files you don't want checked. Or even 'KCSAN_SANITIZE := n' to ignore all files in a directory. It would avoid the robots sending you reports. Not ideal, but it'd give some time to see how things evolve elsewhere if you'd rather avoid all this for now. Thanks, -- Marco
Re: KCSAN: data-race in __jbd2_journal_file_buffer / jbd2_journal_dirty_metadata
On Tue 06-04-21 11:01:39, Theodore Ts'o wrote: > On Tue, Apr 06, 2021 at 02:32:33PM +0200, Jan Kara wrote: > > And the comment explains, why we do this unreliable check. Again, if we > > wanted to silence KCSAN, we could use data_race() macro but AFAIU Ted isn't > > very fond of that annotation. > > I'm not fond of the data_race macro, but I like bogus KCSAN reports > even less. My main complaint is if we're going to have to put the > data_race() macro in place, we're going to need to annotate each > location with an explanation of why it's there (suppress a KCSAN false > positive), and why's it's safe. If it's only one or two places, it'll > probably be fine. If it's dozens, then I would say that KCSAN is > becoming a net negative in terms of making the Linux kernel code > maintainable. OK, I'll send you patches for the two I've seen in the last month. We'll see how many accumulate over time. Honza -- Jan Kara SUSE Labs, CR
Re: KCSAN: data-race in __jbd2_journal_file_buffer / jbd2_journal_dirty_metadata
On Tue, Apr 06, 2021 at 02:32:33PM +0200, Jan Kara wrote: > And the comment explains, why we do this unreliable check. Again, if we > wanted to silence KCSAN, we could use data_race() macro but AFAIU Ted isn't > very fond of that annotation. I'm not fond of the data_race macro, but I like bogus KCSAN reports even less. My main complaint is if we're going to have to put the data_race() macro in place, we're going to need to annotate each location with an explanation of why it's there (suppress a KCSAN false positive), and why's it's safe. If it's only one or two places, it'll probably be fine. If it's dozens, then I would say that KCSAN is becoming a net negative in terms of making the Linux kernel code maintainable. - Ted
Re: KCSAN: data-race in __jbd2_journal_file_buffer / jbd2_journal_dirty_metadata
On Tue 06-04-21 21:27:48, Hao Sun wrote: > > Thanks for report but I'm not sure what KCSAN is complaining about - isn't > > the report truncated? > > Yes, the full KCSAN report is available in the attached log file. Sorry, I missed that in your original email. > Here is the report : > ========== > BUG: KCSAN: data-race in __jbd2_journal_file_buffer / > jbd2_journal_dirty_metadata > > write to 0x88800af6da38 of 8 bytes by task 4822 on cpu 1: > __jbd2_journal_file_buffer+0x18d/0x370 > __jbd2_journal_refile_buffer+0x155/0x230 > jbd2_journal_commit_transaction+0x24c6/0x3200 > kjournald2+0x253/0x470 > kthread+0x1f0/0x220 > ret_from_fork+0x1f/0x30 > > read to 0x88800af6da38 of 8 bytes by task 1955 on cpu 0: > jbd2_journal_dirty_metadata+0x17f/0x670 > __ext4_handle_dirty_metadata+0xc6/0x590 > ext4_mark_iloc_dirty+0x12dd/0x16e0 > __ext4_mark_inode_dirty+0x4d2/0x5d0 > ext4_writepages+0x1262/0x1e50 > do_writepages+0x7b/0x150 > __writeback_single_inode+0x84/0x4e0 > writeback_sb_inodes+0x69f/0x1020 > __writeback_inodes_wb+0xb0/0x2a0 > wb_writeback+0x290/0x650 > wb_do_writeback+0x582/0x5d0 > wb_workfn+0xb8/0x410 > process_one_work+0x3e1/0x940 > worker_thread+0x64a/0xaa0 > kthread+0x1f0/0x220 > ret_from_fork+0x1f/0x30 > > Reported by Kernel Concurrency Sanitizer on: > CPU: 0 PID: 1955 Comm: kworker/u5:2 Not tainted 5.11.0+ #5 > > > Sorry, I couldn't symbolize it because the original Linux binary was lost. > Do you think this is an actual bug? So it is difficult to be 100% sure without knowing which particular access caused the KCSAN warning but I'm quite confident it was caused by one of unlocked accesses to jh->b_transaction in jbd2_journal_dirty_metadata(). And as the comments explain, these are only assertions which we redo under proper lock if they look like they would fail. So the code is in fact correct. Honza > Jan Kara 于2021年4月6日周二 下午8:32写道: > > > > Hello! > > > > On Sun 04-04-21 17:40:44, Hao Sun wrote: > > > When using Healer(https://github.com/SunHao-0/healer/tree/dev) to fuzz > > > the Linux kernel, I found a data-race vulnerability in > > > __jbd2_journal_file_buffer / jbd2_journal_dirty_metadata. > > > Sorry, data-race is usually difficult to reproduce. I cannot provide > > > you with a reproducing program. > > > I hope that the call stack information in the crash log can help you > > > locate the problem. > > > Kernel config and full log can be found in the attachment. > > > > > > Here is the detailed information: > > > commit: 3b9cdafb5358eb9f3790de2f728f765fef100731 > > > version: linux 5.11 > > > git tree:upstream > > > report: > > > == > > > BUG: KCSAN: data-race in __jbd2_journal_file_buffer / > > > jbd2_journal_dirty_metadata > > > write to 0x88800af6da38 of 8 bytes by task 4822 on cpu 1: > > > __jbd2_journal_file_buffer+0x18d/0x370 linux/fs/jbd2/transaction.c:2518 > > > __jbd2_journal_refile_buffer+0x155/0x230 linux/fs/jbd2/transaction.c:2612 > > > jbd2_journal_commit_transaction+0x24c6/0x3200 linux/fs/jbd2/commit.c:1084 > > > kjournald2+0x253/0x470 linux/fs/jbd2/journal.c:213 > > > kthread+0x1f0/0x220 linux/kernel/kthread.c:292 > > > ret_from_fork+0x1f/0x30 linux/arch/x86/entry/entry_64.S:294 > > > > Thanks for report but I'm not sure what KCSAN is complaining about - isn't > > the report truncated? I'm missing 'read' part of the report... The complaint > > is on line: > > > > jh->b_transaction = transaction; > > > > I would guess the complaint is because of the check: > > > > /* > > * This and the following assertions are unreliable since we may > > see jh > > * in inconsistent state unless we grab bh_state lock. But this is > > * crucial to catch bugs so let's do a reliable check until the > > * lockless handling is fully proven. > > */ > > if (jh->b_transaction != transaction && > > jh->b_next_transaction != transaction) { > > > > And the comment explains, why we do this unreliable check. Again, if we > > wanted to silence KCSAN, we could use data_race() macro but AFAIU Ted isn't > > very fond of that annotation. > > > > Honza > > > > -- > > Jan Kara > > SUSE Labs, CR -- Jan Kara SUSE Labs, CR
Re: KCSAN: data-race in __jbd2_journal_file_buffer / jbd2_journal_dirty_metadata
> Thanks for report but I'm not sure what KCSAN is complaining about - isn't > the report truncated? Yes, the full KCSAN report is available in the attached log file. Here is the report : ====== BUG: KCSAN: data-race in __jbd2_journal_file_buffer / jbd2_journal_dirty_metadata write to 0x88800af6da38 of 8 bytes by task 4822 on cpu 1: __jbd2_journal_file_buffer+0x18d/0x370 __jbd2_journal_refile_buffer+0x155/0x230 jbd2_journal_commit_transaction+0x24c6/0x3200 kjournald2+0x253/0x470 kthread+0x1f0/0x220 ret_from_fork+0x1f/0x30 read to 0x88800af6da38 of 8 bytes by task 1955 on cpu 0: jbd2_journal_dirty_metadata+0x17f/0x670 __ext4_handle_dirty_metadata+0xc6/0x590 ext4_mark_iloc_dirty+0x12dd/0x16e0 __ext4_mark_inode_dirty+0x4d2/0x5d0 ext4_writepages+0x1262/0x1e50 do_writepages+0x7b/0x150 __writeback_single_inode+0x84/0x4e0 writeback_sb_inodes+0x69f/0x1020 __writeback_inodes_wb+0xb0/0x2a0 wb_writeback+0x290/0x650 wb_do_writeback+0x582/0x5d0 wb_workfn+0xb8/0x410 process_one_work+0x3e1/0x940 worker_thread+0x64a/0xaa0 kthread+0x1f0/0x220 ret_from_fork+0x1f/0x30 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 1955 Comm: kworker/u5:2 Not tainted 5.11.0+ #5 Sorry, I couldn't symbolize it because the original Linux binary was lost. Do you think this is an actual bug? Hao Jan Kara 于2021年4月6日周二 下午8:32写道: > > Hello! > > On Sun 04-04-21 17:40:44, Hao Sun wrote: > > When using Healer(https://github.com/SunHao-0/healer/tree/dev) to fuzz > > the Linux kernel, I found a data-race vulnerability in > > __jbd2_journal_file_buffer / jbd2_journal_dirty_metadata. > > Sorry, data-race is usually difficult to reproduce. I cannot provide > > you with a reproducing program. > > I hope that the call stack information in the crash log can help you > > locate the problem. > > Kernel config and full log can be found in the attachment. > > > > Here is the detailed information: > > commit: 3b9cdafb5358eb9f3790de2f728f765fef100731 > > version: linux 5.11 > > git tree: upstream > > report: > > ====== > > BUG: KCSAN: data-race in __jbd2_journal_file_buffer / > > jbd2_journal_dirty_metadata > > write to 0x88800af6da38 of 8 bytes by task 4822 on cpu 1: > > __jbd2_journal_file_buffer+0x18d/0x370 linux/fs/jbd2/transaction.c:2518 > > __jbd2_journal_refile_buffer+0x155/0x230 linux/fs/jbd2/transaction.c:2612 > > jbd2_journal_commit_transaction+0x24c6/0x3200 linux/fs/jbd2/commit.c:1084 > > kjournald2+0x253/0x470 linux/fs/jbd2/journal.c:213 > > kthread+0x1f0/0x220 linux/kernel/kthread.c:292 > > ret_from_fork+0x1f/0x30 linux/arch/x86/entry/entry_64.S:294 > > Thanks for report but I'm not sure what KCSAN is complaining about - isn't > the report truncated? I'm missing 'read' part of the report... The complaint > is on line: > > jh->b_transaction = transaction; > > I would guess the complaint is because of the check: > > /* > * This and the following assertions are unreliable since we may see > jh > * in inconsistent state unless we grab bh_state lock. But this is > * crucial to catch bugs so let's do a reliable check until the > * lockless handling is fully proven. > */ > if (jh->b_transaction != transaction && > jh->b_next_transaction != transaction) { > > And the comment explains, why we do this unreliable check. Again, if we > wanted to silence KCSAN, we could use data_race() macro but AFAIU Ted isn't > very fond of that annotation. > > Honza > > -- > Jan Kara > SUSE Labs, CR
Re: KCSAN: data-race in __jbd2_journal_file_buffer / jbd2_journal_dirty_metadata
Hello! On Sun 04-04-21 17:40:44, Hao Sun wrote: > When using Healer(https://github.com/SunHao-0/healer/tree/dev) to fuzz > the Linux kernel, I found a data-race vulnerability in > __jbd2_journal_file_buffer / jbd2_journal_dirty_metadata. > Sorry, data-race is usually difficult to reproduce. I cannot provide > you with a reproducing program. > I hope that the call stack information in the crash log can help you > locate the problem. > Kernel config and full log can be found in the attachment. > > Here is the detailed information: > commit: 3b9cdafb5358eb9f3790de2f728f765fef100731 > version: linux 5.11 > git tree:upstream > report: > ========== > BUG: KCSAN: data-race in __jbd2_journal_file_buffer / > jbd2_journal_dirty_metadata > write to 0x88800af6da38 of 8 bytes by task 4822 on cpu 1: > __jbd2_journal_file_buffer+0x18d/0x370 linux/fs/jbd2/transaction.c:2518 > __jbd2_journal_refile_buffer+0x155/0x230 linux/fs/jbd2/transaction.c:2612 > jbd2_journal_commit_transaction+0x24c6/0x3200 linux/fs/jbd2/commit.c:1084 > kjournald2+0x253/0x470 linux/fs/jbd2/journal.c:213 > kthread+0x1f0/0x220 linux/kernel/kthread.c:292 > ret_from_fork+0x1f/0x30 linux/arch/x86/entry/entry_64.S:294 Thanks for report but I'm not sure what KCSAN is complaining about - isn't the report truncated? I'm missing 'read' part of the report... The complaint is on line: jh->b_transaction = transaction; I would guess the complaint is because of the check: /* * This and the following assertions are unreliable since we may see jh * in inconsistent state unless we grab bh_state lock. But this is * crucial to catch bugs so let's do a reliable check until the * lockless handling is fully proven. */ if (jh->b_transaction != transaction && jh->b_next_transaction != transaction) { And the comment explains, why we do this unreliable check. Again, if we wanted to silence KCSAN, we could use data_race() macro but AFAIU Ted isn't very fond of that annotation. Honza -- Jan Kara SUSE Labs, CR