Hi,
On 4/11/19 5:10 AM, Michael Schmitz wrote:
[...]
OK, I decided to bite the bullet and modify bus_error030() to allow
falling through to do_page_fault if an invalid page read happens while
page faults are disabled.
[...]
Resulting syslog:
[...]
Summary:
* Stack is always shown, but call trace following it is always empty.
Is call trace explicitly disabled for m68k task list?
* Following threads didn't fault:
----------------------------------------------------------------
[31197.540000] task PC stack pid father
[31197.550000] init S 0 1 0 0x00000000
[31197.620000] kthreadd S 0 2 0 0x00000000
>> [31198.020000] ksoftirqd/0 R running task 0 7 2
>> [31198.080000] kdevtmpfs S 0 8 2 0x00000000
>> [31198.280000] oom_reaper S 0 12 2 0x00000000
>> [31198.760000] kswapd0 S 0 200 2 0x00000000
>> [31198.950000] jbd2/hda3-8 S 0 794 2 0x00000000
>> [31199.120000] portmap S 0 982 1 0x00000000
>> [31199.180000] syslogd S 0 1070 1 0x00000000
>> [31199.220000] klogd R running task 0 1076 1
>> [31199.280000] gpm S 0 1086 1 0x00000000
>> [31199.350000] inetd S 0 1091 1 0x00000000
>> [31199.410000] lpd S 0 1095 1 0x00000000
>> [31199.470000] sshd S 0 1101 1 0x00000000
>> [31199.520000] rpc.statd S 0 1106 1 0x00000000
>> [31199.560000] atd S 0 1111 1 0x00000000
>> [31199.610000] cron S 0 1114 1 0x00000000
>> [31199.670000] getty S 0 1120 1 0x00000000
>> [31199.720000] getty S 0 1121 1 0x00000000
>> [31199.790000] getty S 0 1122 1 0x00000000
>> [31199.850000] getty S 0 1123 1 0x00000000
>> [31199.920000] getty S 0 1124 1 0x00000000
>> [31199.980000] getty S 0 1125 1 0x00000000
>> [31200.160000] sshd S 0 1304 1101 0x00000000
>> [31200.220000] sshd S 0 1306 1304 0x00000000
>> [31200.270000] bash S 0 1307 1306 0x00000000
>> [31200.330000] bash R running task 0 1308 1307
----------------------------------------------------------------
* Following threads did fault:
----------------------------------------------------------------
[31197.680000] kworker/0:0 I 0 3 2 0x00000000
[31197.750000] Workqueue: (null) (events)
[31197.800000] kworker/0:0H I 0 4 2 0x00000000
[31197.930000] mm_percpu_wq I 0 6 2 0x00000000
[31198.160000] kworker/u2:1 I 0 9 2 0x00000000
[31198.230000] Workqueue: (null) (events_unbound)
[31198.330000] kworker/0:1 I 0 13 2 0x00000000
[31198.450000] writeback I 0 95 2 0x00000000
[31198.510000] Workqueue: (null) (flush-3:0)
[31198.570000] crypto I 0 97 2 0x00000000
[31198.660000] kblockd I 0 99 2 0x00000000
[31198.720000] Workqueue: (null) (kblockd)
[31198.820000] kworker/0:1H I 0 761 2 0x00000000
[31198.890000] Workqueue: (null) (kblockd)
[31199.010000] ext4-rsv-conver I 0 795 2 0x00000000
[31200.050000] kworker/u2:0 I 0 1272 2 0x00000000
[31200.390000] kworker/u2:2 I 0 1310 2 0x00000000
>> [31200.460000] Workqueue: (null) (events_unbound)
----------------------------------------------------------------
=> *All* of them are kernel threads (kthreadd children) in 'I' state
('I' = interrupt context?)
* *There are always two faults*
Looking at the kthread_probe_data() code:
----------------------------------------------------------------
void *kthread_probe_data(struct task_struct *task)
{
struct kthread *kthread = to_kthread(task);
void *data = NULL;
probe_kernel_read(&data, &kthread->data, sizeof(data));
return data;
}
----------------------------------------------------------------
And void print_worker_info() code:
----------------------------------------------------------------
void print_worker_info(const char *log_lvl, struct task_struct *task)
{
work_func_t *fn = NULL;
char name[WQ_NAME_LEN] = { };
char desc[WORKER_DESC_LEN] = { };
struct pool_workqueue *pwq = NULL;
struct workqueue_struct *wq = NULL;
struct worker *worker;
...
worker = kthread_probe_data(task);
...
probe_kernel_read(&fn, &worker->current_func, sizeof(fn));
probe_kernel_read(&pwq, &worker->current_pwq, sizeof(pwq));
probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
probe_kernel_read(name, wq->name, sizeof(name) - 1);
probe_kernel_read(desc, worker->desc, sizeof(desc) - 1);
...
if (fn || name[0] || desc[0]) {
printk("%sWorkqueue: %s %pf", log_lvl, name, fn);
if (strcmp(name, desc))
pr_cont(" (%s)", desc);
pr_cont("\n");
}
}
----------------------------------------------------------------
From the task output, we know that faulting items with workqueue
have empty "name" & "desc", but non-NULL "current_func".
Everything is initialized to NULLs, so if data fetch fails,
values are NULLs.
From backtraces, we know that at least one of the 2 faults is
from probe_kernel_read().
If variable would be NULL:
* task/kthread -> lots of faults
* worker -> 3 faults, two in probe, and one in above function
* current_func / fn -> no issues
* current_pwq / pwq -> 2 faults, one from probe
* wq -> 1 fault in above function
* name/desc -> can't be NULL
=> I think the problem is that 'I' kthreads have NULL "current_pwq".
Ones with workqueues just have "current_func" set, others don't.
Why that would affect / fault only on 030?
Attached patch fixes the Oops for me.
- Eero
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ddee541ea97a..ec4127c0f3da 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4582,8 +4582,11 @@ void print_worker_info(const char *log_lvl, struct task_struct *task)
*/
probe_kernel_read(&fn, &worker->current_func, sizeof(fn));
probe_kernel_read(&pwq, &worker->current_pwq, sizeof(pwq));
- probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
- probe_kernel_read(name, wq->name, sizeof(name) - 1);
+ /* current_pwq is NULL for 030 'I' tasks, and this would fault 2x */
+ if (pwq) {
+ probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
+ probe_kernel_read(name, wq->name, sizeof(name) - 1);
+ }
probe_kernel_read(desc, worker->desc, sizeof(desc) - 1);
if (fn || name[0] || desc[0]) {