On  4.04.2018 17:48, Dave Anderson wrote:
> 
> 
> ----- Original Message -----
>> Hello,
>>
>> I tried running crash-head (HEAD: 5d172b230cf4) against today's linus'
>> master on a dump obtained via dump-guest-memory in qemu. And I got the
>> following when the image is loaded:
>>
>> please wait... (determining panic task)
>> bt: read error: kernel virtual address: fffffe0000007000  type: "stack
>> contents"
>>
>>   KERNEL: vmlinux
>>     DUMPFILE: memory-verbatim.img
>>         CPUS: 1
>>         DATE: Wed Apr  4 16:36:47 2018
>>       UPTIME: 00:27:48
>> LOAD AVERAGE: 31.11, 17.80, 10.43
>>        TASKS: 145
>>     NODENAME: ubuntu-virtual
>>      RELEASE: 4.16.0-rc7-nbor
>>      VERSION: #570 SMP Wed Apr 4 16:03:44 EEST 2018
>>      MACHINE: x86_64  (3392 Mhz)
>>       MEMORY: 4 GB
>>        PANIC: ""
>>          PID: 0
>>      COMMAND: "swapper/0"
>>         TASK: ffffffff82016500  [THREAD_INFO: ffffffff82016500]
>>          CPU: 0
>>        STATE: TASK_RUNNING
>>      WARNING: panic task not found
>>
>> crash> bt
>> PID: 0      TASK: ffffffff82016500  CPU: 0   COMMAND: "swapper/0"
>>  #0 [ffffffff82003dc8] __schedule at ffffffff817ea059
>> bt: invalid RSP: ffffffff82003dc8  bt->stackbase/stacktop: 
>> ffffffff82000000/ffffffff82002000 cpu: 0
>>
>>
>> So the kernel has been compiled with : gcc (Ubuntu
>> 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 which has retpoline enabled.
>>
>> I have KASLR disabled: # CONFIG_RANDOMIZE_BASE is not set and the kernel
>> is compiled with CONFIG_FRAME_POINTER=y .
>>
>> This scenario used to work around the 4.10 timeline. Am I doing
>> something wrong or crash still needs time to work on the latest upstream
>> kernel code?
> 
> Presumably the latter. 
> 
> If you do a "task -R stack ffffffff82016500", I'm presuming that it
> shows the stack base address is ffffffff82000000.  And the looking at
> the stackbase/stacktop values, the crash utility is presuming an 8K stack:
> 
>  bt: invalid RSP: ffffffff82003dc8  bt->stackbase/stacktop: 
> ffffffff82000000/ffffffff82002000 cpu: 0
> 
> But the RSP is ffffffff82003dc8, which puts its beyond the 8K stack size, 
> so I'm presuming that the kernel is actually using 16K stacks.  The most
> recent kernel I have is 4.16.0-0.rc6.git3.1.fc29.x86_64, which uses 16K 
> stacks.

This is correct, indeed the kernel size should be 16k. However...

> 
> Here is how the crash utility determines the stack size.  The x86_64 stacksize
> starts out with a default size of 2 pages, as set here in 
> x86_64_init(PRE_SYMTAB):
> 
>        case PRE_SYMTAB:
>               ... [ cut ] ...
>                 machdep->stacksize = machdep->pagesize * 2;
>                 ...
> 
> Then later on in task_init(), it gets resized as shown here, where 
> the STACKSIZE() macro is machdep->stacksize:
> 
>         if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
>                 error(WARNING, "\nnon-standard stack size: %ld\n",
>                         len = SIZE(task_union));
>                 machdep->stacksize = len;
>         } else if (VALID_SIZE(thread_union) &&
>                 ((len = SIZE(thread_union)) != STACKSIZE()))
>                 machdep->stacksize = len;

This is not resized at all, instead VALID_SIZE(thread_union) actually
fails, I've added the following else to the if statement there :

+       } else {
+               if (VALID_SIZE(thread_union)) {
+               error(WARNING, "WE ARE IN THE ELSE BRANCH: len: %llu
thread_union size: %llu STACKSIZE(): %llu\n",
+                     len, SIZE(thread_union), STACKSIZE());
+               } else {
+               error(WARNING, "thread_union is invalid\n");
+               }
+       }

Also doing:

crash> struct thread_union
struct: invalid data structure reference: thread_union

So for some reason the thread_union cannot be found by gdb:

help -o | grep thread_union
                  thread_union: -1

> 
> The "task_union" no longer exists, and so it checks whether the
> "thread_union" is larger than the default stacksize, and resets the
> size appropriately.  
>   
> On my 4.16.0-0.rc6.git3.1.fc29.x86_64 kernel, here is the thread_union:
> 
>   crash> thread_union
>   union thread_union {
>       struct task_struct task;
>       unsigned long stack[2048];
>   }
>   SIZE: 16384
> 
> And so it gets reset:
>   
>   crash> help -m | grep stacksize
>             stacksize: 16384
>   crash>
> 
> You can debug it from there.  Let me know what you find.
> 
> Thanks,
>   Dave
> 

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility

Reply via email to