Re: crash: `kmem -s` reported "kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e" on a dumped vmcore

2019-08-30 Thread lijiang
在 2019年08月17日 15:23, lijiang 写道:
> 在 2019年08月11日 10:29, lijiang 写道:
>> 在 2019年08月09日 06:37, Lendacky, Thomas 写道:
>>> On 8/1/19 8:05 PM, Dave Young wrote:
 Add kexec cc list.
 On 08/01/19 at 11:02pm, lijiang wrote:
> Hi, Tom
>
> Recently, i ran into a problem about SME and used crash tool to check the 
> vmcore as follow:
>
> crash> kmem -s | grep -i invalid
> kmem: dma-kmalloc-512: slab: e192c0001000 invalid freepointer: 
> e5ffef4e9a040b7e
> kmem: dma-kmalloc-512: slab: e192c0001000 invalid freepointer: 
> e5ffef4e9a040b7e
>
> And the crash tool reported the above error, probably, the main reason is 
> that kernel does not
> correctly handle the first 640k region when SME is enabled.
>
> When SME is enabled, the kernel and initramfs images are loaded into the 
> decrypted memory, and
> the backup area(first 640k) is also mapped as decrypted, but the first 
> 640k data is copied to
> the backup area in purgatory(). Please refer to this file: 
> arch/x86/purgatory/purgatory.c
> ..
> static int copy_backup_region(void)
> {
>  if (purgatory_backup_dest) {
>  memcpy((void *)purgatory_backup_dest,
> (void *)purgatory_backup_src, 
> purgatory_backup_sz);
>  }
>  return 0;
> }
> ..
>
> arch/x86/kernel/machine_kexec_64.c
> ..
> machine_kexec_prepare()->
> arch_update_purgatory()->
> .
>
> Actually, the firs 640k area is encrypted in the first kernel when SME is 
> enabled, here kernel
> copies the first 640k data to the backup area in purgatory(), because the 
> backup area is mapped
> as decrypted, this copying operation makes that the first 640k data is 
> decrypted(decoded) and
> saved to the backup area, but probably kernel can not aware of SME in 
> purgatory(), which causes
> kernel mistakenly read out the first 640k.
>
> In addition, i hacked kernel code as follow:
>
> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
> index 7bcc92add72c..a51631d36a7a 100644
> --- a/fs/proc/vmcore.c
> +++ b/fs/proc/vmcore.c
> @@ -377,6 +378,16 @@ static ssize_t __read_vmcore(char *buffer, size_t 
> buflen, loff_t *fpos,
>  m->offset + m->size - *fpos,
>  buflen);
>  start = m->paddr + *fpos - m->offset;
> +   if (m->paddr == 0x73f6) {//the backup area's 
> start address:0x73f6
> +   tmp = read_from_oldmem(buffer, tsz, 
> ,
> +   userbuf, false);
> +   } else
>  tmp = read_from_oldmem(buffer, tsz, 
> ,
> userbuf, 
> mem_encrypt_active());
>  if (tmp < 0)
>
> Here, i used the crash tool to check the vmcore, i can see that the 
> backup area is decrypted,
> except for the dma-kmalloc-512. So i suspect that kernel did not 
> correctly read out the first
> 640k data to backup area. Do you happen to know how to deal with the 
> first 640k area in purgatory()
> when SME is enabled? Any idea?
>>>
>>> I'm not all that familiar with kexec and purgatory, etc., but I think
>>> that you want to setup the page table that is active when purgatory runs
>>> so that the src and dest both have the SME encryption mask set in their
>>> respective page table entries. This way, when the copy is performed,
>>> everything is copied correctly. 
>>
>> Exactly. That's just what i was thinking.
>>
> 
> I tried to setup the 1:1 mapping in the init_pgtable() with the memory 
> encryption mask, but that still
> did not correctly access the encrypted memory in purgatory(). I'm not sure 
> whether i missed anything
> else, i'm still digging into it.
> 

As we know, kdump kernel will reuse the first 640k region, so the old content 
in the first 640k area will
be copied to a backup area, which is done in purgatory(). When dumping the 
vmcore, kdump kernel will read
the old content of the first 640k area from the backup area. 

According to above description, when SME is enabled in the first kernel, kernel 
has to setup the identity
mapping for the first 640k area with encryption mask so that kernel can 
correctly access the old memory.
And also setup the identity mapping for the backup region with encryption mask. 
But kdump kernel won't
properly deal with the encrypted memory before SME is enabled, which causes the 
failure of kdump kernel
boot.

So i planed to setup the temporary mapping of page table with encryption mask 
for the first 640k area and
backup region in purgatory().

> I guess that should make the 1:1 mapping 

Re: crash: `kmem -s` reported "kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e" on a dumped vmcore

2019-08-17 Thread lijiang
在 2019年08月11日 10:29, lijiang 写道:
> 在 2019年08月09日 06:37, Lendacky, Thomas 写道:
>> On 8/1/19 8:05 PM, Dave Young wrote:
>>> Add kexec cc list.
>>> On 08/01/19 at 11:02pm, lijiang wrote:
 Hi, Tom

 Recently, i ran into a problem about SME and used crash tool to check the 
 vmcore as follow:

 crash> kmem -s | grep -i invalid
 kmem: dma-kmalloc-512: slab: e192c0001000 invalid freepointer: 
 e5ffef4e9a040b7e
 kmem: dma-kmalloc-512: slab: e192c0001000 invalid freepointer: 
 e5ffef4e9a040b7e

 And the crash tool reported the above error, probably, the main reason is 
 that kernel does not
 correctly handle the first 640k region when SME is enabled.

 When SME is enabled, the kernel and initramfs images are loaded into the 
 decrypted memory, and
 the backup area(first 640k) is also mapped as decrypted, but the first 
 640k data is copied to
 the backup area in purgatory(). Please refer to this file: 
 arch/x86/purgatory/purgatory.c
 ..
 static int copy_backup_region(void)
 {
  if (purgatory_backup_dest) {
  memcpy((void *)purgatory_backup_dest,
 (void *)purgatory_backup_src, purgatory_backup_sz);
  }
  return 0;
 }
 ..

 arch/x86/kernel/machine_kexec_64.c
 ..
 machine_kexec_prepare()->
 arch_update_purgatory()->
 .

 Actually, the firs 640k area is encrypted in the first kernel when SME is 
 enabled, here kernel
 copies the first 640k data to the backup area in purgatory(), because the 
 backup area is mapped
 as decrypted, this copying operation makes that the first 640k data is 
 decrypted(decoded) and
 saved to the backup area, but probably kernel can not aware of SME in 
 purgatory(), which causes
 kernel mistakenly read out the first 640k.

 In addition, i hacked kernel code as follow:

 diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
 index 7bcc92add72c..a51631d36a7a 100644
 --- a/fs/proc/vmcore.c
 +++ b/fs/proc/vmcore.c
 @@ -377,6 +378,16 @@ static ssize_t __read_vmcore(char *buffer, size_t 
 buflen, loff_t *fpos,
  m->offset + m->size - *fpos,
  buflen);
  start = m->paddr + *fpos - m->offset;
 +   if (m->paddr == 0x73f6) {//the backup area's 
 start address:0x73f6
 +   tmp = read_from_oldmem(buffer, tsz, ,
 +   userbuf, false);
 +   } else
  tmp = read_from_oldmem(buffer, tsz, 
 ,
 userbuf, 
 mem_encrypt_active());
  if (tmp < 0)

 Here, i used the crash tool to check the vmcore, i can see that the backup 
 area is decrypted,
 except for the dma-kmalloc-512. So i suspect that kernel did not correctly 
 read out the first
 640k data to backup area. Do you happen to know how to deal with the first 
 640k area in purgatory()
 when SME is enabled? Any idea?
>>
>> I'm not all that familiar with kexec and purgatory, etc., but I think
>> that you want to setup the page table that is active when purgatory runs
>> so that the src and dest both have the SME encryption mask set in their
>> respective page table entries. This way, when the copy is performed,
>> everything is copied correctly. 
> 
> Exactly. That's just what i was thinking.
> 

I tried to setup the 1:1 mapping in the init_pgtable() with the memory 
encryption mask, but that still
did not correctly access the encrypted memory in purgatory(). I'm not sure 
whether i missed anything
else, i'm still digging into it.

I guess that should make the 1:1 mapping in the purgatory context instead of in 
init_pgtable(). Does
anyone happen to know how to make the 1:1 mapping with memory encryption mask 
in purgatory() context?

In addition, there is another way to avoid encrypting the first 640k area. When 
SME is enabled, do not
encrypt the first 640k area, let it skip this area. Do you happen to know how 
to do it? Tom.(btw: I tried
to do it, unfortunately, that failed.). But that also needs to make extra 
things when dumpping the vmcore(
need to dump the vmcore according to whether the first 640k area is encrypted).

Thanks.
Lianbo

>> Remember, encrypted data from one page
>> cannot be directly copied as unencrypted data and decrypted properly in
>> the new location (e.g. a page of zeroes encrypted at one address will not
>> appear the same as a page of zeroes encrypted at a different address).
> 
> Yes, that's right. Thank you, Tom.
> 
> I'm considering how to solve it, and i guess that probably it needs to 
> properly deal with
> this 

Re: crash: `kmem -s` reported "kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e" on a dumped vmcore

2019-08-10 Thread lijiang
在 2019年08月09日 06:37, Lendacky, Thomas 写道:
> On 8/1/19 8:05 PM, Dave Young wrote:
>> Add kexec cc list.
>> On 08/01/19 at 11:02pm, lijiang wrote:
>>> Hi, Tom
>>>
>>> Recently, i ran into a problem about SME and used crash tool to check the 
>>> vmcore as follow:
>>>
>>> crash> kmem -s | grep -i invalid
>>> kmem: dma-kmalloc-512: slab: e192c0001000 invalid freepointer: 
>>> e5ffef4e9a040b7e
>>> kmem: dma-kmalloc-512: slab: e192c0001000 invalid freepointer: 
>>> e5ffef4e9a040b7e
>>>
>>> And the crash tool reported the above error, probably, the main reason is 
>>> that kernel does not
>>> correctly handle the first 640k region when SME is enabled.
>>>
>>> When SME is enabled, the kernel and initramfs images are loaded into the 
>>> decrypted memory, and
>>> the backup area(first 640k) is also mapped as decrypted, but the first 640k 
>>> data is copied to
>>> the backup area in purgatory(). Please refer to this file: 
>>> arch/x86/purgatory/purgatory.c
>>> ..
>>> static int copy_backup_region(void)
>>> {
>>>  if (purgatory_backup_dest) {
>>>  memcpy((void *)purgatory_backup_dest,
>>> (void *)purgatory_backup_src, purgatory_backup_sz);
>>>  }
>>>  return 0;
>>> }
>>> ..
>>>
>>> arch/x86/kernel/machine_kexec_64.c
>>> ..
>>> machine_kexec_prepare()->
>>> arch_update_purgatory()->
>>> .
>>>
>>> Actually, the firs 640k area is encrypted in the first kernel when SME is 
>>> enabled, here kernel
>>> copies the first 640k data to the backup area in purgatory(), because the 
>>> backup area is mapped
>>> as decrypted, this copying operation makes that the first 640k data is 
>>> decrypted(decoded) and
>>> saved to the backup area, but probably kernel can not aware of SME in 
>>> purgatory(), which causes
>>> kernel mistakenly read out the first 640k.
>>>
>>> In addition, i hacked kernel code as follow:
>>>
>>> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
>>> index 7bcc92add72c..a51631d36a7a 100644
>>> --- a/fs/proc/vmcore.c
>>> +++ b/fs/proc/vmcore.c
>>> @@ -377,6 +378,16 @@ static ssize_t __read_vmcore(char *buffer, size_t 
>>> buflen, loff_t *fpos,
>>>  m->offset + m->size - *fpos,
>>>  buflen);
>>>  start = m->paddr + *fpos - m->offset;
>>> +   if (m->paddr == 0x73f6) {//the backup area's 
>>> start address:0x73f6
>>> +   tmp = read_from_oldmem(buffer, tsz, ,
>>> +   userbuf, false);
>>> +   } else
>>>  tmp = read_from_oldmem(buffer, tsz, ,
>>> userbuf, 
>>> mem_encrypt_active());
>>>  if (tmp < 0)
>>>
>>> Here, i used the crash tool to check the vmcore, i can see that the backup 
>>> area is decrypted,
>>> except for the dma-kmalloc-512. So i suspect that kernel did not correctly 
>>> read out the first
>>> 640k data to backup area. Do you happen to know how to deal with the first 
>>> 640k area in purgatory()
>>> when SME is enabled? Any idea?
> 
> I'm not all that familiar with kexec and purgatory, etc., but I think
> that you want to setup the page table that is active when purgatory runs
> so that the src and dest both have the SME encryption mask set in their
> respective page table entries. This way, when the copy is performed,
> everything is copied correctly. 

Exactly. That's just what i was thinking.

> Remember, encrypted data from one page
> cannot be directly copied as unencrypted data and decrypted properly in
> the new location (e.g. a page of zeroes encrypted at one address will not
> appear the same as a page of zeroes encrypted at a different address).

Yes, that's right. Thank you, Tom.

I'm considering how to solve it, and i guess that probably it needs to properly 
deal with
this problem in purgatory().

Thanks.
Lianbo

> 
> Thanks,
> Tom
> 
>>>
>>> BTW: I' curious the reason why the address of dma-kmalloc-512k always falls 
>>> into the first 640k
>>> region, and i did not see the same issue on another machine.
>>>
>>> Machine:
>>> Serial Number   diesel-sys9079-0001
>>> Model   AMD Diesel (A0C)
>>> CPU AMD EPYC 7601 32-Core Processor
>>>
>>>
>>> Background:
>>> On x86_64, the first 640k region is special because of some historical 
>>> reasons. And kdump kernel will
>>> reuse the first 640k region, so kernel will back up(copy) the first 640k 
>>> region to a backup area in
>>> purgatory(), in order not to rewrite the old region(640k) in kdump kernel, 
>>> which makes sure that kdump
>>> can read out the old memory from vmcore.
>>>
>>>
>>> Thanks.
>>> Lianbo


Re: crash: `kmem -s` reported "kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e" on a dumped vmcore

2019-08-08 Thread Lendacky, Thomas
On 8/1/19 8:05 PM, Dave Young wrote:
> Add kexec cc list.
> On 08/01/19 at 11:02pm, lijiang wrote:
>> Hi, Tom
>>
>> Recently, i ran into a problem about SME and used crash tool to check the 
>> vmcore as follow:
>>
>> crash> kmem -s | grep -i invalid
>> kmem: dma-kmalloc-512: slab: e192c0001000 invalid freepointer: 
>> e5ffef4e9a040b7e
>> kmem: dma-kmalloc-512: slab: e192c0001000 invalid freepointer: 
>> e5ffef4e9a040b7e
>>
>> And the crash tool reported the above error, probably, the main reason is 
>> that kernel does not
>> correctly handle the first 640k region when SME is enabled.
>>
>> When SME is enabled, the kernel and initramfs images are loaded into the 
>> decrypted memory, and
>> the backup area(first 640k) is also mapped as decrypted, but the first 640k 
>> data is copied to
>> the backup area in purgatory(). Please refer to this file: 
>> arch/x86/purgatory/purgatory.c
>> ..
>> static int copy_backup_region(void)
>> {
>>  if (purgatory_backup_dest) {
>>  memcpy((void *)purgatory_backup_dest,
>> (void *)purgatory_backup_src, purgatory_backup_sz);
>>  }
>>  return 0;
>> }
>> ..
>>
>> arch/x86/kernel/machine_kexec_64.c
>> ..
>> machine_kexec_prepare()->
>> arch_update_purgatory()->
>> .
>>
>> Actually, the firs 640k area is encrypted in the first kernel when SME is 
>> enabled, here kernel
>> copies the first 640k data to the backup area in purgatory(), because the 
>> backup area is mapped
>> as decrypted, this copying operation makes that the first 640k data is 
>> decrypted(decoded) and
>> saved to the backup area, but probably kernel can not aware of SME in 
>> purgatory(), which causes
>> kernel mistakenly read out the first 640k.
>>
>> In addition, i hacked kernel code as follow:
>>
>> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
>> index 7bcc92add72c..a51631d36a7a 100644
>> --- a/fs/proc/vmcore.c
>> +++ b/fs/proc/vmcore.c
>> @@ -377,6 +378,16 @@ static ssize_t __read_vmcore(char *buffer, size_t 
>> buflen, loff_t *fpos,
>>  m->offset + m->size - *fpos,
>>  buflen);
>>  start = m->paddr + *fpos - m->offset;
>> +   if (m->paddr == 0x73f6) {//the backup area's 
>> start address:0x73f6
>> +   tmp = read_from_oldmem(buffer, tsz, ,
>> +   userbuf, false);
>> +   } else
>>  tmp = read_from_oldmem(buffer, tsz, ,
>> userbuf, 
>> mem_encrypt_active());
>>  if (tmp < 0)
>>
>> Here, i used the crash tool to check the vmcore, i can see that the backup 
>> area is decrypted,
>> except for the dma-kmalloc-512. So i suspect that kernel did not correctly 
>> read out the first
>> 640k data to backup area. Do you happen to know how to deal with the first 
>> 640k area in purgatory()
>> when SME is enabled? Any idea?

I'm not all that familiar with kexec and purgatory, etc., but I think
that you want to setup the page table that is active when purgatory runs
so that the src and dest both have the SME encryption mask set in their
respective page table entries. This way, when the copy is performed,
everything is copied correctly.  Remember, encrypted data from one page
cannot be directly copied as unencrypted data and decrypted properly in
the new location (e.g. a page of zeroes encrypted at one address will not
appear the same as a page of zeroes encrypted at a different address).

Thanks,
Tom

>>
>> BTW: I' curious the reason why the address of dma-kmalloc-512k always falls 
>> into the first 640k
>> region, and i did not see the same issue on another machine.
>>
>> Machine:
>> Serial Numberdiesel-sys9079-0001
>> Model   AMD Diesel (A0C)
>> CPU AMD EPYC 7601 32-Core Processor
>>
>>
>> Background:
>> On x86_64, the first 640k region is special because of some historical 
>> reasons. And kdump kernel will
>> reuse the first 640k region, so kernel will back up(copy) the first 640k 
>> region to a backup area in
>> purgatory(), in order not to rewrite the old region(640k) in kdump kernel, 
>> which makes sure that kdump
>> can read out the old memory from vmcore.
>>
>>
>> Thanks.
>> Lianbo


Re: crash: `kmem -s` reported "kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e" on a dumped vmcore

2019-08-01 Thread Dave Young
Add kexec cc list.
On 08/01/19 at 11:02pm, lijiang wrote:
> Hi, Tom
> 
> Recently, i ran into a problem about SME and used crash tool to check the 
> vmcore as follow: 
> 
> crash> kmem -s | grep -i invalid
> kmem: dma-kmalloc-512: slab: e192c0001000 invalid freepointer: 
> e5ffef4e9a040b7e
> kmem: dma-kmalloc-512: slab: e192c0001000 invalid freepointer: 
> e5ffef4e9a040b7e
> 
> And the crash tool reported the above error, probably, the main reason is 
> that kernel does not
> correctly handle the first 640k region when SME is enabled.
> 
> When SME is enabled, the kernel and initramfs images are loaded into the 
> decrypted memory, and
> the backup area(first 640k) is also mapped as decrypted, but the first 640k 
> data is copied to
> the backup area in purgatory(). Please refer to this file: 
> arch/x86/purgatory/purgatory.c
> ..
> static int copy_backup_region(void)
> {
> if (purgatory_backup_dest) {
> memcpy((void *)purgatory_backup_dest,
>(void *)purgatory_backup_src, purgatory_backup_sz);
> }
> return 0;
> }
> ..
> 
> arch/x86/kernel/machine_kexec_64.c
> ..
> machine_kexec_prepare()->
> arch_update_purgatory()->
> .
> 
> Actually, the firs 640k area is encrypted in the first kernel when SME is 
> enabled, here kernel
> copies the first 640k data to the backup area in purgatory(), because the 
> backup area is mapped
> as decrypted, this copying operation makes that the first 640k data is 
> decrypted(decoded) and
> saved to the backup area, but probably kernel can not aware of SME in 
> purgatory(), which causes
> kernel mistakenly read out the first 640k.
> 
> In addition, i hacked kernel code as follow:
> 
> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
> index 7bcc92add72c..a51631d36a7a 100644
> --- a/fs/proc/vmcore.c
> +++ b/fs/proc/vmcore.c
> @@ -377,6 +378,16 @@ static ssize_t __read_vmcore(char *buffer, size_t 
> buflen, loff_t *fpos,
> m->offset + m->size - *fpos,
> buflen);
> start = m->paddr + *fpos - m->offset;
> +   if (m->paddr == 0x73f6) {//the backup area's 
> start address:0x73f6
> +   tmp = read_from_oldmem(buffer, tsz, ,
> +   userbuf, false);
> +   } else
> tmp = read_from_oldmem(buffer, tsz, ,
>userbuf, mem_encrypt_active());
> if (tmp < 0)
> 
> Here, i used the crash tool to check the vmcore, i can see that the backup 
> area is decrypted,
> except for the dma-kmalloc-512. So i suspect that kernel did not correctly 
> read out the first
> 640k data to backup area. Do you happen to know how to deal with the first 
> 640k area in purgatory()
> when SME is enabled? Any idea?
> 
> BTW: I' curious the reason why the address of dma-kmalloc-512k always falls 
> into the first 640k
> region, and i did not see the same issue on another machine.
> 
> Machine:
> Serial Number diesel-sys9079-0001
> Model   AMD Diesel (A0C)
> CPU AMD EPYC 7601 32-Core Processor
> 
> 
> Background:
> On x86_64, the first 640k region is special because of some historical 
> reasons. And kdump kernel will
> reuse the first 640k region, so kernel will back up(copy) the first 640k 
> region to a backup area in
> purgatory(), in order not to rewrite the old region(640k) in kdump kernel, 
> which makes sure that kdump
> can read out the old memory from vmcore.
> 
> 
> Thanks.
> Lianbo