Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-06-11 Thread David Woodhouse
On Fri, 2015-04-10 at 16:42 +0800, Li, Zhen-Hua wrote:
> This patchset is an update of Bill Sumner's patchset, implements a fix for:
> If a kernel boots with intel_iommu=on on a system that supports intel vt-d, 
> when a panic happens, the kdump kernel will boot with these faults:

But, in the general case, it *does* boot.

There are two cases where it doesn't actually boot, and those are the
interesting ones.

Firstly, a device just keeps generating faults and we die in an
interrupt storm, reporting the same fault over and over again. That can
actually happen without kdump/kexec and the correct fix for that is to
have rate-limiting, disable fault reporting for the offending device
after too many are seen, and then eventually to tie it in to the PCIe
error handling as has been discussed elsewhere.

Secondly, there are devices which do not correctly respond to a
hardware reset. This is broken hardware, and if we really have to copy
the old contexts from the crashed kernel to work around it then I'd
like it to be on a blacklist basis — we do it only for hardware which
is *known* to be broken in this way.

(There's also some cases where the device driver doesn't even *try* to
reset the hardware and just assumes it'll find it in a sane state as
the BIOS or a cleanly shut down kexec would have left it. In those
cases of course we can just fix the driver).

I don't much like the idea of doing this context copy for *all*
hardware. That's masking hardware issues with reset that we really
*ought* to be finding.

I believe that most of the offending hardware is HP's; they like to do
the most, erm, "interesting" things with odd hardware and RMRRs and
stuff. So Zhen-Hua would you be able to provide the list of broken
devices that HP has shipped, for the purpose of such a blacklist?

I assume you've already contacted the hardware folks responsible and
insisted that their devices are fixed to be resettable already, right?

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-06-11 Thread David Woodhouse
On Fri, 2015-04-10 at 16:42 +0800, Li, Zhen-Hua wrote:
 This patchset is an update of Bill Sumner's patchset, implements a fix for:
 If a kernel boots with intel_iommu=on on a system that supports intel vt-d, 
 when a panic happens, the kdump kernel will boot with these faults:

But, in the general case, it *does* boot.

There are two cases where it doesn't actually boot, and those are the
interesting ones.

Firstly, a device just keeps generating faults and we die in an
interrupt storm, reporting the same fault over and over again. That can
actually happen without kdump/kexec and the correct fix for that is to
have rate-limiting, disable fault reporting for the offending device
after too many are seen, and then eventually to tie it in to the PCIe
error handling as has been discussed elsewhere.

Secondly, there are devices which do not correctly respond to a
hardware reset. This is broken hardware, and if we really have to copy
the old contexts from the crashed kernel to work around it then I'd
like it to be on a blacklist basis — we do it only for hardware which
is *known* to be broken in this way.

(There's also some cases where the device driver doesn't even *try* to
reset the hardware and just assumes it'll find it in a sane state as
the BIOS or a cleanly shut down kexec would have left it. In those
cases of course we can just fix the driver).

I don't much like the idea of doing this context copy for *all*
hardware. That's masking hardware issues with reset that we really
*ought* to be finding.

I believe that most of the offending hardware is HP's; they like to do
the most, erm, interesting things with odd hardware and RMRRs and
stuff. So Zhen-Hua would you be able to provide the list of broken
devices that HP has shipped, for the purpose of such a blacklist?

I assume you've already contacted the hardware folks responsible and
insisted that their devices are fixed to be resettable already, right?

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-07 Thread Li, ZhenHua

Hi Joerg,
This problem is caused by the latest updates in iommu module, and we are 
trying to fix it.

When it is fixed, we will send out a new version of the patchset.

Thanks
Zhenhua

On 05/08/2015 01:32 AM, Joerg Roedel wrote:

Hi Baoquan, ZhenHua,

On Mon, May 04, 2015 at 11:17:49AM +0800, Baoquan He wrote:

On 05/04/15 at 11:06am, Li, ZhenHua wrote:

Hi baoquan,
Could you paste the kernel log of the first kernel ?


Please let me know when you have worked this issue out.


Thanks,

Joerg



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-07 Thread Joerg Roedel
Hi Baoquan, ZhenHua,

On Mon, May 04, 2015 at 11:17:49AM +0800, Baoquan He wrote:
> On 05/04/15 at 11:06am, Li, ZhenHua wrote:
> > Hi baoquan,
> > Could you paste the kernel log of the first kernel ?

Please let me know when you have worked this issue out.


Thanks,

Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-07 Thread Li, ZhenHua

Hi Joerg,
This problem is caused by the latest updates in iommu module, and we are 
trying to fix it.

When it is fixed, we will send out a new version of the patchset.

Thanks
Zhenhua

On 05/08/2015 01:32 AM, Joerg Roedel wrote:

Hi Baoquan, ZhenHua,

On Mon, May 04, 2015 at 11:17:49AM +0800, Baoquan He wrote:

On 05/04/15 at 11:06am, Li, ZhenHua wrote:

Hi baoquan,
Could you paste the kernel log of the first kernel ?


Please let me know when you have worked this issue out.


Thanks,

Joerg



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-07 Thread Joerg Roedel
Hi Baoquan, ZhenHua,

On Mon, May 04, 2015 at 11:17:49AM +0800, Baoquan He wrote:
 On 05/04/15 at 11:06am, Li, ZhenHua wrote:
  Hi baoquan,
  Could you paste the kernel log of the first kernel ?

Please let me know when you have worked this issue out.


Thanks,

Joerg

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-06 Thread Joerg Roedel
On Wed, May 06, 2015 at 09:51:35AM +0800, Dave Young wrote:
> DMA write will modify system ram, if the old data is corrupted  it is possible
> that DMA operation modify wrong ram regions because of wrong mapping.
> Am I missing something and is it not possible?

This might have happened already before the kdump kernel even boots.
Also, if there is no IOMMU, this can happen as well. This (unlikely
but possible) situation doesn't make things worse.


Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-06 Thread Joerg Roedel
On Wed, May 06, 2015 at 09:51:35AM +0800, Dave Young wrote:
 DMA write will modify system ram, if the old data is corrupted  it is possible
 that DMA operation modify wrong ram regions because of wrong mapping.
 Am I missing something and is it not possible?

This might have happened already before the kdump kernel even boots.
Also, if there is no IOMMU, this can happen as well. This (unlikely
but possible) situation doesn't make things worse.


Joerg

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-05 Thread Li, ZhenHua

Dave,
This patchset will only write root tables in old kernel,  if it is 
corrupted, faults will also happen in old kernel, and hardware would 
mark it. So things will not go worse..


Thanks
Zhenhua
On 05/06/2015 09:51 AM, Dave Young wrote:

On 05/05/15 at 05:31pm, Joerg Roedel wrote:

On Tue, May 05, 2015 at 02:14:23PM +0800, Dave Young wrote:

The failure is nothing different, but as I said in another reply the
difference is we could use corrupted data to possiblly cause more failure.


I still fail to see how things can get more worse than they already are
by reusing the old data (we just reuse it, we do not modify anything


DMA write will modify system ram, if the old data is corrupted  it is possible
that DMA operation modify wrong ram regions because of wrong mapping.
Am I missing something and is it not possible?

Thanks
Dave



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-05 Thread Dave Young
On 05/05/15 at 05:31pm, Joerg Roedel wrote:
> On Tue, May 05, 2015 at 02:14:23PM +0800, Dave Young wrote:
> > The failure is nothing different, but as I said in another reply the
> > difference is we could use corrupted data to possiblly cause more failure.
> 
> I still fail to see how things can get more worse than they already are
> by reusing the old data (we just reuse it, we do not modify anything

DMA write will modify system ram, if the old data is corrupted  it is possible
that DMA operation modify wrong ram regions because of wrong mapping.
Am I missing something and is it not possible?

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-05 Thread Joerg Roedel
On Tue, May 05, 2015 at 02:14:23PM +0800, Dave Young wrote:
> The failure is nothing different, but as I said in another reply the
> difference is we could use corrupted data to possiblly cause more failure.

I still fail to see how things can get more worse than they already are
by reusing the old data (we just reuse it, we do not modify anything
there). Do you have any specific scenario in mind?


Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-05 Thread Dave Young
On 05/04/15 at 06:23pm, Joerg Roedel wrote:
> On Fri, Apr 24, 2015 at 04:49:57PM +0800, Dave Young wrote:
> > I'm more than happy to see this issue can be fixed in the patchset, I
> > do not agree to add the code there with such problems. OTOH, for now
> > seems there's no way to fix it.
> 
> And that's the point. We discuss this issue and possible solutions for
> years by now, and what ZhenHua implemented is what we agreed to be the
> best-effort on what we can do in the kdump case with IOMMU enabled.
> 
> Of course there are still failure scenarios left, but that is not
> different from systems without any IOMMU.

The failure is nothing different, but as I said in another reply the
difference is we could use corrupted data to possiblly cause more failure. 

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-05 Thread Dave Young
On 05/05/15 at 05:31pm, Joerg Roedel wrote:
 On Tue, May 05, 2015 at 02:14:23PM +0800, Dave Young wrote:
  The failure is nothing different, but as I said in another reply the
  difference is we could use corrupted data to possiblly cause more failure.
 
 I still fail to see how things can get more worse than they already are
 by reusing the old data (we just reuse it, we do not modify anything

DMA write will modify system ram, if the old data is corrupted  it is possible
that DMA operation modify wrong ram regions because of wrong mapping.
Am I missing something and is it not possible?

Thanks
Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-05 Thread Li, ZhenHua

Dave,
This patchset will only write root tables in old kernel,  if it is 
corrupted, faults will also happen in old kernel, and hardware would 
mark it. So things will not go worse..


Thanks
Zhenhua
On 05/06/2015 09:51 AM, Dave Young wrote:

On 05/05/15 at 05:31pm, Joerg Roedel wrote:

On Tue, May 05, 2015 at 02:14:23PM +0800, Dave Young wrote:

The failure is nothing different, but as I said in another reply the
difference is we could use corrupted data to possiblly cause more failure.


I still fail to see how things can get more worse than they already are
by reusing the old data (we just reuse it, we do not modify anything


DMA write will modify system ram, if the old data is corrupted  it is possible
that DMA operation modify wrong ram regions because of wrong mapping.
Am I missing something and is it not possible?

Thanks
Dave



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-05 Thread Joerg Roedel
On Tue, May 05, 2015 at 02:14:23PM +0800, Dave Young wrote:
 The failure is nothing different, but as I said in another reply the
 difference is we could use corrupted data to possiblly cause more failure.

I still fail to see how things can get more worse than they already are
by reusing the old data (we just reuse it, we do not modify anything
there). Do you have any specific scenario in mind?


Joerg

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-05 Thread Dave Young
On 05/04/15 at 06:23pm, Joerg Roedel wrote:
 On Fri, Apr 24, 2015 at 04:49:57PM +0800, Dave Young wrote:
  I'm more than happy to see this issue can be fixed in the patchset, I
  do not agree to add the code there with such problems. OTOH, for now
  seems there's no way to fix it.
 
 And that's the point. We discuss this issue and possible solutions for
 years by now, and what ZhenHua implemented is what we agreed to be the
 best-effort on what we can do in the kdump case with IOMMU enabled.
 
 Of course there are still failure scenarios left, but that is not
 different from systems without any IOMMU.

The failure is nothing different, but as I said in another reply the
difference is we could use corrupted data to possiblly cause more failure. 

Thanks
Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-04 Thread Joerg Roedel
On Fri, Apr 24, 2015 at 04:49:57PM +0800, Dave Young wrote:
> I'm more than happy to see this issue can be fixed in the patchset, I
> do not agree to add the code there with such problems. OTOH, for now
> seems there's no way to fix it.

And that's the point. We discuss this issue and possible solutions for
years by now, and what ZhenHua implemented is what we agreed to be the
best-effort on what we can do in the kdump case with IOMMU enabled.

Of course there are still failure scenarios left, but that is not
different from systems without any IOMMU.


Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-04 Thread Joerg Roedel
On Fri, Apr 24, 2015 at 04:49:57PM +0800, Dave Young wrote:
 I'm more than happy to see this issue can be fixed in the patchset, I
 do not agree to add the code there with such problems. OTOH, for now
 seems there's no way to fix it.

And that's the point. We discuss this issue and possible solutions for
years by now, and what ZhenHua implemented is what we agreed to be the
best-effort on what we can do in the kdump case with IOMMU enabled.

Of course there are still failure scenarios left, but that is not
different from systems without any IOMMU.


Joerg

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-03 Thread Baoquan He
On 05/04/15 at 11:06am, Li, ZhenHua wrote:
> Hi baoquan,
> Could you paste the kernel log of the first kernel ?

Please check the attachment.
[0.00] microcode: CPU0 microcode updated early to revision 0x710, date 
= 2013-06-17
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 4.0.0+ (b...@dhcp-128-28.nay.redhat.com) (gcc 
version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC) ) #6 SMP Wed Apr 29 16:53:34 CST 
2015
[0.00] Command line: BOOT_IMAGE=/vmlinuz-4.0.0+ 
root=/dev/mapper/fedora_dhcp--128--28-root ro rd.lvm.lv=fedora_dhcp-128-28/swap 
rd.lvm.lv=fedora_dhcp-128-28/n
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x000963ff] usable
[0.00] BIOS-e820: [mem 0x00096400-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xcb74] usable
[0.00] BIOS-e820: [mem 0xcb75-0xcb7dafff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcb7db000-0xcbaacfff] reserved
[0.00] BIOS-e820: [mem 0xcbaad000-0xcbaaefff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbaaf000-0xcbabafff] reserved
[0.00] BIOS-e820: [mem 0xcbabb000-0xcbacdfff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbace000-0xcbb55fff] reserved
[0.00] BIOS-e820: [mem 0xcbb56000-0xcbb5dfff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbb5e000-0xcbb70fff] reserved
[0.00] BIOS-e820: [mem 0xcbb71000-0xcbff] ACPI NVS
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfed1c000-0xfed1] reserved
[0.00] BIOS-e820: [mem 0xff00-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00042fff] usable
[0.00] earlycon: no match for ttyS0,115200
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.7 present.
[0.00] e820: last_pfn = 0x43 max_arch_pfn = 0x4
[0.00] PAT configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- UC  
[0.00] e820: last_pfn = 0xcb750 max_arch_pfn = 0x4
[0.00] found SMP MP-table at [mem 0x000f4bc0-0x000f4bcf] mapped at 
[880f4bc0]
[0.00] Using GB pages for direct mapping
[0.00] init_memory_mapping: [mem 0x-0x000f]
[0.00] init_memory_mapping: [mem 0x42fe0-0x42fff]
[0.00] init_memory_mapping: [mem 0x42000-0x42fdf]
[0.00] init_memory_mapping: [mem 0x4-0x41fff]
[0.00] init_memory_mapping: [mem 0x0010-0xcb74]
[0.00] init_memory_mapping: [mem 0x1-0x3]
[0.00] RAMDISK: [mem 0x35a94000-0x36d41fff]
[0.00] ACPI: Early table checksum verification disabled
[0.00] ACPI: RSDP 0x000F9810 24 (v02 HPQOEM)
[0.00] ACPI: XSDT 0xCBA28078 6C (v01 HPQOEM SLIC-WKS 
01072009 AMI  00010013)
[0.00] ACPI: FACP 0xCBA304C8 F4 (v04 HPQOEM SLIC-WKS 
01072009 AMI  00010013)
[0.00] ACPI: DSDT 0xCBA28170 008352 (v02 HPQOEM SLIC-WKS 
0102 INTL 20051117)
[0.00] ACPI: FACS 0xCBB5BF80 40
[0.00] ACPI: APIC 0xCBA305C0 7E (v03 HPQOEM SLIC-WKS 
01072009 AMI  00010013)
[0.00] ACPI: MCFG 0xCBA30640 3C (v01 HPQOEM OEMMCFG. 
01072009 MSFT 0097)
[0.00] ACPI: HPET 0xCBA30680 38 (v01 HPQOEM SLIC-WKS 
01072009 AMI. 0004)
[0.00] ACPI: ASF! 0xCBA306B8 A0 (v32 INTEL   HCG 
0001 TFSM 000F4240)
[0.00] ACPI: SSDT 0xCBA30758 0058DA (v01 COMPAQ WMI  
0001 MSFT 0301)
[0.00] ACPI: SLIC 0xCBA36038 000176 (v01 HPQOEM SLIC-WKS 
0001  )
[0.00] ACPI: SSDT 0xCBA361B0 06E284 (v02 INTEL  CpuPm
4000 INTL 20051117)
[0.00] ACPI: DMAR 0xCBAA4438 A0 (v01 A M I  OEMDMAR  
0001 INTL 0001)
[0.00] No NUMA configuration found
[0.00] Faking a node at [mem 0x-0x00042fff]
[0.00] NODE_DATA(0) allocated [mem 0x42ffea000-0x42fffdfff]
[0.00] Reserving 256MB of memory at 592MB for crashkernel (System RAM: 
16310MB)
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x1000-0x00ff]
[0.00]   DMA32[mem 0x0100-0x]
[0.00]   Normal   [mem 0x0001-0x00042fff]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x1000-0x00095fff]
[0.00]   node   0: 

Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-03 Thread Li, ZhenHua

Hi baoquan,
Could you paste the kernel log of the first kernel ?

Thanks
Zhenhua
On 05/03/2015 04:55 PM, Baoquan He wrote:

On 04/29/15 at 07:20pm, Baoquan He wrote:

Bad news, I rebuilt a kernel with your patchset on 4.0.0+ (this commit
f614c81). Now dmar fault is  seen again.

The lspci log and kdump log are attached, please check:


I found the lspci log previously attached is emtyp, resend it again.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-03 Thread Baoquan He
On 04/29/15 at 07:20pm, Baoquan He wrote:
> Bad news, I rebuilt a kernel with your patchset on 4.0.0+ (this commit
> f614c81). Now dmar fault is  seen again.
> 
> The lspci log and kdump log are attached, please check:

I found the lspci log previously attached is emtyp, resend it again.


00:00.0 Host bridge: Intel Corporation Xeon E5/Core i7 DMI2 (rev 07)
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 

00:01.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 
1a (rev 07) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: 
Kernel driver in use: pcieport
Kernel modules: shpchp

00:02.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 
2a (rev 07) (prog-if 00 [Normal decode])
DeviceName:  Onboard IGD
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: 
Kernel driver in use: pcieport
Kernel modules: shpchp

00:03.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 
3a in PCI Express Mode (rev 07) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: 
Kernel driver in use: pcieport
Kernel modules: shpchp

00:05.0 System peripheral: Intel Corporation Xeon E5/Core i7 Address Map, 
VTd_Misc, System Management (rev 07)
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 

00:05.2 System peripheral: Intel Corporation Xeon E5/Core i7 Control Status and 
Global Errors (rev 07)
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 

00:05.4 PIC: Intel Corporation Xeon E5/Core i7 I/O APIC (rev 07) (prog-if 20 
[IO(X)-APIC])
Subsystem: Intel Corporation Xeon E5/Core i7 I/O APIC
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 

00:11.0 PCI bridge: Intel Corporation C600/X79 series chipset PCI Express 
Virtual Root Port (rev 05) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: 
Kernel driver in use: pcieport
Kernel modules: shpchp

00:16.0 Communication controller: Intel Corporation C600/X79 series chipset MEI 
Controller #1 (rev 05)
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: mei_me
Kernel modules: mei_me

00:16.2 IDE interface: Intel Corporation C600/X79 series chipset IDE-r 
Controller (rev 05) (prog-if 85 [Master SecO PriO])
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: ata_generic
Kernel modules: pata_acpi, ata_generic

00:16.3 Serial controller: Intel Corporation C600/X79 series chipset KT 
Controller (rev 05) (prog-if 02 [16550])
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: serial

00:19.0 Ethernet 

Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-03 Thread Baoquan He
On 04/29/15 at 07:20pm, Baoquan He wrote:
 Bad news, I rebuilt a kernel with your patchset on 4.0.0+ (this commit
 f614c81). Now dmar fault is  seen again.
 
 The lspci log and kdump log are attached, please check:

I found the lspci log previously attached is emtyp, resend it again.


00:00.0 Host bridge: Intel Corporation Xeon E5/Core i7 DMI2 (rev 07)
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR- INTx-
Interrupt: pin A routed to IRQ 0
Capabilities: access denied

00:01.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 
1a (rev 07) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 24
Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
I/O behind bridge: f000-0fff
Memory behind bridge: fff0-000f
Prefetchable memory behind bridge: fff0-000f
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-
BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: access denied
Kernel driver in use: pcieport
Kernel modules: shpchp

00:02.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 
2a (rev 07) (prog-if 00 [Normal decode])
DeviceName:  Onboard IGD
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 25
Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
I/O behind bridge: d000-dfff
Memory behind bridge: d600-d70f
Prefetchable memory behind bridge: d800-ddff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort+ SERR- PERR-
BridgeCtl: Parity+ SERR+ NoISA- VGA+ MAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: access denied
Kernel driver in use: pcieport
Kernel modules: shpchp

00:03.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 
3a in PCI Express Mode (rev 07) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 26
Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
I/O behind bridge: f000-0fff
Memory behind bridge: fff0-000f
Prefetchable memory behind bridge: fff0-000f
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-
BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: access denied
Kernel driver in use: pcieport
Kernel modules: shpchp

00:05.0 System peripheral: Intel Corporation Xeon E5/Core i7 Address Map, 
VTd_Misc, System Management (rev 07)
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR- INTx-
Capabilities: access denied

00:05.2 System peripheral: Intel Corporation Xeon E5/Core i7 Control Status and 
Global Errors (rev 07)
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR- INTx-
Capabilities: access denied

00:05.4 PIC: Intel Corporation Xeon E5/Core i7 I/O APIC (rev 07) (prog-if 20 
[IO(X)-APIC])
Subsystem: Intel Corporation Xeon E5/Core i7 I/O APIC
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR- INTx-

Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-03 Thread Li, ZhenHua

Hi baoquan,
Could you paste the kernel log of the first kernel ?

Thanks
Zhenhua
On 05/03/2015 04:55 PM, Baoquan He wrote:

On 04/29/15 at 07:20pm, Baoquan He wrote:

Bad news, I rebuilt a kernel with your patchset on 4.0.0+ (this commit
f614c81). Now dmar fault is  seen again.

The lspci log and kdump log are attached, please check:


I found the lspci log previously attached is emtyp, resend it again.




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-03 Thread Baoquan He
On 05/04/15 at 11:06am, Li, ZhenHua wrote:
 Hi baoquan,
 Could you paste the kernel log of the first kernel ?

Please check the attachment.
[0.00] microcode: CPU0 microcode updated early to revision 0x710, date 
= 2013-06-17
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 4.0.0+ (b...@dhcp-128-28.nay.redhat.com) (gcc 
version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC) ) #6 SMP Wed Apr 29 16:53:34 CST 
2015
[0.00] Command line: BOOT_IMAGE=/vmlinuz-4.0.0+ 
root=/dev/mapper/fedora_dhcp--128--28-root ro rd.lvm.lv=fedora_dhcp-128-28/swap 
rd.lvm.lv=fedora_dhcp-128-28/n
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x000963ff] usable
[0.00] BIOS-e820: [mem 0x00096400-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xcb74] usable
[0.00] BIOS-e820: [mem 0xcb75-0xcb7dafff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcb7db000-0xcbaacfff] reserved
[0.00] BIOS-e820: [mem 0xcbaad000-0xcbaaefff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbaaf000-0xcbabafff] reserved
[0.00] BIOS-e820: [mem 0xcbabb000-0xcbacdfff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbace000-0xcbb55fff] reserved
[0.00] BIOS-e820: [mem 0xcbb56000-0xcbb5dfff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbb5e000-0xcbb70fff] reserved
[0.00] BIOS-e820: [mem 0xcbb71000-0xcbff] ACPI NVS
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfed1c000-0xfed1] reserved
[0.00] BIOS-e820: [mem 0xff00-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00042fff] usable
[0.00] earlycon: no match for ttyS0,115200
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.7 present.
[0.00] e820: last_pfn = 0x43 max_arch_pfn = 0x4
[0.00] PAT configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- UC  
[0.00] e820: last_pfn = 0xcb750 max_arch_pfn = 0x4
[0.00] found SMP MP-table at [mem 0x000f4bc0-0x000f4bcf] mapped at 
[880f4bc0]
[0.00] Using GB pages for direct mapping
[0.00] init_memory_mapping: [mem 0x-0x000f]
[0.00] init_memory_mapping: [mem 0x42fe0-0x42fff]
[0.00] init_memory_mapping: [mem 0x42000-0x42fdf]
[0.00] init_memory_mapping: [mem 0x4-0x41fff]
[0.00] init_memory_mapping: [mem 0x0010-0xcb74]
[0.00] init_memory_mapping: [mem 0x1-0x3]
[0.00] RAMDISK: [mem 0x35a94000-0x36d41fff]
[0.00] ACPI: Early table checksum verification disabled
[0.00] ACPI: RSDP 0x000F9810 24 (v02 HPQOEM)
[0.00] ACPI: XSDT 0xCBA28078 6C (v01 HPQOEM SLIC-WKS 
01072009 AMI  00010013)
[0.00] ACPI: FACP 0xCBA304C8 F4 (v04 HPQOEM SLIC-WKS 
01072009 AMI  00010013)
[0.00] ACPI: DSDT 0xCBA28170 008352 (v02 HPQOEM SLIC-WKS 
0102 INTL 20051117)
[0.00] ACPI: FACS 0xCBB5BF80 40
[0.00] ACPI: APIC 0xCBA305C0 7E (v03 HPQOEM SLIC-WKS 
01072009 AMI  00010013)
[0.00] ACPI: MCFG 0xCBA30640 3C (v01 HPQOEM OEMMCFG. 
01072009 MSFT 0097)
[0.00] ACPI: HPET 0xCBA30680 38 (v01 HPQOEM SLIC-WKS 
01072009 AMI. 0004)
[0.00] ACPI: ASF! 0xCBA306B8 A0 (v32 INTEL   HCG 
0001 TFSM 000F4240)
[0.00] ACPI: SSDT 0xCBA30758 0058DA (v01 COMPAQ WMI  
0001 MSFT 0301)
[0.00] ACPI: SLIC 0xCBA36038 000176 (v01 HPQOEM SLIC-WKS 
0001  )
[0.00] ACPI: SSDT 0xCBA361B0 06E284 (v02 INTEL  CpuPm
4000 INTL 20051117)
[0.00] ACPI: DMAR 0xCBAA4438 A0 (v01 A M I  OEMDMAR  
0001 INTL 0001)
[0.00] No NUMA configuration found
[0.00] Faking a node at [mem 0x-0x00042fff]
[0.00] NODE_DATA(0) allocated [mem 0x42ffea000-0x42fffdfff]
[0.00] Reserving 256MB of memory at 592MB for crashkernel (System RAM: 
16310MB)
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x1000-0x00ff]
[0.00]   DMA32[mem 0x0100-0x]
[0.00]   Normal   [mem 0x0001-0x00042fff]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x1000-0x00095fff]
[0.00]   node   0: 

Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-29 Thread Baoquan He
Bad news, I rebuilt a kernel with your patchset on 4.0.0+ (this commit
f614c81). Now dmar fault is  seen again.

The lspci log and kdump log are attached, please check:

[ ~]$ cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-4.0.0+ root=/dev/mapper/fedora_dhcp--128--28-root ro
rd.lvm.lv=fedora_dhcp-128-28/swap rd.lvm.lv=fedora_dhcp-128-28/root
crashkernel=256M console=ttyS0,115200 intel_iommu=on


[root@dhcp-128-28 ~]# echo c >/proc/sysrq-trigger 
[  163.160203] sysrq: SysRq : Trigger a crash
[  163.164362] BUG: unable to handle kernel NULL pointer dereference at 
  (null)
[  163.172220] IP: [] sysrq_handle_crash+0x16/0x20
[  163.178333] PGD 419aba067 PUD 419774067 PMD 0 
[  163.182838] Oops: 0002 [#1] SMP 
[  163.186114] Modules linked in: xt_CHECKSUM ipt_MASQUERADE 
nf_nat_masquerade_ipv4 nf_conntrack_netbios_ns nf_conntrack_broadcast 
ip6t_rpfilter ip6t_REJECT cfg80211i
[  163.287902] CPU: 0 PID: 1662 Comm: bash Not tainted 4.0.0+ #6
[  163.293648] Hardware name: Hewlett-Packard HP Z420 Workstation/1589, BIOS 
J61 v01.02 03/09/2012
[  163.302351] task: 8803fdefd580 ti: 880403744000 task.ti: 
880403744000
[  163.309842] RIP: 0010:[]  [] 
sysrq_handle_crash+0x16/0x20
[  163.318383] RSP: 0018:880403747da8  EFLAGS: 00010246
[  163.323696] RAX: 000f RBX: 0063 RCX: 000f
[  163.330817] RDX:  RSI: 88042fc0ea08 RDI: 0063
[  163.337939] RBP: 880403747da8 R08: 0096 R09: 00015098
[  163.345067] R10: 03f1 R11: 0002 R12: 0007
[  163.352203] R13:  R14: 81cc33e0 R15: 
[  163.359346] FS:  7ff6f6ca9700() GS:88042fc0() 
knlGS:
[  163.367443] CS:  0010 DS:  ES:  CR0: 80050033
[  163.373180] CR2:  CR3: 0004019d6000 CR4: 000407f0
[  163.380305] Stack:
[  163.382321]  880403747dd8 81480ea6 0002 
fffb
[  163.389771]  7ff6f6cc5000 0002 880403747df8 
81481353
[  163.397222]  880403747ec8 88041d167f00 880403747e18 
81282408
[  163.404682] Call Trace:
[  163.407130]  [] __handle_sysrq+0x106/0x170
[  163.412784]  [] write_sysrq_trigger+0x33/0x40
[  163.418697]  [] proc_reg_write+0x48/0x70
[  163.424172]  [] __vfs_write+0x37/0x110
[  163.429478]  [] ? __sb_start_write+0x58/0x120
[  163.435391]  [] ? security_file_permission+0x23/0xa0
[  163.441902]  [] vfs_write+0xa9/0x1b0
[  163.447035]  [] SyS_write+0x55/0xd0
[  163.452085]  [] ? do_page_fault+0x2f/0x80
[  163.457651]  [] system_call_fastpath+0x12/0x71
[  163.463648] Code: ef e8 bf f7 ff ff eb d8 66 2e 0f 1f 84 00 00 00 00 00 0f 
1f 00 66 66 66 66 90 55 c7 05 d4 6c a9 00 01 00 00 00 48 89 e5 0f ae f8  04 
25 00 0 
[  163.483597] RIP  [] sysrq_handle_crash+0x16/0x20
[  163.489777]  RSP 
[  163.493257] CR2: 
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 4.0.0+ (b...@dhcp-128-28.nay.redhat.com) (gcc 
version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC) ) #6 SMP Wed Apr 29 16:53:34 CST 
2015
[0.00] Command line: BOOT_IMAGE=/vmlinuz-4.0.0+ 
root=/dev/mapper/fedora_dhcp--128--28-root ro rd.lvm.lv=fedora_dhcp-128-28/swap 
rd.lvm.lv=fedora_dhcp-128-28/K
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0fff] reserved
[0.00] BIOS-e820: [mem 0x1000-0x000963ff] usable
[0.00] BIOS-e820: [mem 0x00096400-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x2500-0x34f65fff] usable
[0.00] BIOS-e820: [mem 0x34fff400-0x34ff] usable
[0.00] BIOS-e820: [mem 0xcb75-0xcb7dafff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcb7db000-0xcbaacfff] reserved
[0.00] BIOS-e820: [mem 0xcbaad000-0xcbaaefff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbaaf000-0xcbabafff] reserved
[0.00] BIOS-e820: [mem 0xcbabb000-0xcbacdfff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbace000-0xcbb55fff] reserved
[0.00] BIOS-e820: [mem 0xcbb56000-0xcbb5dfff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbb5e000-0xcbb70fff] reserved
[0.00] BIOS-e820: [mem 0xcbb71000-0xcbff] ACPI NVS
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfed1c000-0xfed1] reserved
[0.00] BIOS-e820: [mem 0xff00-0x] reserved
[0.00] earlycon: no match for ttyS0,115200
[0.00] NX (Execute Disable) protection: active
[

Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-29 Thread Baoquan He
Bad news, I rebuilt a kernel with your patchset on 4.0.0+ (this commit
f614c81). Now dmar fault is  seen again.

The lspci log and kdump log are attached, please check:

[ ~]$ cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-4.0.0+ root=/dev/mapper/fedora_dhcp--128--28-root ro
rd.lvm.lv=fedora_dhcp-128-28/swap rd.lvm.lv=fedora_dhcp-128-28/root
crashkernel=256M console=ttyS0,115200 intel_iommu=on


[root@dhcp-128-28 ~]# echo c /proc/sysrq-trigger 
[  163.160203] sysrq: SysRq : Trigger a crash
[  163.164362] BUG: unable to handle kernel NULL pointer dereference at 
  (null)
[  163.172220] IP: [81480696] sysrq_handle_crash+0x16/0x20
[  163.178333] PGD 419aba067 PUD 419774067 PMD 0 
[  163.182838] Oops: 0002 [#1] SMP 
[  163.186114] Modules linked in: xt_CHECKSUM ipt_MASQUERADE 
nf_nat_masquerade_ipv4 nf_conntrack_netbios_ns nf_conntrack_broadcast 
ip6t_rpfilter ip6t_REJECT cfg80211i
[  163.287902] CPU: 0 PID: 1662 Comm: bash Not tainted 4.0.0+ #6
[  163.293648] Hardware name: Hewlett-Packard HP Z420 Workstation/1589, BIOS 
J61 v01.02 03/09/2012
[  163.302351] task: 8803fdefd580 ti: 880403744000 task.ti: 
880403744000
[  163.309842] RIP: 0010:[81480696]  [81480696] 
sysrq_handle_crash+0x16/0x20
[  163.318383] RSP: 0018:880403747da8  EFLAGS: 00010246
[  163.323696] RAX: 000f RBX: 0063 RCX: 000f
[  163.330817] RDX:  RSI: 88042fc0ea08 RDI: 0063
[  163.337939] RBP: 880403747da8 R08: 0096 R09: 00015098
[  163.345067] R10: 03f1 R11: 0002 R12: 0007
[  163.352203] R13:  R14: 81cc33e0 R15: 
[  163.359346] FS:  7ff6f6ca9700() GS:88042fc0() 
knlGS:
[  163.367443] CS:  0010 DS:  ES:  CR0: 80050033
[  163.373180] CR2:  CR3: 0004019d6000 CR4: 000407f0
[  163.380305] Stack:
[  163.382321]  880403747dd8 81480ea6 0002 
fffb
[  163.389771]  7ff6f6cc5000 0002 880403747df8 
81481353
[  163.397222]  880403747ec8 88041d167f00 880403747e18 
81282408
[  163.404682] Call Trace:
[  163.407130]  [81480ea6] __handle_sysrq+0x106/0x170
[  163.412784]  [81481353] write_sysrq_trigger+0x33/0x40
[  163.418697]  [81282408] proc_reg_write+0x48/0x70
[  163.424172]  [81215e77] __vfs_write+0x37/0x110
[  163.429478]  [81218d48] ? __sb_start_write+0x58/0x120
[  163.435391]  [8131dc03] ? security_file_permission+0x23/0xa0
[  163.441902]  [812165e9] vfs_write+0xa9/0x1b0
[  163.447035]  [812174a5] SyS_write+0x55/0xd0
[  163.452085]  [81067f6f] ? do_page_fault+0x2f/0x80
[  163.457651]  [8178416e] system_call_fastpath+0x12/0x71
[  163.463648] Code: ef e8 bf f7 ff ff eb d8 66 2e 0f 1f 84 00 00 00 00 00 0f 
1f 00 66 66 66 66 90 55 c7 05 d4 6c a9 00 01 00 00 00 48 89 e5 0f ae f8 c6 04 
25 00 0 
[  163.483597] RIP  [81480696] sysrq_handle_crash+0x16/0x20
[  163.489777]  RSP 880403747da8
[  163.493257] CR2: 
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 4.0.0+ (b...@dhcp-128-28.nay.redhat.com) (gcc 
version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC) ) #6 SMP Wed Apr 29 16:53:34 CST 
2015
[0.00] Command line: BOOT_IMAGE=/vmlinuz-4.0.0+ 
root=/dev/mapper/fedora_dhcp--128--28-root ro rd.lvm.lv=fedora_dhcp-128-28/swap 
rd.lvm.lv=fedora_dhcp-128-28/K
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0fff] reserved
[0.00] BIOS-e820: [mem 0x1000-0x000963ff] usable
[0.00] BIOS-e820: [mem 0x00096400-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x2500-0x34f65fff] usable
[0.00] BIOS-e820: [mem 0x34fff400-0x34ff] usable
[0.00] BIOS-e820: [mem 0xcb75-0xcb7dafff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcb7db000-0xcbaacfff] reserved
[0.00] BIOS-e820: [mem 0xcbaad000-0xcbaaefff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbaaf000-0xcbabafff] reserved
[0.00] BIOS-e820: [mem 0xcbabb000-0xcbacdfff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbace000-0xcbb55fff] reserved
[0.00] BIOS-e820: [mem 0xcbb56000-0xcbb5dfff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcbb5e000-0xcbb70fff] reserved
[0.00] BIOS-e820: [mem 0xcbb71000-0xcbff] ACPI NVS
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 

Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-28 Thread Li, ZhenHua

Hi Baoquan,

If old tables are corrupted, we will see the DMAR faults or INTR faults
(which we have seen), or some other error messages. Most of these
messages are from hardware. This means, hardware will do some check when 
running. But I don't think hardware will completely check the

tables.

Till now, I do not have a good idea to do the check in kdump kernel.


Thanks
Zhenhua


On 04/28/2015 04:54 PM, Baoquan He wrote:

On 04/24/15 at 04:49pm, Dave Young wrote:

On 04/24/15 at 04:35pm, Baoquan He wrote:

On 04/24/15 at 04:25pm, Dave Young wrote:

Hi, Baoquan


I support this patchset.

We should not fear oldmem since reserved crashkernel region is similar.
No one can guarantee that any crazy code won't step into crashkernel
region just because 1st kernel says it's reversed for kdump kernel. Here
the root table and context tables are also not built to allow legal code
to danamge. Both of them has the risk to be corrupted, for trying our
best to get a dumped vmcore the risk is worth being taken.


old mem is mapped in 1st kernel so compare with the reserved crashkernel
they are more likely to be corrupted. they are totally different.


Could you tell how and why they are different? Wrong code will choose
root tables and context tables to danamge when they totally lose
control?


iommu will map io address to system ram, right? not to reserved ram, but
yes I'm assuming the page table is right, but I was worrying they are corrupted
while kernel panic is happening.


OK, I think we may need to think more about the old context tables
reuse. Currently dmar faults will cause error or warning message,
occasionally will cause system with iommu hang in kdump kernel. I don't
know what will happen if old root tables or context tables are corrupted
by evil code. For kdump kernel which use the similar mechanism there's a
verification. When load kdump kernel into reserved crashkernel region a
sha256 sum is calculated, then verify it when jump into kdump kernel
after panic. If corrupted context tables will bring worse result, then
we need consider giving it up and change back to the old way and try
to dump though there's error message.

Hi Zhenhua,

I don't know what's your plan about verification whether old root tables
or old context tables are corrupted. Or have you experimented that what
will happen if old tables are corrupted on purpose.

I am fine if you just put this in a TODO list since that's truly in a
rare case. But it maybe necessary to tell it in patch log.

Thanks
Baoquan



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-28 Thread Baoquan He
On 04/24/15 at 04:49pm, Dave Young wrote:
> On 04/24/15 at 04:35pm, Baoquan He wrote:
> > On 04/24/15 at 04:25pm, Dave Young wrote:
> > > Hi, Baoquan
> > > 
> > > > I support this patchset.
> > > > 
> > > > We should not fear oldmem since reserved crashkernel region is similar.
> > > > No one can guarantee that any crazy code won't step into crashkernel
> > > > region just because 1st kernel says it's reversed for kdump kernel. Here
> > > > the root table and context tables are also not built to allow legal code
> > > > to danamge. Both of them has the risk to be corrupted, for trying our
> > > > best to get a dumped vmcore the risk is worth being taken.
> > > 
> > > old mem is mapped in 1st kernel so compare with the reserved crashkernel
> > > they are more likely to be corrupted. they are totally different. 
> > 
> > Could you tell how and why they are different? Wrong code will choose
> > root tables and context tables to danamge when they totally lose
> > control?
> 
> iommu will map io address to system ram, right? not to reserved ram, but
> yes I'm assuming the page table is right, but I was worrying they are 
> corrupted
> while kernel panic is happening.

OK, I think we may need to think more about the old context tables
reuse. Currently dmar faults will cause error or warning message,
occasionally will cause system with iommu hang in kdump kernel. I don't
know what will happen if old root tables or context tables are corrupted
by evil code. For kdump kernel which use the similar mechanism there's a
verification. When load kdump kernel into reserved crashkernel region a
sha256 sum is calculated, then verify it when jump into kdump kernel
after panic. If corrupted context tables will bring worse result, then
we need consider giving it up and change back to the old way and try
to dump though there's error message.

Hi Zhenhua,

I don't know what's your plan about verification whether old root tables
or old context tables are corrupted. Or have you experimented that what
will happen if old tables are corrupted on purpose.

I am fine if you just put this in a TODO list since that's truly in a
rare case. But it maybe necessary to tell it in patch log.

Thanks
Baoquan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-28 Thread Baoquan He
On 04/24/15 at 04:49pm, Dave Young wrote:
 On 04/24/15 at 04:35pm, Baoquan He wrote:
  On 04/24/15 at 04:25pm, Dave Young wrote:
   Hi, Baoquan
   
I support this patchset.

We should not fear oldmem since reserved crashkernel region is similar.
No one can guarantee that any crazy code won't step into crashkernel
region just because 1st kernel says it's reversed for kdump kernel. Here
the root table and context tables are also not built to allow legal code
to danamge. Both of them has the risk to be corrupted, for trying our
best to get a dumped vmcore the risk is worth being taken.
   
   old mem is mapped in 1st kernel so compare with the reserved crashkernel
   they are more likely to be corrupted. they are totally different. 
  
  Could you tell how and why they are different? Wrong code will choose
  root tables and context tables to danamge when they totally lose
  control?
 
 iommu will map io address to system ram, right? not to reserved ram, but
 yes I'm assuming the page table is right, but I was worrying they are 
 corrupted
 while kernel panic is happening.

OK, I think we may need to think more about the old context tables
reuse. Currently dmar faults will cause error or warning message,
occasionally will cause system with iommu hang in kdump kernel. I don't
know what will happen if old root tables or context tables are corrupted
by evil code. For kdump kernel which use the similar mechanism there's a
verification. When load kdump kernel into reserved crashkernel region a
sha256 sum is calculated, then verify it when jump into kdump kernel
after panic. If corrupted context tables will bring worse result, then
we need consider giving it up and change back to the old way and try
to dump though there's error message.

Hi Zhenhua,

I don't know what's your plan about verification whether old root tables
or old context tables are corrupted. Or have you experimented that what
will happen if old tables are corrupted on purpose.

I am fine if you just put this in a TODO list since that's truly in a
rare case. But it maybe necessary to tell it in patch log.

Thanks
Baoquan

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-28 Thread Li, ZhenHua

Hi Baoquan,

If old tables are corrupted, we will see the DMAR faults or INTR faults
(which we have seen), or some other error messages. Most of these
messages are from hardware. This means, hardware will do some check when 
running. But I don't think hardware will completely check the

tables.

Till now, I do not have a good idea to do the check in kdump kernel.


Thanks
Zhenhua


On 04/28/2015 04:54 PM, Baoquan He wrote:

On 04/24/15 at 04:49pm, Dave Young wrote:

On 04/24/15 at 04:35pm, Baoquan He wrote:

On 04/24/15 at 04:25pm, Dave Young wrote:

Hi, Baoquan


I support this patchset.

We should not fear oldmem since reserved crashkernel region is similar.
No one can guarantee that any crazy code won't step into crashkernel
region just because 1st kernel says it's reversed for kdump kernel. Here
the root table and context tables are also not built to allow legal code
to danamge. Both of them has the risk to be corrupted, for trying our
best to get a dumped vmcore the risk is worth being taken.


old mem is mapped in 1st kernel so compare with the reserved crashkernel
they are more likely to be corrupted. they are totally different.


Could you tell how and why they are different? Wrong code will choose
root tables and context tables to danamge when they totally lose
control?


iommu will map io address to system ram, right? not to reserved ram, but
yes I'm assuming the page table is right, but I was worrying they are corrupted
while kernel panic is happening.


OK, I think we may need to think more about the old context tables
reuse. Currently dmar faults will cause error or warning message,
occasionally will cause system with iommu hang in kdump kernel. I don't
know what will happen if old root tables or context tables are corrupted
by evil code. For kdump kernel which use the similar mechanism there's a
verification. When load kdump kernel into reserved crashkernel region a
sha256 sum is calculated, then verify it when jump into kdump kernel
after panic. If corrupted context tables will bring worse result, then
we need consider giving it up and change back to the old way and try
to dump though there's error message.

Hi Zhenhua,

I don't know what's your plan about verification whether old root tables
or old context tables are corrupted. Or have you experimented that what
will happen if old tables are corrupted on purpose.

I am fine if you just put this in a TODO list since that's truly in a
rare case. But it maybe necessary to tell it in patch log.

Thanks
Baoquan



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-24 Thread Dave Young
On 04/24/15 at 04:35pm, Baoquan He wrote:
> On 04/24/15 at 04:25pm, Dave Young wrote:
> > Hi, Baoquan
> > 
> > > I support this patchset.
> > > 
> > > We should not fear oldmem since reserved crashkernel region is similar.
> > > No one can guarantee that any crazy code won't step into crashkernel
> > > region just because 1st kernel says it's reversed for kdump kernel. Here
> > > the root table and context tables are also not built to allow legal code
> > > to danamge. Both of them has the risk to be corrupted, for trying our
> > > best to get a dumped vmcore the risk is worth being taken.
> > 
> > old mem is mapped in 1st kernel so compare with the reserved crashkernel
> > they are more likely to be corrupted. they are totally different. 
> 
> Could you tell how and why they are different? Wrong code will choose
> root tables and context tables to danamge when they totally lose
> control?

iommu will map io address to system ram, right? not to reserved ram, but
yes I'm assuming the page table is right, but I was worrying they are corrupted
while kernel panic is happening.

> 
> > 
> > > 
> > > And the resetting pci way has been NACKed by David Woodhouse, the
> > > maintainer of intel iommu. Because the place calling the resetting pci
> > > code is ugly before kdump kernel or in kdump kernel. And as he said a
> > > certain device made mistakes why we blame on all devices. We should fix
> > > that device who made mistakes. 
> > 
> > Resetting pci bus is not ugly than fixing a problem with risk and to fix
> > the problem it introduced in the future.
> 
> There's a problem, we fix the problem. If that's uglier, I need redefine
> the 'ugly' in my personal dict. You mean the problem it could introduce
> is wrong code will damage root table and context tables, why don't we
> fix that wrong code, but blame innocent context tables? So you mean
> these tables should deserve being damaged by wrong code?

I'm more than happy to see this issue can be fixed in the patchset, I do not
agree to add the code there with such problems. OTOH, for now seems there's
no way to fix it.

> 
> > 
> > I know it is late to speak out, but sorry I still object and have to NACK 
> > this
> > oldmem approach from my point.
> > 
> > Thanks
> > Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-24 Thread Baoquan He
On 04/24/15 at 04:25pm, Dave Young wrote:
> Hi, Baoquan
> 
> > I support this patchset.
> > 
> > We should not fear oldmem since reserved crashkernel region is similar.
> > No one can guarantee that any crazy code won't step into crashkernel
> > region just because 1st kernel says it's reversed for kdump kernel. Here
> > the root table and context tables are also not built to allow legal code
> > to danamge. Both of them has the risk to be corrupted, for trying our
> > best to get a dumped vmcore the risk is worth being taken.
> 
> old mem is mapped in 1st kernel so compare with the reserved crashkernel
> they are more likely to be corrupted. they are totally different. 

Could you tell how and why they are different? Wrong code will choose
root tables and context tables to danamge when they totally lose
control?

> 
> > 
> > And the resetting pci way has been NACKed by David Woodhouse, the
> > maintainer of intel iommu. Because the place calling the resetting pci
> > code is ugly before kdump kernel or in kdump kernel. And as he said a
> > certain device made mistakes why we blame on all devices. We should fix
> > that device who made mistakes. 
> 
> Resetting pci bus is not ugly than fixing a problem with risk and to fix
> the problem it introduced in the future.

There's a problem, we fix the problem. If that's uglier, I need redefine
the 'ugly' in my personal dict. You mean the problem it could introduce
is wrong code will damage root table and context tables, why don't we
fix that wrong code, but blame innocent context tables? So you mean
these tables should deserve being damaged by wrong code?

> 
> I know it is late to speak out, but sorry I still object and have to NACK this
> oldmem approach from my point.
> 
> Thanks
> Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-24 Thread Dave Young
Hi, Baoquan

> I support this patchset.
> 
> We should not fear oldmem since reserved crashkernel region is similar.
> No one can guarantee that any crazy code won't step into crashkernel
> region just because 1st kernel says it's reversed for kdump kernel. Here
> the root table and context tables are also not built to allow legal code
> to danamge. Both of them has the risk to be corrupted, for trying our
> best to get a dumped vmcore the risk is worth being taken.

old mem is mapped in 1st kernel so compare with the reserved crashkernel
they are more likely to be corrupted. they are totally different. 

> 
> And the resetting pci way has been NACKed by David Woodhouse, the
> maintainer of intel iommu. Because the place calling the resetting pci
> code is ugly before kdump kernel or in kdump kernel. And as he said a
> certain device made mistakes why we blame on all devices. We should fix
> that device who made mistakes. 

Resetting pci bus is not ugly than fixing a problem with risk and to fix
the problem it introduced in the future.

I know it is late to speak out, but sorry I still object and have to NACK this
oldmem approach from my point.

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-24 Thread Baoquan He
On 04/15/15 at 02:48pm, Dave Young wrote:
> On 04/15/15 at 01:47pm, Li, ZhenHua wrote:
> > On 04/15/2015 08:57 AM, Dave Young wrote:
> > >Again, I think it is bad to use old page table, below issues need consider:
> > >1) make sure old page table are reliable across crash
> > >2) do not allow writing oldmem after crash
> > >
> > >Please correct me if I'm wrong, or if above is not doable I think I will 
> > >vote for
> > >resetting pci bus.
> > >
> > >Thanks
> > >Dave
> > >
> > Hi Dave,
> > 
> > When updating the context tables, we have to write their address to root
> > tables, this will cause writing to old mem.
> > 
> > Resetting the pci bus has been discussed, please check this:
> > http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
> > https://lkml.org/lkml/2014/10/21/890

I support this patchset.

We should not fear oldmem since reserved crashkernel region is similar.
No one can guarantee that any crazy code won't step into crashkernel
region just because 1st kernel says it's reversed for kdump kernel. Here
the root table and context tables are also not built to allow legal code
to danamge. Both of them has the risk to be corrupted, for trying our
best to get a dumped vmcore the risk is worth being taken.

And the resetting pci way has been NACKed by David Woodhouse, the
maintainer of intel iommu. Because the place calling the resetting pci
code is ugly before kdump kernel or in kdump kernel. And as he said a
certain device made mistakes why we blame on all devices. We should fix
that device who made mistakes. 

As for me, periodically poked by customers to ask how iommu fix is
going, I really think this patchset is good enough. Aren't we going to
do thing just because there's a risk with tiny possibility or not perfect
enough. I think people won't agree. Otherwise kdump could have been
killed when author proposed it since crashkernel reserved region is
risky and could be corrupted by 1st kernel.

Anyway, let's comprimise a little. At worst it can be reverted if it's
not satisfactory.

Personal opinion.

By the way, I tested it and it works well on my HP z420 workstation.

Thanks
Baoquan


> 
> I know one reason to use old pgtable is this looks better because it fixes the
> real problem, but it is not a good way if it introduce more problems because 
> of
> it have to use oldmem. I will be glad if this is not a problem but I have not
> been convinced.
> 
> OTOH, there's many types of iommu, intel, amd, a lot of other types. They need
> their own fixes, so it looks not that elegant.
> 
> For pci reset, it is not perfect, but it has another advantage, the patch is
> simpler. The problem I see from the old discusssion is, reset bus in 2nd 
> kernel
> is acceptable but it does not fix things on sparc platform. AFAIK current 
> reported
> problems are intel and amd iommu, at least pci reset stuff does not make it 
> worse.
> 
> Thanks
> Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-24 Thread Dave Young
On 04/24/15 at 04:35pm, Baoquan He wrote:
 On 04/24/15 at 04:25pm, Dave Young wrote:
  Hi, Baoquan
  
   I support this patchset.
   
   We should not fear oldmem since reserved crashkernel region is similar.
   No one can guarantee that any crazy code won't step into crashkernel
   region just because 1st kernel says it's reversed for kdump kernel. Here
   the root table and context tables are also not built to allow legal code
   to danamge. Both of them has the risk to be corrupted, for trying our
   best to get a dumped vmcore the risk is worth being taken.
  
  old mem is mapped in 1st kernel so compare with the reserved crashkernel
  they are more likely to be corrupted. they are totally different. 
 
 Could you tell how and why they are different? Wrong code will choose
 root tables and context tables to danamge when they totally lose
 control?

iommu will map io address to system ram, right? not to reserved ram, but
yes I'm assuming the page table is right, but I was worrying they are corrupted
while kernel panic is happening.

 
  
   
   And the resetting pci way has been NACKed by David Woodhouse, the
   maintainer of intel iommu. Because the place calling the resetting pci
   code is ugly before kdump kernel or in kdump kernel. And as he said a
   certain device made mistakes why we blame on all devices. We should fix
   that device who made mistakes. 
  
  Resetting pci bus is not ugly than fixing a problem with risk and to fix
  the problem it introduced in the future.
 
 There's a problem, we fix the problem. If that's uglier, I need redefine
 the 'ugly' in my personal dict. You mean the problem it could introduce
 is wrong code will damage root table and context tables, why don't we
 fix that wrong code, but blame innocent context tables? So you mean
 these tables should deserve being damaged by wrong code?

I'm more than happy to see this issue can be fixed in the patchset, I do not
agree to add the code there with such problems. OTOH, for now seems there's
no way to fix it.

 
  
  I know it is late to speak out, but sorry I still object and have to NACK 
  this
  oldmem approach from my point.
  
  Thanks
  Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-24 Thread Dave Young
Hi, Baoquan

 I support this patchset.
 
 We should not fear oldmem since reserved crashkernel region is similar.
 No one can guarantee that any crazy code won't step into crashkernel
 region just because 1st kernel says it's reversed for kdump kernel. Here
 the root table and context tables are also not built to allow legal code
 to danamge. Both of them has the risk to be corrupted, for trying our
 best to get a dumped vmcore the risk is worth being taken.

old mem is mapped in 1st kernel so compare with the reserved crashkernel
they are more likely to be corrupted. they are totally different. 

 
 And the resetting pci way has been NACKed by David Woodhouse, the
 maintainer of intel iommu. Because the place calling the resetting pci
 code is ugly before kdump kernel or in kdump kernel. And as he said a
 certain device made mistakes why we blame on all devices. We should fix
 that device who made mistakes. 

Resetting pci bus is not ugly than fixing a problem with risk and to fix
the problem it introduced in the future.

I know it is late to speak out, but sorry I still object and have to NACK this
oldmem approach from my point.

Thanks
Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-24 Thread Baoquan He
On 04/15/15 at 02:48pm, Dave Young wrote:
 On 04/15/15 at 01:47pm, Li, ZhenHua wrote:
  On 04/15/2015 08:57 AM, Dave Young wrote:
  Again, I think it is bad to use old page table, below issues need consider:
  1) make sure old page table are reliable across crash
  2) do not allow writing oldmem after crash
  
  Please correct me if I'm wrong, or if above is not doable I think I will 
  vote for
  resetting pci bus.
  
  Thanks
  Dave
  
  Hi Dave,
  
  When updating the context tables, we have to write their address to root
  tables, this will cause writing to old mem.
  
  Resetting the pci bus has been discussed, please check this:
  http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
  https://lkml.org/lkml/2014/10/21/890

I support this patchset.

We should not fear oldmem since reserved crashkernel region is similar.
No one can guarantee that any crazy code won't step into crashkernel
region just because 1st kernel says it's reversed for kdump kernel. Here
the root table and context tables are also not built to allow legal code
to danamge. Both of them has the risk to be corrupted, for trying our
best to get a dumped vmcore the risk is worth being taken.

And the resetting pci way has been NACKed by David Woodhouse, the
maintainer of intel iommu. Because the place calling the resetting pci
code is ugly before kdump kernel or in kdump kernel. And as he said a
certain device made mistakes why we blame on all devices. We should fix
that device who made mistakes. 

As for me, periodically poked by customers to ask how iommu fix is
going, I really think this patchset is good enough. Aren't we going to
do thing just because there's a risk with tiny possibility or not perfect
enough. I think people won't agree. Otherwise kdump could have been
killed when author proposed it since crashkernel reserved region is
risky and could be corrupted by 1st kernel.

Anyway, let's comprimise a little. At worst it can be reverted if it's
not satisfactory.

Personal opinion.

By the way, I tested it and it works well on my HP z420 workstation.

Thanks
Baoquan


 
 I know one reason to use old pgtable is this looks better because it fixes the
 real problem, but it is not a good way if it introduce more problems because 
 of
 it have to use oldmem. I will be glad if this is not a problem but I have not
 been convinced.
 
 OTOH, there's many types of iommu, intel, amd, a lot of other types. They need
 their own fixes, so it looks not that elegant.
 
 For pci reset, it is not perfect, but it has another advantage, the patch is
 simpler. The problem I see from the old discusssion is, reset bus in 2nd 
 kernel
 is acceptable but it does not fix things on sparc platform. AFAIK current 
 reported
 problems are intel and amd iommu, at least pci reset stuff does not make it 
 worse.
 
 Thanks
 Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-24 Thread Baoquan He
On 04/24/15 at 04:25pm, Dave Young wrote:
 Hi, Baoquan
 
  I support this patchset.
  
  We should not fear oldmem since reserved crashkernel region is similar.
  No one can guarantee that any crazy code won't step into crashkernel
  region just because 1st kernel says it's reversed for kdump kernel. Here
  the root table and context tables are also not built to allow legal code
  to danamge. Both of them has the risk to be corrupted, for trying our
  best to get a dumped vmcore the risk is worth being taken.
 
 old mem is mapped in 1st kernel so compare with the reserved crashkernel
 they are more likely to be corrupted. they are totally different. 

Could you tell how and why they are different? Wrong code will choose
root tables and context tables to danamge when they totally lose
control?

 
  
  And the resetting pci way has been NACKed by David Woodhouse, the
  maintainer of intel iommu. Because the place calling the resetting pci
  code is ugly before kdump kernel or in kdump kernel. And as he said a
  certain device made mistakes why we blame on all devices. We should fix
  that device who made mistakes. 
 
 Resetting pci bus is not ugly than fixing a problem with risk and to fix
 the problem it introduced in the future.

There's a problem, we fix the problem. If that's uglier, I need redefine
the 'ugly' in my personal dict. You mean the problem it could introduce
is wrong code will damage root table and context tables, why don't we
fix that wrong code, but blame innocent context tables? So you mean
these tables should deserve being damaged by wrong code?

 
 I know it is late to speak out, but sorry I still object and have to NACK this
 oldmem approach from my point.
 
 Thanks
 Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-20 Thread Dave Young
Hi,

On 04/21/15 at 09:39am, Li, ZhenHua wrote:
> Hi Dave,
> I found the old mail:
> http://lkml.iu.edu/hypermail/linux/kernel/1410.2/03584.html

I know and I have read it before.

==  quote  ===
> > > So with this in mind I would prefer initially taking over the
> > > page-tables from the old kernel before the device drivers re-initialize
> > > the devices.
> >
> > This makes the dump kernel more dependent on data from the old kernel,
> > which we obviously want to avoid when possible.

> Sure, but this is not really possible here (unless we have a generic and
> reliable way to reset all PCI endpoint devices and cancel all in-flight
> DMA before we disable the IOMMU in the kdump kernel).
> Otherwise we always risk data corruption somewhere, in system memory or
> on disk.
=  quote  

What I understand above is it is not really possible to avoid the problem.

But IMHO we should avoid it or we will have problems in the future, if we
really cannot avoid it I would say switching to pci reset way is better.

> 
> Please check this and you will find the discussion.
> 
> Regards
> Zhenhua
> 
> On 04/15/2015 02:48 PM, Dave Young wrote:
> >On 04/15/15 at 01:47pm, Li, ZhenHua wrote:
> >>On 04/15/2015 08:57 AM, Dave Young wrote:
> >>>Again, I think it is bad to use old page table, below issues need consider:
> >>>1) make sure old page table are reliable across crash
> >>>2) do not allow writing oldmem after crash
> >>>
> >>>Please correct me if I'm wrong, or if above is not doable I think I will 
> >>>vote for
> >>>resetting pci bus.
> >>>
> >>>Thanks
> >>>Dave
> >>>
> >>Hi Dave,
> >>
> >>When updating the context tables, we have to write their address to root
> >>tables, this will cause writing to old mem.
> >>
> >>Resetting the pci bus has been discussed, please check this:
> >>http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
> >>https://lkml.org/lkml/2014/10/21/890
> >
> >I know one reason to use old pgtable is this looks better because it fixes 
> >the
> >real problem, but it is not a good way if it introduce more problems because 
> >of
> >it have to use oldmem. I will be glad if this is not a problem but I have not
> >been convinced.
> >
> >OTOH, there's many types of iommu, intel, amd, a lot of other types. They 
> >need
> >their own fixes, so it looks not that elegant.
> >
> >For pci reset, it is not perfect, but it has another advantage, the patch is
> >simpler. The problem I see from the old discusssion is, reset bus in 2nd 
> >kernel
> >is acceptable but it does not fix things on sparc platform. AFAIK current 
> >reported
> >problems are intel and amd iommu, at least pci reset stuff does not make it 
> >worse.
> >
> >Thanks
> >Dave
> >
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-20 Thread Li, ZhenHua

Hi Dave,
I found the old mail:
http://lkml.iu.edu/hypermail/linux/kernel/1410.2/03584.html

Please check this and you will find the discussion.

Regards
Zhenhua

On 04/15/2015 02:48 PM, Dave Young wrote:

On 04/15/15 at 01:47pm, Li, ZhenHua wrote:

On 04/15/2015 08:57 AM, Dave Young wrote:

Again, I think it is bad to use old page table, below issues need consider:
1) make sure old page table are reliable across crash
2) do not allow writing oldmem after crash

Please correct me if I'm wrong, or if above is not doable I think I will vote 
for
resetting pci bus.

Thanks
Dave


Hi Dave,

When updating the context tables, we have to write their address to root
tables, this will cause writing to old mem.

Resetting the pci bus has been discussed, please check this:
http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
https://lkml.org/lkml/2014/10/21/890


I know one reason to use old pgtable is this looks better because it fixes the
real problem, but it is not a good way if it introduce more problems because of
it have to use oldmem. I will be glad if this is not a problem but I have not
been convinced.

OTOH, there's many types of iommu, intel, amd, a lot of other types. They need
their own fixes, so it looks not that elegant.

For pci reset, it is not perfect, but it has another advantage, the patch is
simpler. The problem I see from the old discusssion is, reset bus in 2nd kernel
is acceptable but it does not fix things on sparc platform. AFAIK current 
reported
problems are intel and amd iommu, at least pci reset stuff does not make it 
worse.

Thanks
Dave



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-20 Thread Dave Young
Hi,

On 04/21/15 at 09:39am, Li, ZhenHua wrote:
 Hi Dave,
 I found the old mail:
 http://lkml.iu.edu/hypermail/linux/kernel/1410.2/03584.html

I know and I have read it before.

==  quote  ===
   So with this in mind I would prefer initially taking over the
   page-tables from the old kernel before the device drivers re-initialize
   the devices.
 
  This makes the dump kernel more dependent on data from the old kernel,
  which we obviously want to avoid when possible.

 Sure, but this is not really possible here (unless we have a generic and
 reliable way to reset all PCI endpoint devices and cancel all in-flight
 DMA before we disable the IOMMU in the kdump kernel).
 Otherwise we always risk data corruption somewhere, in system memory or
 on disk.
=  quote  

What I understand above is it is not really possible to avoid the problem.

But IMHO we should avoid it or we will have problems in the future, if we
really cannot avoid it I would say switching to pci reset way is better.

 
 Please check this and you will find the discussion.
 
 Regards
 Zhenhua
 
 On 04/15/2015 02:48 PM, Dave Young wrote:
 On 04/15/15 at 01:47pm, Li, ZhenHua wrote:
 On 04/15/2015 08:57 AM, Dave Young wrote:
 Again, I think it is bad to use old page table, below issues need consider:
 1) make sure old page table are reliable across crash
 2) do not allow writing oldmem after crash
 
 Please correct me if I'm wrong, or if above is not doable I think I will 
 vote for
 resetting pci bus.
 
 Thanks
 Dave
 
 Hi Dave,
 
 When updating the context tables, we have to write their address to root
 tables, this will cause writing to old mem.
 
 Resetting the pci bus has been discussed, please check this:
 http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
 https://lkml.org/lkml/2014/10/21/890
 
 I know one reason to use old pgtable is this looks better because it fixes 
 the
 real problem, but it is not a good way if it introduce more problems because 
 of
 it have to use oldmem. I will be glad if this is not a problem but I have not
 been convinced.
 
 OTOH, there's many types of iommu, intel, amd, a lot of other types. They 
 need
 their own fixes, so it looks not that elegant.
 
 For pci reset, it is not perfect, but it has another advantage, the patch is
 simpler. The problem I see from the old discusssion is, reset bus in 2nd 
 kernel
 is acceptable but it does not fix things on sparc platform. AFAIK current 
 reported
 problems are intel and amd iommu, at least pci reset stuff does not make it 
 worse.
 
 Thanks
 Dave
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-20 Thread Li, ZhenHua

Hi Dave,
I found the old mail:
http://lkml.iu.edu/hypermail/linux/kernel/1410.2/03584.html

Please check this and you will find the discussion.

Regards
Zhenhua

On 04/15/2015 02:48 PM, Dave Young wrote:

On 04/15/15 at 01:47pm, Li, ZhenHua wrote:

On 04/15/2015 08:57 AM, Dave Young wrote:

Again, I think it is bad to use old page table, below issues need consider:
1) make sure old page table are reliable across crash
2) do not allow writing oldmem after crash

Please correct me if I'm wrong, or if above is not doable I think I will vote 
for
resetting pci bus.

Thanks
Dave


Hi Dave,

When updating the context tables, we have to write their address to root
tables, this will cause writing to old mem.

Resetting the pci bus has been discussed, please check this:
http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
https://lkml.org/lkml/2014/10/21/890


I know one reason to use old pgtable is this looks better because it fixes the
real problem, but it is not a good way if it introduce more problems because of
it have to use oldmem. I will be glad if this is not a problem but I have not
been convinced.

OTOH, there's many types of iommu, intel, amd, a lot of other types. They need
their own fixes, so it looks not that elegant.

For pci reset, it is not perfect, but it has another advantage, the patch is
simpler. The problem I see from the old discusssion is, reset bus in 2nd kernel
is acceptable but it does not fix things on sparc platform. AFAIK current 
reported
problems are intel and amd iommu, at least pci reset stuff does not make it 
worse.

Thanks
Dave



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-15 Thread Dave Young
On 04/15/15 at 01:47pm, Li, ZhenHua wrote:
> On 04/15/2015 08:57 AM, Dave Young wrote:
> >Again, I think it is bad to use old page table, below issues need consider:
> >1) make sure old page table are reliable across crash
> >2) do not allow writing oldmem after crash
> >
> >Please correct me if I'm wrong, or if above is not doable I think I will 
> >vote for
> >resetting pci bus.
> >
> >Thanks
> >Dave
> >
> Hi Dave,
> 
> When updating the context tables, we have to write their address to root
> tables, this will cause writing to old mem.
> 
> Resetting the pci bus has been discussed, please check this:
> http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
> https://lkml.org/lkml/2014/10/21/890

I know one reason to use old pgtable is this looks better because it fixes the
real problem, but it is not a good way if it introduce more problems because of
it have to use oldmem. I will be glad if this is not a problem but I have not
been convinced.

OTOH, there's many types of iommu, intel, amd, a lot of other types. They need
their own fixes, so it looks not that elegant.

For pci reset, it is not perfect, but it has another advantage, the patch is
simpler. The problem I see from the old discusssion is, reset bus in 2nd kernel
is acceptable but it does not fix things on sparc platform. AFAIK current 
reported
problems are intel and amd iommu, at least pci reset stuff does not make it 
worse.

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-15 Thread Dave Young
On 04/15/15 at 01:47pm, Li, ZhenHua wrote:
 On 04/15/2015 08:57 AM, Dave Young wrote:
 Again, I think it is bad to use old page table, below issues need consider:
 1) make sure old page table are reliable across crash
 2) do not allow writing oldmem after crash
 
 Please correct me if I'm wrong, or if above is not doable I think I will 
 vote for
 resetting pci bus.
 
 Thanks
 Dave
 
 Hi Dave,
 
 When updating the context tables, we have to write their address to root
 tables, this will cause writing to old mem.
 
 Resetting the pci bus has been discussed, please check this:
 http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
 https://lkml.org/lkml/2014/10/21/890

I know one reason to use old pgtable is this looks better because it fixes the
real problem, but it is not a good way if it introduce more problems because of
it have to use oldmem. I will be glad if this is not a problem but I have not
been convinced.

OTOH, there's many types of iommu, intel, amd, a lot of other types. They need
their own fixes, so it looks not that elegant.

For pci reset, it is not perfect, but it has another advantage, the patch is
simpler. The problem I see from the old discusssion is, reset bus in 2nd kernel
is acceptable but it does not fix things on sparc platform. AFAIK current 
reported
problems are intel and amd iommu, at least pci reset stuff does not make it 
worse.

Thanks
Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-14 Thread Li, ZhenHua

On 04/15/2015 08:57 AM, Dave Young wrote:

Again, I think it is bad to use old page table, below issues need consider:
1) make sure old page table are reliable across crash
2) do not allow writing oldmem after crash

Please correct me if I'm wrong, or if above is not doable I think I will vote 
for
resetting pci bus.

Thanks
Dave


Hi Dave,

When updating the context tables, we have to write their address to root 
tables, this will cause writing to old mem.


Resetting the pci bus has been discussed, please check this:
http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
https://lkml.org/lkml/2014/10/21/890

Thanks
Zhenhua



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-14 Thread Dave Young
On 04/10/15 at 04:42pm, Li, Zhen-Hua wrote:
> This patchset is an update of Bill Sumner's patchset, implements a fix for:
> If a kernel boots with intel_iommu=on on a system that supports intel vt-d, 
> when a panic happens, the kdump kernel will boot with these faults:
> 
> dmar: DRHD: handling fault status reg 102
> dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8
> DMAR:[fault reason 01] Present bit in root entry is clear
> 
> dmar: DRHD: handling fault status reg 2
> dmar: INTR-REMAP: Request device [[61:00.0] fault index 42
> INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear
> 
> On some system, the interrupt remapping fault will also happen even if the 
> intel_iommu is not set to on, because the interrupt remapping will be enabled 
> when x2apic is needed by the system.
> 
> The cause of the DMA fault is described in Bill's original version, and the 
> INTR-Remap fault is caused by a similar reason. In short, the initialization 
> of vt-d drivers causes the in-flight DMA and interrupt requests get wrong 
> response.
> 
> To fix this problem, we modifies the behaviors of the intel vt-d in the 
> crashdump kernel:
> 
> For DMA Remapping:
> 1. To accept the vt-d hardware in an active state,
> 2. Do not disable and re-enable the translation, keep it enabled.
> 3. Use the old root entry table, do not rewrite the RTA register.
> 4. Malloc and use new context entry table, copy data from the old ones that
>used by the old kernel.
> 5. Keep using the old page tables before driver is loaded.
> 6. After device driver is loaded, when it issues the first dma_map command, 
>free the dmar_domain structure for this device, and generate a new one, so 
>that the device can be assigned a new and empty page table. 
> 7. When a new context entry table is generated, we also save its address to 
>the old root entry table.
> 
> For Interrupt Remapping:
> 1. To accept the vt-d hardware in an active state,
> 2. Do not disable and re-enable the interrupt remapping, keep it enabled.
> 3. Use the old interrupt remapping table, do not rewrite the IRTA register.
> 4. When ioapic entry is setup, the interrupt remapping table is changed, and 
>the updated data will be stored to the old interrupt remapping table.
> 
> Advantages of this approach:
> 1. All manipulation of the IO-device is done by the Linux device-driver
>for that device.
> 2. This approach behaves in a manner very similar to operation without an
>active iommu.
> 3. Any activity between the IO-device and its RMRR areas is handled by the
>device-driver in the same manner as during a non-kdump boot.
> 4. If an IO-device has no driver in the kdump kernel, it is simply left alone.
>This supports the practice of creating a special kdump kernel without
>drivers for any devices that are not required for taking a crashdump. 
> 5. Minimal code-changes among the existing mainline intel vt-d code.
> 
> Summary of changes in this patch set:
> 1. Added some useful function for root entry table in code intel-iommu.c
> 2. Added new members to struct root_entry and struct irte;
> 3. Functions to load old root entry table to iommu->root_entry from the 
> memory 
>of old kernel.
> 4. Functions to malloc new context entry table and copy the data from the old
>ones to the malloced new ones.
> 5. Functions to enable support for DMA remapping in kdump kernel.
> 6. Functions to load old irte data from the old kernel to the kdump kernel.
> 7. Some code changes that support other behaviours that have been listed.
> 8. In the new functions, use physical address as "unsigned long" type, not 
>pointers.
> 
> Original version by Bill Sumner:
> https://lkml.org/lkml/2014/1/10/518
> https://lkml.org/lkml/2014/4/15/716
> https://lkml.org/lkml/2014/4/24/836
> 
> Zhenhua's updates:
> https://lkml.org/lkml/2014/10/21/134
> https://lkml.org/lkml/2014/12/15/121
> https://lkml.org/lkml/2014/12/22/53
> https://lkml.org/lkml/2015/1/6/1166
> https://lkml.org/lkml/2015/1/12/35
> https://lkml.org/lkml/2015/3/19/33
> 
> Changelog[v10]:
> 1. Do not use CONFIG_CRASH_DUMP and is_kdump_kernel().
>Use one flag which stores the te and ir status in last kernel:
>iommu->pre_enabled_trans
>iommu->pre_enabled_ir
> 
> Changelog[v9]:
> 1. Add new function iommu_attach_domain_with_id.
> 2. Do not copy old page tables, keep using the old ones.
> 3. Remove functions:
>intel_iommu_did_to_domain_values_entry
>intel_iommu_get_dids_from_old_kernel
>device_to_domain_id
>copy_page_addr
>copy_page_table
>copy_context_entry
>copy_context_entry_table
> 4. Add new function device_to_existing_context_entry.
> 
> Changelog[v8]:
> 1. Add a missing __iommu_flush_cache in function copy_page_table.
> 
> Changelog[v7]:
> 1. Use __iommu_flush_cache to 

Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-14 Thread Dave Young
On 04/10/15 at 04:42pm, Li, Zhen-Hua wrote:
 This patchset is an update of Bill Sumner's patchset, implements a fix for:
 If a kernel boots with intel_iommu=on on a system that supports intel vt-d, 
 when a panic happens, the kdump kernel will boot with these faults:
 
 dmar: DRHD: handling fault status reg 102
 dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8
 DMAR:[fault reason 01] Present bit in root entry is clear
 
 dmar: DRHD: handling fault status reg 2
 dmar: INTR-REMAP: Request device [[61:00.0] fault index 42
 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear
 
 On some system, the interrupt remapping fault will also happen even if the 
 intel_iommu is not set to on, because the interrupt remapping will be enabled 
 when x2apic is needed by the system.
 
 The cause of the DMA fault is described in Bill's original version, and the 
 INTR-Remap fault is caused by a similar reason. In short, the initialization 
 of vt-d drivers causes the in-flight DMA and interrupt requests get wrong 
 response.
 
 To fix this problem, we modifies the behaviors of the intel vt-d in the 
 crashdump kernel:
 
 For DMA Remapping:
 1. To accept the vt-d hardware in an active state,
 2. Do not disable and re-enable the translation, keep it enabled.
 3. Use the old root entry table, do not rewrite the RTA register.
 4. Malloc and use new context entry table, copy data from the old ones that
used by the old kernel.
 5. Keep using the old page tables before driver is loaded.
 6. After device driver is loaded, when it issues the first dma_map command, 
free the dmar_domain structure for this device, and generate a new one, so 
that the device can be assigned a new and empty page table. 
 7. When a new context entry table is generated, we also save its address to 
the old root entry table.
 
 For Interrupt Remapping:
 1. To accept the vt-d hardware in an active state,
 2. Do not disable and re-enable the interrupt remapping, keep it enabled.
 3. Use the old interrupt remapping table, do not rewrite the IRTA register.
 4. When ioapic entry is setup, the interrupt remapping table is changed, and 
the updated data will be stored to the old interrupt remapping table.
 
 Advantages of this approach:
 1. All manipulation of the IO-device is done by the Linux device-driver
for that device.
 2. This approach behaves in a manner very similar to operation without an
active iommu.
 3. Any activity between the IO-device and its RMRR areas is handled by the
device-driver in the same manner as during a non-kdump boot.
 4. If an IO-device has no driver in the kdump kernel, it is simply left alone.
This supports the practice of creating a special kdump kernel without
drivers for any devices that are not required for taking a crashdump. 
 5. Minimal code-changes among the existing mainline intel vt-d code.
 
 Summary of changes in this patch set:
 1. Added some useful function for root entry table in code intel-iommu.c
 2. Added new members to struct root_entry and struct irte;
 3. Functions to load old root entry table to iommu-root_entry from the 
 memory 
of old kernel.
 4. Functions to malloc new context entry table and copy the data from the old
ones to the malloced new ones.
 5. Functions to enable support for DMA remapping in kdump kernel.
 6. Functions to load old irte data from the old kernel to the kdump kernel.
 7. Some code changes that support other behaviours that have been listed.
 8. In the new functions, use physical address as unsigned long type, not 
pointers.
 
 Original version by Bill Sumner:
 https://lkml.org/lkml/2014/1/10/518
 https://lkml.org/lkml/2014/4/15/716
 https://lkml.org/lkml/2014/4/24/836
 
 Zhenhua's updates:
 https://lkml.org/lkml/2014/10/21/134
 https://lkml.org/lkml/2014/12/15/121
 https://lkml.org/lkml/2014/12/22/53
 https://lkml.org/lkml/2015/1/6/1166
 https://lkml.org/lkml/2015/1/12/35
 https://lkml.org/lkml/2015/3/19/33
 
 Changelog[v10]:
 1. Do not use CONFIG_CRASH_DUMP and is_kdump_kernel().
Use one flag which stores the te and ir status in last kernel:
iommu-pre_enabled_trans
iommu-pre_enabled_ir
 
 Changelog[v9]:
 1. Add new function iommu_attach_domain_with_id.
 2. Do not copy old page tables, keep using the old ones.
 3. Remove functions:
intel_iommu_did_to_domain_values_entry
intel_iommu_get_dids_from_old_kernel
device_to_domain_id
copy_page_addr
copy_page_table
copy_context_entry
copy_context_entry_table
 4. Add new function device_to_existing_context_entry.
 
 Changelog[v8]:
 1. Add a missing __iommu_flush_cache in function copy_page_table.
 
 Changelog[v7]:
 1. Use __iommu_flush_cache to flush the data to hardware.
 
 Changelog[v6]:
 1. Use unsigned long as type of physical address.
 2. Use 

Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-14 Thread Li, ZhenHua

On 04/15/2015 08:57 AM, Dave Young wrote:

Again, I think it is bad to use old page table, below issues need consider:
1) make sure old page table are reliable across crash
2) do not allow writing oldmem after crash

Please correct me if I'm wrong, or if above is not doable I think I will vote 
for
resetting pci bus.

Thanks
Dave


Hi Dave,

When updating the context tables, we have to write their address to root 
tables, this will cause writing to old mem.


Resetting the pci bus has been discussed, please check this:
http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
https://lkml.org/lkml/2014/10/21/890

Thanks
Zhenhua



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/