Re: [PATCH 1/3] iommu/amd: Add logic to decode AMD IOMMU event flag
On 4/2/2013 9:33 AM, Joerg Roedel wrote: Hi Suravee, On Wed, Mar 27, 2013 at 06:51:23PM -0500, suravee.suthikulpa...@amd.com wrote: From: Suravee Suthikulpanit suravee.suthikulpa...@amd.com Add logic to decode AMD IOMMU event flag based on information from AMD IOMMU specification. This should simplify debugging IOMMU errors. Also, dump DTE information in additional cases. The patch in general makes sense to have, but I have a couple of comments. +static void dump_flags(int flags, int ev_type) +{ + struct _event_log_flags *p = (struct _event_log_flags *) flags; + u32 err_type = p-type; + + pr_err(AMD-Vi: Flags details:\n); + pr_err(AMD-Vi:Guest / Nested : %u\n, p-gn); + pr_err(AMD-Vi:No Execute : %u\n, p-nx); + pr_err(AMD-Vi:User-Supervisor : %u\n, p-us); + pr_err(AMD-Vi:Interrupt : %u\n, p-i); + pr_err(AMD-Vi:Present : %u\n, p-pr); + pr_err(AMD-Vi:Read / Write: %u\n, p-rw); + pr_err(AMD-Vi:Permission : %u\n, p-pe); + pr_err(AMD-Vi:Reserv bit not zero / Illegal level encoding : %u\n, + p-rz); + pr_err(AMD-Vi:Translation / Transaction : %u\n, + p-tr); + pr_err(AMD-Vi:Type of error : (0x%x) , err_type); Printing the flags multi-line is much too noisy for IOMMU events. Just print a char-shortcut for each flag set on the same line. I will make the changes and send in for new patch for review. + + if ((ev_type == EVENT_TYPE_DEV_TAB_ERR) || + (ev_type == EVENT_TYPE_PAGE_TAB_ERR) || + (ev_type == EVENT_TYPE_CMD_HARD_ERR)) { + /* Only supports up to 2 bits */ + err_type = 0x3; + switch (err_type) { + case 0: + pr_err(Reserved\n); + break; + case 1: + pr_err(Master Abort\n); + break; + case 2: + pr_err(Target Abort\n); + break; + case 3: + pr_err(Data Error\n); + break; + } Why do you create string-arrays for other type-encodings but not for this one? I can do the same way if that's preferred. Thanks, Suravee + } else if (ev_type == EVENT_TYPE_INV_DEV_REQ) { + if (p-tr == 0) { + if (err_type ARRAY_SIZE(_invalid_translation_desc)) + printk(%s\n, + _invalid_translation_desc[err_type]); + } else { + if (err_type ARRAY_SIZE(_invalid_transaction_desc)) + printk(%s\n, + _invalid_transaction_desc[err_type]); pr_cont instead of printk. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu/amd: Add logic to decode AMD IOMMU event flag
On Tue, Apr 02, 2013 at 04:33:36PM +0200, Joerg Roedel wrote: Hi Suravee, On Wed, Mar 27, 2013 at 06:51:23PM -0500, suravee.suthikulpa...@amd.com wrote: From: Suravee Suthikulpanit suravee.suthikulpa...@amd.com Add logic to decode AMD IOMMU event flag based on information from AMD IOMMU specification. This should simplify debugging IOMMU errors. Also, dump DTE information in additional cases. The patch in general makes sense to have, but I have a couple of comments. While you guys are at it, can someone fix this too pls (ASUS board with a PD on it). [0.220342] [Firmware Bug]: AMD-Vi: IOAPIC[9] not in IVRS table [0.220398] [Firmware Bug]: AMD-Vi: IOAPIC[10] not in IVRS table [0.220451] [Firmware Bug]: AMD-Vi: No southbridge IOAPIC found in IVRS table [0.220506] AMD-Vi: Disabling interrupt remapping due to BIOS Bug(s) Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu/amd: Add logic to decode AMD IOMMU event flag
On Tue, Apr 02, 2013 at 04:40:37PM +0200, Borislav Petkov wrote: While you guys are at it, can someone fix this too pls (ASUS board with a PD on it). [0.220342] [Firmware Bug]: AMD-Vi: IOAPIC[9] not in IVRS table [0.220398] [Firmware Bug]: AMD-Vi: IOAPIC[10] not in IVRS table [0.220451] [Firmware Bug]: AMD-Vi: No southbridge IOAPIC found in IVRS table [0.220506] AMD-Vi: Disabling interrupt remapping due to BIOS Bug(s) That is actually a BIOS problem. I wonder whether it would help to turn this into a WARN_ON to get the board vendors to release working BIOSes. Opinions? Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu/amd: Add logic to decode AMD IOMMU event flag
On Tue, Apr 02, 2013 at 05:03:04PM +0200, Joerg Roedel wrote: On Tue, Apr 02, 2013 at 04:40:37PM +0200, Borislav Petkov wrote: While you guys are at it, can someone fix this too pls (ASUS board with a PD on it). [0.220342] [Firmware Bug]: AMD-Vi: IOAPIC[9] not in IVRS table [0.220398] [Firmware Bug]: AMD-Vi: IOAPIC[10] not in IVRS table [0.220451] [Firmware Bug]: AMD-Vi: No southbridge IOAPIC found in IVRS table [0.220506] AMD-Vi: Disabling interrupt remapping due to BIOS Bug(s) That is actually a BIOS problem. I wonder whether it would help to turn this into a WARN_ON to get the board vendors to release working BIOSes. Opinions? Good luck trying to get ASUS to fix anything in their BIOS :(. Can't we detect the SB IOAPIC some other way in this case? -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu/amd: Add logic to decode AMD IOMMU event flag
On 4/2/2013 10:29 AM, Borislav Petkov wrote: On Tue, Apr 02, 2013 at 05:03:04PM +0200, Joerg Roedel wrote: On Tue, Apr 02, 2013 at 04:40:37PM +0200, Borislav Petkov wrote: While you guys are at it, can someone fix this too pls (ASUS board with a PD on it). [0.220342] [Firmware Bug]: AMD-Vi: IOAPIC[9] not in IVRS table [0.220398] [Firmware Bug]: AMD-Vi: IOAPIC[10] not in IVRS table [0.220451] [Firmware Bug]: AMD-Vi: No southbridge IOAPIC found in IVRS table [0.220506] AMD-Vi: Disabling interrupt remapping due to BIOS Bug(s) That is actually a BIOS problem. I wonder whether it would help to turn this into a WARN_ON to get the board vendors to release working BIOSes. Opinions? Good luck trying to get ASUS to fix anything in their BIOS :(. I have tried to contact Asus in the past to have them fix the issue, but I got no luck. Once it is out in the field, it's very difficult to get them to make changes. I am also addressing this issue with the BIOS team for the future hardware. Turning this into WARN_ON() at this point might break a lot of systems currently out in the field. However, users can always switching to use intremap=off but this might not be obvious. Suravee Can't we detect the SB IOAPIC some other way in this case? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu/amd: Add logic to decode AMD IOMMU event flag
On Tue, Apr 02, 2013 at 05:29:56PM +0200, Borislav Petkov wrote: On Tue, Apr 02, 2013 at 05:03:04PM +0200, Joerg Roedel wrote: Good luck trying to get ASUS to fix anything in their BIOS :(. Hmm... Can't we detect the SB IOAPIC some other way in this case? I can certainly write a patch that works around your particular BIOS bug. The problem is that such a fix will most certainly break other systems. Unfortunatly there is no reliable way to fixup the IO-APIC-ID-DEVID mapping at runtime when the BIOS messed it up. The only thing I can do is to check for potential problems and disable the intremap feature then, so that the system will at least boot. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu/amd: Add logic to decode AMD IOMMU event flag
On Tue, Apr 02, 2013 at 10:41:25AM -0500, Suthikulpanit, Suravee wrote: Turning this into WARN_ON() at this point might break a lot of systems currently out in the field. However, users can always switching to use intremap=off but this might not be obvious. A WARN_ON doesn't break systems, it just creates more noise. Probably enough noise to get board vendors to fix their stuff up. But there is more to consider before making more noise, of course :) Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu/amd: Add logic to decode AMD IOMMU event flag
On Tue, Apr 02, 2013 at 09:32:40PM +0200, Borislav Petkov wrote: On Tue, Apr 02, 2013 at 06:33:18PM +0200, Joerg Roedel wrote: Okay, in theory I could implement a feedback loop between timer-setup and intremap code and try fixups until it works. But that seems not to be worth it to work around a buggy BIOS. Yeah, same here. It's not like we really need intremap to work - we're only trying to fix the annoying error message currently. :-) Right :) What I actually thought about was providing an IVRS-override on the kernel command line. So that you can specify the IOAPIC_ID-DEVID mapping there and make it work this way. What do you think? I guess that is workable. I can imagine people wanting this if they want to do the intremap thing on such b0rked BIOSen. So how do I specify this IOAPIC_ID-DEVID mapping on the cmdline exactly? Don't know yet, probably something like ivrs_ioapic[apicid]=0:14.2 and ivrs_hpet[hpet-id]=0:14.2. But not entierly sure yet, at least parsing shouldn't require lex and yacc ;) Yeah, that's my fear too. So we leave it better as it is... Hohumm. Thus shall it be! Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu/amd: Add logic to decode AMD IOMMU event flag
Hi Joerg, Do you have any comments or feedback about this patch set? Thank you, Suravee On 3/27/2013 6:51 PM, suravee.suthikulpa...@amd.com wrote: From: Suravee Suthikulpanit suravee.suthikulpa...@amd.com Add logic to decode AMD IOMMU event flag based on information from AMD IOMMU specification. This should simplify debugging IOMMU errors. Also, dump DTE information in additional cases. Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com --- drivers/iommu/amd_iommu.c | 161 + 1 file changed, 134 insertions(+), 27 deletions(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 98f555d..477cfbb 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -592,7 +592,6 @@ static void amd_iommu_stats_init(void) amd_iommu_stats_add(invalidate_iotlb_all); amd_iommu_stats_add(pri_requests); } - #endif / @@ -601,6 +600,99 @@ static void amd_iommu_stats_init(void) * / +struct _event_log_flags { + u32 gn:1, /* 16 */ + nx:1, /* 17 */ + us:1, /* 18 */ +i:1, /* 19 */ + pr:1, /* 20 */ + rw:1, /* 21 */ + pe:1, /* 22 */ + rz:1, /* 23 */ + tr:1, /* 24 */ + type:3, /* [27:25] */ + _reserved_:20; /* Reserved */ +}; + +static const char * const _invalid_transaction_desc[] = { + /* 000 */Read request or non-posted write in the interrupt +addres range, + /* 001 */Pretranslated transaction received from an I/O device +that has I=0 or V=0 in DTE, + /* 010 */Port I/O space transaction received from an I/O device +that has IoCtl=00b in DTE, + /* 011 */Posted write to invalid address range, + /* 100 */Invalid read request or non-posted write, + /* 101 */Posted write to the interrupt/EOI range from an I/O +device that has IntCtl=00b in DTE, + /* 110 */Posted write to a reserved interrupt address range, + /* 111 */Invalid transaction to the system management address range, +}; + +static const char * const _invalid_translation_desc[] = { + /* 000 */Translation request received from an I/O device that has +I=0, or has V=0, or has V=1 and TV=0 in DTE, + /* 001 */Translation request in invalid address range, + /* 010 */Invalid translation request, + /* 011 */Reserved, + /* 100 */Reserved, + /* 101 */Reserved, + /* 110 */Reserved, + /* 111 */Reserved, +}; + +static void dump_flags(int flags, int ev_type) +{ + struct _event_log_flags *p = (struct _event_log_flags *) flags; + u32 err_type = p-type; + + pr_err(AMD-Vi: Flags details:\n); + pr_err(AMD-Vi:Guest / Nested : %u\n, p-gn); + pr_err(AMD-Vi:No Execute : %u\n, p-nx); + pr_err(AMD-Vi:User-Supervisor : %u\n, p-us); + pr_err(AMD-Vi:Interrupt : %u\n, p-i); + pr_err(AMD-Vi:Present : %u\n, p-pr); + pr_err(AMD-Vi:Read / Write: %u\n, p-rw); + pr_err(AMD-Vi:Permission : %u\n, p-pe); + pr_err(AMD-Vi:Reserv bit not zero / Illegal level encoding : %u\n, + p-rz); + pr_err(AMD-Vi:Translation / Transaction : %u\n, + p-tr); + pr_err(AMD-Vi:Type of error : (0x%x) , err_type); + + if ((ev_type == EVENT_TYPE_DEV_TAB_ERR) || + (ev_type == EVENT_TYPE_PAGE_TAB_ERR) || + (ev_type == EVENT_TYPE_CMD_HARD_ERR)) { + /* Only supports up to 2 bits */ + err_type = 0x3; + switch (err_type) { + case 0: + pr_err(Reserved\n); + break; + case 1: + pr_err(Master Abort\n); + break; + case 2: + pr_err(Target Abort\n); + break; + case 3: + pr_err(Data Error\n); + break; + } + } else if (ev_type == EVENT_TYPE_INV_DEV_REQ) { + if (p-tr == 0) { + if (err_type ARRAY_SIZE(_invalid_translation_desc)) + printk(%s\n, + _invalid_translation_desc[err_type]); + } else { + if (err_type ARRAY_SIZE(_invalid_transaction_desc)) + printk(%s\n, + _invalid_transaction_desc[err_type]); + } + } + pr_err(AMD-Vi:(Note: Please refer to AMD IOMMU specification for details.)\n); +} + static void