[PATCH v2] irqchip/xilinx: Expose Kconfig option for Zynq/ZynqMP
Previously the XILINX_INTC config option was hidden and only auto-selected on the MicroBlaze platform. However, this IP can also be used on the Zynq and ZynqMP platforms as a secondary cascaded controller. Allow this option to be user-enabled on those platforms. Signed-off-by: Robert Hancock --- drivers/irqchip/Kconfig | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig index 15536e321df5..1020cc5a7800 100644 --- a/drivers/irqchip/Kconfig +++ b/drivers/irqchip/Kconfig @@ -279,8 +279,13 @@ config XTENSA_MX select GENERIC_IRQ_EFFECTIVE_AFF_MASK config XILINX_INTC - bool + bool "Xilinx Interrupt Controller IP" + depends on MICROBLAZE || ARCH_ZYNQ || ARCH_ZYNQMP || COMPILE_TEST select IRQ_DOMAIN + help + Support for the Xilinx Interrupt Controller IP core. + This is used as a primary controller with MicroBlaze and can also + be used as a secondary chained controller on other platforms. config IRQ_CROSSBAR bool -- 2.27.0
Re: [PATCH] irqchip/xilinx: Expose Kconfig option
On Mon, 2021-04-19 at 13:23 +0200, Michal Simek wrote: > Hi Marc and Robert, +Anirudha > > On 4/16/21 8:14 PM, Robert Hancock wrote: > > On Fri, 2021-04-16 at 18:53 +0100, Marc Zyngier wrote: > > > On Fri, 16 Apr 2021 17:05:49 +0100, > > > Robert Hancock wrote: > > > > On Fri, 2021-04-16 at 14:41 +0100, Marc Zyngier wrote: > > > > > On Fri, 16 Apr 2021 00:32:50 +0100, > > > > > Robert Hancock wrote: > > > > > > Previously the XILINX_INTC config option was hidden and only > > > > > > auto-selected on the MicroBlaze platform. However, this IP can also > > > > > > be > > > > > > used on other platforms. Allow this option to be user-enabled. > > > > > > > > > > > > Signed-off-by: Robert Hancock > > > > > > > > > > I don't think this is a good idea. In general, people have no idea > > > > > which interrupt controller they need to select. So you either end-up > > > > > with a missing interrupt controller, or a bunch you really don't > > > > > need. > > > > > > > > > > This is essentially a platform constraint, and this should directly > > > > > be > > > > > selected by the platform if you have this IP in your system. > > > > > > > > > > Thanks, > > > > > > > > > > M. > > > > > > > > The problem is essentially that at the platform level, we don't know, > > > > other > > > > than in the MicroBlaze case where we know it will be used as the > > > > platform's > > > > primary interrupt controller. In our case, we are using this IP core on > > > > the > > > > ZynqMP platform as a secondary cascaded interrupt controller in the > > > > FPGA > > > > portion of the device. > > > > But many ZynqMP configurations wouldn't have this device present, it > > > > depends on what the user instantiates in the programmable logic. > > > > Also, this core could just as easily be instantiated in standalone > > > > Xilinx FPGAs which could be connected to many different platforms > > > > over a PCIe, AXI, etc. bus. > > > > > > Not compiling it for some users is great if you happen to *know* what > > > you have to select, which is probably a single digit percentage of the > > > people that build their own kernel. At least having it to depend on > > > ZYNQMP (or some other FPGA platform) would narrow it down. > > > > > > And if you have some other HW in your FPGA, you can make the config > > > fragment for this HW select the right interrupt controller. But I'm > > > definitely not keen on making this a universally user-selectable > > > driver. > > > > In general there is no specific or unique config option for what is > > instantiated in an FPGA, it is completely up to the whims of whoever set it > > up. > > You can instantiate whatever logic cores you want and there is no guarantee > > whether they will or won't end up using this interrupt controller in the > > path > > somewhere, so having a dependency there doesn't make much sense. For FPGA > > logic > > it's ultimately up to the user to ensure the kernel config they are using > > has > > the right drivers enabled for the cores they are using. Kconfig doesn't and > > can't really help in this regard. > > > > There's some precedent on this issue with drivers for various other FPGA- > > based > > IP cores for SPI, I2C, Ethernet etc. Often they started out with an > > architecture constraint which limited them to the platform they were > > originally > > developed with, but which was later removed because the ability to use them > > in > > standalone FPGAs means that the platforms they could potentially be used > > with > > are basically unconstrained. > > > > > > So I don't think having this as a platform constraint makes sense. > > > > > > I don't think imposing this on *everyone*, across all supported > > > architectures and platforms makes any sense. Surely, people who build > > > their own HW (because that's what we are talking about here) can be > > > bothered to add the small Kconfig fragment that is required to their > > > kernel build. > > The same interrupt controller was used by ppc405 and ppc440 xilinx > platform in p
Re: [PATCH] irqchip/xilinx: Expose Kconfig option
On Fri, 2021-04-16 at 18:53 +0100, Marc Zyngier wrote: > On Fri, 16 Apr 2021 17:05:49 +0100, > Robert Hancock wrote: > > On Fri, 2021-04-16 at 14:41 +0100, Marc Zyngier wrote: > > > On Fri, 16 Apr 2021 00:32:50 +0100, > > > Robert Hancock wrote: > > > > Previously the XILINX_INTC config option was hidden and only > > > > auto-selected on the MicroBlaze platform. However, this IP can also be > > > > used on other platforms. Allow this option to be user-enabled. > > > > > > > > Signed-off-by: Robert Hancock > > > > > > I don't think this is a good idea. In general, people have no idea > > > which interrupt controller they need to select. So you either end-up > > > with a missing interrupt controller, or a bunch you really don't need. > > > > > > This is essentially a platform constraint, and this should directly be > > > selected by the platform if you have this IP in your system. > > > > > > Thanks, > > > > > > M. > > > > The problem is essentially that at the platform level, we don't know, other > > than in the MicroBlaze case where we know it will be used as the platform's > > primary interrupt controller. In our case, we are using this IP core on the > > ZynqMP platform as a secondary cascaded interrupt controller in the FPGA > > portion of the device. > > But many ZynqMP configurations wouldn't have this device present, it > > depends on what the user instantiates in the programmable logic. > > Also, this core could just as easily be instantiated in standalone > > Xilinx FPGAs which could be connected to many different platforms > > over a PCIe, AXI, etc. bus. > > Not compiling it for some users is great if you happen to *know* what > you have to select, which is probably a single digit percentage of the > people that build their own kernel. At least having it to depend on > ZYNQMP (or some other FPGA platform) would narrow it down. > > And if you have some other HW in your FPGA, you can make the config > fragment for this HW select the right interrupt controller. But I'm > definitely not keen on making this a universally user-selectable > driver. In general there is no specific or unique config option for what is instantiated in an FPGA, it is completely up to the whims of whoever set it up. You can instantiate whatever logic cores you want and there is no guarantee whether they will or won't end up using this interrupt controller in the path somewhere, so having a dependency there doesn't make much sense. For FPGA logic it's ultimately up to the user to ensure the kernel config they are using has the right drivers enabled for the cores they are using. Kconfig doesn't and can't really help in this regard. There's some precedent on this issue with drivers for various other FPGA-based IP cores for SPI, I2C, Ethernet etc. Often they started out with an architecture constraint which limited them to the platform they were originally developed with, but which was later removed because the ability to use them in standalone FPGAs means that the platforms they could potentially be used with are basically unconstrained. > > > So I don't think having this as a platform constraint makes sense. > > I don't think imposing this on *everyone*, across all supported > architectures and platforms makes any sense. Surely, people who build > their own HW (because that's what we are talking about here) can be > bothered to add the small Kconfig fragment that is required to their > kernel build. -- Robert Hancock Senior Hardware Designer, Calian Advanced Technologies www.calian.com
Re: [PATCH] irqchip/xilinx: Expose Kconfig option
On Fri, 2021-04-16 at 14:41 +0100, Marc Zyngier wrote: > On Fri, 16 Apr 2021 00:32:50 +0100, > Robert Hancock wrote: > > Previously the XILINX_INTC config option was hidden and only > > auto-selected on the MicroBlaze platform. However, this IP can also be > > used on other platforms. Allow this option to be user-enabled. > > > > Signed-off-by: Robert Hancock > > I don't think this is a good idea. In general, people have no idea > which interrupt controller they need to select. So you either end-up > with a missing interrupt controller, or a bunch you really don't need. > > This is essentially a platform constraint, and this should directly be > selected by the platform if you have this IP in your system. > > Thanks, > > M. The problem is essentially that at the platform level, we don't know, other than in the MicroBlaze case where we know it will be used as the platform's primary interrupt controller. In our case, we are using this IP core on the ZynqMP platform as a secondary cascaded interrupt controller in the FPGA portion of the device. But many ZynqMP configurations wouldn't have this device present, it depends on what the user instantiates in the programmable logic. Also, this core could just as easily be instantiated in standalone Xilinx FPGAs which could be connected to many different platforms over a PCIe, AXI, etc. bus. So I don't think having this as a platform constraint makes sense. -- Robert Hancock Senior Hardware Designer, Calian Advanced Technologies www.calian.com
[PATCH] irqchip/xilinx: Expose Kconfig option
Previously the XILINX_INTC config option was hidden and only auto-selected on the MicroBlaze platform. However, this IP can also be used on other platforms. Allow this option to be user-enabled. Signed-off-by: Robert Hancock --- drivers/irqchip/Kconfig | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig index 15536e321df5..cc24f1bd3ca7 100644 --- a/drivers/irqchip/Kconfig +++ b/drivers/irqchip/Kconfig @@ -279,8 +279,12 @@ config XTENSA_MX select GENERIC_IRQ_EFFECTIVE_AFF_MASK config XILINX_INTC - bool + bool "Xilinx Interrupt Controller IP" select IRQ_DOMAIN + help + Support for the Xilinx Interrupt Controller IP core. + This is used as a primary controller with MicroBlaze and can also + be used as a secondary chained controller on other platforms. config IRQ_CROSSBAR bool -- 2.27.0
Re: [PATCH V5 3/5] gpio: gpio-xilinx: Add interrupt support
Noticed one issue, see below: On Fri, 2021-01-29 at 19:56 +0530, Srinivas Neeli wrote: > Adds interrupt support to the Xilinx GPIO driver so that rising and > falling edge line events can be supported. Since interrupt support is > an optional feature in the Xilinx IP, the driver continues to support > devices which have no interrupt provided. > Depends on OF_GPIO framework for of_xlate function to translate > gpiospec to the GPIO number and flags. > > Signed-off-by: Robert Hancock > Signed-off-by: Shubhrajyoti Datta > Signed-off-by: Srinivas Neeli > --- > Changes in V5: > -Removed IRQ_DOMAIN_HIERARCHY from Kconfig and > #include from includes. > -Fixed "detected irqchip that is shared with multiple > gpiochips: please fix the driver"error message. > -Added check for #gpio-cells and error message in failure case. > Changes in V4: > -Added more commit description. > Changes in V3: > -Created separate patch for Clock changes and runtime resume > and suspend. > -Updated minor review comments. > > Changes in V2: > -Added check for return value of platform_get_irq() API. > -Updated code to support rising edge and falling edge. > -Added xgpio_xlate() API to support switch. > -Added MAINTAINERS fragment. > --- > drivers/gpio/Kconfig | 2 + > drivers/gpio/gpio-xilinx.c | 246 > - > 2 files changed, 244 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig > index c70f46e80a3b..2ee57797d908 100644 > --- a/drivers/gpio/Kconfig > +++ b/drivers/gpio/Kconfig > @@ -690,6 +690,8 @@ config GPIO_XGENE_SB > > config GPIO_XILINX > tristate "Xilinx GPIO support" > + select GPIOLIB_IRQCHIP > + depends on OF_GPIO > help > Say yes here to support the Xilinx FPGA GPIO device > > diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c > index f88db56543c2..62deb755f910 100644 > --- a/drivers/gpio/gpio-xilinx.c > +++ b/drivers/gpio/gpio-xilinx.c > @@ -10,7 +10,9 @@ > #include > #include > #include > +#include > #include > +#include > #include > #include > #include > @@ -22,6 +24,11 @@ > > #define XGPIO_CHANNEL_OFFSET 0x8 > > +#define XGPIO_GIER_OFFSET0x11c /* Global Interrupt Enable */ > +#define XGPIO_GIER_IEBIT(31) > +#define XGPIO_IPISR_OFFSET 0x120 /* IP Interrupt Status */ > +#define XGPIO_IPIER_OFFSET 0x128 /* IP Interrupt Enable */ > + > /* Read/Write access to the GPIO registers */ > #if defined(CONFIG_ARCH_ZYNQ) || defined(CONFIG_X86) > # define xgpio_readreg(offset) readl(offset) > @@ -36,9 +43,15 @@ > * @gc: GPIO chip > * @regs: register block > * @gpio_width: GPIO width for every channel > - * @gpio_state: GPIO state shadow register > + * @gpio_state: GPIO write state shadow register > + * @gpio_last_irq_read: GPIO read state register from last interrupt > * @gpio_dir: GPIO direction shadow register > * @gpio_lock: Lock used for synchronization > + * @irq: IRQ used by GPIO device > + * @irqchip: IRQ chip > + * @irq_enable: GPIO IRQ enable/disable bitfield > + * @irq_rising_edge: GPIO IRQ rising edge enable/disable bitfield > + * @irq_falling_edge: GPIO IRQ falling edge enable/disable bitfield > * @clk: clock resource for this driver > */ > struct xgpio_instance { > @@ -46,8 +59,14 @@ struct xgpio_instance { > void __iomem *regs; > unsigned int gpio_width[2]; > u32 gpio_state[2]; > + u32 gpio_last_irq_read[2]; > u32 gpio_dir[2]; > spinlock_t gpio_lock; /* For serializing operations */ > + int irq; > + struct irq_chip irqchip; > + u32 irq_enable[2]; > + u32 irq_rising_edge[2]; > + u32 irq_falling_edge[2]; > struct clk *clk; > }; > > @@ -277,6 +296,175 @@ static int xgpio_remove(struct platform_device *pdev) > } > > /** > + * xgpio_irq_ack - Acknowledge a child GPIO interrupt. > + * @irq_data: per IRQ and chip data passed down to chip functions > + * This currently does nothing, but irq_ack is unconditionally called by > + * handle_edge_irq and therefore must be defined. > + */ > +static void xgpio_irq_ack(struct irq_data *irq_data) > +{ > +} > + > +/** > + * xgpio_irq_mask - Write the specified signal of the GPIO device. > + * @irq_data: per IRQ and chip data passed down to chip functions > + */ > +static void xgpio_irq_mask(struct irq_data *irq_data) > +{ > + unsigned long flags; > + struct xgpio_instance *chip = irq_data_get_irq_chip_data(irq_data); > + int irq_offset = irqd_to_hwirq(irq_data); > + int ind
Re: [LINUX PATCH V3 5/9] gpio: gpio-xilinx: Add interrupt support
On Thu, 2020-11-12 at 22:42 +0530, Srinivas Neeli wrote: > Adds interrupt support to the Xilinx GPIO driver so that rising and > falling edge line events can be supported. Since interrupt support is > an optional feature in the Xilinx IP, the driver continues to support > devices which have no interrupt provided. > > Signed-off-by: Robert Hancock > Signed-off-by: Shubhrajyoti Datta > Signed-off-by: Srinivas Neeli > --- > Chnages in V3: > -Created separate patch for Clock changes and runtime resume > and suspend. > -Updated minor review comments. > > Changes in V2: > -Added check for return value of platform_get_irq() API. > -Updated code to support rising edge and falling edge. > -Added xgpio_xlate() API to support switch. > -Added MAINTAINERS fragment > --- > drivers/gpio/Kconfig | 2 + > drivers/gpio/gpio-xilinx.c | 242 > - > 2 files changed, 240 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig > index 5d4de5cd6759..cf4959891eab 100644 > --- a/drivers/gpio/Kconfig > +++ b/drivers/gpio/Kconfig > @@ -677,6 +677,8 @@ config GPIO_XGENE_SB > > config GPIO_XILINX > tristate "Xilinx GPIO support" > + select GPIOLIB_IRQCHIP > + depends on OF_GPIO This OF_GPIO dependency was previously removed - is this required? It appears the code is now setting of_gpio_n_cells but I am not sure if this is necessary or helpful since the other of_gpio functions are not used. > help > Say yes here to support the Xilinx FPGA GPIO device > > diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c > index 69bdf1910215..80a06ded 100644 > --- a/drivers/gpio/gpio-xilinx.c > +++ b/drivers/gpio/gpio-xilinx.c > @@ -10,9 +10,12 @@ > #include > #include > #include > +#include > #include > +#include > #include > #include > +#include > #include > #include > > @@ -22,6 +25,11 @@ > > #define XGPIO_CHANNEL_OFFSET 0x8 > > +#define XGPIO_GIER_OFFSET0x11c /* Global Interrupt Enable */ > +#define XGPIO_GIER_IEBIT(31) > +#define XGPIO_IPISR_OFFSET 0x120 /* IP Interrupt Status */ > +#define XGPIO_IPIER_OFFSET 0x128 /* IP Interrupt Enable */ > + > /* Read/Write access to the GPIO registers */ > #if defined(CONFIG_ARCH_ZYNQ) || defined(CONFIG_X86) > # define xgpio_readreg(offset) readl(offset) > @@ -36,9 +44,14 @@ > * @gc: GPIO chip > * @regs: register block > * @gpio_width: GPIO width for every channel > - * @gpio_state: GPIO state shadow register > + * @gpio_state: GPIO write state shadow register > + * @gpio_last_irq_read: GPIO read state register from last interrupt > * @gpio_dir: GPIO direction shadow register > * @gpio_lock: Lock used for synchronization > + * @irq: IRQ used by GPIO device > + * @irq_enable: GPIO IRQ enable/disable bitfield > + * @irq_rising_edge: GPIO IRQ rising edge enable/disable bitfield > + * @irq_falling_edge: GPIO IRQ falling edge enable/disable bitfield > * @clk: clock resource for this driver > */ > struct xgpio_instance { > @@ -46,8 +59,13 @@ struct xgpio_instance { > void __iomem *regs; > unsigned int gpio_width[2]; > u32 gpio_state[2]; > + u32 gpio_last_irq_read[2]; > u32 gpio_dir[2]; > spinlock_t gpio_lock; /* For serializing operations */ > + int irq; > + u32 irq_enable[2]; > + u32 irq_rising_edge[2]; > + u32 irq_falling_edge[2]; > struct clk *clk; > }; > > @@ -258,6 +276,183 @@ static void xgpio_save_regs(struct > xgpio_instance *chip) > } > > /** > + * xgpio_irq_ack - Acknowledge a child GPIO interrupt. > + * @irq_data: per IRQ and chip data passed down to chip functions > + * This currently does nothing, but irq_ack is unconditionally > called by > + * handle_edge_irq and therefore must be defined. > + */ > +static void xgpio_irq_ack(struct irq_data *irq_data) > +{ > +} > + > +/** > + * xgpio_irq_mask - Write the specified signal of the GPIO device. > + * @irq_data: per IRQ and chip data passed down to chip functions > + */ > +static void xgpio_irq_mask(struct irq_data *irq_data) > +{ > + unsigned long flags; > + struct xgpio_instance *chip = > irq_data_get_irq_chip_data(irq_data); > + int irq_offset = irqd_to_hwirq(irq_data); > + int index = xgpio_index(chip, irq_offset); > + int offset = xgpio_offset(chip, irq_offset); > + > + spin_lock_irqsave(&chip->gpio_lock, flags); > + > + chip->irq_enable[index] &= ~BIT(offset); > + > + if (!chip->irq_enable[index]) { > +
Re: [PATCH V2 2/3] gpio: xilinx: Add interrupt support
On 2020-07-23 12:03 p.m., Andy Shevchenko wrote: +/** + * xgpio_xlate - Translate gpio_spec to the GPIO number and flags + * @gc: Pointer to gpio_chip device structure. + * @gpiospec: gpio specifier as found in the device tree + * @flags: A flags pointer based on binding + * + * Return: + * irq number otherwise -EINVAL + */ +static int xgpio_xlate(struct gpio_chip *gc, + const struct of_phandle_args *gpiospec, u32 *flags) +{ + if (gc->of_gpio_n_cells < 2) { + WARN_ON(1); + return -EINVAL; + } + + if (WARN_ON(gpiospec->args_count < gc->of_gpio_n_cells)) + return -EINVAL; + + if (gpiospec->args[0] >= gc->ngpio) + return -EINVAL; + + if (flags) + *flags = gpiospec->args[1]; + + return gpiospec->args[0]; +} This looks like a very standart xlate function for GPIO. Why do you need to open-code it? Indeed, this seems the same as the of_gpio_simple_xlate callback which is used if no xlate callback is specified, so I'm not sure why this is necessary? ... +/** + * xgpio_irq_ack - Acknowledge a child GPIO interrupt. + * This currently does nothing, but irq_ack is unconditionally called by + * handle_edge_irq and therefore must be defined. This should go after parameter description(s). + * @irq_data: per irq and chip data passed down to chip functions + */ ... /** + * xgpio_irq_mask - Write the specified signal of the GPIO device. + * @irq_data: per irq and chip data passed down to chip functions In all comments irq -> IRQ. + */ +static void xgpio_irq_mask(struct irq_data *irq_data) +{ + unsigned long flags; + struct xgpio_instance *chip = irq_data_get_irq_chip_data(irq_data); + int irq_offset = irqd_to_hwirq(irq_data); + int index = xgpio_index(chip, irq_offset); + int offset = xgpio_offset(chip, irq_offset); + + spin_lock_irqsave(&chip->gpio_lock, flags); + + chip->irq_enable[index] &= ~BIT(offset); If you convert your data structure to use bitmaps (and respective API) like #define XILINX_NGPIOS 64 ... DECLARE_BITMAP(irq_enable, XILINX_NGPIOS); ... it will make code better to read and understand. For example, here it will be just __clear_bit(offset, chip->irq_enable); + dev_dbg(chip->gc.parent, "Disable %d irq, irq_enable_mask 0x%x\n", + irq_offset, chip->irq_enable[index]); Under spin lock?! Hmm... + if (!chip->irq_enable[index]) { + /* Disable per channel interrupt */ + u32 temp = xgpio_readreg(chip->regs + XGPIO_IPIER_OFFSET); + + temp &= ~BIT(index); + xgpio_writereg(chip->regs + XGPIO_IPIER_OFFSET, temp); + } + spin_unlock_irqrestore(&chip->gpio_lock, flags); +} ... + for (index = 0; index < num_channels; index++) { + if ((status & BIT(index))) { If gpio_width is the same among banks, you can use for_each_set_bit() here as well. ... + for_each_set_bit(bit, &all_events, 32) { + generic_handle_irq(irq_find_mapping + (chip->gc.irq.domain, offset + bit)); Strange indentation. Maybe a temporary variable helps? ... + chip->irq = platform_get_irq_optional(pdev, 0); + if (chip->irq <= 0) { + dev_info(&pdev->dev, "GPIO IRQ not set\n"); Why do you need an optional variant if you print an error anyway? + } else { ... + chip->gc.irq.parents = (unsigned int *)&chip->irq; + chip->gc.irq.num_parents = 1; Current pattern is to use devm_kcalloc() for it (Linus has plans to simplify this in the future and this will help him to find what patterns are being used) -- Robert Hancock Senior Hardware Designer SED Systems, a division of Calian Ltd. Email: hanc...@sedsystems.ca
Re: 5.7 regression: Lots of PCIe AER errors and suspend failure without pcie=noaer
On Fri, Jul 24, 2020 at 8:32 AM Kai-Heng Feng wrote: > > Hi Robert, > > > Unfortunately it appears that this ASMedia PCIe-PCI bridge: > > > > 02:00.0 PCI bridge [0604]: ASMedia Technology Inc. ASM1083/1085 PCIe > > to PCI Bridge [1b21:1080] (rev 04) > > > > doesn't cope with ASPM properly and causes a bunch of PCIe link > > errors. (This is in addition to some broken-ness known as far back as > > 2012 with these ASM1083/1085 chips with regard to PCI interrupts > > getting stuck, but this ASPM problem causes issues even if no devices > > are connected to the PCI side of the bridge, as is the case on my > > system.) > > > > Might need a quirk to disable ASPM on this device? > > Yes I think it's a great idea to do it. > > Can you please file a bug on [1] and we can continue our discussion there. > > [1] https://bugzilla.kernel.org Hi, I created a bug entry earlier as a result of another discussion, which includes the debug info as well as a proposed patch: https://bugzilla.kernel.org/show_bug.cgi?id=208667
Re: [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge
On Wed, Jul 22, 2020 at 7:04 PM Bjorn Helgaas wrote: > > On Wed, Jul 22, 2020 at 06:46:06PM -0600, Robert Hancock wrote: > > On Wed, Jul 22, 2020 at 11:40 AM Bjorn Helgaas wrote: > > > On Tue, Jul 21, 2020 at 08:18:03PM -0600, Robert Hancock wrote: > > > > Recently ASPM handling was changed to no longer disable ASPM on all > > > > PCIe to PCI bridges. Unfortunately these ASMedia PCIe to PCI bridge > > > > devices don't seem to function properly with ASPM enabled, as they > > > > cause the parent PCIe root port to cause repeated AER timeout errors. > > > > In addition to flooding the kernel log, this also causes the machine > > > > to wake up immediately after suspend is initiated. > > > > > > Hi Robert, thanks a lot for the report of this problem > > > (https://lore.kernel.org/r/cadlc3l1r2hssrjxhjv9yhdn_7-hgw58rxsfnp-frazh0tw+...@mail.gmail.com > > > and https://bugzilla.redhat.com/show_bug.cgi?id=1853960). > > > > > > I'm pretty sure Linux ASPM support is missing some things. This > > > problem might be a hardware problem where a quirk is the right > > > solution, but it could also be that it's a result of a Linux defect > > > that we should fix. > > > > > > Could you collect the dmesg log and "sudo lspci -vv" output > > > somewhere (maybe a bugzilla.kernel.org issue)? I want to figure out > > > whether this L1 PM substates are enabled on this link, and whether > > > that's configured correctly. > > > > Created a Bugzilla entry and added dmesg and lspci output: > > https://bugzilla.kernel.org/show_bug.cgi?id=208667 > > > > As I noted in that report, I subsequently found this page on ASMedia's > > site: > > https://www.asmedia.com.tw/eng/e_show_products.php?cate_index=169&item=114 > > which indicates this ASM1083 device has "No PCIe ASPM support". > > How nice. According to your lspci, the device itself claims to > support ASPM: > > 02:00.0 ... ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge > LnkCap: ... ASPM L0s L1 ... > > but the web page claims otherwise. That would mean the device is > defective for claiming something that's not true. Or possibly those > capability bits can be set by BIOS. > > > It's not clear why this problem isn't occurring on Windows however - > > either it is not enabling ASPM, somehow it doesn't cause issues with > > the PCIe link, or it is causing issues and just doesn't notify the > > user in any way. I can try and check if this bridge device is ending > > up with ASPM enabled under Windows 10 or not.. > > If Windows *does* manage to enable ASPM, that would be interesting. I > don't know whether Windows has a similar quirk mechanism. I suppose > they must have *some* way to work around defective devices. As I posted on the Bugzilla report, based on lspci output it appears Windows does have ASPM L0s enabled for this bridge. However, it appears to have the exact same problem: there are correctable PCIe error entries showing up in the Windows system event log against the root port the bridge is connected to. So I am thinking this hardware is just broken with ASPM enabled.
Re: [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge
On Wed, Jul 22, 2020 at 11:40 AM Bjorn Helgaas wrote: > > [+cc Puranjay] > > On Tue, Jul 21, 2020 at 08:18:03PM -0600, Robert Hancock wrote: > > Recently ASPM handling was changed to no longer disable ASPM on all > > PCIe to PCI bridges. Unfortunately these ASMedia PCIe to PCI bridge > > devices don't seem to function properly with ASPM enabled, as they > > cause the parent PCIe root port to cause repeated AER timeout errors. > > In addition to flooding the kernel log, this also causes the machine > > to wake up immediately after suspend is initiated. > > Hi Robert, thanks a lot for the report of this problem > (https://lore.kernel.org/r/cadlc3l1r2hssrjxhjv9yhdn_7-hgw58rxsfnp-frazh0tw+...@mail.gmail.com > and https://bugzilla.redhat.com/show_bug.cgi?id=1853960). > > I'm pretty sure Linux ASPM support is missing some things. This > problem might be a hardware problem where a quirk is the right > solution, but it could also be that it's a result of a Linux defect > that we should fix. > > Could you collect the dmesg log and "sudo lspci -vv" output > somewhere (maybe a bugzilla.kernel.org issue)? I want to figure out > whether this L1 PM substates are enabled on this link, and whether > that's configured correctly. Created a Bugzilla entry and added dmesg and lspci output: https://bugzilla.kernel.org/show_bug.cgi?id=208667 As I noted in that report, I subsequently found this page on ASMedia's site: https://www.asmedia.com.tw/eng/e_show_products.php?cate_index=169&item=114 which indicates this ASM1083 device has "No PCIe ASPM support". It's not clear why this problem isn't occurring on Windows however - either it is not enabling ASPM, somehow it doesn't cause issues with the PCIe link, or it is causing issues and just doesn't notify the user in any way. I can try and check if this bridge device is ending up with ASPM enabled under Windows 10 or not..
[PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge
Recently ASPM handling was changed to no longer disable ASPM on all PCIe to PCI bridges. Unfortunately these ASMedia PCIe to PCI bridge devices don't seem to function properly with ASPM enabled, as they cause the parent PCIe root port to cause repeated AER timeout errors. In addition to flooding the kernel log, this also causes the machine to wake up immediately after suspend is initiated. Fixes: 66ff14e59e8a ("PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges") Cc: sta...@vger.kernel.org Signed-off-by: Robert Hancock --- drivers/pci/quirks.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 812bfc32ecb8..e5713114f2ab 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -2330,6 +2330,19 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s); +static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev) +{ + pci_info(dev, "Disabling ASPM L0s/L1\n"); + pci_disable_link_state(dev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1); +} + +/* + * ASM1083/1085 PCIe-PCI bridge devices cause AER timeout errors on the + * upstream PCIe root port when ASPM is enabled. At least L0s mode is affected, + * disable both L0s and L1 for now to be safe. + */ +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1); + /* * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain * Link bit cleared after starting the link retrain process to allow this -- 2.26.2
Re: 5.7 regression: Lots of PCIe AER errors and suspend failure without pcie=noaer
On Fri, Jul 10, 2020 at 6:28 PM Robert Hancock wrote: > > On Fri, Jul 10, 2020 at 6:23 PM Robert Hancock wrote: > > > > Noticed a problem on my desktop with an Asus PRIME H270-PRO > > motherboard after Fedora 32 upgraded to the 5.7 kernel (now on 5.7.8): > > periodically there are PCIe AER errors getting spewed in dmesg that > > weren't happening before, and this also seems to causes suspend to > > fail - the system just wakes back up again right away, I am assuming > > due to some AER errors interrupting the process. 5.6 kernels didn't > > have this problem. Setting "pcie=noaer" on the kernel command line > > works around the issue, but I'm not sure what would have changed to > > trigger this to occur? > > Correction: the workaround option is "pci=noaer". As a follow-up, from some more experimentation, it appears that disabling PCIe ASPM with setpci on both the ASMedia PCIe-PCI bridge as well as the PCIe root port it is connected to seems to silence the AER errors and allow suspend/resume to work again: setpci -s 00:1c.0 0x50.B=0x00 setpci -s 02:00.0 0x90.B=0x00 It appears the behavior changed as a result of this patch (which went into the stable tree for 5.7.6 and so affects 5.7 kernels as well): commit 66ff14e59e8a30690755b08bc3042359703fb07a Author: Kai-Heng Feng Date: Wed May 6 01:34:21 2020 +0800 PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") added the ability for Linux to enable ASPM, but for some undocumented reason, it didn't enable ASPM on links where the downstream component is a PCIe-to-PCI/PCI-X Bridge. Remove this exclusion so we can enable ASPM on these links. The Dell OptiPlex 7080 mentioned in the bugzilla has a TI XIO2001 PCIe-to-PCI Bridge. Enabling ASPM on the link leading to it allows the Intel SoC to enter deeper Package C-states, which is a significant power savings. [bhelgaas: commit log] Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207571 Link: https://lore.kernel.org/r/20200505173423.26968-1-kai.heng.f...@canonical.com Signed-off-by: Kai-Heng Feng Signed-off-by: Bjorn Helgaas Reviewed-by: Mika Westerberg Unfortunately it appears that this ASMedia PCIe-PCI bridge: 02:00.0 PCI bridge [0604]: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge [1b21:1080] (rev 04) doesn't cope with ASPM properly and causes a bunch of PCIe link errors. (This is in addition to some broken-ness known as far back as 2012 with these ASM1083/1085 chips with regard to PCI interrupts getting stuck, but this ASPM problem causes issues even if no devices are connected to the PCI side of the bridge, as is the case on my system.) Might need a quirk to disable ASPM on this device?
Re: 5.7 regression: Lots of PCIe AER errors and suspend failure without pcie=noaer
On Fri, Jul 10, 2020 at 6:23 PM Robert Hancock wrote: > > Noticed a problem on my desktop with an Asus PRIME H270-PRO > motherboard after Fedora 32 upgraded to the 5.7 kernel (now on 5.7.8): > periodically there are PCIe AER errors getting spewed in dmesg that > weren't happening before, and this also seems to causes suspend to > fail - the system just wakes back up again right away, I am assuming > due to some AER errors interrupting the process. 5.6 kernels didn't > have this problem. Setting "pcie=noaer" on the kernel command line > works around the issue, but I'm not sure what would have changed to > trigger this to occur? Correction: the workaround option is "pci=noaer".
Re: Xilinx axienet 1000BaseX support
On 2020-04-28 5:01 p.m., Russell King - ARM Linux admin wrote: On Tue, Apr 28, 2020 at 03:59:45PM -0600, Robert Hancock wrote: On 2020-04-22 1:51 a.m., Russell King - ARM Linux admin wrote: On Tue, Apr 21, 2020 at 07:45:47PM -0600, Robert Hancock wrote: Hi Andre/Russell, Just wondering where things got to with the changes for SGMII on Xilinx axienet that you were discussing (below)? I am looking into our Xilinx setup using 1000BaseX SFP and trying to get it working "properly" with newer kernels. My understanding is that the requirements for 1000BaseX and SGMII are somewhat similar. I gathered that SGMII was working somewhat already, but that not all link modes had been tested. However, it appears 1000BaseX is not yet working in the stock kernel. The way I had this working before with a 4.19-based kernel was basically a hack to phylink to allow the Xilinx PCS/PMA PHY to be configured sufficiently as a PHY for it to work, and mostly ignored the link status of the SFP PHY itself, even though we were using in-band signalling mode with an SFP module. That was using this patch: https://patchwork.ozlabs.org/project/netdev/patch/1559330285-30246-5-git-send-email-hanc...@sedsystems.ca/ Of course, that's basically just a hack which I suspect mostly worked by luck. I see that there are some helpers that were added to phylink to allow setting PHY advertisements and reading PHY status from clause 22 PHY devices, so I'm guessing that is the way to go in this case? Something like: axienet_mac_config: if using in-band mode, use phylink_mii_c22_pcs_set_advertisement to configure the Xilinx PHY. axienet_mac_pcs_get_state: use phylink_mii_c22_pcs_get_state to get the MAC PCS state from the Xilinx PHY axienet_mac_an_restart: if using in-band mode, use phylink_mii_c22_pcs_an_restart to restart autonegotiation on Xilinx PHY To use those c22 functions, we need to find the mdio_device that's referenced by the phy-handle in the device tree - I guess we can just use some of the guts of of_phy_find_device to do that? Please see the code for DPAA2 - it's changed slightly since I sent a copy to the netdev mailing list, and it still isn't clear whether this is the final approach (DPAA2 has some fun stuff such as several different PHYs at address 0.) NXP basically didn't like the approach I had in the patches I sent to netdev, we had a call, they presented an alternative appraoch, I implemented it, then they decided my original approach was the better solution for their situation. See http://git.armlinux.org.uk/cgit/linux-arm.git/log/?h=cex7 specifically the patches from: "dpaa2-mac: add 1000BASE-X/SGMII PCS support" through to: "net: phylink: add interface to configure clause 22 PCS PHY" You may also need some of the patches further down in the net-queue branch: "net: phylink: avoid mac_config calls" through to: "net: phylink: rejig link state tracking" I've been playing with this a bit on a 5.4 kernel with some of these patches backported. However, I'm running into something that my previous hacks for this basically dealt with as a side effect: when phylink_start is called, sfp_upstream_start gets called, an SFP module is detected, phylink_connect_phy gets called, but then it hits this condition and bails out, because we are using INBAND mode with 1000BaseX: if (WARN_ON(pl->cfg_link_an_mode == MLO_AN_FIXED || (pl->cfg_link_an_mode == MLO_AN_INBAND && phy_interface_mode_is_8023z(interface return -EINVAL; I'm expecting SGMII mode to be used when there's an external PHY as that gives greatest flexibility (as it allows 10 and 100Mbps speeds as well.) From what I remember, these blocks support SGMII, so it should just be a matter of adding that. They do support SGMII, but unfortunately it's not a runtime configurable parameter, it's a synthesis-level parameter on the FPGA IP core so you have to pick one or the other for any given build. We want to be able to support various fiber module types as well, and my understanding is that at least some of those only do 1000BaseX, so that ends up being the standard in common that we are using. I guess I'm not sure how this is supposed to work when the PHY on the SFP module gets detected, i.e. if there's supposed to be another code path that this is supposed to go down, or this is something that just hasn't been fully implemented yet? Copper PHYs work fine - using SGMII mode everywhere so far. The problem is, if you want to use them as 1000BASE-X, you generally have to ensure that the PHY is appropriately programmed for 1000BASE-X negotiation, and the copper side advertisement only indicates 1G support. Not all copper PHYs have the PHY accessible for such programming, and in that case it becomes an exercise of "read the SFP docume
Re: Xilinx axienet 1000BaseX support
On 2020-04-22 1:51 a.m., Russell King - ARM Linux admin wrote: On Tue, Apr 21, 2020 at 07:45:47PM -0600, Robert Hancock wrote: Hi Andre/Russell, Just wondering where things got to with the changes for SGMII on Xilinx axienet that you were discussing (below)? I am looking into our Xilinx setup using 1000BaseX SFP and trying to get it working "properly" with newer kernels. My understanding is that the requirements for 1000BaseX and SGMII are somewhat similar. I gathered that SGMII was working somewhat already, but that not all link modes had been tested. However, it appears 1000BaseX is not yet working in the stock kernel. The way I had this working before with a 4.19-based kernel was basically a hack to phylink to allow the Xilinx PCS/PMA PHY to be configured sufficiently as a PHY for it to work, and mostly ignored the link status of the SFP PHY itself, even though we were using in-band signalling mode with an SFP module. That was using this patch: https://patchwork.ozlabs.org/project/netdev/patch/1559330285-30246-5-git-send-email-hanc...@sedsystems.ca/ Of course, that's basically just a hack which I suspect mostly worked by luck. I see that there are some helpers that were added to phylink to allow setting PHY advertisements and reading PHY status from clause 22 PHY devices, so I'm guessing that is the way to go in this case? Something like: axienet_mac_config: if using in-band mode, use phylink_mii_c22_pcs_set_advertisement to configure the Xilinx PHY. axienet_mac_pcs_get_state: use phylink_mii_c22_pcs_get_state to get the MAC PCS state from the Xilinx PHY axienet_mac_an_restart: if using in-band mode, use phylink_mii_c22_pcs_an_restart to restart autonegotiation on Xilinx PHY To use those c22 functions, we need to find the mdio_device that's referenced by the phy-handle in the device tree - I guess we can just use some of the guts of of_phy_find_device to do that? Please see the code for DPAA2 - it's changed slightly since I sent a copy to the netdev mailing list, and it still isn't clear whether this is the final approach (DPAA2 has some fun stuff such as several different PHYs at address 0.) NXP basically didn't like the approach I had in the patches I sent to netdev, we had a call, they presented an alternative appraoch, I implemented it, then they decided my original approach was the better solution for their situation. See http://git.armlinux.org.uk/cgit/linux-arm.git/log/?h=cex7 specifically the patches from: "dpaa2-mac: add 1000BASE-X/SGMII PCS support" through to: "net: phylink: add interface to configure clause 22 PCS PHY" You may also need some of the patches further down in the net-queue branch: "net: phylink: avoid mac_config calls" through to: "net: phylink: rejig link state tracking" I've been playing with this a bit on a 5.4 kernel with some of these patches backported. However, I'm running into something that my previous hacks for this basically dealt with as a side effect: when phylink_start is called, sfp_upstream_start gets called, an SFP module is detected, phylink_connect_phy gets called, but then it hits this condition and bails out, because we are using INBAND mode with 1000BaseX: if (WARN_ON(pl->cfg_link_an_mode == MLO_AN_FIXED || (pl->cfg_link_an_mode == MLO_AN_INBAND && phy_interface_mode_is_8023z(interface return -EINVAL; That same code is still in the latest version in the arm-linux cex7 branch, except now in phylink_attach_phy, and from what I can see would behave similarly. I guess I'm not sure how this is supposed to work when the PHY on the SFP module gets detected, i.e. if there's supposed to be another code path that this is supposed to go down, or this is something that just hasn't been fully implemented yet? -- Robert Hancock Senior Hardware Designer SED Systems, a division of Calian Ltd. Email: hanc...@sedsystems.ca
Re: [PATCH] net: axienet: fix a potential double free in axienet_probe()
On 2019-07-05 9:38 p.m., Wen Yang wrote: > There is a possible use-after-free issue in the axienet_probe(): > > 1701: np = of_parse_phandle(pdev->dev.of_node, "axistream-connected", 0); > 1702: if (np) { > ... > 1787: of_node_put(np); ---> released here > 1788: lp->eth_irq = platform_get_irq(pdev, 0); > 1789: } else { > ... > 1801: } > 1802: if (IS_ERR(lp->dma_regs)) { > ... > 1805: of_node_put(np); ---> double released here > 1806: goto free_netdev; > 1807: } > > We solve this problem by removing the unnecessary of_node_put(). > > Fixes: 28ef9ebdb64c ("net: axienet: make use of axistream-connected attribute > optional") > Signed-off-by: Wen Yang > Cc: Anirudha Sarangi > Cc: John Linn > Cc: "David S. Miller" > Cc: Michal Simek > Cc: Robert Hancock > Cc: net...@vger.kernel.org > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-kernel@vger.kernel.org Yes, looks valid. Reviewed-by: Robert Hancock > --- > drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c > b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c > index 561e28a..4fc627f 100644 > --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c > @@ -1802,7 +1802,6 @@ static int axienet_probe(struct platform_device *pdev) > if (IS_ERR(lp->dma_regs)) { > dev_err(&pdev->dev, "could not map DMA regs\n"); > ret = PTR_ERR(lp->dma_regs); > - of_node_put(np); > goto free_netdev; > } > if ((lp->rx_irq <= 0) || (lp->tx_irq <= 0)) { > -- Robert Hancock Senior Software Developer SED Systems, a division of Calian Ltd. Email: hanc...@sedsystems.ca
Re: iMX6 5.2-rc3 boot failure due to "PCI: imx6: Allow asynchronous probing"
On 2019-06-11 2:40 p.m., Fabio Estevam wrote: > Hi Robert, > > On Tue, Jun 11, 2019 at 4:02 PM Robert Hancock wrote: > >>> [ 13.193578] imx6q-pcie 1ffc000.pcie: host bridge /soc/pcie@1ffc000 >>> ranges: >>> [ 13.200635] imx6q-pcie 1ffc000.pcie:IO 0x01f8..0x01f8 -> >>> 0x >>> [ 13.201454] imx-sdma 20ec000.sdma: loaded firmware 3.3 > > Does this problem happen if you don't load an external SDMA firmware? Based on some tests, it appears that may help - however it is hard to be conclusive since the behavior is somewhat random, it doesn't fail every time. The first few times I booted this version, I didn't see the problem, but after that it was consistently happening every time until I reverted the patch. Is there potentially a dependency where the PCIe controller doesn't like some other activity that's occurring on the iMX during its initialization sequence? -- Robert Hancock Senior Software Developer SED Systems, a division of Calian Ltd. Email: hanc...@sedsystems.ca
Re: iMX6 5.2-rc3 boot failure due to "PCI: imx6: Allow asynchronous probing"
Adding linux-pci. One thing that may be slightly unusual about our setup is that we are using CONFIG_PREEMPT=y, which may be allowing more concurrency to come into play. On 2019-06-07 6:28 p.m., Robert Hancock wrote: > I am seeing a boot failure on our iMX6D-based embedded platform running > v5.2-rc3. It seems to stall for about 20 seconds after "random: crng > init done" and then panic with a bunch of RCU stall and soft-lockup > errors. It seems like something is hanging up in the iMX6 PCIe driver. > Boot log is below. > > Suspecting the following patch, I reverted it locally and it seems to > resolve the issue. (Well it gets into userspace at least; it later > oopses in the ksz switch driver, appears unrelated..) > > commit 1b8df7aa78748ddafc6f3b16a6328a3c500087b3 > Author: Lucas Stach > Date: Thu Apr 4 18:45:17 2019 +0200 > > PCI: imx6: Allow asynchronous probing > > Establishing a PCIe link can take a while; allow asynchronous probing so > that link establishment can happen in the background while other devices > are being probed. > > Signed-off-by: Lucas Stach > Signed-off-by: Lorenzo Pieralisi > Signed-off-by: Bjorn Helgaas > Reviewed-by: Fabio Estevam > > I would say either that patch needs a fix or it should be reverted for now. > > [0.00] Booting Linux on physical CPU 0x0 > [0.00] Linux version 5.2.0-rc3 > (hanc...@sed.rfc1918.192.168.sedsystems.ca) (gcc version 8.3.0 > (Buildroot 2019.02.1-00510-gcc60ea2)) #1 SMP PREEMPT Fri Jun 7 17:44:38 > CST 2019 > [0.00] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), > cr=10c5387d > [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing > instruction cache > [0.00] OF: fdt: Machine model: SED Systems xxx > [0.00] printk: bootconsole [earlycon0] enabled > [0.00] Memory policy: Data cache writealloc > [0.00] cma: Reserved 64 MiB at 0x3c00 > [0.00] percpu: Embedded 16 pages/cpu s36364 r8192 d20980 u65536 > [0.00] Built 1 zonelists, mobility grouping on. Total pages: 260608 > [0.00] Kernel command line: console=ttymxc0,115200 earlyprintk > [0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 > bytes) > [0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 > bytes) > [0.00] Memory: 940708K/1048576K available (6144K kernel code, > 304K rwdata, 2036K rodata, 1024K init, 377K bss, 42332K reserved, 65536K > cma-reserved, 262144K highmem) > [0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 > [0.00] rcu: Preemptible hierarchical RCU implementation. > [0.00] rcu: RCU event tracing is enabled. > [0.00] rcu: RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2. > [0.00] Tasks RCU enabled. > [0.00] rcu: RCU calculated value of scheduler-enlistment delay > is 100 jiffies. > [0.00] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2 > [0.00] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16 > [0.00] L2C-310 errata 752271 769419 enabled > [0.00] L2C-310 enabling early BRESP for Cortex-A9 > [0.00] L2C-310 full line of zeros enabled for Cortex-A9 > [0.00] L2C-310 ID prefetch enabled, offset 16 lines > [0.00] L2C-310 dynamic clock gating enabled, standby mode enabled > [0.00] L2C-310 cache controller enabled, 16 ways, 1024 kB > [0.00] L2C-310: CACHE_ID 0x41c7, AUX_CTRL 0x76470001 > [0.00] random: get_random_bytes called from > start_kernel+0x2ac/0x434 with crng_init=0 > [0.00] Switching to timer-based delay loop, resolution 333ns > [0.07] sched_clock: 32 bits at 3000kHz, resolution 333ns, wraps > every 715827882841ns > [0.008211] clocksource: mxc_timer1: mask: 0x max_cycles: > 0x, max_idle_ns: 637086815595 ns > [0.019178] Console: colour dummy device 80x30 > [0.023670] Calibrating delay loop (skipped), value calculated using > timer frequency.. 6.00 BogoMIPS (lpj=3000) > [0.033799] pid_max: default: 32768 minimum: 301 > [0.038557] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes) > [0.045211] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 > bytes) > [0.052846] *** VALIDATE proc *** > [0.056285] *** VALIDATE cgroup1 *** > [0.059886] *** VALIDATE cgroup2 *** > [0.063483] CPU: Testing write buffer coherency: ok > [0.068401] CPU0: Spectre v2: using BPIALL workaround > [0.073698] CPU0: thread -1, cpu 0, socket 0, mpidr 8000 > [0.085424] Setting up static identity map for 0x1010 - 0x10100060 > [0.093365] rcu: Hierarchical SRCU implementat
iMX6 5.2-rc3 boot failure due to "PCI: imx6: Allow asynchronous probing"
[] (_dev_info) from [] (devm_of_pci_get_host_bridge_resources+0x198/0x298) [ 40.950471] [] (devm_of_pci_get_host_bridge_resources) from [] (dw_pcie_host_init+0xb4/0x544) [ 40.960757] [] (dw_pcie_host_init) from [] (imx6_pcie_probe+0x3a0/0x6b8) [ 40.969220] [] (imx6_pcie_probe) from [] (platform_drv_probe+0x48/0x98) [ 40.977593] [] (platform_drv_probe) from [] (really_probe+0xf0/0x2c8) [ 40.985792] [] (really_probe) from [] (driver_probe_device+0x60/0x16c) [ 40.994079] [] (driver_probe_device) from [] (__driver_attach_async_helper+0x50/0x54) [ 41.003671] [] (__driver_attach_async_helper) from [] (async_run_entry_fn+0x44/0x118) [ 41.013266] [] (async_run_entry_fn) from [] (process_one_work+0x17c/0x390) [ 41.021902] [] (process_one_work) from [] (worker_thread+0x44/0x518) [ 41.030015] [] (worker_thread) from [] (kthread+0x144/0x14c) [ 41.037430] [] (kthread) from [] (ret_from_fork+0x14/0x2c) [ 41.044667] Exception stack(0xe8081fb0 to 0xe8081ff8) [ 41.049730] 1fa0: [ 41.057926] 1fc0: [ 41.066123] 1fe0: 0013 [ 41.072758] Rebooting in 30 seconds.. -- Robert Hancock Senior Software Developer SED Systems, a division of Calian Ltd. Email: hanc...@sedsystems.ca
Re: [PATCH 1/2] mfd: core: Support multiple OF child devices of the same type
On 2019-06-05 11:27 p.m., Lee Jones wrote: >>>> Without having the .of_full_name support, both MFD cells ended up >>>> wrongly matching against the i2c@c device tree node since we just >>>> picked the first one where of_compatible matched. >>> >>> What is contained in each of their resources? >> >> These are the resource entries for those two devices: >> >> static const struct resource dbe_i2c1_resources[] = { >> { >> .start = 0xc, >> .end= 0xc, >> .name = "xi2c1_regs", >> .flags = IORESOURCE_MEM, >> .desc = IORES_DESC_NONE >> }, >> }; >> >> static const struct resource dbe_i2c2_resources[] = { >> { >> .start = 0xd, >> .end= 0xd, >> .name = "xi2c2_regs", >> .flags = IORESOURCE_MEM, >> .desc = IORES_DESC_NONE >> }, >> }; > > This is your problem. You are providing the memory resources through > *both* DT and MFD. I don't believe I've seen your MFD driver, but it > looks like it's probably not required at all. Just allow DT to probe > each of your child devices. You can obtain the IO memory from there > directly using the usual platform_get_resource() calls. As far as I can tell, the DT child devices underneath a PCIe device don't get probed and drivers loaded automatically - possibly for valid reasons. The MFD driver appears to be required in order to actually get drivers attached to those DT nodes. Right now those devices are ending up with no memory resources unless they are injected through the MFD cells. It would be handy if the memory resources were mapped automatically from the PCIe BARs to the sub-devices, to avoid duplicating information in the DT and the driver, but even if that was solved it wouldn't avoid the need for this patch, as the devices would still end up attached to the wrong DT node and pick up the wrong properties. The other reason we need the MFD driver is we are implementing an IRQ domain to map the interrupts from the PCIe device to the child nodes, and using some of those callbacks to poke other registers on the PCIe to assist with converting the level-triggered AXI interrupts to edge-triggered MSIs. This is what the outer DT leading up to what I showed earlier looks like. &pcie { pinctrl-names = "default"; pinctrl-0 = <&pinctrl_pcie>; reset-gpio = <&gpio7 12 GPIO_ACTIVE_LOW>; status = "okay"; pci_rootport: pcie@0,0 { reg = <0x8300 0 0 0 0>; #address-cells = <3>; #size-cells = <2>; ranges; fpga_pcie: pcie@1,0 { reg = <0x201 0 0 0 0>; #address-cells = <1>; #size-cells = <1>; interrupt-controller; #interrupt-cells = <1>; ... axi_iic_0: i2c@c { compatible = "xlnx,xps-iic-2.00.a"; clocks = <&axi_clk>; clock-frequency = <10>; interrupts = <7>; #size-cells = <0>; #address-cells = <1>; }; axi_iic_1: i2c@d { compatible = "xlnx,xps-iic-2.00.a"; clocks = <&axi_clk>; clock-frequency = <10>; interrupts = <8>; #size-cells = <0>; #address-cells = <1>; }; }; }; }; > >> Ideally the IO memory resource entries would be picked up and mapped >> through the device tree as well, as they are with the interrupts, but I >> haven't yet found the device tree magic that would allow that to happen >> yet, if it's possible. The setup we have has a number of peripherals on >> an AXI bus which are behind a PCIe to AXI bridge, and we're using mfd to >> instantiate each of those AXI devices under the PCIe device. >> > -- Robert Hancock Senior Software Developer SED Systems, a division of Calian Ltd. Email: hanc...@sedsystems.ca
Re: [PATCH 1/2] mfd: core: Support multiple OF child devices of the same type
On 2019-06-05 12:45 p.m., Lee Jones wrote: >>>> diff --git a/include/linux/mfd/core.h b/include/linux/mfd/core.h >>>> index 99c0395..470f6cb 100644 >>>> --- a/include/linux/mfd/core.h >>>> +++ b/include/linux/mfd/core.h >>>> @@ -55,6 +55,9 @@ struct mfd_cell { >>>> */ >>>>const char *of_compatible; >>>> >>>> + /* Optionally match against a specific device of a given type */ >>>> + const char *of_full_name; >>>> + >>> >>> Can you give me an example for when this might be useful? >> >> This is an example of some device tree entries for our MFD device: >> >> axi_iic_0: i2c@c { >> compatible = "xlnx,xps-iic-2.00.a"; >> clocks = <&axi_clk>; >> clock-frequency = <10>; >> interrupts = <7>; >> #size-cells = <0>; >> #address-cells = <1>; >> }; >> >> axi_iic_1: i2c@d { >> compatible = "xlnx,xps-iic-2.00.a"; >> clocks = <&axi_clk>; >> clock-frequency = <10>; >> interrupts = <8>; >> #size-cells = <0>; >> #address-cells = <1>; >> }; >> >> and the corresponding MFD cells: >> >> { >> .name = "axi_iic_0", >> .of_compatible = "xlnx,xps-iic-2.00.a", >> .of_full_name = "i2c@c", >> .num_resources = ARRAY_SIZE(dbe_i2c1_resources), >> .resources = dbe_i2c1_resources >> }, >> { >> .name = "axi_iic_1", >> .of_compatible = "xlnx,xps-iic-2.00.a", >> .of_full_name = "i2c@d", >> .num_resources = ARRAY_SIZE(dbe_i2c2_resources), >> .resources = dbe_i2c2_resources >> }, >> >> Without having the .of_full_name support, both MFD cells ended up >> wrongly matching against the i2c@c device tree node since we just >> picked the first one where of_compatible matched. > > What is contained in each of their resources? These are the resource entries for those two devices: static const struct resource dbe_i2c1_resources[] = { { .start = 0xc, .end= 0xc, .name = "xi2c1_regs", .flags = IORESOURCE_MEM, .desc = IORES_DESC_NONE }, }; static const struct resource dbe_i2c2_resources[] = { { .start = 0xd, .end= 0xdffff, .name = "xi2c2_regs", .flags = IORESOURCE_MEM, .desc = IORES_DESC_NONE }, }; Ideally the IO memory resource entries would be picked up and mapped through the device tree as well, as they are with the interrupts, but I haven't yet found the device tree magic that would allow that to happen yet, if it's possible. The setup we have has a number of peripherals on an AXI bus which are behind a PCIe to AXI bridge, and we're using mfd to instantiate each of those AXI devices under the PCIe device. -- Robert Hancock Senior Software Developer SED Systems, a division of Calian Ltd. Email: hanc...@sedsystems.ca
Re: [PATCH 1/2] mfd: core: Support multiple OF child devices of the same type
On 2019-06-05 12:31 a.m., Lee Jones wrote: > On Tue, 04 Jun 2019, Robert Hancock wrote: > >> Previously the MFD core supported assigning OF nodes to created MFD >> devices, but relied solely on matching the of_compatible string. This >> would result in devices being potentially assigned the wrong node if >> there are multiple devices with the same compatible string within a >> multifunction device. >> >> Add support for matching the full name of the node in the MFD cell >> definition, so that we can match against a specific instance of a >> device. If this is not specified, we match just based on the >> compatible string, as before. >> >> Signed-off-by: Robert Hancock >> --- >> drivers/mfd/mfd-core.c | 5 - >> include/linux/mfd/core.h | 3 +++ >> 2 files changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/mfd/mfd-core.c b/drivers/mfd/mfd-core.c >> index 1ade4c8..74bc895 100644 >> --- a/drivers/mfd/mfd-core.c >> +++ b/drivers/mfd/mfd-core.c >> @@ -177,7 +177,10 @@ static int mfd_add_device(struct device *parent, int id, >> >> if (parent->of_node && cell->of_compatible) { >> for_each_child_of_node(parent->of_node, np) { >> -if (of_device_is_compatible(np, cell->of_compatible)) { >> +if (of_device_is_compatible(np, cell->of_compatible) && >> +(!cell->of_full_name || >> + !strcmp(cell->of_full_name, >> + of_node_full_name(np { >> pdev->dev.of_node = np; >> break; > > That is some ugly, squashed up code. > > If we end up accepting this, I suggest flattening this out a bit. > > ... but we'll cross that bridge when we come to it. Yes, that if statement could be broken up to make it more readable. Will fix in a next version assuming the concept is acceptable. > >> diff --git a/include/linux/mfd/core.h b/include/linux/mfd/core.h >> index 99c0395..470f6cb 100644 >> --- a/include/linux/mfd/core.h >> +++ b/include/linux/mfd/core.h >> @@ -55,6 +55,9 @@ struct mfd_cell { >> */ >> const char *of_compatible; >> >> +/* Optionally match against a specific device of a given type */ >> +const char *of_full_name; >> + > > Can you give me an example for when this might be useful? This is an example of some device tree entries for our MFD device: axi_iic_0: i2c@c { compatible = "xlnx,xps-iic-2.00.a"; clocks = <&axi_clk>; clock-frequency = <10>; interrupts = <7>; #size-cells = <0>; #address-cells = <1>; }; axi_iic_1: i2c@d { compatible = "xlnx,xps-iic-2.00.a"; clocks = <&axi_clk>; clock-frequency = <10>; interrupts = <8>; #size-cells = <0>; #address-cells = <1>; }; and the corresponding MFD cells: { .name = "axi_iic_0", .of_compatible = "xlnx,xps-iic-2.00.a", .of_full_name = "i2c@c", .num_resources = ARRAY_SIZE(dbe_i2c1_resources), .resources = dbe_i2c1_resources }, { .name = "axi_iic_1", .of_compatible = "xlnx,xps-iic-2.00.a", .of_full_name = "i2c@d", .num_resources = ARRAY_SIZE(dbe_i2c2_resources), .resources = dbe_i2c2_resources }, Without having the .of_full_name support, both MFD cells ended up wrongly matching against the i2c@c device tree node since we just picked the first one where of_compatible matched. -- Robert Hancock Senior Software Developer SED Systems, a division of Calian Ltd. Email: hanc...@sedsystems.ca
[PATCH 0/2] MFD core updates for device tree binding support
Fixes for the device tree binding support in MFD core. Robert Hancock (2): mfd: core: Support multiple OF child devices of the same type mfd: core: Set fwnode for created devices drivers/mfd/mfd-core.c | 6 +- include/linux/mfd/core.h | 3 +++ 2 files changed, 8 insertions(+), 1 deletion(-) -- 1.8.3.1
[PATCH 1/2] mfd: core: Support multiple OF child devices of the same type
Previously the MFD core supported assigning OF nodes to created MFD devices, but relied solely on matching the of_compatible string. This would result in devices being potentially assigned the wrong node if there are multiple devices with the same compatible string within a multifunction device. Add support for matching the full name of the node in the MFD cell definition, so that we can match against a specific instance of a device. If this is not specified, we match just based on the compatible string, as before. Signed-off-by: Robert Hancock --- drivers/mfd/mfd-core.c | 5 - include/linux/mfd/core.h | 3 +++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/mfd/mfd-core.c b/drivers/mfd/mfd-core.c index 1ade4c8..74bc895 100644 --- a/drivers/mfd/mfd-core.c +++ b/drivers/mfd/mfd-core.c @@ -177,7 +177,10 @@ static int mfd_add_device(struct device *parent, int id, if (parent->of_node && cell->of_compatible) { for_each_child_of_node(parent->of_node, np) { - if (of_device_is_compatible(np, cell->of_compatible)) { + if (of_device_is_compatible(np, cell->of_compatible) && + (!cell->of_full_name || +!strcmp(cell->of_full_name, +of_node_full_name(np { pdev->dev.of_node = np; break; } diff --git a/include/linux/mfd/core.h b/include/linux/mfd/core.h index 99c0395..470f6cb 100644 --- a/include/linux/mfd/core.h +++ b/include/linux/mfd/core.h @@ -55,6 +55,9 @@ struct mfd_cell { */ const char *of_compatible; + /* Optionally match against a specific device of a given type */ + const char *of_full_name; + /* Matches ACPI */ const struct mfd_cell_acpi_match*acpi_match; -- 1.8.3.1
[PATCH 2/2] mfd: core: Set fwnode for created devices
The logic for setting the of_node on devices created by mfd did not set the fwnode pointer to match, which caused fwnode-based APIs to malfunction on these devices since the fwnode pointer was null. Fix this. Signed-off-by: Robert Hancock --- drivers/mfd/mfd-core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/mfd/mfd-core.c b/drivers/mfd/mfd-core.c index 74bc895..228163c 100644 --- a/drivers/mfd/mfd-core.c +++ b/drivers/mfd/mfd-core.c @@ -182,6 +182,7 @@ static int mfd_add_device(struct device *parent, int id, !strcmp(cell->of_full_name, of_node_full_name(np { pdev->dev.of_node = np; + pdev->dev.fwnode = &np->fwnode; break; } } -- 1.8.3.1
Re: [PATCH] PCI: controller: dwc: Make PCI_IMX6 depend on PCIEPORTBUS
On 2018-12-06 10:10 a.m., Robert Hancock wrote: > On 2018-12-06 9:50 a.m., Lucas Stach wrote: >> Am Donnerstag, den 06.12.2018, 09:45 -0600 schrieb Robert Hancock: >>> On 2018-12-06 2:10 a.m., Baruch Siach wrote: >>>> Hi Andrey, >>>> >>>> Adding Robert Hancock who reported[1] on a PCIe MSI issue with i.MX6. >>>> >>>> Andrey Smirnov writes: >>>> >>>>> Building a kernel with CONFIG_PCI_IMX6=y, but CONFIG_PCIEPORTBUS=n >>>>> produces a system where built-in PCIE bridge (16c3:abcd) isn't bound >>>>> to pcieport driver. This, in turn, results in a PCIE bus that is >>>>> capable of enumerating attached PCIE device, but lacks functional >>>>> interrupt support. >>>> >>>> Robert, does that fix your issue? >>> >>> Unfortunately, no.. in fact the situation on my setup is even worse with >>> CONFIG_PCIEPORTBUS enabled: Not only does MSI still not function, but >>> now INTx interrupts are somehow broken as well - no interrupts are >>> received. The IRQ information shown in /proc/interrupts is correct, but >>> the count remains stubbornly at 0. >> >> That's expected. The port services will use an MSI IRQ when available >> and due to a design issue with the DWC PCIe it will not forward any >> legacy IRQs if any MSI is in use. If any of the PCIe devices in your >> system are unable to work with MSI IRQs, you must boot with "nomsi" on >> the kernel command line set. > > That seems like an unfortunate design choice on their part.. well that > would probably argue against adding this as a hard dependency then, if > non-MSI-supporting PCIe devices can't work with default boot options > with that set. > > I'm looking into testing with an NXP Smart Devices board and some PCIe > cards to see if I can verify whether MSI works on those or not, since we > currently don't have a way to independently verify that the MSI > implementation in our FPGA is working or whether another PCIe device > works with MSI (the FPGA is integrated on the system board). I've now done some tests with a NXP SabreSD reference board and an Intel wireless card: -With the standard imx_v6_v7 defconfig, MSI does not work, INTx works -With CONFIG_PCIEPORTBUS=y, MSI does work So it seems like enabling PCIEPORTBUS should fix our MSI issue on the CPU side, and our remaining problem is likely on the FPGA device side. However, there's still the issue that enabling that option breaks INTx support - I don't have a PCIe card handy that the kernel doesn't enable MSI for in order to test that on the Sabre board, but based on Lucas's comment and my results on our board, it definitely seems to be an issue. I would hope there must be a way to handle that.. -- Robert Hancock Senior Software Developer SED Systems Email: hanc...@sedsystems.ca
Re: [PATCH] PCI: controller: dwc: Make PCI_IMX6 depend on PCIEPORTBUS
On 2018-12-06 9:50 a.m., Lucas Stach wrote: > Am Donnerstag, den 06.12.2018, 09:45 -0600 schrieb Robert Hancock: >> On 2018-12-06 2:10 a.m., Baruch Siach wrote: >>> Hi Andrey, >>> >>> Adding Robert Hancock who reported[1] on a PCIe MSI issue with i.MX6. >>> >>> Andrey Smirnov writes: >>> >>>> Building a kernel with CONFIG_PCI_IMX6=y, but CONFIG_PCIEPORTBUS=n >>>> produces a system where built-in PCIE bridge (16c3:abcd) isn't bound >>>> to pcieport driver. This, in turn, results in a PCIE bus that is >>>> capable of enumerating attached PCIE device, but lacks functional >>>> interrupt support. >>> >>> Robert, does that fix your issue? >> >> Unfortunately, no.. in fact the situation on my setup is even worse with >> CONFIG_PCIEPORTBUS enabled: Not only does MSI still not function, but >> now INTx interrupts are somehow broken as well - no interrupts are >> received. The IRQ information shown in /proc/interrupts is correct, but >> the count remains stubbornly at 0. > > That's expected. The port services will use an MSI IRQ when available > and due to a design issue with the DWC PCIe it will not forward any > legacy IRQs if any MSI is in use. If any of the PCIe devices in your > system are unable to work with MSI IRQs, you must boot with "nomsi" on > the kernel command line set. That seems like an unfortunate design choice on their part.. well that would probably argue against adding this as a hard dependency then, if non-MSI-supporting PCIe devices can't work with default boot options with that set. I'm looking into testing with an NXP Smart Devices board and some PCIe cards to see if I can verify whether MSI works on those or not, since we currently don't have a way to independently verify that the MSI implementation in our FPGA is working or whether another PCIe device works with MSI (the FPGA is integrated on the system board). -- Robert Hancock Senior Software Developer SED Systems Email: hanc...@sedsystems.ca
Re: [PATCH] PCI: controller: dwc: Make PCI_IMX6 depend on PCIEPORTBUS
On 2018-12-06 2:10 a.m., Baruch Siach wrote: > Hi Andrey, > > Adding Robert Hancock who reported[1] on a PCIe MSI issue with i.MX6. > > Andrey Smirnov writes: > >> Building a kernel with CONFIG_PCI_IMX6=y, but CONFIG_PCIEPORTBUS=n >> produces a system where built-in PCIE bridge (16c3:abcd) isn't bound >> to pcieport driver. This, in turn, results in a PCIE bus that is >> capable of enumerating attached PCIE device, but lacks functional >> interrupt support. > > Robert, does that fix your issue? Unfortunately, no.. in fact the situation on my setup is even worse with CONFIG_PCIEPORTBUS enabled: Not only does MSI still not function, but now INTx interrupts are somehow broken as well - no interrupts are received. The IRQ information shown in /proc/interrupts is correct, but the count remains stubbornly at 0. So given that outcome, I don't think we should add this as a hard dependency until we can figure out what is going on, as it seems to regress working setups. > >> Signed-off-by: Andrey Smirnov >> --- >> >> Assuming this is a reasonable dependency, shold this be done to more >> than just i.MX6 driver? >> >> drivers/pci/controller/dwc/Kconfig | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/pci/controller/dwc/Kconfig >> b/drivers/pci/controller/dwc/Kconfig >> index 2b139acccf32..44ededbeab85 100644 >> --- a/drivers/pci/controller/dwc/Kconfig >> +++ b/drivers/pci/controller/dwc/Kconfig >> @@ -92,6 +92,7 @@ config PCI_IMX6 >> bool "Freescale i.MX6 PCIe controller" >> depends on SOC_IMX8MQ || SOC_IMX6Q || (ARM && COMPILE_TEST) >> depends on PCI_MSI_IRQ_DOMAIN >> +depends on PCIEPORTBUS > > This effectively disables PCIe in imx_v6_v7_defconfig, since > CONFIG_PCIEPORTBUS is not enabled there. Maybe do 'select' instead? > >> Select PCIE_DW_HOST >> >> config PCIE_SPEAR13XX > > baruch > > [1] > http://lists.infradead.org/pipermail/linux-arm-kernel/2018-November/614800.html > > -- > http://baruch.siach.name/blog/ ~. .~ Tk Open Systems > =}ooO--U--Ooo{= >- bar...@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il - > -- Robert Hancock Senior Software Developer SED Systems Email: hanc...@sedsystems.ca
Re: linux: sata_nv: adma support
On Tue, Aug 25, 2015 at 6:58 AM, Pali Rohár wrote: > On Tuesday 25 August 2015 07:20:05 Mark Lord wrote: >> On 15-08-01 09:45 PM, Robert Hancock wrote: >> >On Sat, Aug 1, 2015 at 2:09 PM, Pali Rohár wrote: >> >>On Thursday 25 December 2014 07:22:13 Robert Hancock wrote: >> >>>On Tue, Dec 23, 2014 at 1:51 PM, Pali Rohár >> >>>wrote: >> >>>>Hello, >> >>>> >> >>>>I have nvidia nforce4 motherboard with nvidia sata controller: >> .. >> >>>It looks like something is trying to issue a command to disable APM >> >>>power management on the drive, and the command fails (likely because >> >>>it doesn't support that command). >> .. >> >> /sbin/hdparm -B254 $DRIVE >> >> >> >>And that -B254 cause above error message in dmesg log. Output from >> >>hdparm is: >> >> >> >> /dev/sda: >> >> setting Advanced Power Management level to 0xfe (254) >> >> APM_level = not supported >> .. >> >> $ sudo hdparm -I /dev/sda | grep -i power >> >> *Power Management feature set >> >> That's not the same as APM ("Advanced" Power Management). >> >> >However, these NVIDIA SATAs are black boxes, and rather buggy ones at that, >> >so it's possible there's an unknown issue there. >> >> I wonder if NVIDIA simply bought out the IP from Pacific Digital >> when they went bust? Pacific Digital invented the original "ADMA", >> and the pdc_adma.c driver in the kernel knows all about it. >> If the IP is pretty similar (identical?) then we could probably >> improve things. >> > > Can you check if nvidia ADMA code and that Pacific Digital ADMA code is > similar or not? The ADMA spec that Pacific Digital adapter (somewhat) implements was documented in a standard, T13 1510D, ATA/ATAPI Host Adapters Standard. My guess is that is where NVIDIA got the ideas for this controller setup. I would be fairly surprised if the controller actually contained any Pacific Digital IP, as the NVIDIA controllers are quite different (the original ADMA spec didn't envision SATA, NCQ or 64-bit DMA while the NVIDIA controllers support these for example). Even if there is some shared IP, the issues with these controllers seem to be more controller bugs than issues with how the controller is being used. In fact, the later NVIDIA Windows drivers suspiciously removed all references to NCQ support in the control panel, which suggests that maybe even they gave up on it. Even if you don't use any ADMA features at all (even when using the default Microsoft IDE driver in Windows), the error handling is very shaky - things like disc read errors on an optical drive connected to the controller will sometimes hard-lock the machine. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux: sata_nv: adma support
On Sun, Aug 2, 2015 at 3:08 AM, Pali Rohár wrote: > On Sunday 02 August 2015 03:45:32 Robert Hancock wrote: >> On Sat, Aug 1, 2015 at 2:09 PM, Pali Rohár >> wrote: >> > On Thursday 25 December 2014 07:22:13 Robert Hancock wrote: >> >> On Tue, Dec 23, 2014 at 1:51 PM, Pali Rohár >> >> >> >> wrote: >> >> > Hello, >> >> > >> >> > I have nvidia nforce4 motherboard with nvidia sata controller: >> >> > >> >> > 00:07.0 IDE interface [0101]: NVIDIA Corporation CK804 Serial >> >> > ATA Controller [10de:0054] (rev f3) >> >> > 00:08.0 IDE interface [0101]: NVIDIA Corporation CK804 Serial >> >> > ATA Controller [10de:0055] (rev f3) >> >> > >> >> > I manually enabled adma mode (which is disabled by default) by >> >> > adding sata_nv.adma=1 to grub cmdline. In git history I found >> >> > that enabling adma mode includes NCQ support and reduced CPU >> >> > overhead. It looks like adma mode is working, but at every boot >> >> > I see one same error message in dmesg: >> >> > >> >> > [ 16.823514] ata1.00: exception Emask 0x1 SAct 0x0 SErr 0x0 >> >> > action 0x0 >> >> > [ 16.823520] ata1.00: CPB resp_flags 0x11: , CMD error >> >> > [ 16.823524] ata1.00: failed command: SET FEATURES >> >> > [ 16.823530] ata1.00: cmd ef/05:fe:00:00:00/00:00:00:00:00/40 >> >> > tag 16 >> >> > [ 16.823530] res 51/04:fe:00:00:00/00:00:00:00:00/40 >> >> > Emask 0x1 (device error) >> >> > [ 16.823533] ata1.00: status: { DRDY ERR } >> >> > [ 16.823535] ata1.00: error: { ABRT } >> >> > >> >> > When adma is disabled then this error message is not generated. >> >> >> >> It looks like something is trying to issue a command to disable >> >> APM power management on the drive, and the command fails (likely >> >> because it doesn't support that command). I'm not sure where that >> >> would be coming from - I'm pretty sure the kernel doesn't issue >> >> that command itself. Something that's part of your distro >> >> perhaps? >> >> >> >> I don't know why it would only be failing in ADMA mode either, >> >> though depending on where the command is coming from, maybe it's >> >> not being issued otherwise for some reason? >> >> >> >> > What does that error message means? It is critical? What is that >> >> > command SET FEATURES doing? Are there any problems with adma >> >> > mode on nforce4 motherboards? Because I did not see any >> >> > problems (except that one error message). >> >> > >> >> > -- >> >> > Pali Rohár >> >> > pali.ro...@gmail.com >> > >> > Hello, >> > >> > now after long time I did more investigation and that error is >> > reported for every connected HDD. I identified that it comes from >> > udev script >> > >> > /lib/udev/rules.d/85-hdparm.rules >> > >> > which just call script /lib/udev/hdparm for every one connected >> > HDD. >> > >> > Script /lib/udev/hdparm just call: >> > /sbin/hdparm -B254 $DRIVE >> > >> > And that -B254 cause above error message in dmesg log. Output from >> > >> > hdparm is: >> > /dev/sda: >> > setting Advanced Power Management level to 0xfe (254) >> > APM_level = not supported >> > >> > Any idea why in ADMA mode it cause above error (APM unsupported) >> > and in non ADMA mode it is working fine? Maybe APM ATA commands >> > should not be sent via ADMA? >> > >> > Here is another output: >> > $ sudo hdparm -I /dev/sda | grep -i power >> > >> > *Power Management feature set >> > >> > Power-Up In Standby feature set >> > >> > *SET_FEATURES required to spinup after power up >> > *Host-initiated interface power management >> >> The "set features" command is a non-data command so based on our >> current knowledge, it should work in ADMA mode. However, these NVIDIA >> SATAs are black boxes, and rather buggy ones at that, so it's >> possible there's an unknown issue there. >> > > Maybe I should note that hdpar
Re: linux: sata_nv: adma support
On Sat, Aug 1, 2015 at 2:09 PM, Pali Rohár wrote: > On Thursday 25 December 2014 07:22:13 Robert Hancock wrote: >> On Tue, Dec 23, 2014 at 1:51 PM, Pali Rohár >> wrote: >> > Hello, >> > >> > I have nvidia nforce4 motherboard with nvidia sata controller: >> > >> > 00:07.0 IDE interface [0101]: NVIDIA Corporation CK804 Serial ATA >> > Controller [10de:0054] (rev f3) >> > 00:08.0 IDE interface [0101]: NVIDIA Corporation CK804 Serial ATA >> > Controller [10de:0055] (rev f3) >> > >> > I manually enabled adma mode (which is disabled by default) by >> > adding sata_nv.adma=1 to grub cmdline. In git history I found >> > that enabling adma mode includes NCQ support and reduced CPU >> > overhead. It looks like adma mode is working, but at every boot I >> > see one same error message in dmesg: >> > >> > [ 16.823514] ata1.00: exception Emask 0x1 SAct 0x0 SErr 0x0 >> > action 0x0 >> > [ 16.823520] ata1.00: CPB resp_flags 0x11: , CMD error >> > [ 16.823524] ata1.00: failed command: SET FEATURES >> > [ 16.823530] ata1.00: cmd ef/05:fe:00:00:00/00:00:00:00:00/40 >> > tag 16 >> > [ 16.823530] res 51/04:fe:00:00:00/00:00:00:00:00/40 >> > Emask 0x1 (device error) >> > [ 16.823533] ata1.00: status: { DRDY ERR } >> > [ 16.823535] ata1.00: error: { ABRT } >> > >> > When adma is disabled then this error message is not generated. >> >> It looks like something is trying to issue a command to disable APM >> power management on the drive, and the command fails (likely because >> it doesn't support that command). I'm not sure where that would be >> coming from - I'm pretty sure the kernel doesn't issue that command >> itself. Something that's part of your distro perhaps? >> >> I don't know why it would only be failing in ADMA mode either, though >> depending on where the command is coming from, maybe it's not being >> issued otherwise for some reason? >> >> > What does that error message means? It is critical? What is that >> > command SET FEATURES doing? Are there any problems with adma mode >> > on nforce4 motherboards? Because I did not see any problems >> > (except that one error message). >> > >> > -- >> > Pali Rohár >> > pali.ro...@gmail.com > > Hello, > > now after long time I did more investigation and that error is reported > for every connected HDD. I identified that it comes from udev script > > /lib/udev/rules.d/85-hdparm.rules > > which just call script /lib/udev/hdparm for every one connected HDD. > > Script /lib/udev/hdparm just call: > > /sbin/hdparm -B254 $DRIVE > > And that -B254 cause above error message in dmesg log. Output from > hdparm is: > > /dev/sda: > setting Advanced Power Management level to 0xfe (254) > APM_level = not supported > > Any idea why in ADMA mode it cause above error (APM unsupported) and in > non ADMA mode it is working fine? Maybe APM ATA commands should not be > sent via ADMA? > > Here is another output: > > $ sudo hdparm -I /dev/sda | grep -i power > *Power Management feature set > Power-Up In Standby feature set > *SET_FEATURES required to spinup after power up > *Host-initiated interface power management The "set features" command is a non-data command so based on our current knowledge, it should work in ADMA mode. However, these NVIDIA SATAs are black boxes, and rather buggy ones at that, so it's possible there's an unknown issue there. The easiest way to test that would be to take out the condition check for qc->tf.protocol == ATA_PROT_NODATA in nv_adma_use_reg_mode in drivers/ata/sata_nv.c. That would force it to disable ADMA for all non-data commands. I really don't know why Ubuntu is disabling APM on all drives on bootup however. Especially for laptops, that seems like a silly thing to do explicitly. Sounds like one of the silly things Ubuntu is known to do without consulting people. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.18 regression: Error while assigning device slot ID, USB3 devices not detected
On Mon, Jan 19, 2015 at 9:40 AM, Josh Boyer wrote: > On Mon, Jan 19, 2015 at 9:57 AM, Mathias Nyman > wrote: >> On 19.01.2015 15:47, Josh Boyer wrote: >>> On Mon, Jan 19, 2015 at 8:33 AM, Greg KH wrote: >>>> On Mon, Jan 19, 2015 at 08:28:19AM -0500, Josh Boyer wrote: >>>>> On Sun, Jan 18, 2015 at 1:25 AM, Greg KH >>>>> wrote: >>>>>> On Sun, Jan 18, 2015 at 12:08:18AM -0600, Robert Hancock wrote: >>>>>>> I've got an Intel Haswell-based system with a Gigabyte Z87X-D3H >>>>>>> motherboard >>>>>>> under Fedora 21. After updating to the 3.18.2-200 Fedora kernel, I >>>>>>> noticed >>>>>>> some errors in dmesg and at least some of my USB3 ports don't recognize >>>>>>> any >>>>>>> USB3 devices plugged into them: >>>>>>> >>>>>>> [0.560838] xhci_hcd :00:14.0: Error while assigning device slot >>>>>>> ID >>>>>>> [0.560912] xhci_hcd :00:14.0: Max number of devices this xHCI >>>>>>> host >>>>>>> supports is 32. >>>>>>> [0.560990] usb usb2-port2: couldn't allocate usb_device >>>>>>> [0.561098] xhci_hcd :00:14.0: Error while assigning device slot >>>>>>> ID >>>>>>> [0.561163] xhci_hcd :00:14.0: Max number of devices this xHCI >>>>>>> host >>>>>>> supports is 32. >>>>>>> [0.561239] usb usb2-port5: couldn't allocate usb_device >>>>>>> [0.561344] xhci_hcd :00:14.0: Error while assigning device slot >>>>>>> ID >>>>>>> [0.561409] xhci_hcd :00:14.0: Max number of devices this xHCI >>>>>>> host >>>>>>> supports is 32. >>>>>>> [0.561484] usb usb2-port6: couldn't allocate usb_device >>>>>>> >>>>>>> This worked fine under 3.17. Is this a known problem? >>>>>> >>>>>> Yes it is, should be fixed in Linus's tree now and will be backported to >>>>>> the latest 3.18-stable tree in a week or so. >>>>> >>>>> Do you happen to know the commit id? >>>> >>>> f161ead70fa6a62e432dff6e9dab8e3cfbeabea6 >>> >>> Thanks! >>> >> >> Tell me if this fixed the issue. >> I got this gut feeling this might be something else. >> It should have failed in 3.17 as well > > OK. We're tracking this in the bug Robert filed here: > > https://bugzilla.redhat.com/show_bug.cgi?id=1183289 > > The patch should be in the next Fedora kernel build, so hopefully we > can get back to you soon. The patched build seems to fix the problem with the ports not recognizing SuperSpeed devices. I do get these errors in dmesg now though, which also weren't in 3.17: [0.881967] usb: failed to peer 1-9-port1 and 2-5-port1 by location (1-9-port1:none) (2-5-port1:usb1-port11) [0.881968] usb 1-9-port1: failed to peer to 2-5-port1 (-16) [0.881969] usb: port power management may be unreliable [0.881993] usb: failed to peer 1-9-port2 and 2-5-port2 by location (1-9-port2:none) (2-5-port2:usb1-port12) [0.881994] usb 1-9-port2: failed to peer to 2-5-port2 (-16) [0.882015] usb: failed to peer 1-9-port3 and 2-5-port3 by location (1-9-port3:none) (2-5-port3:usb1-port13) [0.882016] usb 1-9-port3: failed to peer to 2-5-port3 (-16) [0.882037] usb: failed to peer 1-9-port4 and 2-5-port4 by location (1-9-port4:none) (2-5-port4:usb1-port14) [0.882038] usb 1-9-port4: failed to peer to 2-5-port4 (-16) Full dmesg is attached to this bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1183289 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.18 regression: Error while assigning device slot ID, USB3 devices not detected
I've got an Intel Haswell-based system with a Gigabyte Z87X-D3H motherboard under Fedora 21. After updating to the 3.18.2-200 Fedora kernel, I noticed some errors in dmesg and at least some of my USB3 ports don't recognize any USB3 devices plugged into them: [0.560838] xhci_hcd :00:14.0: Error while assigning device slot ID [0.560912] xhci_hcd :00:14.0: Max number of devices this xHCI host supports is 32. [0.560990] usb usb2-port2: couldn't allocate usb_device [0.561098] xhci_hcd :00:14.0: Error while assigning device slot ID [0.561163] xhci_hcd :00:14.0: Max number of devices this xHCI host supports is 32. [0.561239] usb usb2-port5: couldn't allocate usb_device [0.561344] xhci_hcd :00:14.0: Error while assigning device slot ID [0.561409] xhci_hcd :00:14.0: Max number of devices this xHCI host supports is 32. [0.561484] usb usb2-port6: couldn't allocate usb_device This worked fine under 3.17. Is this a known problem? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug] 3.16 fwdownload failure on Marvell 88SE9125 sata controller
On 01/09/14 02:22 AM, Ming Lei wrote: Hi Guys, When we use hdparm to download firmware on system with Marvell 88SE9125 SATA controller, it returns failure always and it has been observed in several systems: #hdparm --fwdownload-mode7 fw.bin --yes-i-kno-what-i-am-doing --please-destroy-my-drive /dev/sda /dev/sda: fwdownload: xfer_mode=7 min=1 max=65535 size=699392 FAILED: Input/output error Any errors in dmesg after doing this? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: OT: Open letter to the Linux World
On 12/08/14 01:38 PM, Christopher Barry wrote: What is intelligence? Not exactly the spook kind, but rather what is the definition of intelligence in humans? This is pretty good: http://en.wikipedia.org/wiki/Intelligence#Definitions By most accounts, the self-appointed and arguably too influential creators and thinkers of the day around the 'One Linux' idea fit the definition of intelligent people - at least in the technical realm. And their messages are pretty compelling: * Simplify cross-distro development. * Enable faster boot times. * Enable an on-demand, event driven architecture, similar to 'Modern' Operating Systems. * Bring order and control to subsystems that have had as many different tools as there were distros. All seemingly noble goals. All apparently come from a deep desire to contribute and make things better. Almost anyone could argue that these intelligent people thought hard about these issues, and put an enormous amount of effort into a solution to these problems. Unfortunately, the solution they came up with, as you may have guessed by now, is 'systemd'. While not new, it's grotesque impact has finally reached me and I must speak to it publicly. So, what is systemd? Well, meet your new God. You may have been praying at the alter of simplicity, but your religion is being deprecated. It likely already happened without your knowledge during an upgrade of your Linux box. systemd is the all knowing, all controlling meta-deity that sees all and supervises all. It's the new One Master Process that aspires to control everything it can - and it's already doing a lot. It's what init would look like if it were a transformer on steroids. It's complicated, multi-faceted, opaque, and supremely powerful. I had heard about systemd a few years back, when upstart and some other init replacements I can't remember were showing up on the scene. And while it seemed mildly interesting, I was not in favor of using it, nor any of them for that matter. init was working just fine for me. init was simple and robust. While configuration had it's distro-specific differences, it was often these differences that made one pick the distro to use in the first place, and to stay with that distro. The tools essentially *were* the distro. I just dist-upgraded to Jessie, and voila - PID 1 was suddenly systemd. What a clusterfuck. You might want to send this to a mailing list that's remotely relevant, like perhaps a Debian one. Though I wouldn't expect a very productive response there either, since you neglected to include any reasons behind your rant other than "they changed it, now it sucks". -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Bad DMA from Marvell 9230
On 27/03/14 09:19 AM, Tejun Heo wrote: On Thu, Mar 27, 2014 at 05:57:37PM +1100, Benjamin Herrenschmidt wrote: I've contacted Marvell, but I was wondering if anybody here had already experienced something similar or has an idea of what else the chip might be doing wrong so we can try to find a workaround ? No idea. First time to hear such problem. :( There are other Marvell controllers that do DMA requests from the wrong PCI function ID and cause IOMMU issues, so it seems like testing on such systems (or just validating the DMA transactions done by the controller by some other means) isn't something that Marvell likes to do. Presumably reading from address 0 is normally fine without an IOMMU, so this problem wouldn't be noticed otherwise. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Bug 71331 - mlock yields processor to lower priority process
On 21/03/14 08:50 AM, jimmie.da...@l-3com.com wrote:> > > From: Mike Galbraith [umgwanakikb...@gmail.com] > Sent: Friday, March 21, 2014 9:41 AM > To: Davis, Bud @ SSG - Link > Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org > Subject: RE: Bug 71331 - mlock yields processor to lower priority process > > On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote: > >> If you call mlock () from a SCHED_FIFO task, you expect it to return >> when done. You don't expect it to block, and your task to be >> pre-empted. > > Say some of your pages are sitting in an nfs swapfile orbiting Neptune, > how do they get home, and what should we do meanwhile? > > -Mike > > Two options. > > #1. Return with a status value of EAGAIN. > > or > > #2. Don't return until you can do it. > > If SCHED_FIFO is used, and mlock() is called, the intention of the user is very clear. Run this task until > it is completed or it blocks (and until a bit ago, mlock() did not block). Returning EAGAIN is not something that the API definition from POSIX allows for, that is only for indicating a failure. If the memory that is being locked is not currently residing in RAM, then the memory will need to be swapped in before the call returns, which clearly cannot be done without blocking. Thus mlock can potentially block, which has not changed. Whether or not any kernel behavior has changed to cause this to happen in some cases where it didn't previously, the fact remains that this is allowed behavior. Generally real-time applications should not be doing mlock calls during their real-time execution for that reason. The required memory regions should be locked during startup so that this kind of execution delay can be avoided at runtime. > > SCHED_FIFO users don't care about fairness. They want the system to do what it is told. > > regards, > Bud Davis > > > > > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 15995MB available under Linux but 16329MB available under Win 7
On 21/03/14 10:35 PM, Branimir Maksimovic wrote: This really puzzles me. bmaxa@maxa:~$ lspci -v -s 01:00.0 01:00.0 VGA compatible controller: NVIDIA Corporation GF114 [GeForce GTX 560 Ti] (rev a1) (prog-if 00 [VGA controller]) Subsystem: CardExpert Technology Device 0801 Flags: bus master, fast devsel, latency 0, IRQ 52 Memory at f400 (32-bit, non-prefetchable) [size=32M] Memory at e800 (64-bit, prefetchable) [size=128M] Memory at f000 (64-bit, prefetchable) [size=64M] I/O ports at e000 [size=128] [virtual] Expansion ROM at f600 [disabled] [size=512K] Capabilities: Kernel driver in use: nvidia bmaxa@maxa:~$ dmesg | grep Mem [0.00] Memory: 16358808K/16721316K available (7232K kernel code, 1106K rwdata, 3456K rodata, 1324K init, 1432K bss, 362508K reserved) Can you post your full dmesg output? Everything is clear most of reserved RAM goes to VGA mapping? But, Windows resource monitor and task manager says just around 50MB hardware reserved? So I have tried different kernel boot parameters to no avail. So question is: Does Windows magically maps VGA to upper addresses (beyond 16GB) or simply Windows does not reports all reserved RAM? When I disable memory remap in BIOS Windows reports 500MB more reserved, but Linux adds this 500MB to 300MB reserved giving around 800MB. CPU is i5 3570k, motherboard z77, 16GB of DD3 RAM, AMI BIOS. Branimir. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: xHCI regression in stable 3.13.5 with USB3 card reader (Bisected)
On 05/03/14 11:17 PM, Robert Hancock wrote: I have a USB 3.0 multi-card reader device: Bus 004 Device 002: ID 05e3:0743 Genesys Logic, Inc. which seems to work fine in 3.13.4 (Fedora version kernel-3.13.4-200 specifically) but fails in 3.13.5 (specifically kernel-3.13.5-202). Below is what I get in dmesg. Essentially there's a bunch of input/output errors making the reader mostly unusable. This is on an Intel Haswell machine with this controller: 00:14.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI [8086:8c31] (rev 05) It looks like there were some XHCI commits that went into 3.13.5 so it seems likely one of those is the cause. I can try current git if there's anything in there that's likely to fix it. But it does seem like a regression got into the stable kernel in this respect. Bisecting between 3.13.4 and 3.13.5 gives me this: c8f44f98901994832ccecb87c3dd7900274b699a is the first bad commit commit c8f44f98901994832ccecb87c3dd7900274b699a Author: Sarah Sharp Date: Fri Jan 31 11:26:25 2014 -0800 xhci 1.0: Limit arbitrarily-aligned scatter gather. commit 247bf557273dd775505fb9240d2d152f4f20d304 upstream. xHCI 1.0 hosts have a set of requirements on how to align transfer buffers on the endpoint rings called "TD fragment" rules. When the ax88179_178a driver added support for scatter gather in 3.12, with commit 804fad45411b48233b48003e33a78f290d227c8 "USBNET: ax88179_178a: enable tso if usb host supports sg dma", it broke the device under xHCI 1.0 hosts. Under certain network loads, the device would see an unexpected short packet from the host, which would cause the device to stop sending ethernet packets, even through USB packets would still be sent. Commit 35773dac5f86 "usb: xhci: Link TRB must not occur within a USB payload burst" attempted to fix this. It was a quick hack to partially implement the TD fragment rules. However, it caused regressions in the usb-storage layer and userspace USB drivers using libusb. The patches to attempt to fix this are too far reaching into the USB core, and we really need to implement the TD fragment rules correctly in the xHCI driver, instead of continuing to wallpaper over the issues. Disable arbitrarily-aligned scatter-gather in the xHCI driver for 1.0 hosts. Only the ax88179_178a driver checks the no_sg_constraint flag, so don't set it for 1.0 hosts. This should not impact usb-storage or usbfs behavior, since they pass down max packet sized aligned sg-list entries (512 for USB 2.0 and 1024 for USB 3.0). Signed-off-by: Sarah Sharp Tested-by: Mark Lord Cc: David Laight Cc: Bjørn Mork Cc: Freddy Xin Cc: Ming Lei Signed-off-by: Greg Kroah-Hartman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
xHCI regression in stable 3.13.5 with USB3 card reader
I have a USB 3.0 multi-card reader device: Bus 004 Device 002: ID 05e3:0743 Genesys Logic, Inc. which seems to work fine in 3.13.4 (Fedora version kernel-3.13.4-200 specifically) but fails in 3.13.5 (specifically kernel-3.13.5-202). Below is what I get in dmesg. Essentially there's a bunch of input/output errors making the reader mostly unusable. This is on an Intel Haswell machine with this controller: 00:14.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI [8086:8c31] (rev 05) It looks like there were some XHCI commits that went into 3.13.5 so it seems likely one of those is the cause. I can try current git if there's anything in there that's likely to fix it. But it does seem like a regression got into the stable kernel in this respect. [ 25.177926] usb 4-2: Disable of device-initiated U1 failed. [ 26.906531] usb 4-2: reset SuperSpeed USB device number 2 using xhci_hcd [ 26.918439] xhci_hcd :00:14.0: xHCI xhci_drop_endpoint called with disabled ep 88003f912a00 [ 26.918441] xhci_hcd :00:14.0: xHCI xhci_drop_endpoint called with disabled ep 88003f912a40 [ 26.921116] sd 6:0:0:0: [sdc] Unhandled error code [ 26.921118] sd 6:0:0:0: [sdc] [ 26.921120] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK [ 26.921120] sd 6:0:0:0: [sdc] CDB: [ 26.921121] Read(10): 28 00 00 00 08 23 00 00 f0 00 [ 26.921126] end_request: I/O error, dev sdc, sector 2083 [ 27.208871] sd 6:0:0:0: [sdc] Media Changed [ 27.208874] sd 6:0:0:0: [sdc] [ 27.208875] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 27.208876] sd 6:0:0:0: [sdc] [ 27.208877] Sense Key : Unit Attention [current] [ 27.208878] sd 6:0:0:0: [sdc] [ 27.208880] Add. Sense: Not ready to ready change, medium may have changed [ 27.208880] sd 6:0:0:0: [sdc] CDB: [ 27.208881] Read(10): 28 00 00 00 08 24 00 00 ef 00 [ 27.208886] end_request: I/O error, dev sdc, sector 2084 [ 27.210467] FAT-fs (sdc1): FAT read failed (blocknr 35) [ 49.420334] usb 4-2: Disable of device-initiated U1 failed. [ 51.139080] usb 4-2: reset SuperSpeed USB device number 2 using xhci_hcd [ 51.150979] xhci_hcd :00:14.0: xHCI xhci_drop_endpoint called with disabled ep 88003f912a00 [ 51.150981] xhci_hcd :00:14.0: xHCI xhci_drop_endpoint called with disabled ep 88003f912a40 [ 51.153663] sd 6:0:0:0: [sdc] Unhandled error code [ 51.153665] sd 6:0:0:0: [sdc] [ 51.153666] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK [ 51.153667] sd 6:0:0:0: [sdc] CDB: [ 51.153668] Read(10): 28 00 00 00 08 25 00 00 ee 00 [ 51.153672] end_request: I/O error, dev sdc, sector 2085 [ 51.441377] sd 6:0:0:0: [sdc] Media Changed [ 51.441379] sd 6:0:0:0: [sdc] [ 51.441380] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 51.441381] sd 6:0:0:0: [sdc] [ 51.441382] Sense Key : Unit Attention [current] [ 51.441384] sd 6:0:0:0: [sdc] [ 51.441385] Add. Sense: Not ready to ready change, medium may have changed [ 51.441386] sd 6:0:0:0: [sdc] CDB: [ 51.441386] Read(10): 28 00 00 00 08 26 00 00 ed 00 [ 51.441391] end_request: I/O error, dev sdc, sector 2086 [ 51.441454] FAT-fs (sdc1): FAT read failed (blocknr 37) [ 51.442083] FAT-fs (sdc1): FAT read failed (blocknr 37) [ 51.442570] FAT-fs (sdc1): FAT read failed (blocknr 235) [ 164.219227] usb 4-2: Disable of device-initiated U1 failed. [ 165.938731] usb 4-2: reset SuperSpeed USB device number 2 using xhci_hcd [ 165.950669] xhci_hcd :00:14.0: xHCI xhci_drop_endpoint called with disabled ep 88003f912a00 [ 165.950672] xhci_hcd :00:14.0: xHCI xhci_drop_endpoint called with disabled ep 88003f912a40 [ 165.953366] sd 6:0:0:0: [sdc] Unhandled error code [ 165.953368] sd 6:0:0:0: [sdc] [ 165.953369] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK [ 165.953370] sd 6:0:0:0: [sdc] CDB: [ 165.953371] Read(10): 28 00 00 00 08 27 00 00 ec 00 [ 165.953375] end_request: I/O error, dev sdc, sector 2087 [ 166.240995] sd 6:0:0:0: [sdc] Media Changed [ 166.240997] sd 6:0:0:0: [sdc] [ 166.240999] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 166.241000] sd 6:0:0:0: [sdc] [ 166.241000] Sense Key : Unit Attention [current] [ 166.241002] sd 6:0:0:0: [sdc] [ 166.241003] Add. Sense: Not ready to ready change, medium may have changed [ 166.241004] sd 6:0:0:0: [sdc] CDB: [ 166.241005] Read(10): 28 00 00 00 08 28 00 00 eb 00 [ 166.241010] end_request: I/O error, dev sdc, sector 2088 [ 166.241055] FAT-fs (sdc1): FAT read failed (blocknr 39) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Problem with hibernate partitions and encrypted volumes
A while ago I reported this Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=981841 and thought I would check on the kernel side about what the best way to handle the problem was. Essentially the problem relates to the way in which the kernel stores the device that it uses to resume and hibernate to/from. During boot, dracut writes the device major/minor number of the swap partition to /sys/power/resume, which triggers the kernel to resume from disk if a valid resume image is present. If not, bootup continues normally, and the kernel has stashed away that major/minor pair as swsusp_resume_device. Later on, when you choose to hibernate the system and "disk" gets written to /sys/power/state, the kernel uses that device as the swap partition it will try to save the hibernate image to. The problem comes in when the swap partition is on a LUKS encrypted volume - specifically, when there is more than one encrypted volume on the system. (In my case, the machine has separate encrypted /home and swap partitions.) Since the kernel stores the resume partition as a major/minor pair, it's sensitive to any change in the device ordering. It appears that at some point in the Fedora boot process, the device nodes for the encrypted volumes get torn down and re-created, and there is apparently no guarantee of the order in which this will occur. If the devices get recreated in the opposite order (for example if the swap partition was originally minor 1 and is now minor 2), the stored device ID will no longer refer to a swap partition, and the hibernate process discovers this and aborts. It seems like all of this could be avoided if there was a way for userspace to set the device used to store the hibernate image before triggering hibernation. As far as I can see there is no way to change the device stored in swsusp_resume_device without writing to /sys/power/resume, which immediately tries to resume from it. That seems like quite a hack when one is trying to hibernate. It seems like the "set image device" and "resume" requests should be separated. Any thoughts? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUGREPORT] Linux USB 3.0
On 08/02/14 03:00 AM, Markus Rechberger wrote: On Tue, Feb 4, 2014 at 10:31 AM, David Laight wrote: From: Markus Rechberger Dec 27 23:23:50 solist kernel: [ 36.118245] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr These messages might be harmless. The 3.0 kernel contains a fix for Intel Panther Point xHCI hosts that suppresses those messages, commit ad808333d8201d53075a11bc8dd83b81f3d68f0b "Intel xhci: Ignore spurious successful event." A later commit extends that to all xHCI 1.0 hosts, commit 07f3cb7c28bf3f4dd80bfb136cf45810c46ac474 "usb: host: xhci: Enable XHCI_SPURIOUS_SUCCESS for all controllers with xhci 1.0" That was queued for 3.11 and marked to be backported into stable kernels as old as 3.0. I see the same error message on the 0.96 ASMedia controller when the rx buffers for the ax88179_178a driver cross 64k boundaries. So this isn't confined to 1.0 controllers. Sarah, since there is no response yet, is there anyone at Intel dedicated at working on USB 3.0? We are also getting more and more negative USB 3.0 feedback with Linux Still nobody appears to have provided the requested debugging information that was requested. So there is not much that can be done upstream to debug things based only on vague reports, especially when not using current kernel versions. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12-rc5 and overwritten partition table - by powertop?
On 10/29/2013 04:32 PM, John Twideldum wrote: The first ~170kb of /dev/sda got blown away with what seems to be a logging output by Powertop, when I was playing with the tuneables. So did you log the output to some file? I'm just trying to understand how it could get onto your disk in the first place... Attached a dump of the first 1Mb of the disk, HTH. It looks like a powertop log? (I have powertop 2.4) Yes, likely. But it is strange the corruption doesn't even end at any sensible boundary (data ends at offset 0x27b53). Shrug... My recollection what I did is this: I was looking into powertop and observing how -rc5 works now with Haswell. I saw the tuneable parameters and quite a few were "bad", so I set them to "good". Power usage dropped about one third - yay! However, changing "SATA link power" threw up complaints: Oct 29 09:09:21 localhost kernel: [ 3697.423868] ata1.00: exception Emask 0x10 SAct 0x1 SErr 0xc action 0x6 frozen Oct 29 09:09:21 localhost kernel: [ 3697.423873] ata1.00: irq_stat 0x0800, interface fatal error Oct 29 09:09:21 localhost kernel: [ 3697.423877] ata1: SError: { CommWake 10B8B } Oct 29 09:09:21 localhost kernel: [ 3697.423880] ata1.00: failed command: WRITE FPDMA QUEUED Oct 29 09:09:21 localhost kernel: [ 3697.423886] ata1.00: cmd 61/38:00:01:9e:a4/01:00:00:00:00/40 tag 0 ncq 159744 out Oct 29 09:09:21 localhost kernel: [ 3697.423886] res 50/01:00:01:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error) Oct 29 09:09:21 localhost kernel: [ 3697.423888] ata1.00: status: { DRDY } Oct 29 09:09:21 localhost kernel: [ 3697.423894] ata1: hard resetting link Oct 29 09:09:22 localhost kernel: [ 3697.743196] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 29 09:09:22 localhost kernel: [ 3697.744707] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.744719] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.744725] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.744813] ata1.00: ACPI cmd ef/10:09:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.745212] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 Oct 29 09:09:22 localhost kernel: [ 3697.746694] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.746705] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.746711] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.746779] ata1.00: ACPI cmd ef/10:09:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.747286] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 Oct 29 09:09:22 localhost kernel: [ 3697.747432] ata1.00: configured for UDMA/133 Oct 29 09:09:22 localhost kernel: [ 3697.763181] ata1: EH complete I did not know yet about what "frozen" means, so I did not investigate and very soon powered down as I had to leave. Next time I boot up I did not boot. So data probable is just the size because as long as I had powertop running... (CCing linux-ide) It seems like most likely either the SATA host controller or drive doesn't play nice with link power management enabled. Can you post the full dmesg boot log? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2] ACPI/Battery: Add a _BIX quirk for NEC LZ750/LS
On 01/14/2014 03:37 PM, Rafael J. Wysocki wrote: On Tuesday, January 14, 2014 04:06:01 PM Matthew Garrett wrote: On Mon, Jan 06, 2014 at 11:25:53PM +0100, Rafael J. Wysocki wrote: Queued up as a fix for 3.13 (I fixed up the indentation). Ah, sorry, I missed this chunk of the thread. If the system provides valid _BIF data then we should possibly just fall back to that rather than adding another quirk table. The problem is to know that _BIX is broken. If we could figure that out upfront, we woulnd't need the quirk table in any case. Tianyu, can we do some effort during the driver initialization to detect this breakage and handle it without blacklisting systems? Yes, the usual question in such cases is "how does Windows manage to function on such systems, (almost certainly) without a system-specific hack, and can we replicate that behavior?" -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tg3 and sd card reader at acer aspire
On 12/24/2013 11:52 AM, Bjorn Helgaas wrote: [+cc linux-pci because I think this is related to PCI ASPM] I'm afraid nobody wants to touch ASPM because it's such a mess, but I hope somebody will step up and investigate this. So apparently this machine got broken when we changed the behavior for the ACPI ASPM-not-supported bit from forcing ASPM off for everything to leaving ASPM in whatever state the BIOS left it in. And apparently for some reason the device (devices?) doesn't work with ASPM enabled in Linux but does in Windows. It's possible that some other workaround is being applied by the Windows driver that allows it to work there. I'm not too sure what the next step to debug that would be, unless maybe someone has a contact at Broadcom. We could conceivably add a quirk to force ASPM off for this device regardless of what the BIOS says, though. On Tue, Dec 24, 2013 at 1:27 AM, Vasiliy Tolstov wrote: Hi all and sorry for may be spamming mailing list. I have acer aspire v5-17 with broadcom card reader and ethernet card. I'm affecting on this ubuntu bugs. But bug is present in vanilla linux current git and stable lts. Can somebody helps me and say - where i can post message and discuss this problem. As i see ubuntu team can't solve this problem (bug present is about 1 year and nothing changed). One ubuntu user wia bisecting find broken commit - https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=6cac12dfab9c57a4f76821412224b226a9b08dff Relevant ubuntu bugs: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178131 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1067222 Thanks! -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 6/8] PCI: acpiphp: workaround for Thunderbolt on Acer Aspire S5
On 07/03/2013 03:40 PM, Rafael J. Wysocki wrote: On Wednesday, July 03, 2013 05:04:53 PM Mika Westerberg wrote: From: "Kirill A. Shutemov" Correct ACPI PCI hotplug imeplementation should have _RMV method in a PCI slot (device under pci bridge). In Acer Aspire S5 case we have it deeper in hierarchy: Device (RP05) { // ... Device (HRUP) { // ... Device (HRDN) { // ... Device (EPUP) { // ... Method (_RMV, 0, NotSerialized) // _RMV: Removal Status { Return (One) } } } } } Signed-off-by: Kirill A. Shutemov Signed-off-by: Mika Westerberg --- drivers/pci/hotplug/acpi_pcihp.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/pci/hotplug/acpi_pcihp.c b/drivers/pci/hotplug/acpi_pcihp.c index 2a47e82..d92ebfb 100644 --- a/drivers/pci/hotplug/acpi_pcihp.c +++ b/drivers/pci/hotplug/acpi_pcihp.c @@ -422,6 +422,19 @@ static int pcihp_is_ejectable(acpi_handle handle) status = acpi_evaluate_integer(handle, "_RMV", NULL, &removable); if (ACPI_SUCCESS(status) && removable) return 1; + + /* +* Workaround for Thunderbolt implementation on Acer Aspire S5. +* +* Correct ACPI PCI hotplug imeplementation has _RMV method in a PCI +* slot (device under pci bridge). In Acer Aspire S5 case we have it +* deeper in hierarchy. +*/ + status = acpi_evaluate_integer(handle, "HRDN.EPUP._RMV", NULL, + &removable); Well, calling stuff like this directly from a general function is kind of ugly. Can we use something like a quirk instead? A DMI check or something? Presumably this device functions under Windows so clearly Windows is capable of dealing with this case, so we should too. There are way too many of these silly DMI checks in the kernel - we should be way more hesitant to add more of them. They're almost guaranteed to be incomplete. I would say they should be avoided whenever possible unless there's some reason why a general workaround can't be used. + if (ACPI_SUCCESS(status) && removable) + return 1; + return 0; } Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: /sys/module/pcie_aspm/parameters/policy not writable?
On 07/09/2013 03:49 AM, Pavel Machek wrote: On Mon 2013-07-08 21:13:21, Greg KH wrote: On Tue, Jul 09, 2013 at 03:26:11AM +0200, Pavel Machek wrote: Hi! My thinkpad has rather high ping latencies... and perhaps it is due to PCIE ASPM. Why would that be the problem? The odds that the PCIE bus is the issue seems strange to me. Aha: I guess that's why the file is not writable: pavel@amd:~$ dmesg | grep -i aspm ACPI FADT declares the system doesn't support PCIe ASPM, so disable it IIRC, this message is somewhat misleading. When that FADT flag is set by the BIOS, the kernel doesn't so much disable ASPM as disable the kernel's control over ASPM. I believe this was to match Windows behavior. e1000e :02:00.0: Disabling ASPM L0s L1 And given that, I think this message may also be misleading, as the kernel won't touch the device's ASPM state. Force-enabling ASPM may actually be allowing the driver to disable ASPM on the device. I seem to recall a recent thread on this about another device.. maybe we need to allow drivers to explicitly disable ASPM if it's enabled even if the FADT flag is set? pavel@amd:~$ cat /sys/module/pcie_aspm/parameters/policy [default] performance powersave pavel@amd:~$ root@amd:~# echo -n performance > /sys/module/pcie_aspm/parameters/policy -su: echo: write error: Operation not permitted root@amd:~# But: 1) it should not list unavailable options 2) operation not permitted seems like wrong error code for operation not supported. Pavel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Suspend-to-disk issue with identifying swap partition
I recently ran into a problem with suspend to disk on Fedora 19, which I reported here: https://bugzilla.redhat.com/show_bug.cgi?id=981841 In this case swap and /home are encrypted volumes. Essentially (from what I understand, correct me if I'm wrong) what happens is that when dracut boots up, unlocks the encrypted swap and writes the major/minor number of the swap partition to /sys/power/resume to try to resume from it, and fails as there's no hibernate image present, the kernel still stashes away the major/minor number of the device into swsusp_resume_device (see resume_store in kernel/power/hibernate.c). For whatever reason those dm-crypt mappings get torn down after dracut finishes and recreated afterwards. As it turned out, because of the order of the LUKS entries on the kernel command line versus the order of the lines in /etc/fstab, the mappings were being recreated in the opposite order during the main boot sequence. This resulted in that stored major/minor device in swsusp_resume_device now pointing at the /home partition instead of the swap partition. When you go to hibernate, it fails as obviously that device isn't a swap partition. It seems to me that it's not a great idea to stash away major/minor numbers at attempted resume and try to use them later on. There's no guarantee that they will still point at the same device or even exist at all. It appears that if the resume device was never explicitly set at hibernate time, then the kernel will just pick a usable swap partition to hibernate to, but once userspace has set a resume device, there's no way to get the kernel to forget about that device and just auto-detect at hibernate time again. And if that device no longer exists or isn't a swap device anymore, it seems like you're pretty much screwed. Any thoughts? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
KVM VM shutdown triggers BUG from network bridge code in 3.9.9
I've run into a problem after updating to Fedora 19 where if I shut down a Windows 7 KVM virtual machine, the machine hits a kernel panic. There are a few reports of this on 3.9.8 and 3.9.9 kernels here: https://bugzilla.redhat.com/show_bug.cgi?id=981437 The panic is "kernel BUG at kernel/timer.c:729!" and the stack traces all seem basically the same, something like this one (captured with kdump): #7 [880214d25c10] mod_timer+501 at 8106d905 #8 [880214d25c50] br_multicast_del_pg.isra.20+261 at a0731d25 [bridge] #9 [880214d25c80] br_multicast_disable_port+88 at a0732948 [bridge] #10 [880214d25cb0] br_stp_disable_port+154 at a072bcca [bridge] #11 [880214d25ce8] br_device_event+520 at a072a4e8 [bridge] #12 [880214d25d18] notifier_call_chain+76 at 8164aafc #13 [880214d25d50] raw_notifier_call_chain+22 at 810858f6 #14 [880214d25d60] call_netdevice_notifiers+45 at 81536aad #15 [880214d25d80] dev_close_many+183 at 81536d17 #16 [880214d25dc0] rollback_registered_many+168 at 81537f68 #17 [880214d25de8] rollback_registered+49 at 81538101 #18 [880214d25e10] unregister_netdevice_queue+72 at 815390d8 #19 [880214d25e30] __tun_detach+272 at a074c2f0 [tun] #20 [880214d25e88] tun_chr_close+45 at a074c4bd [tun] #21 [880214d25ea8] __fput+225 at 8119b1f1 #22 [880214d25ef0] fput+14 at 8119b3fe #23 [880214d25f00] task_work_run+159 at 8107cf7f #24 [880214d25f30] do_notify_resume+97 at 810139e1 #25 [880214d25f50] int_signal+18 at 8164f292 It seems like the error is being triggered by the virtual network interface being torn down, though I have no idea why (from all reports so far) it only happens when shutting down a Windows 7 VM, or why this didn't happen in Fedora 18 (something to do with older kvm/qemu/libvirt perhaps..) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal HDD/USB write speed after sleep on a UEFI system
On Tue, May 7, 2013 at 9:59 AM, Artem S. Tashkinov wrote: > May 7, 2013 09:25:40 PM,Bjorn Helgaas wrote: >> [+cc Phillip] >> >>> I would suspect that Windows' complaint about the BIOS mucking up the MTRRs >>> is likely the best hint. Likely Windows is detecting the problem and fixing >>> it up on resume, thus it only complains about "reduced resume performance". >>> If the MTRRs are messed up, then quite likely parts of RAM have become >>> uncacheable, causing performance to get randomly slaughtered in various >>> ways. >>> >>> From looking at the code it's not clear if we are checking/restoring the >>> MTRR contents after resume. If not, maybe we should be. >> >>I agree; the MTRR warning is a good hint. Artem? >> >>Phillip, I cc'd you because you have similar hardware and your >>https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1131468 report is >>slightly similar. Have you seen anything like this "reduced >>performance after resume" issue? If so, can you collect /proc/mtrr >>contents before and after suspending? >> > > Like Robert Hancock correctly noted the Linux kernel lacks the code to check > for MTTR changes after resume - I'm not a kernel hacker to write such a code > ;-) > > Likewise there's no code to see if RAM pages have become uncacheable - i.e > I've no idea how to check it either. > > According to /proc/mttr nothing changes on resume - only Windows detects > the discrepancy between MTTR regions on resume. dmesg contains no warnings > or errors (aside from usual ACPI SATA warnings - but they happen right on > boot - so I highly doubt the ACPI or SATA layers can be the culprit, since USB > exhibits a similar performance degradation). I'm not sure if reading /proc/mtrr actually reads the registers out of the CPU each time, or whether we just return the cached values we read out during initial boot-up. If the latter, then this output isn't really useful as there's no guarantee the values are still intact. > > In short, there's little to nothing that I can check. > > That bug report has nothing to do with my problem - my PC suspends and > resumes more or less correctly - everything works (albeit some parts don't > work as they should). That person also has a very outdated BIOS - 1904 from > 08/15/2011. I wouldn't be surprised if BIOS update solved his problem. > > Best regards, > > Artem -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: device tree not the answer in the ARM world [was: Re: running Debian on a Cubieboard]
On 05/05/2013 06:27 AM, Luke Kenneth Casson Leighton wrote: this message came up on debian-arm and i figured that it is worthwhile endeavouring to get across to people why device tree cannot and will not ever be the solution it was believed to be, in the ARM world. [just a quick note to david who asked this question on the debian-arm mailing list: any chance you could use replies with plaintext in future? converting from HTML to text proved rather awkward and burdensome, requiring considerable editing. the generally-accepted formatting rules for international technical mailing lists are plaintext only and 7-bit characters] On Sun, May 5, 2013 at 11:14 AM, David Goodenough wrote: On Sunday 05 May 2013, Luke Kenneth Casson Leighton wrote: And I have a question: as the Debian installer takes the arch armhf in charge, do you think a standard install' from a netboot image will work ? this has been on my list for a lng time. as with *all* debian installer images however you are hampered by the fact that there is no BIOS - at all - on ARM devices - and therefore it is impossible to have a "one size fits all" debian installer. I wonder if the device tree is the answer here. If the box comes with a DT or one is available on the web then the installer could read it and know what to install. That and the armmp kernel should solve the problem. you'd think so, and it's a very good question, to which the answer could have been and was predicted to be "not a snowbal in hell's chance", even before work started on device tree, and turns out to *be* "not a snowball in hell's chance" which i believe people are now beginning to learn, based on the ultra-low adoption rate of device tree in the ARM world (side-note: [*0]). in the past, i've written at some length as to why this is the case, however the weighting given to my opinions on linux kernel strategic decision-making is negligeable, so as one associate of mine once wisely said, "you just gotta let the train wreck happen". device tree was designed to take the burden off of the linux kernel due to proliferation of platform-specific hard-coding of support for peripherals. however it was designed ***WITHOUT*** its advocates having a full grasp of the sheer overwhelming diversity of the platforms. specifically i am referring to linus torvald's complete lack of understanding of the ARM linux kernel world, as his primary experience is with x86. in his mind, and the minds of those people who do not understand how ARM-based boxes are built and linux brought up on them, *surely* it cannot be all that complicated, *surely* it cannot be as bad as it is, right? what they're completely missing is the following: * the x86 world resolves around standards such as ACPI, BIOSes and general-purpose dynamic buses. * ACPI normalises every single piece of hardware from the perspective of most low-level peripherals. * the BIOS also helps in that normalisation. DOS INT33 is the classic one i remember. * the general-purpose dynamic buses include: - USB and its speed variants (self-describing peripherals) - PCI and its derivatives (self-describing peripherals) - SATA and its speed variants (self-describing peripherals) exceptions to the above include i2c (unusual, and taken care of by i2c-sensors, which uses good heuristics to "probe" devices from userspace) and the ISA bus and its derivatives such as Compact Flash and IDE. even PCMCIA got sufficient advances to auto-identify devices from userspace at runtime. so as a general rule, supporting a new x86-based piece of hardware is a piece of piss. get datasheet or reverse-engineer, drop it in, it's got BIOS, ACPI, USB, PCIe, SATA, wow big deal, job done. also as a general rule, hardware that conforms to x86-motherboard-like layouts such as the various powerpc architectures are along the same lines. so here, device tree is a real easy thing to add, and to some extent a "nice-to-have". i.e. it's not really essential to have device tree on top of something where 99% of the peripherals can describe themselves dynamically over their bus architecture when they're plugged in! now let's look at the ARM world. * is there a BIOS? no. so all the boot-up procedures including ultra-low-level stuff like DDR3 RAM timings initialisation, which is normally the job of the BIOS - must be taken care of BY YOU (usually in u-boot) and it must be done SPECIFICALLY CUSTOMISED EACH AND EVERY SINGLE TIME FOR EVERY SINGLE SPECIFIC HARDWARE COMBINATION. * is there ACPI present? no. so anything related to power management, fans (if there are any), temperature detection (if there is any), all of that must be taken care of BY YOU. * what about the devices? here's where it becomes absolute hell on earth as far as attempting to "streamline" the linux kernel into a "one size fits all" monolithic package. the classic example i give here is the HTC Universal, which was a device that, after 3 years of ded
Re: Abysmal HDD/USB write speed after sleep on a UEFI system
On 04/29/2013 10:47 PM, Bjorn Helgaas wrote: On Sat, Apr 27, 2013 at 4:10 AM, Artem S. Tashkinov wrote: Did this problem ever get resolved? Hello, Unfortunately, no. Out of curiosity I've tried booting kernel 3.9-rc8 in EUFI mode but it exhibits the same problem. Right after the boot: [root@localhost ~]# dd if=/dev/zero of=test bs=64M count=3 3+0 records in 3+0 records out 201326592 bytes (201 MB) copied, 1.08544 s, 185 MB/s After suspend/resume: # dd if=/dev/zero of=test bs=64M count=3 3+0 records in 3+0 records out 201326592 bytes (201 MB) copied, 66.5392 s, 3.0 MB/s That's for my primary SATA-3 HDD. Forgive me my impudence but I believe debugging the USB stack is tangential to this problem. Something far deeper than USB support breaks, but so far no one has come even with the slightest clue of what that might be. I tend to agree that it sounds like something deeper than USB is broken. I admit I'm just grasping at straws because I don't have any good ideas yet. Here are three easy things you can try: 1) Collect "lspci -vvv -" output before and after the suspend/resume to investigate the XHCI Unsupported Request errors. 2) Collect the contents of /proc/mtrr before and after the suspend/resume. 3) After the suspend/resume, try the "setpci" to set the MSI address back to the original value to see if it makes a difference (see my Feb 12 message). I would suspect that Windows' complaint about the BIOS mucking up the MTRRs is likely the best hint. Likely Windows is detecting the problem and fixing it up on resume, thus it only complains about "reduced resume performance". If the MTRRs are messed up, then quite likely parts of RAM have become uncacheable, causing performance to get randomly slaughtered in various ways. From looking at the code it's not clear if we are checking/restoring the MTRR contents after resume. If not, maybe we should be. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] PCI: Handle device quirks when accessing sysfs resource entries
On Fri, Mar 22, 2013 at 9:39 AM, Myron Stowe wrote: > On Thu, Mar 21, 2013 at 6:51 PM, Robert Hancock wrote: >> On 03/20/2013 10:35 PM, Myron Stowe wrote: >>> >>> Sysfs includes entries to memory regions that back a PCI device's BARs. >>> The pci-sysfs entries backing I/O Port BARs can be accessed by userspace, >>> providing direct access to the device's registers. File permissions >>> prevent random users from accessing the device's registers through these >>> files, but don't stop a privileged app that chooses to ignore the purpose >>> of these files from doing so. >>> >>> There are devices with abnormally strict restrictions with respect to >>> accessing their registers; aspects that are typically handled by the >>> device's driver. When these access restrictions are not followed - as >>> when a userspace app such as "udevadm info --attribute-walk >>> --path=/sys/..." parses though reading all the device's sysfs entries - it >>> can cause such devices to fail. >>> >>> This patch introduces a quirking mechanism that can be used to detect >>> accesses that do no meet the device's restrictions, letting a device >>> specific method intervene and decide how to progress. >>> >>> Reported-by: Xiangliang Yu >>> Signed-off-by: Myron Stowe >> >> >> I honestly don't think there's much point in even attempting this strategy. >> This list of devices in the quirk can't possibly be complete. It would >> likely be easier to enumerate a white-list of devices that can deal with >> their IO ports being read willy-nilly than a blacklist of those that don't, >> as there's likely countless devices that fall into this category. Even if >> they don't choke as badly as these ones do, it's quite likely that bad >> behavior will result. > > For the device in question, it seems abnormally restrictive in its > access limitations to BAR1 and BAR3. The device reserves 4 Bytes of > I/O Port space for these BARs, which is likely based on PCI's DWord > based protocol, but "chokes (we still have not received any specifics > on this yet)" on any access other than a single Byte access at offset > 0x2 (x86 supports 1, 2, and 4 Byte I/O Port accesses). This seems to > imply that the device did not back the other three reserved Bytes it > claims in any way, which again, seems peculiar to this particular > device as other similar devices tend to back the reserved bytes they > claim and return 0's when accessed. I don't think it's really that unusual. IO ports are weird, they're not necessarily emulating any kind of normal memory - doing a 32-bit access on IO port 0 isn't necessarily equivalent at all to doing separate 8-bit accesses on IO ports 0, 1, 2 and 3 for example. They can do completely different things. For legacy SFF ATA controllers, IIRC most of the IO ports only expect 8-bit accesses other than the PIO data register which can do 16 or 32-bit accesses (which just gives you a different amount of data when doing different size reads from the same location - it doesn't give you the data for the IO ports 2 or 3 bytes ahead of that when doing a 32-bit read, like you might expect if you were thinking of it like memory). So it's not too shocking that the designer of these Marvell devices didn't pay too much attention to what happens if you access the ports in an unexpected way. Most devices tend not to use IO ports any more in favor of MMIO, except for legacy devices or devices implementing a legacy interface (like the SFF ATA portion of these Marvell controllers). In those old setups, doing funny things with IO port space is more common. Also, the fact that IO port space tends to be somewhat precious leads to this sort of thing too. > > So in the case where two entities such as the devices driver and an > app like 'udevadm' are *not* simultaneously accessing it, so in effect > the device is idle with no device driver attached and a user app like > 'udevadm' accesses it: do you still contend that there are countless > devices that will not deal with their IO ports being read willy-nilly? Perhaps not "countless" but more "uncountable" in that there's no real way to tell which devices may have issues, only to know that it's quite possible that many devices may. > > The reason I ask is related to what I stated in the cover [PATCH 0/3] > - "If on the other hand, consensus is that we need userspace device > register access capabilities - say for UIO drivers or such - then, > depending on the tact taken, we'll need this solution,
Re: [PATCH 2/3] PCI: Handle device quirks when accessing sysfs resource entries
On 03/20/2013 10:35 PM, Myron Stowe wrote: Sysfs includes entries to memory regions that back a PCI device's BARs. The pci-sysfs entries backing I/O Port BARs can be accessed by userspace, providing direct access to the device's registers. File permissions prevent random users from accessing the device's registers through these files, but don't stop a privileged app that chooses to ignore the purpose of these files from doing so. There are devices with abnormally strict restrictions with respect to accessing their registers; aspects that are typically handled by the device's driver. When these access restrictions are not followed - as when a userspace app such as "udevadm info --attribute-walk --path=/sys/..." parses though reading all the device's sysfs entries - it can cause such devices to fail. This patch introduces a quirking mechanism that can be used to detect accesses that do no meet the device's restrictions, letting a device specific method intervene and decide how to progress. Reported-by: Xiangliang Yu Signed-off-by: Myron Stowe I honestly don't think there's much point in even attempting this strategy. This list of devices in the quirk can't possibly be complete. It would likely be easier to enumerate a white-list of devices that can deal with their IO ports being read willy-nilly than a blacklist of those that don't, as there's likely countless devices that fall into this category. Even if they don't choke as badly as these ones do, it's quite likely that bad behavior will result. I think there's a few things that need to be done: -Fix the bug in udevadm that caused it to trawl through these files willy-nilly, -Fix the kernel so that access through these files complies with the kernel's mechanisms for claiming IO/memory regions to prevent access conflicts (i.e. opening these files should claim the resource region they refer to, and should fail with EBUSY or something if another process or a kernel driver is using it). -Reconsider whether supporting read/write on the resource files for IO port regions like these makes any sense. Obviously mmap isn't very practical for IO port access on x86 but you could even do something like an ioctl for this purpose. Not very many pieces of software would need to access these files so it's likely OK if the API is a bit ugly. That would prevent something like grepping through sysfs from generating port accesses to random devices. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] udevadm-info: Don't access sysfs 'resource' files
On Mon, Mar 18, 2013 at 8:35 PM, Greg KH wrote: > On Mon, Mar 18, 2013 at 08:09:22PM -0600, Robert Hancock wrote: >> > Great, that's one possible solution, the other is just not creating the >> > files at all for known problem devices, right? >> >> I don't think one can reasonably enumerate all problem devices. There >> are probably countless devices which can potentially break if their >> resources (especially IO ports) are read in unexpected ways. Aside >> from devices like this one, which apparently don't like certain IO >> ports being read with certain access widths, there's every device in >> existence with read-to-reset type registers. The fix to this needs to >> apply to all devices. >> >> > >> > My main point here is, you aren't going to fix this in userspace, fix it >> > in the kernel. >> >> The kernel can help the situation by blocking access to devices with >> an active driver, but it can't fix all cases. Suppose the device has >> no driver loaded yet, how is the kernel supposed to tell the >> difference between software with a legitimate need to access these >> files for virtualization device assignment, etc. and something like >> udevadm or a random grep command that's reading the files without any >> idea what it's doing? udevadm does need to be fixed to avoid accessing >> these files because it's unnecessary and dangerous. > > Are you going to also fix grep? bash? cat? > > Come on, be realistic. If these files are so dangerous then they need > to just be removed entirely from the kernel. You aren't going to be > able to patch grep for this. Well, clearly not. Although accessing this file with grep, etc. is really just another way root can shoot themselves in the foot, it would be nice if this functionality could be provided in a way that didn't leave this kind of exposed land mine. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] udevadm-info: Don't access sysfs 'resource' files
On Mon, Mar 18, 2013 at 8:03 PM, Greg KH wrote: > On Mon, Mar 18, 2013 at 07:54:09PM -0600, Robert Hancock wrote: >> On 03/16/2013 07:03 PM, Greg KH wrote: >> >On Sat, Mar 16, 2013 at 05:50:53PM -0600, Myron Stowe wrote: >> >>On Sat, 2013-03-16 at 15:11 -0700, Greg KH wrote: >> >>>On Sat, Mar 16, 2013 at 03:35:19PM -0600, Myron Stowe wrote: >> >>>>Sysfs includes entries to memory that backs a PCI device's BARs, both I/O >> >>>>Port space and MMIO. This memory regions correspond to the device's >> >>>>internal status and control registers used to drive the device. >> >>>> >> >>>>Accessing these registers from userspace such as "udevadm info >> >>>>--attribute-walk --path=/sys/devices/..." does can not be allowed as >> >>>>such accesses outside of the driver, even just reading, can yield >> >>>>catastrophic consequences. >> >>>> >> >>>>Udevadm-info skips parsing a specific set of sysfs entries including >> >>>>'resource'. This patch extends the set to include the additional >> >>>>'resource' entries that correspond to a PCI device's BARs. >> >>> >> >>>Nice, are you also going to patch bash to prevent a user from reading >> >>>these sysfs files as well? :) >> >>> >> >>>And pciutils? >> >>> >> >>>You get my point here, right? The root user just asked to read all of >> >>>the data for this device, so why wouldn't you allow it? Just like >> >>>'lspci' does. Or bash does. >> >> lspci doesn't randomly attempt to access device registers, AFAIK.. > > Have you read the man page for the '-xxx' option to lspci? lspci can be > quite intrusive, and I used to have a number of systems that it would > trash very easily if you ran it on them as root. > >> >>Yes :P , you raise a very good point, there are a lot of way a user can >> >>poke around in those BARs. However, there is a difference between >> >>shooting yourself in the foot and getting what you deserve versus >> >>unknowingly executing a common command such as udevadm and having the >> >>system hang. >> >>> >> >>>If this hardware has a problem, then it needs to be fixed in the kernel, >> >>>not have random band-aids added to various userspace programs to paper >> >>>over the root problem here. Please fix the kernel driver and all should >> >>>be fine. No need to change udevadm. >> >> >> >>Xiangliang initially proposed a patch within the PCI core. Ignoring the >> >>specific issue with the proposal which I pointed out in the >> >>https://lkml.org/lkml/2013/3/7/242 thread, that just doesn't seem like >> >>the right place to effect a change either as PCI's core isn't concerned >> >>with the contents or access limitations of those regions, those are >> >>issues that the driver concerns itself with. >> >> >> >>So things seem to be gravitating towards the driver. I'm fairly >> >>ignorant of this area but as Robert succinctly pointed out in the >> >>originating thread - the AHCI driver only uses the device's MMIO region. >> >>The I/O related regions are for legacy SFF-compatible ATA ports and are >> >>not used to driver the device. This, coupled with the observance that >> >>userspace accesses such as udevadm, and others like you additionally >> >>point out, do not filter through the device's driver for seems to >> >>suggest that changes to the driver will not help here either. >> > >> >A PCI quirk should handle this properly, right? Why not do that? Worse >> >thing, the quirk could just not expose these sysfs files for this >> >device, which would solve all userspace program issues, right? >> >> A PCI quirk implies there is something wrong with this device in >> particular. This isn't the case. The device responds properly when >> it's accessed as intended. The problem is that udevadm (or other >> processes, like a random grep through sysfs for example) is >> effectively reading registers willy-nilly. This is absolutely not >> safe to do on many devices - and certainly not while a driver is >> attached to the device and has claimed the port or MMIO regions that >> are being accessed. > > Then we need to fix th
Re: [PATCH] udevadm-info: Don't access sysfs 'resource' files
On 03/16/2013 07:03 PM, Greg KH wrote: On Sat, Mar 16, 2013 at 05:50:53PM -0600, Myron Stowe wrote: On Sat, 2013-03-16 at 15:11 -0700, Greg KH wrote: On Sat, Mar 16, 2013 at 03:35:19PM -0600, Myron Stowe wrote: Sysfs includes entries to memory that backs a PCI device's BARs, both I/O Port space and MMIO. This memory regions correspond to the device's internal status and control registers used to drive the device. Accessing these registers from userspace such as "udevadm info --attribute-walk --path=/sys/devices/..." does can not be allowed as such accesses outside of the driver, even just reading, can yield catastrophic consequences. Udevadm-info skips parsing a specific set of sysfs entries including 'resource'. This patch extends the set to include the additional 'resource' entries that correspond to a PCI device's BARs. Nice, are you also going to patch bash to prevent a user from reading these sysfs files as well? :) And pciutils? You get my point here, right? The root user just asked to read all of the data for this device, so why wouldn't you allow it? Just like 'lspci' does. Or bash does. lspci doesn't randomly attempt to access device registers, AFAIK.. Yes :P , you raise a very good point, there are a lot of way a user can poke around in those BARs. However, there is a difference between shooting yourself in the foot and getting what you deserve versus unknowingly executing a common command such as udevadm and having the system hang. If this hardware has a problem, then it needs to be fixed in the kernel, not have random band-aids added to various userspace programs to paper over the root problem here. Please fix the kernel driver and all should be fine. No need to change udevadm. Xiangliang initially proposed a patch within the PCI core. Ignoring the specific issue with the proposal which I pointed out in the https://lkml.org/lkml/2013/3/7/242 thread, that just doesn't seem like the right place to effect a change either as PCI's core isn't concerned with the contents or access limitations of those regions, those are issues that the driver concerns itself with. So things seem to be gravitating towards the driver. I'm fairly ignorant of this area but as Robert succinctly pointed out in the originating thread - the AHCI driver only uses the device's MMIO region. The I/O related regions are for legacy SFF-compatible ATA ports and are not used to driver the device. This, coupled with the observance that userspace accesses such as udevadm, and others like you additionally point out, do not filter through the device's driver for seems to suggest that changes to the driver will not help here either. A PCI quirk should handle this properly, right? Why not do that? Worse thing, the quirk could just not expose these sysfs files for this device, which would solve all userspace program issues, right? A PCI quirk implies there is something wrong with this device in particular. This isn't the case. The device responds properly when it's accessed as intended. The problem is that udevadm (or other processes, like a random grep through sysfs for example) is effectively reading registers willy-nilly. This is absolutely not safe to do on many devices - and certainly not while a driver is attached to the device and has claimed the port or MMIO regions that are being accessed. Blocking access through these files to a device with an active driver that's claimed the regions would significantly reduce the chances of something like this causing problems. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] PCI: fix system hang issue of Marvell SATA host controller
On 03/08/2013 09:18 PM, Myron Stowe wrote: On Thu, Mar 7, 2013 at 11:51 PM, Xiangliang Yu wrote: Hi, Bjorn Fix system hang issue: if first accessed resource file of BAR0 ~ BAR4, system will hang after executing lspci command This needs more explanation. We've already read the BARs by the time header quirks are run, so apparently it's not just the mere act of accessing a BAR that causes a hang. We need to know exactly what's going on here. For example, do BARs 0-4 exist? Does the device decode accesses to the regions described by the BARs? The PCI core has to know what resources the device uses, so if the device decodes accesses, we can't just throw away the start/end information. The BARs 0-4 is exist and the PCI device is enable IO space, but user access the regions file by udevadm command with info parameter, the system will hang. Like this: udevadmin info --attribut-walk --path=/sys/device/pci-device/000:*. Because the device is just AHCI host controller, don't need the BAR0 ~ 4 region file. Is my explanation ok for the patch? No, I still don't know what causes the hang; I only know that udevadm can trigger it. I don't want to just paper over the problem until we know what the root cause is. Does "lspci -H1 -vv" also cause a hang? What about "setpci -s BASE_ADDRESS_0"? "setpci -H1 -s BASE_ADDRESS_0"? The commands are ok because the commands can't find the device after accessing IO port. Xiangliang: Sorry but I didn't understand your response above, could you elaborate a little more? Are the first five BARs of the suspect device all mapping to I/O port space - i.e. similar to something like this (a capture and inclusion of an 'lspci' of the suspect device would be nice to see): 00:1f.2 SATA controller: Region 0: I/O ports at 1860 [size=8] Region 1: I/O ports at 1814 [size=4] Region 2: I/O ports at 1818 [size=8] Region 3: I/O ports at 1810 [size=4] Region 4: I/O ports at 1840 [size=32] Region 5: Memory at f2827000 (32-bit, non-prefetchable) [size=2K] You have done a good job isolating the issue so far. As Bjorn noted; it's looking as if the problem is with accessing the I/O port space mapped by the suspect device's BAR(s), not with accessing the BAR(s) in the device's configuration space. It would seem so. My question is what is accessing the IO port space in the first place. BAR5 is the MMIO region used by the AHCI driver. BARs 0-4 are the legacy SFF-compatible ATA ports. Nothing should be messing with those IO ports while AHCI is enabled. It's expected that doing that will break things. If something in udev is randomly groveling around inside the resource files for those BARs in sysfs, that seems like a really bad thing. As you responded positively to earlier, as proposed the suspect device will still actively be decoding accesses to the regions described by the BARs. Because the device is actively decoding the PCI core can't just throw away the BAR's corresponding resource regions, as the patch is currently doing, due to the possibility of another device being added at a later time. If a subsequent device were added later, the core may need to try and allocate resources for it and, in the worst case scenario, the core could end up allocating resources that conflict with this suspect device as a consequence of the suspect device's original resource allocations having been silently thrown away. The result would be both devices believing they each exclusively own the same set (or subset) of I/O port mappings and thus both actively decoding accesses to such which. A situation that would obviously be disastrous. There is still something going on here that we still do not understand. Could you please capture the following information to help further isolate the issue: A 'dmesg' log from the system which was booted using both the "debug" and "ignore_loglevel" boot parameters, a 'lspci -xxx -s' capture, and a 'lspci -vv' capture. Thanks, Myron The root cause is that accessing of IO port will make the chip go bad. So, the point of the patch is don't export capability of the IO accessing. --- drivers/pci/quirks.c | 15 +++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 0369fb6..d49f8dc 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -44,6 +44,21 @@ static void quirk_mmio_always_on(struct pci_dev *dev) DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); +/* The BAR0 ~ BAR4 of Marvell 9125 device can't be accessed +* by IO resource file, and need to skip the files +*/ +static void quirk_marvell_mask_bar(struct pci_dev *dev) +{ + int i; + + for (i = 0; i < 5; i++) + if (dev->resource[i].start) + dev->resource[i].start = + dev->resource[i].end = 0; +} +
Re: [PATCH] block: delete super ancient PC-XT driver for 1980's hardware
On 01/04/2013 07:27 PM, Paul Gortmaker wrote: This driver was for the 8 bit ISA cards that were installed in the PC-XT machines of 1980 vintage. They supported the dual ribbon cable MFM drives of 10-20MB capacity, and ran at a 3:1 interleave, giving performance on the order of 128kB/s. By the introduction of the PC-AT (286) these controllers were already scrapped in favour of 16 bit controllers with some onboard RAM that could support a 1:1 interleave. The git history doesn't show any evidence of runtime fixes that would reflect active usage; instead just the usual tree-wide API type changes/cleanups. Going back to in-source changelogs, the last "runtime" fix that is evident is something I did over a dozen years ago[1] -- and even back then, the hardware was long since unavailable, so that ancient fix was also not runtime tested. The time is long overdue for this to get flushed, so lets get rid of it before anyone wastes more time doing builds and sparse checks etc. on long since dead code. Although this hardware is obviously long obsolete, it's conceivable that someone could still drag out an old MFM/RLL controller and run it on a non-completely-ancient PC with ISA slots in order to recover data from an old drive or something. Given that the code doesn't have wide-ranging effects beyond a couple of files, I'd lean towards keeping it unless there's some reason to believe it's hopelessly broken. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On 12/14/2012 03:32 PM, Don Dutile wrote: On 12/13/2012 04:50 AM, Jason Gao wrote: Dear List: Description of problem: After installed Centos 6.3(RHEL6.3) on my Dell R710(lastest bios:Version: 6.3.0,Release Date: 07/24/2012) server,and updated lastest kernel "2.6.32-279.14.1.el6.x86_64",I want to use the Intel 82576 ET Dual Port nic's SR-IOV feature,assigning VFs to kvm guest appended kernel boot parameter: intel_iommu=on,after boot with the following messages: Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 2 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe65000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 102 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe8a000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set Dec 13 16:58:15 2 kernel: scsi 0:0:32:0: Enclosure DP BACKPLANE1.07 PQ: 0 ANSI: 5 Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 202 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe89000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set full dmesg detail: http://pastebin.com/BzFQV0jU lspci -vvv full detail: http://pastebin.com/9rP2d1br it's a production server,and I'm not sure if this is a critical problem,how to fix it,any help would be greatly appreciated. DMAR table does not have an entry for this device to this region. Once the driver reconfigs/resets the device to stop polling bios-boot cmd rings and use (new) OS (dma-mapped) rings, there's a period of time during this transition that the hw is babbling away to an area that is no longer mapped. Maybe some kind of boot PCI quirk is needed to stop the device DMA activity before enabling the IOMMU? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: Safely remove option shows with Micro SD Card connected to Linux through an Android phone
On 12/11/2012 02:37 PM, Alan Stern wrote: On Tue, 11 Dec 2012, prasannatsmkumar wrote: Hi All, I connected an Android phone using USB cable to my machine running Linux (Linux 3.0, 3.2, 3.5). Mounted the SD card in phone in system (phone is just a pass through I guess). When I choose "Safely Remove" option in nautilus file manager (gnome's default file manager) I got an error saying "Error detaching: helper exited with exit code 1: Detaching device /dev/sdb USB device: /sys/devices/pci:00/:00:1d.7/usb1/1-5) SYNCHRONIZE CACHE: OK STOP UNIT: FAILED: No such file or directory" STOP UNIT means spin down the disk or eject the disc. Since your phone doesn't have a disk drive or an optical disc, no wonder this step failed. The reason it's likely doing a STOP UNIT on USB storage devices is that this is preferable for at least USB-connected HDs (at least where the USB to SATA, etc. converter bothers to implement the translation). For many drives, it's better for the disk's lifespan to power it down normally (as it would be if it was in a machine that was being shut down) so it can unload its heads in a controlled fashion, rather than just cutting the power on the running disk and causing an emergency head retract. Some types of devices may not support that command or may not do anything useful with it, but "No such file or directory" seems a strange error to run into. and it goes to unmounted state (yes it should go to and this is not a problem). But I am not able to find the reason for the above error message pop-up. If I choose "Eject" option then things are fine (I think Eject does more than un-mounting the file system). I think "safely remove" tries to cut the power supply to the device but eject does not do that. Is that correct? No, neither option cuts power. The main difference is that "safely remove" disables the USB connection, so that if the device has an "okay to unplug now" light, the light will turn on. If the device cannot be powered down (due to battery charging) why this option is shown? Is kernel exposing such capability to the user space? I am not sure whether this is the correct place to ask this question. If this is not the correct place please direct me to correct place. You probably should get in touch with the people who maintain the Nautilus program if you want to know why it does something. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A vague, murky topic of "Buffer I/O error on device sdb6, logical block NNNNNNNNN" and a ext4/VFS oops
On 11/29/2012 01:27 PM, Artem S. Tashkinov wrote: Hello, When I was copying a lot of information (tens of gigabytes) from my primary HDD to a secondary HDD I got gazillions of errors like these ones: [19568.964762] EXT4-fs warning (device sdb6): ext4_end_bio:250: I/O error writing to inode 6029369 (offset 8036352 size 524288 starting block 51946549) [19568.964767] sd 2:0:0:0: [sdb] [19568.964768] Result: hostbyte=0x00 driverbyte=0x08 [19568.964770] sd 2:0:0:0: [sdb] [19568.964771] Sense Key : 0xb [current] [descriptor] [19568.964774] Descriptor sense data with sense descriptors (in hex): [19568.964775] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 [19568.964784] 00 00 00 00 [19568.964788] sd 2:0:0:0: [sdb] [19568.964789] ASC=0x0 ASCQ=0x0 [19568.964791] sd 2:0:0:0: [sdb] CDB: [19568.964792] cdb[0]=0x2a: 2a 00 18 c5 25 a8 00 00 70 00 [19568.964804] Buffer I/O error on device sdb6, logical block 13727786 [19568.964806] Buffer I/O error on device sdb6, logical block 13727787 [19568.964808] Buffer I/O error on device sdb6, logical block 13727788 [19568.964810] Buffer I/O error on device sdb6, logical block 13727789 [19568.964812] Buffer I/O error on device sdb6, logical block 13727790 along with: [19568.964832] EXT4-fs warning (device sdb6): ext4_end_bio:250: I/O error writing to inode 6029369 (offset 8560640 size 57344 starting block 51946677) [19568.964843] ata3: EH complete [19624.635176] ata3.00: exception Emask 0x0 SAct 0x3fff SErr 0x4 action 0x6 frozen [19624.635181] ata3: SError: { CommWake } This is likely the real problem - the controller saw a CommWake during operation, which likely means the SATA link bounced for some reason. Could be a bad cable, a power issue, or some other hardware problem. The rest is likely all fallout from that (except from those _GTF errors which are likely due to a somewhat broken BIOS). [19624.635184] ata3.00: failed command: WRITE FPDMA QUEUED [19624.635190] ata3.00: cmd 61/00:00:48:ee:cb/04:00:18:00:00/40 tag 0 ncq 524288 out [19624.635190] res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [19624.635193] ata3.00: status: { DRDY } [19624.635196] ata3.00: failed command: WRITE FPDMA QUEUED [19624.635201] ata3.00: cmd 61/08:08:f0:65:bd/00:00:1d:00:00/40 tag 1 ncq 4096 out [19624.635201] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [19624.635203] ata3.00: status: { DRDY } [19624.635206] ata3.00: failed command: WRITE FPDMA QUEUED [19624.635211] ata3.00: cmd 61/00:10:48:f2:cb/04:00:18:00:00/40 tag 2 ncq 524288 out [19624.635211] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [19624.635213] ata3.00: status: { DRDY } [19624.635215] ata3.00: failed command: WRITE FPDMA QUEUED [19624.635220] ata3.00: cmd 61/00:18:48:f6:cb/04:00:18:00:00/40 tag 3 ncq 524288 out [19624.635220] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [19624.635223] ata3.00: status: { DRDY } [19624.635225] ata3.00: failed command: WRITE FPDMA QUEUED along with: [19624.635320] ata3: hard resetting link [19624.954880] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [19624.956101] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20120711/psargs-359) [19624.956109] ACPI Error: Method parse/execution failed [\_SB_.PCI0.SAT0.SPT2._GTF] (Node ef0307b0), AE_NOT_FOUND (20120711/psparse-536) [19624.958006] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20120711/psargs-359) [19624.958011] ACPI Error: Method parse/execution failed [\_SB_.PCI0.SAT0.SPT2._GTF] (Node ef0307b0), AE_NOT_FOUND (20120711/psparse-536) [19624.958366] ata3.00: configured for UDMA/133 [19624.960763] ata3.00: device reported invalid CHS sector 0 [19624.960765] ata3.00: device reported invalid CHS sector 0 [19624.960767] ata3.00: device reported invalid CHS sector 0 [19624.960769] ata3.00: device reported invalid CHS sector 0 [19624.960771] ata3.00: device reported invalid CHS sector 0 [19624.960773] ata3.00: device reported invalid CHS sector 0 [19624.960775] ata3.00: device reported invalid CHS sector 0 [19624.960777] ata3.00: device reported invalid CHS sector 0 [19624.960779] ata3.00: device reported invalid CHS sector 0 [19624.960781] ata3.00: device reported invalid CHS sector 0 [19624.960782] ata3.00: device reported invalid CHS sector 0 [19624.960784] ata3.00: device reported invalid CHS sector 0 [19624.960786] ata3.00: device reported invalid CHS sector 0 [19624.960788] ata3.00: device reported invalid CHS sector 0 and also this: [19624.961128] Buffer I/O error on device sdb6, logical block 13783485 [19624.961132] EXT4-fs warning (device sdb6): ext4_end_bio:250: I/O error writing to inode 6029369 (offset 236183552 size 524288 starting block 52002249) [19624.961142] sd 2:0:0:0: [sdb] [19624.961144] Result: hostbyte=0x00 driverbyte=0x08 [19624.961146] sd 2:0:0:0: [sdb] [19624.961147] Sense Key : 0xb [current] [descriptor] [19624.961149] Descriptor sense data with sense descri
Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question
On Thu, Nov 29, 2012 at 12:16 PM, Bjorn Helgaas wrote: > On Thu, Nov 29, 2012 at 1:55 AM, Justin Piszcz > wrote: >> >> >> -Original Message- >> From: Robert Hancock [mailto:hancock...@gmail.com] >> Sent: Wednesday, November 28, 2012 7:55 PM >> To: Justin Piszcz >> Cc: Bjorn Helgaas; Bruno Prémont; supp...@supermicro.com; >> linux-kernel@vger.kernel.org; Dan Williams >> Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware >> bug question >> >> On Wed, Nov 28, 2012 at 6:49 PM, Justin Piszcz >> wrote: >>> >>> >>> -Original Message- >>> From: Robert Hancock [mailto:hancock...@gmail.com] >>> Sent: Wednesday, November 28, 2012 7:35 PM >>> To: Justin Piszcz >>> Cc: 'Bjorn Helgaas'; 'Bruno Prémont'; supp...@supermicro.com; >>> linux-kernel@vger.kernel.org; 'Dan Williams' >>> Subject: Re: Supermicro X9SRL-F - channel enumeration error & >> ACPI/firmware >>> bug question >>> >>> >>> What does lspci -vv show on that controller? Not sure what actual >>> chipset that controller is, but there's a known issue with some Marvell >>> 6Gbps SATA controllers with DMAR enabled - it seems the device issues >>> memory read/write requests from the wrong PCI function ID and the IOMMU >>> rightly denies access as the function listed in the requests doesn't >>> have any mapping to that memory. I don't think there's presently a >>> workaround other than disabling DMAR. We could (and likely should) be >>> detecting that device and adding some kind of quirk for it. >>> >>> That sounds likely... >>> It is shown below: >>> >>> Card name: HighPoint Rocket 620 Dual Port SATA 6 Gbps PCI Express 2.0 Host >>> Adapter >>> >>> lspci -vv output: >>> >>> 84:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA >>> 6.0 Gb/s controller (rev 11) (prog-if 01 [AHCI 1.0]) >>> Subsystem: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s >>> controller >> >> Yeah, that's one of those controllers I think. But I can't tell from >> the bit of the dmesg you posted exactly what's going on. Can you post >> a full boot log from having the card installed and some drive attached >> (by putting the boot drive on another controller for example)? >> >>>> ==> Further issues with the X9SRL-F -- does this board support ASPM or is >>>> this a Linux/ASPM implementation issue? >>>> [0.632170] pci:ff: ACPI _OSC support notification failed, >>> disabling >>>> PCIe ASPM >>>> [0.632239] pci:ff: Unable to request _OSC control (_OSC support >>>> mask: 0x08) >>> >>> What's the full dmesg from this machine (or is it already posted >> somewhere)? >>> >>> It is now available here: >>> http://home.comcast.net/~jpiszcz/20121128/dmesg.txt >> >>> Is that the same boot log? It doesn't have this error in it. >> >> Yes, the error is here: (its towards the bottom) >> >> [7.973015] ata14.00: qc timeout (cmd 0xa1) >> [8.472120] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) >> [9.275922] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) >> [ 19.260667] ata14.00: qc timeout (cmd 0xa1) >> [ 19.759828] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) >> [ 19.760451] ata14: limiting SATA link speed to 1.5 Gbps >> [ 20.566598] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) >> [ 50.521078] ata14.00: qc timeout (cmd 0xa1) >> [ 51.020880] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) >> [ 51.824664] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) >> [ 51.824682] dmar: DRHD: handling fault status reg 502 >> [ 51.824686] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0 >> [ 51.824686] DMAR:[fault reason 06] PTE Read access is not set > > You have these devices: > > pci :04:00.0: [10de:01d3] type 00 class 0x03 nVidia G72 > pci :84:00.0: [1b4b:9123] type 00 class 0x010601 Marvell 88SE9123 SATA > pci :84:00.1: [1b4b:91a4] type 00 class 0x01018f Marvell 88SE9128 IDE > > I think the 04:00.0 DMAR errors are symptoms of nouveau driver issues, > and if you get rid of that driver, they'll probably go away. > > But this 84:00.1 DMAR error: > > dmar: DMAR:[DMA Read] Request device [84:00.1] fault addr fff0 > DMAR
Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question
On Wed, Nov 28, 2012 at 6:49 PM, Justin Piszcz wrote: > > > -Original Message- > From: Robert Hancock [mailto:hancock...@gmail.com] > Sent: Wednesday, November 28, 2012 7:35 PM > To: Justin Piszcz > Cc: 'Bjorn Helgaas'; 'Bruno Prémont'; supp...@supermicro.com; > linux-kernel@vger.kernel.org; 'Dan Williams' > Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware > bug question > > > What does lspci -vv show on that controller? Not sure what actual > chipset that controller is, but there's a known issue with some Marvell > 6Gbps SATA controllers with DMAR enabled - it seems the device issues > memory read/write requests from the wrong PCI function ID and the IOMMU > rightly denies access as the function listed in the requests doesn't > have any mapping to that memory. I don't think there's presently a > workaround other than disabling DMAR. We could (and likely should) be > detecting that device and adding some kind of quirk for it. > > That sounds likely... > It is shown below: > > Card name: HighPoint Rocket 620 Dual Port SATA 6 Gbps PCI Express 2.0 Host > Adapter > > lspci -vv output: > > 84:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA > 6.0 Gb/s controller (rev 11) (prog-if 01 [AHCI 1.0]) > Subsystem: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s > controller Yeah, that's one of those controllers I think. But I can't tell from the bit of the dmesg you posted exactly what's going on. Can you post a full boot log from having the card installed and some drive attached (by putting the boot drive on another controller for example)? >> ==> Further issues with the X9SRL-F -- does this board support ASPM or is >> this a Linux/ASPM implementation issue? >> [0.632170] pci:ff: ACPI _OSC support notification failed, > disabling >> PCIe ASPM >> [0.632239] pci:ff: Unable to request _OSC control (_OSC support >> mask: 0x08) > > What's the full dmesg from this machine (or is it already posted somewhere)? > > It is now available here: > http://home.comcast.net/~jpiszcz/20121128/dmesg.txt Is that the same boot log? It doesn't have this error in it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question
On 11/27/2012 07:49 AM, Justin Piszcz wrote: It looks like maybe you don't have CONFIG_PCI_MMCONFIG turned on? ===> FOR I/OAT DMA Latest status, it _appears_ its working on the X9SRL-F now, thank you! 1) Supermicro X9SRL-F (GOOD) [0.738510] ioatdma: Intel(R) QuickData Technology Driver 4.00 [0.738719] ioatdma :00:04.0: irq 75 for MSI/MSI-X [0.739088] ioatdma :00:04.1: irq 76 for MSI/MSI-X [0.739408] ioatdma :00:04.2: irq 77 for MSI/MSI-X [0.739739] ioatdma :00:04.3: irq 78 for MSI/MSI-X [0.740040] ioatdma :00:04.4: irq 79 for MSI/MSI-X [0.740342] ioatdma :00:04.5: irq 80 for MSI/MSI-X [0.740670] ioatdma :00:04.6: irq 81 for MSI/MSI-X [0.740971] ioatdma :00:04.7: irq 82 for MSI/MSI-X It is _not_ working on the: 2) Supermicro X8DTH-F (the boot drive in this system is running off a PCI-e card, could the IRQ for the I/O controller be getting re-mapped and fail?)-- worse case I can move the SSD from the 6.0gbpa SATA card to the motherboard and see if that works, but that kind of defeats the purpose of a 6.0gbps SATA SSD. (Fails to talk to the SSD) http://home.comcast.net/~jpiszcz/20121127/photo1-resize.jpg (then, a few moments later: Kernel panic) http://home.comcast.net/~jpiszcz/20121127/photo2-resize.jpg Would be curious if anyone had any suggestions besides removing the controller card? What does lspci -vv show on that controller? Not sure what actual chipset that controller is, but there's a known issue with some Marvell 6Gbps SATA controllers with DMAR enabled - it seems the device issues memory read/write requests from the wrong PCI function ID and the IOMMU rightly denies access as the function listed in the requests doesn't have any mapping to that memory. I don't think there's presently a workaround other than disabling DMAR. We could (and likely should) be detecting that device and adding some kind of quirk for it. -- ==> Further issues with the X9SRL-F -- does this board support ASPM or is this a Linux/ASPM implementation issue? [0.632170] pci:ff: ACPI _OSC support notification failed, disabling PCIe ASPM [0.632239] pci:ff: Unable to request _OSC control (_OSC support mask: 0x08) What's the full dmesg from this machine (or is it already posted somewhere)? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6.8: dmar: DRHD: handling fault status reg 602
On 11/27/2012 09:16 AM, Justin Piszcz wrote: Hello, Any idea why this is happening (e.g. why is PTE Read Access not set?) [ 13.204560] dmar: DRHD: handling fault status reg 602 [ 13.208078] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0 [ 13.208078] DMAR:[fault reason 06] PTE Read access is not set [ 15.777874] dmar: DRHD: handling fault status reg 702 [ 15.777879] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0 [ 15.777879] DMAR:[fault reason 06] PTE Read access is not set [ 16.100453] dmar: DRHD: handling fault status reg 2 [ 16.100458] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0 [ 16.100458] DMAR:[fault reason 06] PTE Read access is not set [ 16.141058] dmar: DRHD: handling fault status reg 102 [ 16.141062] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0 [ 16.141062] DMAR:[fault reason 06] PTE Read access is not set [ 16.210102] dmar: DRHD: handling fault status reg 202 [ 16.210111] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0 [ 16.210111] DMAR:[fault reason 06] PTE Read access is not set [ 16.918149] ixgbe :86:00.0: eth2: NIC Link is Up 10 Gbps, Flow Control: RX/TX This is from: http://lkml.org/lkml/2012/11/27/263 Justin. From the dmesg you posted (and some comments on that thread) it might have something to do with CONFIG_PCI_MMCONFIG being disabled. If so, try enabling that. Of course the DMAR stuff should be recovering from that more gracefully if that's the problem. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ACPI errors with 3.7-rc3
On 11/09/2012 10:36 AM, Feng Tang wrote: On Fri, Nov 09, 2012 at 10:30:43PM +0800, Moore, Robert wrote: The ACPI Global Lock is in fact intended to provide exclusion between the BIOS and the OS. Bob Thanks for the info. And per my check, most of ACPI FW don't implement this lock, say after driver probe, the ec->global_lock will be 0. The DSDT is supposed to define the _GLK control method on the EC if the BIOS needs to perform its own access which may conflict with the OS usage. If it doesn't, then it should be the case that either the BIOS doesn't touch the EC itself or it uses a separate interface that doesn't cause conflicts with what the OS is doing. - Feng -Original Message- From: Tang, Feng Sent: Friday, November 09, 2012 1:29 AM To: Rafael J. Wysocki Cc: Greg KH; Azat Khuzhin; linux-a...@vger.kernel.org; Linux Kernel Mailing List; Zheng, Lv; Len Brown; Moore, Robert Subject: Re: ACPI errors with 3.7-rc3 On Thu, Nov 08, 2012 at 05:49:40AM +0800, Rafael J. Wysocki wrote: On Tuesday, November 06, 2012 01:48:26 PM Greg KH wrote: On Tue, Nov 06, 2012 at 04:42:24PM +0400, Azat Khuzhin wrote: I'v also have such errors on my macbook pro. $ dmesg | tail [17056.008564] ACPI Error: Method parse/execution failed [\_SB_.PCI0.LPCB.EC__.SMB0.SBRW] (Node 88026547ea10), AE_TIME (20120711/psparse-536) [17056.011194] ACPI Error: Method parse/execution failed [\_SB_.BAT0.UBST] (Node 88026547e678), AE_TIME (20120711/psparse-536) [17056.013793] ACPI Error: Method parse/execution failed [\_SB_.BAT0._BST] (Node 88026547e740), AE_TIME (20120711/psparse-536) [17056.016383] ACPI Exception: AE_TIME, Evaluating _BST (20120711/battery-464) [17056.511373] ACPI: EC: input buffer is not empty, aborting transaction [17056.512672] ACPI Exception: AE_TIME, Returned by Handler for [EmbeddedControl] (20120711/evregion-501) [17056.515256] ACPI Error: Method parse/execution failed [\_SB_.PCI0.LPCB.EC__.SMB0.SBRW] (Node 88026547ea10), AE_TIME (20120711/psparse-536) [17056.517886] ACPI Error: Method parse/execution failed [\_SB_.BAT0.UBST] (Node 88026547e678), AE_TIME (20120711/psparse-536) [17056.520479] ACPI Error: Method parse/execution failed [\_SB_.BAT0._BST] (Node 88026547e740), AE_TIME (20120711/psparse-536) [17056.523070] ACPI Exception: AE_TIME, Evaluating _BST (20120711/battery-464) I'm seeing this again right now. I'm wondering if it's because I'm running on battery power at the moment: [41694.309264] ACPI Exception: AE_TIME, Returned by Handler for [EmbeddedControl] (20120913/evregion-501) [41694.309282] ACPI Error: Method parse/execution failed [\_SB_.PCI0.LPCB.EC__.SMB0.SBRW] (Node 88045cc64618), AE_TIME (20120913/psparse-536) [41694.309300] ACPI Error: Method parse/execution failed [\_SB_.BAT0.UBST] (Node 88045cc64988), AE_TIME (20120913/psparse-536) [41694.309310] ACPI Error: Method parse/execution failed [\_SB_.BAT0._BST] (Node 88045cc648c0), AE_TIME (20120913/psparse-536) [41694.309324] ACPI Exception: AE_TIME, Evaluating _BST (20120913/battery-464) [41694.809093] ACPI: EC: input buffer is not empty, aborting transaction ec_storm_threshold is still set to 8 in /sys/module/acpi/parameters/ so that's not the issue here. And also loadavg is too high ~ 10 While there is no process that load CPU up to 100% or like that. I think that this because of processes that is done in kernel space. (basically that one who write such errors) $ uname -a Linux macbook-pro-sq 3.6.5macbook-pro-custom-v0.1 #4 SMP Sun Nov 4 12:39:03 UTC 2012 x86_64 GNU/Linux Ah, ok, that means it's not something new in 3.7-rc, so maybe it's just never worked properly for this hardware :) So it's not a regression, just an ACPI issue, any ACPI developer have an idea about this? Can you please send the output of acpidump from the affected machine(s)? I doubt this problem is sometimes inevitable for some machines, because AFAIK most modern machines have the race problem for EC HW controller, as both OS side and the BIOS may access the EC HW at the same time without any race control. For this case, usually the battery and thermal modules (which may be controlled through EC) are always monitored by BIOS, when OS also frequently visit them too, the EC's own state machine may be broken and not responsive due to the race, then cause the timeout error. And how severe the problem will be depends on the EC HW, the quality of BIOS code and OS/driver code. Myself have seen the similar "ACPI: EC: input buffer is not empty, aborting transaction" error message on one laptop when its EC is busy visited by OS. btw, in EC driver I see a "ec->global_lock", don't know if it was designed to control the race between OS and BIOS. Thanks, Feng -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe li
Re: Possible disk failure
On 11/13/2012 09:54 PM, Steven Rostedt wrote: Hi Jens, Since you helped me out before, I'm going to ask for some more help ;-) I just recently purchased a new workstation from HP, and after fighting with getting grub2 working the way I want, I kicked off a ktest run to create a true min config. It basically disables options from the .config file until it finds a .config that that boots but will fail if you disable any of the configs that are set. Anyway, in the middle of this test, I started getting these nasty ata errors again (like the ones I got with the failed HD that you helped me out with on G+). But this is a brand new spanking machine (with a new HD), but I could have gotten a lemon. Anyway, the full dmesg is at: http://rostedt.homelinux.com/private/bxtest-ata-fail-dmesg The important part being: [ 11.974811] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [ 11.982816] ata1.00: irq_stat 0x4008 [ 11.987512] ata1.00: failed command: READ FPDMA QUEUED [ 11.993407] ata1.00: cmd 60/08:00:00:20:92/00:00:07:00:00/40 tag 0 ncq 4096 in [ 11.993407] res 41/40:00:04:20:92/00:00:07:00:00/40 Emask 0x409 (media error) [ 12.010367] ata1.00: status: { DRDY ERR } [ 12.015146] ata1.00: error: { UNC } .. [ 16.527065] end_request: I/O error, dev sda, sector 127016964 i.e. the drive reported an uncorrected read error on sector 127016964. And here's a smartctl dump: [root@bxtest ~]# smartctl --all /dev/sda smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.5.5-custom] (local build) Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Blue Serial ATA Device Model: WDC WD5000AAKX-60U6AA0 Serial Number:WD-WCC2EF801545 LU WWN Device Id: 5 0014ee 25ccb70a0 Firmware Version: 18.01H18 User Capacity:500,107,862,016 bytes [500 GB] Sector Size: 512 bytes logical/physical Device is:In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is:Tue Nov 13 17:48:40 2012 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x80) Offline data collection activity was never started. Auto Offline Data Collection: Enabled. Self-test execution status: ( 40) The self-test routine was interrupted by the host with a hard or soft reset. Total time to complete Offline data collection:( 7860) seconds. Offline data collection capabilities:(0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 2) minutes. Extended self-test routine recommended polling time:( 80) minutes. SCT capabilities: (0x303f) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051Pre-fail Always - 78 3 Spin_Up_Time0x0027 143 142 021Pre-fail Always - 3850 4 Start_Stop_Count0x0032 100 100 000Old_age Always - 138 5 Reallocated_Sector_Ct 0x0033 200 200 140Pre-fail Always - 0 7 Seek_Error_Rate 0x002f 100 253 051Pre-fail Always - 0 9 Power_On_Hours 0x0032 100 100 000Old_age Always - 43 10 Spin_Retry_Count0x0033 100 100 051Pre-fail Always
Re: [3.6.6] panic on reboot / khungtaskd blocked? (WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule)
On 11/13/2012 08:32 PM, Michael Wang wrote: On 11/13/2012 05:40 PM, Paweł Sikora wrote: On Monday 12 of November 2012 13:33:39 Paweł Sikora wrote: On Monday 12 of November 2012 11:22:47 Paweł Sikora wrote: On Monday 12 of November 2012 15:40:31 Michael Wang wrote: On 11/12/2012 03:16 PM, Paweł Sikora wrote: On Monday 12 of November 2012 11:04:12 Michael Wang wrote: On 11/09/2012 09:48 PM, Paweł Sikora wrote: Hi, during playing with new ups i've caught an nice oops on reboot: http://imgbin.org/index.php?page=image&id=10253 probably the upstream is also affected. Hi, Paweł Are you using a clean 3.6.6 without any modify? yes, pure 3.6.6 form git tree with modular config. Looks like some threads has set itself to be UNINTERRUPTIBLE with out any design on switch itself back later(or the time is too long), are you accidentally using some bad designed module? hmm, hard to say. mostly all modules are loaded automatically by kernel. Could you please provide the whole dmesg in text? your picture lost the print info of the hung task. i've grabbed the console via rs232 but there's no more info (see attached txt). hmm, i have one observation. during rc.shutdown there're messages on console like this: Cannot stat file /proc/$pid/fd/1: Connection timed out afaics this file descriptor points to vnc log file on a remote machine, e.g.: # ps aux|grep xfwm4 eda 1748 0.0 0.0 320220 11224 ?S13:08 0:00 xfwm4 # readlink -m /proc/1748/fd/1 /remote/dragon/ahome/eda/.vnc/odra:11.log # mount|grep ahome dragon:/home/users/ on /remote/dragon/ahome type nfs (rw,relatime,vers=3,rsize=262144,wsize=262144,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.0.2.121,mountvers=3,mountport=45251,mountproto=udp,local_lock=none,addr=10.0.2.121) so, probably during `killall5 -TERM/-KILL` on shutdown stage something sometimes go wrong and these processes (xfce4/vncserver) survive the signal and hang on the nfs i/o. ok, now i have full sysrq+w backtraces from shutdown process. i hope i'll help you. This can only tell us what's the task in UNINTERRUPTABLE state, but with out time info, we can't find out which one is the hung task... Probably all of the ones in D state waiting on NFS are the issue - but as I understand it, with modern kernels processes are supposed to be killable while waiting on NFS I/O. Maybe there's a bug that affects this, though? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMA Errors from SATA Controller with 4G Memory Remapping Enabled
On 11/13/2012 04:08 PM, Dimitar Popov wrote: Hi all, I have an old computer with motherboard ASUS SK8N with AMD Opteron 148 and 4 GiB of DDR400. There is an onboard SATA Promise RAID controller working in IDE mode (i.e. not as RAID controller) with 2 SATA disks. In order to use all 4 GiB RAM I need to enable the "4G Memory Remapping" option from the BIOS. However, if I do that I receive the following errors: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen ata2.00: failed command: WRITE DMA [136B blob data] ata2.00: status: { DRDY } ata2: hard resetting link ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata2.00: configured for UDMA/133 ata2.00: device reported invalid CHS sector 0 sd 1:0:0:0: [sdb] Result: hostbyte=0x00 driverbyte=0x08 sd 1:0:0:0: [sdb] Sense Key : 0xb [current] [descriptor] Descriptor sense data with sense descriptors (in hex): 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 00 00 00 00 sd 1:0:0:0: [sdb] ASC=0x0 ASCQ=0x0 sd 1:0:0:0: [sdb] CDB: cdb[0]=0x2a: 2a 00 00 2a 63 f0 00 00 08 00 end_request: I/O error, dev sdb, sector 2778096 btrfs: bdev /dev/sdb6 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 ata2: EH complete I don't know if it is a bug in the BIOS or a kernel bug. Here some info (I'll provide further info, if needed): It seems like there are some lines missing from this output. Can you provide the full dmesg output from bootup? You also might want to check for a BIOS update. $ uname -a Linux darkstar 3.6.6-1-ARCH #1 SMP PREEMPT Mon Nov 5 11:57:22 CET 2012 x86_64 GNU/Linux 01:08.0 RAID bus controller: Promise Technology, Inc. PDC20378 (FastTrak 378/SATA 378) (rev 02) Subsystem: ASUSTeK Computer Inc. K8V Deluxe/PC-DL Deluxe motherboard Flags: bus master, 66MHz, medium devsel, latency 96, IRQ 19 I/O ports at dc00 [size=64] I/O ports at d800 [size=16] I/O ports at d400 [size=128] Memory at fa9ff000 (32-bit, non-prefetchable) [size=4K] Memory at fa9c (32-bit, non-prefetchable) [size=128K] Capabilities: [60] Power Management version 2 Kernel driver in use: sata_promise Thank you in advance! Regards, Dimitar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Why Cypress does not upstream its trackpad driver?
On 11/07/2012 06:26 PM, David Solda wrote: Dmitry, all, To clarify my comment. Our protocol utilizes 8 bytes which are needed in our driver. In order for the Linux system to accept 8 bytes of data, the Linux psmouse system driver is required to be modified. Without this modification, the driver that you are referring to will not work correctly. The psmouse system driver change that would be required is the item that would be rejected. I appreciate your comments and of course, if the driver could be upstreamed, it would (we already have I2C drivers updstreamed for Chrome systems), but there is a difference here. I will again look into the possibility of what you are requesting, however, the changes are extremely low if not zero that it will be accepted. Why? If drivers were kept out of the kernel because the hardware they are designed to run requires strange things or was badly designed, there would be a lot fewer drivers in the kernel than there are today. Firmware and hardware frequently does bizarre or nonsensical things and we just have to deal with it. Dave -Original Message- From: Dmitry Torokhov [mailto:dmitry.torok...@gmail.com] Sent: Wednesday, November 07, 2012 4:16 PM To: David Solda Cc: Troy Abercrombia; Kamal Mostafa; Ozan Çağlayan; linux-kernel@vger.kernel.org; linux-in...@vger.kernel.org; customercare; mario_limoncie...@dell.com Subject: Re: Why Cypress does not upstream its trackpad driver? Hi David, On Wednesday, November 07, 2012 06:30:11 PM David Solda wrote: Kamal, My name is Dave Solda and I would be happy to answer any other questions that you have. Troy's response is correct however as in order to support the default Linux mouse class, our firmware would also have to be modified to do so, which cannot be done in system. Our packet protocol maxes out at an 8 byte packet, which requires a change to the Linux standard in this case. I am unable to parse this... I do not believe anyone asks you to change your firmware and if your protocol needs 8 bytes to transmit device state - that's fine. Our goal in working with canonical was to provide something on Linux that would support multi-touch and not only have default single finger movement supported. If I am mistaken and he Linux kernel would accept this, then we can proceed to upstream, however all indications we have is that this patch would be rejected. If you (or others on from the locus alias) have any inputs, I would be happy to receive them. This really depends on whether the changes to the psmouse framework make sense or not. Please start submitting patches for review/discussion and we can go from there. Thanks. -- Dmitry This message and any attachments may contain Cypress (or its subsidiaries) confidential information. If it has been received in error, please advise the sender and immediately delete this message. -- To unsubscribe from this list: send the line "unsubscribe linux-input" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: macbook pro 9.2 stat/ata bus error
On 11/06/2012 09:41 PM, Azat Khuzhin wrote: Anybody? On Mon, Nov 5, 2012 at 7:28 PM, Azat Khuzhin wrote: After installing linux on macbook 9.2 (mid 2012), I have next errors in dmesg log: [ 389.623828] EXT4-fs (sda4): re-mounted. Opts: errors=remount-ro,data=ordered,commit=600 [ 410.038465] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. [ 410.075042] ehci_hcd :00:1a.0: setting latency timer to 64 [ 410.483526] EXT4-fs (sda4): re-mounted. Opts: errors=remount-ro,data=ordered,commit=0 [ 1401.834509] EXT4-fs (sda4): re-mounted. Opts: errors=remount-ro,data=ordered,commit=1800 [ 1406.467268] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. [ 1406.506769] ehci_hcd :00:1a.0: setting latency timer to 64 [ 1406.590122] EXT4-fs (sda4): re-mounted. Opts: errors=remount-ro,data=ordered,commit=0 [ 1407.492260] ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x5 action 0xe frozen [ 1407.494441] ata2.00: irq_stat 0x0040, PHY RDY changed [ 1407.495238] ata2: SError: { PHYRdyChg CommWake } [ 1407.496035] sr 1:0:0:0: CDB: [ 1407.497333] Get event status notification: 4a 01 00 00 10 00 00 00 08 00 [ 1407.498285] ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in [ 1407.498285] res 50/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x10 (ATA bus error) [ 1407.501987] ata2.00: status: { DRDY } [ 1407.502882] ata2: hard resetting link [ 1408.230302] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 1408.233279] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 1408.237467] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 1408.239084] ata2.00: configured for UDMA/100 [ 1408.262238] ata2: EH complete Is this after a resume? It could be that for some reason the SATA link is a little bit unstable right after the machine powers up again. There may not be much the kernel can do about this.. [ 3565.785609] EXT4-fs (sda4): re-mounted. Opts: errors=remount-ro,data=ordered,commit=1800 [ 3576.921499] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. [ 3576.958624] ehci_hcd :00:1a.0: setting latency timer to 64 [ 3577.114612] EXT4-fs (sda4): re-mounted. Opts: errors=remount-ro,data=ordered,commit=0 [ 3577.923688] ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x5 action 0xe frozen [ 3577.925852] ata2.00: irq_stat 0x0040, PHY RDY changed [ 3577.926746] ata2: SError: { PHYRdyChg CommWake } [ 3577.927544] sr 1:0:0:0: CDB: [ 3577.928345] Get event status notification: 4a 01 00 00 10 00 00 00 08 00 [ 3577.929642] ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in [ 3577.929642] res 50/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x10 (ATA bus error) [ 3577.932954] ata2.00: status: { DRDY } [ 3577.934264] ata2: hard resetting link [ 3578.662228] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 3578.665211] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 3578.669355] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 3578.670969] ata2.00: configured for UDMA/100 [ 3578.694145] ata2: EH complete Is it linux driver, or maybe $ lspci # sata information only 00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04) (prog-if 01 [AHCI 1.0]) Subsystem: Intel Corporation Device 7270 Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 20 I/O ports at 2098 [size=8] I/O ports at 20bc [size=4] I/O ports at 2090 [size=8] I/O ports at 20b8 [size=4] I/O ports at 2060 [size=32] Memory at a0816000 (32-bit, non-prefetchable) [size=2K] Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [70] Power Management version 3 Capabilities: [a8] SATA HBA v1.0 Capabilities: [b0] PCI Advanced Features Kernel driver in use: ahci $ uname -a Linux macbook-pro 3.6.5macbook-pro-custom-v0.1 #4 SMP Sun Nov 4 12:39:03 UTC 2012 x86_64 GNU/Linux $ cat /etc/debian_version wheezy/sid In OSX there is no errors with hard drive. What else can I do investigate this situation next? -- Azat Khuzhin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Enable A20 using KBC for some MSI laptops to fix S3 resume
On 10/24/2012 02:09 PM, Alan Cox wrote: On Wed, 24 Oct 2012 12:36:04 -0700 "H. Peter Anvin" wrote: Minor concern: it should do the wait for ready before sending each command. Can we get a command line to do this quirk too - it strikes me that if the MSIs rely upon it then it may be something Windows always does so will be useful to try on other problem machines as an experiment. I agree, one has to keep in mind the age-old question "how does Windows work?" since it surely has no such quirk. I'd say we're sometimes too quick to add these DMI quirks when a more general solution would be somehow figure out how the Linux behavior differs from what Windows is doing. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ata4.00: failed to get Identify Device Data, Emask 0x1
On 10/16/2012 07:38 PM, Aaron Lu wrote: On 10/16/2012 11:18 PM, Borislav Petkov wrote: On Tue, Oct 16, 2012 at 03:58:24PM +0100, Alan Cox wrote: Can you check whether 3.6 works on them. I know 3.6 is horribly broken on several brands of AHCI controller (Jmicron for example). Dunno where Jeff is on fixing the regressions ? If by "works" you mean I don't see the message there, then yes, it does. Logs say the message started appearing on Oct 4th after me building Linus master after the merge window started. Ok, let me test 3.6.2 just in case .. yes, no error message there. This is brought by commit: 65fe1f0f66a57380229a4ced844188103135f37b, ahci: implement aggressive SATA device sleep support. Shane, got time to take a look? This debug message made people uncomfortable :-) I don't have whatever version of ATA command set defines this command, but surely there's some identify bit which lists whether this log page is supported. Right now checking for it is only conditional on NCQ support. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFT] xhci: Switch PPT ports to EHCI on shutdown.
On 08/07/2012 11:39 AM, Sarah Sharp wrote: The Intel desktop boards DH77EB and DH77DF have a hardware issue that can be worked around by BIOS. If the USB ports are switched to xHCI on shutdown, the xHCI host will send a spurious interrupt, which will wake the system. Some BIOS will work around this, but not all. The bug can be avoided if the USB ports are switched back to EHCI on shutdown. The Intel Windows driver switches the ports back to EHCI, so change the Linux xHCI driver to do the same. Unfortunately, we can't tell the two effected boards apart from other working motherboards, because the vendors will change the DMI strings for the DH77EB and DH77DF boards to their own custom names. One example is Compulab's mini-desktop, the Intense-PC. Instead, key off the Panther Point xHCI host PCI vendor and device ID, and switch the ports over for all PPT xHCI hosts. The only impact this will have on non-effected boards is to add a couple hundred milliseconds delay on boot when the BIOS has to switch the ports over from EHCI to xHCI. This patch should be backported to kernels as old as 3.0, that contain the commit 69e848c2090aebba5698a1620604c7dccb448684 "Intel xhci: Support EHCI/xHCI port switching." Signed-off-by: Sarah Sharp Reported-by: Denis Turischev Cc: sta...@vger.kernel.org --- drivers/usb/host/pci-quirks.c |7 +++ drivers/usb/host/pci-quirks.h |1 + drivers/usb/host/xhci-pci.c |9 + drivers/usb/host/xhci.c |3 +++ drivers/usb/host/xhci.h |1 + 5 files changed, 21 insertions(+), 0 deletions(-) diff --git a/drivers/usb/host/pci-quirks.c b/drivers/usb/host/pci-quirks.c index df0828c..c5e9e4a 100644 --- a/drivers/usb/host/pci-quirks.c +++ b/drivers/usb/host/pci-quirks.c @@ -800,6 +800,13 @@ void usb_enable_xhci_ports(struct pci_dev *xhci_pdev) } EXPORT_SYMBOL_GPL(usb_enable_xhci_ports); +void usb_disable_xhci_ports(struct pci_dev *xhci_pdev) +{ + pci_write_config_dword(xhci_pdev, USB_INTEL_USB3_PSSEN, 0x0); + pci_write_config_dword(xhci_pdev, USB_INTEL_XUSB2PR, 0x0); +} +EXPORT_SYMBOL_GPL(usb_disable_xhci_ports); + /** * PCI Quirks for xHCI. * diff --git a/drivers/usb/host/pci-quirks.h b/drivers/usb/host/pci-quirks.h index b1002a8..ef004a5 100644 --- a/drivers/usb/host/pci-quirks.h +++ b/drivers/usb/host/pci-quirks.h @@ -10,6 +10,7 @@ void usb_amd_quirk_pll_disable(void); void usb_amd_quirk_pll_enable(void); bool usb_is_intel_switchable_xhci(struct pci_dev *pdev); void usb_enable_xhci_ports(struct pci_dev *xhci_pdev); +void usb_disable_xhci_ports(struct pci_dev *xhci_pdev); #else static inline void usb_amd_quirk_pll_disable(void) {} static inline void usb_amd_quirk_pll_enable(void) {} diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index 92eaff6..9bfd4ca11 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -94,6 +94,15 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) xhci->quirks |= XHCI_EP_LIMIT_QUIRK; xhci->limit_active_eps = 64; xhci->quirks |= XHCI_SW_BW_CHECKING; + /* +* PPT desktop boards DH77EB and DH77DF will power back on after +* a few seconds of being shutdown. The fix for this is to +* switch the ports from xHCI to EHCI on shutdown. We can't use +* DMI information to find those particular boards (since each +* vendor will change the board name), so we have to key off all +* PPT chipsets. +*/ + xhci->quirks |= XHCI_SPURIOUS_REBOOT; } if (pdev->vendor == PCI_VENDOR_ID_ETRON && pdev->device == PCI_DEVICE_ID_ASROCK_P67) { diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index 95394e5..81aa10c 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -659,6 +659,9 @@ void xhci_shutdown(struct usb_hcd *hcd) { struct xhci_hcd *xhci = hcd_to_xhci(hcd); + if (xhci->quirks && XHCI_SPURIOUS_REBOOT) + usb_disable_xhci_ports(to_pci_dev(hcd->self.controller)); This looks like a typo, think it should be & not &&. With this code, it appears the quirk will always be triggered since XHCI_SPURIOUS_REBOOT is non-zero. + spin_lock_irq(&xhci->lock); xhci_halt(xhci); spin_unlock_irq(&xhci->lock); diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 96f49db..c713256 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1494,6 +1494,7 @@ struct xhci_hcd { #define XHCI_TRUST_TX_LENGTH (1 << 10) #define XHCI_LPM_SUPPORT (1 << 11) #define XHCI_INTEL_HOST (1 << 12) +#define XHCI_SPURIOUS_REBOOT (1 << 13) unsigned intnum_active_eps; unsigned intlimit_active_eps; /* There are two roothubs to keep track of bus suspend info for
Re: 3.4.4: Oops in snd_hda_codec_realtek (alc_auto_create_multi_out_ctls)
On Tue, Jul 17, 2012 at 6:25 AM, Takashi Iwai wrote: > At Mon, 9 Jul 2012 13:40:09 -0600, > Robert Hancock wrote: >> >> I've got a sort of industrial portable PC that uses a Supermicro C2SBX >> motherboard. Running an RHEL6 kernel (2.6.32-ish) it works fine, but >> if I run a 3.4.4 kernel (using the kernel-ml builds provided by ELRepo >> for RHEL6) it blows up on boot inside snd_hda_codec_realtek. Below is >> the dmesg output from bootup up to the oops, after I blacklisted >> snd_hda_intel from auto-loading and then manually modprobed it. Is >> this a known problem? > > I don't know of this issue, so need to check more. > Could you load snd-hda-intel module with probe_only=1 option, and run > alsa-info.sh with --no-upload option and give the output? > Then I can try the emulator to see what's wrong. > > Also, of course, testing a newer kernel like 3.5-rc7 would be helpful, > too. Unfortunately this machine is about to ship back out to a customer so there's not really any more info I can gather from it at this point. If I can get hold of it or a similar machine again in the future, I can try to get more information. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.4.4: Oops in snd_hda_codec_realtek (alc_auto_create_multi_out_ctls)
I've got a sort of industrial portable PC that uses a Supermicro C2SBX motherboard. Running an RHEL6 kernel (2.6.32-ish) it works fine, but if I run a 3.4.4 kernel (using the kernel-ml builds provided by ELRepo for RHEL6) it blows up on boot inside snd_hda_codec_realtek. Below is the dmesg output from bootup up to the oops, after I blacklisted snd_hda_intel from auto-loading and then manually modprobed it. Is this a known problem? The lspci for the controller is: 00:1b.0 Audio device [0403]: Intel Corporation 82801I (ICH9 Family) HD Audio Controller [8086:293e] (rev 02) Subsystem: Super Micro Computer Inc Device [15d9:d980] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Capabilities: [130] Root Complex Link Kernel modules: snd-hda-intel Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 3.4.4-1.el6.elrepo.x86_64 (mockbuild@Build64R6) (gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) ) #1 SMP Fri Jun 22 21:43:41 EDT 2012 Command line: ro root=/dev/mapper/vg_add-lv_root rd_NO_LUKS rd_LVM_LV=vg_add/lv_root LANG=en_US.UTF-8 rd_LVM_LV=vg_add/lv_swap rd_NO_MD quiet SYSFONT=latarcyrheb-sun16 rhgb crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM BIOS-provided physical RAM map: BIOS-e820: - 0009cc00 (usable) BIOS-e820: 0009cc00 - 000a (reserved) BIOS-e820: 000d2000 - 000d4000 (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - 7fed (usable) BIOS-e820: 7fed - 7fedc000 (ACPI data) BIOS-e820: 7fedc000 - 7fedf000 (ACPI NVS) BIOS-e820: 7fedf000 - 8000 (reserved) BIOS-e820: e000 - f000 (reserved) BIOS-e820: fec0 - fed0 (reserved) BIOS-e820: fee0 - fee01000 (reserved) BIOS-e820: ff00 - 0001 (reserved) NX (Execute Disable) protection: active DMI present. DMI: Supermicro C2SBX/C2SBX, BIOS 2.00 09/17/2010 e820 update range: - 0001 (usable) ==> (reserved) e820 remove range: 000a - 0010 (usable) No AGP bridge found last_pfn = 0x7fed0 max_arch_pfn = 0x4 MTRR default type: uncachable MTRR fixed ranges enabled: 0-9 write-back A-B uncachable C-C7FFF write-protect C8000-D uncachable E-F write-protect MTRR variable ranges enabled: 0 base 0 mask F8000 write-back 1 disabled 2 disabled 3 disabled 4 disabled 5 disabled 6 disabled x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 found SMP MP-table at [880f67d0] f67d0 initial memory mapped : 0 - 2000 Base memory trampoline at [88097000] 97000 size 20480 init_memory_mapping: -7fed 00 - 007fe0 page 2M 007fe0 - 007fed page 4k kernel direct mapping tables up to 7fed @ 1fffc000-2000 RAMDISK: 36f5b000 - 37ff crashkernel: memory value expected ACPI: RSDP 000f67a0 00024 (v02 PTLTD ) ACPI: XSDT 7fed1333 000A4 (v01 PTLTD ? XSDT 0604 LTP ) ACPI: FACP 7fedbd7c 000F4 (v03 INTEL 0604 PTL 0002) ACPI: DSDT 7fed5447 068C1 (v01 INTEL BEARLAKE 0604 MSFT 0301) ACPI: FACS 7fedefc0 00040 ACPI: _MAR 7fedbe70 00030 (v01 Intel OEMDMAR 0604 LOHR 0001) ACPI: MCFG 7fedbea0 0003C (v01 PTLTDMCFG 0604 LTP ) ACPI: HPET 7fedbedc 00038 (v01 PTLTD HPETTBL 0604 LTP 0001) ACPI: APIC 7fedbf14 00074 (v01 PTLTD ? APIC 0604 LTP ) ACPI: BOOT 7fedbf88 00028 (v01 PTLTD $SBFTBL$ 0604 LTP 0001) ACPI: SPCR 7fedbfb0 00050 (v01 PTLTD $UCRTBL$ 0604 PTL 0001) ACPI: SSDT 7fed2c4a 0025F (v01 PmRef Cpu0Tst 3000 INTL 20050228) ACPI: SSDT 7fed2ba4 000A6 (v01 PmRef Cpu7Tst 3000 INTL 20050228) ACPI: SSDT 7fed2afe 000A6 (v01 PmRef Cpu6Tst 3000 INTL 20050228) ACPI: SSDT 7fed2a58 000A6 (v01 PmRef Cpu5Tst 3000 INTL 20050228) ACPI: SSDT 7fed29b2 000A6 (v01 PmRef Cpu4Tst 3000 INTL 20050228) ACPI: SSDT 7fed290c 000A6 (v01 PmRef Cpu3Tst 3000 INTL 20050228) ACPI: SSDT 7fed2866 000A6 (v01 PmRef Cpu2Tst 3000 INTL 20050228) ACPI: SSDT 7fed27c0 000A6 (v01 PmRef Cpu1Tst 3000 INTL 20050228) ACPI: SSDT 7fed13d7 013E9 (v01 PmRefCpuPm 3000 INTL 20050228) ACPI: Local APIC address 0xfee0 No NUMA configuration found Faking a node at -7fed Initmem setup node 0 -7fed NODE_DATA [7feaa000 - 7fec] [ea00-ea0001bf] PMD -> [88007d60-
Re: [PATCH] sata_nv: fix nmi intr or system hanging in rhel4u6 adma.
Kuan Luo wrote: Hi, robert One customer reported that their system received a nmi interrupt after issuing "dd if=/dev/sdb of=/dev/null" on a defective disk in rhel4u6. I tested it and found that my system hung both in rhel4u6(2.6.9-67) and 2.6.24-rc7. The patch can work well, but I am not sure if the patch has other potential effect on adma. I attached a file in case of lines breaked. The below info comes from Gunther Mayer to reproduce the issue. " used a Seagate ST3500841NS 3.AE for my test; probably other seagate drives are also capable of creating media errors with the new hdparm-8.1: - compile hdparm-8.1 - hdparm -- yes-i-know-what-i-am-doing --make-bad-sector 6 /dev/sdb Unfortunately this does not succeed for nvidia sata controller (timeouts et al.), but it worked fine on AHCI machine (e.g. FSC R640). When I insert this newly created defective disk in Ultra 20, it reboots within seconds after issueing "dd if=/dev/sdb of=/dev/null". " Signed-off-by: [EMAIL PROTECTED] --- drivers/ata/sata_nv.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/ata/sata_nv.c b/drivers/ata/sata_nv.c index ed5473b..e824260 100644 --- a/drivers/ata/sata_nv.c +++ b/drivers/ata/sata_nv.c @@ -837,9 +837,10 @@ static void nv_adma_tf_read(struct ata_port *ap, struct ata_taskfile *tf) all shortly be aborted anyway. We assume that NCQ commands are not issued via passthrough, which is the only way that switching into ADMA mode could abort outstanding commands. */ - nv_adma_register_mode(ap); + struct nv_adma_port_priv *pp = ap->private_data; - ata_tf_read(ap, tf); + if (pp->flags & NV_ADMA_PORT_REGISTER_MODE) + ata_tf_read(ap, tf); } static unsigned int nv_adma_tf_to_cpb(struct ata_taskfile *tf, __le16 *cpb) This is basically avoiding switching into register mode, right? I don't think this is a very good solution as the point of the tf_read function is that it's supposed to read the taskfile provided by the drive to diagnose the error, so not doing this isn't a good thing. Is there a reason why going into register mode should cause a lockup in this case? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Clocksource tsc is always unstable with 2.6.25-* kernels and CONFIG_NO_HZ=y on my box
Gabriel C wrote: Hi, I noticed tsc is always marked unstable on my box with 2.6.25* , 2.6.24 is fine. .. [0.825760] ACPI: PCI Interrupt :03:0e.0[A] -> GSI 22 (level, low) -> IRQ 22 [0.805755] Switched to high resolution mode on CPU 1 [0.794244] Switched to high resolution mode on CPU 2 [0.766968] Switched to high resolution mode on CPU 3 [1.083944] Switched to high resolution mode on CPU 0 [ 15.388792] Clocksource tsc unstable (delta = 9373391604 ns) [ 15.714648] Time: acpi_pm clocksource has been installed. .. Booting nohz=off fixes that. Another strange thing is when I try to boot that kernel with clocksource=acpi_pm it just hangs. config is attached. Please let me know if you need more infos / want me to try patches or anything else. Please post your full dmesg output. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Keyboard interrupt - request_irq()
Pioz wrote: Hi all, I have a problem. I want handle the keyboard interrupt and for this purpose I have write this module (I have kernel 2.6.23): #include #include #include [...] irqreturn_t irq_myhandler (int irqn, void *dev) { printk (KERN_INFO "Key pressed...\n"); return IRQ_HANDLED; } int init_module () { int res; printk (KERN_INFO "Hello World!\n"); free_irq (1, NULL); res = request_irq (1, irq_myhandler, IRQF_SHARED, "bao", dev_id); printk (KERN_INFO "res: %d\n", res); return 0; } void cleanup_module () { free_irq (1, NULL); printk (KERN_INFO "Goodbye World!\n"); } The return value of request_irq() function is -EBUSY. Why? Is the default handler? How can I do to change handler with my function? Thanks... Normally one doesn't register multiple interrupt handlers for the same device. For a PCI level-triggered interrupt one can do it (for the case where multiple devices share the IRQ), but the PC keyboard interrupt is edge-triggered and isn't sharable. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Configure MSI-X vectors to target different CPUs
[EMAIL PROTECTED] wrote: Hi, In MSI-HOWTO, it's said: "Using MSI enables the device functions to support two or more vectors, which can be configured to target different CPUs to increase scalability." So how can I set up MSI-X vectors to target different CPUs? I want to allocate the same number of MSI-X vectors as CPUs, and equally distribute them to every CPU. Is it automatically done by Linux when I call pci_enable_msix()? If yes, how? If not, what should I do? My guess is to set the affinity of the interrupts manually. Am I right? Please CC'ed me ([EMAIL PROTECTED]) answers/comments in response to this posting. Thanks, Ying If the device actually supports multiple vectors (not all do), I think they should show up as separate interrupts in /proc/interrupts and you can either set the affinity manually, or maybe irqbalance is smart enough for this. Careful, though, as in some cases this may reduce performance due to causing more cache line bouncing between CPUs. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BUG?: "Cannot map mmconfig aperture"
Diego Calleja wrote: I get the following new message in my dmesg: [0.155476] ACPI: bus type pci registered [0.155567] PCI: Found Intel Corporation 945G/GZ/P/PL Express Memory Controller Hub with MMCONFIG support. [0.161149] PCI: Cannot map mmconfig aperture for segment 0 [0.161181] PCI: Using configuration type 1 [0.165980] ACPI: EC: Look up EC in DSDT when previously i'd have: [0.156577] PCI: Found Intel Corporation 945G/GZ/P/PL Express Memory Controller Hub with MMCONFIG support. [0.166403] PCI: Using MMCONFIG at e000 - efff [0.166407] PCI: Using configuration type 1 [0.171548] ACPI: EC: Look up EC in DSDT No idea if this is a regression or not, or what it means, the system works well. git-bisect says the problem cames from: commit c31c7d4844ea4817692ae16bf70f9c96c05a50eb Author: Thomas Gleixner <[EMAIL PROTECTED]> Date: Mon Feb 18 20:54:14 2008 +0100 x86: CPA, fix alias checks Yeah, sounds like a regression. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BT8x8 TV Card
Chris Brennan wrote: I'm having a kernel related issue (I think) with the BT878 card I have in my gentoo box. Here are pastebin results of varius infomation, I hope I give all the necessary info. If I am missing something or you need more, please let me know. dmesg -> http://rafb.net/p/MVIiSg62.html xorg.conf -> http://rafb.net/p/dXz4Ry49.html Xorg.0.log -> http://rafb.net/p/LbPBZT56.html config -> http://rafb.net/p/Dm5LDM88.html These links don't work. xawtv gives me a window and when I right click, I can choose what format and region and all that jazz, but I get no picture. I do get a green overlay w/ some static when it doesn't cause my X Session to freeze. Below is the console output of two apps I am trying to use to access the TV Card. tvtime produces the following error: [EMAIL PROTECTED] ~ $ tvtime Running tvtime 1.0.2. Reading configuration from /etc/tvtime/tvtime.xml Reading configuration from /home/xaero/.tvtime/tvtime.xml xvoutput: No XVIDEO port found which supports YUY2 images. *** tvtime requires hardware YUY2 overlay support from your video card *** driver. If you are using an older NVIDIA card (TNT2), then *** this capability is only available with their binary drivers. *** For some ATI cards, this feature may be found in the experimental *** GATOS drivers: http://gatos.souceforge.net/ *** If unsure, please check with your distribution to see if your *** X driver supports hardware overlay surfaces. [EMAIL PROTECTED] ~ $ tvtime-scanner now that made my heart jump ... cause it scanned and stored all my basic cable channels. But I still can't get a video signal. So I am obviously missing something. So hopefully someone can help me I assume you did read the message tvtime spit out and looked into it? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Intel Core2Duo mobile - how does the VID get set?
Brian Morrison wrote: Pallipadi, Venkatesh wrote: After a fair bit of Googling and reading around, I'm none the wiser about exactly how Linux 2.6.x sets the processor VID (or for that matter how it decides the FID settings) when using the ondemand governor and cpufreq stuff. Can anyone tell me a) whether this is obtained from the BIOS, something in the MSR of the processor or elsewhere and Yes. The freq and voltage supported comes from the platform BIOS. BIOS exports this in an ACPI table which is handled by acpi-cpufreq driver in Linux kernel. OK, thanks for that. b) whether there is an interface in /proc or /sys where one can find out what is set and modify it? There is no way of modifying and using new VIDs etc in Linux kernel (other than exporting your own DSDT and hacking the related code). Is there a way of viewing the ACPI table contents relating to the VIDs and FIDs? Viewing the decompiled ACPI DSDT may provide some hints, however the kernel itself knows nothing about VID/FID settings for acpi-cpufreq, basically the kernel writes to magic ACPI registers and the BIOS or hardware take over to do what's required.. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] x86: validate against acpi motherboard resources
Andi Kleen wrote: With just this patch you will have this problem. You need either the patch to disable decode during BAR sizing, Isn't that one already merged? I remember the BAR decoding patch did help with at least one of the original failures (there were multiple ones iirc0) I believe that one's been dropped as it's not needed if we don't use MMCONFIG for non-extended accesses (like we use during BAR sizing). (Though, there may still be a case where it's needed, see below.) If someone points me to all the patches needed or a tree who has them all applied I can give it a quick spin on the boxes I have here. One of the systems where it originally failed I don't have anymore though. or the patch to use MMCONFIG for extended config space only, if you don't have them already. That would mean it would boot, but anything that uses extended config space would fail. While not as catastrophic as before I'm not sure it's that great either. At least there would be still breakage, but much more subtle ones. The only issue on those boards is that since certain device BARs will overlap the MMCONFIG area during BAR sizing, if you use MMCONFIG to do the accesses used during BAR sizing itself, it'll fail. If you use conf1 to do the BAR sizing then that problem doesn't happen. However, I suppose there could be an issue if you hotplugged a device (causing BAR sizing) once you'd booted, while extended config space was in use on another device. The BAR sizing wouldn't fail, but the guy using extended config space would since he's actually reading from/writing into the BAR of the device being sized instead of the MMCONFIG area. That wouldn't be good. The disable-decode-during-sizing patch would avoid that problem. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI Bursting with PIO
Dan Gora wrote: On Feb 15, 2008 10:00 PM, Robert Hancock <[EMAIL PROTECTED]> wrote: Well, in order for the CPU to batch up more writes you'd have to map the BAR as either write-combining or write-back. If it's not listed in /proc/mtrr it will be the default setting of uncacheable. Ok, this is pretty much what I thought, but I still don't really have any idea how to do this. ioremap() doesn't take any flags and I'm not using ioremap_uncacheable(), plus the BAR is marked prefetchable... Likely easiest to do it from userspace by writing into /proc/mtrr to change the memory type attributes. Have a look at Documentation/mtrr.txt. X has code to set up the video memory on the video card as write-combining so it can get better write performance, you could do something similar. Alan mentioned this as well, but I haven't tried to hunt this code yet. If you have any pointers as to where I might find this, I would appreciate it. Setting it as write-back might allow you to get the reads to do bursting as well (since the CPU will do a cache-line fill instead of individual accesses) I don't see what the cache write policy has to do with the reads. If the region is marked cacheable, then all reads should try and read a cache line, right? The write-back or write-through policy only has to do with the writes. If it's write through then writes go directly to RAM, if it's write-back then they hit the cache and are flushed when the line is flushed (LRU replacement, explicit cache line flush, etc..), right? That caching attribute affects reads as well. If it's marked uncacheable or write-combining then reads will never be cached, only if it's marked write-back. but this if the device is modifying this memory area, unless you add code to invalidate those cache lines before reading the data you'll get stale data back. Yeah this could definitely be tricky, would pci_dma_sync suffice for this? No, that's not meant to handle this case of stale data in the CPU's cache since that doesn't normally happen. Something like clflush or wbinvd would do it, those being x86 specific of course.. You could run into some other less obvious issues as well, as normally device memory regions are not mapped write-back. In general, especially if you need to read data back from the device, implementing a DMA engine would be by far the better option. Most chipsets seem not at all optimized for handling sequential reads from PCI memory from the CPU. (Even in the DMA case, you have to be careful with what type of memory read transaction you use when transferring from host memory - some chipsets don't like to burst more than one cycle if you use normal Memory Read instead of Memory Read Line or Memory Read Multiple.) True enough... Fortunately my device allows me to set these... What I am trying to avoid is PCI read transactions in general. PCI reads are slow pretty much no matter if they are originated from the device or from the host because of all the multitude of bridges they have to go through (I've seen 5 in some cases... sheesh). So ultimately I like for everything going to the device to be written from the host, then everything going towards the host be DMA'd into RAM by the device, at least then we can take advantage of PCI write posting and you don't have to wait for the write to actually complete before we plod on. But this depends on at least getting get write burst performance from the host so that the time to write the data from host is less than the time it would take for the device to read the data out of RAM. thanks again for your help! dan Setting write-combining should be fairly easy without too many wierd side effects. Trying to use write-back to get burst reads is potentially doable, but may be fraught with difficulty. I think DMA in both directions is still likely better though, unless the data you are writing is very small. Most chipsets have pretty small posting buffers so the amount it will help you is small. If you fill them up you'll just stall the CPU. With doing a DMA read, at least only the device will stall. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI Bursting with PIO
Dan Gora wrote: Hi, I am trying to optimize a driver for a slave only PCI device and am having a lot of trouble getting any kind of PCI burst transactions in either the read or the write direction. Using bcopy/memcpy or even a hand-crafted while (len) { *pdst++ = *psrc++} (with pdst and psrc unsigned long*) I can only get writes to burst and even in that case only for 2 data phases (8 bytes) and only on 64 bit machines. The best that I have managed is to use a hand crafted asm function which copies the data through mmx registers on i386 machines, but that still only bursts a maximum of 16 bytes in the write direction and not at all in the read direction. The source and destination pointers are both aligned to 8 byte boundaries, so I don't think that it's an alignment issue. The chipset is being limited by what the CPU is giving it. If the CPU sends only a small amount of data in one access then the chipset usually does not try to burst more than that. Is there any way to get PIO to burst over the PCI bus in the read and write direction? My device has 4 BAR registers, but the area where I am transferring data is marked 'prefetchable' (although the others are not). I read here: http://lkml.org/lkml/2004/9/23/393 that this was a prerequisite, but it is apparently not sufficient. He also mentioned that the area had to be marked as write-back, but it's not clear how you can tell (no /proc/mtrr doesn't tell you) or that it has anything to do with bursting reads. Any ideas would be really appreciated, Well, in order for the CPU to batch up more writes you'd have to map the BAR as either write-combining or write-back. If it's not listed in /proc/mtrr it will be the default setting of uncacheable. X has code to set up the video memory on the video card as write-combining so it can get better write performance, you could do something similar. Setting it as write-back might allow you to get the reads to do bursting as well (since the CPU will do a cache-line fill instead of individual accesses) but this if the device is modifying this memory area, unless you add code to invalidate those cache lines before reading the data you'll get stale data back. You could run into some other less obvious issues as well, as normally device memory regions are not mapped write-back. In general, especially if you need to read data back from the device, implementing a DMA engine would be by far the better option. Most chipsets seem not at all optimized for handling sequential reads from PCI memory from the CPU. (Even in the DMA case, you have to be careful with what type of memory read transaction you use when transferring from host memory - some chipsets don't like to burst more than one cycle if you use normal Memory Read instead of Memory Read Line or Memory Read Multiple.) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] x86: validate against acpi motherboard resources
Andi Kleen wrote: Yinghai Lu <[EMAIL PROTECTED]> writes: [EMAIL PROTECTED]: many fixes and cleanups] Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Tested-by: Andi Kleen <[EMAIL PROTECTED]> iirc it really was Tested-and-didnt-pass-test-by: Andi Kleen unfortunately. I have not rechecked recently, but on the one Intel box the original patch and the other mcfg heuristics removed didn't work. With just this patch you will have this problem. You need either the patch to disable decode during BAR sizing, or the patch to use MMCONFIG for extended config space only, if you don't have them already. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
DMA mapping API on 32-bit X86 with CONFIG_HIGHMEM64G
I was looking at the out-of-tree driver for a PCI high-security module (from a vendor who shall remain nameless) today, as we had a problem reported where the device didn't work properly if the computer had more than 4GB of RAM (this is x86 32-bit, with CONFIG_HIGHMEM64G enabled). Essentially what it was doing was taking some memory that the userspace app was transferring to/from the device, doing get_user_pages on it, and then using the old-style page_to_phys, etc. functions to DMA on that memory instead of the modern DMA API. However, I'm not sure this strategy would have worked on this platform even if it had been using the proper DMA API. This device has 32-bit DMA limits and is transferring userspace buffers which with HIGHMEM64G enabled could easily have physical addresses over 4GB. The strategy that Linux Device Drivers, 3rd Edition (chapter 15) suggests is doing get_user_pages, creating an SG list from the returned pages and then using dma_map_sg on that list. However, essentially all dma_map_sg in include/asm-x86/dma-mapping_32.h is: for_each_sg(sglist, sg, nents, i) { BUG_ON(!sg_page(sg)); sg->dma_address = sg_phys(sg); } which does nothing to ensure that the returned physical address is within the device's DMA mask. On 64-bit this triggers IOMMU mapping but on 32-bit it doesn't seem like this case is handled at all. I believe the block and networking layers have their own ways of ensuring that they don't feed such buffers to their drivers if they can't handle it, but a basic character device driver is kind of left out in the cold here and the DMA API doesn't appear to work as documented in this case. Given that x86-32 kernels don't implement any IOMMU support I'm not sure what it actually could do, other than implementing some kind of software bounce buffering of its own.. Are there any in-tree drivers that use this DMA mapping on get_user_pages strategy that could be affected by this? I think the get_free_pages trick is actually pretty silly in this case, the size of the data being transferred is likely such that it would be just as fast or faster to copy to a kernel buffer and DMA to/from there.. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Avoid buffer overflows in get_user_pages()
Nick Piggin wrote: On Tuesday 12 February 2008 10:17, Jonathan Corbet wrote: Avoid buffer overflows in get_user_pages() So I spent a while pounding my head against my monitor trying to figure out the vmsplice() vulnerability - how could a failure to check for *read* access turn into a root exploit? It turns out that it's a buffer overflow problem which is made easy by the way get_user_pages() is coded. In particular, "len" is a signed int, and it is only checked at the *end* of a do {} while() loop. So, if it is passed in as zero, the loop will execute once and decrement len to -1. At that point, the loop will proceed until the next invalid address is found; in the process, it will likely overflow the pages array passed in to get_user_pages(). I think that, if get_user_pages() has been asked to grab zero pages, that's what it should do. Thus this patch; it is, among other things, enough to block the (already fixed) root exploit and any others which might be lurking in similar code. I also think that the number of pages should be unsigned, but changing the prototype of this function probably requires some more careful review. Signed-off-by: Jonathan Corbet <[EMAIL PROTECTED]> diff --git a/mm/memory.c b/mm/memory.c index e5628a5..7f50fd8 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -989,6 +989,8 @@ int get_user_pages(struct task_struct *tsk, struct mm_struct *mm, int i; unsigned int vm_flags; + if (len <= 0) + return 0; BUG_ON()? Well, not if the code involved in the exploit can pass a zero value, otherwise it's just turning it into a DoS.. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
Yinghai Lu wrote: On Feb 10, 2008 12:45 PM, Matthew Wilcox <[EMAIL PROTECTED]> wrote: On Sun, Feb 10, 2008 at 12:24:18PM -0800, Linus Torvalds wrote: On Sun, 10 Feb 2008, Yinghai Lu wrote: I suggest Ivan's patch be merged ASAP as it actually fixes bugs. This patch is just cleanup (and takes care of some future concerns). your patch and Ivan's patch should be merged in one... I really don't care whether they get merges as one or separately, but I think it should be merged _now_ (-rc1 is already delayed), and I'd like to see the final versions of both. Does anybody have them in a final agreed-upon format (preferably with that oddness in quirk_intel_irqbalance also fixed?) I just looked at fixing that -- the reason seems to be that we don't actually have the struct pci_dev at that point. I can fix it, but I think it's actually buggy. I want to look at some chipset docs to confirm that though. I've attached the two patches that I believe are the ones we want. We can (and should) fix quirk_intel_irqbalance separately. Andrew, those two patch just got into linus 2.6.25-rc1. I assume that you will drop gregkh-pci-pci-make-pci-extended-config-space-a-driver-opt-in.patch in -mm. please check some updated patches in -mm that could be affected. hope it could save you some time x86-validate-against-acpi-motherboard-resources.patch x86-clear-pci_mmcfg_virt-when-mmcfg-get-rejected.patch x86-mmconf-enable-mcfg-early.patch x86_64-check-msr-to-get-mmconfig-for-amd-family-10h-opteron-v3.patch I don't think any of these patches are affected. They all affect whether to use MMCONFIG globally or not, regardless of whether not particular accesses will use it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pata_sil680 regression 2.6.22->2.6.24
Robert Lowery wrote: Hi Folks, Having recently upgaded my Ubuntu install from Gutsy to Hardy, my 750GB Seagate disk connected via a SiI680 PCI card is no longer detected. I suspect this is caused by the MMIO changes in 2.6.24. Strangely in 2.6.22 the drive appears as sda1, but on 2.6.24 it appears as a non functioning hde. A working 2.6.22 based system reports Feb 10 01:49:22 myth-backend kernel: [ 88.296373] sil680: BA5_EN = 1 clock = 00 Feb 10 01:49:22 myth-backend kernel: [ 88.296401] sil680: BA5_EN = 1 clock = 10 Feb 10 01:49:22 myth-backend kernel: [ 88.296479] sil680: 133MHz clock. Feb 10 01:49:22 myth-backend kernel: [ 88.296672] ACPI: PCI Interrupt :00:0d.0[A] -> GSI 19 (level, low) -> IRQ 19 Feb 10 01:49:22 myth-backend kernel: [ 88.297584] scsi2 : pata_sil680 Feb 10 01:49:22 myth-backend kernel: [ 88.298142] scsi3 : pata_sil680 Feb 10 01:49:22 myth-backend kernel: [ 88.298398] ata1: PATA max UDMA/133 cmd 0x00019000 ctl 0x00018802 bmdma 0x00017800 irq 19 Feb 10 01:49:22 myth-backend kernel: [ 88.298409] ata2: PATA max UDMA/133 cmd 0x00018400 ctl 0x00018002 bmdma 0x00017808 irq 19 Feb 10 01:49:22 myth-backend kernel: [ 88.530102] ata1.00: ATA-7: ST3750640A, 3.AAE, max UDMA/100 Feb 10 01:49:22 myth-backend kernel: [ 88.530139] ata1.00: 1465149168 sectors, multi 0: LBA48 Feb 10 01:49:22 myth-backend kernel: [ 88.604735] ata1.00: configured for UDMA/100 A non working 2.6.24 based system reports Feb 9 16:19:28 myth-backend kernel: [ 81.149973] SiI680: IDE controller (0x1095:0x0680 rev 0x02) at PCI slot :00:0d.0 Feb 9 16:19:28 myth-backend kernel: [ 81.150031] ACPI: PCI Interrupt :00:0d.0[A] -> GSI 19 (level, low) -> IRQ 19 Feb 9 16:19:28 myth-backend kernel: [ 81.150094] SiI680: BASE CLOCK == 133 Feb 9 16:19:28 myth-backend kernel: [ 81.150104] SiI680: 100% native mode on irq 19 Feb 9 16:19:28 myth-backend kernel: [ 81.150121] ide2: MMIO-DMA , BIOS settings: hde:pio, hdf:pio Feb 9 16:19:28 myth-backend kernel: [ 81.150145] ide3: MMIO-DMA , BIOS settings: hdg:pio, hdh:pio Feb 9 16:19:28 myth-backend kernel: [ 82.113860] hde: þþÿþ`KÐ, ATA DISK drive Feb 9 16:19:28 myth-backend kernel: [ 82.114038] ide2 at 0xf882a080-0xf882a087,0xf882a08a on irq 19 This is not pata_sil680, this is drivers/ide trying to operate this card. Has something changed in your kernel config? Note, my bios does not support 750GB drives, so the drive is not configured in the card BIOS. This has not stopped linux detecting it ok in the past. Please let me know if I need to provide any more information or additional testing. Thanks -Rob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Scheduler(?) regression from 2.6.22 to 2.6.24 for short-lived threads
Olof Johansson wrote: Hi, I ended up with a customer benchmark in my lap this week that doesn't do well on recent kernels. :( After cutting it down to a simple testcase/microbenchmark, it seems like recent kernels don't do as well with short-lived threads competing with the thread it's cloned off of. The CFS scheduler changes come to mind, but I suppose it could be caused by something else as well. The pared-down testcase is included below. Reported runtime for the testcase has increased almost 3x between 2.6.22 and 2.6.24: 2.6.22: 3332 ms 2.6.23: 4397 ms 2.6.24: 8953 ms 2.6.24-git19: 8986 ms While running, it'll fork off a bunch of threads, each doing just a little work, then busy-waiting on the original thread to finish as well. Yes, it's incredibly stupidly coded but that's not my point here. During run, (runtime 10s on my 1.5GHz Core2 Duo laptop), vmstat 2 shows: 0 0 0 115196 364748 224839600 0 0 163 89 0 0 100 0 2 0 0 115172 364748 224839600 0 0 270 178 24 0 76 0 2 0 0 115172 364748 224839600 0 0 402 283 52 0 48 0 2 0 0 115180 364748 224839600 0 0 402 281 50 0 50 0 2 0 0 115180 364764 224839600 022 403 295 51 0 48 1 2 0 0 115056 364764 224839600 0 0 399 280 52 0 48 0 0 0 0 115196 364764 224839600 0 0 241 141 17 0 83 0 0 0 0 115196 364768 224839600 0 2 155 67 0 0 100 0 0 0 0 115196 364768 224839600 0 0 148 62 0 0 100 0 I.e. runqueue is 2, but only one cpu is busy. However, this still seems true on the kernel that runs the testcase in more reasonable time. Also, 'time' reports real and user time roughly the same on all kernels, so it's not that the older kernels are better at spreading out the load between the two cores (either that or it doesn't account for stuff right). I've included the config files, runtime output and vmstat output at http://lixom.net/~olof/threadtest/. I see similar behaviour on PPC as well as x86, so it's not architecture-specific. Testcase below. Yes, I know, there's a bunch of stuff that could be done differently and better, but it still doesn't motivate why there's a 3x slowdown between kernel versions... I would say that something coded this bizarrely is really an application bug and not something that one could call a kernel regression. Any change in how the parent and child threads get scheduled will have a huge impact on this test. I bet if you replace that busy wait with a pthread_cond_wait or something similar, this problem goes away. Hopefully it doesn't have to be pointed out that spawning off threads to do so little work before terminating is inefficient, a thread pool or even just a single thread would likely do a much better job.. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sata_nv: fix ATAPI issues with memory over 4GB (v7)
This fixes some problems with ATAPI devices on nForce4 controllers in ADMA mode on systems with memory located above 4GB. We need to delay setting the 64-bit DMA mask until the PRD table and padding buffer are allocated so that they don't get allocated above 4GB and break legacy mode (which is needed for ATAPI devices). Also, if either port is in ATAPI mode we need to set the DMA mask for the PCI device to 32-bit to ensure that the IOMMU code properly bounces requests above 4GB, as it appears setting the bounce limit does not guarantee that we will not try to map requests above this point. Reported to fix https://bugzilla.redhat.com/show_bug.cgi?id=351451 Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> --- linux-2.6.24/drivers/ata/sata_nv.c 2008-01-24 16:58:37.0 -0600 +++ linux-2.6.24edit/drivers/ata/sata_nv.c 2008-01-29 18:39:37.0 -0600 @@ -247,6 +247,7 @@ struct nv_adma_port_priv { void __iomem*ctl_block; void __iomem*gen_block; void __iomem*notifier_clear_block; + u64 adma_dma_mask; u8 flags; int last_issue_ncq; }; @@ -715,9 +716,10 @@ static int nv_adma_slave_config(struct s { struct ata_port *ap = ata_shost_to_port(sdev->host); struct nv_adma_port_priv *pp = ap->private_data; + struct nv_adma_port_priv *port0, *port1; + struct scsi_device *sdev0, *sdev1; struct pci_dev *pdev = to_pci_dev(ap->host->dev); - u64 bounce_limit; - unsigned long segment_boundary; + unsigned long segment_boundary, flags; unsigned short sg_tablesize; int rc; int adma_enable; @@ -729,6 +731,8 @@ static int nv_adma_slave_config(struct s /* Not a proper libata device, ignore */ return rc; + spin_lock_irqsave(ap->lock, flags); + if (ap->link.device[sdev->id].class == ATA_DEV_ATAPI) { /* * NVIDIA reports that ADMA mode does not support ATAPI commands. @@ -737,7 +741,6 @@ static int nv_adma_slave_config(struct s * Restrict DMA parameters as required by the legacy interface * when an ATAPI device is connected. */ - bounce_limit = ATA_DMA_MASK; segment_boundary = ATA_DMA_BOUNDARY; /* Subtract 1 since an extra entry may be needed for padding, see libata-scsi.c */ @@ -748,7 +751,6 @@ static int nv_adma_slave_config(struct s adma_enable = 0; nv_adma_register_mode(ap); } else { - bounce_limit = *ap->dev->dma_mask; segment_boundary = NV_ADMA_DMA_BOUNDARY; sg_tablesize = NV_ADMA_SGTBL_TOTAL_LEN; adma_enable = 1; @@ -774,12 +776,49 @@ static int nv_adma_slave_config(struct s if (current_reg != new_reg) pci_write_config_dword(pdev, NV_MCP_SATA_CFG_20, new_reg); - blk_queue_bounce_limit(sdev->request_queue, bounce_limit); + port0 = ap->host->ports[0]->private_data; + port1 = ap->host->ports[1]->private_data; + sdev0 = ap->host->ports[0]->link.device[0].sdev; + sdev1 = ap->host->ports[1]->link.device[0].sdev; + if ((port0->flags & NV_ADMA_ATAPI_SETUP_COMPLETE) || + (port1->flags & NV_ADMA_ATAPI_SETUP_COMPLETE)) { + /** We have to set the DMA mask to 32-bit if either port is in + ATAPI mode, since they are on the same PCI device which is + used for DMA mapping. If we set the mask we also need to set + the bounce limit on both ports to ensure that the block + layer doesn't feed addresses that cause DMA mapping to + choke. If either SCSI device is not allocated yet, it's OK + since that port will discover its correct setting when it + does get allocated. + Note: Setting 32-bit mask should not fail. */ + if (sdev0) + blk_queue_bounce_limit(sdev0->request_queue, + ATA_DMA_MASK); + if (sdev1) + blk_queue_bounce_limit(sdev1->request_queue, + ATA_DMA_MASK); + + pci_set_dma_mask(pdev, ATA_DMA_MASK); + } else { + /** This shouldn't fail as it was set to this value before */ + pci_set_dma_mask(pdev, pp->adma_dma_mask); + if (sdev0) + blk_queue_bounce_limit(sdev0->request_queue, + pp->adma_dma_mask); + if (sdev1) +
Re: a7839e96 (PNP: increase max resources) breaks my ALSA intel8x0 sound card
Andrew Morton wrote: There was a patch floating around to ignore PnPACPI reservations which conflict with PCI BARs, which appears to be what's happening in this case. That patch originally worked for any board, but was later made specific to a certain Supermicro motherboard which had the sata_nv controller MMIO regions marked as reserved, preventing the driver from loading. We may need a more general solution. See: https://bugzilla.redhat.com/show_bug.cgi?id=313491 Thanks. If we were to remove the supermicro-specificity, would this be a sufficiently general solution? I think so. There was one objection that it introduced a dependency on pnpacpi loading after PCI bus enumeration, though. Linus also suggested that pnpacpi could be marking the resources as "present but unused" so that drivers can request those regions but we still prevent dynamically assigning resources into them. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM REMAINS: [sata_nv ADMA breaks ATAPI] Crash on accessing DVD-RAM
Alexander wrote: Hello! The problem described at https://bugzilla.redhat.com/show_bug.cgi?id=351451 and at http://ubuntuforums.org/showthread.php?t=655772 and supposedly fixed by the patch http://kerneltrap.org/mailarchive/linux-kernel/2007/11/25/445094 is still there. I have compiled 2.6.24-rc7 kernel and booted my PC with it just to find out that my SATA DVD-RW is sr0: scsi3-mmc drive: 0x/0x caddy as it was before with 2.6.23.12 and earlier 2.6 kernels compiled for x86_64. Trying to use sr0 after this results in dead hang or reboot. When I put sata_nv.adma=0 or mem=4096M then it's all ok: Can you (or others experiencing this problem) test the latest patch attached to the RH Bugzilla entry here: https://bugzilla.redhat.com/show_bug.cgi?id=351451 and see if it resolves the problem? I have one report of success so far. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: The hardware reports a non fatal, correctable incident occurred on CPU 0
Badalian Vyacheslav wrote: Hello all. Can anyone say to me that messages its normal =) [63617.120342] MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0. [63617.120353] Bank 3: cc100100 [63632.092712] MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0. [63632.092723] Bank 3: cc100100 [63647.065081] MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0. [63647.065091] Bank 3: cc100100 [63662.037453] MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0. [63662.037463] Bank 3: cc100100 Or i need send to you my lspci to add my motherboard or some chip to some blacklist? I understand that MCE is* Machine Check Exception. =)* What kind of hardware is this? You likely have some bad RAM or a bad CPU. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: intel ahci problem
Evgen L wrote: Hi all I have a problem with my Intel SR1550 server (S5000PAL motheboard, SATA/SAS controller, 5 SATA HDD Seagate ST9120822AS ). The four drivers are in two md raid1, which striping by lvm and one drive used separately. I have problem like below with two different drives (ata3 or ata4) and ata5. I look problem like this with RedHat 2.6.18-53 kernel, 2.6.24-rc8, and today 2.6.24-git5. I reading about any problems like this in lkml.org. There can be this message will help to fix this issue. The complete dmesg in attachment. md: bind RAID1 conf printout: --- wd:1 rd:2 disk 0, wo:0, o:1, dev:sdc1 disk 1, wo:1, o:1, dev:sdd1 md: recovery of RAID array md2 md: minimum _guaranteed_ speed: 1000 KB/sec/disk. md: using maximum available idle IO bandwidth (but not more than 20 KB/sec) for recovery. md: using 128k window, over a total of 117218176 blocks. ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata4.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 0 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata4.00: status: { DRDY } ata4: hard resetting link ata4: port is slow to respond, please be patient (Status 0x80) ata4: softreset failed (device not ready) ata4: hard resetting link ata4: port is slow to respond, please be patient (Status 0x80) ata4: softreset failed (device not ready) ata4: hard resetting link ata4: port is slow to respond, please be patient (Status 0x80) ata4: softreset failed (device not ready) ata4: limiting SATA link speed to 1.5 Gbps ata4: hard resetting link ata4: softreset failed (device not ready) ata4: reset failed, giving up ata4.00: disabled Are you sure this is not a hardware problem (bad disk, cable, etc?) Is it always ata4 that fails? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Help Needed Reading / Poling the ATA Status Register
[EMAIL PROTECTED] wrote: I am currently writing some code to send some ATA commands directly to the drive using ioctl and SG_IO which seems to work fine. However I also need to read the ATA status register values in real time which I am unsure how to do. I have seen in the libata developers guide the following functions to read the status and alternative status registers:- u8 (*check_status)(struct ata_port *ap); u8 (*check_altstatus)(struct ata_port *ap); u8 (*check_err)(struct ata_port *ap); However I have no idea how to access these functions directly or even if this is the best way to go about it. Any help at all would be much appreciated. First question is why you need to poll the status register, since quite likely there is a better solution to whatever that reason is.. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/