Re: [PATCH 12/12] watchdog: sp5100_tco: Add support for recent FCH versions
I'm trying to remove non-ASCII chars from the mail body in the hope it reaches the lists... My ISP still adds that X-Spam-Report: header quoting large part of the mail body without MIME-encoding it. On 2018-01-04 20:21, Guenter Roeck wrote: On Thu, Jan 04, 2018 at 01:01:22PM +0100, Boszormenyi Zoltan wrote: On 2017-12-24 22:04, Guenter Roeck wrote: Starting with Family 16h Models 30h-3Fh and Family 15h Models 60h-6Fh, watchdog address space decoding has changed. The cutover point is already identified in the i2c-piix2 driver, so use the same mechanism. "i2c-piix4". Thanks! Otherwise, I only have an older AMD FX CPU, so I can only test whether it is not broken there. That is actually the important test. I tested myself on Ryzen 1700X. I was able to test on a Kabini APU at work, my AMD FX at home still needs to be tested. The driver loads properly: [5.620836] sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver [5.621002] sp5100-tco sp5100-tco: Using 0xfed80b00 for watchdog MMIO address [5.621611] sp5100-tco sp5100-tco: initialized. heartbeat=60 sec (nowayout=1) echo "1" >/dev/watchdog rebooted the machine after one minute properly. You can add my Tested-by: line. Best regards, Zoltan Boszormenyi
Re: [PATCH 12/12] watchdog: sp5100_tco: Add support for recent FCH versions
I'm trying to remove non-ASCII chars from the mail body in the hope it reaches the lists... My ISP still adds that X-Spam-Report: header quoting large part of the mail body without MIME-encoding it. On 2018-01-04 20:21, Guenter Roeck wrote: On Thu, Jan 04, 2018 at 01:01:22PM +0100, Boszormenyi Zoltan wrote: On 2017-12-24 22:04, Guenter Roeck wrote: Starting with Family 16h Models 30h-3Fh and Family 15h Models 60h-6Fh, watchdog address space decoding has changed. The cutover point is already identified in the i2c-piix2 driver, so use the same mechanism. "i2c-piix4". Thanks! Otherwise, I only have an older AMD FX CPU, so I can only test whether it is not broken there. That is actually the important test. I tested myself on Ryzen 1700X. I was able to test on a Kabini APU at work, my AMD FX at home still needs to be tested. The driver loads properly: [5.620836] sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver [5.621002] sp5100-tco sp5100-tco: Using 0xfed80b00 for watchdog MMIO address [5.621611] sp5100-tco sp5100-tco: initialized. heartbeat=60 sec (nowayout=1) echo "1" >/dev/watchdog rebooted the machine after one minute properly. You can add my Tested-by: line. Best regards, Zoltan Boszormenyi
Re: [PATCH] HID: core: assign usbhid to handle EETI PID=0x0001 HID device
Hi, On 2017-08-22 11:33, Benjamin Tissoires wrote: On Aug 11 2017 or thereabouts, JamChen wrote: From: Jam ChenThe vendor used the same PID(0x0001) for multiple touch IC controllers. The newer ICs can support HID class and report the multitouch collection in the descriptor. So they were handled by the hid-multitouch driver. But some customized firmwares don't support multitouch protocol even if driver have got the Win8 blob data. Actually, those ICs only support the single touch function, and report the mouse protocol by default. We can assign usbhid to handle them all. Signed-off-by: Jam Chen --- Him FYI, I'd rather see a full working solution such as the one presented here: https://patchwork.kernel.org/patch/9876649/ Because this solution is half working as it regresses on some devices while solving others. Cheers, Benjamin is there any news about resolving this issue in the upstream kernel? Thanks in advance, Zoltán Böszörményi drivers/hid/hid-core.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index 9017dcc14502..df4696022488 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -828,6 +828,10 @@ static int hid_scan_report(struct hid_device *hid) */ hid->group = HID_GROUP_RMI; break; + case USB_VENDOR_ID_DWAV: + if (hid->product == USB_DEVICE_ID_EGALAX_TOUCHCONTROLLER) + hid->group = HID_GROUP_GENERIC; + break; } /* fall back to generic driver in case specific driver doesn't exist */ -- 2.11.0
Re: [PATCH] HID: core: assign usbhid to handle EETI PID=0x0001 HID device
Hi, On 2017-08-22 11:33, Benjamin Tissoires wrote: On Aug 11 2017 or thereabouts, JamChen wrote: From: Jam Chen The vendor used the same PID(0x0001) for multiple touch IC controllers. The newer ICs can support HID class and report the multitouch collection in the descriptor. So they were handled by the hid-multitouch driver. But some customized firmwares don't support multitouch protocol even if driver have got the Win8 blob data. Actually, those ICs only support the single touch function, and report the mouse protocol by default. We can assign usbhid to handle them all. Signed-off-by: Jam Chen --- Him FYI, I'd rather see a full working solution such as the one presented here: https://patchwork.kernel.org/patch/9876649/ Because this solution is half working as it regresses on some devices while solving others. Cheers, Benjamin is there any news about resolving this issue in the upstream kernel? Thanks in advance, Zoltán Böszörményi drivers/hid/hid-core.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index 9017dcc14502..df4696022488 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -828,6 +828,10 @@ static int hid_scan_report(struct hid_device *hid) */ hid->group = HID_GROUP_RMI; break; + case USB_VENDOR_ID_DWAV: + if (hid->product == USB_DEVICE_ID_EGALAX_TOUCHCONTROLLER) + hid->group = HID_GROUP_GENERIC; + break; } /* fall back to generic driver in case specific driver doesn't exist */ -- 2.11.0
Re: [PATCH] HID: core: assign usbhid to handle EETI PID=0x0001 HID device
Hi, it's a funny thing. Currently I can't reproduce the problem. This was the situation that occurred previously: At the first touch after X started, the pointer went to the correct location. Moving the finger made the pointer move with the finger, the pointer location was always good. Then lifted the finger up and touched a different location. Moving the finger made the pointer move with the finger, but the pointer didn't match the finger location, the difference was the vector between the last location in the first step and the first location in this second step. I was thinking that since the touchscreen is demoted to be a mouse, it's the expected behaviour, since only relative motion is processed. This was not a panel orientation problem which can be easily solved by calibration. But I was just after upgrading the system from libevdev 1.4.6 to 1.5.7 and from libinput 1.5.0 to 1.8.1 and it is possible that something still referenced the old library versions even after restarting Xorg. It's also possible that the ordering of the input devices influences things. From dmesg, there's no difference between boots: [3.718370] usb 1-1.1.4: new full-speed USB device number 8 using ehci-pci [3.810812] usb 1-1.1.4: New USB device found, idVendor=0eef, idProduct=0001 [3.810815] usb 1-1.1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [3.810818] usb 1-1.1.4: Product: USB TouchController [3.810819] usb 1-1.1.4: Manufacturer: eGalax Inc. [3.815502] input: eGalax Inc. USB TouchController as /devices/pci:00/:00:1d.0/usb1/1-1/1-1.1/1-1.1.4/1-1.1.4:1.0/0003:0EEF:0001.0003/input/input8 [3.815712] input: eGalax Inc. USB TouchController as /devices/pci:00/:00:1d.0/usb1/1-1/1-1.1/1-1.1.4/1-1.1.4:1.0/0003:0EEF:0001.0003/input/input9 [3.815888] input: eGalax Inc. USB TouchController as /devices/pci:00/:00:1d.0/usb1/1-1/1-1.1/1-1.1.4/1-1.1.4:1.0/0003:0EEF:0001.0003/input/input10 [3.816008] input: eGalax Inc. USB TouchController as /devices/pci:00/:00:1d.0/usb1/1-1/1-1.1/1-1.1.4/1-1.1.4:1.0/0003:0EEF:0001.0003/input/input11 [3.816297] hid-generic 0003:0EEF:0001.0003: input,hiddev96,hidraw2: USB HID v1.00 Pointer [eGalax Inc. USB TouchController] on usb-:00:1d.0-1.1.4/input0 According to evtest, the 4 devices provide these features: input8: EV_SYN, EV_ABS (ABS_X/ABS_Y), EV_KEY (BTN_LEFT/BTN_RIGHT), EV_MSC (MSC_SCAN) input9: EV_SYN, EV_ABS (ABS_X/ABS_Y), EV_KEY (BTN_TOOL_PEN/BTN_TOUCH), EV_MSC (MSC_SCAN) input10: EV_SYN input11: EV_SYN, EV_ABS (ABS_X/ABS_Y/ABS_MISC), EV_KEY (BTN_TOOL_FINGER/BTN_TOUCH), EV_MSC (MSC_SCAN) Now, that it's working, the first device of the four emit events, the one with the mouse buttons. The pointer behaves perfectly. I remember that when it wasn't working properly, then some of the other devices emitted the events and produced the above described behaviour. It's probably a different kernel bug or the old libevdev/libinput were doing something wrong that made the input devices work differently. I have re-rested downgrading to the old libevdev / libinput libraries but I still couldn't reproduce the problem. Maybe it's gremlins in the machine. Sorry for the noise. Best regards, Zoltán Böszörményi 2017-08-17 13:08 keltezéssel, JiaMing Chen írta: Hi Zoltán, Is it the panel orientation issue? If you run the position calibration using the tslib, can it be fixed? Best regards, Jam Chen 2017-08-15 17:43 GMT+08:00 Boszormenyi Zoltan <zbos...@pr.hu <mailto:zbos...@pr.hu>>: Hi, 2017-08-11 09:42 keltezéssel, JamChen wrote: From: Jam Chen <jam.chen.ega...@gmail.com <mailto:jam.chen.ega...@gmail.com>> The vendor used the same PID(0x0001) for multiple touch IC controllers. The newer ICs can support HID class and report the multitouch collection in the descriptor. So they were handled by the hid-multitouch driver. But some customized firmwares don't support multitouch protocol even if driver have got the Win8 blob data. Actually, those ICs only support the single touch function, and report the mouse protocol by default. We can assign usbhid to handle them all. Signed-off-by: Jam Chen <jam.chen.ega...@gmail.com <mailto:jam.chen.ega...@gmail.com>> --- drivers/hid/hid-core.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index 9017dcc14502..df4696022488 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -828,6 +828,10 @@ static int hid_scan_report(struct hid_device *hid) */ hid->group = HID_GROUP_RMI; break; + case USB_VENDOR_ID_DWAV: + if (hid->produc
Re: [PATCH] HID: core: assign usbhid to handle EETI PID=0x0001 HID device
Hi, it's a funny thing. Currently I can't reproduce the problem. This was the situation that occurred previously: At the first touch after X started, the pointer went to the correct location. Moving the finger made the pointer move with the finger, the pointer location was always good. Then lifted the finger up and touched a different location. Moving the finger made the pointer move with the finger, but the pointer didn't match the finger location, the difference was the vector between the last location in the first step and the first location in this second step. I was thinking that since the touchscreen is demoted to be a mouse, it's the expected behaviour, since only relative motion is processed. This was not a panel orientation problem which can be easily solved by calibration. But I was just after upgrading the system from libevdev 1.4.6 to 1.5.7 and from libinput 1.5.0 to 1.8.1 and it is possible that something still referenced the old library versions even after restarting Xorg. It's also possible that the ordering of the input devices influences things. From dmesg, there's no difference between boots: [3.718370] usb 1-1.1.4: new full-speed USB device number 8 using ehci-pci [3.810812] usb 1-1.1.4: New USB device found, idVendor=0eef, idProduct=0001 [3.810815] usb 1-1.1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [3.810818] usb 1-1.1.4: Product: USB TouchController [3.810819] usb 1-1.1.4: Manufacturer: eGalax Inc. [3.815502] input: eGalax Inc. USB TouchController as /devices/pci:00/:00:1d.0/usb1/1-1/1-1.1/1-1.1.4/1-1.1.4:1.0/0003:0EEF:0001.0003/input/input8 [3.815712] input: eGalax Inc. USB TouchController as /devices/pci:00/:00:1d.0/usb1/1-1/1-1.1/1-1.1.4/1-1.1.4:1.0/0003:0EEF:0001.0003/input/input9 [3.815888] input: eGalax Inc. USB TouchController as /devices/pci:00/:00:1d.0/usb1/1-1/1-1.1/1-1.1.4/1-1.1.4:1.0/0003:0EEF:0001.0003/input/input10 [3.816008] input: eGalax Inc. USB TouchController as /devices/pci:00/:00:1d.0/usb1/1-1/1-1.1/1-1.1.4/1-1.1.4:1.0/0003:0EEF:0001.0003/input/input11 [3.816297] hid-generic 0003:0EEF:0001.0003: input,hiddev96,hidraw2: USB HID v1.00 Pointer [eGalax Inc. USB TouchController] on usb-:00:1d.0-1.1.4/input0 According to evtest, the 4 devices provide these features: input8: EV_SYN, EV_ABS (ABS_X/ABS_Y), EV_KEY (BTN_LEFT/BTN_RIGHT), EV_MSC (MSC_SCAN) input9: EV_SYN, EV_ABS (ABS_X/ABS_Y), EV_KEY (BTN_TOOL_PEN/BTN_TOUCH), EV_MSC (MSC_SCAN) input10: EV_SYN input11: EV_SYN, EV_ABS (ABS_X/ABS_Y/ABS_MISC), EV_KEY (BTN_TOOL_FINGER/BTN_TOUCH), EV_MSC (MSC_SCAN) Now, that it's working, the first device of the four emit events, the one with the mouse buttons. The pointer behaves perfectly. I remember that when it wasn't working properly, then some of the other devices emitted the events and produced the above described behaviour. It's probably a different kernel bug or the old libevdev/libinput were doing something wrong that made the input devices work differently. I have re-rested downgrading to the old libevdev / libinput libraries but I still couldn't reproduce the problem. Maybe it's gremlins in the machine. Sorry for the noise. Best regards, Zoltán Böszörményi 2017-08-17 13:08 keltezéssel, JiaMing Chen írta: Hi Zoltán, Is it the panel orientation issue? If you run the position calibration using the tslib, can it be fixed? Best regards, Jam Chen 2017-08-15 17:43 GMT+08:00 Boszormenyi Zoltan mailto:zbos...@pr.hu>>: Hi, 2017-08-11 09:42 keltezéssel, JamChen wrote: From: Jam Chen mailto:jam.chen.ega...@gmail.com>> The vendor used the same PID(0x0001) for multiple touch IC controllers. The newer ICs can support HID class and report the multitouch collection in the descriptor. So they were handled by the hid-multitouch driver. But some customized firmwares don't support multitouch protocol even if driver have got the Win8 blob data. Actually, those ICs only support the single touch function, and report the mouse protocol by default. We can assign usbhid to handle them all. Signed-off-by: Jam Chen mailto:jam.chen.ega...@gmail.com>> --- drivers/hid/hid-core.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index 9017dcc14502..df4696022488 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -828,6 +828,10 @@ static int hid_scan_report(struct hid_device *hid) */ hid->group = HID_GROUP_RMI; break; + case USB_VENDOR_ID_DWAV: + if (hid->product == USB_DEVICE_ID_EGALAX_TOUCHCONTROLLER) + hid-&g
ACPI: IRQ x override to edge, high
Hi, on two different Intel based POS machines where UARTs are crucial, I see such messages (4.11.7 now, seen with older kernels, too): [0.919239] ACPI: IRQ 4 override to edge, high [0.919334] pnp 00:02: Plug and Play ACPI device, IDs PNP0501 (active) [0.919697] ACPI: IRQ 3 override to edge, high [0.919784] pnp 00:03: Plug and Play ACPI device, IDs PNP0501 (active) [0.920154] ACPI: IRQ 11 override to edge, high [0.920247] pnp 00:04: Plug and Play ACPI device, IDs PNP0501 (active) [0.920449] pnp 00:05: Plug and Play ACPI device, IDs PNP0501 (disabled) [0.920811] ACPI: IRQ 5 override to edge, high [0.920908] pnp 00:06: Plug and Play ACPI device, IDs PNP0501 (active) [0.921272] ACPI: IRQ 6 override to edge, high [0.921363] pnp 00:07: Plug and Play ACPI device, IDs PNP0501 (active) It seems that the PnP and the ACPI tables are inconsistent and ACPI wins. Because of this, the UART IRQs don't work, write() or tcdrain() stalls. Is there a kernel parameter that convinces the Linux ACPI to avoid touching the PnP IRQs? Thanks in advance, Zoltán Böszörményi
ACPI: IRQ x override to edge, high
Hi, on two different Intel based POS machines where UARTs are crucial, I see such messages (4.11.7 now, seen with older kernels, too): [0.919239] ACPI: IRQ 4 override to edge, high [0.919334] pnp 00:02: Plug and Play ACPI device, IDs PNP0501 (active) [0.919697] ACPI: IRQ 3 override to edge, high [0.919784] pnp 00:03: Plug and Play ACPI device, IDs PNP0501 (active) [0.920154] ACPI: IRQ 11 override to edge, high [0.920247] pnp 00:04: Plug and Play ACPI device, IDs PNP0501 (active) [0.920449] pnp 00:05: Plug and Play ACPI device, IDs PNP0501 (disabled) [0.920811] ACPI: IRQ 5 override to edge, high [0.920908] pnp 00:06: Plug and Play ACPI device, IDs PNP0501 (active) [0.921272] ACPI: IRQ 6 override to edge, high [0.921363] pnp 00:07: Plug and Play ACPI device, IDs PNP0501 (active) It seems that the PnP and the ACPI tables are inconsistent and ACPI wins. Because of this, the UART IRQs don't work, write() or tcdrain() stalls. Is there a kernel parameter that convinces the Linux ACPI to avoid touching the PnP IRQs? Thanks in advance, Zoltán Böszörményi
Re: [PATCH 5/5 v4] watchdog: sp5100_tco: Use request_declared_muxed_region()
2017-06-22 15:21 keltezéssel, Zoltán Böszörményi írta: Use the new request_declared_muxed_region() macro to synchronize accesses to the SB800 I/O port pair (0xcd6 / 0xcd7) with the PCI quirk for isochronous USB transfers and with the i2c-piix4 driver. At the same time, remove the long lifetime request_region() call to reserve these I/O ports, similarly to i2c-piix4 so the code is now uniform across the three drivers. v1: Started with a common mutex in a C source file. v2: Referenced the common mutex from drivers/usb/host/pci-quirks.c v3: Switched to using the new request_declared_muxed_region macro. v4: Fixed checkpatch.pl warnings and use the new release_declared_region() macro. Signed-off-by: Zoltán Böszörményi--- drivers/watchdog/sp5100_tco.c | 28 +++- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/watchdog/sp5100_tco.c b/drivers/watchdog/sp5100_tco.c index 028618c..cb42b72 100644 --- a/drivers/watchdog/sp5100_tco.c +++ b/drivers/watchdog/sp5100_tco.c @@ -48,7 +48,6 @@ static u32 tcobase_phys; static u32 tco_wdt_fired; static void __iomem *tcobase; -static unsigned int pm_iobase; static DEFINE_SPINLOCK(tco_lock); /* Guards the hardware */ static unsigned long timer_alive; static char tco_expect_close; @@ -70,6 +69,11 @@ module_param(nowayout, bool, 0); MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started." " (default=" __MODULE_STRING(WATCHDOG_NOWAYOUT) ")"); +/* synchronized access to the I/O port pair */ +static struct resource sp5100_res = DEFINE_RES_IO_NAMED(SB800_IO_PM_INDEX_REG, + SP5100_PM_IOPORTS_SIZE, + TCO_MODULE_NAME); + /* * Some TCO specific functions */ @@ -139,6 +143,7 @@ static void tco_timer_enable(void) if (!tco_has_sp5100_reg_layout(sp5100_tco_pci)) { /* For SB800 or later */ /* Set the Watchdog timer resolution to 1 sec */ + request_declared_muxed_region(_res); outb(SB800_PM_WATCHDOG_CONFIG, SB800_IO_PM_INDEX_REG); val = inb(SB800_IO_PM_DATA_REG); val |= SB800_PM_WATCHDOG_SECOND_RES; @@ -150,6 +155,7 @@ static void tco_timer_enable(void) val |= SB800_PCI_WATCHDOG_DECODE_EN; val &= ~SB800_PM_WATCHDOG_DISABLE; outb(val, SB800_IO_PM_DATA_REG); + release_declared_region(_res); } else { /* For SP5100 or SB7x0 */ /* Enable watchdog decode bit */ @@ -164,11 +170,13 @@ static void tco_timer_enable(void) val); /* Enable Watchdog timer and set the resolution to 1 sec */ + request_declared_muxed_region(_res); outb(SP5100_PM_WATCHDOG_CONTROL, SP5100_IO_PM_INDEX_REG); val = inb(SP5100_IO_PM_DATA_REG); val |= SP5100_PM_WATCHDOG_SECOND_RES; val &= ~SP5100_PM_WATCHDOG_DISABLE; outb(val, SP5100_IO_PM_DATA_REG); + release_declared_region(_res); } } @@ -361,16 +369,10 @@ static unsigned char sp5100_tco_setupdevice(void) base_addr = SB800_PM_WATCHDOG_BASE; } - /* Request the IO ports used by this driver */ - pm_iobase = SP5100_IO_PM_INDEX_REG; - if (!request_region(pm_iobase, SP5100_PM_IOPORTS_SIZE, dev_name)) { - pr_err("I/O address 0x%04x already in use\n", pm_iobase); - goto exit; - } - /* * First, Find the watchdog timer MMIO address from indirect I/O. */ + request_declared_muxed_region(_res); outb(base_addr+3, index_reg); val = inb(data_reg); outb(base_addr+2, index_reg); @@ -380,6 +382,7 @@ static unsigned char sp5100_tco_setupdevice(void) outb(base_addr+0, index_reg); /* Low three bits of BASE are reserved */ val = val << 8 | (inb(data_reg) & 0xf8); + release_declared_region(_res); pr_debug("Got 0x%04x from indirect I/O\n", val); @@ -400,6 +403,7 @@ static unsigned char sp5100_tco_setupdevice(void) SP5100_SB_RESOURCE_MMIO_BASE, ); } else { /* Read SBResource_MMIO from AcpiMmioEn(PM_Reg: 24h) */ + request_declared_muxed_region(_res); outb(SB800_PM_ACPI_MMIO_EN+3, SB800_IO_PM_INDEX_REG); val = inb(SB800_IO_PM_DATA_REG); outb(SB800_PM_ACPI_MMIO_EN+2, SB800_IO_PM_INDEX_REG); @@ -408,6 +412,7 @@ static unsigned char sp5100_tco_setupdevice(void) val = val << 8 | inb(SB800_IO_PM_DATA_REG); outb(SB800_PM_ACPI_MMIO_EN+0, SB800_IO_PM_INDEX_REG); val = val << 8 | inb(SB800_IO_PM_DATA_REG); + release_declared_region(_res); } /* The
Re: [PATCH 5/5 v4] watchdog: sp5100_tco: Use request_declared_muxed_region()
2017-06-22 15:21 keltezéssel, Zoltán Böszörményi írta: Use the new request_declared_muxed_region() macro to synchronize accesses to the SB800 I/O port pair (0xcd6 / 0xcd7) with the PCI quirk for isochronous USB transfers and with the i2c-piix4 driver. At the same time, remove the long lifetime request_region() call to reserve these I/O ports, similarly to i2c-piix4 so the code is now uniform across the three drivers. v1: Started with a common mutex in a C source file. v2: Referenced the common mutex from drivers/usb/host/pci-quirks.c v3: Switched to using the new request_declared_muxed_region macro. v4: Fixed checkpatch.pl warnings and use the new release_declared_region() macro. Signed-off-by: Zoltán Böszörményi --- drivers/watchdog/sp5100_tco.c | 28 +++- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/watchdog/sp5100_tco.c b/drivers/watchdog/sp5100_tco.c index 028618c..cb42b72 100644 --- a/drivers/watchdog/sp5100_tco.c +++ b/drivers/watchdog/sp5100_tco.c @@ -48,7 +48,6 @@ static u32 tcobase_phys; static u32 tco_wdt_fired; static void __iomem *tcobase; -static unsigned int pm_iobase; static DEFINE_SPINLOCK(tco_lock); /* Guards the hardware */ static unsigned long timer_alive; static char tco_expect_close; @@ -70,6 +69,11 @@ module_param(nowayout, bool, 0); MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started." " (default=" __MODULE_STRING(WATCHDOG_NOWAYOUT) ")"); +/* synchronized access to the I/O port pair */ +static struct resource sp5100_res = DEFINE_RES_IO_NAMED(SB800_IO_PM_INDEX_REG, + SP5100_PM_IOPORTS_SIZE, + TCO_MODULE_NAME); + /* * Some TCO specific functions */ @@ -139,6 +143,7 @@ static void tco_timer_enable(void) if (!tco_has_sp5100_reg_layout(sp5100_tco_pci)) { /* For SB800 or later */ /* Set the Watchdog timer resolution to 1 sec */ + request_declared_muxed_region(_res); outb(SB800_PM_WATCHDOG_CONFIG, SB800_IO_PM_INDEX_REG); val = inb(SB800_IO_PM_DATA_REG); val |= SB800_PM_WATCHDOG_SECOND_RES; @@ -150,6 +155,7 @@ static void tco_timer_enable(void) val |= SB800_PCI_WATCHDOG_DECODE_EN; val &= ~SB800_PM_WATCHDOG_DISABLE; outb(val, SB800_IO_PM_DATA_REG); + release_declared_region(_res); } else { /* For SP5100 or SB7x0 */ /* Enable watchdog decode bit */ @@ -164,11 +170,13 @@ static void tco_timer_enable(void) val); /* Enable Watchdog timer and set the resolution to 1 sec */ + request_declared_muxed_region(_res); outb(SP5100_PM_WATCHDOG_CONTROL, SP5100_IO_PM_INDEX_REG); val = inb(SP5100_IO_PM_DATA_REG); val |= SP5100_PM_WATCHDOG_SECOND_RES; val &= ~SP5100_PM_WATCHDOG_DISABLE; outb(val, SP5100_IO_PM_DATA_REG); + release_declared_region(_res); } } @@ -361,16 +369,10 @@ static unsigned char sp5100_tco_setupdevice(void) base_addr = SB800_PM_WATCHDOG_BASE; } - /* Request the IO ports used by this driver */ - pm_iobase = SP5100_IO_PM_INDEX_REG; - if (!request_region(pm_iobase, SP5100_PM_IOPORTS_SIZE, dev_name)) { - pr_err("I/O address 0x%04x already in use\n", pm_iobase); - goto exit; - } - /* * First, Find the watchdog timer MMIO address from indirect I/O. */ + request_declared_muxed_region(_res); outb(base_addr+3, index_reg); val = inb(data_reg); outb(base_addr+2, index_reg); @@ -380,6 +382,7 @@ static unsigned char sp5100_tco_setupdevice(void) outb(base_addr+0, index_reg); /* Low three bits of BASE are reserved */ val = val << 8 | (inb(data_reg) & 0xf8); + release_declared_region(_res); pr_debug("Got 0x%04x from indirect I/O\n", val); @@ -400,6 +403,7 @@ static unsigned char sp5100_tco_setupdevice(void) SP5100_SB_RESOURCE_MMIO_BASE, ); } else { /* Read SBResource_MMIO from AcpiMmioEn(PM_Reg: 24h) */ + request_declared_muxed_region(_res); outb(SB800_PM_ACPI_MMIO_EN+3, SB800_IO_PM_INDEX_REG); val = inb(SB800_IO_PM_DATA_REG); outb(SB800_PM_ACPI_MMIO_EN+2, SB800_IO_PM_INDEX_REG); @@ -408,6 +412,7 @@ static unsigned char sp5100_tco_setupdevice(void) val = val << 8 | inb(SB800_IO_PM_DATA_REG); outb(SB800_PM_ACPI_MMIO_EN+0, SB800_IO_PM_INDEX_REG); val = val << 8 | inb(SB800_IO_PM_DATA_REG); + release_declared_region(_res); } /* The SBResource_MMIO is
Re: [PATCH 3/5 v4] usb: pci-quirks: Protect the I/O port pair of SB800
2017-06-22 15:21 keltezéssel, Zoltán Böszörményi írta: This patch uses the previously introduced macro called request_declared_muxed_region() to synchronize access to the I/O port pair 0xcd6 / 0xcd7 on SB800. These I/O ports are also used by i2c-piix4 and sp5100_tco, so synchronization is necessary. The other drivers will also be modified to use the new macro in subsequest patched. v1: Started with a common mutex in a C source file. v2: Declared the common mutex in drivers/usb/host/pci-quirks.c instead of in a common C file. v3: Switched to using the new request_declared_muxed_region macro. v4: Fixed checkpatch.pl warnings and use the new release_declared_region() macro. Signed-off-by: Zoltán Böszörményi--- drivers/usb/host/pci-quirks.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/usb/host/pci-quirks.c b/drivers/usb/host/pci-quirks.c index a9a1e4c..593942a 100644 --- a/drivers/usb/host/pci-quirks.c +++ b/drivers/usb/host/pci-quirks.c @@ -279,6 +279,8 @@ bool usb_amd_prefetch_quirk(void) } EXPORT_SYMBOL_GPL(usb_amd_prefetch_quirk); +static struct resource sb800_res = DEFINE_RES_IO_NAMED(0xcd6, 2, "SB800 USB"); + /* * The hardware normally enables the A-link power management feature, which * lets the system lower the power consumption in idle states. @@ -314,11 +316,13 @@ static void usb_amd_quirk_pll(int disable) if (amd_chipset.sb_type.gen == AMD_CHIPSET_SB800 || amd_chipset.sb_type.gen == AMD_CHIPSET_HUDSON2 || amd_chipset.sb_type.gen == AMD_CHIPSET_BOLTON) { + request_declared_muxed_region(_res); outb_p(AB_REG_BAR_LOW, 0xcd6); addr_low = inb_p(0xcd7); outb_p(AB_REG_BAR_HIGH, 0xcd6); addr_high = inb_p(0xcd7); addr = addr_high << 8 | addr_low; + release_declared_region(_res); outl_p(0x30, AB_INDX(addr)); outl_p(0x40, AB_DATA(addr));
Re: [PATCH 3/5 v4] usb: pci-quirks: Protect the I/O port pair of SB800
2017-06-22 15:21 keltezéssel, Zoltán Böszörményi írta: This patch uses the previously introduced macro called request_declared_muxed_region() to synchronize access to the I/O port pair 0xcd6 / 0xcd7 on SB800. These I/O ports are also used by i2c-piix4 and sp5100_tco, so synchronization is necessary. The other drivers will also be modified to use the new macro in subsequest patched. v1: Started with a common mutex in a C source file. v2: Declared the common mutex in drivers/usb/host/pci-quirks.c instead of in a common C file. v3: Switched to using the new request_declared_muxed_region macro. v4: Fixed checkpatch.pl warnings and use the new release_declared_region() macro. Signed-off-by: Zoltán Böszörményi --- drivers/usb/host/pci-quirks.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/usb/host/pci-quirks.c b/drivers/usb/host/pci-quirks.c index a9a1e4c..593942a 100644 --- a/drivers/usb/host/pci-quirks.c +++ b/drivers/usb/host/pci-quirks.c @@ -279,6 +279,8 @@ bool usb_amd_prefetch_quirk(void) } EXPORT_SYMBOL_GPL(usb_amd_prefetch_quirk); +static struct resource sb800_res = DEFINE_RES_IO_NAMED(0xcd6, 2, "SB800 USB"); + /* * The hardware normally enables the A-link power management feature, which * lets the system lower the power consumption in idle states. @@ -314,11 +316,13 @@ static void usb_amd_quirk_pll(int disable) if (amd_chipset.sb_type.gen == AMD_CHIPSET_SB800 || amd_chipset.sb_type.gen == AMD_CHIPSET_HUDSON2 || amd_chipset.sb_type.gen == AMD_CHIPSET_BOLTON) { + request_declared_muxed_region(_res); outb_p(AB_REG_BAR_LOW, 0xcd6); addr_low = inb_p(0xcd7); outb_p(AB_REG_BAR_HIGH, 0xcd6); addr_high = inb_p(0xcd7); addr = addr_high << 8 | addr_low; + release_declared_region(_res); outl_p(0x30, AB_INDX(addr)); outl_p(0x40, AB_DATA(addr));
Re: [PATCH 1/5 v2] Extend the request_region() infrastructure
2017-06-22 15:21 keltezéssel, Zoltán Böszörményi írta: Add a new IORESOURCE_ALLOCATED flag that is automatically used when alloc_resource() is used internally in kernel/resource.c and free_resource() now takes this flag into account. The core of __request_region() was factored out into a new function called __request_declared_region() that needs struct resource * instead of the (start, n, name) triplet. These changes allow using statically declared struct resource data coupled with the pre-existing DEFINE_RES_IO_NAMED() static initializer macro. The new macro exploiting __request_declared_region() is request_declared_muxed_region() v2: Fixed checkpatch.pl warnings and errors and extended the macro API with request_declared_region() and release_declared_region() Reversed the order of __request_declared_region and __request_region Added high level description of the muxed and declared variants of the macros. Signed-off-by: Zoltán Böszörményi--- include/linux/ioport.h | 14 ++ kernel/resource.c | 40 +--- 2 files changed, 51 insertions(+), 3 deletions(-) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 6230064..6ebcd39 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -52,6 +52,7 @@ struct resource { #define IORESOURCE_MEM_64 0x0010 #define IORESOURCE_WINDOW 0x0020 /* forwarded by bridge */ #define IORESOURCE_MUXED 0x0040 /* Resource is software muxed */ +#define IORESOURCE_ALLOCATED 0x0080 /* Resource was allocated */ #define IORESOURCE_EXT_TYPE_BITS 0x0100 /* Resource extended types */ #define IORESOURCE_SYSRAM 0x0100 /* System RAM (modifier) */ @@ -215,7 +216,14 @@ static inline bool resource_contains(struct resource *r1, struct resource *r2) /* Convenience shorthand with allocation */ #define request_region(start,n,name) __request_region(_resource, (start), (n), (name), 0) +#define request_declared_region(res) __request_region( \ + _resource, \ + (res), 0) #define request_muxed_region(start,n,name) __request_region(_resource, (start), (n), (name), IORESOURCE_MUXED) +#define request_declared_muxed_region(res) __request_declared_region( \ + _resource, \ + (res), \ + IORESOURCE_MUXED) #define __request_mem_region(start,n,name, excl) __request_region(_resource, (start), (n), (name), excl) #define request_mem_region(start,n,name) __request_region(_resource, (start), (n), (name), 0) #define request_mem_region_exclusive(start,n,name) \ @@ -227,8 +235,14 @@ extern struct resource * __request_region(struct resource *, resource_size_t n, const char *name, int flags); +extern struct resource *__request_declared_region(struct resource *parent, + struct resource *res, int flags); + /* Compatibility cruft */ #define release_region(start,n) __release_region(_resource, (start), (n)) +#define release_declared_region(res) __release_region(_resource, \ + (res)->start, \ + (res)->end - (res)->start + 1) #define release_mem_region(start,n) __release_region(_resource, (start), (n)) extern void __release_region(struct resource *, resource_size_t, diff --git a/kernel/resource.c b/kernel/resource.c index 9b5f044..2be7029 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -184,6 +184,9 @@ static void free_resource(struct resource *res) if (!res) return; + if (!(res->flags & IORESOURCE_ALLOCATED)) + return; + if (!PageSlab(virt_to_head_page(res))) { spin_lock(_resource_lock); res->sibling = bootmem_resource_free; @@ -210,6 +213,8 @@ static struct resource *alloc_resource(gfp_t flags) else res = kzalloc(sizeof(struct resource), flags); + res->flags = IORESOURCE_ALLOCATED; + return res; } @@ -1110,8 +1115,19 @@ resource_size_t resource_alignment(struct resource *res) * the IO flag meanings (busy etc). * * request_region creates a new busy region. + * The resource descriptor is allocated by this function. + * + * request_declared_region creates a new busy region + * described in an existing resource descriptor. + * + * request_muxed_region creates a new shared busy region. + * The resource descriptor is allocated by this function. + * + * request_declared_muxed_region creates a new shared busy region + * described in an existing resource descriptor. *
Re: [PATCH 1/5 v2] Extend the request_region() infrastructure
2017-06-22 15:21 keltezéssel, Zoltán Böszörményi írta: Add a new IORESOURCE_ALLOCATED flag that is automatically used when alloc_resource() is used internally in kernel/resource.c and free_resource() now takes this flag into account. The core of __request_region() was factored out into a new function called __request_declared_region() that needs struct resource * instead of the (start, n, name) triplet. These changes allow using statically declared struct resource data coupled with the pre-existing DEFINE_RES_IO_NAMED() static initializer macro. The new macro exploiting __request_declared_region() is request_declared_muxed_region() v2: Fixed checkpatch.pl warnings and errors and extended the macro API with request_declared_region() and release_declared_region() Reversed the order of __request_declared_region and __request_region Added high level description of the muxed and declared variants of the macros. Signed-off-by: Zoltán Böszörményi --- include/linux/ioport.h | 14 ++ kernel/resource.c | 40 +--- 2 files changed, 51 insertions(+), 3 deletions(-) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 6230064..6ebcd39 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -52,6 +52,7 @@ struct resource { #define IORESOURCE_MEM_64 0x0010 #define IORESOURCE_WINDOW 0x0020 /* forwarded by bridge */ #define IORESOURCE_MUXED 0x0040 /* Resource is software muxed */ +#define IORESOURCE_ALLOCATED 0x0080 /* Resource was allocated */ #define IORESOURCE_EXT_TYPE_BITS 0x0100 /* Resource extended types */ #define IORESOURCE_SYSRAM 0x0100 /* System RAM (modifier) */ @@ -215,7 +216,14 @@ static inline bool resource_contains(struct resource *r1, struct resource *r2) /* Convenience shorthand with allocation */ #define request_region(start,n,name) __request_region(_resource, (start), (n), (name), 0) +#define request_declared_region(res) __request_region( \ + _resource, \ + (res), 0) #define request_muxed_region(start,n,name) __request_region(_resource, (start), (n), (name), IORESOURCE_MUXED) +#define request_declared_muxed_region(res) __request_declared_region( \ + _resource, \ + (res), \ + IORESOURCE_MUXED) #define __request_mem_region(start,n,name, excl) __request_region(_resource, (start), (n), (name), excl) #define request_mem_region(start,n,name) __request_region(_resource, (start), (n), (name), 0) #define request_mem_region_exclusive(start,n,name) \ @@ -227,8 +235,14 @@ extern struct resource * __request_region(struct resource *, resource_size_t n, const char *name, int flags); +extern struct resource *__request_declared_region(struct resource *parent, + struct resource *res, int flags); + /* Compatibility cruft */ #define release_region(start,n) __release_region(_resource, (start), (n)) +#define release_declared_region(res) __release_region(_resource, \ + (res)->start, \ + (res)->end - (res)->start + 1) #define release_mem_region(start,n) __release_region(_resource, (start), (n)) extern void __release_region(struct resource *, resource_size_t, diff --git a/kernel/resource.c b/kernel/resource.c index 9b5f044..2be7029 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -184,6 +184,9 @@ static void free_resource(struct resource *res) if (!res) return; + if (!(res->flags & IORESOURCE_ALLOCATED)) + return; + if (!PageSlab(virt_to_head_page(res))) { spin_lock(_resource_lock); res->sibling = bootmem_resource_free; @@ -210,6 +213,8 @@ static struct resource *alloc_resource(gfp_t flags) else res = kzalloc(sizeof(struct resource), flags); + res->flags = IORESOURCE_ALLOCATED; + return res; } @@ -1110,8 +1115,19 @@ resource_size_t resource_alignment(struct resource *res) * the IO flag meanings (busy etc). * * request_region creates a new busy region. + * The resource descriptor is allocated by this function. + * + * request_declared_region creates a new busy region + * described in an existing resource descriptor. + * + * request_muxed_region creates a new shared busy region. + * The resource descriptor is allocated by this function. + * + * request_declared_muxed_region creates a new shared busy region + * described in an existing resource descriptor. * *
Re: [PATCH 0/5 v4] Fix sp5100_tco watchdog driver regression
Hi, ping for the series. Adding Greg Kroah-Hartman to the cc: list, both for the USB core and stable series maintainership. 2017-06-22 15:21 keltezéssel, Zoltán Böszörményi írta: This patch series fixes a regression introduced by: commit 2fee61d22e606fc99ade9079fda15fdee83ec33e Author: Christian FetzerDate: Thu Nov 19 20:13:48 2015 +0100 i2c: piix4: Add support for multiplexed main adapter in SB800 The regression caused sp5100_tco fail to load: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver v0.05 sp5100_tco: PCI Vendor ID: 0x1002, Device ID: 0x4385, Revision ID: 0x42 sp5100_tco: I/O address 0x0cd6 already in use Notable bugzilla links about this issue: https://bugzilla.kernel.org/show_bug.cgi?id=170741 https://bugzilla.redhat.com/show_bug.cgi?id=1369269 https://bugzilla.redhat.com/show_bug.cgi?id=1406844 The previous two versions of this patch series introduced a common mutex to synchronize access to the I/O port pair 0xcd6 / 0xcd7 used by the AMD SB800 USB PCI quirk code and the i2c-piix and sp5100_tco drivers. The common mutex was criticized because it introduces an inter-dependency between drivers. This approach modifies the request_muxed_region() semantics and modifies the possible use cases. The first patch in the series adds a new IORESOURCE_ALLOCATED flag that alloc_resource() sets and free_resource() considers. The core of __request_region() is factored out into a new function that doesn't allocate. With this change, drivers can use the pre-existing DEFINE_RES_IO_NAMED() static initialized macro to declare struct resource statically (e.g. on the stack) and pass the address of it to the new __request_declared_region() function. A new macro called request_declared_muxed_region() was added to exploit this functionality. Because of the new IORESOURCE_ALLOCATED resource flag, release_region() can still be called with the old interface (the port region start and end values) and it won't attempt to free a non-allocated resource. This eliminated one failure case that can come from allocation errors. The second patch modifies the behaviour of IORESOURCE_MUXED, a.k.a. the request_*muxed_region() macros. When these macros are called, the caller goes to sleep when there is any conflicting regions, even if the conflicting region did not use the IORESOURCE_MUXED flag. The kernel logs this inconsistent flag usage with KERN_ERR. This change eliminates the second failure case for IORESOURCE_MUXED and request_muxed_region() can be used like mutex_lock(), i.e. it returns only in case it could successfully request the region. The last three patches adds proper synchronization between the USB PCI quirks code and the i2c-piix and sp5100_tco drivers. The result is that the sp5100_tco driver can load and works again: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver v0.05 sp5100_tco: PCI Vendor ID: 0x1002, Device ID: 0x4385, Revision ID: 0x42 sp5100_tco: Using 0xfed80b00 for watchdog MMIO address sp5100_tco: Last reboot was not triggered by watchdog. sp5100_tco: initialized (0xba2f4192db00). heartbeat=60 sec (nowayout=0) Signed-off-by: Zoltán Böszörményi --- drivers/i2c/busses/i2c-piix4.c | 41 - drivers/usb/host/pci-quirks.c | 4 drivers/watchdog/sp5100_tco.c | 28 + include/linux/ioport.h | 14 + kernel/resource.c | 46 ++ 5 files changed, 88 insertions(+), 45 deletions(-) The synchronized access to the SB800 I/O ports seems to also have made a rare "disabled by hub (EMI?), re-enabling..." report from the kernel disappear. Can someone review the series? Thanks in advance, Zoltán Böszörményi
Re: [PATCH 0/5 v4] Fix sp5100_tco watchdog driver regression
Hi, ping for the series. Adding Greg Kroah-Hartman to the cc: list, both for the USB core and stable series maintainership. 2017-06-22 15:21 keltezéssel, Zoltán Böszörményi írta: This patch series fixes a regression introduced by: commit 2fee61d22e606fc99ade9079fda15fdee83ec33e Author: Christian Fetzer Date: Thu Nov 19 20:13:48 2015 +0100 i2c: piix4: Add support for multiplexed main adapter in SB800 The regression caused sp5100_tco fail to load: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver v0.05 sp5100_tco: PCI Vendor ID: 0x1002, Device ID: 0x4385, Revision ID: 0x42 sp5100_tco: I/O address 0x0cd6 already in use Notable bugzilla links about this issue: https://bugzilla.kernel.org/show_bug.cgi?id=170741 https://bugzilla.redhat.com/show_bug.cgi?id=1369269 https://bugzilla.redhat.com/show_bug.cgi?id=1406844 The previous two versions of this patch series introduced a common mutex to synchronize access to the I/O port pair 0xcd6 / 0xcd7 used by the AMD SB800 USB PCI quirk code and the i2c-piix and sp5100_tco drivers. The common mutex was criticized because it introduces an inter-dependency between drivers. This approach modifies the request_muxed_region() semantics and modifies the possible use cases. The first patch in the series adds a new IORESOURCE_ALLOCATED flag that alloc_resource() sets and free_resource() considers. The core of __request_region() is factored out into a new function that doesn't allocate. With this change, drivers can use the pre-existing DEFINE_RES_IO_NAMED() static initialized macro to declare struct resource statically (e.g. on the stack) and pass the address of it to the new __request_declared_region() function. A new macro called request_declared_muxed_region() was added to exploit this functionality. Because of the new IORESOURCE_ALLOCATED resource flag, release_region() can still be called with the old interface (the port region start and end values) and it won't attempt to free a non-allocated resource. This eliminated one failure case that can come from allocation errors. The second patch modifies the behaviour of IORESOURCE_MUXED, a.k.a. the request_*muxed_region() macros. When these macros are called, the caller goes to sleep when there is any conflicting regions, even if the conflicting region did not use the IORESOURCE_MUXED flag. The kernel logs this inconsistent flag usage with KERN_ERR. This change eliminates the second failure case for IORESOURCE_MUXED and request_muxed_region() can be used like mutex_lock(), i.e. it returns only in case it could successfully request the region. The last three patches adds proper synchronization between the USB PCI quirks code and the i2c-piix and sp5100_tco drivers. The result is that the sp5100_tco driver can load and works again: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver v0.05 sp5100_tco: PCI Vendor ID: 0x1002, Device ID: 0x4385, Revision ID: 0x42 sp5100_tco: Using 0xfed80b00 for watchdog MMIO address sp5100_tco: Last reboot was not triggered by watchdog. sp5100_tco: initialized (0xba2f4192db00). heartbeat=60 sec (nowayout=0) Signed-off-by: Zoltán Böszörményi --- drivers/i2c/busses/i2c-piix4.c | 41 - drivers/usb/host/pci-quirks.c | 4 drivers/watchdog/sp5100_tco.c | 28 + include/linux/ioport.h | 14 + kernel/resource.c | 46 ++ 5 files changed, 88 insertions(+), 45 deletions(-) The synchronized access to the SB800 I/O ports seems to also have made a rare "disabled by hub (EMI?), re-enabling..." report from the kernel disappear. Can someone review the series? Thanks in advance, Zoltán Böszörményi
Re: [Regression] Changes to i2c-piix4.c initialisation prevent loading of sp5100_tco watchdog driver on AMD SB800 chipset
2017-04-03 09:59 keltezéssel, Boszormenyi Zoltan írta: Hi, 2017-04-03 08:34 keltezéssel, Paul Menzel írta: Dear Zoltán, Am Samstag, den 01.04.2017, 12:13 +0200 schrieb Boszormenyi Zoltan: […] and have split the patch into three pieces now (USB quirks, i2c-piix4 and sp5100_tco) and they were sent to the relevant mailing lists. Could you please add me to the receiver list of these patches, so that I can test them? Maybe also Christian (the commit author introducing the regression), Tim (bug reporter), and Nehal from AMD? If you uploaded them to the Kernel.org Bugtracker, that’d also work for me. did both. There's a new set of packages sent to all relevant mailing lists and also attached to the kernel bugtracker at https://bugzilla.kernel.org/show_bug.cgi?id=170741 Hopefully this approach will work for everyone. Best regards, Zoltán Böszörményi Thanks, Paul
Re: [Regression] Changes to i2c-piix4.c initialisation prevent loading of sp5100_tco watchdog driver on AMD SB800 chipset
2017-04-03 09:59 keltezéssel, Boszormenyi Zoltan írta: Hi, 2017-04-03 08:34 keltezéssel, Paul Menzel írta: Dear Zoltán, Am Samstag, den 01.04.2017, 12:13 +0200 schrieb Boszormenyi Zoltan: […] and have split the patch into three pieces now (USB quirks, i2c-piix4 and sp5100_tco) and they were sent to the relevant mailing lists. Could you please add me to the receiver list of these patches, so that I can test them? Maybe also Christian (the commit author introducing the regression), Tim (bug reporter), and Nehal from AMD? If you uploaded them to the Kernel.org Bugtracker, that’d also work for me. did both. There's a new set of packages sent to all relevant mailing lists and also attached to the kernel bugtracker at https://bugzilla.kernel.org/show_bug.cgi?id=170741 Hopefully this approach will work for everyone. Best regards, Zoltán Böszörményi Thanks, Paul
Re: KMS question
2017-04-13 18:20 keltezéssel, Ville Syrjälä írta: On Thu, Apr 13, 2017 at 11:37:45AM -0400, Ilia Mirkin wrote: On Thu, Apr 13, 2017 at 11:36 AM, Alex Deucher <alexdeuc...@gmail.com> wrote: On Thu, Apr 13, 2017 at 11:03 AM, Boszormenyi Zoltan <zbos...@pr.hu> wrote: 2017-04-13 16:05 keltezéssel, Alex Deucher írta: On Thu, Apr 13, 2017 at 9:03 AM, Boszormenyi Zoltan <zbos...@pr.hu> wrote: Hi, how can I disable the behaviour in the KMS drivers that enables all outputs at once? It is very annoying that on a POS machine with an 1024x768 LVDS and a 800x480 secondary monitor (both built-in) the KMS driver wakes up both. Then the framebuffer console and plymouth use both screens, making the primary output very odd with only the top-left part used by the boot splash. I would like the boot splash to be shown only on the primary output at its full resolution instead of on all outputs using the smallest common rectangle. Is there a kernel command line configuration that achieves this? The device in question uses the gma500 kernel driver but the same behaviour is observed with the i915 and radeon drivers. The problem is fbdev is not multi-head aware. The fbdev emulation in the KMS drivers attempts to light up all monitors so that something shows up on all heads. If you really want different per head configurations, you need to use the KMS API directly. As a workaround, you can use the kernel command line to disable the output you don't want to be lit up. See: https://wiki.archlinux.org/index.php/kernel_mode_setting for more info. basically add video=TV-1:d to disable the output in question. Replace TV-1 with whatever connector you want to disable. I tried adding video=DVI-D-1:d to the kernel command line. The effect is while the second output is indeed disabled, the framebuffer console still takes the second output's resolution into account and the boot splash is still using only the top-left 800x480 part of the 1024x768 primary screen. Also, the secondary screen got disabled also in X which is not desired. Can I wake it up under X somehow? This device is using the modesetting DDX driver. Can you enable it via randr? I think the video= based disable forces the connector to be disabled irrevocably. # echo detect > /sys/class/drm//status Thanks, that worked. I had to regenerate my initramfs to actually include the gma500 driver so KMS can kick in early. Before that the text mode plugin was used in plymouth and that doesn't switch dimensions when fbdev took over. The plymouth boot splash now looks good on the primary screen with the secondary display disabled from the kernel command line and I can enable the secondary screen from a boot script before X (a DM) starts which causes both screens to flash but it's good for me now. Thanks for everyone who answered.
Re: KMS question
2017-04-13 18:20 keltezéssel, Ville Syrjälä írta: On Thu, Apr 13, 2017 at 11:37:45AM -0400, Ilia Mirkin wrote: On Thu, Apr 13, 2017 at 11:36 AM, Alex Deucher wrote: On Thu, Apr 13, 2017 at 11:03 AM, Boszormenyi Zoltan wrote: 2017-04-13 16:05 keltezéssel, Alex Deucher írta: On Thu, Apr 13, 2017 at 9:03 AM, Boszormenyi Zoltan wrote: Hi, how can I disable the behaviour in the KMS drivers that enables all outputs at once? It is very annoying that on a POS machine with an 1024x768 LVDS and a 800x480 secondary monitor (both built-in) the KMS driver wakes up both. Then the framebuffer console and plymouth use both screens, making the primary output very odd with only the top-left part used by the boot splash. I would like the boot splash to be shown only on the primary output at its full resolution instead of on all outputs using the smallest common rectangle. Is there a kernel command line configuration that achieves this? The device in question uses the gma500 kernel driver but the same behaviour is observed with the i915 and radeon drivers. The problem is fbdev is not multi-head aware. The fbdev emulation in the KMS drivers attempts to light up all monitors so that something shows up on all heads. If you really want different per head configurations, you need to use the KMS API directly. As a workaround, you can use the kernel command line to disable the output you don't want to be lit up. See: https://wiki.archlinux.org/index.php/kernel_mode_setting for more info. basically add video=TV-1:d to disable the output in question. Replace TV-1 with whatever connector you want to disable. I tried adding video=DVI-D-1:d to the kernel command line. The effect is while the second output is indeed disabled, the framebuffer console still takes the second output's resolution into account and the boot splash is still using only the top-left 800x480 part of the 1024x768 primary screen. Also, the secondary screen got disabled also in X which is not desired. Can I wake it up under X somehow? This device is using the modesetting DDX driver. Can you enable it via randr? I think the video= based disable forces the connector to be disabled irrevocably. # echo detect > /sys/class/drm//status Thanks, that worked. I had to regenerate my initramfs to actually include the gma500 driver so KMS can kick in early. Before that the text mode plugin was used in plymouth and that doesn't switch dimensions when fbdev took over. The plymouth boot splash now looks good on the primary screen with the secondary display disabled from the kernel command line and I can enable the secondary screen from a boot script before X (a DM) starts which causes both screens to flash but it's good for me now. Thanks for everyone who answered.
Re: KMS question
2017-04-13 17:36 keltezéssel, Alex Deucher írta: On Thu, Apr 13, 2017 at 11:03 AM, Boszormenyi Zoltan <zbos...@pr.hu> wrote: 2017-04-13 16:05 keltezéssel, Alex Deucher írta: On Thu, Apr 13, 2017 at 9:03 AM, Boszormenyi Zoltan <zbos...@pr.hu> wrote: Hi, how can I disable the behaviour in the KMS drivers that enables all outputs at once? It is very annoying that on a POS machine with an 1024x768 LVDS and a 800x480 secondary monitor (both built-in) the KMS driver wakes up both. Then the framebuffer console and plymouth use both screens, making the primary output very odd with only the top-left part used by the boot splash. I would like the boot splash to be shown only on the primary output at its full resolution instead of on all outputs using the smallest common rectangle. Is there a kernel command line configuration that achieves this? The device in question uses the gma500 kernel driver but the same behaviour is observed with the i915 and radeon drivers. The problem is fbdev is not multi-head aware. The fbdev emulation in the KMS drivers attempts to light up all monitors so that something shows up on all heads. If you really want different per head configurations, you need to use the KMS API directly. As a workaround, you can use the kernel command line to disable the output you don't want to be lit up. See: https://wiki.archlinux.org/index.php/kernel_mode_setting for more info. basically add video=TV-1:d to disable the output in question. Replace TV-1 with whatever connector you want to disable. I tried adding video=DVI-D-1:d to the kernel command line. The effect is while the second output is indeed disabled, the framebuffer console still takes the second output's resolution into account and the boot splash is still using only the top-left 800x480 part of the 1024x768 primary screen. Also, the secondary screen got disabled also in X which is not desired. Can I wake it up under X somehow? This device is using the modesetting DDX driver. Can you enable it via randr? No, "xrandr --output DVI-D-1 --auto" does nothing. Alex
Re: KMS question
2017-04-13 17:36 keltezéssel, Alex Deucher írta: On Thu, Apr 13, 2017 at 11:03 AM, Boszormenyi Zoltan wrote: 2017-04-13 16:05 keltezéssel, Alex Deucher írta: On Thu, Apr 13, 2017 at 9:03 AM, Boszormenyi Zoltan wrote: Hi, how can I disable the behaviour in the KMS drivers that enables all outputs at once? It is very annoying that on a POS machine with an 1024x768 LVDS and a 800x480 secondary monitor (both built-in) the KMS driver wakes up both. Then the framebuffer console and plymouth use both screens, making the primary output very odd with only the top-left part used by the boot splash. I would like the boot splash to be shown only on the primary output at its full resolution instead of on all outputs using the smallest common rectangle. Is there a kernel command line configuration that achieves this? The device in question uses the gma500 kernel driver but the same behaviour is observed with the i915 and radeon drivers. The problem is fbdev is not multi-head aware. The fbdev emulation in the KMS drivers attempts to light up all monitors so that something shows up on all heads. If you really want different per head configurations, you need to use the KMS API directly. As a workaround, you can use the kernel command line to disable the output you don't want to be lit up. See: https://wiki.archlinux.org/index.php/kernel_mode_setting for more info. basically add video=TV-1:d to disable the output in question. Replace TV-1 with whatever connector you want to disable. I tried adding video=DVI-D-1:d to the kernel command line. The effect is while the second output is indeed disabled, the framebuffer console still takes the second output's resolution into account and the boot splash is still using only the top-left 800x480 part of the 1024x768 primary screen. Also, the secondary screen got disabled also in X which is not desired. Can I wake it up under X somehow? This device is using the modesetting DDX driver. Can you enable it via randr? No, "xrandr --output DVI-D-1 --auto" does nothing. Alex
Re: KMS question
2017-04-13 16:05 keltezéssel, Alex Deucher írta: On Thu, Apr 13, 2017 at 9:03 AM, Boszormenyi Zoltan <zbos...@pr.hu> wrote: Hi, how can I disable the behaviour in the KMS drivers that enables all outputs at once? It is very annoying that on a POS machine with an 1024x768 LVDS and a 800x480 secondary monitor (both built-in) the KMS driver wakes up both. Then the framebuffer console and plymouth use both screens, making the primary output very odd with only the top-left part used by the boot splash. I would like the boot splash to be shown only on the primary output at its full resolution instead of on all outputs using the smallest common rectangle. Is there a kernel command line configuration that achieves this? The device in question uses the gma500 kernel driver but the same behaviour is observed with the i915 and radeon drivers. The problem is fbdev is not multi-head aware. The fbdev emulation in the KMS drivers attempts to light up all monitors so that something shows up on all heads. If you really want different per head configurations, you need to use the KMS API directly. As a workaround, you can use the kernel command line to disable the output you don't want to be lit up. See: https://wiki.archlinux.org/index.php/kernel_mode_setting for more info. basically add video=TV-1:d to disable the output in question. Replace TV-1 with whatever connector you want to disable. I tried adding video=DVI-D-1:d to the kernel command line. The effect is while the second output is indeed disabled, the framebuffer console still takes the second output's resolution into account and the boot splash is still using only the top-left 800x480 part of the 1024x768 primary screen. Also, the secondary screen got disabled also in X which is not desired. Can I wake it up under X somehow? This device is using the modesetting DDX driver. Thanks, Zoltán Alex
Re: KMS question
2017-04-13 16:05 keltezéssel, Alex Deucher írta: On Thu, Apr 13, 2017 at 9:03 AM, Boszormenyi Zoltan wrote: Hi, how can I disable the behaviour in the KMS drivers that enables all outputs at once? It is very annoying that on a POS machine with an 1024x768 LVDS and a 800x480 secondary monitor (both built-in) the KMS driver wakes up both. Then the framebuffer console and plymouth use both screens, making the primary output very odd with only the top-left part used by the boot splash. I would like the boot splash to be shown only on the primary output at its full resolution instead of on all outputs using the smallest common rectangle. Is there a kernel command line configuration that achieves this? The device in question uses the gma500 kernel driver but the same behaviour is observed with the i915 and radeon drivers. The problem is fbdev is not multi-head aware. The fbdev emulation in the KMS drivers attempts to light up all monitors so that something shows up on all heads. If you really want different per head configurations, you need to use the KMS API directly. As a workaround, you can use the kernel command line to disable the output you don't want to be lit up. See: https://wiki.archlinux.org/index.php/kernel_mode_setting for more info. basically add video=TV-1:d to disable the output in question. Replace TV-1 with whatever connector you want to disable. I tried adding video=DVI-D-1:d to the kernel command line. The effect is while the second output is indeed disabled, the framebuffer console still takes the second output's resolution into account and the boot splash is still using only the top-left 800x480 part of the 1024x768 primary screen. Also, the secondary screen got disabled also in X which is not desired. Can I wake it up under X somehow? This device is using the modesetting DDX driver. Thanks, Zoltán Alex
KMS question
Hi, how can I disable the behaviour in the KMS drivers that enables all outputs at once? It is very annoying that on a POS machine with an 1024x768 LVDS and a 800x480 secondary monitor (both built-in) the KMS driver wakes up both. Then the framebuffer console and plymouth use both screens, making the primary output very odd with only the top-left part used by the boot splash. I would like the boot splash to be shown only on the primary output at its full resolution instead of on all outputs using the smallest common rectangle. Is there a kernel command line configuration that achieves this? The device in question uses the gma500 kernel driver but the same behaviour is observed with the i915 and radeon drivers. Thanks in advance, Zoltán Böszörmenyi
KMS question
Hi, how can I disable the behaviour in the KMS drivers that enables all outputs at once? It is very annoying that on a POS machine with an 1024x768 LVDS and a 800x480 secondary monitor (both built-in) the KMS driver wakes up both. Then the framebuffer console and plymouth use both screens, making the primary output very odd with only the top-left part used by the boot splash. I would like the boot splash to be shown only on the primary output at its full resolution instead of on all outputs using the smallest common rectangle. Is there a kernel command line configuration that achieves this? The device in question uses the gma500 kernel driver but the same behaviour is observed with the i915 and radeon drivers. Thanks in advance, Zoltán Böszörmenyi
Re: [Regression] Changes to i2c-piix4.c initialisation prevent loading of sp5100_tco watchdog driver on AMD SB800 chipset
Hi, 2017-04-03 08:34 keltezéssel, Paul Menzel írta: Dear Zoltán, Am Samstag, den 01.04.2017, 12:13 +0200 schrieb Boszormenyi Zoltan: […] and have split the patch into three pieces now (USB quirks, i2c-piix4 and sp5100_tco) and they were sent to the relevant mailing lists. Could you please add me to the receiver list of these patches, so that I can test them? Maybe also Christian (the commit author introducing the regression), Tim (bug reporter), and Nehal from AMD? If you uploaded them to the Kernel.org Bugtracker, that’d also work for me. did both. Best regards, Zoltán Böszörményi Thanks, Paul
Re: [Regression] Changes to i2c-piix4.c initialisation prevent loading of sp5100_tco watchdog driver on AMD SB800 chipset
Hi, 2017-04-03 08:34 keltezéssel, Paul Menzel írta: Dear Zoltán, Am Samstag, den 01.04.2017, 12:13 +0200 schrieb Boszormenyi Zoltan: […] and have split the patch into three pieces now (USB quirks, i2c-piix4 and sp5100_tco) and they were sent to the relevant mailing lists. Could you please add me to the receiver list of these patches, so that I can test them? Maybe also Christian (the commit author introducing the regression), Tim (bug reporter), and Nehal from AMD? If you uploaded them to the Kernel.org Bugtracker, that’d also work for me. did both. Best regards, Zoltán Böszörményi Thanks, Paul
Re: [Regression] Changes to i2c-piix4.c initialisation prevent loading of sp5100_tco watchdog driver on AMD SB800 chipset
2017-04-01 18:20 keltezéssel, Boszormenyi Zoltan írta: The best clean alternative would be add new resource handling infrastructure. * Expose the currently static alloc_resource() in kernel/resource.c With this, driver initialization can allocate the resource once for the lifetime of the driver and it it fails, (unfinished sentence) then the failure is during driver initialization, not during runtime, possibly days or weeks later. * Add a new insert_muxed_region() / __insert_muxed_region() function with different semantics from request_muxed_region() / __request_region(): 1 Accept a pointer to already allocated resource. 2 If the conflicting resource doesn't have IORESOURCE_MUXED set, complain loudly in the syslog but still go into the wait queue. The conflicting resource also has the name which can be printed so the inconsistent resource / region usage can be fixed. We can also just modify the __request_region() semantics, so: 1 It accepts a pointer to an allocated resource or NULL. In the second case, the resource is allocated internally and can still fail. 2 The above second point. But this may cause an error in code that expects the old semantics. The window for request_muxed_region()+release_region() is so short that the requested I/O port range would not show up in /proc/ioports. All this would be to fix only 3 drivers in a no-error scenario and only achieving the functionality of a mutex seems to be overkill. Another alternative is to revert commit 2fee61d22e606fc99ade9079fda15fdee83ec33e that caused the regression in sp5100_tco in the first place. Best regards, Zoltán Böszörményi
Re: [Regression] Changes to i2c-piix4.c initialisation prevent loading of sp5100_tco watchdog driver on AMD SB800 chipset
2017-04-01 18:20 keltezéssel, Boszormenyi Zoltan írta: The best clean alternative would be add new resource handling infrastructure. * Expose the currently static alloc_resource() in kernel/resource.c With this, driver initialization can allocate the resource once for the lifetime of the driver and it it fails, (unfinished sentence) then the failure is during driver initialization, not during runtime, possibly days or weeks later. * Add a new insert_muxed_region() / __insert_muxed_region() function with different semantics from request_muxed_region() / __request_region(): 1 Accept a pointer to already allocated resource. 2 If the conflicting resource doesn't have IORESOURCE_MUXED set, complain loudly in the syslog but still go into the wait queue. The conflicting resource also has the name which can be printed so the inconsistent resource / region usage can be fixed. We can also just modify the __request_region() semantics, so: 1 It accepts a pointer to an allocated resource or NULL. In the second case, the resource is allocated internally and can still fail. 2 The above second point. But this may cause an error in code that expects the old semantics. The window for request_muxed_region()+release_region() is so short that the requested I/O port range would not show up in /proc/ioports. All this would be to fix only 3 drivers in a no-error scenario and only achieving the functionality of a mutex seems to be overkill. Another alternative is to revert commit 2fee61d22e606fc99ade9079fda15fdee83ec33e that caused the regression in sp5100_tco in the first place. Best regards, Zoltán Böszörményi
Re: [PATCH 1/3] usb: pci-quirks: Add a header for SB800 I/O ports and mutex for locking
2017-04-01 16:40 keltezéssel, Alan Stern írta: On Sat, 1 Apr 2017, Greg KH wrote: On Sat, Apr 01, 2017 at 01:02:21PM +0200, Zoltan Boszormenyi wrote: From: B�sz�rm�nyi Zolt�nThis patch adds: * a mutex in the USB PCI quirks code for synchronizing access to the I/O ports on SB800 * a new header that contains symbols for the index and data I/O ports and wrappers for locking and unlocking the mutex. * locking around the I/O port access for SB800 Signed-off-by: Zoltan Boszormenyi --- diff --git a/include/linux/sb800.h b/include/linux/sb800.h new file mode 100644 index 000..5650b7d --- /dev/null +++ b/include/linux/sb800.h @@ -0,0 +1,15 @@ + +#ifndef SB800_H +#define SB800_H + +#include + +#define SB800_PIIX4_SMB_IDX0xcd6 +#define SB800_PIIX4_SMB_DATA 0xcd7 + +extern struct mutex sb800_mutex; + +#define enter_sb800() mutex_lock(_mutex) +#define leave_sb800() mutex_unlock(_mutex) Is include/linux/ the best place for this new header file? Aren't there other locations more suitable for something that's board-specific? Are there? Which subdirectory is better suited? Would it be acceptable to not use a header at all but spell out the "extern struct mutex..." in the two other drivers? Thanks, Zoltán Böszörményi Alan Stern Don't hide the mutex, just spell it out in the code itself. No need for these defines at all. thanks, greg k-h
Re: [PATCH 1/3] usb: pci-quirks: Add a header for SB800 I/O ports and mutex for locking
2017-04-01 16:40 keltezéssel, Alan Stern írta: On Sat, 1 Apr 2017, Greg KH wrote: On Sat, Apr 01, 2017 at 01:02:21PM +0200, Zoltan Boszormenyi wrote: From: B�sz�rm�nyi Zolt�n This patch adds: * a mutex in the USB PCI quirks code for synchronizing access to the I/O ports on SB800 * a new header that contains symbols for the index and data I/O ports and wrappers for locking and unlocking the mutex. * locking around the I/O port access for SB800 Signed-off-by: Zoltan Boszormenyi --- diff --git a/include/linux/sb800.h b/include/linux/sb800.h new file mode 100644 index 000..5650b7d --- /dev/null +++ b/include/linux/sb800.h @@ -0,0 +1,15 @@ + +#ifndef SB800_H +#define SB800_H + +#include + +#define SB800_PIIX4_SMB_IDX0xcd6 +#define SB800_PIIX4_SMB_DATA 0xcd7 + +extern struct mutex sb800_mutex; + +#define enter_sb800() mutex_lock(_mutex) +#define leave_sb800() mutex_unlock(_mutex) Is include/linux/ the best place for this new header file? Aren't there other locations more suitable for something that's board-specific? Are there? Which subdirectory is better suited? Would it be acceptable to not use a header at all but spell out the "extern struct mutex..." in the two other drivers? Thanks, Zoltán Böszörményi Alan Stern Don't hide the mutex, just spell it out in the code itself. No need for these defines at all. thanks, greg k-h
Re: [PATCH 1/3] usb: pci-quirks: Add a header for SB800 I/O ports and mutex for locking
2017-04-01 15:59 keltezéssel, Greg KH írta: On Sat, Apr 01, 2017 at 01:02:21PM +0200, Zoltan Boszormenyi wrote: From: Böszörményi ZoltánThis patch adds: * a mutex in the USB PCI quirks code for synchronizing access to the I/O ports on SB800 * a new header that contains symbols for the index and data I/O ports and wrappers for locking and unlocking the mutex. * locking around the I/O port access for SB800 Signed-off-by: Zoltan Boszormenyi --- drivers/usb/host/pci-quirks.c | 14 ++ include/linux/sb800.h | 15 +++ 2 files changed, 25 insertions(+), 4 deletions(-) create mode 100644 include/linux/sb800.h diff --git a/drivers/usb/host/pci-quirks.c b/drivers/usb/host/pci-quirks.c index a9a1e4c..9b0445c 100644 --- a/drivers/usb/host/pci-quirks.c +++ b/drivers/usb/host/pci-quirks.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "pci-quirks.h" #include "xhci-ext-caps.h" @@ -279,6 +280,9 @@ bool usb_amd_prefetch_quirk(void) } EXPORT_SYMBOL_GPL(usb_amd_prefetch_quirk); +DEFINE_MUTEX(sb800_mutex); +EXPORT_SYMBOL_GPL(sb800_mutex); + /* * The hardware normally enables the A-link power management feature, which * lets the system lower the power consumption in idle states. @@ -314,11 +318,13 @@ static void usb_amd_quirk_pll(int disable) if (amd_chipset.sb_type.gen == AMD_CHIPSET_SB800 || amd_chipset.sb_type.gen == AMD_CHIPSET_HUDSON2 || amd_chipset.sb_type.gen == AMD_CHIPSET_BOLTON) { - outb_p(AB_REG_BAR_LOW, 0xcd6); - addr_low = inb_p(0xcd7); - outb_p(AB_REG_BAR_HIGH, 0xcd6); - addr_high = inb_p(0xcd7); + enter_sb800(); + outb_p(AB_REG_BAR_LOW, SB800_PIIX4_SMB_IDX); + addr_low = inb_p(SB800_PIIX4_SMB_DATA); + outb_p(AB_REG_BAR_HIGH, SB800_PIIX4_SMB_IDX); + addr_high = inb_p(SB800_PIIX4_SMB_DATA); addr = addr_high << 8 | addr_low; + leave_sb800(); outl_p(0x30, AB_INDX(addr)); outl_p(0x40, AB_DATA(addr)); diff --git a/include/linux/sb800.h b/include/linux/sb800.h new file mode 100644 index 000..5650b7d --- /dev/null +++ b/include/linux/sb800.h @@ -0,0 +1,15 @@ + +#ifndef SB800_H +#define SB800_H + +#include + +#define SB800_PIIX4_SMB_IDX0xcd6 +#define SB800_PIIX4_SMB_DATA 0xcd7 + +extern struct mutex sb800_mutex; + +#define enter_sb800() mutex_lock(_mutex) +#define leave_sb800() mutex_unlock(_mutex) Don't hide the mutex, just spell it out in the code itself. No need for these defines at all. Thanks, I will change it. thanks, greg k-h
Re: [PATCH 1/3] usb: pci-quirks: Add a header for SB800 I/O ports and mutex for locking
2017-04-01 15:59 keltezéssel, Greg KH írta: On Sat, Apr 01, 2017 at 01:02:21PM +0200, Zoltan Boszormenyi wrote: From: Böszörményi Zoltán This patch adds: * a mutex in the USB PCI quirks code for synchronizing access to the I/O ports on SB800 * a new header that contains symbols for the index and data I/O ports and wrappers for locking and unlocking the mutex. * locking around the I/O port access for SB800 Signed-off-by: Zoltan Boszormenyi --- drivers/usb/host/pci-quirks.c | 14 ++ include/linux/sb800.h | 15 +++ 2 files changed, 25 insertions(+), 4 deletions(-) create mode 100644 include/linux/sb800.h diff --git a/drivers/usb/host/pci-quirks.c b/drivers/usb/host/pci-quirks.c index a9a1e4c..9b0445c 100644 --- a/drivers/usb/host/pci-quirks.c +++ b/drivers/usb/host/pci-quirks.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "pci-quirks.h" #include "xhci-ext-caps.h" @@ -279,6 +280,9 @@ bool usb_amd_prefetch_quirk(void) } EXPORT_SYMBOL_GPL(usb_amd_prefetch_quirk); +DEFINE_MUTEX(sb800_mutex); +EXPORT_SYMBOL_GPL(sb800_mutex); + /* * The hardware normally enables the A-link power management feature, which * lets the system lower the power consumption in idle states. @@ -314,11 +318,13 @@ static void usb_amd_quirk_pll(int disable) if (amd_chipset.sb_type.gen == AMD_CHIPSET_SB800 || amd_chipset.sb_type.gen == AMD_CHIPSET_HUDSON2 || amd_chipset.sb_type.gen == AMD_CHIPSET_BOLTON) { - outb_p(AB_REG_BAR_LOW, 0xcd6); - addr_low = inb_p(0xcd7); - outb_p(AB_REG_BAR_HIGH, 0xcd6); - addr_high = inb_p(0xcd7); + enter_sb800(); + outb_p(AB_REG_BAR_LOW, SB800_PIIX4_SMB_IDX); + addr_low = inb_p(SB800_PIIX4_SMB_DATA); + outb_p(AB_REG_BAR_HIGH, SB800_PIIX4_SMB_IDX); + addr_high = inb_p(SB800_PIIX4_SMB_DATA); addr = addr_high << 8 | addr_low; + leave_sb800(); outl_p(0x30, AB_INDX(addr)); outl_p(0x40, AB_DATA(addr)); diff --git a/include/linux/sb800.h b/include/linux/sb800.h new file mode 100644 index 000..5650b7d --- /dev/null +++ b/include/linux/sb800.h @@ -0,0 +1,15 @@ + +#ifndef SB800_H +#define SB800_H + +#include + +#define SB800_PIIX4_SMB_IDX0xcd6 +#define SB800_PIIX4_SMB_DATA 0xcd7 + +extern struct mutex sb800_mutex; + +#define enter_sb800() mutex_lock(_mutex) +#define leave_sb800() mutex_unlock(_mutex) Don't hide the mutex, just spell it out in the code itself. No need for these defines at all. Thanks, I will change it. thanks, greg k-h
Re: [Regression] Changes to i2c-piix4.c initialisation prevent loading of sp5100_tco watchdog driver on AMD SB800 chipset
2017-03-31 17:05 keltezéssel, Guenter Roeck írta: On Fri, Mar 31, 2017 at 04:46:02PM +0200, Boszormenyi Zoltan wrote: 2017-03-31 14:49 keltezéssel, Guenter Roeck írta: request_muxed_region() can fail, and literally every other driver using it checks for that failure. Please do the same. In what circumstances can request_muxed_region() fail? As far as I can see, only if two drivers use the same I/O port base and the already present region did not use IORESOURCE_MUXED which is not the case here. When request_muxed_region() is used consistently, subsequent requests are put on a wait queue and the first one is woken up when the region is released. So, it's basically a mutex. Am I missing something here? Yes. failure to allocate the resource is one. So, a common mutex should be used. I have also added synchronization to the USB PCI quirks code and have split the patch into three pieces now (USB quirks, i2c-piix4 and sp5100_tco) and they were sent to the relevant mailing lists. I don't know which subsystem wants to take it, all 3 patches are needed at once. Best regards, Zoltán Böszörményi
Re: [Regression] Changes to i2c-piix4.c initialisation prevent loading of sp5100_tco watchdog driver on AMD SB800 chipset
2017-03-31 17:05 keltezéssel, Guenter Roeck írta: On Fri, Mar 31, 2017 at 04:46:02PM +0200, Boszormenyi Zoltan wrote: 2017-03-31 14:49 keltezéssel, Guenter Roeck írta: request_muxed_region() can fail, and literally every other driver using it checks for that failure. Please do the same. In what circumstances can request_muxed_region() fail? As far as I can see, only if two drivers use the same I/O port base and the already present region did not use IORESOURCE_MUXED which is not the case here. When request_muxed_region() is used consistently, subsequent requests are put on a wait queue and the first one is woken up when the region is released. So, it's basically a mutex. Am I missing something here? Yes. failure to allocate the resource is one. So, a common mutex should be used. I have also added synchronization to the USB PCI quirks code and have split the patch into three pieces now (USB quirks, i2c-piix4 and sp5100_tco) and they were sent to the relevant mailing lists. I don't know which subsystem wants to take it, all 3 patches are needed at once. Best regards, Zoltán Böszörményi
[PATCH] watchdog/sp5100_tco: Coexist with i2c-piix
Hi, the attached patch fixes a long time regression in sp5100_tco caused by changes in i2c-piix4. See: https://bugzilla.redhat.com/show_bug.cgi?id=1406844 https://bugzilla.kernel.org/show_bug.cgi?id=170741 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=853122 Best regards, Zoltán Böszörményi From: Zoltán BöszörményiDate: Tue Mar 28 14:53:07 2017 +0200 Subject: [PATCH] watchdog/sp5100_tco: Coexist with i2c-piix Currently, the kernel says this when i2c-piix loads before sp5100_tco: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver v0.05 sp5100_tco: PCI Vendor ID: 0x1002, Device ID: 0x4385, Revision ID: 0x42 sp5100_tco: I/O address 0x0cd6 already in use Both i2c-piix4 and sp5100_tco uses a static request_region() call so it depends on the load order which one wins. i2c-piix4 uses a mutex to protect I/O port accesses to the pair of I/O ports. Replace this mutex lock / unlock with request_muxed_region() and release_region() around the I/O port accesses in i2c-piix4. Add request_muxed_region() / release_region() pairs around the I/O accesses in sp5100_tco. This will act as a cross-driver mutex. Signed-off-by: Zoltán Böszörményi diff --git a/drivers/i2c/busses/i2c-piix4.c b/drivers/i2c/busses/i2c-piix4.c index c21ca7b..16befdd5 100644 --- a/drivers/i2c/busses/i2c-piix4.c +++ b/drivers/i2c/busses/i2c-piix4.c @@ -40,7 +40,6 @@ #include #include #include -#include /* PIIX4 SMBus address offsets */ @@ -144,10 +143,9 @@ static const struct dmi_system_id piix4_dmi_ibm[] = { /* * SB800 globals - * piix4_mutex_sb800 protects piix4_port_sel_sb800 and the pair - * of I/O ports at SB800_PIIX4_SMB_IDX. + * the pair of I/O ports at SB800_PIIX4_SMB_IDX are protected + * by request_muxed_region / release_region */ -static DEFINE_MUTEX(piix4_mutex_sb800); static u8 piix4_port_sel_sb800; static const char *piix4_main_port_names_sb800[PIIX4_MAX_ADAPTERS] = { " port 0", " port 2", " port 3", " port 4" @@ -157,8 +155,6 @@ static const char *piix4_aux_port_name_sb800 = " port 1"; struct i2c_piix4_adapdata { unsigned short smba; - /* SB800 */ - bool sb800_main; u8 port; /* Port number, shifted */ }; @@ -261,6 +257,14 @@ static int piix4_setup(struct pci_dev *PIIX4_dev, return piix4_smba; } +static inline void enter_sb800(void) { + request_muxed_region(SB800_PIIX4_SMB_IDX, 2, "smba_idx"); +} + +static inline void leave_sb800(void) { + release_region(SB800_PIIX4_SMB_IDX, 2); +} + static int piix4_setup_sb800(struct pci_dev *PIIX4_dev, const struct pci_device_id *id, u8 aux) { @@ -286,12 +290,12 @@ static int piix4_setup_sb800(struct pci_dev *PIIX4_dev, else smb_en = (aux) ? 0x28 : 0x2c; - mutex_lock(_mutex_sb800); + enter_sb800(); outb_p(smb_en, SB800_PIIX4_SMB_IDX); smba_en_lo = inb_p(SB800_PIIX4_SMB_IDX + 1); outb_p(smb_en + 1, SB800_PIIX4_SMB_IDX); smba_en_hi = inb_p(SB800_PIIX4_SMB_IDX + 1); - mutex_unlock(_mutex_sb800); + leave_sb800(); if (!smb_en) { smb_en_status = smba_en_lo & 0x10; @@ -349,13 +353,13 @@ static int piix4_setup_sb800(struct pci_dev *PIIX4_dev, if (PIIX4_dev->vendor == PCI_VENDOR_ID_AMD) { piix4_port_sel_sb800 = SB800_PIIX4_PORT_IDX_ALT; } else { - mutex_lock(_mutex_sb800); + enter_sb800(); outb_p(SB800_PIIX4_PORT_IDX_SEL, SB800_PIIX4_SMB_IDX); port_sel = inb_p(SB800_PIIX4_SMB_IDX + 1); piix4_port_sel_sb800 = (port_sel & 0x01) ? SB800_PIIX4_PORT_IDX_ALT : SB800_PIIX4_PORT_IDX; - mutex_unlock(_mutex_sb800); + leave_sb800(); } dev_info(_dev->dev, @@ -592,7 +596,7 @@ static s32 piix4_access_sb800(struct i2c_adapter *adap, u16 addr, u8 port; int retval; - mutex_lock(_mutex_sb800); + enter_sb800(); /* Request the SMBUS semaphore, avoid conflicts with the IMC */ smbslvcnt = inb_p(SMBSLVCNT); @@ -608,7 +612,7 @@ static s32 piix4_access_sb800(struct i2c_adapter *adap, u16 addr, } while (--retries); /* SMBus is still owned by the IMC, we give up */ if (!retries) { - mutex_unlock(_mutex_sb800); + leave_sb800(); return -EBUSY; } @@ -628,7 +632,7 @@ static s32 piix4_access_sb800(struct i2c_adapter *adap, u16 addr, /* Release the semaphore */ outb_p(smbslvcnt | 0x20, SMBSLVCNT); - mutex_unlock(_mutex_sb800); + leave_sb800(); return retval; } @@ -705,7 +709,6 @@ static int piix4_add_adapter(struct pci_dev *dev, unsigned short smba, } adapdata->smba = smba; - adapdata->sb800_main = sb800_main; adapdata->port = port << 1; /* set up the sysfs linkage to our parent device */ @@ -771,17 +774,9 @@ static int piix4_probe(struct pci_dev *dev, const struct pci_device_id *id) dev->vendor == PCI_VENDOR_ID_AMD) { is_sb800 = true; - if (!request_region(SB800_PIIX4_SMB_IDX, 2, "smba_idx")) { - dev_err(>dev, - "SMBus base address index region 0x%x already in use!\n", - SB800_PIIX4_SMB_IDX); - return -EBUSY; - } - /* base address location etc changed in SB800 */ retval =
[PATCH] watchdog/sp5100_tco: Coexist with i2c-piix
Hi, the attached patch fixes a long time regression in sp5100_tco caused by changes in i2c-piix4. See: https://bugzilla.redhat.com/show_bug.cgi?id=1406844 https://bugzilla.kernel.org/show_bug.cgi?id=170741 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=853122 Best regards, Zoltán Böszörményi From: Zoltán Böszörményi Date: Tue Mar 28 14:53:07 2017 +0200 Subject: [PATCH] watchdog/sp5100_tco: Coexist with i2c-piix Currently, the kernel says this when i2c-piix loads before sp5100_tco: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver v0.05 sp5100_tco: PCI Vendor ID: 0x1002, Device ID: 0x4385, Revision ID: 0x42 sp5100_tco: I/O address 0x0cd6 already in use Both i2c-piix4 and sp5100_tco uses a static request_region() call so it depends on the load order which one wins. i2c-piix4 uses a mutex to protect I/O port accesses to the pair of I/O ports. Replace this mutex lock / unlock with request_muxed_region() and release_region() around the I/O port accesses in i2c-piix4. Add request_muxed_region() / release_region() pairs around the I/O accesses in sp5100_tco. This will act as a cross-driver mutex. Signed-off-by: Zoltán Böszörményi diff --git a/drivers/i2c/busses/i2c-piix4.c b/drivers/i2c/busses/i2c-piix4.c index c21ca7b..16befdd5 100644 --- a/drivers/i2c/busses/i2c-piix4.c +++ b/drivers/i2c/busses/i2c-piix4.c @@ -40,7 +40,6 @@ #include #include #include -#include /* PIIX4 SMBus address offsets */ @@ -144,10 +143,9 @@ static const struct dmi_system_id piix4_dmi_ibm[] = { /* * SB800 globals - * piix4_mutex_sb800 protects piix4_port_sel_sb800 and the pair - * of I/O ports at SB800_PIIX4_SMB_IDX. + * the pair of I/O ports at SB800_PIIX4_SMB_IDX are protected + * by request_muxed_region / release_region */ -static DEFINE_MUTEX(piix4_mutex_sb800); static u8 piix4_port_sel_sb800; static const char *piix4_main_port_names_sb800[PIIX4_MAX_ADAPTERS] = { " port 0", " port 2", " port 3", " port 4" @@ -157,8 +155,6 @@ static const char *piix4_aux_port_name_sb800 = " port 1"; struct i2c_piix4_adapdata { unsigned short smba; - /* SB800 */ - bool sb800_main; u8 port; /* Port number, shifted */ }; @@ -261,6 +257,14 @@ static int piix4_setup(struct pci_dev *PIIX4_dev, return piix4_smba; } +static inline void enter_sb800(void) { + request_muxed_region(SB800_PIIX4_SMB_IDX, 2, "smba_idx"); +} + +static inline void leave_sb800(void) { + release_region(SB800_PIIX4_SMB_IDX, 2); +} + static int piix4_setup_sb800(struct pci_dev *PIIX4_dev, const struct pci_device_id *id, u8 aux) { @@ -286,12 +290,12 @@ static int piix4_setup_sb800(struct pci_dev *PIIX4_dev, else smb_en = (aux) ? 0x28 : 0x2c; - mutex_lock(_mutex_sb800); + enter_sb800(); outb_p(smb_en, SB800_PIIX4_SMB_IDX); smba_en_lo = inb_p(SB800_PIIX4_SMB_IDX + 1); outb_p(smb_en + 1, SB800_PIIX4_SMB_IDX); smba_en_hi = inb_p(SB800_PIIX4_SMB_IDX + 1); - mutex_unlock(_mutex_sb800); + leave_sb800(); if (!smb_en) { smb_en_status = smba_en_lo & 0x10; @@ -349,13 +353,13 @@ static int piix4_setup_sb800(struct pci_dev *PIIX4_dev, if (PIIX4_dev->vendor == PCI_VENDOR_ID_AMD) { piix4_port_sel_sb800 = SB800_PIIX4_PORT_IDX_ALT; } else { - mutex_lock(_mutex_sb800); + enter_sb800(); outb_p(SB800_PIIX4_PORT_IDX_SEL, SB800_PIIX4_SMB_IDX); port_sel = inb_p(SB800_PIIX4_SMB_IDX + 1); piix4_port_sel_sb800 = (port_sel & 0x01) ? SB800_PIIX4_PORT_IDX_ALT : SB800_PIIX4_PORT_IDX; - mutex_unlock(_mutex_sb800); + leave_sb800(); } dev_info(_dev->dev, @@ -592,7 +596,7 @@ static s32 piix4_access_sb800(struct i2c_adapter *adap, u16 addr, u8 port; int retval; - mutex_lock(_mutex_sb800); + enter_sb800(); /* Request the SMBUS semaphore, avoid conflicts with the IMC */ smbslvcnt = inb_p(SMBSLVCNT); @@ -608,7 +612,7 @@ static s32 piix4_access_sb800(struct i2c_adapter *adap, u16 addr, } while (--retries); /* SMBus is still owned by the IMC, we give up */ if (!retries) { - mutex_unlock(_mutex_sb800); + leave_sb800(); return -EBUSY; } @@ -628,7 +632,7 @@ static s32 piix4_access_sb800(struct i2c_adapter *adap, u16 addr, /* Release the semaphore */ outb_p(smbslvcnt | 0x20, SMBSLVCNT); - mutex_unlock(_mutex_sb800); + leave_sb800(); return retval; } @@ -705,7 +709,6 @@ static int piix4_add_adapter(struct pci_dev *dev, unsigned short smba, } adapdata->smba = smba; - adapdata->sb800_main = sb800_main; adapdata->port = port << 1; /* set up the sysfs linkage to our parent device */ @@ -771,17 +774,9 @@ static int piix4_probe(struct pci_dev *dev, const struct pci_device_id *id) dev->vendor == PCI_VENDOR_ID_AMD) { is_sb800 = true; - if (!request_region(SB800_PIIX4_SMB_IDX, 2, "smba_idx")) { - dev_err(>dev, - "SMBus base address index region 0x%x already in use!\n", - SB800_PIIX4_SMB_IDX); - return -EBUSY; - } - /* base address location etc changed in SB800 */ retval = piix4_setup_sb800(dev, id, 0); if
Geode LX AES driver warning, kernel 4.9.10
Hi, this did not occur in the 4.8.x series but I get this with 4.9.9 and 4.9.10: [ 13.785289] [ cut here ] [ 13.785314] WARNING: CPU: 0 PID: 1 at drivers/base/dd.c:344 driver_probe_device+0x3f7/0x430 [ 13.785319] Modules linked in: [ 13.785339] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.10 #1 [ 13.785347] ce0bddd0 cbb0050a cc03db65 ce0bde00 cb853a5a cbff2e50 [ 13.785381] 0001 cc03db65 0158 cbc1a987 0158 ce00b068 cc166c14 cc166c14 [ 13.785416] ce0bde14 cb853b2a 0009 ce0bde44 cbc1a987 cbb4ac19 [ 13.785449] Call Trace: [ 13.785481] [] dump_stack+0x58/0x7e [ 13.785497] [] __warn+0xea/0x110 [ 13.785514] [] ? driver_probe_device+0x3f7/0x430 [ 13.785530] [] warn_slowpath_null+0x2a/0x30 [ 13.785545] [] driver_probe_device+0x3f7/0x430 [ 13.785562] [] ? pci_match_id+0x9/0x90 [ 13.785578] [] ? pci_match_device+0xd4/0xf0 [ 13.785594] [] __driver_attach+0xf1/0x100 [ 13.785618] [] ? _raw_spin_lock+0x9/0x30 [ 13.785634] [] ? klist_next+0x1b/0xf0 [ 13.785649] [] ? driver_probe_device+0x430/0x430 [ 13.785675] [] bus_for_each_dev+0x47/0x80 [ 13.785690] [] driver_attach+0x1e/0x20 [ 13.785705] [] ? driver_probe_device+0x430/0x430 [ 13.785719] [] bus_add_driver+0x147/0x260 [ 13.785737] [] driver_register+0x59/0xe0 [ 13.785753] [] ? cleanup_entry_list+0xa/0x3d [ 13.785773] [] ? efivars_kobject+0x8/0x20 [ 13.785794] [] ? efibc_init+0x3a/0x3a [ 13.785809] [] __pci_register_driver+0x33/0x40 [ 13.785826] [] geode_aes_driver_init+0x14/0x16 [ 13.785841] [] do_one_initcall+0x42/0x180 [ 13.785862] [] ? parameq+0x18/0x70 [ 13.785879] [] ? parse_args+0x178/0x4c0 [ 13.785903] [] kernel_init_freeable+0x146/0x1dd [ 13.785919] [] ? set_debug_rodata+0xf/0xf [ 13.785933] [] ? rest_init+0x70/0x70 [ 13.785948] [] kernel_init+0x10/0x100 [ 13.785964] [] ? schedule_tail+0x11/0x50 [ 13.785978] [] ? rest_init+0x70/0x70 [ 13.785996] [] ret_from_fork+0x1b/0x28 [ 13.786064] ---[ end trace 2345b58de526115e ]--- [ 13.786216] Geode LX AES :00:01.2: guessed PCI INT A -> IRQ 10 [ 13.786291] Geode LX AES :00:01.2: sharing IRQ 10 with :00:01.1 [ 13.797843] Geode LX AES :00:01.2: GEODE AES engine enabled. What has changed in 4.9.x that causes this? Best regards, Zoltán Böszörményi
Geode LX AES driver warning, kernel 4.9.10
Hi, this did not occur in the 4.8.x series but I get this with 4.9.9 and 4.9.10: [ 13.785289] [ cut here ] [ 13.785314] WARNING: CPU: 0 PID: 1 at drivers/base/dd.c:344 driver_probe_device+0x3f7/0x430 [ 13.785319] Modules linked in: [ 13.785339] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.10 #1 [ 13.785347] ce0bddd0 cbb0050a cc03db65 ce0bde00 cb853a5a cbff2e50 [ 13.785381] 0001 cc03db65 0158 cbc1a987 0158 ce00b068 cc166c14 cc166c14 [ 13.785416] ce0bde14 cb853b2a 0009 ce0bde44 cbc1a987 cbb4ac19 [ 13.785449] Call Trace: [ 13.785481] [] dump_stack+0x58/0x7e [ 13.785497] [] __warn+0xea/0x110 [ 13.785514] [] ? driver_probe_device+0x3f7/0x430 [ 13.785530] [] warn_slowpath_null+0x2a/0x30 [ 13.785545] [] driver_probe_device+0x3f7/0x430 [ 13.785562] [] ? pci_match_id+0x9/0x90 [ 13.785578] [] ? pci_match_device+0xd4/0xf0 [ 13.785594] [] __driver_attach+0xf1/0x100 [ 13.785618] [] ? _raw_spin_lock+0x9/0x30 [ 13.785634] [] ? klist_next+0x1b/0xf0 [ 13.785649] [] ? driver_probe_device+0x430/0x430 [ 13.785675] [] bus_for_each_dev+0x47/0x80 [ 13.785690] [] driver_attach+0x1e/0x20 [ 13.785705] [] ? driver_probe_device+0x430/0x430 [ 13.785719] [] bus_add_driver+0x147/0x260 [ 13.785737] [] driver_register+0x59/0xe0 [ 13.785753] [] ? cleanup_entry_list+0xa/0x3d [ 13.785773] [] ? efivars_kobject+0x8/0x20 [ 13.785794] [] ? efibc_init+0x3a/0x3a [ 13.785809] [] __pci_register_driver+0x33/0x40 [ 13.785826] [] geode_aes_driver_init+0x14/0x16 [ 13.785841] [] do_one_initcall+0x42/0x180 [ 13.785862] [] ? parameq+0x18/0x70 [ 13.785879] [] ? parse_args+0x178/0x4c0 [ 13.785903] [] kernel_init_freeable+0x146/0x1dd [ 13.785919] [] ? set_debug_rodata+0xf/0xf [ 13.785933] [] ? rest_init+0x70/0x70 [ 13.785948] [] kernel_init+0x10/0x100 [ 13.785964] [] ? schedule_tail+0x11/0x50 [ 13.785978] [] ? rest_init+0x70/0x70 [ 13.785996] [] ret_from_fork+0x1b/0x28 [ 13.786064] ---[ end trace 2345b58de526115e ]--- [ 13.786216] Geode LX AES :00:01.2: guessed PCI INT A -> IRQ 10 [ 13.786291] Geode LX AES :00:01.2: sharing IRQ 10 with :00:01.1 [ 13.797843] Geode LX AES :00:01.2: GEODE AES engine enabled. What has changed in 4.9.x that causes this? Best regards, Zoltán Böszörményi
Graphics console with VESAFB doesn't kick in in kernel 4.4.x
Hi, I have tried kernel 4.4.0, 4.4.3 and 4.4.4 on an old Geode LX based POS computer where the usual way to use graphics is VESAFB and optionally Xorg with the VESA driver. With vga=0x314 with or without video=vesafb:mtrr:3 the graphics mode is set but VESAFB doesn't start. There is no /dev/fb0 device after boot, although everything else works. # dmesg | egrep -i '(vesa|onsol)' [0.00] Kernel command line: BOOT_IMAGE=/bzImage root=UUID=b2cccbac-2717-4fcb-ae65-15189e087778 ro LANG=en.US.UTF-8 vga=0x314 video=vesafb:mtrr:3 [0.00] Console: colour dummy device 80x25 [0.00] console [tty0] enabled [1.847411] systemd[1]: Starting Dispatch Password Requests to Console Directory Watch. [1.847838] systemd[1]: Started Dispatch Password Requests to Console Directory Watch. [2.416981] systemd[1]: Starting Setup Virtual Console... [2.698214] systemd[1]: Started Setup Virtual Console. The relevant configuration pieces are set, VESAFB is built into the kernel: CONFIG_FB_BOOT_VESA_SUPPORT=y CONFIG_FB_VESA=y CONFIG_HW_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_VT_CONSOLE_SLEEP=y CONFIG_VT_HW_CONSOLE_BINDING=y Help, please and cc: me in the answer, I am not subscribed to the list. Thanks in advance, Zoltán Böszörményi
Graphics console with VESAFB doesn't kick in in kernel 4.4.x
Hi, I have tried kernel 4.4.0, 4.4.3 and 4.4.4 on an old Geode LX based POS computer where the usual way to use graphics is VESAFB and optionally Xorg with the VESA driver. With vga=0x314 with or without video=vesafb:mtrr:3 the graphics mode is set but VESAFB doesn't start. There is no /dev/fb0 device after boot, although everything else works. # dmesg | egrep -i '(vesa|onsol)' [0.00] Kernel command line: BOOT_IMAGE=/bzImage root=UUID=b2cccbac-2717-4fcb-ae65-15189e087778 ro LANG=en.US.UTF-8 vga=0x314 video=vesafb:mtrr:3 [0.00] Console: colour dummy device 80x25 [0.00] console [tty0] enabled [1.847411] systemd[1]: Starting Dispatch Password Requests to Console Directory Watch. [1.847838] systemd[1]: Started Dispatch Password Requests to Console Directory Watch. [2.416981] systemd[1]: Starting Setup Virtual Console... [2.698214] systemd[1]: Started Setup Virtual Console. The relevant configuration pieces are set, VESAFB is built into the kernel: CONFIG_FB_BOOT_VESA_SUPPORT=y CONFIG_FB_VESA=y CONFIG_HW_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_VT_CONSOLE_SLEEP=y CONFIG_VT_HW_CONSOLE_BINDING=y Help, please and cc: me in the answer, I am not subscribed to the list. Thanks in advance, Zoltán Böszörményi
Re: [PATCH v2] drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt
2015-12-04 09:53 keltezéssel, Christian König írta: > On 04.12.2015 00:26, cp...@redhat.com wrote: >> From: Lyude >> >> HPD signals on DVI ports can be fired off before the pins required for >> DDC probing actually make contact, due to the pins for HPD making >> contact first. This results in a HPD signal being asserted but DDC >> probing failing, resulting in hotplugging occasionally failing. >> >> This is somewhat rare on most cards (depending on what angle you plug >> the DVI connector in), but on some cards it happens constantly. The >> Radeon R5 on the machine used for testing this patch for instance, runs >> into this issue just about every time I try to hotplug a DVI monitor and >> as a result hotplugging almost never works. >> >> Rescheduling the hotplug work for a second when we run into an HPD >> signal with a failing DDC probe usually gives enough time for the rest >> of the connector's pins to make contact, and fixes this issue. >> >> Signed-off-by: Lyude > > I find a second a bit long, but if it works so what? > > Looks sane enough to me, patch is Reviewed-by: Christian König > Does this patch help in case of the Radeon chip only has HDMI and DP outputs exposed (Zotac ZBOXNANO-AQ01) but used with DVI or VGA monitors with converter cables? We have some problems with such scenarios that sounds eerily similar to this description. Inquiry-by: Zoltán Böszörményi ;-) Thanks in advance. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt
2015-12-04 09:53 keltezéssel, Christian König írta: > On 04.12.2015 00:26, cp...@redhat.com wrote: >> From: Lyude>> >> HPD signals on DVI ports can be fired off before the pins required for >> DDC probing actually make contact, due to the pins for HPD making >> contact first. This results in a HPD signal being asserted but DDC >> probing failing, resulting in hotplugging occasionally failing. >> >> This is somewhat rare on most cards (depending on what angle you plug >> the DVI connector in), but on some cards it happens constantly. The >> Radeon R5 on the machine used for testing this patch for instance, runs >> into this issue just about every time I try to hotplug a DVI monitor and >> as a result hotplugging almost never works. >> >> Rescheduling the hotplug work for a second when we run into an HPD >> signal with a failing DDC probe usually gives enough time for the rest >> of the connector's pins to make contact, and fixes this issue. >> >> Signed-off-by: Lyude > > I find a second a bit long, but if it works so what? > > Looks sane enough to me, patch is Reviewed-by: Christian König > Does this patch help in case of the Radeon chip only has HDMI and DP outputs exposed (Zotac ZBOXNANO-AQ01) but used with DVI or VGA monitors with converter cables? We have some problems with such scenarios that sounds eerily similar to this description. Inquiry-by: Zoltán Böszörményi ;-) Thanks in advance. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt
2015-11-20 16:52 keltezéssel, cp...@redhat.com írta: > From: Stephen Chandler Paul > > HPD signals on DVI ports can be fired off before the pins required for > DDC probing actually make contact, due to the pins for HPD making > contact first. This results in a HPD signal being asserted but DDC > probing failing, resulting in hotplugging occasionally failing. > > This is somewhat rare on most cards (depending on what angle you plug > the DVI connector in), but on some cards it happens constantly. The > Radeon R5 on the machine used for testing this patch for instance, runs > into this issue just about every time I try to hotplug a DVI monitor and > as a result hotplugging almost never works. > > Rescheduling the hotplug work for a second when we run into an HPD > signal with a failing DDC probe usually gives enough time for the rest > of the connector's pins to make contact, and fixes this issue. > > Signed-off-by: Stephen Chandler Paul > --- > So this one has kind of been a tough sell with Jerome, mostly because it's > somewhat of a hack. Unfortunately however I've managed to find machines where > DVI hotplugging literally doesn't work without a patch like this. We've > already > tried a couple of ways of handling the situation of retriggering ddc probes: > > * Trying the DDC probe in the radeon_dvi_detect() function multiple times. > * Trying to reschedule the hotplug_work task whenever DDC probing fails on DVI > but we got a hpd signal (this ended up being a much more complicated patch > then anticipated) > * Doing what we do right now, which is just triggering userspace to rescan all > the ports when the hpd signal is asserted by the DVI port but there's no DDC > probe, and repeating until at least a second passes. > > All of these actually work, but I guess it's a question of which one is less > of > a hack. If anyone here can think of a cleaner way of handling this feel free > to > let me know. > > drivers/gpu/drm/radeon/radeon.h| 3 +++ > drivers/gpu/drm/radeon/radeon_connectors.c | 20 +--- > drivers/gpu/drm/radeon/radeon_irq_kms.c| 2 ++ > 3 files changed, 22 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h > index b6cbd81..d63f0fe 100644 > --- a/drivers/gpu/drm/radeon/radeon.h > +++ b/drivers/gpu/drm/radeon/radeon.h > @@ -2460,6 +2460,9 @@ struct radeon_device { > /* amdkfd interface */ > struct kfd_dev *kfd; > > + /* last time we received an hpd signal */ > + unsigned long hpd_time; > + > struct mutexmn_lock; > DECLARE_HASHTABLE(mn_hash, 7); > }; > diff --git a/drivers/gpu/drm/radeon/radeon_connectors.c > b/drivers/gpu/drm/radeon/radeon_connectors.c > index 5a2cafb..4ee9440 100644 > --- a/drivers/gpu/drm/radeon/radeon_connectors.c > +++ b/drivers/gpu/drm/radeon/radeon_connectors.c > @@ -1228,19 +1228,33 @@ radeon_dvi_detect(struct drm_connector *connector, > bool force) > const struct drm_encoder_helper_funcs *encoder_funcs; > int i, r; > enum drm_connector_status ret = connector_status_disconnected; > - bool dret = false, broken_edid = false; > + bool dret = false, broken_edid = false, hpd_unchanged; > > r = pm_runtime_get_sync(connector->dev->dev); > if (r < 0) > return connector_status_disconnected; > > - if (!force && radeon_check_hpd_status_unchanged(connector)) { > + hpd_unchanged = radeon_check_hpd_status_unchanged(connector); > + if (!force && hpd_unchanged) { > ret = connector->status; > goto exit; > } > > - if (radeon_connector->ddc_bus) > + if (radeon_connector->ddc_bus) { > dret = radeon_ddc_probe(radeon_connector, false); > + > + /* Sometimes the pins required for the DDC probe on DVI > + * connectors don't make contact at the same time that the ones > + * for HPD do. If the DDC probe fails even though we had an HPD > + * signal, signal userspace to try again */ > + if (!dret && !hpd_unchanged && > + connector->status != connector_status_connected && > + time_before(jiffies, rdev->hpd_time + > msecs_to_jiffies(1000))) { > + DRM_DEBUG_KMS("%s: hpd asserted but ddc probe failed, > retrying\n", > + connector->name); > + drm_sysfs_hotplug_event(dev); > + } > + } > if (dret) { > radeon_connector->detected_by_load = false; > radeon_connector_free_edid(connector); > diff --git a/drivers/gpu/drm/radeon/radeon_irq_kms.c > b/drivers/gpu/drm/radeon/radeon_irq_kms.c > index 171d3e4..579c22c 100644 > --- a/drivers/gpu/drm/radeon/radeon_irq_kms.c > +++ b/drivers/gpu/drm/radeon/radeon_irq_kms.c > @@ -79,6 +79,8 @@ static void radeon_hotplug_work_func(struct work_struct > *work) > struct
Re: [PATCH] drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt
2015-11-20 16:52 keltezéssel, cp...@redhat.com írta: > From: Stephen Chandler Paul> > HPD signals on DVI ports can be fired off before the pins required for > DDC probing actually make contact, due to the pins for HPD making > contact first. This results in a HPD signal being asserted but DDC > probing failing, resulting in hotplugging occasionally failing. > > This is somewhat rare on most cards (depending on what angle you plug > the DVI connector in), but on some cards it happens constantly. The > Radeon R5 on the machine used for testing this patch for instance, runs > into this issue just about every time I try to hotplug a DVI monitor and > as a result hotplugging almost never works. > > Rescheduling the hotplug work for a second when we run into an HPD > signal with a failing DDC probe usually gives enough time for the rest > of the connector's pins to make contact, and fixes this issue. > > Signed-off-by: Stephen Chandler Paul > --- > So this one has kind of been a tough sell with Jerome, mostly because it's > somewhat of a hack. Unfortunately however I've managed to find machines where > DVI hotplugging literally doesn't work without a patch like this. We've > already > tried a couple of ways of handling the situation of retriggering ddc probes: > > * Trying the DDC probe in the radeon_dvi_detect() function multiple times. > * Trying to reschedule the hotplug_work task whenever DDC probing fails on DVI > but we got a hpd signal (this ended up being a much more complicated patch > then anticipated) > * Doing what we do right now, which is just triggering userspace to rescan all > the ports when the hpd signal is asserted by the DVI port but there's no DDC > probe, and repeating until at least a second passes. > > All of these actually work, but I guess it's a question of which one is less > of > a hack. If anyone here can think of a cleaner way of handling this feel free > to > let me know. > > drivers/gpu/drm/radeon/radeon.h| 3 +++ > drivers/gpu/drm/radeon/radeon_connectors.c | 20 +--- > drivers/gpu/drm/radeon/radeon_irq_kms.c| 2 ++ > 3 files changed, 22 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h > index b6cbd81..d63f0fe 100644 > --- a/drivers/gpu/drm/radeon/radeon.h > +++ b/drivers/gpu/drm/radeon/radeon.h > @@ -2460,6 +2460,9 @@ struct radeon_device { > /* amdkfd interface */ > struct kfd_dev *kfd; > > + /* last time we received an hpd signal */ > + unsigned long hpd_time; > + > struct mutexmn_lock; > DECLARE_HASHTABLE(mn_hash, 7); > }; > diff --git a/drivers/gpu/drm/radeon/radeon_connectors.c > b/drivers/gpu/drm/radeon/radeon_connectors.c > index 5a2cafb..4ee9440 100644 > --- a/drivers/gpu/drm/radeon/radeon_connectors.c > +++ b/drivers/gpu/drm/radeon/radeon_connectors.c > @@ -1228,19 +1228,33 @@ radeon_dvi_detect(struct drm_connector *connector, > bool force) > const struct drm_encoder_helper_funcs *encoder_funcs; > int i, r; > enum drm_connector_status ret = connector_status_disconnected; > - bool dret = false, broken_edid = false; > + bool dret = false, broken_edid = false, hpd_unchanged; > > r = pm_runtime_get_sync(connector->dev->dev); > if (r < 0) > return connector_status_disconnected; > > - if (!force && radeon_check_hpd_status_unchanged(connector)) { > + hpd_unchanged = radeon_check_hpd_status_unchanged(connector); > + if (!force && hpd_unchanged) { > ret = connector->status; > goto exit; > } > > - if (radeon_connector->ddc_bus) > + if (radeon_connector->ddc_bus) { > dret = radeon_ddc_probe(radeon_connector, false); > + > + /* Sometimes the pins required for the DDC probe on DVI > + * connectors don't make contact at the same time that the ones > + * for HPD do. If the DDC probe fails even though we had an HPD > + * signal, signal userspace to try again */ > + if (!dret && !hpd_unchanged && > + connector->status != connector_status_connected && > + time_before(jiffies, rdev->hpd_time + > msecs_to_jiffies(1000))) { > + DRM_DEBUG_KMS("%s: hpd asserted but ddc probe failed, > retrying\n", > + connector->name); > + drm_sysfs_hotplug_event(dev); > + } > + } > if (dret) { > radeon_connector->detected_by_load = false; > radeon_connector_free_edid(connector); > diff --git a/drivers/gpu/drm/radeon/radeon_irq_kms.c > b/drivers/gpu/drm/radeon/radeon_irq_kms.c > index 171d3e4..579c22c 100644 > --- a/drivers/gpu/drm/radeon/radeon_irq_kms.c > +++ b/drivers/gpu/drm/radeon/radeon_irq_kms.c > @@ -79,6 +79,8 @@ static void radeon_hotplug_work_func(struct
Re: [Bugfix v3] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32-bit kernel
2015-06-24 12:18 keltezéssel, Ingo Molnar írta: > * Jiang Liu wrote: > >> A regression report from Boszormenyi Zoltan : >> There's a Realtek RTL8111/8168/8411 (PCI ID 10ec:8168, Subsystem ID >> 1565:230e) >> network chip on the mainboard. After the r8169 driver loaded, the IRQs in >> the machine went berserk. Keyboard keypressed arrived with considerable >> latency and duplicated, so no real work was possible. The machine responded >> to the power button but didn't actually power down. It just stuck at the >> powering down message. I had to press the power button for 4 seconds to power >> it down. >> >> The computer is a POS machine with a big battery inside. Because of this, >> either ACPI or the Realtek chip kept the bad state and after rebooting, the >> network chip didn't even show up in lspci. Not even the PXE ROM announced >> itself during boot. I had to disconnect the battery to beat some sense back >> to the computer. >> >> The regression happens with 4.0.5, 4.1.0-rc8 and 4.1.0-final. 3.18.16 was >> good. > So please put this into quotes, like: > > === > Zoltan Boszormenyi reported this regression: > > "There's a Realtek RTL8111/8168/8411 (PCI ID 10ec:8168, Subsystem ID > 1565:230e) >network chip on the mainboard. After the r8169 driver loaded, the IRQs in >the machine went berserk. Keyboard keypressed arrived with considerable >latency and duplicated, so no real work was possible. The machine responded >to the power button but didn't actually power down. It just stuck at the >powering down message. I had to press the power button for 4 seconds to > power >it down. > >The computer is a POS machine with a big battery inside. Because of this, >either ACPI or the Realtek chip kept the bad state and after rebooting, the >network chip didn't even show up in lspci. Not even the PXE ROM announced >itself during boot. I had to disconnect the battery to beat some sense back >to the computer. > >The regression happens with 4.0.5, 4.1.0-rc8 and 4.1.0-final. 3.18.16 was >good." > > ... > === > > Also note the indentation, that helps readability. > > Thanks, > > Ingo So, will there be a v4 with a commit message satisfactory to Ingo that will be part of 4.0.7/4.1.1 and 4.2? Best regards, Zoltán -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugfix v3] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32-bit kernel
2015-06-24 12:18 keltezéssel, Ingo Molnar írta: * Jiang Liu jiang@linux.intel.com wrote: A regression report from Boszormenyi Zoltan zbos...@pr.hu: There's a Realtek RTL8111/8168/8411 (PCI ID 10ec:8168, Subsystem ID 1565:230e) network chip on the mainboard. After the r8169 driver loaded, the IRQs in the machine went berserk. Keyboard keypressed arrived with considerable latency and duplicated, so no real work was possible. The machine responded to the power button but didn't actually power down. It just stuck at the powering down message. I had to press the power button for 4 seconds to power it down. The computer is a POS machine with a big battery inside. Because of this, either ACPI or the Realtek chip kept the bad state and after rebooting, the network chip didn't even show up in lspci. Not even the PXE ROM announced itself during boot. I had to disconnect the battery to beat some sense back to the computer. The regression happens with 4.0.5, 4.1.0-rc8 and 4.1.0-final. 3.18.16 was good. So please put this into quotes, like: === Zoltan Boszormenyi reported this regression: There's a Realtek RTL8111/8168/8411 (PCI ID 10ec:8168, Subsystem ID 1565:230e) network chip on the mainboard. After the r8169 driver loaded, the IRQs in the machine went berserk. Keyboard keypressed arrived with considerable latency and duplicated, so no real work was possible. The machine responded to the power button but didn't actually power down. It just stuck at the powering down message. I had to press the power button for 4 seconds to power it down. The computer is a POS machine with a big battery inside. Because of this, either ACPI or the Realtek chip kept the bad state and after rebooting, the network chip didn't even show up in lspci. Not even the PXE ROM announced itself during boot. I had to disconnect the battery to beat some sense back to the computer. The regression happens with 4.0.5, 4.1.0-rc8 and 4.1.0-final. 3.18.16 was good. ... === Also note the indentation, that helps readability. Thanks, Ingo So, will there be a v4 with a commit message satisfactory to Ingo that will be part of 4.0.7/4.1.1 and 4.2? Best regards, Zoltán -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugfix v2] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32bit kernel
2015-06-24 10:25 keltezéssel, Boszormenyi Zoltan írta: > 2015-06-24 09:43 keltezéssel, Jiang Liu írta: >> Hi Zoltan, >> Could you please help to test this patch against the latest kernel? >> Thanks! >> Gerry > I will, thanks. Now i have tested this v2. I assume later ones will only differ in the commit message. It works, thank you very much! There are differences now between lspci between 3.18.16 and 4.1.0-final plus this patch but I guess they are not relevant to this matter. The i915 chip and the Realtek chip have their IRQs reversed and the "Data: " part in the "Address:" line, too. I attached the lspci -vvxxx output from 3.18.16, 4.1-rc8 with the very first patch and 4.1-final with the v2 patch, so you can see if it is an error or not. Best regards, Zoltán lspci2.tgz Description: application/compressed-tar
Re: [Bugfix v2] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32bit kernel
2015-06-24 10:30 keltezéssel, Ingo Molnar írta: > * Jiang Liu wrote: > >> Since commit 593669c2ac0f ("x86/PCI/ACPI: Use common ACPI resource >> interfaces to >> simplify implementation"), x86 PCI ACPI host bridge driver validates ACPI >> resources by first converting an ACPI resource to a 'struct resource' >> structure >> and then applying checks against the converted resource structure. The >> 'start' >> and 'end' fields in 'struct resource' are defined to be type of >> resource_size_t, >> which may be 32 bits or 64 bits depending on CONFIG_PHYS_ADDR_T_64BIT. >> >> This may cause incorrect resource validation results with 32 bit kernels >> because >> 64bit ACPI resource descriptors may get truncated when converting to 32bit >> 'start' and 'end' fields in 'struct resource'. And eventually affects PCI >> resource allocation subsystem and causes some PCI devices unusable. > s/causes some PCI devices unusuable. > makes some PCI devices unusuable. > > Also, this description is still pretty vague. What exactly happened? Did some > PCI > devices not show up during bootup? Or did they hang? Or did something else > happen? There's a reference mail URL in the description, but here it is in full glory. The machine in question started behaving like being drunk without this fix with 4.0.5 and 4.1.0-rc8 and 4.1.0-final. 3.18.16 was good. There's a Realtek RTL8111/8168/8411 (PCI ID 10ec:8168, Subsystem ID 1565:230e) network chip on the mainboard. After the r8169 driver loaded, the IRQs in the machine went berserk. Keyboard keypressed arrived with considerable latency and duplicated, so no real work was possible. The machine responded to the power button but didn't actually power down. It just stuck at the powering down message. I had to press the power button for 4 seconds to power it down. The computer is a POS machine with a big battery inside. Because of this, either ACPI or the Realtek chip kept the bad state and after rebooting, the network chip didn't even show up in lspci. Not even the PXE ROM announced itself during boot. I had to disconnect the battery to beat some sense back to the computer. Without the patch I was able to get debugging info out of the machine in this bad state with: # modprobe r8169 ; sleep 10 ; dmesg >dmesg.log ; lspci -vvxxx >lspci.log ; \ sync ; sync ; sync ; poweroff all in the same command line. Entering commands manually after a single "modprobe r8169" was impossible. That revealed that the #2 PCIe port (the one that the Realtek chip is attached to) changed this way: @@ -211,7 +211,7 @@ 00:1c.1 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 2 (rev 02) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ - Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR+ > This is _by far_ the most important part of the changelog and determines > whether a > patch gets backported or not. Why does a usable regression description have > to be > coaxed out of you like pulling teeth?? The commit description by Jiang Liu has the URL for initial mail where I reported the symptoms I experienced. If you thing the above summary is not too long for a commit message, then feel free to use it, edited in any way you like. Best regards, Zoltán > >> So enhance the ACPI resource parsing interfaces to ignore ACPI resource >> descriptors with address/offset observe 4G when running in 32bit mode. This >> reverts to the behavior before commit 593669c2ac0f. >> >> This issue was triggered on a platform running 32bit kernel with an ACPI >> resource descriptor with address range [0x4-0xf]. Please >> refer >> to https://lkml.org/lkml/2015/6/19/277 for more information. > s/32bit/32-bit > s/64bit/64-bit > s/32 bit/32-bit > s/64 bit/64-bit > > Thanks, > > Ingo > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugfix v2] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32bit kernel
2015-06-24 09:43 keltezéssel, Jiang Liu írta: > Since commit 593669c2ac0f ("x86/PCI/ACPI: Use common ACPI resource > interfaces to simplify implementation"), x86 PCI ACPI host bridge driver > validates ACPI resources by first converting an ACPI resource to > a 'struct resource' structure and then applying checks against the > converted resource structure. The 'start' and 'end' fields in 'struct > resource' are defined to be type of resource_size_t, which may be 32 bits > or 64 bits depending on CONFIG_PHYS_ADDR_T_64BIT. > > This may cause incorrect resource validation results with 32 bit kernels > because 64bit ACPI resource descriptors may get truncated when converting > to 32bit 'start' and 'end' fields in 'struct resource'. And eventually > affects PCI resource allocation subsystem and causes some PCI devices > unusable. > > So enhance the ACPI resource parsing interfaces to ignore ACPI resource > descriptors with address/offset observe 4G when running in 32bit mode. > This reverts to the behavior before commit 593669c2ac0f. > > This issue was triggered on a platform running 32bit kernel with an > ACPI resource descriptor with address range [0x4-0xf]. > Please refer to https://lkml.org/lkml/2015/6/19/277 for more information. > > Reported-by: Boszormenyi Zoltan > Fixes: 593669c2ac0f ("x86/PCI/ACPI: Use common ACPI resource interfaces to > simplify implementation") > Signed-off-by: Jiang Liu > Cc: sta...@vger.kernel.org # 4.0 > --- > > Hi Zoltan, > Could you please help to test this patch against the latest kernel? > Thanks! > Gerry I will, thanks. Best regards, Zoltán > > --- > drivers/acpi/resource.c | 24 +++- > 1 file changed, 15 insertions(+), 9 deletions(-) > > diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c > index 8244f013f210..f1c966e05078 100644 > --- a/drivers/acpi/resource.c > +++ b/drivers/acpi/resource.c > @@ -193,6 +193,7 @@ static bool acpi_decode_space(struct resource_win *win, > u8 iodec = attr->granularity == 0xfff ? ACPI_DECODE_10 : ACPI_DECODE_16; > bool wp = addr->info.mem.write_protect; > u64 len = attr->address_length; > + u64 start, end, offset = 0; > struct resource *res = >res; > > /* > @@ -204,9 +205,6 @@ static bool acpi_decode_space(struct resource_win *win, > pr_debug("ACPI: Invalid address space min_addr_fix %d, > max_addr_fix %d, len %llx\n", >addr->min_address_fixed, addr->max_address_fixed, len); > > - res->start = attr->minimum; > - res->end = attr->maximum; > - > /* >* For bridges that translate addresses across the bridge, >* translation_offset is the offset that must be added to the > @@ -214,12 +212,22 @@ static bool acpi_decode_space(struct resource_win *win, >* primary side. Non-bridge devices must list 0 for all Address >* Translation offset bits. >*/ > - if (addr->producer_consumer == ACPI_PRODUCER) { > - res->start += attr->translation_offset; > - res->end += attr->translation_offset; > - } else if (attr->translation_offset) { > + if (addr->producer_consumer == ACPI_PRODUCER) > + offset = attr->translation_offset; > + else if (attr->translation_offset) > pr_debug("ACPI: translation_offset(%lld) is invalid for > non-bridge device.\n", >attr->translation_offset); > + start = attr->minimum + offset; > + end = attr->maximum + offset; > + > + win->offset = offset; > + res->start = start; > + res->end = end; > + if (sizeof(resource_size_t) < sizeof(u64) && > + (offset != win->offset || start != res->start || end != res->end)) { > + pr_warn("acpi resource window ([%#llx-%#llx] ignored, not CPU > addressable)\n", > + attr->minimum, attr->maximum); > + return false; > } > > switch (addr->resource_type) { > @@ -236,8 +244,6 @@ static bool acpi_decode_space(struct resource_win *win, > return false; > } > > - win->offset = attr->translation_offset; > - > if (addr->producer_consumer == ACPI_PRODUCER) > res->flags |= IORESOURCE_WINDOW; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugfix v2] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32bit kernel
2015-06-24 10:30 keltezéssel, Ingo Molnar írta: * Jiang Liu jiang@linux.intel.com wrote: Since commit 593669c2ac0f (x86/PCI/ACPI: Use common ACPI resource interfaces to simplify implementation), x86 PCI ACPI host bridge driver validates ACPI resources by first converting an ACPI resource to a 'struct resource' structure and then applying checks against the converted resource structure. The 'start' and 'end' fields in 'struct resource' are defined to be type of resource_size_t, which may be 32 bits or 64 bits depending on CONFIG_PHYS_ADDR_T_64BIT. This may cause incorrect resource validation results with 32 bit kernels because 64bit ACPI resource descriptors may get truncated when converting to 32bit 'start' and 'end' fields in 'struct resource'. And eventually affects PCI resource allocation subsystem and causes some PCI devices unusable. s/causes some PCI devices unusuable. makes some PCI devices unusuable. Also, this description is still pretty vague. What exactly happened? Did some PCI devices not show up during bootup? Or did they hang? Or did something else happen? There's a reference mail URL in the description, but here it is in full glory. The machine in question started behaving like being drunk without this fix with 4.0.5 and 4.1.0-rc8 and 4.1.0-final. 3.18.16 was good. There's a Realtek RTL8111/8168/8411 (PCI ID 10ec:8168, Subsystem ID 1565:230e) network chip on the mainboard. After the r8169 driver loaded, the IRQs in the machine went berserk. Keyboard keypressed arrived with considerable latency and duplicated, so no real work was possible. The machine responded to the power button but didn't actually power down. It just stuck at the powering down message. I had to press the power button for 4 seconds to power it down. The computer is a POS machine with a big battery inside. Because of this, either ACPI or the Realtek chip kept the bad state and after rebooting, the network chip didn't even show up in lspci. Not even the PXE ROM announced itself during boot. I had to disconnect the battery to beat some sense back to the computer. Without the patch I was able to get debugging info out of the machine in this bad state with: # modprobe r8169 ; sleep 10 ; dmesg dmesg.log ; lspci -vvxxx lspci.log ; \ sync ; sync ; sync ; poweroff all in the same command line. Entering commands manually after a single modprobe r8169 was impossible. That revealed that the #2 PCIe port (the one that the Realtek chip is attached to) changed this way: @@ -211,7 +211,7 @@ 00:1c.1 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 2 (rev 02) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ - Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- + Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR+ PERR- INTx- Latency: 0, Cache Line Size: 32 bytes Bus: primary=00, secondary=02, subordinate=02, sec-latency=0 I/O behind bridge: e000-efff @@ -226,7 +226,7 @@ DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes - DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- + DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #2, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s 256ns, L1 4us ClockPM- Surprise- LLActRep+ BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ The uncorrectable error seems to have pushed it or the device behind it to a disabled state after reboot and this state was kept because of the battery. Also, with the 32-bit wraparound caused that every device in the system was reprogrammed to use a different memory address range. With the fix, the behavior of the machine was restored to how 3.18.16 worked, i.e. the memory range that is over 4GB is ignored again, and lspci -vvxxx shows that everything is at the same memory window as they were with 3.18.16. Unrelated to this fix, but I also had an adventure with r8168 (downloaded from Realtek and compiled from source) vs r8169. Most likely caused by switching between r8168 and r8169, the network chip was programmed with a bad MAC address (ff:fc:6d:11:28:ff, the real one is 00:0c:6d:11:28:77) which made the network started acting weirdly. While the machine was pingable and it was able to ping others, real networking like the ssh login prompt never appeared, traceroute took ages, etc. That was also solved by disconnecting the battery and powering down completely and returning to r8169 with the kernel patched with a preliminary version of this patch. This is _by far_ the
Re: [Bugfix v2] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32bit kernel
2015-06-24 09:43 keltezéssel, Jiang Liu írta: Since commit 593669c2ac0f (x86/PCI/ACPI: Use common ACPI resource interfaces to simplify implementation), x86 PCI ACPI host bridge driver validates ACPI resources by first converting an ACPI resource to a 'struct resource' structure and then applying checks against the converted resource structure. The 'start' and 'end' fields in 'struct resource' are defined to be type of resource_size_t, which may be 32 bits or 64 bits depending on CONFIG_PHYS_ADDR_T_64BIT. This may cause incorrect resource validation results with 32 bit kernels because 64bit ACPI resource descriptors may get truncated when converting to 32bit 'start' and 'end' fields in 'struct resource'. And eventually affects PCI resource allocation subsystem and causes some PCI devices unusable. So enhance the ACPI resource parsing interfaces to ignore ACPI resource descriptors with address/offset observe 4G when running in 32bit mode. This reverts to the behavior before commit 593669c2ac0f. This issue was triggered on a platform running 32bit kernel with an ACPI resource descriptor with address range [0x4-0xf]. Please refer to https://lkml.org/lkml/2015/6/19/277 for more information. Reported-by: Boszormenyi Zoltan zbos...@pr.hu Fixes: 593669c2ac0f (x86/PCI/ACPI: Use common ACPI resource interfaces to simplify implementation) Signed-off-by: Jiang Liu jiang@linux.intel.com Cc: sta...@vger.kernel.org # 4.0 --- Hi Zoltan, Could you please help to test this patch against the latest kernel? Thanks! Gerry I will, thanks. Best regards, Zoltán --- drivers/acpi/resource.c | 24 +++- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c index 8244f013f210..f1c966e05078 100644 --- a/drivers/acpi/resource.c +++ b/drivers/acpi/resource.c @@ -193,6 +193,7 @@ static bool acpi_decode_space(struct resource_win *win, u8 iodec = attr-granularity == 0xfff ? ACPI_DECODE_10 : ACPI_DECODE_16; bool wp = addr-info.mem.write_protect; u64 len = attr-address_length; + u64 start, end, offset = 0; struct resource *res = win-res; /* @@ -204,9 +205,6 @@ static bool acpi_decode_space(struct resource_win *win, pr_debug(ACPI: Invalid address space min_addr_fix %d, max_addr_fix %d, len %llx\n, addr-min_address_fixed, addr-max_address_fixed, len); - res-start = attr-minimum; - res-end = attr-maximum; - /* * For bridges that translate addresses across the bridge, * translation_offset is the offset that must be added to the @@ -214,12 +212,22 @@ static bool acpi_decode_space(struct resource_win *win, * primary side. Non-bridge devices must list 0 for all Address * Translation offset bits. */ - if (addr-producer_consumer == ACPI_PRODUCER) { - res-start += attr-translation_offset; - res-end += attr-translation_offset; - } else if (attr-translation_offset) { + if (addr-producer_consumer == ACPI_PRODUCER) + offset = attr-translation_offset; + else if (attr-translation_offset) pr_debug(ACPI: translation_offset(%lld) is invalid for non-bridge device.\n, attr-translation_offset); + start = attr-minimum + offset; + end = attr-maximum + offset; + + win-offset = offset; + res-start = start; + res-end = end; + if (sizeof(resource_size_t) sizeof(u64) + (offset != win-offset || start != res-start || end != res-end)) { + pr_warn(acpi resource window ([%#llx-%#llx] ignored, not CPU addressable)\n, + attr-minimum, attr-maximum); + return false; } switch (addr-resource_type) { @@ -236,8 +244,6 @@ static bool acpi_decode_space(struct resource_win *win, return false; } - win-offset = attr-translation_offset; - if (addr-producer_consumer == ACPI_PRODUCER) res-flags |= IORESOURCE_WINDOW; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugfix v2] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32bit kernel
2015-06-24 10:25 keltezéssel, Boszormenyi Zoltan írta: 2015-06-24 09:43 keltezéssel, Jiang Liu írta: Hi Zoltan, Could you please help to test this patch against the latest kernel? Thanks! Gerry I will, thanks. Now i have tested this v2. I assume later ones will only differ in the commit message. It works, thank you very much! There are differences now between lspci between 3.18.16 and 4.1.0-final plus this patch but I guess they are not relevant to this matter. The i915 chip and the Realtek chip have their IRQs reversed and the Data: part in the Address: line, too. I attached the lspci -vvxxx output from 3.18.16, 4.1-rc8 with the very first patch and 4.1-final with the v2 patch, so you can see if it is an error or not. Best regards, Zoltán lspci2.tgz Description: application/compressed-tar
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 20:55 keltezéssel, Boszormenyi Zoltan írta: > 2015-06-21 19:55 keltezéssel, Jiang Liu írta: >> On 2015/6/22 1:25, Jiang Liu wrote: >> [...] >>>>>> - Memory behind bridge: 8000-801f >>>>>> - Prefetchable memory behind bridge: >>>>>> 8020-803f >>>>>> + Memory behind bridge: ff00-ff1f >>>>>> + Prefetchable memory behind bridge: >>>>>> ff20-ff3f >>>>>> >>>>>> Can't this cause a problem? E.g. programming the bridge with an address >>>>>> range >>>>>> that the bridge doesn't actually support? >>>>> This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You >>>>> attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a >>>>> v3.18.16 dmesg log, so we can compare them? >>>> I collected all 3 for you to compare them, compressed, attached. >>>> >>>> BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0 >>>> as suspicious. I will try the 4.0/4.1 kernels with this one reverted. >>>> >>>>> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at >>>>> the code to see what might be going on: >>>>> >>>>> acpi PNP0A08:00: host bridge window expanded to [mem >>>>> 0x-0x window]; [mem 0x-0x window] >>>>> ignored >>>>> pci :00:1c.1: can't claim BAR 15 [mem 0xfdf0-0xfdff >>>>> 64bit pref]: address conflict with PCI Bus :00 [mem >>>>> 0xf000-0xfed8 window] >>>>> >>>>> Bjorn >>> Hi Bjorn and Boszormenyi, >>> From the 3.18 kernel, we got a message: >>> [0.126248] acpi PNP0A08:00: host bridge window >>> [0x4-0xf] (ignored, not CPU addressable) >>> And from 4.1.-rc8, we got another message: >>> [0.127051] acpi PNP0A08:00: host bridge window expanded to [mem >>> 0x-0x window]; [mem 0x-0x window] ignored >>> >>> That smells like a 32bit overflow or 64bit cut-off issue. >> Hi Bjorn and Boszormenyi, >> With v3.18.6, it uses u64 to compare resource ranges. We changed to use >> resource_size_t with recent changes, and resource_size_t >> may be u32 or u64 depending on configuration. So resource range >> [0x4-0xf] may have been cut-off as >> [0x-0x], thus cause the trouble. >> >> Hi Boszormenyi, >> Could you please help to try following test patch? >> against v4.1-rc8? > I have tried it. The result (dmesg, lspci before/after modprobe) is attached. > The "not CPU addressable" message shows up once in dmesg. > The device shows up in lspci and the module can be loaded. The previously > experienced sluggishness is gone now, but the network doesn't work after > modprobe. > I think it was an expected outcome, since that particular range is ignored > with this patch. Hm, I can see a very similar message in 3.18.16, so it was not the expected outcome. After building the "official" r8168 from Realtek for 4.1.0-rc8, the difference in lspci from the working 3.18.16 is nil, before and after modprobe. (r8168 was build for 3.18.16, that's why.) However, connman (similar to NetworkManager) still sees the network connectivity as "down". I checked that the firmware files are there in /lib/firmware/rtl_nic. With r8168 (the "official" Realtek driver), the kernel message about "link up" appears immediately and connman can configure the network. I have tried the patch on 4.0.5, too, with the same result. So, there may be another problem with the r8169 driver itself besides this ACPI problem but no matter what I do, I can't seem to be able to enable debugging messages for r8169. So, for now I can use r8168 instead of r8169 with this patch. Thanks, Zoltán > > Thanks, > Zoltán > >> Thanks! >> Gerry >> --- >> diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c >> index 8244f013f210..d7b8c392c420 100644 >> --- a/drivers/acpi/resource.c >> +++ b/drivers/acpi/resource.c >> @@ -206,6 +206,11 @@ static bool acpi_decode_space(struct resource_win *win, >> >> res->start = attr->minimum; >> res->end = attr->maximum; >> + if (res->start != attr->minimum || res->end != attr->maximum) { >> + pr_warn("resource window ([%#llx-%#llx] ignored, not CPU >> addressable)\n", >> + attr->minimum, attr->maximum); >> + return false; >> + } >> >> /* >> * For bridges that translate addresses across the bridge, >> - >> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in Please read the FAQ at http://www.tux.org/lkml/
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 19:55 keltezéssel, Jiang Liu írta: > On 2015/6/22 1:25, Jiang Liu wrote: > [...] > - Memory behind bridge: 8000-801f > - Prefetchable memory behind bridge: > 8020-803f > + Memory behind bridge: ff00-ff1f > + Prefetchable memory behind bridge: > ff20-ff3f > > Can't this cause a problem? E.g. programming the bridge with an address > range > that the bridge doesn't actually support? This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a v3.18.16 dmesg log, so we can compare them? >>> I collected all 3 for you to compare them, compressed, attached. >>> >>> BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0 >>> as suspicious. I will try the 4.0/4.1 kernels with this one reverted. >>> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at the code to see what might be going on: acpi PNP0A08:00: host bridge window expanded to [mem 0x-0x window]; [mem 0x-0x window] ignored pci :00:1c.1: can't claim BAR 15 [mem 0xfdf0-0xfdff 64bit pref]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] Bjorn >> Hi Bjorn and Boszormenyi, >> From the 3.18 kernel, we got a message: >> [0.126248] acpi PNP0A08:00: host bridge window >> [0x4-0xf] (ignored, not CPU addressable) >> And from 4.1.-rc8, we got another message: >> [0.127051] acpi PNP0A08:00: host bridge window expanded to [mem >> 0x-0x window]; [mem 0x-0x window] ignored >> >> That smells like a 32bit overflow or 64bit cut-off issue. > Hi Bjorn and Boszormenyi, > With v3.18.6, it uses u64 to compare resource ranges. We changed to use > resource_size_t with recent changes, and resource_size_t > may be u32 or u64 depending on configuration. So resource range > [0x4-0xf] may have been cut-off as > [0x-0x], thus cause the trouble. > > Hi Boszormenyi, > Could you please help to try following test patch? > against v4.1-rc8? I have tried it. The result (dmesg, lspci before/after modprobe) is attached. The "not CPU addressable" message shows up once in dmesg. The device shows up in lspci and the module can be loaded. The previously experienced sluggishness is gone now, but the network doesn't work after modprobe. I think it was an expected outcome, since that particular range is ignored with this patch. Thanks, Zoltán > Thanks! > Gerry > --- > diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c > index 8244f013f210..d7b8c392c420 100644 > --- a/drivers/acpi/resource.c > +++ b/drivers/acpi/resource.c > @@ -206,6 +206,11 @@ static bool acpi_decode_space(struct resource_win *win, > > res->start = attr->minimum; > res->end = attr->maximum; > + if (res->start != attr->minimum || res->end != attr->maximum) { > + pr_warn("resource window ([%#llx-%#llx] ignored, not CPU > addressable)\n", > + attr->minimum, attr->maximum); > + return false; > + } > > /* > * For bridges that translate addresses across the bridge, > - > dmesg-lspci-xx2.tgz Description: application/compressed-tar
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 19:25 keltezéssel, Jiang Liu írta: > On 2015/6/21 22:19, Boszormenyi Zoltan wrote: >> 2015-06-21 16:03 keltezéssel, Bjorn Helgaas írta: >>> [+cc linux-pci] >>> >>> Hi Boszormenyi, >>> >>> On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan wrote: >>>> Hi, >>>> >>>> please, cc me, I am not subscribed to lkml. >>>> >>>>> Hi, >>>>> >>>>> [lkml.org still broken --> no accurate mail header info possible...] >>>>> >>>>> Just to ask the obvious: >>>>> I assume using /sys/bus/pci/rescan does not help once it's broken? >>>>> (since the machine comes up empty at initial-boot scan, too) >>>> I will try it, too, but I am not sure it would work. >>>> >>>> Currently I can't test it because the last time I completely discharged >>>> the battery. I also disconnected it to be able to get the realtek chip back >>>> immediately for faster testing. Now, that I have reconnected the battery, >>>> I need to wait for it to be charged somewhat to be able to reproduce >>>> losing the network chip. >>>> >>>>> Also, you could try diffing lspci -vvxxx -s output >>>>> of working vs. "distorting" kernel version - perhaps some register setup >>>>> has been changed (e.g. due to power management improvements or some such), >>>>> which may encourage the card >>>>> to get a problematic/corrupt state. >>>> I attached a tarball that contains lspci -vvxxx for >>>> - all devices / only the network chip >>>> - before / after "modprobe r8169" >>>> - for all 3 kernel versions tested. >>>> >>>> I figured out that if I type the modprobe and lspci in the same command >>>> line, >>>> I can get diagnostics out of the machine, after all. >>>> >>>> It's not just the Realtek chip that has changed parameters. >>>> >>>> (Vague idea) I noticed that some devices have changed like this: >>>> >>>> - Memory behind bridge: 8000-801f >>>> - Prefetchable memory behind bridge: >>>> 8020-803f >>>> + Memory behind bridge: ff00-ff1f >>>> + Prefetchable memory behind bridge: >>>> ff20-ff3f >>>> >>>> Can't this cause a problem? E.g. programming the bridge with an address >>>> range >>>> that the bridge doesn't actually support? >>> This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You >>> attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a >>> v3.18.16 dmesg log, so we can compare them? >> I collected all 3 for you to compare them, compressed, attached. >> >> BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0 >> as suspicious. I will try the 4.0/4.1 kernels with this one reverted. >> >>> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at >>> the code to see what might be going on: >>> >>> acpi PNP0A08:00: host bridge window expanded to [mem >>> 0x-0x window]; [mem 0x-0x window] >>> ignored >>> pci :00:1c.1: can't claim BAR 15 [mem 0xfdf0-0xfdff >>> 64bit pref]: address conflict with PCI Bus :00 [mem >>> 0xf000-0xfed8 window] >>> >>> Bjorn > Hi Bjorn and Boszormenyi, > From the 3.18 kernel, we got a message: > [0.126248] acpi PNP0A08:00: host bridge window > [0x4-0xf] (ignored, not CPU addressable) > And from 4.1.-rc8, we got another message: > [0.127051] acpi PNP0A08:00: host bridge window expanded to [mem > 0x-0x window]; [mem 0x-0x window] ignored > > That smells like a 32bit overflow or 64bit cut-off issue. > > Hi Boszormenyi, could you please help to provide acpidump from the > machine? I already did in a previous mail which was only sent to LKML, but here it is again. Thanks, Zoltán > Thanks! > Gerry > > > > acpidump.tgz Description: application/compressed-tar
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 16:19 keltezéssel, Boszormenyi Zoltan írta: > 2015-06-21 16:03 keltezéssel, Bjorn Helgaas írta: >> [+cc linux-pci] >> >> Hi Boszormenyi, >> >> On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan wrote: >>> Hi, >>> >>> please, cc me, I am not subscribed to lkml. >>> >>>> Hi, >>>> >>>> [lkml.org still broken --> no accurate mail header info possible...] >>>> >>>> Just to ask the obvious: >>>> I assume using /sys/bus/pci/rescan does not help once it's broken? >>>> (since the machine comes up empty at initial-boot scan, too) >>> I will try it, too, but I am not sure it would work. >>> >>> Currently I can't test it because the last time I completely discharged >>> the battery. I also disconnected it to be able to get the realtek chip back >>> immediately for faster testing. Now, that I have reconnected the battery, >>> I need to wait for it to be charged somewhat to be able to reproduce >>> losing the network chip. >>> >>>> Also, you could try diffing lspci -vvxxx -s output >>>> of working vs. "distorting" kernel version - perhaps some register setup >>>> has been changed (e.g. due to power management improvements or some such), >>>> which may encourage the card >>>> to get a problematic/corrupt state. >>> I attached a tarball that contains lspci -vvxxx for >>> - all devices / only the network chip >>> - before / after "modprobe r8169" >>> - for all 3 kernel versions tested. >>> >>> I figured out that if I type the modprobe and lspci in the same command >>> line, >>> I can get diagnostics out of the machine, after all. >>> >>> It's not just the Realtek chip that has changed parameters. >>> >>> (Vague idea) I noticed that some devices have changed like this: >>> >>> - Memory behind bridge: 8000-801f >>> - Prefetchable memory behind bridge: 8020-803f >>> + Memory behind bridge: ff00-ff1f >>> + Prefetchable memory behind bridge: ff20-ff3f >>> >>> Can't this cause a problem? E.g. programming the bridge with an address >>> range >>> that the bridge doesn't actually support? >> This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You >> attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a >> v3.18.16 dmesg log, so we can compare them? > I collected all 3 for you to compare them, compressed, attached. > > BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0 > as suspicious. I will try the 4.0/4.1 kernels with this one reverted. Reverting this one didn't help. > >> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at >> the code to see what might be going on: >> >> acpi PNP0A08:00: host bridge window expanded to [mem >> 0x-0x window]; [mem 0x-0x window] >> ignored >> pci :00:1c.1: can't claim BAR 15 [mem 0xfdf0-0xfdff >> 64bit pref]: address conflict with PCI Bus :00 [mem >> 0xf000-0xfed8 window] >> >> Bjorn >> > Thanks, > Zoltán Böszörményi > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in Please read the FAQ at http://www.tux.org/lkml/
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 16:03 keltezéssel, Bjorn Helgaas írta: > [+cc linux-pci] > > Hi Boszormenyi, > > On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan wrote: >> Hi, >> >> please, cc me, I am not subscribed to lkml. >> >>> Hi, >>> >>> [lkml.org still broken --> no accurate mail header info possible...] >>> >>> Just to ask the obvious: >>> I assume using /sys/bus/pci/rescan does not help once it's broken? >>> (since the machine comes up empty at initial-boot scan, too) >> I will try it, too, but I am not sure it would work. >> >> Currently I can't test it because the last time I completely discharged >> the battery. I also disconnected it to be able to get the realtek chip back >> immediately for faster testing. Now, that I have reconnected the battery, >> I need to wait for it to be charged somewhat to be able to reproduce >> losing the network chip. >> >>> Also, you could try diffing lspci -vvxxx -s output >>> of working vs. "distorting" kernel version - perhaps some register setup >>> has been changed (e.g. due to power management improvements or some such), >>> which may encourage the card >>> to get a problematic/corrupt state. >> I attached a tarball that contains lspci -vvxxx for >> - all devices / only the network chip >> - before / after "modprobe r8169" >> - for all 3 kernel versions tested. >> >> I figured out that if I type the modprobe and lspci in the same command line, >> I can get diagnostics out of the machine, after all. >> >> It's not just the Realtek chip that has changed parameters. >> >> (Vague idea) I noticed that some devices have changed like this: >> >> - Memory behind bridge: 8000-801f >> - Prefetchable memory behind bridge: 8020-803f >> + Memory behind bridge: ff00-ff1f >> + Prefetchable memory behind bridge: ff20-ff3f >> >> Can't this cause a problem? E.g. programming the bridge with an address range >> that the bridge doesn't actually support? > This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You > attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a > v3.18.16 dmesg log, so we can compare them? I collected all 3 for you to compare them, compressed, attached. BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0 as suspicious. I will try the 4.0/4.1 kernels with this one reverted. > > These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at > the code to see what might be going on: > > acpi PNP0A08:00: host bridge window expanded to [mem > 0x-0x window]; [mem 0x-0x window] > ignored > pci :00:1c.1: can't claim BAR 15 [mem 0xfdf0-0xfdff > 64bit pref]: address conflict with PCI Bus :00 [mem > 0xf000-0xfed8 window] > > Bjorn > Thanks, Zoltán Böszörményi dmesg.tgz Description: application/compressed-tar
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
Hi, please, cc me, I am not subscribed to lkml. > Hi, > > [lkml.org still broken --> no accurate mail header info possible...] > > Just to ask the obvious: > I assume using /sys/bus/pci/rescan does not help once it's broken? > (since the machine comes up empty at initial-boot scan, too) I will try it, too, but I am not sure it would work. Currently I can't test it because the last time I completely discharged the battery. I also disconnected it to be able to get the realtek chip back immediately for faster testing. Now, that I have reconnected the battery, I need to wait for it to be charged somewhat to be able to reproduce losing the network chip. > Also, you could try diffing lspci -vvxxx -s output > of working vs. "distorting" kernel version - perhaps some register setup > has been changed (e.g. due to power management improvements or some such), > which may encourage the card > to get a problematic/corrupt state. I attached a tarball that contains lspci -vvxxx for - all devices / only the network chip - before / after "modprobe r8169" - for all 3 kernel versions tested. I figured out that if I type the modprobe and lspci in the same command line, I can get diagnostics out of the machine, after all. It's not just the Realtek chip that has changed parameters. (Vague idea) I noticed that some devices have changed like this: - Memory behind bridge: 8000-801f - Prefetchable memory behind bridge: 8020-803f + Memory behind bridge: ff00-ff1f + Prefetchable memory behind bridge: ff20-ff3f Can't this cause a problem? E.g. programming the bridge with an address range that the bridge doesn't actually support? > > > Upon powering off the system, > > the r8169 driver compained about "rtl_eriar_cond = 1 loop 100" > > Yup, that seems to be > rtl_eri_read() in ethernet/realtek/r8169.c > waiting on low condition of > RTL_R32(ERIAR) & ERIAR_FLAG; I found that, too, and I think it is a symptom of instead of the cause. Thanks for your efforts, Zoltán Böszörményi lspci.tgz Description: application/compressed-tar
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 16:03 keltezéssel, Bjorn Helgaas írta: [+cc linux-pci] Hi Boszormenyi, On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan zbos...@pr.hu wrote: Hi, please, cc me, I am not subscribed to lkml. Hi, [lkml.org still broken -- no accurate mail header info possible...] Just to ask the obvious: I assume using /sys/bus/pci/rescan does not help once it's broken? (since the machine comes up empty at initial-boot scan, too) I will try it, too, but I am not sure it would work. Currently I can't test it because the last time I completely discharged the battery. I also disconnected it to be able to get the realtek chip back immediately for faster testing. Now, that I have reconnected the battery, I need to wait for it to be charged somewhat to be able to reproduce losing the network chip. Also, you could try diffing lspci -vvxxx -s output of working vs. distorting kernel version - perhaps some register setup has been changed (e.g. due to power management improvements or some such), which may encourage the card to get a problematic/corrupt state. I attached a tarball that contains lspci -vvxxx for - all devices / only the network chip - before / after modprobe r8169 - for all 3 kernel versions tested. I figured out that if I type the modprobe and lspci in the same command line, I can get diagnostics out of the machine, after all. It's not just the Realtek chip that has changed parameters. (Vague idea) I noticed that some devices have changed like this: - Memory behind bridge: 8000-801f - Prefetchable memory behind bridge: 8020-803f + Memory behind bridge: ff00-ff1f + Prefetchable memory behind bridge: ff20-ff3f Can't this cause a problem? E.g. programming the bridge with an address range that the bridge doesn't actually support? This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a v3.18.16 dmesg log, so we can compare them? I collected all 3 for you to compare them, compressed, attached. BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0 as suspicious. I will try the 4.0/4.1 kernels with this one reverted. These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at the code to see what might be going on: acpi PNP0A08:00: host bridge window expanded to [mem 0x-0x window]; [mem 0x-0x window] ignored pci :00:1c.1: can't claim BAR 15 [mem 0xfdf0-0xfdff 64bit pref]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] Bjorn Thanks, Zoltán Böszörményi dmesg.tgz Description: application/compressed-tar
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 16:19 keltezéssel, Boszormenyi Zoltan írta: 2015-06-21 16:03 keltezéssel, Bjorn Helgaas írta: [+cc linux-pci] Hi Boszormenyi, On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan zbos...@pr.hu wrote: Hi, please, cc me, I am not subscribed to lkml. Hi, [lkml.org still broken -- no accurate mail header info possible...] Just to ask the obvious: I assume using /sys/bus/pci/rescan does not help once it's broken? (since the machine comes up empty at initial-boot scan, too) I will try it, too, but I am not sure it would work. Currently I can't test it because the last time I completely discharged the battery. I also disconnected it to be able to get the realtek chip back immediately for faster testing. Now, that I have reconnected the battery, I need to wait for it to be charged somewhat to be able to reproduce losing the network chip. Also, you could try diffing lspci -vvxxx -s output of working vs. distorting kernel version - perhaps some register setup has been changed (e.g. due to power management improvements or some such), which may encourage the card to get a problematic/corrupt state. I attached a tarball that contains lspci -vvxxx for - all devices / only the network chip - before / after modprobe r8169 - for all 3 kernel versions tested. I figured out that if I type the modprobe and lspci in the same command line, I can get diagnostics out of the machine, after all. It's not just the Realtek chip that has changed parameters. (Vague idea) I noticed that some devices have changed like this: - Memory behind bridge: 8000-801f - Prefetchable memory behind bridge: 8020-803f + Memory behind bridge: ff00-ff1f + Prefetchable memory behind bridge: ff20-ff3f Can't this cause a problem? E.g. programming the bridge with an address range that the bridge doesn't actually support? This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a v3.18.16 dmesg log, so we can compare them? I collected all 3 for you to compare them, compressed, attached. BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0 as suspicious. I will try the 4.0/4.1 kernels with this one reverted. Reverting this one didn't help. These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at the code to see what might be going on: acpi PNP0A08:00: host bridge window expanded to [mem 0x-0x window]; [mem 0x-0x window] ignored pci :00:1c.1: can't claim BAR 15 [mem 0xfdf0-0xfdff 64bit pref]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] Bjorn Thanks, Zoltán Böszörményi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in Please read the FAQ at http://www.tux.org/lkml/
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
Hi, please, cc me, I am not subscribed to lkml. Hi, [lkml.org still broken -- no accurate mail header info possible...] Just to ask the obvious: I assume using /sys/bus/pci/rescan does not help once it's broken? (since the machine comes up empty at initial-boot scan, too) I will try it, too, but I am not sure it would work. Currently I can't test it because the last time I completely discharged the battery. I also disconnected it to be able to get the realtek chip back immediately for faster testing. Now, that I have reconnected the battery, I need to wait for it to be charged somewhat to be able to reproduce losing the network chip. Also, you could try diffing lspci -vvxxx -s output of working vs. distorting kernel version - perhaps some register setup has been changed (e.g. due to power management improvements or some such), which may encourage the card to get a problematic/corrupt state. I attached a tarball that contains lspci -vvxxx for - all devices / only the network chip - before / after modprobe r8169 - for all 3 kernel versions tested. I figured out that if I type the modprobe and lspci in the same command line, I can get diagnostics out of the machine, after all. It's not just the Realtek chip that has changed parameters. (Vague idea) I noticed that some devices have changed like this: - Memory behind bridge: 8000-801f - Prefetchable memory behind bridge: 8020-803f + Memory behind bridge: ff00-ff1f + Prefetchable memory behind bridge: ff20-ff3f Can't this cause a problem? E.g. programming the bridge with an address range that the bridge doesn't actually support? Upon powering off the system, the r8169 driver compained about rtl_eriar_cond = 1 loop 100 Yup, that seems to be rtl_eri_read() in ethernet/realtek/r8169.c waiting on low condition of RTL_R32(ERIAR) ERIAR_FLAG; I found that, too, and I think it is a symptom of instead of the cause. Thanks for your efforts, Zoltán Böszörményi lspci.tgz Description: application/compressed-tar
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 20:55 keltezéssel, Boszormenyi Zoltan írta: 2015-06-21 19:55 keltezéssel, Jiang Liu írta: On 2015/6/22 1:25, Jiang Liu wrote: [...] - Memory behind bridge: 8000-801f - Prefetchable memory behind bridge: 8020-803f + Memory behind bridge: ff00-ff1f + Prefetchable memory behind bridge: ff20-ff3f Can't this cause a problem? E.g. programming the bridge with an address range that the bridge doesn't actually support? This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a v3.18.16 dmesg log, so we can compare them? I collected all 3 for you to compare them, compressed, attached. BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0 as suspicious. I will try the 4.0/4.1 kernels with this one reverted. These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at the code to see what might be going on: acpi PNP0A08:00: host bridge window expanded to [mem 0x-0x window]; [mem 0x-0x window] ignored pci :00:1c.1: can't claim BAR 15 [mem 0xfdf0-0xfdff 64bit pref]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] Bjorn Hi Bjorn and Boszormenyi, From the 3.18 kernel, we got a message: [0.126248] acpi PNP0A08:00: host bridge window [0x4-0xf] (ignored, not CPU addressable) And from 4.1.-rc8, we got another message: [0.127051] acpi PNP0A08:00: host bridge window expanded to [mem 0x-0x window]; [mem 0x-0x window] ignored That smells like a 32bit overflow or 64bit cut-off issue. Hi Bjorn and Boszormenyi, With v3.18.6, it uses u64 to compare resource ranges. We changed to use resource_size_t with recent changes, and resource_size_t may be u32 or u64 depending on configuration. So resource range [0x4-0xf] may have been cut-off as [0x-0x], thus cause the trouble. Hi Boszormenyi, Could you please help to try following test patch? against v4.1-rc8? I have tried it. The result (dmesg, lspci before/after modprobe) is attached. The not CPU addressable message shows up once in dmesg. The device shows up in lspci and the module can be loaded. The previously experienced sluggishness is gone now, but the network doesn't work after modprobe. I think it was an expected outcome, since that particular range is ignored with this patch. Hm, I can see a very similar message in 3.18.16, so it was not the expected outcome. After building the official r8168 from Realtek for 4.1.0-rc8, the difference in lspci from the working 3.18.16 is nil, before and after modprobe. (r8168 was build for 3.18.16, that's why.) However, connman (similar to NetworkManager) still sees the network connectivity as down. I checked that the firmware files are there in /lib/firmware/rtl_nic. With r8168 (the official Realtek driver), the kernel message about link up appears immediately and connman can configure the network. I have tried the patch on 4.0.5, too, with the same result. So, there may be another problem with the r8169 driver itself besides this ACPI problem but no matter what I do, I can't seem to be able to enable debugging messages for r8169. So, for now I can use r8168 instead of r8169 with this patch. Thanks, Zoltán Thanks, Zoltán Thanks! Gerry --- diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c index 8244f013f210..d7b8c392c420 100644 --- a/drivers/acpi/resource.c +++ b/drivers/acpi/resource.c @@ -206,6 +206,11 @@ static bool acpi_decode_space(struct resource_win *win, res-start = attr-minimum; res-end = attr-maximum; + if (res-start != attr-minimum || res-end != attr-maximum) { + pr_warn(resource window ([%#llx-%#llx] ignored, not CPU addressable)\n, + attr-minimum, attr-maximum); + return false; + } /* * For bridges that translate addresses across the bridge, - -- To unsubscribe from this list: send the line unsubscribe linux-kernel in Please read the FAQ at http://www.tux.org/lkml/
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 19:25 keltezéssel, Jiang Liu írta: On 2015/6/21 22:19, Boszormenyi Zoltan wrote: 2015-06-21 16:03 keltezéssel, Bjorn Helgaas írta: [+cc linux-pci] Hi Boszormenyi, On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan zbos...@pr.hu wrote: Hi, please, cc me, I am not subscribed to lkml. Hi, [lkml.org still broken -- no accurate mail header info possible...] Just to ask the obvious: I assume using /sys/bus/pci/rescan does not help once it's broken? (since the machine comes up empty at initial-boot scan, too) I will try it, too, but I am not sure it would work. Currently I can't test it because the last time I completely discharged the battery. I also disconnected it to be able to get the realtek chip back immediately for faster testing. Now, that I have reconnected the battery, I need to wait for it to be charged somewhat to be able to reproduce losing the network chip. Also, you could try diffing lspci -vvxxx -s output of working vs. distorting kernel version - perhaps some register setup has been changed (e.g. due to power management improvements or some such), which may encourage the card to get a problematic/corrupt state. I attached a tarball that contains lspci -vvxxx for - all devices / only the network chip - before / after modprobe r8169 - for all 3 kernel versions tested. I figured out that if I type the modprobe and lspci in the same command line, I can get diagnostics out of the machine, after all. It's not just the Realtek chip that has changed parameters. (Vague idea) I noticed that some devices have changed like this: - Memory behind bridge: 8000-801f - Prefetchable memory behind bridge: 8020-803f + Memory behind bridge: ff00-ff1f + Prefetchable memory behind bridge: ff20-ff3f Can't this cause a problem? E.g. programming the bridge with an address range that the bridge doesn't actually support? This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a v3.18.16 dmesg log, so we can compare them? I collected all 3 for you to compare them, compressed, attached. BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0 as suspicious. I will try the 4.0/4.1 kernels with this one reverted. These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at the code to see what might be going on: acpi PNP0A08:00: host bridge window expanded to [mem 0x-0x window]; [mem 0x-0x window] ignored pci :00:1c.1: can't claim BAR 15 [mem 0xfdf0-0xfdff 64bit pref]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] Bjorn Hi Bjorn and Boszormenyi, From the 3.18 kernel, we got a message: [0.126248] acpi PNP0A08:00: host bridge window [0x4-0xf] (ignored, not CPU addressable) And from 4.1.-rc8, we got another message: [0.127051] acpi PNP0A08:00: host bridge window expanded to [mem 0x-0x window]; [mem 0x-0x window] ignored That smells like a 32bit overflow or 64bit cut-off issue. Hi Boszormenyi, could you please help to provide acpidump from the machine? I already did in a previous mail which was only sent to LKML, but here it is again. Thanks, Zoltán Thanks! Gerry acpidump.tgz Description: application/compressed-tar
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 19:55 keltezéssel, Jiang Liu írta: On 2015/6/22 1:25, Jiang Liu wrote: [...] - Memory behind bridge: 8000-801f - Prefetchable memory behind bridge: 8020-803f + Memory behind bridge: ff00-ff1f + Prefetchable memory behind bridge: ff20-ff3f Can't this cause a problem? E.g. programming the bridge with an address range that the bridge doesn't actually support? This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a v3.18.16 dmesg log, so we can compare them? I collected all 3 for you to compare them, compressed, attached. BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0 as suspicious. I will try the 4.0/4.1 kernels with this one reverted. These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at the code to see what might be going on: acpi PNP0A08:00: host bridge window expanded to [mem 0x-0x window]; [mem 0x-0x window] ignored pci :00:1c.1: can't claim BAR 15 [mem 0xfdf0-0xfdff 64bit pref]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] Bjorn Hi Bjorn and Boszormenyi, From the 3.18 kernel, we got a message: [0.126248] acpi PNP0A08:00: host bridge window [0x4-0xf] (ignored, not CPU addressable) And from 4.1.-rc8, we got another message: [0.127051] acpi PNP0A08:00: host bridge window expanded to [mem 0x-0x window]; [mem 0x-0x window] ignored That smells like a 32bit overflow or 64bit cut-off issue. Hi Bjorn and Boszormenyi, With v3.18.6, it uses u64 to compare resource ranges. We changed to use resource_size_t with recent changes, and resource_size_t may be u32 or u64 depending on configuration. So resource range [0x4-0xf] may have been cut-off as [0x-0x], thus cause the trouble. Hi Boszormenyi, Could you please help to try following test patch? against v4.1-rc8? I have tried it. The result (dmesg, lspci before/after modprobe) is attached. The not CPU addressable message shows up once in dmesg. The device shows up in lspci and the module can be loaded. The previously experienced sluggishness is gone now, but the network doesn't work after modprobe. I think it was an expected outcome, since that particular range is ignored with this patch. Thanks, Zoltán Thanks! Gerry --- diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c index 8244f013f210..d7b8c392c420 100644 --- a/drivers/acpi/resource.c +++ b/drivers/acpi/resource.c @@ -206,6 +206,11 @@ static bool acpi_decode_space(struct resource_win *win, res-start = attr-minimum; res-end = attr-maximum; + if (res-start != attr-minimum || res-end != attr-maximum) { + pr_warn(resource window ([%#llx-%#llx] ignored, not CPU addressable)\n, + attr-minimum, attr-maximum); + return false; + } /* * For bridges that translate addresses across the bridge, - dmesg-lspci-xx2.tgz Description: application/compressed-tar
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-20 01:13 keltezéssel, Rafael J. Wysocki írta: > On Friday, June 19, 2015 03:46:48 PM Boszormenyi Zoltan wrote: >> Hi, >> >> so after the network card came alive again, I tried kernels >> 3.18.16, 4.0.5 and 4.1.0-rc8. With the last two kernels, when >> loading the r8169 driver, I experience the symptoms described >> below. Also, after booting 4.0.5 and then 4.1.0-rc8, the network >> card disappeared from the PCI devices again, neither PXE shows up >> nor the device in lspci. >> >> It seems I will have to wait again until the battery loses its >> capacity since the last testing to get the network chip back. >> >> I would be happy to test patches that may fix this behavior. >> >> With 3.18.16 and the device in lspci, the network works with r8169. > The only think I can suggest is to try this patch: > > https://patchwork.kernel.org/patch/6628061/ > > and see if it makes any difference. Thanks, I tried it on 4.1.0-rc8 and it didn't make a difference. Attached are the dmesg and the acpidump output, both compressed. I hope you or someone else can see something that fixes this issue. Until then, I am stuck with 3.18.16 on this machine. Thanks in advance, Zoltán Böszörményi > > >> 2015-06-19 15:31 keltezéssel, Boszormenyi Zoltan írta: >>> Nevermind, this is a POS machine with a big battery inside. >>> When I allowed it to discharge, the network card came back >>> with PXE boot. There might have been some bad state kept >>> by the battery. >>> >>> Sorry for the noise. >>> >>> 2015-06-19 15:24 keltezéssel, Boszormenyi Zoltan írta: >>>> Hi, >>>> >>>> I have a problem on a special POS mainboard that has >>>> a Realtek RTL8111/8168/8411 chip. I use mainline kernel 4.0.5. >>>> >>>> The initial problem was that when r8169 was not blacklisted, >>>> as soon as this driver loaded, a lot of IRQ problems popped up, >>>> like pressing keys on the USB keyboard made the keys duplicated >>>> and the system was sluggish. Upon powering off the system, >>>> the r8169 driver compained about "rtl_eriar_cond = 1 loop 100" >>>> or something like that and the system couldn't even reboot or >>>> get powered down properly. >>>> >>>> It was impossible to get dmesg or other diagnostics info out of >>>> the system in this state. >>>> >>>> When I blacklisted r8169, everything was OK except there was >>>> no network, obviously. >>>> >>>> I also noticed that with kernel 4.0.5, there are memory range conflicts, >>>> like >>>> >>>> pci :00:02.0: can't claim BAR 0 [mem ]: address conflict with PCI >>>> Bus :00 [mem >>>> ... window] >>>> >>>> I also tried to load the r8168 driver from Realtek, with the >>>> same results as with r8169. >>>> >>>> I don't know what happened, was it the "official" Realtek driver >>>> that disabled the chip, or that I toggled the PXE boot in the BIOS, >>>> but now lspci doesn't list the ethernet chip anymore and not even >>>> the PXE boot messages show up, despite it being enabled in the BIOS. >>>> I tried kernels 3.18.16, 4.0.5 again and 4.1.0-rc8. >>>> >>>> I have this in dmesg: >>>> >>>> [0.136171] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 *7 10 11 12 >>>> 14 15) >>>> [0.136323] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 11 12 >>>> *14 15) >>>> [0.136466] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 >>>> *15) >>>> [0.136609] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 14 >>>> *15) >>>> [0.136751] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 >>>> 15) *0, disabled. >>>> [0.136894] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 >>>> 15) *0, disabled. >>>> [0.137050] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 >>>> 15) *0, disabled. >>>> [0.137195] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 *6 7 10 11 12 >>>> 14 15) >>>> >>>> and >>>> >>>> [0.139098] PCI: Using ACPI for IRQ routing >>>> [0.139098] PCI: pci_cache_line_size set to 64 bytes >>>> [0.139098] pci :00:02.0: can't claim BAR 0 [mem >>>> 0xfeb0-0xfeb7]: address >>>>
Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-20 01:13 keltezéssel, Rafael J. Wysocki írta: On Friday, June 19, 2015 03:46:48 PM Boszormenyi Zoltan wrote: Hi, so after the network card came alive again, I tried kernels 3.18.16, 4.0.5 and 4.1.0-rc8. With the last two kernels, when loading the r8169 driver, I experience the symptoms described below. Also, after booting 4.0.5 and then 4.1.0-rc8, the network card disappeared from the PCI devices again, neither PXE shows up nor the device in lspci. It seems I will have to wait again until the battery loses its capacity since the last testing to get the network chip back. I would be happy to test patches that may fix this behavior. With 3.18.16 and the device in lspci, the network works with r8169. The only think I can suggest is to try this patch: https://patchwork.kernel.org/patch/6628061/ and see if it makes any difference. Thanks, I tried it on 4.1.0-rc8 and it didn't make a difference. Attached are the dmesg and the acpidump output, both compressed. I hope you or someone else can see something that fixes this issue. Until then, I am stuck with 3.18.16 on this machine. Thanks in advance, Zoltán Böszörményi 2015-06-19 15:31 keltezéssel, Boszormenyi Zoltan írta: Nevermind, this is a POS machine with a big battery inside. When I allowed it to discharge, the network card came back with PXE boot. There might have been some bad state kept by the battery. Sorry for the noise. 2015-06-19 15:24 keltezéssel, Boszormenyi Zoltan írta: Hi, I have a problem on a special POS mainboard that has a Realtek RTL8111/8168/8411 chip. I use mainline kernel 4.0.5. The initial problem was that when r8169 was not blacklisted, as soon as this driver loaded, a lot of IRQ problems popped up, like pressing keys on the USB keyboard made the keys duplicated and the system was sluggish. Upon powering off the system, the r8169 driver compained about rtl_eriar_cond = 1 loop 100 or something like that and the system couldn't even reboot or get powered down properly. It was impossible to get dmesg or other diagnostics info out of the system in this state. When I blacklisted r8169, everything was OK except there was no network, obviously. I also noticed that with kernel 4.0.5, there are memory range conflicts, like pci :00:02.0: can't claim BAR 0 [mem ]: address conflict with PCI Bus :00 [mem ... window] I also tried to load the r8168 driver from Realtek, with the same results as with r8169. I don't know what happened, was it the official Realtek driver that disabled the chip, or that I toggled the PXE boot in the BIOS, but now lspci doesn't list the ethernet chip anymore and not even the PXE boot messages show up, despite it being enabled in the BIOS. I tried kernels 3.18.16, 4.0.5 again and 4.1.0-rc8. I have this in dmesg: [0.136171] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 *7 10 11 12 14 15) [0.136323] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 11 12 *14 15) [0.136466] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 *15) [0.136609] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 14 *15) [0.136751] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.136894] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.137050] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.137195] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 *6 7 10 11 12 14 15) and [0.139098] PCI: Using ACPI for IRQ routing [0.139098] PCI: pci_cache_line_size set to 64 bytes [0.139098] pci :00:02.0: can't claim BAR 0 [mem 0xfeb0-0xfeb7]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139098] pci :00:02.0: can't claim BAR 2 [mem 0xd000-0xdfff pref]: address conflict with PCI Bus :00 [mem 0x7f70-0xdfff window] [0.139104] pci :00:02.0: can't claim BAR 3 [mem 0xfea0-0xfeaf]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139113] pci :00:02.1: can't claim BAR 0 [mem 0xfeb8-0xfebf]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139123] pci :00:1b.0: can't claim BAR 0 [mem 0xfe9f8000-0xfe9fbfff 64bit]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139146] pci :00:1d.7: can't claim BAR 0 [mem 0xfe9f7c00-0xfe9f7fff]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139161] pci :00:1f.2: can't claim BAR 5 [mem 0xfe9f7800-0xfe9f7bff]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139190] Expanded resource reserved due to conflict with PCI Bus :00 Full dmesg for 4.0.5 is attached. Can anyone help me re-enable the network card? Thanks in advance, Zoltán Böszörményi -- To unsubscribe from
ACPI regression? Was Re: Ethernet chip disappeared from lspci
Hi, so after the network card came alive again, I tried kernels 3.18.16, 4.0.5 and 4.1.0-rc8. With the last two kernels, when loading the r8169 driver, I experience the symptoms described below. Also, after booting 4.0.5 and then 4.1.0-rc8, the network card disappeared from the PCI devices again, neither PXE shows up nor the device in lspci. It seems I will have to wait again until the battery loses its capacity since the last testing to get the network chip back. I would be happy to test patches that may fix this behavior. With 3.18.16 and the device in lspci, the network works with r8169. Best regards, Zoltán Böszörményi 2015-06-19 15:31 keltezéssel, Boszormenyi Zoltan írta: > Nevermind, this is a POS machine with a big battery inside. > When I allowed it to discharge, the network card came back > with PXE boot. There might have been some bad state kept > by the battery. > > Sorry for the noise. > > 2015-06-19 15:24 keltezéssel, Boszormenyi Zoltan írta: >> Hi, >> >> I have a problem on a special POS mainboard that has >> a Realtek RTL8111/8168/8411 chip. I use mainline kernel 4.0.5. >> >> The initial problem was that when r8169 was not blacklisted, >> as soon as this driver loaded, a lot of IRQ problems popped up, >> like pressing keys on the USB keyboard made the keys duplicated >> and the system was sluggish. Upon powering off the system, >> the r8169 driver compained about "rtl_eriar_cond = 1 loop 100" >> or something like that and the system couldn't even reboot or >> get powered down properly. >> >> It was impossible to get dmesg or other diagnostics info out of >> the system in this state. >> >> When I blacklisted r8169, everything was OK except there was >> no network, obviously. >> >> I also noticed that with kernel 4.0.5, there are memory range conflicts, like >> >> pci :00:02.0: can't claim BAR 0 [mem ]: address conflict with PCI >> Bus :00 [mem >> ... window] >> >> I also tried to load the r8168 driver from Realtek, with the >> same results as with r8169. >> >> I don't know what happened, was it the "official" Realtek driver >> that disabled the chip, or that I toggled the PXE boot in the BIOS, >> but now lspci doesn't list the ethernet chip anymore and not even >> the PXE boot messages show up, despite it being enabled in the BIOS. >> I tried kernels 3.18.16, 4.0.5 again and 4.1.0-rc8. >> >> I have this in dmesg: >> >> [0.136171] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 *7 10 11 12 14 >> 15) >> [0.136323] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 11 12 *14 >> 15) >> [0.136466] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 >> *15) >> [0.136609] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 14 >> *15) >> [0.136751] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 >> 15) *0, disabled. >> [0.136894] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 >> 15) *0, disabled. >> [0.137050] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 >> 15) *0, disabled. >> [0.137195] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 *6 7 10 11 12 14 >> 15) >> >> and >> >> [0.139098] PCI: Using ACPI for IRQ routing >> [0.139098] PCI: pci_cache_line_size set to 64 bytes >> [0.139098] pci :00:02.0: can't claim BAR 0 [mem >> 0xfeb0-0xfeb7]: address >> conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] >> [0.139098] pci :00:02.0: can't claim BAR 2 [mem >> 0xd000-0xdfff pref]: >> address conflict with PCI Bus :00 [mem 0x7f70-0xdfff window] >> [0.139104] pci :00:02.0: can't claim BAR 3 [mem >> 0xfea0-0xfeaf]: address >> conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] >> [0.139113] pci :00:02.1: can't claim BAR 0 [mem >> 0xfeb8-0xfebf]: address >> conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] >> [0.139123] pci :00:1b.0: can't claim BAR 0 [mem >> 0xfe9f8000-0xfe9fbfff 64bit]: >> address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] >> [0.139146] pci :00:1d.7: can't claim BAR 0 [mem >> 0xfe9f7c00-0xfe9f7fff]: address >> conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] >> [0.139161] pci :00:1f.2: can't claim BAR 5 [mem >> 0xfe9f7800-0xfe9f7bff]: address >> conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] >> [0.139190] Expanded resource reserved due to conflict with PCI Bus >> :00 >> >> Full dmesg for 4.0.5 is attached. >> >> Can anyone help me re-enable the network card? >> >> Thanks in advance, >> Zoltán Böszörményi >> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in Please read the FAQ at http://www.tux.org/lkml/
Ethernet chip disappeared from lspci
Hi, I have a problem on a special POS mainboard that has a Realtek RTL8111/8168/8411 chip. I use mainline kernel 4.0.5. The initial problem was that when r8169 was not blacklisted, as soon as this driver loaded, a lot of IRQ problems popped up, like pressing keys on the USB keyboard made the keys duplicated and the system was sluggish. Upon powering off the system, the r8169 driver compained about "rtl_eriar_cond = 1 loop 100" or something like that and the system couldn't even reboot or get powered down properly. It was impossible to get dmesg or other diagnostics info out of the system in this state. When I blacklisted r8169, everything was OK except there was no network, obviously. I also noticed that with kernel 4.0.5, there are memory range conflicts, like pci :00:02.0: can't claim BAR 0 [mem ]: address conflict with PCI Bus :00 [mem ... window] I also tried to load the r8168 driver from Realtek, with the same results as with r8169. I don't know what happened, was it the "official" Realtek driver that disabled the chip, or that I toggled the PXE boot in the BIOS, but now lspci doesn't list the ethernet chip anymore and not even the PXE boot messages show up, despite it being enabled in the BIOS. I tried kernels 3.18.16, 4.0.5 again and 4.1.0-rc8. I have this in dmesg: [0.136171] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 *7 10 11 12 14 15) [0.136323] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 11 12 *14 15) [0.136466] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 *15) [0.136609] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 14 *15) [0.136751] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.136894] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.137050] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.137195] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 *6 7 10 11 12 14 15) and [0.139098] PCI: Using ACPI for IRQ routing [0.139098] PCI: pci_cache_line_size set to 64 bytes [0.139098] pci :00:02.0: can't claim BAR 0 [mem 0xfeb0-0xfeb7]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139098] pci :00:02.0: can't claim BAR 2 [mem 0xd000-0xdfff pref]: address conflict with PCI Bus :00 [mem 0x7f70-0xdfff window] [0.139104] pci :00:02.0: can't claim BAR 3 [mem 0xfea0-0xfeaf]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139113] pci :00:02.1: can't claim BAR 0 [mem 0xfeb8-0xfebf]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139123] pci :00:1b.0: can't claim BAR 0 [mem 0xfe9f8000-0xfe9fbfff 64bit]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139146] pci :00:1d.7: can't claim BAR 0 [mem 0xfe9f7c00-0xfe9f7fff]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139161] pci :00:1f.2: can't claim BAR 5 [mem 0xfe9f7800-0xfe9f7bff]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139190] Expanded resource reserved due to conflict with PCI Bus :00 Full dmesg for 4.0.5 is attached. Can anyone help me re-enable the network card? Thanks in advance, Zoltán Böszörményi [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Initializing cgroup subsys cpuacct [0.00] Linux version 4.0.5 (zboszorme...@buildbox-0001.sicom.com) (gcc version 4.8.2 (GCC) ) #1 SMP Wed Jun 17 12:17:52 EDT 2015 [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x7f58] usable [0.00] BIOS-e820: [mem 0x7f59-0x7f59dfff] ACPI data [0.00] BIOS-e820: [mem 0x7f59e000-0x7f5c] ACPI NVS [0.00] BIOS-e820: [mem 0x7f5d-0x7f5d] reserved [0.00] BIOS-e820: [mem 0x7f5ed000-0x7fff] reserved [0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved [0.00] BIOS-e820: [mem 0xfff0-0x] reserved [0.00] Notice: NX (Execute Disable) protection cannot be enabled: non-PAE kernel! [0.00] SMBIOS 2.6 present. [0.00] DMI: SI SL20/SL20, BIOS 080016 04/01/2013 [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0x7f590 max_arch_pfn = 0x10 [0.00] MTRR default type: uncachable [0.00] MTRR fixed ranges enabled: [
Re: Ethernet chip disappeared from lspci
Nevermind, this is a POS machine with a big battery inside. When I allowed it to discharge, the network card came back with PXE boot. There might have been some bad state kept by the battery. Sorry for the noise. 2015-06-19 15:24 keltezéssel, Boszormenyi Zoltan írta: > Hi, > > I have a problem on a special POS mainboard that has > a Realtek RTL8111/8168/8411 chip. I use mainline kernel 4.0.5. > > The initial problem was that when r8169 was not blacklisted, > as soon as this driver loaded, a lot of IRQ problems popped up, > like pressing keys on the USB keyboard made the keys duplicated > and the system was sluggish. Upon powering off the system, > the r8169 driver compained about "rtl_eriar_cond = 1 loop 100" > or something like that and the system couldn't even reboot or > get powered down properly. > > It was impossible to get dmesg or other diagnostics info out of > the system in this state. > > When I blacklisted r8169, everything was OK except there was > no network, obviously. > > I also noticed that with kernel 4.0.5, there are memory range conflicts, like > > pci :00:02.0: can't claim BAR 0 [mem ]: address conflict with PCI Bus > :00 [mem > ... window] > > I also tried to load the r8168 driver from Realtek, with the > same results as with r8169. > > I don't know what happened, was it the "official" Realtek driver > that disabled the chip, or that I toggled the PXE boot in the BIOS, > but now lspci doesn't list the ethernet chip anymore and not even > the PXE boot messages show up, despite it being enabled in the BIOS. > I tried kernels 3.18.16, 4.0.5 again and 4.1.0-rc8. > > I have this in dmesg: > > [0.136171] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 *7 10 11 12 14 > 15) > [0.136323] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 11 12 *14 > 15) > [0.136466] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 > *15) > [0.136609] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 14 > *15) > [0.136751] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 > 15) *0, disabled. > [0.136894] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 > 15) *0, disabled. > [0.137050] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 > 15) *0, disabled. > [0.137195] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 *6 7 10 11 12 14 > 15) > > and > > [0.139098] PCI: Using ACPI for IRQ routing > [0.139098] PCI: pci_cache_line_size set to 64 bytes > [0.139098] pci :00:02.0: can't claim BAR 0 [mem > 0xfeb0-0xfeb7]: address > conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] > [0.139098] pci :00:02.0: can't claim BAR 2 [mem 0xd000-0xdfff > pref]: > address conflict with PCI Bus :00 [mem 0x7f70-0xdfff window] > [0.139104] pci :00:02.0: can't claim BAR 3 [mem > 0xfea0-0xfeaf]: address > conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] > [0.139113] pci :00:02.1: can't claim BAR 0 [mem > 0xfeb8-0xfebf]: address > conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] > [0.139123] pci :00:1b.0: can't claim BAR 0 [mem 0xfe9f8000-0xfe9fbfff > 64bit]: > address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] > [0.139146] pci :00:1d.7: can't claim BAR 0 [mem > 0xfe9f7c00-0xfe9f7fff]: address > conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] > [0.139161] pci :00:1f.2: can't claim BAR 5 [mem > 0xfe9f7800-0xfe9f7bff]: address > conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] > [0.139190] Expanded resource reserved due to conflict with PCI Bus :00 > > Full dmesg for 4.0.5 is attached. > > Can anyone help me re-enable the network card? > > Thanks in advance, > Zoltán Böszörményi > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in Please read the FAQ at http://www.tux.org/lkml/
Ethernet chip disappeared from lspci
Hi, I have a problem on a special POS mainboard that has a Realtek RTL8111/8168/8411 chip. I use mainline kernel 4.0.5. The initial problem was that when r8169 was not blacklisted, as soon as this driver loaded, a lot of IRQ problems popped up, like pressing keys on the USB keyboard made the keys duplicated and the system was sluggish. Upon powering off the system, the r8169 driver compained about rtl_eriar_cond = 1 loop 100 or something like that and the system couldn't even reboot or get powered down properly. It was impossible to get dmesg or other diagnostics info out of the system in this state. When I blacklisted r8169, everything was OK except there was no network, obviously. I also noticed that with kernel 4.0.5, there are memory range conflicts, like pci :00:02.0: can't claim BAR 0 [mem ]: address conflict with PCI Bus :00 [mem ... window] I also tried to load the r8168 driver from Realtek, with the same results as with r8169. I don't know what happened, was it the official Realtek driver that disabled the chip, or that I toggled the PXE boot in the BIOS, but now lspci doesn't list the ethernet chip anymore and not even the PXE boot messages show up, despite it being enabled in the BIOS. I tried kernels 3.18.16, 4.0.5 again and 4.1.0-rc8. I have this in dmesg: [0.136171] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 *7 10 11 12 14 15) [0.136323] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 11 12 *14 15) [0.136466] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 *15) [0.136609] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 14 *15) [0.136751] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.136894] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.137050] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.137195] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 *6 7 10 11 12 14 15) and [0.139098] PCI: Using ACPI for IRQ routing [0.139098] PCI: pci_cache_line_size set to 64 bytes [0.139098] pci :00:02.0: can't claim BAR 0 [mem 0xfeb0-0xfeb7]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139098] pci :00:02.0: can't claim BAR 2 [mem 0xd000-0xdfff pref]: address conflict with PCI Bus :00 [mem 0x7f70-0xdfff window] [0.139104] pci :00:02.0: can't claim BAR 3 [mem 0xfea0-0xfeaf]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139113] pci :00:02.1: can't claim BAR 0 [mem 0xfeb8-0xfebf]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139123] pci :00:1b.0: can't claim BAR 0 [mem 0xfe9f8000-0xfe9fbfff 64bit]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139146] pci :00:1d.7: can't claim BAR 0 [mem 0xfe9f7c00-0xfe9f7fff]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139161] pci :00:1f.2: can't claim BAR 5 [mem 0xfe9f7800-0xfe9f7bff]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139190] Expanded resource reserved due to conflict with PCI Bus :00 Full dmesg for 4.0.5 is attached. Can anyone help me re-enable the network card? Thanks in advance, Zoltán Böszörményi [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Initializing cgroup subsys cpuacct [0.00] Linux version 4.0.5 (zboszorme...@buildbox-0001.sicom.com) (gcc version 4.8.2 (GCC) ) #1 SMP Wed Jun 17 12:17:52 EDT 2015 [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x7f58] usable [0.00] BIOS-e820: [mem 0x7f59-0x7f59dfff] ACPI data [0.00] BIOS-e820: [mem 0x7f59e000-0x7f5c] ACPI NVS [0.00] BIOS-e820: [mem 0x7f5d-0x7f5d] reserved [0.00] BIOS-e820: [mem 0x7f5ed000-0x7fff] reserved [0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved [0.00] BIOS-e820: [mem 0xfff0-0x] reserved [0.00] Notice: NX (Execute Disable) protection cannot be enabled: non-PAE kernel! [0.00] SMBIOS 2.6 present. [0.00] DMI: SI SL20/SL20, BIOS 080016 04/01/2013 [0.00] e820: update [mem 0x-0x0fff] usable == reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0x7f590 max_arch_pfn = 0x10 [0.00] MTRR default type: uncachable [0.00] MTRR fixed ranges enabled: [
ACPI regression? Was Re: Ethernet chip disappeared from lspci
Hi, so after the network card came alive again, I tried kernels 3.18.16, 4.0.5 and 4.1.0-rc8. With the last two kernels, when loading the r8169 driver, I experience the symptoms described below. Also, after booting 4.0.5 and then 4.1.0-rc8, the network card disappeared from the PCI devices again, neither PXE shows up nor the device in lspci. It seems I will have to wait again until the battery loses its capacity since the last testing to get the network chip back. I would be happy to test patches that may fix this behavior. With 3.18.16 and the device in lspci, the network works with r8169. Best regards, Zoltán Böszörményi 2015-06-19 15:31 keltezéssel, Boszormenyi Zoltan írta: Nevermind, this is a POS machine with a big battery inside. When I allowed it to discharge, the network card came back with PXE boot. There might have been some bad state kept by the battery. Sorry for the noise. 2015-06-19 15:24 keltezéssel, Boszormenyi Zoltan írta: Hi, I have a problem on a special POS mainboard that has a Realtek RTL8111/8168/8411 chip. I use mainline kernel 4.0.5. The initial problem was that when r8169 was not blacklisted, as soon as this driver loaded, a lot of IRQ problems popped up, like pressing keys on the USB keyboard made the keys duplicated and the system was sluggish. Upon powering off the system, the r8169 driver compained about rtl_eriar_cond = 1 loop 100 or something like that and the system couldn't even reboot or get powered down properly. It was impossible to get dmesg or other diagnostics info out of the system in this state. When I blacklisted r8169, everything was OK except there was no network, obviously. I also noticed that with kernel 4.0.5, there are memory range conflicts, like pci :00:02.0: can't claim BAR 0 [mem ]: address conflict with PCI Bus :00 [mem ... window] I also tried to load the r8168 driver from Realtek, with the same results as with r8169. I don't know what happened, was it the official Realtek driver that disabled the chip, or that I toggled the PXE boot in the BIOS, but now lspci doesn't list the ethernet chip anymore and not even the PXE boot messages show up, despite it being enabled in the BIOS. I tried kernels 3.18.16, 4.0.5 again and 4.1.0-rc8. I have this in dmesg: [0.136171] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 *7 10 11 12 14 15) [0.136323] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 11 12 *14 15) [0.136466] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 *15) [0.136609] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 14 *15) [0.136751] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.136894] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.137050] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.137195] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 *6 7 10 11 12 14 15) and [0.139098] PCI: Using ACPI for IRQ routing [0.139098] PCI: pci_cache_line_size set to 64 bytes [0.139098] pci :00:02.0: can't claim BAR 0 [mem 0xfeb0-0xfeb7]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139098] pci :00:02.0: can't claim BAR 2 [mem 0xd000-0xdfff pref]: address conflict with PCI Bus :00 [mem 0x7f70-0xdfff window] [0.139104] pci :00:02.0: can't claim BAR 3 [mem 0xfea0-0xfeaf]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139113] pci :00:02.1: can't claim BAR 0 [mem 0xfeb8-0xfebf]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139123] pci :00:1b.0: can't claim BAR 0 [mem 0xfe9f8000-0xfe9fbfff 64bit]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139146] pci :00:1d.7: can't claim BAR 0 [mem 0xfe9f7c00-0xfe9f7fff]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139161] pci :00:1f.2: can't claim BAR 5 [mem 0xfe9f7800-0xfe9f7bff]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139190] Expanded resource reserved due to conflict with PCI Bus :00 Full dmesg for 4.0.5 is attached. Can anyone help me re-enable the network card? Thanks in advance, Zoltán Böszörményi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in Please read the FAQ at http://www.tux.org/lkml/
Re: Ethernet chip disappeared from lspci
Nevermind, this is a POS machine with a big battery inside. When I allowed it to discharge, the network card came back with PXE boot. There might have been some bad state kept by the battery. Sorry for the noise. 2015-06-19 15:24 keltezéssel, Boszormenyi Zoltan írta: Hi, I have a problem on a special POS mainboard that has a Realtek RTL8111/8168/8411 chip. I use mainline kernel 4.0.5. The initial problem was that when r8169 was not blacklisted, as soon as this driver loaded, a lot of IRQ problems popped up, like pressing keys on the USB keyboard made the keys duplicated and the system was sluggish. Upon powering off the system, the r8169 driver compained about rtl_eriar_cond = 1 loop 100 or something like that and the system couldn't even reboot or get powered down properly. It was impossible to get dmesg or other diagnostics info out of the system in this state. When I blacklisted r8169, everything was OK except there was no network, obviously. I also noticed that with kernel 4.0.5, there are memory range conflicts, like pci :00:02.0: can't claim BAR 0 [mem ]: address conflict with PCI Bus :00 [mem ... window] I also tried to load the r8168 driver from Realtek, with the same results as with r8169. I don't know what happened, was it the official Realtek driver that disabled the chip, or that I toggled the PXE boot in the BIOS, but now lspci doesn't list the ethernet chip anymore and not even the PXE boot messages show up, despite it being enabled in the BIOS. I tried kernels 3.18.16, 4.0.5 again and 4.1.0-rc8. I have this in dmesg: [0.136171] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 *7 10 11 12 14 15) [0.136323] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 11 12 *14 15) [0.136466] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 *15) [0.136609] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 14 *15) [0.136751] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.136894] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.137050] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [0.137195] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 *6 7 10 11 12 14 15) and [0.139098] PCI: Using ACPI for IRQ routing [0.139098] PCI: pci_cache_line_size set to 64 bytes [0.139098] pci :00:02.0: can't claim BAR 0 [mem 0xfeb0-0xfeb7]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139098] pci :00:02.0: can't claim BAR 2 [mem 0xd000-0xdfff pref]: address conflict with PCI Bus :00 [mem 0x7f70-0xdfff window] [0.139104] pci :00:02.0: can't claim BAR 3 [mem 0xfea0-0xfeaf]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139113] pci :00:02.1: can't claim BAR 0 [mem 0xfeb8-0xfebf]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139123] pci :00:1b.0: can't claim BAR 0 [mem 0xfe9f8000-0xfe9fbfff 64bit]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139146] pci :00:1d.7: can't claim BAR 0 [mem 0xfe9f7c00-0xfe9f7fff]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139161] pci :00:1f.2: can't claim BAR 5 [mem 0xfe9f7800-0xfe9f7bff]: address conflict with PCI Bus :00 [mem 0xf000-0xfed8 window] [0.139190] Expanded resource reserved due to conflict with PCI Bus :00 Full dmesg for 4.0.5 is attached. Can anyone help me re-enable the network card? Thanks in advance, Zoltán Böszörményi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in Please read the FAQ at http://www.tux.org/lkml/
Re: New prototype computer problem with S3 suspend
2013-06-07 03:17 keltezéssel, Aaron Lu írta: On 06/07/2013 02:11 AM, Boszormenyi Zoltan wrote: Hi, we are working on an Intel Atom-based embedded PC and I have to make suspend-to-RAM work but I can't seem to succeed. The symptom is that quite often, the machine resumes immediately after pm-suspend. Sometimes more than 20 times out of 50 attempts. Can you please file a bug about this? https://bugzilla.kernel.org I have tried 3.7.10, 3.9.4, 3.10-rc[234] and the linux-next branch from the git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git repository. The attached dmesg is from today's linux-pm/linux-next plus the latest drm-fixes patchset posted by Dave Airlie. I have tried disabling wakeup devices via /proc/acpi/wakeup and via sysfs files. (/sys/devices/.../wakeup) >From the dmesg, the following three devices are still armed with wakeup capability and might be the cause: i8042 kbd 00:03: System wakeup enabled by ACPI PM: suspend of devices complete after 578.883 msecs PM: late suspend of devices complete after 0.279 msecs pcieport :00:1c.1: System wakeup enabled by ACPI ehci-pci :00:1d.7: System wakeup enabled by ACPI PM: noirq suspend of devices complete after 31.946 msecs Anyway, please file a bug there, thanks. For the suspend bug: https://bugzilla.kernel.org/show_bug.cgi?id=59401 For the warnings in i915: https://bugs.freedesktop.org/show_bug.cgi?id=65497 Best regards, Zoltán Böszörményi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: New prototype computer problem with S3 suspend
2013-06-07 03:17 keltezéssel, Aaron Lu írta: On 06/07/2013 02:11 AM, Boszormenyi Zoltan wrote: Hi, we are working on an Intel Atom-based embedded PC and I have to make suspend-to-RAM work but I can't seem to succeed. The symptom is that quite often, the machine resumes immediately after pm-suspend. Sometimes more than 20 times out of 50 attempts. Can you please file a bug about this? https://bugzilla.kernel.org I have tried 3.7.10, 3.9.4, 3.10-rc[234] and the linux-next branch from the git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git repository. The attached dmesg is from today's linux-pm/linux-next plus the latest drm-fixes patchset posted by Dave Airlie. I have tried disabling wakeup devices via /proc/acpi/wakeup and via sysfs files. (/sys/devices/.../wakeup) From the dmesg, the following three devices are still armed with wakeup capability and might be the cause: i8042 kbd 00:03: System wakeup enabled by ACPI PM: suspend of devices complete after 578.883 msecs PM: late suspend of devices complete after 0.279 msecs pcieport :00:1c.1: System wakeup enabled by ACPI ehci-pci :00:1d.7: System wakeup enabled by ACPI PM: noirq suspend of devices complete after 31.946 msecs Anyway, please file a bug there, thanks. For the suspend bug: https://bugzilla.kernel.org/show_bug.cgi?id=59401 For the warnings in i915: https://bugs.freedesktop.org/show_bug.cgi?id=65497 Best regards, Zoltán Böszörményi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ahci driver cannot suspend a CF card, ata_piix can
Hi, 2013-04-19 17:14 keltezéssel, Aaron Lu írta: On 04/16/2013 07:58 PM, Boszormenyi Zoltan wrote: Hi, The SATA interface can be switched between AHCI and legacy modes as usual: 00:1f.2 SATA controller: Intel Corporation NM10/ICH7 Family SATA Controller [AHCI mode] (rev 02) The device attached to the SATA controller is always a CF card via a SiI3811 Serial ATA to Parallel ATA Device Bridge based converter board. Can you please attach a normal sata disk and try again with the controller set to AHCI mode? I suspect it may be the SiI3811 bridge that caused this command. I have finally got around to testing a different beast, a Zotax ZBOX with mostly the same motherboard components that the previously mentioned embedded PC had. It turned out that this embedded PC (a POS machine) doesn't have regular SATA connectors but a special one to connect only this SATA-EIDE converter card with the SiI3811 chip. Anyway, using the Zotac box, both a shiny new Samsung 840 SSD and a CF card using a different IDE bridge chip (Sunplus SATALink SPIF223A) can suspend to RAM using AHCI mode. Your suspicion turned out to be correct. Best regards, Zoltán Böszörményi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ahci driver cannot suspend a CF card, ata_piix can
Hi, 2013-04-19 17:14 keltezéssel, Aaron Lu írta: On 04/16/2013 07:58 PM, Boszormenyi Zoltan wrote: Hi, The SATA interface can be switched between AHCI and legacy modes as usual: 00:1f.2 SATA controller: Intel Corporation NM10/ICH7 Family SATA Controller [AHCI mode] (rev 02) The device attached to the SATA controller is always a CF card via a SiI3811 Serial ATA to Parallel ATA Device Bridge based converter board. Can you please attach a normal sata disk and try again with the controller set to AHCI mode? I suspect it may be the SiI3811 bridge that caused this command. I have finally got around to testing a different beast, a Zotax ZBOX with mostly the same motherboard components that the previously mentioned embedded PC had. It turned out that this embedded PC (a POS machine) doesn't have regular SATA connectors but a special one to connect only this SATA-EIDE converter card with the SiI3811 chip. Anyway, using the Zotac box, both a shiny new Samsung 840 SSD and a CF card using a different IDE bridge chip (Sunplus SATALink SPIF223A) can suspend to RAM using AHCI mode. Your suspicion turned out to be correct. Best regards, Zoltán Böszörményi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ahci driver cannot suspend a CF card, ata_piix can
Hi, 2013-04-19 17:14 keltezéssel, Aaron Lu írta: On 04/16/2013 07:58 PM, Boszormenyi Zoltan wrote: Hi, The SATA interface can be switched between AHCI and legacy modes as usual: 00:1f.2 SATA controller: Intel Corporation NM10/ICH7 Family SATA Controller [AHCI mode] (rev 02) The device attached to the SATA controller is always a CF card via a SiI3811 Serial ATA to Parallel ATA Device Bridge based converter board. Can you please attach a normal sata disk and try again with the controller set to AHCI mode? I suspect it may be the SiI3811 bridge that caused this command. Thanks, Aaron Unfortunately, there's no way for me to open the box at the moment. This particular computer is very compact and the screws are very tight. All the interesting devices are sealed behind its touchscreen. But: how comes that by using the ata_piix, the SUSPEND IMMEDIATE ATA command succeeds? Is it sent at all using the legacy mode? Thanks, Zoltán Böszörményi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ahci driver cannot suspend a CF card, ata_piix can
Hi, 2013-04-19 17:14 keltezéssel, Aaron Lu írta: On 04/16/2013 07:58 PM, Boszormenyi Zoltan wrote: Hi, The SATA interface can be switched between AHCI and legacy modes as usual: 00:1f.2 SATA controller: Intel Corporation NM10/ICH7 Family SATA Controller [AHCI mode] (rev 02) The device attached to the SATA controller is always a CF card via a SiI3811 Serial ATA to Parallel ATA Device Bridge based converter board. Can you please attach a normal sata disk and try again with the controller set to AHCI mode? I suspect it may be the SiI3811 bridge that caused this command. Thanks, Aaron Unfortunately, there's no way for me to open the box at the moment. This particular computer is very compact and the screws are very tight. All the interesting devices are sealed behind its touchscreen. But: how comes that by using the ata_piix, the SUSPEND IMMEDIATE ATA command succeeds? Is it sent at all using the legacy mode? Thanks, Zoltán Böszörményi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
ahci driver cannot suspend a CF card, ata_piix can
Hi, I am working with an embedded system, using the Intel Atom CPU, and the usual motherboard components: # lspci 00:00.0 Host bridge: Intel Corporation Atom Processor D4xx/D5xx/N4xx/N5xx DMI Bridge (rev 02) 00:02.0 VGA compatible controller: Intel Corporation Atom Processor D4xx/D5xx/N4xx/N5xx Integrated Graphics Controller (rev 02) 00:02.1 Display controller: Intel Corporation Atom Processor D4xx/D5xx/N4xx/N5xx Integrated Graphics Controller (rev 02) 00:1b.0 Audio device: Intel Corporation NM10/ICH7 Family High Definition Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 1 (rev 02) 00:1c.1 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 2 (rev 02) 00:1c.2 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 3 (rev 02) 00:1c.3 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 4 (rev 02) 00:1d.0 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #1 (rev 02) 00:1d.1 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #2 (rev 02) 00:1d.2 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #3 (rev 02) 00:1d.3 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #4 (rev 02) 00:1d.7 USB controller: Intel Corporation NM10/ICH7 Family USB2 EHCI Controller (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2) 00:1f.0 ISA bridge: Intel Corporation NM10 Family LPC Controller (rev 02) 00:1f.2 IDE interface: Intel Corporation NM10/ICH7 Family SATA Controller [IDE mode] (rev 02) 00:1f.3 SMBus: Intel Corporation NM10/ICH7 Family SMBus Controller (rev 02) 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller (rev 07) The SATA interface can be switched between AHCI and legacy modes as usual: 00:1f.2 SATA controller: Intel Corporation NM10/ICH7 Family SATA Controller [AHCI mode] (rev 02) The device attached to the SATA controller is always a CF card via a SiI3811 Serial ATA to Parallel ATA Device Bridge based converter board. Recently, I have experimented with suspend to disk and suspend to RAM. I used the pm-utils-1.4.1-22 package from Fedora 18 on this dumbed down (OpenEmbedded-based) Linux that we run. What I have found was that when the SATA interface is in AHCI mode, pm-suspend (suspend to RAM via ACPI) doesn't work because the CF card seems to return an error for SUSPEND IMMEDIATE. However, using the legacy mode (the ata_piix driver), suspend (and resume) works. The kernel version is 3.7.10. The kernel messages for a successful suspend with ata_piix, resume was user-initiated by pressing the power button: ... Apr 16 19:09:52 localhost kernel: ata2.00: configured for UDMA/133 Apr 16 19:09:52 localhost kernel: ata2: EH complete Apr 16 19:09:52 localhost kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Apr 16 19:09:58 localhost kernel: EXT4-fs (sda6): re-mounted. Opts: commit=0 Apr 16 19:09:58 localhost kernel: EXT4-fs (sda1): re-mounted. Opts: commit=0 Apr 16 19:10:23 localhost kernel: PM: Syncing filesystems ... done. Apr 16 19:10:23 localhost kernel: Freezing user space processes ... (elapsed 0.01 seconds) done. Apr 16 19:10:23 localhost kernel: Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. Apr 16 19:10:23 localhost kernel: Suspending console(s) (use no_console_suspend to debug) Apr 16 19:10:23 localhost kernel: sd 1:0:0:0: [sda] Synchronizing SCSI cache Apr 16 19:10:23 localhost kernel: sd 1:0:0:0: [sda] Stopping disk Apr 16 19:10:23 localhost kernel: serial 00:0b: disabled Apr 16 19:10:23 localhost kernel: serial 00:0a: disabled Apr 16 19:10:23 localhost kernel: serial 00:09: disabled Apr 16 19:10:23 localhost kernel: serial 00:08: disabled Apr 16 19:10:23 localhost kernel: serial 00:07: disabled Apr 16 19:10:25 localhost kernel: PM: suspend of devices complete after 47.656 msecs Apr 16 19:10:25 localhost kernel: PM: late suspend of devices complete after 0.204 msecs Apr 16 19:10:25 localhost kernel: pcieport :00:1c.1: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: ehci_hcd :00:1d.7: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: uhci_hcd :00:1d.3: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: uhci_hcd :00:1d.2: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: uhci_hcd :00:1d.1: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: uhci_hcd :00:1d.0: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: PM: noirq suspend of devices complete after 29.450 msecs Apr 16 19:10:25 localhost kernel: ACPI: Preparing to enter system sleep state S3 Apr 16 19:10:25 localhost kernel: PM: Saving platform NVS memory Apr 16 19:10:25 localhost kernel: Disabling non-boot CPUs ... Apr 16 19:10:25 localhost kernel: smpboot: CPU 1 is now
ahci driver cannot suspend a CF card, ata_piix can
Hi, I am working with an embedded system, using the Intel Atom CPU, and the usual motherboard components: # lspci 00:00.0 Host bridge: Intel Corporation Atom Processor D4xx/D5xx/N4xx/N5xx DMI Bridge (rev 02) 00:02.0 VGA compatible controller: Intel Corporation Atom Processor D4xx/D5xx/N4xx/N5xx Integrated Graphics Controller (rev 02) 00:02.1 Display controller: Intel Corporation Atom Processor D4xx/D5xx/N4xx/N5xx Integrated Graphics Controller (rev 02) 00:1b.0 Audio device: Intel Corporation NM10/ICH7 Family High Definition Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 1 (rev 02) 00:1c.1 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 2 (rev 02) 00:1c.2 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 3 (rev 02) 00:1c.3 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 4 (rev 02) 00:1d.0 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #1 (rev 02) 00:1d.1 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #2 (rev 02) 00:1d.2 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #3 (rev 02) 00:1d.3 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #4 (rev 02) 00:1d.7 USB controller: Intel Corporation NM10/ICH7 Family USB2 EHCI Controller (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2) 00:1f.0 ISA bridge: Intel Corporation NM10 Family LPC Controller (rev 02) 00:1f.2 IDE interface: Intel Corporation NM10/ICH7 Family SATA Controller [IDE mode] (rev 02) 00:1f.3 SMBus: Intel Corporation NM10/ICH7 Family SMBus Controller (rev 02) 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller (rev 07) The SATA interface can be switched between AHCI and legacy modes as usual: 00:1f.2 SATA controller: Intel Corporation NM10/ICH7 Family SATA Controller [AHCI mode] (rev 02) The device attached to the SATA controller is always a CF card via a SiI3811 Serial ATA to Parallel ATA Device Bridge based converter board. Recently, I have experimented with suspend to disk and suspend to RAM. I used the pm-utils-1.4.1-22 package from Fedora 18 on this dumbed down (OpenEmbedded-based) Linux that we run. What I have found was that when the SATA interface is in AHCI mode, pm-suspend (suspend to RAM via ACPI) doesn't work because the CF card seems to return an error for SUSPEND IMMEDIATE. However, using the legacy mode (the ata_piix driver), suspend (and resume) works. The kernel version is 3.7.10. The kernel messages for a successful suspend with ata_piix, resume was user-initiated by pressing the power button: ... Apr 16 19:09:52 localhost kernel: ata2.00: configured for UDMA/133 Apr 16 19:09:52 localhost kernel: ata2: EH complete Apr 16 19:09:52 localhost kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Apr 16 19:09:58 localhost kernel: EXT4-fs (sda6): re-mounted. Opts: commit=0 Apr 16 19:09:58 localhost kernel: EXT4-fs (sda1): re-mounted. Opts: commit=0 Apr 16 19:10:23 localhost kernel: PM: Syncing filesystems ... done. Apr 16 19:10:23 localhost kernel: Freezing user space processes ... (elapsed 0.01 seconds) done. Apr 16 19:10:23 localhost kernel: Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. Apr 16 19:10:23 localhost kernel: Suspending console(s) (use no_console_suspend to debug) Apr 16 19:10:23 localhost kernel: sd 1:0:0:0: [sda] Synchronizing SCSI cache Apr 16 19:10:23 localhost kernel: sd 1:0:0:0: [sda] Stopping disk Apr 16 19:10:23 localhost kernel: serial 00:0b: disabled Apr 16 19:10:23 localhost kernel: serial 00:0a: disabled Apr 16 19:10:23 localhost kernel: serial 00:09: disabled Apr 16 19:10:23 localhost kernel: serial 00:08: disabled Apr 16 19:10:23 localhost kernel: serial 00:07: disabled Apr 16 19:10:25 localhost kernel: PM: suspend of devices complete after 47.656 msecs Apr 16 19:10:25 localhost kernel: PM: late suspend of devices complete after 0.204 msecs Apr 16 19:10:25 localhost kernel: pcieport :00:1c.1: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: ehci_hcd :00:1d.7: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: uhci_hcd :00:1d.3: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: uhci_hcd :00:1d.2: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: uhci_hcd :00:1d.1: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: uhci_hcd :00:1d.0: wake-up capability enabled by ACPI Apr 16 19:10:25 localhost kernel: PM: noirq suspend of devices complete after 29.450 msecs Apr 16 19:10:25 localhost kernel: ACPI: Preparing to enter system sleep state S3 Apr 16 19:10:25 localhost kernel: PM: Saving platform NVS memory Apr 16 19:10:25 localhost kernel: Disabling non-boot CPUs ... Apr 16 19:10:25 localhost kernel: smpboot: CPU 1 is now
Re: loadavg question
Hi, yes, that commit is in my kernel 3.7.9 kernel source, but not in 3.3.2. Also, I have CONFIG_NO_HZ=n in my .config. I will retry with CONFIG_NO_HZ=y with 3.7.9. Thanks, Zoltán Böszörményi 2013-02-28 15:05 keltezĂŠssel, Azat Khuzhin Ărta: Hi, Could you check do you have commit that mentioned here? http://comments.gmane.org/gmane.linux.kernel/1346108 Respectfully Azat Khuzhin. >From phone. On Feb 28, 2013 4:40 PM, "Boszormenyi Zoltan" mailto:zbos...@pr.hu>> wrote: Hi, on an embedded PC (500MHz AMD Geode), we recently tried to upgrade the kernel. The original version was 2.6.27, the new ones were 3.3.x and 3.7.x. With the same userspace (a GTK based GUI constantly querying daemons handling different pieces of hardware over serial ports), we noticed that loadavg is different while the idle time (95-97%) is the same. On 2.6.27, the load was around 0.05-0.3 while on 3.3.x and 3.7.x, it quickly crawled up to around 5. Has the load calculation changed so dramatically after 2.6.27? Thanks in advance, ZoltĂĄn BĂśszĂśrmĂŠnyi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org <mailto:majord...@vger.kernel.org> More majordomo info at  http://vger.kernel.org/majordomo-info.html Please read the FAQ at  http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
loadavg question
Hi, on an embedded PC (500MHz AMD Geode), we recently tried to upgrade the kernel. The original version was 2.6.27, the new ones were 3.3.x and 3.7.x. With the same userspace (a GTK based GUI constantly querying daemons handling different pieces of hardware over serial ports), we noticed that loadavg is different while the idle time (95-97%) is the same. On 2.6.27, the load was around 0.05-0.3 while on 3.3.x and 3.7.x, it quickly crawled up to around 5. Has the load calculation changed so dramatically after 2.6.27? Thanks in advance, Zoltán Böszörményi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
loadavg question
Hi, on an embedded PC (500MHz AMD Geode), we recently tried to upgrade the kernel. The original version was 2.6.27, the new ones were 3.3.x and 3.7.x. With the same userspace (a GTK based GUI constantly querying daemons handling different pieces of hardware over serial ports), we noticed that loadavg is different while the idle time (95-97%) is the same. On 2.6.27, the load was around 0.05-0.3 while on 3.3.x and 3.7.x, it quickly crawled up to around 5. Has the load calculation changed so dramatically after 2.6.27? Thanks in advance, Zoltán Böszörményi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: loadavg question
Hi, yes, that commit is in my kernel 3.7.9 kernel source, but not in 3.3.2. Also, I have CONFIG_NO_HZ=n in my .config. I will retry with CONFIG_NO_HZ=y with 3.7.9. Thanks, Zoltán Böszörményi 2013-02-28 15:05 keltezĂŠssel, Azat Khuzhin Ărta: Hi, Could you check do you have commit that mentioned here? http://comments.gmane.org/gmane.linux.kernel/1346108 Respectfully Azat Khuzhin. From phone. On Feb 28, 2013 4:40 PM, Boszormenyi Zoltan zbos...@pr.hu mailto:zbos...@pr.hu wrote: Hi, on an embedded PC (500MHz AMD Geode), we recently tried to upgrade the kernel. The original version was 2.6.27, the new ones were 3.3.x and 3.7.x. With the same userspace (a GTK based GUI constantly querying daemons handling different pieces of hardware over serial ports), we noticed that loadavg is different while the idle time (95-97%) is the same. On 2.6.27, the load was around 0.05-0.3 while on 3.3.x and 3.7.x, it quickly crawled up to around 5. Has the load calculation changed so dramatically after 2.6.27? Thanks in advance, ZoltĂĄn BĂśszĂśrmĂŠnyi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org mailto:majord...@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html Please read the FAQ at  http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec
2013-01-04 08:40 keltezéssel, Borislav Petkov írta: On Wed, Jan 02, 2013 at 06:37:23PM -0500, Alex Deucher wrote: From: Alex Deucher Date: Wed, 2 Jan 2013 18:30:21 -0500 Subject: [PATCH] drm/radeon/r6xx: fix DMA engine for ttm bo transfers count must be a multiple of 2. Cc: Borislav Petkov Cc: Markus Trippelsdorf Signed-off-by: Alex Deucher Thanks, will run it on the box in question next week when I have access. Btw, you could add the note about count needing to be a multiple of 2 as a comment in the code below, for future reference. --- drivers/gpu/drm/radeon/r600.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 2aaf147..9f4ce5e 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2636,8 +2636,8 @@ int r600_copy_dma(struct radeon_device *rdev, for (i = 0; i < num_loops; i++) { cur_size_in_dw = size_in_dw; - if (cur_size_in_dw > 0x) - cur_size_in_dw = 0x; + if (cur_size_in_dw > 0xFFFE) + cur_size_in_dw = 0xFFFE; How about any other odd numbers? Like 0xFFFB, or 0x0003? They will get passed as is after this change, no? Shouldn't they be also fixed? Something like this below? if (cur_size_in_dw & 0x0001) cur_size_in_dw &= ~1; size_in_dw -= cur_size_in_dw; radeon_ring_write(ring, DMA_PACKET(DMA_PACKET_COPY, 0, 0, cur_size_in_dw)); radeon_ring_write(ring, dst_offset & 0xfffc); -- 1.7.7.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec
2013-01-04 08:40 keltezéssel, Borislav Petkov írta: On Wed, Jan 02, 2013 at 06:37:23PM -0500, Alex Deucher wrote: From: Alex Deucher alexander.deuc...@amd.com Date: Wed, 2 Jan 2013 18:30:21 -0500 Subject: [PATCH] drm/radeon/r6xx: fix DMA engine for ttm bo transfers count must be a multiple of 2. Cc: Borislav Petkov b...@alien8.de Cc: Markus Trippelsdorf mar...@trippelsdorf.de Signed-off-by: Alex Deucher alexander.deuc...@amd.com Thanks, will run it on the box in question next week when I have access. Btw, you could add the note about count needing to be a multiple of 2 as a comment in the code below, for future reference. --- drivers/gpu/drm/radeon/r600.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 2aaf147..9f4ce5e 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2636,8 +2636,8 @@ int r600_copy_dma(struct radeon_device *rdev, for (i = 0; i num_loops; i++) { cur_size_in_dw = size_in_dw; - if (cur_size_in_dw 0x) - cur_size_in_dw = 0x; + if (cur_size_in_dw 0xFFFE) + cur_size_in_dw = 0xFFFE; How about any other odd numbers? Like 0xFFFB, or 0x0003? They will get passed as is after this change, no? Shouldn't they be also fixed? Something like this below? if (cur_size_in_dw 0x0001) cur_size_in_dw = ~1; size_in_dw -= cur_size_in_dw; radeon_ring_write(ring, DMA_PACKET(DMA_PACKET_COPY, 0, 0, cur_size_in_dw)); radeon_ring_write(ring, dst_offset 0xfffc); -- 1.7.7.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec
2013-01-03 00:37 keltezéssel, Alex Deucher írta: On Wed, Jan 2, 2013 at 5:38 PM, Markus Trippelsdorf wrote: On 2013.01.02 at 17:31 -0500, Jerome Glisse wrote: Please affected people can you test if patch : http://people.freedesktop.org/~glisse/0003-drm-radeon-fix-dma-copy-on-r6xx-r7xx-evergen-ni-si-g.patch Fix the issue, you need to make sure you don't have the patch that disable dma on r6xx ie that line 977-978 & 1061-1062 in radeon_asic.c is : .copy = _copy_dma, .copy_ring_index = R600_RING_TYPE_DMA_INDEX, It fixes the issue for me. Thanks. The count is actually the count, not count - 1. The real fix seems to be that r6xx requires 2 dw aligned transfers. The attached patch fixes the issue for me. Alex I tried this patch over kernel 3.8.0-rc2 but the GDM screen is mostly garbage. Only some text, like "Not on the list?" below the users and small icons are visible but many user names are not rendered. http://tinypic.com/r/33xihit/6 I am on Fedora 18/x86_64, Radeon HD6570. Best regards, Zoltán Böszörményi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec
2013-01-03 00:37 keltezéssel, Alex Deucher írta: On Wed, Jan 2, 2013 at 5:38 PM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2013.01.02 at 17:31 -0500, Jerome Glisse wrote: Please affected people can you test if patch : http://people.freedesktop.org/~glisse/0003-drm-radeon-fix-dma-copy-on-r6xx-r7xx-evergen-ni-si-g.patch Fix the issue, you need to make sure you don't have the patch that disable dma on r6xx ie that line 977-978 1061-1062 in radeon_asic.c is : .copy = r600_copy_dma, .copy_ring_index = R600_RING_TYPE_DMA_INDEX, It fixes the issue for me. Thanks. The count is actually the count, not count - 1. The real fix seems to be that r6xx requires 2 dw aligned transfers. The attached patch fixes the issue for me. Alex I tried this patch over kernel 3.8.0-rc2 but the GDM screen is mostly garbage. Only some text, like Not on the list? below the users and small icons are visible but many user names are not rendered. http://tinypic.com/r/33xihit/6 I am on Fedora 18/x86_64, Radeon HD6570. Best regards, Zoltán Böszörményi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Drop support for x86-32
Dear wbrana, this would have been the perfect April 1st joke along the lines of removing support for *all* CPU architectures and adding support for the one true virtual CPU, the Turing machine. Now you spoiled it, shame on you! :-D Best regards, Zoltán Böszörményi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Drop support for x86-32
Dear wbrana, this would have been the perfect April 1st joke along the lines of removing support for *all* CPU architectures and adding support for the one true virtual CPU, the Turing machine. Now you spoiled it, shame on you! :-D Best regards, Zoltán Böszörményi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] softirq-2.4.6-C3
Hi! I tried this and do_softirq is still unresolved symbol if CONFIG_MODVERSIONS is set to y. Works ok if not. linux-2.4.6-pre1 + softirq-2.4.6-B4 + softirq-2.4.6-C3. Regards, Zoltan Boszormenyi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] softirq-2.4.6-C3
Hi! I tried this and do_softirq is still unresolved symbol if CONFIG_MODVERSIONS is set to y. Works ok if not. linux-2.4.6-pre1 + softirq-2.4.6-B4 + softirq-2.4.6-C3. Regards, Zoltan Boszormenyi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
tcdrain() problem?
Hi! I use the serial line for communication with a device. The communication looks like this: 1. ask the device about its state (a 4 byte packet) 2. read reply from device 3. decide what to do: 3a. if nothing to do, go to 1. 3b. if there is something to do, send the answer to the device this is a 23 byte packet 3c. if the device ack'd the previous action, counter-ack it. This is the function I use for sending: void send_message(int fd, ptg_t *ptg, int sleeptime) { int len; /* tcflush(fd, TCIOFLUSH); */ if (sleeptime) usleep(sleeptime); len = ptg->message_out[1]; write(fd, ptg->message_out, len); tcdrain(fd); #if DEBUG { int i; printf("%d sent: ", ptg->ptgnum); for(i = 0; i < len; i++) printf("0x%02x ", ptg->message_out[i]); printf("\n"); } #endif } The line is full duplex, 19200 baud, 8N1. The serial device if opened with O_RDWR|O_NOCTTY|O_NDELAY. My problem is that if the tcflush() is not there, sometimes between the "there's something to do" reply and the longish answer somehow a "what state are you in?" message goes out. This simply couldn't happen from the program data flow. I have put printf()s into every possible place and it does not show anything unusual. But an independent serial line sniffer (another PC) shows the above problem, no matter what the sleeptime is. (15000usec - 6usec) So I suspect that tcdrain() does not do what the manpage says: tcdrain() waits until all output written to the object referred to by fd has been transmitted. The above problem happened on two different PCs, the only common thing was the uart type: 16550A. Both PCs are running the 2.2.17-14 RH kernel and glibc-2.1.3-22 (RH 6.2), the serial driver compile options and version are from dmesg: Serial driver version 4.27 with MANY_PORTS MULTIPORT SHARE_IRQ enabled If I put the tcflush() before the write, the problem disappears. Now the question is: Should I really issue a tcflush(), or is it a bug in tcdrain()? Can O_NDELAY cause tcdrain() not to wait? Please cc the answer, I am not on the list. Regards, Zoltan Boszormenyi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
tcdrain() problem?
Hi! I use the serial line for communication with a device. The communication looks like this: 1. ask the device about its state (a 4 byte packet) 2. read reply from device 3. decide what to do: 3a. if nothing to do, go to 1. 3b. if there is something to do, send the answer to the device this is a 23 byte packet 3c. if the device ack'd the previous action, counter-ack it. This is the function I use for sending: void send_message(int fd, ptg_t *ptg, int sleeptime) { int len; /* tcflush(fd, TCIOFLUSH); */ if (sleeptime) usleep(sleeptime); len = ptg-message_out[1]; write(fd, ptg-message_out, len); tcdrain(fd); #if DEBUG { int i; printf(%d sent: , ptg-ptgnum); for(i = 0; i len; i++) printf(0x%02x , ptg-message_out[i]); printf(\n); } #endif } The line is full duplex, 19200 baud, 8N1. The serial device if opened with O_RDWR|O_NOCTTY|O_NDELAY. My problem is that if the tcflush() is not there, sometimes between the there's something to do reply and the longish answer somehow a what state are you in? message goes out. This simply couldn't happen from the program data flow. I have put printf()s into every possible place and it does not show anything unusual. But an independent serial line sniffer (another PC) shows the above problem, no matter what the sleeptime is. (15000usec - 6usec) So I suspect that tcdrain() does not do what the manpage says: tcdrain() waits until all output written to the object referred to by fd has been transmitted. The above problem happened on two different PCs, the only common thing was the uart type: 16550A. Both PCs are running the 2.2.17-14 RH kernel and glibc-2.1.3-22 (RH 6.2), the serial driver compile options and version are from dmesg: Serial driver version 4.27 with MANY_PORTS MULTIPORT SHARE_IRQ enabled If I put the tcflush() before the write, the problem disappears. Now the question is: Should I really issue a tcflush(), or is it a bug in tcdrain()? Can O_NDELAY cause tcdrain() not to wait? Please cc the answer, I am not on the list. Regards, Zoltan Boszormenyi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux not adhering to BIOS Drive boot order?
On Wed, 17 Jan 2001, David Balazic wrote: > BTW, where is the scsihosts= kernel parameter documented ? linux/Documentation/filesystems/devfs/README Regards, Zoltan Boszormenyi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
2.4.0-vmpatch-15.1 still no go
On Tue, 9 Jan 2001, Boszormenyi Zoltan wrote: > Hi! > > PF_RSSTRIM is not declared anywhere either in the linux-2.4.0 sources > or in the 2.4.0-vmbigpatch. [zozo@localhost kernel]$ tar xIf linux-2.4.0.tar.bz2 [zozo@localhost kernel]$ cd linux [zozo@localhost linux]$ cat ../patches/2.4.0/2.4.0-vmpatch-15.1 | patch -p1 patching file `kernel/sysctl.c' patching file `kernel/fork.c' patching file `mm/filemap.c' patching file `mm/memory.c' patching file `mm/page_alloc.c' patching file `mm/swap.c' patching file `mm/vmscan.c' patching file `include/linux/sysctl.h' patching file `include/linux/swap.h' patching file `include/linux/mm.h' patching file `Documentation/sysctl/vm.txt' [zozo@localhost linux]$ find . -type f | xargs grep PF_RSSTRIM ./mm/vmscan.c: if (mm->rss > rss_limit && !(p->flags & PF_RSSTRIM)) { ./mm/vmscan.c: p->flags |= PF_RSSTRIM; ./mm/vmscan.c: p->flags &= ~PF_RSSTRIM; [zozo@localhost linux]$ Regards, Zoltan Boszormenyi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
2.4.0-vmpatch-15.1 still no go
On Tue, 9 Jan 2001, Boszormenyi Zoltan wrote: Hi! PF_RSSTRIM is not declared anywhere either in the linux-2.4.0 sources or in the 2.4.0-vmbigpatch. [zozo@localhost kernel]$ tar xIf linux-2.4.0.tar.bz2 [zozo@localhost kernel]$ cd linux [zozo@localhost linux]$ cat ../patches/2.4.0/2.4.0-vmpatch-15.1 | patch -p1 patching file `kernel/sysctl.c' patching file `kernel/fork.c' patching file `mm/filemap.c' patching file `mm/memory.c' patching file `mm/page_alloc.c' patching file `mm/swap.c' patching file `mm/vmscan.c' patching file `include/linux/sysctl.h' patching file `include/linux/swap.h' patching file `include/linux/mm.h' patching file `Documentation/sysctl/vm.txt' [zozo@localhost linux]$ find . -type f | xargs grep PF_RSSTRIM ./mm/vmscan.c: if (mm-rss rss_limit !(p-flags PF_RSSTRIM)) { ./mm/vmscan.c: p-flags |= PF_RSSTRIM; ./mm/vmscan.c: p-flags = ~PF_RSSTRIM; [zozo@localhost linux]$ Regards, Zoltan Boszormenyi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Linux not adhering to BIOS Drive boot order?
On Wed, 17 Jan 2001, David Balazic wrote: BTW, where is the scsihosts= kernel parameter documented ? linux/Documentation/filesystems/devfs/README Regards, Zoltan Boszormenyi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/