Re: [PATCH] usb: chipidea: Fix missing resume call after suspend
Hi Peter, > Peter Chenhat am 25. April 2017 um 10:14 geschrieben: > > Since you unplug the cable first, and plug in again. The driver will > treat it as connection but not resume event. You may use > /sys/class/udc/ci_hdrc.0/state to get udc's connection, eg "not attached" > or other states to indicate connection. Thanks for the hint. This works for me. Regards, Bernhard
Re: [PATCH] usb: chipidea: Fix missing resume call after suspend
Hi Peter, > Peter Chen hat am 25. April 2017 um 10:14 geschrieben: > > Since you unplug the cable first, and plug in again. The driver will > treat it as connection but not resume event. You may use > /sys/class/udc/ci_hdrc.0/state to get udc's connection, eg "not attached" > or other states to indicate connection. Thanks for the hint. This works for me. Regards, Bernhard
Re: [PATCH] usb: chipidea: Fix missing resume call after suspend
Hi, > Peter Chenhat am 24. April 2017 um 05:51 geschrieben: > > > The current code logic is: > - When the resume is received from host, the ci->dirver->resume is > called, and suspended is cleared. > - When the reset is received from host, the isr_reset_handler is called, > and suspended is cleared by _gadget_stop_activity. Since reset is > called, so ci->driver->resume doesn't need to be called. My problem is that dump_stack() doesn't show the complete stackframe. What I see from debug messages is that when I plug a cable (after the host has been suspended) then _gadget_stop_activity is executed before udc_irq. My first attempt was to save ci->suspended at the beginning of udc_irq (i.e. before isr_reset_handler is called), but that doesn't work. Even in the beginning of udc_irq ci->suspended is 0. So isr_reset_handler is called before. The result is that when I unplug a cable and attach it again, driver->suspended has been called but driver->resume doesn't get called. > There is a patch to fix clear suspended even the ci->driver->resume is > NULL at v4.12-rc1. > > usb: chipidea: udc: update gadget state after bus resume Thanks for the hint. That makes the second part of my patch irrelevant, but the first part (removing the ci->suspended = 0) is needed in order to make my setup working. Any idea to find the cause why reset is called before resume? According to your explanation, that shouldn't be the case? Regards, Bernhard
Re: [PATCH] usb: chipidea: Fix missing resume call after suspend
Hi, > Peter Chen hat am 24. April 2017 um 05:51 geschrieben: > > > The current code logic is: > - When the resume is received from host, the ci->dirver->resume is > called, and suspended is cleared. > - When the reset is received from host, the isr_reset_handler is called, > and suspended is cleared by _gadget_stop_activity. Since reset is > called, so ci->driver->resume doesn't need to be called. My problem is that dump_stack() doesn't show the complete stackframe. What I see from debug messages is that when I plug a cable (after the host has been suspended) then _gadget_stop_activity is executed before udc_irq. My first attempt was to save ci->suspended at the beginning of udc_irq (i.e. before isr_reset_handler is called), but that doesn't work. Even in the beginning of udc_irq ci->suspended is 0. So isr_reset_handler is called before. The result is that when I unplug a cable and attach it again, driver->suspended has been called but driver->resume doesn't get called. > There is a patch to fix clear suspended even the ci->driver->resume is > NULL at v4.12-rc1. > > usb: chipidea: udc: update gadget state after bus resume Thanks for the hint. That makes the second part of my patch irrelevant, but the first part (removing the ci->suspended = 0) is needed in order to make my setup working. Any idea to find the cause why reset is called before resume? According to your explanation, that shouldn't be the case? Regards, Bernhard
[PATCH] usb: chipidea: Fix missing resume call after suspend
We have a i.MX53-based hardware (quite similar to the i.MX53 QSB from Freescale/NXP). I'm reading the /ci_hdrc.0/gadget/suspended sysfs file to find out whether a PC is connected to the USB gadget. With old kernel versions, this worked. However, with kernel 4.9 this didn't work. When the host is suspended once, it never sets back the suspended status to 0. The reason is that this seems to be done in the resume handler, which should be executed in the interrupt handler: udc_irq: ... if (USBi_PCI & intr) { ci->gadget.speed = hw_port_is_high_speed(ci) ? USB_SPEED_HIGH : USB_SPEED_FULL; if (ci->suspended && ci->driver->resume) { spin_unlock(>lock); ci->driver->resume(>gadget); spin_lock(>lock); ci->suspended = 0; } } ... However, ci->suspended is already 0 here because _gadget_stop_activity is called before. So the resume handler never gets called. The obvious solution is to not touch ci->suspended in _gadget_stop_activity and to trust the interrupt handler to set it back (and to modify it to set ci->suspended to 0 even if ci->driver->resume is NULL). This code works on my platform. However, since I didn't write the driver and since I have no deep understanding of it, I cannot determine if there are any negative side effects, so I hope to get some review here. Signed-off-by: Bernhard Walle <bernh...@bwalle.de> --- drivers/usb/chipidea/udc.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c index f88e9157fad0..0c780ba39b37 100644 --- a/drivers/usb/chipidea/udc.c +++ b/drivers/usb/chipidea/udc.c @@ -712,7 +712,6 @@ static int _gadget_stop_activity(struct usb_gadget *gadget) spin_lock_irqsave(>lock, flags); ci->gadget.speed = USB_SPEED_UNKNOWN; ci->remote_wakeup = 0; - ci->suspended = 0; spin_unlock_irqrestore(>lock, flags); /* flush all endpoints */ @@ -1845,10 +1844,12 @@ static irqreturn_t udc_irq(struct ci_hdrc *ci) if (USBi_PCI & intr) { ci->gadget.speed = hw_port_is_high_speed(ci) ? USB_SPEED_HIGH : USB_SPEED_FULL; - if (ci->suspended && ci->driver->resume) { - spin_unlock(>lock); - ci->driver->resume(>gadget); - spin_lock(>lock); + if (ci->suspended) { + if (ci->driver->resume) { + spin_unlock(>lock); + ci->driver->resume(>gadget); + spin_lock(>lock); + } ci->suspended = 0; } } -- 2.12.2
[PATCH] usb: chipidea: Fix missing resume call after suspend
We have a i.MX53-based hardware (quite similar to the i.MX53 QSB from Freescale/NXP). I'm reading the /ci_hdrc.0/gadget/suspended sysfs file to find out whether a PC is connected to the USB gadget. With old kernel versions, this worked. However, with kernel 4.9 this didn't work. When the host is suspended once, it never sets back the suspended status to 0. The reason is that this seems to be done in the resume handler, which should be executed in the interrupt handler: udc_irq: ... if (USBi_PCI & intr) { ci->gadget.speed = hw_port_is_high_speed(ci) ? USB_SPEED_HIGH : USB_SPEED_FULL; if (ci->suspended && ci->driver->resume) { spin_unlock(>lock); ci->driver->resume(>gadget); spin_lock(>lock); ci->suspended = 0; } } ... However, ci->suspended is already 0 here because _gadget_stop_activity is called before. So the resume handler never gets called. The obvious solution is to not touch ci->suspended in _gadget_stop_activity and to trust the interrupt handler to set it back (and to modify it to set ci->suspended to 0 even if ci->driver->resume is NULL). This code works on my platform. However, since I didn't write the driver and since I have no deep understanding of it, I cannot determine if there are any negative side effects, so I hope to get some review here. Signed-off-by: Bernhard Walle --- drivers/usb/chipidea/udc.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c index f88e9157fad0..0c780ba39b37 100644 --- a/drivers/usb/chipidea/udc.c +++ b/drivers/usb/chipidea/udc.c @@ -712,7 +712,6 @@ static int _gadget_stop_activity(struct usb_gadget *gadget) spin_lock_irqsave(>lock, flags); ci->gadget.speed = USB_SPEED_UNKNOWN; ci->remote_wakeup = 0; - ci->suspended = 0; spin_unlock_irqrestore(>lock, flags); /* flush all endpoints */ @@ -1845,10 +1844,12 @@ static irqreturn_t udc_irq(struct ci_hdrc *ci) if (USBi_PCI & intr) { ci->gadget.speed = hw_port_is_high_speed(ci) ? USB_SPEED_HIGH : USB_SPEED_FULL; - if (ci->suspended && ci->driver->resume) { - spin_unlock(>lock); - ci->driver->resume(>gadget); - spin_lock(>lock); + if (ci->suspended) { + if (ci->driver->resume) { + spin_unlock(>lock); + ci->driver->resume(>gadget); + spin_lock(>lock); + } ci->suspended = 0; } } -- 2.12.2
[PATCH] net: fec: Rename "phy-reset-active-low" property
>From the perspective of RESET, the meaning of the new property is actually "active high". Thanks for Troy Kisky for pointing that out. Since the patch is in linux-next, this patch is incremental and doesn't replace the original patch. Signed-off-by: Bernhard Walle <bernh...@bwalle.de> --- Documentation/devicetree/bindings/net/fsl-fec.txt | 2 +- drivers/net/ethernet/freescale/fec_main.c | 8 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/Documentation/devicetree/bindings/net/fsl-fec.txt b/Documentation/devicetree/bindings/net/fsl-fec.txt index a4799ff..b037a9d 100644 --- a/Documentation/devicetree/bindings/net/fsl-fec.txt +++ b/Documentation/devicetree/bindings/net/fsl-fec.txt @@ -12,7 +12,7 @@ Optional properties: only if property "phy-reset-gpios" is available. Missing the property will have the duration be 1 millisecond. Numbers greater than 1000 are invalid and 1 millisecond will be used instead. -- phy-reset-active-low : If present then the reset sequence using the GPIO +- phy-reset-active-high : If present then the reset sequence using the GPIO specified in the "phy-reset-gpios" property is reversed (H=reset state, L=operation state). - phy-supply : regulator that powers the Ethernet PHY. diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index bad0ba2..37c0815 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3191,7 +3191,7 @@ static int fec_enet_init(struct net_device *ndev) static void fec_reset_phy(struct platform_device *pdev) { int err, phy_reset; - bool active_low = false; + bool active_high = false; int msec = 1; struct device_node *np = pdev->dev.of_node; @@ -3207,17 +3207,17 @@ static void fec_reset_phy(struct platform_device *pdev) if (!gpio_is_valid(phy_reset)) return; - active_low = of_property_read_bool(np, "phy-reset-active-low"); + active_high = of_property_read_bool(np, "phy-reset-active-high"); err = devm_gpio_request_one(>dev, phy_reset, - active_low ? GPIOF_OUT_INIT_HIGH : GPIOF_OUT_INIT_LOW, + active_high ? GPIOF_OUT_INIT_HIGH : GPIOF_OUT_INIT_LOW, "phy-reset"); if (err) { dev_err(>dev, "failed to get phy-reset-gpios: %d\n", err); return; } msleep(msec); - gpio_set_value_cansleep(phy_reset, !active_low); + gpio_set_value_cansleep(phy_reset, !active_high); } #else /* CONFIG_OF */ static void fec_reset_phy(struct platform_device *pdev) -- 2.7.2
[PATCH] net: fec: Rename "phy-reset-active-low" property
>From the perspective of RESET, the meaning of the new property is actually "active high". Thanks for Troy Kisky for pointing that out. Since the patch is in linux-next, this patch is incremental and doesn't replace the original patch. Signed-off-by: Bernhard Walle --- Documentation/devicetree/bindings/net/fsl-fec.txt | 2 +- drivers/net/ethernet/freescale/fec_main.c | 8 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/Documentation/devicetree/bindings/net/fsl-fec.txt b/Documentation/devicetree/bindings/net/fsl-fec.txt index a4799ff..b037a9d 100644 --- a/Documentation/devicetree/bindings/net/fsl-fec.txt +++ b/Documentation/devicetree/bindings/net/fsl-fec.txt @@ -12,7 +12,7 @@ Optional properties: only if property "phy-reset-gpios" is available. Missing the property will have the duration be 1 millisecond. Numbers greater than 1000 are invalid and 1 millisecond will be used instead. -- phy-reset-active-low : If present then the reset sequence using the GPIO +- phy-reset-active-high : If present then the reset sequence using the GPIO specified in the "phy-reset-gpios" property is reversed (H=reset state, L=operation state). - phy-supply : regulator that powers the Ethernet PHY. diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index bad0ba2..37c0815 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3191,7 +3191,7 @@ static int fec_enet_init(struct net_device *ndev) static void fec_reset_phy(struct platform_device *pdev) { int err, phy_reset; - bool active_low = false; + bool active_high = false; int msec = 1; struct device_node *np = pdev->dev.of_node; @@ -3207,17 +3207,17 @@ static void fec_reset_phy(struct platform_device *pdev) if (!gpio_is_valid(phy_reset)) return; - active_low = of_property_read_bool(np, "phy-reset-active-low"); + active_high = of_property_read_bool(np, "phy-reset-active-high"); err = devm_gpio_request_one(>dev, phy_reset, - active_low ? GPIOF_OUT_INIT_HIGH : GPIOF_OUT_INIT_LOW, + active_high ? GPIOF_OUT_INIT_HIGH : GPIOF_OUT_INIT_LOW, "phy-reset"); if (err) { dev_err(>dev, "failed to get phy-reset-gpios: %d\n", err); return; } msleep(msec); - gpio_set_value_cansleep(phy_reset, !active_low); + gpio_set_value_cansleep(phy_reset, !active_high); } #else /* CONFIG_OF */ static void fec_reset_phy(struct platform_device *pdev) -- 2.7.2
[PATCH v2] regulator: ltc3589: Make IRQ optional
It's perfectly valid to use the LTC3589 without an interrupt pin connected to it. Currently, the driver probing fails when client->irq is 0 (which means "no interrupt"). Don't register the interrupt handler in that case but successfully finish the device probing instead. Signed-off-by: Bernhard Walle --- Changes between v1 and v2: - Use 'client->irq' instead of 'client->irq != 0' - Wrap long line - Don't print the IRQ number since that was leftover from my debugging drivers/regulator/ltc3589.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/drivers/regulator/ltc3589.c b/drivers/regulator/ltc3589.c index 972c386..47bef32 100644 --- a/drivers/regulator/ltc3589.c +++ b/drivers/regulator/ltc3589.c @@ -520,12 +520,15 @@ static int ltc3589_probe(struct i2c_client *client, } } - ret = devm_request_threaded_irq(dev, client->irq, NULL, ltc3589_isr, - IRQF_TRIGGER_LOW | IRQF_ONESHOT, - client->name, ltc3589); - if (ret) { - dev_err(dev, "Failed to request IRQ: %d\n", ret); - return ret; + if (client->irq) { + ret = devm_request_threaded_irq(dev, client->irq, NULL, + ltc3589_isr, + IRQF_TRIGGER_LOW | IRQF_ONESHOT, + client->name, ltc3589); + if (ret) { + dev_err(dev, "Failed to request IRQ: %d\n", ret); + return ret; + } } return 0; -- 2.7.1
Re: [PATCH] regulator: ltc3589: Make IRQ optional
Hi, thanks for the review! Am 10.02.2016 10:44, schrieb Philipp Zabel: + IRQF_TRIGGER_LOW | IRQF_ONESHOT, + client->name, ltc3589); + if (ret) { + dev_err(dev, "Failed to request IRQ %d: %d\n", client->irq, ret); ... and this long line. This was actually a mistake because I added the IRQ number to debug my problem. Will send an updated version of the patch just now. Regards, Bernhard
[PATCH v2] regulator: ltc3589: Make IRQ optional
It's perfectly valid to use the LTC3589 without an interrupt pin connected to it. Currently, the driver probing fails when client->irq is 0 (which means "no interrupt"). Don't register the interrupt handler in that case but successfully finish the device probing instead. Signed-off-by: Bernhard Walle <bernh...@bwalle.de> --- Changes between v1 and v2: - Use 'client->irq' instead of 'client->irq != 0' - Wrap long line - Don't print the IRQ number since that was leftover from my debugging drivers/regulator/ltc3589.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/drivers/regulator/ltc3589.c b/drivers/regulator/ltc3589.c index 972c386..47bef32 100644 --- a/drivers/regulator/ltc3589.c +++ b/drivers/regulator/ltc3589.c @@ -520,12 +520,15 @@ static int ltc3589_probe(struct i2c_client *client, } } - ret = devm_request_threaded_irq(dev, client->irq, NULL, ltc3589_isr, - IRQF_TRIGGER_LOW | IRQF_ONESHOT, - client->name, ltc3589); - if (ret) { - dev_err(dev, "Failed to request IRQ: %d\n", ret); - return ret; + if (client->irq) { + ret = devm_request_threaded_irq(dev, client->irq, NULL, + ltc3589_isr, + IRQF_TRIGGER_LOW | IRQF_ONESHOT, + client->name, ltc3589); + if (ret) { + dev_err(dev, "Failed to request IRQ: %d\n", ret); + return ret; + } } return 0; -- 2.7.1
Re: [PATCH] regulator: ltc3589: Make IRQ optional
Hi, thanks for the review! Am 10.02.2016 10:44, schrieb Philipp Zabel: + IRQF_TRIGGER_LOW | IRQF_ONESHOT, + client->name, ltc3589); + if (ret) { + dev_err(dev, "Failed to request IRQ %d: %d\n", client->irq, ret); ... and this long line. This was actually a mistake because I added the IRQ number to debug my problem. Will send an updated version of the patch just now. Regards, Bernhard
[PATCH] regulator: ltc3589: Make IRQ optional
It's perfectly valid to use the LTC3589 without an interrupt pin connected to it. Currently, the driver probing fails when client->irq is 0 (which means "no interrupt"). Don't register the interrupt handler in that case but successfully finish the device probing instead. Signed-off-by: Bernhard Walle --- drivers/regulator/ltc3589.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/regulator/ltc3589.c b/drivers/regulator/ltc3589.c index 972c386..f2149d07 100644 --- a/drivers/regulator/ltc3589.c +++ b/drivers/regulator/ltc3589.c @@ -520,12 +520,14 @@ static int ltc3589_probe(struct i2c_client *client, } } - ret = devm_request_threaded_irq(dev, client->irq, NULL, ltc3589_isr, - IRQF_TRIGGER_LOW | IRQF_ONESHOT, - client->name, ltc3589); - if (ret) { - dev_err(dev, "Failed to request IRQ: %d\n", ret); - return ret; + if (client->irq != 0) { + ret = devm_request_threaded_irq(dev, client->irq, NULL, ltc3589_isr, + IRQF_TRIGGER_LOW | IRQF_ONESHOT, + client->name, ltc3589); + if (ret) { + dev_err(dev, "Failed to request IRQ %d: %d\n", client->irq, ret); + return ret; + } } return 0; -- 2.7.1
[PATCH v2] net: fec: Add "phy-reset-active-low" property to DT
We need that for a custom hardware that needs the reverse reset sequence. Signed-off-by: Bernhard Walle --- Changes compared to v1: - Add documentation to 'phy-reset-gpios' that flags are ignored as suggested by Andrew Lunn. Documentation/devicetree/bindings/net/fsl-fec.txt | 7 ++- drivers/net/ethernet/freescale/fec_main.c | 8 ++-- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/Documentation/devicetree/bindings/net/fsl-fec.txt b/Documentation/devicetree/bindings/net/fsl-fec.txt index a9eb611..0caa429 100644 --- a/Documentation/devicetree/bindings/net/fsl-fec.txt +++ b/Documentation/devicetree/bindings/net/fsl-fec.txt @@ -7,11 +7,16 @@ Required properties: - phy-mode : See ethernet.txt file in the same directory Optional properties: -- phy-reset-gpios : Should specify the gpio for phy reset +- phy-reset-gpios : Should specify the gpio for phy reset. Additional + flags are ignored, see the non-standard 'phy-reset-active-low' property + instead. - phy-reset-duration : Reset duration in milliseconds. Should present only if property "phy-reset-gpios" is available. Missing the property will have the duration be 1 millisecond. Numbers greater than 1000 are invalid and 1 millisecond will be used instead. +- phy-reset-active-low : If present then the reset sequence using the GPIO + specified in the "phy-reset-gpios" property is reversed (H=reset state, + L=operation state). - phy-supply : regulator that powers the Ethernet PHY. - phy-handle : phandle to the PHY device connected to this device. - fixed-link : Assume a fixed link. See fixed-link.txt in the same directory. diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 41c81f6..98caf87 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3229,6 +3229,7 @@ static int fec_enet_init(struct net_device *ndev) static void fec_reset_phy(struct platform_device *pdev) { int err, phy_reset; + bool active_low = false; int msec = 1; struct device_node *np = pdev->dev.of_node; @@ -3244,14 +3245,17 @@ static void fec_reset_phy(struct platform_device *pdev) if (!gpio_is_valid(phy_reset)) return; + active_low = of_property_read_bool(np, "phy-reset-active-low"); + err = devm_gpio_request_one(>dev, phy_reset, - GPIOF_OUT_INIT_LOW, "phy-reset"); + active_low ? GPIOF_OUT_INIT_HIGH : GPIOF_OUT_INIT_LOW, + "phy-reset"); if (err) { dev_err(>dev, "failed to get phy-reset-gpios: %d\n", err); return; } msleep(msec); - gpio_set_value_cansleep(phy_reset, 1); + gpio_set_value_cansleep(phy_reset, !active_low); } #else /* CONFIG_OF */ static void fec_reset_phy(struct platform_device *pdev) -- 2.7.1
[PATCH] regulator: ltc3589: Make IRQ optional
It's perfectly valid to use the LTC3589 without an interrupt pin connected to it. Currently, the driver probing fails when client->irq is 0 (which means "no interrupt"). Don't register the interrupt handler in that case but successfully finish the device probing instead. Signed-off-by: Bernhard Walle <bernh...@bwalle.de> --- drivers/regulator/ltc3589.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/regulator/ltc3589.c b/drivers/regulator/ltc3589.c index 972c386..f2149d07 100644 --- a/drivers/regulator/ltc3589.c +++ b/drivers/regulator/ltc3589.c @@ -520,12 +520,14 @@ static int ltc3589_probe(struct i2c_client *client, } } - ret = devm_request_threaded_irq(dev, client->irq, NULL, ltc3589_isr, - IRQF_TRIGGER_LOW | IRQF_ONESHOT, - client->name, ltc3589); - if (ret) { - dev_err(dev, "Failed to request IRQ: %d\n", ret); - return ret; + if (client->irq != 0) { + ret = devm_request_threaded_irq(dev, client->irq, NULL, ltc3589_isr, + IRQF_TRIGGER_LOW | IRQF_ONESHOT, + client->name, ltc3589); + if (ret) { + dev_err(dev, "Failed to request IRQ %d: %d\n", client->irq, ret); + return ret; + } } return 0; -- 2.7.1
[PATCH v2] net: fec: Add "phy-reset-active-low" property to DT
We need that for a custom hardware that needs the reverse reset sequence. Signed-off-by: Bernhard Walle <bernh...@bwalle.de> --- Changes compared to v1: - Add documentation to 'phy-reset-gpios' that flags are ignored as suggested by Andrew Lunn. Documentation/devicetree/bindings/net/fsl-fec.txt | 7 ++- drivers/net/ethernet/freescale/fec_main.c | 8 ++-- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/Documentation/devicetree/bindings/net/fsl-fec.txt b/Documentation/devicetree/bindings/net/fsl-fec.txt index a9eb611..0caa429 100644 --- a/Documentation/devicetree/bindings/net/fsl-fec.txt +++ b/Documentation/devicetree/bindings/net/fsl-fec.txt @@ -7,11 +7,16 @@ Required properties: - phy-mode : See ethernet.txt file in the same directory Optional properties: -- phy-reset-gpios : Should specify the gpio for phy reset +- phy-reset-gpios : Should specify the gpio for phy reset. Additional + flags are ignored, see the non-standard 'phy-reset-active-low' property + instead. - phy-reset-duration : Reset duration in milliseconds. Should present only if property "phy-reset-gpios" is available. Missing the property will have the duration be 1 millisecond. Numbers greater than 1000 are invalid and 1 millisecond will be used instead. +- phy-reset-active-low : If present then the reset sequence using the GPIO + specified in the "phy-reset-gpios" property is reversed (H=reset state, + L=operation state). - phy-supply : regulator that powers the Ethernet PHY. - phy-handle : phandle to the PHY device connected to this device. - fixed-link : Assume a fixed link. See fixed-link.txt in the same directory. diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 41c81f6..98caf87 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3229,6 +3229,7 @@ static int fec_enet_init(struct net_device *ndev) static void fec_reset_phy(struct platform_device *pdev) { int err, phy_reset; + bool active_low = false; int msec = 1; struct device_node *np = pdev->dev.of_node; @@ -3244,14 +3245,17 @@ static void fec_reset_phy(struct platform_device *pdev) if (!gpio_is_valid(phy_reset)) return; + active_low = of_property_read_bool(np, "phy-reset-active-low"); + err = devm_gpio_request_one(>dev, phy_reset, - GPIOF_OUT_INIT_LOW, "phy-reset"); + active_low ? GPIOF_OUT_INIT_HIGH : GPIOF_OUT_INIT_LOW, + "phy-reset"); if (err) { dev_err(>dev, "failed to get phy-reset-gpios: %d\n", err); return; } msleep(msec); - gpio_set_value_cansleep(phy_reset, 1); + gpio_set_value_cansleep(phy_reset, !active_low); } #else /* CONFIG_OF */ static void fec_reset_phy(struct platform_device *pdev) -- 2.7.1
[PATCH] net: fec: Add "phy-reset-active-low" property to DT
We need that for a custom hardware that needs the reverse reset sequence. Signed-off-by: Bernhard Walle --- Documentation/devicetree/bindings/net/fsl-fec.txt | 3 +++ drivers/net/ethernet/freescale/fec_main.c | 8 ++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/net/fsl-fec.txt b/Documentation/devicetree/bindings/net/fsl-fec.txt index a9eb611..a4799ff 100644 --- a/Documentation/devicetree/bindings/net/fsl-fec.txt +++ b/Documentation/devicetree/bindings/net/fsl-fec.txt @@ -12,6 +12,9 @@ Optional properties: only if property "phy-reset-gpios" is available. Missing the property will have the duration be 1 millisecond. Numbers greater than 1000 are invalid and 1 millisecond will be used instead. +- phy-reset-active-low : If present then the reset sequence using the GPIO + specified in the "phy-reset-gpios" property is reversed (H=reset state, + L=operation state). - phy-supply : regulator that powers the Ethernet PHY. - phy-handle : phandle to the PHY device connected to this device. - fixed-link : Assume a fixed link. See fixed-link.txt in the same directory. diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 41c81f6..98caf87 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3229,6 +3229,7 @@ static int fec_enet_init(struct net_device *ndev) static void fec_reset_phy(struct platform_device *pdev) { int err, phy_reset; + bool active_low = false; int msec = 1; struct device_node *np = pdev->dev.of_node; @@ -3244,14 +3245,17 @@ static void fec_reset_phy(struct platform_device *pdev) if (!gpio_is_valid(phy_reset)) return; + active_low = of_property_read_bool(np, "phy-reset-active-low"); + err = devm_gpio_request_one(>dev, phy_reset, - GPIOF_OUT_INIT_LOW, "phy-reset"); + active_low ? GPIOF_OUT_INIT_HIGH : GPIOF_OUT_INIT_LOW, + "phy-reset"); if (err) { dev_err(>dev, "failed to get phy-reset-gpios: %d\n", err); return; } msleep(msec); - gpio_set_value_cansleep(phy_reset, 1); + gpio_set_value_cansleep(phy_reset, !active_low); } #else /* CONFIG_OF */ static void fec_reset_phy(struct platform_device *pdev) -- 2.7.1
[PATCH] net: fec: Add "phy-reset-active-low" property to DT
We need that for a custom hardware that needs the reverse reset sequence. Signed-off-by: Bernhard Walle <bernh...@bwalle.de> --- Documentation/devicetree/bindings/net/fsl-fec.txt | 3 +++ drivers/net/ethernet/freescale/fec_main.c | 8 ++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/net/fsl-fec.txt b/Documentation/devicetree/bindings/net/fsl-fec.txt index a9eb611..a4799ff 100644 --- a/Documentation/devicetree/bindings/net/fsl-fec.txt +++ b/Documentation/devicetree/bindings/net/fsl-fec.txt @@ -12,6 +12,9 @@ Optional properties: only if property "phy-reset-gpios" is available. Missing the property will have the duration be 1 millisecond. Numbers greater than 1000 are invalid and 1 millisecond will be used instead. +- phy-reset-active-low : If present then the reset sequence using the GPIO + specified in the "phy-reset-gpios" property is reversed (H=reset state, + L=operation state). - phy-supply : regulator that powers the Ethernet PHY. - phy-handle : phandle to the PHY device connected to this device. - fixed-link : Assume a fixed link. See fixed-link.txt in the same directory. diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 41c81f6..98caf87 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3229,6 +3229,7 @@ static int fec_enet_init(struct net_device *ndev) static void fec_reset_phy(struct platform_device *pdev) { int err, phy_reset; + bool active_low = false; int msec = 1; struct device_node *np = pdev->dev.of_node; @@ -3244,14 +3245,17 @@ static void fec_reset_phy(struct platform_device *pdev) if (!gpio_is_valid(phy_reset)) return; + active_low = of_property_read_bool(np, "phy-reset-active-low"); + err = devm_gpio_request_one(>dev, phy_reset, - GPIOF_OUT_INIT_LOW, "phy-reset"); + active_low ? GPIOF_OUT_INIT_HIGH : GPIOF_OUT_INIT_LOW, + "phy-reset"); if (err) { dev_err(>dev, "failed to get phy-reset-gpios: %d\n", err); return; } msleep(msec); - gpio_set_value_cansleep(phy_reset, 1); + gpio_set_value_cansleep(phy_reset, !active_low); } #else /* CONFIG_OF */ static void fec_reset_phy(struct platform_device *pdev) -- 2.7.1
[PATCH] hwmon: (htu21) Use the nohold mode to read out values
On my Raspberry Pi, the driver doesn't work. Every read fails with -EIO. Reading the data sheet and experimenting a bit made me finding out that using the nohold mode works. The Raspberry Pi I²C chip seems to have problems with slaves blocking the SCK line. In any case, freeing the bus while performing the measurement seems to be the better way, I think. Signed-off-by: Bernhard Walle --- drivers/hwmon/htu21.c | 52 +-- 1 file changed, 46 insertions(+), 6 deletions(-) diff --git a/drivers/hwmon/htu21.c b/drivers/hwmon/htu21.c index 839086e..7defae2 100644 --- a/drivers/hwmon/htu21.c +++ b/drivers/hwmon/htu21.c @@ -25,10 +25,11 @@ #include #include #include +#include /* HTU21 Commands */ -#define HTU21_T_MEASUREMENT_HM 0xE3 -#define HTU21_RH_MEASUREMENT_HM0xE5 +#define HTU21_T_MEASUREMENT_HM_NOHOLD 0xF3 +#define HTU21_RH_MEASUREMENT_HM_NOHOLD 0xF5 struct htu21 { struct device *hwmon_dev; @@ -59,6 +60,47 @@ static inline int htu21_rh_ticks_to_per_cent_mille(int ticks) return ((15625 * ticks) >> 13) - 6000; } +/* retry for one second, then give up */ +#define MAX_RETRIES 20 + +static int htu21_read_word(struct i2c_client *client, u8 command) +{ + char data[2]; + int ret, i; + + /* start the measurement */ + ret = i2c_master_send(client, , sizeof(command)); + if (ret < 0) { + dev_err(>dev, "Unable to send command 0x%hhx: %d\n", + command, ret); + return ret; + } + + /* +* Now poll until we get the data. On I2C level the device sends a NAK +* until it is ready +*/ + + msleep(50); + + for (i = 0; i < MAX_RETRIES; i++) { + ret = i2c_master_recv(client, data, sizeof(data)); + if (ret == sizeof(data)) + break; + + msleep(50); + } + + if (ret < 0) { + dev_err(>dev, "Unable to receive result from command 0x%hhx: %d\n", + command, ret); + return ret; + } + + return (data[0] << 8) | data[1]; +} + + static int htu21_update_measurements(struct i2c_client *client) { int ret = 0; @@ -68,13 +110,11 @@ static int htu21_update_measurements(struct i2c_client *client) if (time_after(jiffies, htu21->last_update + HZ / 2) || !htu21->valid) { - ret = i2c_smbus_read_word_swapped(client, - HTU21_T_MEASUREMENT_HM); + ret = htu21_read_word(client, HTU21_T_MEASUREMENT_HM_NOHOLD); if (ret < 0) goto out; htu21->temperature = htu21_temp_ticks_to_millicelsius(ret); - ret = i2c_smbus_read_word_swapped(client, - HTU21_RH_MEASUREMENT_HM); + ret = htu21_read_word(client, HTU21_RH_MEASUREMENT_HM_NOHOLD); if (ret < 0) goto out; htu21->humidity = htu21_rh_ticks_to_per_cent_mille(ret); -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] hwmon: (htu21) Use the nohold mode to read out values
On my Raspberry Pi, the driver doesn't work. Every read fails with -EIO. Reading the data sheet and experimenting a bit made me finding out that using the nohold mode works. The Raspberry Pi I²C chip seems to have problems with slaves blocking the SCK line. In any case, freeing the bus while performing the measurement seems to be the better way, I think. Signed-off-by: Bernhard Walle bernh...@bwalle.de --- drivers/hwmon/htu21.c | 52 +-- 1 file changed, 46 insertions(+), 6 deletions(-) diff --git a/drivers/hwmon/htu21.c b/drivers/hwmon/htu21.c index 839086e..7defae2 100644 --- a/drivers/hwmon/htu21.c +++ b/drivers/hwmon/htu21.c @@ -25,10 +25,11 @@ #include linux/mutex.h #include linux/device.h #include linux/jiffies.h +#include linux/delay.h /* HTU21 Commands */ -#define HTU21_T_MEASUREMENT_HM 0xE3 -#define HTU21_RH_MEASUREMENT_HM0xE5 +#define HTU21_T_MEASUREMENT_HM_NOHOLD 0xF3 +#define HTU21_RH_MEASUREMENT_HM_NOHOLD 0xF5 struct htu21 { struct device *hwmon_dev; @@ -59,6 +60,47 @@ static inline int htu21_rh_ticks_to_per_cent_mille(int ticks) return ((15625 * ticks) 13) - 6000; } +/* retry for one second, then give up */ +#define MAX_RETRIES 20 + +static int htu21_read_word(struct i2c_client *client, u8 command) +{ + char data[2]; + int ret, i; + + /* start the measurement */ + ret = i2c_master_send(client, command, sizeof(command)); + if (ret 0) { + dev_err(client-dev, Unable to send command 0x%hhx: %d\n, + command, ret); + return ret; + } + + /* +* Now poll until we get the data. On I2C level the device sends a NAK +* until it is ready +*/ + + msleep(50); + + for (i = 0; i MAX_RETRIES; i++) { + ret = i2c_master_recv(client, data, sizeof(data)); + if (ret == sizeof(data)) + break; + + msleep(50); + } + + if (ret 0) { + dev_err(client-dev, Unable to receive result from command 0x%hhx: %d\n, + command, ret); + return ret; + } + + return (data[0] 8) | data[1]; +} + + static int htu21_update_measurements(struct i2c_client *client) { int ret = 0; @@ -68,13 +110,11 @@ static int htu21_update_measurements(struct i2c_client *client) if (time_after(jiffies, htu21-last_update + HZ / 2) || !htu21-valid) { - ret = i2c_smbus_read_word_swapped(client, - HTU21_T_MEASUREMENT_HM); + ret = htu21_read_word(client, HTU21_T_MEASUREMENT_HM_NOHOLD); if (ret 0) goto out; htu21-temperature = htu21_temp_ticks_to_millicelsius(ret); - ret = i2c_smbus_read_word_swapped(client, - HTU21_RH_MEASUREMENT_HM); + ret = htu21_read_word(client, HTU21_RH_MEASUREMENT_HM_NOHOLD); if (ret 0) goto out; htu21-humidity = htu21_rh_ticks_to_per_cent_mille(ret); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Regression in hyperv network driver in 3.14
Am 2014-05-27 15:43, schrieb KY Srinivasan: Can I provide more information to track down the problem? This bug has been fixed upstream. The issue is with regards to the older hosts (ws2008 r2) not Being able to handle the larger receive buffer currently used. Can you point me to the commit that fixed the problem? Woudln't that be something for -stable since the problem is still in 3.14.4. Regards, Bernhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Regression in hyperv network driver in 3.14
Hello, using a 3.14 kernel in a Linux VM running on HyperV (Windows Server 2008 R2) we get following error: hv_netvsc: hv_netvsc channel opened successfully hv_netvsc vmbus_0_9 (unregistered net_device): Unable to complete receive buffer initialization with NetVsp - status 2 hv_netvsc vmbus_0_9 (unregistered net_device): unable to connect to NetVSP - -22 hv_netvsc vmbus_0_9 (unregistered net_device): unable to add netvsc device (ret -22) hv_netvsc: probe of vmbus_0_9 failed with error -22 Bisecting the problem shows that commit b679ef73edc251f6d200a7dd2396e9fef9e36fc3 is responsible. Reverting it fixes the issue. Even only changing NETVSC_RECEIVE_BUFFER_SIZE to 2M fixes the issue! The virtual machine has 2 GiB of memory and 4 CPUs. Can I provide more information to track down the problem? Regards, Bernhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Regression in hyperv network driver in 3.14
Hello, using a 3.14 kernel in a Linux VM running on HyperV (Windows Server 2008 R2) we get following error: hv_netvsc: hv_netvsc channel opened successfully hv_netvsc vmbus_0_9 (unregistered net_device): Unable to complete receive buffer initialization with NetVsp - status 2 hv_netvsc vmbus_0_9 (unregistered net_device): unable to connect to NetVSP - -22 hv_netvsc vmbus_0_9 (unregistered net_device): unable to add netvsc device (ret -22) hv_netvsc: probe of vmbus_0_9 failed with error -22 Bisecting the problem shows that commit b679ef73edc251f6d200a7dd2396e9fef9e36fc3 is responsible. Reverting it fixes the issue. Even only changing NETVSC_RECEIVE_BUFFER_SIZE to 2M fixes the issue! The virtual machine has 2 GiB of memory and 4 CPUs. Can I provide more information to track down the problem? Regards, Bernhard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Regression in hyperv network driver in 3.14
Am 2014-05-27 15:43, schrieb KY Srinivasan: Can I provide more information to track down the problem? This bug has been fixed upstream. The issue is with regards to the older hosts (ws2008 r2) not Being able to handle the larger receive buffer currently used. Can you point me to the commit that fixed the problem? Woudln't that be something for -stable since the problem is still in 3.14.4. Regards, Bernhard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kbuild: Fix gcc -x syntax
Hi, * Ingo Molnar [2012-09-29 08:37]: > * Jean Delvare wrote: > > > The correct syntax for gcc -x is "gcc -x assembler", not "gcc > > -xassembler". Even though the latter happens to work, the > > former is what is documented in the manual page and thus what > > gcc wrappers such as icecream do expect. > > > > This isn't a cosmetic change. The missing space prevents > > icecream from recognizing compilation tasks it can't handle, > > leading to silent kernel miscompilations. > > Although we can apply this patch, it won't solve the problem of > building older kernels (and bisecting, etc.). > > Wouldn't it be prudent to increase the compatibility of > icecream, so that it accepts what GCC accepts in practice, > such as -xassembler? Wouldn't it make sense to do both? Using the documented syntax in the build system *and* increase compatibility in 3rd party tools? Regards, Bernhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kbuild: Fix gcc -x syntax
Hi, * Ingo Molnar mi...@kernel.org [2012-09-29 08:37]: * Jean Delvare jdelv...@suse.de wrote: The correct syntax for gcc -x is gcc -x assembler, not gcc -xassembler. Even though the latter happens to work, the former is what is documented in the manual page and thus what gcc wrappers such as icecream do expect. This isn't a cosmetic change. The missing space prevents icecream from recognizing compilation tasks it can't handle, leading to silent kernel miscompilations. Although we can apply this patch, it won't solve the problem of building older kernels (and bisecting, etc.). Wouldn't it be prudent to increase the compatibility of icecream, so that it accepts what GCC accepts in practice, such as -xassembler? Wouldn't it make sense to do both? Using the documented syntax in the build system *and* increase compatibility in 3rd party tools? Regards, Bernhard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1: ppc32: too few arguments to function 'reserve_bootmem'
* Andrew Morton <[EMAIL PROTECTED]> [2008-02-04 23:40]: > We did this wrong. We should have introduced a new reserve_bootmem_foo() > and migrated over to that in stages. Once all callers are migrated, remove > the old interface. Well, my original proposal was to add a new function but then someone complained that we already have too much bootmem functions. I don't remember if this was on LKML or internally in Bugzilla. However, sorry, it was my fault of course. Bernhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: locking api self-test hanging
* Andrew Morton <[EMAIL PROTECTED]> [2008-02-04 14:04]: > > but that code really needs help. Using spin_lock_irqsave() is what rtc_get_rtc_time() does which was used before I changed to get_rtc_time() does. So it should be ok. I missed that difference. However, I agree with you the whole RTC "emulation" per HPET is a bit unclean. However, I have too little experience in this code area to come up with a proper redesign. I just ported it from the old RTC to the new RTC module interface. > Bernhard, I believe the checklist items in Documentation/SubmitChecklist > would have prevented this at the source. Yes. I must admit that I didn't know that document. I used checkpatch.pl, but that's of course only for coding style. I'll try to follow the points in SubmitChecklist in future. Bernhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mm] Crashkernel memory reservation fails with 2.6.24-rc8-mm1
* Sachin P. Sant <[EMAIL PROTECTED]> [2008-02-05 06:33]: > Bernhard Walle wrote: >> * Vivek Goyal <[EMAIL PROTECTED]> [2008-02-04 19:38]: >> >>> Bernahard, any idea who is the competitor here? >>> >> Hm ..., can you boot the kernel without crashkernel= and provide the >> /proc/iomem? >> > Attached is the /proc/iomem output with and without crashkernel > parameter. Looks ok. In my case when I added the warning message the kernel BSS was too large. But that doesn't seem to be the case here, BSS ends with 9M and you use 16M as start address for the crashkernel. Bernhard pgprEGPyzsIRN.pgp Description: PGP signature
Re: [mm] Crashkernel memory reservation fails with 2.6.24-rc8-mm1
* Sachin P. Sant [EMAIL PROTECTED] [2008-02-05 06:33]: Bernhard Walle wrote: * Vivek Goyal [EMAIL PROTECTED] [2008-02-04 19:38]: Bernahard, any idea who is the competitor here? Hm ..., can you boot the kernel without crashkernel= and provide the /proc/iomem? Attached is the /proc/iomem output with and without crashkernel parameter. Looks ok. In my case when I added the warning message the kernel BSS was too large. But that doesn't seem to be the case here, BSS ends with 9M and you use 16M as start address for the crashkernel. Bernhard pgprEGPyzsIRN.pgp Description: PGP signature
Re: locking api self-test hanging
* Andrew Morton [EMAIL PROTECTED] [2008-02-04 14:04]: but that code really needs help. Using spin_lock_irqsave() is what rtc_get_rtc_time() does which was used before I changed to get_rtc_time() does. So it should be ok. I missed that difference. However, I agree with you the whole RTC emulation per HPET is a bit unclean. However, I have too little experience in this code area to come up with a proper redesign. I just ported it from the old RTC to the new RTC module interface. Bernhard, I believe the checklist items in Documentation/SubmitChecklist would have prevented this at the source. Yes. I must admit that I didn't know that document. I used checkpatch.pl, but that's of course only for coding style. I'll try to follow the points in SubmitChecklist in future. Bernhard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1: ppc32: too few arguments to function 'reserve_bootmem'
* Andrew Morton [EMAIL PROTECTED] [2008-02-04 23:40]: We did this wrong. We should have introduced a new reserve_bootmem_foo() and migrated over to that in stages. Once all callers are migrated, remove the old interface. Well, my original proposal was to add a new function but then someone complained that we already have too much bootmem functions. I don't remember if this was on LKML or internally in Bugzilla. However, sorry, it was my fault of course. Bernhard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mm] Crashkernel memory reservation fails with 2.6.24-rc8-mm1
* Vivek Goyal <[EMAIL PROTECTED]> [2008-02-04 19:38]: > > Bernahard, any idea who is the competitor here? Hm ..., can you boot the kernel without crashkernel= and provide the /proc/iomem? Bernhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mm] Crashkernel memory reservation fails with 2.6.24-rc8-mm1
* Vivek Goyal [EMAIL PROTECTED] [2008-02-04 19:38]: Bernahard, any idea who is the competitor here? Hm ..., can you boot the kernel without crashkernel= and provide the /proc/iomem? Bernhard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3 -mm] kexec jump -v8 : add write support to oldmem device
* Pavel Machek <[EMAIL PROTECTED]> [2007-12-21 11:17]: > Or is crashdump only supported on i386? No. Thanks, Bernhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3 -mm] kexec jump -v8 : add write support to oldmem device
* Pavel Machek [EMAIL PROTECTED] [2007-12-21 11:17]: Or is crashdump only supported on i386? No. Thanks, Bernhard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/3] Make CONFIG_HPET_EMULATE_RTC usable from modules
This patch makes the RTC emulation functions in arch/x86/kernel/hpet.c usable for kernel modules. It - exports the functions (EXPORT_SYMBOL_GPL()), - adds an interface to register the interrupt callback function instead of using only a fixed callback function and - replaces the rtc_get_rtc_time() function which depends on CONFIG_RTC with a call to get_rtc_time() which is defined in include/asm-generic/rtc.h. The only dependency to CONFIG_RTC is the call to rtc_interrupt() which is removed by the next patch. After this, there's no (code) dependency of this functions to CONFIG_RTC=y any more. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/kernel/hpet.c | 47 ++- include/asm-x86/hpet.h |3 +++ 2 files changed, 49 insertions(+), 1 deletion(-) --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -107,6 +107,7 @@ int is_hpet_enabled(void) { return is_hpet_capable() && hpet_legacy_int_enabled; } +EXPORT_SYMBOL_GPL(is_hpet_enabled); /* * When the hpet driver (/dev/hpet) is enabled, we need to reserve @@ -478,6 +479,7 @@ void hpet_disable(void) */ #include #include +#include #define DEFAULT_RTC_INT_FREQ 64 #define DEFAULT_RTC_SHIFT 6 @@ -492,6 +494,38 @@ static unsigned long hpet_default_delta; static unsigned long hpet_pie_delta; static unsigned long hpet_pie_limit; +static rtc_irq_handler irq_handler; + +/* + * Registers a IRQ handler. + */ +int hpet_register_irq_handler(rtc_irq_handler handler) +{ + if (!is_hpet_enabled()) + return -ENODEV; + if (irq_handler) + return -EBUSY; + + irq_handler = handler; + + return 0; +} +EXPORT_SYMBOL_GPL(hpet_register_irq_handler); + +/* + * Deregisters the IRQ handler registered with hpet_register_irq_handler() + * and does cleanup. + */ +void hpet_unregister_irq_handler(rtc_irq_handler handler) +{ + if (!is_hpet_enabled()) + return; + + irq_handler = NULL; + hpet_rtc_flags = 0; +} +EXPORT_SYMBOL_GPL(hpet_unregister_irq_handler); + /* * Timer 1 for RTC emulation. We use one shot mode, as periodic mode * is not supported by all HPET implementations for timer 1. @@ -533,6 +567,7 @@ int hpet_rtc_timer_init(void) return 1; } +EXPORT_SYMBOL_GPL(hpet_rtc_timer_init); /* * The functions below are called from rtc driver. @@ -547,6 +582,7 @@ int hpet_mask_rtc_irq_bit(unsigned long hpet_rtc_flags &= ~bit_mask; return 1; } +EXPORT_SYMBOL_GPL(hpet_mask_rtc_irq_bit); int hpet_set_rtc_irq_bit(unsigned long bit_mask) { @@ -562,6 +598,7 @@ int hpet_set_rtc_irq_bit(unsigned long b return 1; } +EXPORT_SYMBOL_GPL(hpet_set_rtc_irq_bit); int hpet_set_alarm_time(unsigned char hrs, unsigned char min, unsigned char sec) @@ -575,6 +612,7 @@ int hpet_set_alarm_time(unsigned char hr return 1; } +EXPORT_SYMBOL_GPL(hpet_set_alarm_time); int hpet_set_periodic_freq(unsigned long freq) { @@ -593,11 +631,13 @@ int hpet_set_periodic_freq(unsigned long } return 1; } +EXPORT_SYMBOL_GPL(hpet_set_periodic_freq); int hpet_rtc_dropped_irq(void) { return is_hpet_enabled(); } +EXPORT_SYMBOL_GPL(hpet_rtc_dropped_irq); static void hpet_rtc_timer_reinit(void) { @@ -641,9 +681,10 @@ irqreturn_t hpet_rtc_interrupt(int irq, unsigned long rtc_int_flag = 0; hpet_rtc_timer_reinit(); + memset(_time, 0, sizeof(struct rtc_time)); if (hpet_rtc_flags & (RTC_UIE | RTC_AIE)) - rtc_get_rtc_time(_time); + get_rtc_time(_time); if (hpet_rtc_flags & RTC_UIE && curr_time.tm_sec != hpet_prev_update_sec) { @@ -665,8 +706,12 @@ irqreturn_t hpet_rtc_interrupt(int irq, if (rtc_int_flag) { rtc_int_flag |= (RTC_IRQF | (RTC_NUM_INTS << 8)); + if (irq_handler) + irq_handler(rtc_int_flag, dev_id); + rtc_interrupt(rtc_int_flag, dev_id); } return IRQ_HANDLED; } +EXPORT_SYMBOL_GPL(hpet_rtc_interrupt); #endif --- a/include/asm-x86/hpet.h +++ b/include/asm-x86/hpet.h @@ -69,6 +69,7 @@ extern void force_hpet_resume(void); #include +typedef irqreturn_t (*rtc_irq_handler)(int interrupt, void *cookie); extern int hpet_mask_rtc_irq_bit(unsigned long bit_mask); extern int hpet_set_rtc_irq_bit(unsigned long bit_mask); extern int hpet_set_alarm_time(unsigned char hrs, unsigned char min, @@ -77,6 +78,8 @@ extern int hpet_set_periodic_freq(unsign extern int hpet_rtc_dropped_irq(void); extern int hpet_rtc_timer_init(void); extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id); +extern int hpet_register_irq_handler(rtc_irq_handler handler); +extern void hpet_unregister_irq_handler(rtc_irq_handler handler); #endif /* CONFIG_HPET_EMULATE_RTC */ -- To unsubscribe from this list:
[patch 2/3] Use the IRQ callback interface in (old) RTC driver
That function uses the new registration callback mechanism which was added in the previous patch in the old RTC driver. It also removes the direct rtc_interrupt() call from arch/x86/kernel/hpetc.c so that there's finally no (code) dependency to CONFIG_RTC in arch/x86/kernel/hpet.c. Because of this, it's possible to compile the drivers/char/rtc.ko driver as module and still use the HPET emulation functionality. This is also expressed in Kconfig. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/Kconfig |2 +- arch/x86/kernel/hpet.c |2 -- drivers/char/rtc.c | 15 ++- 3 files changed, 15 insertions(+), 4 deletions(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -405,7 +405,7 @@ config HPET_TIMER config HPET_EMULATE_RTC def_bool y - depends on HPET_TIMER && RTC=y + depends on HPET_TIMER && (RTC=y || RTC=m) # Mark as embedded because too many people got it wrong. # The code disables itself when not needed. --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -708,8 +708,6 @@ irqreturn_t hpet_rtc_interrupt(int irq, rtc_int_flag |= (RTC_IRQF | (RTC_NUM_INTS << 8)); if (irq_handler) irq_handler(rtc_int_flag, dev_id); - - rtc_interrupt(rtc_int_flag, dev_id); } return IRQ_HANDLED; } --- a/drivers/char/rtc.c +++ b/drivers/char/rtc.c @@ -110,6 +110,8 @@ static int rtc_has_irq = 1; #define hpet_set_rtc_irq_bit(arg) 0 #define hpet_rtc_timer_init() do { } while (0) #define hpet_rtc_dropped_irq() 0 +#define hpet_register_irq_handler(h) 0 +#define hpet_unregister_irq_handler(h) 0 #ifdef RTC_IRQ static irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id) { @@ -1027,7 +1029,15 @@ no_irq: #ifdef RTC_IRQ if (is_hpet_enabled()) { + int err; + rtc_int_handler_ptr = hpet_rtc_interrupt; + err = hpet_register_irq_handler(rtc_interrupt); + if (err != 0) { + printk(KERN_WARNING "hpet_register_irq_handler failed " + "in rtc_init()."); + return err; + } } else { rtc_int_handler_ptr = rtc_interrupt; } @@ -1050,6 +1060,7 @@ no_irq: if (misc_register(_dev)) { #ifdef RTC_IRQ free_irq(RTC_IRQ, NULL); + hpet_unregister_irq_handler(rtc_interrupt); rtc_has_irq = 0; #endif rtc_release_region(); @@ -1141,8 +1152,10 @@ static void __exit rtc_exit(void) #else rtc_release_region(); #ifdef RTC_IRQ - if (rtc_has_irq) + if (rtc_has_irq) { free_irq(RTC_IRQ, NULL); + hpet_unregister_irq_handler(hpet_rtc_interrupt); + } #endif #endif /* CONFIG_SPARC32 */ } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 0/3] Implement CONFIG_HPET_EMULATE_RTC for RTC_DRV_CMOS
The new rtc-cmos driver misses HPET support. If the hardware has HPET enabled, then interrupts don't work for the rtc-cmos driver which results in RTC_AIE*, RTC_PIE* and RTC_ALM being unusable. This affects hwclock from util-linux-ng at least on i386 since that uses RTC_PIE_ON. (For x86-64, a polling method is used for unknown reasons.) This patch series now 1. export the functions from arch/x86/kernel/hpet.c that the old char/rtc driver uses to work around that problem, 2. makes it possible to compile the old rtc driver as module, while still having CONFIG_HPET_EMULATE_RTC enabled and 3. makes use of the exported functions in (1) in the new rtc-cmos driver. The design is not changed. Please review and give me feedback! This patch series is against 2.6.24-rc5-mm1. It passes the test in Documentation/rtc.txt after http://lkml.org/lkml/2007/12/20/249 is applied. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/3] Add HPET RTC emulation to RTC_DRV_CMOS
That patch adds the RTC emulation of the HPET timer to the new RTC_DRV_CMOS. The old drivers/char/rtc.ko driver had that functionality and it's important on new systems. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/Kconfig |2 - drivers/rtc/rtc-cmos.c | 79 - 2 files changed, 67 insertions(+), 14 deletions(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -405,7 +405,7 @@ config HPET_TIMER config HPET_EMULATE_RTC def_bool y - depends on HPET_TIMER && (RTC=y || RTC=m) + depends on HPET_TIMER && (RTC=y || RTC=m || RTC_DRV_CMOS=m || RTC_DRV_CMOS=y) # Mark as embedded because too many people got it wrong. # The code disables itself when not needed. --- a/drivers/rtc/rtc-cmos.c +++ b/drivers/rtc/rtc-cmos.c @@ -35,10 +35,22 @@ #include #include #include +#include /* this is for "generic access to PC-style RTC" using CMOS_READ/CMOS_WRITE */ #include +#ifndef CONFIG_HPET_EMULATE_RTC +#define is_hpet_enabled() 0 +#define hpet_set_alarm_time(hrs, min, sec) do { } while (0) +#define hpet_set_periodic_freq(arg)0 +#define hpet_mask_rtc_irq_bit(arg) do { } while (0) +#define hpet_set_rtc_irq_bit(arg) do { } while (0) +#define hpet_rtc_timer_init() do { } while (0) +#define hpet_register_irq_handler(h) 0 +#define hpet_unregister_irq_handler(h) do { } while (0) +extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id); +#endif struct cmos_rtc { struct rtc_device *rtc; @@ -199,6 +211,7 @@ static int cmos_set_alarm(struct device sec = t->time.tm_sec; sec = (sec < 60) ? BIN2BCD(sec) : 0xff; + hpet_set_alarm_time(t->time.tm_hour, t->time.tm_min, t->time.tm_sec); spin_lock_irq(_lock); /* next rtc irq must not be from previous alarm setting */ @@ -252,7 +265,8 @@ static int cmos_irq_set_freq(struct devi f = 16 - f; spin_lock_irqsave(_lock, flags); - CMOS_WRITE(RTC_REF_CLCK_32KHZ | f, RTC_FREQ_SELECT); + if (!hpet_set_periodic_freq(freq)) + CMOS_WRITE(RTC_REF_CLCK_32KHZ | f, RTC_FREQ_SELECT); spin_unlock_irqrestore(_lock, flags); return 0; @@ -314,28 +328,37 @@ cmos_rtc_ioctl(struct device *dev, unsig switch (cmd) { case RTC_AIE_OFF: /* alarm off */ rtc_control &= ~RTC_AIE; + hpet_mask_rtc_irq_bit(RTC_AIE); break; case RTC_AIE_ON:/* alarm on */ rtc_control |= RTC_AIE; + hpet_set_rtc_irq_bit(RTC_AIE); break; case RTC_UIE_OFF: /* update off */ rtc_control &= ~RTC_UIE; + hpet_mask_rtc_irq_bit(RTC_UIE); break; case RTC_UIE_ON:/* update on */ rtc_control |= RTC_UIE; + hpet_set_rtc_irq_bit(RTC_UIE); break; case RTC_PIE_OFF: /* periodic off */ rtc_control &= ~RTC_PIE; + hpet_mask_rtc_irq_bit(RTC_PIE); break; case RTC_PIE_ON:/* periodic on */ rtc_control |= RTC_PIE; + hpet_set_rtc_irq_bit(RTC_PIE); break; } - CMOS_WRITE(rtc_control, RTC_CONTROL); + if (!is_hpet_enabled()) + CMOS_WRITE(rtc_control, RTC_CONTROL); + rtc_intr = CMOS_READ(RTC_INTR_FLAGS); rtc_intr &= (rtc_control & RTC_IRQMASK) | RTC_IRQF; if (is_intr(rtc_intr)) rtc_update_irq(cmos->rtc, 1, rtc_intr); + spin_unlock_irqrestore(_lock, flags); return 0; } @@ -475,15 +498,25 @@ static irqreturn_t cmos_interrupt(int ir u8 rtc_control; spin_lock(_lock); - irqstat = CMOS_READ(RTC_INTR_FLAGS); - rtc_control = CMOS_READ(RTC_CONTROL); - irqstat &= (rtc_control & RTC_IRQMASK) | RTC_IRQF; + /* +* In this case it is HPET RTC interrupt handler +* calling us, with the interrupt information +* passed as arg1, instead of irq. +*/ + if (is_hpet_enabled()) + irqstat = (unsigned long)irq & 0xF0; + else { + irqstat = CMOS_READ(RTC_INTR_FLAGS); + rtc_control = CMOS_READ(RTC_CONTROL); + irqstat &= (rtc_control & RTC_IRQMASK) | RTC_IRQF; + } /* All Linux RTC alarms should be treated as if they were oneshot. * Similar code may be needed in system wakeup paths, in case the * alarm woke the system. */ if (irqstat & RTC_AIE) { + rtc_control = CMOS_READ(RTC_CONTROL); rtc_control &= ~RTC_AIE; CMOS_WRITE(rtc_control, RTC_CONTROL); CMOS_READ(RT
[PATCH] Add HPET RTC emulation to RTC_DRV_CMOS
That patch adds the RTC emulation of the HPET timer to the new RTC_DRV_CMOS. The old drivers/char/rtc.ko driver had that functionality and it's important on new systems. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/Kconfig |2 - drivers/rtc/rtc-cmos.c | 79 - 2 files changed, 67 insertions(+), 14 deletions(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -405,7 +405,7 @@ config HPET_TIMER config HPET_EMULATE_RTC def_bool y - depends on HPET_TIMER && (RTC=y || RTC=m) + depends on HPET_TIMER && (RTC=y || RTC=m || RTC_DRV_CMOS=m || RTC_DRV_CMOS=y) # Mark as embedded because too many people got it wrong. # The code disables itself when not needed. --- a/drivers/rtc/rtc-cmos.c +++ b/drivers/rtc/rtc-cmos.c @@ -35,10 +35,22 @@ #include #include #include +#include /* this is for "generic access to PC-style RTC" using CMOS_READ/CMOS_WRITE */ #include +#ifndef CONFIG_HPET_EMULATE_RTC +#define is_hpet_enabled() 0 +#define hpet_set_alarm_time(hrs, min, sec) do { } while (0) +#define hpet_set_periodic_freq(arg)0 +#define hpet_mask_rtc_irq_bit(arg) do { } while (0) +#define hpet_set_rtc_irq_bit(arg) do { } while (0) +#define hpet_rtc_timer_init() do { } while (0) +#define hpet_register_irq_handler(h) 0 +#define hpet_unregister_irq_handler(h) do { } while (0) +extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id); +#endif struct cmos_rtc { struct rtc_device *rtc; @@ -199,6 +211,7 @@ static int cmos_set_alarm(struct device sec = t->time.tm_sec; sec = (sec < 60) ? BIN2BCD(sec) : 0xff; + hpet_set_alarm_time(t->time.tm_hour, t->time.tm_min, t->time.tm_sec); spin_lock_irq(_lock); /* next rtc irq must not be from previous alarm setting */ @@ -252,7 +265,8 @@ static int cmos_irq_set_freq(struct devi f = 16 - f; spin_lock_irqsave(_lock, flags); - CMOS_WRITE(RTC_REF_CLCK_32KHZ | f, RTC_FREQ_SELECT); + if (!hpet_set_periodic_freq(freq)) + CMOS_WRITE(RTC_REF_CLCK_32KHZ | f, RTC_FREQ_SELECT); spin_unlock_irqrestore(_lock, flags); return 0; @@ -314,28 +328,37 @@ cmos_rtc_ioctl(struct device *dev, unsig switch (cmd) { case RTC_AIE_OFF: /* alarm off */ rtc_control &= ~RTC_AIE; + hpet_mask_rtc_irq_bit(RTC_AIE); break; case RTC_AIE_ON:/* alarm on */ rtc_control |= RTC_AIE; + hpet_set_rtc_irq_bit(RTC_AIE); break; case RTC_UIE_OFF: /* update off */ rtc_control &= ~RTC_UIE; + hpet_mask_rtc_irq_bit(RTC_UIE); break; case RTC_UIE_ON:/* update on */ rtc_control |= RTC_UIE; + hpet_set_rtc_irq_bit(RTC_UIE); break; case RTC_PIE_OFF: /* periodic off */ rtc_control &= ~RTC_PIE; + hpet_mask_rtc_irq_bit(RTC_PIE); break; case RTC_PIE_ON:/* periodic on */ rtc_control |= RTC_PIE; + hpet_set_rtc_irq_bit(RTC_PIE); break; } - CMOS_WRITE(rtc_control, RTC_CONTROL); + if (!is_hpet_enabled()) + CMOS_WRITE(rtc_control, RTC_CONTROL); + rtc_intr = CMOS_READ(RTC_INTR_FLAGS); rtc_intr &= (rtc_control & RTC_IRQMASK) | RTC_IRQF; if (is_intr(rtc_intr)) rtc_update_irq(cmos->rtc, 1, rtc_intr); + spin_unlock_irqrestore(_lock, flags); return 0; } @@ -475,15 +498,25 @@ static irqreturn_t cmos_interrupt(int ir u8 rtc_control; spin_lock(_lock); - irqstat = CMOS_READ(RTC_INTR_FLAGS); - rtc_control = CMOS_READ(RTC_CONTROL); - irqstat &= (rtc_control & RTC_IRQMASK) | RTC_IRQF; + /* +* In this case it is HPET RTC interrupt handler +* calling us, with the interrupt information +* passed as arg1, instead of irq. +*/ + if (is_hpet_enabled()) + irqstat = (unsigned long)irq & 0xF0; + else { + irqstat = CMOS_READ(RTC_INTR_FLAGS); + rtc_control = CMOS_READ(RTC_CONTROL); + irqstat &= (rtc_control & RTC_IRQMASK) | RTC_IRQF; + } /* All Linux RTC alarms should be treated as if they were oneshot. * Similar code may be needed in system wakeup paths, in case the * alarm woke the system. */ if (irqstat & RTC_AIE) { + rtc_control = CMOS_READ(RTC_CONTROL); rtc_control &= ~RTC_AIE; CMOS_WRITE(rtc_control, RTC_CONTROL); CMOS_READ(RT
Re: [PATCH] Add HPET RTC emulation to RTC_DRV_CMOS
* Bernhard Walle <[EMAIL PROTECTED]> [2007-12-20 16:24]: ... This was an accident. The patch belongs to a patch series that I'll post later. Please ignore! Thanks, Bernhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix RTC_AIE with CONFIG_HPET_EMULATE_RTC
In the current code, RTC_AIE doesn't work if the RTC relies on CONFIG_HPET_EMULATE_RTC because the code sets the RTC_AIE flag in hpet_set_rtc_irq_bit(). The interrupt handles does accidentally check for RTC_PIE and not RTC_AIE when comparing the time which was set in hpet_set_alarm_time(). This patch is against 2.6.24-rc5-mm1. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/kernel/hpet.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -657,7 +657,7 @@ irqreturn_t hpet_rtc_interrupt(int irq, hpet_pie_count = 0; } - if (hpet_rtc_flags & RTC_PIE && + if (hpet_rtc_flags & RTC_AIE && (curr_time.tm_sec == hpet_alarm_time.tm_sec) && (curr_time.tm_min == hpet_alarm_time.tm_min) && (curr_time.tm_hour == hpet_alarm_time.tm_hour)) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix RTC_AIE with CONFIG_HPET_EMULATE_RTC
In the current code, RTC_AIE doesn't work if the RTC relies on CONFIG_HPET_EMULATE_RTC because the code sets the RTC_AIE flag in hpet_set_rtc_irq_bit(). The interrupt handles does accidentally check for RTC_PIE and not RTC_AIE when comparing the time which was set in hpet_set_alarm_time(). This patch is against 2.6.24-rc5-mm1. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/x86/kernel/hpet.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -657,7 +657,7 @@ irqreturn_t hpet_rtc_interrupt(int irq, hpet_pie_count = 0; } - if (hpet_rtc_flags RTC_PIE + if (hpet_rtc_flags RTC_AIE (curr_time.tm_sec == hpet_alarm_time.tm_sec) (curr_time.tm_min == hpet_alarm_time.tm_min) (curr_time.tm_hour == hpet_alarm_time.tm_hour)) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Add HPET RTC emulation to RTC_DRV_CMOS
That patch adds the RTC emulation of the HPET timer to the new RTC_DRV_CMOS. The old drivers/char/rtc.ko driver had that functionality and it's important on new systems. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/x86/Kconfig |2 - drivers/rtc/rtc-cmos.c | 79 - 2 files changed, 67 insertions(+), 14 deletions(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -405,7 +405,7 @@ config HPET_TIMER config HPET_EMULATE_RTC def_bool y - depends on HPET_TIMER (RTC=y || RTC=m) + depends on HPET_TIMER (RTC=y || RTC=m || RTC_DRV_CMOS=m || RTC_DRV_CMOS=y) # Mark as embedded because too many people got it wrong. # The code disables itself when not needed. --- a/drivers/rtc/rtc-cmos.c +++ b/drivers/rtc/rtc-cmos.c @@ -35,10 +35,22 @@ #include linux/spinlock.h #include linux/platform_device.h #include linux/mod_devicetable.h +#include asm/hpet.h /* this is for generic access to PC-style RTC using CMOS_READ/CMOS_WRITE */ #include asm-generic/rtc.h +#ifndef CONFIG_HPET_EMULATE_RTC +#define is_hpet_enabled() 0 +#define hpet_set_alarm_time(hrs, min, sec) do { } while (0) +#define hpet_set_periodic_freq(arg)0 +#define hpet_mask_rtc_irq_bit(arg) do { } while (0) +#define hpet_set_rtc_irq_bit(arg) do { } while (0) +#define hpet_rtc_timer_init() do { } while (0) +#define hpet_register_irq_handler(h) 0 +#define hpet_unregister_irq_handler(h) do { } while (0) +extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id); +#endif struct cmos_rtc { struct rtc_device *rtc; @@ -199,6 +211,7 @@ static int cmos_set_alarm(struct device sec = t-time.tm_sec; sec = (sec 60) ? BIN2BCD(sec) : 0xff; + hpet_set_alarm_time(t-time.tm_hour, t-time.tm_min, t-time.tm_sec); spin_lock_irq(rtc_lock); /* next rtc irq must not be from previous alarm setting */ @@ -252,7 +265,8 @@ static int cmos_irq_set_freq(struct devi f = 16 - f; spin_lock_irqsave(rtc_lock, flags); - CMOS_WRITE(RTC_REF_CLCK_32KHZ | f, RTC_FREQ_SELECT); + if (!hpet_set_periodic_freq(freq)) + CMOS_WRITE(RTC_REF_CLCK_32KHZ | f, RTC_FREQ_SELECT); spin_unlock_irqrestore(rtc_lock, flags); return 0; @@ -314,28 +328,37 @@ cmos_rtc_ioctl(struct device *dev, unsig switch (cmd) { case RTC_AIE_OFF: /* alarm off */ rtc_control = ~RTC_AIE; + hpet_mask_rtc_irq_bit(RTC_AIE); break; case RTC_AIE_ON:/* alarm on */ rtc_control |= RTC_AIE; + hpet_set_rtc_irq_bit(RTC_AIE); break; case RTC_UIE_OFF: /* update off */ rtc_control = ~RTC_UIE; + hpet_mask_rtc_irq_bit(RTC_UIE); break; case RTC_UIE_ON:/* update on */ rtc_control |= RTC_UIE; + hpet_set_rtc_irq_bit(RTC_UIE); break; case RTC_PIE_OFF: /* periodic off */ rtc_control = ~RTC_PIE; + hpet_mask_rtc_irq_bit(RTC_PIE); break; case RTC_PIE_ON:/* periodic on */ rtc_control |= RTC_PIE; + hpet_set_rtc_irq_bit(RTC_PIE); break; } - CMOS_WRITE(rtc_control, RTC_CONTROL); + if (!is_hpet_enabled()) + CMOS_WRITE(rtc_control, RTC_CONTROL); + rtc_intr = CMOS_READ(RTC_INTR_FLAGS); rtc_intr = (rtc_control RTC_IRQMASK) | RTC_IRQF; if (is_intr(rtc_intr)) rtc_update_irq(cmos-rtc, 1, rtc_intr); + spin_unlock_irqrestore(rtc_lock, flags); return 0; } @@ -475,15 +498,25 @@ static irqreturn_t cmos_interrupt(int ir u8 rtc_control; spin_lock(rtc_lock); - irqstat = CMOS_READ(RTC_INTR_FLAGS); - rtc_control = CMOS_READ(RTC_CONTROL); - irqstat = (rtc_control RTC_IRQMASK) | RTC_IRQF; + /* +* In this case it is HPET RTC interrupt handler +* calling us, with the interrupt information +* passed as arg1, instead of irq. +*/ + if (is_hpet_enabled()) + irqstat = (unsigned long)irq 0xF0; + else { + irqstat = CMOS_READ(RTC_INTR_FLAGS); + rtc_control = CMOS_READ(RTC_CONTROL); + irqstat = (rtc_control RTC_IRQMASK) | RTC_IRQF; + } /* All Linux RTC alarms should be treated as if they were oneshot. * Similar code may be needed in system wakeup paths, in case the * alarm woke the system. */ if (irqstat RTC_AIE) { + rtc_control = CMOS_READ(RTC_CONTROL); rtc_control = ~RTC_AIE; CMOS_WRITE(rtc_control, RTC_CONTROL); CMOS_READ(RTC_INTR_FLAGS
Re: [PATCH] Add HPET RTC emulation to RTC_DRV_CMOS
* Bernhard Walle [EMAIL PROTECTED] [2007-12-20 16:24]: ... This was an accident. The patch belongs to a patch series that I'll post later. Please ignore! Thanks, Bernhard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/3] Add HPET RTC emulation to RTC_DRV_CMOS
That patch adds the RTC emulation of the HPET timer to the new RTC_DRV_CMOS. The old drivers/char/rtc.ko driver had that functionality and it's important on new systems. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/x86/Kconfig |2 - drivers/rtc/rtc-cmos.c | 79 - 2 files changed, 67 insertions(+), 14 deletions(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -405,7 +405,7 @@ config HPET_TIMER config HPET_EMULATE_RTC def_bool y - depends on HPET_TIMER (RTC=y || RTC=m) + depends on HPET_TIMER (RTC=y || RTC=m || RTC_DRV_CMOS=m || RTC_DRV_CMOS=y) # Mark as embedded because too many people got it wrong. # The code disables itself when not needed. --- a/drivers/rtc/rtc-cmos.c +++ b/drivers/rtc/rtc-cmos.c @@ -35,10 +35,22 @@ #include linux/spinlock.h #include linux/platform_device.h #include linux/mod_devicetable.h +#include asm/hpet.h /* this is for generic access to PC-style RTC using CMOS_READ/CMOS_WRITE */ #include asm-generic/rtc.h +#ifndef CONFIG_HPET_EMULATE_RTC +#define is_hpet_enabled() 0 +#define hpet_set_alarm_time(hrs, min, sec) do { } while (0) +#define hpet_set_periodic_freq(arg)0 +#define hpet_mask_rtc_irq_bit(arg) do { } while (0) +#define hpet_set_rtc_irq_bit(arg) do { } while (0) +#define hpet_rtc_timer_init() do { } while (0) +#define hpet_register_irq_handler(h) 0 +#define hpet_unregister_irq_handler(h) do { } while (0) +extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id); +#endif struct cmos_rtc { struct rtc_device *rtc; @@ -199,6 +211,7 @@ static int cmos_set_alarm(struct device sec = t-time.tm_sec; sec = (sec 60) ? BIN2BCD(sec) : 0xff; + hpet_set_alarm_time(t-time.tm_hour, t-time.tm_min, t-time.tm_sec); spin_lock_irq(rtc_lock); /* next rtc irq must not be from previous alarm setting */ @@ -252,7 +265,8 @@ static int cmos_irq_set_freq(struct devi f = 16 - f; spin_lock_irqsave(rtc_lock, flags); - CMOS_WRITE(RTC_REF_CLCK_32KHZ | f, RTC_FREQ_SELECT); + if (!hpet_set_periodic_freq(freq)) + CMOS_WRITE(RTC_REF_CLCK_32KHZ | f, RTC_FREQ_SELECT); spin_unlock_irqrestore(rtc_lock, flags); return 0; @@ -314,28 +328,37 @@ cmos_rtc_ioctl(struct device *dev, unsig switch (cmd) { case RTC_AIE_OFF: /* alarm off */ rtc_control = ~RTC_AIE; + hpet_mask_rtc_irq_bit(RTC_AIE); break; case RTC_AIE_ON:/* alarm on */ rtc_control |= RTC_AIE; + hpet_set_rtc_irq_bit(RTC_AIE); break; case RTC_UIE_OFF: /* update off */ rtc_control = ~RTC_UIE; + hpet_mask_rtc_irq_bit(RTC_UIE); break; case RTC_UIE_ON:/* update on */ rtc_control |= RTC_UIE; + hpet_set_rtc_irq_bit(RTC_UIE); break; case RTC_PIE_OFF: /* periodic off */ rtc_control = ~RTC_PIE; + hpet_mask_rtc_irq_bit(RTC_PIE); break; case RTC_PIE_ON:/* periodic on */ rtc_control |= RTC_PIE; + hpet_set_rtc_irq_bit(RTC_PIE); break; } - CMOS_WRITE(rtc_control, RTC_CONTROL); + if (!is_hpet_enabled()) + CMOS_WRITE(rtc_control, RTC_CONTROL); + rtc_intr = CMOS_READ(RTC_INTR_FLAGS); rtc_intr = (rtc_control RTC_IRQMASK) | RTC_IRQF; if (is_intr(rtc_intr)) rtc_update_irq(cmos-rtc, 1, rtc_intr); + spin_unlock_irqrestore(rtc_lock, flags); return 0; } @@ -475,15 +498,25 @@ static irqreturn_t cmos_interrupt(int ir u8 rtc_control; spin_lock(rtc_lock); - irqstat = CMOS_READ(RTC_INTR_FLAGS); - rtc_control = CMOS_READ(RTC_CONTROL); - irqstat = (rtc_control RTC_IRQMASK) | RTC_IRQF; + /* +* In this case it is HPET RTC interrupt handler +* calling us, with the interrupt information +* passed as arg1, instead of irq. +*/ + if (is_hpet_enabled()) + irqstat = (unsigned long)irq 0xF0; + else { + irqstat = CMOS_READ(RTC_INTR_FLAGS); + rtc_control = CMOS_READ(RTC_CONTROL); + irqstat = (rtc_control RTC_IRQMASK) | RTC_IRQF; + } /* All Linux RTC alarms should be treated as if they were oneshot. * Similar code may be needed in system wakeup paths, in case the * alarm woke the system. */ if (irqstat RTC_AIE) { + rtc_control = CMOS_READ(RTC_CONTROL); rtc_control = ~RTC_AIE; CMOS_WRITE(rtc_control, RTC_CONTROL); CMOS_READ(RTC_INTR_FLAGS
[patch 2/3] Use the IRQ callback interface in (old) RTC driver
That function uses the new registration callback mechanism which was added in the previous patch in the old RTC driver. It also removes the direct rtc_interrupt() call from arch/x86/kernel/hpetc.c so that there's finally no (code) dependency to CONFIG_RTC in arch/x86/kernel/hpet.c. Because of this, it's possible to compile the drivers/char/rtc.ko driver as module and still use the HPET emulation functionality. This is also expressed in Kconfig. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/x86/Kconfig |2 +- arch/x86/kernel/hpet.c |2 -- drivers/char/rtc.c | 15 ++- 3 files changed, 15 insertions(+), 4 deletions(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -405,7 +405,7 @@ config HPET_TIMER config HPET_EMULATE_RTC def_bool y - depends on HPET_TIMER RTC=y + depends on HPET_TIMER (RTC=y || RTC=m) # Mark as embedded because too many people got it wrong. # The code disables itself when not needed. --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -708,8 +708,6 @@ irqreturn_t hpet_rtc_interrupt(int irq, rtc_int_flag |= (RTC_IRQF | (RTC_NUM_INTS 8)); if (irq_handler) irq_handler(rtc_int_flag, dev_id); - - rtc_interrupt(rtc_int_flag, dev_id); } return IRQ_HANDLED; } --- a/drivers/char/rtc.c +++ b/drivers/char/rtc.c @@ -110,6 +110,8 @@ static int rtc_has_irq = 1; #define hpet_set_rtc_irq_bit(arg) 0 #define hpet_rtc_timer_init() do { } while (0) #define hpet_rtc_dropped_irq() 0 +#define hpet_register_irq_handler(h) 0 +#define hpet_unregister_irq_handler(h) 0 #ifdef RTC_IRQ static irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id) { @@ -1027,7 +1029,15 @@ no_irq: #ifdef RTC_IRQ if (is_hpet_enabled()) { + int err; + rtc_int_handler_ptr = hpet_rtc_interrupt; + err = hpet_register_irq_handler(rtc_interrupt); + if (err != 0) { + printk(KERN_WARNING hpet_register_irq_handler failed + in rtc_init().); + return err; + } } else { rtc_int_handler_ptr = rtc_interrupt; } @@ -1050,6 +1060,7 @@ no_irq: if (misc_register(rtc_dev)) { #ifdef RTC_IRQ free_irq(RTC_IRQ, NULL); + hpet_unregister_irq_handler(rtc_interrupt); rtc_has_irq = 0; #endif rtc_release_region(); @@ -1141,8 +1152,10 @@ static void __exit rtc_exit(void) #else rtc_release_region(); #ifdef RTC_IRQ - if (rtc_has_irq) + if (rtc_has_irq) { free_irq(RTC_IRQ, NULL); + hpet_unregister_irq_handler(hpet_rtc_interrupt); + } #endif #endif /* CONFIG_SPARC32 */ } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 0/3] Implement CONFIG_HPET_EMULATE_RTC for RTC_DRV_CMOS
The new rtc-cmos driver misses HPET support. If the hardware has HPET enabled, then interrupts don't work for the rtc-cmos driver which results in RTC_AIE*, RTC_PIE* and RTC_ALM being unusable. This affects hwclock from util-linux-ng at least on i386 since that uses RTC_PIE_ON. (For x86-64, a polling method is used for unknown reasons.) This patch series now 1. export the functions from arch/x86/kernel/hpet.c that the old char/rtc driver uses to work around that problem, 2. makes it possible to compile the old rtc driver as module, while still having CONFIG_HPET_EMULATE_RTC enabled and 3. makes use of the exported functions in (1) in the new rtc-cmos driver. The design is not changed. Please review and give me feedback! This patch series is against 2.6.24-rc5-mm1. It passes the test in Documentation/rtc.txt after http://lkml.org/lkml/2007/12/20/249 is applied. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/3] Make CONFIG_HPET_EMULATE_RTC usable from modules
This patch makes the RTC emulation functions in arch/x86/kernel/hpet.c usable for kernel modules. It - exports the functions (EXPORT_SYMBOL_GPL()), - adds an interface to register the interrupt callback function instead of using only a fixed callback function and - replaces the rtc_get_rtc_time() function which depends on CONFIG_RTC with a call to get_rtc_time() which is defined in include/asm-generic/rtc.h. The only dependency to CONFIG_RTC is the call to rtc_interrupt() which is removed by the next patch. After this, there's no (code) dependency of this functions to CONFIG_RTC=y any more. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/x86/kernel/hpet.c | 47 ++- include/asm-x86/hpet.h |3 +++ 2 files changed, 49 insertions(+), 1 deletion(-) --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -107,6 +107,7 @@ int is_hpet_enabled(void) { return is_hpet_capable() hpet_legacy_int_enabled; } +EXPORT_SYMBOL_GPL(is_hpet_enabled); /* * When the hpet driver (/dev/hpet) is enabled, we need to reserve @@ -478,6 +479,7 @@ void hpet_disable(void) */ #include linux/mc146818rtc.h #include linux/rtc.h +#include asm/rtc.h #define DEFAULT_RTC_INT_FREQ 64 #define DEFAULT_RTC_SHIFT 6 @@ -492,6 +494,38 @@ static unsigned long hpet_default_delta; static unsigned long hpet_pie_delta; static unsigned long hpet_pie_limit; +static rtc_irq_handler irq_handler; + +/* + * Registers a IRQ handler. + */ +int hpet_register_irq_handler(rtc_irq_handler handler) +{ + if (!is_hpet_enabled()) + return -ENODEV; + if (irq_handler) + return -EBUSY; + + irq_handler = handler; + + return 0; +} +EXPORT_SYMBOL_GPL(hpet_register_irq_handler); + +/* + * Deregisters the IRQ handler registered with hpet_register_irq_handler() + * and does cleanup. + */ +void hpet_unregister_irq_handler(rtc_irq_handler handler) +{ + if (!is_hpet_enabled()) + return; + + irq_handler = NULL; + hpet_rtc_flags = 0; +} +EXPORT_SYMBOL_GPL(hpet_unregister_irq_handler); + /* * Timer 1 for RTC emulation. We use one shot mode, as periodic mode * is not supported by all HPET implementations for timer 1. @@ -533,6 +567,7 @@ int hpet_rtc_timer_init(void) return 1; } +EXPORT_SYMBOL_GPL(hpet_rtc_timer_init); /* * The functions below are called from rtc driver. @@ -547,6 +582,7 @@ int hpet_mask_rtc_irq_bit(unsigned long hpet_rtc_flags = ~bit_mask; return 1; } +EXPORT_SYMBOL_GPL(hpet_mask_rtc_irq_bit); int hpet_set_rtc_irq_bit(unsigned long bit_mask) { @@ -562,6 +598,7 @@ int hpet_set_rtc_irq_bit(unsigned long b return 1; } +EXPORT_SYMBOL_GPL(hpet_set_rtc_irq_bit); int hpet_set_alarm_time(unsigned char hrs, unsigned char min, unsigned char sec) @@ -575,6 +612,7 @@ int hpet_set_alarm_time(unsigned char hr return 1; } +EXPORT_SYMBOL_GPL(hpet_set_alarm_time); int hpet_set_periodic_freq(unsigned long freq) { @@ -593,11 +631,13 @@ int hpet_set_periodic_freq(unsigned long } return 1; } +EXPORT_SYMBOL_GPL(hpet_set_periodic_freq); int hpet_rtc_dropped_irq(void) { return is_hpet_enabled(); } +EXPORT_SYMBOL_GPL(hpet_rtc_dropped_irq); static void hpet_rtc_timer_reinit(void) { @@ -641,9 +681,10 @@ irqreturn_t hpet_rtc_interrupt(int irq, unsigned long rtc_int_flag = 0; hpet_rtc_timer_reinit(); + memset(curr_time, 0, sizeof(struct rtc_time)); if (hpet_rtc_flags (RTC_UIE | RTC_AIE)) - rtc_get_rtc_time(curr_time); + get_rtc_time(curr_time); if (hpet_rtc_flags RTC_UIE curr_time.tm_sec != hpet_prev_update_sec) { @@ -665,8 +706,12 @@ irqreturn_t hpet_rtc_interrupt(int irq, if (rtc_int_flag) { rtc_int_flag |= (RTC_IRQF | (RTC_NUM_INTS 8)); + if (irq_handler) + irq_handler(rtc_int_flag, dev_id); + rtc_interrupt(rtc_int_flag, dev_id); } return IRQ_HANDLED; } +EXPORT_SYMBOL_GPL(hpet_rtc_interrupt); #endif --- a/include/asm-x86/hpet.h +++ b/include/asm-x86/hpet.h @@ -69,6 +69,7 @@ extern void force_hpet_resume(void); #include linux/interrupt.h +typedef irqreturn_t (*rtc_irq_handler)(int interrupt, void *cookie); extern int hpet_mask_rtc_irq_bit(unsigned long bit_mask); extern int hpet_set_rtc_irq_bit(unsigned long bit_mask); extern int hpet_set_alarm_time(unsigned char hrs, unsigned char min, @@ -77,6 +78,8 @@ extern int hpet_set_periodic_freq(unsign extern int hpet_rtc_dropped_irq(void); extern int hpet_rtc_timer_init(void); extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id); +extern int hpet_register_irq_handler(rtc_irq_handler handler); +extern void hpet_unregister_irq_handler(rtc_irq_handler handler); #endif /* CONFIG_HPET_EMULATE_RTC */ -- To unsubscribe
Re: [PATCH] proc_fs.h redux
* Russell King <[EMAIL PROTECTED]> [2007-10-28 14:04]: > On Sun, Oct 28, 2007 at 12:59:52PM +0100, Bernhard Walle wrote: > > * Russell King <[EMAIL PROTECTED]> [2007-10-28 11:34]: > > > > > > If you go down that route, you end up with _lots_ of circular > > > dependencies - header file X needs Y needs Z which needs X. We've > > > been there, several times. It very quickly becomes quite > > > unmaintainable - you end up with hard to predict behaviour from > > > include files. > > > > > > The only realistic solution is to use forward declarations. > > > > In header files, yes. But that's not true for implementation files. > > I don't think that needs saying - it's quite obvious. You can't > access the contents of structures without their definitions being > available. Of course. But there might be the case where an implementation file doesn't access the structure itself but just passes the pointer to some other function (which is implemented in another file). In that case, you also have the choice between forward declaration and including the header file in the implementation file. Thanks, Bernhard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] proc_fs.h redux
* Russell King <[EMAIL PROTECTED]> [2007-10-28 11:34]: > > If you go down that route, you end up with _lots_ of circular > dependencies - header file X needs Y needs Z which needs X. We've > been there, several times. It very quickly becomes quite > unmaintainable - you end up with hard to predict behaviour from > include files. > > The only realistic solution is to use forward declarations. In header files, yes. But that's not true for implementation files. Thanks, Bernhard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] proc_fs.h redux
* Russell King [EMAIL PROTECTED] [2007-10-28 11:34]: If you go down that route, you end up with _lots_ of circular dependencies - header file X needs Y needs Z which needs X. We've been there, several times. It very quickly becomes quite unmaintainable - you end up with hard to predict behaviour from include files. The only realistic solution is to use forward declarations. In header files, yes. But that's not true for implementation files. Thanks, Bernhard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] proc_fs.h redux
* Russell King [EMAIL PROTECTED] [2007-10-28 14:04]: On Sun, Oct 28, 2007 at 12:59:52PM +0100, Bernhard Walle wrote: * Russell King [EMAIL PROTECTED] [2007-10-28 11:34]: If you go down that route, you end up with _lots_ of circular dependencies - header file X needs Y needs Z which needs X. We've been there, several times. It very quickly becomes quite unmaintainable - you end up with hard to predict behaviour from include files. The only realistic solution is to use forward declarations. In header files, yes. But that's not true for implementation files. I don't think that needs saying - it's quite obvious. You can't access the contents of structures without their definitions being available. Of course. But there might be the case where an implementation file doesn't access the structure itself but just passes the pointer to some other function (which is implemented in another file). In that case, you also have the choice between forward declaration and including the header file in the implementation file. Thanks, Bernhard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/2] Introduce flags for reserve_bootmem()
This patch changes the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with -EBUSY if the memory already has been reserved in the past. This is to avoid conflicts. Because that code runs before SMP initialisation, there's no race condition inside reserve_bootmem_core(). Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/alpha/kernel/core_irongate.c |3 ++- arch/alpha/kernel/setup.c |5 +++-- arch/alpha/mm/numa.c |5 +++-- arch/arm/mm/init.c|4 ++-- arch/arm/mm/mmu.c | 17 +++-- arch/arm/mm/nommu.c |9 ++--- arch/arm/plat-omap/fb.c |2 +- arch/avr32/kernel/setup.c |6 -- arch/blackfin/kernel/setup.c |2 +- arch/cris/kernel/setup.c |2 +- arch/frv/kernel/setup.c | 16 ++-- arch/h8300/kernel/setup.c |2 +- arch/ia64/mm/contig.c |2 +- arch/ia64/mm/discontig.c |4 ++-- arch/m32r/kernel/setup.c | 11 +++ arch/m32r/mm/discontig.c |5 +++-- arch/m68k/atari/stram.c |2 +- arch/m68k/kernel/setup.c |3 ++- arch/m68knommu/kernel/setup.c |2 +- arch/mips/kernel/setup.c |4 ++-- arch/mips/sgi-ip27/ip27-memory.c |3 ++- arch/parisc/mm/init.c | 14 +- arch/powerpc/mm/mem.c |3 ++- arch/powerpc/mm/numa.c|2 +- arch/s390/kernel/setup.c | 11 +++ arch/sh/kernel/setup.c| 10 ++ arch/sh/mm/numa.c |4 ++-- arch/sh64/kernel/setup.c |7 +-- arch/sparc/mm/init.c |6 +++--- arch/sparc64/mm/init.c|8 arch/v850/kernel/anna.c |3 ++- arch/v850/kernel/as85ep1.c|3 ++- arch/v850/kernel/rte_ma1_cb.c |6 -- arch/v850/kernel/setup.c | 12 arch/x86/kernel/mpparse_32.c |6 -- arch/x86/kernel/setup_32.c| 15 --- arch/x86/kernel/setup_64.c|5 +++-- arch/x86/mm/discontig_32.c|3 ++- arch/x86/mm/init_64.c |6 +++--- arch/x86/mm/numa_64.c |6 -- arch/x86/mm/srat_64.c |3 ++- include/asm-x86/mmzone_32.h |4 ++-- include/linux/bootmem.h | 17 +++-- mm/bootmem.c | 27 +-- 44 files changed, 185 insertions(+), 105 deletions(-) --- a/arch/alpha/kernel/core_irongate.c +++ b/arch/alpha/kernel/core_irongate.c @@ -241,7 +241,8 @@ albacore_init_arch(void) size / 1024); } #endif - reserve_bootmem_node(NODE_DATA(0), pci_mem, memtop - pci_mem); + reserve_bootmem_node(NODE_DATA(0), pci_mem, memtop - + pci_mem, BOOTMEM_DEFAULT); printk("irongate_init_arch: temporarily reserving " "region %08lx-%08lx for PCI\n", pci_mem, memtop - 1); } --- a/arch/alpha/kernel/setup.c +++ b/arch/alpha/kernel/setup.c @@ -429,7 +429,8 @@ setup_memory(void *kernel_end) } /* Reserve the bootmap memory. */ - reserve_bootmem(PFN_PHYS(bootmap_start), bootmap_size); + reserve_bootmem(PFN_PHYS(bootmap_start), bootmap_size, + BOOTMEM_DEFAULT); printk("reserving pages %ld:%ld\n", bootmap_start, bootmap_start+PFN_UP(bootmap_size)); #ifdef CONFIG_BLK_DEV_INITRD @@ -447,7 +448,7 @@ setup_memory(void *kernel_end) phys_to_virt(PFN_PHYS(max_low_pfn))); } else { reserve_bootmem(virt_to_phys((void *)initrd_start), - INITRD_SIZE); + INITRD_SIZE, BOOTMEM_DEFAULT); } } #endif /* CONFIG_BLK_DEV_INITRD */ --- a/arch/alpha/mm/numa.c +++ b/arch/alpha/mm/numa.c @@ -242,7 +242,8 @@ setup_memory_node(int nid, void *kernel_ } /* Reserve the bootmap memory. */ - reserve_bootmem_node(NODE_DATA(nid), PFN_PHYS(bootmap_start), bootmap_size); + reserve_bootmem_node(NODE_DATA(nid), PFN_PHYS(bootmap_start), + bootmap_size, BOOTMEM_DEFAULT); printk(" reserving pages %ld:%ld\n", bootmap_start, bootmap_start+PFN_UP(bootmap_size)); node_set_online(nid); @@ -281,7 +282,7 @@ setup_memory(void *kernel_end) nid = kvaddr_to_nid(initrd_start); reserve_bootmem_node(NODE_DATA(nid), virt_to_phys((void *)initrd_start), -INITRD_SIZE); +
[patch 0/2] Add flags to reserve_bootmem()
This patchset adds a flags variable to reserve_bootmem() and uses the BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions between crashkernel area and already used memory. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2/2] Use BOOTMEM_EXCLUSIVE for kdump
This patch uses the BOOTMEM_EXCLUSIVE, introduced in the previous patch, to avoid conflicts while reserving the memory for the kdump carpture kernel (crashkernel=). Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/sh/kernel/setup.c | 29 ++--- arch/x86/kernel/setup_32.c | 28 ++-- arch/x86/kernel/setup_64.c | 35 +-- 3 files changed, 57 insertions(+), 35 deletions(-) --- a/arch/sh/kernel/setup.c +++ b/arch/sh/kernel/setup.c @@ -140,19 +140,26 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, free_mem, _size, _base); if (ret == 0 && crash_size) { - if (crash_base > 0) { - printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " - "for crashkernel (System RAM: %ldMB)\n", - (unsigned long)(crash_size >> 20), - (unsigned long)(crash_base >> 20), - (unsigned long)(free_mem >> 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, - BOOTMEM_DEFAULT); - } else + if (crash_base <= 0) { printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) < 0) { + printk(KERN_INFO "crashkernel reservation failed - " + "memory is in use\n"); + return; + } + + printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " + "for crashkernel (System RAM: %ldMB)\n", + (unsigned long)(crash_size >> 20), + (unsigned long)(crash_base >> 20), + (unsigned long)(free_mem >> 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -404,18 +404,26 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, total_mem, _size, _base); if (ret == 0 && crash_size > 0) { - if (crash_base > 0) { - printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " - "for crashkernel (System RAM: %ldMB)\n", - (unsigned long)(crash_size >> 20), - (unsigned long)(crash_base >> 20), - (unsigned long)(total_mem >> 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, BOOTMEM_DEFAULT); - } else + if (crash_base <= 0) { printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) < 0) { + printk(KERN_INFO "crashkernel reservation failed - " + "memory is in use\n"); + return; + } + + printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " + "for crashkernel (System RAM: %ldMB)\n", + (unsigned long)(crash_size >> 20), + (unsigned long)(crash_base >> 20), + (unsigned long)(total_mem >> 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -201,28 +201,35 @@ static inline void copy_edd(void) #ifdef CONFIG_KEXEC static void __init reserve_crashkernel(void) { - unsigned long long free_mem; + unsigned long long total_mem; unsig
[patch 0/2] Add flags to reserve_bootmem()
This patchset adds a flags variable to reserve_bootmem() and uses the BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions between crashkernel area and already used memory. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2/2] Use BOOTMEM_EXCLUSIVE for kdump
This patch uses the BOOTMEM_EXCLUSIVE, introduced in the previous patch, to avoid conflicts while reserving the memory for the kdump carpture kernel (crashkernel=). Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/sh/kernel/setup.c | 29 ++--- arch/x86/kernel/setup_32.c | 28 ++-- arch/x86/kernel/setup_64.c | 35 +-- 3 files changed, 57 insertions(+), 35 deletions(-) --- a/arch/sh/kernel/setup.c +++ b/arch/sh/kernel/setup.c @@ -140,19 +140,26 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, free_mem, crash_size, crash_base); if (ret == 0 crash_size) { - if (crash_base 0) { - printk(KERN_INFO Reserving %ldMB of memory at %ldMB - for crashkernel (System RAM: %ldMB)\n, - (unsigned long)(crash_size 20), - (unsigned long)(crash_base 20), - (unsigned long)(free_mem 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, - BOOTMEM_DEFAULT); - } else + if (crash_base = 0) { printk(KERN_INFO crashkernel reservation failed - you have to specify a base address\n); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) 0) { + printk(KERN_INFO crashkernel reservation failed - + memory is in use\n); + return; + } + + printk(KERN_INFO Reserving %ldMB of memory at %ldMB + for crashkernel (System RAM: %ldMB)\n, + (unsigned long)(crash_size 20), + (unsigned long)(crash_base 20), + (unsigned long)(free_mem 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -404,18 +404,26 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, total_mem, crash_size, crash_base); if (ret == 0 crash_size 0) { - if (crash_base 0) { - printk(KERN_INFO Reserving %ldMB of memory at %ldMB - for crashkernel (System RAM: %ldMB)\n, - (unsigned long)(crash_size 20), - (unsigned long)(crash_base 20), - (unsigned long)(total_mem 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, BOOTMEM_DEFAULT); - } else + if (crash_base = 0) { printk(KERN_INFO crashkernel reservation failed - you have to specify a base address\n); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) 0) { + printk(KERN_INFO crashkernel reservation failed - + memory is in use\n); + return; + } + + printk(KERN_INFO Reserving %ldMB of memory at %ldMB + for crashkernel (System RAM: %ldMB)\n, + (unsigned long)(crash_size 20), + (unsigned long)(crash_base 20), + (unsigned long)(total_mem 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -201,28 +201,35 @@ static inline void copy_edd(void) #ifdef CONFIG_KEXEC static void __init reserve_crashkernel(void) { - unsigned long long free_mem; + unsigned long long total_mem; unsigned long long crash_size, crash_base; int ret; - free_mem = ((unsigned long long)max_low_pfn - min_low_pfn) PAGE_SHIFT; + total_mem = ((unsigned long long)max_low_pfn - min_low_pfn) PAGE_SHIFT; - ret = parse_crashkernel(boot_command_line, free_mem, + ret = parse_crashkernel
[patch 1/2] Introduce flags for reserve_bootmem()
This patch changes the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with -EBUSY if the memory already has been reserved in the past. This is to avoid conflicts. Because that code runs before SMP initialisation, there's no race condition inside reserve_bootmem_core(). Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/alpha/kernel/core_irongate.c |3 ++- arch/alpha/kernel/setup.c |5 +++-- arch/alpha/mm/numa.c |5 +++-- arch/arm/mm/init.c|4 ++-- arch/arm/mm/mmu.c | 17 +++-- arch/arm/mm/nommu.c |9 ++--- arch/arm/plat-omap/fb.c |2 +- arch/avr32/kernel/setup.c |6 -- arch/blackfin/kernel/setup.c |2 +- arch/cris/kernel/setup.c |2 +- arch/frv/kernel/setup.c | 16 ++-- arch/h8300/kernel/setup.c |2 +- arch/ia64/mm/contig.c |2 +- arch/ia64/mm/discontig.c |4 ++-- arch/m32r/kernel/setup.c | 11 +++ arch/m32r/mm/discontig.c |5 +++-- arch/m68k/atari/stram.c |2 +- arch/m68k/kernel/setup.c |3 ++- arch/m68knommu/kernel/setup.c |2 +- arch/mips/kernel/setup.c |4 ++-- arch/mips/sgi-ip27/ip27-memory.c |3 ++- arch/parisc/mm/init.c | 14 +- arch/powerpc/mm/mem.c |3 ++- arch/powerpc/mm/numa.c|2 +- arch/s390/kernel/setup.c | 11 +++ arch/sh/kernel/setup.c| 10 ++ arch/sh/mm/numa.c |4 ++-- arch/sh64/kernel/setup.c |7 +-- arch/sparc/mm/init.c |6 +++--- arch/sparc64/mm/init.c|8 arch/v850/kernel/anna.c |3 ++- arch/v850/kernel/as85ep1.c|3 ++- arch/v850/kernel/rte_ma1_cb.c |6 -- arch/v850/kernel/setup.c | 12 arch/x86/kernel/mpparse_32.c |6 -- arch/x86/kernel/setup_32.c| 15 --- arch/x86/kernel/setup_64.c|5 +++-- arch/x86/mm/discontig_32.c|3 ++- arch/x86/mm/init_64.c |6 +++--- arch/x86/mm/numa_64.c |6 -- arch/x86/mm/srat_64.c |3 ++- include/asm-x86/mmzone_32.h |4 ++-- include/linux/bootmem.h | 17 +++-- mm/bootmem.c | 27 +-- 44 files changed, 185 insertions(+), 105 deletions(-) --- a/arch/alpha/kernel/core_irongate.c +++ b/arch/alpha/kernel/core_irongate.c @@ -241,7 +241,8 @@ albacore_init_arch(void) size / 1024); } #endif - reserve_bootmem_node(NODE_DATA(0), pci_mem, memtop - pci_mem); + reserve_bootmem_node(NODE_DATA(0), pci_mem, memtop - + pci_mem, BOOTMEM_DEFAULT); printk(irongate_init_arch: temporarily reserving region %08lx-%08lx for PCI\n, pci_mem, memtop - 1); } --- a/arch/alpha/kernel/setup.c +++ b/arch/alpha/kernel/setup.c @@ -429,7 +429,8 @@ setup_memory(void *kernel_end) } /* Reserve the bootmap memory. */ - reserve_bootmem(PFN_PHYS(bootmap_start), bootmap_size); + reserve_bootmem(PFN_PHYS(bootmap_start), bootmap_size, + BOOTMEM_DEFAULT); printk(reserving pages %ld:%ld\n, bootmap_start, bootmap_start+PFN_UP(bootmap_size)); #ifdef CONFIG_BLK_DEV_INITRD @@ -447,7 +448,7 @@ setup_memory(void *kernel_end) phys_to_virt(PFN_PHYS(max_low_pfn))); } else { reserve_bootmem(virt_to_phys((void *)initrd_start), - INITRD_SIZE); + INITRD_SIZE, BOOTMEM_DEFAULT); } } #endif /* CONFIG_BLK_DEV_INITRD */ --- a/arch/alpha/mm/numa.c +++ b/arch/alpha/mm/numa.c @@ -242,7 +242,8 @@ setup_memory_node(int nid, void *kernel_ } /* Reserve the bootmap memory. */ - reserve_bootmem_node(NODE_DATA(nid), PFN_PHYS(bootmap_start), bootmap_size); + reserve_bootmem_node(NODE_DATA(nid), PFN_PHYS(bootmap_start), + bootmap_size, BOOTMEM_DEFAULT); printk( reserving pages %ld:%ld\n, bootmap_start, bootmap_start+PFN_UP(bootmap_size)); node_set_online(nid); @@ -281,7 +282,7 @@ setup_memory(void *kernel_end) nid = kvaddr_to_nid(initrd_start); reserve_bootmem_node(NODE_DATA(nid), virt_to_phys((void *)initrd_start), -INITRD_SIZE); +INITRD_SIZE, BOOTMEM_DEFAULT
[PATCH] Add additional argument to bootmem reservation
This patch adds the additional bootmem reservation argument to all other architectures which didn't compile after kexec-introduce-bootmem_exclusive.patch has been merged [1]. It also adds a flags argument to reserve_bootmem_node(). I tested compilation on i386, x86_64 and ia64 with different memory configurations. I hope that all other architectures work again, if not, drop me a note with the compiler error and I'll create a patch that fixes it. [1] Andrew, I thought it was clear from my patch description that the patch was not ready to be merged -- however, that patch is the fix that was missing, so no need to drop it now. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/alpha/kernel/core_irongate.c |3 ++- arch/alpha/kernel/setup.c |5 +++-- arch/alpha/mm/numa.c |5 +++-- arch/arm/mm/init.c|4 ++-- arch/arm/mm/mmu.c | 17 +++-- arch/arm/mm/nommu.c |9 ++--- arch/arm/plat-omap/fb.c |2 +- arch/avr32/kernel/setup.c |6 -- arch/blackfin/kernel/setup.c |2 +- arch/cris/kernel/setup.c |2 +- arch/frv/kernel/setup.c | 16 ++-- arch/h8300/kernel/setup.c |2 +- arch/ia64/mm/contig.c |2 +- arch/ia64/mm/discontig.c |4 ++-- arch/m32r/kernel/setup.c | 11 +++ arch/m32r/mm/discontig.c |5 +++-- arch/m68k/atari/stram.c |2 +- arch/m68k/kernel/setup.c |3 ++- arch/m68knommu/kernel/setup.c |2 +- arch/mips/kernel/setup.c |4 ++-- arch/mips/sgi-ip27/ip27-memory.c |3 ++- arch/parisc/mm/init.c | 14 +- arch/powerpc/mm/mem.c |3 ++- arch/powerpc/mm/numa.c|2 +- arch/s390/kernel/setup.c | 11 +++ arch/sh/kernel/setup.c| 10 ++ arch/sh/mm/numa.c |4 ++-- arch/sh64/kernel/setup.c |7 +-- arch/sparc/mm/init.c |6 +++--- arch/sparc64/mm/init.c|8 arch/v850/kernel/anna.c |3 ++- arch/v850/kernel/as85ep1.c|3 ++- arch/v850/kernel/rte_ma1_cb.c |6 -- arch/v850/kernel/setup.c | 12 arch/x86/mm/discontig_32.c|3 ++- arch/x86/mm/init_64.c |6 +++--- arch/x86/mm/numa_64.c |6 -- arch/x86/mm/srat_64.c |3 ++- include/asm-x86/mmzone_32.h |4 ++-- include/linux/bootmem.h |3 ++- mm/bootmem.c |4 ++-- 41 files changed, 138 insertions(+), 89 deletions(-) --- a/arch/alpha/kernel/core_irongate.c +++ b/arch/alpha/kernel/core_irongate.c @@ -241,7 +241,8 @@ albacore_init_arch(void) size / 1024); } #endif - reserve_bootmem_node(NODE_DATA(0), pci_mem, memtop - pci_mem); + reserve_bootmem_node(NODE_DATA(0), pci_mem, memtop - + pci_mem, BOOTMEM_DEFAULT); printk("irongate_init_arch: temporarily reserving " "region %08lx-%08lx for PCI\n", pci_mem, memtop - 1); } --- a/arch/alpha/kernel/setup.c +++ b/arch/alpha/kernel/setup.c @@ -429,7 +429,8 @@ setup_memory(void *kernel_end) } /* Reserve the bootmap memory. */ - reserve_bootmem(PFN_PHYS(bootmap_start), bootmap_size); + reserve_bootmem(PFN_PHYS(bootmap_start), bootmap_size, + BOOTMEM_DEFAULT); printk("reserving pages %ld:%ld\n", bootmap_start, bootmap_start+PFN_UP(bootmap_size)); #ifdef CONFIG_BLK_DEV_INITRD @@ -447,7 +448,7 @@ setup_memory(void *kernel_end) phys_to_virt(PFN_PHYS(max_low_pfn))); } else { reserve_bootmem(virt_to_phys((void *)initrd_start), - INITRD_SIZE); + INITRD_SIZE, BOOTMEM_DEFAULT); } } #endif /* CONFIG_BLK_DEV_INITRD */ --- a/arch/alpha/mm/numa.c +++ b/arch/alpha/mm/numa.c @@ -242,7 +242,8 @@ setup_memory_node(int nid, void *kernel_ } /* Reserve the bootmap memory. */ - reserve_bootmem_node(NODE_DATA(nid), PFN_PHYS(bootmap_start), bootmap_size); + reserve_bootmem_node(NODE_DATA(nid), PFN_PHYS(bootmap_start), + bootmap_size, BOOTMEM_DEFAULT); printk(" reserving pages %ld:%ld\n", bootmap_start, bootmap_start+PFN_UP(bootmap_size)); node_set_online(nid); @@ -281,7 +282,7 @@ setup_memory(void *kernel_end) nid = kvaddr_to_nid(initrd_start); reserve_bootmem_node(NODE_DATA(nid),
[PATCH] Use BOOTMEM_EXCLUSIVE for crashkernel reservation
This patch implements the usage of BOOTMEM_EXCLUSIVE for crashkernel reservation on other architectures. The only architecture that applies is sh. The patch is based on current git tree plus kexec-introduce-bootmem_exclusive.patch from -mm tree. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/sh/kernel/setup.c | 29 ++--- 1 file changed, 18 insertions(+), 11 deletions(-) --- a/arch/sh/kernel/setup.c +++ b/arch/sh/kernel/setup.c @@ -140,19 +140,26 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, free_mem, _size, _base); if (ret == 0 && crash_size) { - if (crash_base > 0) { - printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " - "for crashkernel (System RAM: %ldMB)\n", - (unsigned long)(crash_size >> 20), - (unsigned long)(crash_base >> 20), - (unsigned long)(free_mem >> 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, - BOOTMEM_DEFAULT); - } else + if (crash_base <= 0) { printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) < 0) { + printk(KERN_INFO "crashkernel reservation failed - " + "memory is in use\n"); + return; + } + + printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " + "for crashkernel (System RAM: %ldMB)\n", + (unsigned long)(crash_size >> 20), + (unsigned long)(crash_base >> 20), + (unsigned long)(free_mem >> 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/3] Add BSS to resource tree
* Andrew Morton <[EMAIL PROTECTED]> [2007-10-18 23:26]: > On Thu, 18 Oct 2007 13:15:36 +0200 > Bernhard Walle <[EMAIL PROTECTED]> wrote: > > > This patch adds the BSS to the resource tree just as kernel text and kernel > > data are in the resource tree. The main reason behind this is to avoid > > crashkernel reservation in that area. > > > > While it's not strictly necessary to have the BSS in the resource tree > > (the actual collision detection is done in the reserve_bootmem() function > > before), the usage of the BSS resource should be presented to the user > > in /proc/iomem just as Kernel data and Kernel code. > > > > Note: The patch currently is only implemented for x86 and ia64 (because > > efi_initialize_iomem_resources() has the same signature on i386 and > > ia64). > > > > > > ... > > > > -extern char _text[], _end[], _etext[]; > > + > > +static struct resource bss_resource = { > > + .name = "Kernel bss", > > + .flags = IORESOURCE_BUSY | IORESOURCE_MEM > > +}; > > +extern char _text[], _end[], _etext[], _edata[], _bss[]; > > These should be in a header file. It's already ... the problem just was that IA64 uses _bss instead of __bss_start. So I think we should change this. I verified that the kernel still compiles with the patch below. Thanks, Bernhard --- [PATCH] Rename _bss to __bss_start Rename _bss to __bss_start as on other architectures. That makes it possible to use the instead of own declarations. Also add __bss_stop because that symbol exists on other architectures. That patch applies to current git plus kexec-add-bss-to-resource-tree.patch in -mm tree. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/ia64/hp/sim/boot/bootloader.lds |3 ++- arch/ia64/kernel/setup.c |3 +-- arch/ia64/kernel/vmlinux.lds.S |3 ++- 3 files changed, 5 insertions(+), 4 deletions(-) --- a/arch/ia64/hp/sim/boot/bootloader.lds +++ b/arch/ia64/hp/sim/boot/bootloader.lds @@ -22,10 +22,11 @@ SECTIONS .sdata : { *(.sdata) } _edata = .; - _bss = .; + __bss_start = .; .sbss : { *(.sbss) *(.scommon) } .bss : { *(.bss) *(COMMON) } . = ALIGN(64 / 8); + __bss_stop = .; _end = . ; /* Stabs debugging sections. */ --- a/arch/ia64/kernel/setup.c +++ b/arch/ia64/kernel/setup.c @@ -95,7 +95,6 @@ static struct resource bss_resource = { .name = "Kernel bss", .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; -extern char _text[], _end[], _etext[], _edata[], _bss[]; unsigned long ia64_max_cacheline_size; @@ -206,7 +205,7 @@ static int __init register_memory(void) code_resource.end = ia64_tpa(_etext) - 1; data_resource.start = ia64_tpa(_etext); data_resource.end = ia64_tpa(_edata) - 1; - bss_resource.start = ia64_tpa(_bss); + bss_resource.start = ia64_tpa(__bss_start); bss_resource.end= ia64_tpa(_end) - 1; efi_initialize_iomem_resources(_resource, _resource, _resource); --- a/arch/ia64/kernel/vmlinux.lds.S +++ b/arch/ia64/kernel/vmlinux.lds.S @@ -240,11 +240,12 @@ SECTIONS .sdata : AT(ADDR(.sdata) - LOAD_OFFSET) { *(.sdata) *(.sdata1) *(.srdata) } _edata = .; - _bss = .; + __bss_start = .; .sbss : AT(ADDR(.sbss) - LOAD_OFFSET) { *(.sbss) *(.scommon) } .bss : AT(ADDR(.bss) - LOAD_OFFSET) { *(.bss) *(COMMON) } + __bss_stop = .; _end = .; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/3] Add BSS to resource tree
* Andrew Morton <[EMAIL PROTECTED]> [2007-10-18 23:26]: > > > --- a/arch/x86/kernel/e820_64.c > > +++ b/arch/x86/kernel/e820_64.c > > @@ -47,7 +47,7 @@ unsigned long end_pfn_map; > > */ > > static unsigned long __initdata end_user_pfn = MAXMEM>>PAGE_SHIFT; > > > > -extern struct resource code_resource, data_resource; > > +extern struct resource code_resource, data_resource, bss_resource; > > As should these. afaik they're the same on all architectures and even if > they have different names on some weird arch, an unused declaration won't > hurt. It's different on every architectures. Some don't have it at all (like PPC64), and most of them have code_resource and data_resource static. Instead of making the data on all architectures (that have it) global, I'd like to make it static on x86. See my attached patch. Thanks, Bernhard --- [PATCH] Remove extern declarations for code/data/bss resource This patch removes the extern struct resource declarations for data_resource, code_resource and bss_resource on x86 and declares that three structures as static as done on other architectures like IA64. On i386, these structures are moved to setup_32.c (from e820_32.c) because that's code that is not specific to e820 and also required on EFI systems. That makes the "extern" reference superfluous. On x86_64, data_resource, code_resource and bss_resource are passed to e820_reserve_resources() as arguments just as done on i386 and IA64. That also avoids the "extern" reference and it's possible to make it static. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/kernel/e820_32.c | 49 arch/x86/kernel/e820_64.c | 11 ++-- arch/x86/kernel/setup_32.c | 106 +++-- arch/x86/kernel/setup_64.c |6 +- 4 files changed, 111 insertions(+), 61 deletions(-) --- a/arch/x86/kernel/e820_32.c +++ b/arch/x86/kernel/e820_32.c @@ -37,26 +37,6 @@ unsigned long pci_mem_start = 0x1000 EXPORT_SYMBOL(pci_mem_start); #endif extern int user_defined_memmap; -struct resource data_resource = { - .name = "Kernel data", - .start = 0, - .end= 0, - .flags = IORESOURCE_BUSY | IORESOURCE_MEM -}; - -struct resource code_resource = { - .name = "Kernel code", - .start = 0, - .end= 0, - .flags = IORESOURCE_BUSY | IORESOURCE_MEM -}; - -struct resource bss_resource = { - .name = "Kernel bss", - .start = 0, - .end= 0, - .flags = IORESOURCE_BUSY | IORESOURCE_MEM -}; static struct resource system_rom_resource = { .name = "System ROM", @@ -111,60 +91,6 @@ static struct resource video_rom_resourc .flags = IORESOURCE_BUSY | IORESOURCE_READONLY | IORESOURCE_MEM }; -static struct resource video_ram_resource = { - .name = "Video RAM area", - .start = 0xa, - .end= 0xb, - .flags = IORESOURCE_BUSY | IORESOURCE_MEM -}; - -static struct resource standard_io_resources[] = { { - .name = "dma1", - .start = 0x, - .end= 0x001f, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = "pic1", - .start = 0x0020, - .end= 0x0021, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = "timer0", - .start = 0x0040, - .end= 0x0043, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = "timer1", - .start = 0x0050, - .end= 0x0053, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = "keyboard", - .start = 0x0060, - .end= 0x006f, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = "dma page reg", - .start = 0x0080, - .end= 0x008f, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = "pic2", - .start = 0x00a0, - .end= 0x00a1, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = "dma2", - .start = 0x00c0, - .end= 0x00df, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = "fpu", - .start = 0x00f0, - .end= 0x00ff, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -} }; - #define ROMSIGNATURE 0xaa55 static int __init romsignature(const unsigned char *rom) @@ -260,10 +186,9 @@ static void __init probe_roms(void) * Request address space for all standard RAM and ROM resources * and also for regions reported as reserved by the e820. */ -static void __init -legacy_init_iomem_resources(struct resource *code_resource, - struct resource *data_resource, - struct resource *bss_resource) +void __
[PATCH] Add additional argument to bootmem reservation
This patch adds the additional bootmem reservation argument to all other architectures which didn't compile after kexec-introduce-bootmem_exclusive.patch has been merged [1]. It also adds a flags argument to reserve_bootmem_node(). I tested compilation on i386, x86_64 and ia64 with different memory configurations. I hope that all other architectures work again, if not, drop me a note with the compiler error and I'll create a patch that fixes it. [1] Andrew, I thought it was clear from my patch description that the patch was not ready to be merged -- however, that patch is the fix that was missing, so no need to drop it now. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/alpha/kernel/core_irongate.c |3 ++- arch/alpha/kernel/setup.c |5 +++-- arch/alpha/mm/numa.c |5 +++-- arch/arm/mm/init.c|4 ++-- arch/arm/mm/mmu.c | 17 +++-- arch/arm/mm/nommu.c |9 ++--- arch/arm/plat-omap/fb.c |2 +- arch/avr32/kernel/setup.c |6 -- arch/blackfin/kernel/setup.c |2 +- arch/cris/kernel/setup.c |2 +- arch/frv/kernel/setup.c | 16 ++-- arch/h8300/kernel/setup.c |2 +- arch/ia64/mm/contig.c |2 +- arch/ia64/mm/discontig.c |4 ++-- arch/m32r/kernel/setup.c | 11 +++ arch/m32r/mm/discontig.c |5 +++-- arch/m68k/atari/stram.c |2 +- arch/m68k/kernel/setup.c |3 ++- arch/m68knommu/kernel/setup.c |2 +- arch/mips/kernel/setup.c |4 ++-- arch/mips/sgi-ip27/ip27-memory.c |3 ++- arch/parisc/mm/init.c | 14 +- arch/powerpc/mm/mem.c |3 ++- arch/powerpc/mm/numa.c|2 +- arch/s390/kernel/setup.c | 11 +++ arch/sh/kernel/setup.c| 10 ++ arch/sh/mm/numa.c |4 ++-- arch/sh64/kernel/setup.c |7 +-- arch/sparc/mm/init.c |6 +++--- arch/sparc64/mm/init.c|8 arch/v850/kernel/anna.c |3 ++- arch/v850/kernel/as85ep1.c|3 ++- arch/v850/kernel/rte_ma1_cb.c |6 -- arch/v850/kernel/setup.c | 12 arch/x86/mm/discontig_32.c|3 ++- arch/x86/mm/init_64.c |6 +++--- arch/x86/mm/numa_64.c |6 -- arch/x86/mm/srat_64.c |3 ++- include/asm-x86/mmzone_32.h |4 ++-- include/linux/bootmem.h |3 ++- mm/bootmem.c |4 ++-- 41 files changed, 138 insertions(+), 89 deletions(-) --- a/arch/alpha/kernel/core_irongate.c +++ b/arch/alpha/kernel/core_irongate.c @@ -241,7 +241,8 @@ albacore_init_arch(void) size / 1024); } #endif - reserve_bootmem_node(NODE_DATA(0), pci_mem, memtop - pci_mem); + reserve_bootmem_node(NODE_DATA(0), pci_mem, memtop - + pci_mem, BOOTMEM_DEFAULT); printk(irongate_init_arch: temporarily reserving region %08lx-%08lx for PCI\n, pci_mem, memtop - 1); } --- a/arch/alpha/kernel/setup.c +++ b/arch/alpha/kernel/setup.c @@ -429,7 +429,8 @@ setup_memory(void *kernel_end) } /* Reserve the bootmap memory. */ - reserve_bootmem(PFN_PHYS(bootmap_start), bootmap_size); + reserve_bootmem(PFN_PHYS(bootmap_start), bootmap_size, + BOOTMEM_DEFAULT); printk(reserving pages %ld:%ld\n, bootmap_start, bootmap_start+PFN_UP(bootmap_size)); #ifdef CONFIG_BLK_DEV_INITRD @@ -447,7 +448,7 @@ setup_memory(void *kernel_end) phys_to_virt(PFN_PHYS(max_low_pfn))); } else { reserve_bootmem(virt_to_phys((void *)initrd_start), - INITRD_SIZE); + INITRD_SIZE, BOOTMEM_DEFAULT); } } #endif /* CONFIG_BLK_DEV_INITRD */ --- a/arch/alpha/mm/numa.c +++ b/arch/alpha/mm/numa.c @@ -242,7 +242,8 @@ setup_memory_node(int nid, void *kernel_ } /* Reserve the bootmap memory. */ - reserve_bootmem_node(NODE_DATA(nid), PFN_PHYS(bootmap_start), bootmap_size); + reserve_bootmem_node(NODE_DATA(nid), PFN_PHYS(bootmap_start), + bootmap_size, BOOTMEM_DEFAULT); printk( reserving pages %ld:%ld\n, bootmap_start, bootmap_start+PFN_UP(bootmap_size)); node_set_online(nid); @@ -281,7 +282,7 @@ setup_memory(void *kernel_end) nid = kvaddr_to_nid(initrd_start); reserve_bootmem_node(NODE_DATA(nid), virt_to_phys((void *)initrd_start
Re: [patch 1/3] Add BSS to resource tree
* Andrew Morton [EMAIL PROTECTED] [2007-10-18 23:26]: On Thu, 18 Oct 2007 13:15:36 +0200 Bernhard Walle [EMAIL PROTECTED] wrote: This patch adds the BSS to the resource tree just as kernel text and kernel data are in the resource tree. The main reason behind this is to avoid crashkernel reservation in that area. While it's not strictly necessary to have the BSS in the resource tree (the actual collision detection is done in the reserve_bootmem() function before), the usage of the BSS resource should be presented to the user in /proc/iomem just as Kernel data and Kernel code. Note: The patch currently is only implemented for x86 and ia64 (because efi_initialize_iomem_resources() has the same signature on i386 and ia64). ... -extern char _text[], _end[], _etext[]; + +static struct resource bss_resource = { + .name = Kernel bss, + .flags = IORESOURCE_BUSY | IORESOURCE_MEM +}; +extern char _text[], _end[], _etext[], _edata[], _bss[]; These should be in a header file. It's already ... the problem just was that IA64 uses _bss instead of __bss_start. So I think we should change this. I verified that the kernel still compiles with the patch below. Thanks, Bernhard --- [PATCH] Rename _bss to __bss_start Rename _bss to __bss_start as on other architectures. That makes it possible to use the linux/sections.h instead of own declarations. Also add __bss_stop because that symbol exists on other architectures. That patch applies to current git plus kexec-add-bss-to-resource-tree.patch in -mm tree. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/ia64/hp/sim/boot/bootloader.lds |3 ++- arch/ia64/kernel/setup.c |3 +-- arch/ia64/kernel/vmlinux.lds.S |3 ++- 3 files changed, 5 insertions(+), 4 deletions(-) --- a/arch/ia64/hp/sim/boot/bootloader.lds +++ b/arch/ia64/hp/sim/boot/bootloader.lds @@ -22,10 +22,11 @@ SECTIONS .sdata : { *(.sdata) } _edata = .; - _bss = .; + __bss_start = .; .sbss : { *(.sbss) *(.scommon) } .bss : { *(.bss) *(COMMON) } . = ALIGN(64 / 8); + __bss_stop = .; _end = . ; /* Stabs debugging sections. */ --- a/arch/ia64/kernel/setup.c +++ b/arch/ia64/kernel/setup.c @@ -95,7 +95,6 @@ static struct resource bss_resource = { .name = Kernel bss, .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; -extern char _text[], _end[], _etext[], _edata[], _bss[]; unsigned long ia64_max_cacheline_size; @@ -206,7 +205,7 @@ static int __init register_memory(void) code_resource.end = ia64_tpa(_etext) - 1; data_resource.start = ia64_tpa(_etext); data_resource.end = ia64_tpa(_edata) - 1; - bss_resource.start = ia64_tpa(_bss); + bss_resource.start = ia64_tpa(__bss_start); bss_resource.end= ia64_tpa(_end) - 1; efi_initialize_iomem_resources(code_resource, data_resource, bss_resource); --- a/arch/ia64/kernel/vmlinux.lds.S +++ b/arch/ia64/kernel/vmlinux.lds.S @@ -240,11 +240,12 @@ SECTIONS .sdata : AT(ADDR(.sdata) - LOAD_OFFSET) { *(.sdata) *(.sdata1) *(.srdata) } _edata = .; - _bss = .; + __bss_start = .; .sbss : AT(ADDR(.sbss) - LOAD_OFFSET) { *(.sbss) *(.scommon) } .bss : AT(ADDR(.bss) - LOAD_OFFSET) { *(.bss) *(COMMON) } + __bss_stop = .; _end = .; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Use BOOTMEM_EXCLUSIVE for crashkernel reservation
This patch implements the usage of BOOTMEM_EXCLUSIVE for crashkernel reservation on other architectures. The only architecture that applies is sh. The patch is based on current git tree plus kexec-introduce-bootmem_exclusive.patch from -mm tree. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/sh/kernel/setup.c | 29 ++--- 1 file changed, 18 insertions(+), 11 deletions(-) --- a/arch/sh/kernel/setup.c +++ b/arch/sh/kernel/setup.c @@ -140,19 +140,26 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, free_mem, crash_size, crash_base); if (ret == 0 crash_size) { - if (crash_base 0) { - printk(KERN_INFO Reserving %ldMB of memory at %ldMB - for crashkernel (System RAM: %ldMB)\n, - (unsigned long)(crash_size 20), - (unsigned long)(crash_base 20), - (unsigned long)(free_mem 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, - BOOTMEM_DEFAULT); - } else + if (crash_base = 0) { printk(KERN_INFO crashkernel reservation failed - you have to specify a base address\n); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) 0) { + printk(KERN_INFO crashkernel reservation failed - + memory is in use\n); + return; + } + + printk(KERN_INFO Reserving %ldMB of memory at %ldMB + for crashkernel (System RAM: %ldMB)\n, + (unsigned long)(crash_size 20), + (unsigned long)(crash_base 20), + (unsigned long)(free_mem 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/3] Add BSS to resource tree
* Andrew Morton [EMAIL PROTECTED] [2007-10-18 23:26]: --- a/arch/x86/kernel/e820_64.c +++ b/arch/x86/kernel/e820_64.c @@ -47,7 +47,7 @@ unsigned long end_pfn_map; */ static unsigned long __initdata end_user_pfn = MAXMEMPAGE_SHIFT; -extern struct resource code_resource, data_resource; +extern struct resource code_resource, data_resource, bss_resource; As should these. afaik they're the same on all architectures and even if they have different names on some weird arch, an unused declaration won't hurt. It's different on every architectures. Some don't have it at all (like PPC64), and most of them have code_resource and data_resource static. Instead of making the data on all architectures (that have it) global, I'd like to make it static on x86. See my attached patch. Thanks, Bernhard --- [PATCH] Remove extern declarations for code/data/bss resource This patch removes the extern struct resource declarations for data_resource, code_resource and bss_resource on x86 and declares that three structures as static as done on other architectures like IA64. On i386, these structures are moved to setup_32.c (from e820_32.c) because that's code that is not specific to e820 and also required on EFI systems. That makes the extern reference superfluous. On x86_64, data_resource, code_resource and bss_resource are passed to e820_reserve_resources() as arguments just as done on i386 and IA64. That also avoids the extern reference and it's possible to make it static. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/x86/kernel/e820_32.c | 49 arch/x86/kernel/e820_64.c | 11 ++-- arch/x86/kernel/setup_32.c | 106 +++-- arch/x86/kernel/setup_64.c |6 +- 4 files changed, 111 insertions(+), 61 deletions(-) --- a/arch/x86/kernel/e820_32.c +++ b/arch/x86/kernel/e820_32.c @@ -37,26 +37,6 @@ unsigned long pci_mem_start = 0x1000 EXPORT_SYMBOL(pci_mem_start); #endif extern int user_defined_memmap; -struct resource data_resource = { - .name = Kernel data, - .start = 0, - .end= 0, - .flags = IORESOURCE_BUSY | IORESOURCE_MEM -}; - -struct resource code_resource = { - .name = Kernel code, - .start = 0, - .end= 0, - .flags = IORESOURCE_BUSY | IORESOURCE_MEM -}; - -struct resource bss_resource = { - .name = Kernel bss, - .start = 0, - .end= 0, - .flags = IORESOURCE_BUSY | IORESOURCE_MEM -}; static struct resource system_rom_resource = { .name = System ROM, @@ -111,60 +91,6 @@ static struct resource video_rom_resourc .flags = IORESOURCE_BUSY | IORESOURCE_READONLY | IORESOURCE_MEM }; -static struct resource video_ram_resource = { - .name = Video RAM area, - .start = 0xa, - .end= 0xb, - .flags = IORESOURCE_BUSY | IORESOURCE_MEM -}; - -static struct resource standard_io_resources[] = { { - .name = dma1, - .start = 0x, - .end= 0x001f, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = pic1, - .start = 0x0020, - .end= 0x0021, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = timer0, - .start = 0x0040, - .end= 0x0043, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = timer1, - .start = 0x0050, - .end= 0x0053, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = keyboard, - .start = 0x0060, - .end= 0x006f, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = dma page reg, - .start = 0x0080, - .end= 0x008f, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = pic2, - .start = 0x00a0, - .end= 0x00a1, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = dma2, - .start = 0x00c0, - .end= 0x00df, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -}, { - .name = fpu, - .start = 0x00f0, - .end= 0x00ff, - .flags = IORESOURCE_BUSY | IORESOURCE_IO -} }; - #define ROMSIGNATURE 0xaa55 static int __init romsignature(const unsigned char *rom) @@ -260,10 +186,9 @@ static void __init probe_roms(void) * Request address space for all standard RAM and ROM resources * and also for regions reported as reserved by the e820. */ -static void __init -legacy_init_iomem_resources(struct resource *code_resource, - struct resource *data_resource, - struct resource *bss_resource) +void __init legacy_init_iomem_resources(struct resource *code_resource, + struct resource *data_resource, + struct resource *bss_resource) { int i; @@ -305,35 +230,6 @@ legacy_init_iomem_resources(struct resou } } -/* - * Request address space for all standard resources
[patch 1/3] Add BSS to resource tree
This patch adds the BSS to the resource tree just as kernel text and kernel data are in the resource tree. The main reason behind this is to avoid crashkernel reservation in that area. While it's not strictly necessary to have the BSS in the resource tree (the actual collision detection is done in the reserve_bootmem() function before), the usage of the BSS resource should be presented to the user in /proc/iomem just as Kernel data and Kernel code. Note: The patch currently is only implemented for x86 and ia64 (because efi_initialize_iomem_resources() has the same signature on i386 and ia64). Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/ia64/kernel/efi.c |4 +++- arch/ia64/kernel/setup.c | 14 +++--- arch/x86/kernel/e820_32.c | 18 +++--- arch/x86/kernel/e820_64.c |3 ++- arch/x86/kernel/efi_32.c |4 +++- arch/x86/kernel/setup_32.c |4 arch/x86/kernel/setup_64.c |9 + include/linux/efi.h|2 +- 8 files changed, 48 insertions(+), 10 deletions(-) --- a/arch/ia64/kernel/efi.c +++ b/arch/ia64/kernel/efi.c @@ -1090,7 +1090,8 @@ efi_memmap_init(unsigned long *s, unsign void efi_initialize_iomem_resources(struct resource *code_resource, - struct resource *data_resource) + struct resource *data_resource, + struct resource *bss_resource) { struct resource *res; void *efi_map_start, *efi_map_end, *p; @@ -1171,6 +1172,7 @@ efi_initialize_iomem_resources(struct re */ insert_resource(res, code_resource); insert_resource(res, data_resource); + insert_resource(res, bss_resource); #ifdef CONFIG_KEXEC insert_resource(res, _memmap_res); insert_resource(res, _param_res); --- a/arch/ia64/kernel/setup.c +++ b/arch/ia64/kernel/setup.c @@ -90,7 +90,12 @@ static struct resource code_resource = { .name = "Kernel code", .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; -extern char _text[], _end[], _etext[]; + +static struct resource bss_resource = { + .name = "Kernel bss", + .flags = IORESOURCE_BUSY | IORESOURCE_MEM +}; +extern char _text[], _end[], _etext[], _edata[], _bss[]; unsigned long ia64_max_cacheline_size; @@ -200,8 +205,11 @@ static int __init register_memory(void) code_resource.start = ia64_tpa(_text); code_resource.end = ia64_tpa(_etext) - 1; data_resource.start = ia64_tpa(_etext); - data_resource.end = ia64_tpa(_end) - 1; - efi_initialize_iomem_resources(_resource, _resource); + data_resource.end = ia64_tpa(_edata) - 1; + bss_resource.start = ia64_tpa(_bss); + bss_resource.end= ia64_tpa(_end) - 1; + efi_initialize_iomem_resources(_resource, _resource, + _resource); return 0; } --- a/arch/x86/kernel/e820_32.c +++ b/arch/x86/kernel/e820_32.c @@ -51,6 +51,13 @@ struct resource code_resource = { .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; +struct resource bss_resource = { + .name = "Kernel bss", + .start = 0, + .end= 0, + .flags = IORESOURCE_BUSY | IORESOURCE_MEM +}; + static struct resource system_rom_resource = { .name = "System ROM", .start = 0xf, @@ -254,7 +261,9 @@ static void __init probe_roms(void) * and also for regions reported as reserved by the e820. */ static void __init -legacy_init_iomem_resources(struct resource *code_resource, struct resource *data_resource) +legacy_init_iomem_resources(struct resource *code_resource, + struct resource *data_resource, + struct resource *bss_resource) { int i; @@ -287,6 +296,7 @@ legacy_init_iomem_resources(struct resou */ request_resource(res, code_resource); request_resource(res, data_resource); + request_resource(res, bss_resource); #ifdef CONFIG_KEXEC if (crashk_res.start != crashk_res.end) request_resource(res, _res); @@ -307,9 +317,11 @@ static int __init request_standard_resou printk("Setting up standard PCI resources\n"); if (efi_enabled) - efi_initialize_iomem_resources(_resource, _resource); + efi_initialize_iomem_resources(_resource, + _resource, _resource); else - legacy_init_iomem_resources(_resource, _resource); + legacy_init_iomem_resources(_resource, + _resource, _resource); /* EFI systems may still have VGA */ request_resource(_resource, _ram_resource); --- a/arch/x86/kernel/
[patch 2/3] Introduce BOOTMEM_EXCLUSIVE
This flag changes the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with -EBUSY if the memory already has been reserved in the past. This is to avoid conflicts. Because it's necessary to unreserve the bootmem if a collision is discovered in the middle of the area, a rwlock is introduced: only one BOOTMEM_EXCLUSIVE caller is possible, but multiple BOOTMEM_DEFAULT callers. But if a BOOTMEM_EXCLUSIVE caller is in reserve_bootmem_core(), no BOOTMEM_DEFAULT callers are allowd. IMPORTANT: The patch is only proof of concept. This means that it's only for x86 and breaks other architectures. If the patch is ok, I'll change all other architectures, too. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/kernel/mpparse_32.c |6 -- arch/x86/kernel/setup_32.c | 15 --- arch/x86/kernel/setup_64.c |5 +++-- include/linux/bootmem.h | 14 +- mm/bootmem.c | 25 - 5 files changed, 48 insertions(+), 17 deletions(-) --- a/arch/x86/kernel/mpparse_32.c +++ b/arch/x86/kernel/mpparse_32.c @@ -736,7 +736,8 @@ static int __init smp_scan_config (unsig smp_found_config = 1; printk(KERN_INFO "found SMP MP-table at %08lx\n", virt_to_phys(mpf)); - reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE); + reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE, + BOOTMEM_DEFAULT); if (mpf->mpf_physptr) { /* * We cannot access to MPC table to compute @@ -751,7 +752,8 @@ static int __init smp_scan_config (unsig unsigned long end = max_low_pfn * PAGE_SIZE; if (mpf->mpf_physptr + size > end) size = end - mpf->mpf_physptr; - reserve_bootmem(mpf->mpf_physptr, size); + reserve_bootmem(mpf->mpf_physptr, size, + BOOTMEM_DEFAULT); } mpf_found = mpf; --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -317,7 +317,7 @@ static void __init reserve_ebda_region(v unsigned int addr; addr = get_bios_ebda(); if (addr) - reserve_bootmem(addr, PAGE_SIZE); + reserve_bootmem(addr, PAGE_SIZE, BOOTMEM_DEFAULT); } #ifndef CONFIG_NEED_MULTIPLE_NODES @@ -411,7 +411,7 @@ static void __init reserve_crashkernel(v (unsigned long)(total_mem >> 20)); crashk_res.start = crash_base; crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size); + reserve_bootmem(crash_base, crash_size, BOOTMEM_DEFAULT); } else printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); @@ -439,13 +439,14 @@ void __init setup_bootmem_allocator(void * bootmem allocator with an invalid RAM area. */ reserve_bootmem(__pa_symbol(_text), (PFN_PHYS(min_low_pfn) + -bootmap_size + PAGE_SIZE-1) - __pa_symbol(_text)); +bootmap_size + PAGE_SIZE-1) - __pa_symbol(_text), +BOOTMEM_DEFAULT); /* * reserve physical page 0 - it's a special BIOS page on many boxes, * enabling clean reboots, SMP operation, laptop functions. */ - reserve_bootmem(0, PAGE_SIZE); + reserve_bootmem(0, PAGE_SIZE, BOOTMEM_DEFAULT); /* reserve EBDA region, it's a 4K region */ reserve_ebda_region(); @@ -455,7 +456,7 @@ void __init setup_bootmem_allocator(void unless you have no PS/2 mouse plugged in. */ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD && boot_cpu_data.x86 == 6) -reserve_bootmem(0xa - 4096, 4096); +reserve_bootmem(0xa - 4096, 4096, BOOTMEM_DEFAULT); #ifdef CONFIG_SMP /* @@ -463,7 +464,7 @@ void __init setup_bootmem_allocator(void * FIXME: Don't need the extra page at 4K, but need to fix * trampoline before removing it. (see the GDT stuff) */ - reserve_bootmem(PAGE_SIZE, PAGE_SIZE); + reserve_bootmem(PAGE_SIZE, PAGE_SIZE, BOOTMEM_DEFAULT); #endif #ifdef CONFIG_ACPI_SLEEP /* @@ -481,7 +482,7 @@ void __init setup_bootmem_allocator(void #ifdef CONFIG_BLK_DEV_INITRD if (LOADER_TYPE && INITRD_START) { if (INITRD_START + INITRD_SI
[patch 3/3] Use BOOTMEM_EXCLUSIVE on x86
This patch uses the BOOTMEM_EXCLUSIVE, introduced in the previous patch, to avoid conflicts while reserving the memory for the kdump carpture kernel (crashkernel=). The modification has been tested on i386. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/kernel/setup_32.c | 28 ++-- arch/x86/kernel/setup_64.c | 35 +-- 2 files changed, 39 insertions(+), 24 deletions(-) --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -403,18 +403,26 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, total_mem, _size, _base); if (ret == 0 && crash_size > 0) { - if (crash_base > 0) { - printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " - "for crashkernel (System RAM: %ldMB)\n", - (unsigned long)(crash_size >> 20), - (unsigned long)(crash_base >> 20), - (unsigned long)(total_mem >> 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, BOOTMEM_DEFAULT); - } else + if (crash_base <= 0) { printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) < 0) { + printk(KERN_INFO "crashkernel reservation failed - " + "memory is in use\n"); + return; + } + + printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " + "for crashkernel (System RAM: %ldMB)\n", + (unsigned long)(crash_size >> 20), + (unsigned long)(crash_base >> 20), + (unsigned long)(total_mem >> 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -201,28 +201,35 @@ static inline void copy_edd(void) #ifdef CONFIG_KEXEC static void __init reserve_crashkernel(void) { - unsigned long long free_mem; + unsigned long long total_mem; unsigned long long crash_size, crash_base; int ret; - free_mem = ((unsigned long long)max_low_pfn - min_low_pfn) << PAGE_SHIFT; + total_mem = ((unsigned long long)max_low_pfn - min_low_pfn) << PAGE_SHIFT; - ret = parse_crashkernel(boot_command_line, free_mem, + ret = parse_crashkernel(boot_command_line, total_mem, _size, _base); if (ret == 0 && crash_size) { - if (crash_base > 0) { - printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " - "for crashkernel (System RAM: %ldMB)\n", - (unsigned long)(crash_size >> 20), - (unsigned long)(crash_base >> 20), - (unsigned long)(free_mem >> 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, - BOOTMEM_DEFAULT); - } else + if (crash_base <= 0) { printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) < 0) { + printk(KERN_INFO "crashkernel reservation failed - " + "memory is in use\n"); + return; + } + + printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " + "for crashkernel (System RAM: %ldMB)\n", + (unsigned long)(crash_size >> 20), + (unsigned long)(crash_base >> 20), + (unsigned long
[patch 0/3] Protect crashkernel against BSS overlap
I observed the problem that even when you choose the default 16M as crashkernel base address and the kernel is very big, the reserved area may overlap with the kernel BSS. Currently, this is not checked at runtime, so the kernel just crashes when you load the panic kernel in the sys_kexec call. This three patches check this at runtime. The patches are against current git, but with the patches extended-crashkernel-command-line.patch extended-crashkernel-command-line-update.patch extended-crashkernel-command-line-comment-fix.patch extended-crashkernel-command-line-improve-error-handling-in-parse_crashkernel_mem.patch use-extended-crashkernel-command-line-on-i386.patch use-extended-crashkernel-command-line-on-i386-update.patch use-extended-crashkernel-command-line-on-x86_64.patch use-extended-crashkernel-command-line-on-x86_64-update.patch use-extended-crashkernel-command-line-on-ia64.patch use-extended-crashkernel-command-line-on-ia64-fix.patch use-extended-crashkernel-command-line-on-ia64-update.patch use-extended-crashkernel-command-line-on-ppc64.patch use-extended-crashkernel-command-line-on-ppc64-update.patch use-extended-crashkernel-command-line-on-sh.patch use-extended-crashkernel-command-line-on-sh-update.patch from -mm tree applied since they are marked to be merged in 2.6.24. I know that the implementation of both patches is only x86 (i386 and x86-64), but if you agree that it's the way to go, I'll modify the patch for all architectures. Changes compared to last submit: 1) use BOOTMEM_DEFAULT instead of 0 to improve code readability (suggested by Dave Hansen <[EMAIL PROTECTED]>) 2) unreserve memory that got reserved until we detect a duplicate reservation (discovered by Vivek Goyal <[EMAIL PROTECTED]>) 3) fix IA64 (didn't compile) Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] Introduce BOOTMEM_EXCLUSIVE
* Vivek Goyal <[EMAIL PROTECTED]> [2007-10-17 13:05]: > > > I think we should unreserve the chunks of memory we have reserved so > far (Memory reserved from sidx to i), in case of error. True. Next version is coming. Thanks, Bernhard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] Introduce BOOTMEM_EXCLUSIVE
* Vivek Goyal [EMAIL PROTECTED] [2007-10-17 13:05]: I think we should unreserve the chunks of memory we have reserved so far (Memory reserved from sidx to i), in case of error. True. Next version is coming. Thanks, Bernhard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/3] Use BOOTMEM_EXCLUSIVE on x86
This patch uses the BOOTMEM_EXCLUSIVE, introduced in the previous patch, to avoid conflicts while reserving the memory for the kdump carpture kernel (crashkernel=). The modification has been tested on i386. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/x86/kernel/setup_32.c | 28 ++-- arch/x86/kernel/setup_64.c | 35 +-- 2 files changed, 39 insertions(+), 24 deletions(-) --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -403,18 +403,26 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, total_mem, crash_size, crash_base); if (ret == 0 crash_size 0) { - if (crash_base 0) { - printk(KERN_INFO Reserving %ldMB of memory at %ldMB - for crashkernel (System RAM: %ldMB)\n, - (unsigned long)(crash_size 20), - (unsigned long)(crash_base 20), - (unsigned long)(total_mem 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, BOOTMEM_DEFAULT); - } else + if (crash_base = 0) { printk(KERN_INFO crashkernel reservation failed - you have to specify a base address\n); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) 0) { + printk(KERN_INFO crashkernel reservation failed - + memory is in use\n); + return; + } + + printk(KERN_INFO Reserving %ldMB of memory at %ldMB + for crashkernel (System RAM: %ldMB)\n, + (unsigned long)(crash_size 20), + (unsigned long)(crash_base 20), + (unsigned long)(total_mem 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -201,28 +201,35 @@ static inline void copy_edd(void) #ifdef CONFIG_KEXEC static void __init reserve_crashkernel(void) { - unsigned long long free_mem; + unsigned long long total_mem; unsigned long long crash_size, crash_base; int ret; - free_mem = ((unsigned long long)max_low_pfn - min_low_pfn) PAGE_SHIFT; + total_mem = ((unsigned long long)max_low_pfn - min_low_pfn) PAGE_SHIFT; - ret = parse_crashkernel(boot_command_line, free_mem, + ret = parse_crashkernel(boot_command_line, total_mem, crash_size, crash_base); if (ret == 0 crash_size) { - if (crash_base 0) { - printk(KERN_INFO Reserving %ldMB of memory at %ldMB - for crashkernel (System RAM: %ldMB)\n, - (unsigned long)(crash_size 20), - (unsigned long)(crash_base 20), - (unsigned long)(free_mem 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, - BOOTMEM_DEFAULT); - } else + if (crash_base = 0) { printk(KERN_INFO crashkernel reservation failed - you have to specify a base address\n); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) 0) { + printk(KERN_INFO crashkernel reservation failed - + memory is in use\n); + return; + } + + printk(KERN_INFO Reserving %ldMB of memory at %ldMB + for crashkernel (System RAM: %ldMB)\n, + (unsigned long)(crash_size 20), + (unsigned long)(crash_base 20), + (unsigned long)(total_mem 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo
[patch 0/3] Protect crashkernel against BSS overlap
I observed the problem that even when you choose the default 16M as crashkernel base address and the kernel is very big, the reserved area may overlap with the kernel BSS. Currently, this is not checked at runtime, so the kernel just crashes when you load the panic kernel in the sys_kexec call. This three patches check this at runtime. The patches are against current git, but with the patches extended-crashkernel-command-line.patch extended-crashkernel-command-line-update.patch extended-crashkernel-command-line-comment-fix.patch extended-crashkernel-command-line-improve-error-handling-in-parse_crashkernel_mem.patch use-extended-crashkernel-command-line-on-i386.patch use-extended-crashkernel-command-line-on-i386-update.patch use-extended-crashkernel-command-line-on-x86_64.patch use-extended-crashkernel-command-line-on-x86_64-update.patch use-extended-crashkernel-command-line-on-ia64.patch use-extended-crashkernel-command-line-on-ia64-fix.patch use-extended-crashkernel-command-line-on-ia64-update.patch use-extended-crashkernel-command-line-on-ppc64.patch use-extended-crashkernel-command-line-on-ppc64-update.patch use-extended-crashkernel-command-line-on-sh.patch use-extended-crashkernel-command-line-on-sh-update.patch from -mm tree applied since they are marked to be merged in 2.6.24. I know that the implementation of both patches is only x86 (i386 and x86-64), but if you agree that it's the way to go, I'll modify the patch for all architectures. Changes compared to last submit: 1) use BOOTMEM_DEFAULT instead of 0 to improve code readability (suggested by Dave Hansen [EMAIL PROTECTED]) 2) unreserve memory that got reserved until we detect a duplicate reservation (discovered by Vivek Goyal [EMAIL PROTECTED]) 3) fix IA64 (didn't compile) Signed-off-by: Bernhard Walle [EMAIL PROTECTED] -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2/3] Introduce BOOTMEM_EXCLUSIVE
This flag changes the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with -EBUSY if the memory already has been reserved in the past. This is to avoid conflicts. Because it's necessary to unreserve the bootmem if a collision is discovered in the middle of the area, a rwlock is introduced: only one BOOTMEM_EXCLUSIVE caller is possible, but multiple BOOTMEM_DEFAULT callers. But if a BOOTMEM_EXCLUSIVE caller is in reserve_bootmem_core(), no BOOTMEM_DEFAULT callers are allowd. IMPORTANT: The patch is only proof of concept. This means that it's only for x86 and breaks other architectures. If the patch is ok, I'll change all other architectures, too. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/x86/kernel/mpparse_32.c |6 -- arch/x86/kernel/setup_32.c | 15 --- arch/x86/kernel/setup_64.c |5 +++-- include/linux/bootmem.h | 14 +- mm/bootmem.c | 25 - 5 files changed, 48 insertions(+), 17 deletions(-) --- a/arch/x86/kernel/mpparse_32.c +++ b/arch/x86/kernel/mpparse_32.c @@ -736,7 +736,8 @@ static int __init smp_scan_config (unsig smp_found_config = 1; printk(KERN_INFO found SMP MP-table at %08lx\n, virt_to_phys(mpf)); - reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE); + reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE, + BOOTMEM_DEFAULT); if (mpf-mpf_physptr) { /* * We cannot access to MPC table to compute @@ -751,7 +752,8 @@ static int __init smp_scan_config (unsig unsigned long end = max_low_pfn * PAGE_SIZE; if (mpf-mpf_physptr + size end) size = end - mpf-mpf_physptr; - reserve_bootmem(mpf-mpf_physptr, size); + reserve_bootmem(mpf-mpf_physptr, size, + BOOTMEM_DEFAULT); } mpf_found = mpf; --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -317,7 +317,7 @@ static void __init reserve_ebda_region(v unsigned int addr; addr = get_bios_ebda(); if (addr) - reserve_bootmem(addr, PAGE_SIZE); + reserve_bootmem(addr, PAGE_SIZE, BOOTMEM_DEFAULT); } #ifndef CONFIG_NEED_MULTIPLE_NODES @@ -411,7 +411,7 @@ static void __init reserve_crashkernel(v (unsigned long)(total_mem 20)); crashk_res.start = crash_base; crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size); + reserve_bootmem(crash_base, crash_size, BOOTMEM_DEFAULT); } else printk(KERN_INFO crashkernel reservation failed - you have to specify a base address\n); @@ -439,13 +439,14 @@ void __init setup_bootmem_allocator(void * bootmem allocator with an invalid RAM area. */ reserve_bootmem(__pa_symbol(_text), (PFN_PHYS(min_low_pfn) + -bootmap_size + PAGE_SIZE-1) - __pa_symbol(_text)); +bootmap_size + PAGE_SIZE-1) - __pa_symbol(_text), +BOOTMEM_DEFAULT); /* * reserve physical page 0 - it's a special BIOS page on many boxes, * enabling clean reboots, SMP operation, laptop functions. */ - reserve_bootmem(0, PAGE_SIZE); + reserve_bootmem(0, PAGE_SIZE, BOOTMEM_DEFAULT); /* reserve EBDA region, it's a 4K region */ reserve_ebda_region(); @@ -455,7 +456,7 @@ void __init setup_bootmem_allocator(void unless you have no PS/2 mouse plugged in. */ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD boot_cpu_data.x86 == 6) -reserve_bootmem(0xa - 4096, 4096); +reserve_bootmem(0xa - 4096, 4096, BOOTMEM_DEFAULT); #ifdef CONFIG_SMP /* @@ -463,7 +464,7 @@ void __init setup_bootmem_allocator(void * FIXME: Don't need the extra page at 4K, but need to fix * trampoline before removing it. (see the GDT stuff) */ - reserve_bootmem(PAGE_SIZE, PAGE_SIZE); + reserve_bootmem(PAGE_SIZE, PAGE_SIZE, BOOTMEM_DEFAULT); #endif #ifdef CONFIG_ACPI_SLEEP /* @@ -481,7 +482,7 @@ void __init setup_bootmem_allocator(void #ifdef CONFIG_BLK_DEV_INITRD if (LOADER_TYPE INITRD_START) { if (INITRD_START + INITRD_SIZE = (max_low_pfn PAGE_SHIFT)) { - reserve_bootmem(INITRD_START, INITRD_SIZE
[patch 1/3] Add BSS to resource tree
This patch adds the BSS to the resource tree just as kernel text and kernel data are in the resource tree. The main reason behind this is to avoid crashkernel reservation in that area. While it's not strictly necessary to have the BSS in the resource tree (the actual collision detection is done in the reserve_bootmem() function before), the usage of the BSS resource should be presented to the user in /proc/iomem just as Kernel data and Kernel code. Note: The patch currently is only implemented for x86 and ia64 (because efi_initialize_iomem_resources() has the same signature on i386 and ia64). Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/ia64/kernel/efi.c |4 +++- arch/ia64/kernel/setup.c | 14 +++--- arch/x86/kernel/e820_32.c | 18 +++--- arch/x86/kernel/e820_64.c |3 ++- arch/x86/kernel/efi_32.c |4 +++- arch/x86/kernel/setup_32.c |4 arch/x86/kernel/setup_64.c |9 + include/linux/efi.h|2 +- 8 files changed, 48 insertions(+), 10 deletions(-) --- a/arch/ia64/kernel/efi.c +++ b/arch/ia64/kernel/efi.c @@ -1090,7 +1090,8 @@ efi_memmap_init(unsigned long *s, unsign void efi_initialize_iomem_resources(struct resource *code_resource, - struct resource *data_resource) + struct resource *data_resource, + struct resource *bss_resource) { struct resource *res; void *efi_map_start, *efi_map_end, *p; @@ -1171,6 +1172,7 @@ efi_initialize_iomem_resources(struct re */ insert_resource(res, code_resource); insert_resource(res, data_resource); + insert_resource(res, bss_resource); #ifdef CONFIG_KEXEC insert_resource(res, efi_memmap_res); insert_resource(res, boot_param_res); --- a/arch/ia64/kernel/setup.c +++ b/arch/ia64/kernel/setup.c @@ -90,7 +90,12 @@ static struct resource code_resource = { .name = Kernel code, .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; -extern char _text[], _end[], _etext[]; + +static struct resource bss_resource = { + .name = Kernel bss, + .flags = IORESOURCE_BUSY | IORESOURCE_MEM +}; +extern char _text[], _end[], _etext[], _edata[], _bss[]; unsigned long ia64_max_cacheline_size; @@ -200,8 +205,11 @@ static int __init register_memory(void) code_resource.start = ia64_tpa(_text); code_resource.end = ia64_tpa(_etext) - 1; data_resource.start = ia64_tpa(_etext); - data_resource.end = ia64_tpa(_end) - 1; - efi_initialize_iomem_resources(code_resource, data_resource); + data_resource.end = ia64_tpa(_edata) - 1; + bss_resource.start = ia64_tpa(_bss); + bss_resource.end= ia64_tpa(_end) - 1; + efi_initialize_iomem_resources(code_resource, data_resource, + bss_resource); return 0; } --- a/arch/x86/kernel/e820_32.c +++ b/arch/x86/kernel/e820_32.c @@ -51,6 +51,13 @@ struct resource code_resource = { .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; +struct resource bss_resource = { + .name = Kernel bss, + .start = 0, + .end= 0, + .flags = IORESOURCE_BUSY | IORESOURCE_MEM +}; + static struct resource system_rom_resource = { .name = System ROM, .start = 0xf, @@ -254,7 +261,9 @@ static void __init probe_roms(void) * and also for regions reported as reserved by the e820. */ static void __init -legacy_init_iomem_resources(struct resource *code_resource, struct resource *data_resource) +legacy_init_iomem_resources(struct resource *code_resource, + struct resource *data_resource, + struct resource *bss_resource) { int i; @@ -287,6 +296,7 @@ legacy_init_iomem_resources(struct resou */ request_resource(res, code_resource); request_resource(res, data_resource); + request_resource(res, bss_resource); #ifdef CONFIG_KEXEC if (crashk_res.start != crashk_res.end) request_resource(res, crashk_res); @@ -307,9 +317,11 @@ static int __init request_standard_resou printk(Setting up standard PCI resources\n); if (efi_enabled) - efi_initialize_iomem_resources(code_resource, data_resource); + efi_initialize_iomem_resources(code_resource, + data_resource, bss_resource); else - legacy_init_iomem_resources(code_resource, data_resource); + legacy_init_iomem_resources(code_resource, + data_resource, bss_resource); /* EFI systems may still have VGA */ request_resource(iomem_resource, video_ram_resource); --- a/arch
Re: [patch 2/3] Introduce BOOTMEM_EXCLUSIVE
* Vivek Goyal <[EMAIL PROTECTED]> [2007-10-17 13:05]: > > [..] > > +/* > > + * If flags is 0, then the return value is always 0 (success). If > > + * flags contains BOOTMEM_EXCLUSIVE, then -EBUSY is returned if the > > + * memory already was reserved. > > + */ > > +extern int reserve_bootmem(unsigned long addr, unsigned long size, int > > flags); > > #define alloc_bootmem(x) \ > > __alloc_bootmem(x, SMP_CACHE_BYTES, __pa(MAX_DMA_ADDRESS)) > > #define alloc_bootmem_low(x) \ > > --- a/mm/bootmem.c > > +++ b/mm/bootmem.c > > @@ -111,8 +111,8 @@ static unsigned long __init init_bootmem > > * might be used for boot-time allocations - or it might get added > > * to the free page pool later on. > > */ > > -static void __init reserve_bootmem_core(bootmem_data_t *bdata, unsigned > > long addr, > > - unsigned long size) > > +static int __init reserve_bootmem_core(bootmem_data_t *bdata, unsigned > > long addr, > > + unsigned long size, int flags) > > { > > unsigned long sidx, eidx; > > unsigned long i; > > @@ -133,7 +133,11 @@ static void __init reserve_bootmem_core( > > #ifdef CONFIG_DEBUG_BOOTMEM > > printk("hm, page %08lx reserved twice.\n", i*PAGE_SIZE); > > #endif > > + if (flags & BOOTMEM_EXCLUSIVE) > > + return -EBUSY; > > I think we should unreserve the chunks of memory we have reserved so > far (Memory reserved from sidx to i), in case of error. Unfortunately, that's not possible without using a lock (or counters instead of a bitmap) any more. If we just do for (i--; i >= sidx; i--) clear_bit(i, bdata->node_bootmem_map); then another thread of execution could reserve the memory (without BOOTMEM_EXCLUSIVE) in between -- and the code would free the memory which is already reserved. I think that could be modelled with a rwlock, not changing the default case where BOOTMEM_EXCLUSIVE is not specified. Thanks, Bernhard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] Introduce BOOTMEM_EXCLUSIVE
* Vivek Goyal [EMAIL PROTECTED] [2007-10-17 13:05]: [..] +/* + * If flags is 0, then the return value is always 0 (success). If + * flags contains BOOTMEM_EXCLUSIVE, then -EBUSY is returned if the + * memory already was reserved. + */ +extern int reserve_bootmem(unsigned long addr, unsigned long size, int flags); #define alloc_bootmem(x) \ __alloc_bootmem(x, SMP_CACHE_BYTES, __pa(MAX_DMA_ADDRESS)) #define alloc_bootmem_low(x) \ --- a/mm/bootmem.c +++ b/mm/bootmem.c @@ -111,8 +111,8 @@ static unsigned long __init init_bootmem * might be used for boot-time allocations - or it might get added * to the free page pool later on. */ -static void __init reserve_bootmem_core(bootmem_data_t *bdata, unsigned long addr, - unsigned long size) +static int __init reserve_bootmem_core(bootmem_data_t *bdata, unsigned long addr, + unsigned long size, int flags) { unsigned long sidx, eidx; unsigned long i; @@ -133,7 +133,11 @@ static void __init reserve_bootmem_core( #ifdef CONFIG_DEBUG_BOOTMEM printk(hm, page %08lx reserved twice.\n, i*PAGE_SIZE); #endif + if (flags BOOTMEM_EXCLUSIVE) + return -EBUSY; I think we should unreserve the chunks of memory we have reserved so far (Memory reserved from sidx to i), in case of error. Unfortunately, that's not possible without using a lock (or counters instead of a bitmap) any more. If we just do for (i--; i = sidx; i--) clear_bit(i, bdata-node_bootmem_map); then another thread of execution could reserve the memory (without BOOTMEM_EXCLUSIVE) in between -- and the code would free the memory which is already reserved. I think that could be modelled with a rwlock, not changing the default case where BOOTMEM_EXCLUSIVE is not specified. Thanks, Bernhard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] Introduce BOOTMEM_EXCLUSIVE
Hi, * Dave Hansen <[EMAIL PROTECTED]> [2007-10-16 20:08]: > On Tue, 2007-10-16 at 18:28 +0200, Bernhard Walle wrote: > > > > @@ -736,7 +736,7 @@ static int __init smp_scan_config (unsig > > smp_found_config = 1; > > printk(KERN_INFO "found SMP MP-table at %08lx\n", > > virt_to_phys(mpf)); > > - reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE); > > + reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE, 0); > > if (mpf->mpf_physptr) { > > /* > > Could you give all of these 0's a name? I really hate seeing random > magic numbers in these things. 0 completely kills the ability of > someone to read the code and figure out what it is trying to do without > going and looking at reserve_bootmem(). Of course I can replace that zeroes with something like BOOTMEM_DEFAULT. > Or, alternatively, do something like this: > > -extern void reserve_bootmem(unsigned long addr, unsigned long size); Andi was against more bootmem functions. ;) Thanks, Bernhard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/3] Use BOOTMEM_EXCLUSIVE on x86
This patch uses the BOOTMEM_EXCLUSIVE, introduced in the previous patch, to avoid conflicts while reserving the memory for the kdump carpture kernel (crashkernel=). The modification has been tested on i386. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/kernel/setup_32.c | 28 ++-- arch/x86/kernel/setup_64.c | 34 +- 2 files changed, 39 insertions(+), 23 deletions(-) --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -403,18 +403,26 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, total_mem, _size, _base); if (ret == 0 && crash_size > 0) { - if (crash_base > 0) { - printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " - "for crashkernel (System RAM: %ldMB)\n", - (unsigned long)(crash_size >> 20), - (unsigned long)(crash_base >> 20), - (unsigned long)(total_mem >> 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size); - } else + if (crash_base <= 0) { printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) < 0) { + printk(KERN_INFO "crashkernel reservation failed - " + "memory is in use\n"); + return; + } + + printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " + "for crashkernel (System RAM: %ldMB)\n", + (unsigned long)(crash_size >> 20), + (unsigned long)(crash_base >> 20), + (unsigned long)(total_mem >> 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -201,27 +201,35 @@ static inline void copy_edd(void) #ifdef CONFIG_KEXEC static void __init reserve_crashkernel(void) { - unsigned long long free_mem; + unsigned long long total_mem; unsigned long long crash_size, crash_base; int ret; - free_mem = ((unsigned long long)max_low_pfn - min_low_pfn) << PAGE_SHIFT; + total_mem = ((unsigned long long)max_low_pfn - min_low_pfn) << PAGE_SHIFT; - ret = parse_crashkernel(boot_command_line, free_mem, + ret = parse_crashkernel(boot_command_line, total_mem, _size, _base); if (ret == 0 && crash_size) { - if (crash_base > 0) { - printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " - "for crashkernel (System RAM: %ldMB)\n", - (unsigned long)(crash_size >> 20), - (unsigned long)(crash_base >> 20), - (unsigned long)(free_mem >> 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, 0); - } else + if (crash_base <= 0) { printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) < 0) { + printk(KERN_INFO "crashkernel reservation failed - " + "memory is in use\n"); + return; + } + + printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " + "for crashkernel (System RAM: %ldMB)\n", + (unsigned long)(crash_size >> 20), + (unsigned long)(crash_base >> 20), + (unsigned long)(total_mem >> 20)); + crashk_res.start = crash_base;
[patch 2/3] Introduce BOOTMEM_EXCLUSIVE
This flag changes the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with -EBUSY if the memory already has been reserved in the past. This is to avoid conflicts. IMPORTANT: The patch is only proof of concept. This means that it's only for x86 and breaks other architectures. If the patch is ok, I'll change all other architectures, too. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/kernel/mpparse_32.c |4 ++-- arch/x86/kernel/setup_32.c | 12 ++-- arch/x86/kernel/setup_64.c |2 +- include/linux/bootmem.h | 13 - mm/bootmem.c | 15 ++- 5 files changed, 31 insertions(+), 15 deletions(-) --- a/arch/x86/kernel/mpparse_32.c +++ b/arch/x86/kernel/mpparse_32.c @@ -736,7 +736,7 @@ static int __init smp_scan_config (unsig smp_found_config = 1; printk(KERN_INFO "found SMP MP-table at %08lx\n", virt_to_phys(mpf)); - reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE); + reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE, 0); if (mpf->mpf_physptr) { /* * We cannot access to MPC table to compute @@ -751,7 +751,7 @@ static int __init smp_scan_config (unsig unsigned long end = max_low_pfn * PAGE_SIZE; if (mpf->mpf_physptr + size > end) size = end - mpf->mpf_physptr; - reserve_bootmem(mpf->mpf_physptr, size); + reserve_bootmem(mpf->mpf_physptr, size, 0); } mpf_found = mpf; --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -317,7 +317,7 @@ static void __init reserve_ebda_region(v unsigned int addr; addr = get_bios_ebda(); if (addr) - reserve_bootmem(addr, PAGE_SIZE); + reserve_bootmem(addr, PAGE_SIZE, 0); } #ifndef CONFIG_NEED_MULTIPLE_NODES @@ -439,13 +439,13 @@ void __init setup_bootmem_allocator(void * bootmem allocator with an invalid RAM area. */ reserve_bootmem(__pa_symbol(_text), (PFN_PHYS(min_low_pfn) + -bootmap_size + PAGE_SIZE-1) - __pa_symbol(_text)); +bootmap_size + PAGE_SIZE-1) - __pa_symbol(_text), 0); /* * reserve physical page 0 - it's a special BIOS page on many boxes, * enabling clean reboots, SMP operation, laptop functions. */ - reserve_bootmem(0, PAGE_SIZE); + reserve_bootmem(0, PAGE_SIZE, 0); /* reserve EBDA region, it's a 4K region */ reserve_ebda_region(); @@ -455,7 +455,7 @@ void __init setup_bootmem_allocator(void unless you have no PS/2 mouse plugged in. */ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD && boot_cpu_data.x86 == 6) -reserve_bootmem(0xa - 4096, 4096); +reserve_bootmem(0xa - 4096, 4096, 0); #ifdef CONFIG_SMP /* @@ -463,7 +463,7 @@ void __init setup_bootmem_allocator(void * FIXME: Don't need the extra page at 4K, but need to fix * trampoline before removing it. (see the GDT stuff) */ - reserve_bootmem(PAGE_SIZE, PAGE_SIZE); + reserve_bootmem(PAGE_SIZE, PAGE_SIZE, 0); #endif #ifdef CONFIG_ACPI_SLEEP /* @@ -481,7 +481,7 @@ void __init setup_bootmem_allocator(void #ifdef CONFIG_BLK_DEV_INITRD if (LOADER_TYPE && INITRD_START) { if (INITRD_START + INITRD_SIZE <= (max_low_pfn << PAGE_SHIFT)) { - reserve_bootmem(INITRD_START, INITRD_SIZE); + reserve_bootmem(INITRD_START, INITRD_SIZE, 0); initrd_start = INITRD_START + PAGE_OFFSET; initrd_end = initrd_start+INITRD_SIZE; } --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -218,7 +218,7 @@ static void __init reserve_crashkernel(v (unsigned long)(free_mem >> 20)); crashk_res.start = crash_base; crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size); + reserve_bootmem(crash_base, crash_size, 0); } else printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); --- a/include/linux/bootmem.h +++ b/include/linux/bootmem.h @@ -61,8 +61,19 @@ extern void *__alloc_bootmem_core(struct unsigned lo
[patch 0/3] Protect crashkernel against BSS overlap
I observed the problem that even when you choose the default 16M as crashkernel base address and the kernel is very big, the reserved area may overlap with the kernel BSS. Currently, this is not checked at runtime, so the kernel just crashes when you load the panic kernel in the sys_kexec call. This three patches check this at runtime. The patches are against current git, but with the patches extended-crashkernel-command-line.patch extended-crashkernel-command-line-update.patch extended-crashkernel-command-line-comment-fix.patch extended-crashkernel-command-line-improve-error-handling-in-parse_crashkernel_mem.patch use-extended-crashkernel-command-line-on-i386.patch use-extended-crashkernel-command-line-on-i386-update.patch use-extended-crashkernel-command-line-on-x86_64.patch use-extended-crashkernel-command-line-on-x86_64-update.patch use-extended-crashkernel-command-line-on-ia64.patch use-extended-crashkernel-command-line-on-ia64-fix.patch use-extended-crashkernel-command-line-on-ia64-update.patch use-extended-crashkernel-command-line-on-ppc64.patch use-extended-crashkernel-command-line-on-ppc64-update.patch use-extended-crashkernel-command-line-on-sh.patch use-extended-crashkernel-command-line-on-sh-update.patch from -mm tree applied since they are marked to be merged in 2.6.24. I know that the implementation of both patches is only x86 (i386 and x86-64), but if you agree that it's the way to go, I'll modify the patch for all architectures. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/3] Add BSS to resource tree
This patch adds the BSS to the resource tree just as kernel text and kernel data are in the resource tree. The main reason behind this is to avoid crashkernel reservation in that area. While it's not strictly necessary to have the BSS in the resource tree (the actual collision detection is done in the reserve_bootmem() function before), the usage of the BSS resource should be presented to the user in /proc/iomem just as Kernel data and Kernel code. Note: The patch currently is only implemented for x86 and ia64 (because efi_initialize_iomem_resources() has the same signature on i386 and ia64). Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/ia64/kernel/efi.c | 10 ++ arch/ia64/kernel/setup.c |9 - arch/x86/kernel/e820_32.c | 18 +- arch/x86/kernel/e820_64.c |3 ++- arch/x86/kernel/efi_32.c | 11 +++ arch/x86/kernel/setup_32.c |4 arch/x86/kernel/setup_64.c |9 + include/linux/efi.h|3 +-- 8 files changed, 50 insertions(+), 17 deletions(-) --- a/arch/ia64/kernel/efi.c +++ b/arch/ia64/kernel/efi.c @@ -41,6 +41,8 @@ extern efi_status_t efi_call_phys (void *, ...); +extern struct resource code_resource, data_resource, bss_resource; + struct efi efi; EXPORT_SYMBOL(efi); static efi_runtime_services_t *runtime; @@ -1089,8 +1091,7 @@ efi_memmap_init(unsigned long *s, unsign } void -efi_initialize_iomem_resources(struct resource *code_resource, - struct resource *data_resource) +efi_initialize_iomem_resources(void) { struct resource *res; void *efi_map_start, *efi_map_end, *p; @@ -1169,8 +1170,9 @@ efi_initialize_iomem_resources(struct re * kernel data so we try it repeatedly and * let the resource manager test it. */ - insert_resource(res, code_resource); - insert_resource(res, data_resource); + insert_resource(res, _resource); + insert_resource(res, _resource); + insert_resource(res, _resource); #ifdef CONFIG_KEXEC insert_resource(res, _memmap_res); insert_resource(res, _param_res); --- a/arch/ia64/kernel/setup.c +++ b/arch/ia64/kernel/setup.c @@ -90,6 +90,11 @@ static struct resource code_resource = { .name = "Kernel code", .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; + +static struct resource bss_resource = { + .name = "Kernel bss", + .flags = IORESOURCE_BUSY | IORESOURCE_MEM +}; extern char _text[], _end[], _etext[]; unsigned long ia64_max_cacheline_size; @@ -201,7 +206,9 @@ static int __init register_memory(void) code_resource.end = ia64_tpa(_etext) - 1; data_resource.start = ia64_tpa(_etext); data_resource.end = ia64_tpa(_end) - 1; - efi_initialize_iomem_resources(_resource, _resource); + bss_resource.start = ia64_tpa(__bss_start); + bss_resource.end= ia64_tpa(__bss_stop) - 1; + efi_initialize_iomem_resources(); return 0; } --- a/arch/x86/kernel/e820_32.c +++ b/arch/x86/kernel/e820_32.c @@ -51,6 +51,13 @@ struct resource code_resource = { .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; +struct resource bss_resource = { + .name = "Kernel bss", + .start = 0, + .end= 0, + .flags = IORESOURCE_BUSY | IORESOURCE_MEM +}; + static struct resource system_rom_resource = { .name = "System ROM", .start = 0xf, @@ -254,7 +261,7 @@ static void __init probe_roms(void) * and also for regions reported as reserved by the e820. */ static void __init -legacy_init_iomem_resources(struct resource *code_resource, struct resource *data_resource) +legacy_init_iomem_resources(void) { int i; @@ -285,8 +292,9 @@ legacy_init_iomem_resources(struct resou * so we try it repeatedly and let the resource manager * test it. */ - request_resource(res, code_resource); - request_resource(res, data_resource); + request_resource(res, _resource); + request_resource(res, _resource); + request_resource(res, _resource); #ifdef CONFIG_KEXEC if (crashk_res.start != crashk_res.end) request_resource(res, _res); @@ -307,9 +315,9 @@ static int __init request_standard_resou printk("Setting up standard PCI resources\n"); if (efi_enabled) - efi_initialize_iomem_resources(_resource, _resource); + efi_initialize_iomem_resources(); else - legacy_init_iomem_resources(_resource, _resource); + legacy_in
Re: [patch 0/2] Protect crashkernel against BSS overlap
* Vivek Goyal <[EMAIL PROTECTED]> [2007-10-16 07:49]: > > Shouldn't bootmem allocator have the functionality to flag an error if > we try to reserve a memory which is already reserved? I see that bootmem > allocator is currently printing a warning under CONFIG_DEBUG_BOOTMEM. That's probably better, yes. See the next version. > Wouldn't it be better if we reserve the code, data and bss memory also > using bootmem allocator and when somebody tries to reserve craskernel memory > and if there is an overlap, boot memory allocator should scream? It's already marked as reserved. At least on i386 in my test. > In second patch, you are checking for crash kernel reserved memory being > beyond _end. That will make sure that there is no overlap with kernel > text, data or bss. I am wondering then why do we need first patch and > why should we register bss memory in the resources list. Second patch > would make sure that there is no overlap with crash kernel memory and kexec > will not place any segment outside crashkernel memory. I think we should also present the BSS to the user like we present text and data. Thanks, Bernhard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/2] Protect crashkernel against BSS overlap
* Andi Kleen <[EMAIL PROTECTED]> [2007-10-16 11:59]: > Vivek Goyal <[EMAIL PROTECTED]> writes: > > > Wouldn't it be better if we reserve the code, data and bss memory also > > using bootmem allocator and when somebody tries to reserve craskernel memory > > and if there is an overlap, boot memory allocator should scream? > > Some x86 bootmem code right now relies on it not screaming (or at least not > erroring out). That would need to be fixed first. Or you make it a flag. > Would probably make sense, except that we already have too many bootmem > allocation variants :/ Ok, I made a flag, see the next version of the patch. Thanks, Bernhard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/2] Protect crashkernel against BSS overlap
* Andi Kleen [EMAIL PROTECTED] [2007-10-16 11:59]: Vivek Goyal [EMAIL PROTECTED] writes: Wouldn't it be better if we reserve the code, data and bss memory also using bootmem allocator and when somebody tries to reserve craskernel memory and if there is an overlap, boot memory allocator should scream? Some x86 bootmem code right now relies on it not screaming (or at least not erroring out). That would need to be fixed first. Or you make it a flag. Would probably make sense, except that we already have too many bootmem allocation variants :/ Ok, I made a flag, see the next version of the patch. Thanks, Bernhard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/2] Protect crashkernel against BSS overlap
* Vivek Goyal [EMAIL PROTECTED] [2007-10-16 07:49]: Shouldn't bootmem allocator have the functionality to flag an error if we try to reserve a memory which is already reserved? I see that bootmem allocator is currently printing a warning under CONFIG_DEBUG_BOOTMEM. That's probably better, yes. See the next version. Wouldn't it be better if we reserve the code, data and bss memory also using bootmem allocator and when somebody tries to reserve craskernel memory and if there is an overlap, boot memory allocator should scream? It's already marked as reserved. At least on i386 in my test. In second patch, you are checking for crash kernel reserved memory being beyond _end. That will make sure that there is no overlap with kernel text, data or bss. I am wondering then why do we need first patch and why should we register bss memory in the resources list. Second patch would make sure that there is no overlap with crash kernel memory and kexec will not place any segment outside crashkernel memory. I think we should also present the BSS to the user like we present text and data. Thanks, Bernhard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 0/3] Protect crashkernel against BSS overlap
I observed the problem that even when you choose the default 16M as crashkernel base address and the kernel is very big, the reserved area may overlap with the kernel BSS. Currently, this is not checked at runtime, so the kernel just crashes when you load the panic kernel in the sys_kexec call. This three patches check this at runtime. The patches are against current git, but with the patches extended-crashkernel-command-line.patch extended-crashkernel-command-line-update.patch extended-crashkernel-command-line-comment-fix.patch extended-crashkernel-command-line-improve-error-handling-in-parse_crashkernel_mem.patch use-extended-crashkernel-command-line-on-i386.patch use-extended-crashkernel-command-line-on-i386-update.patch use-extended-crashkernel-command-line-on-x86_64.patch use-extended-crashkernel-command-line-on-x86_64-update.patch use-extended-crashkernel-command-line-on-ia64.patch use-extended-crashkernel-command-line-on-ia64-fix.patch use-extended-crashkernel-command-line-on-ia64-update.patch use-extended-crashkernel-command-line-on-ppc64.patch use-extended-crashkernel-command-line-on-ppc64-update.patch use-extended-crashkernel-command-line-on-sh.patch use-extended-crashkernel-command-line-on-sh-update.patch from -mm tree applied since they are marked to be merged in 2.6.24. I know that the implementation of both patches is only x86 (i386 and x86-64), but if you agree that it's the way to go, I'll modify the patch for all architectures. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/3] Add BSS to resource tree
This patch adds the BSS to the resource tree just as kernel text and kernel data are in the resource tree. The main reason behind this is to avoid crashkernel reservation in that area. While it's not strictly necessary to have the BSS in the resource tree (the actual collision detection is done in the reserve_bootmem() function before), the usage of the BSS resource should be presented to the user in /proc/iomem just as Kernel data and Kernel code. Note: The patch currently is only implemented for x86 and ia64 (because efi_initialize_iomem_resources() has the same signature on i386 and ia64). Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/ia64/kernel/efi.c | 10 ++ arch/ia64/kernel/setup.c |9 - arch/x86/kernel/e820_32.c | 18 +- arch/x86/kernel/e820_64.c |3 ++- arch/x86/kernel/efi_32.c | 11 +++ arch/x86/kernel/setup_32.c |4 arch/x86/kernel/setup_64.c |9 + include/linux/efi.h|3 +-- 8 files changed, 50 insertions(+), 17 deletions(-) --- a/arch/ia64/kernel/efi.c +++ b/arch/ia64/kernel/efi.c @@ -41,6 +41,8 @@ extern efi_status_t efi_call_phys (void *, ...); +extern struct resource code_resource, data_resource, bss_resource; + struct efi efi; EXPORT_SYMBOL(efi); static efi_runtime_services_t *runtime; @@ -1089,8 +1091,7 @@ efi_memmap_init(unsigned long *s, unsign } void -efi_initialize_iomem_resources(struct resource *code_resource, - struct resource *data_resource) +efi_initialize_iomem_resources(void) { struct resource *res; void *efi_map_start, *efi_map_end, *p; @@ -1169,8 +1170,9 @@ efi_initialize_iomem_resources(struct re * kernel data so we try it repeatedly and * let the resource manager test it. */ - insert_resource(res, code_resource); - insert_resource(res, data_resource); + insert_resource(res, code_resource); + insert_resource(res, data_resource); + insert_resource(res, bss_resource); #ifdef CONFIG_KEXEC insert_resource(res, efi_memmap_res); insert_resource(res, boot_param_res); --- a/arch/ia64/kernel/setup.c +++ b/arch/ia64/kernel/setup.c @@ -90,6 +90,11 @@ static struct resource code_resource = { .name = Kernel code, .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; + +static struct resource bss_resource = { + .name = Kernel bss, + .flags = IORESOURCE_BUSY | IORESOURCE_MEM +}; extern char _text[], _end[], _etext[]; unsigned long ia64_max_cacheline_size; @@ -201,7 +206,9 @@ static int __init register_memory(void) code_resource.end = ia64_tpa(_etext) - 1; data_resource.start = ia64_tpa(_etext); data_resource.end = ia64_tpa(_end) - 1; - efi_initialize_iomem_resources(code_resource, data_resource); + bss_resource.start = ia64_tpa(__bss_start); + bss_resource.end= ia64_tpa(__bss_stop) - 1; + efi_initialize_iomem_resources(); return 0; } --- a/arch/x86/kernel/e820_32.c +++ b/arch/x86/kernel/e820_32.c @@ -51,6 +51,13 @@ struct resource code_resource = { .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; +struct resource bss_resource = { + .name = Kernel bss, + .start = 0, + .end= 0, + .flags = IORESOURCE_BUSY | IORESOURCE_MEM +}; + static struct resource system_rom_resource = { .name = System ROM, .start = 0xf, @@ -254,7 +261,7 @@ static void __init probe_roms(void) * and also for regions reported as reserved by the e820. */ static void __init -legacy_init_iomem_resources(struct resource *code_resource, struct resource *data_resource) +legacy_init_iomem_resources(void) { int i; @@ -285,8 +292,9 @@ legacy_init_iomem_resources(struct resou * so we try it repeatedly and let the resource manager * test it. */ - request_resource(res, code_resource); - request_resource(res, data_resource); + request_resource(res, code_resource); + request_resource(res, data_resource); + request_resource(res, bss_resource); #ifdef CONFIG_KEXEC if (crashk_res.start != crashk_res.end) request_resource(res, crashk_res); @@ -307,9 +315,9 @@ static int __init request_standard_resou printk(Setting up standard PCI resources\n); if (efi_enabled) - efi_initialize_iomem_resources(code_resource, data_resource); + efi_initialize_iomem_resources(); else - legacy_init_iomem_resources(code_resource, data_resource
[patch 2/3] Introduce BOOTMEM_EXCLUSIVE
This flag changes the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with -EBUSY if the memory already has been reserved in the past. This is to avoid conflicts. IMPORTANT: The patch is only proof of concept. This means that it's only for x86 and breaks other architectures. If the patch is ok, I'll change all other architectures, too. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/x86/kernel/mpparse_32.c |4 ++-- arch/x86/kernel/setup_32.c | 12 ++-- arch/x86/kernel/setup_64.c |2 +- include/linux/bootmem.h | 13 - mm/bootmem.c | 15 ++- 5 files changed, 31 insertions(+), 15 deletions(-) --- a/arch/x86/kernel/mpparse_32.c +++ b/arch/x86/kernel/mpparse_32.c @@ -736,7 +736,7 @@ static int __init smp_scan_config (unsig smp_found_config = 1; printk(KERN_INFO found SMP MP-table at %08lx\n, virt_to_phys(mpf)); - reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE); + reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE, 0); if (mpf-mpf_physptr) { /* * We cannot access to MPC table to compute @@ -751,7 +751,7 @@ static int __init smp_scan_config (unsig unsigned long end = max_low_pfn * PAGE_SIZE; if (mpf-mpf_physptr + size end) size = end - mpf-mpf_physptr; - reserve_bootmem(mpf-mpf_physptr, size); + reserve_bootmem(mpf-mpf_physptr, size, 0); } mpf_found = mpf; --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -317,7 +317,7 @@ static void __init reserve_ebda_region(v unsigned int addr; addr = get_bios_ebda(); if (addr) - reserve_bootmem(addr, PAGE_SIZE); + reserve_bootmem(addr, PAGE_SIZE, 0); } #ifndef CONFIG_NEED_MULTIPLE_NODES @@ -439,13 +439,13 @@ void __init setup_bootmem_allocator(void * bootmem allocator with an invalid RAM area. */ reserve_bootmem(__pa_symbol(_text), (PFN_PHYS(min_low_pfn) + -bootmap_size + PAGE_SIZE-1) - __pa_symbol(_text)); +bootmap_size + PAGE_SIZE-1) - __pa_symbol(_text), 0); /* * reserve physical page 0 - it's a special BIOS page on many boxes, * enabling clean reboots, SMP operation, laptop functions. */ - reserve_bootmem(0, PAGE_SIZE); + reserve_bootmem(0, PAGE_SIZE, 0); /* reserve EBDA region, it's a 4K region */ reserve_ebda_region(); @@ -455,7 +455,7 @@ void __init setup_bootmem_allocator(void unless you have no PS/2 mouse plugged in. */ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD boot_cpu_data.x86 == 6) -reserve_bootmem(0xa - 4096, 4096); +reserve_bootmem(0xa - 4096, 4096, 0); #ifdef CONFIG_SMP /* @@ -463,7 +463,7 @@ void __init setup_bootmem_allocator(void * FIXME: Don't need the extra page at 4K, but need to fix * trampoline before removing it. (see the GDT stuff) */ - reserve_bootmem(PAGE_SIZE, PAGE_SIZE); + reserve_bootmem(PAGE_SIZE, PAGE_SIZE, 0); #endif #ifdef CONFIG_ACPI_SLEEP /* @@ -481,7 +481,7 @@ void __init setup_bootmem_allocator(void #ifdef CONFIG_BLK_DEV_INITRD if (LOADER_TYPE INITRD_START) { if (INITRD_START + INITRD_SIZE = (max_low_pfn PAGE_SHIFT)) { - reserve_bootmem(INITRD_START, INITRD_SIZE); + reserve_bootmem(INITRD_START, INITRD_SIZE, 0); initrd_start = INITRD_START + PAGE_OFFSET; initrd_end = initrd_start+INITRD_SIZE; } --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -218,7 +218,7 @@ static void __init reserve_crashkernel(v (unsigned long)(free_mem 20)); crashk_res.start = crash_base; crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size); + reserve_bootmem(crash_base, crash_size, 0); } else printk(KERN_INFO crashkernel reservation failed - you have to specify a base address\n); --- a/include/linux/bootmem.h +++ b/include/linux/bootmem.h @@ -61,8 +61,19 @@ extern void *__alloc_bootmem_core(struct unsigned long limit); extern void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size); +/* + * flags
[patch 3/3] Use BOOTMEM_EXCLUSIVE on x86
This patch uses the BOOTMEM_EXCLUSIVE, introduced in the previous patch, to avoid conflicts while reserving the memory for the kdump carpture kernel (crashkernel=). The modification has been tested on i386. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/x86/kernel/setup_32.c | 28 ++-- arch/x86/kernel/setup_64.c | 34 +- 2 files changed, 39 insertions(+), 23 deletions(-) --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -403,18 +403,26 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, total_mem, crash_size, crash_base); if (ret == 0 crash_size 0) { - if (crash_base 0) { - printk(KERN_INFO Reserving %ldMB of memory at %ldMB - for crashkernel (System RAM: %ldMB)\n, - (unsigned long)(crash_size 20), - (unsigned long)(crash_base 20), - (unsigned long)(total_mem 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size); - } else + if (crash_base = 0) { printk(KERN_INFO crashkernel reservation failed - you have to specify a base address\n); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) 0) { + printk(KERN_INFO crashkernel reservation failed - + memory is in use\n); + return; + } + + printk(KERN_INFO Reserving %ldMB of memory at %ldMB + for crashkernel (System RAM: %ldMB)\n, + (unsigned long)(crash_size 20), + (unsigned long)(crash_base 20), + (unsigned long)(total_mem 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -201,27 +201,35 @@ static inline void copy_edd(void) #ifdef CONFIG_KEXEC static void __init reserve_crashkernel(void) { - unsigned long long free_mem; + unsigned long long total_mem; unsigned long long crash_size, crash_base; int ret; - free_mem = ((unsigned long long)max_low_pfn - min_low_pfn) PAGE_SHIFT; + total_mem = ((unsigned long long)max_low_pfn - min_low_pfn) PAGE_SHIFT; - ret = parse_crashkernel(boot_command_line, free_mem, + ret = parse_crashkernel(boot_command_line, total_mem, crash_size, crash_base); if (ret == 0 crash_size) { - if (crash_base 0) { - printk(KERN_INFO Reserving %ldMB of memory at %ldMB - for crashkernel (System RAM: %ldMB)\n, - (unsigned long)(crash_size 20), - (unsigned long)(crash_base 20), - (unsigned long)(free_mem 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, 0); - } else + if (crash_base = 0) { printk(KERN_INFO crashkernel reservation failed - you have to specify a base address\n); + return; + } + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) 0) { + printk(KERN_INFO crashkernel reservation failed - + memory is in use\n); + return; + } + + printk(KERN_INFO Reserving %ldMB of memory at %ldMB + for crashkernel (System RAM: %ldMB)\n, + (unsigned long)(crash_size 20), + (unsigned long)(crash_base 20), + (unsigned long)(total_mem 20)); + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; } } #else -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] Introduce BOOTMEM_EXCLUSIVE
Hi, * Dave Hansen [EMAIL PROTECTED] [2007-10-16 20:08]: On Tue, 2007-10-16 at 18:28 +0200, Bernhard Walle wrote: @@ -736,7 +736,7 @@ static int __init smp_scan_config (unsig smp_found_config = 1; printk(KERN_INFO found SMP MP-table at %08lx\n, virt_to_phys(mpf)); - reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE); + reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE, 0); if (mpf-mpf_physptr) { /* Could you give all of these 0's a name? I really hate seeing random magic numbers in these things. 0 completely kills the ability of someone to read the code and figure out what it is trying to do without going and looking at reserve_bootmem(). Of course I can replace that zeroes with something like BOOTMEM_DEFAULT. Or, alternatively, do something like this: -extern void reserve_bootmem(unsigned long addr, unsigned long size); Andi was against more bootmem functions. ;) Thanks, Bernhard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] Add BSS to resource tree
* Andrew Morton <[EMAIL PROTECTED]> [2007-10-15 20:32]: > On Mon, 15 Oct 2007 13:50:43 +0200 > Bernhard Walle <[EMAIL PROTECTED]> wrote: > > > --- a/arch/x86/kernel/e820_32.c > > +++ b/arch/x86/kernel/e820_32.c > > @@ -51,6 +51,13 @@ struct resource code_resource = { > > .flags = IORESOURCE_BUSY | IORESOURCE_MEM > > }; > > > > +struct resource bss_resource = { > > + .name = "Kernel bss", > > + .start = 0, > > + .end= 0, > > + .flags = IORESOURCE_BUSY | IORESOURCE_MEM > > +}; > > + > > static struct resource system_rom_resource = { > > .name = "System ROM", > > .start = 0xf, > > @@ -287,6 +294,7 @@ legacy_init_iomem_resources(struct resou > > */ > > request_resource(res, code_resource); > > request_resource(res, data_resource); > > + request_resource(res, _resource); > > Looks ungainly, doesn't it? Perhaps we should add a third arg to > legacy_init_iomem_resources(), or change legacy_init_iomem_resources() to > take zero args? Yes. But when we change legacy_init_iomem_resources(), then we should also change efi_initialize_iomem_resources(). But that's declared in and so a change in ia64 code is required which I wanted to avoid. But that patch is for review of the idea. If nobody has objections, then I'll implement the IA64 change anyway -- and then the 3rd parameter can be added. Thanks, Bernhard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/2] Add BSS to resource tree
This patch adds the BSS to the resource tree just as kernel text and kernel data are in the resource tree. The main reason behind this is to avoid crashkernel reservation in that area. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/kernel/e820_32.c |8 arch/x86/kernel/e820_64.c |3 ++- arch/x86/kernel/efi_32.c |3 +++ arch/x86/kernel/setup_32.c |4 arch/x86/kernel/setup_64.c |9 + 5 files changed, 26 insertions(+), 1 deletion(-) --- a/arch/x86/kernel/e820_32.c +++ b/arch/x86/kernel/e820_32.c @@ -51,6 +51,13 @@ struct resource code_resource = { .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; +struct resource bss_resource = { + .name = "Kernel bss", + .start = 0, + .end= 0, + .flags = IORESOURCE_BUSY | IORESOURCE_MEM +}; + static struct resource system_rom_resource = { .name = "System ROM", .start = 0xf, @@ -287,6 +294,7 @@ legacy_init_iomem_resources(struct resou */ request_resource(res, code_resource); request_resource(res, data_resource); + request_resource(res, _resource); #ifdef CONFIG_KEXEC if (crashk_res.start != crashk_res.end) request_resource(res, _res); --- a/arch/x86/kernel/e820_64.c +++ b/arch/x86/kernel/e820_64.c @@ -47,7 +47,7 @@ unsigned long end_pfn_map; */ static unsigned long __initdata end_user_pfn = MAXMEM>>PAGE_SHIFT; -extern struct resource code_resource, data_resource; +extern struct resource code_resource, data_resource, bss_resource; /* Check for some hardcoded bad areas that early boot is not allowed to touch */ static inline int bad_addr(unsigned long *addrp, unsigned long size) @@ -220,6 +220,7 @@ void __init e820_reserve_resources(void) */ request_resource(res, _resource); request_resource(res, _resource); + request_resource(res, _resource); #ifdef CONFIG_KEXEC if (crashk_res.start != crashk_res.end) request_resource(res, _res); --- a/arch/x86/kernel/efi_32.c +++ b/arch/x86/kernel/efi_32.c @@ -49,6 +49,8 @@ EXPORT_SYMBOL(efi); static struct efi efi_phys; struct efi_memory_map memmap; +extern struct resource iomem_resource; + /* * We require an early boot_ioremap mapping mechanism initially */ @@ -672,6 +674,7 @@ efi_initialize_iomem_resources(struct re if (md->type == EFI_CONVENTIONAL_MEMORY) { request_resource(res, code_resource); request_resource(res, data_resource); + request_resource(res, _resource); #ifdef CONFIG_KEXEC request_resource(res, _res); #endif --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -60,6 +60,7 @@ #include #include #include +#include /* This value is set up by the early boot code to point to the value immediately after the boot time page tables. It contains a *physical* @@ -73,6 +74,7 @@ int disable_pse __devinitdata = 0; */ extern struct resource code_resource; extern struct resource data_resource; +extern struct resource bss_resource; /* cpu data as detected by the assembly code in head.S */ struct cpuinfo_x86 new_cpu_data __cpuinitdata = { 0, 0, 0, 0, -1, 1, 0, 0, -1 }; @@ -595,6 +597,8 @@ void __init setup_arch(char **cmdline_p) code_resource.end = virt_to_phys(_etext)-1; data_resource.start = virt_to_phys(_etext); data_resource.end = virt_to_phys(_edata)-1; + bss_resource.start = virt_to_phys(&__bss_start); + bss_resource.end = virt_to_phys(&__bss_stop)-1; parse_early_param(); --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -59,6 +59,7 @@ #include #include #include +#include /* * Machine setup.. @@ -134,6 +135,12 @@ struct resource code_resource = { .end = 0, .flags = IORESOURCE_RAM, }; +struct resource bss_resource = { + .name = "Kernel bss", + .start = 0, + .end = 0, + .flags = IORESOURCE_RAM, +}; #ifdef CONFIG_PROC_VMCORE /* elfcorehdr= specifies the location of elf core header @@ -276,6 +283,8 @@ void __init setup_arch(char **cmdline_p) code_resource.end = virt_to_phys(&_etext)-1; data_resource.start = virt_to_phys(&_etext); data_resource.end = virt_to_phys(&_edata)-1; + bss_resource.start = virt_to_phys(&__bss_start); + bss_resource.end = virt_to_phys(&__bss_stop)-1; early_identify_cpu(_cpu_data); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2/2] Check if the crashkernel area is behind BSS
This patch checks if the crashkernel base address is after the end of BSS on i386 and x86_64. Having "Kernel bss" in the resource tree is not sufficient since that only prevents "crash kernel" from appearing in the resource tree and therefore kexec from loading the crashdump kernel since it checks /proc/iomem. However, the crashkernel memory would still be reserved. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> --- arch/x86/kernel/setup_32.c | 31 +-- arch/x86/kernel/setup_64.c | 31 +-- 2 files changed, 42 insertions(+), 20 deletions(-) --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -403,18 +403,29 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, total_mem, _size, _base); if (ret == 0 && crash_size > 0) { - if (crash_base > 0) { - printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " - "for crashkernel (System RAM: %ldMB)\n", - (unsigned long)(crash_size >> 20), - (unsigned long)(crash_base >> 20), - (unsigned long)(total_mem >> 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size); - } else + if (crash_base <= 0) { printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); + return; + } + + if (base < virt_to_phys(&_end)) { + printk(KERN_WARNING "base address for crashkernel " + "(%luMB) is too low -- 0M-%luMB area " + "is needed by the kernel\n", + base >> 20, virt_to_phys(&_end) << 20); + return; + } + + printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " + "for crashkernel (System RAM: %ldMB)\n", + (unsigned long)(crash_size >> 20), + (unsigned long)(crash_base >> 20), + (unsigned long)(total_mem >> 20)); + + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; + reserve_bootmem(crash_base, crash_size); } } #else --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -210,18 +210,29 @@ static void __init reserve_crashkernel(v ret = parse_crashkernel(boot_command_line, free_mem, _size, _base); if (ret == 0 && crash_size) { - if (crash_base > 0) { - printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " - "for crashkernel (System RAM: %ldMB)\n", - (unsigned long)(crash_size >> 20), - (unsigned long)(crash_base >> 20), - (unsigned long)(free_mem >> 20)); - crashk_res.start = crash_base; - crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size); - } else + if (crash_base <= 0) { printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); + return; + } + + if (crash_base < virt_to_phys(&_end)) { + printk(KERN_WARNING "base address for crashkernel " + "(%lluMB) is too low -- 0M-%luMB area " + "is needed by the kernel\n", + crash_base >> 20, + virt_to_phys(&_end) << 20); + return; + } + + printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " + "for crashkernel (System RAM: %ldMB)\n", + (unsigned long)(crash_size >> 20), + (unsigned long)(crash_base >> 20), + (u
[patch 0/2] Protect crashkernel against BSS overlap
I observed the problem that even when you choose the default 16M as crashkernel base address and the kernel is very big, the reserved area may overlap with the kernel BSS. Currently, this is not checked at runtime, so the kernel just crashes when you load the panic kernel in the sys_kexec call. This two patches check this at runtime. The patches are against current git, but with the patches extended-crashkernel-command-line.patch extended-crashkernel-command-line-update.patch extended-crashkernel-command-line-comment-fix.patch extended-crashkernel-command-line-improve-error-handling-in-parse_crashkernel_mem.patch use-extended-crashkernel-command-line-on-i386.patch use-extended-crashkernel-command-line-on-i386-update.patch use-extended-crashkernel-command-line-on-x86_64.patch use-extended-crashkernel-command-line-on-x86_64-update.patch use-extended-crashkernel-command-line-on-ia64.patch use-extended-crashkernel-command-line-on-ia64-fix.patch use-extended-crashkernel-command-line-on-ia64-update.patch use-extended-crashkernel-command-line-on-ppc64.patch use-extended-crashkernel-command-line-on-ppc64-update.patch use-extended-crashkernel-command-line-on-sh.patch use-extended-crashkernel-command-line-on-sh-update.patch from -mm tree applied since they are marked to be merged in 2.6.24. I know that the implementation of both patches is only x86 (i386 and x86-64), but if you agree that it's the way to go, I can add the BSS resource and the check for all other architectures that apply. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 0/2] Protect crashkernel against BSS overlap
I observed the problem that even when you choose the default 16M as crashkernel base address and the kernel is very big, the reserved area may overlap with the kernel BSS. Currently, this is not checked at runtime, so the kernel just crashes when you load the panic kernel in the sys_kexec call. This two patches check this at runtime. The patches are against current git, but with the patches extended-crashkernel-command-line.patch extended-crashkernel-command-line-update.patch extended-crashkernel-command-line-comment-fix.patch extended-crashkernel-command-line-improve-error-handling-in-parse_crashkernel_mem.patch use-extended-crashkernel-command-line-on-i386.patch use-extended-crashkernel-command-line-on-i386-update.patch use-extended-crashkernel-command-line-on-x86_64.patch use-extended-crashkernel-command-line-on-x86_64-update.patch use-extended-crashkernel-command-line-on-ia64.patch use-extended-crashkernel-command-line-on-ia64-fix.patch use-extended-crashkernel-command-line-on-ia64-update.patch use-extended-crashkernel-command-line-on-ppc64.patch use-extended-crashkernel-command-line-on-ppc64-update.patch use-extended-crashkernel-command-line-on-sh.patch use-extended-crashkernel-command-line-on-sh-update.patch from -mm tree applied since they are marked to be merged in 2.6.24. I know that the implementation of both patches is only x86 (i386 and x86-64), but if you agree that it's the way to go, I can add the BSS resource and the check for all other architectures that apply. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/2] Add BSS to resource tree
This patch adds the BSS to the resource tree just as kernel text and kernel data are in the resource tree. The main reason behind this is to avoid crashkernel reservation in that area. Signed-off-by: Bernhard Walle [EMAIL PROTECTED] --- arch/x86/kernel/e820_32.c |8 arch/x86/kernel/e820_64.c |3 ++- arch/x86/kernel/efi_32.c |3 +++ arch/x86/kernel/setup_32.c |4 arch/x86/kernel/setup_64.c |9 + 5 files changed, 26 insertions(+), 1 deletion(-) --- a/arch/x86/kernel/e820_32.c +++ b/arch/x86/kernel/e820_32.c @@ -51,6 +51,13 @@ struct resource code_resource = { .flags = IORESOURCE_BUSY | IORESOURCE_MEM }; +struct resource bss_resource = { + .name = Kernel bss, + .start = 0, + .end= 0, + .flags = IORESOURCE_BUSY | IORESOURCE_MEM +}; + static struct resource system_rom_resource = { .name = System ROM, .start = 0xf, @@ -287,6 +294,7 @@ legacy_init_iomem_resources(struct resou */ request_resource(res, code_resource); request_resource(res, data_resource); + request_resource(res, bss_resource); #ifdef CONFIG_KEXEC if (crashk_res.start != crashk_res.end) request_resource(res, crashk_res); --- a/arch/x86/kernel/e820_64.c +++ b/arch/x86/kernel/e820_64.c @@ -47,7 +47,7 @@ unsigned long end_pfn_map; */ static unsigned long __initdata end_user_pfn = MAXMEMPAGE_SHIFT; -extern struct resource code_resource, data_resource; +extern struct resource code_resource, data_resource, bss_resource; /* Check for some hardcoded bad areas that early boot is not allowed to touch */ static inline int bad_addr(unsigned long *addrp, unsigned long size) @@ -220,6 +220,7 @@ void __init e820_reserve_resources(void) */ request_resource(res, code_resource); request_resource(res, data_resource); + request_resource(res, bss_resource); #ifdef CONFIG_KEXEC if (crashk_res.start != crashk_res.end) request_resource(res, crashk_res); --- a/arch/x86/kernel/efi_32.c +++ b/arch/x86/kernel/efi_32.c @@ -49,6 +49,8 @@ EXPORT_SYMBOL(efi); static struct efi efi_phys; struct efi_memory_map memmap; +extern struct resource iomem_resource; + /* * We require an early boot_ioremap mapping mechanism initially */ @@ -672,6 +674,7 @@ efi_initialize_iomem_resources(struct re if (md-type == EFI_CONVENTIONAL_MEMORY) { request_resource(res, code_resource); request_resource(res, data_resource); + request_resource(res, bss_resource); #ifdef CONFIG_KEXEC request_resource(res, crashk_res); #endif --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -60,6 +60,7 @@ #include asm/vmi.h #include setup_arch.h #include bios_ebda.h +#include asm/cacheflush.h /* This value is set up by the early boot code to point to the value immediately after the boot time page tables. It contains a *physical* @@ -73,6 +74,7 @@ int disable_pse __devinitdata = 0; */ extern struct resource code_resource; extern struct resource data_resource; +extern struct resource bss_resource; /* cpu data as detected by the assembly code in head.S */ struct cpuinfo_x86 new_cpu_data __cpuinitdata = { 0, 0, 0, 0, -1, 1, 0, 0, -1 }; @@ -595,6 +597,8 @@ void __init setup_arch(char **cmdline_p) code_resource.end = virt_to_phys(_etext)-1; data_resource.start = virt_to_phys(_etext); data_resource.end = virt_to_phys(_edata)-1; + bss_resource.start = virt_to_phys(__bss_start); + bss_resource.end = virt_to_phys(__bss_stop)-1; parse_early_param(); --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -59,6 +59,7 @@ #include asm/numa.h #include asm/sections.h #include asm/dmi.h +#include asm/cacheflush.h /* * Machine setup.. @@ -134,6 +135,12 @@ struct resource code_resource = { .end = 0, .flags = IORESOURCE_RAM, }; +struct resource bss_resource = { + .name = Kernel bss, + .start = 0, + .end = 0, + .flags = IORESOURCE_RAM, +}; #ifdef CONFIG_PROC_VMCORE /* elfcorehdr= specifies the location of elf core header @@ -276,6 +283,8 @@ void __init setup_arch(char **cmdline_p) code_resource.end = virt_to_phys(_etext)-1; data_resource.start = virt_to_phys(_etext); data_resource.end = virt_to_phys(_edata)-1; + bss_resource.start = virt_to_phys(__bss_start); + bss_resource.end = virt_to_phys(__bss_stop)-1; early_identify_cpu(boot_cpu_data); -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http