Re: [PATCH] ieee802154: mrf24j40: fix incorrect mask in mrf24j40_stop
On 11/06/2017 06:13 AM, Stefan Schmidt wrote: Hello Alan. On 10/31/2017 07:31 AM, Gustavo A. R. Silva wrote: It seems that this is a copy/paste error and the proper bit masking is: BIT_TXNIE | BIT_RXIE This issue was detected with the help of Coccinelle. Reported-by: Julia Lawall <julia.law...@lip6.fr> Signed-off-by: Gustavo A. R. Silva <garsi...@embeddedor.com> Fixes: 7d840545e5b9 ("mrf24j40: replace magic numbers") --- drivers/net/ieee802154/mrf24j40.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index ee7084b..cf4788d 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -635,7 +635,7 @@ static void mrf24j40_stop(struct ieee802154_hw *hw) /* Set TXNIE and RXIE. Disable Interrupts */ regmap_update_bits(devrec->regmap_short, REG_INTCON, - BIT_TXNIE | BIT_TXNIE, BIT_TXNIE | BIT_TXNIE); + BIT_TXNIE | BIT_RXIE, BIT_TXNIE | BIT_RXIE); } static int mrf24j40_set_channel(struct ieee802154_hw *hw, u8 page, u8 channel) Could you review this and give me your ack when you are happy so I can apply it to my tree? Acked-by: Alan Ott <a...@signal11.us>
Re: [PATCH] ieee802154: mrf24j40: fix incorrect mask in mrf24j40_stop
On 11/06/2017 06:13 AM, Stefan Schmidt wrote: Hello Alan. On 10/31/2017 07:31 AM, Gustavo A. R. Silva wrote: It seems that this is a copy/paste error and the proper bit masking is: BIT_TXNIE | BIT_RXIE This issue was detected with the help of Coccinelle. Reported-by: Julia Lawall Signed-off-by: Gustavo A. R. Silva Fixes: 7d840545e5b9 ("mrf24j40: replace magic numbers") --- drivers/net/ieee802154/mrf24j40.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index ee7084b..cf4788d 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -635,7 +635,7 @@ static void mrf24j40_stop(struct ieee802154_hw *hw) /* Set TXNIE and RXIE. Disable Interrupts */ regmap_update_bits(devrec->regmap_short, REG_INTCON, - BIT_TXNIE | BIT_TXNIE, BIT_TXNIE | BIT_TXNIE); + BIT_TXNIE | BIT_RXIE, BIT_TXNIE | BIT_RXIE); } static int mrf24j40_set_channel(struct ieee802154_hw *hw, u8 page, u8 channel) Could you review this and give me your ack when you are happy so I can apply it to my tree? Acked-by: Alan Ott
Re: [PATCH v2] mrf24j40: Fix en error handling path in 'mrf24j40_probe()'
On 07/08/2017 04:40 AM, Christophe JAILLET wrote: If this check fails, we must release some resources as done everywhere else in this function before returning an error code. Signed-off-by: Christophe JAILLET <christophe.jail...@wanadoo.fr> --- V2: initialization of ret in this erro path ws missing. Stupid me! --- drivers/net/ieee802154/mrf24j40.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 7d334963dc08..da8683782ffc 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -1330,7 +1330,8 @@ static int mrf24j40_probe(struct spi_device *spi) if (spi->max_speed_hz > MAX_SPI_SPEED_HZ) { dev_warn(>dev, "spi clock above possible maximum: %d", MAX_SPI_SPEED_HZ); - return -EINVAL; + ret = -EINVAL; + goto err_register_device; } ret = mrf24j40_hw_init(devrec); Acked-by: Alan Ott <a...@signal11.us>
Re: [PATCH v2] mrf24j40: Fix en error handling path in 'mrf24j40_probe()'
On 07/08/2017 04:40 AM, Christophe JAILLET wrote: If this check fails, we must release some resources as done everywhere else in this function before returning an error code. Signed-off-by: Christophe JAILLET --- V2: initialization of ret in this erro path ws missing. Stupid me! --- drivers/net/ieee802154/mrf24j40.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 7d334963dc08..da8683782ffc 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -1330,7 +1330,8 @@ static int mrf24j40_probe(struct spi_device *spi) if (spi->max_speed_hz > MAX_SPI_SPEED_HZ) { dev_warn(>dev, "spi clock above possible maximum: %d", MAX_SPI_SPEED_HZ); - return -EINVAL; + ret = -EINVAL; + goto err_register_device; } ret = mrf24j40_hw_init(devrec); Acked-by: Alan Ott
[PATCH v2] doc: efi-stub.txt: Fix arm64 paths
Update documented paths for arm64 files to match current tree. Signed-off-by: Alan Ott --- Documentation/efi-stub.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt index 7747024..e157469 100644 --- a/Documentation/efi-stub.txt +++ b/Documentation/efi-stub.txt @@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c, respectively. For ARM the EFI stub is implemented in arch/arm/boot/compressed/efi-header.S and arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared -between architectures is in drivers/firmware/efi/efi-stub-helper.c. +between architectures is in drivers/firmware/efi/libstub. For arm64, there is no compressed kernel support, so the Image itself masquerades as a PE/COFF image and the EFI stub is linked into the kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S -and arch/arm64/kernel/efi-stub.c. +and drivers/firmware/efi/libstub/arm64-stub.c. By using the EFI boot stub it's possible to boot a Linux kernel without the use of a conventional EFI boot loader, such as grub or -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] doc: efi-stub.txt: Fix arm64 paths
Update documented paths for arm64 files to match current tree. Signed-off-by: Alan Ott <a...@softiron.co.uk> --- Documentation/efi-stub.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt index 7747024..e157469 100644 --- a/Documentation/efi-stub.txt +++ b/Documentation/efi-stub.txt @@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c, respectively. For ARM the EFI stub is implemented in arch/arm/boot/compressed/efi-header.S and arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared -between architectures is in drivers/firmware/efi/efi-stub-helper.c. +between architectures is in drivers/firmware/efi/libstub. For arm64, there is no compressed kernel support, so the Image itself masquerades as a PE/COFF image and the EFI stub is linked into the kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S -and arch/arm64/kernel/efi-stub.c. +and drivers/firmware/efi/libstub/arm64-stub.c. By using the EFI boot stub it's possible to boot a Linux kernel without the use of a conventional EFI boot loader, such as grub or -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] doc: efi-stub.txt: Fix arm64 paths
On 11/28/2015 11:03 AM, Alan Ott wrote: Update documented paths for arm64 files to match current tree. --- Documentation/efi-stub.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt index 7747024..e157469 100644 --- a/Documentation/efi-stub.txt +++ b/Documentation/efi-stub.txt @@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c, respectively. For ARM the EFI stub is implemented in arch/arm/boot/compressed/efi-header.S and arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared -between architectures is in drivers/firmware/efi/efi-stub-helper.c. +between architectures is in drivers/firmware/efi/libstub. For arm64, there is no compressed kernel support, so the Image itself masquerades as a PE/COFF image and the EFI stub is linked into the kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S -and arch/arm64/kernel/efi-stub.c. +and drivers/firmware/efi/libstub/arm64-stub.c. By using the EFI boot stub it's possible to boot a Linux kernel without the use of a conventional EFI boot loader, such as grub or To follow up on this one, it additionally appears that commit 719e284369d23c3feb02d02ad58b3a516df47806 [1] was premature, as ARM support for EFI never fully made it into mainline, and the files referenced[2] do not exist anywhere in the history. Alan. [1] https://lkml.org/lkml/2014/4/4/401 [2] arch/arm/boot/compressed/efi-header.S, and arch/arm/boot/compressed/efi-stub.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] doc: efi-stub.txt: Fix arm64 paths
Update documented paths for arm64 files to match current tree. --- Documentation/efi-stub.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt index 7747024..e157469 100644 --- a/Documentation/efi-stub.txt +++ b/Documentation/efi-stub.txt @@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c, respectively. For ARM the EFI stub is implemented in arch/arm/boot/compressed/efi-header.S and arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared -between architectures is in drivers/firmware/efi/efi-stub-helper.c. +between architectures is in drivers/firmware/efi/libstub. For arm64, there is no compressed kernel support, so the Image itself masquerades as a PE/COFF image and the EFI stub is linked into the kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S -and arch/arm64/kernel/efi-stub.c. +and drivers/firmware/efi/libstub/arm64-stub.c. By using the EFI boot stub it's possible to boot a Linux kernel without the use of a conventional EFI boot loader, such as grub or -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] doc: efi-stub.txt: Fix arm64 paths
Update documented paths for arm64 files to match current tree. --- Documentation/efi-stub.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt index 7747024..e157469 100644 --- a/Documentation/efi-stub.txt +++ b/Documentation/efi-stub.txt @@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c, respectively. For ARM the EFI stub is implemented in arch/arm/boot/compressed/efi-header.S and arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared -between architectures is in drivers/firmware/efi/efi-stub-helper.c. +between architectures is in drivers/firmware/efi/libstub. For arm64, there is no compressed kernel support, so the Image itself masquerades as a PE/COFF image and the EFI stub is linked into the kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S -and arch/arm64/kernel/efi-stub.c. +and drivers/firmware/efi/libstub/arm64-stub.c. By using the EFI boot stub it's possible to boot a Linux kernel without the use of a conventional EFI boot loader, such as grub or -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] doc: efi-stub.txt: Fix arm64 paths
On 11/28/2015 11:03 AM, Alan Ott wrote: Update documented paths for arm64 files to match current tree. --- Documentation/efi-stub.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt index 7747024..e157469 100644 --- a/Documentation/efi-stub.txt +++ b/Documentation/efi-stub.txt @@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c, respectively. For ARM the EFI stub is implemented in arch/arm/boot/compressed/efi-header.S and arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared -between architectures is in drivers/firmware/efi/efi-stub-helper.c. +between architectures is in drivers/firmware/efi/libstub. For arm64, there is no compressed kernel support, so the Image itself masquerades as a PE/COFF image and the EFI stub is linked into the kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S -and arch/arm64/kernel/efi-stub.c. +and drivers/firmware/efi/libstub/arm64-stub.c. By using the EFI boot stub it's possible to boot a Linux kernel without the use of a conventional EFI boot loader, such as grub or To follow up on this one, it additionally appears that commit 719e284369d23c3feb02d02ad58b3a516df47806 [1] was premature, as ARM support for EFI never fully made it into mainline, and the files referenced[2] do not exist anywhere in the history. Alan. [1] https://lkml.org/lkml/2014/4/4/401 [2] arch/arm/boot/compressed/efi-header.S, and arch/arm/boot/compressed/efi-stub.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
insertions(+), 3 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index f2efe8a..50793cd 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -76,6 +76,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q, struct bio_vec bv, bvprv, *bvprvp = NULL; struct bvec_iter iter; unsigned seg_size = 0, nsegs = 0, sectors = 0; + unsigned front_seg_size = bio->bi_seg_front_size; + bool do_split = true; + struct bio *new = NULL; bio_for_each_segment(bv, bio, iter) { if (sectors + (bv.bv_len >> 9) > queue_max_sectors(q)) @@ -111,13 +114,26 @@ new_segment: bvprvp = seg_size = bv.bv_len; sectors += bv.bv_len >> 9; + + if (nsegs == 1 && seg_size > front_seg_size) + front_seg_size = seg_size; } - *segs = nsegs; - return NULL; + do_split = false; split: *segs = nsegs; - return bio_split(bio, sectors, GFP_NOIO, bs); + + if (do_split) { + new = bio_split(bio, sectors, GFP_NOIO, bs); + if (new) + bio = new; + } + + bio->bi_seg_front_size = front_seg_size; + if (seg_size > bio->bi_seg_back_size) + bio->bi_seg_back_size = seg_size; + + return do_split ? new : NULL; } void blk_queue_split(struct request_queue *q, struct bio **bio, I applied both your patches and tested on Overdrive 3000. This fixes the issue for me. Added linux-arm-kernel, since arm64 triggers this issue. Thanks Ming and testers for your hard work on this. Tested-by: Alan Ott -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!
-by: Michael Ellerman <m...@ellerman.id.au> Reported-by: Mark Salter <msal...@redhat.com> Fixes: bdced438acd83a(block: setup bi_phys_segments after splitting) Signed-off-by: Ming Lei <ming@canonical.com> --- block/blk-merge.c | 22 +++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index f2efe8a..50793cd 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -76,6 +76,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q, struct bio_vec bv, bvprv, *bvprvp = NULL; struct bvec_iter iter; unsigned seg_size = 0, nsegs = 0, sectors = 0; + unsigned front_seg_size = bio->bi_seg_front_size; + bool do_split = true; + struct bio *new = NULL; bio_for_each_segment(bv, bio, iter) { if (sectors + (bv.bv_len >> 9) > queue_max_sectors(q)) @@ -111,13 +114,26 @@ new_segment: bvprvp = seg_size = bv.bv_len; sectors += bv.bv_len >> 9; + + if (nsegs == 1 && seg_size > front_seg_size) + front_seg_size = seg_size; } - *segs = nsegs; - return NULL; + do_split = false; split: *segs = nsegs; - return bio_split(bio, sectors, GFP_NOIO, bs); + + if (do_split) { + new = bio_split(bio, sectors, GFP_NOIO, bs); + if (new) + bio = new; + } + + bio->bi_seg_front_size = front_seg_size; + if (seg_size > bio->bi_seg_back_size) + bio->bi_seg_back_size = seg_size; + + return do_split ? new : NULL; } void blk_queue_split(struct request_queue *q, struct bio **bio, I applied both your patches and tested on Overdrive 3000. This fixes the issue for me. Added linux-arm-kernel, since arm64 triggers this issue. Thanks Ming and testers for your hard work on this. Tested-by: Alan Ott <a...@softiron.co.uk> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6] net: ieee802154: Remove redundant spi driver bus initialization
On 06/23/2015 10:52 AM, Antonio Borneo wrote: In ancient times it was necessary to manually initialize the bus field of an spi_driver to spi_bus_type. These days this is done in spi_register_driver(), so we can drop the manual assignment. Signed-off-by: Antonio Borneo To: Alan Ott To: Alexander Aring To: Varka Bhadram To: linux-w...@vger.kernel.org To: net...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- drivers/net/ieee802154/cc2520.c | 1 - drivers/net/ieee802154/mrf24j40.c | 1 - 2 files changed, 2 deletions(-) diff --git a/drivers/net/ieee802154/cc2520.c b/drivers/net/ieee802154/cc2520.c index f833b8b..bd957f1 100644 --- a/drivers/net/ieee802154/cc2520.c +++ b/drivers/net/ieee802154/cc2520.c @@ -1046,7 +1046,6 @@ MODULE_DEVICE_TABLE(of, cc2520_of_ids); static struct spi_driver cc2520_driver = { .driver = { .name = "cc2520", - .bus = _bus_type, .owner = THIS_MODULE, .of_match_table = of_match_ptr(cc2520_of_ids), }, diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index fba2dfd..bc8fb26 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -812,7 +812,6 @@ MODULE_DEVICE_TABLE(spi, mrf24j40_ids); static struct spi_driver mrf24j40_driver = { .driver = { .name = "mrf24j40", - .bus = _bus_type, .owner = THIS_MODULE, }, .id_table = mrf24j40_ids, Acked-by: Alan Ott -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6] net: ieee802154: Remove redundant spi driver bus initialization
On 06/23/2015 10:52 AM, Antonio Borneo wrote: In ancient times it was necessary to manually initialize the bus field of an spi_driver to spi_bus_type. These days this is done in spi_register_driver(), so we can drop the manual assignment. Signed-off-by: Antonio Borneo borneo.anto...@gmail.com To: Alan Ott a...@signal11.us To: Alexander Aring alex.ar...@gmail.com To: Varka Bhadram varkabhad...@gmail.com To: linux-w...@vger.kernel.org To: net...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- drivers/net/ieee802154/cc2520.c | 1 - drivers/net/ieee802154/mrf24j40.c | 1 - 2 files changed, 2 deletions(-) diff --git a/drivers/net/ieee802154/cc2520.c b/drivers/net/ieee802154/cc2520.c index f833b8b..bd957f1 100644 --- a/drivers/net/ieee802154/cc2520.c +++ b/drivers/net/ieee802154/cc2520.c @@ -1046,7 +1046,6 @@ MODULE_DEVICE_TABLE(of, cc2520_of_ids); static struct spi_driver cc2520_driver = { .driver = { .name = cc2520, - .bus = spi_bus_type, .owner = THIS_MODULE, .of_match_table = of_match_ptr(cc2520_of_ids), }, diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index fba2dfd..bc8fb26 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -812,7 +812,6 @@ MODULE_DEVICE_TABLE(spi, mrf24j40_ids); static struct spi_driver mrf24j40_driver = { .driver = { .name = mrf24j40, - .bus = spi_bus_type, .owner = THIS_MODULE, }, .id_table = mrf24j40_ids, Acked-by: Alan Ott a...@signal11.us -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH net-next] MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driver
Alan is the original author of the driver. This change was discussed with the 802.15.4 subsystem maintainer, Alexander Aring. Signed-off-by: Alan Ott --- MAINTAINERS | 6 ++ 1 file changed, 6 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 2f85f55..c066da4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5972,6 +5972,12 @@ T: git git://linuxtv.org/media_tree.git S: Maintained F: drivers/media/radio/radio-mr800.c +MRF24J40 IEEE 802.15.4 RADIO DRIVER +M: Alan Ott +L: linux-w...@vger.kernel.org +S: Maintained +F: drivers/net/ieee802154/mrf24j40.c + MSI LAPTOP SUPPORT M: "Lee, Chun-Yi" L: platform-driver-...@vger.kernel.org -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH net-next] MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driver
Alan is the original author of the driver. This change was discussed with the 802.15.4 subsystem maintainer, Alexander Aring. Signed-off-by: Alan Ott a...@signal11.us --- MAINTAINERS | 6 ++ 1 file changed, 6 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 2f85f55..c066da4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5972,6 +5972,12 @@ T: git git://linuxtv.org/media_tree.git S: Maintained F: drivers/media/radio/radio-mr800.c +MRF24J40 IEEE 802.15.4 RADIO DRIVER +M: Alan Ott a...@signal11.us +L: linux-w...@vger.kernel.org +S: Maintained +F: drivers/net/ieee802154/mrf24j40.c + MSI LAPTOP SUPPORT M: Lee, Chun-Yi j...@suse.com L: platform-driver-...@vger.kernel.org -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Deadlock in do_page_fault() on ARM (old kernel)
On 01/17/2014 08:20 PM, Russell King - ARM Linux wrote: On Fri, Jan 17, 2014 at 07:57:16PM -0500, Alan Ott wrote: On 01/17/2014 08:46 AM, Russell King - ARM Linux wrote: My suspicion therefore is that some other thread must have died while holding the mmap_sem, so there's probably a kernel oops earlier... that's my best guess at the moment without seeing the full backtrace. There's no oops that I'm able to see. Each of the tasks which lockdep reports as "holding" mmap_sem are blocking for it. If some other task had taken it and then crashed, I assume lockdep would list the crashed task as also holding the resource in the printout. My point is this: - the five (or six) threads which are trying to take the mmap_sem in read-mode in the fault handler are all blocked on it - they haven't taken the lock, which will only happen because there's a pending writer. - of these in your original post, there are two which faulted from __copy_to_user_std(). __copy_to_user_std() doesn't take the mmap_sem - this is the non-uaccess-with-memcpy path. - the pending writers are the two threads in sys_mmap_pgoff(), both of which are blocked waiting to gain the write lock. - there are no *other* threads holding the mmap_sem lock. Yes, all true. I don't remember why I started looking at the memcpy() case. So... there's a question here how we got into this state - and frankly I don't know. What I do see from your latest dump is that there's two unknown modules there - something called rcu2m and another called buttoms, and there are two threads inside ioctls there. Both have faulted from the function at 0xc0d2a394 (which won't appear in the backtrace, but is most likely __copy_to_user_std.) Yes, there are a handful of out-of-tree modules. So, in the absence of you saying anything about there being any preceding oopses, my conclusion now is that one of those modules is taking the mmap_sem itself, and is the culpret inducing this deadlock. Yes, I came to that as well. I had checked for the presence of mmap_sem in the sources of the out-of-tree modules and didn't see it. However, upon closer inspection, my grep-fu failed me as there were some backward symlinks I didn't account for. TI's cmemk module _is_ taking out mmap_sem. I wish I had seen this days ago. That's my new investigation path. Note that your dump ([2]) in your reply was just the hung task detector printing out the stacktrace for a few tasks, not the full all-threads stack dump which I was expecting. Yes, in a misguided attempt to keep the SNR high, I didn't include the full dump, but only what I thought was the interesting part. I did another capture and the full dump is at [1] . So I'm pulling out these conclusions from the very little information you're supplying. I appreciate it. Thank you for taking the time to reply. Alan. [1] http://www.signal11.us/~alan/stack_dump_all_tasks_with_frame_pointers.txt -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Deadlock in do_page_fault() on ARM (old kernel)
On 01/17/2014 08:20 PM, Russell King - ARM Linux wrote: On Fri, Jan 17, 2014 at 07:57:16PM -0500, Alan Ott wrote: On 01/17/2014 08:46 AM, Russell King - ARM Linux wrote: My suspicion therefore is that some other thread must have died while holding the mmap_sem, so there's probably a kernel oops earlier... that's my best guess at the moment without seeing the full backtrace. There's no oops that I'm able to see. Each of the tasks which lockdep reports as holding mmap_sem are blocking for it. If some other task had taken it and then crashed, I assume lockdep would list the crashed task as also holding the resource in the printout. My point is this: - the five (or six) threads which are trying to take the mmap_sem in read-mode in the fault handler are all blocked on it - they haven't taken the lock, which will only happen because there's a pending writer. - of these in your original post, there are two which faulted from __copy_to_user_std(). __copy_to_user_std() doesn't take the mmap_sem - this is the non-uaccess-with-memcpy path. - the pending writers are the two threads in sys_mmap_pgoff(), both of which are blocked waiting to gain the write lock. - there are no *other* threads holding the mmap_sem lock. Yes, all true. I don't remember why I started looking at the memcpy() case. So... there's a question here how we got into this state - and frankly I don't know. What I do see from your latest dump is that there's two unknown modules there - something called rcu2m and another called buttoms, and there are two threads inside ioctls there. Both have faulted from the function at 0xc0d2a394 (which won't appear in the backtrace, but is most likely __copy_to_user_std.) Yes, there are a handful of out-of-tree modules. So, in the absence of you saying anything about there being any preceding oopses, my conclusion now is that one of those modules is taking the mmap_sem itself, and is the culpret inducing this deadlock. Yes, I came to that as well. I had checked for the presence of mmap_sem in the sources of the out-of-tree modules and didn't see it. However, upon closer inspection, my grep-fu failed me as there were some backward symlinks I didn't account for. TI's cmemk module _is_ taking out mmap_sem. I wish I had seen this days ago. That's my new investigation path. Note that your dump ([2]) in your reply was just the hung task detector printing out the stacktrace for a few tasks, not the full all-threads stack dump which I was expecting. Yes, in a misguided attempt to keep the SNR high, I didn't include the full dump, but only what I thought was the interesting part. I did another capture and the full dump is at [1] . So I'm pulling out these conclusions from the very little information you're supplying. I appreciate it. Thank you for taking the time to reply. Alan. [1] http://www.signal11.us/~alan/stack_dump_all_tasks_with_frame_pointers.txt -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Deadlock in do_page_fault() on ARM (old kernel)
On 01/17/2014 08:46 AM, Russell King - ARM Linux wrote: On Wed, Jan 15, 2014 at 08:13:04PM -0500, Alan Ott wrote: So my questions are: 1. Why don't I see a full backtrace beyond the exception stack? It's the same when dump_stack() is called manually. No idea - it looks like you're not using frame pointers, but are using the unwinder. Full backtraces can always be created with frame pointers, it's just that unwinding seems unreliable. Hi Russell, I managed to get frame pointers turned on, which in this kernel version, seems like it requires the unwinder to be turned off[1]. When I built with frame pointers, the backtraces show differently. It only shows the frames which were _not_ shown with the unwinder. Backtrace at [2]. As you can see, it shows the non-exception stack dumps ending at places which can't page fault (and if it did page fault in those locations, it wouldn't try to take the mmap_sem lock). Its as though it shows what's in the modules but not what's in the image, where with the unwinder, it showed what's in the image and not what's in the modules[3]. I think we do need to see the full backtrace here - from looking at the full state dump, I don't see any sign of the mmap_sem being held except by an attempt to process a fault, and two threads trying to do a sys_mmap_pgoff(). The lockdep printout at the end of [4] (which is the original log I sent the other day) shows do_page_fault() holding mmap_sem in six tasks and sys_mmap_pgoff() holding it in two. (I don't like the term "held" in the lockdep printout and believe it to mean "held or waiting for," based on my analysis of the rw_semaphore code.) Each of those tasks seems to be blocking for it. My suspicion therefore is that some other thread must have died while holding the mmap_sem, so there's probably a kernel oops earlier... that's my best guess at the moment without seeing the full backtrace. There's no oops that I'm able to see. Each of the tasks which lockdep reports as "holding" mmap_sem are blocking for it. If some other task had taken it and then crashed, I assume lockdep would list the crashed task as also holding the resource in the printout. Thanks for having a look at this, Alan. [1] It seems a little strange to me even in the newest kernel: 1: lib/Kconfig.debug specifies FRAME_POINTER . The text of this instance shows up when one searches for FRAME_POINTER in menuconfig. It can become selected even though the dependencies listed here are not met. 2: arch/arm/Kconfig.debug also specifies FRAME_POINTER, with a dependency only on !THUMB2_KERNEL, defaulting to yes on !ARM_UNWIND 3: ARM doens't set ARCH_WANT_FRAME_POINTERS [2] http://www.signal11.us/~alan/stack_dump_with_frame_pointers.txt Note: This is a sysrq dump of the locks held at lockup time and then the automatic hung task detection. [3] The modules are being built in the standard way for out-of-tree: $(MAKE) -C $(PRJDIR)/linux M=$(CWD) modules [4] http://www.signal11.us/~alan/show-all-tasks-deadlock.txt -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Deadlock in do_page_fault() on ARM (old kernel)
On 01/17/2014 08:46 AM, Russell King - ARM Linux wrote: On Wed, Jan 15, 2014 at 08:13:04PM -0500, Alan Ott wrote: So my questions are: 1. Why don't I see a full backtrace beyond the exception stack? It's the same when dump_stack() is called manually. No idea - it looks like you're not using frame pointers, but are using the unwinder. Full backtraces can always be created with frame pointers, it's just that unwinding seems unreliable. Hi Russell, I managed to get frame pointers turned on, which in this kernel version, seems like it requires the unwinder to be turned off[1]. When I built with frame pointers, the backtraces show differently. It only shows the frames which were _not_ shown with the unwinder. Backtrace at [2]. As you can see, it shows the non-exception stack dumps ending at places which can't page fault (and if it did page fault in those locations, it wouldn't try to take the mmap_sem lock). Its as though it shows what's in the modules but not what's in the image, where with the unwinder, it showed what's in the image and not what's in the modules[3]. I think we do need to see the full backtrace here - from looking at the full state dump, I don't see any sign of the mmap_sem being held except by an attempt to process a fault, and two threads trying to do a sys_mmap_pgoff(). The lockdep printout at the end of [4] (which is the original log I sent the other day) shows do_page_fault() holding mmap_sem in six tasks and sys_mmap_pgoff() holding it in two. (I don't like the term held in the lockdep printout and believe it to mean held or waiting for, based on my analysis of the rw_semaphore code.) Each of those tasks seems to be blocking for it. My suspicion therefore is that some other thread must have died while holding the mmap_sem, so there's probably a kernel oops earlier... that's my best guess at the moment without seeing the full backtrace. There's no oops that I'm able to see. Each of the tasks which lockdep reports as holding mmap_sem are blocking for it. If some other task had taken it and then crashed, I assume lockdep would list the crashed task as also holding the resource in the printout. Thanks for having a look at this, Alan. [1] It seems a little strange to me even in the newest kernel: 1: lib/Kconfig.debug specifies FRAME_POINTER . The text of this instance shows up when one searches for FRAME_POINTER in menuconfig. It can become selected even though the dependencies listed here are not met. 2: arch/arm/Kconfig.debug also specifies FRAME_POINTER, with a dependency only on !THUMB2_KERNEL, defaulting to yes on !ARM_UNWIND 3: ARM doens't set ARCH_WANT_FRAME_POINTERS [2] http://www.signal11.us/~alan/stack_dump_with_frame_pointers.txt Note: This is a sysrq dump of the locks held at lockup time and then the automatic hung task detection. [3] The modules are being built in the standard way for out-of-tree: $(MAKE) -C $(PRJDIR)/linux M=$(CWD) modules [4] http://www.signal11.us/~alan/show-all-tasks-deadlock.txt -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Deadlock in do_page_fault() on ARM (old kernel)
Hello, I have a deadlock that I'm trying to understand. The symptom is multiple tasks trying to acquire a read lock (down_read()) on mm->mmap_sem in do_page_fault(). I'll be right up front and say that this is a fairly old kernel (2.6.37 TI PSP kernel) on a fairly old processor DaVinci 6446. At the time of the deadlock, sysrq's show-all-tasks shows the following for three of the tasks which are deadlocked (there are more, but I just picked the interesting ones; the full output is at [1]): uiD c0ea8208 0 1405 1293 0x [] (schedule+0x33c/0x3c4) from [] (__down_read+0xbc/0xd4) [] (__down_read+0xbc/0xd4) from [] (do_page_fault+0x94/0x248) [] (do_page_fault+0x94/0x248) from [] (do_DataAbort+0x34/0x94) [] (do_DataAbort+0x34/0x94) from [] (__dabt_svc+0x4c/0x60) Exception stack(0xc048dce8 to 0xc048dd30) dce0: 400e9a94 c048ddb0 ffec c048c000 c048dda4 dd00: 400e9a94 ff92 c048c000 0001 0014 c048dd34 dd20: c0d1f68c 0013 [] (__dabt_svc+0x4c/0x60) from [] (__copy_to_user_std+0xcc/0x3a8) uiD c0ea8208 0 1406 1293 0x [] (schedule+0x33c/0x3c4) from [] (__down_read+0xbc/0xd4) [] (__down_read+0xbc/0xd4) from [] (do_page_fault+0x94/0x248) [] (do_page_fault+0x94/0x248) from [] (do_DataAbort+0x34/0x94) [] (do_DataAbort+0x34/0x94) from [] (ret_from_exception+0x0/0x10) Exception stack(0xc048ffb0 to 0xc048fff8) ffa0: 0060 000a 00a8 0010d000 ffc0: 00c23d80 00c23de8 405af06c 405af03c 405af074 0050 01ff ffe0: 405ae000 40185748 404f5c4c 404f393c 8010 uiD c0ea8208 0 1411 1293 0x [] (schedule+0x33c/0x3c4) from [] (__down_read+0xbc/0xd4) [] (__down_read+0xbc/0xd4) from [] (do_page_fault+0x94/0x248) [] (do_page_fault+0x94/0x248) from [] (do_DataAbort+0x34/0x94) [] (do_DataAbort+0x34/0x94) from [] (ret_from_exception+0x0/0x10) Exception stack(0xc053bfb0 to 0xc053bff8) bfa0: 0001 00ba3610 bfc0: 00ba3610 00bb6020 00ba3610 40074000 00b91024 415e4930 0583 bfe0: 00b611a0 415e38e0 4005f3e4 0fc0 6010 [snip] Showing all locks held in the system: 1 lock held by getty/1294: #0: (>atomic_read_lock){+.+...}, at: [] n_tty_read+0x21c/0x670 1 lock held by ui/1405: #0: (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248 1 lock held by ui/1406: #0: (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248 1 lock held by ui/1408: #0: (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248 1 lock held by ui/1409: #0: (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248 1 lock held by ui/1411: #0: (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248 1 lock held by ui/1416: #0: (>mmap_sem){++}, at: [] sys_mmap_pgoff+0x70/0xc0 1 lock held by ui/1418: #0: (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248 1 lock held by ui/1420: #0: (>mmap_sem){++}, at: [] sys_mmap_pgoff+0x70/0xc0 1 lock held by ui/1434: #0: (>atomic_read_lock){+.+...}, at: [] n_tty_read+0x21c/0x670 Note that above, do_page_fault() takes out a read lock (down_read()) and sys_mmap_pgoff() takes out a write lock (down_write()). I've searched for this kind of problem and found two patches which seem to be related to this issue[2]. I have applied both with no better results. So my questions are: 1. Why don't I see a full backtrace beyond the exception stack? It's the same when dump_stack() is called manually. 2. __copy_to_user_memcpy() takes a read lock (down_read()) on mm->mmap_sem. While that lock is held, __copy_to_user_memcpy() can generate a page fault, causing do_page_fault() to get called, which will also try to get a read lock (down_read()) on mm->mmap_sem. Multiple read locks can be taken on an rw_semaphore, but deadlock will occur if another thread tries to get a write lock (down_write()) in between. For example: Task 1: Task 2: down_read(sem) down_write(sem)<-- Goes to sleep down_read(sem) <-- Goes to sleep There is a thread from 2005[3] which seems to discuss the same concept of recursive rw_semaphores, but for futexes. Other comments: 1. My analysis of this probably wrong. Otherwise it seems many others would have the same problem, and they don't seem to. I'm hoping this email will help to correct my understanding. 2. I looked through the git logs for recent (since 2.6.37 time frame) and nothing else jumped out at me as being an obvious fix for this situation. Thanks for any insight you can give, Alan. [1] http://www.signal11.us/~alan/show-all-tasks-deadlock.txt [2] Some websites/bugtrackers mention this commit with a similar issue, but I'm not entirely sure how it's related: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8878a539ff19a43cf3729e7562cd528f490246ae This
Deadlock in do_page_fault() on ARM (old kernel)
Hello, I have a deadlock that I'm trying to understand. The symptom is multiple tasks trying to acquire a read lock (down_read()) on mm-mmap_sem in do_page_fault(). I'll be right up front and say that this is a fairly old kernel (2.6.37 TI PSP kernel) on a fairly old processor DaVinci 6446. At the time of the deadlock, sysrq's show-all-tasks shows the following for three of the tasks which are deadlocked (there are more, but I just picked the interesting ones; the full output is at [1]): uiD c0ea8208 0 1405 1293 0x [c0ea8208] (schedule+0x33c/0x3c4) from [c0eaa3b4] (__down_read+0xbc/0xd4) [c0eaa3b4] (__down_read+0xbc/0xd4) from [c0c0b378] (do_page_fault+0x94/0x248) [c0c0b378] (do_page_fault+0x94/0x248) from [c0c052e0] (do_DataAbort+0x34/0x94) [c0c052e0] (do_DataAbort+0x34/0x94) from [c0c05b0c] (__dabt_svc+0x4c/0x60) Exception stack(0xc048dce8 to 0xc048dd30) dce0: 400e9a94 c048ddb0 ffec c048c000 c048dda4 dd00: 400e9a94 ff92 c048c000 0001 0014 c048dd34 dd20: c0d1f68c 0013 [c0c05b0c] (__dabt_svc+0x4c/0x60) from [c0d1f68c] (__copy_to_user_std+0xcc/0x3a8) uiD c0ea8208 0 1406 1293 0x [c0ea8208] (schedule+0x33c/0x3c4) from [c0eaa3b4] (__down_read+0xbc/0xd4) [c0eaa3b4] (__down_read+0xbc/0xd4) from [c0c0b378] (do_page_fault+0x94/0x248) [c0c0b378] (do_page_fault+0x94/0x248) from [c0c052e0] (do_DataAbort+0x34/0x94) [c0c052e0] (do_DataAbort+0x34/0x94) from [c0c05f0c] (ret_from_exception+0x0/0x10) Exception stack(0xc048ffb0 to 0xc048fff8) ffa0: 0060 000a 00a8 0010d000 ffc0: 00c23d80 00c23de8 405af06c 405af03c 405af074 0050 01ff ffe0: 405ae000 40185748 404f5c4c 404f393c 8010 uiD c0ea8208 0 1411 1293 0x [c0ea8208] (schedule+0x33c/0x3c4) from [c0eaa3b4] (__down_read+0xbc/0xd4) [c0eaa3b4] (__down_read+0xbc/0xd4) from [c0c0b378] (do_page_fault+0x94/0x248) [c0c0b378] (do_page_fault+0x94/0x248) from [c0c052e0] (do_DataAbort+0x34/0x94) [c0c052e0] (do_DataAbort+0x34/0x94) from [c0c05f0c] (ret_from_exception+0x0/0x10) Exception stack(0xc053bfb0 to 0xc053bff8) bfa0: 0001 00ba3610 bfc0: 00ba3610 00bb6020 00ba3610 40074000 00b91024 415e4930 0583 bfe0: 00b611a0 415e38e0 4005f3e4 0fc0 6010 [snip] Showing all locks held in the system: 1 lock held by getty/1294: #0: (tty-atomic_read_lock){+.+...}, at: [c0d45bf0] n_tty_read+0x21c/0x670 1 lock held by ui/1405: #0: (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248 1 lock held by ui/1406: #0: (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248 1 lock held by ui/1408: #0: (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248 1 lock held by ui/1409: #0: (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248 1 lock held by ui/1411: #0: (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248 1 lock held by ui/1416: #0: (mm-mmap_sem){++}, at: [c0c6e604] sys_mmap_pgoff+0x70/0xc0 1 lock held by ui/1418: #0: (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248 1 lock held by ui/1420: #0: (mm-mmap_sem){++}, at: [c0c6e604] sys_mmap_pgoff+0x70/0xc0 1 lock held by ui/1434: #0: (tty-atomic_read_lock){+.+...}, at: [c0d45bf0] n_tty_read+0x21c/0x670 Note that above, do_page_fault() takes out a read lock (down_read()) and sys_mmap_pgoff() takes out a write lock (down_write()). I've searched for this kind of problem and found two patches which seem to be related to this issue[2]. I have applied both with no better results. So my questions are: 1. Why don't I see a full backtrace beyond the exception stack? It's the same when dump_stack() is called manually. 2. __copy_to_user_memcpy() takes a read lock (down_read()) on mm-mmap_sem. While that lock is held, __copy_to_user_memcpy() can generate a page fault, causing do_page_fault() to get called, which will also try to get a read lock (down_read()) on mm-mmap_sem. Multiple read locks can be taken on an rw_semaphore, but deadlock will occur if another thread tries to get a write lock (down_write()) in between. For example: Task 1: Task 2: down_read(sem) down_write(sem)-- Goes to sleep down_read(sem) -- Goes to sleep There is a thread from 2005[3] which seems to discuss the same concept of recursive rw_semaphores, but for futexes. Other comments: 1. My analysis of this probably wrong. Otherwise it seems many others would have the same problem, and they don't seem to. I'm hoping this email will help to correct my understanding. 2. I looked through the git logs for recent (since 2.6.37 time frame) and nothing else jumped out at me as being an obvious fix for this situation. Thanks for any insight you can give,
[PATCH v1 2/3] mrf24j40: Use threaded IRQ handler
Eliminate all the workqueue and interrupt enable/disable. Signed-off-by: Alan Ott --- drivers/net/ieee802154/mrf24j40.c | 27 +++ 1 file changed, 7 insertions(+), 20 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 66bb4ce..c1bc688 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -82,7 +82,6 @@ struct mrf24j40 { struct mutex buffer_mutex; /* only used to protect buf */ struct completion tx_complete; - struct work_struct irqwork; u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */ }; @@ -590,17 +589,6 @@ static struct ieee802154_ops mrf24j40_ops = { static irqreturn_t mrf24j40_isr(int irq, void *data) { struct mrf24j40 *devrec = data; - - disable_irq_nosync(irq); - - schedule_work(>irqwork); - - return IRQ_HANDLED; -} - -static void mrf24j40_isrwork(struct work_struct *work) -{ - struct mrf24j40 *devrec = container_of(work, struct mrf24j40, irqwork); u8 intstat; int ret; @@ -618,7 +606,7 @@ static void mrf24j40_isrwork(struct work_struct *work) mrf24j40_handle_rx(devrec); out: - enable_irq(devrec->spi->irq); + return IRQ_HANDLED; } static int mrf24j40_probe(struct spi_device *spi) @@ -642,7 +630,6 @@ static int mrf24j40_probe(struct spi_device *spi) mutex_init(>buffer_mutex); init_completion(>tx_complete); - INIT_WORK(>irqwork, mrf24j40_isrwork); devrec->spi = spi; spi_set_drvdata(spi, devrec); @@ -688,11 +675,12 @@ static int mrf24j40_probe(struct spi_device *spi) val &= ~0x3; /* Clear RX mode (normal) */ write_short_reg(devrec, REG_RXMCR, val); - ret = request_irq(spi->irq, - mrf24j40_isr, - IRQF_TRIGGER_FALLING, - dev_name(>dev), - devrec); + ret = request_threaded_irq(spi->irq, + NULL, + mrf24j40_isr, + IRQF_TRIGGER_FALLING|IRQF_ONESHOT, + dev_name(>dev), + devrec); if (ret) { dev_err(printdev(devrec), "Unable to get IRQ"); @@ -721,7 +709,6 @@ static int mrf24j40_remove(struct spi_device *spi) dev_dbg(printdev(devrec), "remove\n"); free_irq(spi->irq, devrec); - flush_work(>irqwork); /* TODO: Is this the right call? */ ieee802154_unregister_device(devrec->dev); ieee802154_free_device(devrec->dev); /* TODO: Will ieee802154_free_device() wait until ->xmit() is -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v1 1/3] mrf24j40: Move INIT_COMPLETION() to before packet transmission
This avoids a race condition where complete(tx_complete) could be called before tx_complete is initialized. Signed-off-by: Alan Ott --- drivers/net/ieee802154/mrf24j40.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 42e6dee..66bb4ce 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -344,6 +344,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) if (ret) goto err; + INIT_COMPLETION(devrec->tx_complete); + /* Set TXNTRIG bit of TXNCON to send packet */ ret = read_short_reg(devrec, REG_TXNCON, ); if (ret) @@ -354,8 +356,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) val |= 0x4; write_short_reg(devrec, REG_TXNCON, val); - INIT_COMPLETION(devrec->tx_complete); - /* Wait for the device to send the TX complete interrupt. */ ret = wait_for_completion_interruptible_timeout( >tx_complete, -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v1 3/3] mrf24j40: Use level-triggered interrupts
The mrf24j40 generates level interrupts. There are rare cases where it appears that the interrupt line never gets de-asserted between interrupts, causing interrupts to be lost, and causing a hung device from the driver's perspective. Switching the driver to interpret these interrupts as level-triggered fixes this issue. Signed-off-by: Alan Ott --- drivers/net/ieee802154/mrf24j40.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index c1bc688..0632d34 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -678,7 +678,7 @@ static int mrf24j40_probe(struct spi_device *spi) ret = request_threaded_irq(spi->irq, NULL, mrf24j40_isr, - IRQF_TRIGGER_FALLING|IRQF_ONESHOT, + IRQF_TRIGGER_LOW|IRQF_ONESHOT, dev_name(>dev), devrec); -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v1 0/3] Fix race conditions in mrf24j40 interrupts
After testing with the betas of this patchset, it's been rebased and is ready for inclusion. David Hauweele noticed that the mrf24j40 would hang arbitrarily after some period of heavy traffic. Two race conditions were discovered, and the driver was changed to use threaded interrupts, since the enable/disable of interrupts in the driver has recently been a lighning rod whenever issues arise related to interrupts (costing engineering time), and since threaded interrupts are the right way to do it. Alan Ott (3): mrf24j40: Move INIT_COMPLETION() to before packet transmission mrf24j40: Use threaded IRQ handler mrf24j40: Use level-triggered interrupts drivers/net/ieee802154/mrf24j40.c | 31 +-- 1 file changed, 9 insertions(+), 22 deletions(-) -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/2] 6lowpan: Only make 6lowpan links to IEEE802154 devices
Refuse to create 6lowpan links if the actual hardware interface is of any type other than ARPHRD_IEEE802154. Signed-off-by: Alan Ott Suggested-by: Alexander Aring --- net/ieee802154/6lowpan.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index c85e71e..8f56b2b 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1372,6 +1372,8 @@ static int lowpan_newlink(struct net *src_net, struct net_device *dev, real_dev = dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) return -ENODEV; + if (real_dev->type != ARPHRD_IEEE802154) + return -EINVAL; lowpan_dev_info(dev)->real_dev = real_dev; lowpan_dev_info(dev)->fragment_tag = 0; -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/2] 6lowpan: Sync default hardware address of lowpan links to their wpan
When a lowpan link to a wpan device is created, set the hardware address of the lowpan link to that of the wpan device. Signed-off-by: Alan Ott --- net/ieee802154/6lowpan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index 8f56b2b..ff41b4d 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1388,6 +1388,9 @@ static int lowpan_newlink(struct net *src_net, struct net_device *dev, entry->ldev = dev; + /* Set the lowpan harware address to the wpan hardware address. */ + memcpy(dev->dev_addr, real_dev->dev_addr, IEEE802154_ADDR_LEN); + mutex_lock(_dev_info(dev)->dev_list_mtx); INIT_LIST_HEAD(>list); list_add_tail(>list, _devices); -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/2] 6lowpan default hardware address
Alexander Aring suggested that devices desired to be linked to 6lowpan be checked for actually being of type IEEE802154, since IEEE802154 devices are all that are supported by 6lowpan at present. Alan Ott (2): 6lowpan: Only make 6lowpan links to IEEE802154 devices 6lowpan: Sync default hardware address of lowpan links to their wpan net/ieee802154/6lowpan.c | 5 + 1 file changed, 5 insertions(+) -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] 6lowpan: Sync default hardware address of lowpan links to their wpan
On 10/05/2013 08:24 PM, Alexander Aring wrote: Hi Alan, On Sat, Oct 05, 2013 at 05:38:00PM -0400, Alan Ott wrote: When a lowpan link to a wpan device is created, set the hardware address of the lowpan link to that of the wpan device. Signed-off-by: Alan Ott --- net/ieee802154/6lowpan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index c85e71e..fb89133 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1386,6 +1386,9 @@ static int lowpan_newlink(struct net *src_net, struct net_device *dev, entry->ldev = dev; + BUG_ON(IEEE802154_ADDR_LEN != dev->addr_len); So if somebody make a: "ip link add link $NOT_8BYTE_HWADDR_DEV name $NAME type lowpan" the kernel creates a BUG_ON? Okay it seems that case is very unusual but better is to return a errno so "maybe(I don't know it)" the userspace software will generate a error and the whole kernel doesn't crash. That line checks the length of the address of the lowpan device, not the real device the lowpan device is attached to (and on second look, I don't like the Yoda code; didn't notice that before). Further, running: ip link add link eth0 name lowpan0 type lowpan like you suggest, crashes the kernel hard, with or without my patch. More stuff to fix + memcpy(dev->dev_addr, real_dev->dev_addr, IEEE802154_ADDR_LEN); Is there one case where we read the dev->dev_addr? I saw a case when we set this [1], but I don't know why we need this. Did you detect some problems because this isn't the "real" hw addr? Other case is I am feeling uneasy when we have two netdev devices with the same hw addr. [1]https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/ieee802154/6lowpan.c?id=refs/tags/v3.12-rc3#n1102 In my testing, lowpan devices need a hardware address which is the same as the wpan, or else stuff like ping doesn't work at all. In all the examples I've ever seen, the lowpan's hardware address has been set manually. What does your test setup look like that you don't have a hardware address for your lowpan device? Are you able to ping other Linux hosts? Alan. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] 6lowpan: Sync default hardware address of lowpan links to their wpan
When a lowpan link to a wpan device is created, set the hardware address of the lowpan link to that of the wpan device. Signed-off-by: Alan Ott --- net/ieee802154/6lowpan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index c85e71e..fb89133 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1386,6 +1386,9 @@ static int lowpan_newlink(struct net *src_net, struct net_device *dev, entry->ldev = dev; + BUG_ON(IEEE802154_ADDR_LEN != dev->addr_len); + memcpy(dev->dev_addr, real_dev->dev_addr, IEEE802154_ADDR_LEN); + mutex_lock(_dev_info(dev)->dev_list_mtx); INIT_LIST_HEAD(>list); list_add_tail(>list, _devices); -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] 6lowpan: Sync default hardware address of lowpan links to their wpan
When a lowpan link to a wpan device is created, set the hardware address of the lowpan link to that of the wpan device. Signed-off-by: Alan Ott a...@signal11.us --- net/ieee802154/6lowpan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index c85e71e..fb89133 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1386,6 +1386,9 @@ static int lowpan_newlink(struct net *src_net, struct net_device *dev, entry-ldev = dev; + BUG_ON(IEEE802154_ADDR_LEN != dev-addr_len); + memcpy(dev-dev_addr, real_dev-dev_addr, IEEE802154_ADDR_LEN); + mutex_lock(lowpan_dev_info(dev)-dev_list_mtx); INIT_LIST_HEAD(entry-list); list_add_tail(entry-list, lowpan_devices); -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] 6lowpan: Sync default hardware address of lowpan links to their wpan
On 10/05/2013 08:24 PM, Alexander Aring wrote: Hi Alan, On Sat, Oct 05, 2013 at 05:38:00PM -0400, Alan Ott wrote: When a lowpan link to a wpan device is created, set the hardware address of the lowpan link to that of the wpan device. Signed-off-by: Alan Otta...@signal11.us --- net/ieee802154/6lowpan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index c85e71e..fb89133 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1386,6 +1386,9 @@ static int lowpan_newlink(struct net *src_net, struct net_device *dev, entry-ldev = dev; + BUG_ON(IEEE802154_ADDR_LEN != dev-addr_len); So if somebody make a: ip link add link $NOT_8BYTE_HWADDR_DEV name $NAME type lowpan the kernel creates a BUG_ON? Okay it seems that case is very unusual but better is to return a errno so maybe(I don't know it) the userspace software will generate a error and the whole kernel doesn't crash. That line checks the length of the address of the lowpan device, not the real device the lowpan device is attached to (and on second look, I don't like the Yoda code; didn't notice that before). Further, running: ip link add link eth0 name lowpan0 type lowpan like you suggest, crashes the kernel hard, with or without my patch. More stuff to fix + memcpy(dev-dev_addr, real_dev-dev_addr, IEEE802154_ADDR_LEN); Is there one case where we read the dev-dev_addr? I saw a case when we set this [1], but I don't know why we need this. Did you detect some problems because this isn't the real hw addr? Other case is I am feeling uneasy when we have two netdev devices with the same hw addr. [1]https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/ieee802154/6lowpan.c?id=refs/tags/v3.12-rc3#n1102 In my testing, lowpan devices need a hardware address which is the same as the wpan, or else stuff like ping doesn't work at all. In all the examples I've ever seen, the lowpan's hardware address has been set manually. What does your test setup look like that you don't have a hardware address for your lowpan device? Are you able to ping other Linux hosts? Alan. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/2] 6lowpan default hardware address
Alexander Aring suggested that devices desired to be linked to 6lowpan be checked for actually being of type IEEE802154, since IEEE802154 devices are all that are supported by 6lowpan at present. Alan Ott (2): 6lowpan: Only make 6lowpan links to IEEE802154 devices 6lowpan: Sync default hardware address of lowpan links to their wpan net/ieee802154/6lowpan.c | 5 + 1 file changed, 5 insertions(+) -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/2] 6lowpan: Sync default hardware address of lowpan links to their wpan
When a lowpan link to a wpan device is created, set the hardware address of the lowpan link to that of the wpan device. Signed-off-by: Alan Ott a...@signal11.us --- net/ieee802154/6lowpan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index 8f56b2b..ff41b4d 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1388,6 +1388,9 @@ static int lowpan_newlink(struct net *src_net, struct net_device *dev, entry-ldev = dev; + /* Set the lowpan harware address to the wpan hardware address. */ + memcpy(dev-dev_addr, real_dev-dev_addr, IEEE802154_ADDR_LEN); + mutex_lock(lowpan_dev_info(dev)-dev_list_mtx); INIT_LIST_HEAD(entry-list); list_add_tail(entry-list, lowpan_devices); -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/2] 6lowpan: Only make 6lowpan links to IEEE802154 devices
Refuse to create 6lowpan links if the actual hardware interface is of any type other than ARPHRD_IEEE802154. Signed-off-by: Alan Ott a...@signal11.us Suggested-by: Alexander Aring alex.ar...@gmail.com --- net/ieee802154/6lowpan.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index c85e71e..8f56b2b 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1372,6 +1372,8 @@ static int lowpan_newlink(struct net *src_net, struct net_device *dev, real_dev = dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) return -ENODEV; + if (real_dev-type != ARPHRD_IEEE802154) + return -EINVAL; lowpan_dev_info(dev)-real_dev = real_dev; lowpan_dev_info(dev)-fragment_tag = 0; -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v1 3/3] mrf24j40: Use level-triggered interrupts
The mrf24j40 generates level interrupts. There are rare cases where it appears that the interrupt line never gets de-asserted between interrupts, causing interrupts to be lost, and causing a hung device from the driver's perspective. Switching the driver to interpret these interrupts as level-triggered fixes this issue. Signed-off-by: Alan Ott a...@signal11.us --- drivers/net/ieee802154/mrf24j40.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index c1bc688..0632d34 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -678,7 +678,7 @@ static int mrf24j40_probe(struct spi_device *spi) ret = request_threaded_irq(spi-irq, NULL, mrf24j40_isr, - IRQF_TRIGGER_FALLING|IRQF_ONESHOT, + IRQF_TRIGGER_LOW|IRQF_ONESHOT, dev_name(spi-dev), devrec); -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v1 0/3] Fix race conditions in mrf24j40 interrupts
After testing with the betas of this patchset, it's been rebased and is ready for inclusion. David Hauweele noticed that the mrf24j40 would hang arbitrarily after some period of heavy traffic. Two race conditions were discovered, and the driver was changed to use threaded interrupts, since the enable/disable of interrupts in the driver has recently been a lighning rod whenever issues arise related to interrupts (costing engineering time), and since threaded interrupts are the right way to do it. Alan Ott (3): mrf24j40: Move INIT_COMPLETION() to before packet transmission mrf24j40: Use threaded IRQ handler mrf24j40: Use level-triggered interrupts drivers/net/ieee802154/mrf24j40.c | 31 +-- 1 file changed, 9 insertions(+), 22 deletions(-) -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v1 1/3] mrf24j40: Move INIT_COMPLETION() to before packet transmission
This avoids a race condition where complete(tx_complete) could be called before tx_complete is initialized. Signed-off-by: Alan Ott a...@signal11.us --- drivers/net/ieee802154/mrf24j40.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 42e6dee..66bb4ce 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -344,6 +344,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) if (ret) goto err; + INIT_COMPLETION(devrec-tx_complete); + /* Set TXNTRIG bit of TXNCON to send packet */ ret = read_short_reg(devrec, REG_TXNCON, val); if (ret) @@ -354,8 +356,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) val |= 0x4; write_short_reg(devrec, REG_TXNCON, val); - INIT_COMPLETION(devrec-tx_complete); - /* Wait for the device to send the TX complete interrupt. */ ret = wait_for_completion_interruptible_timeout( devrec-tx_complete, -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v1 2/3] mrf24j40: Use threaded IRQ handler
Eliminate all the workqueue and interrupt enable/disable. Signed-off-by: Alan Ott a...@signal11.us --- drivers/net/ieee802154/mrf24j40.c | 27 +++ 1 file changed, 7 insertions(+), 20 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 66bb4ce..c1bc688 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -82,7 +82,6 @@ struct mrf24j40 { struct mutex buffer_mutex; /* only used to protect buf */ struct completion tx_complete; - struct work_struct irqwork; u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */ }; @@ -590,17 +589,6 @@ static struct ieee802154_ops mrf24j40_ops = { static irqreturn_t mrf24j40_isr(int irq, void *data) { struct mrf24j40 *devrec = data; - - disable_irq_nosync(irq); - - schedule_work(devrec-irqwork); - - return IRQ_HANDLED; -} - -static void mrf24j40_isrwork(struct work_struct *work) -{ - struct mrf24j40 *devrec = container_of(work, struct mrf24j40, irqwork); u8 intstat; int ret; @@ -618,7 +606,7 @@ static void mrf24j40_isrwork(struct work_struct *work) mrf24j40_handle_rx(devrec); out: - enable_irq(devrec-spi-irq); + return IRQ_HANDLED; } static int mrf24j40_probe(struct spi_device *spi) @@ -642,7 +630,6 @@ static int mrf24j40_probe(struct spi_device *spi) mutex_init(devrec-buffer_mutex); init_completion(devrec-tx_complete); - INIT_WORK(devrec-irqwork, mrf24j40_isrwork); devrec-spi = spi; spi_set_drvdata(spi, devrec); @@ -688,11 +675,12 @@ static int mrf24j40_probe(struct spi_device *spi) val = ~0x3; /* Clear RX mode (normal) */ write_short_reg(devrec, REG_RXMCR, val); - ret = request_irq(spi-irq, - mrf24j40_isr, - IRQF_TRIGGER_FALLING, - dev_name(spi-dev), - devrec); + ret = request_threaded_irq(spi-irq, + NULL, + mrf24j40_isr, + IRQF_TRIGGER_FALLING|IRQF_ONESHOT, + dev_name(spi-dev), + devrec); if (ret) { dev_err(printdev(devrec), Unable to get IRQ); @@ -721,7 +709,6 @@ static int mrf24j40_remove(struct spi_device *spi) dev_dbg(printdev(devrec), remove\n); free_irq(spi-irq, devrec); - flush_work(devrec-irqwork); /* TODO: Is this the right call? */ ieee802154_unregister_device(devrec-dev); ieee802154_free_device(devrec-dev); /* TODO: Will ieee802154_free_device() wait until -xmit() is -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts
On 05/23/2013 01:54 PM, David Hauweele wrote: > 2013/5/23 Alan Ott : >> On 5/22/13 4:32 PM, David Hauweele wrote: >>> >>> I cannot use level-triggered interrupts with GPIO on the RPi, so I >>> cannot test this specific patch. >> >> >> Is there another interrupt line you can tie into which does support >> level-trigger interrupts (INT0 or something)? > > According to the datasheet it should be possible but the bcm2708 port > does not support it. I've been told that we shouldn't use level-triggered > interrupts in the first place. Who is "we," and why shouldn't we use level-triggered interrupts? The CPU needs to detect whatever type of interrupt the attached hardware generates. > >> >> >>> However I agree with the idea of level-triggered interrupts, that >>> would fix all major problems related to missed interrupts. >>> >>> Beside this I'm running a ping -f since more than two hours now and it >>> seems to work well. >>> >> >> So that surprises me. I thought level-trigger interrupts were the thing that >> would fix this problem, and if you're not running with that patch, you just >> have the INIT_COMPLETION() fix (which you said didn't fix your issue) and >> the threaded interrupts patch, which I was fairly sure I had determined >> wasn't fixing any actual race-condition-related problems. > > I should have been more clear about this. I've tested [PATCH 1/3] > which fixes the race condition with tx_complete. That is the > INIT_COMPLETION() fix. But it is still possible to miss an interrupt, > perhaps it just took longer this time. I ran the test again today and > it failed after 30 minutes. > > I did not test [PATCH 2/3], that is the threaded IRQ. Instead I > removed interrupt enable/disable from the IRQ handler and the > workqueue. Without this the driver would fail within seconds of a ping > -f. Without what? What do you mean by "without this?" Without the enable/disable, or without the change that removes the enable/disable? > Have you observed this too ? Perhaps this problem is specific to > the bcm2708 port. > What I observe right now is that it seems to work great (ping -f for 6.5 hours) when using the three patches in this patch set on a BeagleBone. > >> >> I'm glad, but surprised that you're no longer seeing issues. >> >> Alan. >> >> >>> >>> 2013/5/22 Alan Ott : >>>> >>>> On 05/21/2013 10:01 PM, Alan Ott wrote: >>>>> >>>>> David Hauweele noticed that the mrf24j40 would hang arbitrarily after >>>>> some >>>>> period of heavy traffic. Two race conditions were discovered, and the >>>>> driver was changed to use threaded interrupts, since the enable/disable >>>>> of >>>>> interrupts in the driver has recently been a lighning rod whenever >>>>> issues >>>>> arise related to interrupts (costing engineering time), and since >>>>> threaded >>>>> interrupts are the right way to do it. >>>>> >>>>> Alan Ott (3): >>>>>mrf24j40: Move INIT_COMPLETION() to before packet transmission >>>>>mrf24j40: Use threaded IRQ handler >>>>>mrf24j40: Use level-triggered interrupts >>>>> >>>>> drivers/net/ieee802154/mrf24j40.c | 31 +-- >>>>> 1 file changed, 9 insertions(+), 22 deletions(-) >>>> >>>> >>>> I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and >>>> it seems solid. >>>> >> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts
On 5/22/13 4:32 PM, David Hauweele wrote: I cannot use level-triggered interrupts with GPIO on the RPi, so I cannot test this specific patch. Is there another interrupt line you can tie into which does support level-trigger interrupts (INT0 or something)? However I agree with the idea of level-triggered interrupts, that would fix all major problems related to missed interrupts. Beside this I'm running a ping -f since more than two hours now and it seems to work well. So that surprises me. I thought level-trigger interrupts were the thing that would fix this problem, and if you're not running with that patch, you just have the INIT_COMPLETION() fix (which you said didn't fix your issue) and the threaded interrupts patch, which I was fairly sure I had determined wasn't fixing any actual race-condition-related problems. I'm glad, but surprised that you're no longer seeing issues. Alan. 2013/5/22 Alan Ott : On 05/21/2013 10:01 PM, Alan Ott wrote: David Hauweele noticed that the mrf24j40 would hang arbitrarily after some period of heavy traffic. Two race conditions were discovered, and the driver was changed to use threaded interrupts, since the enable/disable of interrupts in the driver has recently been a lighning rod whenever issues arise related to interrupts (costing engineering time), and since threaded interrupts are the right way to do it. Alan Ott (3): mrf24j40: Move INIT_COMPLETION() to before packet transmission mrf24j40: Use threaded IRQ handler mrf24j40: Use level-triggered interrupts drivers/net/ieee802154/mrf24j40.c | 31 +-- 1 file changed, 9 insertions(+), 22 deletions(-) I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and it seems solid. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts
On 5/22/13 4:32 PM, David Hauweele wrote: I cannot use level-triggered interrupts with GPIO on the RPi, so I cannot test this specific patch. Is there another interrupt line you can tie into which does support level-trigger interrupts (INT0 or something)? However I agree with the idea of level-triggered interrupts, that would fix all major problems related to missed interrupts. Beside this I'm running a ping -f since more than two hours now and it seems to work well. So that surprises me. I thought level-trigger interrupts were the thing that would fix this problem, and if you're not running with that patch, you just have the INIT_COMPLETION() fix (which you said didn't fix your issue) and the threaded interrupts patch, which I was fairly sure I had determined wasn't fixing any actual race-condition-related problems. I'm glad, but surprised that you're no longer seeing issues. Alan. 2013/5/22 Alan Ott a...@signal11.us: On 05/21/2013 10:01 PM, Alan Ott wrote: David Hauweele noticed that the mrf24j40 would hang arbitrarily after some period of heavy traffic. Two race conditions were discovered, and the driver was changed to use threaded interrupts, since the enable/disable of interrupts in the driver has recently been a lighning rod whenever issues arise related to interrupts (costing engineering time), and since threaded interrupts are the right way to do it. Alan Ott (3): mrf24j40: Move INIT_COMPLETION() to before packet transmission mrf24j40: Use threaded IRQ handler mrf24j40: Use level-triggered interrupts drivers/net/ieee802154/mrf24j40.c | 31 +-- 1 file changed, 9 insertions(+), 22 deletions(-) I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and it seems solid. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts
On 05/23/2013 01:54 PM, David Hauweele wrote: 2013/5/23 Alan Ott a...@signal11.us: On 5/22/13 4:32 PM, David Hauweele wrote: I cannot use level-triggered interrupts with GPIO on the RPi, so I cannot test this specific patch. Is there another interrupt line you can tie into which does support level-trigger interrupts (INT0 or something)? According to the datasheet it should be possible but the bcm2708 port does not support it. I've been told that we shouldn't use level-triggered interrupts in the first place. Who is we, and why shouldn't we use level-triggered interrupts? The CPU needs to detect whatever type of interrupt the attached hardware generates. However I agree with the idea of level-triggered interrupts, that would fix all major problems related to missed interrupts. Beside this I'm running a ping -f since more than two hours now and it seems to work well. So that surprises me. I thought level-trigger interrupts were the thing that would fix this problem, and if you're not running with that patch, you just have the INIT_COMPLETION() fix (which you said didn't fix your issue) and the threaded interrupts patch, which I was fairly sure I had determined wasn't fixing any actual race-condition-related problems. I should have been more clear about this. I've tested [PATCH 1/3] which fixes the race condition with tx_complete. That is the INIT_COMPLETION() fix. But it is still possible to miss an interrupt, perhaps it just took longer this time. I ran the test again today and it failed after 30 minutes. I did not test [PATCH 2/3], that is the threaded IRQ. Instead I removed interrupt enable/disable from the IRQ handler and the workqueue. Without this the driver would fail within seconds of a ping -f. Without what? What do you mean by without this? Without the enable/disable, or without the change that removes the enable/disable? Have you observed this too ? Perhaps this problem is specific to the bcm2708 port. What I observe right now is that it seems to work great (ping -f for 6.5 hours) when using the three patches in this patch set on a BeagleBone. I'm glad, but surprised that you're no longer seeing issues. Alan. 2013/5/22 Alan Ott a...@signal11.us: On 05/21/2013 10:01 PM, Alan Ott wrote: David Hauweele noticed that the mrf24j40 would hang arbitrarily after some period of heavy traffic. Two race conditions were discovered, and the driver was changed to use threaded interrupts, since the enable/disable of interrupts in the driver has recently been a lighning rod whenever issues arise related to interrupts (costing engineering time), and since threaded interrupts are the right way to do it. Alan Ott (3): mrf24j40: Move INIT_COMPLETION() to before packet transmission mrf24j40: Use threaded IRQ handler mrf24j40: Use level-triggered interrupts drivers/net/ieee802154/mrf24j40.c | 31 +-- 1 file changed, 9 insertions(+), 22 deletions(-) I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and it seems solid. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts
On 05/21/2013 10:01 PM, Alan Ott wrote: > David Hauweele noticed that the mrf24j40 would hang arbitrarily after some > period of heavy traffic. Two race conditions were discovered, and the > driver was changed to use threaded interrupts, since the enable/disable of > interrupts in the driver has recently been a lighning rod whenever issues > arise related to interrupts (costing engineering time), and since threaded > interrupts are the right way to do it. > > Alan Ott (3): > mrf24j40: Move INIT_COMPLETION() to before packet transmission > mrf24j40: Use threaded IRQ handler > mrf24j40: Use level-triggered interrupts > > drivers/net/ieee802154/mrf24j40.c | 31 +-- > 1 file changed, 9 insertions(+), 22 deletions(-) I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and it seems solid. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts
David Hauweele noticed that the mrf24j40 would hang arbitrarily after some period of heavy traffic. Two race conditions were discovered, and the driver was changed to use threaded interrupts, since the enable/disable of interrupts in the driver has recently been a lighning rod whenever issues arise related to interrupts (costing engineering time), and since threaded interrupts are the right way to do it. Alan Ott (3): mrf24j40: Move INIT_COMPLETION() to before packet transmission mrf24j40: Use threaded IRQ handler mrf24j40: Use level-triggered interrupts drivers/net/ieee802154/mrf24j40.c | 31 +-- 1 file changed, 9 insertions(+), 22 deletions(-) -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH beta 1] 1/3] mrf24j40: Move INIT_COMPLETION() to before packet transmission
This avoids a race condition where complete(tx_complete) could be called before tx_complete is initialized. Signed-off-by: Alan Ott --- drivers/net/ieee802154/mrf24j40.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 556151d..0ea2a5a 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -345,6 +345,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) if (ret) goto err; + INIT_COMPLETION(devrec->tx_complete); + /* Set TXNTRIG bit of TXNCON to send packet */ ret = read_short_reg(devrec, REG_TXNCON, ); if (ret) @@ -355,8 +357,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) val |= 0x4; write_short_reg(devrec, REG_TXNCON, val); - INIT_COMPLETION(devrec->tx_complete); - /* Wait for the device to send the TX complete interrupt. */ ret = wait_for_completion_interruptible_timeout( >tx_complete, -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH beta 1] 2/3] mrf24j40: Use threaded IRQ handler
Eliminate all the workqueue and interrupt enable/disable. Signed-off-by: Alan Ott --- drivers/net/ieee802154/mrf24j40.c | 27 +++ 1 file changed, 7 insertions(+), 20 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 0ea2a5a..a55ab8d 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -83,7 +83,6 @@ struct mrf24j40 { struct mutex buffer_mutex; /* only used to protect buf */ struct completion tx_complete; - struct work_struct irqwork; u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */ }; @@ -591,17 +590,6 @@ static struct ieee802154_ops mrf24j40_ops = { static irqreturn_t mrf24j40_isr(int irq, void *data) { struct mrf24j40 *devrec = data; - - disable_irq_nosync(irq); - - schedule_work(>irqwork); - - return IRQ_HANDLED; -} - -static void mrf24j40_isrwork(struct work_struct *work) -{ - struct mrf24j40 *devrec = container_of(work, struct mrf24j40, irqwork); u8 intstat; int ret; @@ -619,7 +607,7 @@ static void mrf24j40_isrwork(struct work_struct *work) mrf24j40_handle_rx(devrec); out: - enable_irq(devrec->spi->irq); + return IRQ_HANDLED; } static int mrf24j40_probe(struct spi_device *spi) @@ -649,7 +637,6 @@ static int mrf24j40_probe(struct spi_device *spi) mutex_init(>buffer_mutex); init_completion(>tx_complete); - INIT_WORK(>irqwork, mrf24j40_isrwork); devrec->spi = spi; dev_set_drvdata(>dev, devrec); @@ -695,11 +682,12 @@ static int mrf24j40_probe(struct spi_device *spi) val &= ~0x3; /* Clear RX mode (normal) */ write_short_reg(devrec, REG_RXMCR, val); - ret = request_irq(spi->irq, - mrf24j40_isr, - IRQF_TRIGGER_FALLING, - dev_name(>dev), - devrec); + ret = request_threaded_irq(spi->irq, + NULL, + mrf24j40_isr, + IRQF_TRIGGER_FALLING|IRQF_ONESHOT, + dev_name(>dev), + devrec); if (ret) { dev_err(printdev(devrec), "Unable to get IRQ"); @@ -728,7 +716,6 @@ static int mrf24j40_remove(struct spi_device *spi) dev_dbg(printdev(devrec), "remove\n"); free_irq(spi->irq, devrec); - flush_work(>irqwork); /* TODO: Is this the right call? */ ieee802154_unregister_device(devrec->dev); ieee802154_free_device(devrec->dev); /* TODO: Will ieee802154_free_device() wait until ->xmit() is -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH beta 1] 3/3] mrf24j40: Use level-triggered interrupts
The mrf24j40 generates level interrupts. There are rare cases where it appears that the interrupt line never gets de-asserted between interrupts, causing interrupts to be lost, and causing a hung device from the driver's perspective. Switching the driver to interpret these interrupts as level-triggered fixes this issue. Signed-off-by: Alan Ott --- drivers/net/ieee802154/mrf24j40.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index a55ab8d..d59dbff 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -685,7 +685,7 @@ static int mrf24j40_probe(struct spi_device *spi) ret = request_threaded_irq(spi->irq, NULL, mrf24j40_isr, - IRQF_TRIGGER_FALLING|IRQF_ONESHOT, + IRQF_TRIGGER_LOW|IRQF_ONESHOT, dev_name(>dev), devrec); -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled
On 05/21/2013 12:17 PM, David Hauweele wrote: > 2013/5/20 Alan Ott : >> On 05/16/2013 05:34 PM, David Hauweele wrote: >>> I have seen the interrupt line going low forever with heavy traffic. >> I've been doing some testing today and I can reproduce your issue if I >> ping -f both ways simultaneously (after about 3-4 minutes usually). I >> think it has more to do with the tx/rx mutual exclusion that you >> described in patch 1/1 than it does with the disable/enable IRQ. With >> respect to enable/disable irq, I did some checking of other similar >> drivers (non-ieee802154) and noticed that many of them use threaded >> interrupts. I think this may be closer to the right way to do it. > Indeed, threaded interrupts would be more appropriate. > What about spi_async ? We could avoid using any working queue with this call. > Although the driver would need a major rewrite and I've no clue about > the benefits of doing so. > I wonder how much additional latency is introduced by the working > queue, I will try to measure this today. Async would be really difficult since we need the values that are read in order to know what to write in later calls (like where we need to set one bit in a register). It's hard to imagine doing this without blocking somewhere, and we might as well use the blocking mechanisms that the kernel gives us. >> Are you running a tickless kernel? What is your preemption model? What's >> your hardware platform? > This is a tickless kernel with CONFIG_PREEMPT on an RPi. As another data point, can you try with voluntary preemption and see if the behavior changes? >>> The at86rf230 driver has two isr, one for edge-triggered and another >>> for level-triggered interrupt. The latter uses >>> disable_irq_nosync/enable_irq which makes sense for level-triggered >>> interrupt. Otherwise the interrupt would be triggered again and again >>> until the scheduled work read the status register. >>> >>> However I don't see the use of disable_irq/enable_irq with an >>> edge-triggered interrupt. Perhaps I'm missing something. Though I >>> don't see any harm using it anyway. >> My understanding based on the datasheet is that the interrupt can be >> asserted again as soon as INTSTAT is read (which is done in the >> workqueue). I guess even if it happens while the workqueue is running, >> it's ok. > You're right, as soon as INTSTAT is read, the line should be driven > high again to assert another interrupt. This is where I think I've found another race condition. In doing some testing with the mrf24j40 from PIC24 firmware last night, I think I've discovered that it's possible for the interrupt line to _not_ get reset (de-asserted). I think what may be causing it is if an interrupt occurs at the same time as (or right shortly after) INTSTAT is read. It's easy to imagine the interrupt line not being able to get all the way up before coming back down. I have a support email into Microchip asking specifically about this case. In my firmware, I'm able to see a case where the INTSTAT register shows non-zero (ie: an interrupt is pending), yet the interrupt flag is not set. It seems like it takes about 30 minutes for this to happen. On Linux, I think this is less likely, but still possible, since there is more latency between the hardware interrupt and when INTSTAT is read. The extra latency means there is more time for the second interrupt to happen, causing INTSTAT to simply show both interrupts, and have less likelihood of the second interrupt happening exactly as INTSTAT is read. > So it shouldn't pose any problem if the interrupts are replayed by the > kernel. What may happen however is > an error in the spi transfer. This would result in the line staying > low since the register hasn't really been read. > This is what I observed at the hardware level. To be clear, you've observed the line staying low, not an SPI transfer error, right? > If this is the case, then the problem should also occurs with and > without disable/enable_irq. However this bug > is a bit harder to reproduce as it may take several hours. The only > solution I see would be to regularly poll the > INTSTAT register for recovering. I know about other drivers which do > this so I will look into this and work on a fix. I'd like to avoid a straight-up poll if at all possible. Worst case, we can poll for interrupt in the case of something happening which we don't expect to ever happen, for example on something like the TX interrupt never showing up. It's harder of course to know when to poll for a system that's primarily a receiver. I've done something like this in my firmware: void mrf24j40_isr(void) { u8 intstat; int ret; /* Read the interrupt status */ ret = read_shor
Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled
On 05/21/2013 12:17 PM, David Hauweele wrote: 2013/5/20 Alan Ott a...@signal11.us: On 05/16/2013 05:34 PM, David Hauweele wrote: I have seen the interrupt line going low forever with heavy traffic. I've been doing some testing today and I can reproduce your issue if I ping -f both ways simultaneously (after about 3-4 minutes usually). I think it has more to do with the tx/rx mutual exclusion that you described in patch 1/1 than it does with the disable/enable IRQ. With respect to enable/disable irq, I did some checking of other similar drivers (non-ieee802154) and noticed that many of them use threaded interrupts. I think this may be closer to the right way to do it. Indeed, threaded interrupts would be more appropriate. What about spi_async ? We could avoid using any working queue with this call. Although the driver would need a major rewrite and I've no clue about the benefits of doing so. I wonder how much additional latency is introduced by the working queue, I will try to measure this today. Async would be really difficult since we need the values that are read in order to know what to write in later calls (like where we need to set one bit in a register). It's hard to imagine doing this without blocking somewhere, and we might as well use the blocking mechanisms that the kernel gives us. Are you running a tickless kernel? What is your preemption model? What's your hardware platform? This is a tickless kernel with CONFIG_PREEMPT on an RPi. As another data point, can you try with voluntary preemption and see if the behavior changes? The at86rf230 driver has two isr, one for edge-triggered and another for level-triggered interrupt. The latter uses disable_irq_nosync/enable_irq which makes sense for level-triggered interrupt. Otherwise the interrupt would be triggered again and again until the scheduled work read the status register. However I don't see the use of disable_irq/enable_irq with an edge-triggered interrupt. Perhaps I'm missing something. Though I don't see any harm using it anyway. My understanding based on the datasheet is that the interrupt can be asserted again as soon as INTSTAT is read (which is done in the workqueue). I guess even if it happens while the workqueue is running, it's ok. You're right, as soon as INTSTAT is read, the line should be driven high again to assert another interrupt. This is where I think I've found another race condition. In doing some testing with the mrf24j40 from PIC24 firmware last night, I think I've discovered that it's possible for the interrupt line to _not_ get reset (de-asserted). I think what may be causing it is if an interrupt occurs at the same time as (or right shortly after) INTSTAT is read. It's easy to imagine the interrupt line not being able to get all the way up before coming back down. I have a support email into Microchip asking specifically about this case. In my firmware, I'm able to see a case where the INTSTAT register shows non-zero (ie: an interrupt is pending), yet the interrupt flag is not set. It seems like it takes about 30 minutes for this to happen. On Linux, I think this is less likely, but still possible, since there is more latency between the hardware interrupt and when INTSTAT is read. The extra latency means there is more time for the second interrupt to happen, causing INTSTAT to simply show both interrupts, and have less likelihood of the second interrupt happening exactly as INTSTAT is read. So it shouldn't pose any problem if the interrupts are replayed by the kernel. What may happen however is an error in the spi transfer. This would result in the line staying low since the register hasn't really been read. This is what I observed at the hardware level. To be clear, you've observed the line staying low, not an SPI transfer error, right? If this is the case, then the problem should also occurs with and without disable/enable_irq. However this bug is a bit harder to reproduce as it may take several hours. The only solution I see would be to regularly poll the INTSTAT register for recovering. I know about other drivers which do this so I will look into this and work on a fix. I'd like to avoid a straight-up poll if at all possible. Worst case, we can poll for interrupt in the case of something happening which we don't expect to ever happen, for example on something like the TX interrupt never showing up. It's harder of course to know when to poll for a system that's primarily a receiver. I've done something like this in my firmware: void mrf24j40_isr(void) { u8 intstat; int ret; /* Read the interrupt status */ ret = read_short_reg(devrec, REG_INTSTAT, intstat); if (ret) return; do { /* Check for TX complete */ if (intstat 0x1) tx_complete = 1; /* Check for Rx */ if (intstat 0x8
[PATCH beta 1] 3/3] mrf24j40: Use level-triggered interrupts
The mrf24j40 generates level interrupts. There are rare cases where it appears that the interrupt line never gets de-asserted between interrupts, causing interrupts to be lost, and causing a hung device from the driver's perspective. Switching the driver to interpret these interrupts as level-triggered fixes this issue. Signed-off-by: Alan Ott a...@signal11.us --- drivers/net/ieee802154/mrf24j40.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index a55ab8d..d59dbff 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -685,7 +685,7 @@ static int mrf24j40_probe(struct spi_device *spi) ret = request_threaded_irq(spi-irq, NULL, mrf24j40_isr, - IRQF_TRIGGER_FALLING|IRQF_ONESHOT, + IRQF_TRIGGER_LOW|IRQF_ONESHOT, dev_name(spi-dev), devrec); -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH beta 1] 2/3] mrf24j40: Use threaded IRQ handler
Eliminate all the workqueue and interrupt enable/disable. Signed-off-by: Alan Ott a...@signal11.us --- drivers/net/ieee802154/mrf24j40.c | 27 +++ 1 file changed, 7 insertions(+), 20 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 0ea2a5a..a55ab8d 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -83,7 +83,6 @@ struct mrf24j40 { struct mutex buffer_mutex; /* only used to protect buf */ struct completion tx_complete; - struct work_struct irqwork; u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */ }; @@ -591,17 +590,6 @@ static struct ieee802154_ops mrf24j40_ops = { static irqreturn_t mrf24j40_isr(int irq, void *data) { struct mrf24j40 *devrec = data; - - disable_irq_nosync(irq); - - schedule_work(devrec-irqwork); - - return IRQ_HANDLED; -} - -static void mrf24j40_isrwork(struct work_struct *work) -{ - struct mrf24j40 *devrec = container_of(work, struct mrf24j40, irqwork); u8 intstat; int ret; @@ -619,7 +607,7 @@ static void mrf24j40_isrwork(struct work_struct *work) mrf24j40_handle_rx(devrec); out: - enable_irq(devrec-spi-irq); + return IRQ_HANDLED; } static int mrf24j40_probe(struct spi_device *spi) @@ -649,7 +637,6 @@ static int mrf24j40_probe(struct spi_device *spi) mutex_init(devrec-buffer_mutex); init_completion(devrec-tx_complete); - INIT_WORK(devrec-irqwork, mrf24j40_isrwork); devrec-spi = spi; dev_set_drvdata(spi-dev, devrec); @@ -695,11 +682,12 @@ static int mrf24j40_probe(struct spi_device *spi) val = ~0x3; /* Clear RX mode (normal) */ write_short_reg(devrec, REG_RXMCR, val); - ret = request_irq(spi-irq, - mrf24j40_isr, - IRQF_TRIGGER_FALLING, - dev_name(spi-dev), - devrec); + ret = request_threaded_irq(spi-irq, + NULL, + mrf24j40_isr, + IRQF_TRIGGER_FALLING|IRQF_ONESHOT, + dev_name(spi-dev), + devrec); if (ret) { dev_err(printdev(devrec), Unable to get IRQ); @@ -728,7 +716,6 @@ static int mrf24j40_remove(struct spi_device *spi) dev_dbg(printdev(devrec), remove\n); free_irq(spi-irq, devrec); - flush_work(devrec-irqwork); /* TODO: Is this the right call? */ ieee802154_unregister_device(devrec-dev); ieee802154_free_device(devrec-dev); /* TODO: Will ieee802154_free_device() wait until -xmit() is -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH beta 1] 1/3] mrf24j40: Move INIT_COMPLETION() to before packet transmission
This avoids a race condition where complete(tx_complete) could be called before tx_complete is initialized. Signed-off-by: Alan Ott a...@signal11.us --- drivers/net/ieee802154/mrf24j40.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 556151d..0ea2a5a 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -345,6 +345,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) if (ret) goto err; + INIT_COMPLETION(devrec-tx_complete); + /* Set TXNTRIG bit of TXNCON to send packet */ ret = read_short_reg(devrec, REG_TXNCON, val); if (ret) @@ -355,8 +357,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) val |= 0x4; write_short_reg(devrec, REG_TXNCON, val); - INIT_COMPLETION(devrec-tx_complete); - /* Wait for the device to send the TX complete interrupt. */ ret = wait_for_completion_interruptible_timeout( devrec-tx_complete, -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts
David Hauweele noticed that the mrf24j40 would hang arbitrarily after some period of heavy traffic. Two race conditions were discovered, and the driver was changed to use threaded interrupts, since the enable/disable of interrupts in the driver has recently been a lighning rod whenever issues arise related to interrupts (costing engineering time), and since threaded interrupts are the right way to do it. Alan Ott (3): mrf24j40: Move INIT_COMPLETION() to before packet transmission mrf24j40: Use threaded IRQ handler mrf24j40: Use level-triggered interrupts drivers/net/ieee802154/mrf24j40.c | 31 +-- 1 file changed, 9 insertions(+), 22 deletions(-) -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts
On 05/21/2013 10:01 PM, Alan Ott wrote: David Hauweele noticed that the mrf24j40 would hang arbitrarily after some period of heavy traffic. Two race conditions were discovered, and the driver was changed to use threaded interrupts, since the enable/disable of interrupts in the driver has recently been a lighning rod whenever issues arise related to interrupts (costing engineering time), and since threaded interrupts are the right way to do it. Alan Ott (3): mrf24j40: Move INIT_COMPLETION() to before packet transmission mrf24j40: Use threaded IRQ handler mrf24j40: Use level-triggered interrupts drivers/net/ieee802154/mrf24j40.c | 31 +-- 1 file changed, 9 insertions(+), 22 deletions(-) I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and it seems solid. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH testing] mrf24j40: Move INIT_COMPLETION to before the packet is sent
This avoids a race condition where the tx_complete interrupt could happen before the completion is initialized. --- David H, Try this out and see if it changes anything for you. I put a bunch of schedule_timeout()s in mrf24j40_tx() to basically guarantee that a reception will happen during mrf24j40_tx(), and it seems to work. It would not work before with said schedule_timeout() calls. Alan. --- drivers/net/ieee802154/mrf24j40.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 556151d..0ea2a5a 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -345,6 +345,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) if (ret) goto err; + INIT_COMPLETION(devrec->tx_complete); + /* Set TXNTRIG bit of TXNCON to send packet */ ret = read_short_reg(devrec, REG_TXNCON, ); if (ret) @@ -355,8 +357,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) val |= 0x4; write_short_reg(devrec, REG_TXNCON, val); - INIT_COMPLETION(devrec->tx_complete); - /* Wait for the device to send the TX complete interrupt. */ ret = wait_for_completion_interruptible_timeout( >tx_complete, -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled
On 05/16/2013 05:34 PM, David Hauweele wrote: > I have seen the interrupt line going low forever with heavy traffic. Hi David, I've been doing some testing today and I can reproduce your issue if I ping -f both ways simultaneously (after about 3-4 minutes usually). I think it has more to do with the tx/rx mutual exclusion that you described in patch 1/1 than it does with the disable/enable IRQ. With respect to enable/disable irq, I did some checking of other similar drivers (non-ieee802154) and noticed that many of them use threaded interrupts. I think this may be closer to the right way to do it. Are you running a tickless kernel? What is your preemption model? What's your hardware platform? > The at86rf230 driver has two isr, one for edge-triggered and another > for level-triggered interrupt. The latter uses > disable_irq_nosync/enable_irq which makes sense for level-triggered > interrupt. Otherwise the interrupt would be triggered again and again > until the scheduled work read the status register. > > However I don't see the use of disable_irq/enable_irq with an > edge-triggered interrupt. Perhaps I'm missing something. Though I > don't see any harm using it anyway. My understanding based on the datasheet is that the interrupt can be asserted again as soon as INTSTAT is read (which is done in the workqueue). I guess even if it happens while the workqueue is running, it's ok. I think I had a flawed understanding of schedule_work() before, thinking that it would not schedule a work_struct it if the work_struct was running. > As you said the interrupt should > be delayed until enable_irq() is called. In particular when > CONFIG_HARDIRQS_SW_RESEND is set. I think I agree with you. I'll send you a patch to try with threaded interrupts to try. I'm trying to determine exactly what's the cause of the interrupt line being stuck low. Alan. > > 2013/5/14 Alan Ott : >> On 5/9/13 11:19 AM, David Hauweele wrote: >>> Disabling the interrupt line could miss an IRQ and leave the line into a >>> low state hence locking the driver. >>> >> Have you observed this? My understanding is that the interrupt won't be lost >> but instead delayed until enable_irq() is called. >> >> I got this pattern from the other 802.15.4 drivers. Perhaps my understanding >> is wrong. >> >> >> >>> Signed-off-by: David Hauweele >>> --- >>> drivers/net/ieee802154/mrf24j40.c |7 +-- >>> 1 file changed, 1 insertion(+), 6 deletions(-) >>> >>> diff --git a/drivers/net/ieee802154/mrf24j40.c >>> b/drivers/net/ieee802154/mrf24j40.c >>> index 1e3ddf3..6ef32f7 100644 >>> --- a/drivers/net/ieee802154/mrf24j40.c >>> +++ b/drivers/net/ieee802154/mrf24j40.c >>> @@ -603,8 +603,6 @@ static irqreturn_t mrf24j40_isr(int irq, void *data) >>> { >>> struct mrf24j40 *devrec = data; >>> >>> - disable_irq_nosync(irq); >>> - >>> schedule_work(>irqwork); >>> >>> return IRQ_HANDLED; >>> @@ -619,7 +617,7 @@ static void mrf24j40_isrwork(struct work_struct *work) >>> /* Read the interrupt status */ >>> ret = read_short_reg(devrec, REG_INTSTAT, ); >>> if (ret) >>> - goto out; >>> + return; >>> >>> /* Check for TX complete */ >>> if (intstat & 0x1) >>> @@ -628,9 +626,6 @@ static void mrf24j40_isrwork(struct work_struct *work) >>> /* Check for Rx */ >>> if (intstat & 0x8) >>> schedule_work(>rxwork); >>> - >>> -out: >>> - enable_irq(devrec->spi->irq); >>> } >>> >>> static void mrf24j40_rxwork(struct work_struct *work) >>> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled
On 05/16/2013 05:34 PM, David Hauweele wrote: I have seen the interrupt line going low forever with heavy traffic. Hi David, I've been doing some testing today and I can reproduce your issue if I ping -f both ways simultaneously (after about 3-4 minutes usually). I think it has more to do with the tx/rx mutual exclusion that you described in patch 1/1 than it does with the disable/enable IRQ. With respect to enable/disable irq, I did some checking of other similar drivers (non-ieee802154) and noticed that many of them use threaded interrupts. I think this may be closer to the right way to do it. Are you running a tickless kernel? What is your preemption model? What's your hardware platform? The at86rf230 driver has two isr, one for edge-triggered and another for level-triggered interrupt. The latter uses disable_irq_nosync/enable_irq which makes sense for level-triggered interrupt. Otherwise the interrupt would be triggered again and again until the scheduled work read the status register. However I don't see the use of disable_irq/enable_irq with an edge-triggered interrupt. Perhaps I'm missing something. Though I don't see any harm using it anyway. My understanding based on the datasheet is that the interrupt can be asserted again as soon as INTSTAT is read (which is done in the workqueue). I guess even if it happens while the workqueue is running, it's ok. I think I had a flawed understanding of schedule_work() before, thinking that it would not schedule a work_struct it if the work_struct was running. As you said the interrupt should be delayed until enable_irq() is called. In particular when CONFIG_HARDIRQS_SW_RESEND is set. I think I agree with you. I'll send you a patch to try with threaded interrupts to try. I'm trying to determine exactly what's the cause of the interrupt line being stuck low. Alan. 2013/5/14 Alan Ott a...@signal11.us: On 5/9/13 11:19 AM, David Hauweele wrote: Disabling the interrupt line could miss an IRQ and leave the line into a low state hence locking the driver. Have you observed this? My understanding is that the interrupt won't be lost but instead delayed until enable_irq() is called. I got this pattern from the other 802.15.4 drivers. Perhaps my understanding is wrong. Signed-off-by: David Hauweele da...@hauweele.net --- drivers/net/ieee802154/mrf24j40.c |7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 1e3ddf3..6ef32f7 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -603,8 +603,6 @@ static irqreturn_t mrf24j40_isr(int irq, void *data) { struct mrf24j40 *devrec = data; - disable_irq_nosync(irq); - schedule_work(devrec-irqwork); return IRQ_HANDLED; @@ -619,7 +617,7 @@ static void mrf24j40_isrwork(struct work_struct *work) /* Read the interrupt status */ ret = read_short_reg(devrec, REG_INTSTAT, intstat); if (ret) - goto out; + return; /* Check for TX complete */ if (intstat 0x1) @@ -628,9 +626,6 @@ static void mrf24j40_isrwork(struct work_struct *work) /* Check for Rx */ if (intstat 0x8) schedule_work(devrec-rxwork); - -out: - enable_irq(devrec-spi-irq); } static void mrf24j40_rxwork(struct work_struct *work) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH testing] mrf24j40: Move INIT_COMPLETION to before the packet is sent
This avoids a race condition where the tx_complete interrupt could happen before the completion is initialized. --- David H, Try this out and see if it changes anything for you. I put a bunch of schedule_timeout()s in mrf24j40_tx() to basically guarantee that a reception will happen during mrf24j40_tx(), and it seems to work. It would not work before with said schedule_timeout() calls. Alan. --- drivers/net/ieee802154/mrf24j40.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 556151d..0ea2a5a 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -345,6 +345,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) if (ret) goto err; + INIT_COMPLETION(devrec-tx_complete); + /* Set TXNTRIG bit of TXNCON to send packet */ ret = read_short_reg(devrec, REG_TXNCON, val); if (ret) @@ -355,8 +357,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) val |= 0x4; write_short_reg(devrec, REG_TXNCON, val); - INIT_COMPLETION(devrec-tx_complete); - /* Wait for the device to send the TX complete interrupt. */ ret = wait_for_completion_interruptible_timeout( devrec-tx_complete, -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled
On 5/9/13 11:19 AM, David Hauweele wrote: Disabling the interrupt line could miss an IRQ and leave the line into a low state hence locking the driver. Have you observed this? My understanding is that the interrupt won't be lost but instead delayed until enable_irq() is called. I got this pattern from the other 802.15.4 drivers. Perhaps my understanding is wrong. Signed-off-by: David Hauweele --- drivers/net/ieee802154/mrf24j40.c |7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 1e3ddf3..6ef32f7 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -603,8 +603,6 @@ static irqreturn_t mrf24j40_isr(int irq, void *data) { struct mrf24j40 *devrec = data; - disable_irq_nosync(irq); - schedule_work(>irqwork); return IRQ_HANDLED; @@ -619,7 +617,7 @@ static void mrf24j40_isrwork(struct work_struct *work) /* Read the interrupt status */ ret = read_short_reg(devrec, REG_INTSTAT, ); if (ret) - goto out; + return; /* Check for TX complete */ if (intstat & 0x1) @@ -628,9 +626,6 @@ static void mrf24j40_isrwork(struct work_struct *work) /* Check for Rx */ if (intstat & 0x8) schedule_work(>rxwork); - -out: - enable_irq(devrec->spi->irq); } static void mrf24j40_rxwork(struct work_struct *work) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH 1/2] mrf24j40: Avoid transmission while receiving a frame
On 5/9/13 11:19 AM, David Hauweele wrote: The transceiver may fail under heavy traffic when a frame is transmitted while receiving another frame. This patch uses a mutex to separate the transmission and reception of frames along with a secondary working queue to avoid a deadlock while waiting for the transmission interrupt. Have you observed this failure? I have put this kind of thing in before (several times actually) but ultimately convinced myself each time that it was not necessary. I'm happy to be shown wrong. Signed-off-by: David Hauweele --- drivers/net/ieee802154/mrf24j40.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index ede3ce4..1e3ddf3 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -82,8 +82,10 @@ struct mrf24j40 { struct ieee802154_dev *dev; struct mutex buffer_mutex; /* only used to protect buf */ + struct mutex txrx_mutex; /* avoid transmission while receiving a frame */ struct completion tx_complete; struct work_struct irqwork; + struct work_struct rxwork; u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */ }; @@ -353,6 +355,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) /* Set TXNACKREQ if the ACK bit is set in the packet. */ if (skb->data[0] & IEEE802154_FC_ACK_REQ) val |= 0x4; + + mutex_lock(>txrx_mutex); write_short_reg(devrec, REG_TXNCON, val); INIT_COMPLETION(devrec->tx_complete); @@ -361,6 +365,9 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) ret = wait_for_completion_interruptible_timeout( >tx_complete, 5 * HZ); + + mutex_unlock(>txrx_mutex); + if (ret == -ERESTARTSYS) goto err; if (ret == 0) { @@ -535,6 +542,8 @@ static int mrf24j40_handle_rx(struct mrf24j40 *devrec) int ret = 0; struct sk_buff *skb; + mutex_lock(>txrx_mutex); + /* Turn off reception of packets off the air. This prevents the * device from overwriting the buffer while we're reading it. */ ret = read_short_reg(devrec, REG_BBREG1, ); @@ -575,6 +584,8 @@ out: val &= ~0x4; /* Clear RXDECINV */ write_short_reg(devrec, REG_BBREG1, val); + mutex_unlock(>txrx_mutex); + return ret; } @@ -616,12 +627,19 @@ static void mrf24j40_isrwork(struct work_struct *work) /* Check for Rx */ if (intstat & 0x8) - mrf24j40_handle_rx(devrec); + schedule_work(>rxwork); mrf24f40_isrwork() is already called from a workqueue. out: enable_irq(devrec->spi->irq); } +static void mrf24j40_rxwork(struct work_struct *work) +{ + struct mrf24j40 *devrec = container_of(work, struct mrf24j40, rxwork); + + mrf24j40_handle_rx(devrec); +} + static int mrf24j40_probe(struct spi_device *spi) { int ret = -ENOMEM; @@ -648,8 +666,10 @@ static int mrf24j40_probe(struct spi_device *spi) spi->max_speed_hz = MAX_SPI_SPEED_HZ; mutex_init(>buffer_mutex); + mutex_init(>txrx_mutex); init_completion(>tx_complete); INIT_WORK(>irqwork, mrf24j40_isrwork); + INIT_WORK(>rxwork, mrf24j40_rxwork); devrec->spi = spi; spi_set_drvdata(spi, devrec); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH 1/2] mrf24j40: Avoid transmission while receiving a frame
On 5/9/13 11:19 AM, David Hauweele wrote: The transceiver may fail under heavy traffic when a frame is transmitted while receiving another frame. This patch uses a mutex to separate the transmission and reception of frames along with a secondary working queue to avoid a deadlock while waiting for the transmission interrupt. Have you observed this failure? I have put this kind of thing in before (several times actually) but ultimately convinced myself each time that it was not necessary. I'm happy to be shown wrong. Signed-off-by: David Hauweele da...@hauweele.net --- drivers/net/ieee802154/mrf24j40.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index ede3ce4..1e3ddf3 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -82,8 +82,10 @@ struct mrf24j40 { struct ieee802154_dev *dev; struct mutex buffer_mutex; /* only used to protect buf */ + struct mutex txrx_mutex; /* avoid transmission while receiving a frame */ struct completion tx_complete; struct work_struct irqwork; + struct work_struct rxwork; u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */ }; @@ -353,6 +355,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) /* Set TXNACKREQ if the ACK bit is set in the packet. */ if (skb-data[0] IEEE802154_FC_ACK_REQ) val |= 0x4; + + mutex_lock(devrec-txrx_mutex); write_short_reg(devrec, REG_TXNCON, val); INIT_COMPLETION(devrec-tx_complete); @@ -361,6 +365,9 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) ret = wait_for_completion_interruptible_timeout( devrec-tx_complete, 5 * HZ); + + mutex_unlock(devrec-txrx_mutex); + if (ret == -ERESTARTSYS) goto err; if (ret == 0) { @@ -535,6 +542,8 @@ static int mrf24j40_handle_rx(struct mrf24j40 *devrec) int ret = 0; struct sk_buff *skb; + mutex_lock(devrec-txrx_mutex); + /* Turn off reception of packets off the air. This prevents the * device from overwriting the buffer while we're reading it. */ ret = read_short_reg(devrec, REG_BBREG1, val); @@ -575,6 +584,8 @@ out: val = ~0x4; /* Clear RXDECINV */ write_short_reg(devrec, REG_BBREG1, val); + mutex_unlock(devrec-txrx_mutex); + return ret; } @@ -616,12 +627,19 @@ static void mrf24j40_isrwork(struct work_struct *work) /* Check for Rx */ if (intstat 0x8) - mrf24j40_handle_rx(devrec); + schedule_work(devrec-rxwork); mrf24f40_isrwork() is already called from a workqueue. out: enable_irq(devrec-spi-irq); } +static void mrf24j40_rxwork(struct work_struct *work) +{ + struct mrf24j40 *devrec = container_of(work, struct mrf24j40, rxwork); + + mrf24j40_handle_rx(devrec); +} + static int mrf24j40_probe(struct spi_device *spi) { int ret = -ENOMEM; @@ -648,8 +666,10 @@ static int mrf24j40_probe(struct spi_device *spi) spi-max_speed_hz = MAX_SPI_SPEED_HZ; mutex_init(devrec-buffer_mutex); + mutex_init(devrec-txrx_mutex); init_completion(devrec-tx_complete); INIT_WORK(devrec-irqwork, mrf24j40_isrwork); + INIT_WORK(devrec-rxwork, mrf24j40_rxwork); devrec-spi = spi; spi_set_drvdata(spi, devrec); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled
On 5/9/13 11:19 AM, David Hauweele wrote: Disabling the interrupt line could miss an IRQ and leave the line into a low state hence locking the driver. Have you observed this? My understanding is that the interrupt won't be lost but instead delayed until enable_irq() is called. I got this pattern from the other 802.15.4 drivers. Perhaps my understanding is wrong. Signed-off-by: David Hauweele da...@hauweele.net --- drivers/net/ieee802154/mrf24j40.c |7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 1e3ddf3..6ef32f7 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -603,8 +603,6 @@ static irqreturn_t mrf24j40_isr(int irq, void *data) { struct mrf24j40 *devrec = data; - disable_irq_nosync(irq); - schedule_work(devrec-irqwork); return IRQ_HANDLED; @@ -619,7 +617,7 @@ static void mrf24j40_isrwork(struct work_struct *work) /* Read the interrupt status */ ret = read_short_reg(devrec, REG_INTSTAT, intstat); if (ret) - goto out; + return; /* Check for TX complete */ if (intstat 0x1) @@ -628,9 +626,6 @@ static void mrf24j40_isrwork(struct work_struct *work) /* Check for Rx */ if (intstat 0x8) schedule_work(devrec-rxwork); - -out: - enable_irq(devrec-spi-irq); } static void mrf24j40_rxwork(struct work_struct *work) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] mac802154: Keep track of the channel when changed
Two sections checked whether the current channel != the new channel without ever setting the current channel variables. 1. net/mac802154/tx.c: Prevent set_channel() from getting called every time a packet is sent. 2. net/mac802154/mib.c: Lock (pib_lock) accesses to current_channel and current_page and make sure they are updated when the channel has been changed. Signed-off-by: Alan Ott --- net/mac802154/mib.c | 12 +++- net/mac802154/tx.c | 3 +++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/net/mac802154/mib.c b/net/mac802154/mib.c index f03e55f..8ded97c 100644 --- a/net/mac802154/mib.c +++ b/net/mac802154/mib.c @@ -176,9 +176,15 @@ static void phy_chan_notify(struct work_struct *work) struct mac802154_sub_if_data *priv = netdev_priv(nw->dev); int res; + mutex_lock(>hw->phy->pib_lock); res = hw->ops->set_channel(>hw, priv->page, priv->chan); if (res) pr_debug("set_channel failed\n"); + else { + priv->hw->phy->current_channel = priv->chan; + priv->hw->phy->current_page = priv->page; + } + mutex_unlock(>hw->phy->pib_lock); kfree(nw); } @@ -195,8 +201,11 @@ void mac802154_dev_set_page_channel(struct net_device *dev, u8 page, u8 chan) priv->chan = chan; spin_unlock_bh(>mib_lock); + mutex_lock(>hw->phy->pib_lock); if (priv->hw->phy->current_channel != priv->chan || priv->hw->phy->current_page != priv->page) { + mutex_unlock(>hw->phy->pib_lock); + work = kzalloc(sizeof(*work), GFP_ATOMIC); if (!work) return; @@ -204,5 +213,6 @@ void mac802154_dev_set_page_channel(struct net_device *dev, u8 page, u8 chan) INIT_WORK(>work, phy_chan_notify); work->dev = dev; queue_work(priv->hw->dev_workqueue, >work); - } + } else + mutex_unlock(>hw->phy->pib_lock); } diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index 3fd3e07..6d16473 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -58,6 +58,9 @@ static void mac802154_xmit_worker(struct work_struct *work) pr_debug("set_channel failed\n"); goto out; } + + xw->priv->phy->current_channel = xw->chan; + xw->priv->phy->current_page = xw->page; } res = xw->priv->ops->xmit(>priv->hw, xw->skb); -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH] mac802154: Keep track of the channel when changed
On 04/05/2013 05:05 PM, Werner Almesberger wrote: > Alan Ott wrote: >> Prevent set_channel() from getting called every time a packet is sent. This >> looks like it was an oversight. > at86rf230.c and derivatives avoid this problem by setting > phy->current_* in the *_channel function. > > But I'd agree that it's nicer to do this in one place, not in > every driver. > > In case a driver had a weird failure mode in which it leaves the > original channel but only makes it halfway to the new channel, it > could still set phy->current_* and return an error. So there's no > loss of functionality with your change. Hmm... I just noticed that mib.c does the same thing (and doesn't set phy->current_*). I'll need to fix that one too (and resubmit). :( Alan. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mrf24j40: Enable link-layer acknowledgement and retry
On the MRF24J40, link-layer acknowledgment request and retry must be turned on explicitly for each packet. Turn this on in the hardware based on the FC_ACK_REQ bit being set in the packet. Also, now that failure to receive an ACK will cause the hardware to report failure of transmission, change the log level for this failure to debug level. Signed-off-by: Alan Ott --- drivers/net/ieee802154/mrf24j40.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 6481faf..556151d 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -25,6 +25,7 @@ #include #include #include +#include /* MRF24J40 Short Address Registers */ #define REG_RXMCR0x00 /* Receive MAC control */ @@ -349,7 +350,9 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) if (ret) goto err; val |= 0x1; - val &= ~0x4; + /* Set TXNACKREQ if the ACK bit is set in the packet. */ + if (skb->data[0] & IEEE802154_FC_ACK_REQ) + val |= 0x4; write_short_reg(devrec, REG_TXNCON, val); INIT_COMPLETION(devrec->tx_complete); @@ -371,7 +374,7 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) if (ret) goto err; if (val & 0x1) { - dev_err(printdev(devrec), "Error Sending. Retry count exceeded\n"); + dev_dbg(printdev(devrec), "Error Sending. Retry count exceeded\n"); ret = -ECOMM; /* TODO: Better error code ? */ } else dev_dbg(printdev(devrec), "Packet Sent\n"); -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mac802154: Keep track of the channel when changed
Prevent set_channel() from getting called every time a packet is sent. This looks like it was an oversight. Signed-off-by: Alan Ott --- net/mac802154/tx.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index 3fd3e07..6d16473 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -58,6 +58,9 @@ static void mac802154_xmit_worker(struct work_struct *work) pr_debug("set_channel failed\n"); goto out; } + + xw->priv->phy->current_channel = xw->chan; + xw->priv->phy->current_page = xw->page; } res = xw->priv->ops->xmit(>priv->hw, xw->skb); -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mac802154: Keep track of the channel when changed
Prevent set_channel() from getting called every time a packet is sent. This looks like it was an oversight. Signed-off-by: Alan Ott a...@signal11.us --- net/mac802154/tx.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index 3fd3e07..6d16473 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -58,6 +58,9 @@ static void mac802154_xmit_worker(struct work_struct *work) pr_debug(set_channel failed\n); goto out; } + + xw-priv-phy-current_channel = xw-chan; + xw-priv-phy-current_page = xw-page; } res = xw-priv-ops-xmit(xw-priv-hw, xw-skb); -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mrf24j40: Enable link-layer acknowledgement and retry
On the MRF24J40, link-layer acknowledgment request and retry must be turned on explicitly for each packet. Turn this on in the hardware based on the FC_ACK_REQ bit being set in the packet. Also, now that failure to receive an ACK will cause the hardware to report failure of transmission, change the log level for this failure to debug level. Signed-off-by: Alan Ott a...@signal11.us --- drivers/net/ieee802154/mrf24j40.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c index 6481faf..556151d 100644 --- a/drivers/net/ieee802154/mrf24j40.c +++ b/drivers/net/ieee802154/mrf24j40.c @@ -25,6 +25,7 @@ #include linux/pinctrl/consumer.h #include net/wpan-phy.h #include net/mac802154.h +#include net/ieee802154.h /* MRF24J40 Short Address Registers */ #define REG_RXMCR0x00 /* Receive MAC control */ @@ -349,7 +350,9 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) if (ret) goto err; val |= 0x1; - val = ~0x4; + /* Set TXNACKREQ if the ACK bit is set in the packet. */ + if (skb-data[0] IEEE802154_FC_ACK_REQ) + val |= 0x4; write_short_reg(devrec, REG_TXNCON, val); INIT_COMPLETION(devrec-tx_complete); @@ -371,7 +374,7 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb) if (ret) goto err; if (val 0x1) { - dev_err(printdev(devrec), Error Sending. Retry count exceeded\n); + dev_dbg(printdev(devrec), Error Sending. Retry count exceeded\n); ret = -ECOMM; /* TODO: Better error code ? */ } else dev_dbg(printdev(devrec), Packet Sent\n); -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH] mac802154: Keep track of the channel when changed
On 04/05/2013 05:05 PM, Werner Almesberger wrote: Alan Ott wrote: Prevent set_channel() from getting called every time a packet is sent. This looks like it was an oversight. at86rf230.c and derivatives avoid this problem by setting phy-current_* in the *_channel function. But I'd agree that it's nicer to do this in one place, not in every driver. In case a driver had a weird failure mode in which it leaves the original channel but only makes it halfway to the new channel, it could still set phy-current_* and return an error. So there's no loss of functionality with your change. Hmm... I just noticed that mib.c does the same thing (and doesn't set phy-current_*). I'll need to fix that one too (and resubmit). :( Alan. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] mac802154: Keep track of the channel when changed
Two sections checked whether the current channel != the new channel without ever setting the current channel variables. 1. net/mac802154/tx.c: Prevent set_channel() from getting called every time a packet is sent. 2. net/mac802154/mib.c: Lock (pib_lock) accesses to current_channel and current_page and make sure they are updated when the channel has been changed. Signed-off-by: Alan Ott a...@signal11.us --- net/mac802154/mib.c | 12 +++- net/mac802154/tx.c | 3 +++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/net/mac802154/mib.c b/net/mac802154/mib.c index f03e55f..8ded97c 100644 --- a/net/mac802154/mib.c +++ b/net/mac802154/mib.c @@ -176,9 +176,15 @@ static void phy_chan_notify(struct work_struct *work) struct mac802154_sub_if_data *priv = netdev_priv(nw-dev); int res; + mutex_lock(priv-hw-phy-pib_lock); res = hw-ops-set_channel(hw-hw, priv-page, priv-chan); if (res) pr_debug(set_channel failed\n); + else { + priv-hw-phy-current_channel = priv-chan; + priv-hw-phy-current_page = priv-page; + } + mutex_unlock(priv-hw-phy-pib_lock); kfree(nw); } @@ -195,8 +201,11 @@ void mac802154_dev_set_page_channel(struct net_device *dev, u8 page, u8 chan) priv-chan = chan; spin_unlock_bh(priv-mib_lock); + mutex_lock(priv-hw-phy-pib_lock); if (priv-hw-phy-current_channel != priv-chan || priv-hw-phy-current_page != priv-page) { + mutex_unlock(priv-hw-phy-pib_lock); + work = kzalloc(sizeof(*work), GFP_ATOMIC); if (!work) return; @@ -204,5 +213,6 @@ void mac802154_dev_set_page_channel(struct net_device *dev, u8 page, u8 chan) INIT_WORK(work-work, phy_chan_notify); work-dev = dev; queue_work(priv-hw-dev_workqueue, work-work); - } + } else + mutex_unlock(priv-hw-phy-pib_lock); } diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index 3fd3e07..6d16473 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -58,6 +58,9 @@ static void mac802154_xmit_worker(struct work_struct *work) pr_debug(set_channel failed\n); goto out; } + + xw-priv-phy-current_channel = xw-chan; + xw-priv-phy-current_page = xw-page; } res = xw-priv-ops-xmit(xw-priv-hw, xw-skb); -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 3/4] mac802154: Increase tx_buffer_len
Increase the buffer length from 10 to 300 packets. Consider that traffic on mac802154 devices will often be 6LoWPAN, and a full-length (1280 octet) IPv6 packet will fragment into 15 6LoWPAN fragments (because the MTU of IEEE 802.15.4 is 127). A 300-packet queue is really 20 full-length IPv6 packets. With a queue length of 10, an entire IPv6 packet was unable to get queued at one time, causing fragments to be dropped, and making reassembly impossible. Signed-off-by: Alan Ott --- net/mac802154/wpan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/mac802154/wpan.c b/net/mac802154/wpan.c index 7d3f659..2ca2f4d 100644 --- a/net/mac802154/wpan.c +++ b/net/mac802154/wpan.c @@ -360,7 +360,7 @@ void mac802154_wpan_setup(struct net_device *dev) dev->header_ops = _header_ops; dev->needed_tailroom= 2; /* FCS */ dev->mtu= IEEE802154_MTU; - dev->tx_queue_len = 10; + dev->tx_queue_len = 300; dev->type = ARPHRD_IEEE802154; dev->flags = IFF_NOARP | IFF_BROADCAST; dev->watchdog_timeo = 0; -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/4] 802.15.4 and 6LoWPAN Buffering Fixes
Version 2 of this patch series: Differences from v1: 1. Patches previously numbered 5 and 6 were squashed (to become current patch #4) at the request of Alexander Smirnov. 2. Current patch #2 had extraneous braces removed. 3. Current patch #1 was changed. It is now a patch to make mac802154 _not_ retry sending packets on failure. I believe this to be consistent with the 802.15.4 specification (Section 7.5.6.4.3 of IEEE 802.15.4-2006) Alan Ott (4): mac802154: Do not try to resend failed packets mac802154: Use netif flow control mac802154: Increase tx_buffer_len 6lowpan: handle dev_queue_xmit() error code properly net/ieee802154/6lowpan.c | 4 ++-- net/mac802154/mac802154.h | 2 -- net/mac802154/tx.c| 26 -- net/mac802154/wpan.c | 2 +- 4 files changed, 19 insertions(+), 15 deletions(-) -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/4] mac802154: Do not try to resend failed packets
When ops->xmit() fails, drop the packet. Devices which support hardware ack and retry (which include all devices currently supported by mainline), will automatically retry sending the packet (in the hardware) up to 3 times, per the 802.15.4 spec. There is no need, and it is incorrect to try to do it in mac802154. Signed-off-by: Alan Ott --- net/mac802154/mac802154.h | 2 -- net/mac802154/tx.c| 12 ++-- 2 files changed, 2 insertions(+), 12 deletions(-) diff --git a/net/mac802154/mac802154.h b/net/mac802154/mac802154.h index 21fa386..5c9e021 100644 --- a/net/mac802154/mac802154.h +++ b/net/mac802154/mac802154.h @@ -88,8 +88,6 @@ struct mac802154_sub_if_data { #define mac802154_to_priv(_hw) container_of(_hw, struct mac802154_priv, hw) -#define MAC802154_MAX_XMIT_ATTEMPTS3 - #define MAC802154_CHAN_NONE(~(u8)0) /* No channel is assigned */ extern struct ieee802154_reduced_mlme_ops mac802154_mlme_reduced; diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index 4e09d07..7264874 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -39,7 +39,6 @@ struct xmit_work { struct mac802154_priv *priv; u8 chan; u8 page; - u8 xmit_attempts; }; static void mac802154_xmit_worker(struct work_struct *work) @@ -60,18 +59,12 @@ static void mac802154_xmit_worker(struct work_struct *work) } res = xw->priv->ops->xmit(>priv->hw, xw->skb); + if (res) + pr_debug("transmission failed\n"); out: mutex_unlock(>priv->phy->pib_lock); - if (res) { - if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) { - queue_work(xw->priv->dev_workqueue, >work); - return; - } else - pr_debug("transmission failed for %d times", -MAC802154_MAX_XMIT_ATTEMPTS); - } dev_kfree_skb(xw->skb); @@ -114,7 +107,6 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb, work->priv = priv; work->page = page; work->chan = chan; - work->xmit_attempts = 0; queue_work(priv->dev_workqueue, >work); -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 4/4] 6lowpan: handle dev_queue_xmit() error code properly
dev_queue_xmit() will return a positive value if the packet could not be queued, often because the real network device (in our case the mac802154 wpan device) has its queue stopped. lowpan_xmit() should handle the positive return code (for the debug statement) and return that value to the higher layer so the higher layer will retry sending the packet. Signed-off-by: Alan Ott --- net/ieee802154/6lowpan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index e1b4580..55e1fd5 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1139,10 +1139,10 @@ static netdev_tx_t lowpan_xmit(struct sk_buff *skb, struct net_device *dev) error: dev_kfree_skb(skb); out: - if (err < 0) + if (err) pr_debug("ERROR: xmit failed\n"); - return (err < 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK); + return (err < 0) ? NET_XMIT_DROP : err; } static struct wpan_phy *lowpan_get_phy(const struct net_device *dev) -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/4] mac802154: Use netif flow control
Use netif_stop_queue() and netif_wake_queue() to control the flow of packets to mac802154 devices. Since many IEEE 802.15.4 devices have no output buffer, and since the mac802154 xmit() function is designed to block, netif_stop_queue() is called after each packet. Signed-off-by: Alan Ott --- net/mac802154/tx.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index 7264874..3fd3e07 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -25,6 +25,7 @@ #include #include +#include #include #include @@ -44,6 +45,7 @@ struct xmit_work { static void mac802154_xmit_worker(struct work_struct *work) { struct xmit_work *xw = container_of(work, struct xmit_work, work); + struct mac802154_sub_if_data *sdata; int res; mutex_lock(>priv->phy->pib_lock); @@ -65,6 +67,11 @@ static void mac802154_xmit_worker(struct work_struct *work) out: mutex_unlock(>priv->phy->pib_lock); + /* Restart the netif queue on each sub_if_data object. */ + rcu_read_lock(); + list_for_each_entry_rcu(sdata, >priv->slaves, list) + netif_wake_queue(sdata->dev); + rcu_read_unlock(); dev_kfree_skb(xw->skb); @@ -75,6 +82,7 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb, u8 page, u8 chan) { struct xmit_work *work; + struct mac802154_sub_if_data *sdata; if (!(priv->phy->channels_supported[page] & (1 << chan))) { WARN_ON(1); @@ -102,6 +110,12 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb, return NETDEV_TX_BUSY; } + /* Stop the netif queue on each sub_if_data object. */ + rcu_read_lock(); + list_for_each_entry_rcu(sdata, >slaves, list) + netif_stop_queue(sdata->dev); + rcu_read_unlock(); + INIT_WORK(>work, mac802154_xmit_worker); work->skb = skb; work->priv = priv; -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/4] mac802154: Use netif flow control
Use netif_stop_queue() and netif_wake_queue() to control the flow of packets to mac802154 devices. Since many IEEE 802.15.4 devices have no output buffer, and since the mac802154 xmit() function is designed to block, netif_stop_queue() is called after each packet. Signed-off-by: Alan Ott a...@signal11.us --- net/mac802154/tx.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index 7264874..3fd3e07 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -25,6 +25,7 @@ #include linux/if_arp.h #include linux/crc-ccitt.h +#include net/ieee802154_netdev.h #include net/mac802154.h #include net/wpan-phy.h @@ -44,6 +45,7 @@ struct xmit_work { static void mac802154_xmit_worker(struct work_struct *work) { struct xmit_work *xw = container_of(work, struct xmit_work, work); + struct mac802154_sub_if_data *sdata; int res; mutex_lock(xw-priv-phy-pib_lock); @@ -65,6 +67,11 @@ static void mac802154_xmit_worker(struct work_struct *work) out: mutex_unlock(xw-priv-phy-pib_lock); + /* Restart the netif queue on each sub_if_data object. */ + rcu_read_lock(); + list_for_each_entry_rcu(sdata, xw-priv-slaves, list) + netif_wake_queue(sdata-dev); + rcu_read_unlock(); dev_kfree_skb(xw-skb); @@ -75,6 +82,7 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb, u8 page, u8 chan) { struct xmit_work *work; + struct mac802154_sub_if_data *sdata; if (!(priv-phy-channels_supported[page] (1 chan))) { WARN_ON(1); @@ -102,6 +110,12 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb, return NETDEV_TX_BUSY; } + /* Stop the netif queue on each sub_if_data object. */ + rcu_read_lock(); + list_for_each_entry_rcu(sdata, priv-slaves, list) + netif_stop_queue(sdata-dev); + rcu_read_unlock(); + INIT_WORK(work-work, mac802154_xmit_worker); work-skb = skb; work-priv = priv; -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/4] mac802154: Do not try to resend failed packets
When ops-xmit() fails, drop the packet. Devices which support hardware ack and retry (which include all devices currently supported by mainline), will automatically retry sending the packet (in the hardware) up to 3 times, per the 802.15.4 spec. There is no need, and it is incorrect to try to do it in mac802154. Signed-off-by: Alan Ott a...@signal11.us --- net/mac802154/mac802154.h | 2 -- net/mac802154/tx.c| 12 ++-- 2 files changed, 2 insertions(+), 12 deletions(-) diff --git a/net/mac802154/mac802154.h b/net/mac802154/mac802154.h index 21fa386..5c9e021 100644 --- a/net/mac802154/mac802154.h +++ b/net/mac802154/mac802154.h @@ -88,8 +88,6 @@ struct mac802154_sub_if_data { #define mac802154_to_priv(_hw) container_of(_hw, struct mac802154_priv, hw) -#define MAC802154_MAX_XMIT_ATTEMPTS3 - #define MAC802154_CHAN_NONE(~(u8)0) /* No channel is assigned */ extern struct ieee802154_reduced_mlme_ops mac802154_mlme_reduced; diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index 4e09d07..7264874 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -39,7 +39,6 @@ struct xmit_work { struct mac802154_priv *priv; u8 chan; u8 page; - u8 xmit_attempts; }; static void mac802154_xmit_worker(struct work_struct *work) @@ -60,18 +59,12 @@ static void mac802154_xmit_worker(struct work_struct *work) } res = xw-priv-ops-xmit(xw-priv-hw, xw-skb); + if (res) + pr_debug(transmission failed\n); out: mutex_unlock(xw-priv-phy-pib_lock); - if (res) { - if (xw-xmit_attempts++ MAC802154_MAX_XMIT_ATTEMPTS) { - queue_work(xw-priv-dev_workqueue, xw-work); - return; - } else - pr_debug(transmission failed for %d times, -MAC802154_MAX_XMIT_ATTEMPTS); - } dev_kfree_skb(xw-skb); @@ -114,7 +107,6 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb, work-priv = priv; work-page = page; work-chan = chan; - work-xmit_attempts = 0; queue_work(priv-dev_workqueue, work-work); -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 4/4] 6lowpan: handle dev_queue_xmit() error code properly
dev_queue_xmit() will return a positive value if the packet could not be queued, often because the real network device (in our case the mac802154 wpan device) has its queue stopped. lowpan_xmit() should handle the positive return code (for the debug statement) and return that value to the higher layer so the higher layer will retry sending the packet. Signed-off-by: Alan Ott a...@signal11.us --- net/ieee802154/6lowpan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index e1b4580..55e1fd5 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1139,10 +1139,10 @@ static netdev_tx_t lowpan_xmit(struct sk_buff *skb, struct net_device *dev) error: dev_kfree_skb(skb); out: - if (err 0) + if (err) pr_debug(ERROR: xmit failed\n); - return (err 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK); + return (err 0) ? NET_XMIT_DROP : err; } static struct wpan_phy *lowpan_get_phy(const struct net_device *dev) -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/4] 802.15.4 and 6LoWPAN Buffering Fixes
Version 2 of this patch series: Differences from v1: 1. Patches previously numbered 5 and 6 were squashed (to become current patch #4) at the request of Alexander Smirnov. 2. Current patch #2 had extraneous braces removed. 3. Current patch #1 was changed. It is now a patch to make mac802154 _not_ retry sending packets on failure. I believe this to be consistent with the 802.15.4 specification (Section 7.5.6.4.3 of IEEE 802.15.4-2006) Alan Ott (4): mac802154: Do not try to resend failed packets mac802154: Use netif flow control mac802154: Increase tx_buffer_len 6lowpan: handle dev_queue_xmit() error code properly net/ieee802154/6lowpan.c | 4 ++-- net/mac802154/mac802154.h | 2 -- net/mac802154/tx.c| 26 -- net/mac802154/wpan.c | 2 +- 4 files changed, 19 insertions(+), 15 deletions(-) -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 3/4] mac802154: Increase tx_buffer_len
Increase the buffer length from 10 to 300 packets. Consider that traffic on mac802154 devices will often be 6LoWPAN, and a full-length (1280 octet) IPv6 packet will fragment into 15 6LoWPAN fragments (because the MTU of IEEE 802.15.4 is 127). A 300-packet queue is really 20 full-length IPv6 packets. With a queue length of 10, an entire IPv6 packet was unable to get queued at one time, causing fragments to be dropped, and making reassembly impossible. Signed-off-by: Alan Ott a...@signal11.us --- net/mac802154/wpan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/mac802154/wpan.c b/net/mac802154/wpan.c index 7d3f659..2ca2f4d 100644 --- a/net/mac802154/wpan.c +++ b/net/mac802154/wpan.c @@ -360,7 +360,7 @@ void mac802154_wpan_setup(struct net_device *dev) dev-header_ops = mac802154_header_ops; dev-needed_tailroom= 2; /* FCS */ dev-mtu= IEEE802154_MTU; - dev-tx_queue_len = 10; + dev-tx_queue_len = 300; dev-type = ARPHRD_IEEE802154; dev-flags = IFF_NOARP | IFF_BROADCAST; dev-watchdog_timeo = 0; -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets
On 04/02/2013 10:30 PM, David Miller wrote: > From: Alan Ott > Date: Tue, 02 Apr 2013 22:25:28 -0400 > >> The workqueue in mac802154 is only needed because the current mac802154 >> xmit() function is designed to be blocking and synchronous. >> >> Prior to my patch (#3/6), that very same workqueue would actually queue >> up packets (without bound). That's what my patch fixes. >> >> The workqueue in mac802154 also serializes the access to the device for >> other functions like setting the channel, ensuring that in the driver >> code, one doesn't have to mutex everything. I'm not sure if that's the >> "right" way to do it, but that's the way it was when I got here. > This is entirely duplicating existing facilities. > > Your desire to allow blockability during xmit() on the basis of mutual > exclusion is not well founded. I'm not sure it's my desire, but rather a statement of the way it currently is. To be clear, .ndo_start_xmit() does not block, but queues a workqueue item which then calls ieee802154_ops->xmit() which does block. This patch series centers around putting netif_stop_queue() and netif_wake_queue() in the mac802154 layer. I've sent emails about this before[1], and gotten no real suggestions about the issue, so I proceeded with Solution #1 (as described at [1]). If you want to skip this and go straight to solution #2, then let's talk about what that might look like. I still think though, that there is benefit in getting solution #1 in because it fixes some current usability problems (including the buffer (workqueue) growing without bound). All that said, I'm not sure I've answered your question or concern. Please let me know if I'm still not getting it. Alan. [1] http://thread.gmane.org/gmane.linux.network/242495/focus=262869 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets
On 04/02/2013 10:03 PM, David Miller wrote: > From: Alan Ott > Date: Tue, 02 Apr 2013 21:59:37 -0400 > >> On 04/02/2013 09:56 PM, David Miller wrote: >>> From: Alan Ott >>> Date: Tue, 02 Apr 2013 21:24:59 -0400 >>> >>>> I like it for a couple of reasons. >>>> 1. Most supported devices have only single packet output buffer, so >>>> blocking in the driver is the most straight-forward way to handle it. >>>> The alternative is to make each driver have a workqueue for xmit() (to >>>> lift the blocking out from atomic context). This makes each driver simpler. >>>> >>>> 2. All of the flow control can be handled one time in the mac802154 layer. >>> We have a perfectly working flow control mechanism in the generic >>> networking queuing layer. Please use it instead of inventing things. >> I'm pretty sure that's what I'm doing in [1]. When I say "flow control >> can be handled," I mean managing calls to netif_stop_queue() and >> netif_wake_queue(). Is there something else I should be doing instead? > Then you shouldn't need workqueues if the generic netdev facilities > can do the flow control properly. The workqueue in mac802154 is only needed because the current mac802154 xmit() function is designed to be blocking and synchronous. Prior to my patch (#3/6), that very same workqueue would actually queue up packets (without bound). That's what my patch fixes. The workqueue in mac802154 also serializes the access to the device for other functions like setting the channel, ensuring that in the driver code, one doesn't have to mutex everything. I'm not sure if that's the "right" way to do it, but that's the way it was when I got here. > There are several ethernet devices that have a single transmit buffer > and function just fine, and optimally, solely using the transmit queue > stop/start/wake infrastructure. Yes, that does work. enc28j60 works like this. However, since it's an SPI device (and can sleep), its ndo_start_xmit() _does_ use a workqueue (to remove it from atomic context), the same as ours (mac802154) does. The difference is that we do it at the mac802154 layer, while Ethernet devices do it in the driver. I guess one advantage to the way it currently is in mac802154, with the synchronous xmit(), is that a return code can be had from the driver for each packet. With my new idea that we don't need to retransmit on failure, I'm not sure we need this return code at all. Alan. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets
On 04/02/2013 09:56 PM, David Miller wrote: > From: Alan Ott > Date: Tue, 02 Apr 2013 21:24:59 -0400 > >> I like it for a couple of reasons. >> 1. Most supported devices have only single packet output buffer, so >> blocking in the driver is the most straight-forward way to handle it. >> The alternative is to make each driver have a workqueue for xmit() (to >> lift the blocking out from atomic context). This makes each driver simpler. >> >> 2. All of the flow control can be handled one time in the mac802154 layer. > We have a perfectly working flow control mechanism in the generic > networking queuing layer. Please use it instead of inventing things. I'm pretty sure that's what I'm doing in [1]. When I say "flow control can be handled," I mean managing calls to netif_stop_queue() and netif_wake_queue(). Is there something else I should be doing instead? > If it does not meet your needs, fix it, rather than go off and do > your own thing. That way everyone benfits, not just you. Fully agreed. Alan. [1] http://www.spinics.net/lists/netdev/msg231483.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets
On 04/02/2013 07:13 PM, Werner Almesberger wrote: > Alan Ott wrote: >> it's now my opinion that we should _not_ try to retransmit at >> all in mac802154/tx.c. > I think the currently blocking workqueue design is ugly and > quite contrary to how most the rest of the stack works. So > anything that kills it has my blessing :-) I like it for a couple of reasons. 1. Most supported devices have only single packet output buffer, so blocking in the driver is the most straight-forward way to handle it. The alternative is to make each driver have a workqueue for xmit() (to lift the blocking out from atomic context). This makes each driver simpler. 2. All of the flow control can be handled one time in the mac802154 layer. > I do wonder though why it was done like this in the first place. > Just for convenience ? > > If we want to move towards an asynchronous interface, it could > exist in parallel with the current one. That way, drivers could > be migrated one by one. Maybe at some point this will be done. Right now we have a ton of pressing issues (in my opinion). > Having said that, the errors you get there may not be failed > single transmissions on the air but some form of congestion in > the driver or a problem with the device. But I don't think > that's a valid reason for retrying the transmission at that > level. Agreed. Like I said, I'm now of the opinion that the mac802154 layer should _not_ retry at all. Alan. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/6] mac802154: Immediately retry sending failed packets
On 04/02/2013 05:28 PM, Alan Ott wrote: > According to 7.5.6.5 of IEEE 802.15.4-2003, if the retransmission fails > more than aMaxFrameRetries (3) times, it is assumed that it has failed. > Since some transceivers (and I would assume most if not all) do this in > hardware, it's now my opinion that we should _not_ try to retransmit at > all in mac802154/tx.c. > > For a driver for a device which _doesn't_ do retransmission in hardware, > maybe it should be handled by that driver then. It's worth noting that the mrf24j40, the at86rf230, and the cc2420 support retransmission in hardware (the first two are the only two in mainline). Alan. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/6] mac802154: Immediately retry sending failed packets
On 04/02/2013 04:28 PM, Alan Ott wrote: > On 04/02/2013 03:11 PM, Alexander Smirnov wrote: >> 2013/4/2 Alan Ott mailto:a...@signal11.us>> >> >> When ops->xmit() fails, immediately retry. Previously the packet was >> sent >> to the back of the workqueue. >> >> Signed-off-by: Alan Ott mailto:a...@signal11.us>> >> --- >> net/mac802154/tx.c | 17 - >> 1 file changed, 8 insertions(+), 9 deletions(-) >> >> diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c >> index 4e09d07..fbf937c 100644 >> --- a/net/mac802154/tx.c >> +++ b/net/mac802154/tx.c >> @@ -59,19 +59,18 @@ static void mac802154_xmit_worker(struct >> work_struct *work) >> } >> } >> >> - res = xw->priv->ops->xmit(>priv->hw, xw->skb); >> + do { >> + res = xw->priv->ops->xmit(>priv->hw, xw->skb); >> + if (res && ++xw->xmit_attempts >= >> MAC802154_MAX_XMIT_ATTEMPTS) { >> + pr_debug("transmission failed for %d times", >> +MAC802154_MAX_XMIT_ATTEMPTS); >> + break; >> + } >> + } while (res); >> >> >> >> IIRC this 802.15.4 stack uses single-thread-work-queue and all RX/TX >> are performed by using it. > Hi Alexander, > > Yes, that is true. As is currently implemented, the driver xmit() > functions are called from a workqueue and block until the packet is sent. > > >> Doing TX retry in the way you proposed - >> it's possible that you will block other packets pending in this >> queue. > Yes. Since sending data is a blocking operation, any time spent sending > (or re-sending) is blocking. > > As it was before this patch series, with the buffer (workqueue) growing > arbitrarily large, doing retry by putting a packet at the end of the > workqueue was largely useless because by the time it came to retry it, > any state associated with it (with respect to fragmentation/reassembly) > had expired. > > Keep in mind that with the netif stop/wake code, putting retries at the > end of the workqueue or doing them immediately is basically the same > thing, since the workqueue is no longer the packet queue (and will > ideally only have 0 or 1 packets in it). The workqueue (with these > patches) only serves to lift the driver xmit() calls out of atomic > context, allowing them to block. > > However, it is easy to envision one process clogging up the works with > retries by sending many packets to an unavailable address. > > What do you recommend doing here instead? According to 7.5.6.5 of IEEE 802.15.4-2003, if the retransmission fails more than aMaxFrameRetries (3) times, it is assumed that it has failed. Since some transceivers (and I would assume most if not all) do this in hardware, it's now my opinion that we should _not_ try to retransmit at all in mac802154/tx.c. For a driver for a device which _doesn't_ do retransmission in hardware, maybe it should be handled by that driver then. > >> Despite on Linux is already 'slow' system to provide >> real-time for specific 802.15.4 features, I think it's not a good >> idea to increase nodes communication latency. > With the transmit buffer length increased (and actually being used), > maybe the packets with realtime requirements can be given a higher > priority to deal with these requirements. > > Alan. > >> >> out: >> mutex_unlock(>priv->phy->pib_lock); >> >> - if (res) { >> - if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) { >> - queue_work(xw->priv->dev_workqueue, >work); >> - return; >> - } else >> - pr_debug("transmission failed for %d times", >> -MAC802154_MAX_XMIT_ATTEMPTS); >> - } >> >> dev_kfree_skb(xw->skb); >> >> -- >> 1.7.11.2 >> >> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/6] mac802154: Immediately retry sending failed packets
On 04/02/2013 03:11 PM, Alexander Smirnov wrote: > 2013/4/2 Alan Ott mailto:a...@signal11.us>> > > When ops->xmit() fails, immediately retry. Previously the packet was > sent > to the back of the workqueue. > > Signed-off-by: Alan Ott mailto:a...@signal11.us>> > --- > net/mac802154/tx.c | 17 - > 1 file changed, 8 insertions(+), 9 deletions(-) > > diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c > index 4e09d07..fbf937c 100644 > --- a/net/mac802154/tx.c > +++ b/net/mac802154/tx.c > @@ -59,19 +59,18 @@ static void mac802154_xmit_worker(struct > work_struct *work) > } > } > > - res = xw->priv->ops->xmit(>priv->hw, xw->skb); > + do { > + res = xw->priv->ops->xmit(>priv->hw, xw->skb); > + if (res && ++xw->xmit_attempts >= > MAC802154_MAX_XMIT_ATTEMPTS) { > + pr_debug("transmission failed for %d times", > +MAC802154_MAX_XMIT_ATTEMPTS); > + break; > + } > + } while (res); > > > > IIRC this 802.15.4 stack uses single-thread-work-queue and all RX/TX > are performed by using it. Hi Alexander, Yes, that is true. As is currently implemented, the driver xmit() functions are called from a workqueue and block until the packet is sent. > Doing TX retry in the way you proposed - > it's possible that you will block other packets pending in this > queue. Yes. Since sending data is a blocking operation, any time spent sending (or re-sending) is blocking. As it was before this patch series, with the buffer (workqueue) growing arbitrarily large, doing retry by putting a packet at the end of the workqueue was largely useless because by the time it came to retry it, any state associated with it (with respect to fragmentation/reassembly) had expired. Keep in mind that with the netif stop/wake code, putting retries at the end of the workqueue or doing them immediately is basically the same thing, since the workqueue is no longer the packet queue (and will ideally only have 0 or 1 packets in it). The workqueue (with these patches) only serves to lift the driver xmit() calls out of atomic context, allowing them to block. However, it is easy to envision one process clogging up the works with retries by sending many packets to an unavailable address. What do you recommend doing here instead? > Despite on Linux is already 'slow' system to provide > real-time for specific 802.15.4 features, I think it's not a good > idea to increase nodes communication latency. With the transmit buffer length increased (and actually being used), maybe the packets with realtime requirements can be given a higher priority to deal with these requirements. Alan. > > > out: > mutex_unlock(>priv->phy->pib_lock); > > - if (res) { > - if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) { > - queue_work(xw->priv->dev_workqueue, >work); > - return; > - } else > - pr_debug("transmission failed for %d times", > -MAC802154_MAX_XMIT_ATTEMPTS); > - } > > dev_kfree_skb(xw->skb); > > -- > 1.7.11.2 > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/6] mac802154: Use netif flow control
Use netif_stop_queue() and netif_wake_queue() to control the flow of packets to mac802154 devices. Since many IEEE 802.15.4 devices have no output buffer, and since the mac802154 xmit() function is designed to block, netif_stop_queue() is called after each packet. Signed-off-by: Alan Ott --- net/mac802154/tx.c | 16 1 file changed, 16 insertions(+) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index a248246..fe3e02c 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -25,6 +25,7 @@ #include #include +#include #include #include @@ -45,6 +46,7 @@ static void mac802154_xmit_worker(struct work_struct *work) { struct xmit_work *xw = container_of(work, struct xmit_work, work); u8 xmit_attempts = 0; + struct mac802154_sub_if_data *sdata; int res; mutex_lock(>priv->phy->pib_lock); @@ -71,6 +73,12 @@ static void mac802154_xmit_worker(struct work_struct *work) out: mutex_unlock(>priv->phy->pib_lock); + /* Restart the netif queue on each sub_if_data object. */ + rcu_read_lock(); + list_for_each_entry_rcu(sdata, >priv->slaves, list) { + netif_wake_queue(sdata->dev); + } + rcu_read_unlock(); dev_kfree_skb(xw->skb); @@ -81,6 +89,7 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb, u8 page, u8 chan) { struct xmit_work *work; + struct mac802154_sub_if_data *sdata; if (!(priv->phy->channels_supported[page] & (1 << chan))) { WARN_ON(1); @@ -108,6 +117,13 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb, return NETDEV_TX_BUSY; } + /* Stop the netif queue on each sub_if_data object. */ + rcu_read_lock(); + list_for_each_entry_rcu(sdata, >slaves, list) { + netif_stop_queue(sdata->dev); + } + rcu_read_unlock(); + INIT_WORK(>work, mac802154_xmit_worker); work->skb = skb; work->priv = priv; -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/6] mac802154: Move xmit_attemps to stack
There's no reason to have it in the work struct anymore. Signed-off-by: Alan Ott --- net/mac802154/tx.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index fbf937c..a248246 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -39,12 +39,12 @@ struct xmit_work { struct mac802154_priv *priv; u8 chan; u8 page; - u8 xmit_attempts; }; static void mac802154_xmit_worker(struct work_struct *work) { struct xmit_work *xw = container_of(work, struct xmit_work, work); + u8 xmit_attempts = 0; int res; mutex_lock(>priv->phy->pib_lock); @@ -61,7 +61,7 @@ static void mac802154_xmit_worker(struct work_struct *work) do { res = xw->priv->ops->xmit(>priv->hw, xw->skb); - if (res && ++xw->xmit_attempts >= MAC802154_MAX_XMIT_ATTEMPTS) { + if (res && ++xmit_attempts >= MAC802154_MAX_XMIT_ATTEMPTS) { pr_debug("transmission failed for %d times", MAC802154_MAX_XMIT_ATTEMPTS); break; @@ -113,7 +113,6 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb, work->priv = priv; work->page = page; work->chan = chan; - work->xmit_attempts = 0; queue_work(priv->dev_workqueue, >work); -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/6] mac802154: Increase tx_buffer_len
Increase the buffer length from 10 to 300 packets. Consider that traffic on mac802154 devices will often be 6LoWPAN, and a full-length (1280 octet) IPv6 packet will fragment into 15 6LoWPAN fragments (because the MTU of IEEE 802.15.4 is 127). A 300-packet queue is really 20 full-length IPv6 packets. With a queue length of 10, an entire IPv6 packet was unable to get queued at one time, causing fragments to be dropped, and making reassembly impossible. Signed-off-by: Alan Ott --- net/mac802154/wpan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/mac802154/wpan.c b/net/mac802154/wpan.c index 7d3f659..2ca2f4d 100644 --- a/net/mac802154/wpan.c +++ b/net/mac802154/wpan.c @@ -360,7 +360,7 @@ void mac802154_wpan_setup(struct net_device *dev) dev->header_ops = _header_ops; dev->needed_tailroom= 2; /* FCS */ dev->mtu= IEEE802154_MTU; - dev->tx_queue_len = 10; + dev->tx_queue_len = 300; dev->type = ARPHRD_IEEE802154; dev->flags = IFF_NOARP | IFF_BROADCAST; dev->watchdog_timeo = 0; -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/6] 802.15.4 and 6LoWPAN Buffering Fixes
These patches fix an issue in the 802.15.4 code where the output buffer (which prior to this patchset is just a workqueue) can grow to an arbitrary size. This is basically fixed as follows: 1. Use netif_stop_queue() and netif_wake_queue() to stop and start the transmit queue, preventing packets from piling up without bound on the mac802154 workqueue. 2. Increase the default tx_buffer_len for mac802154 (wpan) devices from 10 to 300, enabling Qdisc to work properly on the transmit queue. Additionally the following related issues are fixed: 1. Handle dev_queue_xmit() return values properly in the 6LoWPAN code (and return the proper errors to the higher layers). This will cause the higher layers to retry later if the mac802154 queue is full. 2. Fix the retry of transmit failures in mac802154. Alan Ott (6): mac802154: Immediately retry sending failed packets mac802154: Move xmit_attemps to stack mac802154: Use netif flow control mac802154: Increase tx_buffer_len 6lowpan: handle dev_queue_xmit error code properly 6lowpan: return the dev_queue_xmit() return value from lowpan_xmit() net/ieee802154/6lowpan.c | 4 ++-- net/mac802154/tx.c | 34 -- net/mac802154/wpan.c | 2 +- 3 files changed, 27 insertions(+), 13 deletions(-) -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/6] 6lowpan: return the dev_queue_xmit() return value from lowpan_xmit()
dev_queue_xmit() will return a positive value if the packet could not be queued, often because the real network device (in our case the mac802154 wpan device) has its queue stopped. lowpan_xmit() should return that value to the higher layer so the higher layer will retry sending the packet. Signed-off-by: Alan Ott --- net/ieee802154/6lowpan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index a68c792..55e1fd5 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1142,7 +1142,7 @@ out: if (err) pr_debug("ERROR: xmit failed\n"); - return (err < 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK); + return (err < 0) ? NET_XMIT_DROP : err; } static struct wpan_phy *lowpan_get_phy(const struct net_device *dev) -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/6] mac802154: Immediately retry sending failed packets
When ops->xmit() fails, immediately retry. Previously the packet was sent to the back of the workqueue. Signed-off-by: Alan Ott --- net/mac802154/tx.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index 4e09d07..fbf937c 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -59,19 +59,18 @@ static void mac802154_xmit_worker(struct work_struct *work) } } - res = xw->priv->ops->xmit(>priv->hw, xw->skb); + do { + res = xw->priv->ops->xmit(>priv->hw, xw->skb); + if (res && ++xw->xmit_attempts >= MAC802154_MAX_XMIT_ATTEMPTS) { + pr_debug("transmission failed for %d times", +MAC802154_MAX_XMIT_ATTEMPTS); + break; + } + } while (res); out: mutex_unlock(>priv->phy->pib_lock); - if (res) { - if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) { - queue_work(xw->priv->dev_workqueue, >work); - return; - } else - pr_debug("transmission failed for %d times", -MAC802154_MAX_XMIT_ATTEMPTS); - } dev_kfree_skb(xw->skb); -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/6] 6lowpan: handle dev_queue_xmit error code properly
dev_queue_xmit() can return positive error codes, so check for nonzero. Signed-off-by: Alan Ott --- net/ieee802154/6lowpan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index e1b4580..a68c792 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1139,7 +1139,7 @@ static netdev_tx_t lowpan_xmit(struct sk_buff *skb, struct net_device *dev) error: dev_kfree_skb(skb); out: - if (err < 0) + if (err) pr_debug("ERROR: xmit failed\n"); return (err < 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK); -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/6] 6lowpan: handle dev_queue_xmit error code properly
dev_queue_xmit() can return positive error codes, so check for nonzero. Signed-off-by: Alan Ott a...@signal11.us --- net/ieee802154/6lowpan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index e1b4580..a68c792 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1139,7 +1139,7 @@ static netdev_tx_t lowpan_xmit(struct sk_buff *skb, struct net_device *dev) error: dev_kfree_skb(skb); out: - if (err 0) + if (err) pr_debug(ERROR: xmit failed\n); return (err 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK); -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/6] mac802154: Immediately retry sending failed packets
When ops-xmit() fails, immediately retry. Previously the packet was sent to the back of the workqueue. Signed-off-by: Alan Ott a...@signal11.us --- net/mac802154/tx.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index 4e09d07..fbf937c 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -59,19 +59,18 @@ static void mac802154_xmit_worker(struct work_struct *work) } } - res = xw-priv-ops-xmit(xw-priv-hw, xw-skb); + do { + res = xw-priv-ops-xmit(xw-priv-hw, xw-skb); + if (res ++xw-xmit_attempts = MAC802154_MAX_XMIT_ATTEMPTS) { + pr_debug(transmission failed for %d times, +MAC802154_MAX_XMIT_ATTEMPTS); + break; + } + } while (res); out: mutex_unlock(xw-priv-phy-pib_lock); - if (res) { - if (xw-xmit_attempts++ MAC802154_MAX_XMIT_ATTEMPTS) { - queue_work(xw-priv-dev_workqueue, xw-work); - return; - } else - pr_debug(transmission failed for %d times, -MAC802154_MAX_XMIT_ATTEMPTS); - } dev_kfree_skb(xw-skb); -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/6] 6lowpan: return the dev_queue_xmit() return value from lowpan_xmit()
dev_queue_xmit() will return a positive value if the packet could not be queued, often because the real network device (in our case the mac802154 wpan device) has its queue stopped. lowpan_xmit() should return that value to the higher layer so the higher layer will retry sending the packet. Signed-off-by: Alan Ott a...@signal11.us --- net/ieee802154/6lowpan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c index a68c792..55e1fd5 100644 --- a/net/ieee802154/6lowpan.c +++ b/net/ieee802154/6lowpan.c @@ -1142,7 +1142,7 @@ out: if (err) pr_debug(ERROR: xmit failed\n); - return (err 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK); + return (err 0) ? NET_XMIT_DROP : err; } static struct wpan_phy *lowpan_get_phy(const struct net_device *dev) -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/6] 802.15.4 and 6LoWPAN Buffering Fixes
These patches fix an issue in the 802.15.4 code where the output buffer (which prior to this patchset is just a workqueue) can grow to an arbitrary size. This is basically fixed as follows: 1. Use netif_stop_queue() and netif_wake_queue() to stop and start the transmit queue, preventing packets from piling up without bound on the mac802154 workqueue. 2. Increase the default tx_buffer_len for mac802154 (wpan) devices from 10 to 300, enabling Qdisc to work properly on the transmit queue. Additionally the following related issues are fixed: 1. Handle dev_queue_xmit() return values properly in the 6LoWPAN code (and return the proper errors to the higher layers). This will cause the higher layers to retry later if the mac802154 queue is full. 2. Fix the retry of transmit failures in mac802154. Alan Ott (6): mac802154: Immediately retry sending failed packets mac802154: Move xmit_attemps to stack mac802154: Use netif flow control mac802154: Increase tx_buffer_len 6lowpan: handle dev_queue_xmit error code properly 6lowpan: return the dev_queue_xmit() return value from lowpan_xmit() net/ieee802154/6lowpan.c | 4 ++-- net/mac802154/tx.c | 34 -- net/mac802154/wpan.c | 2 +- 3 files changed, 27 insertions(+), 13 deletions(-) -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/