Re: [PATCH] ieee802154: mrf24j40: fix incorrect mask in mrf24j40_stop

2017-11-06 Thread Alan Ott

On 11/06/2017 06:13 AM, Stefan Schmidt wrote:

Hello Alan.

On 10/31/2017 07:31 AM, Gustavo A. R. Silva wrote:

It seems that this is a copy/paste error and the proper bit masking is:
BIT_TXNIE | BIT_RXIE

This issue was detected with the help of Coccinelle.

Reported-by: Julia Lawall <julia.law...@lip6.fr>
Signed-off-by: Gustavo A. R. Silva <garsi...@embeddedor.com>
Fixes: 7d840545e5b9 ("mrf24j40: replace magic numbers")
---
 drivers/net/ieee802154/mrf24j40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index ee7084b..cf4788d 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -635,7 +635,7 @@ static void mrf24j40_stop(struct ieee802154_hw *hw)

/* Set TXNIE and RXIE. Disable Interrupts */
regmap_update_bits(devrec->regmap_short, REG_INTCON,
-  BIT_TXNIE | BIT_TXNIE, BIT_TXNIE | BIT_TXNIE);
+  BIT_TXNIE | BIT_RXIE, BIT_TXNIE | BIT_RXIE);
 }

 static int mrf24j40_set_channel(struct ieee802154_hw *hw, u8 page, u8 channel)



Could you review this and give me your ack when you are happy so I can apply it 
to my tree?



Acked-by: Alan Ott <a...@signal11.us>



Re: [PATCH] ieee802154: mrf24j40: fix incorrect mask in mrf24j40_stop

2017-11-06 Thread Alan Ott

On 11/06/2017 06:13 AM, Stefan Schmidt wrote:

Hello Alan.

On 10/31/2017 07:31 AM, Gustavo A. R. Silva wrote:

It seems that this is a copy/paste error and the proper bit masking is:
BIT_TXNIE | BIT_RXIE

This issue was detected with the help of Coccinelle.

Reported-by: Julia Lawall 
Signed-off-by: Gustavo A. R. Silva 
Fixes: 7d840545e5b9 ("mrf24j40: replace magic numbers")
---
 drivers/net/ieee802154/mrf24j40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index ee7084b..cf4788d 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -635,7 +635,7 @@ static void mrf24j40_stop(struct ieee802154_hw *hw)

/* Set TXNIE and RXIE. Disable Interrupts */
regmap_update_bits(devrec->regmap_short, REG_INTCON,
-  BIT_TXNIE | BIT_TXNIE, BIT_TXNIE | BIT_TXNIE);
+  BIT_TXNIE | BIT_RXIE, BIT_TXNIE | BIT_RXIE);
 }

 static int mrf24j40_set_channel(struct ieee802154_hw *hw, u8 page, u8 channel)



Could you review this and give me your ack when you are happy so I can apply it 
to my tree?



Acked-by: Alan Ott 



Re: [PATCH v2] mrf24j40: Fix en error handling path in 'mrf24j40_probe()'

2017-07-10 Thread Alan Ott

On 07/08/2017 04:40 AM, Christophe JAILLET wrote:

If this check fails, we must release some resources as done everywhere
else in this function before returning an error code.

Signed-off-by: Christophe JAILLET <christophe.jail...@wanadoo.fr>
---
V2: initialization of ret in this erro path ws missing. Stupid me!
---
 drivers/net/ieee802154/mrf24j40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 7d334963dc08..da8683782ffc 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -1330,7 +1330,8 @@ static int mrf24j40_probe(struct spi_device *spi)
if (spi->max_speed_hz > MAX_SPI_SPEED_HZ) {
dev_warn(>dev, "spi clock above possible maximum: %d",
 MAX_SPI_SPEED_HZ);
-   return -EINVAL;
+   ret = -EINVAL;
+   goto err_register_device;
}

ret = mrf24j40_hw_init(devrec);



Acked-by: Alan Ott <a...@signal11.us>



Re: [PATCH v2] mrf24j40: Fix en error handling path in 'mrf24j40_probe()'

2017-07-10 Thread Alan Ott

On 07/08/2017 04:40 AM, Christophe JAILLET wrote:

If this check fails, we must release some resources as done everywhere
else in this function before returning an error code.

Signed-off-by: Christophe JAILLET 
---
V2: initialization of ret in this erro path ws missing. Stupid me!
---
 drivers/net/ieee802154/mrf24j40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 7d334963dc08..da8683782ffc 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -1330,7 +1330,8 @@ static int mrf24j40_probe(struct spi_device *spi)
if (spi->max_speed_hz > MAX_SPI_SPEED_HZ) {
dev_warn(>dev, "spi clock above possible maximum: %d",
 MAX_SPI_SPEED_HZ);
-   return -EINVAL;
+   ret = -EINVAL;
+   goto err_register_device;
}

ret = mrf24j40_hw_init(devrec);



Acked-by: Alan Ott 



[PATCH v2] doc: efi-stub.txt: Fix arm64 paths

2015-11-30 Thread Alan Ott
Update documented paths for arm64 files to match current tree.

Signed-off-by: Alan Ott 
---
 Documentation/efi-stub.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt
index 7747024..e157469 100644
--- a/Documentation/efi-stub.txt
+++ b/Documentation/efi-stub.txt
@@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c,
 respectively. For ARM the EFI stub is implemented in
 arch/arm/boot/compressed/efi-header.S and
 arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared
-between architectures is in drivers/firmware/efi/efi-stub-helper.c.
+between architectures is in drivers/firmware/efi/libstub.
 
 For arm64, there is no compressed kernel support, so the Image itself
 masquerades as a PE/COFF image and the EFI stub is linked into the
 kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S
-and arch/arm64/kernel/efi-stub.c.
+and drivers/firmware/efi/libstub/arm64-stub.c.
 
 By using the EFI boot stub it's possible to boot a Linux kernel
 without the use of a conventional EFI boot loader, such as grub or
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] doc: efi-stub.txt: Fix arm64 paths

2015-11-30 Thread Alan Ott
Update documented paths for arm64 files to match current tree.

Signed-off-by: Alan Ott <a...@softiron.co.uk>
---
 Documentation/efi-stub.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt
index 7747024..e157469 100644
--- a/Documentation/efi-stub.txt
+++ b/Documentation/efi-stub.txt
@@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c,
 respectively. For ARM the EFI stub is implemented in
 arch/arm/boot/compressed/efi-header.S and
 arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared
-between architectures is in drivers/firmware/efi/efi-stub-helper.c.
+between architectures is in drivers/firmware/efi/libstub.
 
 For arm64, there is no compressed kernel support, so the Image itself
 masquerades as a PE/COFF image and the EFI stub is linked into the
 kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S
-and arch/arm64/kernel/efi-stub.c.
+and drivers/firmware/efi/libstub/arm64-stub.c.
 
 By using the EFI boot stub it's possible to boot a Linux kernel
 without the use of a conventional EFI boot loader, such as grub or
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] doc: efi-stub.txt: Fix arm64 paths

2015-11-28 Thread Alan Ott

On 11/28/2015 11:03 AM, Alan Ott wrote:

Update documented paths for arm64 files to match current tree.
---
  Documentation/efi-stub.txt | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt
index 7747024..e157469 100644
--- a/Documentation/efi-stub.txt
+++ b/Documentation/efi-stub.txt
@@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c,
  respectively. For ARM the EFI stub is implemented in
  arch/arm/boot/compressed/efi-header.S and
  arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared
-between architectures is in drivers/firmware/efi/efi-stub-helper.c.
+between architectures is in drivers/firmware/efi/libstub.
  
  For arm64, there is no compressed kernel support, so the Image itself

  masquerades as a PE/COFF image and the EFI stub is linked into the
  kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S
-and arch/arm64/kernel/efi-stub.c.
+and drivers/firmware/efi/libstub/arm64-stub.c.
  
  By using the EFI boot stub it's possible to boot a Linux kernel

  without the use of a conventional EFI boot loader, such as grub or


To follow up on this one, it additionally appears that commit 
719e284369d23c3feb02d02ad58b3a516df47806 [1] was premature, as ARM 
support for EFI never fully made it into mainline, and the files 
referenced[2] do not exist anywhere in the history.


Alan.

[1] https://lkml.org/lkml/2014/4/4/401
[2] arch/arm/boot/compressed/efi-header.S, and 
arch/arm/boot/compressed/efi-stub.c


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] doc: efi-stub.txt: Fix arm64 paths

2015-11-28 Thread Alan Ott
Update documented paths for arm64 files to match current tree.
---
 Documentation/efi-stub.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt
index 7747024..e157469 100644
--- a/Documentation/efi-stub.txt
+++ b/Documentation/efi-stub.txt
@@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c,
 respectively. For ARM the EFI stub is implemented in
 arch/arm/boot/compressed/efi-header.S and
 arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared
-between architectures is in drivers/firmware/efi/efi-stub-helper.c.
+between architectures is in drivers/firmware/efi/libstub.
 
 For arm64, there is no compressed kernel support, so the Image itself
 masquerades as a PE/COFF image and the EFI stub is linked into the
 kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S
-and arch/arm64/kernel/efi-stub.c.
+and drivers/firmware/efi/libstub/arm64-stub.c.
 
 By using the EFI boot stub it's possible to boot a Linux kernel
 without the use of a conventional EFI boot loader, such as grub or
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] doc: efi-stub.txt: Fix arm64 paths

2015-11-28 Thread Alan Ott
Update documented paths for arm64 files to match current tree.
---
 Documentation/efi-stub.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt
index 7747024..e157469 100644
--- a/Documentation/efi-stub.txt
+++ b/Documentation/efi-stub.txt
@@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c,
 respectively. For ARM the EFI stub is implemented in
 arch/arm/boot/compressed/efi-header.S and
 arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared
-between architectures is in drivers/firmware/efi/efi-stub-helper.c.
+between architectures is in drivers/firmware/efi/libstub.
 
 For arm64, there is no compressed kernel support, so the Image itself
 masquerades as a PE/COFF image and the EFI stub is linked into the
 kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S
-and arch/arm64/kernel/efi-stub.c.
+and drivers/firmware/efi/libstub/arm64-stub.c.
 
 By using the EFI boot stub it's possible to boot a Linux kernel
 without the use of a conventional EFI boot loader, such as grub or
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] doc: efi-stub.txt: Fix arm64 paths

2015-11-28 Thread Alan Ott

On 11/28/2015 11:03 AM, Alan Ott wrote:

Update documented paths for arm64 files to match current tree.
---
  Documentation/efi-stub.txt | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/efi-stub.txt b/Documentation/efi-stub.txt
index 7747024..e157469 100644
--- a/Documentation/efi-stub.txt
+++ b/Documentation/efi-stub.txt
@@ -10,12 +10,12 @@ arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c,
  respectively. For ARM the EFI stub is implemented in
  arch/arm/boot/compressed/efi-header.S and
  arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared
-between architectures is in drivers/firmware/efi/efi-stub-helper.c.
+between architectures is in drivers/firmware/efi/libstub.
  
  For arm64, there is no compressed kernel support, so the Image itself

  masquerades as a PE/COFF image and the EFI stub is linked into the
  kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S
-and arch/arm64/kernel/efi-stub.c.
+and drivers/firmware/efi/libstub/arm64-stub.c.
  
  By using the EFI boot stub it's possible to boot a Linux kernel

  without the use of a conventional EFI boot loader, such as grub or


To follow up on this one, it additionally appears that commit 
719e284369d23c3feb02d02ad58b3a516df47806 [1] was premature, as ARM 
support for EFI never fully made it into mainline, and the files 
referenced[2] do not exist anywhere in the history.


Alan.

[1] https://lkml.org/lkml/2014/4/4/401
[2] arch/arm/boot/compressed/efi-header.S, and 
arch/arm/boot/compressed/efi-stub.c


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!

2015-11-24 Thread Alan Ott
insertions(+), 3 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index f2efe8a..50793cd 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -76,6 +76,9 @@ static struct bio *blk_bio_segment_split(struct request_queue 
*q,
struct bio_vec bv, bvprv, *bvprvp = NULL;
struct bvec_iter iter;
unsigned seg_size = 0, nsegs = 0, sectors = 0;
+   unsigned front_seg_size = bio->bi_seg_front_size;
+   bool do_split = true;
+   struct bio *new = NULL;

bio_for_each_segment(bv, bio, iter) {
if (sectors + (bv.bv_len >> 9) > queue_max_sectors(q))
@@ -111,13 +114,26 @@ new_segment:
bvprvp = 
seg_size = bv.bv_len;
sectors += bv.bv_len >> 9;
+
+   if (nsegs == 1 && seg_size > front_seg_size)
+   front_seg_size = seg_size;
}

-   *segs = nsegs;
-   return NULL;
+   do_split = false;
  split:
*segs = nsegs;
-   return bio_split(bio, sectors, GFP_NOIO, bs);
+
+   if (do_split) {
+   new = bio_split(bio, sectors, GFP_NOIO, bs);
+   if (new)
+   bio = new;
+   }
+
+   bio->bi_seg_front_size = front_seg_size;
+   if (seg_size > bio->bi_seg_back_size)
+   bio->bi_seg_back_size = seg_size;
+
+   return do_split ? new : NULL;
  }

  void blk_queue_split(struct request_queue *q, struct bio **bio,



I applied both your patches and tested on Overdrive 3000. This fixes the 
issue for me.


Added linux-arm-kernel, since arm64 triggers this issue.

Thanks Ming and testers for your hard work on this.

Tested-by: Alan Ott 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel BUG at drivers/scsi/scsi_lib.c:1096!

2015-11-24 Thread Alan Ott
-by: Michael Ellerman <m...@ellerman.id.au>
Reported-by: Mark Salter <msal...@redhat.com>
Fixes: bdced438acd83a(block: setup bi_phys_segments after splitting)
Signed-off-by: Ming Lei <ming@canonical.com>
---
  block/blk-merge.c | 22 +++---
  1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index f2efe8a..50793cd 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -76,6 +76,9 @@ static struct bio *blk_bio_segment_split(struct request_queue 
*q,
struct bio_vec bv, bvprv, *bvprvp = NULL;
struct bvec_iter iter;
unsigned seg_size = 0, nsegs = 0, sectors = 0;
+   unsigned front_seg_size = bio->bi_seg_front_size;
+   bool do_split = true;
+   struct bio *new = NULL;

bio_for_each_segment(bv, bio, iter) {
if (sectors + (bv.bv_len >> 9) > queue_max_sectors(q))
@@ -111,13 +114,26 @@ new_segment:
bvprvp = 
seg_size = bv.bv_len;
sectors += bv.bv_len >> 9;
+
+   if (nsegs == 1 && seg_size > front_seg_size)
+   front_seg_size = seg_size;
}

-   *segs = nsegs;
-   return NULL;
+   do_split = false;
  split:
*segs = nsegs;
-   return bio_split(bio, sectors, GFP_NOIO, bs);
+
+   if (do_split) {
+   new = bio_split(bio, sectors, GFP_NOIO, bs);
+   if (new)
+   bio = new;
+   }
+
+   bio->bi_seg_front_size = front_seg_size;
+   if (seg_size > bio->bi_seg_back_size)
+   bio->bi_seg_back_size = seg_size;
+
+   return do_split ? new : NULL;
  }

  void blk_queue_split(struct request_queue *q, struct bio **bio,



I applied both your patches and tested on Overdrive 3000. This fixes the 
issue for me.


Added linux-arm-kernel, since arm64 triggers this issue.

Thanks Ming and testers for your hard work on this.

Tested-by: Alan Ott <a...@softiron.co.uk>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/6] net: ieee802154: Remove redundant spi driver bus initialization

2015-06-23 Thread Alan Ott

On 06/23/2015 10:52 AM, Antonio Borneo wrote:

In ancient times it was necessary to manually initialize the bus
field of an spi_driver to spi_bus_type. These days this is done in
spi_register_driver(), so we can drop the manual assignment.

Signed-off-by: Antonio Borneo 
To: Alan Ott 
To: Alexander Aring 
To: Varka Bhadram 
To: linux-w...@vger.kernel.org
To: net...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
  drivers/net/ieee802154/cc2520.c   | 1 -
  drivers/net/ieee802154/mrf24j40.c | 1 -
  2 files changed, 2 deletions(-)

diff --git a/drivers/net/ieee802154/cc2520.c b/drivers/net/ieee802154/cc2520.c
index f833b8b..bd957f1 100644
--- a/drivers/net/ieee802154/cc2520.c
+++ b/drivers/net/ieee802154/cc2520.c
@@ -1046,7 +1046,6 @@ MODULE_DEVICE_TABLE(of, cc2520_of_ids);
  static struct spi_driver cc2520_driver = {
.driver = {
.name = "cc2520",
-   .bus = _bus_type,
.owner = THIS_MODULE,
.of_match_table = of_match_ptr(cc2520_of_ids),
},
diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index fba2dfd..bc8fb26 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -812,7 +812,6 @@ MODULE_DEVICE_TABLE(spi, mrf24j40_ids);
  static struct spi_driver mrf24j40_driver = {
.driver = {
.name = "mrf24j40",
-   .bus = _bus_type,
.owner = THIS_MODULE,
},
.id_table = mrf24j40_ids,


Acked-by: Alan Ott 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/6] net: ieee802154: Remove redundant spi driver bus initialization

2015-06-23 Thread Alan Ott

On 06/23/2015 10:52 AM, Antonio Borneo wrote:

In ancient times it was necessary to manually initialize the bus
field of an spi_driver to spi_bus_type. These days this is done in
spi_register_driver(), so we can drop the manual assignment.

Signed-off-by: Antonio Borneo borneo.anto...@gmail.com
To: Alan Ott a...@signal11.us
To: Alexander Aring alex.ar...@gmail.com
To: Varka Bhadram varkabhad...@gmail.com
To: linux-w...@vger.kernel.org
To: net...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
  drivers/net/ieee802154/cc2520.c   | 1 -
  drivers/net/ieee802154/mrf24j40.c | 1 -
  2 files changed, 2 deletions(-)

diff --git a/drivers/net/ieee802154/cc2520.c b/drivers/net/ieee802154/cc2520.c
index f833b8b..bd957f1 100644
--- a/drivers/net/ieee802154/cc2520.c
+++ b/drivers/net/ieee802154/cc2520.c
@@ -1046,7 +1046,6 @@ MODULE_DEVICE_TABLE(of, cc2520_of_ids);
  static struct spi_driver cc2520_driver = {
.driver = {
.name = cc2520,
-   .bus = spi_bus_type,
.owner = THIS_MODULE,
.of_match_table = of_match_ptr(cc2520_of_ids),
},
diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index fba2dfd..bc8fb26 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -812,7 +812,6 @@ MODULE_DEVICE_TABLE(spi, mrf24j40_ids);
  static struct spi_driver mrf24j40_driver = {
.driver = {
.name = mrf24j40,
-   .bus = spi_bus_type,
.owner = THIS_MODULE,
},
.id_table = mrf24j40_ids,


Acked-by: Alan Ott a...@signal11.us

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net-next] MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driver

2014-08-16 Thread Alan Ott
Alan is the original author of the driver. This change was discussed
with the 802.15.4 subsystem maintainer, Alexander Aring.

Signed-off-by: Alan Ott 
---
 MAINTAINERS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 2f85f55..c066da4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5972,6 +5972,12 @@ T:   git git://linuxtv.org/media_tree.git
 S: Maintained
 F: drivers/media/radio/radio-mr800.c
 
+MRF24J40 IEEE 802.15.4 RADIO DRIVER
+M: Alan Ott 
+L: linux-w...@vger.kernel.org
+S: Maintained
+F: drivers/net/ieee802154/mrf24j40.c
+
 MSI LAPTOP SUPPORT
 M: "Lee, Chun-Yi" 
 L: platform-driver-...@vger.kernel.org
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net-next] MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driver

2014-08-16 Thread Alan Ott
Alan is the original author of the driver. This change was discussed
with the 802.15.4 subsystem maintainer, Alexander Aring.

Signed-off-by: Alan Ott a...@signal11.us
---
 MAINTAINERS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 2f85f55..c066da4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5972,6 +5972,12 @@ T:   git git://linuxtv.org/media_tree.git
 S: Maintained
 F: drivers/media/radio/radio-mr800.c
 
+MRF24J40 IEEE 802.15.4 RADIO DRIVER
+M: Alan Ott a...@signal11.us
+L: linux-w...@vger.kernel.org
+S: Maintained
+F: drivers/net/ieee802154/mrf24j40.c
+
 MSI LAPTOP SUPPORT
 M: Lee, Chun-Yi j...@suse.com
 L: platform-driver-...@vger.kernel.org
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Deadlock in do_page_fault() on ARM (old kernel)

2014-01-20 Thread Alan Ott

On 01/17/2014 08:20 PM, Russell King - ARM Linux wrote:

On Fri, Jan 17, 2014 at 07:57:16PM -0500, Alan Ott wrote:

On 01/17/2014 08:46 AM, Russell King - ARM Linux wrote:

My suspicion therefore is that some other thread must have died while
holding the mmap_sem, so there's probably a kernel oops earlier...
that's my best guess at the moment without seeing the full backtrace.

There's no oops that I'm able to see.

Each of the tasks which lockdep reports as "holding" mmap_sem are
blocking for it. If some other task had taken it and then crashed, I
assume lockdep would list the crashed task as also holding the resource
in the printout.

My point is this:

- the five (or six) threads which are trying to take the mmap_sem in
   read-mode in the fault handler are all blocked on it - they haven't
   taken the lock, which will only happen because there's a pending writer.
- of these in your original post, there are two which faulted from
   __copy_to_user_std().  __copy_to_user_std() doesn't take the mmap_sem -
   this is the non-uaccess-with-memcpy path.
- the pending writers are the two threads in sys_mmap_pgoff(), both of
   which are blocked waiting to gain the write lock.
- there are no *other* threads holding the mmap_sem lock.


Yes, all true. I don't remember why I started looking at the memcpy() case.


So... there's a question here how we got into this state - and frankly
I don't know.  What I do see from your latest dump is that there's two
unknown modules there - something called rcu2m and another called
buttoms, and there are two threads inside ioctls there.  Both have
faulted from the function at 0xc0d2a394 (which won't appear in the
backtrace, but is most likely __copy_to_user_std.)


Yes, there are a handful of out-of-tree modules.


So, in the absence of you saying anything about there being any preceding
oopses, my conclusion now is that one of those modules is taking the
mmap_sem itself, and is the culpret inducing this deadlock.


Yes, I came to that as well. I had checked for the presence of mmap_sem 
in the sources of the out-of-tree modules and didn't see it. However, 
upon closer inspection, my grep-fu failed me as there were some backward 
symlinks I didn't account for. TI's cmemk module _is_ taking out 
mmap_sem. I wish I had seen this days ago. That's my new investigation path.



Note that your dump ([2]) in your reply was just the hung task detector
printing out the stacktrace for a few tasks, not the full all-threads
stack dump which I was expecting.


Yes, in a misguided attempt to keep the SNR high, I didn't include the 
full dump, but only what I thought was the interesting part. I did 
another capture and the full dump is at [1] .



So I'm pulling out these conclusions from the very little information
you're supplying.


I appreciate it. Thank you for taking the time to reply.

Alan.

[1] 
http://www.signal11.us/~alan/stack_dump_all_tasks_with_frame_pointers.txt


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Deadlock in do_page_fault() on ARM (old kernel)

2014-01-20 Thread Alan Ott

On 01/17/2014 08:20 PM, Russell King - ARM Linux wrote:

On Fri, Jan 17, 2014 at 07:57:16PM -0500, Alan Ott wrote:

On 01/17/2014 08:46 AM, Russell King - ARM Linux wrote:

My suspicion therefore is that some other thread must have died while
holding the mmap_sem, so there's probably a kernel oops earlier...
that's my best guess at the moment without seeing the full backtrace.

There's no oops that I'm able to see.

Each of the tasks which lockdep reports as holding mmap_sem are
blocking for it. If some other task had taken it and then crashed, I
assume lockdep would list the crashed task as also holding the resource
in the printout.

My point is this:

- the five (or six) threads which are trying to take the mmap_sem in
   read-mode in the fault handler are all blocked on it - they haven't
   taken the lock, which will only happen because there's a pending writer.
- of these in your original post, there are two which faulted from
   __copy_to_user_std().  __copy_to_user_std() doesn't take the mmap_sem -
   this is the non-uaccess-with-memcpy path.
- the pending writers are the two threads in sys_mmap_pgoff(), both of
   which are blocked waiting to gain the write lock.
- there are no *other* threads holding the mmap_sem lock.


Yes, all true. I don't remember why I started looking at the memcpy() case.


So... there's a question here how we got into this state - and frankly
I don't know.  What I do see from your latest dump is that there's two
unknown modules there - something called rcu2m and another called
buttoms, and there are two threads inside ioctls there.  Both have
faulted from the function at 0xc0d2a394 (which won't appear in the
backtrace, but is most likely __copy_to_user_std.)


Yes, there are a handful of out-of-tree modules.


So, in the absence of you saying anything about there being any preceding
oopses, my conclusion now is that one of those modules is taking the
mmap_sem itself, and is the culpret inducing this deadlock.


Yes, I came to that as well. I had checked for the presence of mmap_sem 
in the sources of the out-of-tree modules and didn't see it. However, 
upon closer inspection, my grep-fu failed me as there were some backward 
symlinks I didn't account for. TI's cmemk module _is_ taking out 
mmap_sem. I wish I had seen this days ago. That's my new investigation path.



Note that your dump ([2]) in your reply was just the hung task detector
printing out the stacktrace for a few tasks, not the full all-threads
stack dump which I was expecting.


Yes, in a misguided attempt to keep the SNR high, I didn't include the 
full dump, but only what I thought was the interesting part. I did 
another capture and the full dump is at [1] .



So I'm pulling out these conclusions from the very little information
you're supplying.


I appreciate it. Thank you for taking the time to reply.

Alan.

[1] 
http://www.signal11.us/~alan/stack_dump_all_tasks_with_frame_pointers.txt


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Deadlock in do_page_fault() on ARM (old kernel)

2014-01-17 Thread Alan Ott

On 01/17/2014 08:46 AM, Russell King - ARM Linux wrote:

On Wed, Jan 15, 2014 at 08:13:04PM -0500, Alan Ott wrote:

So my questions are:
1. Why don't I see a full backtrace beyond the exception stack? It's the
same when dump_stack() is called manually.

No idea - it looks like you're not using frame pointers, but are using
the unwinder.  Full backtraces can always be created with frame pointers,
it's just that unwinding seems unreliable.


Hi Russell,

I managed to get frame pointers turned on, which in this kernel version, 
seems like it requires the unwinder to be turned off[1].


When I built with frame pointers, the backtraces show differently. It 
only shows the frames which were _not_ shown with the unwinder. 
Backtrace at [2]. As you can see, it shows the non-exception stack dumps 
ending at places which can't page fault (and if it did page fault in 
those locations, it wouldn't try to take the mmap_sem lock). Its as 
though it shows what's in the modules but not what's in the image, where 
with the unwinder, it showed what's in the image and not what's in the 
modules[3].



I think we do need to see the full backtrace here - from looking at the
full state dump, I don't see any sign of the mmap_sem being held except
by an attempt to process a fault, and two threads trying to do a
sys_mmap_pgoff().


The lockdep printout at the end of [4] (which is the original log I sent 
the other day) shows do_page_fault() holding mmap_sem in six tasks and 
sys_mmap_pgoff() holding it in two. (I don't like the term "held" in the 
lockdep printout and believe it to mean "held or waiting for," based on 
my analysis of the rw_semaphore code.)


Each of those tasks seems to be blocking for it.


My suspicion therefore is that some other thread must have died while
holding the mmap_sem, so there's probably a kernel oops earlier...
that's my best guess at the moment without seeing the full backtrace.


There's no oops that I'm able to see.

Each of the tasks which lockdep reports as "holding" mmap_sem are 
blocking for it. If some other task had taken it and then crashed, I 
assume lockdep would list the crashed task as also holding the resource 
in the printout.


Thanks for having a look at this,

Alan.

[1] It seems a little strange to me even in the newest kernel:
1: lib/Kconfig.debug specifies FRAME_POINTER . The text of this instance 
shows up when one searches for FRAME_POINTER in menuconfig. It can 
become selected even though the dependencies listed here are not met.
2: arch/arm/Kconfig.debug also specifies FRAME_POINTER, with a 
dependency only on !THUMB2_KERNEL, defaulting to yes on !ARM_UNWIND

3: ARM doens't set ARCH_WANT_FRAME_POINTERS

[2] http://www.signal11.us/~alan/stack_dump_with_frame_pointers.txt
Note: This is a sysrq dump of the locks held at lockup time and then the 
automatic hung task detection.


[3] The modules are being built in the standard way for out-of-tree:
$(MAKE) -C $(PRJDIR)/linux M=$(CWD) modules

[4] http://www.signal11.us/~alan/show-all-tasks-deadlock.txt


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Deadlock in do_page_fault() on ARM (old kernel)

2014-01-17 Thread Alan Ott

On 01/17/2014 08:46 AM, Russell King - ARM Linux wrote:

On Wed, Jan 15, 2014 at 08:13:04PM -0500, Alan Ott wrote:

So my questions are:
1. Why don't I see a full backtrace beyond the exception stack? It's the
same when dump_stack() is called manually.

No idea - it looks like you're not using frame pointers, but are using
the unwinder.  Full backtraces can always be created with frame pointers,
it's just that unwinding seems unreliable.


Hi Russell,

I managed to get frame pointers turned on, which in this kernel version, 
seems like it requires the unwinder to be turned off[1].


When I built with frame pointers, the backtraces show differently. It 
only shows the frames which were _not_ shown with the unwinder. 
Backtrace at [2]. As you can see, it shows the non-exception stack dumps 
ending at places which can't page fault (and if it did page fault in 
those locations, it wouldn't try to take the mmap_sem lock). Its as 
though it shows what's in the modules but not what's in the image, where 
with the unwinder, it showed what's in the image and not what's in the 
modules[3].



I think we do need to see the full backtrace here - from looking at the
full state dump, I don't see any sign of the mmap_sem being held except
by an attempt to process a fault, and two threads trying to do a
sys_mmap_pgoff().


The lockdep printout at the end of [4] (which is the original log I sent 
the other day) shows do_page_fault() holding mmap_sem in six tasks and 
sys_mmap_pgoff() holding it in two. (I don't like the term held in the 
lockdep printout and believe it to mean held or waiting for, based on 
my analysis of the rw_semaphore code.)


Each of those tasks seems to be blocking for it.


My suspicion therefore is that some other thread must have died while
holding the mmap_sem, so there's probably a kernel oops earlier...
that's my best guess at the moment without seeing the full backtrace.


There's no oops that I'm able to see.

Each of the tasks which lockdep reports as holding mmap_sem are 
blocking for it. If some other task had taken it and then crashed, I 
assume lockdep would list the crashed task as also holding the resource 
in the printout.


Thanks for having a look at this,

Alan.

[1] It seems a little strange to me even in the newest kernel:
1: lib/Kconfig.debug specifies FRAME_POINTER . The text of this instance 
shows up when one searches for FRAME_POINTER in menuconfig. It can 
become selected even though the dependencies listed here are not met.
2: arch/arm/Kconfig.debug also specifies FRAME_POINTER, with a 
dependency only on !THUMB2_KERNEL, defaulting to yes on !ARM_UNWIND

3: ARM doens't set ARCH_WANT_FRAME_POINTERS

[2] http://www.signal11.us/~alan/stack_dump_with_frame_pointers.txt
Note: This is a sysrq dump of the locks held at lockup time and then the 
automatic hung task detection.


[3] The modules are being built in the standard way for out-of-tree:
$(MAKE) -C $(PRJDIR)/linux M=$(CWD) modules

[4] http://www.signal11.us/~alan/show-all-tasks-deadlock.txt


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Deadlock in do_page_fault() on ARM (old kernel)

2014-01-15 Thread Alan Ott

Hello,

I have a deadlock that I'm trying to understand. The symptom is multiple 
tasks trying to acquire a read lock (down_read()) on mm->mmap_sem in 
do_page_fault(). I'll be right up front and say that this is a fairly 
old kernel (2.6.37 TI PSP kernel) on a fairly old processor DaVinci 6446.


At the time of the deadlock, sysrq's show-all-tasks shows the following 
for three of the tasks which are deadlocked (there are more, but I just 
picked the interesting ones; the full output is at [1]):


uiD c0ea8208 0  1405   1293 0x
[] (schedule+0x33c/0x3c4) from [] 
(__down_read+0xbc/0xd4)
[] (__down_read+0xbc/0xd4) from [] 
(do_page_fault+0x94/0x248)
[] (do_page_fault+0x94/0x248) from [] 
(do_DataAbort+0x34/0x94)
[] (do_DataAbort+0x34/0x94) from [] 
(__dabt_svc+0x4c/0x60)

Exception stack(0xc048dce8 to 0xc048dd30)
dce0:   400e9a94 c048ddb0 ffec  c048c000 
c048dda4
dd00: 400e9a94  ff92 c048c000  0001 0014 
c048dd34

dd20:  c0d1f68c 0013 
[] (__dabt_svc+0x4c/0x60) from [] 
(__copy_to_user_std+0xcc/0x3a8)


uiD c0ea8208 0  1406   1293 0x
[] (schedule+0x33c/0x3c4) from [] 
(__down_read+0xbc/0xd4)
[] (__down_read+0xbc/0xd4) from [] 
(do_page_fault+0x94/0x248)
[] (do_page_fault+0x94/0x248) from [] 
(do_DataAbort+0x34/0x94)
[] (do_DataAbort+0x34/0x94) from [] 
(ret_from_exception+0x0/0x10)

Exception stack(0xc048ffb0 to 0xc048fff8)
ffa0: 0060 000a 00a8 
0010d000
ffc0: 00c23d80 00c23de8 405af06c  405af03c 405af074 0050 
01ff

ffe0: 405ae000 40185748 404f5c4c 404f393c 8010 

uiD c0ea8208 0  1411   1293 0x
[] (schedule+0x33c/0x3c4) from [] 
(__down_read+0xbc/0xd4)
[] (__down_read+0xbc/0xd4) from [] 
(do_page_fault+0x94/0x248)
[] (do_page_fault+0x94/0x248) from [] 
(do_DataAbort+0x34/0x94)
[] (do_DataAbort+0x34/0x94) from [] 
(ret_from_exception+0x0/0x10)

Exception stack(0xc053bfb0 to 0xc053bff8)
bfa0:  0001 00ba3610 

bfc0:  00ba3610 00bb6020 00ba3610 40074000 00b91024 415e4930 
0583

bfe0: 00b611a0 415e38e0 4005f3e4 0fc0 6010 

 [snip] 

Showing all locks held in the system:
1 lock held by getty/1294:
 #0:  (>atomic_read_lock){+.+...}, at: [] 
n_tty_read+0x21c/0x670

1 lock held by ui/1405:
 #0:  (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248
1 lock held by ui/1406:
 #0:  (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248
1 lock held by ui/1408:
 #0:  (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248
1 lock held by ui/1409:
 #0:  (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248
1 lock held by ui/1411:
 #0:  (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248
1 lock held by ui/1416:
 #0:  (>mmap_sem){++}, at: [] sys_mmap_pgoff+0x70/0xc0
1 lock held by ui/1418:
 #0:  (>mmap_sem){++}, at: [] do_page_fault+0x94/0x248
1 lock held by ui/1420:
 #0:  (>mmap_sem){++}, at: [] sys_mmap_pgoff+0x70/0xc0
1 lock held by ui/1434:
 #0:  (>atomic_read_lock){+.+...}, at: [] 
n_tty_read+0x21c/0x670



Note that above, do_page_fault() takes out a read lock (down_read()) and 
sys_mmap_pgoff() takes out a write lock (down_write()).


I've searched for this kind of problem and found two patches which seem 
to be related to this issue[2]. I have applied both with no better results.


So my questions are:
1. Why don't I see a full backtrace beyond the exception stack? It's the 
same when dump_stack() is called manually.
2. __copy_to_user_memcpy() takes a read lock (down_read()) on 
mm->mmap_sem. While that lock is held, __copy_to_user_memcpy() can 
generate a page fault, causing do_page_fault() to get called, which will 
also try to get a read lock (down_read()) on mm->mmap_sem. Multiple read 
locks can be taken on an rw_semaphore, but deadlock will occur if 
another thread tries to get a write lock (down_write()) in between. For 
example:

Task 1: Task 2:
down_read(sem)
down_write(sem)<-- Goes to sleep
down_read(sem) <-- Goes to sleep

There is a thread from 2005[3] which seems to discuss the same concept 
of recursive rw_semaphores, but for futexes.


Other comments:
1. My analysis of this probably wrong. Otherwise it seems many others 
would have the same problem, and they don't seem to. I'm hoping this 
email will help to correct my understanding.
2. I looked through the git logs for recent (since 2.6.37 time frame) 
and nothing else jumped out at me as being an obvious fix for this 
situation.


Thanks for any insight you can give,

Alan.

[1] http://www.signal11.us/~alan/show-all-tasks-deadlock.txt

[2] Some websites/bugtrackers mention this commit with a similar issue, 
but I'm not entirely sure how it's related:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8878a539ff19a43cf3729e7562cd528f490246ae

This 

Deadlock in do_page_fault() on ARM (old kernel)

2014-01-15 Thread Alan Ott

Hello,

I have a deadlock that I'm trying to understand. The symptom is multiple 
tasks trying to acquire a read lock (down_read()) on mm-mmap_sem in 
do_page_fault(). I'll be right up front and say that this is a fairly 
old kernel (2.6.37 TI PSP kernel) on a fairly old processor DaVinci 6446.


At the time of the deadlock, sysrq's show-all-tasks shows the following 
for three of the tasks which are deadlocked (there are more, but I just 
picked the interesting ones; the full output is at [1]):


uiD c0ea8208 0  1405   1293 0x
[c0ea8208] (schedule+0x33c/0x3c4) from [c0eaa3b4] 
(__down_read+0xbc/0xd4)
[c0eaa3b4] (__down_read+0xbc/0xd4) from [c0c0b378] 
(do_page_fault+0x94/0x248)
[c0c0b378] (do_page_fault+0x94/0x248) from [c0c052e0] 
(do_DataAbort+0x34/0x94)
[c0c052e0] (do_DataAbort+0x34/0x94) from [c0c05b0c] 
(__dabt_svc+0x4c/0x60)

Exception stack(0xc048dce8 to 0xc048dd30)
dce0:   400e9a94 c048ddb0 ffec  c048c000 
c048dda4
dd00: 400e9a94  ff92 c048c000  0001 0014 
c048dd34

dd20:  c0d1f68c 0013 
[c0c05b0c] (__dabt_svc+0x4c/0x60) from [c0d1f68c] 
(__copy_to_user_std+0xcc/0x3a8)


uiD c0ea8208 0  1406   1293 0x
[c0ea8208] (schedule+0x33c/0x3c4) from [c0eaa3b4] 
(__down_read+0xbc/0xd4)
[c0eaa3b4] (__down_read+0xbc/0xd4) from [c0c0b378] 
(do_page_fault+0x94/0x248)
[c0c0b378] (do_page_fault+0x94/0x248) from [c0c052e0] 
(do_DataAbort+0x34/0x94)
[c0c052e0] (do_DataAbort+0x34/0x94) from [c0c05f0c] 
(ret_from_exception+0x0/0x10)

Exception stack(0xc048ffb0 to 0xc048fff8)
ffa0: 0060 000a 00a8 
0010d000
ffc0: 00c23d80 00c23de8 405af06c  405af03c 405af074 0050 
01ff

ffe0: 405ae000 40185748 404f5c4c 404f393c 8010 

uiD c0ea8208 0  1411   1293 0x
[c0ea8208] (schedule+0x33c/0x3c4) from [c0eaa3b4] 
(__down_read+0xbc/0xd4)
[c0eaa3b4] (__down_read+0xbc/0xd4) from [c0c0b378] 
(do_page_fault+0x94/0x248)
[c0c0b378] (do_page_fault+0x94/0x248) from [c0c052e0] 
(do_DataAbort+0x34/0x94)
[c0c052e0] (do_DataAbort+0x34/0x94) from [c0c05f0c] 
(ret_from_exception+0x0/0x10)

Exception stack(0xc053bfb0 to 0xc053bff8)
bfa0:  0001 00ba3610 

bfc0:  00ba3610 00bb6020 00ba3610 40074000 00b91024 415e4930 
0583

bfe0: 00b611a0 415e38e0 4005f3e4 0fc0 6010 

 [snip] 

Showing all locks held in the system:
1 lock held by getty/1294:
 #0:  (tty-atomic_read_lock){+.+...}, at: [c0d45bf0] 
n_tty_read+0x21c/0x670

1 lock held by ui/1405:
 #0:  (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248
1 lock held by ui/1406:
 #0:  (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248
1 lock held by ui/1408:
 #0:  (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248
1 lock held by ui/1409:
 #0:  (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248
1 lock held by ui/1411:
 #0:  (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248
1 lock held by ui/1416:
 #0:  (mm-mmap_sem){++}, at: [c0c6e604] sys_mmap_pgoff+0x70/0xc0
1 lock held by ui/1418:
 #0:  (mm-mmap_sem){++}, at: [c0c0b378] do_page_fault+0x94/0x248
1 lock held by ui/1420:
 #0:  (mm-mmap_sem){++}, at: [c0c6e604] sys_mmap_pgoff+0x70/0xc0
1 lock held by ui/1434:
 #0:  (tty-atomic_read_lock){+.+...}, at: [c0d45bf0] 
n_tty_read+0x21c/0x670



Note that above, do_page_fault() takes out a read lock (down_read()) and 
sys_mmap_pgoff() takes out a write lock (down_write()).


I've searched for this kind of problem and found two patches which seem 
to be related to this issue[2]. I have applied both with no better results.


So my questions are:
1. Why don't I see a full backtrace beyond the exception stack? It's the 
same when dump_stack() is called manually.
2. __copy_to_user_memcpy() takes a read lock (down_read()) on 
mm-mmap_sem. While that lock is held, __copy_to_user_memcpy() can 
generate a page fault, causing do_page_fault() to get called, which will 
also try to get a read lock (down_read()) on mm-mmap_sem. Multiple read 
locks can be taken on an rw_semaphore, but deadlock will occur if 
another thread tries to get a write lock (down_write()) in between. For 
example:

Task 1: Task 2:
down_read(sem)
down_write(sem)-- Goes to sleep
down_read(sem) -- Goes to sleep

There is a thread from 2005[3] which seems to discuss the same concept 
of recursive rw_semaphores, but for futexes.


Other comments:
1. My analysis of this probably wrong. Otherwise it seems many others 
would have the same problem, and they don't seem to. I'm hoping this 
email will help to correct my understanding.
2. I looked through the git logs for recent (since 2.6.37 time frame) 
and nothing else jumped out at me as being an obvious fix for this 
situation.


Thanks for any insight you can give,


[PATCH v1 2/3] mrf24j40: Use threaded IRQ handler

2013-10-05 Thread Alan Ott
Eliminate all the workqueue and interrupt enable/disable.

Signed-off-by: Alan Ott 
---
 drivers/net/ieee802154/mrf24j40.c | 27 +++
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 66bb4ce..c1bc688 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -82,7 +82,6 @@ struct mrf24j40 {
 
struct mutex buffer_mutex; /* only used to protect buf */
struct completion tx_complete;
-   struct work_struct irqwork;
u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */
 };
 
@@ -590,17 +589,6 @@ static struct ieee802154_ops mrf24j40_ops = {
 static irqreturn_t mrf24j40_isr(int irq, void *data)
 {
struct mrf24j40 *devrec = data;
-
-   disable_irq_nosync(irq);
-
-   schedule_work(>irqwork);
-
-   return IRQ_HANDLED;
-}
-
-static void mrf24j40_isrwork(struct work_struct *work)
-{
-   struct mrf24j40 *devrec = container_of(work, struct mrf24j40, irqwork);
u8 intstat;
int ret;
 
@@ -618,7 +606,7 @@ static void mrf24j40_isrwork(struct work_struct *work)
mrf24j40_handle_rx(devrec);
 
 out:
-   enable_irq(devrec->spi->irq);
+   return IRQ_HANDLED;
 }
 
 static int mrf24j40_probe(struct spi_device *spi)
@@ -642,7 +630,6 @@ static int mrf24j40_probe(struct spi_device *spi)
 
mutex_init(>buffer_mutex);
init_completion(>tx_complete);
-   INIT_WORK(>irqwork, mrf24j40_isrwork);
devrec->spi = spi;
spi_set_drvdata(spi, devrec);
 
@@ -688,11 +675,12 @@ static int mrf24j40_probe(struct spi_device *spi)
val &= ~0x3; /* Clear RX mode (normal) */
write_short_reg(devrec, REG_RXMCR, val);
 
-   ret = request_irq(spi->irq,
- mrf24j40_isr,
- IRQF_TRIGGER_FALLING,
- dev_name(>dev),
- devrec);
+   ret = request_threaded_irq(spi->irq,
+  NULL,
+  mrf24j40_isr,
+  IRQF_TRIGGER_FALLING|IRQF_ONESHOT,
+  dev_name(>dev),
+  devrec);
 
if (ret) {
dev_err(printdev(devrec), "Unable to get IRQ");
@@ -721,7 +709,6 @@ static int mrf24j40_remove(struct spi_device *spi)
dev_dbg(printdev(devrec), "remove\n");
 
free_irq(spi->irq, devrec);
-   flush_work(>irqwork); /* TODO: Is this the right call? */
ieee802154_unregister_device(devrec->dev);
ieee802154_free_device(devrec->dev);
/* TODO: Will ieee802154_free_device() wait until ->xmit() is
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 1/3] mrf24j40: Move INIT_COMPLETION() to before packet transmission

2013-10-05 Thread Alan Ott
This avoids a race condition where complete(tx_complete) could be called
before tx_complete is initialized.

Signed-off-by: Alan Ott 
---
 drivers/net/ieee802154/mrf24j40.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 42e6dee..66bb4ce 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -344,6 +344,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
if (ret)
goto err;
 
+   INIT_COMPLETION(devrec->tx_complete);
+
/* Set TXNTRIG bit of TXNCON to send packet */
ret = read_short_reg(devrec, REG_TXNCON, );
if (ret)
@@ -354,8 +356,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
val |= 0x4;
write_short_reg(devrec, REG_TXNCON, val);
 
-   INIT_COMPLETION(devrec->tx_complete);
-
/* Wait for the device to send the TX complete interrupt. */
ret = wait_for_completion_interruptible_timeout(
>tx_complete,
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 3/3] mrf24j40: Use level-triggered interrupts

2013-10-05 Thread Alan Ott
The mrf24j40 generates level interrupts. There are rare cases where it
appears that the interrupt line never gets de-asserted between interrupts,
causing interrupts to be lost, and causing a hung device from the driver's
perspective.  Switching the driver to interpret these interrupts as
level-triggered fixes this issue.

Signed-off-by: Alan Ott 
---
 drivers/net/ieee802154/mrf24j40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index c1bc688..0632d34 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -678,7 +678,7 @@ static int mrf24j40_probe(struct spi_device *spi)
ret = request_threaded_irq(spi->irq,
   NULL,
   mrf24j40_isr,
-  IRQF_TRIGGER_FALLING|IRQF_ONESHOT,
+  IRQF_TRIGGER_LOW|IRQF_ONESHOT,
   dev_name(>dev),
   devrec);
 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 0/3] Fix race conditions in mrf24j40 interrupts

2013-10-05 Thread Alan Ott
After testing with the betas of this patchset, it's been rebased and is
ready for inclusion.

David Hauweele noticed that the mrf24j40 would hang arbitrarily after some
period of heavy traffic.  Two race conditions were discovered, and the
driver was changed to use threaded interrupts, since the enable/disable of
interrupts in the driver has recently been a lighning rod whenever issues
arise related to interrupts (costing engineering time), and since threaded
interrupts are the right way to do it.

Alan Ott (3):
  mrf24j40: Move INIT_COMPLETION() to before packet transmission
  mrf24j40: Use threaded IRQ handler
  mrf24j40: Use level-triggered interrupts

 drivers/net/ieee802154/mrf24j40.c | 31 +--
 1 file changed, 9 insertions(+), 22 deletions(-)

-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] 6lowpan: Only make 6lowpan links to IEEE802154 devices

2013-10-05 Thread Alan Ott
Refuse to create 6lowpan links if the actual hardware interface is
of any type other than ARPHRD_IEEE802154.

Signed-off-by: Alan Ott 
Suggested-by: Alexander Aring 
---
 net/ieee802154/6lowpan.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index c85e71e..8f56b2b 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1372,6 +1372,8 @@ static int lowpan_newlink(struct net *src_net, struct 
net_device *dev,
real_dev = dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK]));
if (!real_dev)
return -ENODEV;
+   if (real_dev->type != ARPHRD_IEEE802154)
+   return -EINVAL;
 
lowpan_dev_info(dev)->real_dev = real_dev;
lowpan_dev_info(dev)->fragment_tag = 0;
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/2] 6lowpan: Sync default hardware address of lowpan links to their wpan

2013-10-05 Thread Alan Ott
When a lowpan link to a wpan device is created, set the hardware address
of the lowpan link to that of the wpan device.

Signed-off-by: Alan Ott 
---
 net/ieee802154/6lowpan.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index 8f56b2b..ff41b4d 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1388,6 +1388,9 @@ static int lowpan_newlink(struct net *src_net, struct 
net_device *dev,
 
entry->ldev = dev;
 
+   /* Set the lowpan harware address to the wpan hardware address. */
+   memcpy(dev->dev_addr, real_dev->dev_addr, IEEE802154_ADDR_LEN);
+
mutex_lock(_dev_info(dev)->dev_list_mtx);
INIT_LIST_HEAD(>list);
list_add_tail(>list, _devices);
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/2] 6lowpan default hardware address

2013-10-05 Thread Alan Ott
Alexander Aring suggested that devices desired to be linked to 6lowpan
be checked for actually being of type IEEE802154, since IEEE802154 devices
are all that are supported by 6lowpan at present.

Alan Ott (2):
  6lowpan: Only make 6lowpan links to IEEE802154 devices
  6lowpan: Sync default hardware address of lowpan links to their wpan

 net/ieee802154/6lowpan.c | 5 +
 1 file changed, 5 insertions(+)

-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] 6lowpan: Sync default hardware address of lowpan links to their wpan

2013-10-05 Thread Alan Ott

On 10/05/2013 08:24 PM, Alexander Aring wrote:

Hi Alan,

On Sat, Oct 05, 2013 at 05:38:00PM -0400, Alan Ott wrote:

When a lowpan link to a wpan device is created, set the hardware address
of the lowpan link to that of the wpan device.

Signed-off-by: Alan Ott
---
  net/ieee802154/6lowpan.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index c85e71e..fb89133 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1386,6 +1386,9 @@ static int lowpan_newlink(struct net *src_net, struct 
net_device *dev,
  
  	entry->ldev = dev;
  
+	BUG_ON(IEEE802154_ADDR_LEN != dev->addr_len);

So if somebody make a:
"ip link add link $NOT_8BYTE_HWADDR_DEV name $NAME type lowpan"

the kernel creates a BUG_ON? Okay it seems that case is very unusual but
better is to return a errno so "maybe(I don't know it)" the userspace
software will generate a error and the whole kernel doesn't crash.


That line checks the length of the address of the lowpan device, not the 
real device the lowpan device is attached to (and on second look, I 
don't like the Yoda code; didn't notice that before).


Further, running:
ip link add link eth0 name lowpan0 type lowpan

like you suggest, crashes the kernel hard, with or without my patch. 
More stuff to fix




+   memcpy(dev->dev_addr, real_dev->dev_addr, IEEE802154_ADDR_LEN);

Is there one case where we read the dev->dev_addr? I saw a case when we
set this [1], but I don't know why we need this. Did you detect some
problems because this isn't the "real" hw addr?

Other case is I am feeling uneasy when we have two netdev devices with
the same hw addr.
[1]https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/ieee802154/6lowpan.c?id=refs/tags/v3.12-rc3#n1102


In my testing, lowpan devices need a hardware address which is the same 
as the wpan, or else stuff like ping doesn't work at all. In all the 
examples I've ever seen, the lowpan's hardware address has been set 
manually.


What does your test setup look like that you don't have a hardware 
address for your lowpan device? Are you able to ping other Linux hosts?


Alan.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] 6lowpan: Sync default hardware address of lowpan links to their wpan

2013-10-05 Thread Alan Ott
When a lowpan link to a wpan device is created, set the hardware address
of the lowpan link to that of the wpan device.

Signed-off-by: Alan Ott 
---
 net/ieee802154/6lowpan.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index c85e71e..fb89133 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1386,6 +1386,9 @@ static int lowpan_newlink(struct net *src_net, struct 
net_device *dev,
 
entry->ldev = dev;
 
+   BUG_ON(IEEE802154_ADDR_LEN != dev->addr_len);
+   memcpy(dev->dev_addr, real_dev->dev_addr, IEEE802154_ADDR_LEN);
+
mutex_lock(_dev_info(dev)->dev_list_mtx);
INIT_LIST_HEAD(>list);
list_add_tail(>list, _devices);
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] 6lowpan: Sync default hardware address of lowpan links to their wpan

2013-10-05 Thread Alan Ott
When a lowpan link to a wpan device is created, set the hardware address
of the lowpan link to that of the wpan device.

Signed-off-by: Alan Ott a...@signal11.us
---
 net/ieee802154/6lowpan.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index c85e71e..fb89133 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1386,6 +1386,9 @@ static int lowpan_newlink(struct net *src_net, struct 
net_device *dev,
 
entry-ldev = dev;
 
+   BUG_ON(IEEE802154_ADDR_LEN != dev-addr_len);
+   memcpy(dev-dev_addr, real_dev-dev_addr, IEEE802154_ADDR_LEN);
+
mutex_lock(lowpan_dev_info(dev)-dev_list_mtx);
INIT_LIST_HEAD(entry-list);
list_add_tail(entry-list, lowpan_devices);
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] 6lowpan: Sync default hardware address of lowpan links to their wpan

2013-10-05 Thread Alan Ott

On 10/05/2013 08:24 PM, Alexander Aring wrote:

Hi Alan,

On Sat, Oct 05, 2013 at 05:38:00PM -0400, Alan Ott wrote:

When a lowpan link to a wpan device is created, set the hardware address
of the lowpan link to that of the wpan device.

Signed-off-by: Alan Otta...@signal11.us
---
  net/ieee802154/6lowpan.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index c85e71e..fb89133 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1386,6 +1386,9 @@ static int lowpan_newlink(struct net *src_net, struct 
net_device *dev,
  
  	entry-ldev = dev;
  
+	BUG_ON(IEEE802154_ADDR_LEN != dev-addr_len);

So if somebody make a:
ip link add link $NOT_8BYTE_HWADDR_DEV name $NAME type lowpan

the kernel creates a BUG_ON? Okay it seems that case is very unusual but
better is to return a errno so maybe(I don't know it) the userspace
software will generate a error and the whole kernel doesn't crash.


That line checks the length of the address of the lowpan device, not the 
real device the lowpan device is attached to (and on second look, I 
don't like the Yoda code; didn't notice that before).


Further, running:
ip link add link eth0 name lowpan0 type lowpan

like you suggest, crashes the kernel hard, with or without my patch. 
More stuff to fix




+   memcpy(dev-dev_addr, real_dev-dev_addr, IEEE802154_ADDR_LEN);

Is there one case where we read the dev-dev_addr? I saw a case when we
set this [1], but I don't know why we need this. Did you detect some
problems because this isn't the real hw addr?

Other case is I am feeling uneasy when we have two netdev devices with
the same hw addr.
[1]https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/ieee802154/6lowpan.c?id=refs/tags/v3.12-rc3#n1102


In my testing, lowpan devices need a hardware address which is the same 
as the wpan, or else stuff like ping doesn't work at all. In all the 
examples I've ever seen, the lowpan's hardware address has been set 
manually.


What does your test setup look like that you don't have a hardware 
address for your lowpan device? Are you able to ping other Linux hosts?


Alan.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/2] 6lowpan default hardware address

2013-10-05 Thread Alan Ott
Alexander Aring suggested that devices desired to be linked to 6lowpan
be checked for actually being of type IEEE802154, since IEEE802154 devices
are all that are supported by 6lowpan at present.

Alan Ott (2):
  6lowpan: Only make 6lowpan links to IEEE802154 devices
  6lowpan: Sync default hardware address of lowpan links to their wpan

 net/ieee802154/6lowpan.c | 5 +
 1 file changed, 5 insertions(+)

-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/2] 6lowpan: Sync default hardware address of lowpan links to their wpan

2013-10-05 Thread Alan Ott
When a lowpan link to a wpan device is created, set the hardware address
of the lowpan link to that of the wpan device.

Signed-off-by: Alan Ott a...@signal11.us
---
 net/ieee802154/6lowpan.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index 8f56b2b..ff41b4d 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1388,6 +1388,9 @@ static int lowpan_newlink(struct net *src_net, struct 
net_device *dev,
 
entry-ldev = dev;
 
+   /* Set the lowpan harware address to the wpan hardware address. */
+   memcpy(dev-dev_addr, real_dev-dev_addr, IEEE802154_ADDR_LEN);
+
mutex_lock(lowpan_dev_info(dev)-dev_list_mtx);
INIT_LIST_HEAD(entry-list);
list_add_tail(entry-list, lowpan_devices);
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] 6lowpan: Only make 6lowpan links to IEEE802154 devices

2013-10-05 Thread Alan Ott
Refuse to create 6lowpan links if the actual hardware interface is
of any type other than ARPHRD_IEEE802154.

Signed-off-by: Alan Ott a...@signal11.us
Suggested-by: Alexander Aring alex.ar...@gmail.com
---
 net/ieee802154/6lowpan.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index c85e71e..8f56b2b 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1372,6 +1372,8 @@ static int lowpan_newlink(struct net *src_net, struct 
net_device *dev,
real_dev = dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK]));
if (!real_dev)
return -ENODEV;
+   if (real_dev-type != ARPHRD_IEEE802154)
+   return -EINVAL;
 
lowpan_dev_info(dev)-real_dev = real_dev;
lowpan_dev_info(dev)-fragment_tag = 0;
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 3/3] mrf24j40: Use level-triggered interrupts

2013-10-05 Thread Alan Ott
The mrf24j40 generates level interrupts. There are rare cases where it
appears that the interrupt line never gets de-asserted between interrupts,
causing interrupts to be lost, and causing a hung device from the driver's
perspective.  Switching the driver to interpret these interrupts as
level-triggered fixes this issue.

Signed-off-by: Alan Ott a...@signal11.us
---
 drivers/net/ieee802154/mrf24j40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index c1bc688..0632d34 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -678,7 +678,7 @@ static int mrf24j40_probe(struct spi_device *spi)
ret = request_threaded_irq(spi-irq,
   NULL,
   mrf24j40_isr,
-  IRQF_TRIGGER_FALLING|IRQF_ONESHOT,
+  IRQF_TRIGGER_LOW|IRQF_ONESHOT,
   dev_name(spi-dev),
   devrec);
 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 0/3] Fix race conditions in mrf24j40 interrupts

2013-10-05 Thread Alan Ott
After testing with the betas of this patchset, it's been rebased and is
ready for inclusion.

David Hauweele noticed that the mrf24j40 would hang arbitrarily after some
period of heavy traffic.  Two race conditions were discovered, and the
driver was changed to use threaded interrupts, since the enable/disable of
interrupts in the driver has recently been a lighning rod whenever issues
arise related to interrupts (costing engineering time), and since threaded
interrupts are the right way to do it.

Alan Ott (3):
  mrf24j40: Move INIT_COMPLETION() to before packet transmission
  mrf24j40: Use threaded IRQ handler
  mrf24j40: Use level-triggered interrupts

 drivers/net/ieee802154/mrf24j40.c | 31 +--
 1 file changed, 9 insertions(+), 22 deletions(-)

-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 1/3] mrf24j40: Move INIT_COMPLETION() to before packet transmission

2013-10-05 Thread Alan Ott
This avoids a race condition where complete(tx_complete) could be called
before tx_complete is initialized.

Signed-off-by: Alan Ott a...@signal11.us
---
 drivers/net/ieee802154/mrf24j40.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 42e6dee..66bb4ce 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -344,6 +344,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
if (ret)
goto err;
 
+   INIT_COMPLETION(devrec-tx_complete);
+
/* Set TXNTRIG bit of TXNCON to send packet */
ret = read_short_reg(devrec, REG_TXNCON, val);
if (ret)
@@ -354,8 +356,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
val |= 0x4;
write_short_reg(devrec, REG_TXNCON, val);
 
-   INIT_COMPLETION(devrec-tx_complete);
-
/* Wait for the device to send the TX complete interrupt. */
ret = wait_for_completion_interruptible_timeout(
devrec-tx_complete,
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 2/3] mrf24j40: Use threaded IRQ handler

2013-10-05 Thread Alan Ott
Eliminate all the workqueue and interrupt enable/disable.

Signed-off-by: Alan Ott a...@signal11.us
---
 drivers/net/ieee802154/mrf24j40.c | 27 +++
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 66bb4ce..c1bc688 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -82,7 +82,6 @@ struct mrf24j40 {
 
struct mutex buffer_mutex; /* only used to protect buf */
struct completion tx_complete;
-   struct work_struct irqwork;
u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */
 };
 
@@ -590,17 +589,6 @@ static struct ieee802154_ops mrf24j40_ops = {
 static irqreturn_t mrf24j40_isr(int irq, void *data)
 {
struct mrf24j40 *devrec = data;
-
-   disable_irq_nosync(irq);
-
-   schedule_work(devrec-irqwork);
-
-   return IRQ_HANDLED;
-}
-
-static void mrf24j40_isrwork(struct work_struct *work)
-{
-   struct mrf24j40 *devrec = container_of(work, struct mrf24j40, irqwork);
u8 intstat;
int ret;
 
@@ -618,7 +606,7 @@ static void mrf24j40_isrwork(struct work_struct *work)
mrf24j40_handle_rx(devrec);
 
 out:
-   enable_irq(devrec-spi-irq);
+   return IRQ_HANDLED;
 }
 
 static int mrf24j40_probe(struct spi_device *spi)
@@ -642,7 +630,6 @@ static int mrf24j40_probe(struct spi_device *spi)
 
mutex_init(devrec-buffer_mutex);
init_completion(devrec-tx_complete);
-   INIT_WORK(devrec-irqwork, mrf24j40_isrwork);
devrec-spi = spi;
spi_set_drvdata(spi, devrec);
 
@@ -688,11 +675,12 @@ static int mrf24j40_probe(struct spi_device *spi)
val = ~0x3; /* Clear RX mode (normal) */
write_short_reg(devrec, REG_RXMCR, val);
 
-   ret = request_irq(spi-irq,
- mrf24j40_isr,
- IRQF_TRIGGER_FALLING,
- dev_name(spi-dev),
- devrec);
+   ret = request_threaded_irq(spi-irq,
+  NULL,
+  mrf24j40_isr,
+  IRQF_TRIGGER_FALLING|IRQF_ONESHOT,
+  dev_name(spi-dev),
+  devrec);
 
if (ret) {
dev_err(printdev(devrec), Unable to get IRQ);
@@ -721,7 +709,6 @@ static int mrf24j40_remove(struct spi_device *spi)
dev_dbg(printdev(devrec), remove\n);
 
free_irq(spi-irq, devrec);
-   flush_work(devrec-irqwork); /* TODO: Is this the right call? */
ieee802154_unregister_device(devrec-dev);
ieee802154_free_device(devrec-dev);
/* TODO: Will ieee802154_free_device() wait until -xmit() is
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts

2013-05-23 Thread Alan Ott
On 05/23/2013 01:54 PM, David Hauweele wrote:
> 2013/5/23 Alan Ott :
>> On 5/22/13 4:32 PM, David Hauweele wrote:
>>>
>>> I cannot use level-triggered interrupts with GPIO on the RPi, so I
>>> cannot test this specific patch.
>>
>>
>> Is there another interrupt line you can tie into which does support
>> level-trigger interrupts (INT0 or something)?
> 
> According to the datasheet it should be possible but the bcm2708 port
> does not support it. I've been told that we shouldn't use level-triggered
> interrupts in the first place.


Who is "we," and why shouldn't we use level-triggered interrupts? The
CPU needs to detect whatever type of interrupt the attached hardware
generates.


> 
>>
>>
>>> However I agree with the idea of level-triggered interrupts, that
>>> would fix all major problems related to missed interrupts.
>>>
>>> Beside this I'm running a ping -f since more than two hours now and it
>>> seems to work well.
>>>
>>
>> So that surprises me. I thought level-trigger interrupts were the thing that
>> would fix this problem, and if you're not running with that patch, you just
>> have the INIT_COMPLETION() fix (which you said didn't fix your issue) and
>> the threaded interrupts patch, which I was fairly sure I had determined
>> wasn't fixing any actual race-condition-related problems.
> 
> I should have been more clear about this. I've tested [PATCH 1/3]
> which fixes the race condition with tx_complete. That is the
> INIT_COMPLETION() fix. But it is still possible to miss an interrupt,
> perhaps it just took longer this time. I ran the test again today and
> it failed after 30 minutes.
> 
> I did not test [PATCH 2/3], that is the threaded IRQ. Instead I
> removed interrupt enable/disable from the IRQ handler and the
> workqueue. Without this the driver would fail within seconds of a ping
> -f.


Without what? What do you mean by "without this?" Without the
enable/disable, or without the change that removes the enable/disable?


> Have you observed this too ? Perhaps this problem is specific to
> the bcm2708 port.
> 

What I observe right now is that it seems to work great (ping -f for 6.5
hours) when using the three patches in this patch set on a BeagleBone.

> 
>>
>> I'm glad, but surprised that you're no longer seeing issues.
>>
>> Alan.
>>
>>
>>>
>>> 2013/5/22 Alan Ott :
>>>>
>>>> On 05/21/2013 10:01 PM, Alan Ott wrote:
>>>>>
>>>>> David Hauweele noticed that the mrf24j40 would hang arbitrarily after
>>>>> some
>>>>> period of heavy traffic.  Two race conditions were discovered, and the
>>>>> driver was changed to use threaded interrupts, since the enable/disable
>>>>> of
>>>>> interrupts in the driver has recently been a lighning rod whenever
>>>>> issues
>>>>> arise related to interrupts (costing engineering time), and since
>>>>> threaded
>>>>> interrupts are the right way to do it.
>>>>>
>>>>> Alan Ott (3):
>>>>>mrf24j40: Move INIT_COMPLETION() to before packet transmission
>>>>>mrf24j40: Use threaded IRQ handler
>>>>>mrf24j40: Use level-triggered interrupts
>>>>>
>>>>>   drivers/net/ieee802154/mrf24j40.c | 31 +--
>>>>>   1 file changed, 9 insertions(+), 22 deletions(-)
>>>>
>>>>
>>>> I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and
>>>> it seems solid.
>>>>
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts

2013-05-23 Thread Alan Ott

On 5/22/13 4:32 PM, David Hauweele wrote:

I cannot use level-triggered interrupts with GPIO on the RPi, so I
cannot test this specific patch.


Is there another interrupt line you can tie into which does support 
level-trigger interrupts (INT0 or something)?



However I agree with the idea of level-triggered interrupts, that
would fix all major problems related to missed interrupts.

Beside this I'm running a ping -f since more than two hours now and it
seems to work well.



So that surprises me. I thought level-trigger interrupts were the thing 
that would fix this problem, and if you're not running with that patch, 
you just have the INIT_COMPLETION() fix (which you said didn't fix your 
issue) and the threaded interrupts patch, which I was fairly sure I had 
determined wasn't fixing any actual race-condition-related problems.


I'm glad, but surprised that you're no longer seeing issues.

Alan.



2013/5/22 Alan Ott :

On 05/21/2013 10:01 PM, Alan Ott wrote:

David Hauweele noticed that the mrf24j40 would hang arbitrarily after some
period of heavy traffic.  Two race conditions were discovered, and the
driver was changed to use threaded interrupts, since the enable/disable of
interrupts in the driver has recently been a lighning rod whenever issues
arise related to interrupts (costing engineering time), and since threaded
interrupts are the right way to do it.

Alan Ott (3):
   mrf24j40: Move INIT_COMPLETION() to before packet transmission
   mrf24j40: Use threaded IRQ handler
   mrf24j40: Use level-triggered interrupts

  drivers/net/ieee802154/mrf24j40.c | 31 +--
  1 file changed, 9 insertions(+), 22 deletions(-)


I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and
it seems solid.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts

2013-05-23 Thread Alan Ott

On 5/22/13 4:32 PM, David Hauweele wrote:

I cannot use level-triggered interrupts with GPIO on the RPi, so I
cannot test this specific patch.


Is there another interrupt line you can tie into which does support 
level-trigger interrupts (INT0 or something)?



However I agree with the idea of level-triggered interrupts, that
would fix all major problems related to missed interrupts.

Beside this I'm running a ping -f since more than two hours now and it
seems to work well.



So that surprises me. I thought level-trigger interrupts were the thing 
that would fix this problem, and if you're not running with that patch, 
you just have the INIT_COMPLETION() fix (which you said didn't fix your 
issue) and the threaded interrupts patch, which I was fairly sure I had 
determined wasn't fixing any actual race-condition-related problems.


I'm glad, but surprised that you're no longer seeing issues.

Alan.



2013/5/22 Alan Ott a...@signal11.us:

On 05/21/2013 10:01 PM, Alan Ott wrote:

David Hauweele noticed that the mrf24j40 would hang arbitrarily after some
period of heavy traffic.  Two race conditions were discovered, and the
driver was changed to use threaded interrupts, since the enable/disable of
interrupts in the driver has recently been a lighning rod whenever issues
arise related to interrupts (costing engineering time), and since threaded
interrupts are the right way to do it.

Alan Ott (3):
   mrf24j40: Move INIT_COMPLETION() to before packet transmission
   mrf24j40: Use threaded IRQ handler
   mrf24j40: Use level-triggered interrupts

  drivers/net/ieee802154/mrf24j40.c | 31 +--
  1 file changed, 9 insertions(+), 22 deletions(-)


I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and
it seems solid.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts

2013-05-23 Thread Alan Ott
On 05/23/2013 01:54 PM, David Hauweele wrote:
 2013/5/23 Alan Ott a...@signal11.us:
 On 5/22/13 4:32 PM, David Hauweele wrote:

 I cannot use level-triggered interrupts with GPIO on the RPi, so I
 cannot test this specific patch.


 Is there another interrupt line you can tie into which does support
 level-trigger interrupts (INT0 or something)?
 
 According to the datasheet it should be possible but the bcm2708 port
 does not support it. I've been told that we shouldn't use level-triggered
 interrupts in the first place.


Who is we, and why shouldn't we use level-triggered interrupts? The
CPU needs to detect whatever type of interrupt the attached hardware
generates.


 


 However I agree with the idea of level-triggered interrupts, that
 would fix all major problems related to missed interrupts.

 Beside this I'm running a ping -f since more than two hours now and it
 seems to work well.


 So that surprises me. I thought level-trigger interrupts were the thing that
 would fix this problem, and if you're not running with that patch, you just
 have the INIT_COMPLETION() fix (which you said didn't fix your issue) and
 the threaded interrupts patch, which I was fairly sure I had determined
 wasn't fixing any actual race-condition-related problems.
 
 I should have been more clear about this. I've tested [PATCH 1/3]
 which fixes the race condition with tx_complete. That is the
 INIT_COMPLETION() fix. But it is still possible to miss an interrupt,
 perhaps it just took longer this time. I ran the test again today and
 it failed after 30 minutes.
 
 I did not test [PATCH 2/3], that is the threaded IRQ. Instead I
 removed interrupt enable/disable from the IRQ handler and the
 workqueue. Without this the driver would fail within seconds of a ping
 -f.


Without what? What do you mean by without this? Without the
enable/disable, or without the change that removes the enable/disable?


 Have you observed this too ? Perhaps this problem is specific to
 the bcm2708 port.
 

What I observe right now is that it seems to work great (ping -f for 6.5
hours) when using the three patches in this patch set on a BeagleBone.

 

 I'm glad, but surprised that you're no longer seeing issues.

 Alan.



 2013/5/22 Alan Ott a...@signal11.us:

 On 05/21/2013 10:01 PM, Alan Ott wrote:

 David Hauweele noticed that the mrf24j40 would hang arbitrarily after
 some
 period of heavy traffic.  Two race conditions were discovered, and the
 driver was changed to use threaded interrupts, since the enable/disable
 of
 interrupts in the driver has recently been a lighning rod whenever
 issues
 arise related to interrupts (costing engineering time), and since
 threaded
 interrupts are the right way to do it.

 Alan Ott (3):
mrf24j40: Move INIT_COMPLETION() to before packet transmission
mrf24j40: Use threaded IRQ handler
mrf24j40: Use level-triggered interrupts

   drivers/net/ieee802154/mrf24j40.c | 31 +--
   1 file changed, 9 insertions(+), 22 deletions(-)


 I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and
 it seems solid.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts

2013-05-21 Thread Alan Ott
On 05/21/2013 10:01 PM, Alan Ott wrote:
> David Hauweele noticed that the mrf24j40 would hang arbitrarily after some
> period of heavy traffic.  Two race conditions were discovered, and the
> driver was changed to use threaded interrupts, since the enable/disable of
> interrupts in the driver has recently been a lighning rod whenever issues
> arise related to interrupts (costing engineering time), and since threaded
> interrupts are the right way to do it.
>
> Alan Ott (3):
>   mrf24j40: Move INIT_COMPLETION() to before packet transmission
>   mrf24j40: Use threaded IRQ handler
>   mrf24j40: Use level-triggered interrupts
>
>  drivers/net/ieee802154/mrf24j40.c | 31 +--
>  1 file changed, 9 insertions(+), 22 deletions(-)

I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and
it seems solid.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts

2013-05-21 Thread Alan Ott
David Hauweele noticed that the mrf24j40 would hang arbitrarily after some
period of heavy traffic.  Two race conditions were discovered, and the
driver was changed to use threaded interrupts, since the enable/disable of
interrupts in the driver has recently been a lighning rod whenever issues
arise related to interrupts (costing engineering time), and since threaded
interrupts are the right way to do it.

Alan Ott (3):
  mrf24j40: Move INIT_COMPLETION() to before packet transmission
  mrf24j40: Use threaded IRQ handler
  mrf24j40: Use level-triggered interrupts

 drivers/net/ieee802154/mrf24j40.c | 31 +--
 1 file changed, 9 insertions(+), 22 deletions(-)

-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH beta 1] 1/3] mrf24j40: Move INIT_COMPLETION() to before packet transmission

2013-05-21 Thread Alan Ott
This avoids a race condition where complete(tx_complete) could be called
before tx_complete is initialized.

Signed-off-by: Alan Ott 
---
 drivers/net/ieee802154/mrf24j40.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 556151d..0ea2a5a 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -345,6 +345,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
if (ret)
goto err;
 
+   INIT_COMPLETION(devrec->tx_complete);
+
/* Set TXNTRIG bit of TXNCON to send packet */
ret = read_short_reg(devrec, REG_TXNCON, );
if (ret)
@@ -355,8 +357,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
val |= 0x4;
write_short_reg(devrec, REG_TXNCON, val);
 
-   INIT_COMPLETION(devrec->tx_complete);
-
/* Wait for the device to send the TX complete interrupt. */
ret = wait_for_completion_interruptible_timeout(
>tx_complete,
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH beta 1] 2/3] mrf24j40: Use threaded IRQ handler

2013-05-21 Thread Alan Ott
Eliminate all the workqueue and interrupt enable/disable.

Signed-off-by: Alan Ott 
---
 drivers/net/ieee802154/mrf24j40.c | 27 +++
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 0ea2a5a..a55ab8d 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -83,7 +83,6 @@ struct mrf24j40 {
 
struct mutex buffer_mutex; /* only used to protect buf */
struct completion tx_complete;
-   struct work_struct irqwork;
u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */
 };
 
@@ -591,17 +590,6 @@ static struct ieee802154_ops mrf24j40_ops = {
 static irqreturn_t mrf24j40_isr(int irq, void *data)
 {
struct mrf24j40 *devrec = data;
-
-   disable_irq_nosync(irq);
-
-   schedule_work(>irqwork);
-
-   return IRQ_HANDLED;
-}
-
-static void mrf24j40_isrwork(struct work_struct *work)
-{
-   struct mrf24j40 *devrec = container_of(work, struct mrf24j40, irqwork);
u8 intstat;
int ret;
 
@@ -619,7 +607,7 @@ static void mrf24j40_isrwork(struct work_struct *work)
mrf24j40_handle_rx(devrec);
 
 out:
-   enable_irq(devrec->spi->irq);
+   return IRQ_HANDLED;
 }
 
 static int mrf24j40_probe(struct spi_device *spi)
@@ -649,7 +637,6 @@ static int mrf24j40_probe(struct spi_device *spi)
 
mutex_init(>buffer_mutex);
init_completion(>tx_complete);
-   INIT_WORK(>irqwork, mrf24j40_isrwork);
devrec->spi = spi;
dev_set_drvdata(>dev, devrec);
 
@@ -695,11 +682,12 @@ static int mrf24j40_probe(struct spi_device *spi)
val &= ~0x3; /* Clear RX mode (normal) */
write_short_reg(devrec, REG_RXMCR, val);
 
-   ret = request_irq(spi->irq,
- mrf24j40_isr,
- IRQF_TRIGGER_FALLING,
- dev_name(>dev),
- devrec);
+   ret = request_threaded_irq(spi->irq,
+  NULL,
+  mrf24j40_isr,
+  IRQF_TRIGGER_FALLING|IRQF_ONESHOT,
+  dev_name(>dev),
+  devrec);
 
if (ret) {
dev_err(printdev(devrec), "Unable to get IRQ");
@@ -728,7 +716,6 @@ static int mrf24j40_remove(struct spi_device *spi)
dev_dbg(printdev(devrec), "remove\n");
 
free_irq(spi->irq, devrec);
-   flush_work(>irqwork); /* TODO: Is this the right call? */
ieee802154_unregister_device(devrec->dev);
ieee802154_free_device(devrec->dev);
/* TODO: Will ieee802154_free_device() wait until ->xmit() is
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH beta 1] 3/3] mrf24j40: Use level-triggered interrupts

2013-05-21 Thread Alan Ott
The mrf24j40 generates level interrupts. There are rare cases where it
appears that the interrupt line never gets de-asserted between interrupts,
causing interrupts to be lost, and causing a hung device from the driver's
perspective.  Switching the driver to interpret these interrupts as
level-triggered fixes this issue.

Signed-off-by: Alan Ott 
---
 drivers/net/ieee802154/mrf24j40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index a55ab8d..d59dbff 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -685,7 +685,7 @@ static int mrf24j40_probe(struct spi_device *spi)
ret = request_threaded_irq(spi->irq,
   NULL,
   mrf24j40_isr,
-  IRQF_TRIGGER_FALLING|IRQF_ONESHOT,
+  IRQF_TRIGGER_LOW|IRQF_ONESHOT,
   dev_name(>dev),
   devrec);
 
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled

2013-05-21 Thread Alan Ott
On 05/21/2013 12:17 PM, David Hauweele wrote:
> 2013/5/20 Alan Ott :
>> On 05/16/2013 05:34 PM, David Hauweele wrote:
>>> I have seen the interrupt line going low forever with heavy traffic.
>> I've been doing some testing today and I can reproduce your issue if I
>> ping -f both ways simultaneously (after about 3-4 minutes usually). I
>> think it has more to do with the tx/rx mutual exclusion that you
>> described in patch 1/1 than it does with the disable/enable IRQ. With
>> respect to enable/disable irq, I did some checking of other similar
>> drivers (non-ieee802154) and noticed that many of them use threaded
>> interrupts. I think this may be closer to the right way to do it.
> Indeed, threaded interrupts would be more appropriate.
> What about spi_async ? We could avoid using any working queue with this call.
> Although the driver would need a major rewrite and I've no clue about
> the benefits of doing so.
> I wonder how much additional latency is introduced by the working
> queue, I will try to measure this today.

Async would be really difficult since we need the values that are read
in order to know what to write in later calls (like where we need to set
one bit in a register). It's hard to imagine doing this without blocking
somewhere, and we might as well use the blocking mechanisms that the
kernel gives us.

>> Are you running a tickless kernel? What is your preemption model? What's
>> your hardware platform?
> This is a tickless kernel with CONFIG_PREEMPT on an RPi.

As another data point, can you try with voluntary preemption and see if
the behavior changes?

>>> The at86rf230 driver has two isr, one for edge-triggered and another
>>> for level-triggered interrupt. The latter uses
>>> disable_irq_nosync/enable_irq which makes sense for level-triggered
>>> interrupt. Otherwise the interrupt would be triggered again and again
>>> until the scheduled work read the status register.
>>>
>>> However I don't see the use of disable_irq/enable_irq with an
>>> edge-triggered interrupt. Perhaps I'm missing something. Though I
>>> don't see any harm using it anyway.
>> My understanding based on the datasheet is that the interrupt can be
>> asserted again as soon as INTSTAT is read (which is done in the
>> workqueue). I guess even if it happens while the workqueue is running,
>> it's ok.
> You're right, as soon as INTSTAT is read, the line should be driven
> high again to assert another interrupt.

This is where I think I've found another race condition. In doing some
testing with the mrf24j40 from PIC24 firmware last night, I think I've
discovered that it's possible for the interrupt line to _not_ get reset
(de-asserted). I think what may be causing it is if an interrupt occurs
at the same time as (or right shortly after) INTSTAT is read. It's easy
to imagine the interrupt line not being able to get all the way up
before coming back down. I have a support email into Microchip asking
specifically about this case.

In my firmware, I'm able to see a case where the INTSTAT register shows
non-zero (ie: an interrupt is pending), yet the interrupt flag is not
set. It seems like it takes about 30 minutes for this to happen. On
Linux, I think this is less likely, but still possible, since there is
more latency between the hardware interrupt and when INTSTAT is read.
The extra latency means there is more time for the second interrupt to
happen, causing INTSTAT to simply show both interrupts, and have less
likelihood of the second interrupt happening exactly as INTSTAT is read.

> So it shouldn't pose any problem if the interrupts are replayed by the
> kernel. What may happen however is
> an error in the spi transfer. This would result in the line staying
> low since the register hasn't really been read.
> This is what I observed at the hardware level.

To be clear, you've observed the line staying low, not an SPI transfer
error, right?

> If this is the case, then the problem should also occurs with and
> without disable/enable_irq. However this bug
> is a bit harder to reproduce as it may take several hours. The only
> solution I see would be to regularly poll the
> INTSTAT register for recovering. I know about other drivers which do
> this so I will look into this and work on a fix.

I'd like to avoid a straight-up poll if at all possible. Worst case, we
can poll for interrupt in the case of something happening which we don't
expect to ever happen, for example on something like the TX interrupt
never showing up. It's harder of course to know when to poll for a
system that's primarily a receiver.
 
I've done something like this in my firmware:

void mrf24j40_isr(void)
{
u8 intstat;
int ret;

/* Read the interrupt status */
ret = read_shor

Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled

2013-05-21 Thread Alan Ott
On 05/21/2013 12:17 PM, David Hauweele wrote:
 2013/5/20 Alan Ott a...@signal11.us:
 On 05/16/2013 05:34 PM, David Hauweele wrote:
 I have seen the interrupt line going low forever with heavy traffic.
 I've been doing some testing today and I can reproduce your issue if I
 ping -f both ways simultaneously (after about 3-4 minutes usually). I
 think it has more to do with the tx/rx mutual exclusion that you
 described in patch 1/1 than it does with the disable/enable IRQ. With
 respect to enable/disable irq, I did some checking of other similar
 drivers (non-ieee802154) and noticed that many of them use threaded
 interrupts. I think this may be closer to the right way to do it.
 Indeed, threaded interrupts would be more appropriate.
 What about spi_async ? We could avoid using any working queue with this call.
 Although the driver would need a major rewrite and I've no clue about
 the benefits of doing so.
 I wonder how much additional latency is introduced by the working
 queue, I will try to measure this today.

Async would be really difficult since we need the values that are read
in order to know what to write in later calls (like where we need to set
one bit in a register). It's hard to imagine doing this without blocking
somewhere, and we might as well use the blocking mechanisms that the
kernel gives us.

 Are you running a tickless kernel? What is your preemption model? What's
 your hardware platform?
 This is a tickless kernel with CONFIG_PREEMPT on an RPi.

As another data point, can you try with voluntary preemption and see if
the behavior changes?

 The at86rf230 driver has two isr, one for edge-triggered and another
 for level-triggered interrupt. The latter uses
 disable_irq_nosync/enable_irq which makes sense for level-triggered
 interrupt. Otherwise the interrupt would be triggered again and again
 until the scheduled work read the status register.

 However I don't see the use of disable_irq/enable_irq with an
 edge-triggered interrupt. Perhaps I'm missing something. Though I
 don't see any harm using it anyway.
 My understanding based on the datasheet is that the interrupt can be
 asserted again as soon as INTSTAT is read (which is done in the
 workqueue). I guess even if it happens while the workqueue is running,
 it's ok.
 You're right, as soon as INTSTAT is read, the line should be driven
 high again to assert another interrupt.

This is where I think I've found another race condition. In doing some
testing with the mrf24j40 from PIC24 firmware last night, I think I've
discovered that it's possible for the interrupt line to _not_ get reset
(de-asserted). I think what may be causing it is if an interrupt occurs
at the same time as (or right shortly after) INTSTAT is read. It's easy
to imagine the interrupt line not being able to get all the way up
before coming back down. I have a support email into Microchip asking
specifically about this case.

In my firmware, I'm able to see a case where the INTSTAT register shows
non-zero (ie: an interrupt is pending), yet the interrupt flag is not
set. It seems like it takes about 30 minutes for this to happen. On
Linux, I think this is less likely, but still possible, since there is
more latency between the hardware interrupt and when INTSTAT is read.
The extra latency means there is more time for the second interrupt to
happen, causing INTSTAT to simply show both interrupts, and have less
likelihood of the second interrupt happening exactly as INTSTAT is read.

 So it shouldn't pose any problem if the interrupts are replayed by the
 kernel. What may happen however is
 an error in the spi transfer. This would result in the line staying
 low since the register hasn't really been read.
 This is what I observed at the hardware level.

To be clear, you've observed the line staying low, not an SPI transfer
error, right?

 If this is the case, then the problem should also occurs with and
 without disable/enable_irq. However this bug
 is a bit harder to reproduce as it may take several hours. The only
 solution I see would be to regularly poll the
 INTSTAT register for recovering. I know about other drivers which do
 this so I will look into this and work on a fix.

I'd like to avoid a straight-up poll if at all possible. Worst case, we
can poll for interrupt in the case of something happening which we don't
expect to ever happen, for example on something like the TX interrupt
never showing up. It's harder of course to know when to poll for a
system that's primarily a receiver.
 
I've done something like this in my firmware:

void mrf24j40_isr(void)
{
u8 intstat;
int ret;

/* Read the interrupt status */
ret = read_short_reg(devrec, REG_INTSTAT, intstat);
if (ret)
return;

do {
/* Check for TX complete */
if (intstat  0x1)
tx_complete = 1;

/* Check for Rx */
if (intstat  0x8

[PATCH beta 1] 3/3] mrf24j40: Use level-triggered interrupts

2013-05-21 Thread Alan Ott
The mrf24j40 generates level interrupts. There are rare cases where it
appears that the interrupt line never gets de-asserted between interrupts,
causing interrupts to be lost, and causing a hung device from the driver's
perspective.  Switching the driver to interpret these interrupts as
level-triggered fixes this issue.

Signed-off-by: Alan Ott a...@signal11.us
---
 drivers/net/ieee802154/mrf24j40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index a55ab8d..d59dbff 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -685,7 +685,7 @@ static int mrf24j40_probe(struct spi_device *spi)
ret = request_threaded_irq(spi-irq,
   NULL,
   mrf24j40_isr,
-  IRQF_TRIGGER_FALLING|IRQF_ONESHOT,
+  IRQF_TRIGGER_LOW|IRQF_ONESHOT,
   dev_name(spi-dev),
   devrec);
 
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH beta 1] 2/3] mrf24j40: Use threaded IRQ handler

2013-05-21 Thread Alan Ott
Eliminate all the workqueue and interrupt enable/disable.

Signed-off-by: Alan Ott a...@signal11.us
---
 drivers/net/ieee802154/mrf24j40.c | 27 +++
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 0ea2a5a..a55ab8d 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -83,7 +83,6 @@ struct mrf24j40 {
 
struct mutex buffer_mutex; /* only used to protect buf */
struct completion tx_complete;
-   struct work_struct irqwork;
u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */
 };
 
@@ -591,17 +590,6 @@ static struct ieee802154_ops mrf24j40_ops = {
 static irqreturn_t mrf24j40_isr(int irq, void *data)
 {
struct mrf24j40 *devrec = data;
-
-   disable_irq_nosync(irq);
-
-   schedule_work(devrec-irqwork);
-
-   return IRQ_HANDLED;
-}
-
-static void mrf24j40_isrwork(struct work_struct *work)
-{
-   struct mrf24j40 *devrec = container_of(work, struct mrf24j40, irqwork);
u8 intstat;
int ret;
 
@@ -619,7 +607,7 @@ static void mrf24j40_isrwork(struct work_struct *work)
mrf24j40_handle_rx(devrec);
 
 out:
-   enable_irq(devrec-spi-irq);
+   return IRQ_HANDLED;
 }
 
 static int mrf24j40_probe(struct spi_device *spi)
@@ -649,7 +637,6 @@ static int mrf24j40_probe(struct spi_device *spi)
 
mutex_init(devrec-buffer_mutex);
init_completion(devrec-tx_complete);
-   INIT_WORK(devrec-irqwork, mrf24j40_isrwork);
devrec-spi = spi;
dev_set_drvdata(spi-dev, devrec);
 
@@ -695,11 +682,12 @@ static int mrf24j40_probe(struct spi_device *spi)
val = ~0x3; /* Clear RX mode (normal) */
write_short_reg(devrec, REG_RXMCR, val);
 
-   ret = request_irq(spi-irq,
- mrf24j40_isr,
- IRQF_TRIGGER_FALLING,
- dev_name(spi-dev),
- devrec);
+   ret = request_threaded_irq(spi-irq,
+  NULL,
+  mrf24j40_isr,
+  IRQF_TRIGGER_FALLING|IRQF_ONESHOT,
+  dev_name(spi-dev),
+  devrec);
 
if (ret) {
dev_err(printdev(devrec), Unable to get IRQ);
@@ -728,7 +716,6 @@ static int mrf24j40_remove(struct spi_device *spi)
dev_dbg(printdev(devrec), remove\n);
 
free_irq(spi-irq, devrec);
-   flush_work(devrec-irqwork); /* TODO: Is this the right call? */
ieee802154_unregister_device(devrec-dev);
ieee802154_free_device(devrec-dev);
/* TODO: Will ieee802154_free_device() wait until -xmit() is
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH beta 1] 1/3] mrf24j40: Move INIT_COMPLETION() to before packet transmission

2013-05-21 Thread Alan Ott
This avoids a race condition where complete(tx_complete) could be called
before tx_complete is initialized.

Signed-off-by: Alan Ott a...@signal11.us
---
 drivers/net/ieee802154/mrf24j40.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 556151d..0ea2a5a 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -345,6 +345,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
if (ret)
goto err;
 
+   INIT_COMPLETION(devrec-tx_complete);
+
/* Set TXNTRIG bit of TXNCON to send packet */
ret = read_short_reg(devrec, REG_TXNCON, val);
if (ret)
@@ -355,8 +357,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
val |= 0x4;
write_short_reg(devrec, REG_TXNCON, val);
 
-   INIT_COMPLETION(devrec-tx_complete);
-
/* Wait for the device to send the TX complete interrupt. */
ret = wait_for_completion_interruptible_timeout(
devrec-tx_complete,
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts

2013-05-21 Thread Alan Ott
David Hauweele noticed that the mrf24j40 would hang arbitrarily after some
period of heavy traffic.  Two race conditions were discovered, and the
driver was changed to use threaded interrupts, since the enable/disable of
interrupts in the driver has recently been a lighning rod whenever issues
arise related to interrupts (costing engineering time), and since threaded
interrupts are the right way to do it.

Alan Ott (3):
  mrf24j40: Move INIT_COMPLETION() to before packet transmission
  mrf24j40: Use threaded IRQ handler
  mrf24j40: Use level-triggered interrupts

 drivers/net/ieee802154/mrf24j40.c | 31 +--
 1 file changed, 9 insertions(+), 22 deletions(-)

-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH beta 1] 0/3] Fix race conditions in mrf24j40 interrupts

2013-05-21 Thread Alan Ott
On 05/21/2013 10:01 PM, Alan Ott wrote:
 David Hauweele noticed that the mrf24j40 would hang arbitrarily after some
 period of heavy traffic.  Two race conditions were discovered, and the
 driver was changed to use threaded interrupts, since the enable/disable of
 interrupts in the driver has recently been a lighning rod whenever issues
 arise related to interrupts (costing engineering time), and since threaded
 interrupts are the right way to do it.

 Alan Ott (3):
   mrf24j40: Move INIT_COMPLETION() to before packet transmission
   mrf24j40: Use threaded IRQ handler
   mrf24j40: Use level-triggered interrupts

  drivers/net/ieee802154/mrf24j40.c | 31 +--
  1 file changed, 9 insertions(+), 22 deletions(-)

I forgot to add, I ran ping -f both ways all afternoon (6.5 hours), and
it seems solid.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH testing] mrf24j40: Move INIT_COMPLETION to before the packet is sent

2013-05-19 Thread Alan Ott
This avoids a race condition where the tx_complete interrupt could happen
before the completion is initialized.

---

David H,

Try this out and see if it changes anything for you. I put a bunch of
schedule_timeout()s in mrf24j40_tx() to basically guarantee that a reception
will happen during mrf24j40_tx(), and it seems to work. It would not work
before with said schedule_timeout() calls.

Alan.

---
 drivers/net/ieee802154/mrf24j40.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 556151d..0ea2a5a 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -345,6 +345,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
if (ret)
goto err;
 
+   INIT_COMPLETION(devrec->tx_complete);
+
/* Set TXNTRIG bit of TXNCON to send packet */
ret = read_short_reg(devrec, REG_TXNCON, );
if (ret)
@@ -355,8 +357,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
val |= 0x4;
write_short_reg(devrec, REG_TXNCON, val);
 
-   INIT_COMPLETION(devrec->tx_complete);
-
/* Wait for the device to send the TX complete interrupt. */
ret = wait_for_completion_interruptible_timeout(
>tx_complete,
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled

2013-05-19 Thread Alan Ott
On 05/16/2013 05:34 PM, David Hauweele wrote:
> I have seen the interrupt line going low forever with heavy traffic.

Hi David,

I've been doing some testing today and I can reproduce your issue if I
ping -f both ways simultaneously (after about 3-4 minutes usually). I
think it has more to do with the tx/rx mutual exclusion that you
described in patch 1/1 than it does with the disable/enable IRQ. With
respect to enable/disable irq, I did some checking of other similar
drivers (non-ieee802154) and noticed that many of them use threaded
interrupts. I think this may be closer to the right way to do it.

Are you running a tickless kernel? What is your preemption model? What's
your hardware platform?

> The at86rf230 driver has two isr, one for edge-triggered and another
> for level-triggered interrupt. The latter uses
> disable_irq_nosync/enable_irq which makes sense for level-triggered
> interrupt. Otherwise the interrupt would be triggered again and again
> until the scheduled work read the status register.
>
> However I don't see the use of disable_irq/enable_irq with an
> edge-triggered interrupt. Perhaps I'm missing something. Though I
> don't see any harm using it anyway.

My understanding based on the datasheet is that the interrupt can be
asserted again as soon as INTSTAT is read (which is done in the
workqueue). I guess even if it happens while the workqueue is running,
it's ok.

I think I had a flawed understanding of schedule_work() before, thinking
that it would not schedule a work_struct it if the work_struct was running.

>  As you said the interrupt should
> be delayed until enable_irq() is called. In particular when
> CONFIG_HARDIRQS_SW_RESEND is set.

I think I agree with you. I'll send you a patch to try with threaded
interrupts to try.

I'm trying to determine exactly what's the cause of the interrupt line
being stuck low.

Alan.

>
> 2013/5/14 Alan Ott :
>> On 5/9/13 11:19 AM, David Hauweele wrote:
>>> Disabling the interrupt line could miss an IRQ and leave the line into a
>>> low state hence locking the driver.
>>>
>> Have you observed this? My understanding is that the interrupt won't be lost
>> but instead delayed until enable_irq() is called.
>>
>> I got this pattern from the other 802.15.4 drivers. Perhaps my understanding
>> is wrong.
>>
>>
>>
>>> Signed-off-by: David Hauweele 
>>> ---
>>>   drivers/net/ieee802154/mrf24j40.c |7 +--
>>>   1 file changed, 1 insertion(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/net/ieee802154/mrf24j40.c
>>> b/drivers/net/ieee802154/mrf24j40.c
>>> index 1e3ddf3..6ef32f7 100644
>>> --- a/drivers/net/ieee802154/mrf24j40.c
>>> +++ b/drivers/net/ieee802154/mrf24j40.c
>>> @@ -603,8 +603,6 @@ static irqreturn_t mrf24j40_isr(int irq, void *data)
>>>   {
>>> struct mrf24j40 *devrec = data;
>>>
>>> -   disable_irq_nosync(irq);
>>> -
>>> schedule_work(>irqwork);
>>>
>>> return IRQ_HANDLED;
>>> @@ -619,7 +617,7 @@ static void mrf24j40_isrwork(struct work_struct *work)
>>> /* Read the interrupt status */
>>> ret = read_short_reg(devrec, REG_INTSTAT, );
>>> if (ret)
>>> -   goto out;
>>> +   return;
>>>
>>> /* Check for TX complete */
>>> if (intstat & 0x1)
>>> @@ -628,9 +626,6 @@ static void mrf24j40_isrwork(struct work_struct *work)
>>> /* Check for Rx */
>>> if (intstat & 0x8)
>>> schedule_work(>rxwork);
>>> -
>>> -out:
>>> -   enable_irq(devrec->spi->irq);
>>>   }
>>>
>>>   static void mrf24j40_rxwork(struct work_struct *work)
>>>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled

2013-05-19 Thread Alan Ott
On 05/16/2013 05:34 PM, David Hauweele wrote:
 I have seen the interrupt line going low forever with heavy traffic.

Hi David,

I've been doing some testing today and I can reproduce your issue if I
ping -f both ways simultaneously (after about 3-4 minutes usually). I
think it has more to do with the tx/rx mutual exclusion that you
described in patch 1/1 than it does with the disable/enable IRQ. With
respect to enable/disable irq, I did some checking of other similar
drivers (non-ieee802154) and noticed that many of them use threaded
interrupts. I think this may be closer to the right way to do it.

Are you running a tickless kernel? What is your preemption model? What's
your hardware platform?

 The at86rf230 driver has two isr, one for edge-triggered and another
 for level-triggered interrupt. The latter uses
 disable_irq_nosync/enable_irq which makes sense for level-triggered
 interrupt. Otherwise the interrupt would be triggered again and again
 until the scheduled work read the status register.

 However I don't see the use of disable_irq/enable_irq with an
 edge-triggered interrupt. Perhaps I'm missing something. Though I
 don't see any harm using it anyway.

My understanding based on the datasheet is that the interrupt can be
asserted again as soon as INTSTAT is read (which is done in the
workqueue). I guess even if it happens while the workqueue is running,
it's ok.

I think I had a flawed understanding of schedule_work() before, thinking
that it would not schedule a work_struct it if the work_struct was running.

  As you said the interrupt should
 be delayed until enable_irq() is called. In particular when
 CONFIG_HARDIRQS_SW_RESEND is set.

I think I agree with you. I'll send you a patch to try with threaded
interrupts to try.

I'm trying to determine exactly what's the cause of the interrupt line
being stuck low.

Alan.


 2013/5/14 Alan Ott a...@signal11.us:
 On 5/9/13 11:19 AM, David Hauweele wrote:
 Disabling the interrupt line could miss an IRQ and leave the line into a
 low state hence locking the driver.

 Have you observed this? My understanding is that the interrupt won't be lost
 but instead delayed until enable_irq() is called.

 I got this pattern from the other 802.15.4 drivers. Perhaps my understanding
 is wrong.



 Signed-off-by: David Hauweele da...@hauweele.net
 ---
   drivers/net/ieee802154/mrf24j40.c |7 +--
   1 file changed, 1 insertion(+), 6 deletions(-)

 diff --git a/drivers/net/ieee802154/mrf24j40.c
 b/drivers/net/ieee802154/mrf24j40.c
 index 1e3ddf3..6ef32f7 100644
 --- a/drivers/net/ieee802154/mrf24j40.c
 +++ b/drivers/net/ieee802154/mrf24j40.c
 @@ -603,8 +603,6 @@ static irqreturn_t mrf24j40_isr(int irq, void *data)
   {
 struct mrf24j40 *devrec = data;

 -   disable_irq_nosync(irq);
 -
 schedule_work(devrec-irqwork);

 return IRQ_HANDLED;
 @@ -619,7 +617,7 @@ static void mrf24j40_isrwork(struct work_struct *work)
 /* Read the interrupt status */
 ret = read_short_reg(devrec, REG_INTSTAT, intstat);
 if (ret)
 -   goto out;
 +   return;

 /* Check for TX complete */
 if (intstat  0x1)
 @@ -628,9 +626,6 @@ static void mrf24j40_isrwork(struct work_struct *work)
 /* Check for Rx */
 if (intstat  0x8)
 schedule_work(devrec-rxwork);
 -
 -out:
 -   enable_irq(devrec-spi-irq);
   }

   static void mrf24j40_rxwork(struct work_struct *work)



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH testing] mrf24j40: Move INIT_COMPLETION to before the packet is sent

2013-05-19 Thread Alan Ott
This avoids a race condition where the tx_complete interrupt could happen
before the completion is initialized.

---

David H,

Try this out and see if it changes anything for you. I put a bunch of
schedule_timeout()s in mrf24j40_tx() to basically guarantee that a reception
will happen during mrf24j40_tx(), and it seems to work. It would not work
before with said schedule_timeout() calls.

Alan.

---
 drivers/net/ieee802154/mrf24j40.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 556151d..0ea2a5a 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -345,6 +345,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
if (ret)
goto err;
 
+   INIT_COMPLETION(devrec-tx_complete);
+
/* Set TXNTRIG bit of TXNCON to send packet */
ret = read_short_reg(devrec, REG_TXNCON, val);
if (ret)
@@ -355,8 +357,6 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
val |= 0x4;
write_short_reg(devrec, REG_TXNCON, val);
 
-   INIT_COMPLETION(devrec-tx_complete);
-
/* Wait for the device to send the TX complete interrupt. */
ret = wait_for_completion_interruptible_timeout(
devrec-tx_complete,
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled

2013-05-13 Thread Alan Ott

On 5/9/13 11:19 AM, David Hauweele wrote:

Disabling the interrupt line could miss an IRQ and leave the line into a
low state hence locking the driver.



Have you observed this? My understanding is that the interrupt won't be 
lost but instead delayed until enable_irq() is called.


I got this pattern from the other 802.15.4 drivers. Perhaps my 
understanding is wrong.




Signed-off-by: David Hauweele 
---
  drivers/net/ieee802154/mrf24j40.c |7 +--
  1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 1e3ddf3..6ef32f7 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -603,8 +603,6 @@ static irqreturn_t mrf24j40_isr(int irq, void *data)
  {
struct mrf24j40 *devrec = data;

-   disable_irq_nosync(irq);
-
schedule_work(>irqwork);

return IRQ_HANDLED;
@@ -619,7 +617,7 @@ static void mrf24j40_isrwork(struct work_struct *work)
/* Read the interrupt status */
ret = read_short_reg(devrec, REG_INTSTAT, );
if (ret)
-   goto out;
+   return;

/* Check for TX complete */
if (intstat & 0x1)
@@ -628,9 +626,6 @@ static void mrf24j40_isrwork(struct work_struct *work)
/* Check for Rx */
if (intstat & 0x8)
schedule_work(>rxwork);
-
-out:
-   enable_irq(devrec->spi->irq);
  }

  static void mrf24j40_rxwork(struct work_struct *work)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH 1/2] mrf24j40: Avoid transmission while receiving a frame

2013-05-13 Thread Alan Ott

On 5/9/13 11:19 AM, David Hauweele wrote:

The transceiver may fail under heavy traffic when a frame is transmitted
while receiving another frame.

This patch uses a mutex to separate the transmission and reception of
frames along with a secondary working queue to avoid a deadlock while
waiting for the transmission interrupt.



Have you observed this failure? I have put this kind of thing in before 
(several times actually) but ultimately convinced myself each time that 
it was not necessary. I'm happy to be shown wrong.




Signed-off-by: David Hauweele 
---
  drivers/net/ieee802154/mrf24j40.c |   22 +-
  1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index ede3ce4..1e3ddf3 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -82,8 +82,10 @@ struct mrf24j40 {
struct ieee802154_dev *dev;

struct mutex buffer_mutex; /* only used to protect buf */
+   struct mutex txrx_mutex; /* avoid transmission while receiving a frame 
*/
struct completion tx_complete;
struct work_struct irqwork;
+   struct work_struct rxwork;
u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */
  };

@@ -353,6 +355,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
/* Set TXNACKREQ if the ACK bit is set in the packet. */
if (skb->data[0] & IEEE802154_FC_ACK_REQ)
val |= 0x4;
+
+   mutex_lock(>txrx_mutex);
write_short_reg(devrec, REG_TXNCON, val);

INIT_COMPLETION(devrec->tx_complete);
@@ -361,6 +365,9 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
ret = wait_for_completion_interruptible_timeout(
>tx_complete,
5 * HZ);
+
+   mutex_unlock(>txrx_mutex);
+
if (ret == -ERESTARTSYS)
goto err;
if (ret == 0) {
@@ -535,6 +542,8 @@ static int mrf24j40_handle_rx(struct mrf24j40 *devrec)
int ret = 0;
struct sk_buff *skb;

+   mutex_lock(>txrx_mutex);
+
/* Turn off reception of packets off the air. This prevents the
 * device from overwriting the buffer while we're reading it. */
ret = read_short_reg(devrec, REG_BBREG1, );
@@ -575,6 +584,8 @@ out:
val &= ~0x4; /* Clear RXDECINV */
write_short_reg(devrec, REG_BBREG1, val);

+   mutex_unlock(>txrx_mutex);
+
return ret;
  }

@@ -616,12 +627,19 @@ static void mrf24j40_isrwork(struct work_struct *work)

/* Check for Rx */
if (intstat & 0x8)
-   mrf24j40_handle_rx(devrec);
+   schedule_work(>rxwork);



mrf24f40_isrwork() is already called from a workqueue.


  out:
enable_irq(devrec->spi->irq);
  }

+static void mrf24j40_rxwork(struct work_struct *work)
+{
+   struct mrf24j40 *devrec = container_of(work, struct mrf24j40, rxwork);
+
+   mrf24j40_handle_rx(devrec);
+}
+
  static int mrf24j40_probe(struct spi_device *spi)
  {
int ret = -ENOMEM;
@@ -648,8 +666,10 @@ static int mrf24j40_probe(struct spi_device *spi)
spi->max_speed_hz = MAX_SPI_SPEED_HZ;

mutex_init(>buffer_mutex);
+   mutex_init(>txrx_mutex);
init_completion(>tx_complete);
INIT_WORK(>irqwork, mrf24j40_isrwork);
+   INIT_WORK(>rxwork, mrf24j40_rxwork);
devrec->spi = spi;
spi_set_drvdata(spi, devrec);




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH 1/2] mrf24j40: Avoid transmission while receiving a frame

2013-05-13 Thread Alan Ott

On 5/9/13 11:19 AM, David Hauweele wrote:

The transceiver may fail under heavy traffic when a frame is transmitted
while receiving another frame.

This patch uses a mutex to separate the transmission and reception of
frames along with a secondary working queue to avoid a deadlock while
waiting for the transmission interrupt.



Have you observed this failure? I have put this kind of thing in before 
(several times actually) but ultimately convinced myself each time that 
it was not necessary. I'm happy to be shown wrong.




Signed-off-by: David Hauweele da...@hauweele.net
---
  drivers/net/ieee802154/mrf24j40.c |   22 +-
  1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index ede3ce4..1e3ddf3 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -82,8 +82,10 @@ struct mrf24j40 {
struct ieee802154_dev *dev;

struct mutex buffer_mutex; /* only used to protect buf */
+   struct mutex txrx_mutex; /* avoid transmission while receiving a frame 
*/
struct completion tx_complete;
struct work_struct irqwork;
+   struct work_struct rxwork;
u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */
  };

@@ -353,6 +355,8 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
/* Set TXNACKREQ if the ACK bit is set in the packet. */
if (skb-data[0]  IEEE802154_FC_ACK_REQ)
val |= 0x4;
+
+   mutex_lock(devrec-txrx_mutex);
write_short_reg(devrec, REG_TXNCON, val);

INIT_COMPLETION(devrec-tx_complete);
@@ -361,6 +365,9 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
ret = wait_for_completion_interruptible_timeout(
devrec-tx_complete,
5 * HZ);
+
+   mutex_unlock(devrec-txrx_mutex);
+
if (ret == -ERESTARTSYS)
goto err;
if (ret == 0) {
@@ -535,6 +542,8 @@ static int mrf24j40_handle_rx(struct mrf24j40 *devrec)
int ret = 0;
struct sk_buff *skb;

+   mutex_lock(devrec-txrx_mutex);
+
/* Turn off reception of packets off the air. This prevents the
 * device from overwriting the buffer while we're reading it. */
ret = read_short_reg(devrec, REG_BBREG1, val);
@@ -575,6 +584,8 @@ out:
val = ~0x4; /* Clear RXDECINV */
write_short_reg(devrec, REG_BBREG1, val);

+   mutex_unlock(devrec-txrx_mutex);
+
return ret;
  }

@@ -616,12 +627,19 @@ static void mrf24j40_isrwork(struct work_struct *work)

/* Check for Rx */
if (intstat  0x8)
-   mrf24j40_handle_rx(devrec);
+   schedule_work(devrec-rxwork);



mrf24f40_isrwork() is already called from a workqueue.


  out:
enable_irq(devrec-spi-irq);
  }

+static void mrf24j40_rxwork(struct work_struct *work)
+{
+   struct mrf24j40 *devrec = container_of(work, struct mrf24j40, rxwork);
+
+   mrf24j40_handle_rx(devrec);
+}
+
  static int mrf24j40_probe(struct spi_device *spi)
  {
int ret = -ENOMEM;
@@ -648,8 +666,10 @@ static int mrf24j40_probe(struct spi_device *spi)
spi-max_speed_hz = MAX_SPI_SPEED_HZ;

mutex_init(devrec-buffer_mutex);
+   mutex_init(devrec-txrx_mutex);
init_completion(devrec-tx_complete);
INIT_WORK(devrec-irqwork, mrf24j40_isrwork);
+   INIT_WORK(devrec-rxwork, mrf24j40_rxwork);
devrec-spi = spi;
spi_set_drvdata(spi, devrec);




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH 2/2] mrf24j40: Keep the interrupt line enabled

2013-05-13 Thread Alan Ott

On 5/9/13 11:19 AM, David Hauweele wrote:

Disabling the interrupt line could miss an IRQ and leave the line into a
low state hence locking the driver.



Have you observed this? My understanding is that the interrupt won't be 
lost but instead delayed until enable_irq() is called.


I got this pattern from the other 802.15.4 drivers. Perhaps my 
understanding is wrong.




Signed-off-by: David Hauweele da...@hauweele.net
---
  drivers/net/ieee802154/mrf24j40.c |7 +--
  1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 1e3ddf3..6ef32f7 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -603,8 +603,6 @@ static irqreturn_t mrf24j40_isr(int irq, void *data)
  {
struct mrf24j40 *devrec = data;

-   disable_irq_nosync(irq);
-
schedule_work(devrec-irqwork);

return IRQ_HANDLED;
@@ -619,7 +617,7 @@ static void mrf24j40_isrwork(struct work_struct *work)
/* Read the interrupt status */
ret = read_short_reg(devrec, REG_INTSTAT, intstat);
if (ret)
-   goto out;
+   return;

/* Check for TX complete */
if (intstat  0x1)
@@ -628,9 +626,6 @@ static void mrf24j40_isrwork(struct work_struct *work)
/* Check for Rx */
if (intstat  0x8)
schedule_work(devrec-rxwork);
-
-out:
-   enable_irq(devrec-spi-irq);
  }

  static void mrf24j40_rxwork(struct work_struct *work)



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] mac802154: Keep track of the channel when changed

2013-04-05 Thread Alan Ott
Two sections checked whether the current channel != the new channel
without ever setting the current channel variables.

1. net/mac802154/tx.c: Prevent set_channel() from getting called every
time a packet is sent.

2. net/mac802154/mib.c: Lock (pib_lock) accesses to current_channel and
current_page and make sure they are updated when the channel has been
changed.

Signed-off-by: Alan Ott 
---
 net/mac802154/mib.c | 12 +++-
 net/mac802154/tx.c  |  3 +++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/net/mac802154/mib.c b/net/mac802154/mib.c
index f03e55f..8ded97c 100644
--- a/net/mac802154/mib.c
+++ b/net/mac802154/mib.c
@@ -176,9 +176,15 @@ static void phy_chan_notify(struct work_struct *work)
struct mac802154_sub_if_data *priv = netdev_priv(nw->dev);
int res;
 
+   mutex_lock(>hw->phy->pib_lock);
res = hw->ops->set_channel(>hw, priv->page, priv->chan);
if (res)
pr_debug("set_channel failed\n");
+   else {
+   priv->hw->phy->current_channel = priv->chan;
+   priv->hw->phy->current_page = priv->page;
+   }
+   mutex_unlock(>hw->phy->pib_lock);
 
kfree(nw);
 }
@@ -195,8 +201,11 @@ void mac802154_dev_set_page_channel(struct net_device 
*dev, u8 page, u8 chan)
priv->chan = chan;
spin_unlock_bh(>mib_lock);
 
+   mutex_lock(>hw->phy->pib_lock);
if (priv->hw->phy->current_channel != priv->chan ||
priv->hw->phy->current_page != priv->page) {
+   mutex_unlock(>hw->phy->pib_lock);
+
work = kzalloc(sizeof(*work), GFP_ATOMIC);
if (!work)
return;
@@ -204,5 +213,6 @@ void mac802154_dev_set_page_channel(struct net_device *dev, 
u8 page, u8 chan)
INIT_WORK(>work, phy_chan_notify);
work->dev = dev;
queue_work(priv->hw->dev_workqueue, >work);
-   }
+   } else
+   mutex_unlock(>hw->phy->pib_lock);
 }
diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 3fd3e07..6d16473 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -58,6 +58,9 @@ static void mac802154_xmit_worker(struct work_struct *work)
pr_debug("set_channel failed\n");
goto out;
}
+
+   xw->priv->phy->current_channel = xw->chan;
+   xw->priv->phy->current_page = xw->page;
}
 
res = xw->priv->ops->xmit(>priv->hw, xw->skb);
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH] mac802154: Keep track of the channel when changed

2013-04-05 Thread Alan Ott
On 04/05/2013 05:05 PM, Werner Almesberger wrote:
> Alan Ott wrote:
>> Prevent set_channel() from getting called every time a packet is sent. This
>> looks like it was an oversight.
> at86rf230.c and derivatives avoid this problem by setting
> phy->current_* in the *_channel function.
>
> But I'd agree that it's nicer to do this in one place, not in
> every driver.
>
> In case a driver had a weird failure mode in which it leaves the
> original channel but only makes it halfway to the new channel, it
> could still set phy->current_* and return an error. So there's no
> loss of functionality with your change.

Hmm... I just noticed that mib.c does the same thing (and doesn't set
phy->current_*). I'll need to fix that one too (and resubmit). :(

Alan.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mrf24j40: Enable link-layer acknowledgement and retry

2013-04-05 Thread Alan Ott
On the MRF24J40, link-layer acknowledgment request and retry must be
turned on explicitly for each packet.  Turn this on in the hardware based
on the FC_ACK_REQ bit being set in the packet.

Also, now that failure to receive an ACK will cause the hardware to report
failure of transmission, change the log level for this failure to debug
level.

Signed-off-by: Alan Ott 
---
 drivers/net/ieee802154/mrf24j40.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 6481faf..556151d 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* MRF24J40 Short Address Registers */
 #define REG_RXMCR0x00  /* Receive MAC control */
@@ -349,7 +350,9 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
if (ret)
goto err;
val |= 0x1;
-   val &= ~0x4;
+   /* Set TXNACKREQ if the ACK bit is set in the packet. */
+   if (skb->data[0] & IEEE802154_FC_ACK_REQ)
+   val |= 0x4;
write_short_reg(devrec, REG_TXNCON, val);
 
INIT_COMPLETION(devrec->tx_complete);
@@ -371,7 +374,7 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
if (ret)
goto err;
if (val & 0x1) {
-   dev_err(printdev(devrec), "Error Sending. Retry count 
exceeded\n");
+   dev_dbg(printdev(devrec), "Error Sending. Retry count 
exceeded\n");
ret = -ECOMM; /* TODO: Better error code ? */
} else
dev_dbg(printdev(devrec), "Packet Sent\n");
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mac802154: Keep track of the channel when changed

2013-04-05 Thread Alan Ott
Prevent set_channel() from getting called every time a packet is sent. This
looks like it was an oversight.

Signed-off-by: Alan Ott 
---
 net/mac802154/tx.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 3fd3e07..6d16473 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -58,6 +58,9 @@ static void mac802154_xmit_worker(struct work_struct *work)
pr_debug("set_channel failed\n");
goto out;
}
+
+   xw->priv->phy->current_channel = xw->chan;
+   xw->priv->phy->current_page = xw->page;
}
 
res = xw->priv->ops->xmit(>priv->hw, xw->skb);
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mac802154: Keep track of the channel when changed

2013-04-05 Thread Alan Ott
Prevent set_channel() from getting called every time a packet is sent. This
looks like it was an oversight.

Signed-off-by: Alan Ott a...@signal11.us
---
 net/mac802154/tx.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 3fd3e07..6d16473 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -58,6 +58,9 @@ static void mac802154_xmit_worker(struct work_struct *work)
pr_debug(set_channel failed\n);
goto out;
}
+
+   xw-priv-phy-current_channel = xw-chan;
+   xw-priv-phy-current_page = xw-page;
}
 
res = xw-priv-ops-xmit(xw-priv-hw, xw-skb);
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mrf24j40: Enable link-layer acknowledgement and retry

2013-04-05 Thread Alan Ott
On the MRF24J40, link-layer acknowledgment request and retry must be
turned on explicitly for each packet.  Turn this on in the hardware based
on the FC_ACK_REQ bit being set in the packet.

Also, now that failure to receive an ACK will cause the hardware to report
failure of transmission, change the log level for this failure to debug
level.

Signed-off-by: Alan Ott a...@signal11.us
---
 drivers/net/ieee802154/mrf24j40.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 6481faf..556151d 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -25,6 +25,7 @@
 #include linux/pinctrl/consumer.h
 #include net/wpan-phy.h
 #include net/mac802154.h
+#include net/ieee802154.h
 
 /* MRF24J40 Short Address Registers */
 #define REG_RXMCR0x00  /* Receive MAC control */
@@ -349,7 +350,9 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
if (ret)
goto err;
val |= 0x1;
-   val = ~0x4;
+   /* Set TXNACKREQ if the ACK bit is set in the packet. */
+   if (skb-data[0]  IEEE802154_FC_ACK_REQ)
+   val |= 0x4;
write_short_reg(devrec, REG_TXNCON, val);
 
INIT_COMPLETION(devrec-tx_complete);
@@ -371,7 +374,7 @@ static int mrf24j40_tx(struct ieee802154_dev *dev, struct 
sk_buff *skb)
if (ret)
goto err;
if (val  0x1) {
-   dev_err(printdev(devrec), Error Sending. Retry count 
exceeded\n);
+   dev_dbg(printdev(devrec), Error Sending. Retry count 
exceeded\n);
ret = -ECOMM; /* TODO: Better error code ? */
} else
dev_dbg(printdev(devrec), Packet Sent\n);
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH] mac802154: Keep track of the channel when changed

2013-04-05 Thread Alan Ott
On 04/05/2013 05:05 PM, Werner Almesberger wrote:
 Alan Ott wrote:
 Prevent set_channel() from getting called every time a packet is sent. This
 looks like it was an oversight.
 at86rf230.c and derivatives avoid this problem by setting
 phy-current_* in the *_channel function.

 But I'd agree that it's nicer to do this in one place, not in
 every driver.

 In case a driver had a weird failure mode in which it leaves the
 original channel but only makes it halfway to the new channel, it
 could still set phy-current_* and return an error. So there's no
 loss of functionality with your change.

Hmm... I just noticed that mib.c does the same thing (and doesn't set
phy-current_*). I'll need to fix that one too (and resubmit). :(

Alan.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] mac802154: Keep track of the channel when changed

2013-04-05 Thread Alan Ott
Two sections checked whether the current channel != the new channel
without ever setting the current channel variables.

1. net/mac802154/tx.c: Prevent set_channel() from getting called every
time a packet is sent.

2. net/mac802154/mib.c: Lock (pib_lock) accesses to current_channel and
current_page and make sure they are updated when the channel has been
changed.

Signed-off-by: Alan Ott a...@signal11.us
---
 net/mac802154/mib.c | 12 +++-
 net/mac802154/tx.c  |  3 +++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/net/mac802154/mib.c b/net/mac802154/mib.c
index f03e55f..8ded97c 100644
--- a/net/mac802154/mib.c
+++ b/net/mac802154/mib.c
@@ -176,9 +176,15 @@ static void phy_chan_notify(struct work_struct *work)
struct mac802154_sub_if_data *priv = netdev_priv(nw-dev);
int res;
 
+   mutex_lock(priv-hw-phy-pib_lock);
res = hw-ops-set_channel(hw-hw, priv-page, priv-chan);
if (res)
pr_debug(set_channel failed\n);
+   else {
+   priv-hw-phy-current_channel = priv-chan;
+   priv-hw-phy-current_page = priv-page;
+   }
+   mutex_unlock(priv-hw-phy-pib_lock);
 
kfree(nw);
 }
@@ -195,8 +201,11 @@ void mac802154_dev_set_page_channel(struct net_device 
*dev, u8 page, u8 chan)
priv-chan = chan;
spin_unlock_bh(priv-mib_lock);
 
+   mutex_lock(priv-hw-phy-pib_lock);
if (priv-hw-phy-current_channel != priv-chan ||
priv-hw-phy-current_page != priv-page) {
+   mutex_unlock(priv-hw-phy-pib_lock);
+
work = kzalloc(sizeof(*work), GFP_ATOMIC);
if (!work)
return;
@@ -204,5 +213,6 @@ void mac802154_dev_set_page_channel(struct net_device *dev, 
u8 page, u8 chan)
INIT_WORK(work-work, phy_chan_notify);
work-dev = dev;
queue_work(priv-hw-dev_workqueue, work-work);
-   }
+   } else
+   mutex_unlock(priv-hw-phy-pib_lock);
 }
diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 3fd3e07..6d16473 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -58,6 +58,9 @@ static void mac802154_xmit_worker(struct work_struct *work)
pr_debug(set_channel failed\n);
goto out;
}
+
+   xw-priv-phy-current_channel = xw-chan;
+   xw-priv-phy-current_page = xw-page;
}
 
res = xw-priv-ops-xmit(xw-priv-hw, xw-skb);
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/4] mac802154: Increase tx_buffer_len

2013-04-03 Thread Alan Ott
Increase the buffer length from 10 to 300 packets. Consider that traffic on
mac802154 devices will often be 6LoWPAN, and a full-length (1280 octet)
IPv6 packet will fragment into 15 6LoWPAN fragments (because the MTU of
IEEE 802.15.4 is 127).  A 300-packet queue is really 20 full-length IPv6
packets.

With a queue length of 10, an entire IPv6 packet was unable to get queued
at one time, causing fragments to be dropped, and making reassembly
impossible.

Signed-off-by: Alan Ott 
---
 net/mac802154/wpan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac802154/wpan.c b/net/mac802154/wpan.c
index 7d3f659..2ca2f4d 100644
--- a/net/mac802154/wpan.c
+++ b/net/mac802154/wpan.c
@@ -360,7 +360,7 @@ void mac802154_wpan_setup(struct net_device *dev)
dev->header_ops = _header_ops;
dev->needed_tailroom= 2; /* FCS */
dev->mtu= IEEE802154_MTU;
-   dev->tx_queue_len   = 10;
+   dev->tx_queue_len   = 300;
dev->type   = ARPHRD_IEEE802154;
dev->flags  = IFF_NOARP | IFF_BROADCAST;
dev->watchdog_timeo = 0;
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/4] 802.15.4 and 6LoWPAN Buffering Fixes

2013-04-03 Thread Alan Ott
Version 2 of this patch series:

Differences from v1:

1. Patches previously numbered 5 and 6 were squashed (to become current
patch #4) at the request of Alexander Smirnov.

2. Current patch #2 had extraneous braces removed.

3. Current patch #1 was changed. It is now a patch to make mac802154 _not_
retry sending packets on failure. I believe this to be consistent with the
802.15.4 specification (Section 7.5.6.4.3 of IEEE 802.15.4-2006)

Alan Ott (4):
  mac802154: Do not try to resend failed packets
  mac802154: Use netif flow control
  mac802154: Increase tx_buffer_len
  6lowpan: handle dev_queue_xmit() error code properly

 net/ieee802154/6lowpan.c  |  4 ++--
 net/mac802154/mac802154.h |  2 --
 net/mac802154/tx.c| 26 --
 net/mac802154/wpan.c  |  2 +-
 4 files changed, 19 insertions(+), 15 deletions(-)

-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/4] mac802154: Do not try to resend failed packets

2013-04-03 Thread Alan Ott
When ops->xmit() fails, drop the packet. Devices which support hardware
ack and retry (which include all devices currently supported by mainline),
will automatically retry sending the packet (in the hardware) up to 3
times, per the 802.15.4 spec.  There is no need, and it is incorrect to
try to do it in mac802154.

Signed-off-by: Alan Ott 
---
 net/mac802154/mac802154.h |  2 --
 net/mac802154/tx.c| 12 ++--
 2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/net/mac802154/mac802154.h b/net/mac802154/mac802154.h
index 21fa386..5c9e021 100644
--- a/net/mac802154/mac802154.h
+++ b/net/mac802154/mac802154.h
@@ -88,8 +88,6 @@ struct mac802154_sub_if_data {
 
 #define mac802154_to_priv(_hw) container_of(_hw, struct mac802154_priv, hw)
 
-#define MAC802154_MAX_XMIT_ATTEMPTS3
-
 #define MAC802154_CHAN_NONE(~(u8)0) /* No channel is assigned */
 
 extern struct ieee802154_reduced_mlme_ops mac802154_mlme_reduced;
diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 4e09d07..7264874 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -39,7 +39,6 @@ struct xmit_work {
struct mac802154_priv *priv;
u8 chan;
u8 page;
-   u8 xmit_attempts;
 };
 
 static void mac802154_xmit_worker(struct work_struct *work)
@@ -60,18 +59,12 @@ static void mac802154_xmit_worker(struct work_struct *work)
}
 
res = xw->priv->ops->xmit(>priv->hw, xw->skb);
+   if (res)
+   pr_debug("transmission failed\n");
 
 out:
mutex_unlock(>priv->phy->pib_lock);
 
-   if (res) {
-   if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) {
-   queue_work(xw->priv->dev_workqueue, >work);
-   return;
-   } else
-   pr_debug("transmission failed for %d times",
-MAC802154_MAX_XMIT_ATTEMPTS);
-   }
 
dev_kfree_skb(xw->skb);
 
@@ -114,7 +107,6 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, 
struct sk_buff *skb,
work->priv = priv;
work->page = page;
work->chan = chan;
-   work->xmit_attempts = 0;
 
queue_work(priv->dev_workqueue, >work);
 
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 4/4] 6lowpan: handle dev_queue_xmit() error code properly

2013-04-03 Thread Alan Ott
dev_queue_xmit() will return a positive value if the packet could not be
queued, often because the real network device (in our case the mac802154
wpan device) has its queue stopped.  lowpan_xmit() should handle the
positive return code (for the debug statement) and return that value to
the higher layer so the higher layer will retry sending the packet.

Signed-off-by: Alan Ott 
---
 net/ieee802154/6lowpan.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index e1b4580..55e1fd5 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1139,10 +1139,10 @@ static netdev_tx_t lowpan_xmit(struct sk_buff *skb, 
struct net_device *dev)
 error:
dev_kfree_skb(skb);
 out:
-   if (err < 0)
+   if (err)
pr_debug("ERROR: xmit failed\n");
 
-   return (err < 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK);
+   return (err < 0) ? NET_XMIT_DROP : err;
 }
 
 static struct wpan_phy *lowpan_get_phy(const struct net_device *dev)
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/4] mac802154: Use netif flow control

2013-04-03 Thread Alan Ott
Use netif_stop_queue() and netif_wake_queue() to control the flow of
packets to mac802154 devices.  Since many IEEE 802.15.4 devices have no
output buffer, and since the mac802154 xmit() function is designed to
block, netif_stop_queue() is called after each packet.

Signed-off-by: Alan Ott 
---
 net/mac802154/tx.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 7264874..3fd3e07 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 
@@ -44,6 +45,7 @@ struct xmit_work {
 static void mac802154_xmit_worker(struct work_struct *work)
 {
struct xmit_work *xw = container_of(work, struct xmit_work, work);
+   struct mac802154_sub_if_data *sdata;
int res;
 
mutex_lock(>priv->phy->pib_lock);
@@ -65,6 +67,11 @@ static void mac802154_xmit_worker(struct work_struct *work)
 out:
mutex_unlock(>priv->phy->pib_lock);
 
+   /* Restart the netif queue on each sub_if_data object. */
+   rcu_read_lock();
+   list_for_each_entry_rcu(sdata, >priv->slaves, list)
+   netif_wake_queue(sdata->dev);
+   rcu_read_unlock();
 
dev_kfree_skb(xw->skb);
 
@@ -75,6 +82,7 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct 
sk_buff *skb,
 u8 page, u8 chan)
 {
struct xmit_work *work;
+   struct mac802154_sub_if_data *sdata;
 
if (!(priv->phy->channels_supported[page] & (1 << chan))) {
WARN_ON(1);
@@ -102,6 +110,12 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, 
struct sk_buff *skb,
return NETDEV_TX_BUSY;
}
 
+   /* Stop the netif queue on each sub_if_data object. */
+   rcu_read_lock();
+   list_for_each_entry_rcu(sdata, >slaves, list)
+   netif_stop_queue(sdata->dev);
+   rcu_read_unlock();
+
INIT_WORK(>work, mac802154_xmit_worker);
work->skb = skb;
work->priv = priv;
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/4] mac802154: Use netif flow control

2013-04-03 Thread Alan Ott
Use netif_stop_queue() and netif_wake_queue() to control the flow of
packets to mac802154 devices.  Since many IEEE 802.15.4 devices have no
output buffer, and since the mac802154 xmit() function is designed to
block, netif_stop_queue() is called after each packet.

Signed-off-by: Alan Ott a...@signal11.us
---
 net/mac802154/tx.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 7264874..3fd3e07 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -25,6 +25,7 @@
 #include linux/if_arp.h
 #include linux/crc-ccitt.h
 
+#include net/ieee802154_netdev.h
 #include net/mac802154.h
 #include net/wpan-phy.h
 
@@ -44,6 +45,7 @@ struct xmit_work {
 static void mac802154_xmit_worker(struct work_struct *work)
 {
struct xmit_work *xw = container_of(work, struct xmit_work, work);
+   struct mac802154_sub_if_data *sdata;
int res;
 
mutex_lock(xw-priv-phy-pib_lock);
@@ -65,6 +67,11 @@ static void mac802154_xmit_worker(struct work_struct *work)
 out:
mutex_unlock(xw-priv-phy-pib_lock);
 
+   /* Restart the netif queue on each sub_if_data object. */
+   rcu_read_lock();
+   list_for_each_entry_rcu(sdata, xw-priv-slaves, list)
+   netif_wake_queue(sdata-dev);
+   rcu_read_unlock();
 
dev_kfree_skb(xw-skb);
 
@@ -75,6 +82,7 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct 
sk_buff *skb,
 u8 page, u8 chan)
 {
struct xmit_work *work;
+   struct mac802154_sub_if_data *sdata;
 
if (!(priv-phy-channels_supported[page]  (1  chan))) {
WARN_ON(1);
@@ -102,6 +110,12 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, 
struct sk_buff *skb,
return NETDEV_TX_BUSY;
}
 
+   /* Stop the netif queue on each sub_if_data object. */
+   rcu_read_lock();
+   list_for_each_entry_rcu(sdata, priv-slaves, list)
+   netif_stop_queue(sdata-dev);
+   rcu_read_unlock();
+
INIT_WORK(work-work, mac802154_xmit_worker);
work-skb = skb;
work-priv = priv;
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/4] mac802154: Do not try to resend failed packets

2013-04-03 Thread Alan Ott
When ops-xmit() fails, drop the packet. Devices which support hardware
ack and retry (which include all devices currently supported by mainline),
will automatically retry sending the packet (in the hardware) up to 3
times, per the 802.15.4 spec.  There is no need, and it is incorrect to
try to do it in mac802154.

Signed-off-by: Alan Ott a...@signal11.us
---
 net/mac802154/mac802154.h |  2 --
 net/mac802154/tx.c| 12 ++--
 2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/net/mac802154/mac802154.h b/net/mac802154/mac802154.h
index 21fa386..5c9e021 100644
--- a/net/mac802154/mac802154.h
+++ b/net/mac802154/mac802154.h
@@ -88,8 +88,6 @@ struct mac802154_sub_if_data {
 
 #define mac802154_to_priv(_hw) container_of(_hw, struct mac802154_priv, hw)
 
-#define MAC802154_MAX_XMIT_ATTEMPTS3
-
 #define MAC802154_CHAN_NONE(~(u8)0) /* No channel is assigned */
 
 extern struct ieee802154_reduced_mlme_ops mac802154_mlme_reduced;
diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 4e09d07..7264874 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -39,7 +39,6 @@ struct xmit_work {
struct mac802154_priv *priv;
u8 chan;
u8 page;
-   u8 xmit_attempts;
 };
 
 static void mac802154_xmit_worker(struct work_struct *work)
@@ -60,18 +59,12 @@ static void mac802154_xmit_worker(struct work_struct *work)
}
 
res = xw-priv-ops-xmit(xw-priv-hw, xw-skb);
+   if (res)
+   pr_debug(transmission failed\n);
 
 out:
mutex_unlock(xw-priv-phy-pib_lock);
 
-   if (res) {
-   if (xw-xmit_attempts++  MAC802154_MAX_XMIT_ATTEMPTS) {
-   queue_work(xw-priv-dev_workqueue, xw-work);
-   return;
-   } else
-   pr_debug(transmission failed for %d times,
-MAC802154_MAX_XMIT_ATTEMPTS);
-   }
 
dev_kfree_skb(xw-skb);
 
@@ -114,7 +107,6 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, 
struct sk_buff *skb,
work-priv = priv;
work-page = page;
work-chan = chan;
-   work-xmit_attempts = 0;
 
queue_work(priv-dev_workqueue, work-work);
 
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 4/4] 6lowpan: handle dev_queue_xmit() error code properly

2013-04-03 Thread Alan Ott
dev_queue_xmit() will return a positive value if the packet could not be
queued, often because the real network device (in our case the mac802154
wpan device) has its queue stopped.  lowpan_xmit() should handle the
positive return code (for the debug statement) and return that value to
the higher layer so the higher layer will retry sending the packet.

Signed-off-by: Alan Ott a...@signal11.us
---
 net/ieee802154/6lowpan.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index e1b4580..55e1fd5 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1139,10 +1139,10 @@ static netdev_tx_t lowpan_xmit(struct sk_buff *skb, 
struct net_device *dev)
 error:
dev_kfree_skb(skb);
 out:
-   if (err  0)
+   if (err)
pr_debug(ERROR: xmit failed\n);
 
-   return (err  0 ? NETDEV_TX_BUSY : NETDEV_TX_OK);
+   return (err  0) ? NET_XMIT_DROP : err;
 }
 
 static struct wpan_phy *lowpan_get_phy(const struct net_device *dev)
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/4] 802.15.4 and 6LoWPAN Buffering Fixes

2013-04-03 Thread Alan Ott
Version 2 of this patch series:

Differences from v1:

1. Patches previously numbered 5 and 6 were squashed (to become current
patch #4) at the request of Alexander Smirnov.

2. Current patch #2 had extraneous braces removed.

3. Current patch #1 was changed. It is now a patch to make mac802154 _not_
retry sending packets on failure. I believe this to be consistent with the
802.15.4 specification (Section 7.5.6.4.3 of IEEE 802.15.4-2006)

Alan Ott (4):
  mac802154: Do not try to resend failed packets
  mac802154: Use netif flow control
  mac802154: Increase tx_buffer_len
  6lowpan: handle dev_queue_xmit() error code properly

 net/ieee802154/6lowpan.c  |  4 ++--
 net/mac802154/mac802154.h |  2 --
 net/mac802154/tx.c| 26 --
 net/mac802154/wpan.c  |  2 +-
 4 files changed, 19 insertions(+), 15 deletions(-)

-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/4] mac802154: Increase tx_buffer_len

2013-04-03 Thread Alan Ott
Increase the buffer length from 10 to 300 packets. Consider that traffic on
mac802154 devices will often be 6LoWPAN, and a full-length (1280 octet)
IPv6 packet will fragment into 15 6LoWPAN fragments (because the MTU of
IEEE 802.15.4 is 127).  A 300-packet queue is really 20 full-length IPv6
packets.

With a queue length of 10, an entire IPv6 packet was unable to get queued
at one time, causing fragments to be dropped, and making reassembly
impossible.

Signed-off-by: Alan Ott a...@signal11.us
---
 net/mac802154/wpan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac802154/wpan.c b/net/mac802154/wpan.c
index 7d3f659..2ca2f4d 100644
--- a/net/mac802154/wpan.c
+++ b/net/mac802154/wpan.c
@@ -360,7 +360,7 @@ void mac802154_wpan_setup(struct net_device *dev)
dev-header_ops = mac802154_header_ops;
dev-needed_tailroom= 2; /* FCS */
dev-mtu= IEEE802154_MTU;
-   dev-tx_queue_len   = 10;
+   dev-tx_queue_len   = 300;
dev-type   = ARPHRD_IEEE802154;
dev-flags  = IFF_NOARP | IFF_BROADCAST;
dev-watchdog_timeo = 0;
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

2013-04-02 Thread Alan Ott
On 04/02/2013 10:30 PM, David Miller wrote:
> From: Alan Ott 
> Date: Tue, 02 Apr 2013 22:25:28 -0400
>
>> The workqueue in mac802154 is only needed because the current mac802154
>> xmit() function is designed to be blocking and synchronous.
>>
>> Prior to my patch (#3/6), that very same workqueue would actually queue
>> up packets (without bound). That's what my patch fixes.
>>
>> The workqueue in mac802154 also serializes the access to the device for
>> other functions like setting the channel, ensuring that in the driver
>> code, one doesn't have to mutex everything. I'm not sure if that's the
>> "right" way to do it, but that's the way it was when I got here.
> This is entirely duplicating existing facilities.
>
> Your desire to allow blockability during xmit() on the basis of mutual
> exclusion is not well founded.

I'm not sure it's my desire, but rather a statement of the way it
currently is. To be clear, .ndo_start_xmit() does not block, but queues
a workqueue item which then calls ieee802154_ops->xmit() which does block.

This patch series centers around putting netif_stop_queue() and
netif_wake_queue() in the mac802154 layer. I've sent emails about this
before[1], and gotten no real suggestions about the issue, so I
proceeded with Solution #1 (as described at [1]). If you want to skip
this and go straight to solution #2, then let's talk about what that
might look like. I still think though, that there is benefit in getting
solution #1 in because it fixes some current usability problems
(including the buffer (workqueue) growing without bound).

All that said, I'm not sure I've answered your question or concern.
Please let me know if I'm still not getting it.

Alan.

[1] http://thread.gmane.org/gmane.linux.network/242495/focus=262869

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

2013-04-02 Thread Alan Ott
On 04/02/2013 10:03 PM, David Miller wrote:
> From: Alan Ott 
> Date: Tue, 02 Apr 2013 21:59:37 -0400
>
>> On 04/02/2013 09:56 PM, David Miller wrote:
>>> From: Alan Ott 
>>> Date: Tue, 02 Apr 2013 21:24:59 -0400
>>>
>>>> I like it for a couple of reasons.
>>>> 1. Most supported devices have only single packet output buffer, so
>>>> blocking in the driver is the most straight-forward way to handle it.
>>>> The alternative is to make each driver have a workqueue for xmit() (to
>>>> lift the blocking out from atomic context). This makes each driver simpler.
>>>>
>>>> 2. All of the flow control can be handled one time in the mac802154 layer.
>>> We have a perfectly working flow control mechanism in the generic
>>> networking queuing layer.  Please use it instead of inventing things.
>> I'm pretty sure that's what I'm doing in [1]. When I say "flow control
>> can be handled," I mean managing calls to netif_stop_queue() and
>> netif_wake_queue(). Is there something else I should be doing instead?
> Then you shouldn't need workqueues if the generic netdev facilities
> can do the flow control properly.

The workqueue in mac802154 is only needed because the current mac802154
xmit() function is designed to be blocking and synchronous.

Prior to my patch (#3/6), that very same workqueue would actually queue
up packets (without bound). That's what my patch fixes.

The workqueue in mac802154 also serializes the access to the device for
other functions like setting the channel, ensuring that in the driver
code, one doesn't have to mutex everything. I'm not sure if that's the
"right" way to do it, but that's the way it was when I got here.

> There are several ethernet devices that have a single transmit buffer
> and function just fine, and optimally, solely using the transmit queue
> stop/start/wake infrastructure.

Yes, that does work. enc28j60 works like this. However, since it's an
SPI device (and can sleep), its ndo_start_xmit() _does_ use a workqueue
(to remove it from atomic context), the same as ours (mac802154) does.
The difference is that we do it at the mac802154 layer, while Ethernet
devices do it in the driver.

I guess one advantage to the way it currently is in mac802154, with the
synchronous xmit(), is that a return code can be had from the driver for
each packet. With my new idea that we don't need to retransmit on
failure, I'm not sure we need this return code at all.

Alan.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

2013-04-02 Thread Alan Ott
On 04/02/2013 09:56 PM, David Miller wrote:
> From: Alan Ott 
> Date: Tue, 02 Apr 2013 21:24:59 -0400
>
>> I like it for a couple of reasons.
>> 1. Most supported devices have only single packet output buffer, so
>> blocking in the driver is the most straight-forward way to handle it.
>> The alternative is to make each driver have a workqueue for xmit() (to
>> lift the blocking out from atomic context). This makes each driver simpler.
>>
>> 2. All of the flow control can be handled one time in the mac802154 layer.
> We have a perfectly working flow control mechanism in the generic
> networking queuing layer.  Please use it instead of inventing things.

I'm pretty sure that's what I'm doing in [1]. When I say "flow control
can be handled," I mean managing calls to netif_stop_queue() and
netif_wake_queue(). Is there something else I should be doing instead?

> If it does not meet your needs, fix it, rather than go off and do
> your own thing.  That way everyone benfits, not just you.

Fully agreed.

Alan.

[1] http://www.spinics.net/lists/netdev/msg231483.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

2013-04-02 Thread Alan Ott
On 04/02/2013 07:13 PM, Werner Almesberger wrote:
> Alan Ott wrote:
>> it's now my opinion that we should _not_ try to retransmit at
>> all in mac802154/tx.c.
> I think the currently blocking workqueue design is ugly and
> quite contrary to how most the rest of the stack works. So
> anything that kills it has my blessing :-)

I like it for a couple of reasons.
1. Most supported devices have only single packet output buffer, so
blocking in the driver is the most straight-forward way to handle it.
The alternative is to make each driver have a workqueue for xmit() (to
lift the blocking out from atomic context). This makes each driver simpler.

2. All of the flow control can be handled one time in the mac802154 layer.

> I do wonder though why it was done like this in the first place.
> Just for convenience ?
>
> If we want to move towards an asynchronous interface, it could
> exist in parallel with the current one. That way, drivers could
> be migrated one by one.

Maybe at some point this will be done. Right now we have a ton of
pressing issues (in my opinion).

> Having said that, the errors you get there may not be failed
> single transmissions on the air but some form of congestion in
> the driver or a problem with the device. But I don't think
> that's a valid reason for retrying the transmission at that
> level.

Agreed. Like I said, I'm now of the opinion that the mac802154 layer
should _not_ retry at all.

Alan.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] mac802154: Immediately retry sending failed packets

2013-04-02 Thread Alan Ott
On 04/02/2013 05:28 PM, Alan Ott wrote:

> According to 7.5.6.5 of IEEE 802.15.4-2003, if the retransmission fails
> more than aMaxFrameRetries (3) times, it is assumed that it has failed.
> Since some transceivers (and I would assume most if not all) do this in
> hardware, it's now my opinion that we should _not_ try to retransmit at
> all in mac802154/tx.c.
> 
> For a driver for a device which _doesn't_ do retransmission in hardware,
> maybe it should be handled by that driver then.


It's worth noting that the mrf24j40, the at86rf230, and the cc2420
support retransmission in hardware (the first two are the only two in
mainline).

Alan.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] mac802154: Immediately retry sending failed packets

2013-04-02 Thread Alan Ott
On 04/02/2013 04:28 PM, Alan Ott wrote:
> On 04/02/2013 03:11 PM, Alexander Smirnov wrote:
>> 2013/4/2 Alan Ott mailto:a...@signal11.us>>
>>
>> When ops->xmit() fails, immediately retry. Previously the packet was
>> sent
>> to the back of the workqueue.
>>
>> Signed-off-by: Alan Ott mailto:a...@signal11.us>>
>> ---
>>  net/mac802154/tx.c | 17 -
>>  1 file changed, 8 insertions(+), 9 deletions(-)
>>
>> diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
>> index 4e09d07..fbf937c 100644
>> --- a/net/mac802154/tx.c
>> +++ b/net/mac802154/tx.c
>> @@ -59,19 +59,18 @@ static void mac802154_xmit_worker(struct
>> work_struct *work)
>> }
>> }
>>
>> -   res = xw->priv->ops->xmit(>priv->hw, xw->skb);
>> +   do {
>> +   res = xw->priv->ops->xmit(>priv->hw, xw->skb);
>> +   if (res && ++xw->xmit_attempts >=
>> MAC802154_MAX_XMIT_ATTEMPTS) {
>> +   pr_debug("transmission failed for %d times",
>> +MAC802154_MAX_XMIT_ATTEMPTS);
>> +   break;
>> +   }
>> +   } while (res);
>>
>>  
>>
>> IIRC this 802.15.4 stack uses single-thread-work-queue and all RX/TX
>> are performed by using it.
> Hi Alexander,
>
> Yes, that is true. As is currently implemented, the driver xmit()
> functions are called from a workqueue and block until the packet is sent.
>
>
>> Doing TX retry in the way you proposed -
>> it's possible that you will block other packets pending in this
>> queue.
> Yes. Since sending data is a blocking operation, any time spent sending
> (or re-sending) is blocking.
>
> As it was before this patch series, with the buffer (workqueue) growing
> arbitrarily large, doing retry by putting a packet at the end of the
> workqueue was largely useless because by the time it came to retry it,
> any state associated with it (with respect to fragmentation/reassembly)
> had expired.
>
> Keep in mind that with the netif stop/wake code, putting retries at the
> end of the workqueue or doing them immediately is basically the same
> thing, since the workqueue is no longer the packet queue (and will
> ideally only have 0 or 1 packets in it). The workqueue (with these
> patches) only serves to lift the driver xmit() calls out of atomic
> context, allowing them to block.
>
> However, it is easy to envision one process clogging up the works with
> retries by sending many packets to an unavailable address.
>
> What do you recommend doing here instead?

According to 7.5.6.5 of IEEE 802.15.4-2003, if the retransmission fails
more than aMaxFrameRetries (3) times, it is assumed that it has failed.
Since some transceivers (and I would assume most if not all) do this in
hardware, it's now my opinion that we should _not_ try to retransmit at
all in mac802154/tx.c.

For a driver for a device which _doesn't_ do retransmission in hardware,
maybe it should be handled by that driver then.

>
>> Despite on Linux is already 'slow' system to provide
>> real-time for specific 802.15.4 features, I think it's not a good
>> idea to increase nodes communication latency.
> With the transmit buffer length increased (and actually being used),
> maybe the packets with realtime requirements can be given a higher
> priority to deal with these requirements.
>
> Alan.
>
>>
>>  out:
>> mutex_unlock(>priv->phy->pib_lock);
>>
>> -   if (res) {
>> -   if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) {
>> -   queue_work(xw->priv->dev_workqueue, >work);
>> -   return;
>> -   } else
>> -   pr_debug("transmission failed for %d times",
>> -MAC802154_MAX_XMIT_ATTEMPTS);
>> -   }
>>
>> dev_kfree_skb(xw->skb);
>>
>> --
>> 1.7.11.2
>>
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] mac802154: Immediately retry sending failed packets

2013-04-02 Thread Alan Ott
On 04/02/2013 03:11 PM, Alexander Smirnov wrote:
> 2013/4/2 Alan Ott mailto:a...@signal11.us>>
> 
> When ops->xmit() fails, immediately retry. Previously the packet was
> sent
> to the back of the workqueue.
> 
> Signed-off-by: Alan Ott mailto:a...@signal11.us>>
> ---
>  net/mac802154/tx.c | 17 -
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
> index 4e09d07..fbf937c 100644
> --- a/net/mac802154/tx.c
> +++ b/net/mac802154/tx.c
> @@ -59,19 +59,18 @@ static void mac802154_xmit_worker(struct
> work_struct *work)
> }
> }
> 
> -   res = xw->priv->ops->xmit(>priv->hw, xw->skb);
> +   do {
> +   res = xw->priv->ops->xmit(>priv->hw, xw->skb);
> +   if (res && ++xw->xmit_attempts >=
> MAC802154_MAX_XMIT_ATTEMPTS) {
> +   pr_debug("transmission failed for %d times",
> +MAC802154_MAX_XMIT_ATTEMPTS);
> +   break;
> +   }
> +   } while (res);
> 
>  
> 
> IIRC this 802.15.4 stack uses single-thread-work-queue and all RX/TX
> are performed by using it.

Hi Alexander,

Yes, that is true. As is currently implemented, the driver xmit()
functions are called from a workqueue and block until the packet is sent.


> Doing TX retry in the way you proposed -
> it's possible that you will block other packets pending in this
> queue.

Yes. Since sending data is a blocking operation, any time spent sending
(or re-sending) is blocking.

As it was before this patch series, with the buffer (workqueue) growing
arbitrarily large, doing retry by putting a packet at the end of the
workqueue was largely useless because by the time it came to retry it,
any state associated with it (with respect to fragmentation/reassembly)
had expired.

Keep in mind that with the netif stop/wake code, putting retries at the
end of the workqueue or doing them immediately is basically the same
thing, since the workqueue is no longer the packet queue (and will
ideally only have 0 or 1 packets in it). The workqueue (with these
patches) only serves to lift the driver xmit() calls out of atomic
context, allowing them to block.

However, it is easy to envision one process clogging up the works with
retries by sending many packets to an unavailable address.

What do you recommend doing here instead?

> Despite on Linux is already 'slow' system to provide
> real-time for specific 802.15.4 features, I think it's not a good
> idea to increase nodes communication latency.

With the transmit buffer length increased (and actually being used),
maybe the packets with realtime requirements can be given a higher
priority to deal with these requirements.

Alan.

> 
> 
>  out:
> mutex_unlock(>priv->phy->pib_lock);
> 
> -   if (res) {
> -   if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) {
> -   queue_work(xw->priv->dev_workqueue, >work);
> -   return;
> -   } else
> -   pr_debug("transmission failed for %d times",
> -MAC802154_MAX_XMIT_ATTEMPTS);
> -   }
> 
> dev_kfree_skb(xw->skb);
> 
> --
> 1.7.11.2
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/6] mac802154: Use netif flow control

2013-04-02 Thread Alan Ott
Use netif_stop_queue() and netif_wake_queue() to control the flow of
packets to mac802154 devices.  Since many IEEE 802.15.4 devices have no
output buffer, and since the mac802154 xmit() function is designed to
block, netif_stop_queue() is called after each packet.

Signed-off-by: Alan Ott 
---
 net/mac802154/tx.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index a248246..fe3e02c 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 
@@ -45,6 +46,7 @@ static void mac802154_xmit_worker(struct work_struct *work)
 {
struct xmit_work *xw = container_of(work, struct xmit_work, work);
u8 xmit_attempts = 0;
+   struct mac802154_sub_if_data *sdata;
int res;
 
mutex_lock(>priv->phy->pib_lock);
@@ -71,6 +73,12 @@ static void mac802154_xmit_worker(struct work_struct *work)
 out:
mutex_unlock(>priv->phy->pib_lock);
 
+   /* Restart the netif queue on each sub_if_data object. */
+   rcu_read_lock();
+   list_for_each_entry_rcu(sdata, >priv->slaves, list) {
+   netif_wake_queue(sdata->dev);
+   }
+   rcu_read_unlock();
 
dev_kfree_skb(xw->skb);
 
@@ -81,6 +89,7 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct 
sk_buff *skb,
 u8 page, u8 chan)
 {
struct xmit_work *work;
+   struct mac802154_sub_if_data *sdata;
 
if (!(priv->phy->channels_supported[page] & (1 << chan))) {
WARN_ON(1);
@@ -108,6 +117,13 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, 
struct sk_buff *skb,
return NETDEV_TX_BUSY;
}
 
+   /* Stop the netif queue on each sub_if_data object. */
+   rcu_read_lock();
+   list_for_each_entry_rcu(sdata, >slaves, list) {
+   netif_stop_queue(sdata->dev);
+   }
+   rcu_read_unlock();
+
INIT_WORK(>work, mac802154_xmit_worker);
work->skb = skb;
work->priv = priv;
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/6] mac802154: Move xmit_attemps to stack

2013-04-02 Thread Alan Ott
There's no reason to have it in the work struct anymore.

Signed-off-by: Alan Ott 
---
 net/mac802154/tx.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index fbf937c..a248246 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -39,12 +39,12 @@ struct xmit_work {
struct mac802154_priv *priv;
u8 chan;
u8 page;
-   u8 xmit_attempts;
 };
 
 static void mac802154_xmit_worker(struct work_struct *work)
 {
struct xmit_work *xw = container_of(work, struct xmit_work, work);
+   u8 xmit_attempts = 0;
int res;
 
mutex_lock(>priv->phy->pib_lock);
@@ -61,7 +61,7 @@ static void mac802154_xmit_worker(struct work_struct *work)
 
do {
res = xw->priv->ops->xmit(>priv->hw, xw->skb);
-   if (res && ++xw->xmit_attempts >= MAC802154_MAX_XMIT_ATTEMPTS) {
+   if (res && ++xmit_attempts >= MAC802154_MAX_XMIT_ATTEMPTS) {
pr_debug("transmission failed for %d times",
 MAC802154_MAX_XMIT_ATTEMPTS);
break;
@@ -113,7 +113,6 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, 
struct sk_buff *skb,
work->priv = priv;
work->page = page;
work->chan = chan;
-   work->xmit_attempts = 0;
 
queue_work(priv->dev_workqueue, >work);
 
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/6] mac802154: Increase tx_buffer_len

2013-04-02 Thread Alan Ott
Increase the buffer length from 10 to 300 packets. Consider that traffic on
mac802154 devices will often be 6LoWPAN, and a full-length (1280 octet)
IPv6 packet will fragment into 15 6LoWPAN fragments (because the MTU of
IEEE 802.15.4 is 127).  A 300-packet queue is really 20 full-length IPv6
packets.

With a queue length of 10, an entire IPv6 packet was unable to get queued
at one time, causing fragments to be dropped, and making reassembly
impossible.

Signed-off-by: Alan Ott 
---
 net/mac802154/wpan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac802154/wpan.c b/net/mac802154/wpan.c
index 7d3f659..2ca2f4d 100644
--- a/net/mac802154/wpan.c
+++ b/net/mac802154/wpan.c
@@ -360,7 +360,7 @@ void mac802154_wpan_setup(struct net_device *dev)
dev->header_ops = _header_ops;
dev->needed_tailroom= 2; /* FCS */
dev->mtu= IEEE802154_MTU;
-   dev->tx_queue_len   = 10;
+   dev->tx_queue_len   = 300;
dev->type   = ARPHRD_IEEE802154;
dev->flags  = IFF_NOARP | IFF_BROADCAST;
dev->watchdog_timeo = 0;
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/6] 802.15.4 and 6LoWPAN Buffering Fixes

2013-04-02 Thread Alan Ott
These patches fix an issue in the 802.15.4 code where the output buffer
(which prior to this patchset is just a workqueue) can grow to an arbitrary
size.  This is basically fixed as follows:

1. Use netif_stop_queue() and netif_wake_queue() to stop and start the
transmit queue, preventing packets from piling up without bound on the
mac802154 workqueue.

2.  Increase the default tx_buffer_len for mac802154 (wpan) devices from 10
to 300, enabling Qdisc to work properly on the transmit queue.

Additionally the following related issues are fixed:

1. Handle dev_queue_xmit() return values properly in the 6LoWPAN code (and
return the proper errors to the higher layers).  This will cause the higher
layers to retry later if the mac802154 queue is full.

2. Fix the retry of transmit failures in mac802154.

Alan Ott (6):
  mac802154: Immediately retry sending failed packets
  mac802154: Move xmit_attemps to stack
  mac802154: Use netif flow control
  mac802154: Increase tx_buffer_len
  6lowpan: handle dev_queue_xmit error code properly
  6lowpan: return the dev_queue_xmit() return value from lowpan_xmit()

 net/ieee802154/6lowpan.c |  4 ++--
 net/mac802154/tx.c   | 34 --
 net/mac802154/wpan.c |  2 +-
 3 files changed, 27 insertions(+), 13 deletions(-)

-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/6] 6lowpan: return the dev_queue_xmit() return value from lowpan_xmit()

2013-04-02 Thread Alan Ott
dev_queue_xmit() will return a positive value if the packet could not be
queued, often because the real network device (in our case the mac802154
wpan device) has its queue stopped.  lowpan_xmit() should return that value
to the higher layer so the higher layer will retry sending the packet.

Signed-off-by: Alan Ott 
---
 net/ieee802154/6lowpan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index a68c792..55e1fd5 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1142,7 +1142,7 @@ out:
if (err)
pr_debug("ERROR: xmit failed\n");
 
-   return (err < 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK);
+   return (err < 0) ? NET_XMIT_DROP : err;
 }
 
 static struct wpan_phy *lowpan_get_phy(const struct net_device *dev)
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/6] mac802154: Immediately retry sending failed packets

2013-04-02 Thread Alan Ott
When ops->xmit() fails, immediately retry. Previously the packet was sent
to the back of the workqueue.

Signed-off-by: Alan Ott 
---
 net/mac802154/tx.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 4e09d07..fbf937c 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -59,19 +59,18 @@ static void mac802154_xmit_worker(struct work_struct *work)
}
}
 
-   res = xw->priv->ops->xmit(>priv->hw, xw->skb);
+   do {
+   res = xw->priv->ops->xmit(>priv->hw, xw->skb);
+   if (res && ++xw->xmit_attempts >= MAC802154_MAX_XMIT_ATTEMPTS) {
+   pr_debug("transmission failed for %d times",
+MAC802154_MAX_XMIT_ATTEMPTS);
+   break;
+   }
+   } while (res);
 
 out:
mutex_unlock(>priv->phy->pib_lock);
 
-   if (res) {
-   if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) {
-   queue_work(xw->priv->dev_workqueue, >work);
-   return;
-   } else
-   pr_debug("transmission failed for %d times",
-MAC802154_MAX_XMIT_ATTEMPTS);
-   }
 
dev_kfree_skb(xw->skb);
 
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/6] 6lowpan: handle dev_queue_xmit error code properly

2013-04-02 Thread Alan Ott
dev_queue_xmit() can return positive error codes, so check for nonzero.

Signed-off-by: Alan Ott 
---
 net/ieee802154/6lowpan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index e1b4580..a68c792 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1139,7 +1139,7 @@ static netdev_tx_t lowpan_xmit(struct sk_buff *skb, 
struct net_device *dev)
 error:
dev_kfree_skb(skb);
 out:
-   if (err < 0)
+   if (err)
pr_debug("ERROR: xmit failed\n");
 
return (err < 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK);
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/6] 6lowpan: handle dev_queue_xmit error code properly

2013-04-02 Thread Alan Ott
dev_queue_xmit() can return positive error codes, so check for nonzero.

Signed-off-by: Alan Ott a...@signal11.us
---
 net/ieee802154/6lowpan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index e1b4580..a68c792 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1139,7 +1139,7 @@ static netdev_tx_t lowpan_xmit(struct sk_buff *skb, 
struct net_device *dev)
 error:
dev_kfree_skb(skb);
 out:
-   if (err  0)
+   if (err)
pr_debug(ERROR: xmit failed\n);
 
return (err  0 ? NETDEV_TX_BUSY : NETDEV_TX_OK);
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/6] mac802154: Immediately retry sending failed packets

2013-04-02 Thread Alan Ott
When ops-xmit() fails, immediately retry. Previously the packet was sent
to the back of the workqueue.

Signed-off-by: Alan Ott a...@signal11.us
---
 net/mac802154/tx.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 4e09d07..fbf937c 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -59,19 +59,18 @@ static void mac802154_xmit_worker(struct work_struct *work)
}
}
 
-   res = xw-priv-ops-xmit(xw-priv-hw, xw-skb);
+   do {
+   res = xw-priv-ops-xmit(xw-priv-hw, xw-skb);
+   if (res  ++xw-xmit_attempts = MAC802154_MAX_XMIT_ATTEMPTS) {
+   pr_debug(transmission failed for %d times,
+MAC802154_MAX_XMIT_ATTEMPTS);
+   break;
+   }
+   } while (res);
 
 out:
mutex_unlock(xw-priv-phy-pib_lock);
 
-   if (res) {
-   if (xw-xmit_attempts++  MAC802154_MAX_XMIT_ATTEMPTS) {
-   queue_work(xw-priv-dev_workqueue, xw-work);
-   return;
-   } else
-   pr_debug(transmission failed for %d times,
-MAC802154_MAX_XMIT_ATTEMPTS);
-   }
 
dev_kfree_skb(xw-skb);
 
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/6] 6lowpan: return the dev_queue_xmit() return value from lowpan_xmit()

2013-04-02 Thread Alan Ott
dev_queue_xmit() will return a positive value if the packet could not be
queued, often because the real network device (in our case the mac802154
wpan device) has its queue stopped.  lowpan_xmit() should return that value
to the higher layer so the higher layer will retry sending the packet.

Signed-off-by: Alan Ott a...@signal11.us
---
 net/ieee802154/6lowpan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index a68c792..55e1fd5 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1142,7 +1142,7 @@ out:
if (err)
pr_debug(ERROR: xmit failed\n);
 
-   return (err  0 ? NETDEV_TX_BUSY : NETDEV_TX_OK);
+   return (err  0) ? NET_XMIT_DROP : err;
 }
 
 static struct wpan_phy *lowpan_get_phy(const struct net_device *dev)
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/6] 802.15.4 and 6LoWPAN Buffering Fixes

2013-04-02 Thread Alan Ott
These patches fix an issue in the 802.15.4 code where the output buffer
(which prior to this patchset is just a workqueue) can grow to an arbitrary
size.  This is basically fixed as follows:

1. Use netif_stop_queue() and netif_wake_queue() to stop and start the
transmit queue, preventing packets from piling up without bound on the
mac802154 workqueue.

2.  Increase the default tx_buffer_len for mac802154 (wpan) devices from 10
to 300, enabling Qdisc to work properly on the transmit queue.

Additionally the following related issues are fixed:

1. Handle dev_queue_xmit() return values properly in the 6LoWPAN code (and
return the proper errors to the higher layers).  This will cause the higher
layers to retry later if the mac802154 queue is full.

2. Fix the retry of transmit failures in mac802154.

Alan Ott (6):
  mac802154: Immediately retry sending failed packets
  mac802154: Move xmit_attemps to stack
  mac802154: Use netif flow control
  mac802154: Increase tx_buffer_len
  6lowpan: handle dev_queue_xmit error code properly
  6lowpan: return the dev_queue_xmit() return value from lowpan_xmit()

 net/ieee802154/6lowpan.c |  4 ++--
 net/mac802154/tx.c   | 34 --
 net/mac802154/wpan.c |  2 +-
 3 files changed, 27 insertions(+), 13 deletions(-)

-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >