from:"Satyam Sharma"

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-22 Thread Satyam Sharma



On Thu, 20 Sep 2007, Satyam Sharma wrote:
 
 BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
 IIRC I got build failures in:

 drivers/net/spider_net.c


[PATCH -mm] spider_net: Misc build fixes after recent netdev stats changes

Unbreak the following:

drivers/net/spider_net.c: In function 'spider_net_release_tx_chain':
drivers/net/spider_net.c:818: error: 'dev' undeclared (first use in this 
function)
drivers/net/spider_net.c:818: error: (Each undeclared identifier is reported 
only once
drivers/net/spider_net.c:818: error: for each function it appears in.)
drivers/net/spider_net.c: In function 'spider_net_xmit':
drivers/net/spider_net.c:922: error: 'dev' undeclared (first use in this 
function)
drivers/net/spider_net.c: In function 'spider_net_pass_skb_up':
drivers/net/spider_net.c:1018: error: 'dev' undeclared (first use in this 
function)
drivers/net/spider_net.c: In function 'spider_net_decode_one_descr':
drivers/net/spider_net.c:1215: error: 'dev' undeclared (first use in this 
function)
make[2]: *** [drivers/net/spider_net.o] Error 1

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/net/spider_net.c |   24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff -ruNp a/drivers/net/spider_net.c b/drivers/net/spider_net.c
--- a/drivers/net/spider_net.c  2007-09-22 06:26:39.0 +0530
+++ b/drivers/net/spider_net.c  2007-09-22 12:12:23.0 +0530
@@ -795,6 +795,7 @@ spider_net_set_low_watermark(struct spid
 static int
 spider_net_release_tx_chain(struct spider_net_card *card, int brutal)
 {
+   struct net_device *dev = card-netdev;
struct spider_net_descr_chain *chain = card-tx_chain;
struct spider_net_descr *descr;
struct spider_net_hw_descr *hwdescr;
@@ -919,7 +920,7 @@ spider_net_xmit(struct sk_buff *skb, str
spider_net_release_tx_chain(card, 0);
 
if (spider_net_prepare_tx_descr(card, skb) != 0) {
-   dev-stats.tx_dropped++;
+   netdev-stats.tx_dropped++;
netif_stop_queue(netdev);
return NETDEV_TX_BUSY;
}
@@ -979,16 +980,12 @@ static void
 spider_net_pass_skb_up(struct spider_net_descr *descr,
   struct spider_net_card *card)
 {
-   struct spider_net_hw_descr *hwdescr= descr-hwdescr;
-   struct sk_buff *skb;
-   struct net_device *netdev;
-   u32 data_status, data_error;
-
-   data_status = hwdescr-data_status;
-   data_error = hwdescr-data_error;
-   netdev = card-netdev;
+   struct spider_net_hw_descr *hwdescr = descr-hwdescr;
+   struct sk_buff *skb = descr-skb;
+   struct net_device *netdev = card-netdev;
+   u32 data_status = hwdescr-data_status;
+   u32 data_error = hwdescr-data_error;
 
-   skb = descr-skb;
skb_put(skb, hwdescr-valid_size);
 
/* the card seems to add 2 bytes of junk in front
@@ -1015,8 +1012,8 @@ spider_net_pass_skb_up(struct spider_net
}
 
/* update netdevice statistics */
-   dev-stats.rx_packets++;
-   dev-stats.rx_bytes += skb-len;
+   netdev-stats.rx_packets++;
+   netdev-stats.rx_bytes += skb-len;
 
/* pass skb up to stack */
netif_receive_skb(skb);
@@ -1184,6 +1181,7 @@ static int spider_net_resync_tail_ptr(st
 static int
 spider_net_decode_one_descr(struct spider_net_card *card)
 {
+   struct net_device *dev = card-netdev;
struct spider_net_descr_chain *chain = card-rx_chain;
struct spider_net_descr *descr = chain-tail;
struct spider_net_hw_descr *hwdescr = descr-hwdescr;
@@ -1210,7 +1208,7 @@ spider_net_decode_one_descr(struct spide
 (status == SPIDER_NET_DESCR_PROTECTION_ERROR) ||
 (status == SPIDER_NET_DESCR_FORCE_END) ) {
if (netif_msg_rx_err(card))
-   dev_err(card-netdev-dev,
+   dev_err(dev-dev,
   dropping RX descriptor with state %d\n, 
status);
dev-stats.rx_dropped++;
goto bad_desc;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-22 Thread Satyam Sharma



On Thu, 20 Sep 2007, Satyam Sharma wrote:
 
 BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
 IIRC I got build failures in:

 drivers/net/spider_net.c

Fixing the above showed up another problem in another file of the
same driver (drivers/net/spider_net_ethtool.c)


[PATCH -mm] spider_net_ethtool: Keep up with recent netdev stats changes

Unbreak the following:

drivers/net/spider_net_ethtool.c: In function 'spider_net_get_ethtool_stats':
drivers/net/spider_net_ethtool.c:160: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:161: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:162: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:163: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:164: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:165: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:166: error: structure has no member named 
'netdev_stats'
make[2]: *** [drivers/net/spider_net_ethtool.o] Error 1

Also do another ARRAY_SIZE() cleanup while at it.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/net/spider_net_ethtool.c |   18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff -ruNp a/drivers/net/spider_net_ethtool.c b/drivers/net/spider_net_ethtool.c
--- a/drivers/net/spider_net_ethtool.c  2007-09-22 06:26:39.0 +0530
+++ b/drivers/net/spider_net_ethtool.c  2007-09-22 12:43:51.0 +0530
@@ -28,8 +28,6 @@
 #include spider_net.h
 
 
-#define SPIDER_NET_NUM_STATS 13
-
 static struct {
const char str[ETH_GSTRING_LEN];
 } ethtool_stats_keys[] = {
@@ -149,7 +147,7 @@ spider_net_ethtool_get_ringparam(struct 
 
 static int spider_net_get_stats_count(struct net_device *netdev)
 {
-   return SPIDER_NET_NUM_STATS;
+   return ARRAY_SIZE(ethtool_stats_keys);
 }
 
 static void spider_net_get_ethtool_stats(struct net_device *netdev,
@@ -157,13 +155,13 @@ static void spider_net_get_ethtool_stats
 {
struct spider_net_card *card = netdev-priv;
 
-   data[0] = card-netdev_stats.tx_packets;
-   data[1] = card-netdev_stats.tx_bytes;
-   data[2] = card-netdev_stats.rx_packets;
-   data[3] = card-netdev_stats.rx_bytes;
-   data[4] = card-netdev_stats.tx_errors;
-   data[5] = card-netdev_stats.tx_dropped;
-   data[6] = card-netdev_stats.rx_dropped;
+   data[0] = netdev-stats.tx_packets;
+   data[1] = netdev-stats.tx_bytes;
+   data[2] = netdev-stats.rx_packets;
+   data[3] = netdev-stats.rx_bytes;
+   data[4] = netdev-stats.tx_errors;
+   data[5] = netdev-stats.tx_dropped;
+   data[6] = netdev-stats.rx_dropped;
data[7] = card-spider_stats.rx_desc_error;
data[8] = card-spider_stats.tx_timeouts;
data[9] = card-spider_stats.alloc_rx_skb_error;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH -mm] pasemi_mac: Build fix after recent netdev stats changes

2007-09-22 Thread Satyam Sharma



On Thu, 20 Sep 2007, Satyam Sharma wrote:
 
 BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
 IIRC I got build failures in:

 drivers/net/pasemi_mac.c


[PATCH -mm] pasemi_mac: Build fix after recent netdev stats changes

Unbreak the following:

drivers/net/pasemi_mac.c: In function 'pasemi_mac_clean_rx':
drivers/net/pasemi_mac.c:533: error: 'dev' undeclared (first use in this 
function)
drivers/net/pasemi_mac.c:533: error: (Each undeclared identifier is reported 
only once
drivers/net/pasemi_mac.c:533: error: for each function it appears in.)

And remove an unused static function:

drivers/net/pasemi_mac.c: At top level:
drivers/net/pasemi_mac.c:89: warning: 'read_iob_reg' defined but not used

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/net/pasemi_mac.c |9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff -ruNp a/drivers/net/pasemi_mac.c b/drivers/net/pasemi_mac.c
--- a/drivers/net/pasemi_mac.c  2007-09-22 06:26:39.0 +0530
+++ b/drivers/net/pasemi_mac.c  2007-09-22 13:03:04.0 +0530
@@ -85,11 +85,6 @@ MODULE_PARM_DESC(debug, PA Semi MAC bit
 
 static struct pasdma_status *dma_status;
 
-static unsigned int read_iob_reg(struct pasemi_mac *mac, unsigned int reg)
-{
-   return in_le32(mac-iob_regs+reg);
-}
-
 static void write_iob_reg(struct pasemi_mac *mac, unsigned int reg,
  unsigned int val)
 {
@@ -530,8 +525,8 @@ static int pasemi_mac_clean_rx(struct pa
} else
skb-ip_summed = CHECKSUM_NONE;
 
-   dev-stats.rx_bytes += len;
-   dev-stats.rx_packets++;
+   mac-netdev-stats.rx_bytes += len;
+   mac-netdev-stats.rx_packets++;
 
skb-protocol = eth_type_trans(skb, mac-netdev);
netif_receive_skb(skb);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH -mm] mv643xx_eth: Remove redundant multiple initialization

2007-09-22 Thread Satyam Sharma


Of ethtool_ops-get_stats_count and ethtool_ops-get_ethtool_stats.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/net/mv643xx_eth.c |2 --
 1 file changed, 2 deletions(-)

diff -ruNp a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c
--- a/drivers/net/mv643xx_eth.c 2007-09-22 06:26:39.0 +0530
+++ b/drivers/net/mv643xx_eth.c 2007-09-22 13:15:41.0 +0530
@@ -2740,8 +2740,6 @@ static const struct ethtool_ops mv643xx_
.get_stats_count= mv643xx_get_stats_count,
.get_ethtool_stats  = mv643xx_get_ethtool_stats,
.get_strings= mv643xx_get_strings,
-   .get_stats_count= mv643xx_get_stats_count,
-   .get_ethtool_stats  = mv643xx_get_ethtool_stats,
.nway_reset = mv643xx_eth_nway_restart,
 };
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH -mm] iseries_veth: Kill unused variable

2007-09-22 Thread Satyam Sharma


drivers/net/iseries_veth.c: In function 'veth_transmit_to_many':
drivers/net/iseries_veth.c:1174: warning: unused variable 'port'

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/net/iseries_veth.c |1 -
 1 file changed, 1 deletion(-)

diff -ruNp a/drivers/net/iseries_veth.c b/drivers/net/iseries_veth.c
--- a/drivers/net/iseries_veth.c2007-09-22 06:26:39.0 +0530
+++ b/drivers/net/iseries_veth.c2007-09-22 11:17:29.0 +0530
@@ -1171,7 +1171,6 @@ static void veth_transmit_to_many(struct
  HvLpIndexMap lpmask,
  struct net_device *dev)
 {
-   struct veth_port *port = (struct veth_port *) dev-priv;
int i, success, error;
 
success = error = 0;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.23-rc4-mm1 OOPS in forcedeth?

2007-09-17 Thread Satyam Sharma



On Mon, 17 Sep 2007, Denis V. Lunev wrote:
 Dhaval Giani wrote:
  On Thu, Sep 13, 2007 at 11:51:33PM -0400, Andrew James Wade wrote:

  EIP: [c037d81b] tcp_rto_min+0xb/0x15 SS:ESP 0068:c0596dec

As Vlad Yasevich mentioned, this one is already fixed in 23-rc6.

The forcedeth oops is unrelated, but multiple people have reported that
same oops now -- adding Manfred Spraul to CC. [ original thread is at:
http://lkml.org/lkml/2007/9/1/115 ]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] Make the pr_*() family of macros in kernel.h complete

2007-09-14 Thread Satyam Sharma

  -Original Message-
  From: Medve Emilian-EMMEDVE1
  Sent: Wednesday, September 12, 2007 11:40 AM
  To: [EMAIL PROTECTED]; netdev@vger.kernel.org;
  [EMAIL PROTECTED]; [EMAIL PROTECTED]
  Cc: Medve Emilian-EMMEDVE1

... no maintainer explicitly copied on the original mail who'd push this in.
Little thing, but LKML is too noisy, so you need to Cc: folks explicitly ...

  Subject: [PATCH v3] Make the pr_*() family of macros in
  kernel.h complete
 
  Other/Some pr_*() macros are already defined in kernel.h, but
  pr_err() was defined
  multiple times in several other places
 
  Signed-off-by: Emil Medve [EMAIL PROTECTED]

I think it's a useful cleanup, patch looks good to me ...

Reviewed-by: Satyam Sharma [EMAIL PROTECTED]

[ Original diff is at: http://lkml.org/lkml/diff/2007/9/12/182/1 ]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BUG: scheduling while atomic: ifconfig/0x00000002/4170

2007-09-06 Thread Satyam Sharma



On Thu, 6 Sep 2007, Herbert Xu wrote:

 On Thu, Sep 06, 2007 at 10:32:33AM +0530, Satyam Sharma wrote:
  
[  382.529041]  [c02c8abc] dev_close+0x24/0x67
[  382.529052]  [e01f402b] ieee80211_master_stop+0x4a/0x6d [mac80211]
 
 This is where the bug is.  You cannot call dev_close from an
 atomic context as i33380211_master_stop does it within spin
 locks.

Doh, of course! I must be blind ... and wait_for_completion()'s
might_sleep() clearly didn't trigger earlier because Florian must've
had CONFIG_DEBUG_SPINLOCK_SLEEP off in his .config ...
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH -mm 1/2] 3c59x: Fix uninitialized variable bug

2007-09-06 Thread Satyam Sharma



On Tue, 4 Sep 2007, Mark Hindley wrote:

 On Tue, Sep 04, 2007 at 02:09:47PM +0530, Satyam Sharma wrote:
  Hi Steffen,
  
  
  On Tue, 4 Sep 2007, Steffen Klassert wrote:
  
   On Tue, Sep 04, 2007 at 03:45:55AM +0530, Satyam Sharma wrote:

drivers/net/3c59x.c: In function 'vortex_up':
drivers/net/3c59x.c:1495: warning: 'err' may be used uninitialized in 
this function
   
   This came in with the recently applied 
   3c59x-check-return-of-pci_enable_device patch
   from Mark Hindley. I just compiled it on a PCI only machine so far, 
   therefore I did
   not notice the warning yet.
  
  Hmm, the .config I built with had PCI=y as well. Probably a compiler
  version difference -- Jeff also mentioned yesterday that some newer
  GCC versions fail to warn about uninitialized variables cases.
  
 
 Sorry, this is my bad. I have just checked: there is no warning with gcc
 4.2 or 4.1, but 3.3 emits the warning. 

This is a GCC bug (regression, actually, as you've found out) -- no two
ways about it. Although different from the kind Jeff mentioned couple days
back -- that was about wising GCC up to false positives and /not/ emitting
warnings. But here a genuine problem was not complained about, so this is
more serious. Do you plan to open up a bug at gcc.gnu.org/bugzilla/ ?


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel crashes inside MV643xx driver

2007-09-05 Thread Satyam Sharma



On Wed, 5 Sep 2007, Dale Farnsworth wrote:

 On Wed, Sep 05, 2007 at 08:24:52AM -0700, Andrew Morton wrote:
   On Mon, 20 Aug 2007 14:38:57 +0800 gshan [EMAIL PROTECTED] wrote:
   Hi All,
   
   After I started the NFS server, it crashed:
   
   3Badness in local_bh_enable at 
   /home/cli4/sandbox/main/TelicaRoot/components/mvlinux/cge/devkit/lsp/7xx/linux/kernel/softirq.c:195
   Badness in local_bh_enable at 
   /home/cli4/sandbox/main/TelicaRoot/components/mvlinux/cge/devkit/lsp/7xx/linux/kernel/softirq.c:195
   Call trace:
[c0005340] check_bug_trap+0xbc/0x11c
[c0005604] ProgramCheckException+0x264/0x2bc
[c0004ac4] ret_from_except_full+0x0/0x4c
[c0022ae4] local_bh_enable+0x18/0x80
[c024648c] skb_copy_bits+0x168/0x3b8
[c024db44] __skb_linearize+0x90/0x150
[c020e8a4] mv643xx_eth_start_xmit+0x4c0/0x5bc
[c025c934] qdisc_restart+0xac/0x2bc
[c024de9c] dev_queue_xmit+0x298/0x34c
[c0269814] ip_finish_output+0x140/0x2b8
[c026a3ac] ip_fragment+0x3cc/0x6e0
[c026bac8] ip_push_pending_frames+0x3dc/0x46c
[c0289ec4] udp_push_pending_frames+0x10c/0x1cc
[c028a7c4] udp_sendpage+0x104/0x188
[c0292fc8] inet_sendpage+0x90/0xb8
   
   I searched the webs and found the similar problems:

   http://www.mail-archive.com/netdev@vger.kernel.org/msg05199.html

   http://oss.sgi.com/archives/netdev/2005-09/msg00025.html
   
   Who knew there are fixes for the problem?
  
  Well that got a tremendous response, didn't it?

I thought of replying to this post when I saw it couple weeks back, but
didn't because (1) that's an ancient kernel and (2) the first link that
was mentioned up there in the original post itself pointed to a patch to
solve it (which I verified was since applied to mainline too).


  What do you mean by crashed?  The above is a warning and the system
  should have survived.
  
  Which kernel version is being used?
 
 That is the key question.  From the pathnames, I suspect that gshan is
 using a MontaVista version.  I'm still (especially) interested, since
 MontaVista pays me.
 
 BTW, I never received the original message on netdev or linux-kernel.
 Hmm.  Thanks to Andrew for replying.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BUG: scheduling while atomic: ifconfig/0x00000002/4170

2007-09-05 Thread Satyam Sharma

Hi,


 On 02/09/07, Florian Lohoff [EMAIL PROTECTED] wrote:
 
  Hi,
  with current git i got this when ifconfig eth1 down. eth1 had a mac
  address which looked really like an eth1394 ethernet although the module
  was not loaded. Something is really broken in 2.6.23-currentgit. I always 
  get the

-currentgit would be -rc5 right?


  sysfs rename issues which are discussed to be an udev issue. Then i see
  a eth1394 mac address on an interface which typically shouldn exist
  (udev should rename the wireless to eth1) and when issueing an
  ifconfig eth1 down i get a
 
  BUG: scheduling while atomic: ifconfig/0x0002/4170

Is this reproducible? Also, please send your .config.

BTW the calltrace below shows that eth1 was the wireless interface when
you tried to down it.


  On the next boot i see the eth1394 mac address on the wireless interface
  wmaster0_rename whereas eth1 is active (the wireless) and has the correct
  ip address.

I don't think the scheduling while atomic bug you saw is related to
the other problem you're seeing ... these are probably a bunch of all
different issues, but I'm not sure if eth1394 is involved at all.


  I dont get it - this all looks really messed up. udev is
  debian sid 114-2.

True, messed up it indeed is.


  eth0  Link encap:Ethernet  HWaddr 00:17:42:13:45:8C
UP BROADCAST MULTICAST  MTU:1500  Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
Interrupt:19
 
  eth1  Link encap:Ethernet  HWaddr 00:18:DE:63:F0:B3
inet addr:195.71.97.208  Bcast:195.71.97.223  Mask:255.255.255.224
inet6 addr: fe80::218:deff:fe63:f0b3/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:2079 errors:0 dropped:0 overruns:0 frame:0
TX packets:2220 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:508959 (497.0 KiB)  TX bytes:261123 (255.0 KiB)
 
  wmaster0_ Link encap:UNSPEC  HWaddr 
  00-18-DE-63-F0-B3-30-3A-00-00-00-00-00-00-00-00
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
 
 
  [   14.300736] VFS: Mounted root (ext3 filesystem) readonly.
  [   14.300902] Freeing unused kernel memory: 216k freed
  [   17.618804] irda_init()
  [   17.618817] NET: Registered protocol family 23
  [   17.636399] ACPI: PCI Interrupt :02:00.0[A] - GSI 16 (level, low) 
  - IRQ 19
  [   17.636588] PCI: Setting latency timer of device :02:00.0 to 64
  [   17.636619] sky2 :02:00.0: v1.17 addr 0xf000 irq 19 Yukon-EC 
  Ultra (0xb4) rev 2
  [   17.648081] parport_pc 00:0c: reported by Plug and Play ACPI
  [   17.648206] parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE,EPP]
  [   17.653652] sky2 eth0: addr 00:17:42:13:45:8c
  [   17.680848] input: Video Bus as /class/input/input6
  [   17.680961] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: 
  no)
  [   17.757019] usbcore: registered new interface driver usbfs
  [   17.757139] usbcore: registered new interface driver hub
  [   17.757264] usbcore: registered new device driver usb
  [   17.824819] Yenta: CardBus bridge found at :08:03.0 [10cf:131e]
  [   17.824941] Yenta O2: res at 0x94/0xD4: 00/ea
  [   17.825034] Yenta O2: enabling read prefetch/write burst
  [   17.828363] hda: ATAPI 24X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, 
  UDMA(33)
  [   17.828838] Uniform CD-ROM driver Revision: 3.20
  [   17.891481] USB Universal Host Controller Interface driver v3.0
  [   17.891650] ACPI: PCI Interrupt :00:1d.0[A] - GSI 23 (level, low) 
  - IRQ 22
  [   17.891840] PCI: Setting latency timer of device :00:1d.0 to 64
  [   17.891844] uhci_hcd :00:1d.0: UHCI Host Controller
  [   17.892155] uhci_hcd :00:1d.0: new USB bus registered, assigned bus 
  number 1
  [   17.892327] uhci_hcd :00:1d.0: irq 22, io base 0x1820
  [   17.892571] usb usb1: configuration #1 chosen from 1 choice
  [   17.892689] hub 1-0:1.0: USB hub found
  [   17.892784] hub 1-0:1.0: 2 ports detected
  [   17.924265] found SMC SuperIO Chip (devid=0x7a rev=00 base=0x002e): 
  LPC47N227
  [   17.924390] smsc_superio_flat(): fir: 0x6e8, sir: 0x2e8, dma: 03, irq: 
  3, mode: 0x0e
  [   17.924526] smsc_ircc_present: can't get sir_base of 0x2e8
  [   17.954918] Yenta: ISA IRQ mask 0x0c38, PCI irq 19
  [   17.955009] Socket status: 3006
  [   17.955094] Yenta: Raising subordinate bus# of parent bus (#08) from #09 
  to #0c
  [   17.955225] pcmcia: parent PCI bridge I/O window: 0x3000 - 0x3fff
  [   17.955315] cs: IO port probe 0x3000-0x3fff: clean.
  [   17.955773]

Re: [PATCH] Fix a potential NULL pointer dereference in uli526x_interrupt() in drivers/net/tulip/uli526x.c

2007-09-04 Thread Satyam Sharma

Hi Micah,


On Tue, 4 Sep 2007, Micah Gruber wrote:

 This patch fixes a potential null dereference bug where we dereference
 dev before a null check. This patch simply moves the dereferencing after
 the null check.
 
 Signed-off-by: Micah Gruber [EMAIL PROTECTED]
 ---
 
 --- a/drivers/net/tulip/uli526x.c
 +++ b/drivers/net/tulip/uli526x.c
 @@ -663,7 +663,7 @@
  {
   struct net_device *dev = dev_id;
   struct uli526x_board_info *db = netdev_priv(dev);
 - unsigned long ioaddr;
 + unsigned long ioaddr = dev-base_addr;
   unsigned long flags;
  
   if (!dev) {
 @@ -671,8 +671,6 @@
   return IRQ_NONE;
   }
  
 - ioaddr = dev-base_addr;
 -
   spin_lock_irqsave(db-lock, flags);
   outl(0, ioaddr + DCR7);


[The patch is reversed.]

But it is also complete -- you've left out the netdev_priv(dev)
statement _just_ before the dereference that you've pushed down.
Coverity wasn't smart enough to tell, but that's actually a dev-priv,
so we'll _still_ be dereferencing dev before checking it for NULL below.
And lastly, the patch isn't quite correct. It is the (!dev) check that
is redundant and must be removed instead :-)

I bet more such pointless checks exist all over. It makes more sense to
remove them than unnecessarily try to solve potential NULL dereferences,
that we (naturally) never saw earlier, because they're impossible. ISTR
pointing out two similar patches a month or so back when Jeff pushed net
driver patches upstream, IIRC ... ]


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH -mm 1/2] 3c59x: Fix uninitialized variable bug

2007-09-04 Thread Satyam Sharma

Hi Steffen,


On Tue, 4 Sep 2007, Steffen Klassert wrote:

 On Tue, Sep 04, 2007 at 03:45:55AM +0530, Satyam Sharma wrote:
  
  drivers/net/3c59x.c: In function 'vortex_up':
  drivers/net/3c59x.c:1495: warning: 'err' may be used uninitialized in this 
  function
 
 This came in with the recently applied 
 3c59x-check-return-of-pci_enable_device patch
 from Mark Hindley. I just compiled it on a PCI only machine so far, therefore 
 I did
 not notice the warning yet.

Hmm, the .config I built with had PCI=y as well. Probably a compiler
version difference -- Jeff also mentioned yesterday that some newer
GCC versions fail to warn about uninitialized variables cases.


  is a genuine bug. The function returns an uninitialized value of 'err'
  back to the caller, which expects it to be 0 for success cases. Let's
  fix this by explicitly initializing 'err' to zero.
  
  Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
 Acked-by: Steffen Klassert [EMAIL PROTECTED]


Thanks,

Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Fix a potential NULL pointer dereference in uli526x_interrupt() in drivers/net/tulip/uli526x.c

2007-09-04 Thread Satyam Sharma



On Tue, 4 Sep 2007, Satyam Sharma wrote:

 Hi Micah,
 
 
 On Tue, 4 Sep 2007, Micah Gruber wrote:
 
  This patch fixes a potential null dereference bug where we dereference
  dev before a null check. This patch simply moves the dereferencing after
  the null check.
  
  Signed-off-by: Micah Gruber [EMAIL PROTECTED]
  ---
  
  --- a/drivers/net/tulip/uli526x.c
  +++ b/drivers/net/tulip/uli526x.c
  @@ -663,7 +663,7 @@
   {
  struct net_device *dev = dev_id;
  struct uli526x_board_info *db = netdev_priv(dev);
  -   unsigned long ioaddr;
  +   unsigned long ioaddr = dev-base_addr;
  unsigned long flags;
   
  if (!dev) {
  @@ -671,8 +671,6 @@
  return IRQ_NONE;
  }
   
  -   ioaddr = dev-base_addr;
  -
  spin_lock_irqsave(db-lock, flags);
  outl(0, ioaddr + DCR7);
 
 
 [The patch is reversed.]
 
 But it is also complete -- you've left out the netdev_priv(dev)
 
I meant:   incomplete

 statement _just_ before the dereference that you've pushed down.
 Coverity wasn't smart enough to tell, but that's actually a dev-priv,
 so we'll _still_ be dereferencing dev before checking it for NULL below.
 And lastly, the patch isn't quite correct. It is the (!dev) check that
 is redundant and must be removed instead :-)
 
 I bet more such pointless checks exist all over. It makes more sense to
 remove them than unnecessarily try to solve potential NULL dereferences,
 that we (naturally) never saw earlier, because they're impossible. ISTR
 pointing out two similar patches a month or so back when Jeff pushed net
 driver patches upstream, IIRC ... ]
 
 
 Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH -mm 1/2] 3c59x: Fix uninitialized variable bug

2007-09-03 Thread Satyam Sharma


drivers/net/3c59x.c: In function 'vortex_up':
drivers/net/3c59x.c:1495: warning: 'err' may be used uninitialized in this 
function

is a genuine bug. The function returns an uninitialized value of 'err'
back to the caller, which expects it to be 0 for success cases. Let's
fix this by explicitly initializing 'err' to zero.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/net/3c59x.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-2.6.23-rc4-mm1/drivers/net/3c59x.c~fix2007-09-04 
03:29:40.0 +0530
+++ linux-2.6.23-rc4-mm1/drivers/net/3c59x.c2007-09-04 03:30:08.0 
+0530
@@ -1492,7 +1492,8 @@ vortex_up(struct net_device *dev)
struct vortex_private *vp = netdev_priv(dev);
void __iomem *ioaddr = vp-ioaddr;
unsigned int config;
-   int i, mii_reg1, mii_reg5, err;
+   int i, mii_reg1, mii_reg5;
+   int err = 0;
 
if (VORTEX_PCI(vp)) {
pci_set_power_state(VORTEX_PCI(vp), PCI_D0);/* Go active */
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH -mm 2/2] 3c59x MAINTAINERS

2007-09-03 Thread Satyam Sharma

Remove duplicate entry for the same driver.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 MAINTAINERS |6 --
 1 file changed, 6 deletions(-)

--- linux-2.6.23-rc4-mm1/MAINTAINERS~fix2007-09-04 03:49:16.0 
+0530
+++ linux-2.6.23-rc4-mm1/MAINTAINERS2007-09-04 03:49:28.0 +0530
@@ -104,12 +104,6 @@ M: [EMAIL PROTECTED]
 L: netdev@vger.kernel.org
 S: Maintained
 
-3C59X NETWORK DRIVER
-P: Steffen Klassert
-M: [EMAIL PROTECTED]
-L: netdev@vger.kernel.org
-S: Maintained
-
 3CR990 NETWORK DRIVER
 P: David Dillow
 M: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.23-rc4-mm1 OOPS in forcedeth?

2007-09-02 Thread Satyam Sharma



On Sun, 2 Sep 2007, [EMAIL PROTECTED] wrote:
 
   [EMAIL PROTECTED] wrote:

On this machine (Athlon 64 X2 4600, 4 GiB memory, lots of disks),
2.6.23-rc1-mm2 runs fine. 2.6.23-rc4-mm1 reproducably dies within 
seconds of
starting
a rsync session on another PC against this machine.

NULL pointer dereference
code:   nv_napi_poll+0x108
trace:  net_rx_action+0xab
__do_softirq+0x74
call_softirq+0x1c
do_softirq+0x3d
irq_exit+0x85
do_IRQ+0x85
ret_from_intr+0x0
 
 There are 4 pictures of oopses here:
 
 http://www.xs4all.nl/~thunder7/oops_2623rc4mm1_1.jpg
 http://www.xs4all.nl/~thunder7/oops_2623rc4mm1_2.jpg
 http://www.xs4all.nl/~thunder7/oops_2623rc4mm1_3.jpg
 http://www.xs4all.nl/~thunder7/oops_2623rc4mm1_4.jpg

OK, I've been pouring over forcedeth.c and the newly introduce NAPI code,
but didn't debug this yet, so I'll at least lay out the situation so that
somebody else who's more experienced @netdev can pick up from here with
minimal time wastage.

Here's what's happening (repeatedly, reproducibly) on Jurriaan's x64 box:

(1) The following NULL dereference oops:

nv_rx_process_optimized(), inlined from nv_napi_poll(), found that
skb i.e. np-get_rx_ctx-skb == NULL when trying to update
skb-ip_summed.

(2) The following BUG in napi_complete():

BUG_ON(!test_bit(NAPI_STATE_SCHED, n-state));

from the nv_napi_poll()-__netif_rx_complete()-napi_complete()
callchain is triggering. IOW napi_complete() found that a NAPI
poll wasn't/shouldn't have been scheduled at all (!)

The above two problems appear to be occurring independently, AFAICT.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [OOPS] 2.6.23-rc5 ? network/via-rhine [was: hang with CONFIG_MCYRIXIII]

2007-09-02 Thread Satyam Sharma



On Sun, 2 Sep 2007, Mark Hindley wrote:
 
 BUG: unable to handle kernel NULL pointer dereference at virtual address 
 0025
 [...]
 Call Trace:
  [c0259b1b] tcp_rtt_estimator+0xba/0x100
 [...]
 EIP: [c0259a57] tcp_rto_min+0x8/0x12 SS:ESP 0068:c0341dec

Third report of this oops within past few hours :-)

You want the patch available at:

(very long URL)
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=blobdiff;f=net/ipv4/tcp_input.c;h=bbad2cdb74b7c18c0ae742be241857b126f06890;hp=1ee72127462bf37ed5ef834ebc7b1cf070d5;hb=5c127c58ae9bf196d787815b1bd6b0aec5aee816;hpb=88282c6ecf0dacefda6090b0289d35e8a3cc6221
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH -mm] sunrpc svc: Shut up bogus uninitialized variable warning

2007-09-02 Thread Satyam Sharma


net/sunrpc/svc.c: In function ‘__svc_create_thread’:
net/sunrpc/svc.c:550: warning: ‘oldmask.bits[0u]’ may be used uninitialized in 
this function

is a bogus warning, but gcc isn't smart enough to see why. We cannot just
reorganize the code in the function, because we want the set_cpus_allowed()
restore to happen only after the kernel_thread() is forked. Alas, we have
to use cpus_clear() to initialize oldmask instead to keep gcc happy.

Also add some comments to describe what's happening in the function.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 net/sunrpc/svc.c |7 +++
 1 file changed, 7 insertions(+)

--- linux-2.6.23-rc4-mm1/net/sunrpc/svc.c~fix   2007-09-02 22:55:25.0 
+0530
+++ linux-2.6.23-rc4-mm1/net/sunrpc/svc.c   2007-09-02 23:03:45.0 
+0530
@@ -568,17 +568,24 @@ __svc_create_thread(svc_thread_fn func, 
rqstp-rq_server = serv;
rqstp-rq_pool = pool;
 
+   /* Only to prevent gcc warning */
+   cpus_clear(oldmask);
+
+   /* Temporarily change CPU affinity before forking svc thread */
if (serv-sv_nrpools  1)
have_oldmask = svc_pool_map_set_cpumask(pool-sp_id, oldmask);
 
+   /* Will inherit current-cpus_allowed */
error = kernel_thread((int (*)(void *)) func, rqstp, 0);
 
+   /* Restore old cpus_allowed */
if (have_oldmask)
set_cpus_allowed(current, oldmask);
 
if (error  0)
goto out_thread;
svc_sock_update_bufs(serv);
+
error = 0;
 out:
return error;

Re: 2.6.23-rc4-mm1 no CRC MODPOST warnings

2007-09-01 Thread Satyam Sharma



On Fri, 31 Aug 2007, Andrew Morton wrote:
 
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc4/2.6.23-rc4-mm1/

Got these on an i386 build with CONFIG_MODVERSIONS=y ...

WARNING: div64_64 [net/netfilter/xt_connbytes.ko] has no CRC!
WARNING: div64_64 [net/ipv4/tcp_cubic.ko] has no CRC!

Full .config at: http://www.cse.iitk.ac.in/users/ssatyam/config-mm
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.23-rc4-mm1 OOPS in forcedeth?

2007-09-01 Thread Satyam Sharma

Hi Jurriaan,

 [EMAIL PROTECTED] wrote:
  From: Andrew Morton [EMAIL PROTECTED]
  Date: Fri, Aug 31, 2007 at 09:58:22PM -0700
   ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc4/2.6.23-rc4-mm1/

  On this machine (Athlon 64 X2 4600, 4 GiB memory, lots of disks),
  2.6.23-rc1-mm2 runs fine. 2.6.23-rc4-mm1 reproducably dies within seconds of
  starting
  a rsync session on another PC against this machine.

  NULL pointer dereference
  code:   nv_napi_poll+0x108
  trace:  net_rx_action+0xab
  __do_softirq+0x74
  call_softirq+0x1c
  do_softirq+0x3d
  irq_exit+0x85
  do_IRQ+0x85
  ret_from_intr+0x0

The dmesg you posted below doesn't cover the messages from this oops
itself. As you mentioned you can reproduce this oops easily, please do so,
and post the *full* oops log (if it doesn't get logged to disk, you can
try taking digicam photo, or write down *all* the messages and post here).
I built an x86_64 kernel as per your .config, but don't see any memory
dereference at nv_napi_poll+0x108 -- could be toolchain differences.

Else, can you run:
$ gdb ./vmlinux

and then:
(gdb) l *nv_napi_poll+0x108

and send us the output?

Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH -mm] net/sched/sch_cbq.c: Shut up uninitialized variable warning

2007-09-01 Thread Satyam Sharma


net/sched/sch_cbq.c: In function 'cbq_enqueue':
net/sched/sch_cbq.c:383: warning: 'ret' may be used uninitialized in this 
function

has been verified to be a bogus case. So let's shut it up.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 net/sched/sch_cbq.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.23-rc4-mm1/net/sched/sch_cbq.c~fix2007-09-02 
06:45:08.0 +0530
+++ linux-2.6.23-rc4-mm1/net/sched/sch_cbq.c2007-09-02 06:44:37.0 
+0530
@@ -380,7 +380,7 @@ cbq_enqueue(struct sk_buff *skb, struct 
 {
struct cbq_sched_data *q = qdisc_priv(sch);
int len = skb-len;
-   int ret;
+   int uninitialized_var(ret);
struct cbq_class *cl = cbq_classify(skb, sch, ret);
 
 #ifdef CONFIG_NET_CLS_ACT
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.23-rc4-mm1: unpingable box and NULL dereference at tcp_rto_min()

2007-09-01 Thread Satyam Sharma



On Sun, 2 Sep 2007, Alexey Dobriyan wrote:
 
 Unable to handle kernel NULL pointer dereference at 0039 RIP: 
  [803b6f7c] tcp_rto_min+0xc/0x20

tcp_rto_min() lacks a check-for-NULL. You want 5c127c58ae9bf196 from
the net-2.6.git tree -- so this will be gone in -rc6.

 P.S.: uh-oh, it's [TCP] Allow minnimum RTO ... aka 05bb1fad1cde

Yup, it came from this last commit in net-2.6 before -rc5.

[ Considering it's pretty core code (and thus the oops fairly easily
  reproducible), I initially thought this must've come from net-2.6.24.
  I suspect lot of testers might hit this, so would be wise to put that
  patch up as a hot-fix ? ]


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bug? Realtek RTL-8169 Gigabit Ethernet and High Resolution Timers

2007-08-30 Thread Satyam Sharma

[ Adding relevant Cc:'s ]


On Thu, 30 Aug 2007, n wrote:

 I found a bug when using the Ethernet controller: Realtek Semiconductor Co.,
 Ltd. RTL-8169 Gigabit Ethernet (rev 10)  ethernet card and kernel High
 Resolution Timers (menuconfig - Processor type and features - High
 Resolution Timer Support )
 after about 20 kernel compiles i finnaly figured out this option was making
 the ethernet card slow
 tested kernels 2.6.22,2.6.22.5, 2.6.23-rc4 (running now)
 
 the cpu is the pc is a p3 667 mhz so its cpu limited i guess and maxes at
 400mbit~. (without High Resolution Timers )
 with High Resolution Timers off it would only do 50-70mbit.
 
 r8169 (compile as module + napi)
 High Resolution Timer Support (enabled)
 tested with iperf also system is very unresponsive i cant even open a ssh
 session or type a command while the test ran
 [  5]  0.0-10.0 sec  64.5 MBytes  54.0 Mbits/sec
 
 r8169 (compile as module + napi)
 High Resolution Timer Support (disabled)
 [  3]  0.0-10.0 sec552 MBytes463 Mbits/sec
 
 as you can see with it disabled the speed really improved.
 but im wondering what does High Resolution Timer's do and what does it have to
 do with a pci gigabit ethernet card that would slow it down
 maybe someone else could test and see if they get the same results
 
 this might solve alot of problems with this card since alot of distros compile
 Resolution Timer default to enabled...
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()

2007-08-24 Thread Satyam Sharma

Hi Denys,


On Fri, 24 Aug 2007, Denys Vlasenko wrote:

 On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
 
   static inline void wait_for_init_deassert(atomic_t *deassert)
   {
  -   while (!atomic_read(deassert));
  +   while (!atomic_read(deassert))
  +   cpu_relax();
  return;
   }
 
 For less-than-briliant people like me, it's totally non-obvious that
 cpu_relax() is needed for correctness here, not just to make P4 happy.

This thread has been round-and-round with exactly the same discussions
:-) I had proposed few such variants to make a compiler barrier implicit
in atomic_{read,set} myself, but frankly, at least personally speaking
(now that I know better), I'm not so much in favour of implicit barriers
(compiler, memory or both) in atomic_{read,set}.

This might sound like an about-turn if you read my own postings to Nick
Piggin from a week back, but I do agree with most his opinions on the
matter now -- separation of barriers from atomic ops is actually good,
beneficial to certain code that knows what it's doing, explicit usage
of barriers stands out more clearly (most people here who deal with it
do know cpu_relax() is an explicit compiler barrier) compared to an
implicit usage in an atomic_read() or such variant ...


 IOW: atomic_read name quite unambiguously means I will read
 this variable from main memory. Which is not true and creates
 potential for confusion and bugs.

I'd have to disagree here -- atomic ops are all about _atomicity_ of
memory accesses, not _making_ them happen (or visible to other CPUs)
_then and there_ itself. The latter are the job of barriers.

The behaviour (and expectations) are quite comprehensively covered in
atomic_ops.txt -- let alone atomic_{read,set}, even atomic_{inc,dec}
are permitted by archs' implementations to _not_ have any memory
barriers, for that matter. [It is unrelated that on x86 making them
SMP-safe requires the use of the LOCK prefix that also happens to be
an implicit memory barrier.]

An argument was also made about consistency of atomic_{read,set} w.r.t.
the other atomic ops -- but clearly, they are all already consistent!
All of them are atomic :-) The fact that atomic_{read,set} do _not_
require any inline asm or LOCK prefix whereas the others do, has to do
with the fact that unlike all others, atomic_{read,set} are not RMW ops
and hence guaranteed to be atomic just as they are in plain  simple C.

But if people do seem to have a mixed / confused notion of atomicity
and barriers, and if there's consensus, then as I'd said earlier, I
have no issues in going with the consensus (eg. having API variants).
Linus would be more difficult to convince, however, I suspect :-)


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-21 Thread Satyam Sharma

On Tue, 21 Aug 2007, Chris Snook wrote:

 David Miller wrote:
  From: Linus Torvalds [EMAIL PROTECTED]
  Date: Mon, 20 Aug 2007 22:46:47 -0700 (PDT)

   Ie a barrier() is likely _cheaper_ than the code generation downside
   from using volatile.

  Assuming GCC were ever better about the code generation badness
  with volatile that has been discussed here, I much prefer
  we tell GCC this memory piece changed rather than every
  piece of memory has changed which is what the barrier() does.

  I happened to have been scanning a lot of assembler lately to
  track down a gcc-4.2 miscompilation on sparc64, and the barriers
  do hurt quite a bit in some places.  Instead of keeping unrelated
  variables around cached in local registers, it reloads everything.

 Moore's law is definitely working against us here.  Register counts, pipeline
 depths, core counts, and clock multipliers are all increasing in the long run.
 At some point in the future, barrier() will be universally regarded as a
 hammer too big for most purposes.

I do agree, and the important point to note is that the benefits of a
/lighter/ compiler barrier, such as what David referred to above, _can_
be had without having to do anything with the volatile keyword at all.
And such a primitive has already been mentioned/proposed on this thread.

But this is all tangential to the core question at hand -- whether to have
implicit (compiler, possibly light-weight of the kind referred above)
barrier semantics in atomic ops that do not have them, or not.

I was lately looking in the kernel for _actual_ code that uses atomic_t
and benefits from the lack of any implicit barrier, with the compiler
being free to cache the atomic_t in a register. Now that often does _not_
happen, because all other ops (implemented in asm with LOCK prefix on x86)
_must_ therefore constrain the atomic_t to memory anyway. So typically all
atomic ops code sequences end up operating on memory.

Then I did locate sched.c:select_nohz_load_balancer() -- it repeatedly
references the same atomic_t object, and the code that I saw generated
(with CC_OPTIMIZE_FOR_SIZE=y) did cache it in a register for a sequence of
instructions. It uses atomic_cmpxchg, thereby not requiring explicit
memory barriers anywhere in the code, and is an example of an atomic_t
user that is safe, and yet benefits from its memory loads/stores being
elided/coalesced by the compiler.

# at this point, %%eax holds num_online_cpus() and
# %%ebx holds cpus_weight(nohz.cpu_mask)
# the variable cpu is in %esi

0xc1018e1d:  cmp%eax,%ebx   # if No.A.
0xc1018e1f:  mov0xc134d900,%eax # first atomic_read()
0xc1018e24:  jne0xc1018e36
0xc1018e26:  cmp%esi,%eax   # if No.B.
0xc1018e28:  jne0xc1018e80  # returns with 0
0xc1018e2a:  movl   $0x,0xc134d900  # atomic_set(-1), and ...
0xc1018e34:  jmp0xc1018e80  # ... returns with 0
0xc1018e36:  cmp$0x,%eax# if No.C. (NOTE!)
0xc1018e39:  jne0xc1018e46
0xc1018e3b:  lock cmpxchg %esi,0xc134d900   # atomic_cmpxchg()
0xc1018e43:  inc%eax
0xc1018e44:  jmp0xc1018e48
0xc1018e46:  cmp%esi,%eax   # if No.D. (NOTE!)
0xc1018e48:  jne0xc1018e80  # if !=, default return 0 (if 
No.E.)
0xc1018e4a:  jmp0xc1018e84  # otherwise (==) returns with 1

The above is:

if (cpus_weight(nohz.cpu_mask) == num_online_cpus()) {  /* if No.A. */
if (atomic_read(nohz.load_balancer) == cpu)/* if No.B. */
atomic_set(nohz.load_balancer, -1);/* XXX */
return 0;
}
if (atomic_read(nohz.load_balancer) == -1) {   /* if No.C. */
/* make me the ilb owner */
if (atomic_cmpxchg(nohz.load_balancer, -1, cpu) == -1) /* if 
No.E. */
return 1;
} else if (atomic_read(nohz.load_balancer) == cpu) /* if No.D. */
return 1;
...
...
return 0; /* default return from function */

As you can see, the atomic_read()'s of ifs Nos. B, C, and D, were _all_
coalesced into a single memory reference mov0xc134d900,%eax at the
top of the function, and then ifs Nos. C and D simply used the value
from %%eax itself. But that's perfectly safe, such is the logic of this
function. It uses cmpxchg _whenever_ updating the value in the memory
atomic_t and then returns appropriately. The _only_ point that a casual
reader may find racy is that marked /* XXX */ above -- atomic_read()
followed by atomic_set() with no barrier in between. But even that is ok,
because if one thread ever finds that condition to succeed, it is 100%
guaranteed no other thread on any other CPU will find _any_ condition
to be true, thereby avoiding any race in the modification of that value.

BTW it does sound reasonable that a lot of atomic_t users

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-18 Thread Satyam Sharma

[ LOL, you _are_ shockingly petty! ]


On Sat, 18 Aug 2007, Segher Boessenkool wrote:

   The documentation simply doesn't say +m is allowed.  The code to
   allow it was added for the benefit of people who do not read the
   documentation.  Documentation for +m might get added later if it
   is decided this [the code, not the documentation] is a sane thing
   to have (which isn't directly obvious).
  
  Huh?
  
  If the (current) documentation doesn't match up with the (current)
  code, then _at least one_ of them has to be (as of current) wrong.
  
  I wonder how could you even try to disagree with that.
 
 Easy.
 
 The GCC documentation you're referring to is the user's manual.
 See the blurb on the first page:
 
 This manual documents how to use the GNU compilers, as well as their
 features and incompatibilities, and how to report bugs.  It corresponds
 to GCC version 4.3.0.  The internals of the GNU compilers, including
 how to port them to new targets and some information about how to write
 front ends for new languages, are documented in a separate manual.
 
 _How to use_.  This documentation doesn't describe in minute detail
 everything the compiler does (see the source code for that -- no, it
 isn't described in the internals manual either).

Wow, now that's a nice disclaimer. By your (poor) standards of writing
documentation, one can as well write any factually incorrect stuff that
one wants in a document once you've got such a blurb in place :-)


 If it doesn't tell you how to use +m, and even tells you _not_ to
 use it, maybe that is what it means to say?  It doesn't mean +m
 doesn't actually do something.  It also doesn't mean it does what
 you think it should do.  It might do just that of course.  But treating
 writing C code as an empirical science isn't such a smart idea.

Oh, really? Considering how much is (left out of being) documented, often
one would reasonably have to experimentally see (with testcases) how the
compiler behaves for some given code. Well, at least _I_ do it often
(several others on this list do as well), and I think there's everything
smart about it rather than having to read gcc sources -- I'd be surprised
(unless you have infinite free time on your hands, which does look like
teh case actually) if someone actually prefers reading gcc sources first
to know what/how gcc does something for some given code, rather than
simply write it out, compile and look the generated code (saves time for
those who don't have an infinite amount of it).


  And I didn't go whining about this ... you asked me. (I think I'd said
  something to the effect of GCC docs are often wrong,
 
 No need to guess at what you said, even if you managed to delete
 your own mail already, there are plenty of free web-based archives
 around.  You said:
 
  See, volatile C keyword, for all it's ill-definition and dodgy
  semantics, is still at least given somewhat of a treatment in the C
  standard (whose quality is ... ummm, sadly not always good and clear,
  but unsurprisingly, still about 5,482 orders-of-magnitude times
  better than GCC docs).

Try _reading_ what I said there, for a change, dude. I'd originally only
said unless GCC's docs is yet again wrong ... then _you_ asked me what,
after which this discussion began and I wrote the above [which I fully
agree with -- so what if I used hyperbole in my sentence (yup, that was
intended, and obviously, exaggeration), am I not even allowed to do that?
Man, you're a Nazi or what ...] I didn't go whining about on my own as
you'd had earlier suggested, until _you_ asked me.

[ Ick, I somehow managed to reply this ... this is such a ...
  *disgustingly* petty argument you made here. ]


  which is true,
 
 Yes, documentation of that size often has shortcomings.  No surprise
 there.  However, great effort is made to make it better documentation,
 and especially to keep it up to date; if you find any errors or
 omissions, please report them.  There are many ways how to do that,
 see the GCC homepage./end-of-marketing-blurb
 ^^

Looks like you even get paid :-)


  but probably you feel saying that is not allowed on non-gcc lists?)
 
 [amazingly pointless stuff snipped]
 
  As for the PR
  you're requesting me to file with GCC for this, that
  gcc-patches@ thread did precisely that
 
 [more amazingly pointless stuff snipped]
 
  and more (submitted a patch to
  said documentation -- and no, saying documentation might get added
  later is totally bogus and nonsensical -- documentation exists to
  document current behaviour, not past).
 
 When code like you want to write becomes a supported feature, that
 will be reflected in the user manual.  It is completely nonsensical
 to expect everything that is *not* a supported feature to be mentioned
 there.

What crap. It is _perfectly reasonable_ to expect (current) documentation
to keep up with (current) code behaviour. In fact trying to justify such
a state is completely

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-18 Thread Satyam Sharma



On Fri, 17 Aug 2007, Linus Torvalds wrote:

 On Sat, 18 Aug 2007, Satyam Sharma wrote:
  
  No code does (or would do, or should do):
  
  x.counter++;
  
  on an atomic_t x; anyway.
 
 That's just an example of a general problem.
 
 No, you don't use x.counter++. But you *do* use
 
   if (atomic_read(x) = 1)
 
 and loading into a register is stupid and pointless, when you could just 
 do it as a regular memory-operand to the cmp instruction.

True, but that makes this a bad/poor code generation issue with the
compiler, not something that affects the _correctness_ of atomic ops if
volatile is used for that counter object (as was suggested), because
we'd always use the atomic_inc() etc primitives to do increments, which
are always (should be!) implemented to be atomic.


 And as far as the compiler is concerned, the problem is the 100% same: 
 combining operations with the volatile memop.
 
 The fact is, a compiler that thinks that
 
   movl mem,reg
   cmpl $val,reg
 
 is any better than
 
   cmpl $val,mem
 
 is just not a very good compiler.

Absolutely, this is definitely a bug report worth opening with gcc. And
what you've said to explain this previously sounds definitely correct --
seeing volatile for any access does appear to just scare the hell out
of gcc and makes it generate such (poor) code.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-18 Thread Satyam Sharma



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

   GCC manual, section 6.1, When
  ^^
   is a Volatile Object Accessed? doesn't say anything of the
  ^^^
   kind.
  ^

  True, implementation-defined as per the C standard _is_ supposed to mean
^

  unspecified behaviour where each implementation documents how the choice
  is made. So ok, probably GCC isn't documenting this
  

  implementation-defined behaviour which it is supposed to, but can't really
  fault them much for this, probably.
 
 GCC _is_ documenting this, namely in this section 6.1.

(Again totally petty, but) Yes, but ...

 It doesn't
  ^^
 mention volatile-casted stuff.  Draw your own conclusions.
  ^^

... exactly. So that's why I said GCC isn't documenting _this_.

Man, try _reading_ mails before replying to them ...
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Herbert Xu wrote:

 On Fri, Aug 17, 2007 at 01:43:27PM +1000, Paul Mackerras wrote:
 
  The cost of doing so seems to me to be well down in the noise - 44
  bytes of extra kernel text on a ppc64 G5 config, and I don't believe
  the extra few cycles for the occasional extra load would be measurable
  (they should all hit in the L1 dcache).  I don't mind if x86[-64] have
  atomic_read/set be nonvolatile and find all the missing barriers, but
  for now on powerpc, I think that not having to find those missing
  barriers is worth the 0.00076% increase in kernel text size.
 
 BTW, the sort of missing barriers that triggered this thread
 aren't that subtle.  It'll result in a simple lock-up if the
 loop condition holds upon entry.  At which point it's fairly
 straightforward to find the culprit.

Not necessarily. A barrier-less buggy code such as below:

atomic_set(v, 0);

... /* some initial code */

while (atomic_read(v))
;

... /* code that MUST NOT be executed unless v becomes non-zero */

(where v-counter is has no volatile access semantics)

could be generated by the compiler to simply *elid* or *do away* with
the loop itself, thereby making the:

/* code that MUST NOT be executed unless v becomes non-zero */

to be executed even when v is zero! That is subtle indeed, and causes
no hard lockups.

Granted, the above IS buggy code. But, the stated objective is to avoid
heisenbugs. And we have driver / subsystem maintainers such as Stefan
coming up and admitting that often a lot of code that's written to use
atomic_read() does assume the read will not be elided by the compiler.

See, I agree, volatility semantics != what we often want. However, if
what we want is compiler barrier, for only the object under consideration,
volatility semantics aren't really nonsensical or anything.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Thu, 16 Aug 2007, Paul E. McKenney wrote:

 On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
  On Thu, Aug 16, 2007 at 09:34:41AM -0700, Paul E. McKenney wrote:
  
   The compiler can also reorder non-volatile accesses.  For an example
   patch that cares about this, please see:
   
 http://lkml.org/lkml/2007/8/7/280
   
   This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
   rcu_read_unlock() to ensure that accesses aren't reordered with respect
   to interrupt handlers and NMIs/SMIs running on that same CPU.
  
  Good, finally we have some code to discuss (even though it's
  not actually in the kernel yet).
 
 There was some earlier in this thread as well.

Hmm, I never quite got what all this interrupt/NMI/SMI handling and
RCU business you mentioned earlier was all about, but now that you've
pointed to the actual code and issues with it ...


  First of all, I think this illustrates that what you want
  here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
  macro occurs a lot more times in your patch than atomic
  reads/sets.  So *assuming* that it was necessary at all,
  then having an ordered variant of the atomic_read/atomic_set
  ops could do just as well.
 
 Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
 to maintain ordering, then I could just use them instead of having to
 create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
 different patch.)

+#define WHATEVER(x)(*(volatile typeof(x) *)(x))

I suppose one could want volatile access semantics for stuff that's
a bit-field too, no?

Also, this gives *zero* re-ordering guarantees that your code wants
as you've explained it below) -- neither w.r.t. CPU re-ordering (which
probably you don't care about) *nor* w.r.t. compiler re-ordering
(which you definitely _do_ care about).


  However, I still don't know which atomic_read/atomic_set in
  your patch would be broken if there were no volatile.  Could
  you please point them out?
 
 Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
 atomic_read() and atomic_set().  Starting with __rcu_read_lock():
 
 o If ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++
   was ordered by the compiler after
   ORDERED_WRT_IRQ(me-rcu_read_lock_nesting) = nesting + 1, then
   suppose an NMI/SMI happened after the rcu_read_lock_nesting but
   before the rcu_flipctr.
 
   Then if there was an rcu_read_lock() in the SMI/NMI
   handler (which is perfectly legal), the nested rcu_read_lock()
   would believe that it could take the then-clause of the
   enclosing if statement.  But because the rcu_flipctr per-CPU
   variable had not yet been incremented, an RCU updater would
   be within its rights to assume that there were no RCU reads
   in progress, thus possibly yanking a data structure out from
   under the reader in the SMI/NMI function.
 
   Fatal outcome.  Note that only one CPU is involved here
   because these are all either per-CPU or per-task variables.

Ok, so you don't care about CPU re-ordering. Still, I should let you know
that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
want is a full compiler optimization barrier().

[ Your code probably works now, and emits correct code, but that's
  just because of gcc did what it did. Nothing in any standard,
  or in any documented behaviour of gcc, or anything about the real
  (or expected) semantics of volatile is protecting the code here. ]


 o If ORDERED_WRT_IRQ(me-rcu_read_lock_nesting) = nesting + 1
   was ordered by the compiler to follow the
   ORDERED_WRT_IRQ(me-rcu_flipctr_idx) = idx, and an NMI/SMI
   happened between the two, then an __rcu_read_lock() in the NMI/SMI
   would incorrectly take the else clause of the enclosing if
   statement.  If some other CPU flipped the rcu_ctrlblk.completed
   in the meantime, then the __rcu_read_lock() would (correctly)
   write the new value into rcu_flipctr_idx.
 
   Well and good so far.  But the problem arises in
   __rcu_read_unlock(), which then decrements the wrong counter.
   Depending on exactly how subsequent events played out, this could
   result in either prematurely ending grace periods or never-ending
   grace periods, both of which are fatal outcomes.
 
 And the following are not needed in the current version of the
 patch, but will be in a future version that either avoids disabling
 irqs or that dispenses with the smp_read_barrier_depends() that I
 have 99% convinced myself is unneeded:
 
 o nesting = ORDERED_WRT_IRQ(me-rcu_read_lock_nesting);
 
 o idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed)  0x1;
 
 Furthermore, in that future version, irq handlers can cause the same
 mischief that SMI/NMI handlers can in this version.
 
 Next, looking at __rcu_read_unlock():
 
 o If ORDERED_WRT_IRQ(me-rcu_read_lock_nesting) = nesting - 1
   was reordered by the compiler to

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Paul Mackerras wrote:

 Herbert Xu writes:
 
  On Fri, Aug 17, 2007 at 03:09:57PM +1000, Paul Mackerras wrote:
   Herbert Xu writes:
   
Can you find an actual atomic_read code snippet there that is
broken without the volatile modifier?
   
   There are some in arch-specific code, for example line 1073 of
   arch/mips/kernel/smtc.c.  On mips, cpu_relax() is just barrier(), so
   the empty loop body is ok provided that atomic_read actually does the
   load each time around the loop.
  
  A barrier() is all you need to force the compiler to reread
  the value.
  
  The people advocating volatile in this thread are talking
  about code that doesn't use barrier()/cpu_relax().
 
 Did you look at it?  Here it is:
 
   /* Someone else is initializing in parallel - let 'em finish */
   while (atomic_read(idle_hook_initialized)  1000)
   ;


Honestly, this thread is suffering from HUGE communication gaps.

What Herbert (obviously) meant there was that this loop could've
been okay _without_ using volatile-semantics-atomic_read() also, if
only it used cpu_relax().

That does work, because cpu_relax() is _at least_ barrier() on all
archs (on some it also emits some arch-dependent pause kind of
instruction).

Now, saying that MIPS does not have such an instruction so I won't
use cpu_relax() for arch-dependent-busy-while-loops in arch/mips/
sounds like a wrong argument, because: tomorrow, such arch's _may_
introduce such an instruction, so naturally, at that time we'd
change cpu_relax() appropriately (in reality, we would actually
*re-define* cpu_relax() and ensure that the correct version gets
pulled in depending on whether the callsite code is legacy or only
for the newer such CPUs of said arch, whatever), but loops such as
this would remain un-changed, because they never used cpu_relax()!

OTOH an argument that said the following would've made a stronger case:

I don't want to use cpu_relax() because that's a full memory
clobber barrier() and I have loop-invariants / other variables
around in that code that I *don't* want the compiler to forget
just because it used cpu_relax(), and hence I will not use
cpu_relax() but instead make my atomic_read() itself have
volatility semantics. Not just that, but I will introduce a
cpu_relax_no_barrier() on MIPS, that would be a no-op #define
for now, but which may not be so forever, and continue to use
that in such busy loops.

In general, please read the thread-summary I've tried to do at:
http://lkml.org/lkml/2007/8/17/25
Feel free to continue / comment / correct stuff from there, there's
too much confusion and circular-arguments happening on this thread
otherwise.

[ I might've made an incorrect statement there about
  volatile w.r.t. cache on non-x86 archs, I think. ]


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Nick Piggin wrote:

 Satyam Sharma wrote:
 
  #define atomic_read_volatile(v) \
  ({  \
  forget((v)-counter);   \
  ((v)-counter); \
  })
  
  where:
 
 *vomit* :)

I wonder if this'll generate smaller and better code than _both_ the
other atomic_read_volatile() variants. Would need to build allyesconfig
on lots of diff arch's etc to test the theory though.


 Not only do I hate the keyword volatile, but the barrier is only a
 one-sided affair so its probable this is going to have slightly
 different allowed reorderings than a real volatile access.

True ...


 Also, why would you want to make these insane accessors for atomic_t
 types? Just make sure everybody knows the basics of barriers, and they
 can apply that knowledge to atomic_t and all other lockless memory
 accesses as well.

Code that looks like:

while (!atomic_read(v)) {
...
cpu_relax_no_barrier();
forget(v.counter);

}

would be uglier. Also think about code such as:

a = atomic_read();
if (!a)
do_something();

forget();
a = atomic_read();
... /* some code that depends on value of a, obviously */

forget();
a = atomic_read();
...

So much explicit sprinkling of forget() looks ugly.

atomic_read_volatile()

on the other hand, looks neater. The _volatile() suffix makes it also
no less explicit than an explicit barrier-like macro that this primitive
is something special, for code clarity purposes.


  #define forget(a)   __asm__ __volatile__ ( :=m (a) :m (a))
 
 I like order(x) better, but it's not the most perfect name either.

forget(x) is just a stupid-placeholder-for-a-better-name. order(x) sounds
good but we could leave quibbling about function or macro names for later,
this thread is noisy as it is :-)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Nick Piggin wrote:

 Stefan Richter wrote:
 [...]
 Just use spinlocks if you're not absolutely clear about potential
 races and memory ordering issues -- they're pretty cheap and simple.

I fully agree with this. As Paul Mackerras mentioned elsewhere,
a lot of authors sprinkle atomic_t in code thinking they're somehow
done with *locking*. This is sad, and I wonder if it's time for a
Documentation/atomic-considered-dodgy.txt kind of document :-)


  Sure, now
  that I learned of these properties I can start to audit code and insert
  barriers where I believe they are needed, but this simply means that
  almost all occurrences of atomic_read will get barriers (unless there
  already are implicit but more or less obvious barriers like msleep).
 
 You might find that these places that appear to need barriers are
 buggy for other reasons anyway. Can you point to some in-tree code
 we can have a look at?

Such code was mentioned elsewhere (query nodemgr_host_thread in cscope)
that managed to escape the requirement for a barrier only because of
some completely un-obvious compilation-unit-scope thing. But I find such
an non-explicit barrier quite bad taste. Stefan, do consider plunking an
explicit call to barrier() there.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Nick Piggin wrote:

 Satyam Sharma wrote:
  
  On Fri, 17 Aug 2007, Nick Piggin wrote:
 
Sure, now
that I learned of these properties I can start to audit code and insert
barriers where I believe they are needed, but this simply means that
almost all occurrences of atomic_read will get barriers (unless there
already are implicit but more or less obvious barriers like msleep).
   
   You might find that these places that appear to need barriers are
   buggy for other reasons anyway. Can you point to some in-tree code
   we can have a look at?
  
  
  Such code was mentioned elsewhere (query nodemgr_host_thread in cscope)
  that managed to escape the requirement for a barrier only because of
  some completely un-obvious compilation-unit-scope thing. But I find such
  an non-explicit barrier quite bad taste. Stefan, do consider plunking an
  explicit call to barrier() there.
 
 It is very obvious. msleep calls schedule() (ie. sleeps), which is
 always a barrier.

Probably you didn't mean that, but no, schedule() is not barrier because
it sleeps. It's a barrier because it's invisible.

 The unobvious thing is that you wanted to know how the compiler knows
 a function is a barrier -- answer is that if it does not *know* it is not
 a barrier, it must assume it is a barrier.

True, that's clearly what happens here. But are you're definitely joking
that this is obvious in terms of code-clarity, right?

Just 5 minutes back you mentioned elsewhere you like seeing lots of
explicit calls to barrier() (with comments, no less, hmm? :-)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Andi Kleen wrote:

 On Friday 17 August 2007 05:42, Linus Torvalds wrote:
  On Fri, 17 Aug 2007, Paul Mackerras wrote:
   I'm really surprised it's as much as a few K.  I tried it on powerpc
   and it only saved 40 bytes (10 instructions) for a G5 config.
 
  One of the things that volatile generally screws up is a simple
 
  volatile int i;
 
  i++;
 
 But for atomic_t people use atomic_inc() anyways which does this correctly.
 It shouldn't really matter for atomic_t.
 
 I'm worrying a bit that the volatile atomic_t change caused subtle code 
 breakage like these delay read loops people here pointed out.

Umm, I followed most of the thread, but which breakage is this?

 Wouldn't it be safer to just re-add the volatile to atomic_read() 
 for 2.6.23? Or alternatively make it asm(), but volatile seems more
 proven.

The problem with volatile is not just trashy code generation (which also
definitely is a major problem), but definition holes, and implementation
inconsistencies. Making it asm() is not the only other alternative to
volatile either (read another reply to this mail), but considering most
of the thread has been about people not wanting even a
atomic_read_volatile() variant, making atomic_read() itself have volatile
semantics sounds ... strange :-)


PS: http://lkml.org/lkml/2007/8/15/407 was submitted a couple days back,
any word if you saw that?

I have another one for you:


[PATCH] i386, x86_64: __const_udelay() should not be marked inline

Because it can never get inlined in any callsite (each translation unit
is compiled separately for the kernel and so the implementation of
__const_udelay() would be invisible to all other callsites). In fact it
turns out, the correctness of callsites at arch/x86_64/kernel/crash.c:97
and arch/i386/kernel/crash.c:101 explicitly _depends_ upon it not being
inlined, and also it's an exported symbol (modules may want to call
mdelay() and udelay() that often becomes __const_udelay() after some
macro-ing in various headers). So let's not mark it as inline either.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 arch/i386/lib/delay.c   |2 +-
 arch/x86_64/lib/delay.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/i386/lib/delay.c b/arch/i386/lib/delay.c
index f6edb11..0082c99 100644
--- a/arch/i386/lib/delay.c
+++ b/arch/i386/lib/delay.c
@@ -74,7 +74,7 @@ void __delay(unsigned long loops)
delay_fn(loops);
 }
 
-inline void __const_udelay(unsigned long xloops)
+void __const_udelay(unsigned long xloops)
 {
int d0;
 
diff --git a/arch/x86_64/lib/delay.c b/arch/x86_64/lib/delay.c
index 2dbebd3..d0cd9cd 100644
--- a/arch/x86_64/lib/delay.c
+++ b/arch/x86_64/lib/delay.c
@@ -38,7 +38,7 @@ void __delay(unsigned long loops)
 }
 EXPORT_SYMBOL(__delay);
 
-inline void __const_udelay(unsigned long xloops)
+void __const_udelay(unsigned long xloops)
 {
__delay(((xloops * HZ * 
cpu_data[raw_smp_processor_id()].loops_per_jiffy)  32) + 1);
 }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Nick Piggin wrote:

 Satyam Sharma wrote:
  On Fri, 17 Aug 2007, Nick Piggin wrote:
   Satyam Sharma wrote:
   
   It is very obvious. msleep calls schedule() (ie. sleeps), which is
   always a barrier.
  
  Probably you didn't mean that, but no, schedule() is not barrier because
  it sleeps. It's a barrier because it's invisible.
 
 Where did I say it is a barrier because it sleeps?

Just below. What you wrote:

 It is always a barrier because, at the lowest level, schedule() (and thus
 anything that sleeps) is defined to always be a barrier.

It is always a barrier because, at the lowest level, anything that sleeps
is defined to always be a barrier.


 Regardless of
 whatever obscure means the compiler might need to infer the barrier.
 
 In other words, you can ignore those obscure details because schedule() is
 always going to have an explicit barrier in it.

I didn't quite understand what you said here, so I'll tell what I think:

* foo() is a compiler barrier if the definition of foo() is invisible to
  the compiler at a callsite.

* foo() is also a compiler barrier if the definition of foo() includes
  a barrier, and it is inlined at the callsite.

If the above is wrong, or if there's something else at play as well,
do let me know.

   The unobvious thing is that you wanted to know how the compiler knows
   a function is a barrier -- answer is that if it does not *know* it is not
   a barrier, it must assume it is a barrier.
  
  True, that's clearly what happens here. But are you're definitely joking
  that this is obvious in terms of code-clarity, right?
 
 No. If you accept that barrier() is implemented correctly, and you know
 that sleeping is defined to be a barrier,

Curiously, that's the second time you've said sleeping is defined to
be a (compiler) barrier. How does the compiler even know if foo() is
a function that sleeps? Do compilers have some notion of sleeping
to ensure they automatically assume a compiler barrier whenever such
a function is called? Or are you saying that the compiler can see the
barrier() inside said function ... nopes, you're saying quite the
opposite below.


 then its perfectly clear. You
 don't have to know how the compiler knows that some function contains
 a barrier.

I think I do, why not? Would appreciate if you could elaborate on this.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Nick Piggin wrote:

 Satyam Sharma wrote:
 [...]
  You think both these are equivalent in terms of looks:
  
  |
  while (!atomic_read(v)) {  |   while (!atomic_read_xxx(v)) {
  ... |   ...
  cpu_relax_no_barrier(); |
  cpu_relax_no_barrier();
  order_atomic(v);   |   }
  }   |
  
  (where order_atomic() is an atomic_t
  specific wrapper as you mentioned below)
  
  ?
 
 I think the LHS is better if your atomic_read_xxx primitive is using the
 crazy one-sided barrier,
  ^

I'd say it's purposefully one-sided.

 because the LHS code you immediately know what
 barriers are happening, and with the RHS you have to look at the
 atomic_read_xxx
 definition.

No. As I said, the _xxx (whatever the heck you want to name it as) should
give the same heads-up that your order_atomic thing is supposed to give.


 If your atomic_read_xxx implementation was more intuitive, then both are
 pretty well equal. More lines != ugly code.
 
  [...]
  What bugs?
 
 You can't think for yourself? Your atomic_read_volatile contains a compiler
 barrier to the atomic variable before the load. 2 such reads from different
 locations look like this:
 
 asm volatile( : +m (v1));
 atomic_read(v1);
 asm volatile( : +m (v2));
 atomic_read(v2);
 
 Which implies that the load of v1 can be reordered to occur after the load
 of v2.

And how would that be a bug? (sorry, I really can't think for myself)


   Secondly, what sort of code would do such a thing?
  
  See the nodemgr_host_thread() that does something similar, though not
  exactly same.
 
 I'm sorry, all this waffling about made up code which might do this and
 that is just a waste of time.

First, you could try looking at the code.

And by the way, as I've already said (why do *require* people to have to
repeat things to you?) this isn't even about only existing code.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Nick Piggin wrote:

 Satyam Sharma wrote:
 
  On Fri, 17 Aug 2007, Nick Piggin wrote:
  
   Because they should be thinking about them in terms of barriers, over
   which the compiler / CPU is not to reorder accesses or cache memory
   operations, rather than special volatile accesses.
  
  This is obviously just a taste thing. Whether to have that forget(x)
  barrier as something author should explicitly sprinkle appropriately
  in appropriate places in the code by himself or use a primitive that
  includes it itself.
 
 That's not obviously just taste to me. Not when the primitive has many
 (perhaps, the majority) of uses that do not require said barriers. And
 this is not solely about the code generation (which, as Paul says, is
 relatively minor even on x86).

See, you do *require* people to have to repeat the same things to you!

As has been written about enough times already, and if you followed the
discussion on this thread, I am *not* proposing that atomic_read()'s
semantics be changed to have any extra barriers. What is proposed is a
different atomic_read_xxx() variant thereof, that those can use who do
want that.

Now whether to have a kind of barrier (volatile, whatever) in the
atomic_read_xxx() itself, or whether to make the code writer himself to
explicitly write the order(x) appropriately in appropriate places in the
code _is_ a matter of taste.


  That's definitely the point, why not. This is why barrier(), being
  heavy-handed, is not the best option.
 
 That is _not_ the point [...]

Again, you're requiring me to repeat things that were already made evident
on this thread (if you follow it).

This _is_ the point, because a lot of loops out there (too many of them,
I WILL NOT bother citing file_name:line_number) end up having to use a
barrier just because they're using a loop-exit-condition that depends
on a value returned by atomic_read(). It would be good for them if they
used an atomic_read_xxx() primitive that gave these volatility semantics
without junking compiler optimizations for other memory references.

 because there has already been an alternative posted

Whether that alternative (explicitly using forget(x), or wrappers thereof,
such as the order_atomic you proposed) is better than other alternatives
(such as atomic_read_xxx() which includes the volatility behaviour in
itself) is still open, and precisely what we started discussing just one
mail back.

(The above was also mostly stuff I had to repeated for you, sadly.)

 that better conforms with Linux barrier
 API and is much more widely useful and more usable.

I don't think so.

(Now *this* _is_ the taste-dependent matter that I mentioned earlier.)

 If you are so worried
 about
 barrier() being too heavyweight, then you're off to a poor start by wanting to
 add a few K of kernel text by making atomic_read volatile.

Repeating myself, for the N'th time, NO, I DON'T want to make atomic_read
have volatile semantics.

   because as I also mentioned, the logical extention
   to Linux's barrier API to handle this is the order(x) macro. Again, not
   special volatile accessors.
  
  Sure, that forget(x) macro _is_ proposed to be made part of the generic
  API. Doesn't explain why not to define/use primitives that has volatility
 
  semantics in itself, though (taste matters apart).
^
 If you follow the discussion You were thinking of a reason why the
 semantics *should* be changed or added, and I was rebutting your argument
 that it must be used when a full barrier() is too heavy (ie. by pointing
 out that order() has superior semantics anyway).

Amazing. Either you have reading comprehension problems, or else, please
try reading this thread (or at least this sub-thread) again. I don't want
_you_ blaming _me_ for having to repeat things to you all over again.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Segher Boessenkool wrote:

   atomic_dec() already has volatile behavior everywhere, so this is
   semantically
   okay, but this code (and any like it) should be calling cpu_relax() each
   iteration through the loop, unless there's a compelling reason not to.
   I'll
   allow that for some hardware drivers (possibly this one) such a compelling
   reason may exist, but hardware-independent core subsystems probably have
   no
   excuse.
  
  No it does not have any volatile semantics. atomic_dec() can be reordered
  at will by the compiler within the current basic unit if you do not add a
  barrier.
 
 volatile has nothing to do with reordering.

If you're talking of volatile the type-qualifier keyword, then
http://lkml.org/lkml/2007/8/16/231 (and sub-thread below it) shows
otherwise.

 atomic_dec() writes
 to memory, so it _does_ have volatile semantics, implicitly, as
 long as the compiler cannot optimise the atomic variable away
 completely -- any store counts as a side effect.

I don't think an atomic_dec() implemented as an inline asm volatile
or one that uses a forget macro would have the same re-ordering
guarantees as an atomic_dec() that uses a volatile access cast.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Paul E. McKenney wrote:

 On Fri, Aug 17, 2007 at 01:09:08PM +0530, Satyam Sharma wrote:
  
  On Thu, 16 Aug 2007, Paul E. McKenney wrote:
  
   On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:

First of all, I think this illustrates that what you want
here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
macro occurs a lot more times in your patch than atomic
reads/sets.  So *assuming* that it was necessary at all,
then having an ordered variant of the atomic_read/atomic_set
ops could do just as well.
   
   Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
   to maintain ordering, then I could just use them instead of having to
   create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
   different patch.)
  
  +#define WHATEVER(x)(*(volatile typeof(x) *)(x))
  [...]
  Also, this gives *zero* re-ordering guarantees that your code wants
  as you've explained it below) -- neither w.r.t. CPU re-ordering (which
  probably you don't care about) *nor* w.r.t. compiler re-ordering
  (which you definitely _do_ care about).
 
 You are correct about CPU re-ordering (and about the fact that this
 example doesn't care about it), but not about compiler re-ordering.
 
 The compiler is prohibited from moving a volatile access across a sequence
 point.  One example of a sequence point is a statement boundary.  Because
 all of the volatile accesses in this code are separated by statement
 boundaries, a conforming compiler is prohibited from reordering them.

Yes, you're right, and I believe precisely this was discussed elsewhere
as well today.

But I'd call attention to what Herbert mentioned there. You're using
ORDERED_WRT_IRQ() on stuff that is _not_ defined to be an atomic_t at all:

* Member completed of struct rcu_ctrlblk is a long.
* Per-cpu variable rcu_flipctr is an array of ints.
* Members rcu_read_lock_nesting and rcu_flipctr_idx of
  struct task_struct are ints.

So are you saying you're having to use this volatile-access macro
because you *couldn't* declare all the above as atomic_t and thus just
expect the right thing to happen by using the atomic ops API by default,
because it lacks volatile access semantics (on x86)?

If so, then I wonder if using the volatile access cast is really the
best way to achieve (at least in terms of code clarity) the kind of
re-ordering guarantees it wants there. (there could be alternative
solutions, such as using barrier(), or that at bottom of this mail)

What I mean is this: If you switch to atomic_t, and x86 switched to
make atomic_t have volatile semantics by default, the statements
would be simply a string of: atomic_inc(), atomic_add(), atomic_set(),
and atomic_read() statements, and nothing in there that clearly makes
it *explicit* that the code is correct (and not buggy) simply because
of the re-ordering guarantees that the C volatile type-qualifier
keyword gives us as per the standard. But now we're firmly in
subjective territory, so you or anybody could legitimately disagree.


   Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
   atomic_read() and atomic_set().  Starting with __rcu_read_lock():
   
   o If ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++
 was ordered by the compiler after
 ORDERED_WRT_IRQ(me-rcu_read_lock_nesting) = nesting + 1, then
 suppose an NMI/SMI happened after the rcu_read_lock_nesting but
 before the rcu_flipctr.
   
 Then if there was an rcu_read_lock() in the SMI/NMI
 handler (which is perfectly legal), the nested rcu_read_lock()
 would believe that it could take the then-clause of the
 enclosing if statement.  But because the rcu_flipctr per-CPU
 variable had not yet been incremented, an RCU updater would
 be within its rights to assume that there were no RCU reads
 in progress, thus possibly yanking a data structure out from
 under the reader in the SMI/NMI function.
   
 Fatal outcome.  Note that only one CPU is involved here
 because these are all either per-CPU or per-task variables.
  
  Ok, so you don't care about CPU re-ordering. Still, I should let you know
  that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
  want is a full compiler optimization barrier().
 
 No.  See above.

True, *(volatile foo *) _will_ work for this case.

But multiple calls to barrier() (granted, would invalidate all other
optimizations also) would work as well, would it not?

[ Interestingly, if you declared all those objects mentioned earlier as
  atomic_t, and x86(-64) switched to an __asm__ __volatile__ based variant
  for atomic_{read,set}_volatile(), the bugs you want to avoid would still
  be there. volatile the C language type-qualifier does have compiler
  re-ordering semantics you mentioned earlier, but the volatile that
  applies to inline asm()s gives no re-ordering guarantees. ]


   o If ORDERED_WRT_IRQ(me-rcu_read_lock_nesting) = nesting

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

  #define forget(a)   __asm__ __volatile__ ( :=m (a) :m (a))
  
  [ This is exactly equivalent to using +m in the constraints, as recently
explained on a GCC list somewhere, in response to the patch in my bitops
series a few weeks back where I thought +m was bogus. ]
 
 [It wasn't explained on a GCC list in response to your patch, as
 far as I can see -- if I missed it, please point me to an archived
 version of it].

http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01758.html

is a follow-up in the thread on the [EMAIL PROTECTED] mailing list,
which began with:

http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01677.html

that was posted by Jan Kubicka, as he quotes in that initial posting,
after I had submitted:

http://lkml.org/lkml/2007/7/23/252

which was a (wrong) patch to rectify what I thought was the bogus
+m constraint, as per the quoted extract from gcc docs (that was
given in that (wrong) patch's changelog).

That's when _I_ came to know how GCC interprets +m, but probably
this has been explained on those lists multiple times. Who cares,
anyway?


 One last time: it isn't equivalent on older (but still supported
 by Linux) versions of GCC.  Current versions of GCC allow it, but
 it has no documented behaviour at all, so use it at your own risk.

True.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

No it does not have any volatile semantics. atomic_dec() can be
reordered
at will by the compiler within the current basic unit if you do not add
a
barrier.
   
   volatile has nothing to do with reordering.
  
  If you're talking of volatile the type-qualifier keyword, then
  http://lkml.org/lkml/2007/8/16/231 (and sub-thread below it) shows
  otherwise.
 
 I'm not sure what in that mail you mean, but anyway...
 
 Yes, of course, the fact that volatile creates a side effect
 prevents certain things from being reordered wrt the atomic_dec();
 but the atomic_dec() has a side effect *already* so the volatile
 doesn't change anything.

That's precisely what that sub-thread (read down to the last mail
there, and not the first mail only) shows. So yes, volatile does
have something to do with re-ordering (as guaranteed by the C
standard).


   atomic_dec() writes
   to memory, so it _does_ have volatile semantics, implicitly, as
   long as the compiler cannot optimise the atomic variable away
   completely -- any store counts as a side effect.
  
  I don't think an atomic_dec() implemented as an inline asm volatile
  or one that uses a forget macro would have the same re-ordering
  guarantees as an atomic_dec() that uses a volatile access cast.
 
 The asm volatile implementation does have exactly the same
 reordering guarantees as the volatile cast thing,

I don't think so.

 if that is
 implemented by GCC in the obvious way.  Even a plain asm()
 will do the same.

Read the relevant GCC documentation.

[ of course, if the (latest) GCC documentation is *yet again*
  wrong, then alright, not much I can do about it, is there. ]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Christoph Lameter wrote:

 On Fri, 17 Aug 2007, Paul E. McKenney wrote:
 
  On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
   On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
   
gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
   
   I had totally forgotten that I'd already filed that bug more
   than six years ago until they just closed yours as a duplicate
   of mine :)
   
   Good luck in getting it fixed!
  
  Well, just got done re-opening it for the third time.  And a local
  gcc community member advised me not to give up too easily.  But I
  must admit that I am impressed with the speed that it was identified
  as duplicate.
  
  Should be entertaining!  ;-)
 
 Right. ROTFL... volatile actually breaks atomic_t instead of making it 
 safe. x++ becomes a register load, increment and a register store. Without 
 volatile we can increment the memory directly.

No code does (or would do, or should do):

x.counter++;

on an atomic_t x; anyway.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

 atomic_dec() writes
 to memory, so it _does_ have volatile semantics, implicitly, as
 long as the compiler cannot optimise the atomic variable away
 completely -- any store counts as a side effect.

I don't think an atomic_dec() implemented as an inline asm volatile
or one that uses a forget macro would have the same re-ordering
guarantees as an atomic_dec() that uses a volatile access cast.
   
   The asm volatile implementation does have exactly the same
   reordering guarantees as the volatile cast thing,
  
  I don't think so.
 
 asm volatile creates a side effect.

Yeah.

 Side effects aren't
 allowed to be reordered wrt sequence points.

Yeah.

 This is exactly
 the same reason as why volatile accesses cannot be reordered.

No, the code in that sub-thread I earlier pointed you at *WAS* written
such that there was a sequence point after all the uses of that volatile
access cast macro, and _therefore_ we were safe from re-ordering
(behaviour guaranteed by the C standard).

But you seem to be missing the simple and basic fact that:

(something_that_has_side_effects || statement)
!= something_that_is_a_sequence_point

Now, one cannot fantasize that volatile asms are also sequence points.
In fact such an argument would be sadly mistaken, because sequence
points are defined by the C standard and it'd be horribly wrong to
even _try_ claiming that the C standard knows about volatile asms.


   if that is
   implemented by GCC in the obvious way.  Even a plain asm()
   will do the same.
  
  Read the relevant GCC documentation.
 
 I did, yes.

No, you didn't read:

http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

Read the bit about the need for artificial dependencies, and the example
given there:

asm volatile(mtfsf 255,%0 : : f (fpenv));
sum = x + y;

The docs explicitly say the addition can be moved before the volatile
asm. Hopefully, as you know, (x + y) is an C expression and hence
a sequence point as defined by the standard. So the volatile asm
should've happened before it, right? Wrong.

I know there is also stuff written about side-effects there which
_could_ give the same semantic w.r.t. sequence points as the volatile
access casts, but hey, it's GCC's own documentation, you obviously can't
find fault with _me_ if there's wrong stuff written in there. Say that
to GCC ...

See, volatile C keyword, for all it's ill-definition and dodgy
semantics, is still at least given somewhat of a treatment in the C
standard (whose quality is ... ummm, sadly not always good and clear,
but unsurprisingly, still about 5,482 orders-of-magnitude times
better than GCC docs). Semantics of volatile as applies to inline
asm, OTOH? You're completely relying on the compiler for that ...


  [ of course, if the (latest) GCC documentation is *yet again*
wrong, then alright, not much I can do about it, is there. ]
 
 There was (and is) nothing wrong about the +m documentation, if
 that is what you are talking about.  It could be extended now, to
 allow +m -- but that takes more than just fixing the documentation.

No, there was (and is) _everything_ wrong about the + documentation as
applies to memory-constrained operands. I don't give a whit if it's
some workaround in their gimplifier, or the other, that makes it possible
to use +m (like the current kernel code does). The docs suggest
otherwise, so there's obviously a clear disconnect between the docs and
actual GCC behaviour.


[ You seem to often take issue with _amazingly_ petty and pedantic things,
  by the way :-) ]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Fri, 17 Aug 2007, Nick Piggin wrote:

 Satyam Sharma wrote:
 
  I didn't quite understand what you said here, so I'll tell what I think:
  
  * foo() is a compiler barrier if the definition of foo() is invisible to
   the compiler at a callsite.
  
  * foo() is also a compiler barrier if the definition of foo() includes
   a barrier, and it is inlined at the callsite.
  
  If the above is wrong, or if there's something else at play as well,
  do let me know.
 
 [...]
 If a function is not completely visible to the compiler (so it can't
 determine whether a barrier could be in it or not), then it must always
 assume it will contain a barrier so it always does the right thing.

Yup, that's what I'd said just a few sentences above, as you can see. I
was actually asking for elaboration on how a compiler determines that
function foo() (say foo == schedule), even when it cannot see that it has
a barrier(), as you'd mentioned, is a 'sleeping' function actually, and
whether compilers have a notion of sleep to automatically assume a
compiler barrier whenever such a sleeping function foo() is called. But
I think you've already qualified the discussion to this kernel, so okay,
I shouldn't nit-pick anymore.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Satyam Sharma



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

 The asm volatile implementation does have exactly the same
 reordering guarantees as the volatile cast thing,

I don't think so.
   
   asm volatile creates a side effect.
  
  Yeah.
  
   Side effects aren't
   allowed to be reordered wrt sequence points.
  
  Yeah.
  
   This is exactly
   the same reason as why volatile accesses cannot be reordered.
  
  No, the code in that sub-thread I earlier pointed you at *WAS* written
  such that there was a sequence point after all the uses of that volatile
  access cast macro, and _therefore_ we were safe from re-ordering
  (behaviour guaranteed by the C standard).
 
 And exactly the same is true for the asm version.
 
  Now, one cannot fantasize that volatile asms are also sequence points.
 
 Sure you can do that.  I don't though.
 
  In fact such an argument would be sadly mistaken, because sequence
  points are defined by the C standard and it'd be horribly wrong to
  even _try_ claiming that the C standard knows about volatile asms.
 
 That's nonsense.  GCC can extend the C standard any way they
 bloody well please -- witness the fact that they added an
 extra class of side effects...
 
Read the relevant GCC documentation.
   
   I did, yes.
  
  No, you didn't read:
  
  http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
  
  Read the bit about the need for artificial dependencies, and the example
  given there:
  
  asm volatile(mtfsf 255,%0 : : f (fpenv));
  sum = x + y;
  
  The docs explicitly say the addition can be moved before the volatile
  asm. Hopefully, as you know, (x + y) is an C expression and hence
  a sequence point as defined by the standard.
 
 The _end of a full expression_ is a sequence point, not every
 expression.  And that is irrelevant here anyway.
 
 It is perfectly fine to compute x+y any time before the
 assignment; the C compiler is allowed to compute it _after_
 the assignment even, if it could figure out how ;-)
 
 x+y does not contain a side effect, you know.
 
  I know there is also stuff written about side-effects there which
  _could_ give the same semantic w.r.t. sequence points as the volatile
  access casts,
 
 s/could/does/
 
  but hey, it's GCC's own documentation, you obviously can't
  find fault with _me_ if there's wrong stuff written in there. Say that
  to GCC ...
 
 There's nothing wrong there.
 
  See, volatile C keyword, for all it's ill-definition and dodgy
  semantics, is still at least given somewhat of a treatment in the C
  standard (whose quality is ... ummm, sadly not always good and clear,
  but unsurprisingly, still about 5,482 orders-of-magnitude times
  better than GCC docs).
 
 If you find any problems/shortcomings in the GCC documentation,
 please file a PR, don't go whine on some unrelated mailing lists.
 Thank you.
 
  Semantics of volatile as applies to inline
  asm, OTOH? You're completely relying on the compiler for that ...
 
 Yes, and?  GCC promises the behaviour it has documented.

LOTS there, which obviously isn't correct, but which I'll reply to later,
easier stuff first. (call this handwaving if you want, but don't worry,
I /will/ bother myself to reply)


[ of course, if the (latest) GCC documentation is *yet again*
  wrong, then alright, not much I can do about it, is there. ]
   
   There was (and is) nothing wrong about the +m documentation, if
   that is what you are talking about.  It could be extended now, to
   allow +m -- but that takes more than just fixing the documentation.
  
  No, there was (and is) _everything_ wrong about the + documentation as
  applies to memory-constrained operands. I don't give a whit if it's
  some workaround in their gimplifier, or the other, that makes it possible
  to use +m (like the current kernel code does). The docs suggest
  otherwise, so there's obviously a clear disconnect between the docs and
  actual GCC behaviour.
 
 The documentation simply doesn't say +m is allowed.  The code to
 allow it was added for the benefit of people who do not read the
 documentation.  Documentation for +m might get added later if it
 is decided this [the code, not the documentation] is a sane thing
 to have (which isn't directly obvious).

Huh?

If the (current) documentation doesn't match up with the (current)
code, then _at least one_ of them has to be (as of current) wrong.

I wonder how could you even try to disagree with that.

And I didn't go whining about this ... you asked me. (I think I'd said
something to the effect of GCC docs are often wrong, which is true,
but probably you feel saying that is not allowed on non-gcc lists?)

As for the PR you're requesting me to file with GCC for this, that
gcc-patches@ thread did precisely that and more (submitted a patch to
said documentation -- and no, saying documentation might get added
later is totally bogus and nonsensical -- documentation exists to
document current behaviour, not past). But come on, this is wholly

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Satyam Sharma

[ Bill tells me in private communication he gets this already, but I
  think it's more complicated than the shoddy explanation I'd made
  earlier so would wish to make this clearer in detail one last time,
  for the benefit of others listening in or reading the archives. ]


On Thu, 16 Aug 2007, Satyam Sharma wrote:

 On Thu, 16 Aug 2007, Satyam Sharma wrote:
 [...]
  On Wed, 15 Aug 2007, Bill Fink wrote:
   [...]
   I'm curious about one minor tangential point.  Why, instead of:
   
 b = *(volatile int *)a;
   
   why can't this just be expressed as:
   
 b = (volatile int)a;
   
   Isn't it the contents of a that's volatile, i.e. it's value can change
   invisibly to the compiler, and that's why you want to force a read from
   memory?  Why do you need the *(volatile int *) construct?
  
  b = (volatile int)a; doesn't help us because a cast to a qualified type
  has the same effect as a cast to an unqualified version of that type, as
  mentioned in 6.5.4:4 (footnote 86) of the standard. Note that volatile
  is a type-qualifier, not a type itself, so a cast of the _object_ itself
  to a qualified-type i.e. (volatile int) would not make the access itself
  volatile-qualified.

Casts don't produce lvalues, and the cast ((volatile int)a) does not
produce the object-int-a-qualified-as-volatile -- in fact, the
result of the above cast is whatever is the _value_ of int a, with
the access to that object having _already_ taken place, as per the
actual type-qualification of the object (that was originally declared
as being _non-volatile_, in fact). Hence, defining atomic_read() as:

#define atomic_read(v)  ((volatile int)((v)-counter))

would be buggy and not give volatility semantics at all, unless the
counter object itself isn't volatile-qualified already (which it
isn't).

The result of the cast itself being the _value_ of the int object, and
not the object itself (i.e., not an lvalue), is thereby independent of
type-qualification in that cast itself (it just wouldn't make any
difference), hence the cast to a qualified type has the same effect
as a cast to an unqualified version of that type bit in section 6.5.4:4
of the standard.


  To serve our purposes, it is necessary for us to take the address of this
  (non-volatile) object, cast the resulting _pointer_ to the corresponding
  volatile-qualified pointer-type, and then dereference it. This makes that
  particular _access_ be volatile-qualified, without the object itself being
  such. Also note that the (dereferenced) result is also a valid lvalue and
  hence can be used in *(volatile int *)a = b; kind of construction
  (which we use for the atomic_set case).

Dereferencing using the *(pointer-type-cast) construct, OTOH, serves
us well:

#define atomic_read(v)  (*(volatile int *)(v)-counter)

Firstly, note that the cast here being (volatile int *) and not
(int * volatile) qualifies the type of the _object_ being pointed to
by the pointer in question as being volatile-qualified, and not the
pointer itself (6.2.5:27 of the standard, and 6.3.2.3:2 allows us to
convert from a pointer-to-non-volatile-qualified-int to a pointer-to-
volatile-qualified-int, which suits us just fine) -- but note that
the _access_ to that address itself has not yet occurred.

_After_ specifying the memory address as containing a volatile-qualified-
int-type object, (and GCC co-operates as mentioned below), we proceed to
dereference it, which is when the _actual access_ occurs, therefore with
volatility semantics this time.

Interesting.


 Here, I should obviously admit that the semantics of *(volatile int *)
 aren't any neater or well-defined in the _language standard_ at all. The
 standard does say (verbatim) precisely what constitutes as access to
 object of volatile-qualified type is implementation-defined, but GCC
 does help us out here by doing the right thing. Accessing the non-volatile
 object there using the volatile-qualified pointer-type cast makes GCC
 treat the object stored at that memory address itself as if it were a 
 volatile object, thus making the access end up having what we're calling
 as volatility semantics here.
 
 Honestly, given such confusion, and the propensity of the volatile
 type-qualifier keyword to be ill-defined (or at least poorly understood,
 often inconsistently implemented), I'd (again) express my opinion that it
 would be best to avoid its usage, given other alternatives do exist.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: drivers/infiniband/mlx/mad.c misplaced ;

2007-08-16 Thread Satyam Sharma

Hi Ilpo,


On Thu, 16 Aug 2007, Ilpo Järvinen wrote:

 
 ...I guess those guys hunting for broken busyloops in the other thread 
 could also benefit from similar searching commands introduced in this 
 thread... ...Ccing Satyam to caught their attention too.
 
 
  On Wed, Aug 15, 2007 at 05:40:11PM -0700, Joe Perches wrote:
   
   There's more than a few of these (not inspected).
   
   $ egrep -r --include=*.c \bif[[:space:]]*\([^\)]*\)[[:space:]]*\; * 
 
 ...Hmm, I plugged in a preprocessor too to manage with non compliant 
 coding styles :-). Please understand that the line numbers are not an 
 exact match due to preprocessor changes:
 
 $ for i in `find . -name '*.[ch]'`; do echo $i; indent -npro -kr -i8 -ts8 
 -sob -l8000 -ss -ncs -cp1 -nhnl -st $i | egrep -n [[:space:]]if [(].*[)] ;$;
 done | grep -B1 ^[^.]

Thanks, looks useful, will check with this.


Satyam

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Satyam Sharma



On Thu, 16 Aug 2007, Segher Boessenkool wrote:

  Note that volatile
  is a type-qualifier, not a type itself, so a cast of the _object_ itself
  to a qualified-type i.e. (volatile int) would not make the access itself
  volatile-qualified.
 
 There is no such thing as volatile-qualified access defined
 anywhere; there only is the concept of a volatile-qualified
 *object*.

Sure, volatile-qualified access was not some standard term I used
there. Just something to mean an access that would make the compiler
treat the object at that memory as if it were an object with a
volatile-qualified type.

Now the second wording *IS* technically correct, but come on, it's
24 words long whereas the original one was 3 -- and hopefully anybody
reading the shorter phrase *would* have known anyway what was meant,
without having to be pedantic about it :-)


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Satyam Sharma



On Thu, 16 Aug 2007, Segher Boessenkool wrote:

  Here, I should obviously admit that the semantics of *(volatile int *)
  aren't any neater or well-defined in the _language standard_ at all. The
  standard does say (verbatim) precisely what constitutes as access to
  object of volatile-qualified type is implementation-defined, but GCC
  does help us out here by doing the right thing.
 
 Where do you get that idea?

Try a testcase (experimentally verify).

 GCC manual, section 6.1, When
 is a Volatile Object Accessed? doesn't say anything of the
 kind.

True, implementation-defined as per the C standard _is_ supposed to mean
unspecified behaviour where each implementation documents how the choice
is made. So ok, probably GCC isn't documenting this
implementation-defined behaviour which it is supposed to, but can't really
fault them much for this, probably.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Satyam Sharma

Hi Linus,

[ and others; I think there's a communication gap in a lot of this
  thread, and a little summary would be useful. Hence this posting. ]


On Thu, 16 Aug 2007, Linus Torvalds wrote:

 On Fri, 17 Aug 2007, Paul Mackerras wrote:
  
  I'm really surprised it's as much as a few K.  I tried it on powerpc
  and it only saved 40 bytes (10 instructions) for a G5 config.
 
 One of the things that volatile generally screws up is a simple
 
   volatile int i;
 
   i++;
 
 which a compiler will generally get horribly, horribly wrong.
 
 [...] For no good reason, except that volatile just doesn't have any 
 good/clear semantics for the compiler, so most compilers will just make it 
 be I will not touch this access in any way, shape, or form. Including 
 even trivially correct instruction optimization/combination.
 
 This is one of the reasons why we should never use volatile. It 
 pessimises code generation for no good reason - just because compilers 
 don't know what the heck it even means! 
 [...]
 In other words: volatile is a horribly horribly bad way of doing things, 
 because it generates *worse*code*, for no good reason. You just don't see 
 it on powerpc, because it's already a load-store architecture, so there is 
 no good code for doing direct-to-memory operations.


True, and I bet *everybody* on this list has already agreed for a very
long time that using volatile to type-qualify the _declaration_ of an
object itself as being horribly bad (taste-wise, code-generation-wise,
often even buggy for sitations where real CPU barriers should've been
used instead).

However, the discussion on this thread (IIRC) began with only giving
volatility semantics to atomic ops. Now that is different, and may not
require the use the volatile keyword (at least not in the declaration
of the object) itself.

Sadly, most arch's *still* do type-qualify the declaration of the
counter member of atomic_t as volatile. This is probably a historic
hangover, and I suspect not yet rectified because of lethargy.

Anyway, some of the variants I can think of are:

[1]

#define atomic_read_volatile(v) \
({  \
forget((v)-counter);   \
((v)-counter); \
})

where:

#define forget(a)   __asm__ __volatile__ ( :=m (a) :m (a))

[ This is exactly equivalent to using +m in the constraints, as recently
  explained on a GCC list somewhere, in response to the patch in my bitops
  series a few weeks back where I thought +m was bogus. ]

[2]

#define atomic_read_volatile(v) (*(volatile int *)(v)-counter)

This is something that does work. It has reasonably good semantics
guaranteed by the C standard in conjunction with how GCC currently
behaves (and how it has behaved for all supported versions). I haven't
checked if generates much different code than the first variant above,
(it probably will generate similar code to just declaring the object
as volatile, but would still be better in terms of code-clarity and
taste, IMHO), but in any case, we should pick whichever of these variants
works for us and generates good code.

[3]

static inline int atomic_read_volatile(atomic_t *v)
{
... arch-dependent __asm__ __volatile__ stuff ...
}

I can reasonably bet this variant would often generate worse code than
at least the variant [1] above.


Now, why do we even require these volatility semantics variants?

Note, volatility semantics *know* / assume that it can have a meaning
_only_ as far as the compiler, so atomic_read_volatile() doesn't really
care reading stale values from the cache for certain non-x86 archs, etc.

The first argument is safety:

Use of atomic_read() (possibly in conjunction with other atomic ops) in
a lot of code out there in the kernel *assumes* the compiler will not
optimize away those ops. (which is possible given current definitions
of atomic_set and atomic_read on archs such as x86 in present code).
An additional argument that builds on this one says that by ensuring
the compiler will not elid or coalesce these ops, we could even avoid
potential heisenbugs in the future.

However, there is a counter-argument:

As Herbert Xu has often been making the point, there is *no* bug out
there involving atomic_read in busy-while-loops that should not have
a compiler barrier (or cpu_relax() in fact) anyway. As for non-busy-loops,
they would invariable call schedule() at some point (possibly directly)
and thus have an implicit compiler barrier by virtue of calling out
a function that is not in scope of the current compilation unit (although
users in sched.c itself would probably require an explicit compiler
barrier).

The second pro-volatility-in-atomic-ops argument is performance:
(surprise!)

Using a full memory clobber compiler barrier in busy loops will disqualify
optimizations for loop invariants so it probably makes sense to *only*
make the compiler

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma



On Wed, 15 Aug 2007, Stefan Richter wrote:

 Satyam Sharma wrote:
  [ BTW, why do we want the compiler to not optimize atomic_read()'s in
the first place? Atomic ops guarantee atomicity, which has nothing
to do with volatility -- users that expect volatility from
atomic ops are the ones who must be fixed instead, IMHO. ]
 
 LDD3 says on page 125:  The following operations are defined for the
 type [atomic_t] and are guaranteed to be atomic with respect to all
 processors of an SMP computer.
 
 Doesn't atomic WRT all processors require volatility?

No, it definitely doesn't. Why should it?

Atomic w.r.t. all processors is just your normal, simple atomicity
for SMP systems (ensure that that object is modified / set / replaced
in main memory atomically) and has nothing to do with volatile
behaviour.

Volatile behaviour itself isn't consistently defined (at least
definitely not consistently implemented in various gcc versions across
platforms), but it is /expected/ to mean something like: ensure that
every such access actually goes all the way to memory, and is not
re-ordered w.r.t. to other accesses, as far as the compiler can take
care of these. The last as far as compiler can take care disclaimer
comes about due to CPUs doing their own re-ordering nowadays.

For example (say on i386):

(A)
$ cat tp1.c
int a;

void func(void)
{
a = 10;
a = 20;
}
$ gcc -Os -S tp1.c
$ cat tp1.s
...
movl$20, a
...

(B)
$ cat tp2.c
volatile int a;

void func(void)
{
a = 10;
a = 20;
}
$ gcc -Os -S tp2.c
$ cat tp2.s
...
movl$10, a
movl$20, a
...

(C)
$ cat tp3.c
int a;

void func(void)
{
*(volatile int *)a = 10;
*(volatile int *)a = 20;
}
$ gcc -Os -S tp3.c
$ cat tp3.s
...
movl$10, a
movl$20, a
...

In (A) the compiler optimized a = 10; away, but the actual store
of the final value 20 to a was still atomic. (B) and (C) also
exhibit volatile behaviour apart from the atomicity.

But as others replied, it seems some callers out there depend upon
atomic ops exhibiting volatile behaviour as well, so that answers
my initial question, actually. I haven't looked at the code Paul
pointed me at, but I wonder if that forget(x) macro would help
those cases. I'd wish to avoid the volatile primitive, personally.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma



On Wed, 15 Aug 2007, Stefan Richter wrote:

 Satyam Sharma wrote:
  On Wed, 15 Aug 2007, Stefan Richter wrote:
  Doesn't atomic WRT all processors require volatility?
  
  No, it definitely doesn't. Why should it?
  
  Atomic w.r.t. all processors is just your normal, simple atomicity
  for SMP systems (ensure that that object is modified / set / replaced
  in main memory atomically) and has nothing to do with volatile
  behaviour.
  
  Volatile behaviour itself isn't consistently defined (at least
  definitely not consistently implemented in various gcc versions across
  platforms), but it is /expected/ to mean something like: ensure that
  every such access actually goes all the way to memory, and is not
  re-ordered w.r.t. to other accesses, as far as the compiler can take
  care of these. The last as far as compiler can take care disclaimer
  comes about due to CPUs doing their own re-ordering nowadays.
  
  For example (say on i386):
 
 [...]
 
  In (A) the compiler optimized a = 10; away, but the actual store
  of the final value 20 to a was still atomic. (B) and (C) also
  exhibit volatile behaviour apart from the atomicity.
  
  But as others replied, it seems some callers out there depend upon
  atomic ops exhibiting volatile behaviour as well, so that answers
  my initial question, actually. I haven't looked at the code Paul
  pointed me at, but I wonder if that forget(x) macro would help
  those cases. I'd wish to avoid the volatile primitive, personally.
 
 So, looking at load instead of store, understand I correctly that in
 your opinion
 
   int b;
 
   b = atomic_read(a);
   if (b)
   do_something_time_consuming();
 
   b = atomic_read(a);
   if (b)
   do_something_more();
 
 should be changed to explicitly forget(a) after
 do_something_time_consuming?

No, I'd actually prefer something like what Christoph Lameter suggested,
i.e. users (such as above) who want volatile-like behaviour from atomic
ops can use alternative functions. How about something like:

#define atomic_read_volatile(v) \
({  \
forget((v)-counter);  \
((v)-counter); \
})

Or possibly, implement these volatile atomic ops variants in inline asm
like the patch that Sebastian Siewior has submitted on another thread just
a while back.

Of course, if we find there are more callers in the kernel who want the
volatility behaviour than those who don't care, we can re-define the
existing ops to such variants, and re-name the existing definitions to
somethine else, say atomic_read_nonvolatile for all I care.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma

Hi Stefan,


On Wed, 15 Aug 2007, Stefan Richter wrote:

 On 8/15/2007 10:18 AM, Heiko Carstens wrote:
  On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote:
  Chris Snook [EMAIL PROTECTED] wrote:
   
   Because atomic operations are generally used for synchronization, which 
   requires 
   volatile behavior.  Most such codepaths currently use an inefficient 
   barrier(). 
Some forget to and we get bugs, because people assume that 
   atomic_read() 
   actually reads something, and atomic_write() actually writes something.  
   Worse, 
   these are architecture-specific, even compiler version-specific bugs 
   that are 
   often difficult to track down.
  
  I'm yet to see a single example from the current tree where
  this patch series is the correct solution.  So far the only
  example has been a buggy piece of code which has since been
  fixed with a cpu_relax.
  
  Btw.: we still have
  
  include/asm-i386/mach-es7000/mach_wakecpu.h:  while 
  (!atomic_read(deassert));
  include/asm-i386/mach-default/mach_wakecpu.h: while 
  (!atomic_read(deassert));
  
  Looks like they need to be fixed as well.
 
 
 I don't know if this here is affected:

Yes, I think it is. You're clearly expecting the read to actually happen
when you call get_hpsb_generation(). It's clearly not a busy-loop, so
cpu_relax() sounds pointless / wrong solution for this case, so I'm now
somewhat beginning to appreciate the motivation behind this series :-)

But as I said, there are ways to achieve the same goals of this series
without using volatile.

I think I'll submit a RFC/patch or two on this myself (will also fix
the code pieces listed here).


 /* drivers/ieee1394/ieee1394_core.h */
 static inline unsigned int get_hpsb_generation(struct hpsb_host *host)
 {
   return atomic_read(host-generation);
 }
 
 /* drivers/ieee1394/nodemgr.c */
 static int nodemgr_host_thread(void *__hi)
 {
   [...]
 
   for (;;) {
   [... sleep until bus reset event ...]
 
   /* Pause for 1/4 second in 1/16 second intervals,
* to make sure things settle down. */
   g = get_hpsb_generation(host);
   for (i = 0; i  4 ; i++) {
   if (msleep_interruptible(63) ||
   kthread_should_stop())
   goto exit;

Totally unrelated, but this looks weird. IMHO you actually wanted:

msleep_interruptible(63);
if (kthread_should_stop())
goto exit;

here, didn't you? Otherwise the thread will exit even when
kthread_should_stop() != TRUE (just because it received a signal),
and it is not good for a kthread to exit on its own if it uses
kthread_should_stop() or if some other piece of kernel code could
eventually call kthread_stop(tsk) on it.

Ok, probably the thread will never receive a signal in the first
place because it's spawned off kthreadd which ignores all signals
beforehand, but still ...

[PATCH] ieee1394: Fix kthread stopping in nodemgr_host_thread

The nodemgr host thread can exit on its own even when kthread_should_stop
is not true, on receiving a signal (might never happen in practice, as
it ignores signals). But considering kthread_stop() must not be mixed with
kthreads that can exit on their own, I think changing the code like this
is clearer. This change means the thread can cut its sleep short when
receive a signal but looking at the code around, that sounds okay (and
again, it might never actually recieve a signal in practice).

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/ieee1394/nodemgr.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/ieee1394/nodemgr.c b/drivers/ieee1394/nodemgr.c
index 2ffd534..981a7da 100644
--- a/drivers/ieee1394/nodemgr.c
+++ b/drivers/ieee1394/nodemgr.c
@@ -1721,7 +1721,8 @@ static int nodemgr_host_thread(void *__hi)
 * to make sure things settle down. */
g = get_hpsb_generation(host);
for (i = 0; i  4 ; i++) {
-   if (msleep_interruptible(63) || kthread_should_stop())
+   msleep_interruptible(63);
+   if (kthread_should_stop())
goto exit;
 
/* Now get the generation in which the node ID's we 
collect
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma



On Wed, 15 Aug 2007, Paul E. McKenney wrote:

 On Wed, Aug 15, 2007 at 06:09:35PM +0200, Stefan Richter wrote:
  Herbert Xu wrote:
   On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
   I don't know if this here is affected:
  
  [...something like]
  b = atomic_read(a);
  for (i = 0; i  4; i++) {
  msleep_interruptible(63);
  c = atomic_read(a);
  if (c != b) {
  b = c;
  i = 0;
  }
  }
  
   Nope, we're calling schedule which is a rather heavy-weight
   barrier.
  
  How does the compiler know that msleep() has got barrier()s?
 
 Because msleep_interruptible() is in a separate compilation unit,
 the compiler has to assume that it might modify any arbitrary global.
 In many cases, the compiler also has to assume that msleep_interruptible()
 might call back into a function in the current compilation unit, thus
 possibly modifying global static variables.

Yup, I've just verified this with a testcase. So a call to any function
outside of the current compilation unit acts as a compiler barrier. Cool.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma



On Wed, 15 Aug 2007, Paul E. McKenney wrote:

 On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote:
  On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
   
   Do we really need another set of APIs?  Can you give even one example
   where the pre-existing volatile semantics are causing enough of a problem
   to justify adding yet more atomic_*() APIs?
  
  Let's turn this around.  Can you give a single example where
  the volatile semantics is needed in a legitimate way?
 
 Sorry, but you are the one advocating for the change.

Not for i386 and x86_64 -- those have atomic ops without any volatile
semantics (currently as per existing definitions).
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma



On Wed, 15 Aug 2007, Paul E. McKenney wrote:

 On Wed, Aug 15, 2007 at 10:48:28PM +0530, Satyam Sharma wrote:
  [...]
  Not for i386 and x86_64 -- those have atomic ops without any volatile
  semantics (currently as per existing definitions).
 
 I claim unit volumes with arm, and the majority of the architectures, but
 I cannot deny the popularity of i386 and x86_64 with many developers.  ;-)

Hmm, does arm really need that volatile int counter;? Hopefully RMK will
take a patch removing that volatile ... ;-)


 However, I am not aware of code in the kernel that would benefit
 from the compiler coalescing multiple atomic_set() and atomic_read()
 invocations, thus I don't see the downside to volatility in this case.
 Are there some performance-critical code fragments that I am missing?

I don't know, and yes, code with multiple atomic_set's and atomic_read's
getting optimized or coalesced does sound strange to start with. Anyway,
I'm not against volatile semantics per se. As replied elsewhere, I do
appreciate the motivation behind this series (to _avoid_ gotchas, not to
fix existing ones). Just that I'd like to avoid using volatile, for
aforementioned reasons, especially given that there are perfectly
reasonable alternatives to achieve the same desired behaviour.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma

Hi Paul,


On Wed, 15 Aug 2007, Paul E. McKenney wrote:

 On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
  [...]
  No, I'd actually prefer something like what Christoph Lameter suggested,
  i.e. users (such as above) who want volatile-like behaviour from atomic
  ops can use alternative functions. How about something like:
  
  #define atomic_read_volatile(v) \
  ({  \
  forget((v)-counter);  \
  ((v)-counter); \
  })
 
 Wouldn't the above forget the value, throw it away, then forget
 that it forgot it, giving non-volatile semantics?

Nope, I don't think so. I wrote the following trivial testcases:
[ See especially tp4.c and tp4.s (last example). ]

==
$ cat tp1.c # Using volatile access casts

#define atomic_read(a)  (*(volatile int *)a)

int a;

void func(void)
{
a = 0;
while (atomic_read(a))
;
}
==
$ gcc -Os -S tp1.c; cat tp1.s

func:
pushl   %ebp
movl%esp, %ebp
movl$0, a
.L2:
movla, %eax
testl   %eax, %eax
jne .L2
popl%ebp
ret
==
$ cat tp2.c # Using nothing; gcc will optimize the whole loop away

#define forget(x)

#define atomic_read(a)  \
({  \
forget((a));   \
(a);\
})

int a;

void func(void)
{
a = 0;
while (atomic_read(a))
;
}
==
$ gcc -Os -S tp2.c; cat tp2.s

func:
pushl   %ebp
movl%esp, %ebp
popl%ebp
movl$0, a
ret
==
$ cat tp3.c # Using a full memory clobber barrier

#define forget(x)   asm volatile (:::memory)

#define atomic_read(a)  \
({  \
forget((a));   \
(a);\
})

int a;

void func(void)
{
a = 0;
while (atomic_read(a))
;
}
==
$ gcc -Os -S tp3.c; cat tp3.s

func:
pushl   %ebp
movl%esp, %ebp
movl$0, a
.L2:
cmpl$0, a
jne .L2
popl%ebp
ret
==
$ cat tp4.c # Using a forget(var) macro

#define forget(a)   __asm__ __volatile__ ( :=m (a) :m (a))

#define atomic_read(a)  \
({  \
forget(a);  \
(a);\
})

int a;

void func(void)
{
a = 0;
while (atomic_read(a))
;
}
==
$ gcc -Os -S tp4.c; cat tp4.s

func:
pushl   %ebp
movl%esp, %ebp
movl$0, a
.L2:
cmpl$0, a
jne .L2
popl%ebp
ret
==


Possibly these were too trivial to expose any potential problems that you
may have been referring to, so would be helpful if you could write a more
concrete example / sample code.


  Or possibly, implement these volatile atomic ops variants in inline asm
  like the patch that Sebastian Siewior has submitted on another thread just
  a while back.
 
 Given that you are advocating a change (please keep in mind that
 atomic_read() and atomic_set() had volatile semantics on almost all
 platforms), care to give some example where these historical volatile
 semantics are causing a problem?
 [...]
 Can you give even one example
 where the pre-existing volatile semantics are causing enough of a problem
 to justify adding yet more atomic_*() APIs?

Will take this to the other sub-thread ...


  Of course, if we find there are more callers in the kernel who want the
  volatility behaviour than those who don't care, we can re-define the
  existing ops to such variants, and re-name the existing definitions to
  somethine else, say atomic_read_nonvolatile for all I care.
 
 Do we really need another set of APIs?

Well, if there's one set of users who do care about volatile behaviour,
and another set that doesn't, it only sounds correct to provide both
those APIs, instead of forcing one behaviour on all users.


Thanks,
Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma



On Wed, 15 Aug 2007, Segher Boessenkool wrote:

  Volatile behaviour itself isn't consistently defined (at least
  definitely not consistently implemented in various gcc versions across
  platforms),
 
 It should be consistent across platforms; if not, file a bug please.
 
  but it is /expected/ to mean something like: ensure that
  every such access actually goes all the way to memory, and is not
  re-ordered w.r.t. to other accesses, as far as the compiler can take
  ^
  (volatile)

(or, alternatively, other accesses to the same volatile object ...)

  care of these. The last as far as compiler can take care disclaimer
  comes about due to CPUs doing their own re-ordering nowadays.
 
 You can *expect* whatever you want, but this isn't in line with
 reality at all.
 
 volatile _does not_ prevent reordering wrt other accesses.
 [...]
 What volatile does are a) never optimise away a read (or write)
 to the object, since the data can change in ways the compiler
 cannot see; and b) never move stores to the object across a
 sequence point.  This does not mean other accesses cannot be
 reordered wrt the volatile access.
 
 If the abstract machine would do an access to a volatile-
 qualified object, the generated machine code will do that
 access too.  But, for example, it can still be optimised
 away by the compiler, if it can prove it is allowed to.

As (now) indicated above, I had meant multiple volatile accesses to
the same object, obviously.

BTW:

#define atomic_read(a)  (*(volatile int *)(a))
#define atomic_set(a,i) (*(volatile int *)(a) = (i))

int a;

void func(void)
{
int b;

b = atomic_read(a);
atomic_set(a, 20);
b = atomic_read(a);
}

gives:

func:
pushl   %ebp
movla, %eax
movl%esp, %ebp
movl$20, a
movla, %eax
popl%ebp
ret

so the first atomic_read() wasn't optimized away.


 volatile _does not_ make accesses go all the way to memory.
 [...]
 If you want stuff to go all the way to memory, you need some
 architecture-specific flush sequence; to make a store globally
 visible before another store, you need mb(); before some following
 read, you need mb(); to prevent reordering you need a barrier.

Sure, which explains the as far as the compiler can take care bit.
Poor phrase / choice of words, probably.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma

[ The Cc: list scares me. Should probably trim it. ]


On Wed, 15 Aug 2007, Paul E. McKenney wrote:

 On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote:
  How does the compiler know that msleep() has got barrier()s?
  
  Because msleep_interruptible() is in a separate compilation unit,
  the compiler has to assume that it might modify any arbitrary global.
  
  No; compilation units have nothing to do with it, GCC can optimise
  across compilation unit boundaries just fine, if you tell it to
  compile more than one compilation unit at once.
 
 Last I checked, the Linux kernel build system did compile each .c file
 as a separate compilation unit.
 
  What you probably mean is that the compiler has to assume any code
  it cannot currently see can do anything (insofar as allowed by the
  relevant standards etc.)

I think this was just terminology confusion here again. Isn't any code
that it cannot currently see the same as another compilation unit,
and wouldn't the compilation unit itself expand if we ask gcc to
compile more than one unit at once? Or is there some more specific
definition for compilation unit (in gcc lingo, possibly?)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[no subject]

2007-08-15 Thread Satyam Sharma



On Wed, 15 Aug 2007, Segher Boessenkool wrote:

What you probably mean is that the compiler has to assume any code
it cannot currently see can do anything (insofar as allowed by the
relevant standards etc.)
  
  I think this was just terminology confusion here again. Isn't any code
  that it cannot currently see the same as another compilation unit,
 
 It is not; try  gcc -combine  or the upcoming link-time optimisation
 stuff, for example.
 
  and wouldn't the compilation unit itself expand if we ask gcc to
  compile more than one unit at once? Or is there some more specific
  definition for compilation unit (in gcc lingo, possibly?)
 
 compilation unit is a C standard term.  It typically boils down
 to single .c file.

As you mentioned later, single .c file with all the other files (headers
or other .c files) that it pulls in via #include is actually translation
unit, both in the C standard as well as gcc docs. Compilation unit
doesn't seem to be nearly as standard a term, though in most places it
is indeed meant to be same as translation unit, but with the new gcc
inter-module-analysis stuff that you referred to above, I suspect one may
reasonably want to call a compilation unit as all that the compiler sees
at a given instant.

BTW I did some auditing (only inside include/asm-{i386,x86_64}/ and
arch/{i386,x86_64}/) and found a couple more callsites that don't use
cpu_relax():

arch/i386/kernel/crash.c:101
arch/x86_64/kernel/crash.c:97

that are:

while ((atomic_read(waiting_for_crash_ipi)  0)  msecs) {
mdelay(1);
msecs--;
}

where mdelay() becomes __const_udelay() which happens to be in another
translation unit (arch/i386/lib/delay.c) and hence saves this callsite
from being a bug :-)

Curiously, __const_udelay() is still marked as inline where it is
implemented in lib/delay.c which is weird, considering it won't ever
be inlined, would it? With the kernel presently being compiled one
translation unit at a time, I don't see how the implementation would
be visible to any callsite out there to be able to inline it.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()

2007-08-15 Thread Satyam Sharma



On Wed, 15 Aug 2007, Heiko Carstens wrote:

 [...]
 Btw.: we still have
 
 include/asm-i386/mach-es7000/mach_wakecpu.h:  while (!atomic_read(deassert));
 include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert));
 
 Looks like they need to be fixed as well.


[PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()

Use cpu_relax() in the busy loops, as atomic_read() doesn't automatically
imply volatility for i386 and x86_64. x86_64 doesn't have this issue because
it open-codes the while loop in smpboot.c:smp_callin() itself that already
uses cpu_relax().

For i386, however, smpboot.c:smp_callin() calls wait_for_init_deassert()
which is buggy for mach-default and mach-es7000 cases.

[ I test-built a kernel -- smp_callin() itself got inlined in its only
  callsite, smpboot.c:start_secondary() -- and the relevant piece of
  code disassembles to the following:

0xc1019704 start_secondary+12:mov0xc144c4c8,%eax
0xc1019709 start_secondary+17:test   %eax,%eax
0xc101970b start_secondary+19:je 0xc1019709 start_secondary+17

  init_deasserted (at 0xc144c4c8) gets fetched into %eax only once and
  then we loop over the test of the stale value in the register only,
  so these look like real bugs to me. With the fix below, this becomes:

0xc1019706 start_secondary+14:pause
0xc1019708 start_secondary+16:cmpl   $0x0,0xc144c4c8
0xc101970f start_secondary+23:je 0xc1019706 start_secondary+14

  which looks nice and healthy. ]

Thanks to Heiko Carstens for noticing this.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 include/asm-i386/mach-default/mach_wakecpu.h |3 ++-
 include/asm-i386/mach-es7000/mach_wakecpu.h  |3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/asm-i386/mach-default/mach_wakecpu.h 
b/include/asm-i386/mach-default/mach_wakecpu.h
index 673b85c..3ebb178 100644
--- a/include/asm-i386/mach-default/mach_wakecpu.h
+++ b/include/asm-i386/mach-default/mach_wakecpu.h
@@ -15,7 +15,8 @@
 
 static inline void wait_for_init_deassert(atomic_t *deassert)
 {
-   while (!atomic_read(deassert));
+   while (!atomic_read(deassert))
+   cpu_relax();
return;
 }
 
diff --git a/include/asm-i386/mach-es7000/mach_wakecpu.h 
b/include/asm-i386/mach-es7000/mach_wakecpu.h
index efc903b..84ff583 100644
--- a/include/asm-i386/mach-es7000/mach_wakecpu.h
+++ b/include/asm-i386/mach-es7000/mach_wakecpu.h
@@ -31,7 +31,8 @@ wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip)
 static inline void wait_for_init_deassert(atomic_t *deassert)
 {
 #ifdef WAKE_SECONDARY_VIA_INIT
-   while (!atomic_read(deassert));
+   while (!atomic_read(deassert))
+   cpu_relax();
 #endif
return;
 }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma

[ Sorry for empty subject line in previous mail. I intended to make
  a patch so cleared it to change it, but ultimately neither made
  a patch nor restored subject line. Done that now. ]


On Thu, 16 Aug 2007, Herbert Xu wrote:

 On Thu, Aug 16, 2007 at 06:06:00AM +0530, Satyam Sharma wrote:
  
  that are:
  
  while ((atomic_read(waiting_for_crash_ipi)  0)  msecs) {
  mdelay(1);
  msecs--;
  }
  
  where mdelay() becomes __const_udelay() which happens to be in another
  translation unit (arch/i386/lib/delay.c) and hence saves this callsite
  from being a bug :-)
 
 The udelay itself certainly should have some form of cpu_relax in it.

Yes, a form of barrier() must be present in mdelay() or udelay() itself
as you say, having it in __const_udelay() is *not* enough (superflous
actually, considering it is already a separate translation unit and
invisible to the compiler).

However, there are no compiler barriers on the macro-definition-path
between mdelay(1) and __const_udelay(), so the only thing that saves us
from being a bug here is indeed the different-translation-unit concept.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma



On Wed, 15 Aug 2007, Segher Boessenkool wrote:

 [...]
  BTW:
  
  #define atomic_read(a)  (*(volatile int *)(a))
  #define atomic_set(a,i) (*(volatile int *)(a) = (i))
  
  int a;
  
  void func(void)
  {
  int b;
  
  b = atomic_read(a);
  atomic_set(a, 20);
  b = atomic_read(a);
  }
  
  gives:
  
  func:
  pushl   %ebp
  movla, %eax
  movl%esp, %ebp
  movl$20, a
  movla, %eax
  popl%ebp
  ret
  
  so the first atomic_read() wasn't optimized away.
 
 Of course.  It is executed by the abstract machine, so
 it will be executed by the actual machine.  On the other
 hand, try
 
   b = 0;
   if (b)
   b = atomic_read(a);
 
 or similar.

Yup, obviously. Volatile accesses (or any access to volatile objects),
or even __volatile__ asms (which gcc normally promises never to elid)
can always be optimized for cases such as these where the compiler can
trivially determine that the code in question is not reachable.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma

Hi Herbert,


On Thu, 16 Aug 2007, Herbert Xu wrote:

 On Thu, Aug 16, 2007 at 06:28:42AM +0530, Satyam Sharma wrote:
 
   The udelay itself certainly should have some form of cpu_relax in it.
  
  Yes, a form of barrier() must be present in mdelay() or udelay() itself
  as you say, having it in __const_udelay() is *not* enough (superflous
  actually, considering it is already a separate translation unit and
  invisible to the compiler).
 
 As long as __const_udelay does something which has the same
 effect as barrier it is enough even if it's in the same unit.

Only if __const_udelay() is inlined. But as I said, __const_udelay()
-- although marked inline -- will never be inlined anywhere in the
kernel in reality. It's an exported symbol, and never inlined from
modules. Even from built-in targets, the definition of __const_udelay
is invisible when gcc is compiling the compilation units of those
callsites. The compiler has no idea that that function has barriers
or not, so we're saved here _only_ by the lucky fact that
__const_udelay() is in a different compilation unit.


 As a matter of fact it does on i386 where __delay either uses
 rep_nop or asm/volatile.

__delay() can be either delay_tsc() or delay_loop() on i386.

delay_tsc() uses the rep_nop() there for it's own little busy
loop, actually. But for a call site that inlines __const_udelay()
-- if it were ever moved to a .h file and marked inline -- the
call to __delay() will _still_ be across compilation units. So,
again for this case, it does not matter if the callee function
has compiler barriers or not (it would've been a different story
if we were discussing real/CPU barriers, I think), what saves us
here is just the fact that a call is made to a function from a
different compilation unit, which is invisible to the compiler
when compiling the callsite, and hence acting as the compiler
barrier.

Regarding delay_loop(), it uses volatile for the asm which
has quite different semantics from the C language volatile
type-qualifier keyword and does not imply any compiler barrier
at all.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma



On Wed, 15 Aug 2007, Christoph Lameter wrote:

 On Wed, 15 Aug 2007, Paul E. McKenney wrote:
 
  Understood.  My point is not that the impact is precisely zero, but
  rather that the impact on optimization is much less hurtful than the
  problems that could arise otherwise, particularly as compilers become
  more aggressive in their optimizations.
 
 The problems arise because barriers are not used as required. Volatile 
 has wishy washy semantics and somehow marries memory barriers with data 
 access. It is clearer to separate the two. Conceptual cleanness usually 
 translates into better code. If one really wants the volatile then lets 
 make it explicit and use
 
   atomic_read_volatile()

Completely agreed, again. To summarize again (had done so about ~100 mails
earlier in this thread too :-) ...

atomic_{read,set}_volatile() -- guarantees volatility also along with
atomicity (the two _are_ different concepts after all, irrespective of
whether callsites normally want one with the other or not)

atomic_{read,set}_nonvolatile() -- only guarantees atomicity, compiler
free to elid / coalesce / optimize such accesses, can keep the object
in question cached in a local register, leads to smaller text, etc.

As to which one should be the default atomic_read() is a question of
whether majority of callsites (more weightage to important / hot
codepaths, lesser to obscure callsites) want a particular behaviour.

Do we have a consensus here? (hoping against hope, probably :-)


[ This thread has gotten completely out of hand ... for my mail client
  alpine as well, it now seems. Reminds of that 1000+ GPLv3 fest :-) ]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma



On Thu, 16 Aug 2007, Paul Mackerras wrote:

 Herbert Xu writes:
 
  See sk_stream_mem_schedule in net/core/stream.c:
  
  /* Under limit. */
  if (atomic_read(sk-sk_prot-memory_allocated)  
  sk-sk_prot-sysctl_mem[0]) {
  if (*sk-sk_prot-memory_pressure)
  *sk-sk_prot-memory_pressure = 0;
  return 1;
  }
  
  /* Over hard limit. */
  if (atomic_read(sk-sk_prot-memory_allocated)  
  sk-sk_prot-sysctl_mem[2]) {
  sk-sk_prot-enter_memory_pressure();
  goto suppress_allocation;
  }
  
  We don't need to reload sk-sk_prot-memory_allocated here.
 
 Are you sure?  How do you know some other CPU hasn't changed the value
 in between?

I can't speak for this particular case, but there could be similar code
examples elsewhere, where we do the atomic ops on an atomic_t object
inside a higher-level locking scheme that would take care of the kind of
problem you're referring to here. It would be useful for such or similar
code if the compiler kept the value of that atomic object in a register.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma



On Thu, 16 Aug 2007, Satyam Sharma wrote:

 On Thu, 16 Aug 2007, Paul Mackerras wrote:
  Herbert Xu writes:
  
   See sk_stream_mem_schedule in net/core/stream.c:
   
   /* Under limit. */
   if (atomic_read(sk-sk_prot-memory_allocated)  
   sk-sk_prot-sysctl_mem[0]) {
   if (*sk-sk_prot-memory_pressure)
   *sk-sk_prot-memory_pressure = 0;
   return 1;
   }
   
   /* Over hard limit. */
   if (atomic_read(sk-sk_prot-memory_allocated)  
   sk-sk_prot-sysctl_mem[2]) {
   sk-sk_prot-enter_memory_pressure();
   goto suppress_allocation;
   }
   
   We don't need to reload sk-sk_prot-memory_allocated here.
  
  Are you sure?  How do you know some other CPU hasn't changed the value
  in between?
 
 I can't speak for this particular case, but there could be similar code
 examples elsewhere, where we do the atomic ops on an atomic_t object
 inside a higher-level locking scheme that would take care of the kind of
 problem you're referring to here. It would be useful for such or similar
 code if the compiler kept the value of that atomic object in a register.

We might not be using atomic_t (and ops) if we already have a higher-level
locking scheme, actually. So as Herbert mentioned, such cases might just
not care. [ Too much of this thread, too little sleep, sorry! ]

Anyway, the problem, of course, is that this conversion to a stronger /
safer-by-default behaviour doesn't happen with zero cost to performance.
Converting atomic ops to volatile behaviour did add ~2K to kernel text
for archs such as i386 (possibly to important codepaths) that didn't have
those semantics already so it would be constructive to actually look at
those differences and see if there were really any heisenbugs that got
rectified. Or if there were legitimate optimizations that got wrongly
disabled. Onus lies on those proposing the modifications, I'd say ;-)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma

Hi Bill,


On Wed, 15 Aug 2007, Bill Fink wrote:

 On Wed, 15 Aug 2007, Satyam Sharma wrote:
 
  (C)
  $ cat tp3.c
  int a;
  
  void func(void)
  {
  *(volatile int *)a = 10;
  *(volatile int *)a = 20;
  }
  $ gcc -Os -S tp3.c
  $ cat tp3.s
  ...
  movl$10, a
  movl$20, a
  ...
 
 I'm curious about one minor tangential point.  Why, instead of:
 
   b = *(volatile int *)a;
 
 why can't this just be expressed as:
 
   b = (volatile int)a;
 
 Isn't it the contents of a that's volatile, i.e. it's value can change
 invisibly to the compiler, and that's why you want to force a read from
 memory?  Why do you need the *(volatile int *) construct?

b = (volatile int)a; doesn't help us because a cast to a qualified type
has the same effect as a cast to an unqualified version of that type, as
mentioned in 6.5.4:4 (footnote 86) of the standard. Note that volatile
is a type-qualifier, not a type itself, so a cast of the _object_ itself
to a qualified-type i.e. (volatile int) would not make the access itself
volatile-qualified.

To serve our purposes, it is necessary for us to take the address of this
(non-volatile) object, cast the resulting _pointer_ to the corresponding
volatile-qualified pointer-type, and then dereference it. This makes that
particular _access_ be volatile-qualified, without the object itself being
such. Also note that the (dereferenced) result is also a valid lvalue and
hence can be used in *(volatile int *)a = b; kind of construction
(which we use for the atomic_set case).


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma



On Thu, 16 Aug 2007, Satyam Sharma wrote:

 Hi Bill,
 
 
 On Wed, 15 Aug 2007, Bill Fink wrote:
 
  On Wed, 15 Aug 2007, Satyam Sharma wrote:
  
   (C)
   $ cat tp3.c
   int a;
   
   void func(void)
   {
 *(volatile int *)a = 10;
 *(volatile int *)a = 20;
   }
   $ gcc -Os -S tp3.c
   $ cat tp3.s
   ...
   movl$10, a
   movl$20, a
   ...
  
  I'm curious about one minor tangential point.  Why, instead of:
  
  b = *(volatile int *)a;
  
  why can't this just be expressed as:
  
  b = (volatile int)a;
  
  Isn't it the contents of a that's volatile, i.e. it's value can change
  invisibly to the compiler, and that's why you want to force a read from
  memory?  Why do you need the *(volatile int *) construct?
 
 b = (volatile int)a; doesn't help us because a cast to a qualified type
 has the same effect as a cast to an unqualified version of that type, as
 mentioned in 6.5.4:4 (footnote 86) of the standard. Note that volatile
 is a type-qualifier, not a type itself, so a cast of the _object_ itself
 to a qualified-type i.e. (volatile int) would not make the access itself
 volatile-qualified.
 
 To serve our purposes, it is necessary for us to take the address of this
 (non-volatile) object, cast the resulting _pointer_ to the corresponding
 volatile-qualified pointer-type, and then dereference it. This makes that
 particular _access_ be volatile-qualified, without the object itself being
 such. Also note that the (dereferenced) result is also a valid lvalue and
 hence can be used in *(volatile int *)a = b; kind of construction
 (which we use for the atomic_set case).

Here, I should obviously admit that the semantics of *(volatile int *)
aren't any neater or well-defined in the _language standard_ at all. The
standard does say (verbatim) precisely what constitutes as access to
object of volatile-qualified type is implementation-defined, but GCC
does help us out here by doing the right thing. Accessing the non-volatile
object there using the volatile-qualified pointer-type cast makes GCC
treat the object stored at that memory address itself as if it were a 
volatile object, thus making the access end up having what we're calling
as volatility semantics here.

Honestly, given such confusion, and the propensity of the volatile
type-qualifier keyword to be ill-defined (or at least poorly understood,
often inconsistently implemented), I'd (again) express my opinion that it
would be best to avoid its usage, given other alternatives do exist.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-14 Thread Satyam Sharma



On Tue, 14 Aug 2007, Christoph Lameter wrote:

 On Thu, 9 Aug 2007, Chris Snook wrote:
 
  This patchset makes the behavior of atomic_read uniform by removing the
  volatile keyword from all atomic_t and atomic64_t definitions that currently
  have it, and instead explicitly casts the variable as volatile in
  atomic_read().  This leaves little room for creative optimization by the
  compiler, and is in keeping with the principles behind volatile considered
  harmful.
 
 volatile is generally harmful even in atomic_read(). Barriers control
 visibility and AFAICT things are fine.

Frankly, I don't see the need for this series myself either. Personal
opinion (others may differ), but I consider volatile to be a sad /
unfortunate wart in C (numerous threads on this list and on the gcc
lists/bugzilla over the years stand testimony to this) and if we _can_
steer clear of it, then why not -- why use this ill-defined primitive
whose implementation has often differed over compiler versions and
platforms? Granted, barrier() _is_ heavy-handed in that it makes the
optimizer forget _everything_, but then somebody did post a forget()
macro on this thread itself ...

[ BTW, why do we want the compiler to not optimize atomic_read()'s in
  the first place? Atomic ops guarantee atomicity, which has nothing
  to do with volatility -- users that expect volatility from
  atomic ops are the ones who must be fixed instead, IMHO. ]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 27/28] Introduce U16_MAX and U32_MAX

2007-08-13 Thread Satyam Sharma

Hi David,

On Fri, 10 Aug 2007, David Miller wrote:

 From: [EMAIL PROTECTED]
 Date: Fri, 10 Aug 2007 14:12:10 -0700

  From: Satyam Sharma [EMAIL PROTECTED]

  ... in kernel.h and clean up home-grown macros elsewhere in the tree.

  Leave out the one in reiserfs_fs.h as it is in the userspace-visible part
  of that header. Still, #undef the (equivalent) kernel version there to
  avoid seeing redefined, previous definition was here gcc warnings.

  [EMAIL PROTECTED]: fix U16_MAX, U32_MAX defns]
  Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
  Signed-off-by: Andrew Morton [EMAIL PROTECTED]

 I won't apply this one, for two reasons:

 1) The reiserfs definition is better, it is _type_ based.
Please use (~(__u16)0) and (~(__u32)0), respectively.

Hmm, in that case ((__u16)0x) and ((__u32)0x) are probably
better and clearer -- as that's what u16_max and u32_max are, after all.

We do require the (~0) thing for the max int/uint/long types, but that's
because those are types where the number-of-bits is not known when writing
the macro definition -- but that's not case with u16 and u32, so the
0xff... variants are clearer, IMHO.

 2) The reiserfs definition is going to define an equivalent
value, so just adding an #undef and still letting reiserfs
override is wrong.  Why put a common define in kernel.h
if other headers still keep their own crufty copy too?

Because removing the (re-)definition of U32_MAX from in there in
reiserfs_fs.h will break builds of all userspace users of U32_MAX and
max_reiserfs_offset(), would it not? I haven't looked at any reiserfs
userspace tools source code, so possibly none such (that use
max_reiserfs_offset) exist, but I thought it better to be safe.
I'll have a look at the reiserfs-utils package, just in case.

Thanks,
Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 27/28] Introduce U16_MAX and U32_MAX

2007-08-13 Thread Satyam Sharma



On Mon, 13 Aug 2007, Satyam Sharma wrote:

 Hi David,
 
 
 On Fri, 10 Aug 2007, David Miller wrote:
 [...]
  1) The reiserfs definition is better, it is _type_ based.
 Please use (~(__u16)0) and (~(__u32)0), respectively.
 
 Hmm, in that case ((__u16)0x) and ((__u32)0x) are probably
 better and clearer -- as that's what u16_max and u32_max are, after all.
 
 We do require the (~0) thing for the max int/uint/long types, but that's
 because those are types where the number-of-bits is not known when writing
 the macro definition -- but that's not case with u16 and u32, so the
 0xff... variants are clearer, IMHO.
 
 
  2) The reiserfs definition is going to define an equivalent
 value, so just adding an #undef and still letting reiserfs
 override is wrong.  Why put a common define in kernel.h
 if other headers still keep their own crufty copy too?
 
 Because removing the (re-)definition of U32_MAX from in there in
 reiserfs_fs.h will break builds of all userspace users of U32_MAX and
 max_reiserfs_offset(), would it not? I haven't looked at any reiserfs
 userspace tools source code, so possibly none such (that use
 max_reiserfs_offset) exist, but I thought it better to be safe.
 I'll have a look at the reiserfs-utils package, just in case.

I checked with latest reiserfsprogs available from namesys.com and also
reiserfs-utils package available from Fedora, and it appears there are
no users of either U32_MAX or max_reiserfs_offset() in any userspace
reiserfs tools, so we're safe in removing the U32_MAX from
include/linux/reiserfs_fs.h.


Thanks,
Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: NETPOLL=y , NETDEVICES=n compile error ( Re: 2.6.23-rc1-mm1 )

2007-08-02 Thread Satyam Sharma

Hi,


On Thu, 2 Aug 2007, Sam Ravnborg wrote:

   
   ...
   endif # NETDEVICES
   
   config NETPOLL
   depends on NETDEVICES
   def_bool NETCONSOLE
   
   config NETPOLL_TRAP
   bool Netpoll traffic trapping
   default n
   depends on NETPOLL
   
   config NET_POLL_CONTROLLER
   def_bool NETPOLL
   depends on NETPOLL
   
   
   seems to select NET_POLL_CONTROLLER after selecting NETPOLL, but
   still doesn't check for NETDEVICES dependency.
  
  That's odd. Adding Sam to the cc:.

I just noticed this thread, but I wonder what the fuss is all
about :-) Kconfig dependencies are easy, really -- any code that
pulls in code from elsewhere, must explicitly depends on it.
It is possible to use select as well, but could lead to breakages
as discussed to death on at least 64592 other threads on LKML already
and hence should only be used for library-like code that does not
have any dependencies itself.


 select is evil
 select will by brute force set a symbol equal to 'y' without
 visiting the dependencies.
 So abusing select you are able to select a symbol FOO even 
 if FOO depends on BAR that is not set.
 
 In general use select only for non-visible symbols (no promts anywhere)
 and for symbols with no dependencies.
 That will limit the suefullness but on the other hand avoid the illegal
 configurations all over.

The problem with using depends on is that your config symbol becomes
invisible unless the dependency has already been selected.

So, there's a workaround: make the ultimate config symbol itself depend
upon the grand-dependency (excuse the nomenclature) and just select
the immediate-parent-dependency, i.e. the following:

CONFIG_BAZ
...

CONFIG BAR
depends on BAZ

CONFIG_FOO
depends on BAZ
select BAR

is perfectly legal, and doesn't cause any build problems. Perhaps such a
solution makes sense here as well?


 kconfig should one day warn about such things but I have not fel inclined
 to dive into the matters hoping that Roman does one day.

Yup, I've wanted to do this myself, in fact I wanted to implement an idea
I had in mind ( http://lkml.org/lkml/2007/5/16/257 ) but for some reason
I tend to stay away from stuff in scripts/ :-)


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: NETPOLL=y , NETDEVICES=n compile error ( Re: 2.6.23-rc1-mm1 )

2007-08-02 Thread Satyam Sharma

[ Read through the thread, looked at Kconfig files,
  did some tests. Adding Kconfig experts to Cc: list. ]


 On Thu, 2 Aug 2007, Sam Ravnborg wrote:
 

...
endif # NETDEVICES

config NETPOLL
depends on NETDEVICES
def_bool NETCONSOLE

config NETPOLL_TRAP
bool Netpoll traffic trapping
default n
depends on NETPOLL

config NET_POLL_CONTROLLER
def_bool NETPOLL
depends on NETPOLL


Gargh, what we're seeing here is a whole bunch of bugs, I think. First
I thought this must be one of those randconfig-producing-wrong-configs
issues, but surprisingly, running make oldconfig on this .config on
a fresh 2.6.23-rc1-mm1 tree didn't change anything in the .config.


Kconfig bug #1:
===

Which means, although:

*
menuconfig BAZ

if BAZ

config BAR

endif
*

is widely believed (by most folks, I've heard this on several threads,
and as written in the comment in drivers/net/Kconfig) to be equivalent to:

*
menuconfig BAZ

if BAZ
endif

config BAR
depends on BAZ
*

this is *not* enforced by make oldconfig! And hence, the NETPOLL 
!NETDEVICES situation we're seeing here.

[ We could also categorize this as a bug in Kconfig's if, fwiw. ]


Kconfig bug #2:
===

config FOO
def_bool BAR

is supposed to ensure that FOO == BAR (as Matt mentioned earlier).

However, even this is *not* enforced by make oldconfig. And hence,
the NETPOLL  !NET_POLL_CONTROLLER situation we're seeing here.

In fact, I believe it's possible to even pass a NETCONSOLE but
!NETPOLL kind of .config through make oldconfig but it still won't
catch it, and build breakages *will* occur.

[ We could also categorize this as a bug in Kconfig's def_bool, fwiw. ]

Possibly, we could also decide to just blame randconfig for the whole
issue, and forget about these, because I think it's highly unlikely
(though not impossible) for people with real .configs to hit the
problems we saw above.


KGDBOE bug #1:
==

config KGDBOE in lib/Kconfig.kgdb must also depend on NETDEVICES,
and select NET_POLL_CONTROLLER also.


KGDBOE bug #2:
==

config KGDBOE_NOMODULE is a sad, sad option, and must be killed. The
if !KGDBOE_NOMODULE in KGDBOE must be removed, and it must lose its
dependency on m.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-usb-devel] [PATCH] USB Pegasus driver - avoid a potential NULL pointer dereference.

2007-07-30 Thread Satyam Sharma



On Mon, 30 Jul 2007, Petko Manolov wrote:

 On Sun, 29 Jul 2007, Oliver Neukum wrote:
 
  [...]
  pegasus == NULL there would be a kernel bug. Silently ignoring
  it, like the code now wants to do is bad. As the oops has never been
  reported, I figure turning it into an explicit debugging test is overkill,
  so removal seems to be the best option.
 
 In the past urb-context was not guaranteed to be non-null for any
 asynchronous calls.  If this is not the case anymore then it should be removed
 from at least two more locations in the driver.
 
 Attached you'll find the resulting patch.

Given Oliver's earlier comment, it looks okay to me. Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [git patches] net driver fixes

2007-07-30 Thread Satyam Sharma



On Mon, 30 Jul 2007, Jeff Garzik wrote:

 true, we should just remove the dev==NULL check

Patch below:

[PATCH] nmclan_cs: Remove bogus (dev==NULL) check in mace_interrupt()

The (dev == NULL) check in drivers/net/pcmcia/nmclan_cs.c:mace_interrupt()
handler is always false, so let's remove it.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/net/pcmcia/nmclan_cs.c |6 --
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/drivers/net/pcmcia/nmclan_cs.c b/drivers/net/pcmcia/nmclan_cs.c
index 73da611..3446be9 100644
--- a/drivers/net/pcmcia/nmclan_cs.c
+++ b/drivers/net/pcmcia/nmclan_cs.c
@@ -1000,12 +1000,6 @@ static irqreturn_t mace_interrupt(int irq, void *dev_id)
   int status;
   int IntrCnt = MACE_MAX_IR_ITERATIONS;
 
-  if (dev == NULL) {
-DEBUG(2, mace_interrupt(): irq 0x%X for unknown device.\n,
- irq);
-return IRQ_NONE;
-  }
-
   if (lp-tx_irq_disabled) {
 printk(
   (lp-tx_irq_disabled?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 -mm 0/9] netconsole: Multiple targets and dynamic reconfigurability

2007-07-30 Thread Satyam Sharma



On Mon, 30 Jul 2007, Andrew Morton wrote:

 On Mon, 30 Jul 2007 08:17:41 +0530
 Satyam Sharma [EMAIL PROTECTED] wrote:
 
  [0/9] netconsole: Multiple targets and dynamic reconfigurability
  
  This is v3 of the patchset
 
 That all looks pretty reasonable, thanks.
 
 Are we all comfortable with the attributions?  As I have it now, everything
 will go in as having been authored by yourself.  There is no formal (ie:
 understood-by-git) way of recording joint authorship, unfortunately.  But one
 can make a record of such things in the changelog.

Ok, making such a record would be good, if possible. I did see that the
author field for git commits can have only one entry, unfortunately,
and it turns out Signed-off-by and Acked-by have other meanings
attached that have to do with the patch-journey-from-author-to-you chain.
But I'd like these to be somehow marked as jointly authored, definitely.

 I'll queue this material up in the to-be-sent-to-davem queue.

Ok, thanks.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 -mm 9/9] netconsole: Support dynamic reconfiguration using configfs

2007-07-30 Thread Satyam Sharma



On Mon, 30 Jul 2007, Andrew Morton wrote:

 On Mon, 30 Jul 2007 08:19:10 +0530
 Satyam Sharma [EMAIL PROTECTED] wrote:
 
  +/*
  + * Wrapper over simple_strtol (base 10) with sanity and range checking.
  + * We return (signed) long only because we may want to return errors.
  + * Do not use this to convert numbers that are allowed to be negative.
  + */
  +static long strtol10_check_range(const char *cp, long min, long max)
  +{
  +   long ret;
  +   char *p = (char *) cp;
  +
  +   WARN_ON(min  0);
  +   WARN_ON(max  min);
  +
  +   ret = simple_strtol(p, p, 10);
  +
  +   if (*p  (*p != '\n')) {
  +   printk(KERN_ERR netconsole: invalid input\n);
  +   return -EINVAL;
  +   }
  +   if ((ret  min) || (ret  max)) {
  +   printk(KERN_ERR netconsole: input %ld must be between 
  +   %ld and %ld\n, ret, min, max);
  +   return -EINVAL;
  +   }
  +
  +   return ret;
  +}
 
 There's probably other code around the place which does this.  It might
 be worth making this function a general-purpose thing.

Probably, yes, I thought along similar lines.

[ BTW somewhere in the 9/9 patch I remember having to define a
  __U16_MAX as well. That could be pushed out to some generic
  or appropriate header as well, possibly. ]


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-usb-devel] [PATCH] USB Pegasus driver - avoid a potential NULL pointer dereference.

2007-07-29 Thread Satyam Sharma



On Sun, 29 Jul 2007, Oliver Neukum wrote:

 Am Sonntag 29 Juli 2007 schrieb Jesper Juhl:
  On 29/07/07, Satyam Sharma [EMAIL PROTECTED] wrote:
   Hi,
  
   On 7/29/07, Jesper Juhl [EMAIL PROTECTED] wrote:
Hello,
   
This patch makes sure we don't dereference a NULL pointer in
drivers/net/usb/pegasus.c::write_bulk_callback() in the initial
struct net_device *net = pegasus-net; assignment.
The existing code checks if 'pegasus' is NULL and bails out if
it is, so we better not touch that pointer until after that check.
[...]
  
   Is it really possible that we're called into this function with
   urb-context == NULL? If not, I'd suggest let's just get rid of
   the check itself, instead.
  
  I'm not sure. I am not very familiar with this code. I just figured
  that moving the assignment is potentially a little safer and it is
  certainly no worse than the current code, so that's a safe and
  potentially benneficial change. Removing the check may be safe but I'm
  not certain enough to make that call...
 
 pegasus == NULL there would be a kernel bug. Silently ignoring
 it, like the code now wants to do is bad. As the oops has never been
 reported, I figure turning it into an explicit debugging test is overkill,
 so removal seems to be the best option.

Ok, thanks. Updated patch below.

[PATCH] pegasus: Remove bogus checks in urb-complete() callbacks

urb-complete() callbacks registered in drivers/net/usb/pegasus.c needlessly
check for urb-context != NULL, but that is not possible (the only way that
can be possible would be a kernel bug elsewhere, and these checks would
simply end up hiding that). So let's remove the bogus checks.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/net/usb/pegasus.c |9 +
 1 files changed, 1 insertions(+), 8 deletions(-)

diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c
index a05fd97..439ef9f 100644
--- a/drivers/net/usb/pegasus.c
+++ b/drivers/net/usb/pegasus.c
@@ -770,9 +770,6 @@ static void write_bulk_callback(struct urb *urb)
pegasus_t *pegasus = urb-context;
struct net_device *net = pegasus-net;
 
-   if (!pegasus)
-   return;
-
if (!netif_device_present(net) || !netif_running(net))
return;
 
@@ -805,13 +802,9 @@ static void write_bulk_callback(struct urb *urb)
 static void intr_callback(struct urb *urb)
 {
pegasus_t *pegasus = urb-context;
-   struct net_device *net;
+   struct net_device *net = pegasus-net;
int status;
 
-   if (!pegasus)
-   return;
-   net = pegasus-net;
-
switch (urb-status) {
case 0:
break;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] sb1000: prevent a potential NULL pointer dereference in sb1000_dev_ioctl()

2007-07-29 Thread Satyam Sharma



 On Sun, 29 Jul 2007, Domen Puncer wrote:
 
  On 29/07/07 00:02 +0200, Jesper Juhl wrote:
   Hi,
   
   Here's a small patch, prompted by a find by the Coverity checker, 
   that removes a potential NULL pointer dereference from 
   drivers/net/sb1000.c::sb1000_dev_ioctl().
   The checker spotted that we do a NULL test of 'dev', yet we 
   dereference the pointer prior to that check.
   This patch simply moves the dereference after the NULL test.
  
  But... it can't be called without a valid 'dev', no?
  A quick 'grep do_ioctl net/' confirms that all calls are in
  the form of 'dev-do_ioctl(dev, ...'.
 
 Yup, I think so too ...
 
 
   @@ -991,11 +991,13 @@ static int sb1000_dev_ioctl(struct net_device *dev, 
   struct ifreq *ifr, int cmd)
 short PID[4];
 int ioaddr[2], status, frequency;
 unsigned int stats[5];
   - struct sb1000_private *lp = netdev_priv(dev);
   + struct sb1000_private *lp;

 if (!(dev  dev-flags  IFF_UP))
 return -ENODEV;
 
 I think we could get rid of the !dev check itself. Actually, the IFF_UP
 check /also/ looks suspect to me for two reasons: (1) I remember Stephen
 Hemminger once telling me dev-flags is legacy and unsafe, and one of
 the netif_xxx() functions be used instead, and, (2) I wonder if we really
 require the interface to be up and *running* when we do this ioctl.

Updated patch below.

[PATCH] sb1000: Remove bogus checks

In net_device-do_ioctl() of the sb1000 driver (sb1000_dev_ioctl):

(1) !dev condition is always false -- this function cannot be called with
NULL net_device.
(2) !(dev-flags  IFF_UP) is bogus because the functions of this ioctl
can (and should) be allowed even when the interface is not up and running.

So let's remove these checks.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Cc: Jesper Juhl [EMAIL PROTECTED]
Cc: Domen Puncer [EMAIL PROTECTED]

---

 drivers/net/sb1000.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/net/sb1000.c b/drivers/net/sb1000.c
index 1de3eec..f60fe98 100644
--- a/drivers/net/sb1000.c
+++ b/drivers/net/sb1000.c
@@ -993,9 +993,6 @@ static int sb1000_dev_ioctl(struct net_device *dev, struct 
ifreq *ifr, int cmd)
unsigned int stats[5];
struct sb1000_private *lp = netdev_priv(dev);
 
-   if (!(dev  dev-flags  IFF_UP))
-   return -ENODEV;
-
ioaddr[0] = dev-base_addr;
/* mem_start holds the second I/O address */
ioaddr[1] = dev-mem_start;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] sb1000: prevent a potential NULL pointer dereference in sb1000_dev_ioctl()

2007-07-29 Thread Satyam Sharma

Hi Michael,


On Sun, 29 Jul 2007, Michael Buesch wrote:

 On Sunday 29 July 2007 20:34:46 Satyam Sharma wrote:
  (2) !(dev-flags  IFF_UP) is bogus because the functions of this ioctl
  can (and should) be allowed even when the interface is not up and running.
 
 Are you _sure_? This function does poke with the device hardware.
 It might return crap or even machinecheck when not initialized.
 Hardware is probably powered down, if not IFF_UP. (I don't know if that's
 the case here, though).

IFF_UP checks if the _interface_ is up -- the hardware / card could still
be powered up, but the interface down (ifconfing eth0 down or ip link set
eth0 down).

Probably what we want here is netif_device_present()? -- I think that
should return true only when the *device* itself is up (as in powered)
but the interface itself could be down ...

Let's wait for comments from the netdev people Cc:'ed here, in that case.


   drivers/net/sb1000.c |3 ---
   1 files changed, 0 insertions(+), 3 deletions(-)
  
  diff --git a/drivers/net/sb1000.c b/drivers/net/sb1000.c
  index 1de3eec..f60fe98 100644
  --- a/drivers/net/sb1000.c
  +++ b/drivers/net/sb1000.c
  @@ -993,9 +993,6 @@ static int sb1000_dev_ioctl(struct net_device *dev, 
  struct ifreq *ifr, int cmd)
  unsigned int stats[5];
  struct sb1000_private *lp = netdev_priv(dev);
   
  -   if (!(dev  dev-flags  IFF_UP))
  -   return -ENODEV;
  -


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 -mm 0/9] netconsole: Multiple targets and dynamic reconfigurability

2007-07-29 Thread Satyam Sharma

[0/9] netconsole: Multiple targets and dynamic reconfigurability

This is v3 of the patchset, the previous versions are available at:
http://lkml.org/lkml/2007/7/4/107
http://lkml.org/lkml/2007/7/10/78

Diffed against 2.6.23-rc1-git6 (6c8dca5d as of writing), but applies
successfully to 2.6.23-rc1-mm1 as well.

Patches 1-5 are okay-to-apply to current mainline -git, I think.

Patches 6-9 introduce the new functionality and are requested for
inclusion in -mm.

[1/9] netconsole: Cleanups, codingstyle, prettyfication
[2/9] netconsole: Remove bogus check
[3/9] netconsole: Simplify boot/module option setup logic
[4/9] netconsole: Use netif_running() in write_msg()
[5/9] netconsole: Add some useful tips to documentation
[6/9] netconsole: Introduce netconsole_target
[7/9] netconsole: Introduce netconsole_netdev_notifier
[8/9] netconsole: Support multiple logging targets
[9/9] netconsole: Support dynamic reconfiguration using configfs

Changes since v2:
=

* enabled must be defined outside #ifdef NETCONSOLE_DYNAMIC

* Some simple enhancements to documentation

* Drop the use of unlikely from a condition where I'd got the
  common case wrong

About this patchset:


What?

Support multiple remote logging targets in netconsole. Also, ability to
dynamically add or remove targets or modify parameters (IP, port, remote
MAC address) of targets at runtime, from userspace, using configfs.

Why?

From Keiichi Kii's original post:

[...] current netconsole is not flexible. For example, if you want to change
ip address for logging agent, in the case of built-in netconsole, you can't
change config except for changing boot parameter and rebooting your system,
or in the case of module netconsole, you need to remove it and reload
with different parameters. [...] and we have been losing serial console
port with PCs and Servers.

(... and especially laptops, I would add.)


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 -mm 1/9] netconsole: Cleanups, codingstyle, prettyfication

2007-07-29 Thread Satyam Sharma

From: Satyam Sharma [EMAIL PROTECTED]

[1/9] netconsole: Cleanups, codingstyle, prettyfication

(1) Remove unwanted headers.
(2) Mark __init and __exit as appropriate.
(3) Various trivial codingstyle and prettification stuff.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Signed-off-by: Keiichi Kii [EMAIL PROTECTED]
Acked-by: Matt Mackall [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |   55 ++---
 1 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 69233f6..f1c2a2d 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -35,35 +35,32 @@
  /
 
 #include linux/mm.h
-#include linux/tty.h
 #include linux/init.h
 #include linux/module.h
 #include linux/console.h
-#include linux/tty_driver.h
 #include linux/moduleparam.h
 #include linux/string.h
-#include linux/sysrq.h
-#include linux/smp.h
 #include linux/netpoll.h
 
 MODULE_AUTHOR(Maintainer: Matt Mackall [EMAIL PROTECTED]);
 MODULE_DESCRIPTION(Console driver for network interfaces);
 MODULE_LICENSE(GPL);
 
-static char config[256];
-module_param_string(netconsole, config, 256, 0);
+#define MAX_PARAM_LENGTH   256
+#define MAX_PRINT_CHUNK1000
+
+static char config[MAX_PARAM_LENGTH];
+module_param_string(netconsole, config, MAX_PARAM_LENGTH, 0);
 MODULE_PARM_DESC(netconsole,  [EMAIL 
PROTECTED]/[dev],[tgt-port]@tgt-ip/[tgt-macaddr]\n);
 
 static struct netpoll np = {
-   .name = netconsole,
-   .dev_name = eth0,
-   .local_port = 6665,
-   .remote_port = ,
-   .remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
+   .name   = netconsole,
+   .dev_name   = eth0,
+   .local_port = 6665,
+   .remote_port= ,
+   .remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
 };
-static int configured = 0;
-
-#define MAX_PRINT_CHUNK 1000
+static int configured;
 
 static void write_msg(struct console *con, const char *msg, unsigned int len)
 {
@@ -75,7 +72,7 @@ static void write_msg(struct console *con, const char *msg, 
unsigned int len)
 
local_irq_save(flags);
 
-   for(left = len; left; ) {
+   for (left = len; left;) {
frag = min(left, MAX_PRINT_CHUNK);
netpoll_send_udp(np, msg, frag);
msg += frag;
@@ -86,12 +83,12 @@ static void write_msg(struct console *con, const char *msg, 
unsigned int len)
 }
 
 static struct console netconsole = {
-   .name = netcon,
-   .flags = CON_ENABLED | CON_PRINTBUFFER,
-   .write = write_msg
+   .name   = netcon,
+   .flags  = CON_ENABLED | CON_PRINTBUFFER,
+   .write  = write_msg,
 };
 
-static int option_setup(char *opt)
+static int __init option_setup(char *opt)
 {
configured = !netpoll_parse_options(np, opt);
return 1;
@@ -99,28 +96,30 @@ static int option_setup(char *opt)
 
 __setup(netconsole=, option_setup);
 
-static int init_netconsole(void)
+static int __init init_netconsole(void)
 {
-   int err;
+   int err = 0;
 
-   if(strlen(config))
+   if (strnlen(config, MAX_PARAM_LENGTH))
option_setup(config);
 
-   if(!configured) {
-   printk(netconsole: not configured, aborting\n);
-   return 0;
+   if (!configured) {
+   printk(KERN_INFO netconsole: not configured, aborting\n);
+   goto out;
}
 
err = netpoll_setup(np);
if (err)
-   return err;
+   goto out;
 
register_console(netconsole);
printk(KERN_INFO netconsole: network logging started\n);
-   return 0;
+
+out:
+   return err;
 }
 
-static void cleanup_netconsole(void)
+static void __exit cleanup_netconsole(void)
 {
unregister_console(netconsole);
netpoll_cleanup(np);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 -mm 2/9] netconsole: Remove bogus check

2007-07-29 Thread Satyam Sharma

From: Satyam Sharma [EMAIL PROTECTED]

[2/9] netconsole: Remove bogus check

The (!np.dev) check in write_msg() is bogus (always false), because:
np.dev is set by netpoll_setup(), which is called by init_netconsole()
before register_console(), so write_msg() cannot be triggered unless
netpoll_setup() successfully set np.dev. Also np.dev cannot go away
from under us, because netpoll_setup() grabs us reference on it. So
let's remove the bogus check.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Acked-by: Keiichi Kii [EMAIL PROTECTED]
Acked-by: Matt Mackall [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index f1c2a2d..2c2aef1 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -67,9 +67,6 @@ static void write_msg(struct console *con, const char *msg, 
unsigned int len)
int frag, left;
unsigned long flags;
 
-   if (!np.dev)
-   return;
-
local_irq_save(flags);
 
for (left = len; left;) {
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 -mm 3/9] netconsole: Simplify boot/module option setup logic

2007-07-29 Thread Satyam Sharma

From: Satyam Sharma [EMAIL PROTECTED]

[3/9] netconsole: Simplify boot/module option setup logic

Presently, boot/module parameters are set up quite differently for the
case of built-in netconsole (__setup() - obsolete_checksetup() -
netpoll_parse_options() - strlen(config) == 0 in init_netconsole())
vs modular netconsole (module_param_string() - string copied to the
config variable - strlen(config) != 0 init_netconsole() -
netpoll_parse_options()).

This patch makes both of them similar by doing exactly the equivalent
of a module_param_string() in option_setup() also -- just copying the
param string passed from the kernel command line into config variable.
So, strlen(config) != 0 in both cases, and netpoll_parse_options() is
always called from init_netconsole(), thus making the setup logic for
both cases similar.

Now, option_setup() is only ever called / used for the built-in case,
so we put it inside a #ifndef MODULE, otherwise gcc will complain about
option_setup() being defined but not used. Also, the configured
variable is redundant with this patch and hence removed.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Signed-off-by: Keiichi Kii [EMAIL PROTECTED]
Acked-by: Matt Mackall [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |   27 ++-
 1 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 2c2aef1..e56aa6c 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -53,6 +53,15 @@ static char config[MAX_PARAM_LENGTH];
 module_param_string(netconsole, config, MAX_PARAM_LENGTH, 0);
 MODULE_PARM_DESC(netconsole,  [EMAIL 
PROTECTED]/[dev],[tgt-port]@tgt-ip/[tgt-macaddr]\n);
 
+#ifndefMODULE
+static int __init option_setup(char *opt)
+{
+   strlcpy(config, opt, MAX_PARAM_LENGTH);
+   return 1;
+}
+__setup(netconsole=, option_setup);
+#endif /* MODULE */
+
 static struct netpoll np = {
.name   = netconsole,
.dev_name   = eth0,
@@ -60,7 +69,6 @@ static struct netpoll np = {
.remote_port= ,
.remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
 };
-static int configured;
 
 static void write_msg(struct console *con, const char *msg, unsigned int len)
 {
@@ -85,26 +93,19 @@ static struct console netconsole = {
.write  = write_msg,
 };
 
-static int __init option_setup(char *opt)
-{
-   configured = !netpoll_parse_options(np, opt);
-   return 1;
-}
-
-__setup(netconsole=, option_setup);
-
 static int __init init_netconsole(void)
 {
int err = 0;
 
-   if (strnlen(config, MAX_PARAM_LENGTH))
-   option_setup(config);
-
-   if (!configured) {
+   if (!strnlen(config, MAX_PARAM_LENGTH)) {
printk(KERN_INFO netconsole: not configured, aborting\n);
goto out;
}
 
+   err = netpoll_parse_options(np, config);
+   if (err)
+   goto out;
+
err = netpoll_setup(np);
if (err)
goto out;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 -mm 4/9] netconsole: Use netif_running() in write_msg()

2007-07-29 Thread Satyam Sharma

From: Satyam Sharma [EMAIL PROTECTED]

[4/9] netconsole: Use netif_running() in write_msg()

Avoid unnecessarily disabling interrupts and calling netpoll_send_udp()
if the corresponding local interface is not up.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Acked-by: Keiichi Kii [EMAIL PROTECTED]
Cc: Matt Mackall [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |   18 +-
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index e56aa6c..75cb761 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -75,16 +75,16 @@ static void write_msg(struct console *con, const char *msg, 
unsigned int len)
int frag, left;
unsigned long flags;
 
-   local_irq_save(flags);
-
-   for (left = len; left;) {
-   frag = min(left, MAX_PRINT_CHUNK);
-   netpoll_send_udp(np, msg, frag);
-   msg += frag;
-   left -= frag;
+   if (netif_running(np.dev)) {
+   local_irq_save(flags);
+   for (left = len; left;) {
+   frag = min(left, MAX_PRINT_CHUNK);
+   netpoll_send_udp(np, msg, frag);
+   msg += frag;
+   left -= frag;
+   }
+   local_irq_restore(flags);
}
-
-   local_irq_restore(flags);
 }
 
 static struct console netconsole = {
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 -mm 5/9] netconsole: Add some useful tips to documentation

2007-07-29 Thread Satyam Sharma

From: Satyam Sharma [EMAIL PROTECTED]

[5/9] netconsole: Add some useful tips to documentation

Add some useful general-purpose tips. Also suggest solution for the frequent
problem of console loglevel set too low numerically (i.e. for high priority
messages only) on the sender.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Acked-by: Keiichi Kii [EMAIL PROTECTED]
Acked-by: Matt Mackall [EMAIL PROTECTED]

---

 Documentation/networking/netconsole.txt |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/Documentation/networking/netconsole.txt 
b/Documentation/networking/netconsole.txt
index 1caa6c7..5962f45 100644
--- a/Documentation/networking/netconsole.txt
+++ b/Documentation/networking/netconsole.txt
@@ -44,11 +44,36 @@ WARNING: the default target ethernet setting uses the 
broadcast
 ethernet address to send packets, which can cause increased load on
 other systems on the same ethernet segment.
 
+TIP: some LAN switches may be configured to suppress ethernet broadcasts
+so it is advised to explicitly specify the remote agents' MAC addresses
+from the config parameters passed to netconsole.
+
+TIP: to find out the MAC address of, say, 10.0.0.2, you may try using:
+
+ ping -c 1 10.0.0.2 ; /sbin/arp -n | grep 10.0.0.2
+
+TIP: in case the remote logging agent is on a separate LAN subnet than
+the sender, it is suggested to try specifying the MAC address of the
+default gateway (you may use /sbin/route -n to find it out) as the
+remote MAC address instead.
+
 NOTE: the network device (eth1 in the above case) can run any kind
 of other network traffic, netconsole is not intrusive. Netconsole
 might cause slight delays in other traffic if the volume of kernel
 messages is high, but should have no other impact.
 
+NOTE: if you find that the remote logging agent is not receiving or
+printing all messages from the sender, it is likely that you have set
+the console_loglevel parameter (on the sender) to only send high
+priority messages to the console. You can change this at runtime using:
+
+ dmesg -n 8
+
+or by specifying debug on the kernel command line at boot, to send
+all kernel messages to the console. A specific value for this parameter
+can also be set using the loglevel kernel boot option. See the
+dmesg(8) man page and Documentation/kernel-parameters.txt for details.
+
 Netconsole was designed to be as instantaneous as possible, to
 enable the logging of even the most critical kernel bugs. It works
 from IRQ contexts as well, and does not enable interrupts while
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 -mm 6/9] netconsole: Introduce netconsole_target

2007-07-29 Thread Satyam Sharma

From: Satyam Sharma [EMAIL PROTECTED]

[6/9] netconsole: Introduce netconsole_target

Introduce a wrapper structure over netpoll to represent logging targets
configured in netconsole. This will get extended with other members in
further patches.

This is done independent of the (to-be-introduced) NETCONSOLE_DYNAMIC
config option so that we're able to drastically cut down on the #ifdef
complexity of final netconsole.c. Also, struct netconsole_target would be
required for multiple targets support also, and not just dynamic
reconfigurability.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Signed-off-by: Keiichi Kii [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |   36 +---
 1 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 75cb761..be15ca6 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -62,24 +62,35 @@ static int __init option_setup(char *opt)
 __setup(netconsole=, option_setup);
 #endif /* MODULE */
 
-static struct netpoll np = {
-   .name   = netconsole,
-   .dev_name   = eth0,
-   .local_port = 6665,
-   .remote_port= ,
-   .remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
+/**
+ * struct netconsole_target - Represents a configured netconsole target.
+ * @np:The netpoll structure for this target.
+ */
+struct netconsole_target {
+   struct netpoll  np;
+};
+
+static struct netconsole_target default_target = {
+   .np = {
+   .name   = netconsole,
+   .dev_name   = eth0,
+   .local_port = 6665,
+   .remote_port= ,
+   .remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
+   },
 };
 
 static void write_msg(struct console *con, const char *msg, unsigned int len)
 {
int frag, left;
unsigned long flags;
+   struct netconsole_target *nt = default_target;
 
-   if (netif_running(np.dev)) {
+   if (netif_running(nt-np.dev)) {
local_irq_save(flags);
for (left = len; left;) {
frag = min(left, MAX_PRINT_CHUNK);
-   netpoll_send_udp(np, msg, frag);
+   netpoll_send_udp(nt-np, msg, frag);
msg += frag;
left -= frag;
}
@@ -96,17 +107,18 @@ static struct console netconsole = {
 static int __init init_netconsole(void)
 {
int err = 0;
+   struct netconsole_target *nt = default_target;
 
if (!strnlen(config, MAX_PARAM_LENGTH)) {
printk(KERN_INFO netconsole: not configured, aborting\n);
goto out;
}
 
-   err = netpoll_parse_options(np, config);
+   err = netpoll_parse_options(nt-np, config);
if (err)
goto out;
 
-   err = netpoll_setup(np);
+   err = netpoll_setup(nt-np);
if (err)
goto out;
 
@@ -119,8 +131,10 @@ out:
 
 static void __exit cleanup_netconsole(void)
 {
+   struct netconsole_target *nt = default_target;
+
unregister_console(netconsole);
-   netpoll_cleanup(np);
+   netpoll_cleanup(nt-np);
 }
 
 module_init(init_netconsole);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 -mm 7/9] netconsole: Introduce netconsole_netdev_notifier

2007-07-29 Thread Satyam Sharma

From: Satyam Sharma [EMAIL PROTECTED]

[7/9] netconsole: Introduce netconsole_netdev_notifier

To update fields of underlying netpoll structure at runtime on
corresponding NETDEV_CHANGEADDR or NETDEV_CHANGENAME notifications.

ioctl(SIOCSIFHWADDR or SIOCSIFNAME) could be used to change the
hardware/MAC address or name of the local interface that our netpoll
is attached to. Whenever this happens, netdev notifier chain is called
out with the NETDEV_CHANGEADDR or NETDEV_CHANGENAME event message. We
respond to that and update the local_mac or dev_name field of the struct
netpoll. This makes sense anyway, but is especially required for dynamic
netconsole because the netpoll structure's internal members become user
visible files when either sysfs or configfs are used. So this helps us
to keep up with the MAC address/name changes and keep values in struct
netpoll uptodate.

[ Note that ioctl(SIOCSIFADDR) to change IP address of interface at
  runtime is not handled (to update local_ip of netpoll) on purpose --
  some setups may set the local_ip to a private address, not necessary
  the actual IP address of the sender host, as presently allowed. ]

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Signed-off-by: Keiichi Kii [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |   32 
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index be15ca6..5557098 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -80,6 +80,33 @@ static struct netconsole_target default_target = {
},
 };
 
+/* Handle network interface device notifications */
+static int netconsole_netdev_event(struct notifier_block *this,
+  unsigned long event,
+  void *ptr)
+{
+   struct net_device *dev = ptr;
+   struct netconsole_target *nt = default_target;
+
+   if (nt-np.dev == dev) {
+   switch (event) {
+   case NETDEV_CHANGEADDR:
+   memcpy(nt-np.local_mac, dev-dev_addr, ETH_ALEN);
+   break;
+
+   case NETDEV_CHANGENAME:
+   strlcpy(nt-np.dev_name, dev-name, IFNAMSIZ);
+   break;
+   }
+   }
+
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block netconsole_netdev_notifier = {
+   .notifier_call  = netconsole_netdev_event,
+};
+
 static void write_msg(struct console *con, const char *msg, unsigned int len)
 {
int frag, left;
@@ -122,6 +149,10 @@ static int __init init_netconsole(void)
if (err)
goto out;
 
+   err = register_netdevice_notifier(netconsole_netdev_notifier);
+   if (err)
+   goto out;
+
register_console(netconsole);
printk(KERN_INFO netconsole: network logging started\n);
 
@@ -134,6 +165,7 @@ static void __exit cleanup_netconsole(void)
struct netconsole_target *nt = default_target;
 
unregister_console(netconsole);
+   unregister_netdevice_notifier(netconsole_netdev_notifier);
netpoll_cleanup(nt-np);
 }
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 -mm 8/9] netconsole: Support multiple logging targets

2007-07-29 Thread Satyam Sharma

From: Satyam Sharma [EMAIL PROTECTED]

[8/9] netconsole: Support multiple logging targets

This patch introduces support for multiple targets, independent of
CONFIG_NETCONSOLE_DYNAMIC -- this is useful even in the default case and
(including the infrastructure introduced in previous patches) doesn't
really add too many bytes to module text. All the complexity (and size)
comes with the dynamic reconfigurability / userspace interface patch,
and so it's plausible users may want to keep this enabled but that
disabled (say to avoid a dependency on CONFIG_CONFIGFS_FS too).

Also update documentation to mention the use of ; separator to specify
multiple logging targets in the boot/module option string.

Brief overview:

We maintain a target_list (and corresponding lock). Get rid of the static
default_target and introduce allocation and release functions for our
netconsole_target objects (but keeping sure to preserve previous behaviour
such as default values). During init_netconsole(), ; is used as the
separator to identify multiple target specifications in the boot/module
option string. The target specifications are parsed and netpolls setup.
During exit, the target_list is torn down and all items released.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Signed-off-by: Keiichi Kii [EMAIL PROTECTED]

---

 Documentation/networking/netconsole.txt |6 +
 drivers/net/netconsole.c|  171 +++---
 2 files changed, 137 insertions(+), 40 deletions(-)

diff --git a/Documentation/networking/netconsole.txt 
b/Documentation/networking/netconsole.txt
index 5962f45..1aaa738 100644
--- a/Documentation/networking/netconsole.txt
+++ b/Documentation/networking/netconsole.txt
@@ -34,6 +34,12 @@ Examples:
 
  insmod netconsole netconsole=@/,@10.0.0.2/
 
+It also supports logging to multiple remote agents by specifying
+parameters for the multiple agents separated by semicolons and the
+complete string enclosed in quotes, thusly:
+
+ modprobe netconsole netconsole=@/,@10.0.0.2/;@/eth1,[EMAIL PROTECTED]/
+
 Built-in netconsole starts immediately after the TCP stack is
 initialized and attempts to bring up the supplied dev at the supplied
 address.
diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 5557098..458c4d6 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -62,44 +62,93 @@ static int __init option_setup(char *opt)
 __setup(netconsole=, option_setup);
 #endif /* MODULE */
 
+/* Linked list of all configured targets */
+static LIST_HEAD(target_list);
+
+/* This needs to be a spinlock because write_msg() cannot sleep */
+static DEFINE_SPINLOCK(target_list_lock);
+
 /**
  * struct netconsole_target - Represents a configured netconsole target.
+ * @list:  Links this target into the target_list.
  * @np:The netpoll structure for this target.
  */
 struct netconsole_target {
+   struct list_headlist;
struct netpoll  np;
 };
 
-static struct netconsole_target default_target = {
-   .np = {
-   .name   = netconsole,
-   .dev_name   = eth0,
-   .local_port = 6665,
-   .remote_port= ,
-   .remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
-   },
-};
+/* Allocate new target and setup netpoll for it */
+static struct netconsole_target *alloc_target(char *target_config)
+{
+   int err = -ENOMEM;
+   struct netconsole_target *nt;
+
+   /* Allocate and initialize with defaults */
+   nt = kzalloc(sizeof(*nt), GFP_KERNEL);
+   if (!nt) {
+   printk(KERN_ERR netconsole: failed to allocate memory\n);
+   goto fail;
+   }
+
+   nt-np.name = netconsole;
+   strlcpy(nt-np.dev_name, eth0, IFNAMSIZ);
+   nt-np.local_port = 6665;
+   nt-np.remote_port = ;
+   memset(nt-np.remote_mac, 0xff, ETH_ALEN);
+
+   /* Parse parameters and setup netpoll */
+   err = netpoll_parse_options(nt-np, target_config);
+   if (err)
+   goto fail;
+
+   err = netpoll_setup(nt-np);
+   if (err)
+   goto fail;
+
+   return nt;
+
+fail:
+   kfree(nt);
+   return ERR_PTR(err);
+}
+
+/* Cleanup netpoll for given target and free it */
+static void free_target(struct netconsole_target *nt)
+{
+   netpoll_cleanup(nt-np);
+   kfree(nt);
+}
 
 /* Handle network interface device notifications */
 static int netconsole_netdev_event(struct notifier_block *this,
   unsigned long event,
   void *ptr)
 {
+   unsigned long flags;
+   struct netconsole_target *nt;
struct net_device *dev = ptr;
-   struct netconsole_target *nt = default_target;
 
-   if (nt-np.dev == dev) {
-   switch (event) {
-   case NETDEV_CHANGEADDR:
-   memcpy(nt-np.local_mac, dev-dev_addr, ETH_ALEN

[PATCH v3 -mm 9/9] netconsole: Support dynamic reconfiguration using configfs

2007-07-29 Thread Satyam Sharma

From: Satyam Sharma [EMAIL PROTECTED]

[9/9] netconsole: Support dynamic reconfiguration using configfs

This patch introduces support for dynamic reconfiguration (adding, removing
and/or modifying parameters of netconsole targets at runtime) using a
userspace interface exported via configfs. Documentation is also updated
accordingly.

Issues and brief design overview:

(1) Kernel-initiated creation / destruction of kernel objects is not
possible with configfs -- the lifetimes of the config items is managed
exclusively from userspace. But netconsole must support boot/module params
too, and these are parsed in kernel and hence netpolls must be setup from
the kernel. Joel Becker suggested to separately manage the lifetimes of
the two kinds of netconsole_target objects -- those created via configfs
mkdir(2) from userspace and those specified from the boot/module option
string. This adds complexity and some redundancy here and also means that
boot/module param-created targets are not exposed through the configfs
namespace (and hence cannot be updated / destroyed dynamically). However,
this saves us from locking / refcounting complexities that would need to
be introduced in configfs to support kernel-initiated item creation /
destroy there.

(2) In configfs, item creation takes place in the call chain of the mkdir(2)
syscall in the driver subsystem. If we used an ioctl(2) to create / destroy
objects from userspace, the special userspace program is able to fill out
the structure to be passed into the ioctl and hence specify attributes such
as local interface that are required at the time we set up the netpoll.
For configfs, this information is not available at the time of mkdir(2).
So, we keep all newly-created targets (via configfs) disabled by default.
The user is expected to set various attributes appropriately (including the
local network interface if required) and then write(2) 1 to the enabled
attribute. Thus, netpoll_setup() is then called on the set parameters in the
context of _this_ write(2) on the enabled attribute itself. This design
enables the user to reconfigure existing netconsole targets at runtime to
be attached to newly-come-up interfaces that may not have existed when
netconsole was loaded or when the targets were actually created. All this
effectively enables us to get rid of custom ioctls.

(3) Ultra-paranoid configfs attribute show() and store() operations, with
sanity and input range checking, using only safe string primitives, and
compliant with the recommendations in Documentation/filesystems/sysfs.txt.

(4) A new function netpoll_print_options() is created in the netpoll API,
that just prints out the configured parameters for a netpoll structure.
netpoll_parse_options() is modified to use that and it is also exported to
be used from netconsole.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Acked-by: Keiichi Kii [EMAIL PROTECTED]

---

 Documentation/networking/netconsole.txt |   68 
 drivers/net/Kconfig |   10 +
 drivers/net/netconsole.c|  605 +--
 include/linux/netpoll.h |1 +
 net/core/netpoll.c  |   44 ++-
 5 files changed, 683 insertions(+), 45 deletions(-)

diff --git a/Documentation/networking/netconsole.txt 
b/Documentation/networking/netconsole.txt
index 1aaa738..3c2f2b3 100644
--- a/Documentation/networking/netconsole.txt
+++ b/Documentation/networking/netconsole.txt
@@ -3,6 +3,10 @@ started by Ingo Molnar [EMAIL PROTECTED], 2001.09.17
 2.6 port and netpoll api by Matt Mackall [EMAIL PROTECTED], Sep 9 2003
 
 Please send bug reports to Matt Mackall [EMAIL PROTECTED]
+and Satyam Sharma [EMAIL PROTECTED]
+
+Introduction:
+=
 
 This module logs kernel printk messages over UDP allowing debugging of
 problem where disk logging fails and serial consoles are impractical.
@@ -13,6 +17,9 @@ the specified interface as soon as possible. While this 
doesn't allow
 capture of early kernel panics, it does capture most of the boot
 process.
 
+Sender and receiver configuration:
+==
+
 It takes a string configuration parameter netconsole in the
 following format:
 
@@ -46,6 +53,67 @@ address.
 
 The remote host can run either 'netcat -u -l -p port' or syslogd.
 
+Dynamic reconfiguration:
+
+
+Dynamic reconfigurability is a useful addition to netconsole that enables
+remote logging targets to be dynamically added, removed, or have their
+parameters reconfigured at runtime from a configfs-based userspace interface.
+[ Note that the parameters of netconsole targets that were specified/created
+from the boot/module option are not exposed via this interface, and hence
+cannot be modified dynamically. ]
+
+To include this feature, select CONFIG_NETCONSOLE_DYNAMIC when building the
+netconsole module (or kernel, if netconsole is built-in).
+
+Some examples follow (where configfs is mounted at the /sys/kernel/config

Re: [PATCH] USB Pegasus driver - avoid a potential NULL pointer dereference.

2007-07-28 Thread Satyam Sharma

Hi,

On 7/29/07, Jesper Juhl [EMAIL PROTECTED] wrote:
 Hello,

 This patch makes sure we don't dereference a NULL pointer in
 drivers/net/usb/pegasus.c::write_bulk_callback() in the initial
 struct net_device *net = pegasus-net; assignment.
 The existing code checks if 'pegasus' is NULL and bails out if
 it is, so we better not touch that pointer until after that check.
 [...]
 diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c
 index a05fd97..04cba6b 100644
 --- a/drivers/net/usb/pegasus.c
 +++ b/drivers/net/usb/pegasus.c
 @@ -768,11 +768,13 @@ done:
  static void write_bulk_callback(struct urb *urb)
  {
 pegasus_t *pegasus = urb-context;
 -   struct net_device *net = pegasus-net;
 +   struct net_device *net;

 if (!pegasus)
 return;

 +   net = pegasus-net;
 +
 if (!netif_device_present(net) || !netif_running(net))
 return;

Is it really possible that we're called into this function with
urb-context == NULL? If not, I'd suggest let's just get rid of
the check itself, instead.

Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] sb1000: prevent a potential NULL pointer dereference in sb1000_dev_ioctl()

2007-07-28 Thread Satyam Sharma



On Sun, 29 Jul 2007, Domen Puncer wrote:

 On 29/07/07 00:02 +0200, Jesper Juhl wrote:
  Hi,
  
  Here's a small patch, prompted by a find by the Coverity checker, 
  that removes a potential NULL pointer dereference from 
  drivers/net/sb1000.c::sb1000_dev_ioctl().
  The checker spotted that we do a NULL test of 'dev', yet we 
  dereference the pointer prior to that check.
  This patch simply moves the dereference after the NULL test.
 
 But... it can't be called without a valid 'dev', no?
 A quick 'grep do_ioctl net/' confirms that all calls are in
 the form of 'dev-do_ioctl(dev, ...'.

Yup, I think so too ...


  @@ -991,11 +991,13 @@ static int sb1000_dev_ioctl(struct net_device *dev, 
  struct ifreq *ifr, int cmd)
  short PID[4];
  int ioaddr[2], status, frequency;
  unsigned int stats[5];
  -   struct sb1000_private *lp = netdev_priv(dev);
  +   struct sb1000_private *lp;
   
  if (!(dev  dev-flags  IFF_UP))
  return -ENODEV;

I think we could get rid of the !dev check itself. Actually, the IFF_UP
check /also/ looks suspect to me for two reasons: (1) I remember Stephen
Hemminger once telling me dev-flags is legacy and unsafe, and one of
the netif_xxx() functions be used instead, and, (2) I wonder if we really
require the interface to be up and *running* when we do this ioctl.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC][PATCH v2 -mm 0/9] netconsole: Multiple targets and dynamic reconfigurability

2007-07-13 Thread Satyam Sharma

Hi Keiichi,

On Fri, 13 Jul 2007, KII Keiichi wrote:
 Hi Satyam,
 
  [0/9] netconsole: Multiple targets and dynamic reconfigurability
  
  This patchset is a rework of the original idea and patches posted by
  Keiichi Kii and Takayoshi Kochi at: http://lkml.org/lkml/2007/6/13/72
  
  This is v2 of the patchset, the previous version is available at:
  http://lkml.org/lkml/2007/7/4/107
  
  This is diffed against 2.6.22-rc6-mm1 + the earlier netpoll fixlet and the
  configfs cleanup patches (those are already merged in -mm AFAIK, and hence
  not reproduced here).
  
  [1/9] netconsole: Cleanups, codingstyle, prettyfication
  [2/9] netconsole: Remove bogus check
  [3/9] netconsole: Simplify boot/module option setup logic
  [4/9] netconsole: Add some useful tips to documentation
  [5/9] netconsole: Introduce netconsole_target
  [6/9] netconsole: Introduce netconsole_netdev_notifier
  [7/9] netconsole: Use netif_running() in write_msg()
  [8/9] netconsole: Support multiple logging targets
  [9/9] netconsole: Support dynamic reconfiguration using configfs
  
 
 I tested your v2 patches on the x86/IA64 architecture.
 There was no problem on my tests.
 
 Signed-off-by: Keiichi Kii [EMAIL PROTECTED]

Ok, thanks! I'll send out the next series (with the various few
fixes and cleanups suggested this time) with your Sign-offs/Acks.

Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] crash in 2.6.22-git2 sysctl_set_parent()

2007-07-13 Thread Satyam Sharma


Hi,

I'm totally confuzed by this patch ...

On 7/14/07, Linas Vepstas [EMAIL PROTECTED] wrote:


This is a patch ( bug report) for a crash in sysctl_set_parent()
in 2.6.22-git2.


Are you sure you saw this crash on 22-git2 (Linus' tree) ???


Problem: 2.6.22-git2 crashes with a stack trace
[c1d0fb00] c0067b4c .sysctl_set_parent+0x48/0x7c
[c1d0fb90] c0069b40 .register_sysctl_table+0x7c/0xf4
[c1d0fc30] c065e710 .devinet_init+0x88/0xb0
[c1d0fcc0] c065db74 .ip_rt_init+0x17c/0x32c
[c1d0fd70] c065deec .ip_init+0x10/0x34
[c1d0fdf0] c065e898 .inet_init+0x160/0x3dc
[c1d0fea0] c0630bc4 .kernel_init+0x204/0x3c8


Please post the full dmesg output, if possible.


A bit of poking around makes it clear what the problem is:
In sysctl_set_parent(), the for loop

   for (; table-ctl_name || table-procname; table++) {


Yup, but note that the table there is a struct ctl_table * ...


walks off the end of the table, and into garbage.  Basically,
this for-loop iterator expects all table arrays to be
null terminated.  However, net/ipv4/devinet.c statically
declares an array that is not null-terminated.


Whereas the not null-terminated array you're talking about
in devinet.c is a struct devinet_sysctl_table, whose *members*
are ctl_table structs.

Also note that all those members have already been declared
as arrays: struct ctl_table xxx[2]; *precisely* because the
second element (which is left un-initialized, hence will get
zeroed-out being static) is what will protect us from running
out into garbage in sysctl_set_parent().


The patch
below fixes that; it works for me.


Now I'm even more confuzed ...


Its somewhat conservative;
if one wishes to assume that the compiler will always zero out
the empty parts of the structure,


You can always safely assume that.

Off-topic, but note that unintialized static variables being automagically
initialized to zero / NULL is guaranteed by C (the language) itself, so
nothing left in the hands of compiler implementations here. [6.7.8:10]


then this pach can be shrunk
to one line: +  ctl_table   devinet_root_dir[3];


But that's pointless. It was already [2] for the reason I mentioned.


Signed-off-by: Linas Vepstas [EMAIL PROTECTED]
[...]
 net/core/neighbour.c |4 
 net/ipv4/devinet.c   |7 ++-
 2 files changed, 10 insertions(+), 1 deletion(-)


Weirder. diffstat does not match the patch ...


Index: linux-2.6.22-git2/net/ipv4/devinet.c
===
--- linux-2.6.22-git2.orig/net/ipv4/devinet.c   2007-07-13 14:23:21.0 
-0500
+++ linux-2.6.22-git2/net/ipv4/devinet.c2007-07-13 14:24:15.0 
-0500
@@ -1424,7 +1424,7 @@ static struct devinet_sysctl_table {
ctl_table   devinet_dev[2];
ctl_table   devinet_conf_dir[2];
ctl_table   devinet_proto_dir[2];
-   ctl_table   devinet_root_dir[2];
+   ctl_table   devinet_root_dir[3];
 } devinet_sysctl = {
.devinet_vars = {
DEVINET_SYSCTL_COMPLEX_ENTRY(FORWARDING, forwarding,
@@ -1493,8 +1493,13 @@ static struct devinet_sysctl_table {


And Linus' latest -git doesn't even have this stuff at line 1493.


.data   = ipv4_devconf.loop,
.maxlen = sizeof(int),
.mode   = 0644,
+   .child  = 0x0,
.proc_handler   = proc_dointvec,
},
+   {
+   .ctl_name   = 0,
+   .procname   = 0,
+   },
},
 };


Well, as I said, I'm totally confuzed ...

Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 -mm 9/9] netconsole: Support dynamic reconfiguration using configfs

2007-07-12 Thread Satyam Sharma

Hi,

On Thu, 12 Jul 2007, Keiichi KII wrote:
 Hi Satyam,
 
   struct netconsole_target {
  struct list_headlist;
  +#ifdef CONFIG_NETCONSOLE_DYNAMIC
  +   struct config_item  item;
  +   int enabled;
  +#endif
  struct netpoll  np;
   };
 
 If CONFIG_NETCONSOLE_DYNAMIC is unset, we can't access to the enabled 
 member.
 So, the compile errors occur because the following functions make use of 
 the above one.

Gargh, yes.

 Signed-off-by: Keiichi Kii [EMAIL PROTECTED]
 
 Index: mm/drivers/net/netconsole.c
 ===
 --- mm.orig/drivers/net/netconsole.c
 +++ mm/drivers/net/netconsole.c
 @@ -94,8 +94,8 @@ struct netconsole_target {
 struct list_headlist;
  #ifdef CONFIG_NETCONSOLE_DYNAMIC
 struct config_item  item;
 -   int enabled;
  #endif
 +   int enabled;
 struct netpoll  np;
  };

Yup, enabled should be out of CONFIG_NETCONSOLE_DYNAMIC. I'll include this
change in the next version as well.

Thanks,
Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 -mm 8/9] netconsole: Support multiple logging targets

2007-07-11 Thread Satyam Sharma

Hi Joel,

On Tue, 10 Jul 2007, Joel Becker wrote:
 On Wed, Jul 11, 2007 at 03:47:09AM +0530, Satyam Sharma wrote:
  Hmm, I put it in there because I expected that the user must have had
  at least one target configured (added to target_list) if he's got the
  module loaded/built-in (and netconsole registered), which is when this
  function would be triggered anyway.
 
   Loading the module won't happen without a config,

That's no longer the case (doesn't need to be enforced actually) now
with being able to add logging targets _after_ the module has loaded
and initialized / registered itself. It's plausible a user may simply
want to load netconsole without any netconsole= option, and then
specify/add targets later -- this wasn't possible previously, of course.

 but there's no
 requirement that a built-in netconsole has a kernel command-line
 argument.

The above is applicable for the built-in case too, with
CONFIG_NETCONSOLE_DYNAMIC. So now built-in netconsole will initialize
(configfs subsystem, etc) and register itself (register_console) even
if no netconsole= was provided, again for the same reason.

And I now notice that CONFIG_NETCONSOLE=y in the defconfig's of
most arch's. So I think I did get the common case wrong there ...
ok, I'll get rid of the unlikely() in that case.

Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC][PATCH v2 -mm 0/9] netconsole: Multiple targets and dynamic reconfigurability

2007-07-10 Thread Satyam Sharma

[0/9] netconsole: Multiple targets and dynamic reconfigurability

This patchset is a rework of the original idea and patches posted by
Keiichi Kii and Takayoshi Kochi at: http://lkml.org/lkml/2007/6/13/72

This is v2 of the patchset, the previous version is available at:
http://lkml.org/lkml/2007/7/4/107

This is diffed against 2.6.22-rc6-mm1 + the earlier netpoll fixlet and the
configfs cleanup patches (those are already merged in -mm AFAIK, and hence
not reproduced here).

[1/9] netconsole: Cleanups, codingstyle, prettyfication
[2/9] netconsole: Remove bogus check
[3/9] netconsole: Simplify boot/module option setup logic
[4/9] netconsole: Add some useful tips to documentation
[5/9] netconsole: Introduce netconsole_target
[6/9] netconsole: Introduce netconsole_netdev_notifier
[7/9] netconsole: Use netif_running() in write_msg()
[8/9] netconsole: Support multiple logging targets
[9/9] netconsole: Support dynamic reconfiguration using configfs

Changes since v1:
=

* Split / merge various patches in the patchset as suggested in reviews.

* Get rid of redundant id and dev_status fields.

* Use netif_running() instead of IFF_UP.

* Explain the boot/module option setup simplification and the netdev
  notifier patches in more detail (justify why they're good / required).

* Get rid of possible trailing newline from echo(1) in store_dev_name().

* Fix a panic bug -- now, when no boot/module option is specified, we
  skip only the part that creates and inserts param string-created
  targets. Rest of the initialization (notifier, configfs subsystem)
  is done as usual.

* Get rid of lock-skipping special-casing where possible.

* Use config_item_name() instead of empty_item.

* Introduce netpoll_print_options() in netpoll and use it.

* Style cleanups (reduce #ifdefs, indentation, whitespace fixes, etc).

* Miscellaneous fixes in documentation (as suggested in reviews).

About this patchset:


What?

Support multiple remote logging targets in netconsole. Also, ability to
dynamically add or remove targets or modify parameters (IP, port, remote
MAC address) of targets at runtime, from userspace, using configfs.

Why?

From Keiichi Kii's original post:

[...] current netconsole is not flexible. For example, if you want to change
ip address for logging agent, in the case of built-in netconsole, you can't
change config except for changing boot parameter and rebooting your system,
or in the case of module netconsole, you need to remove it and reload
with different parameters. [...] and we have been losing serial console
port with PCs and Servers.

(... and especially laptops, I would add.)

Kindly review and comment.

Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 >

1 - 100 of 160 matches

Mail list logo