Re: [PATCH] bonding: fix race that causes invalid statistics

2008-01-28 Thread Chris Snook

Jay Vosburgh wrote:

Andy Gospodarek [EMAIL PROTECTED] wrote:


+   memcpy(stats,local_stats,sizeof(net_device_stats));


FYI, this generates a compiler error (missing the word struct
in here).  Other than not compiling, the patch seems reasonable.  I'll
fix it up and include it in the series I'll post tomorrow or so.

-J


Looks like Andy sent the draft version by mistake.  We did in fact test an 
otherwise identical patch, which did fix the bug.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/26] atl1: refactor tx processing

2008-01-24 Thread Chris Snook

Jay Cliburn wrote:

On Tue, 22 Jan 2008 18:31:09 -0600
Jay Cliburn [EMAIL PROTECTED] wrote:


On Tue, 22 Jan 2008 04:58:17 -0500
Jeff Garzik [EMAIL PROTECTED] wrote:


[...]

for such a huge patch, this description is very tiny.  [describe]
what is refactored, and why.


Is this one any better?


This satisfies me.

Acked-by: Chris Snook [EMAIL PROTECTED]


From df475e2eea401f9dc18ca23dab538b99fb9e710c Mon Sep 17 00:00:00 2001
From: Jay Cliburn [EMAIL PROTECTED]
Date: Wed, 23 Jan 2008 21:36:36 -0600
Subject: [PATCH] atl1: simplify tx packet descriptor

The transmit packet descriptor consists of four 32-bit words, with word 3
upper bits overloaded depending upon the condition of its bits 3 and 4.
The driver currently duplicates all word 2 and some word 3 register bit
definitions unnecessarily and also uses a set of nested structures in its
definition of the TPD without good cause. This patch adds a lengthy
comment describing the TPD, eliminates duplicate TPD bit definitions,
and simplifies the TPD structure itself. It also expands the TSO check
to correctly handle custom checksum versus TSO processing using the revised
TPD definitions. Finally, shorten some variable names in the transmit
processing path to reduce line lengths, rename some variables to better
describe their purpose (e.g., nseg versus m), and add a comment or two
to better describe what the code is doing.

Signed-off-by: Jay Cliburn [EMAIL PROTECTED]

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/26] atl1: update initialization parameters

2008-01-22 Thread Chris Snook

Jay Cliburn wrote:

On Tue, 22 Jan 2008 04:56:11 -0500
Jeff Garzik [EMAIL PROTECTED] wrote:


[EMAIL PROTECTED] wrote:

From: Jay Cliburn [EMAIL PROTECTED]

Update initialization parameters to match the current vendor driver
version 1.2.40.2.


[...]

ACK without any better knowledge...  but is any addition insight 
available at all?


No, sorry Jeff.  I simply took the vendor's current driver and matched
his initialization settings.  I can only assume he discovered these
values through lab testing.

For this and the other conform to vendor driver patches in this set, I
thought it important to have the in-tree driver match the vendor driver
as closely as possible.  The primary motivations are (1) my belief that
he's in a better position to test the NIC, and (2) to be able to go to
him for assistance occasionally and not be rejected because of
significant differences between his and our drivers.


I don't think we should be doing this without justification.  From all the atl1 
and atl2 code I've looked at, I've gotten the impression that their driver 
development processes are extremely ad-hoc.  There is code in the Atheros 
version of atl2 that cannot *possibly* apply to that hardware and was just 
copied and pasted from atl1, just as much of atl1 was copied and pasted from 
e1000.  The fact that various versions have different magic numbers may simply 
mean they copied and pasted from different irrelevant and incorrect sources.


Our contacts at Atheros seem to be very good electrical engineers, so when they 
tell us that a certain setting should be changed to match particular properties 
of the hardware, I trust them.  They are not, however, experienced and 
disciplined kernel developers, so absent such justification I think we should 
stick with what we have, which has been improved and reviewed by people who 
*are* experienced and disciplined kernel developers.


We have at least as much to teach Atheros about writing kernel code as they have 
to teach us about their hardware.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bonding : Monitoring of 4965 wireless card

2008-01-11 Thread Chris Snook

[EMAIL PROTECTED] wrote:

Hi,

I want to make a bond with my wireless card. The ipw driver create two
 interfaces (wlan0 and wmaster0). When i switch the rf_kill button,
 ifplug detect wlan0 unplugged but not wmaster0. If i down wlan0 (while
 rf_kil ), bonding detect the inactivity when i up the interface.

Have you some idea where is the problem? the driver or the miimon of
 the module?

my module parameters mode=1 miimon=100 primary eth0


miimon isn't meaningful for wmaster0.  I suggest you use arp monitoring instead. 
 See bonding.txt for details.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC] introducing the Atheros L2 Fast Ethernet driver

2007-12-04 Thread Chris Snook

Hey folks --

	I've begun cleaning up the atl2 vendor driver for merging.  It's very similar 
to the atl1 driver, and needs a lot of the same work, though I have already 
fixed the 64-bit DMA data corrupter that atl1 users remember so fondly.  Right 
now this is very raw, and there is a large amount of cosmetic work to do to make 
it more maintainable, but it should generally work, at least as well as the 
vendor driver does.


	While this is in pre-submission cleanup mode, the latest standalone source 
tarball and patch (currently against 2.6.23) will be available here:


http://people.redhat.com/csnook/atl2/

	If you have atl2 hardware, please give this a spin.  I plan to submit the 
driver for merging some time in the next month or so.  Questions, comments, 
patches welcome.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] introducing the Atheros L2 Fast Ethernet driver

2007-12-04 Thread Chris Snook

Jeff Garzik wrote:

Chris Snook wrote:

Hey folks --

I've begun cleaning up the atl2 vendor driver for merging.  It's 
very similar to the atl1 driver, and needs a lot of the same work, 
though I have already fixed the 64-bit DMA data corrupter that atl1 
users remember so fondly.  Right now this is very raw, and there is a 
large amount of cosmetic work to do to make it more maintainable, but 
it should generally work, at least as well as the vendor driver does.


While this is in pre-submission cleanup mode, the latest 
standalone source tarball and patch (currently against 2.6.23) will be 
available here:


http://people.redhat.com/csnook/atl2/

If you have atl2 hardware, please give this a spin.  I plan to 
submit the driver for merging some time in the next month or so.  
Questions, comments, patches welcome.


Why not update atl1 for this new hardware?  Why is a new driver needed?

Jeff


They have some common hardware at the PCIe level, but at the ethernet level it's 
totally different (simple, low-power 100 Mb vs. Gb with all the bells and 
whistles).  We'd end up special-casing all over the place, because they have 
rather different capabilities, like, say, if we tried to merge ixgb and e1000 
(which have a lot of code in common).  It could be done, but I don't think it 
would be beneficial.  If Atheros decides to maintain the in-tree drivers 
directly, and wants to take it in that direction, I won't object, but from where 
I sit, on the other side of the planet from the guys with the chips hooked up to 
the hardware analyzers, separate drivers seems less painful.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


atl1 developers wanted

2007-10-28 Thread Chris Snook

Greetings, netdev!

	Since the Linux Driver Project apparently has a glut of talent, I figured I'd 
ask for a little help with an existing driver.  As some of you may know, Jay 
Cliburn and I have been working on atl1 since last summer, and we've gotten the 
driver merged and stable.  It does the basic things one expects a gigE driver to 
do, and it no longer does the things one expects a gigE driver to *not* do, but 
there are some hardware features that we don't support very well, or at all.


	If you have an atl1 board and want to get your hands dirty, we could use some 
help with the following:


Performance features:

TSO -- it works, but it's really slow, so we currently disable it by default
LRO -- would be nice to have
NAPI -- would be nice to have
configurable interrupt coalescing -- we're not currently taking advantage of 
this

feature testing:

jumbo frames -- last I checked, I could open connections with large MTU, but 
we've done very little validation, benchmarking, or optimization


IPv6 -- seems to work, but the hardware doesn't support offloading for IPv6 
traffic, so we need to know about the performance to prioritize work on the 
aforementioned performance features


other:

make WoLAN work reliably

-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: netif_rx will not free skb when I use ftp in kernel 2.6.22/2.6.21

2007-09-19 Thread Chris Snook

 wrote:

Dear netdev,
 
I have one problem with my NIC driver: whenever I use  
netif_receive_skb(skb) or netif_rx(skb) to indicate received packet,  
iperf does work, but ftp does not work, because the kernel seemes not 
kfree_skb(skb) when I use ftp.  What??s wrong with my NIC  driver? Any 
suggestions will be appreciated. Thanks in advance.
 
When I use iperf to test my NIC driver, it seems okay. When I use ftp, 
ftp client in kernel 2.6.22 and ftp server is ServUftp at windows2000, 
it does work too. But, When I use windows2003 server and its contained 
ftp server, my NIC driver can get file from that ftp server until the 
memory is used out.  When I get 3G file, then 3G memory will be used and 
not freed.
 
The source code is at : 
http://sourceforge.krugle.com/kse/files?project=%22Attansic%20L1%20Gigabit%20Ethernet%20driver%22 
http://sourceforge.krugle.com/kse/files?project=%22Attansic%20L1%20Gigabit%20Ethernet%20driver%22
Please check the function at_clean_rx_irq() and at_alloc_rx_buffers() in 
file at_main.c.
I dev_alloc_skb() and pci_map_single() skb in function 
at_alloc_rx_buffers(), pci_unmap_page() and netif_rx() in function 
at_clean_rx_irq(), CONFIG_AT_NAPI is not defined.
 
Thank csnook.
 
Best Regards,

Fei Zhang


Okay, I didn't know you were talking about the atl1 driver.  Are you 
using the
in-tree driver in 2.6.22, or the pre-merge driver on sourceforge, or the 
vendor

driver from Attansic/Atheros?

Also, can you test this again using the latest 2.6.23-rc7 kernel, with 
the atl1 driver

included in that kernel?

-- Chris


??2007-09-20??Chris Snook [EMAIL PROTECTED] ??

 wrote:
 Dear csnook,
 I have one problem in my NIC driver: whenever I use  
 netif_receive_skb(skb) or netif_rx(skb) to indicate received packet, 
 iperf does work, but ftp does not work,
 because the kernel seemes not kfree_skb(skb).  What??s wrong with my NIC 
 driver? Any suggestions will be appreciated. Thanks in advance.

 Best wishes,
 Fei Zhang
  
 Kernel v2.6.22

 Cpu : amd64
 Memory: 4G

I'm going to need way more information than that.  I'm also not certain I'm the 
best person to answer the question.  I suggest asking the question again, with a 
lot more info, to [EMAIL PROTECTED]  You're welcome to CC me as well. 
Please also post a link to the driver source so we can see what exactly you're 
doing.


-- Chris

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Document non-semantics of atomic_read() and atomic_set()

2007-09-10 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Unambiguously document the fact that atomic_read() and atomic_set()
do not imply any ordering or memory access, and that callers are
obligated to explicitly invoke barriers as needed to ensure that
changes to atomic variables are visible in all contexts that need
to see them.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- a/Documentation/atomic_ops.txt  2007-07-08 19:32:17.0 -0400
+++ b/Documentation/atomic_ops.txt  2007-09-10 19:02:50.0 -0400
@@ -12,7 +12,11 @@
 C integer type will fail.  Something like the following should
 suffice:
 
-   typedef struct { volatile int counter; } atomic_t;
+   typedef struct { int counter; } atomic_t;
+
+   Historically, counter has been declared volatile.  This is now
+discouraged.  See Documentation/volatile-considered-harmful.txt for the
+complete rationale.
 
The first operations to implement for atomic_t's are the
 initializers and plain reads.
@@ -42,6 +46,22 @@
 
 which simply reads the current value of the counter.
 
+*** WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! ***
+
+Some architectures may choose to use the volatile keyword, barriers, or
+inline assembly to guarantee some degree of immediacy for atomic_read()
+and atomic_set().  This is not uniformly guaranteed, and may change in
+the future, so all users of atomic_t should treat atomic_read() and
+atomic_set() as simple C assignment statements that may be reordered or
+optimized away entirely by the compiler or processor, and explicitly
+invoke the appropriate compiler and/or memory barrier for each use case.
+Failure to do so will result in code that may suddenly break when used with
+different architectures or compiler optimizations, or even changes in
+unrelated code which changes how the compiler optimizes the section
+accessing atomic_t variables.
+
+*** YOU HAVE BEEN WARNED! ***
+
 Now, we move onto the actual atomic operation interfaces.
 
void atomic_add(int i, atomic_t *v);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] atl1: add CONFIG_ATL1_EXPERIMENTAL to kconfig

2007-09-07 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Introduce Kconfig ATL1_EXPERIMENTAL to separate mature code from less mature
code in the atl1 driver, and remove EXPERIMENTAL designation for ATL1.

Signed-off-by: Chris Snook [EMAIL PROTECTED]
Acked-by: Jay Cliburn [EMAIL PROTECTED]

--- a/drivers/net/Kconfig   2007-09-04 10:12:38.0 -0400
+++ b/drivers/net/Kconfig   2007-09-04 10:37:34.0 -0400
@@ -2329,8 +2329,8 @@ config QLA3XXX
  will be called qla3xxx.
 
 config ATL1
-   tristate Attansic L1 Gigabit Ethernet support (EXPERIMENTAL)
-   depends on PCI  EXPERIMENTAL
+   tristate Attansic L1 Gigabit Ethernet support
+   depends on PCI
select CRC32
select MII
help
@@ -2339,6 +2339,16 @@ config ATL1
  To compile this driver as a module, choose M here.  The module
  will be called atl1.
 
+config ATL1_EXPERIMENTAL
+   bool atl1 experimental features
+   depends on ATL1  EXPERIMENTAL
+   help
+ This option enables various features that have not yet reached
+ the maturity of the rest of the atl1 driver.  The driver will
+ still work fine without this option enabled.
+
+ If unsure, say N.
+
 endif # NETDEV_1000
 
 #
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] atl1: wrap problematic optimizations in CONFIG_ATL1_EXPERIMENTAL

2007-09-07 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make certain problematic optimizations build-time configurable.

Signed-off-by: Chris Snook [EMAIL PROTECTED]
Acked-by: Jay Cliburn [EMAIL PROTECTED]

--- a/drivers/net/atl1/atl1_main.c  2007-09-04 10:12:38.0 -0400
+++ b/drivers/net/atl1/atl1_main.c  2007-09-04 11:23:26.0 -0400
@@ -2203,22 +2203,26 @@ static int __devinit atl1_probe(struct p
struct net_device *netdev;
struct atl1_adapter *adapter;
static int cards_found = 0;
-   bool pci_using_64 = true;
+   bool pci_using_64 = false;
int err;
 
err = pci_enable_device(pdev);
if (err)
return err;
 
+#ifdef CONFIG_ATL1_EXPERIMENTAL
err = pci_set_dma_mask(pdev, DMA_64BIT_MASK);
+   if (!err) {
+   pci_using_64 = true;
+   goto dma_ok;
+   }
+#endif /* CONFIG_ATL1_EXPERIMENTAL */
+   err = pci_set_dma_mask(pdev, DMA_32BIT_MASK);
if (err) {
-   err = pci_set_dma_mask(pdev, DMA_32BIT_MASK);
-   if (err) {
-   dev_err(pdev-dev, no usable DMA configuration\n);
-   goto err_dma;
-   }
-   pci_using_64 = false;
+   dev_err(pdev-dev, no usable DMA configuration\n);
+   goto err_dma;
}
+dma_ok:
/* Mark all PCI regions associated with PCI device
 * pdev as being reserved by owner atl1_driver_name
 */
@@ -2294,11 +2298,13 @@ static int __devinit atl1_probe(struct p
netdev-features |= (NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX);
 
/*
-* FIXME - Until tso performance gets fixed, disable the feature.
+* TSO currently has performance problems,
+* so let's disable it by default.
 * Enable it with ethtool -K if desired.
 */
-   /* netdev-features |= NETIF_F_TSO; */
-
+#ifdef CONFIG_ATL1_EXPERIMENTAL
+   netdev-features |= NETIF_F_TSO;
+#endif
if (pci_using_64)
netdev-features |= NETIF_F_HIGHDMA;
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] atl1: Introduce CONFIG_ATL1_EXPERIMENTAL

2007-09-07 Thread Chris Snook

Jeff Garzik wrote:

Chris Snook wrote:
The atl1 driver is currently marked EXPERIMENTAL, because a few 
supposedly performance-enhancing features still have problems.  When 
these features are disabled, the driver is completely stable, fully 
functional, and performs well.


Patch 1/2 Creates the kconfig option CONFIG_ATL1_EXPERIMENTAL, and 
removes the EXPERIMENTAL designation from CONFIG_ATL1


Patch 2/2 Wraps some currently-disabled features in #ifdef 
CONFIG_ATL1_EXPERIMENTAL, so developers and testers can play with 
these features more easily, and distributions will still get a fast, 
stable driver with existing .config files.


We'll also be using this to wrap around various new features we'll be 
experimenting with in coming months.  Instead of using a half dozen 
different kconfig options for each of them, like some drivers do, 
we'll just use this, and make sure things are safe for everyone before 
we take them out of the experimental wrapper.


Well, I haven't received patch #2 yet, but in general a runtime switch 
(module option?) is greatly preferred.


Jeff


Okay, I'll think about how we want to parameterize this.  I don't want users 
expecting development options to be around forever.  I'll resubmit something 
once I have more of these experimental features ready to submit.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [atl1-devel] [PATCH 2/2] atl1: wrap problematic optimizations in CONFIG_ATL1_EXPERIMENTAL

2007-09-07 Thread Chris Snook

Luca wrote:

On 9/8/07, Chris Snook [EMAIL PROTECTED] wrote:

From: Chris Snook [EMAIL PROTECTED]

Make certain problematic optimizations build-time configurable.

Signed-off-by: Chris Snook [EMAIL PROTECTED]
Acked-by: Jay Cliburn [EMAIL PROTECTED]

--- a/drivers/net/atl1/atl1_main.c  2007-09-04 10:12:38.0 -0400
+++ b/drivers/net/atl1/atl1_main.c  2007-09-04 11:23:26.0 -0400
@@ -2203,22 +2203,26 @@ static int __devinit atl1_probe(struct p
struct net_device *netdev;
struct atl1_adapter *adapter;
static int cards_found = 0;
-   bool pci_using_64 = true;
+   bool pci_using_64 = false;
int err;

err = pci_enable_device(pdev);
if (err)
return err;

+#ifdef CONFIG_ATL1_EXPERIMENTAL
err = pci_set_dma_mask(pdev, DMA_64BIT_MASK);
+   if (!err) {
+   pci_using_64 = true;
+   goto dma_ok;
+   }
+#endif /* CONFIG_ATL1_EXPERIMENTAL */


This is more like CONFIG_ATL1_PLEASE_KILL_MY_MACHINE; I really don't
see the problem with just limiting the DMA mask:
- if you don't have physical mem over the 4GB boundary limiting DMA
doesn't make any difference
- if you have more than 4GB of memory the machine won't survive long without it


Atheros is still working on this, and we plan to fix it.  64-bit DMA *should* 
work.  I just resubmitted your patch with the comment Jeff requested.  I still 
may want to revisit CONFIG_ATL1_EXPERIMENTAL soon when I start playing around 
with more features.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()

2007-08-24 Thread Chris Snook

Denys Vlasenko wrote:

On Friday 24 August 2007 18:06, Christoph Lameter wrote:

On Fri, 24 Aug 2007, Satyam Sharma wrote:

But if people do seem to have a mixed / confused notion of atomicity
and barriers, and if there's consensus, then as I'd said earlier, I
have no issues in going with the consensus (eg. having API variants).
Linus would be more difficult to convince, however, I suspect :-)

The confusion may be the result of us having barrier semantics in
atomic_read. If we take that out then we may avoid future confusions.


I think better name may help. Nuke atomic_read() altogether.

n = atomic_value(x);// doesnt hint as strongly at reading as atomic_read
n = atomic_fetch(x);// yes, we _do_ touch RAM
n = atomic_read_uncached(x); // or this

How does that sound?


atomic_value() vs. atomic_fetch() should be rather unambiguous. 
atomic_read_uncached() begs the question of precisely which cache we are 
avoiding, and could itself cause confusion.


So, if I were writing atomic.h from scratch, knowing what I know now, I think 
I'd use atomic_value() and atomic_fetch().  The problem is that there are a lot 
of existing users of atomic_read(), and we can't write a script to correctly 
guess their intent.  I'm not sure auditing all uses of atomic_read() is really 
worth the comparatively miniscule benefits.


We could play it safe and convert them all to atomic_fetch(), or we could 
acknowledge that changing the semantics 8 months ago was not at all disastrous, 
and make them all atomic_value(), allowing people to use atomic_fetch() where 
they really care.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-21 Thread Chris Snook

David Miller wrote:

From: Linus Torvalds [EMAIL PROTECTED]
Date: Mon, 20 Aug 2007 22:46:47 -0700 (PDT)

Ie a barrier() is likely _cheaper_ than the code generation downside 
from using volatile.


Assuming GCC were ever better about the code generation badness
with volatile that has been discussed here, I much prefer
we tell GCC this memory piece changed rather than every
piece of memory has changed which is what the barrier() does.

I happened to have been scanning a lot of assembler lately to
track down a gcc-4.2 miscompilation on sparc64, and the barriers
do hurt quite a bit in some places.  Instead of keeping unrelated
variables around cached in local registers, it reloads everything.


Moore's law is definitely working against us here.  Register counts, 
pipeline depths, core counts, and clock multipliers are all increasing 
in the long run.  At some point in the future, barrier() will be 
universally regarded as a hammer too big for most purposes.  Whether or 
not removing it now constitutes premature optimization is arguable, but 
I think we should allow such optimization to happen (or not happen) in 
architecture-dependent code, and provide a consistent API that doesn't 
require the use of such things in arch-independent code where it might 
turn into a totally superfluous performance killer depending on what 
hardware it gets compiled for.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-20 Thread Chris Snook

Linus Torvalds wrote:
So the only reason to add back volatile to the atomic_read() sequence is 
not to fix bugs, but to _hide_ the bugs better. They're still there, they 
are just a lot harder to trigger, and tend to be a lot subtler.


What about barrier removal?  With consistent semantics we could optimize a fair 
amount of code.  Whether or not that constitutes premature optimization is 
open to debate, but there's no question we could reduce our register wiping in 
some places.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures)

2007-08-20 Thread Chris Snook

Stefan Richter wrote:

Nick Piggin wrote:

Stefan Richter wrote:

Nick Piggin wrote:


I don't know why people would assume volatile of atomics. AFAIK, most
of the documentation is pretty clear that all the atomic stuff can be
reordered etc. except for those that modify and return a value.


Which documentation is there?

Documentation/atomic_ops.txt



For driver authors, there is LDD3.  It doesn't specifically cover
effects of optimization on accesses to atomic_t.

For architecture port authors, there is Documentation/atomic_ops.txt.
Driver authors also can learn something from that document, as it
indirectly documents the atomic_t and bitops APIs.


Semantics and Behavior of Atomic and Bitmask Operations is
pretty direct :)

Sure, it says that it's for arch maintainers, but there is no
reason why users can't make use of it.



Note, LDD3 page 238 says:  It is worth noting that most of the other
kernel primitives dealing with synchronization, such as spinlock and
atomic_t operations, also function as memory barriers.

I don't know about Linux 2.6.10 against which LDD3 was written, but
currently only _some_ atomic_t operations function as memory barriers.

Besides, judging from some posts in this thread, saying that atomic_t
operations dealt with synchronization may not be entirely precise.


atomic_t is often used as the basis for implementing more sophisticated 
synchronization mechanisms, such as rwlocks.  Whether or not they are designed 
for that purpose, the atomic_* operations are de facto synchronization primitives.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-20 Thread Chris Snook

Christoph Lameter wrote:

On Fri, 17 Aug 2007, Paul E. McKenney wrote:


On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:

On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:

gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)

I had totally forgotten that I'd already filed that bug more
than six years ago until they just closed yours as a duplicate
of mine :)

Good luck in getting it fixed!

Well, just got done re-opening it for the third time.  And a local
gcc community member advised me not to give up too easily.  But I
must admit that I am impressed with the speed that it was identified
as duplicate.

Should be entertaining!  ;-)


Right. ROTFL... volatile actually breaks atomic_t instead of making it 
safe. x++ becomes a register load, increment and a register store. Without 
volatile we can increment the memory directly. It seems that volatile 
requires that the variable is loaded into a register first and then 
operated upon. Understandable when you think about volatile being used to 
access memory mapped I/O registers where a RMW operation could be 
problematic.


So, if we want consistent behavior, we're pretty much screwed unless we use 
inline assembler everywhere?


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-20 Thread Chris Snook

Herbert Xu wrote:

On Mon, Aug 20, 2007 at 09:15:11AM -0400, Chris Snook wrote:

Linus Torvalds wrote:
So the only reason to add back volatile to the atomic_read() sequence is 
not to fix bugs, but to _hide_ the bugs better. They're still there, they 
are just a lot harder to trigger, and tend to be a lot subtler.
What about barrier removal?  With consistent semantics we could optimize a 
fair amount of code.  Whether or not that constitutes premature 
optimization is open to debate, but there's no question we could reduce our 
register wiping in some places.


If you've been reading all of Linus's emails you should be
thinking about adding memory barriers, and not removing
compiler barriers.

He's just told you that code of the kind

while (!atomic_read(cond))
;

do_something()

probably needs a memory barrier (not just compiler) so that
do_something() doesn't see stale cache content that occured
before cond flipped.


Such code generally doesn't care precisely when it gets the update, just that 
the update is atomic, and it doesn't loop forever.  Regardless, I'm convinced we 
just need to do it all in assembly.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Chris Snook

Ilpo Järvinen wrote:

On Thu, 16 Aug 2007, Herbert Xu wrote:


We've been through that already.  If it's a busy-wait it
should use cpu_relax. 


I looked around a bit by using some command lines and ended up wondering 
if these are equal to busy-wait case (and should be fixed) or not:


./drivers/telephony/ixj.c
6674:   while (atomic_read(j-DSPWrite)  0)
6675-   atomic_dec(j-DSPWrite);

...besides that, there are couple of more similar cases in the same file 
(with braces)...


atomic_dec() already has volatile behavior everywhere, so this is 
semantically okay, but this code (and any like it) should be calling 
cpu_relax() each iteration through the loop, unless there's a compelling 
reason not to.  I'll allow that for some hardware drivers (possibly this 
one) such a compelling reason may exist, but hardware-independent core 
subsystems probably have no excuse.


If the maintainer of this code doesn't see a compelling reason to add 
cpu_relax() in this loop, then it should be patched.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Chris Snook

Herbert Xu wrote:

On Thu, Aug 16, 2007 at 10:06:31AM +0200, Stefan Richter wrote:

Do you (or anyone else for that matter) have an example of this?

The only code I somewhat know, the ieee1394 subsystem, was perhaps
authored and is currently maintained with the expectation that each
occurrence of atomic_read actually results in a load operation, i.e. is
not optimized away.  This means all atomic_t (bus generation, packet and
buffer refcounts, and some other state variables)* and likewise all
atomic bitops in that subsystem.


Can you find an actual atomic_read code snippet there that is
broken without the volatile modifier?


A whole bunch of atomic_read uses will be broken without the volatile 
modifier once we start removing barriers that aren't needed if volatile 
behavior is guaranteed.


barrier() clobbers all your registers.  volatile atomic_read() only 
clobbers one register, and more often than not it's a register you 
wanted to clobber anyway.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Chris Snook

Ilpo Järvinen wrote:

On Thu, 16 Aug 2007, Herbert Xu wrote:


We've been through that already.  If it's a busy-wait it
should use cpu_relax. 


I looked around a bit by using some command lines and ended up wondering 
if these are equal to busy-wait case (and should be fixed) or not:


./drivers/telephony/ixj.c
6674:   while (atomic_read(j-DSPWrite)  0)
6675-   atomic_dec(j-DSPWrite);

...besides that, there are couple of more similar cases in the same file 
(with braces)...


atomic_dec() already has volatile behavior everywhere, so this is 
semantically okay, but this code (and any like it) should be calling 
cpu_relax() each iteration through the loop, unless there's a compelling 
reason not to.  I'll allow that for some hardware drivers (possibly this 
one) such a compelling reason may exist, but hardware-independent core 
subsystems probably have no excuse.


If the maintainer of this code doesn't see a compelling reason not to 
add cpu_relax() in this loop, then it should be patched.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Chris Snook

Herbert Xu wrote:

On Thu, Aug 16, 2007 at 03:48:54PM -0400, Chris Snook wrote:

Can you find an actual atomic_read code snippet there that is
broken without the volatile modifier?
A whole bunch of atomic_read uses will be broken without the volatile 
modifier once we start removing barriers that aren't needed if volatile 
behavior is guaranteed.


Could you please cite the file/function names so we can
see whether removing the barrier makes sense?

Thanks,


At a glance, several architectures' implementations of smp_call_function() have 
one or more legitimate atomic_read() busy-waits that shouldn't be using 
CPU-relax.  Some of them do work in the loop.


I'm sure there are plenty more examples that various maintainers could find in 
their own code.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Chris Snook

Herbert Xu wrote:

Chris Snook [EMAIL PROTECTED] wrote:
Because atomic operations are generally used for synchronization, which requires 
volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
 Some forget to and we get bugs, because people assume that atomic_read() 
actually reads something, and atomic_write() actually writes something.  Worse, 
these are architecture-specific, even compiler version-specific bugs that are 
often difficult to track down.


I'm yet to see a single example from the current tree where
this patch series is the correct solution.  So far the only
example has been a buggy piece of code which has since been
fixed with a cpu_relax.


Part of the motivation here is to fix heisenbugs.  If I knew where they 
were, I'd be posting patches for them.  Unlike most bugs, where we want 
to expose them as obviously as possible, these can be extremely 
difficult to track down, and are often due to people assuming that the 
atomic_* operations have the same semantics they've historically had. 
Remember that until recently, all SMP architectures except s390 (which 
very few kernel developers outside of IBM, Red Hat, and SuSE do much 
work on) had volatile declarations for atomic_t.  Removing the volatile 
declarations from i386 and x86_64 may have created heisenbugs that won't 
manifest themselves until GCC 6.0 comes out and people start compiling 
kernels with -O5.  We should have consistent semantics for atomic_* 
operations.


The other motivation is to reduce the need for the barriers used to 
prevent/fix such problems which clobber all your registers, and instead 
force atomic_* operations to behave in the way they're actually used. 
After the (resubmitted) patchset is merged, we'll be able to remove a 
whole bunch of barriers, shrinking our source and our binaries, and 
improving performance.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-14 Thread Chris Snook

Christoph Lameter wrote:

On Thu, 9 Aug 2007, Chris Snook wrote:


This patchset makes the behavior of atomic_read uniform by removing the
volatile keyword from all atomic_t and atomic64_t definitions that currently
have it, and instead explicitly casts the variable as volatile in
atomic_read().  This leaves little room for creative optimization by the
compiler, and is in keeping with the principles behind volatile considered
harmful.


volatile is generally harmful even in atomic_read(). Barriers control
visibility and AFAICT things are fine.


But barriers force a flush of *everything* in scope, which we generally don't 
want.  On the other hand, we pretty much always want to flush atomic_* 
operations.  One way or another, we should be restricting the volatile behavior 
to the thing that needs it.  On most architectures, this patch set just moves 
that from the declaration, where it is considered harmful, to the use, where it 
is considered an occasional necessary evil.


See the resubmitted patchset, which also puts a cast in the atomic[64]_set 
operations.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-14 Thread Chris Snook

Satyam Sharma wrote:


On Tue, 14 Aug 2007, Christoph Lameter wrote:


On Thu, 9 Aug 2007, Chris Snook wrote:


This patchset makes the behavior of atomic_read uniform by removing the
volatile keyword from all atomic_t and atomic64_t definitions that currently
have it, and instead explicitly casts the variable as volatile in
atomic_read().  This leaves little room for creative optimization by the
compiler, and is in keeping with the principles behind volatile considered
harmful.

volatile is generally harmful even in atomic_read(). Barriers control
visibility and AFAICT things are fine.


Frankly, I don't see the need for this series myself either. Personal
opinion (others may differ), but I consider volatile to be a sad /
unfortunate wart in C (numerous threads on this list and on the gcc
lists/bugzilla over the years stand testimony to this) and if we _can_
steer clear of it, then why not -- why use this ill-defined primitive
whose implementation has often differed over compiler versions and
platforms? Granted, barrier() _is_ heavy-handed in that it makes the
optimizer forget _everything_, but then somebody did post a forget()
macro on this thread itself ...

[ BTW, why do we want the compiler to not optimize atomic_read()'s in
  the first place? Atomic ops guarantee atomicity, which has nothing
  to do with volatility -- users that expect volatility from
  atomic ops are the ones who must be fixed instead, IMHO. ]


Because atomic operations are generally used for synchronization, which requires 
volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
 Some forget to and we get bugs, because people assume that atomic_read() 
actually reads something, and atomic_write() actually writes something.  Worse, 
these are architecture-specific, even compiler version-specific bugs that are 
often difficult to track down.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-13 Thread Chris Snook

Paul Mackerras wrote:

Chris Snook writes:


I'll do this for the whole patchset.  Stay tuned for the resubmit.


Could you incorporate Segher's patch to turn atomic_{read,set} into
asm on powerpc?  Segher claims that using asm is really the only
reliable way to ensure that gcc does what we want, and he seems to
have a point.

Paul.


I haven't seen a patch yet.  I'm going to resubmit with inline volatile-cast 
atomic[64]_[read|set] on all architectures as a reference point, and if anyone 
wants to go and implement some of them in assembly, that's between them and the 
relevant arch maintainers.  I have no problem with (someone else) doing it in 
assembly.  I just don't think it's necessary and won't let it hold up the effort 
to get consistent behavior on all architectures.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-13 Thread Chris Snook

David Howells wrote:

Chris Snook [EMAIL PROTECTED] wrote:


cpu_relax() contains a barrier, so it should do the right thing.  For non-smp
architectures, I'm concerned about interacting with interrupt handlers.  Some
drivers do use atomic_* operations.


I'm not sure that actually answers my question.  Why not smp_rmb()?

David


I would assume because we want to waste time efficiently even on non-smp 
architectures, rather than frying the CPU or draining the battery.  Certain 
looping execution patterns can cause the CPU to operate above thermal design 
power.  I have fans on my workstation that only ever come on when running 
LINPACK, and that's generally memory bandwidth-bound.  Just imagine what happens 
when you're executing the same few non-serializing instructions in a tight loop 
without ever stalling on memory fetches, or being scheduled out.


If there's another reason, I'd like to hear it too, because I'm just guessing 
here.

-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Disabling 64-bit DMA on atl1?

2007-08-13 Thread Chris Snook

Jeff --

	Can we please get Luca's patch merged for 2.6.23?  The bug is a data corrupter, 
and the workaround doesn't cost us much.  Both Jay and I have already acked it.


http://lkml.org/lkml/2007/6/25/293

	We'll do it the right way once Atheros tracks it down with hardware 
analyzers, but for now, let's fix the data corrupter.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-10 Thread Chris Snook

Paul E. McKenney wrote:

On Thu, Aug 09, 2007 at 03:24:40PM -0400, Chris Snook wrote:

Paul E. McKenney wrote:

On Thu, Aug 09, 2007 at 02:13:52PM -0400, Chris Snook wrote:

Paul E. McKenney wrote:

On Thu, Aug 09, 2007 at 01:14:35PM -0400, Chris Snook wrote:
  If you're depending on volatile writes 
being visible to other CPUs, you're screwed either way, because the CPU 
can hold that data in cache as long as it wants before it writes it to 
memory.  When this finally does happen, it will happen atomically, 
which is all that atomic_set guarantees.  If you need to guarantee that 
the value is written to memory at a particular time in your execution 
sequence, you either have to read it from memory to force the compiler 
to store it first (and a volatile cast in atomic_read will suffice for 
this) or you have to use LOCK_PREFIX instructions which will invalidate 
remote cache lines containing the same variable.  This patch doesn't 
change either of these cases.

The case that it -can- change is interactions with interrupt handlers.
And NMI/SMI handlers, for that matter.
You have a point here, but only if you can guarantee that the interrupt 
handler is running on a processor sharing the cache that has the 
not-yet-written volatile value.  That implies a strictly non-SMP 
architecture.  At the moment, none of those have volatile in their 
declaration of atomic_t, so this patch can't break any of them.

This can also happen when using per-CPU variables.  And there are a
number of per-CPU variables that are either atomic themselves or are
structures containing atomic fields.
Accessing per-CPU variables in this fashion reliably already requires a 
suitable smp/non-smp read/write memory barrier.  I maintain that if we 
break anything with this change, it was really already broken, if less 
obviously.  Can you give a real or synthetic example of legitimate code 
that could break?


My main concern is actually the lack of symmetry -- I would expect
that an atomic_set() would have the same properties as atomic_read().
It is easy and cheap to provide them with similar properties, so why not?
Debugging even a single problem would consume far more time than simply
giving them corresponding semantics.

But you asked for examples.  These are synthetic, and of course legitimacy
is in the eye of the beholder.

1.  Watchdog variable.

atomic_t watchdog = ATOMIC_INIT(0);

...

int i;
while (!done) {

/* Do so stuff that doesn't take more than a few us. */
/* Could do atomic increment, but throughput penalty. */

i++;
atomic_set(watchdog, i);
}
do_something_with(watchdog);


/* Every so often on some other CPU... */

if ((new_watchdog = atomic_read(watchdog)) == old_watchdog)
die_horribly();
old_watchdog = new_watchdog;


If atomic_set() did not have volatile semantics, the compiler
would be within its rights optimizing it to simply get the
final value of i after exit from the loop.  This would cause
the watchdog check to fail spuriously.  Memory barriers are
not required in this case, because the CPU cannot hang onto
the value for very long -- we don't care about the exact value,
or about exact synchronization, but rather about whether or
not the value is changing.

In this (toy) example, one might replace the atomic_set() with
an atomic increment (though that might be too expensive in some
cases) or with something like:

atomic_set(watchdog, atomic_read(watchdog) + 1);

However, other cases might not permit this transformation,
for example, an existing heavily used API might take int rather
than atomic_t.

Some will no doubt argue that this example should use a
macro or an asm similar to the forget() asm put forward
elsewhere in this thread.

2.  Communicating both with interrupt handler and with other CPUs.
For example, data elements that are built up in a location visible
to interrupts and NMIs, and then added as a unit to a data structure
visible to other CPUs.  This more-realistic example is abbreviated
to the point of pointlessness as follows:

struct foo {
atomic_t a;
atomic_t b;
};

DEFINE_PER_CPU(struct foo *, staging) = NULL;

/* Create element in staging area. */

__get_cpu_var(staging) = kzalloc(sizeof(*p), GFP_WHATEVER);
if (__get_cpu_var(staging) == NULL)
die_horribly();
/* allocate an element of some per-CPU array, get the result in i */
atomic_set(__get_cpu_var(staging).a, i);
/* allocate another element of a per-CPU array, with result in i */
atomic_set(__get_cpu_var(staging).b, i);
rcu_assign_pointer(some_global_place

Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-10 Thread Chris Snook

David Howells wrote:

Chris Snook [EMAIL PROTECTED] wrote:


To head off the criticism, I admit this is an oversimplification, and true
busy-waiters should be using cpu_relax(), which contains a barrier.


Why would you want to use cpu_relax()?  That's there to waste time efficiently,
isn't it?  Shouldn't you be using smp_rmb() or something like that?

David


cpu_relax() contains a barrier, so it should do the right thing.  For 
non-smp architectures, I'm concerned about interacting with interrupt 
handlers.  Some drivers do use atomic_* operations.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-10 Thread Chris Snook

Luck, Tony wrote:

+#define atomic_read(v) (*(volatile __s32 *)(v)-counter)
+#define atomic64_read(v)   (*(volatile __s64 *)(v)-counter)

 #define atomic_set(v,i)(((v)-counter) = (i))
 #define atomic64_set(v,i)  (((v)-counter) = (i))



Losing the volatile from the set variants definitely changes
the code generated.  Before the patch gcc would give us:

st4.rel [r37]=r9

after
st4 [r37]=r9

It is unclear whether anyone relies on (or even whether they should
rely on) the release semantics that are provided by the current
version of atomic.h.  But making this change would require an
audit of all the uses of atomic_set() to find an answer.

There is a more worrying difference in the generated code (this
from the ancient and venerable gcc 3.4.6 that is on my build
machine).   In rwsem_down_failed_common I see this change (after
disassembling vmlinux, I used sed to zap the low 32-bits of addresses
to make the diff manageable ... that's why the addresses all end
in ):

712868,712873c712913,712921
 a001: adds r16=-1,r30
 a001: [MII]   ld8.acq r33=[r32]
 a001: nop.i 0x0;;
 a001: add r42=r33,r16
 a001: [MMI]   mov.m ar.ccv=r33;;
 a001: cmpxchg8.acq r34=[r32],r42,ar.ccv
---

a001: adds r16=-1,r31
a001: [MMI]   ld4.acq r14=[r32];;
a001: nop.m 0x0
a001: sxt4 r34=r14
a001: [MMI]   nop.m 0x0;;
a001: nop.m 0x0
a001: add r15=r34,r16
a001: [MMI]   mov.m ar.ccv=r34;;
a001: cmpxchg8.acq r42=[r32],r15,ar.ccv


This code is probably from the rwsem_atomic_update(adjustment, sem) macro
which is cpp'd to atomic64_add_return().  It looks really bad that the new
code reads a 32-bit value and sign extends it rather than reading a 64-bit
value (but I'm perplexed as to why this patch provoked this change in the
generated code).

-Tony


That's distressing.  I'm about to resubmit with a volatile cast in 
atomic_set as well, since people expect that behavior and I've been 
shown a legitimate case where it could matter.  Does the assembly look 
right with that cast in atomic_set() as well?


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-10 Thread Chris Snook

Luck, Tony wrote:

Use atomic64_read to read an atomic64_t.


Thanks Andreas!

Chris: This bug is why the 8-byte loads got changed to 4-byte + sign-extend
by your change to atomic_read().


I figured as much.  Thanks for confirming this.


With this applied together with shuffling the volatile from the
declaration to the usage (in both atomic_read() and atomic_set()
the generated code *almost* reverts to the original.

There are some differences where ld4 have turned into ld8 though.
Are these bugs in the use of atomic_add() and atomic_sub().  E.g.
the first of these changes is in: ipc/msg.c:freeque() where we have:

atomic_sub(msg-q_cbytes, msg_bytes);

Now the type of msg-q_cbytes is unsigned long ... so it seems a
poor idea to subtract such a large typed object from msg_bytes
which is a mere slip of an atomic_t.

Or is there some other type-wrangling that needs to happen in
include/asm-ia64/atomic.h?  There are a total of nineteen of
these ld4-ld8 transforms.


Possibly.  Either that or we've uncovered some latent bugs.  Maybe a 
combination of the two.  Can you list those 19 changes so we can 
evaluate them?  I'm told there were some *(volatile *) bugs fixed in gcc 
recently, so it's also possible your 3.4.6 is showing those.  I can test 
that on a more recent gcc on ia64 if it's inconvenient for you to do so 
on your test box.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-10 Thread Chris Snook

Linus Torvalds wrote:


On Fri, 10 Aug 2007, Luck, Tony wrote:

Here are the functions in which they occur in the object file. You
may have to chase down some inlining to find the function that
actually uses atomic_*().


Could you just make the atomic_read() and atomic_set() functions be 
inline functions instead?


That way you get nice compiler warnings when you pass the wrong kind of 
object around.  So


static void atomic_set(atomic_t *p, int value)
{
*(volatile int *)p-value = value;
}

static int atomic_read(atomic_t *p)
{
return *(volatile int *)p-value;
}

etc...


I'll do this for the whole patchset.  Stay tuned for the resubmit.

-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Chris Snook

Linus Torvalds wrote:


On Wed, 8 Aug 2007, Chris Snook wrote:

Some architectures currently do not declare the contents of an atomic_t to be
volatile.  This causes confusion since atomic_read() might not actually read
anything if an optimizing compiler re-uses a value stored in a register, which
can break code that loops until something external changes the value of an
atomic_t.


I'd be *much* happier with atomic_read() doing the volatile instead.

The fact is, volatile on data structures is a bug. It's a wart in the C 
language. It shouldn't be used. 

Volatile accesses in *code* can be ok, and if we have atomic_read() 
expand to a *(volatile int *)(x)-value, then I'd be ok with that.


But marking data structures volatile just makes the compiler screw up 
totally, and makes code for initialization sequences etc much worse.


Linus


Fair enough.  Casting to (volatile int *) will give us the behavior people 
expect when using atomic_t without needing to use inefficient barriers.


While we have the hood up, should we convert all the atomic_t's to non-volatile 
and put volatile casts in all the atomic_reads?  I don't know enough about the 
various arches to say with confidence that those changes alone will preserve 
existing behavior.  We might need some arch-specific tweaking of the atomic 
operations.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Chris Snook

Herbert Xu wrote:

Chris Snook [EMAIL PROTECTED] wrote:

Some architectures currently do not declare the contents of an atomic_t to be
volatile.  This causes confusion since atomic_read() might not actually read
anything if an optimizing compiler re-uses a value stored in a register, which
can break code that loops until something external changes the value of an
atomic_t.  Avoiding such bugs requires using barrier(), which causes re-loads


Such loops should always use something like cpu_relax() which comes
with a barrier.


If they're not doing anything, sure.  Plenty of loops actually do some sort of 
real work while waiting for their halt condition, possibly even work which is 
necessary for their halt condition to occur, and you definitely don't want to be 
doing cpu_relax() in this case.  On register-rich architectures you can do quite 
a lot of work without needing to reuse the register containing the result of the 
atomic_read().  Those are precisely the architectures where barrier() hurts the 
most.



of all registers used in the loop, thus hurting performance instead of helping
it, particularly on architectures where it's unnecessary.  Since we generally


Do you have an example of such a loop where performance is hurt by this?


Not handy.  Perhaps more interesting are cases where we access the same atomic_t 
twice in a hot path.  If we can remove some of those barriers, those hot paths 
will get faster.


Performance was only part of the motivation.  The IPVS bug was an example of how 
atomic_t is assumed (not always correctly) to work, and other recent discussions 
on this list have made it clear that most people assume atomic_read() actually 
reads something every time, and don't even think to consult the documentation 
until they find out the hard way that it does not.  I'm not saying we should 
encourage lazy programming, but in this case the assumption is reasonable 
because that's how people actually use atomic_t, and making this behavior 
uniform across all architectures makes it more convenient to do things the right 
way, which we should encourage.



The IPVS code that led to this patch was simply broken and has been
fixed to use cpu_relax().


I agree, busy-waiting should be done with cpu_relax(), if at all.  I'm more 
concerned about cases that are not busy-waiting, but could still get compiled 
with the same optimization.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Chris Snook

Herbert Xu wrote:

On Thu, Aug 09, 2007 at 03:47:57AM -0400, Chris Snook wrote:
If they're not doing anything, sure.  Plenty of loops actually do some sort 
of real work while waiting for their halt condition, possibly even work 
which is necessary for their halt condition to occur, and you definitely 
don't want to be doing cpu_relax() in this case.  On register-rich 
architectures you can do quite a lot of work without needing to reuse the 
register containing the result of the atomic_read().  Those are precisely 
the architectures where barrier() hurts the most.


I have a problem with this argument.  The same loop could be
using a non-atomic as long as the updaters are serialised.  Would
you suggest that we turn such non-atomics into volatiles too?


No.  I'm simply saying that when people call atomic_read, they expect a read to 
occur.  When this doesn't happen, people get confused.  Merely referencing a 
variable doesn't carry the same connotation.


Anyway, I'm taking Linus's advice and putting the cast in atomic_read(), rather 
than the counter declaration itself.  Everything else uses __asm__ __volatile__, 
or calls atomic_read() with interrupts disabled.  This ensures that 
atomic_read() works as expected across all architectures, without the cruft the 
compiler generates when you declare the variable itself volatile.



Any loop that's waiting for an external halt condition either
has to schedule away (which is a barrier) or you'd be busy
waiting in which case you should use cpu_relax.


Not necessarily.  Some code uses atomic_t for a sort of lightweight semaphore. 
If your loop is actually doing real work, perhaps in a softirq handler 
negotiating shared resources with a hard irq handler, you're not busy-waiting.



Do you have an example where this isn't the case?


a) No, and that's sort of the point.  We shouldn't have to audit the whole 
kernel to find cases where a misleadingly-named function is doing what its users 
are not expecting.  If we can make it always do the right thing without any 
substantial drawbacks, we should.


b) Loops are just one case, which came to mind because of the IPVS bug recently 
discussed.  I recall seeing some scheduler code recently which does an 
atomic_read() twice on the same variable, with a barrier in between.  It's in a 
very hot path, so if we can remove that barrier, we save a bunch of register 
loads.  When you're context switching every microsecond in SCHED_RR, that really 
matters.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] ipvs: force read of atomic_t in while loop

2007-08-09 Thread Chris Snook

Michael Buesch wrote:

On Thursday 09 August 2007 02:15:33 Andi Kleen wrote:

On Wed, Aug 08, 2007 at 05:08:44PM -0400, Chris Snook wrote:

Heiko Carstens wrote:

On Wed, Aug 08, 2007 at 03:21:31AM -0700, David Miller wrote:

From: Heiko Carstens [EMAIL PROTECTED]
Date: Wed, 8 Aug 2007 11:33:00 +0200


Just saw this while grepping for atomic_reads in a while loops.
Maybe we should re-add the volatile to atomic_t. Not sure.

I think whatever the choice, it should be done consistently
on every architecture.

It's just asking for trouble if your arch does it differently from
every other.

Well..currently it's i386/x86_64 and s390 which have no volatile
in atomic_t. And yes, of course I agree it should be consistent
across all architectures. But it isn't.
Based on recent discussion, it's pretty clear that there's a lot of 
confusion about this.  A lot of people (myself included, until I thought 
about it long and hard) will reasonably assume that calling 
atomic_read() will actually read the value from memory.  Leaving out the 
volatile declaration seems like a pessimization to me.  If you force 
people to use barrier() everywhere they're working with atomic_t, it 
will force re-reads of all the non-atomic data in use as well, which 
will cause more memory fetches of things that generally don't need 
barrier().  That and it's a bug waiting to happen.


Andi -- your thoughts on the matter?

I also think readding volatile makes sense. An alternative would be
to stick an rmb() into atomic_read() -- that would also stop speculative reads.
Disadvantage is that it clobbers all memory, not just the specific value.

But you really have to complain to Linus (cc'ed). He came up
with the volatile removale change iirc.


Isn't it possible through some inline assembly trick
that only a certain variable has to be reloaded?
So we could define something like that:

#define reload_var(x) __asm__ __volatile__ (whatever, x)

I don't know inline assembly that much, but isn't it possible
with that to kind of fake-touch the variable, so the compiler
must reload it (and only it) to make sure it's up to date?


We can do it in C, like this:

-#define atomic_read(v) ((v)-counter)
+#define atomic_read(v) (*(volatile int *)(v)-counter)

By casting it volatile at the precise piece of code where we want to guarantee a 
read from memory, there's little risk of the compiler getting creative in its 
interpretation of the code.


Stay tuned for the patch set...

-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-09 Thread Chris Snook
As recent discussions[1], and bugs[2] have shown, there is a great deal of
confusion about the expected behavior of atomic_read(), compounded by the
fact that it is not the same on all architectures.  Since users expect calls
to atomic_read() to actually perform a read, it is not desirable to allow
the compiler to optimize this away.  Requiring the use of barrier() in this
case is inefficient, since we only want to re-load the atomic_t variable,
not everything else in scope.

This patchset makes the behavior of atomic_read uniform by removing the
volatile keyword from all atomic_t and atomic64_t definitions that currently
have it, and instead explicitly casts the variable as volatile in
atomic_read().  This leaves little room for creative optimization by the
compiler, and is in keeping with the principles behind volatile considered
harmful.

Busy-waiters should still use cpu_relax(), but fast paths may be able to
reduce their use of barrier() between some atomic_read() calls.

-- Chris

1)  http://lkml.org/lkml/2007/7/1/52
2)  http://lkml.org/lkml/2007/8/8/122
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic[64]_t on alpha.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-alpha/atomic.h2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-alpha/atomic.h 2007-08-09 09:19:00.0 
-0400
@@ -14,18 +14,22 @@
 
 
 /*
- * Counter is volatile to make sure gcc doesn't try to be clever
- * and move things around on us. We need to use _exactly_ the address
- * the user gave us, not some alias that contains the same information.
+ * Make sure gcc doesn't try to be clever and move things around
+ * on us. We need to use _exactly_ the address the user gave us,
+ * not some alias that contains the same information.
  */
-typedef struct { volatile int counter; } atomic_t;
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { int counter; } atomic_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC_INIT(i) ( (atomic_t) { (i) } )
 #define ATOMIC64_INIT(i)   ( (atomic64_t) { (i) } )
 
-#define atomic_read(v) ((v)-counter + 0)
-#define atomic64_read(v)   ((v)-counter + 0)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter + 0)
+#define atomic64_read(v)   (*(volatile long *)(v)-counter + 0)
 
 #define atomic_set(v,i)((v)-counter = (i))
 #define atomic64_set(v,i)  ((v)-counter = (i))
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/24] make atomic_read() behave consistently on arm

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic_t on arm.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-arm/atomic.h  2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-arm/atomic.h   2007-08-09 06:30:40.0 
-0400
@@ -14,13 +14,17 @@
 #include linux/compiler.h
 #include asm/system.h
 
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 
 #define ATOMIC_INIT(i) { (i) }
 
 #ifdef __KERNEL__
 
-#define atomic_read(v) ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 
 #if __LINUX_ARM_ARCH__ = 6
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/24] make atomic_read() behave consistently on blackfin

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make atomic_read() volatile on blackfin.  This ensures that busy-waiting
for an interrupt handler to change an atomic_t won't get compiled to an
infinite loop, consistent with SMP architectures.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-blackfin/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-blackfin/atomic.h  2007-08-09 
06:36:15.0 -0400
@@ -18,7 +18,11 @@ typedef struct {
 } atomic_t;
 #define ATOMIC_INIT(i) { (i) }
 
-#define atomic_read(v) ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 #define atomic_set(v, i)   (((v)-counter) = i)
 
 static __inline__ void atomic_add(int i, atomic_t * v)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make atomic_read() volatile on frv.  This ensures that busy-waiting
for an interrupt handler to change an atomic_t won't get compiled to an
infinite loop, consistent with SMP architectures.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-frv/atomic.h  2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-frv/atomic.h   2007-08-09 06:41:48.0 
-0400
@@ -40,7 +40,12 @@ typedef struct {
 } atomic_t;
 
 #define ATOMIC_INIT(i) { (i) }
-#define atomic_read(v) ((v)-counter)
+
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 #define atomic_set(v, i)   (((v)-counter) = (i))
 
 #ifndef CONFIG_FRV_OUTOFLINE_ATOMIC_OPS
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/24] make atomic_read() behave consistently on i386

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make atomic_read() volatile on i386, to ensure memory is actually read
each time.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-i386/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-i386/atomic.h  2007-08-09 06:47:47.0 
-0400
@@ -25,7 +25,7 @@ typedef struct { int counter; } atomic_t
  * 
  * Atomically reads the value of @v.
  */ 
-#define atomic_read(v) ((v)-counter)
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 
 /**
  * atomic_set - set atomic variable
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/24] make atomic_read() behave consistently on h8300

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make atomic_read() volatile on h8300.  This ensures that busy-waiting
for an interrupt handler to change an atomic_t won't get compiled to an
infinite loop, consistent with SMP architectures.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-h8300/atomic.h2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-h8300/atomic.h 2007-08-09 06:45:43.0 
-0400
@@ -9,7 +9,11 @@
 typedef struct { int counter; } atomic_t;
 #define ATOMIC_INIT(i) { (i) }
 
-#define atomic_read(v) ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 #define atomic_set(v, i)   (((v)-counter) = i)
 
 #include asm/system.h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic[64]_t on ia64.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-ia64/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-ia64/atomic.h  2007-08-09 06:53:48.0 
-0400
@@ -17,18 +17,18 @@
 #include asm/intrinsics.h
 #include asm/system.h
 
-/*
- * On IA-64, counter must always be volatile to ensure that that the
- * memory accesses are ordered.
- */
-typedef struct { volatile __s32 counter; } atomic_t;
-typedef struct { volatile __s64 counter; } atomic64_t;
+typedef struct { __s32 counter; } atomic_t;
+typedef struct { __s64 counter; } atomic64_t;
 
 #define ATOMIC_INIT(i) ((atomic_t) { (i) })
 #define ATOMIC64_INIT(i)   ((atomic64_t) { (i) })
 
-#define atomic_read(v) ((v)-counter)
-#define atomic64_read(v)   ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile __s32 *)(v)-counter)
+#define atomic64_read(v)   (*(volatile __s64 *)(v)-counter)
 
 #define atomic_set(v,i)(((v)-counter) = (i))
 #define atomic64_set(v,i)  (((v)-counter) = (i))
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/24] make atomic_read() behave consistently on m32r

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic_t on m32r.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-m32r/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-m32r/atomic.h  2007-08-09 06:55:53.0 
-0400
@@ -22,7 +22,7 @@
  * on us. We need to use _exactly_ the address the user gave us,
  * not some alias that contains the same information.
  */
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 
 #define ATOMIC_INIT(i) { (i) }
 
@@ -32,7 +32,7 @@ typedef struct { volatile int counter; }
  *
  * Atomically reads the value of @v.
  */
-#define atomic_read(v) ((v)-counter)
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 
 /**
  * atomic_set - set atomic variable
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/24] make atomic_read() behave consistently on m68knommu

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make atomic_read() volatile on m68knommu.  This ensures that busy-waiting
for an interrupt handler to change an atomic_t won't get compiled to an
infinite loop, consistent with SMP architectures.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-m68knommu/atomic.h2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-m68knommu/atomic.h 2007-08-09 
07:00:24.0 -0400
@@ -15,7 +15,11 @@
 typedef struct { int counter; } atomic_t;
 #define ATOMIC_INIT(i) { (i) }
 
-#define atomic_read(v) ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 #define atomic_set(v, i)   (((v)-counter) = i)
 
 static __inline__ void atomic_add(int i, atomic_t *v)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/24] make atomic_read() behave consistently on avr32

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic_t on avr32.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-avr32/atomic.h2007-08-08 
17:48:52.0 -0400
+++ linux-2.6.23-rc2/include/asm-avr32/atomic.h 2007-08-09 06:33:39.0 
-0400
@@ -16,10 +16,14 @@
 
 #include asm/system.h
 
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 #define ATOMIC_INIT(i)  { (i) }
 
-#define atomic_read(v) ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 #define atomic_set(v, i)   (((v)-counter) = i)
 
 /*
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/24] make atomic_read() behave consistently on m68k

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make atomic_read() volatile on m68k.  This ensures that busy-waiting
for an interrupt handler to change an atomic_t won't get compiled to an
infinite loop, consistent with SMP architectures.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-m68k/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-m68k/atomic.h  2007-08-09 06:58:24.0 
-0400
@@ -16,7 +16,11 @@
 typedef struct { int counter; } atomic_t;
 #define ATOMIC_INIT(i) { (i) }
 
-#define atomic_read(v) ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 #define atomic_set(v, i)   (((v)-counter) = i)
 
 static inline void atomic_add(int i, atomic_t *v)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/24] make atomic_read() behave consistently on mips

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic[64]_t on mips.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-mips/atomic.h 2007-08-08 
17:48:53.0 -0400
+++ linux-2.6.23-rc2/include/asm-mips/atomic.h  2007-08-09 07:02:50.0 
-0400
@@ -20,7 +20,7 @@
 #include asm/war.h
 #include asm/system.h
 
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 
 #define ATOMIC_INIT(i){ (i) }
 
@@ -30,7 +30,7 @@ typedef struct { volatile int counter; }
  *
  * Atomically reads the value of @v.
  */
-#define atomic_read(v) ((v)-counter)
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 
 /*
  * atomic_set - set atomic variable
@@ -404,7 +404,7 @@ static __inline__ int atomic_add_unless(
 
 #ifdef CONFIG_64BIT
 
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i){ (i) }
 
@@ -413,7 +413,7 @@ typedef struct { volatile long counter; 
  * @v: pointer of type atomic64_t
  *
  */
-#define atomic64_read(v)   ((v)-counter)
+#define atomic64_read(v)   (*(volatile long *)(v)-counter)
 
 /*
  * atomic64_set - set atomic variable
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/24] make atomic_read() behave consistently on parisc

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic[64]_t on parisc.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-parisc/atomic.h   2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-parisc/atomic.h2007-08-09 
07:11:38.0 -0400
@@ -128,7 +128,7 @@ __cmpxchg(volatile void *ptr, unsigned l
  * Cache-line alignment would conflict with, for example, linux/module.h
  */
 
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 
 /* It's possible to reduce all atomic operations to either
  * __atomic_add_return, atomic_set and atomic_read (the latter
@@ -157,9 +157,13 @@ static __inline__ void atomic_set(atomic
_atomic_spin_unlock_irqrestore(v, flags);
 }
 
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
 static __inline__ int atomic_read(const atomic_t *v)
 {
-   return v-counter;
+   return (*(volatile int *)(v)-counter);
 }
 
 /* exported interface */
@@ -227,7 +231,7 @@ static __inline__ int atomic_add_unless(
 
 #ifdef CONFIG_64BIT
 
-typedef struct { volatile s64 counter; } atomic64_t;
+typedef struct { s64 counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i) ((atomic64_t) { (i) })
 
@@ -258,7 +262,7 @@ atomic64_set(atomic64_t *v, s64 i)
 static __inline__ s64
 atomic64_read(const atomic64_t *v)
 {
-   return v-counter;
+   return (*(volatile s64 *)(v)-counter);
 }
 
 #define atomic64_add(i,v)  ((void)(__atomic64_add_return( ((s64)i),(v
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/24] make atomic_read() behave consistently on powerpc

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic[64]_t on powerpc.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-powerpc/atomic.h  2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-powerpc/atomic.h   2007-08-09 
07:15:21.0 -0400
@@ -5,7 +5,7 @@
  * PowerPC atomic operations
  */
 
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 
 #ifdef __KERNEL__
 #include linux/compiler.h
@@ -15,7 +15,11 @@ typedef struct { volatile int counter; }
 
 #define ATOMIC_INIT(i) { (i) }
 
-#define atomic_read(v) ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 #define atomic_set(v,i)(((v)-counter) = (i))
 
 static __inline__ void atomic_add(int a, atomic_t *v)
@@ -240,11 +244,11 @@ static __inline__ int atomic_dec_if_posi
 
 #ifdef __powerpc64__
 
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i)   { (i) }
 
-#define atomic64_read(v)   ((v)-counter)
+#define atomic64_read(v)   (*(volatile long *)(v)-counter)
 #define atomic64_set(v,i)  (((v)-counter) = (i))
 
 static __inline__ void atomic64_add(long a, atomic64_t *v)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 16/24] make atomic_read() behave consistently on s390

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make atomic[64]_read() volatile on s390, to ensure memory is actually read
each time.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-s390/atomic.h 2007-08-08 
17:48:53.0 -0400
+++ linux-2.6.23-rc2/include/asm-s390/atomic.h  2007-08-09 07:18:56.0 
-0400
@@ -67,7 +67,11 @@ typedef struct {
 
 #endif /* __GNUC__ */
 
-#define atomic_read(v)  ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v)  (*(volatile int *)(v)-counter)
 #define atomic_set(v,i) (((v)-counter) = (i))
 
 static __inline__ int atomic_add_return(int i, atomic_t * v)
@@ -182,7 +186,7 @@ typedef struct {
 
 #endif /* __GNUC__ */
 
-#define atomic64_read(v)  ((v)-counter)
+#define atomic64_read(v)  (*(volatile long long *)(v)-counter)
 #define atomic64_set(v,i) (((v)-counter) = (i))
 
 static __inline__ long long atomic64_add_return(long long i, atomic64_t * v)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/24] make atomic_read() behave consistently on sh64

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic_t on sh64.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-sh64/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-sh64/atomic.h  2007-08-09 07:22:50.0 
-0400
@@ -19,11 +19,15 @@
  *
  */
 
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 
 #define ATOMIC_INIT(i) ( (atomic_t) { (i) } )
 
-#define atomic_read(v) ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 #define atomic_set(v,i)((v)-counter = (i))
 
 #include asm/system.h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 18/24] make atomic_read() behave consistently on sh

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic_t on sh.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-sh/atomic.h   2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-sh/atomic.h2007-08-09 07:21:33.0 
-0400
@@ -7,11 +7,15 @@
  *
  */
 
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 
 #define ATOMIC_INIT(i) ( (atomic_t) { (i) } )
 
-#define atomic_read(v) ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 #define atomic_set(v,i)((v)-counter = (i))
 
 #include linux/compiler.h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 19/24] make atomic_read() behave consistently on sparc64

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic[64]_t on sparc64.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-sparc64/atomic.h  2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-sparc64/atomic.h   2007-08-09 
07:48:01.0 -0400
@@ -11,14 +11,18 @@
 #include linux/types.h
 #include asm/system.h
 
-typedef struct { volatile int counter; } atomic_t;
-typedef struct { volatile __s64 counter; } atomic64_t;
+typedef struct { int counter; } atomic_t;
+typedef struct { __s64 counter; } atomic64_t;
 
 #define ATOMIC_INIT(i) { (i) }
 #define ATOMIC64_INIT(i)   { (i) }
 
-#define atomic_read(v) ((v)-counter)
-#define atomic64_read(v)   ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
+#define atomic64_read(v)   (*(volatile __s64 *)(v)-counter)
 
 #define atomic_set(v, i)   (((v)-counter) = i)
 #define atomic64_set(v, i) (((v)-counter) = i)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 20/24] make atomic_read() behave consistently on sparc

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic_t on sparc.
Leave atomic24_t alone, since it's only used by sparc-specific code.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-sparc/atomic.h2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-sparc/atomic.h 2007-08-09 07:29:31.0 
-0400
@@ -13,7 +13,7 @@
 
 #include linux/types.h
 
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 
 #ifdef __KERNEL__
 
@@ -61,7 +61,11 @@ extern int atomic_cmpxchg(atomic_t *, in
 extern int atomic_add_unless(atomic_t *, int, int);
 extern void atomic_set(atomic_t *, int);
 
-#define atomic_read(v)  ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v)  (*(volatile int *)(v)-counter)
 
 #define atomic_add(i, v)   ((void)__atomic_add_return( (int)(i), (v)))
 #define atomic_sub(i, v)   ((void)__atomic_add_return(-(int)(i), (v)))
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/24] make atomic_read() behave consistently on v850

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make atomic_read() volatile on v850.  This ensures that busy-waiting
for an interrupt handler to change an atomic_t won't get compiled to an
infinite loop, consistent with SMP architectures.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-v850/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-v850/atomic.h  2007-08-09 07:50:15.0 
-0400
@@ -27,7 +27,11 @@ typedef struct { int counter; } atomic_t
 
 #ifdef __KERNEL__
 
-#define atomic_read(v) ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 #define atomic_set(v,i)(((v)-counter) = (i))
 
 static inline int atomic_add_return (int i, volatile atomic_t *v)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/24] make atomic_read() behave consistently on cris

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic_t on cris.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-cris/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-cris/atomic.h  2007-08-09 06:38:28.0 
-0400
@@ -11,11 +11,15 @@
  * resource counting etc..
  */
 
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 
 #define ATOMIC_INIT(i)  { (i) }
 
-#define atomic_read(v) ((v)-counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 #define atomic_set(v,i) (((v)-counter) = (i))
 
 /* These should be written in asm but we do it in C for now. */
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 22/24] make atomic_read() behave consistently on x86_64

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make atomic_read() consistent with atomic64_read() and other architectures
on x86_64.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-x86_64/atomic.h   2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-x86_64/atomic.h2007-08-09 
07:51:40.0 -0400
@@ -32,7 +32,7 @@ typedef struct { int counter; } atomic_t
  * 
  * Atomically reads the value of @v.
  */ 
-#define atomic_read(v) ((v)-counter)
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 
 /**
  * atomic_set - set atomic variable
@@ -206,7 +206,7 @@ static __inline__ int atomic_sub_return(
 
 /* An 64bit atomic type */
 
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i)   { (i) }
 
@@ -217,7 +217,7 @@ typedef struct { volatile long counter; 
  * Atomically reads the value of @v.
  * Doesn't imply a read memory barrier.
  */
-#define atomic64_read(v)   ((v)-counter)
+#define atomic64_read(v)   (*(volatile long *)(v)-counter)
 
 /**
  * atomic64_set - set atomic64 variable
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 23/24] make atomic_read() behave consistently on xtensa

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Purify volatile use for atomic_t on xtensa.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/include/asm-xtensa/atomic.h   2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-xtensa/atomic.h2007-08-09 
07:54:59.0 -0400
@@ -15,7 +15,7 @@
 
 #include linux/stringify.h
 
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 
 #ifdef __KERNEL__
 #include asm/processor.h
@@ -47,7 +47,7 @@ typedef struct { volatile int counter; }
  *
  * Atomically reads the value of @v.
  */
-#define atomic_read(v) ((v)-counter)
+#define atomic_read(v) (*(volatile int *)(v)-counter)
 
 /**
  * atomic_set - set atomic variable
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Chris Snook

Paul E. McKenney wrote:

Why not the same access-once semantics for atomic_set() as
for atomic_read()?  As this patch stands, it might introduce
architecture-specific compiler-induced bugs due to the fact that
atomic_set() used to imply volatile behavior but no longer does.


When we make the volatile cast in atomic_read(), we're casting an rvalue to 
volatile.  This unambiguously tells the compiler that we want to re-load that 
register from memory.  What's volatile behavior for an lvalue?  A write to an 
lvalue already implies an eventual write to memory, so this would be a no-op. 
Maybe you'll write to the register a few times before flushing it to memory, but 
it will happen eventually.  With an rvalue, there's no guarantee that it will 
*ever* load from memory, which is what volatile fixes.


I think what you have in mind is LOCK_PREFIX behavior, which is not the purpose 
of atomic_set.  We use LOCK_PREFIX in the inline assembly for the atomic_* 
operations that read, modify, and write a value, only because it is necessary to 
perform that entire transaction atomically.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-09 Thread Chris Snook

Arnd Bergmann wrote:

On Thursday 09 August 2007, Chris Snook wrote:

This patchset makes the behavior of atomic_read uniform by removing the
volatile keyword from all atomic_t and atomic64_t definitions that currently
have it, and instead explicitly casts the variable as volatile in
atomic_read().  This leaves little room for creative optimization by the
compiler, and is in keeping with the principles behind volatile considered
harmful.



Just an idea: since all architectures already include asm-generic/atomic.h,
why not move the definitions of atomic_t and atomic64_t, as well as anything
that does not involve architecture specific inline assembly into the generic
header?

Arnd 


a) chicken and egg: asm-generic/atomic.h depends on definitions in asm/atomic.h

If you can find a way to reshuffle the code and make it simpler, I personally am 
all for it.  I'm skeptical that you'll get much to show for the effort.


b) The definitions aren't precisely identical between all architectures, so it 
would be a mess of special cases, which gets us right back to where we are now.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 24/24] document volatile atomic_read() behavior

2007-08-09 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Update atomic_ops.txt to reflect the newly consistent behavior of
atomic_read(), and to note that volatile (in declarations) is now
considered harmful.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.23-rc2-orig/Documentation/atomic_ops.txt  2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/Documentation/atomic_ops.txt   2007-08-09 
08:24:32.0 -0400
@@ -12,7 +12,7 @@
 C integer type will fail.  Something like the following should
 suffice:
 
-   typedef struct { volatile int counter; } atomic_t;
+   typedef struct { int counter; } atomic_t;
 
The first operations to implement for atomic_t's are the
 initializers and plain reads.
@@ -38,9 +38,17 @@
 
 Next, we have:
 
-   #define atomic_read(v)  ((v)-counter)
+   #define atomic_read(v)  (*(volatile int *)(v)-counter)
 
-which simply reads the current value of the counter.
+which reads the counter as though it were volatile.  This prevents the
+compiler from optimizing away repeated atomic_read() invocations without
+requiring a more expensive barrier().  Historically this has been
+accomplished by declaring the counter itself to be volatile, but the
+ambiguity of the C standard on the semantics of volatile make this practice
+vulnerable to overly creative interpretation by compilers.  Explicit
+casting in atomic_read() ensures consistent behavior across architectures
+and compilers.  Even with this convenience in atomic_read(), busy-waiters
+should call cpu_relax().
 
 Now, we move onto the actual atomic operation interfaces.
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Chris Snook

Paul E. McKenney wrote:

On Thu, Aug 09, 2007 at 10:53:14AM -0400, Chris Snook wrote:

Paul E. McKenney wrote:

Why not the same access-once semantics for atomic_set() as
for atomic_read()?  As this patch stands, it might introduce
architecture-specific compiler-induced bugs due to the fact that
atomic_set() used to imply volatile behavior but no longer does.
When we make the volatile cast in atomic_read(), we're casting an rvalue to 
volatile.  This unambiguously tells the compiler that we want to re-load 
that register from memory.  What's volatile behavior for an lvalue?


I was absolutely -not- suggesting volatile behavior for lvalues.

Instead, I am asking for volatile behavior from an -rvalue-.  In the
case of atomic_read(), it is the atomic_t being read from.  In the case
of atomic_set(), it is the atomic_t being written to.  As suggested in
my previous email:

#define atomic_set(v,i) ((*(volatile int *)(v)-counter) = (i))
#define atomic64_set(v,i)   ((*(volatile long *)(v)-counter) = (i))


That looks like a volatile lvalue to me.  I confess I didn't exactly ace 
compilers.  Care to explain this?



Again, the architectures that used to have their counter declared
as volatile will lose volatile semantics on atomic_set() with your
patch, which might result in bugs due to overly imaginative compiler
optimizations.  The above would prevent any such bugs from appearing.

  A 
write to an lvalue already implies an eventual write to memory, so this 
would be a no-op. Maybe you'll write to the register a few times before 
flushing it to memory, but it will happen eventually.  With an rvalue, 
there's no guarantee that it will *ever* load from memory, which is what 
volatile fixes.


I think what you have in mind is LOCK_PREFIX behavior, which is not the 
purpose of atomic_set.  We use LOCK_PREFIX in the inline assembly for the 
atomic_* operations that read, modify, and write a value, only because it 
is necessary to perform that entire transaction atomically.


No LOCK_PREFIX, thank you!!!  I just want to make sure that the compiler
doesn't push the store down out of a loop, split the store, allow the
store to happen twice (e.g., to allow different code paths to be merged),
and all the other tricks that the C standard permits compilers to pull.


We can't have split stores because we don't use atomic64_t on 32-bit 
architectures.  volatile won't save you from pushing stores out of loops unless 
you're also doing reads.  This is why we use reads to flush writes to mmio 
registers.  In this case, an atomic_read() with volatile in it will suffice. 
Storing twice is perfectly legal here, though it's unlikely that an optimizing 
compiler smart enough to create the problem we're addressing would ever do that.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Chris Snook

Segher Boessenkool wrote:
We can't have split stores because we don't use atomic64_t on 32-bit 
architectures.


That's not true; the compiler is free to split all stores
(and reads) from memory however it wants.  It is debatable
whether volatile would prevent this as well, certainly
it is unsafe if you want to be portable.  GCC will do its
best to not split volatile memory accesses, but bugs in
this area do happen a lot (because the compiler code for
volatile isn't as well exercised as most other compiler
code, and because it is simply a hard subject; and there
is no real formalised model for what GCC should do).

The only safe way to get atomic accesses is to write
assembler code.  Are there any downsides to that?  I don't
see any.


The assumption that aligned word reads and writes are atomic, and that words are 
aligned unless explicitly packed otherwise, is endemic in the kernel.  No sane 
compiler violates this assumption.  It's true that we're not portable to insane 
compilers after this patch, but we never were in the first place.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 24/24] document volatile atomic_read() behavior

2007-08-09 Thread Chris Snook

Segher Boessenkool wrote:

Historically this has been
+accomplished by declaring the counter itself to be volatile, but the
+ambiguity of the C standard on the semantics of volatile make this 
practice

+vulnerable to overly creative interpretation by compilers.


It's even worse when accessing through a volatile casted pointer;
see for example the recent(*) GCC bugs in that area.

(*) Well, not _all_ that recent.  No one should be using the 3.x
series anymore, right?


Explicit
+casting in atomic_read() ensures consistent behavior across 
architectures

+and compilers.


Even modulo compiler bugs, what makes you believe that?


When you declare a variable volatile, you don't actually tell the compiler where 
you want to override its default optimization behavior, giving it some freedom 
to guess your intentions incorrectly.  When you put the cast on the data access 
itself, there is no question about precisely where in the code you want to 
override the compiler's default optimization behavior.  If the compiler doesn't 
do what you want with a volatile declaration, it might have a plausible excuse 
in the ambiguity of the C standard.  If the compiler doesn't do what you want in 
a cast specific to a single dereference, it's just plain broken.  We try to be 
compatible with plausibly correct compilers, but if they're completely broken, 
we're screwed no matter what.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Chris Snook

Paul E. McKenney wrote:

The compiler is within its rights to read a 32-bit quantity 16 bits at
at time, even on a 32-bit machine.  I would be glad to help pummel any
compiler writer that pulls such a dirty trick, but the C standard really
does permit this.


Yes, but we don't write code for these compilers.  There are countless pieces of 
kernel code which would break in this condition, and there doesn't seem to be 
any interest in fixing this.



Use of volatile does in fact save you from the compiler pushing stores out
of loops regardless of whether you are also doing reads.  The C standard
has the notion of sequence points, which occur at various places including
the ends of statements and the control expressions for if and while
statements.  The compiler is not permitted to move volatile references
across a sequence point.  Therefore, the compiler is not allowed to
push a volatile store out of a loop.  Now the CPU might well do such a
reordering, but that is a separate issue to be dealt with via memory
barriers.  Note that it is the CPU and I/O system, not the compiler,
that is forcing you to use reads to flush writes to MMIO registers.


Sequence points enforce read-after-write ordering, not write-after-write.  We 
flush writes with reads for MMIO because of this effect as well as the CPU/bus 
effects.



And you would be amazed at what compiler writers will do in order to
get an additional fraction of a percent out of SpecCPU...


Probably not :)


In short, please retain atomic_set()'s volatility, especially on those
architectures that declared the atomic_t's counter to be volatile.


Like i386 and x86_64?  These used to have volatile in the atomic_t declaration. 
 We removed it, and the sky did not fall.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-09 Thread Chris Snook

Chris Snook wrote:

From: Chris Snook [EMAIL PROTECTED]

Make atomic_read() volatile on frv.  This ensures that busy-waiting
for an interrupt handler to change an atomic_t won't get compiled to an
infinite loop, consistent with SMP architectures.


To head off the criticism, I admit this is an oversimplification, and true 
busy-waiters should be using cpu_relax(), which contains a barrier.  As 
discussed in recent threads, there are other cases which can be optimized by 
removing the need for a barrier, and having behavior consistent with 
architectures where the benefit is more profound is also valuable.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Chris Snook

Paul E. McKenney wrote:

On Thu, Aug 09, 2007 at 12:36:17PM -0400, Chris Snook wrote:

Paul E. McKenney wrote:

The compiler is within its rights to read a 32-bit quantity 16 bits at
at time, even on a 32-bit machine.  I would be glad to help pummel any
compiler writer that pulls such a dirty trick, but the C standard really
does permit this.
Yes, but we don't write code for these compilers.  There are countless 
pieces of kernel code which would break in this condition, and there 
doesn't seem to be any interest in fixing this.



Use of volatile does in fact save you from the compiler pushing stores out
of loops regardless of whether you are also doing reads.  The C standard
has the notion of sequence points, which occur at various places including
the ends of statements and the control expressions for if and while
statements.  The compiler is not permitted to move volatile references
across a sequence point.  Therefore, the compiler is not allowed to
push a volatile store out of a loop.  Now the CPU might well do such a
reordering, but that is a separate issue to be dealt with via memory
barriers.  Note that it is the CPU and I/O system, not the compiler,
that is forcing you to use reads to flush writes to MMIO registers.
Sequence points enforce read-after-write ordering, not write-after-write.  
We flush writes with reads for MMIO because of this effect as well as the 
CPU/bus effects.


Neither volatile reads nor volatile writes may be moved across sequence
points.


By the compiler, or by the CPU?  If you're depending on volatile writes being 
visible to other CPUs, you're screwed either way, because the CPU can hold that 
data in cache as long as it wants before it writes it to memory.  When this 
finally does happen, it will happen atomically, which is all that atomic_set 
guarantees.  If you need to guarantee that the value is written to memory at a 
particular time in your execution sequence, you either have to read it from 
memory to force the compiler to store it first (and a volatile cast in 
atomic_read will suffice for this) or you have to use LOCK_PREFIX instructions 
which will invalidate remote cache lines containing the same variable.  This 
patch doesn't change either of these cases.



And you would be amazed at what compiler writers will do in order to
get an additional fraction of a percent out of SpecCPU...

Probably not :)


In short, please retain atomic_set()'s volatility, especially on those
architectures that declared the atomic_t's counter to be volatile.
Like i386 and x86_64?  These used to have volatile in the atomic_t 
declaration. We removed it, and the sky did not fall.


Interesting.  You tested all possible configs on all possible hardware?


No, but I can reason about it and be confident that this won't break anything 
that isn't already broken.  At worst, this patch will make any existing subtly 
incorrect uses of atomic_t much more obvious and easier to track down.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Chris Snook

Paul E. McKenney wrote:

On Thu, Aug 09, 2007 at 01:14:35PM -0400, Chris Snook wrote:
If you're depending on volatile writes 
being visible to other CPUs, you're screwed either way, because the CPU can 
hold that data in cache as long as it wants before it writes it to memory.  
When this finally does happen, it will happen atomically, which is all that 
atomic_set guarantees.  If you need to guarantee that the value is written 
to memory at a particular time in your execution sequence, you either have 
to read it from memory to force the compiler to store it first (and a 
volatile cast in atomic_read will suffice for this) or you have to use 
LOCK_PREFIX instructions which will invalidate remote cache lines 
containing the same variable.  This patch doesn't change either of these 
cases.


The case that it -can- change is interactions with interrupt handlers.
And NMI/SMI handlers, for that matter.


You have a point here, but only if you can guarantee that the interrupt handler 
is running on a processor sharing the cache that has the not-yet-written 
volatile value.  That implies a strictly non-SMP architecture.  At the moment, 
none of those have volatile in their declaration of atomic_t, so this patch 
can't break any of them.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Chris Snook

Segher Boessenkool wrote:

The only safe way to get atomic accesses is to write
assembler code.  Are there any downsides to that?  I don't
see any.


The assumption that aligned word reads and writes are atomic, and that 
words are aligned unless explicitly packed otherwise, is endemic in 
the kernel.  No sane compiler violates this assumption.  It's true 
that we're not portable to insane compilers after this patch, but we 
never were in the first place.


You didn't answer my question: are there any downsides to using
explicit coded-in-assembler accesses for atomic accesses?  You
can handwave all you want that it should just work with
volatile accesses, but volatility != atomicity, volatile in C
is really badly defined, GCC never officially gave stronger
guarantees, and we have a bugzilla full of PRs to show what a
minefield it is.

So, why not use the well-defined alternative?


Because we don't need to, and it hurts performance.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Chris Snook

Paul E. McKenney wrote:

On Thu, Aug 09, 2007 at 02:13:52PM -0400, Chris Snook wrote:

Paul E. McKenney wrote:

On Thu, Aug 09, 2007 at 01:14:35PM -0400, Chris Snook wrote:
   If you're depending on volatile writes 
being visible to other CPUs, you're screwed either way, because the CPU 
can hold that data in cache as long as it wants before it writes it to 
memory.  When this finally does happen, it will happen atomically, which 
is all that atomic_set guarantees.  If you need to guarantee that the 
value is written to memory at a particular time in your execution 
sequence, you either have to read it from memory to force the compiler to 
store it first (and a volatile cast in atomic_read will suffice for this) 
or you have to use LOCK_PREFIX instructions which will invalidate remote 
cache lines containing the same variable.  This patch doesn't change 
either of these cases.

The case that it -can- change is interactions with interrupt handlers.
And NMI/SMI handlers, for that matter.
You have a point here, but only if you can guarantee that the interrupt 
handler is running on a processor sharing the cache that has the 
not-yet-written volatile value.  That implies a strictly non-SMP 
architecture.  At the moment, none of those have volatile in their 
declaration of atomic_t, so this patch can't break any of them.


This can also happen when using per-CPU variables.  And there are a
number of per-CPU variables that are either atomic themselves or are
structures containing atomic fields.


Accessing per-CPU variables in this fashion reliably already requires a suitable 
smp/non-smp read/write memory barrier.  I maintain that if we break anything 
with this change, it was really already broken, if less obviously.  Can you give 
a real or synthetic example of legitimate code that could break?


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Chris Snook

Segher Boessenkool wrote:

The compiler is within its rights to read a 32-bit quantity 16 bits at
at time, even on a 32-bit machine.  I would be glad to help pummel any
compiler writer that pulls such a dirty trick, but the C standard really
does permit this.


Yes, but we don't write code for these compilers.  There are countless 
pieces of kernel code which would break in this condition, and there 
doesn't seem to be any interest in fixing this.


Other things are broken too.  Great argument :-)


We make plenty of practical assumptions in the kernel, and declare incorrect 
things which violate them, even in cases where there's no commandment from the 
heavens forbidding them.  Since the whole point of this exercise is to prevent 
badness with *optimizing* compilers, it's quite reasonable to declare broken any 
so-called optimizer which violates these trivial assumptions.



In short, please retain atomic_set()'s volatility, especially on those
architectures that declared the atomic_t's counter to be volatile.


Like i386 and x86_64?  These used to have volatile in the atomic_t 
declaration.  We removed it, and the sky did not fall.


And this proves what?  Lots of stuff works by accident.


If something breaks because of this, it was already broken, but hidden a lot 
better.  I don't see much of a downside to exposing and fixing those bugs.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Chris Snook

Geert Uytterhoeven wrote:

On Thu, 9 Aug 2007, Chris Snook wrote:

Segher Boessenkool wrote:

The only safe way to get atomic accesses is to write
assembler code.  Are there any downsides to that?  I don't
see any.

The assumption that aligned word reads and writes are atomic, and that
words are aligned unless explicitly packed otherwise, is endemic in the
kernel.  No sane compiler violates this assumption.  It's true that we're
not portable to insane compilers after this patch, but we never were in
the first place.

You didn't answer my question: are there any downsides to using
explicit coded-in-assembler accesses for atomic accesses?  You
can handwave all you want that it should just work with
volatile accesses, but volatility != atomicity, volatile in C
is really badly defined, GCC never officially gave stronger
guarantees, and we have a bugzilla full of PRs to show what a
minefield it is.

So, why not use the well-defined alternative?

Because we don't need to, and it hurts performance.


It hurts performance by implementing 32-bit atomic reads in assembler?


No, I misunderstood the question.  Implementing 32-bit atomic reads in assembler 
is redundant, because any sane compiler, *particularly* and optimizing compiler 
(and we're only in this mess because of optimizing compilers) will give us that 
automatically without the assembler.  Yes, it is legal for a compiler to violate 
this assumption.  It is also legal for us to refuse to maintain compatibility 
with compilers that suck this badly.  That decision was made a very long time 
ago, and I consider it the correct decision.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 24/24] document volatile atomic_read() behavior

2007-08-09 Thread Chris Snook

Segher Boessenkool wrote:

Explicit
+casting in atomic_read() ensures consistent behavior across 
architectures

+and compilers.

Even modulo compiler bugs, what makes you believe that?


When you declare a variable volatile, you don't actually tell the 
compiler where you want to override its default optimization behavior, 
giving it some freedom to guess your intentions incorrectly.  When you 
put the cast on the data access itself, there is no question about 
precisely where in the code you want to override the compiler's 
default optimization behavior.


...except for the small point that this isn't how volatile works.

Rule of thumb: even people who know the semantics of volatile
shouldn't use it.

If the compiler doesn't do what you want with a volatile declaration, 
it might have a plausible excuse in the ambiguity of the C standard.  
If the compiler doesn't do what you want in a cast specific to a 
single dereference, it's just plain broken.


The other way around.  volatile has pretty weak semantics, and
certainly not the semantics you think it has, or that you wish it
had; but *(volatile XX *) doesn't have *any* semantics.  However
much you cast that pointer it still doesn't point to a volatile
object.

GCC will generally take the path of least surprise and perform a
volatile access anyway, but this has only be decided recently (a
year ago or so), and there very likely still are some bugs in
that area.

We try to be compatible with plausibly correct compilers, but if 
they're completely broken, we're screwed no matter what.


If you don't know what to expect, you're screwed for sure.


Anyway, what's the supposed advantage of *(volatile *) vs. using
a real volatile object?  That you can access that same object in
a non-volatile way?


You'll have to take that up with Linus and the minds behind Volatile Considered 
Harmful, but the crux of it is that volatile objects are prone to compiler bugs 
too, and if we have to track down a compiler bug, it's a lot easier when we know 
exactly where the load is supposed to be because we deliberately put it there, 
rather than letting the compiler re-order everything that lacks a strict data 
dependency and trying to figure out where in a thousand lines of assembler the 
compiler should have put the load for the volatile object.


If we're going to assume that the compiler has bugs we'll never be able to find, 
we all need to find new careers.  If we're going to assume that it has bugs we 
*can* find, then let's use code that makes it easier to do that.


I initially proposed a patch that made all the objects volatile, on the grounds 
that this was a special case where there wasn't much room to have a 
misunderstanding that resulted in anything worse than wasted loads.  Linus 
objected, and now that I've seen all the responses to the new patchset, I 
understand exactly why.  If our compilers really suck as much as everyone says 
they do, it'll be much easier to detect that with volatile casts than with 
volatile declarations.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] ipvs: force read of atomic_t in while loop

2007-08-08 Thread Chris Snook

Heiko Carstens wrote:

On Wed, Aug 08, 2007 at 03:21:31AM -0700, David Miller wrote:

From: Heiko Carstens [EMAIL PROTECTED]
Date: Wed, 8 Aug 2007 11:33:00 +0200


Just saw this while grepping for atomic_reads in a while loops.
Maybe we should re-add the volatile to atomic_t. Not sure.

I think whatever the choice, it should be done consistently
on every architecture.

It's just asking for trouble if your arch does it differently from
every other.


Well..currently it's i386/x86_64 and s390 which have no volatile
in atomic_t. And yes, of course I agree it should be consistent
across all architectures. But it isn't.


Based on recent discussion, it's pretty clear that there's a lot of 
confusion about this.  A lot of people (myself included, until I thought 
about it long and hard) will reasonably assume that calling 
atomic_read() will actually read the value from memory.  Leaving out the 
volatile declaration seems like a pessimization to me.  If you force 
people to use barrier() everywhere they're working with atomic_t, it 
will force re-reads of all the non-atomic data in use as well, which 
will cause more memory fetches of things that generally don't need 
barrier().  That and it's a bug waiting to happen.


Andi -- your thoughts on the matter?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] ipvs: force read of atomic_t in while loop

2007-08-08 Thread Chris Snook

Heiko Carstens wrote:

On Wed, Aug 08, 2007 at 02:31:15PM -0700, Andrew Morton wrote:

On Wed, 08 Aug 2007 17:08:44 -0400
Chris Snook [EMAIL PROTECTED] wrote:


Heiko Carstens wrote:

On Wed, Aug 08, 2007 at 03:21:31AM -0700, David Miller wrote:

From: Heiko Carstens [EMAIL PROTECTED]
Date: Wed, 8 Aug 2007 11:33:00 +0200


Just saw this while grepping for atomic_reads in a while loops.
Maybe we should re-add the volatile to atomic_t. Not sure.

I think whatever the choice, it should be done consistently
on every architecture.

It's just asking for trouble if your arch does it differently from
every other.

Well..currently it's i386/x86_64 and s390 which have no volatile
in atomic_t. And yes, of course I agree it should be consistent
across all architectures. But it isn't.
Based on recent discussion, it's pretty clear that there's a lot of 
confusion about this.  A lot of people (myself included, until I thought 
about it long and hard) will reasonably assume that calling 
atomic_read() will actually read the value from memory.  Leaving out the 
volatile declaration seems like a pessimization to me.  If you force 
people to use barrier() everywhere they're working with atomic_t, it 
will force re-reads of all the non-atomic data in use as well, which 
will cause more memory fetches of things that generally don't need 
barrier().  That and it's a bug waiting to happen.


Andi -- your thoughts on the matter?

I'm not Andi, but this not-Andi thinks that permitting the compiler to cache
the results of atomic_read() is dumb.


Ok, how about this:

Subject: [PATCH] Add 'volatile' to atomic_t again.

From: Heiko Carstens [EMAIL PROTECTED]

This basically reverts f9e9dcb38f5106fa8cdac04a9e967d5487f1cd20 which
removed 'volatile' from atomic_t for i386/x86_64. Reason for this
is to make sure that code like
while (atomic_read(whatever));
continues to work.
Otherwise the compiler might generate code that will loop forever.
Also this makes sure atomic_t is the same across all architectures.

Cc: Andi Kleen [EMAIL PROTECTED]
Cc: Martin Schwidefsky [EMAIL PROTECTED]
Signed-off-by: Heiko Carstens [EMAIL PROTECTED]
---

s390 patch will go in via Martin if this is accepted.

 include/asm-i386/atomic.h   |2 +-
 include/asm-x86_64/atomic.h |2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

Index: linux-2.6/include/asm-i386/atomic.h
===
--- linux-2.6.orig/include/asm-i386/atomic.h
+++ linux-2.6/include/asm-i386/atomic.h
@@ -15,7 +15,7 @@
  * on us. We need to use _exactly_ the address the user gave us,
  * not some alias that contains the same information.
  */
-typedef struct { int counter; } atomic_t;
+typedef struct { volatile int counter; } atomic_t;
 
 #define ATOMIC_INIT(i)	{ (i) }
 
Index: linux-2.6/include/asm-x86_64/atomic.h

===
--- linux-2.6.orig/include/asm-x86_64/atomic.h
+++ linux-2.6/include/asm-x86_64/atomic.h
@@ -22,7 +22,7 @@
  * on us. We need to use _exactly_ the address the user gave us,
  * not some alias that contains the same information.
  */
-typedef struct { int counter; } atomic_t;
+typedef struct { volatile int counter; } atomic_t;
 
 #define ATOMIC_INIT(i)	{ (i) }
 


Good so far, but we need to fix it on non-SMP architectures too, since 
drivers may use atomic_t in interrupt code.  Ideally I'd like to be able 
to remove a whole bunch of barriers, since they cause a lot of needless 
re-fetches for everything else in the loop.  We should also document the 
semantics of atomic_t to ensure consistency in the future.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Some architectures currently do not declare the contents of an atomic_t to be
volatile.  This causes confusion since atomic_read() might not actually read
anything if an optimizing compiler re-uses a value stored in a register, which
can break code that loops until something external changes the value of an
atomic_t.  Avoiding such bugs requires using barrier(), which causes re-loads
of all registers used in the loop, thus hurting performance instead of helping
it, particularly on architectures where it's unnecessary.  Since we generally
want to re-read the contents of an atomic variable on every access anyway,
let's standardize the behavior across all architectures and avoid the
performance and correctness problems of requiring the use of barrier() in
loops that expect atomic_t variables to change externally.  This is relevant
even on non-smp architectures, since drivers may use atomic operations in
interrupt handlers.

Signed-off-by: Chris Snook [EMAIL PROTECTED]


diff -urp linux-2.6.23-rc2-orig/include/asm-blackfin/atomic.h 
linux-2.6.23-rc2/include/asm-blackfin/atomic.h
--- linux-2.6.23-rc2-orig/include/asm-blackfin/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-blackfin/atomic.h  2007-08-08 
18:18:43.0 -0400
@@ -13,8 +13,13 @@
  * Tony Kou ([EMAIL PROTECTED])   Lineo Inc.   2001
  */
 
+/*
+ * Make atomic_t volatile to remove the need for barriers in loops that
+ * wait for an outside event.  We generally want to re-load the atomic_t
+ * variable each time anyway, but don't need to re-load everything else.
+ */
 typedef struct {
-   int counter;
+   volatile int counter;
 } atomic_t;
 #define ATOMIC_INIT(i) { (i) }
 
diff -urp linux-2.6.23-rc2-orig/include/asm-frv/atomic.h 
linux-2.6.23-rc2/include/asm-frv/atomic.h
--- linux-2.6.23-rc2-orig/include/asm-frv/atomic.h  2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-frv/atomic.h   2007-08-08 18:21:55.0 
-0400
@@ -35,8 +35,13 @@
 #define smp_mb__before_atomic_inc()barrier()
 #define smp_mb__after_atomic_inc() barrier()
 
+/*
+ * Make atomic_t volatile to remove the need for barriers in loops that
+ * wait for an outside event.  We generally want to re-load the atomic_t
+ * variable each time anyway, but don't need to re-load everything else.
+ */
 typedef struct {
-   int counter;
+   volatile int counter;
 } atomic_t;
 
 #define ATOMIC_INIT(i) { (i) }
diff -urp linux-2.6.23-rc2-orig/include/asm-h8300/atomic.h 
linux-2.6.23-rc2/include/asm-h8300/atomic.h
--- linux-2.6.23-rc2-orig/include/asm-h8300/atomic.h2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-h8300/atomic.h 2007-08-08 18:24:02.0 
-0400
@@ -6,7 +6,12 @@
  * resource counting etc..
  */
 
-typedef struct { int counter; } atomic_t;
+/*
+ * Make atomic_t volatile to remove the need for barriers in loops that
+ * wait for an outside event.  We generally want to re-load the atomic_t
+ * variable each time anyway, but don't need to re-load everything else.
+ */
+typedef struct { volatile int counter; } atomic_t;
 #define ATOMIC_INIT(i) { (i) }
 
 #define atomic_read(v) ((v)-counter)
diff -urp linux-2.6.23-rc2-orig/include/asm-i386/atomic.h 
linux-2.6.23-rc2/include/asm-i386/atomic.h
--- linux-2.6.23-rc2-orig/include/asm-i386/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-i386/atomic.h  2007-08-08 18:26:09.0 
-0400
@@ -14,8 +14,12 @@
  * Make sure gcc doesn't try to be clever and move things around
  * on us. We need to use _exactly_ the address the user gave us,
  * not some alias that contains the same information.
+ *
+ * Make atomic_t volatile to remove the need for barriers in loops that
+ * wait for an outside event.  We generally want to re-load the atomic_t
+ * variable each time anyway, but don't need to re-load everything else.
  */
-typedef struct { int counter; } atomic_t;
+typedef struct { volatile int counter; } atomic_t;
 
 #define ATOMIC_INIT(i) { (i) }
 
diff -urp linux-2.6.23-rc2-orig/include/asm-m68k/atomic.h 
linux-2.6.23-rc2/include/asm-m68k/atomic.h
--- linux-2.6.23-rc2-orig/include/asm-m68k/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-m68k/atomic.h  2007-08-08 18:28:58.0 
-0400
@@ -13,7 +13,12 @@
  * We do not have SMP m68k systems, so we don't have to deal with that.
  */
 
-typedef struct { int counter; } atomic_t;
+/*
+ * Make atomic_t volatile to remove the need for barriers in loops that
+ * wait for an outside event.  We generally want to re-load the atomic_t
+ * variable each time anyway, but don't need to re-load everything else.
+ */
+typedef struct { volatile int counter; } atomic_t;
 #define ATOMIC_INIT(i) { (i) }
 
 #define atomic_read(v) ((v)-counter)
diff -urp linux-2.6.23-rc2-orig/include/asm-m68knommu/atomic.h 
linux-2.6.23-rc2/include/asm-m68knommu/atomic.h
--- linux-2.6.23-rc2

Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Chris Snook

Jesper Juhl wrote:

On 09/08/2007, Chris Snook [EMAIL PROTECTED] wrote:

From: Chris Snook [EMAIL PROTECTED]

Some architectures currently do not declare the contents of an atomic_t to be
volatile.  This causes confusion since atomic_read() might not actually read
anything if an optimizing compiler re-uses a value stored in a register, which
can break code that loops until something external changes the value of an
atomic_t.  Avoiding such bugs requires using barrier(), which causes re-loads
of all registers used in the loop, thus hurting performance instead of helping
it, particularly on architectures where it's unnecessary.  Since we generally
want to re-read the contents of an atomic variable on every access anyway,
let's standardize the behavior across all architectures and avoid the
performance and correctness problems of requiring the use of barrier() in
loops that expect atomic_t variables to change externally.  This is relevant
even on non-smp architectures, since drivers may use atomic operations in
interrupt handlers.

Signed-off-by: Chris Snook [EMAIL PROTECTED]



Hmm, I thought we were trying to move away from volatile since it is
very weakly defined and towards explicit barriers and locks...
Points to -- Documentation/volatile-considered-harmful.txt


This is a special case.  Usually, the use of volatile is just lazy.  In 
this case, it's probably necessary on at least some architectures, so we 
can't remove it everywhere unless we want to rewrite atomic.h completely 
in inline assembler for several architectures, and painstakingly verify 
all that work.  I would hope it's obvious that having consistent 
behavior on all architectures, or even at all compiler optimization 
levels within an architecture, can be agreed to be good.  Additionally, 
atomic_t variables are a rare exception, in that we pretty much always 
want to work with the latest value in RAM on every access.  Making this 
atomic will allow us to remove a bunch of barriers which do nothing but 
slow things down on most architectures.


I agree that the use of atomic_t in .c files is generally bad, but in 
certain special cases, hiding it inside defined data types may be worth 
the slight impurity, just as we sometimes tolerate lines longer than 80 
columns when fixing it makes things unreadable.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Chris Snook

Lennert Buytenhek wrote:

On Wed, Aug 08, 2007 at 07:07:33PM -0400, Chris Snook wrote:


From: Chris Snook [EMAIL PROTECTED]

Some architectures currently do not declare the contents of an atomic_t to be
volatile.  This causes confusion since atomic_read() might not actually read
anything if an optimizing compiler re-uses a value stored in a register, which
can break code that loops until something external changes the value of an
atomic_t.  Avoiding such bugs requires using barrier(), which causes re-loads
of all registers used in the loop, thus hurting performance instead of helping
it, particularly on architectures where it's unnecessary.  Since we generally
want to re-read the contents of an atomic variable on every access anyway,
let's standardize the behavior across all architectures and avoid the
performance and correctness problems of requiring the use of barrier() in
loops that expect atomic_t variables to change externally.  This is relevant
even on non-smp architectures, since drivers may use atomic operations in
interrupt handlers.

Signed-off-by: Chris Snook [EMAIL PROTECTED]


Documentation/atomic_ops.txt would need updating:

[...]

One very important aspect of these two routines is that they DO NOT
require any explicit memory barriers.  They need only perform the
atomic_t counter update in an SMP safe manner.


Thanks, I was looking for that.  I'll re-send shortly with my comment 
moved there.  People are already using atomic_t in a manner that implies 
the use of memory barriers and interrupt-safety, which is what the patch 
enforces.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: atl1 driver corrupting memory?

2007-07-25 Thread Chris Snook

Chuck Ebbert wrote:

I have a report of random errors when using the atl1 driver
with kernel 2.6.22.1. Could that be a problem fixed by the
recent changes to DMA setup in 2.6.23-rc?


I hope so.  As far as we can tell the driver and the NIC itself are doing the 
right thing, and the pci layer or chipset is screwing up the 64-bit DMA.  This 
only manifests when physical memory addresses cross the 4 GB boundary, and as 
far as I'm aware atl1 is only used on desktop boards, so we don't have a lot of 
testers.  If someone wants to buy me and Jay more RAM so we can test it 
ourselves, I guess we wouldn't object :)


I favor disabling 64-bit DMA in atl1 until Atheros can track this down in the 
lab.  If we don't get confirmation that this bug is fixed by the DMA changes, I 
think we should revert to 32-bit DMA for 2.6.23.  Limiting ourselves to 32-bit 
DMA on desktop systems is a lot less bad than allowing arbitrary memory corruption.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: atl1 driver corrupting memory?

2007-07-25 Thread Chris Snook

Chuck Ebbert wrote:

On 07/25/2007 05:22 PM, Chris Snook wrote:

Chuck Ebbert wrote:

I have a report of random errors when using the atl1 driver
with kernel 2.6.22.1. Could that be a problem fixed by the
recent changes to DMA setup in 2.6.23-rc?

I hope so.  As far as we can tell the driver and the NIC itself are
doing the right thing, and the pci layer or chipset is screwing up the
64-bit DMA.  This only manifests when physical memory addresses cross
the 4 GB boundary, and as far as I'm aware atl1 is only used on desktop
boards, so we don't have a lot of testers.  If someone wants to buy me
and Jay more RAM so we can test it ourselves, I guess we wouldn't object :)



Our reporter has 8GB of memory in an x86_64 machine.

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=249511



I favor disabling 64-bit DMA in atl1 until Atheros can track this down
in the lab.  If we don't get confirmation that this bug is fixed by the
DMA changes, I think we should revert to 32-bit DMA for 2.6.23. 
Limiting ourselves to 32-bit DMA on desktop systems is a lot less bad

than allowing arbitrary memory corruption.



This is what was committed.

http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3f516c00d416bd39aab6cfb348b68919e295fe23
http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ef76e3e2505db01f7d4b537854f4a177220c26c8


Oh, I thought you were referring to a problem reproduced *after* those changes, 
to be fixed by some generic DMA setup patch.  Has anyone reproduced the problem 
after those changes?


CCing atl1-devel to see if we can get some more testing...

-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: a maze of twisty stats, most different

2007-06-28 Thread Chris Snook

Rick Jones wrote:
It seems that every driver, when providing support for ethtool -S 
functionality, has considerable lattitude when it comes to the stats 
provided.  Clearly this is very nice for the driver writer(s) as it 
allows them to provide whatever stats they feel are most natural for 
their NIC(s) and name them as they see fit.


This is deliberate.  Ethtool operates at a very primitive level, and generally 
does little or no translation between the hardware registers and userspace.



However :)

 From the standpoint of someone looking from the outside, say someone 
wanting to consume ethtool -S statistics, it seems to be a big jumble.


Might there be a way to bring those two camps together?  Is there 
already, with the same wide availabilty of ethtool, and I've just not 
seen it?


ifconfig comes to mind, for one.  What statistics are you looking for exactly?

-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] atl1: disable 64bit DMA

2007-06-25 Thread Chris Snook

Luca Tettamanti wrote:
Il Mon, Jun 25, 2007 at 07:42:44AM -0500, Jay Cliburn ha scritto: 

Jay L. T. Cornwall wrote:

Jay Cliburn wrote:


For reasons not yet clear to me, it appears the L1 driver has a bug or
the device itself has trouble with DMA in high memory.  This patch,
drafted by Luca Tettamanti, is being explored as a workaround.  I'd be
interested to know if it fixes your problem.

Yes, it certainly seems to. Now running with this patch and 4GB active,
I've transferred about 15GB with no problem so far. It usually oopses
after a GB or two.

I guess it's not an ideal solution, architecturally speaking, but it's a
good deal better than an unstable driver. If there's any other patches
you'd like me to test or traces to capture, I'm happy to help out.
Otherwise I'll run with this one for now since it does the job!

Okay Jay, thanks.

Luca, would you please submit your patch to Jeff Garzik and netdev?


Hi Jeff,
a couple of users reported hard lockups when using L1 NICs on machines
with 4GB or more of RAM. We're still waiting official confirmation from
the vendor, but it seems that L1 has problems doing DMA to/from high
memory (physical address above the 4GB limit). Passing 32bit DMA mask
cures the problem.

Signed-Off-By: Luca Tettamanti [EMAIL PROTECTED]

---
I think that the patch should be included in 2.6.22.

 drivers/net/atl1/atl1_main.c |   15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/net/atl1/atl1_main.c b/drivers/net/atl1/atl1_main.c
index 6862c11..a730f15 100644
--- a/drivers/net/atl1/atl1_main.c
+++ b/drivers/net/atl1/atl1_main.c
@@ -2097,21 +2097,16 @@ static int __devinit atl1_probe(struct pci_dev *pdev,
struct net_device *netdev;
struct atl1_adapter *adapter;
static int cards_found = 0;
-   bool pci_using_64 = true;
int err;
 
 	err = pci_enable_device(pdev);

if (err)
return err;
 
-	err = pci_set_dma_mask(pdev, DMA_64BIT_MASK);

+   err = pci_set_dma_mask(pdev, DMA_32BIT_MASK);
if (err) {
-   err = pci_set_dma_mask(pdev, DMA_32BIT_MASK);
-   if (err) {
-   dev_err(pdev-dev, no usable DMA configuration\n);
-   goto err_dma;
-   }
-   pci_using_64 = false;
+   dev_err(pdev-dev, no usable DMA configuration\n);
+   goto err_dma;
}
/* Mark all PCI regions associated with PCI device
 * pdev as being reserved by owner atl1_driver_name
@@ -2176,7 +2171,6 @@ static int __devinit atl1_probe(struct pci_dev *pdev,
 
 	netdev-ethtool_ops = atl1_ethtool_ops;

adapter-bd_number = cards_found;
-   adapter-pci_using_64 = pci_using_64;
 
 	/* setup the private structure */

err = atl1_sw_init(adapter);
@@ -2193,9 +2187,6 @@ static int __devinit atl1_probe(struct pci_dev *pdev,
 */
/* netdev-features |= NETIF_F_TSO; */
 
-	if (pci_using_64)

-   netdev-features |= NETIF_F_HIGHDMA;
-
netdev-features |= NETIF_F_LLTX;
 
 	/*



Luca


What boards have we seen this on?  It's quite possible this is:

a) an iommu-related problem specific to AMD or specific to Intel
b) a BIOS problem that atl1 happens to be a victim of

I'd rather not disable this unconditionally if we can get more 
information about why it's breaking.  Doing so might just end up 
covering up the most obvious manifestation of a larger problem.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] atl1: disable 64bit DMA

2007-06-25 Thread Chris Snook

Jay L. T. Cornwall wrote:

Chris Snook wrote:


What boards have we seen this on?  It's quite possible this is:


I can reproduce on an Asus P5K with a Core 2 Duo E6600.

lspci identifies the controller as:
  02:00.0 Ethernet controller: Attansic Technology Corp. L1 Gigabit
  Ethernet Adapter (rev b0)

dmesg notes the PCI-DMA mapping implementation:
  PCI-DMA: Using software bounce buffering for IO (SWIOTLB)



I had a hunch this was on Intel.  I'd rather just disable this when swiotlb is 
in use, unless we get more complaints.  It's probably ultimately a BIOS quirk 
anyway.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/5] atl1: eliminate unneeded kill_vid code

2007-06-01 Thread Chris Snook

Stephen Hemminger wrote:

This driver has unneeded stubs for VLAN filtering.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/atl1/atl1_main.c   |   25 +

---
 drivers/net/atl1/atl1_main.c |   33 +
 1 file changed, 1 insertion(+), 32 deletions(-)

--- a/drivers/net/atl1/atl1_main.c  2007-06-01 09:21:53.0 -0700
+++ b/drivers/net/atl1/atl1_main.c  2007-06-01 09:27:53.0 -0700
@@ -1229,39 +1229,9 @@ static void atl1_vlan_rx_register(struct
spin_unlock_irqrestore(adapter-lock, flags);
 }
 
-/* FIXME: justify or remove -- CHS */

-static void atl1_vlan_rx_add_vid(struct net_device *netdev, u16 vid)
-{
-   /* We don't do Vlan filtering */
-   return;
-}
-
-/* FIXME: this looks wrong too -- CHS */
-static void atl1_vlan_rx_kill_vid(struct net_device *netdev, u16 vid)
-{
-   struct atl1_adapter *adapter = netdev_priv(netdev);
-   unsigned long flags;
-
-   spin_lock_irqsave(adapter-lock, flags);
-   /* atl1_irq_disable(adapter); */
-   vlan_group_set_device(adapter-vlgrp, vid, NULL);
-   /* atl1_irq_enable(adapter); */
-   spin_unlock_irqrestore(adapter-lock, flags);
-   /* We don't do Vlan filtering */
-   return;
-}
-
 static void atl1_restore_vlan(struct atl1_adapter *adapter)
 {
atl1_vlan_rx_register(adapter-netdev, adapter-vlgrp);
-   if (adapter-vlgrp) {
-   u16 vid;
-   for (vid = 0; vid  VLAN_GROUP_ARRAY_LEN; vid++) {
-   if (!vlan_group_get_device(adapter-vlgrp, vid))
-   continue;
-   atl1_vlan_rx_add_vid(adapter-netdev, vid);
-   }
-   }
 }
 
 static u16 tpd_avail(struct atl1_tpd_ring *tpd_ring)

@@ -2203,8 +2173,7 @@ static int __devinit atl1_probe(struct p
netdev-poll_controller = atl1_poll_controller;
 #endif
netdev-vlan_rx_register = atl1_vlan_rx_register;
-   netdev-vlan_rx_add_vid = atl1_vlan_rx_add_vid;
-   netdev-vlan_rx_kill_vid = atl1_vlan_rx_kill_vid;
+
netdev-ethtool_ops = atl1_ethtool_ops;
adapter-bd_number = cards_found;
adapter-pci_using_64 = pci_using_64;



As the comments indicate, I was going to get around to doing this in my 
Copious Free Time anyway, so:


Acked-By: Chris Snook [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] atl1: minor cleanup

2007-05-01 Thread Chris Snook

Jay Cliburn wrote:

Please accept the following trivial patches to the atl1 driver.

- use dev_printk macros
- fix whitespace damage


Acked-By: Chris Snook [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] remove irq_sem from e1000

2007-04-09 Thread Chris Snook

Kok, Auke wrote:

Chris Snook wrote:

From: Chris Snook [EMAIL PROTECTED]

Remove unnecessary irq_sem accounting from e1000.  Tested with no 
problems.



the major problem with this is that one of the e1000 parts (82547) still 
requires the irq_sem. I doubt that you tested that card specifically. 
I'm still not convinced that this patch is therefore a good idea. the 
patch for ixgb is another thing, but for e1000 we definately want to 
keep the irq_sem around until we get some definitive testing on all 
adapters.


So, this patch will stay under suspicion a bit longer. Same for the ixgb 
version.


Auke


Requires?  I acknowledge that the code is used in a special codepath for 
those cards, but it appears to be used in the same unnecessary way it is 
used everywhere else.


My testing was limited to an onboard NIC on a fairly old box I don't 
have access to anymore, as well as the e1000-derived atl1 driver.  My 
primary justification is code inspection.  The code just doesn't do 
anything productive.


-- Chris



Signed-off-by: Chris Snook [EMAIL PROTECTED]
--
diff -urp linux-2.6.20-git14.orig/drivers/net/e1000/e1000.h
linux-2.6.20-git14/drivers/net/e1000/e1000.h
--- linux-2.6.20-git14.orig/drivers/net/e1000/e1000.h2007-02-19
14:32:15.0 -0500
+++ linux-2.6.20-git14/drivers/net/e1000/e1000.h2007-02-19 
15:07:37.0

-0500
@@ -252,7 +252,6 @@ struct e1000_adapter {
 #ifdef CONFIG_E1000_NAPI
 spinlock_t tx_queue_lock;
 #endif
-atomic_t irq_sem;
 unsigned int total_tx_bytes;
 unsigned int total_tx_packets;
 unsigned int total_rx_bytes;
diff -urp linux-2.6.20-git14.orig/drivers/net/e1000/e1000_main.c
linux-2.6.20-git14/drivers/net/e1000/e1000_main.c
--- linux-2.6.20-git14.orig/drivers/net/e1000/e1000_main.c2007-02-19
14:32:15.0 -0500
+++ linux-2.6.20-git14/drivers/net/e1000/e1000_main.c2007-02-19
15:09:28.0 -0500
@@ -349,7 +349,6 @@ static void e1000_free_irq(struct e1000_
 static void
 e1000_irq_disable(struct e1000_adapter *adapter)
 {
-atomic_inc(adapter-irq_sem);
 E1000_WRITE_REG(adapter-hw, IMC, ~0);
 E1000_WRITE_FLUSH(adapter-hw);
 synchronize_irq(adapter-pdev-irq);
@@ -363,10 +362,8 @@ e1000_irq_disable(struct e1000_adapter *
 static void
 e1000_irq_enable(struct e1000_adapter *adapter)
 {
-if (likely(atomic_dec_and_test(adapter-irq_sem))) {
-E1000_WRITE_REG(adapter-hw, IMS, IMS_ENABLE_MASK);
-E1000_WRITE_FLUSH(adapter-hw);
-}
+E1000_WRITE_REG(adapter-hw, IMS, IMS_ENABLE_MASK);
+E1000_WRITE_FLUSH(adapter-hw);
 }

 static void
@@ -1336,7 +1333,6 @@ e1000_sw_init(struct e1000_adapter *adap
 spin_lock_init(adapter-tx_queue_lock);
 #endif

-atomic_set(adapter-irq_sem, 1);
 spin_lock_init(adapter-stats_lock);

 set_bit(__E1000_DOWN, adapter-flags);
@@ -3758,11 +3754,6 @@ e1000_intr_msi(int irq, void *data)
 #endif
 uint32_t icr = E1000_READ_REG(hw, ICR);

-#ifdef CONFIG_E1000_NAPI
-/* read ICR disables interrupts using IAM, so keep up with our
- * enable/disable accounting */
-atomic_inc(adapter-irq_sem);
-#endif
 if (icr  (E1000_ICR_RXSEQ | E1000_ICR_LSC)) {
 hw-get_link_status = 1;
 /* 80003ES2LAN workaround-- For packet buffer work-around on
@@ -3832,13 +3823,6 @@ e1000_intr(int irq, void *data)
 if (unlikely(hw-mac_type = e1000_82571 
  !(icr  E1000_ICR_INT_ASSERTED)))
 return IRQ_NONE;
-
-/* Interrupt Auto-Mask...upon reading ICR,
- * interrupts are masked.  No need for the
- * IMC write, but it does mean we should
- * account for it ASAP. */
-if (likely(hw-mac_type = e1000_82571))
-atomic_inc(adapter-irq_sem);
 #endif

 if (unlikely(icr  (E1000_ICR_RXSEQ | E1000_ICR_LSC))) {
@@ -3862,7 +3846,6 @@ e1000_intr(int irq, void *data)
 #ifdef CONFIG_E1000_NAPI
 if (unlikely(hw-mac_type  e1000_82571)) {
 /* disable interrupts, without the synchronize_irq bit */
-atomic_inc(adapter-irq_sem);
 E1000_WRITE_REG(hw, IMC, ~0);
 E1000_WRITE_FLUSH(hw);
 }
@@ -3888,7 +3871,6 @@ e1000_intr(int irq, void *data)
  * de-assertion state.
  */
 if (hw-mac_type == e1000_82547 || hw-mac_type == 
e1000_82547_rev_2) {

-atomic_inc(adapter-irq_sem);
 E1000_WRITE_REG(hw, IMC, ~0);
 }

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] atl1: save mac address on remove

2007-03-29 Thread Chris Snook

Jeff Garzik wrote:

Chris Snook wrote:

From: Chris Snook [EMAIL PROTECTED]

Some atl1 boards get their MAC address written directly to the register
by the BIOS during POST, rather than storing it in EEPROM that's
accessible to the driver.  If the MAC register on one of these boards
is changed and then the module is unloaded, the permanent MAC address
will be forgotten until the box is rebooted.  We should save the
permanent address during removal if we've been messing with it.

Signed-off-by: Chris Snook [EMAIL PROTECTED]


applied, even though its a hack


Oh, it's definitely a hack, but the hardware it's supporting is a hack, so we 
have to work with it.


Thanks.

-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] add Attansic L2 PCI ID

2007-03-29 Thread Chris Snook

Jeff Garzik wrote:

Chris Snook wrote:

From: Chris Snook [EMAIL PROTECTED]

Add PCI ID for the Attansic L2 100 Mb ethernet adapter.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.21-rc5.orig/include/linux/pci_ids.h2007-03-27 
23:26:50.0 -0400
+++ linux-2.6.21-rc5/include/linux/pci_ids.h2007-03-28 
15:11:03.0 -0400

@@ -2090,6 +2090,7 @@
 
 #define PCI_VENDOR_ID_ATTANSIC0x1969

 #define PCI_DEVICE_ID_ATTANSIC_L10x1048
+#define PCI_DEVICE_ID_ATTANSIC_L20x2048


Actually you should be doing the reverse:

Remove PCI_DEVICE_ID_ATTANSIC_L1, and replace the one place that uses it 
with the hexadecimal constant.


Jeff


We're working on integrating the driver for the L2 chip, so it will be useful to 
symbolically distinguish between them.  For now, adding the ID serves to 
document the distinction between the L1 and L2 chips, as they're alike enough 
that an atl1 driver hacked with the new PCI ID will detect link status on the 
L2, even though it won't really work.  By getting the ID in now, we can 
distribute patches that don't touch core code and won't need to be tweaked for 
submission.


If pci_ids.h bloat is really a big deal, we can hold off until the L2 patches 
are ready, but I don't see the harm in getting this out there now.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] add Attansic L2 PCI ID

2007-03-28 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Add PCI ID for the Attansic L2 100 Mb ethernet adapter.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- linux-2.6.21-rc5.orig/include/linux/pci_ids.h   2007-03-27 
23:26:50.0 -0400
+++ linux-2.6.21-rc5/include/linux/pci_ids.h2007-03-28 15:11:03.0 
-0400
@@ -2090,6 +2090,7 @@
 
 #define PCI_VENDOR_ID_ATTANSIC 0x1969
 #define PCI_DEVICE_ID_ATTANSIC_L1  0x1048
+#define PCI_DEVICE_ID_ATTANSIC_L2  0x2048
 
 #define PCI_VENDOR_ID_JMICRON  0x197B
 #define PCI_DEVICE_ID_JMICRON_JMB360   0x2360
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] atl1: save mac address on remove

2007-03-28 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Some atl1 boards get their MAC address written directly to the register
by the BIOS during POST, rather than storing it in EEPROM that's
accessible to the driver.  If the MAC register on one of these boards
is changed and then the module is unloaded, the permanent MAC address
will be forgotten until the box is rebooted.  We should save the
permanent address during removal if we've been messing with it.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- a/drivers/net/atl1/atl1_main.c  2007-03-01 14:14:48.0 -0500
+++ b/drivers/net/atl1/atl1_main.c  2007-03-01 16:59:59.0 -0500
@@ -2321,6 +2321,16 @@ static void __devexit atl1_remove(struct
return;
 
adapter = netdev_priv(netdev);
+
+   /* Some atl1 boards lack persistent storage for their MAC, and get it
+* from the BIOS during POST.  If we've been messing with the MAC
+* address, we need to save the permanent one.
+*/
+   if (memcmp(adapter-hw.mac_addr, adapter-hw.perm_mac_addr, ETH_ALEN)) {
+   memcpy(adapter-hw.mac_addr, adapter-hw.perm_mac_addr, 
ETH_ALEN);
+   atl1_set_mac_addr(adapter-hw);
+   }
+
iowrite16(0, adapter-hw.hw_addr + REG_GPHY_ENABLE);
unregister_netdev(netdev);
pci_iounmap(pdev, adapter-hw.hw_addr);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] remove irq_sem from atl1

2007-02-27 Thread Chris Snook

Jeff Garzik wrote:

Chris Snook wrote:

From: Chris Snook [EMAIL PROTECTED]

Remove unnecessary irq_sem code from atl1 driver.  Tested with no 
problems.


Signed-off-by: Chris Snook [EMAIL PROTECTED]
Signed-off-by: Jay Cliburn [EMAIL PROTECTED]


ACK, but patch does not apply:


Applying 'remove irq_sem from atl1'

error: patch fragment without header at line 7: @@ -236,7 +236,6 @@ 
struct atl1_adapter {
error: patch fragment without header at line 21: @@ -163,7 +163,6 @@ 
static int __devinit atl1_sw_init(struct
error: patch fragment without header at line 29: @@ -272,8 +271,7 @@ 
err_nomem:
error: patch fragment without header at line 39: @@ -1205,7 +1203,6 @@ 
static u32 atl1_configure(struct atl1_ad

error: No changes
Patch failed at 0004.
When you have resolved this problem run git-am --resolved.
If you would prefer to skip this patch, instead run git-am --skip.


Huh, looks like my mailer line-wrapped the headers.  Patch didn't 
complain, but I guess git is pickier.  I've attached the patch to avoid 
text mangling.  Were there any problems with the e1000 and ixgb patches?


-- Chris
diff -urp linux-2.6.20-git14.orig/drivers/net/atl1/atl1.h linux-2.6.20-git14/drivers/net/atl1/atl1.h
--- linux-2.6.20-git14.orig/drivers/net/atl1/atl1.h	2007-02-19 14:32:15.0 -0500
+++ linux-2.6.20-git14/drivers/net/atl1/atl1.h	2007-02-19 15:10:07.0 -0500
@@ -236,7 +236,6 @@ struct atl1_adapter {
 	u16 link_speed;
 	u16 link_duplex;
 	spinlock_t lock;
-	atomic_t irq_sem;
 	struct work_struct tx_timeout_task;
 	struct work_struct link_chg_task;
 	struct work_struct pcie_dma_to_rst_task;
diff -urp linux-2.6.20-git14.orig/drivers/net/atl1/atl1_main.c linux-2.6.20-git14/drivers/net/atl1/atl1_main.c
--- linux-2.6.20-git14.orig/drivers/net/atl1/atl1_main.c	2007-02-19 14:32:15.0 -0500
+++ linux-2.6.20-git14/drivers/net/atl1/atl1_main.c	2007-02-19 15:10:44.0 -0500
@@ -163,7 +163,6 @@ static int __devinit atl1_sw_init(struct
 	hw-cmb_tx_timer = 1;	/* about 2us */
 	hw-smb_timer = 10;	/* about 200ms */
 
-	atomic_set(adapter-irq_sem, 0);
 	spin_lock_init(adapter-lock);
 	spin_lock_init(adapter-mb_lock);
 
@@ -272,8 +271,7 @@ err_nomem:
  */
 static void atl1_irq_enable(struct atl1_adapter *adapter)
 {
-	if (likely(!atomic_dec_and_test(adapter-irq_sem)))
-		iowrite32(IMR_NORMAL_MASK, adapter-hw.hw_addr + REG_IMR);
+	iowrite32(IMR_NORMAL_MASK, adapter-hw.hw_addr + REG_IMR);
 }
 
 static void atl1_clear_phy_int(struct atl1_adapter *adapter)
@@ -1205,7 +1203,6 @@ static u32 atl1_configure(struct atl1_ad
  */
 static void atl1_irq_disable(struct atl1_adapter *adapter)
 {
-	atomic_inc(adapter-irq_sem);
 	iowrite32(0, adapter-hw.hw_addr + REG_IMR);
 	ioread32(adapter-hw.hw_addr + REG_IMR);
 	synchronize_irq(adapter-pdev-irq);


[PATCH 0/3] remove irq_sem cruft from e1000 a nd derivatives

2007-02-19 Thread Chris Snook
Hey folks --

 While digging through the atl1 source, I was troubled by the code using
irq_sem.  I did some digging and found the same code in e1000 and ixgb.  I'm
not entirely sure what it was originally intended to do, but it doesn't seem
to be doing anything useful now, except possibly locking interrupts off if
NAPI is flipped on and off enough times to cause an integer overflow.

 The following patches completely remove irq_sem from each of the drivers.
 This has been tested successfully on atl1 and e1000.  If someone would like
to send me ixgb hardware I'd be glad to test that, otherwise you'll have to
test it yourself. :)

 -- Chris

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] remove irq_sem from atl1

2007-02-19 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Remove unnecessary irq_sem code from atl1 driver.  Tested with no problems.

Signed-off-by: Chris Snook [EMAIL PROTECTED]
Signed-off-by: Jay Cliburn [EMAIL PROTECTED]
--
diff -urp linux-2.6.20-git14.orig/drivers/net/atl1/atl1.h
linux-2.6.20-git14/drivers/net/atl1/atl1.h
--- linux-2.6.20-git14.orig/drivers/net/atl1/atl1.h 2007-02-19
14:32:15.0 -0500
+++ linux-2.6.20-git14/drivers/net/atl1/atl1.h  2007-02-19 15:10:07.0
-0500
@@ -236,7 +236,6 @@ struct atl1_adapter {
u16 link_speed;
u16 link_duplex;
spinlock_t lock;
-   atomic_t irq_sem;
struct work_struct tx_timeout_task;
struct work_struct link_chg_task;
struct work_struct pcie_dma_to_rst_task;
diff -urp linux-2.6.20-git14.orig/drivers/net/atl1/atl1_main.c
linux-2.6.20-git14/drivers/net/atl1/atl1_main.c
--- linux-2.6.20-git14.orig/drivers/net/atl1/atl1_main.c2007-02-19
14:32:15.0 -0500
+++ linux-2.6.20-git14/drivers/net/atl1/atl1_main.c 2007-02-19
15:10:44.0 -0500
@@ -163,7 +163,6 @@ static int __devinit atl1_sw_init(struct
hw-cmb_tx_timer = 1;   /* about 2us */
hw-smb_timer = 10; /* about 200ms */

-   atomic_set(adapter-irq_sem, 0);
spin_lock_init(adapter-lock);
spin_lock_init(adapter-mb_lock);

@@ -272,8 +271,7 @@ err_nomem:
  */
 static void atl1_irq_enable(struct atl1_adapter *adapter)
 {
-   if (likely(!atomic_dec_and_test(adapter-irq_sem)))
-   iowrite32(IMR_NORMAL_MASK, adapter-hw.hw_addr + REG_IMR);
+   iowrite32(IMR_NORMAL_MASK, adapter-hw.hw_addr + REG_IMR);
 }

 static void atl1_clear_phy_int(struct atl1_adapter *adapter)
@@ -1205,7 +1203,6 @@ static u32 atl1_configure(struct atl1_ad
  */
 static void atl1_irq_disable(struct atl1_adapter *adapter)
 {
-   atomic_inc(adapter-irq_sem);
iowrite32(0, adapter-hw.hw_addr + REG_IMR);
ioread32(adapter-hw.hw_addr + REG_IMR);
synchronize_irq(adapter-pdev-irq);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] remove irq_sem from e1000

2007-02-19 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Remove unnecessary irq_sem accounting from e1000.  Tested with no problems.

Signed-off-by: Chris Snook [EMAIL PROTECTED]
--
diff -urp linux-2.6.20-git14.orig/drivers/net/e1000/e1000.h
linux-2.6.20-git14/drivers/net/e1000/e1000.h
--- linux-2.6.20-git14.orig/drivers/net/e1000/e1000.h   2007-02-19
14:32:15.0 -0500
+++ linux-2.6.20-git14/drivers/net/e1000/e1000.h2007-02-19 
15:07:37.0
-0500
@@ -252,7 +252,6 @@ struct e1000_adapter {
 #ifdef CONFIG_E1000_NAPI
spinlock_t tx_queue_lock;
 #endif
-   atomic_t irq_sem;
unsigned int total_tx_bytes;
unsigned int total_tx_packets;
unsigned int total_rx_bytes;
diff -urp linux-2.6.20-git14.orig/drivers/net/e1000/e1000_main.c
linux-2.6.20-git14/drivers/net/e1000/e1000_main.c
--- linux-2.6.20-git14.orig/drivers/net/e1000/e1000_main.c  2007-02-19
14:32:15.0 -0500
+++ linux-2.6.20-git14/drivers/net/e1000/e1000_main.c   2007-02-19
15:09:28.0 -0500
@@ -349,7 +349,6 @@ static void e1000_free_irq(struct e1000_
 static void
 e1000_irq_disable(struct e1000_adapter *adapter)
 {
-   atomic_inc(adapter-irq_sem);
E1000_WRITE_REG(adapter-hw, IMC, ~0);
E1000_WRITE_FLUSH(adapter-hw);
synchronize_irq(adapter-pdev-irq);
@@ -363,10 +362,8 @@ e1000_irq_disable(struct e1000_adapter *
 static void
 e1000_irq_enable(struct e1000_adapter *adapter)
 {
-   if (likely(atomic_dec_and_test(adapter-irq_sem))) {
-   E1000_WRITE_REG(adapter-hw, IMS, IMS_ENABLE_MASK);
-   E1000_WRITE_FLUSH(adapter-hw);
-   }
+   E1000_WRITE_REG(adapter-hw, IMS, IMS_ENABLE_MASK);
+   E1000_WRITE_FLUSH(adapter-hw);
 }

 static void
@@ -1336,7 +1333,6 @@ e1000_sw_init(struct e1000_adapter *adap
spin_lock_init(adapter-tx_queue_lock);
 #endif

-   atomic_set(adapter-irq_sem, 1);
spin_lock_init(adapter-stats_lock);

set_bit(__E1000_DOWN, adapter-flags);
@@ -3758,11 +3754,6 @@ e1000_intr_msi(int irq, void *data)
 #endif
uint32_t icr = E1000_READ_REG(hw, ICR);

-#ifdef CONFIG_E1000_NAPI
-   /* read ICR disables interrupts using IAM, so keep up with our
-* enable/disable accounting */
-   atomic_inc(adapter-irq_sem);
-#endif
if (icr  (E1000_ICR_RXSEQ | E1000_ICR_LSC)) {
hw-get_link_status = 1;
/* 80003ES2LAN workaround-- For packet buffer work-around on
@@ -3832,13 +3823,6 @@ e1000_intr(int irq, void *data)
if (unlikely(hw-mac_type = e1000_82571 
 !(icr  E1000_ICR_INT_ASSERTED)))
return IRQ_NONE;
-
-   /* Interrupt Auto-Mask...upon reading ICR,
-* interrupts are masked.  No need for the
-* IMC write, but it does mean we should
-* account for it ASAP. */
-   if (likely(hw-mac_type = e1000_82571))
-   atomic_inc(adapter-irq_sem);
 #endif

if (unlikely(icr  (E1000_ICR_RXSEQ | E1000_ICR_LSC))) {
@@ -3862,7 +3846,6 @@ e1000_intr(int irq, void *data)
 #ifdef CONFIG_E1000_NAPI
if (unlikely(hw-mac_type  e1000_82571)) {
/* disable interrupts, without the synchronize_irq bit */
-   atomic_inc(adapter-irq_sem);
E1000_WRITE_REG(hw, IMC, ~0);
E1000_WRITE_FLUSH(hw);
}
@@ -3888,7 +3871,6 @@ e1000_intr(int irq, void *data)
 * de-assertion state.
 */
if (hw-mac_type == e1000_82547 || hw-mac_type == e1000_82547_rev_2) {
-   atomic_inc(adapter-irq_sem);
E1000_WRITE_REG(hw, IMC, ~0);
}

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] remove irq_sem from ixgb

2007-02-19 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Remove irq_sem from ixgb.  Currently untested, but similar to tested patches
on atl1 and e1000.

Signed-off-by: Chris Snook [EMAIL PROTECTED]
--
diff -urp linux-2.6.20-git14.orig/drivers/net/ixgb/ixgb.h
linux-2.6.20-git14/drivers/net/ixgb/ixgb.h
--- linux-2.6.20-git14.orig/drivers/net/ixgb/ixgb.h 2007-02-19
14:32:16.0 -0500
+++ linux-2.6.20-git14/drivers/net/ixgb/ixgb.h  2007-02-19 15:04:50.0
-0500
@@ -161,7 +161,6 @@ struct ixgb_adapter {
uint16_t link_speed;
uint16_t link_duplex;
spinlock_t tx_lock;
-   atomic_t irq_sem;
struct work_struct tx_timeout_task;

struct timer_list blink_timer;
diff -urp linux-2.6.20-git14.orig/drivers/net/ixgb/ixgb_main.c
linux-2.6.20-git14/drivers/net/ixgb/ixgb_main.c
--- linux-2.6.20-git14.orig/drivers/net/ixgb/ixgb_main.c2007-02-19
14:32:16.0 -0500
+++ linux-2.6.20-git14/drivers/net/ixgb/ixgb_main.c 2007-02-19
15:06:52.0 -0500
@@ -201,7 +201,6 @@ module_exit(ixgb_exit_module);
 static void
 ixgb_irq_disable(struct ixgb_adapter *adapter)
 {
-   atomic_inc(adapter-irq_sem);
IXGB_WRITE_REG(adapter-hw, IMC, ~0);
IXGB_WRITE_FLUSH(adapter-hw);
synchronize_irq(adapter-pdev-irq);
@@ -215,12 +214,10 @@ ixgb_irq_disable(struct ixgb_adapter *ad
 static void
 ixgb_irq_enable(struct ixgb_adapter *adapter)
 {
-   if(atomic_dec_and_test(adapter-irq_sem)) {
-   IXGB_WRITE_REG(adapter-hw, IMS,
-  IXGB_INT_RXT0 | IXGB_INT_RXDMT0 | IXGB_INT_TXDW |
-  IXGB_INT_LSC);
-   IXGB_WRITE_FLUSH(adapter-hw);
-   }
+   IXGB_WRITE_REG(adapter-hw, IMS,
+  IXGB_INT_RXT0 | IXGB_INT_RXDMT0 | IXGB_INT_TXDW |
+  IXGB_INT_LSC);
+   IXGB_WRITE_FLUSH(adapter-hw);
 }

 int
@@ -584,7 +581,6 @@ ixgb_sw_init(struct ixgb_adapter *adapte
/* enable flow control to be programmed */
hw-fc.send_xon = 1;

-   atomic_set(adapter-irq_sem, 1);
spin_lock_init(adapter-tx_lock);

return 0;
@@ -1755,7 +1751,6 @@ ixgb_intr(int irq, void *data)
  of the posted write is intentionally left out.
*/

-   atomic_inc(adapter-irq_sem);
IXGB_WRITE_REG(adapter-hw, IMC, ~0);
__netif_rx_schedule(netdev);
}

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix atl1 braino

2007-02-13 Thread Chris Snook

Al Viro wrote:

Spot the bug...

Signed-off-by: Al Viro [EMAIL PROTECTED]
---

diff --git a/drivers/net/atl1/atl1_hw.c b/drivers/net/atl1/atl1_hw.c
index 08b2d78..e28707a 100644
--- a/drivers/net/atl1/atl1_hw.c
+++ b/drivers/net/atl1/atl1_hw.c
@@ -357,7 +357,7 @@ void atl1_hash_set(struct atl1_hw *hw, u32 hash_value)
 */
hash_reg = (hash_value  31)  0x1;
hash_bit = (hash_value  26)  0x1F;
-   mta = ioread32((hw + REG_RX_HASH_TABLE) + (hash_reg  2));
+   mta = ioread32((hw-hw_addr + REG_RX_HASH_TABLE) + (hash_reg  2));
mta |= (1  hash_bit);
iowrite32(mta, (hw-hw_addr + REG_RX_HASH_TABLE) + (hash_reg  2));
 }


ACK.

Thanks for catching this.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >