Re: [PATCH net] atl1c: fix error return code in atl1c_probe()

2020-11-17 Thread Chris Snook
On Tue, Nov 17, 2020 at 1:01 AM Heiner Kallweit  wrote:
>
> Am 17.11.2020 um 08:43 schrieb Chris Snook:
> > The full text of the preceding comment explains the need:
> >
> > /*
> > * The atl1c chip can DMA to 64-bit addresses, but it uses a single
> > * shared register for the high 32 bits, so only a single, aligned,
> > * 4 GB physical address range can be used at a time.
> > *
> > * Supporting 64-bit DMA on this hardware is more trouble than it's
> > * worth.  It is far easier to limit to 32-bit DMA than update
> > * various kernel subsystems to support the mechanics required by a
> > * fixed-high-32-bit system.
> > */
> >
> > Without this, we get data corruption and crashes on machines with 4 GB
> > of RAM or more.
> >
> > - Chris
> >
> > On Mon, Nov 16, 2020 at 11:14 PM Heiner Kallweit  
> > wrote:
> >>
> >> Am 17.11.2020 um 03:55 schrieb Zhang Changzhong:
> >>> Fix to return a negative error code from the error handling
> >>> case instead of 0, as done elsewhere in this function.
> >>>
> >>> Fixes: 85eb5bc33717 ("net: atheros: switch from 'pci_' to 'dma_' API")
> >>> Reported-by: Hulk Robot 
> >>> Signed-off-by: Zhang Changzhong 
> >>> ---
> >>>  drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 4 ++--
> >>>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c 
> >>> b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> >>> index 0c12cf7..3f65f2b 100644
> >>> --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> >>> +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> >>> @@ -2543,8 +2543,8 @@ static int atl1c_probe(struct pci_dev *pdev, const 
> >>> struct pci_device_id *ent)
> >>>* various kernel subsystems to support the mechanics required by a
> >>>* fixed-high-32-bit system.
> >>>*/
> >>> - if ((dma_set_mask(>dev, DMA_BIT_MASK(32)) != 0) ||
> >>> - (dma_set_coherent_mask(>dev, DMA_BIT_MASK(32)) != 0)) {
> >>> + err = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32));
> >>
> >> I wonder whether you need this call at all, because 32bit is the default.
> >> See following
> >>
> >> "By default, the kernel assumes that your device can address 32-bits
> >> of DMA addressing."
> >>
> >> in https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt
> >>
> >>> + if (err) {
> >>>   dev_err(>dev, "No usable DMA 
> >>> configuration,aborting\n");
> >>>   goto err_dma;
> >>>   }
> >>>
> >>
>
> Please don't top-post.
> >From what I've seen the kernel configures 32bit as default DMA size.
> See beginning of pci_device_add(), there the coherent mask is set to 32bit.
>
> And in pci_setup_device() see the following:
>   /*
>  * Assume 32-bit PCI; let 64-bit PCI cards (which are far rarer)
>  * set this higher, assuming the system even supports it.
>  */
> dev->dma_mask = 0x;
>
>
> That means if you would like to use 64bit DMA then you'd need to configure 
> this explicitly.
> You could check to which mask dev->dma_mask and dev->coherent_dma_mask are set
> w/o the call to dma_set_mask_and_coherent.

I don't remember the exact history with atl1c, but we really did hit
this bug with atl1 and atl2. I'm not sure if that's because this
default wasn't there or if it's because because another call was
replaced with this call, but either way it's quite likely that at some
point in the future someone who doesn't even have test hardware will
try to port this to a newer interface that doesn't make the same
assumption, and bad things will happen. This isn't a hot path, so it's
better to be explicit. If dma_set_mask_and_coherent() ever takes a
long time or fails, something is seriously wrong and we probably want
to know about it before we start DMAing.

- Chris


Re: [PATCH net] atl1c: fix error return code in atl1c_probe()

2020-11-16 Thread Chris Snook
The full text of the preceding comment explains the need:

/*
* The atl1c chip can DMA to 64-bit addresses, but it uses a single
* shared register for the high 32 bits, so only a single, aligned,
* 4 GB physical address range can be used at a time.
*
* Supporting 64-bit DMA on this hardware is more trouble than it's
* worth.  It is far easier to limit to 32-bit DMA than update
* various kernel subsystems to support the mechanics required by a
* fixed-high-32-bit system.
*/

Without this, we get data corruption and crashes on machines with 4 GB
of RAM or more.

- Chris

On Mon, Nov 16, 2020 at 11:14 PM Heiner Kallweit  wrote:
>
> Am 17.11.2020 um 03:55 schrieb Zhang Changzhong:
> > Fix to return a negative error code from the error handling
> > case instead of 0, as done elsewhere in this function.
> >
> > Fixes: 85eb5bc33717 ("net: atheros: switch from 'pci_' to 'dma_' API")
> > Reported-by: Hulk Robot 
> > Signed-off-by: Zhang Changzhong 
> > ---
> >  drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c 
> > b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> > index 0c12cf7..3f65f2b 100644
> > --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> > +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> > @@ -2543,8 +2543,8 @@ static int atl1c_probe(struct pci_dev *pdev, const 
> > struct pci_device_id *ent)
> >* various kernel subsystems to support the mechanics required by a
> >* fixed-high-32-bit system.
> >*/
> > - if ((dma_set_mask(>dev, DMA_BIT_MASK(32)) != 0) ||
> > - (dma_set_coherent_mask(>dev, DMA_BIT_MASK(32)) != 0)) {
> > + err = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32));
>
> I wonder whether you need this call at all, because 32bit is the default.
> See following
>
> "By default, the kernel assumes that your device can address 32-bits
> of DMA addressing."
>
> in https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt
>
> > + if (err) {
> >   dev_err(>dev, "No usable DMA configuration,aborting\n");
> >   goto err_dma;
> >   }
> >
>


Re: [PATCH 0/3] net: ethernet: atheros: atlx: Use PCI generic definitions instead of private duplicates

2019-06-21 Thread Chris Snook
On Fri, Jun 21, 2019 at 11:33 AM Joe Perches  wrote:
>
> On Fri, 2019-06-21 at 13:12 -0500, Bjorn Helgaas wrote:
> > On Fri, Jun 21, 2019 at 12:27 PM Joe Perches  wrote:
> []
> > > Subsystem specific local PCI #defines without generic
> > > naming is poor style and makes treewide grep and
> > > refactoring much more difficult.
> >
> > Don't worry, we have the same objectives.  I totally agree that local
> > #defines are a bad thing, which is why I proposed this project in the
> > first place.
>
> Hi again Bjorn.
>
> I didn't know that was your idea.  Good idea.
>
> > I'm just saying that this is a "first-patch" sort of learning project
> > and I think it'll avoid some list spamming and discouragement if we
> > can figure out the scope and shake out some of the teething problems
> > ahead of time.  I don't want to end up with multiple versions of
> > dozens of little 2-3 patch series posted every week or two.
>
> Great, that's sensible.
>
> > I'd rather be able to deal with a whole block of them at one time.
>
> Also very sensible.
>
> > > 2: Show that you compiled the object files and verified
> > >where possible that there are no object file changes.
> >
> > Do you have any pointers for the best way to do this?  Is it as simple
> > as comparing output of "objdump -d"?
>
> Generically, yes.
>
> I have a little script that does the equivalent of:
>
> 
> make 
> mv  .old
> patch -P1 < 
> make 
> mv  .new
> diff -urN <(objdump -d .old) <(objdump -d .new)
>
> But it's not foolproof as gcc does not guarantee
> compilation repeatability.
>
> And some subsystems Makefiles do not allow per-file
> compilation.
>

This should work, but be aware that the older atlx drivers did some
regrettable things with file structure, so not all .c files are
expected to generate a corresponding .o file.

- Chris


Re: [PATCH] [trivial] treewide: Fix company name in module descriptions

2014-10-16 Thread Chris Snook
On Thu, Oct 16, 2014 at 8:09 AM, Masanari Iida  wrote:
> This patch fix company name's spelling typo in module descriptions
> and a Kconfig.
>
> Signed-off-by: Masanari Iida 

> diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c 
> b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> index 72fb86b..c9946c6 100644
> --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> @@ -48,7 +48,7 @@ MODULE_DEVICE_TABLE(pci, atl1c_pci_tbl);
>
>  MODULE_AUTHOR("Jie Yang");
>  MODULE_AUTHOR("Qualcomm Atheros Inc., ");
> -MODULE_DESCRIPTION("Qualcom Atheros 100/1000M Ethernet Network Driver");
> +MODULE_DESCRIPTION("Qualcomm Atheros 100/1000M Ethernet Network Driver");
>  MODULE_LICENSE("GPL");
>  MODULE_VERSION(ATL1C_DRV_VERSION);
>

Acked-by: Chris Snook 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [trivial] treewide: Fix company name in module descriptions

2014-10-16 Thread Chris Snook
On Thu, Oct 16, 2014 at 8:09 AM, Masanari Iida standby2...@gmail.com wrote:
 This patch fix company name's spelling typo in module descriptions
 and a Kconfig.

 Signed-off-by: Masanari Iida standby2...@gmail.com

 diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c 
 b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
 index 72fb86b..c9946c6 100644
 --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
 +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
 @@ -48,7 +48,7 @@ MODULE_DEVICE_TABLE(pci, atl1c_pci_tbl);

  MODULE_AUTHOR(Jie Yang);
  MODULE_AUTHOR(Qualcomm Atheros Inc., nic-de...@qualcomm.com);
 -MODULE_DESCRIPTION(Qualcom Atheros 100/1000M Ethernet Network Driver);
 +MODULE_DESCRIPTION(Qualcomm Atheros 100/1000M Ethernet Network Driver);
  MODULE_LICENSE(GPL);
  MODULE_VERSION(ATL1C_DRV_VERSION);


Acked-by: Chris Snook chris.sn...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Performance problems with 3ware 9500S-4LP and 2.6.25-rc3

2008-02-26 Thread Chris Snook

Andre Noll wrote:

we are experiencing massive performance problems with two of our
Linux servers that contain 3ware controllers on a Tyan mainboard and
a couple of 1T disks.

During the daily cron job that uses rsync to sync a 500G file system
from another machine to the raid on the 3ware controller the load
jumps up, and the machine becomes sluggish as hell. For example, an
ssh login to that machine takes minutes to complete and ldap becomes
unreliable while the rsync job is running. Even Nagios complains
about the machine being down while rsync is running.


You're putting your box under astronomical load.  This is generally 
regarded as a bad idea, regardless of how well your storage controller 
is performing.  Can you measure the single-threaded throughput (say, 
coping one huge file, and then syncing) to give us a baseline 
performance figure?  rsync will happily peg your box, your network, and 
your cat if you let it.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Performance problems with 3ware 9500S-4LP and 2.6.25-rc3

2008-02-26 Thread Chris Snook

Andre Noll wrote:

we are experiencing massive performance problems with two of our
Linux servers that contain 3ware controllers on a Tyan mainboard and
a couple of 1T disks.

During the daily cron job that uses rsync to sync a 500G file system
from another machine to the raid on the 3ware controller the load
jumps up, and the machine becomes sluggish as hell. For example, an
ssh login to that machine takes minutes to complete and ldap becomes
unreliable while the rsync job is running. Even Nagios complains
about the machine being down while rsync is running.


You're putting your box under astronomical load.  This is generally 
regarded as a bad idea, regardless of how well your storage controller 
is performing.  Can you measure the single-threaded throughput (say, 
coping one huge file, and then syncing) to give us a baseline 
performance figure?  rsync will happily peg your box, your network, and 
your cat if you let it.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] MARKERS depends on MODULES

2008-02-15 Thread Chris Snook
From: Chris Snook <[EMAIL PROTECTED]>

Make MARKERS depend on MODULES to prevent build failures with certain configs.

Signed-off-by: Chris Snook <[EMAIL PROTECTED]>

diff --git a/init/Kconfig b/init/Kconfig
index dcef8b5..933df15 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -729,6 +729,7 @@ config PROFILING
 
 config MARKERS
bool "Activate markers"
+   depends on MODULES
help
  Place an empty function call at each marker site. Can be
  dynamically changed for a probe function.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] make LKDTM depend on BLOCK

2008-02-15 Thread Chris Snook
From: Chris Snook <[EMAIL PROTECTED]>

Make LKDTM depend on BLOCK to prevent build failures with certain configs.

Signed-off-by: Chris Snook <[EMAIL PROTECTED]>

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a370fe8..24b327c 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -524,6 +524,7 @@ config LKDTM
tristate "Linux Kernel Dump Test Tool Module"
depends on DEBUG_KERNEL
depends on KPROBES
+   depends on BLOCK
default n
help
This module enables testing of the different dumping mechanisms by
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] make LKDTM depend on BLOCK

2008-02-15 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make LKDTM depend on BLOCK to prevent build failures with certain configs.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a370fe8..24b327c 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -524,6 +524,7 @@ config LKDTM
tristate Linux Kernel Dump Test Tool Module
depends on DEBUG_KERNEL
depends on KPROBES
+   depends on BLOCK
default n
help
This module enables testing of the different dumping mechanisms by
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] MARKERS depends on MODULES

2008-02-15 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Make MARKERS depend on MODULES to prevent build failures with certain configs.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

diff --git a/init/Kconfig b/init/Kconfig
index dcef8b5..933df15 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -729,6 +729,7 @@ config PROFILING
 
 config MARKERS
bool Activate markers
+   depends on MODULES
help
  Place an empty function call at each marker site. Can be
  dynamically changed for a probe function.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next build status

2008-02-14 Thread Chris Snook

Tony Breeds wrote:

On Thu, Feb 14, 2008 at 08:24:27PM -0500, Chris Snook wrote:

Stephen Rothwell wrote:

Hi all,

Initial status can be seen here
http://kisskb.ellerman.id.au/kisskb/branch/9/ (I hope to make a better
URL soon).  Suggestions for more compiler/config combinations are
welcome, but we can't necessarily commit to fulfilling all you
wishes.  :-)


i386 allmodconfig please.


Wont i386 allmodconfig be equivalent to x86_64 allmodconfig?


Only if there are no bugs.

Driver code is most likely to trip over bitness/endianness bugs, and 
you've already got allmodconfig builds for be32, be64, and le64 
architectures.  Adding an le32 architecture (i386) completes the 
coverage of these basic categories.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next build status

2008-02-14 Thread Chris Snook

Stephen Rothwell wrote:

Hi all,

Initial status can be seen here
http://kisskb.ellerman.id.au/kisskb/branch/9/ (I hope to make a better
URL soon).  Suggestions for more compiler/config combinations are
welcome, but we can't necessarily commit to fulfilling all you
wishes.  :-)



i386 allmodconfig please.

Also, I highly recommend adding some randconfig builds, at least one 32-bit arch 
and one 64-bit arch.  Any given randconfig build is not particularly likely to 
catch bugs that would be missed elsewhere, but doing them daily for two months 
will catch a lot of things before they get released.  The catch, of course, is 
that you have to actually save the .config for this to be useful, which might 
require a slight modification to your scripts.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next build status

2008-02-14 Thread Chris Snook

Stephen Rothwell wrote:

Hi all,

Initial status can be seen here
http://kisskb.ellerman.id.au/kisskb/branch/9/ (I hope to make a better
URL soon).  Suggestions for more compiler/config combinations are
welcome, but we can't necessarily commit to fulfilling all you
wishes.  :-)



i386 allmodconfig please.

Also, I highly recommend adding some randconfig builds, at least one 32-bit arch 
and one 64-bit arch.  Any given randconfig build is not particularly likely to 
catch bugs that would be missed elsewhere, but doing them daily for two months 
will catch a lot of things before they get released.  The catch, of course, is 
that you have to actually save the .config for this to be useful, which might 
require a slight modification to your scripts.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next build status

2008-02-14 Thread Chris Snook

Tony Breeds wrote:

On Thu, Feb 14, 2008 at 08:24:27PM -0500, Chris Snook wrote:

Stephen Rothwell wrote:

Hi all,

Initial status can be seen here
http://kisskb.ellerman.id.au/kisskb/branch/9/ (I hope to make a better
URL soon).  Suggestions for more compiler/config combinations are
welcome, but we can't necessarily commit to fulfilling all you
wishes.  :-)


i386 allmodconfig please.


Wont i386 allmodconfig be equivalent to x86_64 allmodconfig?


Only if there are no bugs.

Driver code is most likely to trip over bitness/endianness bugs, and 
you've already got allmodconfig builds for be32, be64, and le64 
architectures.  Adding an le32 architecture (i386) completes the 
coverage of these basic categories.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc1 panics on boot

2008-02-13 Thread Chris Snook

Dhaval Giani wrote:

I am getting the following oops on bootup on 2.6.25-rc1

...

I am booting using kexec with maxcpus=1. It does not have any problems
with maxcpus=2 or higher.


Sounds like another (the same?) kexec cpu numbering bug.  Can you 
post/link the entire dmesg from both a cold boot and a kexec boot so we 
can compare?


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc1 panics on boot

2008-02-13 Thread Chris Snook

Dhaval Giani wrote:

I am getting the following oops on bootup on 2.6.25-rc1

...

I am booting using kexec with maxcpus=1. It does not have any problems
with maxcpus=2 or higher.


Sounds like another (the same?) kexec cpu numbering bug.  Can you 
post/link the entire dmesg from both a cold boot and a kexec boot so we 
can compare?


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: log spamming

2008-02-01 Thread Chris Snook

Gene Heskett wrote:

Greetings;

I just rebooted to a new config of 2.6.24, basically trying to strip out the 
building of modules I don't use.  And I enabled a couple of checks that 
weren't checked in the kernel-hacking menu.  .config posted on request.


Now the messages log is being spammed at 2-5 second intervals by these:
Feb  1 10:41:08 coyote kernel: [ 3085.501037] MCE: The hardware reports a non 
fatal, correctable incident occurred on CPU 0.

Feb  1 10:41:08 coyote kernel: [ 3085.501042] Bank 1: d4004152
Feb  1 10:41:08 coyote kernel: [ 3085.501045] MCE: The hardware reports a non 
fatal, correctable incident occurred on CPU 0.

Feb  1 10:41:08 coyote kernel: [ 3085.501048] Bank 2: d400417a

Always the same 2 addresses.  Is this telling me I should be running memtest86 
for a couple of cycles?


Those two addresses are in the same cache line, but they are *not* in the same 
128-bit ECC block.  This is probably a northbridge problem, not a RAM problem. 
It's not necessarily a hardware problem.  I wouldn't be surprised if you swapped 
CPUs and still got the same result, due to BIOS misconfiguration.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: how to get chance for user space process even when the kernel is utilizing 100% CPU.

2008-02-01 Thread Chris Snook

veerasena reddy wrote:

I have a requirement where i need to execute a user process even when
the kernel is utilizing 100% of CPU time.


In the realtime kernel, hardware interrupt handlers are prioritized 
threads, so you can give the userspace process a higher realtime priority.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: how to get chance for user space process even when the kernel is utilizing 100% CPU.

2008-02-01 Thread Chris Snook

veerasena reddy wrote:

I have a requirement where i need to execute a user process even when
the kernel is utilizing 100% of CPU time.


In the realtime kernel, hardware interrupt handlers are prioritized 
threads, so you can give the userspace process a higher realtime priority.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: log spamming

2008-02-01 Thread Chris Snook

Gene Heskett wrote:

Greetings;

I just rebooted to a new config of 2.6.24, basically trying to strip out the 
building of modules I don't use.  And I enabled a couple of checks that 
weren't checked in the kernel-hacking menu.  .config posted on request.


Now the messages log is being spammed at 2-5 second intervals by these:
Feb  1 10:41:08 coyote kernel: [ 3085.501037] MCE: The hardware reports a non 
fatal, correctable incident occurred on CPU 0.

Feb  1 10:41:08 coyote kernel: [ 3085.501042] Bank 1: d4004152
Feb  1 10:41:08 coyote kernel: [ 3085.501045] MCE: The hardware reports a non 
fatal, correctable incident occurred on CPU 0.

Feb  1 10:41:08 coyote kernel: [ 3085.501048] Bank 2: d400417a

Always the same 2 addresses.  Is this telling me I should be running memtest86 
for a couple of cycles?


Those two addresses are in the same cache line, but they are *not* in the same 
128-bit ECC block.  This is probably a northbridge problem, not a RAM problem. 
It's not necessarily a hardware problem.  I wouldn't be surprised if you swapped 
CPUs and still got the same result, due to BIOS misconfiguration.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How does ext2 implement sparse files?

2008-01-31 Thread Chris Snook

Lars Noschinski wrote:


Hello!

For an university project, we had to write a toy filesystem (ext2-like),
for which I would like to implement sparse file support. For this, I
digged through the ext2 source code; but I could not find the point,
where ext2 detects holes.

As far as I can see from fs/buffer.c, an hole is a buffer_head which is
not mapped, but uptodate. But I cannot find a relevant source line,
where ext2 makes usage of this information.


In ext2 (and most other block filesystems) all files are sparse files. 
If you write to an address in the file for which no block is allocated, 
the filesystem allocates a block and writes the contents to disk, 
regardless of whether that block is at the end of the file (the usual 
case of lengthening a non-sparse file), in the middle of the file 
(filling in holes in a sparse file), or past the the end of the file 
(making a file sparse).


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: about relocs.c on x86

2008-01-31 Thread Chris Snook

Yinghai Lu wrote:

On Jan 31, 2008 12:33 AM, Chris Snook <[EMAIL PROTECTED]> wrote:

Yinghai Lu wrote:

why not rename relocs.c to relocs_32.c?

Because we're trying to get rid of all the _32 and _64 files?


but that file is not need for x86_64


Which means there's no conflict with any 64-bit code, and thus no reason 
to break it out into a _32 file.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: about relocs.c on x86

2008-01-31 Thread Chris Snook

Yinghai Lu wrote:

why not rename relocs.c to relocs_32.c?


Because we're trying to get rid of all the _32 and _64 files?

-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: about relocs.c on x86

2008-01-31 Thread Chris Snook

Yinghai Lu wrote:

why not rename relocs.c to relocs_32.c?


Because we're trying to get rid of all the _32 and _64 files?

-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: about relocs.c on x86

2008-01-31 Thread Chris Snook

Yinghai Lu wrote:

On Jan 31, 2008 12:33 AM, Chris Snook [EMAIL PROTECTED] wrote:

Yinghai Lu wrote:

why not rename relocs.c to relocs_32.c?

Because we're trying to get rid of all the _32 and _64 files?


but that file is not need for x86_64


Which means there's no conflict with any 64-bit code, and thus no reason 
to break it out into a _32 file.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How does ext2 implement sparse files?

2008-01-31 Thread Chris Snook

Lars Noschinski wrote:


Hello!

For an university project, we had to write a toy filesystem (ext2-like),
for which I would like to implement sparse file support. For this, I
digged through the ext2 source code; but I could not find the point,
where ext2 detects holes.

As far as I can see from fs/buffer.c, an hole is a buffer_head which is
not mapped, but uptodate. But I cannot find a relevant source line,
where ext2 makes usage of this information.


In ext2 (and most other block filesystems) all files are sparse files. 
If you write to an address in the file for which no block is allocated, 
the filesystem allocates a block and writes the contents to disk, 
regardless of whether that block is at the end of the file (the usual 
case of lengthening a non-sparse file), in the middle of the file 
(filling in holes in a sparse file), or past the the end of the file 
(making a file sparse).


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Strange error?

2008-01-30 Thread Chris Snook

Gene Heskett wrote:

Greetings all;

This line showed up in my log a couple of hours ago, several minutes removed 
from anything else I was doing at the time:


rarian-sk-get-c[31855]: segfault at  eip 00b7c153 esp bf9ddf0c error 4

The system acts and feels normal.

Does anyone have a clue to loan me?


I would ask the rarian developers:

http://rarian.freedesktop.org/

My barely-educated guess is that Gnome was doing a routine re-index of 
its help files and and the app got bored and decided to dereference a 
NULL pointer for fun.  Your desktop documentation index may be 
incomplete or corrupt.  Try not to panic.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Purpose of numa_node?

2008-01-30 Thread Chris Snook
While pondering ways to optimize I/O and swapping on large NUMA machines, I 
noticed that the numa_node field in struct device isn't actually used anywhere. 
 We just have a couple dozen lines of code to conditionally create a sysfs file 
that will always return -1.  Is anyone even working on code to actually use this 
field?  I think it's a good piece of information to keep track of, so I'm not 
suggesting we remove it, but I want to make sure I'm not stepping on toes or 
duplicating effort if I try to make it useful.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Purpose of numa_node?

2008-01-30 Thread Chris Snook
While pondering ways to optimize I/O and swapping on large NUMA machines, I 
noticed that the numa_node field in struct device isn't actually used anywhere. 
 We just have a couple dozen lines of code to conditionally create a sysfs file 
that will always return -1.  Is anyone even working on code to actually use this 
field?  I think it's a good piece of information to keep track of, so I'm not 
suggesting we remove it, but I want to make sure I'm not stepping on toes or 
duplicating effort if I try to make it useful.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-24 Thread Chris Snook

Al Boldi wrote:

Greetings!

data=ordered mode has proven reliable over the years, and it does this by 
ordering filedata flushes before metadata flushes.  But this sometimes 
causes contention in the order of a 10x slowdown for certain apps, either 
due to the misuse of fsync or due to inherent behaviour like db's, as well 
as inherent starvation issues exposed by the data=ordered mode.


data=writeback mode alleviates data=order mode slowdowns, but only works 
per-mount and is too dangerous to run as a default mode.


This RFC proposes to introduce a tunable which allows to disable fsync and 
changes ordered into writeback writeout on a per-process basis like this:


  echo 1 > /proc/`pidof process`/softsync


Your comments are much welcome!


This is basically a kernel workaround for stupid app behavior.  It wouldn't be 
the first time we've provided such an option, but we shouldn't do it without a 
very good justification.  At the very least, we need a test case that 
demonstrates the problem and benchmark results that prove that this approach 
actually fixes it.  I suspect we can find a cleaner fix for the problem.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/26] atl1: refactor tx processing

2008-01-24 Thread Chris Snook

Jay Cliburn wrote:

On Tue, 22 Jan 2008 18:31:09 -0600
Jay Cliburn <[EMAIL PROTECTED]> wrote:


On Tue, 22 Jan 2008 04:58:17 -0500
Jeff Garzik <[EMAIL PROTECTED]> wrote:


[...]

for such a huge patch, this description is very tiny.  [describe]
what is refactored, and why.


Is this one any better?


This satisfies me.

Acked-by: Chris Snook <[EMAIL PROTECTED]>


From df475e2eea401f9dc18ca23dab538b99fb9e710c Mon Sep 17 00:00:00 2001
From: Jay Cliburn <[EMAIL PROTECTED]>
Date: Wed, 23 Jan 2008 21:36:36 -0600
Subject: [PATCH] atl1: simplify tx packet descriptor

The transmit packet descriptor consists of four 32-bit words, with word 3
upper bits overloaded depending upon the condition of its bits 3 and 4.
The driver currently duplicates all word 2 and some word 3 register bit
definitions unnecessarily and also uses a set of nested structures in its
definition of the TPD without good cause. This patch adds a lengthy
comment describing the TPD, eliminates duplicate TPD bit definitions,
and simplifies the TPD structure itself. It also expands the TSO check
to correctly handle custom checksum versus TSO processing using the revised
TPD definitions. Finally, shorten some variable names in the transmit
processing path to reduce line lengths, rename some variables to better
describe their purpose (e.g., nseg versus m), and add a comment or two
to better describe what the code is doing.

Signed-off-by: Jay Cliburn <[EMAIL PROTECTED]>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-24 Thread Chris Snook

Al Boldi wrote:

Greetings!

data=ordered mode has proven reliable over the years, and it does this by 
ordering filedata flushes before metadata flushes.  But this sometimes 
causes contention in the order of a 10x slowdown for certain apps, either 
due to the misuse of fsync or due to inherent behaviour like db's, as well 
as inherent starvation issues exposed by the data=ordered mode.


data=writeback mode alleviates data=order mode slowdowns, but only works 
per-mount and is too dangerous to run as a default mode.


This RFC proposes to introduce a tunable which allows to disable fsync and 
changes ordered into writeback writeout on a per-process basis like this:


  echo 1  /proc/`pidof process`/softsync


Your comments are much welcome!


This is basically a kernel workaround for stupid app behavior.  It wouldn't be 
the first time we've provided such an option, but we shouldn't do it without a 
very good justification.  At the very least, we need a test case that 
demonstrates the problem and benchmark results that prove that this approach 
actually fixes it.  I suspect we can find a cleaner fix for the problem.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/26] atl1: refactor tx processing

2008-01-24 Thread Chris Snook

Jay Cliburn wrote:

On Tue, 22 Jan 2008 18:31:09 -0600
Jay Cliburn [EMAIL PROTECTED] wrote:


On Tue, 22 Jan 2008 04:58:17 -0500
Jeff Garzik [EMAIL PROTECTED] wrote:


[...]

for such a huge patch, this description is very tiny.  [describe]
what is refactored, and why.


Is this one any better?


This satisfies me.

Acked-by: Chris Snook [EMAIL PROTECTED]


From df475e2eea401f9dc18ca23dab538b99fb9e710c Mon Sep 17 00:00:00 2001
From: Jay Cliburn [EMAIL PROTECTED]
Date: Wed, 23 Jan 2008 21:36:36 -0600
Subject: [PATCH] atl1: simplify tx packet descriptor

The transmit packet descriptor consists of four 32-bit words, with word 3
upper bits overloaded depending upon the condition of its bits 3 and 4.
The driver currently duplicates all word 2 and some word 3 register bit
definitions unnecessarily and also uses a set of nested structures in its
definition of the TPD without good cause. This patch adds a lengthy
comment describing the TPD, eliminates duplicate TPD bit definitions,
and simplifies the TPD structure itself. It also expands the TSO check
to correctly handle custom checksum versus TSO processing using the revised
TPD definitions. Finally, shorten some variable names in the transmit
processing path to reduce line lengths, rename some variables to better
describe their purpose (e.g., nseg versus m), and add a comment or two
to better describe what the code is doing.

Signed-off-by: Jay Cliburn [EMAIL PROTECTED]

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/26] atl1: update initialization parameters

2008-01-22 Thread Chris Snook

Jay Cliburn wrote:

On Tue, 22 Jan 2008 04:56:11 -0500
Jeff Garzik <[EMAIL PROTECTED]> wrote:


[EMAIL PROTECTED] wrote:

From: Jay Cliburn <[EMAIL PROTECTED]>

Update initialization parameters to match the current vendor driver
version 1.2.40.2.


[...]

ACK without any better knowledge...  but is any addition insight 
available at all?


No, sorry Jeff.  I simply took the vendor's current driver and matched
his initialization settings.  I can only assume he discovered these
values through lab testing.

For this and the other "conform to vendor driver" patches in this set, I
thought it important to have the in-tree driver match the vendor driver
as closely as possible.  The primary motivations are (1) my belief that
he's in a better position to test the NIC, and (2) to be able to go to
him for assistance occasionally and not be rejected because of
significant differences between his and our drivers.


I don't think we should be doing this without justification.  From all the atl1 
and atl2 code I've looked at, I've gotten the impression that their driver 
development processes are extremely ad-hoc.  There is code in the Atheros 
version of atl2 that cannot *possibly* apply to that hardware and was just 
copied and pasted from atl1, just as much of atl1 was copied and pasted from 
e1000.  The fact that various versions have different magic numbers may simply 
mean they copied and pasted from different irrelevant and incorrect sources.


Our contacts at Atheros seem to be very good electrical engineers, so when they 
tell us that a certain setting should be changed to match particular properties 
of the hardware, I trust them.  They are not, however, experienced and 
disciplined kernel developers, so absent such justification I think we should 
stick with what we have, which has been improved and reviewed by people who 
*are* experienced and disciplined kernel developers.


We have at least as much to teach Atheros about writing kernel code as they have 
to teach us about their hardware.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/26] atl1: update initialization parameters

2008-01-22 Thread Chris Snook

Jay Cliburn wrote:

On Tue, 22 Jan 2008 04:56:11 -0500
Jeff Garzik [EMAIL PROTECTED] wrote:


[EMAIL PROTECTED] wrote:

From: Jay Cliburn [EMAIL PROTECTED]

Update initialization parameters to match the current vendor driver
version 1.2.40.2.


[...]

ACK without any better knowledge...  but is any addition insight 
available at all?


No, sorry Jeff.  I simply took the vendor's current driver and matched
his initialization settings.  I can only assume he discovered these
values through lab testing.

For this and the other conform to vendor driver patches in this set, I
thought it important to have the in-tree driver match the vendor driver
as closely as possible.  The primary motivations are (1) my belief that
he's in a better position to test the NIC, and (2) to be able to go to
him for assistance occasionally and not be rejected because of
significant differences between his and our drivers.


I don't think we should be doing this without justification.  From all the atl1 
and atl2 code I've looked at, I've gotten the impression that their driver 
development processes are extremely ad-hoc.  There is code in the Atheros 
version of atl2 that cannot *possibly* apply to that hardware and was just 
copied and pasted from atl1, just as much of atl1 was copied and pasted from 
e1000.  The fact that various versions have different magic numbers may simply 
mean they copied and pasted from different irrelevant and incorrect sources.


Our contacts at Atheros seem to be very good electrical engineers, so when they 
tell us that a certain setting should be changed to match particular properties 
of the hardware, I trust them.  They are not, however, experienced and 
disciplined kernel developers, so absent such justification I think we should 
stick with what we have, which has been improved and reviewed by people who 
*are* experienced and disciplined kernel developers.


We have at least as much to teach Atheros about writing kernel code as they have 
to teach us about their hardware.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Usage semantics of atomic_set ( )

2008-01-11 Thread Chris Snook

Vineet Gupta wrote:

I'm trying to implement atomic ops for a CPU which has no inherent
support for Read-Modify-Write Ops. Instead of using a global spin lock
which protects all the atomic APIs, I want to use a spin lock per
instance of atomic_t.


What operations are you using to implement spinlocks?

A few architectures use arrays of spinlocks to implement atomic_t.  I believe 
sparc and parisc are among them.  Assuming your spinlock implementation is sound 
and efficient, the same technique should work for you.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Usage semantics of atomic_set ( )

2008-01-11 Thread Chris Snook

Vineet Gupta wrote:

I'm trying to implement atomic ops for a CPU which has no inherent
support for Read-Modify-Write Ops. Instead of using a global spin lock
which protects all the atomic APIs, I want to use a spin lock per
instance of atomic_t.


What operations are you using to implement spinlocks?

A few architectures use arrays of spinlocks to implement atomic_t.  I believe 
sparc and parisc are among them.  Assuming your spinlock implementation is sound 
and efficient, the same technique should work for you.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Strange NFS write performance Linux->Solaris-10/VXFS, maybe VW related

2007-12-28 Thread Chris Snook

Martin Knoblauch wrote:

Hi,

currently I am tracking down an "interesting" effect when writing to a
Solars-10/Sparc based server. The server exports two filesystems. One UFS,
one VXFS. The filesystems are mounted NFS3/TCP, no special options. Linux
kernel in question is 2.6.24-rc6, but it happens with earlier kernels
(2.6.19.2, 2.6.22.6) as well. The client is x86_64 with 8 GB of ram.

The problem: when writing to the VXFS based filesystem, performance drops
dramatically when the the filesize reaches or exceeds "dirty_ratio". For a
dirty_ratio of 10% (about 800MB) files below 750 MB are transfered with about
30 MB/sec. Anything above 770 MB drops down to below 10 MB/sec. If I perform
the same tests on the UFS based FS, performance stays at about 30 MB/sec
until 3GB and likely larger (I just stopped at 3 GB).

Any ideas what could cause this difference? Any suggestions on debugging it?


1) Try normal NFS tuning, such as rsize/wsize tuning.

2) You're entering synchronous writeback mode, so you can delay the problem by 
raising dirty_ratio to 100, or reduce the size of the problem by lowering 
dirty_ratio to 1.  Either one could help.


3) It sounds like the bottleneck is the vxfs filesystem.  It only *appears* on 
the client side because writes up until dirty_ratio get buffered on the client. 
 If you can confirm that the server is actually writing stuff to disk slower 
when the client is in writeback mode, then it's possible the Linux NFS client is 
doing something inefficient in writeback mode.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Strange NFS write performance Linux-Solaris-10/VXFS, maybe VW related

2007-12-28 Thread Chris Snook

Martin Knoblauch wrote:

Hi,

currently I am tracking down an interesting effect when writing to a
Solars-10/Sparc based server. The server exports two filesystems. One UFS,
one VXFS. The filesystems are mounted NFS3/TCP, no special options. Linux
kernel in question is 2.6.24-rc6, but it happens with earlier kernels
(2.6.19.2, 2.6.22.6) as well. The client is x86_64 with 8 GB of ram.

The problem: when writing to the VXFS based filesystem, performance drops
dramatically when the the filesize reaches or exceeds dirty_ratio. For a
dirty_ratio of 10% (about 800MB) files below 750 MB are transfered with about
30 MB/sec. Anything above 770 MB drops down to below 10 MB/sec. If I perform
the same tests on the UFS based FS, performance stays at about 30 MB/sec
until 3GB and likely larger (I just stopped at 3 GB).

Any ideas what could cause this difference? Any suggestions on debugging it?


1) Try normal NFS tuning, such as rsize/wsize tuning.

2) You're entering synchronous writeback mode, so you can delay the problem by 
raising dirty_ratio to 100, or reduce the size of the problem by lowering 
dirty_ratio to 1.  Either one could help.


3) It sounds like the bottleneck is the vxfs filesystem.  It only *appears* on 
the client side because writes up until dirty_ratio get buffered on the client. 
 If you can confirm that the server is actually writing stuff to disk slower 
when the client is in writeback mode, then it's possible the Linux NFS client is 
doing something inefficient in writeback mode.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/net/: Spelling fixes

2007-12-17 Thread Chris Snook

Joe Perches wrote:

 drivers/net/atl1/atl1_hw.c |2 +-
 drivers/net/atl1/atl1_main.c   |2 +-


The atl1 code will be heavily reworked in the 2.6.25 merge window, so this may 
cause headaches.  Please remove these chunks before merging.


The spelling corrections themselves are fine, and I will ensure that the revised 
driver includes them, if the comments in question are still present at all once 
we're done with all the changes and cleanups.


-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/net/: Spelling fixes

2007-12-17 Thread Chris Snook

Joe Perches wrote:

 drivers/net/atl1/atl1_hw.c |2 +-
 drivers/net/atl1/atl1_main.c   |2 +-


The atl1 code will be heavily reworked in the 2.6.25 merge window, so this may 
cause headaches.  Please remove these chunks before merging.


The spelling corrections themselves are fine, and I will ensure that the revised 
driver includes them, if the comments in question are still present at all once 
we're done with all the changes and cleanups.


-- Chris
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel - Future works

2007-12-04 Thread Chris Snook

Muhammad Nowbuth wrote:

Hi all,

Could anyone give some ideas of future pending works which are needed
on the linux kernel?


http://kernelnewbies.org/KernelHacking
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel - Future works

2007-12-04 Thread Chris Snook

Muhammad Nowbuth wrote:

Hi all,

Could anyone give some ideas of future pending works which are needed
on the linux kernel?


http://kernelnewbies.org/KernelHacking
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel Development & Objective-C

2007-11-30 Thread Chris Snook

Ben Crowhurst wrote:

Has Objective-C ever been considered for kernel development?


No.  Kernel programming requires what is essentially assembly language with a 
lot of syntactic sugar, which C provides.  Higher-level languages abstract away 
too much detail to be suitable for the sort of bit-perfect control you need when 
you're directly controlling bare metal.  You can still use object-oriented 
programming techniques in C, and we do this all the time in the kernel, but we 
do so with more fine-grained explicit control than a language like Objective-C 
would give us.  More to the point, if we tried to use Objective-C, we'd find 
ourselves needing to fall back to C-style explicitness so often that it wouldn't 
be worth the trouble.


In other news, I hear Hurd boots again!

-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel Development Objective-C

2007-11-30 Thread Chris Snook

Ben Crowhurst wrote:

Has Objective-C ever been considered for kernel development?


No.  Kernel programming requires what is essentially assembly language with a 
lot of syntactic sugar, which C provides.  Higher-level languages abstract away 
too much detail to be suitable for the sort of bit-perfect control you need when 
you're directly controlling bare metal.  You can still use object-oriented 
programming techniques in C, and we do this all the time in the kernel, but we 
do so with more fine-grained explicit control than a language like Objective-C 
would give us.  More to the point, if we tried to use Objective-C, we'd find 
ourselves needing to fall back to C-style explicitness so often that it wouldn't 
be worth the trouble.


In other news, I hear Hurd boots again!

-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Avoid overflows in kernel/time.c

2007-11-29 Thread Chris Snook

H. Peter Anvin wrote:

NOTE: This patch uses a bc(1) script to compute the appropriate
constants.


Perhaps dc would be more appropriate?  That's included in busybox.

-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Avoid overflows in kernel/time.c

2007-11-29 Thread Chris Snook

H. Peter Anvin wrote:

NOTE: This patch uses a bc(1) script to compute the appropriate
constants.


Perhaps dc would be more appropriate?  That's included in busybox.

-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.y][PATCH] atl1: disable broken 64-bit DMA

2007-11-26 Thread Chris Snook

Jay Cliburn wrote:

atl1: disable broken 64-bit DMA

[ Upstream commit: 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 ]

The L1 network chip can DMA to 64-bit addresses, but multiple descriptor
rings share a single register for the high 32 bits of their address, so
only a single, aligned, 4 GB physical address range can be used at a time.
As a result, we need to confine the driver to a 32-bit DMA mask, otherwise
we see occasional data corruption errors in systems containing 4 or more
gigabytes of RAM.

Signed-off-by: Jay Cliburn <[EMAIL PROTECTED]>
Cc: Luca Tettamanti <[EMAIL PROTECTED]>
Cc: Chris Snook <[EMAIL PROTECTED]>


Acked-By: Chris Snook <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.y][PATCH] atl1: disable broken 64-bit DMA

2007-11-26 Thread Chris Snook

Jay Cliburn wrote:

atl1: disable broken 64-bit DMA

[ Upstream commit: 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 ]

The L1 network chip can DMA to 64-bit addresses, but multiple descriptor
rings share a single register for the high 32 bits of their address, so
only a single, aligned, 4 GB physical address range can be used at a time.
As a result, we need to confine the driver to a 32-bit DMA mask, otherwise
we see occasional data corruption errors in systems containing 4 or more
gigabytes of RAM.

Signed-off-by: Jay Cliburn [EMAIL PROTECTED]
Cc: Luca Tettamanti [EMAIL PROTECTED]
Cc: Chris Snook [EMAIL PROTECTED]


Acked-By: Chris Snook [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: IM Kernel Failure 12/11/07

2007-11-14 Thread Chris Snook

[EMAIL PROTECTED] wrote:
	Linux version 2.4.9-e.38smp ([EMAIL PROTECTED]) (gcc 
version 2.96 2731 (Red Hat Linux 7.2 2.96-124.7.2)) #1 SMP Wed Feb 
11 00:09:01 EST 2004


Ancient vendor kernels are very out of scope for this mailing list.  The 
following links may be useful:


https://bugzilla.redhat.com/
https://www.redhat.com/apps/support/
http://www.redhat.com/mailman/listinfo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: IM Kernel Failure 12/11/07

2007-11-14 Thread Chris Snook

[EMAIL PROTECTED] wrote:
	Linux version 2.4.9-e.38smp ([EMAIL PROTECTED]) (gcc 
version 2.96 2731 (Red Hat Linux 7.2 2.96-124.7.2)) #1 SMP Wed Feb 
11 00:09:01 EST 2004


Ancient vendor kernels are very out of scope for this mailing list.  The 
following links may be useful:


https://bugzilla.redhat.com/
https://www.redhat.com/apps/support/
http://www.redhat.com/mailman/listinfo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Strange delays / what usually happens every 10 min?

2007-11-13 Thread Chris Snook

Florian Boelstler wrote:

While running that test driver a delay of about 10ms _exactly_ occurs
every 10 minutes.


This is precisely the sort of thing that BIOS/firmware-level SMI handlers do, 
particularly those that have monitoring or management features.  Try to 
determine if the kernel is doing anything during this time.  If the entire 
kernel seems to be frozen, talk to the people who wrote the firmware.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Strange delays / what usually happens every 10 min?

2007-11-13 Thread Chris Snook

Florian Boelstler wrote:

While running that test driver a delay of about 10ms _exactly_ occurs
every 10 minutes.


This is precisely the sort of thing that BIOS/firmware-level SMI handlers do, 
particularly those that have monitoring or management features.  Try to 
determine if the kernel is doing anything during this time.  If the entire 
kernel seems to be frozen, talk to the people who wrote the firmware.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PAGE_SIZE on 64bit and 32bit machines

2007-11-12 Thread Chris Snook

Yoav Artzi wrote:
According to my knowledge the PAGE_SIZE on 32bit architectures in 4KB. 
Logically, the PAGE_SIZE on 64bit architectures should be 8KB. That's at 
least the way I understand it. However, looking at the kernel code of 
x86_64, I see the PAGE_SIZE is 4KB.



Can anyone explain to me what am I missing here?


PAGE_SIZE is highly architecture-dependent.  While it is true that 4K pages are 
typical on 32-bit architectures, and 64-bit architectures have historically 
introduced 8K pages, this is by no means a requirement.  x86_64 uses the same 
page sizes that are available on i686+PAE, so you get 4K base pages.  alpha and 
sparc64 typically use 8K base pages, though they have other options as well. 
ia64 defaults to 16K, though it can do 4K, 8K, and a bunch of larger base sizes. 
 ppc64 does 4K and 64K.  s390 uses 4K base pages in both 31-bit and 64-bit 
kernels.  If x86_64 processors are released with TLBs that can handle 8K pages, 
it'll be straightforward to add that feature, but otherwise it would require 
faking it in software, which has lots of pitfalls and does nothing to improve 
TLB efficiency.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PAGE_SIZE on 64bit and 32bit machines

2007-11-12 Thread Chris Snook

Yoav Artzi wrote:
According to my knowledge the PAGE_SIZE on 32bit architectures in 4KB. 
Logically, the PAGE_SIZE on 64bit architectures should be 8KB. That's at 
least the way I understand it. However, looking at the kernel code of 
x86_64, I see the PAGE_SIZE is 4KB.



Can anyone explain to me what am I missing here?


PAGE_SIZE is highly architecture-dependent.  While it is true that 4K pages are 
typical on 32-bit architectures, and 64-bit architectures have historically 
introduced 8K pages, this is by no means a requirement.  x86_64 uses the same 
page sizes that are available on i686+PAE, so you get 4K base pages.  alpha and 
sparc64 typically use 8K base pages, though they have other options as well. 
ia64 defaults to 16K, though it can do 4K, 8K, and a bunch of larger base sizes. 
 ppc64 does 4K and 64K.  s390 uses 4K base pages in both 31-bit and 64-bit 
kernels.  If x86_64 processors are released with TLBs that can handle 8K pages, 
it'll be straightforward to add that feature, but otherwise it would require 
faking it in software, which has lots of pitfalls and does nothing to improve 
TLB efficiency.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [poll] Is the megafreeze development model broken?

2007-11-08 Thread Chris Snook

ciol wrote:

Chris Snook wrote:


Why are you asking the developers?  We do this for the sake of the users.



The kernel is the software of the developers.


The kernel is a technology.  A distribution is a product.  When decisions about 
technology and decisions about products are made *entirely* by the same people, 
the result is never good.



It's important to know how they want it to be distributed.


For commercial distributions, the answer is: "In whichever way results in the 
largest paycheck with the least amount of stress and effort", which means doing 
it the way that's best for the customer.


Non-commercial distributions have less of this pressure, but the same principle 
applies if they care about their users.  If you're not interested in the users 
but you are interested in the technology, you should be doing your work 
upstream, so the distribution is irrelevant.


Don't get me wrong, I think stable kernel trees like 2.6.16 are a good thing. 
They serve very well a whole bunch of different niches where users are willing 
to sacrifice the support benefits of a distribution kernel for the control of an 
upstream kernel, while maintaining the stability of their installed base.  These 
users have little interest in the general-purpose distribution kernel anyway, 
aside from perhaps wishing it included some config or patch that its maintainers 
have elected not to include.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Coding Style: indenting with tabs vs. spaces

2007-11-08 Thread Chris Snook

Benny Halevy wrote:

Greetings,

I would like to hear peoples opinion about the indentation convention
described below that I personally found the most practical with
several different editors.

The gist of it is that tabs should be used for nesting, not for decoration.
Indent your code with as many tabs as your nesting level, where all statements
will begin, and from there on use space characters.
The rational behind it is to be tab-width agnostic so regardless of your
tab expansion setup, the code will look correct and will make sense.

When you break a line and want the new line text to start below a specific point
relative to the previous line (I consider that "decorating") then start the new
line with the same number of tabs as the previous one and then just use space
characters as their width is the same as any character in the previous line,
(assuming fixed-width fonts of course).


I find it meaningful to indent extended lines one extra tab stop, but beyond 
that I agree it is just decoration.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [poll] Is the megafreeze development model broken?

2007-11-08 Thread Chris Snook

ciol wrote:

Hi, I'd like to ask you a few questions:

* Do you like the way linux distributions integrate the kernel?

* Wouldn't you prefer they ship with the stable and still maintained 
2.6.16.X, while providing optionally the latest kernel for those who 
want or just have a new hardware?


* Do you think the megafreeze development model [1] and the "I don't 
trust in upstream" development model are broken? (And why)


Why are you asking the developers?  We do this for the sake of the users.

-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [poll] Is the megafreeze development model broken?

2007-11-08 Thread Chris Snook

ciol wrote:

Hi, I'd like to ask you a few questions:

* Do you like the way linux distributions integrate the kernel?

* Wouldn't you prefer they ship with the stable and still maintained 
2.6.16.X, while providing optionally the latest kernel for those who 
want or just have a new hardware?


* Do you think the megafreeze development model [1] and the I don't 
trust in upstream development model are broken? (And why)


Why are you asking the developers?  We do this for the sake of the users.

-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Coding Style: indenting with tabs vs. spaces

2007-11-08 Thread Chris Snook

Benny Halevy wrote:

Greetings,

I would like to hear peoples opinion about the indentation convention
described below that I personally found the most practical with
several different editors.

The gist of it is that tabs should be used for nesting, not for decoration.
Indent your code with as many tabs as your nesting level, where all statements
will begin, and from there on use space characters.
The rational behind it is to be tab-width agnostic so regardless of your
tab expansion setup, the code will look correct and will make sense.

When you break a line and want the new line text to start below a specific point
relative to the previous line (I consider that decorating) then start the new
line with the same number of tabs as the previous one and then just use space
characters as their width is the same as any character in the previous line,
(assuming fixed-width fonts of course).


I find it meaningful to indent extended lines one extra tab stop, but beyond 
that I agree it is just decoration.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [poll] Is the megafreeze development model broken?

2007-11-08 Thread Chris Snook

ciol wrote:

Chris Snook wrote:


Why are you asking the developers?  We do this for the sake of the users.



The kernel is the software of the developers.


The kernel is a technology.  A distribution is a product.  When decisions about 
technology and decisions about products are made *entirely* by the same people, 
the result is never good.



It's important to know how they want it to be distributed.


For commercial distributions, the answer is: In whichever way results in the 
largest paycheck with the least amount of stress and effort, which means doing 
it the way that's best for the customer.


Non-commercial distributions have less of this pressure, but the same principle 
applies if they care about their users.  If you're not interested in the users 
but you are interested in the technology, you should be doing your work 
upstream, so the distribution is irrelevant.


Don't get me wrong, I think stable kernel trees like 2.6.16 are a good thing. 
They serve very well a whole bunch of different niches where users are willing 
to sacrifice the support benefits of a distribution kernel for the control of an 
upstream kernel, while maintaining the stability of their installed base.  These 
users have little interest in the general-purpose distribution kernel anyway, 
aside from perhaps wishing it included some config or patch that its maintainers 
have elected not to include.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH] Optimize zone allocator synchronization

2007-11-06 Thread Chris Snook

Don Porter wrote:

From: Donald E. Porter <[EMAIL PROTECTED]>

In the bulk page allocation/free routines in mm/page_alloc.c, the zone
lock is held across all iterations.  For certain parallel workloads, I
have found that releasing and reacquiring the lock for each iteration
yields better performance, especially at higher CPU counts.  For
instance, kernel compilation is sped up by 5% on an 8 CPU test
machine.  In most cases, there is no significant effect on performance
(although the effect tends to be slightly positive).  This seems quite
reasonable for the very small scope of the change.

My intuition is that this patch prevents smaller requests from waiting
on larger ones.  While grabbing and releasing the lock within the loop
adds a few instructions, it can lower the latency for a particular
thread's allocation which is often on the thread's critical path.
Lowering the average latency for allocation can increase system throughput.

More detailed information, including data from the tests I ran to
validate this change are available at
http://www.cs.utexas.edu/~porterde/kernel-patch.html .

Thanks in advance for your consideration and feedback.


That's an interesting insight.  My intuition is that Nick Piggin's 
recently-posted ticket spinlocks patches[1] will reduce the need for this patch, 
though it may be useful to have both.  Can you benchmark again with only ticket 
spinlocks, and with ticket spinlocks + this patch?  You'll probably want to use 
2.6.24-rc1 as your baseline, due to the x86 architecture merge.


-- Chris

[1] http://lkml.org/lkml/2007/11/1/123
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH] Optimize zone allocator synchronization

2007-11-06 Thread Chris Snook

Don Porter wrote:

From: Donald E. Porter [EMAIL PROTECTED]

In the bulk page allocation/free routines in mm/page_alloc.c, the zone
lock is held across all iterations.  For certain parallel workloads, I
have found that releasing and reacquiring the lock for each iteration
yields better performance, especially at higher CPU counts.  For
instance, kernel compilation is sped up by 5% on an 8 CPU test
machine.  In most cases, there is no significant effect on performance
(although the effect tends to be slightly positive).  This seems quite
reasonable for the very small scope of the change.

My intuition is that this patch prevents smaller requests from waiting
on larger ones.  While grabbing and releasing the lock within the loop
adds a few instructions, it can lower the latency for a particular
thread's allocation which is often on the thread's critical path.
Lowering the average latency for allocation can increase system throughput.

More detailed information, including data from the tests I ran to
validate this change are available at
http://www.cs.utexas.edu/~porterde/kernel-patch.html .

Thanks in advance for your consideration and feedback.


That's an interesting insight.  My intuition is that Nick Piggin's 
recently-posted ticket spinlocks patches[1] will reduce the need for this patch, 
though it may be useful to have both.  Can you benchmark again with only ticket 
spinlocks, and with ticket spinlocks + this patch?  You'll probably want to use 
2.6.24-rc1 as your baseline, due to the x86 architecture merge.


-- Chris

[1] http://lkml.org/lkml/2007/11/1/123
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Quad core CPU detected but shows as single core in 2.6.23.1

2007-11-03 Thread Chris Snook

Zurk Tech wrote:

dmesg (new) with disabled GART error reporting if anyone wants to
compare to previous dmesg with GART error reporting :


A few unrelated observations about Barcelona support...


Marking TSC unstable due to TSCs unsynchronized


This is probably wrong.  The TSC is on the northbridge on Barcelona chips, so 
every core on the die should be in sync.  Hypothetically you could have 
different speed northbridges in different sockets, but we've never tried very 
hard to support that case anyway.  We should probably be marking the TSC as 
stable on Barcelona chips.



xor: automatically using best checksumming function: generic_sse
   generic_sse:  7449.000 MB/sec
xor: using function: generic_sse (7449.000 MB/sec)


We should probably also implement an SSE5 function to take advantage of the 
128-bit SSE operations supported on newer processors.



pnp: the driver 'system' has been registered
pnp: match found with the PnP device '00:08' and the driver 'system'
pnp: match found with the PnP device '00:09' and the driver 'system'
pnp: 00:09: ioport range 0x580-0x58f has been reserved
pnp: 00:09: ioport range 0x590-0x593 has been reserved
pnp: 00:09: ioport range 0x700-0x703 has been reserved
pnp: 00:09: ioport range 0xca0-0xcaf has been reserved
pnp: 00:09: iomem range 0xfec0-0xfec00fff could not be reserved
pnp: 00:09: iomem range 0xfec01000-0xfec01fff could not be reserved
pnp: 00:09: iomem range 0xfec02000-0xfec02fff could not be reserved
pnp: 00:09: iomem range 0xfee0-0xfee00fff could not be reserved
pnp: match found with the PnP device '00:0a' and the driver 'system'
pnp: 00:0a: ioport range 0x600-0x61f has been reserved
pnp: 00:0a: ioport range 0x520-0x53f has been reserved
pnp: 00:0a: ioport range 0x540-0x54f has been reserved
pnp: 00:0a: ioport range 0x640-0x65f has been reserved
pnp: match found with the PnP device '00:0b' and the driver 'system'
pnp: 00:0b: iomem range 0xe000-0xefff has been reserved
pnp: match found with the PnP device '00:0c' and the driver 'system'
pnp: 00:0c: iomem range 0x0-0x9 could not be reserved
pnp: 00:0c: iomem range 0x0-0x0 could not be reserved
pnp: 00:0c: iomem range 0xe-0xf could not be reserved
pnp: 00:0c: iomem range 0x10-0xc7ff could not be reserved
PCI: Bridge: :01:0d.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:01.0
  IO window: a000-bfff
  MEM window: ff40-ff4f
  PREFETCH window: disabled.
PCI: Bridge: :00:06.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:07.0
  IO window: disabled.
  MEM window: ff50-ff5f
  PREFETCH window: cfe0-cfef
PCI: Bridge: :00:08.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:09.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:0a.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:0b.0
  IO window: disabled.
  MEM window: disabled.


Hmmm... perhaps we're not handling the new mmconfig stuff correctly?  Or maybe 
the BIOS isn't.



hwmon-vid: Unknown VRM version of your x86 CPU
 : Not supporting VRM 0.0


This code probably needs an update for Barcelona.


raid6: int64x1   1920 MB/s
raid6: int64x2   2353 MB/s
raid6: int64x4   2331 MB/s
raid6: int64x8   1254 MB/s
raid6: sse2x12664 MB/s
raid6: sse2x24214 MB/s
raid6: sse2x44905 MB/s
raid6: using algorithm sse2x4 (4905 MB/s)


An update here for SSE5 might be in order as well.

-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Quad core CPU detected but shows as single core in 2.6.23.1

2007-11-03 Thread Chris Snook

Zurk Tech wrote:

dmesg (new) with disabled GART error reporting if anyone wants to
compare to previous dmesg with GART error reporting :


A few unrelated observations about Barcelona support...


Marking TSC unstable due to TSCs unsynchronized


This is probably wrong.  The TSC is on the northbridge on Barcelona chips, so 
every core on the die should be in sync.  Hypothetically you could have 
different speed northbridges in different sockets, but we've never tried very 
hard to support that case anyway.  We should probably be marking the TSC as 
stable on Barcelona chips.



xor: automatically using best checksumming function: generic_sse
   generic_sse:  7449.000 MB/sec
xor: using function: generic_sse (7449.000 MB/sec)


We should probably also implement an SSE5 function to take advantage of the 
128-bit SSE operations supported on newer processors.



pnp: the driver 'system' has been registered
pnp: match found with the PnP device '00:08' and the driver 'system'
pnp: match found with the PnP device '00:09' and the driver 'system'
pnp: 00:09: ioport range 0x580-0x58f has been reserved
pnp: 00:09: ioport range 0x590-0x593 has been reserved
pnp: 00:09: ioport range 0x700-0x703 has been reserved
pnp: 00:09: ioport range 0xca0-0xcaf has been reserved
pnp: 00:09: iomem range 0xfec0-0xfec00fff could not be reserved
pnp: 00:09: iomem range 0xfec01000-0xfec01fff could not be reserved
pnp: 00:09: iomem range 0xfec02000-0xfec02fff could not be reserved
pnp: 00:09: iomem range 0xfee0-0xfee00fff could not be reserved
pnp: match found with the PnP device '00:0a' and the driver 'system'
pnp: 00:0a: ioport range 0x600-0x61f has been reserved
pnp: 00:0a: ioport range 0x520-0x53f has been reserved
pnp: 00:0a: ioport range 0x540-0x54f has been reserved
pnp: 00:0a: ioport range 0x640-0x65f has been reserved
pnp: match found with the PnP device '00:0b' and the driver 'system'
pnp: 00:0b: iomem range 0xe000-0xefff has been reserved
pnp: match found with the PnP device '00:0c' and the driver 'system'
pnp: 00:0c: iomem range 0x0-0x9 could not be reserved
pnp: 00:0c: iomem range 0x0-0x0 could not be reserved
pnp: 00:0c: iomem range 0xe-0xf could not be reserved
pnp: 00:0c: iomem range 0x10-0xc7ff could not be reserved
PCI: Bridge: :01:0d.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:01.0
  IO window: a000-bfff
  MEM window: ff40-ff4f
  PREFETCH window: disabled.
PCI: Bridge: :00:06.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:07.0
  IO window: disabled.
  MEM window: ff50-ff5f
  PREFETCH window: cfe0-cfef
PCI: Bridge: :00:08.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:09.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:0a.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:0b.0
  IO window: disabled.
  MEM window: disabled.


Hmmm... perhaps we're not handling the new mmconfig stuff correctly?  Or maybe 
the BIOS isn't.



hwmon-vid: Unknown VRM version of your x86 CPU
 : Not supporting VRM 0.0


This code probably needs an update for Barcelona.


raid6: int64x1   1920 MB/s
raid6: int64x2   2353 MB/s
raid6: int64x4   2331 MB/s
raid6: int64x8   1254 MB/s
raid6: sse2x12664 MB/s
raid6: sse2x24214 MB/s
raid6: sse2x44905 MB/s
raid6: using algorithm sse2x4 (4905 MB/s)


An update here for SSE5 might be in order as well.

-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][REFERENCE ONLY] 9p: ramfs 9p server

2007-11-02 Thread Chris Snook

Latchesar Ionkov wrote:

Sample ramfs file server that uses the in-kernel 9P file server support.
This code is for reference only.


Reference code generally goes in Documentation/

-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][REFERENCE ONLY] 9p: ramfs 9p server

2007-11-02 Thread Chris Snook

Latchesar Ionkov wrote:

Sample ramfs file server that uses the in-kernel 9P file server support.
This code is for reference only.


Reference code generally goes in Documentation/

-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Quad core CPU detected but shows as single core in 2.6.23.1

2007-10-24 Thread Chris Snook

Zurk Tech wrote:

Hi guys,
I have a tyan s3992 h2000 with single barcelona amd quad core cpu (the
other cpu socket is empty). cat /proc/cpuinfo shows amd quad core
processor
but core : 1ive compiled the kernel from scratch with smp and
amd64 + the numa stuff. i also tried debian etchs amd64 smp kernel and
same result.
is amd barcelona quad core cpu not yet supported or is it something else ?
Thanks for any insight. im completely stumped. ive dealt with
mutliprocessing machines before and have a couple of dual cores which
are fine with the
exact same kernel configs. my amd tk-53 x2 turions show 2 cores in cpuinfo


The bootstrap protocol for Barcelona is a little different from older Opterons, 
so an older BIOS that doesn't know the new protocol won't be able to bring up 
any CPU other than the bootstrap processor.  My wild guess is that this is 
what's happening and a BIOS update will fix it, but as Arjan said, please post 
dmesg when reporting bugs like this.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Quad core CPU detected but shows as single core in 2.6.23.1

2007-10-24 Thread Chris Snook

Zurk Tech wrote:

Hi guys,
I have a tyan s3992 h2000 with single barcelona amd quad core cpu (the
other cpu socket is empty). cat /proc/cpuinfo shows amd quad core
processor
but core : 1ive compiled the kernel from scratch with smp and
amd64 + the numa stuff. i also tried debian etchs amd64 smp kernel and
same result.
is amd barcelona quad core cpu not yet supported or is it something else ?
Thanks for any insight. im completely stumped. ive dealt with
mutliprocessing machines before and have a couple of dual cores which
are fine with the
exact same kernel configs. my amd tk-53 x2 turions show 2 cores in cpuinfo


The bootstrap protocol for Barcelona is a little different from older Opterons, 
so an older BIOS that doesn't know the new protocol won't be able to bring up 
any CPU other than the bootstrap processor.  My wild guess is that this is 
what's happening and a BIOS update will fix it, but as Arjan said, please post 
dmesg when reporting bugs like this.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: unify div64{,_32,_64}.h

2007-10-20 Thread Chris Snook
From: Chris Snook <[EMAIL PROTECTED]>

Unify x86 div64.h headers.

Signed-off-by: Chris Snook <[EMAIL PROTECTED]>

diff -Nurp a/include/asm-x86/div64_32.h b/include/asm-x86/div64_32.h
--- a/include/asm-x86/div64_32.h2007-10-20 07:33:53.0 -0400
+++ b/include/asm-x86/div64_32.h1969-12-31 19:00:00.0 -0500
@@ -1,52 +0,0 @@
-#ifndef __I386_DIV64
-#define __I386_DIV64
-
-#include 
-
-/*
- * do_div() is NOT a C function. It wants to return
- * two values (the quotient and the remainder), but
- * since that doesn't work very well in C, what it
- * does is:
- *
- * - modifies the 64-bit dividend _in_place_
- * - returns the 32-bit remainder
- *
- * This ends up being the most efficient "calling
- * convention" on x86.
- */
-#define do_div(n,base) ({ \
-   unsigned long __upper, __low, __high, __mod, __base; \
-   __base = (base); \
-   asm("":"=a" (__low), "=d" (__high):"A" (n)); \
-   __upper = __high; \
-   if (__high) { \
-   __upper = __high % (__base); \
-   __high = __high / (__base); \
-   } \
-   asm("divl %2":"=a" (__low), "=d" (__mod):"rm" (__base), "0" (__low), 
"1" (__upper)); \
-   asm("":"=A" (n):"a" (__low),"d" (__high)); \
-   __mod; \
-})
-
-/*
- * (long)X = ((long long)divs) / (long)div
- * (long)rem = ((long long)divs) % (long)div
- *
- * Warning, this will do an exception if X overflows.
- */
-#define div_long_long_rem(a,b,c) div_ll_X_l_rem(a,b,c)
-
-static inline long
-div_ll_X_l_rem(long long divs, long div, long *rem)
-{
-   long dum2;
-  __asm__("divl %2":"=a"(dum2), "=d"(*rem)
-  :"rm"(div), "A"(divs));
-
-   return dum2;
-
-}
-
-extern uint64_t div64_64(uint64_t dividend, uint64_t divisor);
-#endif
diff -Nurp a/include/asm-x86/div64_64.h b/include/asm-x86/div64_64.h
--- a/include/asm-x86/div64_64.h2007-10-20 07:33:53.0 -0400
+++ b/include/asm-x86/div64_64.h1969-12-31 19:00:00.0 -0500
@@ -1 +0,0 @@
-#include 
diff -Nurp a/include/asm-x86/div64.h b/include/asm-x86/div64.h
--- a/include/asm-x86/div64.h   2007-10-20 07:33:53.0 -0400
+++ b/include/asm-x86/div64.h   2007-10-20 07:32:34.0 -0400
@@ -1,5 +1,58 @@
+#ifndef _ASM_X86_DIV64_H
+#define _ASM_X86_DIV64_H
+
 #ifdef CONFIG_X86_32
-# include "div64_32.h"
-#else
-# include "div64_64.h"
-#endif
+
+#include 
+
+/*
+ * do_div() is NOT a C function. It wants to return
+ * two values (the quotient and the remainder), but
+ * since that doesn't work very well in C, what it
+ * does is:
+ *
+ * - modifies the 64-bit dividend _in_place_
+ * - returns the 32-bit remainder
+ *
+ * This ends up being the most efficient "calling
+ * convention" on x86.
+ */
+#define do_div(n,base) ({ \
+   unsigned long __upper, __low, __high, __mod, __base; \
+   __base = (base); \
+   asm("":"=a" (__low), "=d" (__high):"A" (n)); \
+   __upper = __high; \
+   if (__high) { \
+   __upper = __high % (__base); \
+   __high = __high / (__base); \
+   } \
+   asm("divl %2":"=a" (__low), "=d" (__mod):"rm" (__base), "0" (__low), 
"1" (__upper)); \
+   asm("":"=A" (n):"a" (__low),"d" (__high)); \
+   __mod; \
+})
+
+/*
+ * (long)X = ((long long)divs) / (long)div
+ * (long)rem = ((long long)divs) % (long)div
+ *
+ * Warning, this will do an exception if X overflows.
+ */
+#define div_long_long_rem(a,b,c) div_ll_X_l_rem(a,b,c)
+
+static inline long
+div_ll_X_l_rem(long long divs, long div, long *rem)
+{
+   long dum2;
+  __asm__("divl %2":"=a"(dum2), "=d"(*rem)
+  :"rm"(div), "A"(divs));
+
+   return dum2;
+
+}
+
+extern uint64_t div64_64(uint64_t dividend, uint64_t divisor);
+
+# else
+#  include 
+# endif /* CONFIG_X86_32 */
+#endif /* _ASM_X86_DIV64_H */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: unify a.out{,_32,_64}.h

2007-10-20 Thread Chris Snook
From: Chris Snook <[EMAIL PROTECTED]>

Unify x86 a.out_32.h and a.out_64.h

Signed-off-by: Chris Snook <[EMAIL PROTECTED]>

diff -Nurp a/include/asm-x86/a.out_32.h b/include/asm-x86/a.out_32.h
--- a/include/asm-x86/a.out_32.h2007-10-20 06:20:01.0 -0400
+++ b/include/asm-x86/a.out_32.h1969-12-31 19:00:00.0 -0500
@@ -1,27 +0,0 @@
-#ifndef __I386_A_OUT_H__
-#define __I386_A_OUT_H__
-
-struct exec
-{
-  unsigned long a_info;/* Use macros N_MAGIC, etc for access */
-  unsigned a_text; /* length of text, in bytes */
-  unsigned a_data; /* length of data, in bytes */
-  unsigned a_bss;  /* length of uninitialized data area for file, 
in bytes */
-  unsigned a_syms; /* length of symbol table data in file, in 
bytes */
-  unsigned a_entry;/* start address */
-  unsigned a_trsize;   /* length of relocation info for text, in bytes 
*/
-  unsigned a_drsize;   /* length of relocation info for data, in bytes 
*/
-};
-
-#define N_TRSIZE(a)((a).a_trsize)
-#define N_DRSIZE(a)((a).a_drsize)
-#define N_SYMSIZE(a)   ((a).a_syms)
-
-#ifdef __KERNEL__
-
-#define STACK_TOP  TASK_SIZE
-#define STACK_TOP_MAX  STACK_TOP
-
-#endif
-
-#endif /* __A_OUT_GNU_H__ */
diff -Nurp a/include/asm-x86/a.out_64.h b/include/asm-x86/a.out_64.h
--- a/include/asm-x86/a.out_64.h2007-10-20 06:20:01.0 -0400
+++ b/include/asm-x86/a.out_64.h1969-12-31 19:00:00.0 -0500
@@ -1,28 +0,0 @@
-#ifndef __X8664_A_OUT_H__
-#define __X8664_A_OUT_H__
-
-/* 32bit a.out */
-
-struct exec
-{
-  unsigned int a_info; /* Use macros N_MAGIC, etc for access */
-  unsigned a_text; /* length of text, in bytes */
-  unsigned a_data; /* length of data, in bytes */
-  unsigned a_bss;  /* length of uninitialized data area for file, 
in bytes */
-  unsigned a_syms; /* length of symbol table data in file, in 
bytes */
-  unsigned a_entry;/* start address */
-  unsigned a_trsize;   /* length of relocation info for text, in bytes 
*/
-  unsigned a_drsize;   /* length of relocation info for data, in bytes 
*/
-};
-
-#define N_TRSIZE(a)((a).a_trsize)
-#define N_DRSIZE(a)((a).a_drsize)
-#define N_SYMSIZE(a)   ((a).a_syms)
-
-#ifdef __KERNEL__
-#include 
-#define STACK_TOP  TASK_SIZE
-#define STACK_TOP_MAX  TASK_SIZE64
-#endif
-
-#endif /* __A_OUT_GNU_H__ */
diff -Nurp a/include/asm-x86/a.out.h b/include/asm-x86/a.out.h
--- a/include/asm-x86/a.out.h   2007-10-20 06:20:01.0 -0400
+++ b/include/asm-x86/a.out.h   2007-10-20 06:14:26.0 -0400
@@ -1,13 +1,32 @@
+#ifndef _ASM_X86_A_OUT_H
+#define _ASM_X86_A_OUT_H
+
+/* 32bit a.out */
+
+struct exec
+{
+  unsigned int a_info; /* Use macros N_MAGIC, etc for access */
+  unsigned a_text; /* length of text, in bytes */
+  unsigned a_data; /* length of data, in bytes */
+  unsigned a_bss;  /* length of uninitialized data area for file, 
in bytes */
+  unsigned a_syms; /* length of symbol table data in file, in 
bytes */
+  unsigned a_entry;/* start address */
+  unsigned a_trsize;   /* length of relocation info for text, in bytes 
*/
+  unsigned a_drsize;   /* length of relocation info for data, in bytes 
*/
+};
+
+#define N_TRSIZE(a)((a).a_trsize)
+#define N_DRSIZE(a)((a).a_drsize)
+#define N_SYMSIZE(a)   ((a).a_syms)
+
 #ifdef __KERNEL__
+# include 
+# define STACK_TOP TASK_SIZE
 # ifdef CONFIG_X86_32
-#  include "a.out_32.h"
+#  define STACK_TOP_MAXSTACK_TOP
 # else
-#  include "a.out_64.h"
-# endif
-#else
-# ifdef __i386__
-#  include "a.out_32.h"
-# else
-#  include "a.out_64.h"
+#  define STACK_TOP_MAXTASK_SIZE64
 # endif
 #endif
+
+#endif /* _ASM_X86_A_OUT_H */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: merge mmu{,_32,_64}.h

2007-10-20 Thread Chris Snook
From: Chris Snook <[EMAIL PROTECTED]>

Merge mmu_32.h and mmu_64.h into mmu.h.

Signed-off-by: Chris Snook <[EMAIL PROTECTED]>

diff -Nurp a/include/asm-x86/mmu_32.h b/include/asm-x86/mmu_32.h
--- a/include/asm-x86/mmu_32.h  2007-10-20 02:42:24.0 -0400
+++ b/include/asm-x86/mmu_32.h  1969-12-31 19:00:00.0 -0500
@@ -1,18 +0,0 @@
-#ifndef __i386_MMU_H
-#define __i386_MMU_H
-
-#include 
-/*
- * The i386 doesn't have a mmu context, but
- * we put the segment information here.
- *
- * cpu_vm_mask is used to optimize ldt flushing.
- */
-typedef struct { 
-   int size;
-   struct mutex lock;
-   void *ldt;
-   void *vdso;
-} mm_context_t;
-
-#endif
diff -Nurp a/include/asm-x86/mmu_64.h b/include/asm-x86/mmu_64.h
--- a/include/asm-x86/mmu_64.h  2007-10-20 02:42:24.0 -0400
+++ b/include/asm-x86/mmu_64.h  1969-12-31 19:00:00.0 -0500
@@ -1,21 +0,0 @@
-#ifndef __x86_64_MMU_H
-#define __x86_64_MMU_H
-
-#include 
-#include 
-
-/*
- * The x86_64 doesn't have a mmu context, but
- * we put the segment information here.
- *
- * cpu_vm_mask is used to optimize ldt flushing.
- */
-typedef struct { 
-   void *ldt;
-   rwlock_t ldtlock; 
-   int size;
-   struct mutex lock;
-   void *vdso;
-} mm_context_t;
-
-#endif
diff -Nurp a/include/asm-x86/mmu.h b/include/asm-x86/mmu.h
--- a/include/asm-x86/mmu.h 2007-10-20 02:42:24.0 -0400
+++ b/include/asm-x86/mmu.h 2007-10-20 02:38:36.0 -0400
@@ -1,5 +1,23 @@
-#ifdef CONFIG_X86_32
-# include "mmu_32.h"
-#else
-# include "mmu_64.h"
+#ifndef _ASM_X86_MMU_H
+#define _ASM_X86_MMU_H
+
+#include 
+#include 
+
+/*
+ * The x86 doesn't have a mmu context, but
+ * we put the segment information here.
+ *
+ * cpu_vm_mask is used to optimize ldt flushing.
+ */
+typedef struct { 
+   void *ldt;
+#ifdef CONFIG_X86_64
+   rwlock_t ldtlock; 
 #endif
+   int size;
+   struct mutex lock;
+   void *vdso;
+} mm_context_t;
+
+#endif /* _ASM_X86_MMU_H */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: merge mmu{,_32,_64}.h

2007-10-20 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Merge mmu_32.h and mmu_64.h into mmu.h.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

diff -Nurp a/include/asm-x86/mmu_32.h b/include/asm-x86/mmu_32.h
--- a/include/asm-x86/mmu_32.h  2007-10-20 02:42:24.0 -0400
+++ b/include/asm-x86/mmu_32.h  1969-12-31 19:00:00.0 -0500
@@ -1,18 +0,0 @@
-#ifndef __i386_MMU_H
-#define __i386_MMU_H
-
-#include linux/mutex.h
-/*
- * The i386 doesn't have a mmu context, but
- * we put the segment information here.
- *
- * cpu_vm_mask is used to optimize ldt flushing.
- */
-typedef struct { 
-   int size;
-   struct mutex lock;
-   void *ldt;
-   void *vdso;
-} mm_context_t;
-
-#endif
diff -Nurp a/include/asm-x86/mmu_64.h b/include/asm-x86/mmu_64.h
--- a/include/asm-x86/mmu_64.h  2007-10-20 02:42:24.0 -0400
+++ b/include/asm-x86/mmu_64.h  1969-12-31 19:00:00.0 -0500
@@ -1,21 +0,0 @@
-#ifndef __x86_64_MMU_H
-#define __x86_64_MMU_H
-
-#include linux/spinlock.h
-#include linux/mutex.h
-
-/*
- * The x86_64 doesn't have a mmu context, but
- * we put the segment information here.
- *
- * cpu_vm_mask is used to optimize ldt flushing.
- */
-typedef struct { 
-   void *ldt;
-   rwlock_t ldtlock; 
-   int size;
-   struct mutex lock;
-   void *vdso;
-} mm_context_t;
-
-#endif
diff -Nurp a/include/asm-x86/mmu.h b/include/asm-x86/mmu.h
--- a/include/asm-x86/mmu.h 2007-10-20 02:42:24.0 -0400
+++ b/include/asm-x86/mmu.h 2007-10-20 02:38:36.0 -0400
@@ -1,5 +1,23 @@
-#ifdef CONFIG_X86_32
-# include mmu_32.h
-#else
-# include mmu_64.h
+#ifndef _ASM_X86_MMU_H
+#define _ASM_X86_MMU_H
+
+#include linux/spinlock.h
+#include linux/mutex.h
+
+/*
+ * The x86 doesn't have a mmu context, but
+ * we put the segment information here.
+ *
+ * cpu_vm_mask is used to optimize ldt flushing.
+ */
+typedef struct { 
+   void *ldt;
+#ifdef CONFIG_X86_64
+   rwlock_t ldtlock; 
 #endif
+   int size;
+   struct mutex lock;
+   void *vdso;
+} mm_context_t;
+
+#endif /* _ASM_X86_MMU_H */
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: unify a.out{,_32,_64}.h

2007-10-20 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Unify x86 a.out_32.h and a.out_64.h

Signed-off-by: Chris Snook [EMAIL PROTECTED]

diff -Nurp a/include/asm-x86/a.out_32.h b/include/asm-x86/a.out_32.h
--- a/include/asm-x86/a.out_32.h2007-10-20 06:20:01.0 -0400
+++ b/include/asm-x86/a.out_32.h1969-12-31 19:00:00.0 -0500
@@ -1,27 +0,0 @@
-#ifndef __I386_A_OUT_H__
-#define __I386_A_OUT_H__
-
-struct exec
-{
-  unsigned long a_info;/* Use macros N_MAGIC, etc for access */
-  unsigned a_text; /* length of text, in bytes */
-  unsigned a_data; /* length of data, in bytes */
-  unsigned a_bss;  /* length of uninitialized data area for file, 
in bytes */
-  unsigned a_syms; /* length of symbol table data in file, in 
bytes */
-  unsigned a_entry;/* start address */
-  unsigned a_trsize;   /* length of relocation info for text, in bytes 
*/
-  unsigned a_drsize;   /* length of relocation info for data, in bytes 
*/
-};
-
-#define N_TRSIZE(a)((a).a_trsize)
-#define N_DRSIZE(a)((a).a_drsize)
-#define N_SYMSIZE(a)   ((a).a_syms)
-
-#ifdef __KERNEL__
-
-#define STACK_TOP  TASK_SIZE
-#define STACK_TOP_MAX  STACK_TOP
-
-#endif
-
-#endif /* __A_OUT_GNU_H__ */
diff -Nurp a/include/asm-x86/a.out_64.h b/include/asm-x86/a.out_64.h
--- a/include/asm-x86/a.out_64.h2007-10-20 06:20:01.0 -0400
+++ b/include/asm-x86/a.out_64.h1969-12-31 19:00:00.0 -0500
@@ -1,28 +0,0 @@
-#ifndef __X8664_A_OUT_H__
-#define __X8664_A_OUT_H__
-
-/* 32bit a.out */
-
-struct exec
-{
-  unsigned int a_info; /* Use macros N_MAGIC, etc for access */
-  unsigned a_text; /* length of text, in bytes */
-  unsigned a_data; /* length of data, in bytes */
-  unsigned a_bss;  /* length of uninitialized data area for file, 
in bytes */
-  unsigned a_syms; /* length of symbol table data in file, in 
bytes */
-  unsigned a_entry;/* start address */
-  unsigned a_trsize;   /* length of relocation info for text, in bytes 
*/
-  unsigned a_drsize;   /* length of relocation info for data, in bytes 
*/
-};
-
-#define N_TRSIZE(a)((a).a_trsize)
-#define N_DRSIZE(a)((a).a_drsize)
-#define N_SYMSIZE(a)   ((a).a_syms)
-
-#ifdef __KERNEL__
-#include linux/thread_info.h
-#define STACK_TOP  TASK_SIZE
-#define STACK_TOP_MAX  TASK_SIZE64
-#endif
-
-#endif /* __A_OUT_GNU_H__ */
diff -Nurp a/include/asm-x86/a.out.h b/include/asm-x86/a.out.h
--- a/include/asm-x86/a.out.h   2007-10-20 06:20:01.0 -0400
+++ b/include/asm-x86/a.out.h   2007-10-20 06:14:26.0 -0400
@@ -1,13 +1,32 @@
+#ifndef _ASM_X86_A_OUT_H
+#define _ASM_X86_A_OUT_H
+
+/* 32bit a.out */
+
+struct exec
+{
+  unsigned int a_info; /* Use macros N_MAGIC, etc for access */
+  unsigned a_text; /* length of text, in bytes */
+  unsigned a_data; /* length of data, in bytes */
+  unsigned a_bss;  /* length of uninitialized data area for file, 
in bytes */
+  unsigned a_syms; /* length of symbol table data in file, in 
bytes */
+  unsigned a_entry;/* start address */
+  unsigned a_trsize;   /* length of relocation info for text, in bytes 
*/
+  unsigned a_drsize;   /* length of relocation info for data, in bytes 
*/
+};
+
+#define N_TRSIZE(a)((a).a_trsize)
+#define N_DRSIZE(a)((a).a_drsize)
+#define N_SYMSIZE(a)   ((a).a_syms)
+
 #ifdef __KERNEL__
+# include linux/thread_info.h
+# define STACK_TOP TASK_SIZE
 # ifdef CONFIG_X86_32
-#  include a.out_32.h
+#  define STACK_TOP_MAXSTACK_TOP
 # else
-#  include a.out_64.h
-# endif
-#else
-# ifdef __i386__
-#  include a.out_32.h
-# else
-#  include a.out_64.h
+#  define STACK_TOP_MAXTASK_SIZE64
 # endif
 #endif
+
+#endif /* _ASM_X86_A_OUT_H */
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: unify div64{,_32,_64}.h

2007-10-20 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Unify x86 div64.h headers.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

diff -Nurp a/include/asm-x86/div64_32.h b/include/asm-x86/div64_32.h
--- a/include/asm-x86/div64_32.h2007-10-20 07:33:53.0 -0400
+++ b/include/asm-x86/div64_32.h1969-12-31 19:00:00.0 -0500
@@ -1,52 +0,0 @@
-#ifndef __I386_DIV64
-#define __I386_DIV64
-
-#include linux/types.h
-
-/*
- * do_div() is NOT a C function. It wants to return
- * two values (the quotient and the remainder), but
- * since that doesn't work very well in C, what it
- * does is:
- *
- * - modifies the 64-bit dividend _in_place_
- * - returns the 32-bit remainder
- *
- * This ends up being the most efficient calling
- * convention on x86.
- */
-#define do_div(n,base) ({ \
-   unsigned long __upper, __low, __high, __mod, __base; \
-   __base = (base); \
-   asm(:=a (__low), =d (__high):A (n)); \
-   __upper = __high; \
-   if (__high) { \
-   __upper = __high % (__base); \
-   __high = __high / (__base); \
-   } \
-   asm(divl %2:=a (__low), =d (__mod):rm (__base), 0 (__low), 
1 (__upper)); \
-   asm(:=A (n):a (__low),d (__high)); \
-   __mod; \
-})
-
-/*
- * (long)X = ((long long)divs) / (long)div
- * (long)rem = ((long long)divs) % (long)div
- *
- * Warning, this will do an exception if X overflows.
- */
-#define div_long_long_rem(a,b,c) div_ll_X_l_rem(a,b,c)
-
-static inline long
-div_ll_X_l_rem(long long divs, long div, long *rem)
-{
-   long dum2;
-  __asm__(divl %2:=a(dum2), =d(*rem)
-  :rm(div), A(divs));
-
-   return dum2;
-
-}
-
-extern uint64_t div64_64(uint64_t dividend, uint64_t divisor);
-#endif
diff -Nurp a/include/asm-x86/div64_64.h b/include/asm-x86/div64_64.h
--- a/include/asm-x86/div64_64.h2007-10-20 07:33:53.0 -0400
+++ b/include/asm-x86/div64_64.h1969-12-31 19:00:00.0 -0500
@@ -1 +0,0 @@
-#include asm-generic/div64.h
diff -Nurp a/include/asm-x86/div64.h b/include/asm-x86/div64.h
--- a/include/asm-x86/div64.h   2007-10-20 07:33:53.0 -0400
+++ b/include/asm-x86/div64.h   2007-10-20 07:32:34.0 -0400
@@ -1,5 +1,58 @@
+#ifndef _ASM_X86_DIV64_H
+#define _ASM_X86_DIV64_H
+
 #ifdef CONFIG_X86_32
-# include div64_32.h
-#else
-# include div64_64.h
-#endif
+
+#include linux/types.h
+
+/*
+ * do_div() is NOT a C function. It wants to return
+ * two values (the quotient and the remainder), but
+ * since that doesn't work very well in C, what it
+ * does is:
+ *
+ * - modifies the 64-bit dividend _in_place_
+ * - returns the 32-bit remainder
+ *
+ * This ends up being the most efficient calling
+ * convention on x86.
+ */
+#define do_div(n,base) ({ \
+   unsigned long __upper, __low, __high, __mod, __base; \
+   __base = (base); \
+   asm(:=a (__low), =d (__high):A (n)); \
+   __upper = __high; \
+   if (__high) { \
+   __upper = __high % (__base); \
+   __high = __high / (__base); \
+   } \
+   asm(divl %2:=a (__low), =d (__mod):rm (__base), 0 (__low), 
1 (__upper)); \
+   asm(:=A (n):a (__low),d (__high)); \
+   __mod; \
+})
+
+/*
+ * (long)X = ((long long)divs) / (long)div
+ * (long)rem = ((long long)divs) % (long)div
+ *
+ * Warning, this will do an exception if X overflows.
+ */
+#define div_long_long_rem(a,b,c) div_ll_X_l_rem(a,b,c)
+
+static inline long
+div_ll_X_l_rem(long long divs, long div, long *rem)
+{
+   long dum2;
+  __asm__(divl %2:=a(dum2), =d(*rem)
+  :rm(div), A(divs));
+
+   return dum2;
+
+}
+
+extern uint64_t div64_64(uint64_t dividend, uint64_t divisor);
+
+# else
+#  include asm-generic/div64.h
+# endif /* CONFIG_X86_32 */
+#endif /* _ASM_X86_DIV64_H */
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: mostly merge types.h

2007-10-19 Thread Chris Snook
From: Chris Snook <[EMAIL PROTECTED]>

Most of types_32.h and types_64.h are the same.  Merge the common definitions
into types.h, keeping the differences in their own files.  Also #error if
types_{32,64}.h is included directly.  Tested with allmodconfig on x86_64.

Signed-off-by: Chris Snook <[EMAIL PROTECTED]>

 types.h|   45 +
 types_32.h |   48 ++--
 types_64.h |   47 +++
 3 files changed, 58 insertions(+), 82 deletions(-)

diff -urp a/include/asm-x86/types_32.h b/include/asm-x86/types_32.h
--- a/include/asm-x86/types_32.h2007-10-18 04:23:36.0 -0400
+++ b/include/asm-x86/types_32.h2007-10-18 07:03:05.0 -0400
@@ -1,64 +1,28 @@
 #ifndef _I386_TYPES_H
 #define _I386_TYPES_H
 
-#ifndef __ASSEMBLY__
-
-typedef unsigned short umode_t;
-
-/*
- * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the
- * header files exported to user space
- */
-
-typedef __signed__ char __s8;
-typedef unsigned char __u8;
-
-typedef __signed__ short __s16;
-typedef unsigned short __u16;
-
-typedef __signed__ int __s32;
-typedef unsigned int __u32;
+#ifndef _X86_TYPES_H
+#error Do not include this file directly.  Use asm/types.h instead.
+#endif
 
-#if defined(__GNUC__)
+#if !defined(__ASSEMBLY__) && defined(__GNUC__)
 __extension__ typedef __signed__ long long __s64;
 __extension__ typedef unsigned long long __u64;
 #endif
 
-#endif /* __ASSEMBLY__ */
-
-/*
- * These aren't exported outside the kernel to avoid name space clashes
- */
 #ifdef __KERNEL__
 
 #define BITS_PER_LONG 32
 
 #ifndef __ASSEMBLY__
 
-
-typedef signed char s8;
-typedef unsigned char u8;
-
-typedef signed short s16;
-typedef unsigned short u16;
-
-typedef signed int s32;
-typedef unsigned int u32;
-
-typedef signed long long s64;
-typedef unsigned long long u64;
-
-/* DMA addresses come in generic and 64-bit flavours.  */
-
+/* DMA addresses come in generic and 64-bit flavours. */
 #ifdef CONFIG_HIGHMEM64G
 typedef u64 dma_addr_t;
 #else
 typedef u32 dma_addr_t;
 #endif
-typedef u64 dma64_addr_t;
 
 #endif /* __ASSEMBLY__ */
-
 #endif /* __KERNEL__ */
-
-#endif
+#endif /* _I386_TYPES_H */
diff -urp a/include/asm-x86/types_64.h b/include/asm-x86/types_64.h
--- a/include/asm-x86/types_64.h2007-10-18 04:23:36.0 -0400
+++ b/include/asm-x86/types_64.h2007-10-18 07:03:11.0 -0400
@@ -1,55 +1,22 @@
 #ifndef _X86_64_TYPES_H
 #define _X86_64_TYPES_H
 
-#ifndef __ASSEMBLY__
-
-typedef unsigned short umode_t;
-
-/*
- * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the
- * header files exported to user space
- */
-
-typedef __signed__ char __s8;
-typedef unsigned char __u8;
-
-typedef __signed__ short __s16;
-typedef unsigned short __u16;
-
-typedef __signed__ int __s32;
-typedef unsigned int __u32;
+#ifndef _X86_TYPES_H
+#error Do not include this file directly.  Use asm/types.h instead.
+#endif
 
+#ifndef __ASSEMBLY__
 typedef __signed__ long long __s64;
 typedef unsigned long long  __u64;
+#endif
 
-#endif /* __ASSEMBLY__ */
-
-/*
- * These aren't exported outside the kernel to avoid name space clashes
- */
 #ifdef __KERNEL__
 
 #define BITS_PER_LONG 64
 
 #ifndef __ASSEMBLY__
-
-typedef signed char s8;
-typedef unsigned char u8;
-
-typedef signed short s16;
-typedef unsigned short u16;
-
-typedef signed int s32;
-typedef unsigned int u32;
-
-typedef signed long long s64;
-typedef unsigned long long u64;
-
-typedef u64 dma64_addr_t;
 typedef u64 dma_addr_t;
-
-#endif /* __ASSEMBLY__ */
+#endif
 
 #endif /* __KERNEL__ */
-
-#endif
+#endif /* _X86_64_TYPES_H */
diff -urp a/include/asm-x86/types.h b/include/asm-x86/types.h
--- a/include/asm-x86/types.h   2007-10-18 04:23:36.0 -0400
+++ b/include/asm-x86/types.h   2007-10-18 06:59:37.0 -0400
@@ -1,3 +1,46 @@
+#ifndef _X86_TYPES_H
+#define _X86_TYPES_H
+
+#ifndef __ASSEMBLY__
+
+typedef unsigned short umode_t;
+
+/*
+ * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the
+ * header files exported to user space
+ */
+
+typedef __signed__ char __s8;
+typedef unsigned char __u8;
+
+typedef __signed__ short __s16;
+typedef unsigned short __u16;
+
+typedef __signed__ int __s32;
+typedef unsigned int __u32;
+
+/*
+ * These aren't exported outside the kernel to avoid name space clashes
+ */
+#ifdef __KERNEL__
+
+typedef signed char s8;
+typedef unsigned char u8;
+
+typedef signed short s16;
+typedef unsigned short u16;
+
+typedef signed int s32;
+typedef unsigned int u32;
+
+typedef signed long long s64;
+typedef unsigned long long u64;
+
+typedef u64 dma64_addr_t;
+
+#endif /* __KERNEL__ */
+#endif /* __ASSEMBLY__ */
+
 #ifdef __KERNEL__
 # ifdef CONFIG_X86_32
 #  include "types_32.h"
@@ -11,3 +54,5 @@
 #  include "types_64.h"
 # endif
 #endif
+
+#endif /* _X86_TYPES_H */

-
To unsubscribe from this list: send the lin

[PATCH] x86: mostly merge types.h

2007-10-19 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

Most of types_32.h and types_64.h are the same.  Merge the common definitions
into types.h, keeping the differences in their own files.  Also #error if
types_{32,64}.h is included directly.  Tested with allmodconfig on x86_64.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

 types.h|   45 +
 types_32.h |   48 ++--
 types_64.h |   47 +++
 3 files changed, 58 insertions(+), 82 deletions(-)

diff -urp a/include/asm-x86/types_32.h b/include/asm-x86/types_32.h
--- a/include/asm-x86/types_32.h2007-10-18 04:23:36.0 -0400
+++ b/include/asm-x86/types_32.h2007-10-18 07:03:05.0 -0400
@@ -1,64 +1,28 @@
 #ifndef _I386_TYPES_H
 #define _I386_TYPES_H
 
-#ifndef __ASSEMBLY__
-
-typedef unsigned short umode_t;
-
-/*
- * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the
- * header files exported to user space
- */
-
-typedef __signed__ char __s8;
-typedef unsigned char __u8;
-
-typedef __signed__ short __s16;
-typedef unsigned short __u16;
-
-typedef __signed__ int __s32;
-typedef unsigned int __u32;
+#ifndef _X86_TYPES_H
+#error Do not include this file directly.  Use asm/types.h instead.
+#endif
 
-#if defined(__GNUC__)
+#if !defined(__ASSEMBLY__)  defined(__GNUC__)
 __extension__ typedef __signed__ long long __s64;
 __extension__ typedef unsigned long long __u64;
 #endif
 
-#endif /* __ASSEMBLY__ */
-
-/*
- * These aren't exported outside the kernel to avoid name space clashes
- */
 #ifdef __KERNEL__
 
 #define BITS_PER_LONG 32
 
 #ifndef __ASSEMBLY__
 
-
-typedef signed char s8;
-typedef unsigned char u8;
-
-typedef signed short s16;
-typedef unsigned short u16;
-
-typedef signed int s32;
-typedef unsigned int u32;
-
-typedef signed long long s64;
-typedef unsigned long long u64;
-
-/* DMA addresses come in generic and 64-bit flavours.  */
-
+/* DMA addresses come in generic and 64-bit flavours. */
 #ifdef CONFIG_HIGHMEM64G
 typedef u64 dma_addr_t;
 #else
 typedef u32 dma_addr_t;
 #endif
-typedef u64 dma64_addr_t;
 
 #endif /* __ASSEMBLY__ */
-
 #endif /* __KERNEL__ */
-
-#endif
+#endif /* _I386_TYPES_H */
diff -urp a/include/asm-x86/types_64.h b/include/asm-x86/types_64.h
--- a/include/asm-x86/types_64.h2007-10-18 04:23:36.0 -0400
+++ b/include/asm-x86/types_64.h2007-10-18 07:03:11.0 -0400
@@ -1,55 +1,22 @@
 #ifndef _X86_64_TYPES_H
 #define _X86_64_TYPES_H
 
-#ifndef __ASSEMBLY__
-
-typedef unsigned short umode_t;
-
-/*
- * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the
- * header files exported to user space
- */
-
-typedef __signed__ char __s8;
-typedef unsigned char __u8;
-
-typedef __signed__ short __s16;
-typedef unsigned short __u16;
-
-typedef __signed__ int __s32;
-typedef unsigned int __u32;
+#ifndef _X86_TYPES_H
+#error Do not include this file directly.  Use asm/types.h instead.
+#endif
 
+#ifndef __ASSEMBLY__
 typedef __signed__ long long __s64;
 typedef unsigned long long  __u64;
+#endif
 
-#endif /* __ASSEMBLY__ */
-
-/*
- * These aren't exported outside the kernel to avoid name space clashes
- */
 #ifdef __KERNEL__
 
 #define BITS_PER_LONG 64
 
 #ifndef __ASSEMBLY__
-
-typedef signed char s8;
-typedef unsigned char u8;
-
-typedef signed short s16;
-typedef unsigned short u16;
-
-typedef signed int s32;
-typedef unsigned int u32;
-
-typedef signed long long s64;
-typedef unsigned long long u64;
-
-typedef u64 dma64_addr_t;
 typedef u64 dma_addr_t;
-
-#endif /* __ASSEMBLY__ */
+#endif
 
 #endif /* __KERNEL__ */
-
-#endif
+#endif /* _X86_64_TYPES_H */
diff -urp a/include/asm-x86/types.h b/include/asm-x86/types.h
--- a/include/asm-x86/types.h   2007-10-18 04:23:36.0 -0400
+++ b/include/asm-x86/types.h   2007-10-18 06:59:37.0 -0400
@@ -1,3 +1,46 @@
+#ifndef _X86_TYPES_H
+#define _X86_TYPES_H
+
+#ifndef __ASSEMBLY__
+
+typedef unsigned short umode_t;
+
+/*
+ * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the
+ * header files exported to user space
+ */
+
+typedef __signed__ char __s8;
+typedef unsigned char __u8;
+
+typedef __signed__ short __s16;
+typedef unsigned short __u16;
+
+typedef __signed__ int __s32;
+typedef unsigned int __u32;
+
+/*
+ * These aren't exported outside the kernel to avoid name space clashes
+ */
+#ifdef __KERNEL__
+
+typedef signed char s8;
+typedef unsigned char u8;
+
+typedef signed short s16;
+typedef unsigned short u16;
+
+typedef signed int s32;
+typedef unsigned int u32;
+
+typedef signed long long s64;
+typedef unsigned long long u64;
+
+typedef u64 dma64_addr_t;
+
+#endif /* __KERNEL__ */
+#endif /* __ASSEMBLY__ */
+
 #ifdef __KERNEL__
 # ifdef CONFIG_X86_32
 #  include types_32.h
@@ -11,3 +54,5 @@
 #  include types_64.h
 # endif
 #endif
+
+#endif /* _X86_TYPES_H */

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message

Re: NVIDIA Ethernet & invalid MAC

2007-10-16 Thread Chris Snook

Konstantin Kalin wrote:
P.S. It's simple to add DEV_HAS_CORRECT_MACADDR to pci_device_tlb for 
these types of Ethernet. But I think it's not right decision because it 
would break older revisions of these models.


Any reason you can't distinguish based on PCI ID?

-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NVIDIA Ethernet invalid MAC

2007-10-16 Thread Chris Snook

Konstantin Kalin wrote:
P.S. It's simple to add DEV_HAS_CORRECT_MACADDR to pci_device_tlb for 
these types of Ethernet. But I think it's not right decision because it 
would break older revisions of these models.


Any reason you can't distinguish based on PCI ID?

-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: gigabit ethernet power consumption

2007-10-08 Thread Chris Snook

Pavel Machek wrote:

Hi!

I've found that gbit vs. 100mbit power consumption difference is about
1W -- pretty significant. (Maybe powertop should include it in the
tips section? :).

Energy Star people insist that machines should switch down to 100mbit
when network is idle, and I guess that makes a lot of sense -- you
save 1W locally and 1W on the router.

Question is, how to implement it correctly? Daemon that would watch
data rates and switch speeds using mii-tool would be simple, but is
that enough?


I believe you misspelled "ethtool".

While you're at it, why stop at 100Mb?  I believe you save even more power at 
10Mb, which is why WOL puts the card in 10Mb mode.  In my experience, you 
generally want either the maximum setting or the minimum setting when going for 
power savings, because of the race-to-idle effect.  Workloads that have a 
sustained fractional utilization are rare.  Right now I'm at home, hooked up to 
a cable modem, so anything over 4Mb is wasted, unless I'm talking to the box 
across the room, which is rare.


Talk to the NetworkManager folks.  This is right up their alley.

-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: gigabit ethernet power consumption

2007-10-08 Thread Chris Snook

Pavel Machek wrote:

Hi!

I've found that gbit vs. 100mbit power consumption difference is about
1W -- pretty significant. (Maybe powertop should include it in the
tips section? :).

Energy Star people insist that machines should switch down to 100mbit
when network is idle, and I guess that makes a lot of sense -- you
save 1W locally and 1W on the router.

Question is, how to implement it correctly? Daemon that would watch
data rates and switch speeds using mii-tool would be simple, but is
that enough?


I believe you misspelled ethtool.

While you're at it, why stop at 100Mb?  I believe you save even more power at 
10Mb, which is why WOL puts the card in 10Mb mode.  In my experience, you 
generally want either the maximum setting or the minimum setting when going for 
power savings, because of the race-to-idle effect.  Workloads that have a 
sustained fractional utilization are rare.  Right now I'm at home, hooked up to 
a cable modem, so anything over 4Mb is wasted, unless I'm talking to the box 
across the room, which is rare.


Talk to the NetworkManager folks.  This is right up their alley.

-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: One process with multiple user ids.

2007-10-02 Thread Chris Snook

Giuliano Gagliardi wrote:

Hello,

I have a server that has to switch to different user ids, but because it does 
other complex things, I would rather not have it run as root.


Well, it's probably going to have to *start* as root, or use something like 
sudo.  It's probably easiest to have it start as root and drop privileges as 
soon as possible, certainly before handling any untrusted data.


> I only need the

server to be able to switch to certain pre-defined user ids.


This is a very easy special case.  Just start a process for each user ID and 
drop root privileges.  They can communicate via sockets or even shared memory. 
If you wanted to switch between arbitrary UIDs at runtime, it might be worth 
doing something exotic, but it's really not in this case.  Also, if you do it 
this way, it's rather easy to verify the correctness of your design, and you 
never have to touch kernel code.


I have seen that two possible solutions have already been suggested here on 
the LKML, but it was some years ago, and nothing like it has been 
implemented.


(1) Having supplementary user ids like there are supplementary group ids and 
system calls getuids() and setuids() that work like getgroups() and 
setgroups()


But you can already accomplish this with ACLs and SELinux.  You're trying to 
make this problem harder than it really is.



(2) Allowing processes to pass user and group ids via sockets.


And do what with them?  You can already pass arbitrary data via sockets.  It 
sounds like you need (1) to use (2).


Both (1) and (2) would solve my problem. Now my question is whether there are 
any fundamental flaws with (1) or (2), or whether the right way to solve my 
problem is another one.


(1) doesn't accomplish anything you can't already do, but it would make a huge 
mess of a lot of code.


(2) is silly.  Sockets are for communicating between userspace processes.  If 
you want to be granting/revoking credentials, you should be using system calls, 
and even then only if you absolutely must.  Having the kernel snoop traffic on 
sockets between processes would be disastrous for performance, and without that, 
any process could claim that it had been granted privileges over a socket and 
the kernel would just have to trust it.


Don't overthink this.  You don't need to touch the kernel at all to do this. 
Just use a multi-process model, like qmail does, for example.  You can start 
with root privileges and drop them, or use sudo to help you out.  It's fast, 
secure, takes advantage of modern multi-core CPUs, and is much simpler.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: One process with multiple user ids.

2007-10-02 Thread Chris Snook

Giuliano Gagliardi wrote:

Hello,

I have a server that has to switch to different user ids, but because it does 
other complex things, I would rather not have it run as root.


Well, it's probably going to have to *start* as root, or use something like 
sudo.  It's probably easiest to have it start as root and drop privileges as 
soon as possible, certainly before handling any untrusted data.


 I only need the

server to be able to switch to certain pre-defined user ids.


This is a very easy special case.  Just start a process for each user ID and 
drop root privileges.  They can communicate via sockets or even shared memory. 
If you wanted to switch between arbitrary UIDs at runtime, it might be worth 
doing something exotic, but it's really not in this case.  Also, if you do it 
this way, it's rather easy to verify the correctness of your design, and you 
never have to touch kernel code.


I have seen that two possible solutions have already been suggested here on 
the LKML, but it was some years ago, and nothing like it has been 
implemented.


(1) Having supplementary user ids like there are supplementary group ids and 
system calls getuids() and setuids() that work like getgroups() and 
setgroups()


But you can already accomplish this with ACLs and SELinux.  You're trying to 
make this problem harder than it really is.



(2) Allowing processes to pass user and group ids via sockets.


And do what with them?  You can already pass arbitrary data via sockets.  It 
sounds like you need (1) to use (2).


Both (1) and (2) would solve my problem. Now my question is whether there are 
any fundamental flaws with (1) or (2), or whether the right way to solve my 
problem is another one.


(1) doesn't accomplish anything you can't already do, but it would make a huge 
mess of a lot of code.


(2) is silly.  Sockets are for communicating between userspace processes.  If 
you want to be granting/revoking credentials, you should be using system calls, 
and even then only if you absolutely must.  Having the kernel snoop traffic on 
sockets between processes would be disastrous for performance, and without that, 
any process could claim that it had been granted privileges over a socket and 
the kernel would just have to trust it.


Don't overthink this.  You don't need to touch the kernel at all to do this. 
Just use a multi-process model, like qmail does, for example.  You can start 
with root privileges and drop them, or use sudo to help you out.  It's fast, 
secure, takes advantage of modern multi-core CPUs, and is much simpler.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bonnie++ with 1024k stripe SW/RAID5 causes kernel to goto D-state

2007-09-29 Thread Chris Snook

Justin Piszcz wrote:

Kernel: 2.6.23-rc8 (older kernels do this as well)

When running the following command:
/usr/bin/time /usr/sbin/bonnie++ -d /x/test -s 16384 -m p34 -n 
16:10:16:64


It hangs unless I increase various parameters md/raid such as the 
stripe_cache_size etc..


# ps auxww | grep D
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root   276  0.0  0.0  0 0 ?D12:14   0:00 [pdflush]
root   277  0.0  0.0  0 0 ?D12:14   0:00 [pdflush]
root  1639  0.0  0.0  0 0 ?D<   12:14   0:00 [xfsbufd]
root  1767  0.0  0.0   8100   420 ?Ds   12:14   0:00 
root  2895  0.0  0.0   5916   632 ?Ds   12:15   0:00 
/sbin/syslogd -r


See the bottom for more details.

Is this normal?  Does md only work without tuning up to a certain stripe 
size? I use a RAID 5 with 1024k stripe which works fine with many 
optimizations, but if I just boot the system and run bonnie++ on it 
without applying the optimizations, it will hang in d-state.  When I run 
the optimizations, then it exits out of D-state, pretty weird?


Not at all.  1024k stripes are way outside the norm.  If you do something way 
outside the norm, and don't tune for it in advance, don't be terribly surprised 
when something like bonnie++ brings your box to its knees.


That's not to say we couldn't make md auto-tune itself more intelligently, but 
this isn't really a bug.  With a sufficiently huge amount of RAM, you'd be able 
to dynamically allocate the buffers that you're not pre-allocating with 
stripe_cache_size, but bonnie++ is eating that up in this case.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bonnie++ with 1024k stripe SW/RAID5 causes kernel to goto D-state

2007-09-29 Thread Chris Snook

Justin Piszcz wrote:

Kernel: 2.6.23-rc8 (older kernels do this as well)

When running the following command:
/usr/bin/time /usr/sbin/bonnie++ -d /x/test -s 16384 -m p34 -n 
16:10:16:64


It hangs unless I increase various parameters md/raid such as the 
stripe_cache_size etc..


# ps auxww | grep D
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root   276  0.0  0.0  0 0 ?D12:14   0:00 [pdflush]
root   277  0.0  0.0  0 0 ?D12:14   0:00 [pdflush]
root  1639  0.0  0.0  0 0 ?D   12:14   0:00 [xfsbufd]
root  1767  0.0  0.0   8100   420 ?Ds   12:14   0:00 
root  2895  0.0  0.0   5916   632 ?Ds   12:15   0:00 
/sbin/syslogd -r


See the bottom for more details.

Is this normal?  Does md only work without tuning up to a certain stripe 
size? I use a RAID 5 with 1024k stripe which works fine with many 
optimizations, but if I just boot the system and run bonnie++ on it 
without applying the optimizations, it will hang in d-state.  When I run 
the optimizations, then it exits out of D-state, pretty weird?


Not at all.  1024k stripes are way outside the norm.  If you do something way 
outside the norm, and don't tune for it in advance, don't be terribly surprised 
when something like bonnie++ brings your box to its knees.


That's not to say we couldn't make md auto-tune itself more intelligently, but 
this isn't really a bug.  With a sufficiently huge amount of RAM, you'd be able 
to dynamically allocate the buffers that you're not pre-allocating with 
stripe_cache_size, but bonnie++ is eating that up in this case.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND] x86_64: make atomic64_t work like atomic_t

2007-09-26 Thread Chris Snook
Regardless of the greater controversy about the semantics of atomic_t, I think
we can all agree that atomic_t and atomic64_t should have the same semantics.
This is presently not the case on x86_64, where the volatile keyword was
removed from the declaration of atomic_t, but it was not removed from the
declaration of atomic64_t.  The following patch fixes that inconsistency,
without delving into anything more controversial.

From: Chris Snook <[EMAIL PROTECTED]>

The volatile keyword has already been removed from the declaration of atomic_t
on x86_64.  For consistency, remove it from atomic64_t as well.

Signed-off-by: Chris Snook <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>

--- a/include/asm-x86_64/atomic.h   2007-07-08 19:32:17.0 -0400
+++ b/include/asm-x86_64/atomic.h   2007-09-13 11:30:51.0 -0400
@@ -206,7 +206,7 @@ static __inline__ int atomic_sub_return(
 
 /* An 64bit atomic type */
 
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i)   { (i) }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND] x86_64: make atomic64_t work like atomic_t

2007-09-26 Thread Chris Snook
Regardless of the greater controversy about the semantics of atomic_t, I think
we can all agree that atomic_t and atomic64_t should have the same semantics.
This is presently not the case on x86_64, where the volatile keyword was
removed from the declaration of atomic_t, but it was not removed from the
declaration of atomic64_t.  The following patch fixes that inconsistency,
without delving into anything more controversial.

From: Chris Snook [EMAIL PROTECTED]

The volatile keyword has already been removed from the declaration of atomic_t
on x86_64.  For consistency, remove it from atomic64_t as well.

Signed-off-by: Chris Snook [EMAIL PROTECTED]
CC: Andi Kleen [EMAIL PROTECTED]

--- a/include/asm-x86_64/atomic.h   2007-07-08 19:32:17.0 -0400
+++ b/include/asm-x86_64/atomic.h   2007-09-13 11:30:51.0 -0400
@@ -206,7 +206,7 @@ static __inline__ int atomic_sub_return(
 
 /* An 64bit atomic type */
 
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i)   { (i) }
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: patch/option to wipe memory at boot?

2007-09-19 Thread Chris Snook

David Madore wrote:

On Mon, Sep 17, 2007 at 11:11:52AM -0700, Jeremy Fitzhardinge wrote:

Boot memtest86 for a little while before booting the kernel?  And if you
haven't already run it for a while, then that would be your first step
anyway.


Indeed, that does the trick, thanks for the suggestion.  So I can be
quite confident, now, that my RAM is sane and it's just that the BIOS
doesn't initialize it properly.

But I'd still like some way of filling the RAM when Linux starts (or
perhaps in the bootloader), because letting memtest86 run after every
cold reboot isn't a very satisfactory solution.


Bootloaders like to do things like run in 16-bit or 32-bit mode on boxes where 
higher bitness is necessary to access all the memory.  It may be possible to do 
this in the bootloader, but the BIOS is clearly the correct place to fix this 
problem.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: patch/option to wipe memory at boot?

2007-09-19 Thread Chris Snook

David Madore wrote:

On Mon, Sep 17, 2007 at 11:11:52AM -0700, Jeremy Fitzhardinge wrote:

Boot memtest86 for a little while before booting the kernel?  And if you
haven't already run it for a while, then that would be your first step
anyway.


Indeed, that does the trick, thanks for the suggestion.  So I can be
quite confident, now, that my RAM is sane and it's just that the BIOS
doesn't initialize it properly.

But I'd still like some way of filling the RAM when Linux starts (or
perhaps in the bootloader), because letting memtest86 run after every
cold reboot isn't a very satisfactory solution.


Bootloaders like to do things like run in 16-bit or 32-bit mode on boxes where 
higher bitness is necessary to access all the memory.  It may be possible to do 
this in the bootloader, but the BIOS is clearly the correct place to fix this 
problem.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU usage for 10Gbps UDP transfers

2007-09-17 Thread Chris Snook

Lukas Hejtmanek wrote:

Hello,

is it expected that application sending 8900bytes datagram through 10Gbps NIC
utilizes CPU to 100% and similarly the receiver also utilizes CPU to 100%.
Is it something wrong or this is quite OK?

(The box is dual single core Opteron 2.4GHz with Myricom 10GE NIC.)


Every time a new generation of ethernet comes out, its peak throughput exceeds 
the memory/CPU/IO capacity of commodity hardware available at the time.  This is 
normal.  Of course, you may not be saturating the link, and it may be possible 
to tune the driver to improve your throughput, but you'll still be saturating a 
CPU on that hardware.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CPU usage for 10Gbps UDP transfers

2007-09-17 Thread Chris Snook

Lukas Hejtmanek wrote:

Hello,

is it expected that application sending 8900bytes datagram through 10Gbps NIC
utilizes CPU to 100% and similarly the receiver also utilizes CPU to 100%.
Is it something wrong or this is quite OK?

(The box is dual single core Opteron 2.4GHz with Myricom 10GE NIC.)


Every time a new generation of ethernet comes out, its peak throughput exceeds 
the memory/CPU/IO capacity of commodity hardware available at the time.  This is 
normal.  Of course, you may not be saturating the link, and it may be possible 
to tune the driver to improve your throughput, but you'll still be saturating a 
CPU on that hardware.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: irq load balancing

2007-09-13 Thread Chris Snook

Venkat Subbiah wrote:

Since most network devices have a single status register for both
receiver and transmit (and errors and the like), which needs a lock to
protect access, you will likely end up with serious thrashing of moving
the lock between cpus.

Any ways to measure the trashing of locks?


Since most network devices have a single status register for both
receiver and transmit (and errors and the like)

These register accesses will be mostly within the irq handler which I

plan on keeping on the same processor. The network driver is actually
tg3. Will looks closely into the driver.


Why are you trying to do this, anyway?  This is a classic example of fairness 
hurting both performance and efficiency.  Unbalanced distribution of a single 
IRQ gives superior performance.  There are cases when this is a worthwhile 
tradeoff, but the network stack is not one of them.  In the HPC world, people 
generally want to squeeze maximum performance out of CPU/cache/RAM so they just 
accept the imbalance because it performs better than balancing it, and 
irqbalance can keep things fair over longer intervals if that's important.  In 
the realtime world, people generally bind everything they can to one or two 
CPUs, and bind their realtime applications to the remaining ones to minimize 
contention.


Distributing your network interrupts in a round-robin fashion will make your 
computer do exactly one thing faster: heat up the room.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86_64: make atomic64_t semantics consistent with atomic_t

2007-09-13 Thread Chris Snook
From: Chris Snook <[EMAIL PROTECTED]>

The volatile keyword has already been removed from the declaration of atomic_t
on x86_64.  For consistency, remove it from atomic64_t as well.

Signed-off-by: Chris Snook <[EMAIL PROTECTED]>

--- a/include/asm-x86_64/atomic.h   2007-07-08 19:32:17.0 -0400
+++ b/include/asm-x86_64/atomic.h   2007-09-13 11:30:51.0 -0400
@@ -206,7 +206,7 @@ static __inline__ int atomic_sub_return(
 
 /* An 64bit atomic type */
 
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i)   { (i) }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Lossy interrupts on x86_64

2007-09-13 Thread Chris Snook

Jesse Barnes wrote:
I just narrowed down a weird problem where I was losing more than 50% of 
my vblank interrupts to what seems to be the hires timers patch.  Stock 
2.6.23-rc5 works fine, but the latest (171) kernel from rawhide drops 
most of my interrupts unless I also have another interrupt source 
running (e.g. if I hold down a key or move the mouse I get the expected 
number of vblank interrupts, otherwise I get between 3 and 30 instead 
of the expected 60 per second).


Any ideas?  It seems like it might be bad APIC programming, but I 
haven't gone through those mods to look for suspects...


What happens if you boot with 'noapic' or 'pci=nomsi'?  Please post dmesg as 
well so we can see how the kernel is initializing the relevant hardware.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Lossy interrupts on x86_64

2007-09-13 Thread Chris Snook

Jesse Barnes wrote:
I just narrowed down a weird problem where I was losing more than 50% of 
my vblank interrupts to what seems to be the hires timers patch.  Stock 
2.6.23-rc5 works fine, but the latest (171) kernel from rawhide drops 
most of my interrupts unless I also have another interrupt source 
running (e.g. if I hold down a key or move the mouse I get the expected 
number of vblank interrupts, otherwise I get between 3 and 30 instead 
of the expected 60 per second).


Any ideas?  It seems like it might be bad APIC programming, but I 
haven't gone through those mods to look for suspects...


What happens if you boot with 'noapic' or 'pci=nomsi'?  Please post dmesg as 
well so we can see how the kernel is initializing the relevant hardware.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86_64: make atomic64_t semantics consistent with atomic_t

2007-09-13 Thread Chris Snook
From: Chris Snook [EMAIL PROTECTED]

The volatile keyword has already been removed from the declaration of atomic_t
on x86_64.  For consistency, remove it from atomic64_t as well.

Signed-off-by: Chris Snook [EMAIL PROTECTED]

--- a/include/asm-x86_64/atomic.h   2007-07-08 19:32:17.0 -0400
+++ b/include/asm-x86_64/atomic.h   2007-09-13 11:30:51.0 -0400
@@ -206,7 +206,7 @@ static __inline__ int atomic_sub_return(
 
 /* An 64bit atomic type */
 
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i)   { (i) }
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: irq load balancing

2007-09-13 Thread Chris Snook

Venkat Subbiah wrote:

Since most network devices have a single status register for both
receiver and transmit (and errors and the like), which needs a lock to
protect access, you will likely end up with serious thrashing of moving
the lock between cpus.

Any ways to measure the trashing of locks?


Since most network devices have a single status register for both
receiver and transmit (and errors and the like)

These register accesses will be mostly within the irq handler which I

plan on keeping on the same processor. The network driver is actually
tg3. Will looks closely into the driver.


Why are you trying to do this, anyway?  This is a classic example of fairness 
hurting both performance and efficiency.  Unbalanced distribution of a single 
IRQ gives superior performance.  There are cases when this is a worthwhile 
tradeoff, but the network stack is not one of them.  In the HPC world, people 
generally want to squeeze maximum performance out of CPU/cache/RAM so they just 
accept the imbalance because it performs better than balancing it, and 
irqbalance can keep things fair over longer intervals if that's important.  In 
the realtime world, people generally bind everything they can to one or two 
CPUs, and bind their realtime applications to the remaining ones to minimize 
contention.


Distributing your network interrupts in a round-robin fashion will make your 
computer do exactly one thing faster: heat up the room.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: irq load balancing

2007-09-12 Thread Chris Snook

Venkat Subbiah wrote:

Most of the load in my system is triggered by a single ethernet IRQ.
Essentially the IRQ schedules a tasklet and most of the work is done in the
taskelet which is scheduled in the IRQ. From what I read looks like the
tasklet would be executed on the same CPU on which it was scheduled. So this
means even in an SMP system it will be one processor which is overloaded.

So will using the user space IRQ loadbalancer really help?


A little bit.  It'll keep other IRQs on different CPUs, which will prevent other 
interrupts from causing cache and TLB evictions that could slow down the 
interrupt handler for the NIC.



What I am doubtful
about is that the user space load balance comes along and changes the
affinity once in a while. But really what I need is every interrupt to go to
a different CPU in a round robin fashion.


Doing it in a round-robin fashion will be disastrous for performance.  Your 
cache miss rate will go through the roof and you'll hit the slow paths in the 
network stack most of the time.



Looks like the APIC  can distribute IRQ's dynamically? Is this supported in
the kernel and any config or proc interface to turn this on/off.


/proc/irq/$FOO/smp_affinity is a bitmask.  You can mask an irq to multiple 
processors.  Of course, this will absolutely kill your performance.  That's why 
irqbalance never does this.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: irq load balancing

2007-09-12 Thread Chris Snook

Venkat Subbiah wrote:

Most of the load in my system is triggered by a single ethernet IRQ.
Essentially the IRQ schedules a tasklet and most of the work is done in the
taskelet which is scheduled in the IRQ. From what I read looks like the
tasklet would be executed on the same CPU on which it was scheduled. So this
means even in an SMP system it will be one processor which is overloaded.

So will using the user space IRQ loadbalancer really help?


A little bit.  It'll keep other IRQs on different CPUs, which will prevent other 
interrupts from causing cache and TLB evictions that could slow down the 
interrupt handler for the NIC.



What I am doubtful
about is that the user space load balance comes along and changes the
affinity once in a while. But really what I need is every interrupt to go to
a different CPU in a round robin fashion.


Doing it in a round-robin fashion will be disastrous for performance.  Your 
cache miss rate will go through the roof and you'll hit the slow paths in the 
network stack most of the time.



Looks like the APIC  can distribute IRQ's dynamically? Is this supported in
the kernel and any config or proc interface to turn this on/off.


/proc/irq/$FOO/smp_affinity is a bitmask.  You can mask an irq to multiple 
processors.  Of course, this will absolutely kill your performance.  That's why 
irqbalance never does this.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   >