Re: [PATCH net] atl1c: fix error return code in atl1c_probe()
On Tue, Nov 17, 2020 at 1:01 AM Heiner Kallweit wrote: > > Am 17.11.2020 um 08:43 schrieb Chris Snook: > > The full text of the preceding comment explains the need: > > > > /* > > * The atl1c chip can DMA to 64-bit addresses, but it uses a single > > * shared register for the high 32 bits, so only a single, aligned, > > * 4 GB physical address range can be used at a time. > > * > > * Supporting 64-bit DMA on this hardware is more trouble than it's > > * worth. It is far easier to limit to 32-bit DMA than update > > * various kernel subsystems to support the mechanics required by a > > * fixed-high-32-bit system. > > */ > > > > Without this, we get data corruption and crashes on machines with 4 GB > > of RAM or more. > > > > - Chris > > > > On Mon, Nov 16, 2020 at 11:14 PM Heiner Kallweit > > wrote: > >> > >> Am 17.11.2020 um 03:55 schrieb Zhang Changzhong: > >>> Fix to return a negative error code from the error handling > >>> case instead of 0, as done elsewhere in this function. > >>> > >>> Fixes: 85eb5bc33717 ("net: atheros: switch from 'pci_' to 'dma_' API") > >>> Reported-by: Hulk Robot > >>> Signed-off-by: Zhang Changzhong > >>> --- > >>> drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 4 ++-- > >>> 1 file changed, 2 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > >>> b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > >>> index 0c12cf7..3f65f2b 100644 > >>> --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > >>> +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > >>> @@ -2543,8 +2543,8 @@ static int atl1c_probe(struct pci_dev *pdev, const > >>> struct pci_device_id *ent) > >>>* various kernel subsystems to support the mechanics required by a > >>>* fixed-high-32-bit system. > >>>*/ > >>> - if ((dma_set_mask(>dev, DMA_BIT_MASK(32)) != 0) || > >>> - (dma_set_coherent_mask(>dev, DMA_BIT_MASK(32)) != 0)) { > >>> + err = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32)); > >> > >> I wonder whether you need this call at all, because 32bit is the default. > >> See following > >> > >> "By default, the kernel assumes that your device can address 32-bits > >> of DMA addressing." > >> > >> in https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt > >> > >>> + if (err) { > >>> dev_err(>dev, "No usable DMA > >>> configuration,aborting\n"); > >>> goto err_dma; > >>> } > >>> > >> > > Please don't top-post. > >From what I've seen the kernel configures 32bit as default DMA size. > See beginning of pci_device_add(), there the coherent mask is set to 32bit. > > And in pci_setup_device() see the following: > /* > * Assume 32-bit PCI; let 64-bit PCI cards (which are far rarer) > * set this higher, assuming the system even supports it. > */ > dev->dma_mask = 0x; > > > That means if you would like to use 64bit DMA then you'd need to configure > this explicitly. > You could check to which mask dev->dma_mask and dev->coherent_dma_mask are set > w/o the call to dma_set_mask_and_coherent. I don't remember the exact history with atl1c, but we really did hit this bug with atl1 and atl2. I'm not sure if that's because this default wasn't there or if it's because because another call was replaced with this call, but either way it's quite likely that at some point in the future someone who doesn't even have test hardware will try to port this to a newer interface that doesn't make the same assumption, and bad things will happen. This isn't a hot path, so it's better to be explicit. If dma_set_mask_and_coherent() ever takes a long time or fails, something is seriously wrong and we probably want to know about it before we start DMAing. - Chris
Re: [PATCH net] atl1c: fix error return code in atl1c_probe()
The full text of the preceding comment explains the need: /* * The atl1c chip can DMA to 64-bit addresses, but it uses a single * shared register for the high 32 bits, so only a single, aligned, * 4 GB physical address range can be used at a time. * * Supporting 64-bit DMA on this hardware is more trouble than it's * worth. It is far easier to limit to 32-bit DMA than update * various kernel subsystems to support the mechanics required by a * fixed-high-32-bit system. */ Without this, we get data corruption and crashes on machines with 4 GB of RAM or more. - Chris On Mon, Nov 16, 2020 at 11:14 PM Heiner Kallweit wrote: > > Am 17.11.2020 um 03:55 schrieb Zhang Changzhong: > > Fix to return a negative error code from the error handling > > case instead of 0, as done elsewhere in this function. > > > > Fixes: 85eb5bc33717 ("net: atheros: switch from 'pci_' to 'dma_' API") > > Reported-by: Hulk Robot > > Signed-off-by: Zhang Changzhong > > --- > > drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > index 0c12cf7..3f65f2b 100644 > > --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > > @@ -2543,8 +2543,8 @@ static int atl1c_probe(struct pci_dev *pdev, const > > struct pci_device_id *ent) > >* various kernel subsystems to support the mechanics required by a > >* fixed-high-32-bit system. > >*/ > > - if ((dma_set_mask(>dev, DMA_BIT_MASK(32)) != 0) || > > - (dma_set_coherent_mask(>dev, DMA_BIT_MASK(32)) != 0)) { > > + err = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32)); > > I wonder whether you need this call at all, because 32bit is the default. > See following > > "By default, the kernel assumes that your device can address 32-bits > of DMA addressing." > > in https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt > > > + if (err) { > > dev_err(>dev, "No usable DMA configuration,aborting\n"); > > goto err_dma; > > } > > >
Re: [PATCH 0/3] net: ethernet: atheros: atlx: Use PCI generic definitions instead of private duplicates
On Fri, Jun 21, 2019 at 11:33 AM Joe Perches wrote: > > On Fri, 2019-06-21 at 13:12 -0500, Bjorn Helgaas wrote: > > On Fri, Jun 21, 2019 at 12:27 PM Joe Perches wrote: > [] > > > Subsystem specific local PCI #defines without generic > > > naming is poor style and makes treewide grep and > > > refactoring much more difficult. > > > > Don't worry, we have the same objectives. I totally agree that local > > #defines are a bad thing, which is why I proposed this project in the > > first place. > > Hi again Bjorn. > > I didn't know that was your idea. Good idea. > > > I'm just saying that this is a "first-patch" sort of learning project > > and I think it'll avoid some list spamming and discouragement if we > > can figure out the scope and shake out some of the teething problems > > ahead of time. I don't want to end up with multiple versions of > > dozens of little 2-3 patch series posted every week or two. > > Great, that's sensible. > > > I'd rather be able to deal with a whole block of them at one time. > > Also very sensible. > > > > 2: Show that you compiled the object files and verified > > >where possible that there are no object file changes. > > > > Do you have any pointers for the best way to do this? Is it as simple > > as comparing output of "objdump -d"? > > Generically, yes. > > I have a little script that does the equivalent of: > > > make > mv .old > patch -P1 < > make > mv .new > diff -urN <(objdump -d .old) <(objdump -d .new) > > But it's not foolproof as gcc does not guarantee > compilation repeatability. > > And some subsystems Makefiles do not allow per-file > compilation. > This should work, but be aware that the older atlx drivers did some regrettable things with file structure, so not all .c files are expected to generate a corresponding .o file. - Chris
Re: [PATCH] [trivial] treewide: Fix company name in module descriptions
On Thu, Oct 16, 2014 at 8:09 AM, Masanari Iida wrote: > This patch fix company name's spelling typo in module descriptions > and a Kconfig. > > Signed-off-by: Masanari Iida > diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > index 72fb86b..c9946c6 100644 > --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > @@ -48,7 +48,7 @@ MODULE_DEVICE_TABLE(pci, atl1c_pci_tbl); > > MODULE_AUTHOR("Jie Yang"); > MODULE_AUTHOR("Qualcomm Atheros Inc., "); > -MODULE_DESCRIPTION("Qualcom Atheros 100/1000M Ethernet Network Driver"); > +MODULE_DESCRIPTION("Qualcomm Atheros 100/1000M Ethernet Network Driver"); > MODULE_LICENSE("GPL"); > MODULE_VERSION(ATL1C_DRV_VERSION); > Acked-by: Chris Snook -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [trivial] treewide: Fix company name in module descriptions
On Thu, Oct 16, 2014 at 8:09 AM, Masanari Iida standby2...@gmail.com wrote: This patch fix company name's spelling typo in module descriptions and a Kconfig. Signed-off-by: Masanari Iida standby2...@gmail.com diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c index 72fb86b..c9946c6 100644 --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c @@ -48,7 +48,7 @@ MODULE_DEVICE_TABLE(pci, atl1c_pci_tbl); MODULE_AUTHOR(Jie Yang); MODULE_AUTHOR(Qualcomm Atheros Inc., nic-de...@qualcomm.com); -MODULE_DESCRIPTION(Qualcom Atheros 100/1000M Ethernet Network Driver); +MODULE_DESCRIPTION(Qualcomm Atheros 100/1000M Ethernet Network Driver); MODULE_LICENSE(GPL); MODULE_VERSION(ATL1C_DRV_VERSION); Acked-by: Chris Snook chris.sn...@gmail.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Performance problems with 3ware 9500S-4LP and 2.6.25-rc3
Andre Noll wrote: we are experiencing massive performance problems with two of our Linux servers that contain 3ware controllers on a Tyan mainboard and a couple of 1T disks. During the daily cron job that uses rsync to sync a 500G file system from another machine to the raid on the 3ware controller the load jumps up, and the machine becomes sluggish as hell. For example, an ssh login to that machine takes minutes to complete and ldap becomes unreliable while the rsync job is running. Even Nagios complains about the machine being down while rsync is running. You're putting your box under astronomical load. This is generally regarded as a bad idea, regardless of how well your storage controller is performing. Can you measure the single-threaded throughput (say, coping one huge file, and then syncing) to give us a baseline performance figure? rsync will happily peg your box, your network, and your cat if you let it. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Performance problems with 3ware 9500S-4LP and 2.6.25-rc3
Andre Noll wrote: we are experiencing massive performance problems with two of our Linux servers that contain 3ware controllers on a Tyan mainboard and a couple of 1T disks. During the daily cron job that uses rsync to sync a 500G file system from another machine to the raid on the 3ware controller the load jumps up, and the machine becomes sluggish as hell. For example, an ssh login to that machine takes minutes to complete and ldap becomes unreliable while the rsync job is running. Even Nagios complains about the machine being down while rsync is running. You're putting your box under astronomical load. This is generally regarded as a bad idea, regardless of how well your storage controller is performing. Can you measure the single-threaded throughput (say, coping one huge file, and then syncing) to give us a baseline performance figure? rsync will happily peg your box, your network, and your cat if you let it. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MARKERS depends on MODULES
From: Chris Snook <[EMAIL PROTECTED]> Make MARKERS depend on MODULES to prevent build failures with certain configs. Signed-off-by: Chris Snook <[EMAIL PROTECTED]> diff --git a/init/Kconfig b/init/Kconfig index dcef8b5..933df15 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -729,6 +729,7 @@ config PROFILING config MARKERS bool "Activate markers" + depends on MODULES help Place an empty function call at each marker site. Can be dynamically changed for a probe function. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] make LKDTM depend on BLOCK
From: Chris Snook <[EMAIL PROTECTED]> Make LKDTM depend on BLOCK to prevent build failures with certain configs. Signed-off-by: Chris Snook <[EMAIL PROTECTED]> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index a370fe8..24b327c 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -524,6 +524,7 @@ config LKDTM tristate "Linux Kernel Dump Test Tool Module" depends on DEBUG_KERNEL depends on KPROBES + depends on BLOCK default n help This module enables testing of the different dumping mechanisms by -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] make LKDTM depend on BLOCK
From: Chris Snook [EMAIL PROTECTED] Make LKDTM depend on BLOCK to prevent build failures with certain configs. Signed-off-by: Chris Snook [EMAIL PROTECTED] diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index a370fe8..24b327c 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -524,6 +524,7 @@ config LKDTM tristate Linux Kernel Dump Test Tool Module depends on DEBUG_KERNEL depends on KPROBES + depends on BLOCK default n help This module enables testing of the different dumping mechanisms by -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MARKERS depends on MODULES
From: Chris Snook [EMAIL PROTECTED] Make MARKERS depend on MODULES to prevent build failures with certain configs. Signed-off-by: Chris Snook [EMAIL PROTECTED] diff --git a/init/Kconfig b/init/Kconfig index dcef8b5..933df15 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -729,6 +729,7 @@ config PROFILING config MARKERS bool Activate markers + depends on MODULES help Place an empty function call at each marker site. Can be dynamically changed for a probe function. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next build status
Tony Breeds wrote: On Thu, Feb 14, 2008 at 08:24:27PM -0500, Chris Snook wrote: Stephen Rothwell wrote: Hi all, Initial status can be seen here http://kisskb.ellerman.id.au/kisskb/branch/9/ (I hope to make a better URL soon). Suggestions for more compiler/config combinations are welcome, but we can't necessarily commit to fulfilling all you wishes. :-) i386 allmodconfig please. Wont i386 allmodconfig be equivalent to x86_64 allmodconfig? Only if there are no bugs. Driver code is most likely to trip over bitness/endianness bugs, and you've already got allmodconfig builds for be32, be64, and le64 architectures. Adding an le32 architecture (i386) completes the coverage of these basic categories. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next build status
Stephen Rothwell wrote: Hi all, Initial status can be seen here http://kisskb.ellerman.id.au/kisskb/branch/9/ (I hope to make a better URL soon). Suggestions for more compiler/config combinations are welcome, but we can't necessarily commit to fulfilling all you wishes. :-) i386 allmodconfig please. Also, I highly recommend adding some randconfig builds, at least one 32-bit arch and one 64-bit arch. Any given randconfig build is not particularly likely to catch bugs that would be missed elsewhere, but doing them daily for two months will catch a lot of things before they get released. The catch, of course, is that you have to actually save the .config for this to be useful, which might require a slight modification to your scripts. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next build status
Stephen Rothwell wrote: Hi all, Initial status can be seen here http://kisskb.ellerman.id.au/kisskb/branch/9/ (I hope to make a better URL soon). Suggestions for more compiler/config combinations are welcome, but we can't necessarily commit to fulfilling all you wishes. :-) i386 allmodconfig please. Also, I highly recommend adding some randconfig builds, at least one 32-bit arch and one 64-bit arch. Any given randconfig build is not particularly likely to catch bugs that would be missed elsewhere, but doing them daily for two months will catch a lot of things before they get released. The catch, of course, is that you have to actually save the .config for this to be useful, which might require a slight modification to your scripts. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next build status
Tony Breeds wrote: On Thu, Feb 14, 2008 at 08:24:27PM -0500, Chris Snook wrote: Stephen Rothwell wrote: Hi all, Initial status can be seen here http://kisskb.ellerman.id.au/kisskb/branch/9/ (I hope to make a better URL soon). Suggestions for more compiler/config combinations are welcome, but we can't necessarily commit to fulfilling all you wishes. :-) i386 allmodconfig please. Wont i386 allmodconfig be equivalent to x86_64 allmodconfig? Only if there are no bugs. Driver code is most likely to trip over bitness/endianness bugs, and you've already got allmodconfig builds for be32, be64, and le64 architectures. Adding an le32 architecture (i386) completes the coverage of these basic categories. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.25-rc1 panics on boot
Dhaval Giani wrote: I am getting the following oops on bootup on 2.6.25-rc1 ... I am booting using kexec with maxcpus=1. It does not have any problems with maxcpus=2 or higher. Sounds like another (the same?) kexec cpu numbering bug. Can you post/link the entire dmesg from both a cold boot and a kexec boot so we can compare? -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.25-rc1 panics on boot
Dhaval Giani wrote: I am getting the following oops on bootup on 2.6.25-rc1 ... I am booting using kexec with maxcpus=1. It does not have any problems with maxcpus=2 or higher. Sounds like another (the same?) kexec cpu numbering bug. Can you post/link the entire dmesg from both a cold boot and a kexec boot so we can compare? -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: log spamming
Gene Heskett wrote: Greetings; I just rebooted to a new config of 2.6.24, basically trying to strip out the building of modules I don't use. And I enabled a couple of checks that weren't checked in the kernel-hacking menu. .config posted on request. Now the messages log is being spammed at 2-5 second intervals by these: Feb 1 10:41:08 coyote kernel: [ 3085.501037] MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0. Feb 1 10:41:08 coyote kernel: [ 3085.501042] Bank 1: d4004152 Feb 1 10:41:08 coyote kernel: [ 3085.501045] MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0. Feb 1 10:41:08 coyote kernel: [ 3085.501048] Bank 2: d400417a Always the same 2 addresses. Is this telling me I should be running memtest86 for a couple of cycles? Those two addresses are in the same cache line, but they are *not* in the same 128-bit ECC block. This is probably a northbridge problem, not a RAM problem. It's not necessarily a hardware problem. I wouldn't be surprised if you swapped CPUs and still got the same result, due to BIOS misconfiguration. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: how to get chance for user space process even when the kernel is utilizing 100% CPU.
veerasena reddy wrote: I have a requirement where i need to execute a user process even when the kernel is utilizing 100% of CPU time. In the realtime kernel, hardware interrupt handlers are prioritized threads, so you can give the userspace process a higher realtime priority. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: how to get chance for user space process even when the kernel is utilizing 100% CPU.
veerasena reddy wrote: I have a requirement where i need to execute a user process even when the kernel is utilizing 100% of CPU time. In the realtime kernel, hardware interrupt handlers are prioritized threads, so you can give the userspace process a higher realtime priority. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: log spamming
Gene Heskett wrote: Greetings; I just rebooted to a new config of 2.6.24, basically trying to strip out the building of modules I don't use. And I enabled a couple of checks that weren't checked in the kernel-hacking menu. .config posted on request. Now the messages log is being spammed at 2-5 second intervals by these: Feb 1 10:41:08 coyote kernel: [ 3085.501037] MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0. Feb 1 10:41:08 coyote kernel: [ 3085.501042] Bank 1: d4004152 Feb 1 10:41:08 coyote kernel: [ 3085.501045] MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0. Feb 1 10:41:08 coyote kernel: [ 3085.501048] Bank 2: d400417a Always the same 2 addresses. Is this telling me I should be running memtest86 for a couple of cycles? Those two addresses are in the same cache line, but they are *not* in the same 128-bit ECC block. This is probably a northbridge problem, not a RAM problem. It's not necessarily a hardware problem. I wouldn't be surprised if you swapped CPUs and still got the same result, due to BIOS misconfiguration. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How does ext2 implement sparse files?
Lars Noschinski wrote: Hello! For an university project, we had to write a toy filesystem (ext2-like), for which I would like to implement sparse file support. For this, I digged through the ext2 source code; but I could not find the point, where ext2 detects holes. As far as I can see from fs/buffer.c, an hole is a buffer_head which is not mapped, but uptodate. But I cannot find a relevant source line, where ext2 makes usage of this information. In ext2 (and most other block filesystems) all files are sparse files. If you write to an address in the file for which no block is allocated, the filesystem allocates a block and writes the contents to disk, regardless of whether that block is at the end of the file (the usual case of lengthening a non-sparse file), in the middle of the file (filling in holes in a sparse file), or past the the end of the file (making a file sparse). -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: about relocs.c on x86
Yinghai Lu wrote: On Jan 31, 2008 12:33 AM, Chris Snook <[EMAIL PROTECTED]> wrote: Yinghai Lu wrote: why not rename relocs.c to relocs_32.c? Because we're trying to get rid of all the _32 and _64 files? but that file is not need for x86_64 Which means there's no conflict with any 64-bit code, and thus no reason to break it out into a _32 file. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: about relocs.c on x86
Yinghai Lu wrote: why not rename relocs.c to relocs_32.c? Because we're trying to get rid of all the _32 and _64 files? -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: about relocs.c on x86
Yinghai Lu wrote: why not rename relocs.c to relocs_32.c? Because we're trying to get rid of all the _32 and _64 files? -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: about relocs.c on x86
Yinghai Lu wrote: On Jan 31, 2008 12:33 AM, Chris Snook [EMAIL PROTECTED] wrote: Yinghai Lu wrote: why not rename relocs.c to relocs_32.c? Because we're trying to get rid of all the _32 and _64 files? but that file is not need for x86_64 Which means there's no conflict with any 64-bit code, and thus no reason to break it out into a _32 file. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How does ext2 implement sparse files?
Lars Noschinski wrote: Hello! For an university project, we had to write a toy filesystem (ext2-like), for which I would like to implement sparse file support. For this, I digged through the ext2 source code; but I could not find the point, where ext2 detects holes. As far as I can see from fs/buffer.c, an hole is a buffer_head which is not mapped, but uptodate. But I cannot find a relevant source line, where ext2 makes usage of this information. In ext2 (and most other block filesystems) all files are sparse files. If you write to an address in the file for which no block is allocated, the filesystem allocates a block and writes the contents to disk, regardless of whether that block is at the end of the file (the usual case of lengthening a non-sparse file), in the middle of the file (filling in holes in a sparse file), or past the the end of the file (making a file sparse). -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange error?
Gene Heskett wrote: Greetings all; This line showed up in my log a couple of hours ago, several minutes removed from anything else I was doing at the time: rarian-sk-get-c[31855]: segfault at eip 00b7c153 esp bf9ddf0c error 4 The system acts and feels normal. Does anyone have a clue to loan me? I would ask the rarian developers: http://rarian.freedesktop.org/ My barely-educated guess is that Gnome was doing a routine re-index of its help files and and the app got bored and decided to dereference a NULL pointer for fun. Your desktop documentation index may be incomplete or corrupt. Try not to panic. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Purpose of numa_node?
While pondering ways to optimize I/O and swapping on large NUMA machines, I noticed that the numa_node field in struct device isn't actually used anywhere. We just have a couple dozen lines of code to conditionally create a sysfs file that will always return -1. Is anyone even working on code to actually use this field? I think it's a good piece of information to keep track of, so I'm not suggesting we remove it, but I want to make sure I'm not stepping on toes or duplicating effort if I try to make it useful. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Purpose of numa_node?
While pondering ways to optimize I/O and swapping on large NUMA machines, I noticed that the numa_node field in struct device isn't actually used anywhere. We just have a couple dozen lines of code to conditionally create a sysfs file that will always return -1. Is anyone even working on code to actually use this field? I think it's a good piece of information to keep track of, so I'm not suggesting we remove it, but I want to make sure I'm not stepping on toes or duplicating effort if I try to make it useful. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] ext3: per-process soft-syncing data=ordered mode
Al Boldi wrote: Greetings! data=ordered mode has proven reliable over the years, and it does this by ordering filedata flushes before metadata flushes. But this sometimes causes contention in the order of a 10x slowdown for certain apps, either due to the misuse of fsync or due to inherent behaviour like db's, as well as inherent starvation issues exposed by the data=ordered mode. data=writeback mode alleviates data=order mode slowdowns, but only works per-mount and is too dangerous to run as a default mode. This RFC proposes to introduce a tunable which allows to disable fsync and changes ordered into writeback writeout on a per-process basis like this: echo 1 > /proc/`pidof process`/softsync Your comments are much welcome! This is basically a kernel workaround for stupid app behavior. It wouldn't be the first time we've provided such an option, but we shouldn't do it without a very good justification. At the very least, we need a test case that demonstrates the problem and benchmark results that prove that this approach actually fixes it. I suspect we can find a cleaner fix for the problem. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 09/26] atl1: refactor tx processing
Jay Cliburn wrote: On Tue, 22 Jan 2008 18:31:09 -0600 Jay Cliburn <[EMAIL PROTECTED]> wrote: On Tue, 22 Jan 2008 04:58:17 -0500 Jeff Garzik <[EMAIL PROTECTED]> wrote: [...] for such a huge patch, this description is very tiny. [describe] what is refactored, and why. Is this one any better? This satisfies me. Acked-by: Chris Snook <[EMAIL PROTECTED]> From df475e2eea401f9dc18ca23dab538b99fb9e710c Mon Sep 17 00:00:00 2001 From: Jay Cliburn <[EMAIL PROTECTED]> Date: Wed, 23 Jan 2008 21:36:36 -0600 Subject: [PATCH] atl1: simplify tx packet descriptor The transmit packet descriptor consists of four 32-bit words, with word 3 upper bits overloaded depending upon the condition of its bits 3 and 4. The driver currently duplicates all word 2 and some word 3 register bit definitions unnecessarily and also uses a set of nested structures in its definition of the TPD without good cause. This patch adds a lengthy comment describing the TPD, eliminates duplicate TPD bit definitions, and simplifies the TPD structure itself. It also expands the TSO check to correctly handle custom checksum versus TSO processing using the revised TPD definitions. Finally, shorten some variable names in the transmit processing path to reduce line lengths, rename some variables to better describe their purpose (e.g., nseg versus m), and add a comment or two to better describe what the code is doing. Signed-off-by: Jay Cliburn <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] ext3: per-process soft-syncing data=ordered mode
Al Boldi wrote: Greetings! data=ordered mode has proven reliable over the years, and it does this by ordering filedata flushes before metadata flushes. But this sometimes causes contention in the order of a 10x slowdown for certain apps, either due to the misuse of fsync or due to inherent behaviour like db's, as well as inherent starvation issues exposed by the data=ordered mode. data=writeback mode alleviates data=order mode slowdowns, but only works per-mount and is too dangerous to run as a default mode. This RFC proposes to introduce a tunable which allows to disable fsync and changes ordered into writeback writeout on a per-process basis like this: echo 1 /proc/`pidof process`/softsync Your comments are much welcome! This is basically a kernel workaround for stupid app behavior. It wouldn't be the first time we've provided such an option, but we shouldn't do it without a very good justification. At the very least, we need a test case that demonstrates the problem and benchmark results that prove that this approach actually fixes it. I suspect we can find a cleaner fix for the problem. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 09/26] atl1: refactor tx processing
Jay Cliburn wrote: On Tue, 22 Jan 2008 18:31:09 -0600 Jay Cliburn [EMAIL PROTECTED] wrote: On Tue, 22 Jan 2008 04:58:17 -0500 Jeff Garzik [EMAIL PROTECTED] wrote: [...] for such a huge patch, this description is very tiny. [describe] what is refactored, and why. Is this one any better? This satisfies me. Acked-by: Chris Snook [EMAIL PROTECTED] From df475e2eea401f9dc18ca23dab538b99fb9e710c Mon Sep 17 00:00:00 2001 From: Jay Cliburn [EMAIL PROTECTED] Date: Wed, 23 Jan 2008 21:36:36 -0600 Subject: [PATCH] atl1: simplify tx packet descriptor The transmit packet descriptor consists of four 32-bit words, with word 3 upper bits overloaded depending upon the condition of its bits 3 and 4. The driver currently duplicates all word 2 and some word 3 register bit definitions unnecessarily and also uses a set of nested structures in its definition of the TPD without good cause. This patch adds a lengthy comment describing the TPD, eliminates duplicate TPD bit definitions, and simplifies the TPD structure itself. It also expands the TSO check to correctly handle custom checksum versus TSO processing using the revised TPD definitions. Finally, shorten some variable names in the transmit processing path to reduce line lengths, rename some variables to better describe their purpose (e.g., nseg versus m), and add a comment or two to better describe what the code is doing. Signed-off-by: Jay Cliburn [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 06/26] atl1: update initialization parameters
Jay Cliburn wrote: On Tue, 22 Jan 2008 04:56:11 -0500 Jeff Garzik <[EMAIL PROTECTED]> wrote: [EMAIL PROTECTED] wrote: From: Jay Cliburn <[EMAIL PROTECTED]> Update initialization parameters to match the current vendor driver version 1.2.40.2. [...] ACK without any better knowledge... but is any addition insight available at all? No, sorry Jeff. I simply took the vendor's current driver and matched his initialization settings. I can only assume he discovered these values through lab testing. For this and the other "conform to vendor driver" patches in this set, I thought it important to have the in-tree driver match the vendor driver as closely as possible. The primary motivations are (1) my belief that he's in a better position to test the NIC, and (2) to be able to go to him for assistance occasionally and not be rejected because of significant differences between his and our drivers. I don't think we should be doing this without justification. From all the atl1 and atl2 code I've looked at, I've gotten the impression that their driver development processes are extremely ad-hoc. There is code in the Atheros version of atl2 that cannot *possibly* apply to that hardware and was just copied and pasted from atl1, just as much of atl1 was copied and pasted from e1000. The fact that various versions have different magic numbers may simply mean they copied and pasted from different irrelevant and incorrect sources. Our contacts at Atheros seem to be very good electrical engineers, so when they tell us that a certain setting should be changed to match particular properties of the hardware, I trust them. They are not, however, experienced and disciplined kernel developers, so absent such justification I think we should stick with what we have, which has been improved and reviewed by people who *are* experienced and disciplined kernel developers. We have at least as much to teach Atheros about writing kernel code as they have to teach us about their hardware. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 06/26] atl1: update initialization parameters
Jay Cliburn wrote: On Tue, 22 Jan 2008 04:56:11 -0500 Jeff Garzik [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: From: Jay Cliburn [EMAIL PROTECTED] Update initialization parameters to match the current vendor driver version 1.2.40.2. [...] ACK without any better knowledge... but is any addition insight available at all? No, sorry Jeff. I simply took the vendor's current driver and matched his initialization settings. I can only assume he discovered these values through lab testing. For this and the other conform to vendor driver patches in this set, I thought it important to have the in-tree driver match the vendor driver as closely as possible. The primary motivations are (1) my belief that he's in a better position to test the NIC, and (2) to be able to go to him for assistance occasionally and not be rejected because of significant differences between his and our drivers. I don't think we should be doing this without justification. From all the atl1 and atl2 code I've looked at, I've gotten the impression that their driver development processes are extremely ad-hoc. There is code in the Atheros version of atl2 that cannot *possibly* apply to that hardware and was just copied and pasted from atl1, just as much of atl1 was copied and pasted from e1000. The fact that various versions have different magic numbers may simply mean they copied and pasted from different irrelevant and incorrect sources. Our contacts at Atheros seem to be very good electrical engineers, so when they tell us that a certain setting should be changed to match particular properties of the hardware, I trust them. They are not, however, experienced and disciplined kernel developers, so absent such justification I think we should stick with what we have, which has been improved and reviewed by people who *are* experienced and disciplined kernel developers. We have at least as much to teach Atheros about writing kernel code as they have to teach us about their hardware. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Usage semantics of atomic_set ( )
Vineet Gupta wrote: I'm trying to implement atomic ops for a CPU which has no inherent support for Read-Modify-Write Ops. Instead of using a global spin lock which protects all the atomic APIs, I want to use a spin lock per instance of atomic_t. What operations are you using to implement spinlocks? A few architectures use arrays of spinlocks to implement atomic_t. I believe sparc and parisc are among them. Assuming your spinlock implementation is sound and efficient, the same technique should work for you. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Usage semantics of atomic_set ( )
Vineet Gupta wrote: I'm trying to implement atomic ops for a CPU which has no inherent support for Read-Modify-Write Ops. Instead of using a global spin lock which protects all the atomic APIs, I want to use a spin lock per instance of atomic_t. What operations are you using to implement spinlocks? A few architectures use arrays of spinlocks to implement atomic_t. I believe sparc and parisc are among them. Assuming your spinlock implementation is sound and efficient, the same technique should work for you. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange NFS write performance Linux->Solaris-10/VXFS, maybe VW related
Martin Knoblauch wrote: Hi, currently I am tracking down an "interesting" effect when writing to a Solars-10/Sparc based server. The server exports two filesystems. One UFS, one VXFS. The filesystems are mounted NFS3/TCP, no special options. Linux kernel in question is 2.6.24-rc6, but it happens with earlier kernels (2.6.19.2, 2.6.22.6) as well. The client is x86_64 with 8 GB of ram. The problem: when writing to the VXFS based filesystem, performance drops dramatically when the the filesize reaches or exceeds "dirty_ratio". For a dirty_ratio of 10% (about 800MB) files below 750 MB are transfered with about 30 MB/sec. Anything above 770 MB drops down to below 10 MB/sec. If I perform the same tests on the UFS based FS, performance stays at about 30 MB/sec until 3GB and likely larger (I just stopped at 3 GB). Any ideas what could cause this difference? Any suggestions on debugging it? 1) Try normal NFS tuning, such as rsize/wsize tuning. 2) You're entering synchronous writeback mode, so you can delay the problem by raising dirty_ratio to 100, or reduce the size of the problem by lowering dirty_ratio to 1. Either one could help. 3) It sounds like the bottleneck is the vxfs filesystem. It only *appears* on the client side because writes up until dirty_ratio get buffered on the client. If you can confirm that the server is actually writing stuff to disk slower when the client is in writeback mode, then it's possible the Linux NFS client is doing something inefficient in writeback mode. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange NFS write performance Linux-Solaris-10/VXFS, maybe VW related
Martin Knoblauch wrote: Hi, currently I am tracking down an interesting effect when writing to a Solars-10/Sparc based server. The server exports two filesystems. One UFS, one VXFS. The filesystems are mounted NFS3/TCP, no special options. Linux kernel in question is 2.6.24-rc6, but it happens with earlier kernels (2.6.19.2, 2.6.22.6) as well. The client is x86_64 with 8 GB of ram. The problem: when writing to the VXFS based filesystem, performance drops dramatically when the the filesize reaches or exceeds dirty_ratio. For a dirty_ratio of 10% (about 800MB) files below 750 MB are transfered with about 30 MB/sec. Anything above 770 MB drops down to below 10 MB/sec. If I perform the same tests on the UFS based FS, performance stays at about 30 MB/sec until 3GB and likely larger (I just stopped at 3 GB). Any ideas what could cause this difference? Any suggestions on debugging it? 1) Try normal NFS tuning, such as rsize/wsize tuning. 2) You're entering synchronous writeback mode, so you can delay the problem by raising dirty_ratio to 100, or reduce the size of the problem by lowering dirty_ratio to 1. Either one could help. 3) It sounds like the bottleneck is the vxfs filesystem. It only *appears* on the client side because writes up until dirty_ratio get buffered on the client. If you can confirm that the server is actually writing stuff to disk slower when the client is in writeback mode, then it's possible the Linux NFS client is doing something inefficient in writeback mode. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/net/: Spelling fixes
Joe Perches wrote: drivers/net/atl1/atl1_hw.c |2 +- drivers/net/atl1/atl1_main.c |2 +- The atl1 code will be heavily reworked in the 2.6.25 merge window, so this may cause headaches. Please remove these chunks before merging. The spelling corrections themselves are fine, and I will ensure that the revised driver includes them, if the comments in question are still present at all once we're done with all the changes and cleanups. -- Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/net/: Spelling fixes
Joe Perches wrote: drivers/net/atl1/atl1_hw.c |2 +- drivers/net/atl1/atl1_main.c |2 +- The atl1 code will be heavily reworked in the 2.6.25 merge window, so this may cause headaches. Please remove these chunks before merging. The spelling corrections themselves are fine, and I will ensure that the revised driver includes them, if the comments in question are still present at all once we're done with all the changes and cleanups. -- Chris -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux Kernel - Future works
Muhammad Nowbuth wrote: Hi all, Could anyone give some ideas of future pending works which are needed on the linux kernel? http://kernelnewbies.org/KernelHacking -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux Kernel - Future works
Muhammad Nowbuth wrote: Hi all, Could anyone give some ideas of future pending works which are needed on the linux kernel? http://kernelnewbies.org/KernelHacking -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Development & Objective-C
Ben Crowhurst wrote: Has Objective-C ever been considered for kernel development? No. Kernel programming requires what is essentially assembly language with a lot of syntactic sugar, which C provides. Higher-level languages abstract away too much detail to be suitable for the sort of bit-perfect control you need when you're directly controlling bare metal. You can still use object-oriented programming techniques in C, and we do this all the time in the kernel, but we do so with more fine-grained explicit control than a language like Objective-C would give us. More to the point, if we tried to use Objective-C, we'd find ourselves needing to fall back to C-style explicitness so often that it wouldn't be worth the trouble. In other news, I hear Hurd boots again! -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Development Objective-C
Ben Crowhurst wrote: Has Objective-C ever been considered for kernel development? No. Kernel programming requires what is essentially assembly language with a lot of syntactic sugar, which C provides. Higher-level languages abstract away too much detail to be suitable for the sort of bit-perfect control you need when you're directly controlling bare metal. You can still use object-oriented programming techniques in C, and we do this all the time in the kernel, but we do so with more fine-grained explicit control than a language like Objective-C would give us. More to the point, if we tried to use Objective-C, we'd find ourselves needing to fall back to C-style explicitness so often that it wouldn't be worth the trouble. In other news, I hear Hurd boots again! -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Avoid overflows in kernel/time.c
H. Peter Anvin wrote: NOTE: This patch uses a bc(1) script to compute the appropriate constants. Perhaps dc would be more appropriate? That's included in busybox. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Avoid overflows in kernel/time.c
H. Peter Anvin wrote: NOTE: This patch uses a bc(1) script to compute the appropriate constants. Perhaps dc would be more appropriate? That's included in busybox. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.22.y][PATCH] atl1: disable broken 64-bit DMA
Jay Cliburn wrote: atl1: disable broken 64-bit DMA [ Upstream commit: 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 ] The L1 network chip can DMA to 64-bit addresses, but multiple descriptor rings share a single register for the high 32 bits of their address, so only a single, aligned, 4 GB physical address range can be used at a time. As a result, we need to confine the driver to a 32-bit DMA mask, otherwise we see occasional data corruption errors in systems containing 4 or more gigabytes of RAM. Signed-off-by: Jay Cliburn <[EMAIL PROTECTED]> Cc: Luca Tettamanti <[EMAIL PROTECTED]> Cc: Chris Snook <[EMAIL PROTECTED]> Acked-By: Chris Snook <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.22.y][PATCH] atl1: disable broken 64-bit DMA
Jay Cliburn wrote: atl1: disable broken 64-bit DMA [ Upstream commit: 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 ] The L1 network chip can DMA to 64-bit addresses, but multiple descriptor rings share a single register for the high 32 bits of their address, so only a single, aligned, 4 GB physical address range can be used at a time. As a result, we need to confine the driver to a 32-bit DMA mask, otherwise we see occasional data corruption errors in systems containing 4 or more gigabytes of RAM. Signed-off-by: Jay Cliburn [EMAIL PROTECTED] Cc: Luca Tettamanti [EMAIL PROTECTED] Cc: Chris Snook [EMAIL PROTECTED] Acked-By: Chris Snook [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: IM Kernel Failure 12/11/07
[EMAIL PROTECTED] wrote: Linux version 2.4.9-e.38smp ([EMAIL PROTECTED]) (gcc version 2.96 2731 (Red Hat Linux 7.2 2.96-124.7.2)) #1 SMP Wed Feb 11 00:09:01 EST 2004 Ancient vendor kernels are very out of scope for this mailing list. The following links may be useful: https://bugzilla.redhat.com/ https://www.redhat.com/apps/support/ http://www.redhat.com/mailman/listinfo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: IM Kernel Failure 12/11/07
[EMAIL PROTECTED] wrote: Linux version 2.4.9-e.38smp ([EMAIL PROTECTED]) (gcc version 2.96 2731 (Red Hat Linux 7.2 2.96-124.7.2)) #1 SMP Wed Feb 11 00:09:01 EST 2004 Ancient vendor kernels are very out of scope for this mailing list. The following links may be useful: https://bugzilla.redhat.com/ https://www.redhat.com/apps/support/ http://www.redhat.com/mailman/listinfo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange delays / what usually happens every 10 min?
Florian Boelstler wrote: While running that test driver a delay of about 10ms _exactly_ occurs every 10 minutes. This is precisely the sort of thing that BIOS/firmware-level SMI handlers do, particularly those that have monitoring or management features. Try to determine if the kernel is doing anything during this time. If the entire kernel seems to be frozen, talk to the people who wrote the firmware. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange delays / what usually happens every 10 min?
Florian Boelstler wrote: While running that test driver a delay of about 10ms _exactly_ occurs every 10 minutes. This is precisely the sort of thing that BIOS/firmware-level SMI handlers do, particularly those that have monitoring or management features. Try to determine if the kernel is doing anything during this time. If the entire kernel seems to be frozen, talk to the people who wrote the firmware. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PAGE_SIZE on 64bit and 32bit machines
Yoav Artzi wrote: According to my knowledge the PAGE_SIZE on 32bit architectures in 4KB. Logically, the PAGE_SIZE on 64bit architectures should be 8KB. That's at least the way I understand it. However, looking at the kernel code of x86_64, I see the PAGE_SIZE is 4KB. Can anyone explain to me what am I missing here? PAGE_SIZE is highly architecture-dependent. While it is true that 4K pages are typical on 32-bit architectures, and 64-bit architectures have historically introduced 8K pages, this is by no means a requirement. x86_64 uses the same page sizes that are available on i686+PAE, so you get 4K base pages. alpha and sparc64 typically use 8K base pages, though they have other options as well. ia64 defaults to 16K, though it can do 4K, 8K, and a bunch of larger base sizes. ppc64 does 4K and 64K. s390 uses 4K base pages in both 31-bit and 64-bit kernels. If x86_64 processors are released with TLBs that can handle 8K pages, it'll be straightforward to add that feature, but otherwise it would require faking it in software, which has lots of pitfalls and does nothing to improve TLB efficiency. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PAGE_SIZE on 64bit and 32bit machines
Yoav Artzi wrote: According to my knowledge the PAGE_SIZE on 32bit architectures in 4KB. Logically, the PAGE_SIZE on 64bit architectures should be 8KB. That's at least the way I understand it. However, looking at the kernel code of x86_64, I see the PAGE_SIZE is 4KB. Can anyone explain to me what am I missing here? PAGE_SIZE is highly architecture-dependent. While it is true that 4K pages are typical on 32-bit architectures, and 64-bit architectures have historically introduced 8K pages, this is by no means a requirement. x86_64 uses the same page sizes that are available on i686+PAE, so you get 4K base pages. alpha and sparc64 typically use 8K base pages, though they have other options as well. ia64 defaults to 16K, though it can do 4K, 8K, and a bunch of larger base sizes. ppc64 does 4K and 64K. s390 uses 4K base pages in both 31-bit and 64-bit kernels. If x86_64 processors are released with TLBs that can handle 8K pages, it'll be straightforward to add that feature, but otherwise it would require faking it in software, which has lots of pitfalls and does nothing to improve TLB efficiency. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [poll] Is the megafreeze development model broken?
ciol wrote: Chris Snook wrote: Why are you asking the developers? We do this for the sake of the users. The kernel is the software of the developers. The kernel is a technology. A distribution is a product. When decisions about technology and decisions about products are made *entirely* by the same people, the result is never good. It's important to know how they want it to be distributed. For commercial distributions, the answer is: "In whichever way results in the largest paycheck with the least amount of stress and effort", which means doing it the way that's best for the customer. Non-commercial distributions have less of this pressure, but the same principle applies if they care about their users. If you're not interested in the users but you are interested in the technology, you should be doing your work upstream, so the distribution is irrelevant. Don't get me wrong, I think stable kernel trees like 2.6.16 are a good thing. They serve very well a whole bunch of different niches where users are willing to sacrifice the support benefits of a distribution kernel for the control of an upstream kernel, while maintaining the stability of their installed base. These users have little interest in the general-purpose distribution kernel anyway, aside from perhaps wishing it included some config or patch that its maintainers have elected not to include. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Coding Style: indenting with tabs vs. spaces
Benny Halevy wrote: Greetings, I would like to hear peoples opinion about the indentation convention described below that I personally found the most practical with several different editors. The gist of it is that tabs should be used for nesting, not for decoration. Indent your code with as many tabs as your nesting level, where all statements will begin, and from there on use space characters. The rational behind it is to be tab-width agnostic so regardless of your tab expansion setup, the code will look correct and will make sense. When you break a line and want the new line text to start below a specific point relative to the previous line (I consider that "decorating") then start the new line with the same number of tabs as the previous one and then just use space characters as their width is the same as any character in the previous line, (assuming fixed-width fonts of course). I find it meaningful to indent extended lines one extra tab stop, but beyond that I agree it is just decoration. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [poll] Is the megafreeze development model broken?
ciol wrote: Hi, I'd like to ask you a few questions: * Do you like the way linux distributions integrate the kernel? * Wouldn't you prefer they ship with the stable and still maintained 2.6.16.X, while providing optionally the latest kernel for those who want or just have a new hardware? * Do you think the megafreeze development model [1] and the "I don't trust in upstream" development model are broken? (And why) Why are you asking the developers? We do this for the sake of the users. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [poll] Is the megafreeze development model broken?
ciol wrote: Hi, I'd like to ask you a few questions: * Do you like the way linux distributions integrate the kernel? * Wouldn't you prefer they ship with the stable and still maintained 2.6.16.X, while providing optionally the latest kernel for those who want or just have a new hardware? * Do you think the megafreeze development model [1] and the I don't trust in upstream development model are broken? (And why) Why are you asking the developers? We do this for the sake of the users. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Coding Style: indenting with tabs vs. spaces
Benny Halevy wrote: Greetings, I would like to hear peoples opinion about the indentation convention described below that I personally found the most practical with several different editors. The gist of it is that tabs should be used for nesting, not for decoration. Indent your code with as many tabs as your nesting level, where all statements will begin, and from there on use space characters. The rational behind it is to be tab-width agnostic so regardless of your tab expansion setup, the code will look correct and will make sense. When you break a line and want the new line text to start below a specific point relative to the previous line (I consider that decorating) then start the new line with the same number of tabs as the previous one and then just use space characters as their width is the same as any character in the previous line, (assuming fixed-width fonts of course). I find it meaningful to indent extended lines one extra tab stop, but beyond that I agree it is just decoration. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [poll] Is the megafreeze development model broken?
ciol wrote: Chris Snook wrote: Why are you asking the developers? We do this for the sake of the users. The kernel is the software of the developers. The kernel is a technology. A distribution is a product. When decisions about technology and decisions about products are made *entirely* by the same people, the result is never good. It's important to know how they want it to be distributed. For commercial distributions, the answer is: In whichever way results in the largest paycheck with the least amount of stress and effort, which means doing it the way that's best for the customer. Non-commercial distributions have less of this pressure, but the same principle applies if they care about their users. If you're not interested in the users but you are interested in the technology, you should be doing your work upstream, so the distribution is irrelevant. Don't get me wrong, I think stable kernel trees like 2.6.16 are a good thing. They serve very well a whole bunch of different niches where users are willing to sacrifice the support benefits of a distribution kernel for the control of an upstream kernel, while maintaining the stability of their installed base. These users have little interest in the general-purpose distribution kernel anyway, aside from perhaps wishing it included some config or patch that its maintainers have elected not to include. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] Optimize zone allocator synchronization
Don Porter wrote: From: Donald E. Porter <[EMAIL PROTECTED]> In the bulk page allocation/free routines in mm/page_alloc.c, the zone lock is held across all iterations. For certain parallel workloads, I have found that releasing and reacquiring the lock for each iteration yields better performance, especially at higher CPU counts. For instance, kernel compilation is sped up by 5% on an 8 CPU test machine. In most cases, there is no significant effect on performance (although the effect tends to be slightly positive). This seems quite reasonable for the very small scope of the change. My intuition is that this patch prevents smaller requests from waiting on larger ones. While grabbing and releasing the lock within the loop adds a few instructions, it can lower the latency for a particular thread's allocation which is often on the thread's critical path. Lowering the average latency for allocation can increase system throughput. More detailed information, including data from the tests I ran to validate this change are available at http://www.cs.utexas.edu/~porterde/kernel-patch.html . Thanks in advance for your consideration and feedback. That's an interesting insight. My intuition is that Nick Piggin's recently-posted ticket spinlocks patches[1] will reduce the need for this patch, though it may be useful to have both. Can you benchmark again with only ticket spinlocks, and with ticket spinlocks + this patch? You'll probably want to use 2.6.24-rc1 as your baseline, due to the x86 architecture merge. -- Chris [1] http://lkml.org/lkml/2007/11/1/123 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] Optimize zone allocator synchronization
Don Porter wrote: From: Donald E. Porter [EMAIL PROTECTED] In the bulk page allocation/free routines in mm/page_alloc.c, the zone lock is held across all iterations. For certain parallel workloads, I have found that releasing and reacquiring the lock for each iteration yields better performance, especially at higher CPU counts. For instance, kernel compilation is sped up by 5% on an 8 CPU test machine. In most cases, there is no significant effect on performance (although the effect tends to be slightly positive). This seems quite reasonable for the very small scope of the change. My intuition is that this patch prevents smaller requests from waiting on larger ones. While grabbing and releasing the lock within the loop adds a few instructions, it can lower the latency for a particular thread's allocation which is often on the thread's critical path. Lowering the average latency for allocation can increase system throughput. More detailed information, including data from the tests I ran to validate this change are available at http://www.cs.utexas.edu/~porterde/kernel-patch.html . Thanks in advance for your consideration and feedback. That's an interesting insight. My intuition is that Nick Piggin's recently-posted ticket spinlocks patches[1] will reduce the need for this patch, though it may be useful to have both. Can you benchmark again with only ticket spinlocks, and with ticket spinlocks + this patch? You'll probably want to use 2.6.24-rc1 as your baseline, due to the x86 architecture merge. -- Chris [1] http://lkml.org/lkml/2007/11/1/123 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Quad core CPU detected but shows as single core in 2.6.23.1
Zurk Tech wrote: dmesg (new) with disabled GART error reporting if anyone wants to compare to previous dmesg with GART error reporting : A few unrelated observations about Barcelona support... Marking TSC unstable due to TSCs unsynchronized This is probably wrong. The TSC is on the northbridge on Barcelona chips, so every core on the die should be in sync. Hypothetically you could have different speed northbridges in different sockets, but we've never tried very hard to support that case anyway. We should probably be marking the TSC as stable on Barcelona chips. xor: automatically using best checksumming function: generic_sse generic_sse: 7449.000 MB/sec xor: using function: generic_sse (7449.000 MB/sec) We should probably also implement an SSE5 function to take advantage of the 128-bit SSE operations supported on newer processors. pnp: the driver 'system' has been registered pnp: match found with the PnP device '00:08' and the driver 'system' pnp: match found with the PnP device '00:09' and the driver 'system' pnp: 00:09: ioport range 0x580-0x58f has been reserved pnp: 00:09: ioport range 0x590-0x593 has been reserved pnp: 00:09: ioport range 0x700-0x703 has been reserved pnp: 00:09: ioport range 0xca0-0xcaf has been reserved pnp: 00:09: iomem range 0xfec0-0xfec00fff could not be reserved pnp: 00:09: iomem range 0xfec01000-0xfec01fff could not be reserved pnp: 00:09: iomem range 0xfec02000-0xfec02fff could not be reserved pnp: 00:09: iomem range 0xfee0-0xfee00fff could not be reserved pnp: match found with the PnP device '00:0a' and the driver 'system' pnp: 00:0a: ioport range 0x600-0x61f has been reserved pnp: 00:0a: ioport range 0x520-0x53f has been reserved pnp: 00:0a: ioport range 0x540-0x54f has been reserved pnp: 00:0a: ioport range 0x640-0x65f has been reserved pnp: match found with the PnP device '00:0b' and the driver 'system' pnp: 00:0b: iomem range 0xe000-0xefff has been reserved pnp: match found with the PnP device '00:0c' and the driver 'system' pnp: 00:0c: iomem range 0x0-0x9 could not be reserved pnp: 00:0c: iomem range 0x0-0x0 could not be reserved pnp: 00:0c: iomem range 0xe-0xf could not be reserved pnp: 00:0c: iomem range 0x10-0xc7ff could not be reserved PCI: Bridge: :01:0d.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. PCI: Bridge: :00:01.0 IO window: a000-bfff MEM window: ff40-ff4f PREFETCH window: disabled. PCI: Bridge: :00:06.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. PCI: Bridge: :00:07.0 IO window: disabled. MEM window: ff50-ff5f PREFETCH window: cfe0-cfef PCI: Bridge: :00:08.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. PCI: Bridge: :00:09.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. PCI: Bridge: :00:0a.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. PCI: Bridge: :00:0b.0 IO window: disabled. MEM window: disabled. Hmmm... perhaps we're not handling the new mmconfig stuff correctly? Or maybe the BIOS isn't. hwmon-vid: Unknown VRM version of your x86 CPU : Not supporting VRM 0.0 This code probably needs an update for Barcelona. raid6: int64x1 1920 MB/s raid6: int64x2 2353 MB/s raid6: int64x4 2331 MB/s raid6: int64x8 1254 MB/s raid6: sse2x12664 MB/s raid6: sse2x24214 MB/s raid6: sse2x44905 MB/s raid6: using algorithm sse2x4 (4905 MB/s) An update here for SSE5 might be in order as well. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Quad core CPU detected but shows as single core in 2.6.23.1
Zurk Tech wrote: dmesg (new) with disabled GART error reporting if anyone wants to compare to previous dmesg with GART error reporting : A few unrelated observations about Barcelona support... Marking TSC unstable due to TSCs unsynchronized This is probably wrong. The TSC is on the northbridge on Barcelona chips, so every core on the die should be in sync. Hypothetically you could have different speed northbridges in different sockets, but we've never tried very hard to support that case anyway. We should probably be marking the TSC as stable on Barcelona chips. xor: automatically using best checksumming function: generic_sse generic_sse: 7449.000 MB/sec xor: using function: generic_sse (7449.000 MB/sec) We should probably also implement an SSE5 function to take advantage of the 128-bit SSE operations supported on newer processors. pnp: the driver 'system' has been registered pnp: match found with the PnP device '00:08' and the driver 'system' pnp: match found with the PnP device '00:09' and the driver 'system' pnp: 00:09: ioport range 0x580-0x58f has been reserved pnp: 00:09: ioport range 0x590-0x593 has been reserved pnp: 00:09: ioport range 0x700-0x703 has been reserved pnp: 00:09: ioport range 0xca0-0xcaf has been reserved pnp: 00:09: iomem range 0xfec0-0xfec00fff could not be reserved pnp: 00:09: iomem range 0xfec01000-0xfec01fff could not be reserved pnp: 00:09: iomem range 0xfec02000-0xfec02fff could not be reserved pnp: 00:09: iomem range 0xfee0-0xfee00fff could not be reserved pnp: match found with the PnP device '00:0a' and the driver 'system' pnp: 00:0a: ioport range 0x600-0x61f has been reserved pnp: 00:0a: ioport range 0x520-0x53f has been reserved pnp: 00:0a: ioport range 0x540-0x54f has been reserved pnp: 00:0a: ioport range 0x640-0x65f has been reserved pnp: match found with the PnP device '00:0b' and the driver 'system' pnp: 00:0b: iomem range 0xe000-0xefff has been reserved pnp: match found with the PnP device '00:0c' and the driver 'system' pnp: 00:0c: iomem range 0x0-0x9 could not be reserved pnp: 00:0c: iomem range 0x0-0x0 could not be reserved pnp: 00:0c: iomem range 0xe-0xf could not be reserved pnp: 00:0c: iomem range 0x10-0xc7ff could not be reserved PCI: Bridge: :01:0d.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. PCI: Bridge: :00:01.0 IO window: a000-bfff MEM window: ff40-ff4f PREFETCH window: disabled. PCI: Bridge: :00:06.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. PCI: Bridge: :00:07.0 IO window: disabled. MEM window: ff50-ff5f PREFETCH window: cfe0-cfef PCI: Bridge: :00:08.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. PCI: Bridge: :00:09.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. PCI: Bridge: :00:0a.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. PCI: Bridge: :00:0b.0 IO window: disabled. MEM window: disabled. Hmmm... perhaps we're not handling the new mmconfig stuff correctly? Or maybe the BIOS isn't. hwmon-vid: Unknown VRM version of your x86 CPU : Not supporting VRM 0.0 This code probably needs an update for Barcelona. raid6: int64x1 1920 MB/s raid6: int64x2 2353 MB/s raid6: int64x4 2331 MB/s raid6: int64x8 1254 MB/s raid6: sse2x12664 MB/s raid6: sse2x24214 MB/s raid6: sse2x44905 MB/s raid6: using algorithm sse2x4 (4905 MB/s) An update here for SSE5 might be in order as well. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][REFERENCE ONLY] 9p: ramfs 9p server
Latchesar Ionkov wrote: Sample ramfs file server that uses the in-kernel 9P file server support. This code is for reference only. Reference code generally goes in Documentation/ -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][REFERENCE ONLY] 9p: ramfs 9p server
Latchesar Ionkov wrote: Sample ramfs file server that uses the in-kernel 9P file server support. This code is for reference only. Reference code generally goes in Documentation/ -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Quad core CPU detected but shows as single core in 2.6.23.1
Zurk Tech wrote: Hi guys, I have a tyan s3992 h2000 with single barcelona amd quad core cpu (the other cpu socket is empty). cat /proc/cpuinfo shows amd quad core processor but core : 1ive compiled the kernel from scratch with smp and amd64 + the numa stuff. i also tried debian etchs amd64 smp kernel and same result. is amd barcelona quad core cpu not yet supported or is it something else ? Thanks for any insight. im completely stumped. ive dealt with mutliprocessing machines before and have a couple of dual cores which are fine with the exact same kernel configs. my amd tk-53 x2 turions show 2 cores in cpuinfo The bootstrap protocol for Barcelona is a little different from older Opterons, so an older BIOS that doesn't know the new protocol won't be able to bring up any CPU other than the bootstrap processor. My wild guess is that this is what's happening and a BIOS update will fix it, but as Arjan said, please post dmesg when reporting bugs like this. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Quad core CPU detected but shows as single core in 2.6.23.1
Zurk Tech wrote: Hi guys, I have a tyan s3992 h2000 with single barcelona amd quad core cpu (the other cpu socket is empty). cat /proc/cpuinfo shows amd quad core processor but core : 1ive compiled the kernel from scratch with smp and amd64 + the numa stuff. i also tried debian etchs amd64 smp kernel and same result. is amd barcelona quad core cpu not yet supported or is it something else ? Thanks for any insight. im completely stumped. ive dealt with mutliprocessing machines before and have a couple of dual cores which are fine with the exact same kernel configs. my amd tk-53 x2 turions show 2 cores in cpuinfo The bootstrap protocol for Barcelona is a little different from older Opterons, so an older BIOS that doesn't know the new protocol won't be able to bring up any CPU other than the bootstrap processor. My wild guess is that this is what's happening and a BIOS update will fix it, but as Arjan said, please post dmesg when reporting bugs like this. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: unify div64{,_32,_64}.h
From: Chris Snook <[EMAIL PROTECTED]> Unify x86 div64.h headers. Signed-off-by: Chris Snook <[EMAIL PROTECTED]> diff -Nurp a/include/asm-x86/div64_32.h b/include/asm-x86/div64_32.h --- a/include/asm-x86/div64_32.h2007-10-20 07:33:53.0 -0400 +++ b/include/asm-x86/div64_32.h1969-12-31 19:00:00.0 -0500 @@ -1,52 +0,0 @@ -#ifndef __I386_DIV64 -#define __I386_DIV64 - -#include - -/* - * do_div() is NOT a C function. It wants to return - * two values (the quotient and the remainder), but - * since that doesn't work very well in C, what it - * does is: - * - * - modifies the 64-bit dividend _in_place_ - * - returns the 32-bit remainder - * - * This ends up being the most efficient "calling - * convention" on x86. - */ -#define do_div(n,base) ({ \ - unsigned long __upper, __low, __high, __mod, __base; \ - __base = (base); \ - asm("":"=a" (__low), "=d" (__high):"A" (n)); \ - __upper = __high; \ - if (__high) { \ - __upper = __high % (__base); \ - __high = __high / (__base); \ - } \ - asm("divl %2":"=a" (__low), "=d" (__mod):"rm" (__base), "0" (__low), "1" (__upper)); \ - asm("":"=A" (n):"a" (__low),"d" (__high)); \ - __mod; \ -}) - -/* - * (long)X = ((long long)divs) / (long)div - * (long)rem = ((long long)divs) % (long)div - * - * Warning, this will do an exception if X overflows. - */ -#define div_long_long_rem(a,b,c) div_ll_X_l_rem(a,b,c) - -static inline long -div_ll_X_l_rem(long long divs, long div, long *rem) -{ - long dum2; - __asm__("divl %2":"=a"(dum2), "=d"(*rem) - :"rm"(div), "A"(divs)); - - return dum2; - -} - -extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); -#endif diff -Nurp a/include/asm-x86/div64_64.h b/include/asm-x86/div64_64.h --- a/include/asm-x86/div64_64.h2007-10-20 07:33:53.0 -0400 +++ b/include/asm-x86/div64_64.h1969-12-31 19:00:00.0 -0500 @@ -1 +0,0 @@ -#include diff -Nurp a/include/asm-x86/div64.h b/include/asm-x86/div64.h --- a/include/asm-x86/div64.h 2007-10-20 07:33:53.0 -0400 +++ b/include/asm-x86/div64.h 2007-10-20 07:32:34.0 -0400 @@ -1,5 +1,58 @@ +#ifndef _ASM_X86_DIV64_H +#define _ASM_X86_DIV64_H + #ifdef CONFIG_X86_32 -# include "div64_32.h" -#else -# include "div64_64.h" -#endif + +#include + +/* + * do_div() is NOT a C function. It wants to return + * two values (the quotient and the remainder), but + * since that doesn't work very well in C, what it + * does is: + * + * - modifies the 64-bit dividend _in_place_ + * - returns the 32-bit remainder + * + * This ends up being the most efficient "calling + * convention" on x86. + */ +#define do_div(n,base) ({ \ + unsigned long __upper, __low, __high, __mod, __base; \ + __base = (base); \ + asm("":"=a" (__low), "=d" (__high):"A" (n)); \ + __upper = __high; \ + if (__high) { \ + __upper = __high % (__base); \ + __high = __high / (__base); \ + } \ + asm("divl %2":"=a" (__low), "=d" (__mod):"rm" (__base), "0" (__low), "1" (__upper)); \ + asm("":"=A" (n):"a" (__low),"d" (__high)); \ + __mod; \ +}) + +/* + * (long)X = ((long long)divs) / (long)div + * (long)rem = ((long long)divs) % (long)div + * + * Warning, this will do an exception if X overflows. + */ +#define div_long_long_rem(a,b,c) div_ll_X_l_rem(a,b,c) + +static inline long +div_ll_X_l_rem(long long divs, long div, long *rem) +{ + long dum2; + __asm__("divl %2":"=a"(dum2), "=d"(*rem) + :"rm"(div), "A"(divs)); + + return dum2; + +} + +extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); + +# else +# include +# endif /* CONFIG_X86_32 */ +#endif /* _ASM_X86_DIV64_H */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: unify a.out{,_32,_64}.h
From: Chris Snook <[EMAIL PROTECTED]> Unify x86 a.out_32.h and a.out_64.h Signed-off-by: Chris Snook <[EMAIL PROTECTED]> diff -Nurp a/include/asm-x86/a.out_32.h b/include/asm-x86/a.out_32.h --- a/include/asm-x86/a.out_32.h2007-10-20 06:20:01.0 -0400 +++ b/include/asm-x86/a.out_32.h1969-12-31 19:00:00.0 -0500 @@ -1,27 +0,0 @@ -#ifndef __I386_A_OUT_H__ -#define __I386_A_OUT_H__ - -struct exec -{ - unsigned long a_info;/* Use macros N_MAGIC, etc for access */ - unsigned a_text; /* length of text, in bytes */ - unsigned a_data; /* length of data, in bytes */ - unsigned a_bss; /* length of uninitialized data area for file, in bytes */ - unsigned a_syms; /* length of symbol table data in file, in bytes */ - unsigned a_entry;/* start address */ - unsigned a_trsize; /* length of relocation info for text, in bytes */ - unsigned a_drsize; /* length of relocation info for data, in bytes */ -}; - -#define N_TRSIZE(a)((a).a_trsize) -#define N_DRSIZE(a)((a).a_drsize) -#define N_SYMSIZE(a) ((a).a_syms) - -#ifdef __KERNEL__ - -#define STACK_TOP TASK_SIZE -#define STACK_TOP_MAX STACK_TOP - -#endif - -#endif /* __A_OUT_GNU_H__ */ diff -Nurp a/include/asm-x86/a.out_64.h b/include/asm-x86/a.out_64.h --- a/include/asm-x86/a.out_64.h2007-10-20 06:20:01.0 -0400 +++ b/include/asm-x86/a.out_64.h1969-12-31 19:00:00.0 -0500 @@ -1,28 +0,0 @@ -#ifndef __X8664_A_OUT_H__ -#define __X8664_A_OUT_H__ - -/* 32bit a.out */ - -struct exec -{ - unsigned int a_info; /* Use macros N_MAGIC, etc for access */ - unsigned a_text; /* length of text, in bytes */ - unsigned a_data; /* length of data, in bytes */ - unsigned a_bss; /* length of uninitialized data area for file, in bytes */ - unsigned a_syms; /* length of symbol table data in file, in bytes */ - unsigned a_entry;/* start address */ - unsigned a_trsize; /* length of relocation info for text, in bytes */ - unsigned a_drsize; /* length of relocation info for data, in bytes */ -}; - -#define N_TRSIZE(a)((a).a_trsize) -#define N_DRSIZE(a)((a).a_drsize) -#define N_SYMSIZE(a) ((a).a_syms) - -#ifdef __KERNEL__ -#include -#define STACK_TOP TASK_SIZE -#define STACK_TOP_MAX TASK_SIZE64 -#endif - -#endif /* __A_OUT_GNU_H__ */ diff -Nurp a/include/asm-x86/a.out.h b/include/asm-x86/a.out.h --- a/include/asm-x86/a.out.h 2007-10-20 06:20:01.0 -0400 +++ b/include/asm-x86/a.out.h 2007-10-20 06:14:26.0 -0400 @@ -1,13 +1,32 @@ +#ifndef _ASM_X86_A_OUT_H +#define _ASM_X86_A_OUT_H + +/* 32bit a.out */ + +struct exec +{ + unsigned int a_info; /* Use macros N_MAGIC, etc for access */ + unsigned a_text; /* length of text, in bytes */ + unsigned a_data; /* length of data, in bytes */ + unsigned a_bss; /* length of uninitialized data area for file, in bytes */ + unsigned a_syms; /* length of symbol table data in file, in bytes */ + unsigned a_entry;/* start address */ + unsigned a_trsize; /* length of relocation info for text, in bytes */ + unsigned a_drsize; /* length of relocation info for data, in bytes */ +}; + +#define N_TRSIZE(a)((a).a_trsize) +#define N_DRSIZE(a)((a).a_drsize) +#define N_SYMSIZE(a) ((a).a_syms) + #ifdef __KERNEL__ +# include +# define STACK_TOP TASK_SIZE # ifdef CONFIG_X86_32 -# include "a.out_32.h" +# define STACK_TOP_MAXSTACK_TOP # else -# include "a.out_64.h" -# endif -#else -# ifdef __i386__ -# include "a.out_32.h" -# else -# include "a.out_64.h" +# define STACK_TOP_MAXTASK_SIZE64 # endif #endif + +#endif /* _ASM_X86_A_OUT_H */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: merge mmu{,_32,_64}.h
From: Chris Snook <[EMAIL PROTECTED]> Merge mmu_32.h and mmu_64.h into mmu.h. Signed-off-by: Chris Snook <[EMAIL PROTECTED]> diff -Nurp a/include/asm-x86/mmu_32.h b/include/asm-x86/mmu_32.h --- a/include/asm-x86/mmu_32.h 2007-10-20 02:42:24.0 -0400 +++ b/include/asm-x86/mmu_32.h 1969-12-31 19:00:00.0 -0500 @@ -1,18 +0,0 @@ -#ifndef __i386_MMU_H -#define __i386_MMU_H - -#include -/* - * The i386 doesn't have a mmu context, but - * we put the segment information here. - * - * cpu_vm_mask is used to optimize ldt flushing. - */ -typedef struct { - int size; - struct mutex lock; - void *ldt; - void *vdso; -} mm_context_t; - -#endif diff -Nurp a/include/asm-x86/mmu_64.h b/include/asm-x86/mmu_64.h --- a/include/asm-x86/mmu_64.h 2007-10-20 02:42:24.0 -0400 +++ b/include/asm-x86/mmu_64.h 1969-12-31 19:00:00.0 -0500 @@ -1,21 +0,0 @@ -#ifndef __x86_64_MMU_H -#define __x86_64_MMU_H - -#include -#include - -/* - * The x86_64 doesn't have a mmu context, but - * we put the segment information here. - * - * cpu_vm_mask is used to optimize ldt flushing. - */ -typedef struct { - void *ldt; - rwlock_t ldtlock; - int size; - struct mutex lock; - void *vdso; -} mm_context_t; - -#endif diff -Nurp a/include/asm-x86/mmu.h b/include/asm-x86/mmu.h --- a/include/asm-x86/mmu.h 2007-10-20 02:42:24.0 -0400 +++ b/include/asm-x86/mmu.h 2007-10-20 02:38:36.0 -0400 @@ -1,5 +1,23 @@ -#ifdef CONFIG_X86_32 -# include "mmu_32.h" -#else -# include "mmu_64.h" +#ifndef _ASM_X86_MMU_H +#define _ASM_X86_MMU_H + +#include +#include + +/* + * The x86 doesn't have a mmu context, but + * we put the segment information here. + * + * cpu_vm_mask is used to optimize ldt flushing. + */ +typedef struct { + void *ldt; +#ifdef CONFIG_X86_64 + rwlock_t ldtlock; #endif + int size; + struct mutex lock; + void *vdso; +} mm_context_t; + +#endif /* _ASM_X86_MMU_H */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: merge mmu{,_32,_64}.h
From: Chris Snook [EMAIL PROTECTED] Merge mmu_32.h and mmu_64.h into mmu.h. Signed-off-by: Chris Snook [EMAIL PROTECTED] diff -Nurp a/include/asm-x86/mmu_32.h b/include/asm-x86/mmu_32.h --- a/include/asm-x86/mmu_32.h 2007-10-20 02:42:24.0 -0400 +++ b/include/asm-x86/mmu_32.h 1969-12-31 19:00:00.0 -0500 @@ -1,18 +0,0 @@ -#ifndef __i386_MMU_H -#define __i386_MMU_H - -#include linux/mutex.h -/* - * The i386 doesn't have a mmu context, but - * we put the segment information here. - * - * cpu_vm_mask is used to optimize ldt flushing. - */ -typedef struct { - int size; - struct mutex lock; - void *ldt; - void *vdso; -} mm_context_t; - -#endif diff -Nurp a/include/asm-x86/mmu_64.h b/include/asm-x86/mmu_64.h --- a/include/asm-x86/mmu_64.h 2007-10-20 02:42:24.0 -0400 +++ b/include/asm-x86/mmu_64.h 1969-12-31 19:00:00.0 -0500 @@ -1,21 +0,0 @@ -#ifndef __x86_64_MMU_H -#define __x86_64_MMU_H - -#include linux/spinlock.h -#include linux/mutex.h - -/* - * The x86_64 doesn't have a mmu context, but - * we put the segment information here. - * - * cpu_vm_mask is used to optimize ldt flushing. - */ -typedef struct { - void *ldt; - rwlock_t ldtlock; - int size; - struct mutex lock; - void *vdso; -} mm_context_t; - -#endif diff -Nurp a/include/asm-x86/mmu.h b/include/asm-x86/mmu.h --- a/include/asm-x86/mmu.h 2007-10-20 02:42:24.0 -0400 +++ b/include/asm-x86/mmu.h 2007-10-20 02:38:36.0 -0400 @@ -1,5 +1,23 @@ -#ifdef CONFIG_X86_32 -# include mmu_32.h -#else -# include mmu_64.h +#ifndef _ASM_X86_MMU_H +#define _ASM_X86_MMU_H + +#include linux/spinlock.h +#include linux/mutex.h + +/* + * The x86 doesn't have a mmu context, but + * we put the segment information here. + * + * cpu_vm_mask is used to optimize ldt flushing. + */ +typedef struct { + void *ldt; +#ifdef CONFIG_X86_64 + rwlock_t ldtlock; #endif + int size; + struct mutex lock; + void *vdso; +} mm_context_t; + +#endif /* _ASM_X86_MMU_H */ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: unify a.out{,_32,_64}.h
From: Chris Snook [EMAIL PROTECTED] Unify x86 a.out_32.h and a.out_64.h Signed-off-by: Chris Snook [EMAIL PROTECTED] diff -Nurp a/include/asm-x86/a.out_32.h b/include/asm-x86/a.out_32.h --- a/include/asm-x86/a.out_32.h2007-10-20 06:20:01.0 -0400 +++ b/include/asm-x86/a.out_32.h1969-12-31 19:00:00.0 -0500 @@ -1,27 +0,0 @@ -#ifndef __I386_A_OUT_H__ -#define __I386_A_OUT_H__ - -struct exec -{ - unsigned long a_info;/* Use macros N_MAGIC, etc for access */ - unsigned a_text; /* length of text, in bytes */ - unsigned a_data; /* length of data, in bytes */ - unsigned a_bss; /* length of uninitialized data area for file, in bytes */ - unsigned a_syms; /* length of symbol table data in file, in bytes */ - unsigned a_entry;/* start address */ - unsigned a_trsize; /* length of relocation info for text, in bytes */ - unsigned a_drsize; /* length of relocation info for data, in bytes */ -}; - -#define N_TRSIZE(a)((a).a_trsize) -#define N_DRSIZE(a)((a).a_drsize) -#define N_SYMSIZE(a) ((a).a_syms) - -#ifdef __KERNEL__ - -#define STACK_TOP TASK_SIZE -#define STACK_TOP_MAX STACK_TOP - -#endif - -#endif /* __A_OUT_GNU_H__ */ diff -Nurp a/include/asm-x86/a.out_64.h b/include/asm-x86/a.out_64.h --- a/include/asm-x86/a.out_64.h2007-10-20 06:20:01.0 -0400 +++ b/include/asm-x86/a.out_64.h1969-12-31 19:00:00.0 -0500 @@ -1,28 +0,0 @@ -#ifndef __X8664_A_OUT_H__ -#define __X8664_A_OUT_H__ - -/* 32bit a.out */ - -struct exec -{ - unsigned int a_info; /* Use macros N_MAGIC, etc for access */ - unsigned a_text; /* length of text, in bytes */ - unsigned a_data; /* length of data, in bytes */ - unsigned a_bss; /* length of uninitialized data area for file, in bytes */ - unsigned a_syms; /* length of symbol table data in file, in bytes */ - unsigned a_entry;/* start address */ - unsigned a_trsize; /* length of relocation info for text, in bytes */ - unsigned a_drsize; /* length of relocation info for data, in bytes */ -}; - -#define N_TRSIZE(a)((a).a_trsize) -#define N_DRSIZE(a)((a).a_drsize) -#define N_SYMSIZE(a) ((a).a_syms) - -#ifdef __KERNEL__ -#include linux/thread_info.h -#define STACK_TOP TASK_SIZE -#define STACK_TOP_MAX TASK_SIZE64 -#endif - -#endif /* __A_OUT_GNU_H__ */ diff -Nurp a/include/asm-x86/a.out.h b/include/asm-x86/a.out.h --- a/include/asm-x86/a.out.h 2007-10-20 06:20:01.0 -0400 +++ b/include/asm-x86/a.out.h 2007-10-20 06:14:26.0 -0400 @@ -1,13 +1,32 @@ +#ifndef _ASM_X86_A_OUT_H +#define _ASM_X86_A_OUT_H + +/* 32bit a.out */ + +struct exec +{ + unsigned int a_info; /* Use macros N_MAGIC, etc for access */ + unsigned a_text; /* length of text, in bytes */ + unsigned a_data; /* length of data, in bytes */ + unsigned a_bss; /* length of uninitialized data area for file, in bytes */ + unsigned a_syms; /* length of symbol table data in file, in bytes */ + unsigned a_entry;/* start address */ + unsigned a_trsize; /* length of relocation info for text, in bytes */ + unsigned a_drsize; /* length of relocation info for data, in bytes */ +}; + +#define N_TRSIZE(a)((a).a_trsize) +#define N_DRSIZE(a)((a).a_drsize) +#define N_SYMSIZE(a) ((a).a_syms) + #ifdef __KERNEL__ +# include linux/thread_info.h +# define STACK_TOP TASK_SIZE # ifdef CONFIG_X86_32 -# include a.out_32.h +# define STACK_TOP_MAXSTACK_TOP # else -# include a.out_64.h -# endif -#else -# ifdef __i386__ -# include a.out_32.h -# else -# include a.out_64.h +# define STACK_TOP_MAXTASK_SIZE64 # endif #endif + +#endif /* _ASM_X86_A_OUT_H */ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: unify div64{,_32,_64}.h
From: Chris Snook [EMAIL PROTECTED] Unify x86 div64.h headers. Signed-off-by: Chris Snook [EMAIL PROTECTED] diff -Nurp a/include/asm-x86/div64_32.h b/include/asm-x86/div64_32.h --- a/include/asm-x86/div64_32.h2007-10-20 07:33:53.0 -0400 +++ b/include/asm-x86/div64_32.h1969-12-31 19:00:00.0 -0500 @@ -1,52 +0,0 @@ -#ifndef __I386_DIV64 -#define __I386_DIV64 - -#include linux/types.h - -/* - * do_div() is NOT a C function. It wants to return - * two values (the quotient and the remainder), but - * since that doesn't work very well in C, what it - * does is: - * - * - modifies the 64-bit dividend _in_place_ - * - returns the 32-bit remainder - * - * This ends up being the most efficient calling - * convention on x86. - */ -#define do_div(n,base) ({ \ - unsigned long __upper, __low, __high, __mod, __base; \ - __base = (base); \ - asm(:=a (__low), =d (__high):A (n)); \ - __upper = __high; \ - if (__high) { \ - __upper = __high % (__base); \ - __high = __high / (__base); \ - } \ - asm(divl %2:=a (__low), =d (__mod):rm (__base), 0 (__low), 1 (__upper)); \ - asm(:=A (n):a (__low),d (__high)); \ - __mod; \ -}) - -/* - * (long)X = ((long long)divs) / (long)div - * (long)rem = ((long long)divs) % (long)div - * - * Warning, this will do an exception if X overflows. - */ -#define div_long_long_rem(a,b,c) div_ll_X_l_rem(a,b,c) - -static inline long -div_ll_X_l_rem(long long divs, long div, long *rem) -{ - long dum2; - __asm__(divl %2:=a(dum2), =d(*rem) - :rm(div), A(divs)); - - return dum2; - -} - -extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); -#endif diff -Nurp a/include/asm-x86/div64_64.h b/include/asm-x86/div64_64.h --- a/include/asm-x86/div64_64.h2007-10-20 07:33:53.0 -0400 +++ b/include/asm-x86/div64_64.h1969-12-31 19:00:00.0 -0500 @@ -1 +0,0 @@ -#include asm-generic/div64.h diff -Nurp a/include/asm-x86/div64.h b/include/asm-x86/div64.h --- a/include/asm-x86/div64.h 2007-10-20 07:33:53.0 -0400 +++ b/include/asm-x86/div64.h 2007-10-20 07:32:34.0 -0400 @@ -1,5 +1,58 @@ +#ifndef _ASM_X86_DIV64_H +#define _ASM_X86_DIV64_H + #ifdef CONFIG_X86_32 -# include div64_32.h -#else -# include div64_64.h -#endif + +#include linux/types.h + +/* + * do_div() is NOT a C function. It wants to return + * two values (the quotient and the remainder), but + * since that doesn't work very well in C, what it + * does is: + * + * - modifies the 64-bit dividend _in_place_ + * - returns the 32-bit remainder + * + * This ends up being the most efficient calling + * convention on x86. + */ +#define do_div(n,base) ({ \ + unsigned long __upper, __low, __high, __mod, __base; \ + __base = (base); \ + asm(:=a (__low), =d (__high):A (n)); \ + __upper = __high; \ + if (__high) { \ + __upper = __high % (__base); \ + __high = __high / (__base); \ + } \ + asm(divl %2:=a (__low), =d (__mod):rm (__base), 0 (__low), 1 (__upper)); \ + asm(:=A (n):a (__low),d (__high)); \ + __mod; \ +}) + +/* + * (long)X = ((long long)divs) / (long)div + * (long)rem = ((long long)divs) % (long)div + * + * Warning, this will do an exception if X overflows. + */ +#define div_long_long_rem(a,b,c) div_ll_X_l_rem(a,b,c) + +static inline long +div_ll_X_l_rem(long long divs, long div, long *rem) +{ + long dum2; + __asm__(divl %2:=a(dum2), =d(*rem) + :rm(div), A(divs)); + + return dum2; + +} + +extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); + +# else +# include asm-generic/div64.h +# endif /* CONFIG_X86_32 */ +#endif /* _ASM_X86_DIV64_H */ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: mostly merge types.h
From: Chris Snook <[EMAIL PROTECTED]> Most of types_32.h and types_64.h are the same. Merge the common definitions into types.h, keeping the differences in their own files. Also #error if types_{32,64}.h is included directly. Tested with allmodconfig on x86_64. Signed-off-by: Chris Snook <[EMAIL PROTECTED]> types.h| 45 + types_32.h | 48 ++-- types_64.h | 47 +++ 3 files changed, 58 insertions(+), 82 deletions(-) diff -urp a/include/asm-x86/types_32.h b/include/asm-x86/types_32.h --- a/include/asm-x86/types_32.h2007-10-18 04:23:36.0 -0400 +++ b/include/asm-x86/types_32.h2007-10-18 07:03:05.0 -0400 @@ -1,64 +1,28 @@ #ifndef _I386_TYPES_H #define _I386_TYPES_H -#ifndef __ASSEMBLY__ - -typedef unsigned short umode_t; - -/* - * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the - * header files exported to user space - */ - -typedef __signed__ char __s8; -typedef unsigned char __u8; - -typedef __signed__ short __s16; -typedef unsigned short __u16; - -typedef __signed__ int __s32; -typedef unsigned int __u32; +#ifndef _X86_TYPES_H +#error Do not include this file directly. Use asm/types.h instead. +#endif -#if defined(__GNUC__) +#if !defined(__ASSEMBLY__) && defined(__GNUC__) __extension__ typedef __signed__ long long __s64; __extension__ typedef unsigned long long __u64; #endif -#endif /* __ASSEMBLY__ */ - -/* - * These aren't exported outside the kernel to avoid name space clashes - */ #ifdef __KERNEL__ #define BITS_PER_LONG 32 #ifndef __ASSEMBLY__ - -typedef signed char s8; -typedef unsigned char u8; - -typedef signed short s16; -typedef unsigned short u16; - -typedef signed int s32; -typedef unsigned int u32; - -typedef signed long long s64; -typedef unsigned long long u64; - -/* DMA addresses come in generic and 64-bit flavours. */ - +/* DMA addresses come in generic and 64-bit flavours. */ #ifdef CONFIG_HIGHMEM64G typedef u64 dma_addr_t; #else typedef u32 dma_addr_t; #endif -typedef u64 dma64_addr_t; #endif /* __ASSEMBLY__ */ - #endif /* __KERNEL__ */ - -#endif +#endif /* _I386_TYPES_H */ diff -urp a/include/asm-x86/types_64.h b/include/asm-x86/types_64.h --- a/include/asm-x86/types_64.h2007-10-18 04:23:36.0 -0400 +++ b/include/asm-x86/types_64.h2007-10-18 07:03:11.0 -0400 @@ -1,55 +1,22 @@ #ifndef _X86_64_TYPES_H #define _X86_64_TYPES_H -#ifndef __ASSEMBLY__ - -typedef unsigned short umode_t; - -/* - * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the - * header files exported to user space - */ - -typedef __signed__ char __s8; -typedef unsigned char __u8; - -typedef __signed__ short __s16; -typedef unsigned short __u16; - -typedef __signed__ int __s32; -typedef unsigned int __u32; +#ifndef _X86_TYPES_H +#error Do not include this file directly. Use asm/types.h instead. +#endif +#ifndef __ASSEMBLY__ typedef __signed__ long long __s64; typedef unsigned long long __u64; +#endif -#endif /* __ASSEMBLY__ */ - -/* - * These aren't exported outside the kernel to avoid name space clashes - */ #ifdef __KERNEL__ #define BITS_PER_LONG 64 #ifndef __ASSEMBLY__ - -typedef signed char s8; -typedef unsigned char u8; - -typedef signed short s16; -typedef unsigned short u16; - -typedef signed int s32; -typedef unsigned int u32; - -typedef signed long long s64; -typedef unsigned long long u64; - -typedef u64 dma64_addr_t; typedef u64 dma_addr_t; - -#endif /* __ASSEMBLY__ */ +#endif #endif /* __KERNEL__ */ - -#endif +#endif /* _X86_64_TYPES_H */ diff -urp a/include/asm-x86/types.h b/include/asm-x86/types.h --- a/include/asm-x86/types.h 2007-10-18 04:23:36.0 -0400 +++ b/include/asm-x86/types.h 2007-10-18 06:59:37.0 -0400 @@ -1,3 +1,46 @@ +#ifndef _X86_TYPES_H +#define _X86_TYPES_H + +#ifndef __ASSEMBLY__ + +typedef unsigned short umode_t; + +/* + * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the + * header files exported to user space + */ + +typedef __signed__ char __s8; +typedef unsigned char __u8; + +typedef __signed__ short __s16; +typedef unsigned short __u16; + +typedef __signed__ int __s32; +typedef unsigned int __u32; + +/* + * These aren't exported outside the kernel to avoid name space clashes + */ +#ifdef __KERNEL__ + +typedef signed char s8; +typedef unsigned char u8; + +typedef signed short s16; +typedef unsigned short u16; + +typedef signed int s32; +typedef unsigned int u32; + +typedef signed long long s64; +typedef unsigned long long u64; + +typedef u64 dma64_addr_t; + +#endif /* __KERNEL__ */ +#endif /* __ASSEMBLY__ */ + #ifdef __KERNEL__ # ifdef CONFIG_X86_32 # include "types_32.h" @@ -11,3 +54,5 @@ # include "types_64.h" # endif #endif + +#endif /* _X86_TYPES_H */ - To unsubscribe from this list: send the lin
[PATCH] x86: mostly merge types.h
From: Chris Snook [EMAIL PROTECTED] Most of types_32.h and types_64.h are the same. Merge the common definitions into types.h, keeping the differences in their own files. Also #error if types_{32,64}.h is included directly. Tested with allmodconfig on x86_64. Signed-off-by: Chris Snook [EMAIL PROTECTED] types.h| 45 + types_32.h | 48 ++-- types_64.h | 47 +++ 3 files changed, 58 insertions(+), 82 deletions(-) diff -urp a/include/asm-x86/types_32.h b/include/asm-x86/types_32.h --- a/include/asm-x86/types_32.h2007-10-18 04:23:36.0 -0400 +++ b/include/asm-x86/types_32.h2007-10-18 07:03:05.0 -0400 @@ -1,64 +1,28 @@ #ifndef _I386_TYPES_H #define _I386_TYPES_H -#ifndef __ASSEMBLY__ - -typedef unsigned short umode_t; - -/* - * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the - * header files exported to user space - */ - -typedef __signed__ char __s8; -typedef unsigned char __u8; - -typedef __signed__ short __s16; -typedef unsigned short __u16; - -typedef __signed__ int __s32; -typedef unsigned int __u32; +#ifndef _X86_TYPES_H +#error Do not include this file directly. Use asm/types.h instead. +#endif -#if defined(__GNUC__) +#if !defined(__ASSEMBLY__) defined(__GNUC__) __extension__ typedef __signed__ long long __s64; __extension__ typedef unsigned long long __u64; #endif -#endif /* __ASSEMBLY__ */ - -/* - * These aren't exported outside the kernel to avoid name space clashes - */ #ifdef __KERNEL__ #define BITS_PER_LONG 32 #ifndef __ASSEMBLY__ - -typedef signed char s8; -typedef unsigned char u8; - -typedef signed short s16; -typedef unsigned short u16; - -typedef signed int s32; -typedef unsigned int u32; - -typedef signed long long s64; -typedef unsigned long long u64; - -/* DMA addresses come in generic and 64-bit flavours. */ - +/* DMA addresses come in generic and 64-bit flavours. */ #ifdef CONFIG_HIGHMEM64G typedef u64 dma_addr_t; #else typedef u32 dma_addr_t; #endif -typedef u64 dma64_addr_t; #endif /* __ASSEMBLY__ */ - #endif /* __KERNEL__ */ - -#endif +#endif /* _I386_TYPES_H */ diff -urp a/include/asm-x86/types_64.h b/include/asm-x86/types_64.h --- a/include/asm-x86/types_64.h2007-10-18 04:23:36.0 -0400 +++ b/include/asm-x86/types_64.h2007-10-18 07:03:11.0 -0400 @@ -1,55 +1,22 @@ #ifndef _X86_64_TYPES_H #define _X86_64_TYPES_H -#ifndef __ASSEMBLY__ - -typedef unsigned short umode_t; - -/* - * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the - * header files exported to user space - */ - -typedef __signed__ char __s8; -typedef unsigned char __u8; - -typedef __signed__ short __s16; -typedef unsigned short __u16; - -typedef __signed__ int __s32; -typedef unsigned int __u32; +#ifndef _X86_TYPES_H +#error Do not include this file directly. Use asm/types.h instead. +#endif +#ifndef __ASSEMBLY__ typedef __signed__ long long __s64; typedef unsigned long long __u64; +#endif -#endif /* __ASSEMBLY__ */ - -/* - * These aren't exported outside the kernel to avoid name space clashes - */ #ifdef __KERNEL__ #define BITS_PER_LONG 64 #ifndef __ASSEMBLY__ - -typedef signed char s8; -typedef unsigned char u8; - -typedef signed short s16; -typedef unsigned short u16; - -typedef signed int s32; -typedef unsigned int u32; - -typedef signed long long s64; -typedef unsigned long long u64; - -typedef u64 dma64_addr_t; typedef u64 dma_addr_t; - -#endif /* __ASSEMBLY__ */ +#endif #endif /* __KERNEL__ */ - -#endif +#endif /* _X86_64_TYPES_H */ diff -urp a/include/asm-x86/types.h b/include/asm-x86/types.h --- a/include/asm-x86/types.h 2007-10-18 04:23:36.0 -0400 +++ b/include/asm-x86/types.h 2007-10-18 06:59:37.0 -0400 @@ -1,3 +1,46 @@ +#ifndef _X86_TYPES_H +#define _X86_TYPES_H + +#ifndef __ASSEMBLY__ + +typedef unsigned short umode_t; + +/* + * __xx is ok: it doesn't pollute the POSIX namespace. Use these in the + * header files exported to user space + */ + +typedef __signed__ char __s8; +typedef unsigned char __u8; + +typedef __signed__ short __s16; +typedef unsigned short __u16; + +typedef __signed__ int __s32; +typedef unsigned int __u32; + +/* + * These aren't exported outside the kernel to avoid name space clashes + */ +#ifdef __KERNEL__ + +typedef signed char s8; +typedef unsigned char u8; + +typedef signed short s16; +typedef unsigned short u16; + +typedef signed int s32; +typedef unsigned int u32; + +typedef signed long long s64; +typedef unsigned long long u64; + +typedef u64 dma64_addr_t; + +#endif /* __KERNEL__ */ +#endif /* __ASSEMBLY__ */ + #ifdef __KERNEL__ # ifdef CONFIG_X86_32 # include types_32.h @@ -11,3 +54,5 @@ # include types_64.h # endif #endif + +#endif /* _X86_TYPES_H */ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message
Re: NVIDIA Ethernet & invalid MAC
Konstantin Kalin wrote: P.S. It's simple to add DEV_HAS_CORRECT_MACADDR to pci_device_tlb for these types of Ethernet. But I think it's not right decision because it would break older revisions of these models. Any reason you can't distinguish based on PCI ID? -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NVIDIA Ethernet invalid MAC
Konstantin Kalin wrote: P.S. It's simple to add DEV_HAS_CORRECT_MACADDR to pci_device_tlb for these types of Ethernet. But I think it's not right decision because it would break older revisions of these models. Any reason you can't distinguish based on PCI ID? -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: gigabit ethernet power consumption
Pavel Machek wrote: Hi! I've found that gbit vs. 100mbit power consumption difference is about 1W -- pretty significant. (Maybe powertop should include it in the tips section? :). Energy Star people insist that machines should switch down to 100mbit when network is idle, and I guess that makes a lot of sense -- you save 1W locally and 1W on the router. Question is, how to implement it correctly? Daemon that would watch data rates and switch speeds using mii-tool would be simple, but is that enough? I believe you misspelled "ethtool". While you're at it, why stop at 100Mb? I believe you save even more power at 10Mb, which is why WOL puts the card in 10Mb mode. In my experience, you generally want either the maximum setting or the minimum setting when going for power savings, because of the race-to-idle effect. Workloads that have a sustained fractional utilization are rare. Right now I'm at home, hooked up to a cable modem, so anything over 4Mb is wasted, unless I'm talking to the box across the room, which is rare. Talk to the NetworkManager folks. This is right up their alley. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: gigabit ethernet power consumption
Pavel Machek wrote: Hi! I've found that gbit vs. 100mbit power consumption difference is about 1W -- pretty significant. (Maybe powertop should include it in the tips section? :). Energy Star people insist that machines should switch down to 100mbit when network is idle, and I guess that makes a lot of sense -- you save 1W locally and 1W on the router. Question is, how to implement it correctly? Daemon that would watch data rates and switch speeds using mii-tool would be simple, but is that enough? I believe you misspelled ethtool. While you're at it, why stop at 100Mb? I believe you save even more power at 10Mb, which is why WOL puts the card in 10Mb mode. In my experience, you generally want either the maximum setting or the minimum setting when going for power savings, because of the race-to-idle effect. Workloads that have a sustained fractional utilization are rare. Right now I'm at home, hooked up to a cable modem, so anything over 4Mb is wasted, unless I'm talking to the box across the room, which is rare. Talk to the NetworkManager folks. This is right up their alley. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: One process with multiple user ids.
Giuliano Gagliardi wrote: Hello, I have a server that has to switch to different user ids, but because it does other complex things, I would rather not have it run as root. Well, it's probably going to have to *start* as root, or use something like sudo. It's probably easiest to have it start as root and drop privileges as soon as possible, certainly before handling any untrusted data. > I only need the server to be able to switch to certain pre-defined user ids. This is a very easy special case. Just start a process for each user ID and drop root privileges. They can communicate via sockets or even shared memory. If you wanted to switch between arbitrary UIDs at runtime, it might be worth doing something exotic, but it's really not in this case. Also, if you do it this way, it's rather easy to verify the correctness of your design, and you never have to touch kernel code. I have seen that two possible solutions have already been suggested here on the LKML, but it was some years ago, and nothing like it has been implemented. (1) Having supplementary user ids like there are supplementary group ids and system calls getuids() and setuids() that work like getgroups() and setgroups() But you can already accomplish this with ACLs and SELinux. You're trying to make this problem harder than it really is. (2) Allowing processes to pass user and group ids via sockets. And do what with them? You can already pass arbitrary data via sockets. It sounds like you need (1) to use (2). Both (1) and (2) would solve my problem. Now my question is whether there are any fundamental flaws with (1) or (2), or whether the right way to solve my problem is another one. (1) doesn't accomplish anything you can't already do, but it would make a huge mess of a lot of code. (2) is silly. Sockets are for communicating between userspace processes. If you want to be granting/revoking credentials, you should be using system calls, and even then only if you absolutely must. Having the kernel snoop traffic on sockets between processes would be disastrous for performance, and without that, any process could claim that it had been granted privileges over a socket and the kernel would just have to trust it. Don't overthink this. You don't need to touch the kernel at all to do this. Just use a multi-process model, like qmail does, for example. You can start with root privileges and drop them, or use sudo to help you out. It's fast, secure, takes advantage of modern multi-core CPUs, and is much simpler. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: One process with multiple user ids.
Giuliano Gagliardi wrote: Hello, I have a server that has to switch to different user ids, but because it does other complex things, I would rather not have it run as root. Well, it's probably going to have to *start* as root, or use something like sudo. It's probably easiest to have it start as root and drop privileges as soon as possible, certainly before handling any untrusted data. I only need the server to be able to switch to certain pre-defined user ids. This is a very easy special case. Just start a process for each user ID and drop root privileges. They can communicate via sockets or even shared memory. If you wanted to switch between arbitrary UIDs at runtime, it might be worth doing something exotic, but it's really not in this case. Also, if you do it this way, it's rather easy to verify the correctness of your design, and you never have to touch kernel code. I have seen that two possible solutions have already been suggested here on the LKML, but it was some years ago, and nothing like it has been implemented. (1) Having supplementary user ids like there are supplementary group ids and system calls getuids() and setuids() that work like getgroups() and setgroups() But you can already accomplish this with ACLs and SELinux. You're trying to make this problem harder than it really is. (2) Allowing processes to pass user and group ids via sockets. And do what with them? You can already pass arbitrary data via sockets. It sounds like you need (1) to use (2). Both (1) and (2) would solve my problem. Now my question is whether there are any fundamental flaws with (1) or (2), or whether the right way to solve my problem is another one. (1) doesn't accomplish anything you can't already do, but it would make a huge mess of a lot of code. (2) is silly. Sockets are for communicating between userspace processes. If you want to be granting/revoking credentials, you should be using system calls, and even then only if you absolutely must. Having the kernel snoop traffic on sockets between processes would be disastrous for performance, and without that, any process could claim that it had been granted privileges over a socket and the kernel would just have to trust it. Don't overthink this. You don't need to touch the kernel at all to do this. Just use a multi-process model, like qmail does, for example. You can start with root privileges and drop them, or use sudo to help you out. It's fast, secure, takes advantage of modern multi-core CPUs, and is much simpler. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Bonnie++ with 1024k stripe SW/RAID5 causes kernel to goto D-state
Justin Piszcz wrote: Kernel: 2.6.23-rc8 (older kernels do this as well) When running the following command: /usr/bin/time /usr/sbin/bonnie++ -d /x/test -s 16384 -m p34 -n 16:10:16:64 It hangs unless I increase various parameters md/raid such as the stripe_cache_size etc.. # ps auxww | grep D USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND root 276 0.0 0.0 0 0 ?D12:14 0:00 [pdflush] root 277 0.0 0.0 0 0 ?D12:14 0:00 [pdflush] root 1639 0.0 0.0 0 0 ?D< 12:14 0:00 [xfsbufd] root 1767 0.0 0.0 8100 420 ?Ds 12:14 0:00 root 2895 0.0 0.0 5916 632 ?Ds 12:15 0:00 /sbin/syslogd -r See the bottom for more details. Is this normal? Does md only work without tuning up to a certain stripe size? I use a RAID 5 with 1024k stripe which works fine with many optimizations, but if I just boot the system and run bonnie++ on it without applying the optimizations, it will hang in d-state. When I run the optimizations, then it exits out of D-state, pretty weird? Not at all. 1024k stripes are way outside the norm. If you do something way outside the norm, and don't tune for it in advance, don't be terribly surprised when something like bonnie++ brings your box to its knees. That's not to say we couldn't make md auto-tune itself more intelligently, but this isn't really a bug. With a sufficiently huge amount of RAM, you'd be able to dynamically allocate the buffers that you're not pre-allocating with stripe_cache_size, but bonnie++ is eating that up in this case. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Bonnie++ with 1024k stripe SW/RAID5 causes kernel to goto D-state
Justin Piszcz wrote: Kernel: 2.6.23-rc8 (older kernels do this as well) When running the following command: /usr/bin/time /usr/sbin/bonnie++ -d /x/test -s 16384 -m p34 -n 16:10:16:64 It hangs unless I increase various parameters md/raid such as the stripe_cache_size etc.. # ps auxww | grep D USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND root 276 0.0 0.0 0 0 ?D12:14 0:00 [pdflush] root 277 0.0 0.0 0 0 ?D12:14 0:00 [pdflush] root 1639 0.0 0.0 0 0 ?D 12:14 0:00 [xfsbufd] root 1767 0.0 0.0 8100 420 ?Ds 12:14 0:00 root 2895 0.0 0.0 5916 632 ?Ds 12:15 0:00 /sbin/syslogd -r See the bottom for more details. Is this normal? Does md only work without tuning up to a certain stripe size? I use a RAID 5 with 1024k stripe which works fine with many optimizations, but if I just boot the system and run bonnie++ on it without applying the optimizations, it will hang in d-state. When I run the optimizations, then it exits out of D-state, pretty weird? Not at all. 1024k stripes are way outside the norm. If you do something way outside the norm, and don't tune for it in advance, don't be terribly surprised when something like bonnie++ brings your box to its knees. That's not to say we couldn't make md auto-tune itself more intelligently, but this isn't really a bug. With a sufficiently huge amount of RAM, you'd be able to dynamically allocate the buffers that you're not pre-allocating with stripe_cache_size, but bonnie++ is eating that up in this case. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RESEND] x86_64: make atomic64_t work like atomic_t
Regardless of the greater controversy about the semantics of atomic_t, I think we can all agree that atomic_t and atomic64_t should have the same semantics. This is presently not the case on x86_64, where the volatile keyword was removed from the declaration of atomic_t, but it was not removed from the declaration of atomic64_t. The following patch fixes that inconsistency, without delving into anything more controversial. From: Chris Snook <[EMAIL PROTECTED]> The volatile keyword has already been removed from the declaration of atomic_t on x86_64. For consistency, remove it from atomic64_t as well. Signed-off-by: Chris Snook <[EMAIL PROTECTED]> CC: Andi Kleen <[EMAIL PROTECTED]> --- a/include/asm-x86_64/atomic.h 2007-07-08 19:32:17.0 -0400 +++ b/include/asm-x86_64/atomic.h 2007-09-13 11:30:51.0 -0400 @@ -206,7 +206,7 @@ static __inline__ int atomic_sub_return( /* An 64bit atomic type */ -typedef struct { volatile long counter; } atomic64_t; +typedef struct { long counter; } atomic64_t; #define ATOMIC64_INIT(i) { (i) } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RESEND] x86_64: make atomic64_t work like atomic_t
Regardless of the greater controversy about the semantics of atomic_t, I think we can all agree that atomic_t and atomic64_t should have the same semantics. This is presently not the case on x86_64, where the volatile keyword was removed from the declaration of atomic_t, but it was not removed from the declaration of atomic64_t. The following patch fixes that inconsistency, without delving into anything more controversial. From: Chris Snook [EMAIL PROTECTED] The volatile keyword has already been removed from the declaration of atomic_t on x86_64. For consistency, remove it from atomic64_t as well. Signed-off-by: Chris Snook [EMAIL PROTECTED] CC: Andi Kleen [EMAIL PROTECTED] --- a/include/asm-x86_64/atomic.h 2007-07-08 19:32:17.0 -0400 +++ b/include/asm-x86_64/atomic.h 2007-09-13 11:30:51.0 -0400 @@ -206,7 +206,7 @@ static __inline__ int atomic_sub_return( /* An 64bit atomic type */ -typedef struct { volatile long counter; } atomic64_t; +typedef struct { long counter; } atomic64_t; #define ATOMIC64_INIT(i) { (i) } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: patch/option to wipe memory at boot?
David Madore wrote: On Mon, Sep 17, 2007 at 11:11:52AM -0700, Jeremy Fitzhardinge wrote: Boot memtest86 for a little while before booting the kernel? And if you haven't already run it for a while, then that would be your first step anyway. Indeed, that does the trick, thanks for the suggestion. So I can be quite confident, now, that my RAM is sane and it's just that the BIOS doesn't initialize it properly. But I'd still like some way of filling the RAM when Linux starts (or perhaps in the bootloader), because letting memtest86 run after every cold reboot isn't a very satisfactory solution. Bootloaders like to do things like run in 16-bit or 32-bit mode on boxes where higher bitness is necessary to access all the memory. It may be possible to do this in the bootloader, but the BIOS is clearly the correct place to fix this problem. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: patch/option to wipe memory at boot?
David Madore wrote: On Mon, Sep 17, 2007 at 11:11:52AM -0700, Jeremy Fitzhardinge wrote: Boot memtest86 for a little while before booting the kernel? And if you haven't already run it for a while, then that would be your first step anyway. Indeed, that does the trick, thanks for the suggestion. So I can be quite confident, now, that my RAM is sane and it's just that the BIOS doesn't initialize it properly. But I'd still like some way of filling the RAM when Linux starts (or perhaps in the bootloader), because letting memtest86 run after every cold reboot isn't a very satisfactory solution. Bootloaders like to do things like run in 16-bit or 32-bit mode on boxes where higher bitness is necessary to access all the memory. It may be possible to do this in the bootloader, but the BIOS is clearly the correct place to fix this problem. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU usage for 10Gbps UDP transfers
Lukas Hejtmanek wrote: Hello, is it expected that application sending 8900bytes datagram through 10Gbps NIC utilizes CPU to 100% and similarly the receiver also utilizes CPU to 100%. Is it something wrong or this is quite OK? (The box is dual single core Opteron 2.4GHz with Myricom 10GE NIC.) Every time a new generation of ethernet comes out, its peak throughput exceeds the memory/CPU/IO capacity of commodity hardware available at the time. This is normal. Of course, you may not be saturating the link, and it may be possible to tune the driver to improve your throughput, but you'll still be saturating a CPU on that hardware. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU usage for 10Gbps UDP transfers
Lukas Hejtmanek wrote: Hello, is it expected that application sending 8900bytes datagram through 10Gbps NIC utilizes CPU to 100% and similarly the receiver also utilizes CPU to 100%. Is it something wrong or this is quite OK? (The box is dual single core Opteron 2.4GHz with Myricom 10GE NIC.) Every time a new generation of ethernet comes out, its peak throughput exceeds the memory/CPU/IO capacity of commodity hardware available at the time. This is normal. Of course, you may not be saturating the link, and it may be possible to tune the driver to improve your throughput, but you'll still be saturating a CPU on that hardware. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: irq load balancing
Venkat Subbiah wrote: Since most network devices have a single status register for both receiver and transmit (and errors and the like), which needs a lock to protect access, you will likely end up with serious thrashing of moving the lock between cpus. Any ways to measure the trashing of locks? Since most network devices have a single status register for both receiver and transmit (and errors and the like) These register accesses will be mostly within the irq handler which I plan on keeping on the same processor. The network driver is actually tg3. Will looks closely into the driver. Why are you trying to do this, anyway? This is a classic example of fairness hurting both performance and efficiency. Unbalanced distribution of a single IRQ gives superior performance. There are cases when this is a worthwhile tradeoff, but the network stack is not one of them. In the HPC world, people generally want to squeeze maximum performance out of CPU/cache/RAM so they just accept the imbalance because it performs better than balancing it, and irqbalance can keep things fair over longer intervals if that's important. In the realtime world, people generally bind everything they can to one or two CPUs, and bind their realtime applications to the remaining ones to minimize contention. Distributing your network interrupts in a round-robin fashion will make your computer do exactly one thing faster: heat up the room. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86_64: make atomic64_t semantics consistent with atomic_t
From: Chris Snook <[EMAIL PROTECTED]> The volatile keyword has already been removed from the declaration of atomic_t on x86_64. For consistency, remove it from atomic64_t as well. Signed-off-by: Chris Snook <[EMAIL PROTECTED]> --- a/include/asm-x86_64/atomic.h 2007-07-08 19:32:17.0 -0400 +++ b/include/asm-x86_64/atomic.h 2007-09-13 11:30:51.0 -0400 @@ -206,7 +206,7 @@ static __inline__ int atomic_sub_return( /* An 64bit atomic type */ -typedef struct { volatile long counter; } atomic64_t; +typedef struct { long counter; } atomic64_t; #define ATOMIC64_INIT(i) { (i) } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Lossy interrupts on x86_64
Jesse Barnes wrote: I just narrowed down a weird problem where I was losing more than 50% of my vblank interrupts to what seems to be the hires timers patch. Stock 2.6.23-rc5 works fine, but the latest (171) kernel from rawhide drops most of my interrupts unless I also have another interrupt source running (e.g. if I hold down a key or move the mouse I get the expected number of vblank interrupts, otherwise I get between 3 and 30 instead of the expected 60 per second). Any ideas? It seems like it might be bad APIC programming, but I haven't gone through those mods to look for suspects... What happens if you boot with 'noapic' or 'pci=nomsi'? Please post dmesg as well so we can see how the kernel is initializing the relevant hardware. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Lossy interrupts on x86_64
Jesse Barnes wrote: I just narrowed down a weird problem where I was losing more than 50% of my vblank interrupts to what seems to be the hires timers patch. Stock 2.6.23-rc5 works fine, but the latest (171) kernel from rawhide drops most of my interrupts unless I also have another interrupt source running (e.g. if I hold down a key or move the mouse I get the expected number of vblank interrupts, otherwise I get between 3 and 30 instead of the expected 60 per second). Any ideas? It seems like it might be bad APIC programming, but I haven't gone through those mods to look for suspects... What happens if you boot with 'noapic' or 'pci=nomsi'? Please post dmesg as well so we can see how the kernel is initializing the relevant hardware. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86_64: make atomic64_t semantics consistent with atomic_t
From: Chris Snook [EMAIL PROTECTED] The volatile keyword has already been removed from the declaration of atomic_t on x86_64. For consistency, remove it from atomic64_t as well. Signed-off-by: Chris Snook [EMAIL PROTECTED] --- a/include/asm-x86_64/atomic.h 2007-07-08 19:32:17.0 -0400 +++ b/include/asm-x86_64/atomic.h 2007-09-13 11:30:51.0 -0400 @@ -206,7 +206,7 @@ static __inline__ int atomic_sub_return( /* An 64bit atomic type */ -typedef struct { volatile long counter; } atomic64_t; +typedef struct { long counter; } atomic64_t; #define ATOMIC64_INIT(i) { (i) } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: irq load balancing
Venkat Subbiah wrote: Since most network devices have a single status register for both receiver and transmit (and errors and the like), which needs a lock to protect access, you will likely end up with serious thrashing of moving the lock between cpus. Any ways to measure the trashing of locks? Since most network devices have a single status register for both receiver and transmit (and errors and the like) These register accesses will be mostly within the irq handler which I plan on keeping on the same processor. The network driver is actually tg3. Will looks closely into the driver. Why are you trying to do this, anyway? This is a classic example of fairness hurting both performance and efficiency. Unbalanced distribution of a single IRQ gives superior performance. There are cases when this is a worthwhile tradeoff, but the network stack is not one of them. In the HPC world, people generally want to squeeze maximum performance out of CPU/cache/RAM so they just accept the imbalance because it performs better than balancing it, and irqbalance can keep things fair over longer intervals if that's important. In the realtime world, people generally bind everything they can to one or two CPUs, and bind their realtime applications to the remaining ones to minimize contention. Distributing your network interrupts in a round-robin fashion will make your computer do exactly one thing faster: heat up the room. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: irq load balancing
Venkat Subbiah wrote: Most of the load in my system is triggered by a single ethernet IRQ. Essentially the IRQ schedules a tasklet and most of the work is done in the taskelet which is scheduled in the IRQ. From what I read looks like the tasklet would be executed on the same CPU on which it was scheduled. So this means even in an SMP system it will be one processor which is overloaded. So will using the user space IRQ loadbalancer really help? A little bit. It'll keep other IRQs on different CPUs, which will prevent other interrupts from causing cache and TLB evictions that could slow down the interrupt handler for the NIC. What I am doubtful about is that the user space load balance comes along and changes the affinity once in a while. But really what I need is every interrupt to go to a different CPU in a round robin fashion. Doing it in a round-robin fashion will be disastrous for performance. Your cache miss rate will go through the roof and you'll hit the slow paths in the network stack most of the time. Looks like the APIC can distribute IRQ's dynamically? Is this supported in the kernel and any config or proc interface to turn this on/off. /proc/irq/$FOO/smp_affinity is a bitmask. You can mask an irq to multiple processors. Of course, this will absolutely kill your performance. That's why irqbalance never does this. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: irq load balancing
Venkat Subbiah wrote: Most of the load in my system is triggered by a single ethernet IRQ. Essentially the IRQ schedules a tasklet and most of the work is done in the taskelet which is scheduled in the IRQ. From what I read looks like the tasklet would be executed on the same CPU on which it was scheduled. So this means even in an SMP system it will be one processor which is overloaded. So will using the user space IRQ loadbalancer really help? A little bit. It'll keep other IRQs on different CPUs, which will prevent other interrupts from causing cache and TLB evictions that could slow down the interrupt handler for the NIC. What I am doubtful about is that the user space load balance comes along and changes the affinity once in a while. But really what I need is every interrupt to go to a different CPU in a round robin fashion. Doing it in a round-robin fashion will be disastrous for performance. Your cache miss rate will go through the roof and you'll hit the slow paths in the network stack most of the time. Looks like the APIC can distribute IRQ's dynamically? Is this supported in the kernel and any config or proc interface to turn this on/off. /proc/irq/$FOO/smp_affinity is a bitmask. You can mask an irq to multiple processors. Of course, this will absolutely kill your performance. That's why irqbalance never does this. -- Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/