Re: Real-Time Preemption and RCU
* Paul E. McKenney <[EMAIL PROTECTED]> wrote: > Seems to me that it would be good to have an RCU implementation that > meet the requirements of the Real-Time Preemption patch, but that is > 100% compatible with the "classic RCU" API. Such an implementation > must meet a number of requirements, which are listed at the end of > this message (search for "REQUIREMENTS"). [ Wow. you must be a secret telepatic mind-reader - yesterday i was thinking about mailing you, because my approach to RCU preemptability (the API variants) clearly sucked and caused problems all around, both in terms of maintainability and in terms of stability and scalability. ] > 5.The final version, which both scales and meets realtime > requirements, as well as exactly matching the "classic RCU" > API. > > I have tested this approach, but in user-level scaffolding. All of > these implementations should therefore be regarded with great > suspicion: untested, probably don't even compile. Besides which, I > certainly can't claim to fully understand the real-time preempt patch, > so I am bound to have gotten something wrong somewhere. In any case, > none of these implementations are a suitable replacement for "classic > RCU" on large servers, since they acquire locks in the RCU read-side > critical sections. However, they should scale enough to support small > SMP systems, inflicting only a modest performance penalty. basically for PREEMPT_RT the only constraint is that RCU sections should be preemptable. Whatever the performance cost. If PREEMPT_RT is merged into the upstream kernel then it will (at least initially) be at a status similar to NOMMU: it will be tolerated as long as it causes no 'drag' on the main code. The RCU API variants i introduced clearly violated this requirement, and were my #1 worry wrt. upstream mergability. > I believe that implementation #5 is most appropriate for real-time > preempt kernels. [...] yeah, agreed - it looks perfect - both the read and write side is preemptable. Can i just plug the code you sent into rcupudate.c and expect it to work, or would you like to send a patch? If you prefer you can make it an unconditional patch against an upstream kernel to keep things simple for you - i'll then massage it to be properly PREEMPT_RT dependent. > [...] In theory, #3 might be appropriate, but if I understand the > real-time preempt implementation of reader-writer lock, it will not > perform well if there are long RCU read-side critical sections, even > in UP kernels. all RCU-locked sections must be preemptable in -RT. Basically RCU is a mainstream API that is used by lots of code and will be introduced in many other areas as well. From the -RT kernel's POV sees this as an 'uncontrollable latency source', which keeps introducing critical sections. One major goal of PREEMPT_RT is to convert all popular critical section APIs into preemptible sections, so that the amount of code that is non-preemptable is drastically reduced and can be managed (and thus can be trusted). This goal has a higher priority than any performance consideration, because it doesnt matter what performance you have, if you cannot trust the kernel to be deterministic. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/3] j_state_lock, j_list_lock, remove-bitlocks
On Thu, 17 Mar 2005, Lee Revell wrote: > > OK, no need to cc: me on this one any more. It's really low priority > IMO compared to the big latencies I am seeing with ext3 and > "data=ordered". Unless you think there is any relation. > IMO a deadlock is higher priority than a big latency :-) I still belive that something to do with the locking in ext3 has to do with your latencies, but I'll take you off when I send something to Andrew or Ingo next time. Hopefully, they'll do the same. When this problem is solved on Ingo's side, maybe this will solve your latency problem, so I recommend that you keep trying the latest RT kernels. BTW what test are you running that causes these latencies? -- Steve - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ppc64 build broke between 2.6.11-bk6 and 2.6.11-bk7
--Andrew Morton <[EMAIL PROTECTED]> wrote (on Thursday, March 17, 2005 22:44:09 -0800): > "Martin J. Bligh" <[EMAIL PROTECTED]> wrote: >> >> drivers/built-in.o(.text+0x182bc): In function `.matroxfb_probe': >> : undefined reference to `.mac_vmode_to_var' >> make: *** [.tmp_vmlinux1] Error 1 >> >> Anyone know what that is? >> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11-mm4/broken-out/fbdev-kconfig-fix-for-macmodes-and-ppc.patch > > should fix it. > > Thanks - will retest. M. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ppc64 build broke between 2.6.11-bk6 and 2.6.11-bk7
"Martin J. Bligh" <[EMAIL PROTECTED]> wrote: > > drivers/built-in.o(.text+0x182bc): In function `.matroxfb_probe': > : undefined reference to `.mac_vmode_to_var' > make: *** [.tmp_vmlinux1] Error 1 > > Anyone know what that is? > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11-mm4/broken-out/fbdev-kconfig-fix-for-macmodes-and-ppc.patch should fix it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BKCVS broken ?
On Thu, Mar 17, 2005 at 10:50:40PM -0700, Erik Andersen wrote: > On Thu Mar 17, 2005 at 04:10:53PM -0800, Larry McVoy wrote: > > I got swamped, I'll look at this after dinner. But you might take a look > > at this: http://www.bitkeeper.com/press/2005-03-17.html which is a link > > to a very simple open source BK client. It doesn't do much except track > > the head of the tree but it does that well. It's slightly better than > > that, it puts all the checkin comments in BK/ChangeLog so you don't have > > to go over the wire to get those. > > > > It's intended for someone who just wants the latest and greatest snapshot, > > knows how to do cp -rp and diff -Nur, it's pretty basic. It's not a > > CVS gateway replacement but it does work for every tree on bkbits.net. > > Just to be clear, we are not dropping the CVS gateway, this is "in > > addition to" not "instead of". > > Thanks! Its nice to finally have an open source tool for sucking > down the latest and greatest directly from bk. Thus far the tool > is working perfectly at fetching source trees and at updating > them when new patches are applied. Great. It _should_ just work, I tested it with patches that included binaries which changed, it handles that. I suspect we'll find some case which doesn't work some day (symlinks can't be represented in a patch for example) but you can always reget things from scratch, that will work for contents, permissions, symlinks, the works. > One minor nit. The name for the 'update' tool is a bit too > generic... Hey, it's open source, I'm hoping that people will take that code and evolve it do whatever they need. We're willing to do what we can on this end if people need protocol changes to support new features, time permitting. Think of that code as a prototype. It's really simple, you can hack it trivially. If you want us to distribute your changes then send a patch, if not that's cool too. You can take that and evolve it to your heart's content. If you need a different license to start hacking let me know what you want, I really don't care, you can have that code as public domain if you like. -- --- Larry McVoylm at bitmover.com http://www.bitkeeper.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
ppc64 build broke between 2.6.11-bk6 and 2.6.11-bk7
drivers/built-in.o(.text+0x182bc): In function `.matroxfb_probe': : undefined reference to `.mac_vmode_to_var' make: *** [.tmp_vmlinux1] Error 1 Anyone know what that is? M. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BKPATCH] ACPI for 2.6.12-rc1
Hi Linus, please do a bk pull bk://linux-acpi.bkbits.net/to-linus This includes the ACPI part of memory hotplug, plus various fixes, BIOS workarounds and a fix for an interpreter regressions we had in 2.6.11 vs 2.6.10. All changes here have been through Andrew's mm tree. thanks, -Len ps. a plain patch is also available here: ftp://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/2.6.11/acpi-20050309-2.6.11.diff.gz This will update the following files: arch/i386/kernel/acpi/sleep.c |3 arch/ia64/kernel/acpi.c |2 drivers/acpi/Kconfig| 20 drivers/acpi/Makefile |1 drivers/acpi/ac.c | 18 drivers/acpi/acpi_memhotplug.c | 542 drivers/acpi/battery.c |2 drivers/acpi/button.c |4 drivers/acpi/container.c| 15 drivers/acpi/debug.c|4 drivers/acpi/dispatcher/dsmethod.c | 11 drivers/acpi/dispatcher/dsopcode.c |8 drivers/acpi/dispatcher/dsutils.c | 166 +-- drivers/acpi/dispatcher/dswexec.c | 61 ++ drivers/acpi/ec.c |2 drivers/acpi/events/evxface.c |4 drivers/acpi/executer/exmisc.c |5 drivers/acpi/executer/exoparg2.c|6 drivers/acpi/executer/exresolv.c|6 drivers/acpi/executer/exstoren.c|7 drivers/acpi/executer/exstorob.c| 27 - drivers/acpi/fan.c | 33 - drivers/acpi/ibm_acpi.c |4 drivers/acpi/numa.c |2 drivers/acpi/osl.c | 10 drivers/acpi/parser/psopcode.c |2 drivers/acpi/parser/psparse.c | 42 + drivers/acpi/parser/pswalk.c| 254 +-- drivers/acpi/pci_irq.c | 38 + drivers/acpi/pci_link.c | 14 drivers/acpi/pci_root.c |4 drivers/acpi/power.c| 10 drivers/acpi/processor_core.c |6 drivers/acpi/processor_thermal.c|2 drivers/acpi/processor_throttling.c |2 drivers/acpi/resources/rsaddr.c | 146 +++--- drivers/acpi/resources/rscalc.c | 14 drivers/acpi/resources/rsdump.c | 23 - drivers/acpi/resources/rslist.c |1 drivers/acpi/scan.c | 47 +- drivers/acpi/thermal.c |2 drivers/acpi/toshiba_acpi.c |2 drivers/acpi/utilities/utcopy.c | 19 drivers/acpi/utilities/utdelete.c | 18 drivers/acpi/utilities/utglobal.c | 10 drivers/acpi/utilities/utmisc.c | 44 + drivers/acpi/video.c|2 drivers/pnp/pnpacpi/rsparser.c |9 include/acpi/acconfig.h |4 include/acpi/acdisasm.h |5 include/acpi/acdispat.h | 10 include/acpi/acinterp.h |1 include/acpi/aclocal.h |4 include/acpi/acpi_bus.h |1 include/acpi/acpi_drivers.h |3 include/acpi/acstruct.h |1 include/acpi/actbl.h|4 include/acpi/actbl2.h | 79 +++ include/acpi/actypes.h | 33 - include/acpi/platform/acenv.h |2 include/acpi/processor.h|2 include/linux/acpi.h|2 62 files changed, 1301 insertions(+), 524 deletions(-) through these ChangeSets: <[EMAIL PROTECTED]> (05/03/17 1.2213) [ACPI] build fix in acpi_pci_irq_disable() bk-acpi-acpi_pci_irq_disable-build-fix.patch Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> (05/03/09 1.1938.505.27) [ACPI] ACPICA 20050309 from Bob Moore The string-to-buffer implicit conversion code has been modified again after a change to the ACPI specification. In order to match the behavior of the other major ACPI implementation, the target buffer is no longer truncated if the source string is smaller than an existing target buffer. This change requires an update to the ACPI spec, and should eliminate the recent AE_AML_BUFFER_LIMIT issues. The "implicit return" support was rewritten to a new algorithm that solves the general case. Rather than attempt to determine when a method is about to exit, the result of every ASL operator is saved momentarily until the very next ASL operator is executed. Therefore, no matter how the method exits, there will always be a saved implicit return value. This feature is only enabled with the acpi_gbl_enable_interpreter_slack flag which Linux enables unless "acpi=strict". This should eliminate AE_AML_NO_RETURN_VALUE errors. Implemented implicit conversion support for the predicate (operand) of the If, Else, and While operators. String and Buffer arguments are automatically converted to Integers. Changed the string-to-integer
Re: [PATCH] Automatically append a semi-random version for BK users
Sam, this version includes the CVS portion. Automatically append a semi-random version if the tree we're building isn't tagged in BitKeeper or CVS and CONFIG_LOCALVERSION_AUTO is set. This fixes the case when Linus (or someone else) does a release and tags it, someone else does a build of that release tree (i.e, 2.6.11), and installs it. Later, before another release occurs (i.e, -rc1), another build happens, and the actual, released 2.6.11 is overwritten with the -current tree. This currently supports BitKeeper and CVS (assuming the format is the same as the BK->CVS tree) Signed-Off-By: Ryan Anderson <[EMAIL PROTECTED]> Index: local-quilt/Makefile === --- local-quilt.orig/Makefile 2005-03-14 20:53:59.0 -0500 +++ local-quilt/Makefile2005-03-14 20:54:02.0 -0500 @@ -549,6 +549,26 @@ export KBUILD_IMAGE ?= vmlinux # images. Default is /boot, but you can set it to other values export INSTALL_PATH ?= /boot +# If CONFIG_LOCALVERSION_AUTO is set, we automatically perform some tests +# and try to determine if the current source tree is a release tree, of any sort, +# or if is a pure development tree. +# +# A 'release tree' is any tree with a BitKeeper, or other SCM, TAG associated +# with it. The primary goal of this is to make it safe for a native +# BitKeeper/CVS/SVN user to build a release tree (i.e, 2.6.9) and also to +# continue developing against the current Linus tree, without having the Linus +# tree overwrite the 2.6.9 tree when installed. +# +# Currently, only BitKeeper is supported. +# Other SCMs can edit scripts/setlocalversion and add the appropriate +# checks as needed. + + +ifdef CONFIG_LOCALVERSION_AUTO + localversion-auto := $(shell $(PERL) $(srctree)/scripts/setlocalversion $(srctree)) + LOCALVERSION := $(LOCALVERSION)$(localversion-auto) +endif + # # INSTALL_MOD_PATH specifies a prefix to MODLIB for module directory # relocations required by build roots. This is not defined in the Index: local-quilt/init/Kconfig === --- local-quilt.orig/init/Kconfig 2005-03-14 20:53:59.0 -0500 +++ local-quilt/init/Kconfig2005-03-17 23:49:44.0 -0500 @@ -69,6 +69,24 @@ config LOCALVERSION object and source tree, in that order. Your total string can be a maximum of 64 characters. +config LOCALVERSION_AUTO + bool "Automatically append version information to the version string" + default y + help + This will try to automatically determine if the current tree is a + release tree by looking for BitKeeper or CVS tags that + belong to the current top of tree revision. + + A string of the format -BK will be added to the localversion + if a BitKeeper based tree is found. The string -cvs-$version will be + added to the localversion if a CVS tree based on the BK->CVS tree is + found. The string generated by this will be appended after any + matching localversion* files, and after the value set in + CONFIG_LOCALVERSION + + Note: This requires Perl and the Digest::MD5 module, as well + as BitKeeper and/or CVS. + config SWAP bool "Support for paging of anonymous memory (swap)" depends on MMU Index: local-quilt/scripts/setlocalversion === --- /dev/null 1970-01-01 00:00:00.0 + +++ local-quilt/scripts/setlocalversion 2005-03-17 23:02:02.0 -0500 @@ -0,0 +1,120 @@ +#!/usr/bin/perl +# Copyright 2004 - Ryan Anderson <[EMAIL PROTECTED]> GPL v2 + +use strict; +use warnings; +use Digest::MD5; +require 5.006; + +if (@ARGV != 1) { + print < +EOT + exit(1); +} + +my ($srctree) = @ARGV; + +my @LOCALVERSIONS = (); + +# BitKeeper Version Checks + +# We are going to use the following commands to try and determine if this +# repository is at a Version boundary (i.e, 2.6.10 vs 2.6.10 + some patches) We +# currently assume that all meaningful version boundaries are marked by a tag. +# We don't care what the tag is, just that something exists. +# +# The process is as follows: +# +# 1. Get the key of the top of tree changeset: +# cset=`bk changes -r+ -k` +#This will be something like: +#[EMAIL PROTECTED]|ChangeSet|20050314010036|43252 +# +# 2. Get the tag, if any, associated with it: +# bk prs -h -d':TAG:\n' -r$cset +# +# 3. If no such tag exists, take the hex-encoded md5sum of the +# changeset key, extract the first 8 characters of it, and add +# -BK and the above 8 characters to the end of the version. + +sub do_bk_checks { + chdir($srctree); + my $changeset = `bk changes -r+ -k`; + chomp $changeset; # strip trailing \n safely + my $tag = `bk prs -h -d':TAG:' -r'$changeset'`; + + if (length($tag) == 0) { + # There is no tag
Re: BKCVS broken ?
On Thu Mar 17, 2005 at 04:10:53PM -0800, Larry McVoy wrote: > I got swamped, I'll look at this after dinner. But you might take a look > at this: http://www.bitkeeper.com/press/2005-03-17.html which is a link > to a very simple open source BK client. It doesn't do much except track > the head of the tree but it does that well. It's slightly better than > that, it puts all the checkin comments in BK/ChangeLog so you don't have > to go over the wire to get those. > > It's intended for someone who just wants the latest and greatest snapshot, > knows how to do cp -rp and diff -Nur, it's pretty basic. It's not a > CVS gateway replacement but it does work for every tree on bkbits.net. > Just to be clear, we are not dropping the CVS gateway, this is "in > addition to" not "instead of". Thanks! Its nice to finally have an open source tool for sucking down the latest and greatest directly from bk. Thus far the tool is working perfectly at fetching source trees and at updating them when new patches are applied. One minor nit. The name for the 'update' tool is a bit too generic... For example old (old) linux systems have an /sbin/update util for flushing buffers, and I have plenty of 'update' scripts lying around doing odd jobs. Perhaps a rename to 'sfioup' might be a good idea, as that is sufficiently obscure there is little chance of a naming collision. -Erik -- Erik B. Andersen http://codepoet-consulting.com/ --This message was written using 73% post-consumer electrons-- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8 + free_hot_zeroed_page + free_cold_zeroed page
On Thu, 17 Mar 2005 18:09:11 -0800 (PST), Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Thu, 17 Mar 2005, Jason Uhlenkott wrote: > > > On Thu, Mar 17, 2005 at 05:36:50PM -0800, Christoph Lameter wrote: > > > +while (avenrun[0] >= ((unsigned long)sysctl_scrub_load << > > > FSHIFT)) { > > > + set_current_state(TASK_UNINTERRUPTIBLE); > > > + schedule_timeout(30*HZ); > > > + } > > > > This should probably be TASK_INTERRUPTIBLE. It'll never actually get > > interrupted either way since kernel threads block all signals, but > > sleeping uninterruptibly contributes to the load average. > > Correct. I just do not seem to be able to get this right. I think msleep_interruptible(3) would be your best choice, then. Maybe with a comment that you don't actually expect signals, but are using TASK_INTERRUPTIBLE to avoid contributing to load average (that way, if the loadavg calculation changes someday, somebody will be able to change your sleep over appropriately). Thanks, Nish - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Network Driver Name (ATM Driver)
Hi I have a network driver (Ethernet), When i install the driver with insmod the driver installs successfully and with ifconfig -a i could see the ethernet driver name as eth0 or eth1 etc. (net_device structure having variable name from where we will have the name of the driver) For ATM driver , it installs the driver without any problem. but with ifconfig -a it doesn't show anything like the case in ethernet driver.(i suppose it shows like atm0, atm1 or .. am i correct.??)B How i can i get the driver name with "ifconfig -a" for PPPoATM driver. Whats the function need to be included in the code to get the same..?? Kindly reply back to this mail ID and [EMAIL PROTECTED] Thanks in Advance Subbu "SASKEN RATED THE BEST EMPLOYER IN THE COUNTRY by the BUSINESS TODAY Mercer Survey 2004" SASKEN BUSINESS DISCLAIMER This message may contain confidential, proprietary or legally Privileged information. In case you are not the original intended Recipient of the message, you must not, directly or indirectly, use, Disclose, distribute, print, or copy any part of this message and you are requested to delete it and inform the sender. Any views expressed in this message are those of the individual sender unless otherwise stated. Nothing contained in this message shall be construed as an offer or acceptance of any offer by Sasken Communication Technologies Limited ("Sasken") unless sent with that express intent and with due authority of Sasken. Sasken has taken enough precautions to prevent the spread of viruses. However the company accepts no liability for any damage caused by any virus transmitted by this email - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 12/12] scripts/mod/sumversion.c: replace strtok() with strsep()
On Fri, Mar 18, 2005 at 03:46:20AM +0100, Nicolas Kaiser wrote: > * Sam Ravnborg <[EMAIL PROTECTED]>: > > > On Sat, Mar 05, 2005 at 04:35:45PM +0100, [EMAIL PROTECTED] wrote: > > > > > > Replaces strtok() with strsep() > > > > Why - does it increase portability? > > "strtok() is not thread and SMP safe and strsep() should be > used instead" > > http://janitor.kernelnewbies.org/docs/driver-howto.html#3.3.1 It does not matter in this particular file. But applied for consistency (so it does not show up if you grep for it). Sam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 32Bit vs 64Bit
regatta <[EMAIL PROTECTED]> wrote: > My question because We ran a 32 Bit application in Sun AMD64 Optreon > with 1GB connection (Kernel 2.4 x86_64 with 8 Gb memory & 2 CPUs) and > we had trouble time with it because the user tried to put the > application processing data in a nas box (in the network) and that > made the machine to use more than 60% of the NAS CPU and no one else > was able to access the NAS Does the application happen to frequently access the data in small chunks randomly scatterd across the file(s)? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] Thinkpad Suspend Powersave: Add D2 power saving code for Thinkpads with Radeon video chipsets
On Thu, 2005-03-17 at 22:39 -0500, Theodore Ts'o wrote: > On Thu, Mar 17, 2005 at 10:19:04AM +1100, Benjamin Herrenschmidt wrote: > > You probably want to remove the bit that does > > > > OUTREG(TV_DAC_CNTL, INREG(TV_DAC_CNTL) | 0x0700); > > > > Or you'll lose TV output :) > > I'm not using TV output, and the original patch stated: > > > > + /* Power down TV DAC, that saves a significant amount of power, > > > + * we'll have something better once we actually have some TVOut > > > + * support > > > + */ Yup, I know, I wrote this bit :) > I suppose I should renable the TV DAC and see how much power it > actually consumes if I enable it. It would seem to me that we should > have a way that we can power down whatever parts of the video chipset > that we're not using. (For example if I don't have anything connected > to the VGA output, it would be good if we could power that down too...) We can power down the internal DAC too, yes, and the TMDS transmitter when no DVI is plugged, etc.. and we can also lower the chip clock :) I do intend to do these things. The problem right now is that the above will break some users who have a BIOS that can set TV-Out. Maybe some sysfs attribute ? At least until I can properly probe all ports including the TV Out (I'm working on that). Ultimately, the driver should be able to properly detect everything that is connected. Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Xen/i386 cleanups - AGP bus/phys cleanups
On Fri, 18 Mar 2005, Paul Mackerras wrote: > However, the idea of having phys_to_agp/agp_to_phys (or > virt_to_agp/agp_to_virt) sounds like it wouldn't be too much effort, if > it would help Xen. It would be absolutely trivial. On most architectures you would have: #define virt_to_agp virt_to_phys #define agp_to_virt phys_to_virt On Xen you would have: #define virt_to_agp virt_to_bus #define agp_to_virt bus_to_virt Or, more likely, defined to arbitrary_machine_to_phys or whatever it was called ;) -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE][PATCH] drivers/scsi/megaraid/megaraid_{mm,mbox}
"Ju, Seokmann" <[EMAIL PROTECTED]> wrote: > > Here, I'm sending another patch that has > fix for this issue. It is still wordwrapped. Please fix you email client, email the patch to yourself, ensure that the result still applies, then resend it with a full description. Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/6] kobject/hotplug split - devices core
kobject_add() and kobject_del() don't emit hotplug events anymore. Do it ourselves if we are finished populating the device directory. Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> --- 1.91/drivers/base/core.c2004-11-12 13:16:42 +01:00 +++ edited/drivers/base/core.c 2005-03-18 02:17:17 +01:00 @@ -260,6 +260,8 @@ int device_add(struct device *dev) /* notify platform of device entry */ if (platform_notify) platform_notify(dev); + + kobject_hotplug(>kobj, KOBJ_ADD); Done: put_device(dev); return error; @@ -349,6 +351,7 @@ void device_del(struct device * dev) platform_notify_remove(dev); bus_remove_device(dev); device_pm_remove(dev); + kobject_hotplug(>kobj, KOBJ_REMOVE); kobject_del(>kobj); if (parent) put_device(parent); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/6] kobject/hotplug split - block core
kobject_add() and kobject_del() don't emit hotplug events anymore. Do it ourselves if we are finished populating the device directory. Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = fs/partitions/check.c 1.129 vs edited = --- 1.129/fs/partitions/check.c 2005-01-31 07:33:40 +01:00 +++ edited/fs/partitions/check.c2005-03-18 02:17:18 +01:00 @@ -337,6 +337,7 @@ void register_disk(struct gendisk *disk) if ((err = kobject_add(>kobj))) return; disk_sysfs_symlinks(disk); + kobject_hotplug(>kobj, KOBJ_ADD); /* No minors to use for partitions */ if (disk->minors == 1) { @@ -441,5 +442,6 @@ void del_gendisk(struct gendisk *disk) sysfs_remove_link(>driverfs_dev->kobj, "block"); put_device(disk->driverfs_dev); } + kobject_hotplug(>kobj, KOBJ_REMOVE); kobject_del(>kobj); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: PCI Error Recovery API Proposal. (WAS:: [PATCH/RFC] PCIErrorRecovery)
Nguyen, Tom L writes: > We decided to implement PCI Express error handling based on the PCI > Express specification in a platform independent manner. This allows any > platform that implements PCI Express AER per the PCI SIG specification > can take advantage of the advanced features, much like SHPC hot-plug or > PCI Express hot-plug implementations. Does the PCI Express AER specification define an API for drivers? > For PCI Express the endpoint device driver can take recovery action on > its own, depending on the nature of the error so long as it does not > affect the upstream device. This can include endpoint device resets. Likewise, with EEH the device driver could take recovery action on its own. But we don't want to end up with multiple sets of recovery code in drivers, if possible. Also we want the recovery code to be as simple as possible, otherwise driver authors will get it wrong. > To support the AER driver calling an upstream device to initiate a reset > of the link we need a specific callback since the driver doing the reset > is not the driver who got the error. In the case of general PCI this I would see the AER driver as being included in the "platform" code. The AER driver would be be closely involved in the recovery process. What is the state of a link during the time between when an error is detected and when a link reset is done? Is the link usable? What happens if you try to do a MMIO read from a device downstream of the link? Regards, Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/6] kobject/hotplug split - usb cris
kobject_add() and kobject_del() don't emit hotplug events anymore. We need to do it ourselves now. Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = drivers/usb/host/hc_crisv10.c 1.7 vs edited = --- 1.7/drivers/usb/host/hc_crisv10.c 2004-12-21 02:15:10 +01:00 +++ edited/drivers/usb/host/hc_crisv10.c2005-03-18 02:17:17 +01:00 @@ -4396,6 +4396,7 @@ static int __init etrax_usb_hc_init(void device_initialize(_device); kobject_set_name(_device.kobj, "etrax_usb"); kobject_add(_device.kobj); +kobject_hotplug(_device.kobj, KOBJ_ADD); hc->bus->controller = _device; usb_register_bus(hc->bus); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/6] kobject/hotplug split - class core
kobject_add() and kobject_del() don't emit hotplug events anymore. Do it ourselves if we are finished populating the device directory. Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = drivers/base/class.c 1.61 vs edited = --- 1.61/drivers/base/class.c 2005-03-15 17:52:00 +01:00 +++ edited/drivers/base/class.c 2005-03-18 02:17:17 +01:00 @@ -491,6 +491,7 @@ int class_device_add(struct class_device up(>sem); } + kobject_hotplug(_dev->kobj, KOBJ_ADD); register_done: if (error && parent) class_put(parent); @@ -562,6 +563,7 @@ void class_device_del(struct class_devic } class_device_remove_attrs(class_dev); + kobject_hotplug(_dev->kobj, KOBJ_REMOVE); kobject_del(_dev->kobj); if (parent) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/6] kobject/hotplug split - net bridge
kobject_add() and kobject_del() don't emit hotplug events anymore. We need to do it ourselves now. Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = net/bridge/br_sysfs_if.c 1.2 vs edited = --- 1.2/net/bridge/br_sysfs_if.c2004-06-18 22:15:34 +02:00 +++ edited/net/bridge/br_sysfs_if.c 2005-03-18 02:17:18 +01:00 @@ -248,6 +248,7 @@ int br_sysfs_addif(struct net_bridge_por if (err) goto out2; + kobject_hotplug(>kobj, KOBJ_ADD); return 0; out2: kobject_del(>kobj); @@ -259,6 +260,7 @@ void br_sysfs_removeif(struct net_bridge { pr_debug("br_sysfs_removeif\n"); sysfs_remove_link(>br->ifobj, p->dev->name); + kobject_hotplug(>kobj, KOBJ_REMOVE); kobject_del(>kobj); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/6] kobject/hotplug split - kobject add/remove
kobject_add() and kobject_del() don't emit hotplug events anymore. The user should do it itself if it has finished populating the device directory. Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = lib/kobject.c 1.58 vs edited = --- 1.58/lib/kobject.c 2005-03-09 18:04:09 +01:00 +++ edited/lib/kobject.c2005-03-18 02:17:18 +01:00 @@ -184,8 +184,6 @@ int kobject_add(struct kobject * kobj) unlink(kobj); if (parent) kobject_put(parent); - } else { - kobject_hotplug(kobj, KOBJ_ADD); } return error; @@ -207,7 +205,8 @@ int kobject_register(struct kobject * ko printk("kobject_register failed for %s (%d)\n", kobject_name(kobj),error); dump_stack(); - } + } else + kobject_hotplug(kobj, KOBJ_ADD); } else error = -EINVAL; return error; @@ -301,7 +300,6 @@ int kobject_rename(struct kobject * kobj void kobject_del(struct kobject * kobj) { - kobject_hotplug(kobj, KOBJ_REMOVE); sysfs_remove_dir(kobj); unlink(kobj); } @@ -314,6 +312,7 @@ void kobject_del(struct kobject * kobj) void kobject_unregister(struct kobject * kobj) { pr_debug("kobject %s: unregistering\n",kobject_name(kobj)); + kobject_hotplug(kobj, KOBJ_REMOVE); kobject_del(kobj); kobject_put(kobj); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/6] kobject/hotplug split
This splits the implicit generation of hotplug events from kobject_add() and kobject_del(), to give the driver core control over the time the event is created. The kobject_register() and unregister functions still have the same behavior and emit the events by themselves. The class, block and device core is changed now to emit the hotplug event _after_ the "dev" file, the "device" symlink and the default attributes are created. This will save udev from spinning in a stat() loop to wait for the files to appear, which is expensive if we have a lot of concurrent events. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: PCI Error Recovery API Proposal. (WAS:: [PATCH/RFC] PCIErrorRecovery)
Nguyen, Tom L writes: > Is EEH a PCI-SIG specification? Is EEH specs available in public? No and no (not yet anyway). > It seems that a PCI-PCI bridge per slot is hardware implementation > specific. The fact that the PCI-PCI Bridge can isolate the slot is > hardware feature specific. Well, it's a common feature across all current IBM PPC64 machines. > PCI Express AER driver uses similar concept of determining whether the > driver is AER-aware or not except that PCI Express AER is independent > from firmware support. Don't worry about the firmware; the driver won't have to interact with firmware itself, that's the job of the ppc64-specific platform code. > Where does the platform code reside and where does it log the error? By platform code I meant the code under the arch directory that knows the details of the I/O topology of the machine, how to access the PCI host bridges, etc. How and where it logs the error is a platform policy; on IBM ppc64 machines we have an error log daemon for this purpose, which can do things like log the error to a file or send it to another machine. > In PCI Express if the driver is not AER-aware the fatal error message is > reported by its upstream switch, the AER driver obtains comprehensive > error information from the upstream switch (like EEH platform code > obtains error information from the firmware). Since the driver is not > AER-aware, the fatal error is reported to user to make a policy decision > since the PCI Express does not have a hot-plug event for the slot like > EEH platform. If there is a permanent failure of an upstream link, then maybe generating unplug events for the devices below it would be a useful thing to do. > So it looks like the hot-plug capability of the driver is being used in > lieu of specific callbacks to freeze and thaw IO in the case of a > non-aware driver. If the driver does not support hot-plug then the > error is just logged. Do you leave the slot isolated or perform error > recovery anyway? The choice is really to leave the slot isolated or to panic the system. Leaving the slot isolated risks having the driver loop in an interrupt routine or deliver bad data to userspace, so we currently panic the system. > On a fatal error the interface is down. No matter what the driver Which interface do you mean here? > supports (AER aware, EEH aware, unaware) all IO is likely to fail. > Resetting a bus in a point-to-point environment like PCI Express or EEH > (as you describe) should have little adverse effect. The risk is the > bus reset will cause a card reset and the driver must understand to > re-initialize the card. A link reset in PCI Express will not cause a > card reset. We assume the driver will reset its card if necessary. How will the driver reset its card? > In PCI Express the AER driver obtains fatal error information from the > upstream switch driver. We can use the same API with message = > PCIERR_ERROR_RECOVER to notify the endpoint driver, which is maybe > unaware of the fatal error reported by its upstream device. Mostly the > driver will respond with PCIERR_RESULT_NEED_RESET. Sounds fine. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Why no bigphysarea in mainline?
On Fri, 18 Mar 2005 01:35, Dave Hansen wrote: > Doing mem= for drivers isn't just a hack, it's *WRONG*. It's a ticking > time bomb that magically happens to work on some systems. It will not > work consistently on a discontiguous memory system, or a memory hotplug > system. I couldn't agree more. Problem is I've been asked to change the way mem=X works on PPC64 so that this hack will work, which is a horrible thought. > Could you give some examples of drivers which are in the kernel that > could benefit from this patch? We don't tend to put things like this > in, unless they have actual users. We don't tend to change code for > out-of-tree users, either. No I can't. I've been approached by several "vendors" asking about using mem=X hacks on PPC64, however I doubt any of them have code in-tree. I'll check though. cheers pgpoVUl47Rs9y.pgp Description: PGP signature
Re: [PATCH 2/2] Thinkpad Suspend Powersave: Add D2 power saving code for Thinkpads with Radeon video chipsets
On Thu, Mar 17, 2005 at 10:19:04AM +1100, Benjamin Herrenschmidt wrote: > You probably want to remove the bit that does > > OUTREG(TV_DAC_CNTL, INREG(TV_DAC_CNTL) | 0x0700); > > Or you'll lose TV output :) I'm not using TV output, and the original patch stated: > > + /* Power down TV DAC, that saves a significant amount of power, > > +* we'll have something better once we actually have some TVOut > > +* support > > +*/ I suppose I should renable the TV DAC and see how much power it actually consumes if I enable it. It would seem to me that we should have a way that we can power down whatever parts of the video chipset that we're not using. (For example if I don't have anything connected to the VGA output, it would be good if we could power that down too...) - Ted - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [ANNOUNCE][PATCH] drivers/scsi/megaraid/megaraid_{mm,mbox}
On Thursday, March 17, 2005 12:28 PM, James wrote: > This is still rejecting: > > patching file drivers/scsi/megaraid/megaraid_mm.c > Hunk #2 FAILED at 43. > Hunk #4 FAILED at 68. > Hunk #5 FAILED at 1217. > Hunk #6 FAILED at 1225. > Hunk #7 FAILED at 1245. > 5 out of 7 hunks FAILED -- saving rejects to file > drivers/scsi/megaraid/megaraid_mm.c.rej Thank you for correction again. At this time, I've download the source from BK:/linux.bkbits.net:8080/linux-2.6 and found that the source already includes 'compat_ioctl' support. Here, I'm sending another patch that has fix for this issue. I've verified (also, recreated above error with previous patch) this by applying the patch to the source from BK. I'm learning from your great comments and I appreciate your time on this. Please let me know for any comment. Thank you. Sign-off-by: Seokmann Ju <[EMAIL PROTECTED]> --- diff -Naur BK/Documentation/scsi/ChangeLog.megaraid new/Documentation/scsi/ChangeLog.megaraid --- BK/Documentation/scsi/ChangeLog.megaraid2005-03-17 18:06:38.115075184 -0500 +++ new/Documentation/scsi/ChangeLog.megaraid 2005-03-17 09:14:03.247953384 -0500 @@ -1,3 +1,69 @@ +Release Date : Mon Mar 07 12:27:22 EST 2005 - Seokmann Ju <[EMAIL PROTECTED]> +Current Version: 2.20.4.6 (scsi module), 2.20.2.6 (cmm module) +Older Version : 2.20.4.5 (scsi module), 2.20.2.5 (cmm module) + +1. Added IOCTL backward compatibility. + Convert megaraid_mm driver to new compat_ioctl entry points. + I don't have easy access to hardware, so only compile tested. + - Signed-off-by:Andi Kleen <[EMAIL PROTECTED]> + +2. megaraid_mbox fix: wrong order of arguments in memset() + That, BTW, shows why cross-builds are useful-the only indication of + problem had been a new warning showing up in sparse output on alpha + build (number of exceeding 256 got truncated). + - Signed-off-by: Al Viro + <[EMAIL PROTECTED]> + +3. Convert pci_module_init to pci_register_driver + Convert from pci_module_init to pci_register_driver + (from:http://kerneljanitors.org/TODO) + - Signed-off-by: Domen Puncer <[EMAIL PROTECTED]> + +4. Use the pre defined DMA mask constants from dma-mapping.h + Use the DMA_{64,32}BIT_MASK constants from dma-mapping.h when calling + pci_set_dma_mask() or pci_set_consistend_dma_mask(). See + http://marc.theaimsgroup.com/?t=10800199301=1=2 for more + details. + Signed-off-by: Tobias Klauser <[EMAIL PROTECTED]> + Signed-off-by: Domen Puncer <[EMAIL PROTECTED]> + +5. Remove SSID checking for Dobson, Lindsay, and Verde based products. + Checking the SSVID/SSID for controllers which have Dobson, Lindsay, + and Verde is unnecessary because device ID has been assigned by LSI + and it is unique value. So, all controllers with these IOPs have to be + supported by the driver regardless SSVID/SSID. + +6. Date Thu, 27 Jan 2005 04:31:09 +0100 + From Herbert Poetzl <> + Subject RFC: assert_spin_locked() for 2.6 + + Greetings! + + overcautious programming will kill your kernel ;) + ever thought about checking a spin_lock or even + asserting that it must be held (maybe just for + spinlock debugging?) ... + + there are several checks present in the kernel + where somebody does a variation on the following: + + BUG_ON(!spin_is_locked(_lock)); + + so what's wrong about that? nothing, unless you + compile the code with CONFIG_DEBUG_SPINLOCK but + without CONFIG_SMP ... in which case the BUG() + will kill your kernel ... + + maybe it's not advised to make such assertions, + but here is a solution which works for me ... + (compile tested for sh, x86_64 and x86, boot/run + tested for x86 only) + + best, + Herbert + + - Herbert Poetzl <[EMAIL PROTECTED]>, Thu, 27 Jan 2005 + Release Date : Thu Feb 03 12:27:22 EST 2005 - Seokmann Ju <[EMAIL PROTECTED]> Current Version: 2.20.4.5 (scsi module), 2.20.2.5 (cmm module) Older Version : 2.20.4.4 (scsi module), 2.20.2.4 (cmm module) diff -Naur BK/drivers/scsi/megaraid/mega_common.h new/drivers/scsi/megaraid/mega_common.h --- BK/drivers/scsi/megaraid/mega_common.h 2005-03-17 20:01:55.774431112 -0500 +++ new/drivers/scsi/megaraid/mega_common.h 2005-03-17 07:16:21.209546408 -0500 @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include diff -Naur BK/drivers/scsi/megaraid/megaraid_mbox.c new/drivers/scsi/megaraid/megaraid_mbox.c --- BK/drivers/scsi/megaraid/megaraid_mbox.c2005-03-17 20:01:55.782429896 -0500 +++ new/drivers/scsi/megaraid/megaraid_mbox.c 2005-03-17 09:03:41.275507568 -0500 @@ -10,7 +10,7 @@ *2 of the License, or (at your option) any later version. * * FILE: megaraid_mbox.c - * Version
Re: Error messages with ACPI
On Sat, 2005-03-05 at 13:09, Mina Nozar wrote: > kernel: ACPI-1133: *** Error: Method execution failed > [\_SB_.BAT0._BST] > (Node dfe043c0), AE_AML_NO_RETURN_VALUE Please try the latest mm tree and report if these go away. thanks, -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
On Thu, 2005-03-17 at 15:06 -0800, Christoph Lameter wrote: > I want to sleep 30 seconds because the system load is unlikely to change > frequently. Ugh ? That sounds like a magic number coming right from your hat or from your test scenario ... Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BKCVS broken ?
Followup to: <[EMAIL PROTECTED]> By author:[EMAIL PROTECTED] (Larry McVoy) In newsgroup: linux.dev.kernel > > I'll check into it. We've been having problems with connecting to > master.kernel.org, yup, here you go, anyone else seeing this? > > From [EMAIL PROTECTED] Thu Mar 17 05:06:57 2005 > Date: Thu, 17 Mar 2005 05:00:57 -0800 > From: [EMAIL PROTECTED] (Cron Daemon) > To: [EMAIL PROTECTED] > Subject: Cron <[EMAIL PROTECTED]> /bk-cvsexport/src/UPDATE > > Read from remote host master.kernel.org: Connection timed out > Please Cc: any reports of badness on kernel.org to [EMAIL PROTECTED]; I would have seen this quicker that way. Around the time the above happened the machine was pretty bogged down, because we're preparing new hardware to replace the main server, and were doing some very large copies. It might have caused a timeout. I notice a long login from you at approximately 14:00 PST; does that mean this is no longer an issue? -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: binfmt_elf padzero problems
Andrew Morton writes: > I guess if the bss has zero length then we can skip the zeroing of the end > of the page at the end of bss, as long as we're dead sure that we didn't > accidentally instantiate a single page on behalf of that zero-length bss. There is another thing I noticed about the bss code, which is that it doesn't give the bss the permissions from the PT_LOAD segment, rather it just uses VM_DATA_DEFAULT_FLAGS. That doesn't matter at the moment but may matter in future for ppc32. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 12/12] scripts/mod/sumversion.c: replace strtok() with strsep()
* Sam Ravnborg <[EMAIL PROTECTED]>: > On Sat, Mar 05, 2005 at 04:35:45PM +0100, [EMAIL PROTECTED] wrote: > > > > Replaces strtok() with strsep() > > Why - does it increase portability? "strtok() is not thread and SMP safe and strsep() should be used instead" http://janitor.kernelnewbies.org/docs/driver-howto.html#3.3.1 Cheers, n. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: PCI Error Recovery API Proposal. (WAS:: [PATCH/RFC] PCIErrorRecovery)
On Thu, 2005-03-17 at 10:53 -0800, Nguyen, Tom L wrote: > To support the AER driver calling an upstream device to initiate a reset > of the link we need a specific callback since the driver doing the reset > is not the driver who got the error. In the case of general PCI this > could be useful if a PCI bus driver were available to support the > callback for a bridge device. This would also support specific error > recovery calls to reset an endpoint adapter. We need a call to request > a driver to perform a reset on a link or device. That is quite implementation specific, it doesn't need to be part of the API (the way the general error management is implemented in PCIE could be completely done within the bus drivers I suppose). Again, I'm not trying to define or force a given implementation. I'm trying to define the driver-side API, that's all. I have difficulties following all of your previous explanations, I must admit. My point here is I'd like you to find out if the API can fit on the driver side, and if not, what would need to be changed. For example, we might want to distinguish between slot reset (full hard reset) and link reset, that sort of thing (thus adding a new state for link reset and a new return code for the others for requesting a link reset if possible, platforms that don't do it, like IBM EEH PCI would just fallback to full reset). Again, the goal here is to have a way for drivers to be mostly bus agnostic (that is not have to care if they are running on PCI, PCI-X, PCIE, with or without IBM EEH mecanism, and whatever other mecanism another vendor might provide) and still implement basic error recovery. A driver _designed_ for a PCI-Express deviec that knows it's on PCI Express can perfectly use additional APIs to gather more error details, etc... but it would be nice to fit the "common needs" as much as possible in a common and _SIMPLE_ API. The simplicity here is a requirement, I'm very serious about it, because if it's not simple, drivers either won't implement it or won't get it right. Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8 + free_hot_zeroed_page + free_cold_zeroed page
On Thu, 17 Mar 2005, Jason Uhlenkott wrote: > On Thu, Mar 17, 2005 at 05:36:50PM -0800, Christoph Lameter wrote: > > +while (avenrun[0] >= ((unsigned long)sysctl_scrub_load << FSHIFT)) > > { > > + set_current_state(TASK_UNINTERRUPTIBLE); > > + schedule_timeout(30*HZ); > > + } > > This should probably be TASK_INTERRUPTIBLE. It'll never actually get > interrupted either way since kernel threads block all signals, but > sleeping uninterruptibly contributes to the load average. Correct. I just do not seem to be able to get this right. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: PCI Error Recovery API Proposal. (WAS:: [PATCH/RFC] PCIErrorRecovery)
On Wednesday, March 16, 2005 7:20 PM Benjamin Herrenschmidt wrote: >> What mechanism (message??) is used to perform the bus and/or link >> level reset? For PCI Express the reset is performed by the upstream >> port driver. My API takes this into account. Are you assuming the PCI >> device on the bus does the reset or will there be a PCI bus driver that >> will do the reset? How will the PCI error handling code initiate a >> reset? > >The "caller", that is the error management framework. I'm defining the >API at the driver level, not the implementation at the core level. > >For example, on IBM pSeries with PCI-Express, we will probably not have >an AER driver. This will be all dealt by the firmware which will mimmic >that to the existing EEH error management. We'll have the same API to do >the reset that we have today for resetting a slot. We decided to implement PCI Express error handling based on the PCI Express specification in a platform independent manner. This allows any platform that implements PCI Express AER per the PCI SIG specification can take advantage of the advanced features, much like SHPC hot-plug or PCI Express hot-plug implementations. >You may have noticed in general that I didn't either define who is >callign those callbacks. It's all implicit that this is done by platform >error management code. For example, on ppc64, even the recovery step >requires action from the platform since the slot has been physically >isolated. After we have notified all drivers with the "error detected" >callback, if we decide we can try the "recover" step (all drivers >returned they could try it and we decided the error wasn't too fatal) we >will call the firmware to re-enable IOs on the slot and call the >"recover" step. For PCI Express the endpoint device driver can take recovery action on its own, depending on the nature of the error so long as it does not affect the upstream device. This can include endpoint device resets. We expect the driver to do this upon error notification, if possible. In PCI Express since the driver will have the most knowledge regarding the error it will have the best ability to do device dependent recovery and IO retry. If its recovery fails then the AER driver will ask the upstream device driver to perform the link reset. Since this is more of a side effect an explicit call to recover is not necessary. However, we understand and agree that it is needed to support the general error recovery cases for PCI. To support the AER driver calling an upstream device to initiate a reset of the link we need a specific callback since the driver doing the reset is not the driver who got the error. In the case of general PCI this could be useful if a PCI bus driver were available to support the callback for a bridge device. This would also support specific error recovery calls to reset an endpoint adapter. We need a call to request a driver to perform a reset on a link or device. Thanks, Long - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8 + free_hot_zeroed_page + free_cold_zeroed page
On Thu, Mar 17, 2005 at 05:36:50PM -0800, Christoph Lameter wrote: > +while (avenrun[0] >= ((unsigned long)sysctl_scrub_load << FSHIFT)) { > + set_current_state(TASK_UNINTERRUPTIBLE); > + schedule_timeout(30*HZ); > + } This should probably be TASK_INTERRUPTIBLE. It'll never actually get interrupted either way since kernel threads block all signals, but sleeping uninterruptibly contributes to the load average. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oom with 2.6.11
Coywolf Qi Hunt wrote: > I do "grep check-route.sh oom_2.6.11.3.txt | wc" and it shows 4365 duh, good catch! really! > lines, which means there're 4365 that script processes running, from > pid 4260 to12747, mostly with pretty low points, 123. > Based on this points, suppose each script consumes 100k, that'll be > 100k*4k=400M roughly. And your box's is merely 256M MemTotal. yes, i just checked, the script is looping and crond is starting a new one, and anotherand the oom-killer does not catch it, because it's too small and of course don't know where it is coming from (crond). > Check this script and disable it; see what will happen. yes, will do that. on a (not so unimportant) side-note: i was told the whole thing should be fixed with 2.6.11.4: [PATCH] CAN-2005-0384: Remote Linux DoS on ppp servers after all it seems to be PEBKAC and bad luck...what a week. thank you for your help, Christian. -- BOFH excuse #416: We're out of slots on the server - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.11-mm3 - redzone mismatch
My computer crashed twice today. Both times I was unable to use the keyboard, but was able to shutdown X. I have had hardware problems related to overheating, but I believe that I have resolved my overheating problems, and in any event I had not been stressing the cpu since the first crash. The following messages appeared in my kernel log before my first crash: Mar 17 12:51:33 kenichi kernel: slab dentry_cache: redzone mismatch in slabp c2802000, objp c2802ab8, bufctl 0xfffe Mar 17 12:51:33 kenichi kernel: Redzone: 0x170fc2a5/0x120fc2a5. Mar 17 12:51:33 kenichi kernel: Last user: [d_alloc+28/464](d_alloc+0x1c/0x1d0) Mar 17 12:51:33 kenichi kernel: 000: 00 00 00 00 00 00 00 00 a4 7a 32 cb 34 2e 80 c2 Mar 17 12:51:33 kenichi kernel: 010: 1d 47 cc d2 0f 00 00 00 20 2b 80 c2 6c 2b 80 c2 Mar 17 12:51:33 kenichi kernel: slab dentry_cache: redzone mismatch in slabp c2802000, objp c2802b4c, bufctl 0xfffe Mar 17 12:51:33 kenichi kernel: Redzone: 0x90fc2a5/0x170fc2a5. Mar 17 12:51:33 kenichi kernel: Last user: [d_alloc+28/464](d_alloc+0x1c/0x1d0) Mar 17 12:51:33 kenichi kernel: 000: 00 00 00 00 00 00 00 35 8c 7c 32 cb 34 2e 80 12 Mar 17 12:51:33 kenichi kernel: 010: 9a d3 fd f5 15 00 00 09 b4 2b 80 c2 00 2c 80 35 ... repeat every five minutes for a few hours ... then: Mar 17 17:56:35 kenichi kernel: ff ff ff ff ff bf ff Mar 17 17:56:35 kenichi kernel: 12ff0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13000: 46 41 43 53 40 00 00 00 00 00 00 00 00 00 00 00 Mar 17 17:56:35 kenichi kernel: 13010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Mar 17 17:56:35 kenichi kernel: 13020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Mar 17 17:56:35 kenichi kernel: 13030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Mar 17 17:56:35 kenichi kernel: 13040: ff fb ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13060: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13070: ff ff ff ff ff ff f7 ff ff ff df ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13080: ff ff ff ff ff ff ff ff ff ff f7 ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13090: ff ff ff df ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 130a0: ff ff ff ff ff ff ff ff ff ff ff ef ff ff ff ff Mar 17 17:56:35 kenichi kernel: 130b0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 130c0: bf ff ff ff ff ff ff ff ff ff ff ff ff 7f ff ff Mar 17 17:56:35 kenichi kernel: 130d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 130e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 130f0: ff ff ff ff ff bf ff ff ff f7 ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13100: ff ff ff ff ff ff bf ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13110: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13120: ff ff ff ff ff ff ff ff ff ff ff ef ff ff db ff Mar 17 17:56:35 kenichi kernel: 13130: ff ff ff ff ff ff ff ff fb ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13140: ff ff ff ff ff ff ff ff db ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13150: ff ff ff ff ff ff ff ff f3 ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13160: ff ff 7f ff ff ff ff fb ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13170: ff ff ff ef ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13180: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13190: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 131a0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff fb ff Mar 17 17:56:35 kenichi kernel: 131b0: ff ff ff ff ff ff ff ff ff ff ff ff ef ff ff ff Mar 17 17:56:35 kenichi kernel: 131c0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 131d0: ff ff fd ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 131e0: ff ff ff ff ff ff ff ff ef ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 131f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ad ff ff Mar 17 17:56:35 kenichi kernel: 13200: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13210: ff ff ff ff ff ff ff ff ff bf df ff ff ff ef ff Mar 17 17:56:35 kenichi kernel: 13220: ff ff ff ff ff ff ff ff ff ff ef ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13230: ff 7f ff ff ff ff ff ff ff ff ff ff ff ff ff 7f Mar 17 17:56:35 kenichi kernel: 13240: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff df Mar 17 17:56:35 kenichi kernel: 13250: ff f7 ff ff ff ff fe ff ff ff ff ff ff ff ff fe Mar 17 17:56:35 kenichi kernel: 13260: ff ff ff ff ff ff ff ff ff bf ff ff ff fe df ff Mar 17 17:56:35 kenichi kernel: 13270: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Mar 17 17:56:35 kenichi kernel: 13280: ef ff ff
Re: [PATCH] Prezeroing V8 + free_hot_zeroed_page + free_cold_zeroed page
Here is the fixed up zeroing patch with management of hot/cold zeroed pages. If quicklists would like the use this then they need to use free_hot_zeroed_page(page) and get_zeroed_page(GFP) for their management of hot zeroed pages. If the pool is empty then it will be replenished either from the pool build up by kscrubd or by zeroing a couple of pages on the fly. The most expensive operation in the page fault handler is (apart of SMP locking overhead) the touching of all cache lines of a page by zeroing the page. This zeroing means that all cachelines of the faulted page (on Altix that means all 128 cachelines of 128 byte each) must be handled and later written back. This patch allows to avoid having to use all cachelines if only a part of the cachelines of that page is needed immediately after the fault. Doing so will only be effective for sparsely accessed memory which is typical for anonymous memory and pte maps. The patch can make prezeroing more effective by also allowing the use of hardware devices to offload zeroing from the cpu. This avoids the invalidation of the cpu caches by extensive zeroing operations. For that purpose a driver may register a zeroing driver via register_zero_driver(z) When the number of zeroed pages falls below a lower threshhold (defined by setting /proc/sys/vm/scrub_start) kscrubd is invoked (similar to the swapper). kscrubd then zeroes free pages until the upper threshold is reached (set by /proc/sys/vm/scrub_stop). The zeroing is performed on a percentage of pages at each order of freed pages to minimize fragmentation of pages. kscrubd performs short bursts of zeroing when needed and tries to stay off the processor as much as possible. Kscrubd will only run when the load is less than set in /proc/sys/vm/scrub_load (defaults to 1). The patch also provides the management of hot and cold lists for zeroed pages in the pageset structure. Patch against 2.6.11.3-bk3. Performance data may be found at http://oss.sgi.com/projects/page_fault_performance/ Changelog: - Cleanup and document more clearly - Add full support for hot/cold zeroed pages. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Index: linux-2.6.11/mm/page_alloc.c === --- linux-2.6.11.orig/mm/page_alloc.c 2005-03-17 16:38:55.0 -0800 +++ linux-2.6.11/mm/page_alloc.c2005-03-17 17:28:27.0 -0800 @@ -12,6 +12,8 @@ * Zone balancing, Kanoj Sarcar, SGI, Jan 2000 * Per cpu hot/cold page lists, bulk allocation, Martin J. Bligh, Sept 2002 * (lots of bits borrowed from Ingo Molnar & Andrew Morton) + * Page zeroing by Christoph Lameter, SGI, Dec 2004 using + * initial code for __GFP_ZERO support by Andrea Arcangeli, Oct 2004. */ #include @@ -34,6 +36,7 @@ #include #include #include +#include #include #include "internal.h" @@ -180,16 +183,20 @@ static void destroy_compound_page(struct * zone->lock is already acquired when we use these. * So, we don't need atomic page->flags operations here. */ -static inline unsigned long page_order(struct page *page) { +static inline unsigned long page_zorder(struct page *page) { return page->private; } -static inline void set_page_order(struct page *page, int order) { - page->private = order; +/* We use bit PAGE_PRIVATE_ZERO_SHIFT in page->private to encode + * the zeroing status. This makes buddy pages with different zeroing + * status not match to avoid merging zeroed with unzeroed pages + */ +static inline void set_page_zorder(struct page *page, int order, int zero) { + page->private = order + (zero << PAGE_PRIVATE_ZERO_SHIFT); __SetPagePrivate(page); } -static inline void rmv_page_order(struct page *page) +static inline void rmv_page_zorder(struct page *page) { __ClearPagePrivate(page); page->private = 0; @@ -231,14 +238,15 @@ __find_combined_index(unsigned long page * we can do coalesce a page and its buddy if * (a) the buddy is free && * (b) the buddy is on the buddy system && - * (c) a page and its buddy have the same order. + * (c) a page and its buddy have the same order and the same + * zeroing status. * for recording page's order, we use page->private and PG_private. * */ -static inline int page_is_buddy(struct page *page, int order) +static inline int page_is_buddy(struct page *page, int order, int zero) { if (PagePrivate(page) && - (page_order(page) == order) && + (page_zorder(page) == order + (zero << PAGE_PRIVATE_ZERO_SHIFT)) && !PageReserved(page) && page_count(page) == 0) return 1; @@ -270,7 +278,7 @@ static inline int page_is_buddy(struct p */ static inline void __free_pages_bulk (struct page *page, - struct zone *zone, unsigned int order) + struct zone *zone, unsigned int order, unsigned int zero) { unsigned long page_idx; int
Re: [2.6 patch] USB: possible cleanups
On Tue, Mar 01, 2005 at 01:43:52AM +0100, Adrian Bunk wrote: > Before I'm getting flamed to death: > This patch contains possible cleanups. If parts of this patch conflict > with pending changes these parts of my patch have to be dropped. > > This patch contains the following possible cleanups: > - make needlessly global code static > - #if 0 the following unused global functions: > - core/usb.c: usb_buffer_map > - core/usb.c: usb_buffer_unmap > - remove the following unneeded EXPORT_SYMBOL's: > - core/hcd.c: usb_bus_init > - core/hcd.c: usb_alloc_bus > - core/hcd.c: usb_register_bus > - core/hcd.c: usb_deregister_bus > - core/hcd.c: usb_hcd_irq > - core/usb.c: usb_buffer_map > - core/usb.c: usb_buffer_unmap > - core/buffer.c: hcd_buffer_create > - core/buffer.c: hcd_buffer_destroy > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> Looks good to me, thanks for the patch. Applied. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: Fw: Anybody? 2.6.11 (stable and -rc) ACPI breaks USB
On Fri, 2005-03-18 at 02:08, Bjorn Helgaas wrote: > On Thu, 2005-03-17 at 09:33 +0800, Li Shaohua wrote: > > The comments in previous quirk said it's required only in PIC mode. > ... > > I feel we concerned too much. Changing the interrupt line isn't harmful, > > right? Linux actually ignored interrupt line. Maybe just a > > PCI_FIXUP_ENABLE(PCI_VENDOR_ID_VIA, PCI_ANY_ID, quirk_via_irq) is > > sufficient. > > I think it's good to limit the scope of the quirk as much as > possible because that makes it easier to do future restructuring, > such as device-specific interrupt routers. > > The comment (before quirk_via_acpi(), nowhere near quirk_via_irqpic()) > says *on-chip devices* have this unusual behavior when the interrupt > line is written. That makes sense to me. > > Writing the interrupt line on random plug-in Via PCI devices does > not make sense to me, because for that to have any effect, an > upstream bridge would have to be snooping the traffic going through > it. That doesn't sound plausible to me. > > What about this: Hmm, this looks like previous solution. We removed the specific via quirk is because we don't know how many devices have such issue. Every time we encounter an IRQ issue in a VIA PCI device, we will suspect it requires quirk and keep try. This is a big overhead. Thanks, Shaohua - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/usb/net/pegasus.c: make some code static
On Tue, Mar 01, 2005 at 01:35:41AM +0100, Adrian Bunk wrote: > This patch makes some needlessly global code static. > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> Applied, thanks. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] remove drivers/usb/image/hpusbscsi.c
On Thu, Mar 03, 2005 at 02:38:56PM +0100, Adrian Bunk wrote: > USB_HPUSBSCSI was marked as BROKEN in 2.6.11 since libsane is the > preferred way to access these devices. > > Unless someone plans to resurrect this driver, I'm therefore proposing > this patch to completely remove it. > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> Applied, thanks. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/usb/storage/: cleanups
On Tue, Mar 01, 2005 at 01:37:58AM +0100, Adrian Bunk wrote: > This patch contains the following cleanups: > - make needlessly global code static > - scsiglue.c: remove the unused usb_stor_sense_notready > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> Applied, thanks. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/usb/serial/: make some functions static
On Tue, Mar 01, 2005 at 01:39:35AM +0100, Adrian Bunk wrote: > This patch makes some needlessly global functions static. > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> Applied, thanks. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
AIO panic on 2.6.11 on PPC64 caused by is_hugepage_only_range()
When testing AIO on PPC64 (a power5 machine) running 2.6.11 with CONFIG_HUGETLB_PAGE=y, I ran into a kernel panic when a process exits that has done AIO (io_queue_init()) but has not done the io_queue_release(). The exit_aio() code is cleaning up and panicing when trying to free the aio ring buffer. I tracked this down to is_hugepage_only_range() (include/asm-ppc64/page.h) which is doing a touches_hugepage_low_range() which is checking current->mm->context.htlb_segs. The problem is that exit_mm() cleared tsk->mm before doing the mmput() which leads to the exit_aio() and then the panic. Looks like is_hugepage_only_range() is only used in ia64 and ppc64. Possible fix is to change is_hugepage_only_range() to take an 'mm' as a parameter as well as 'addr' and 'len' and then the ppc64 code could change to use 'mm'. It looks like it has been broken for quite a while. Here's the stack trace: cpu 0x2: Vector: 300 (Data Access) at [c001d1be7590] pc: c0092960: .unmap_region+0x17c/0x4a4 lr: c0092bb0: .unmap_region+0x3cc/0x4a4 sp: c001d1be7810 msr: 80009032 dar: 298 dsisr: 4000 current = 0xc1dd77b0 paca= 0xc0595c00 pid = 11336, comm = aiodio_readoff [c001d1be78e0] c0093d08 .do_munmap+0x240/0x408 [c001d1be79b0] c00d11b4 .aio_free_ring+0x10c/0x1d8 [c001d1be7a50] c00d162c .__put_ioctx+0x84/0x120 [c001d1be7af0] c00d3640 .exit_aio+0xf4/0x100 [c001d1be7b80] c004dfd4 .mmput+0x80/0x15c [c001d1be7c20] c0053648 .exit_mm+0x1b4/0x264 [c001d1be7cc0] c00555ac .do_exit+0x10c/0xdb0 [c001d1be7d90] c00562a8 .do_group_exit+0x58/0xd8 [c001d1be7e30] c000d500 syscall_exit+0x0/0x18 Here's a program that produces the panic: (compile using cc -o aiodio_read aiodio_read.c -laio). -- #define _XOPEN_SOURCE 600 #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include int pagesize; char *iobuf; io_context_t myctx; int aio_maxio = 4; /* * do a AIO DIO write */ int do_aio_direct_read(int fd, char *iobuf, int offset, int size) { struct iocb myiocb; struct iocb *iocbp = int ret; struct io_event e; struct stat s; io_prep_pread(, fd, iobuf, size, offset); if ((ret = io_submit(myctx, 1, )) != 1) { perror("io_submit"); return ret; } ret = io_getevents(myctx, 1, 1, , 0); if (ret) { struct iocb *iocb = e.obj; int iosize = iocb->u.c.nbytes; char *buf = iocb->u.c.buf; long long loffset = iocb->u.c.offset; printf("AIO read of %d at offset %lld returned %d\n", iosize, loffset, e.res); } return ret; } int main(int argc, char *argv[]) { char *filename; int fd; int err; filename = "test.aio.file"; fd = open(filename, O_RDWR|O_DIRECT|O_CREAT|O_TRUNC, 0666); pagesize = getpagesize(); err = posix_memalign((void**) , pagesize, pagesize); if (err) { fprintf(stderr, "Error allocating %d aligned bytes.\n", pagesize); exit(1); } err = write(fd, iobuf, pagesize); if (err != pagesize) { fprintf(stderr, "Error ret = %d writing %d bytes.\n", err, pagesize); perror(""); exit(1); } memset(, 0, sizeof(myctx)); io_queue_init(aio_maxio, ); err = do_aio_direct_read(fd, iobuf, 0, pagesize); close(fd); printf("This will panic on ppc64\n"); return err; } -- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
EIP and VMA
Hi, I am working on this piece of code (simplified): void ip_vma(struct task_struct *task, struct pt_regs *regs) { struct mm_struct *mm; struct vm_area_struct *vma; if(task) { mm = get_task_mm(task); if(mm) { vma = find_vma(mm, regs->eip); if(vma) { /* Some code */ } else printk("WARNING: No VMA\n"); mmput(mm); } } } I would like to get instruction pointer's VMA of a task. In order to do so, I use find_vma function, using regs->eip as instruction pointer value. Unfortunately I always get "WARNING: No VMA" message because find_vma isn't able to find the right VMA regs->eip address belongs to. Is regs->eip the right place where istruction pointer is located or I should find that value elsewhere? Thank you, Luca - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] fork_connector: add a fork connector
On Thu, 17 Mar 2005 08:56:57 -0800 Jesse Barnes <[EMAIL PROTECTED]> wrote: > On Thursday, March 17, 2005 1:04 am, Guillaume Thouvenin wrote: > > +static inline void fork_connector(pid_t parent, pid_t child) > > +{ > > + static DEFINE_SPINLOCK(cn_fork_lock); > > + static __u32 seq; /* used to test if message is lost */ > > + > > + if (cn_fork_enable) { > > + struct cn_msg *msg; > > + > > + __u8 buffer[CN_FORK_MSG_SIZE]; > > + > > + msg = (struct cn_msg *)buffer; > > + > > + memcpy(>id, _fork_id, sizeof(msg->id)); > > + spin_lock(_fork_lock); > > + msg->seq = seq++; > > + spin_unlock(_fork_lock); > > As I mentioned before, this won't work very well on a large CPU count system. > > cn_fork_lock will be taken by each CPU everytime it does a fork, meaning that > forks will be very slow if lots of CPUs are doing them at the same time. Is Maybe... But..., concider ppc system, each lock is about 10 instructions(or even less), increment with return is about 3-5 instructions, unlock - barrier() and setting. The whole fork syscall contains too bigger number of instruction(do_fork() itself is more than 500, and it is not counting number of instructions in functions that are called from do_fork()) to care about 20 idle on each CPU, even if there are 512 of them. The most significant part there - is requirement to store u32 seq in each CPU's cache and thus flush cacheline + invalidate/get from mem on each other cpus each time it is accessed, which is a big price. > there a more scalable way to ensure message delivery? It is totally Guillaume's work - so he decides, I would recomend per cpu counters and processor's id in each message. And of course userspace should take care of misordered messages. I personally prefer such mechanism. Guillaume? > Jesse Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
On Thu, 17 Mar 2005, Andrew Morton wrote: > > I switched off the page-zeroing hardware for the tests. > > What tests? For the results on the darn URL. > See, a speedup in a simple malloc+memset could be due to either a simple > transfer of load from user to kscrubd, or it could be due to leveraging the > page-zeroing hardware. > > The latter, I expect, if the workload is actually touching every byte of > all the pages. Is it? If the workload is touching every byte of the workload immediately after a page fault then prezeroing is not effective. Its only useful for sparse accesses (like page tables etc). > If we're doing kscrubd zeroing via memset() then the total system load > would actually be increased if the application is touching every byte, yes? The kernel would have zeroed a page uselessly at an idle time. > > Without zeroing hardware the eroing actions are moved to idle > > system time (load < /proc/sys/vm/scrub_load). Its shifting the cpu load. > > Right. We'd expect that to be a net regression if the application is > touching all of the memory and a net win if it is touching the memory > sparsely, yes? There will be no regression (as shown on the unnamed URL) if the scrubd is only run during idle times (and also there will be no regression if the known zeroed pages are returned to the hotlists and then used). Kscrubd is an experimental configuration option. Switch it off[default] and the zero hotlists are only populated by the return of known zeroed pages via free_hot_zeroed_page etc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
EIP and VMA
Hi, I am working on this piece of code (simplified): void ip_vma(struct task_struct *task, struct pt_regs *regs) { struct mm_struct *mm; struct vm_area_struct *vma; if(task) { mm = get_task_mm(task); if(mm) { vma = find_vma(mm, regs->eip); if(vma) { /* Some code */ } else printk("WARNING: No VMA\n"); mmput(mm); } } } I would like to get instruction pointer's VMA of a task. In order to do so, I use find_vma function, using regs->eip as instruction pointer value. Unfortunately I always get "WARNING: No VMA" message because find_vma isn't able to find the right VMA regs->eip address belongs to. Is regs->eip the right place where istruction pointer is located or I should find that value elsewhere? Thank you, Luca - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Real-Time Preemption and RCU
Hello! As promised/threatened earlier in another forum... Thanx, Paul The Real-Time Preemption patch modified RCU to permit its read-side critical sections to be safely preempted. This has worked, but there are a few serious problems with this variant of RCU. [If you just want to skip directly to the code, search for "synchronize_kernel(void)". There are five occurrences, each a variation on the theme. I recommend the fifth one. The third one might also be OK in some environments. If you have better approaches, please do not keep them a secret!!!] So, why am I saying that there are problems with the real-time preemption implementation of RCU? o RCU read-side critical sections cannot be freely nested, since the read-side critical section now acquires locks. This means that the real-time preemption variant of RCU is subject to deadlock conditions that "classic" RCU is immune to. This is not just a theoretical concern, for example, see nf_hook_slow() in linux/net/core/netfilter.c: + /* +* PREEMPT_RT semantics: different-type read-locks +* dont nest that easily: +*/ +// rcu_read_lock_read(_lock); A number of other RCU read-side critical sections have been similarly disabled (17 total in the patch). Perhaps most embedded systems will not be using netfilter heavily, but this does bring up a very real stability concern, even on UP systems (since preemption will eventually occur when and where least expected). o RCU read-side critical sections cannot be unconditionally upgraded to write-side critical sections in all cases. For example, in classic RCU, it is perfectly legal to do the following: rcu_read_lock(); list_for_each_entry_rcu(lep, head, p) { if (p->needs_update) { spin_lock(_lock); update_it(p); spin_unlock(_lock); } } rcu_read_unlock() This would need to change to the following for real-time preempt kernels: rcu_read_lock_spin(_lock); list_for_each_entry_rcu(lep, head, p) { if (p->needs_update) { spin_lock(_lock); update_it(p); spin_unlock(_lock); } } rcu_read_unlock_spin(_lock) This results in self-deadlock. o There is an API expansion, with five different variants of rcu_read_lock(): API # uses -- rcu_read_lock_spin()11 rcu_read_unlock_spin() 12 rcu_read_lock_read()42 rcu_read_unlock_read() 42 rcu_read_lock_bh_read() 2 rcu_read_unlock_bh_read()3 rcu_read_lock_down_read() 14 rcu_read_unlock_up_read() 20 rcu_read_lock_nort() 3 rcu_read_unlock_nort() 4 TOTAL 153 o The need to modify lots of RCU code expands the size of this patch -- roughly 10% of the 20K lines of this patch are devoted to modifying RCU code to meet this new API. 10% may not sound like much, but it comes to more than 2,000 lines of context diffs. Seems to me that it would be good to have an RCU implementation that meet the requirements of the Real-Time Preemption patch, but that is 100% compatible with the "classic RCU" API. Such an implementation must meet a number of requirements, which are listed at the end of this message (search for "REQUIREMENTS"). I have looked into a number of seductive but subtly broken "solutions" to this problem. The solution listed here is not perfect, but I believe that it has enough advantages to be worth pursuing. The solution is quite simple, and I feel a bit embarrassed that it took me so long to come up with it. All I can say in my defense is that the idea of -adding- locks to improve scalability and eliminate deadlocks is quite counterintuitive to me. And, like I said earlier, if you know of a better approach, please don't keep it a secret! The following verbiage steps through several variations on this solution, as follows: 1. "Toy" implementation that has numerous API, scalability, and realtime problems, but is a very simple 28-line illustration of the underlying principles. (In case you get excited about this being much smaller than
Re: [PATCH] add TIMEOUT to firmware_class hotplug event
On Thu, Mar 17, 2005 at 03:34:31AM +0100, Kay Sievers wrote: > On Tue, 2005-03-15 at 09:25 +0100, Hannes Reinecke wrote: > > The current implementation of the firmware class breaks a fundamental > > assumption in udevd: that the physical device can be initialised fully > > prior to executing the next event for that device. > > Here we add a TIMEOUT value to the hotplug environment of the firmware > requesting event. I will adapt udevd not to wait for anything else, if > it finds a TIMEOUT key. > > Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> Applied, thanks. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
Christoph Lameter <[EMAIL PROTECTED]> wrote: > > On Thu, 17 Mar 2005, Andrew Morton wrote: > > > > http://oss.sgi.com/projects/page_fault_performance/ > > > > Oh no, not that page again ;) > > Yes indeed! > > > Seems to say that prezeroing makes negligible difference to kernel builds, > > but speeds up a big malloc+memset by 3x to 4x, yes? > > Correct. > > > Are there any real-worldish workloads which show an appreciable benefit? > > Ummm. Big loads are our real-worldish workloads here. Sure, but not malloc+memset+exit. How much improvement do these big numerical tasks get from the patch? > > The large speedup for a big memset seems odd - I assume it's simply > > transferring CPU load from the user's process over to kscrubd. Or is it > > the fancy page-zeroing hardware? How do we differentiate the two? > > I switched off the page-zeroing hardware for the tests. What tests? See, a speedup in a simple malloc+memset could be due to either a simple transfer of load from user to kscrubd, or it could be due to leveraging the page-zeroing hardware. The latter, I expect, if the workload is actually touching every byte of all the pages. Is it? If we're doing kscrubd zeroing via memset() then the total system load would actually be increased if the application is touching every byte, yes? > > Are there any workloads which are seeing a benefit on a CPU which doesn't > > have the zeroing hardware? > > Without zeroing hardware the eroing actions are moved to idle > system time (load < /proc/sys/vm/scrub_load). Its shifting the cpu load. Right. We'd expect that to be a net regression if the application is touching all of the memory and a net win if it is touching the memory sparsely, yes? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Xen/i386 cleanups - AGP bus/phys cleanups
Alan Cox writes: > On Iau, 2005-03-17 at 09:34, Paul Mackerras wrote: > > This code needs real physical addresses, which are not the same things > > as bus addresses. > > Not always. The code needs platform specific goodies. We've only never > been burned so far because there isn't a box with an IOMMU and AGPGART > where one maps through the other. That sounds like a good way to make AGP accesses slower. :) Seriously, given that AGP is a technology that is being superseded by PCI Express, I think it's reasonable to look at the range of current implementations to see what we have to cope with. So I don't think it's worth worrying too much about the possibility of GARTs that go through the IOMMU. However, the idea of having phys_to_agp/agp_to_phys (or virt_to_agp/agp_to_virt) sounds like it wouldn't be too much effort, if it would help Xen. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch][resend] convert a remaining verify_area to access_ok (was: Re: [PATCH 2.6.11-mm1] mips: more convert verify_area to access_ok) (fwd)
On Thu, 17 Mar 2005, Ralf Baechle wrote: > On Wed, Mar 16, 2005 at 10:35:09PM +0100, Jesper Juhl wrote: > > > Around 2.6.11-mm1 Yoichi Yuasa found a user of verify_area that I had > > missed when converting everything to access_ok. The patch below still > > applies cleanly to 2.6.11-mm4. > > Please apply (unless of course you already picked it up back then and > > have it in a queue somewhere :) . > > Oh gosh, you actually converted the whole IRIX compatibility mess even, > amazing stomach you have :-) I only noticed that when I just looked at > Linus' tree - after buring a few hours into cleaning those files myself - > mine are now almost free of sparse warnings. > I hope I did a descent job and that you didn't waste too much time duplicating effort... > The last instance of verify_area() in the MIPS code is now the definition > itself. > The plan is to wait for a few months (or a few kernel releases - whichever comes first) and then I'll send Andrew patches to remove it completely. There are still a few related nits left, like the FPU_verify_area function arch/i386/math-emu/reg_ld_str.c and the rw_verify_area function in fs/read_write.c that I want to get out of the way first (think I'll probably end up attempting to rename those s/verify_area/access_ok/ and see if people scream). -- Jesper - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BKCVS broken ?
I got swamped, I'll look at this after dinner. But you might take a look at this: http://www.bitkeeper.com/press/2005-03-17.html which is a link to a very simple open source BK client. It doesn't do much except track the head of the tree but it does that well. It's slightly better than that, it puts all the checkin comments in BK/ChangeLog so you don't have to go over the wire to get those. It's intended for someone who just wants the latest and greatest snapshot, knows how to do cp -rp and diff -Nur, it's pretty basic. It's not a CVS gateway replacement but it does work for every tree on bkbits.net. Just to be clear, we are not dropping the CVS gateway, this is "in addition to" not "instead of". If this turns out to be popular we can look at making a BitTorrent image of each tree available so people can get at them without swamping us. Don't worry about the license, it's a joke. BSD license OK with everyone? -- --- Larry McVoylm at bitmover.com http://www.bitkeeper.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] IDE failure on ACPI resume
Matthew Garrett wrote: On Thu, 2005-03-17 at 12:34 -0800, Nate Lawson wrote: Very interesting. I was hoping to someday have _GTF et al implemented but the ATA knowledge required was above my head. I also strongly suspected that the info published by _GTF would likely be invalid. Does Windows actually use that method or just hardcoded ATA initialization? I believe that Windows does use the _GTF methods. You are correct. A quick scan of my w2k drivers shows atapi.sys uses the _GTF, _GTM, and _STM methods. -- Nate - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
On Thu, 17 Mar 2005, Andrew Morton wrote: > > http://oss.sgi.com/projects/page_fault_performance/ > > Oh no, not that page again ;) Yes indeed! > Seems to say that prezeroing makes negligible difference to kernel builds, > but speeds up a big malloc+memset by 3x to 4x, yes? Correct. > Are there any real-worldish workloads which show an appreciable benefit? Ummm. Big loads are our real-worldish workloads here. > The large speedup for a big memset seems odd - I assume it's simply > transferring CPU load from the user's process over to kscrubd. Or is it > the fancy page-zeroing hardware? How do we differentiate the two? I switched off the page-zeroing hardware for the tests. > Are there any workloads which are seeing a benefit on a CPU which doesn't > have the zeroing hardware? Without zeroing hardware the eroing actions are moved to idle system time (load < /proc/sys/vm/scrub_load). Its shifting the cpu load. But I just fixed things up so that the kernel can return hot zeroed pages to the pool for quicklist management. This yields zeroed pages without kscrubd. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] IDE failure on ACPI resume
On Thu, 2005-03-17 at 12:34 -0800, Nate Lawson wrote: > Very interesting. I was hoping to someday have _GTF et al implemented > but the ATA knowledge required was above my head. I also strongly > suspected that the info published by _GTF would likely be invalid. Does > Windows actually use that method or just hardcoded ATA initialization? I believe that Windows does use the _GTF methods. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
On Thu, 17 Mar 2005, Andrew Morton wrote: > Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > > > And given that we have separate buddy structures for zeroed and not-zeroed > > > pages, why is this tagging needed at all? > > > > Because the buddy pointers may point to a page of the different kind. Then > > a merge is not possible. > > In that case I still don't understand, sorry. > > If each zone has two buddy lists, one for zeroed and one for not-zeroed, > how can we ever get known-to-be-zeroed pages on the not-known-to-be-zeroed > list or vice versa? The buddy is calculated based on the position in the page struct array not based on the list. > > > > #define __free_page(page) __free_pages((page), 0) > > #define free_page(addr) free_pages((addr),0) > > > > This is what you want right? > > Well, it was more a question that a request. If we do this, does it speed > anything up? It will be able to manage the quicklist effectively and you can avoid having to zero a page for pte/pmd/pud/pgds. The main benefit from prezeroing is gained for programs that do numerical calculations based on sparse matrices or other extremely large programs that typically also come with large sparse arrays. The optimization is typical for operating systems in that area (even M$ does that...). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
Christoph Lameter <[EMAIL PROTECTED]> wrote: > > On Thu, 17 Mar 2005, Andrew Morton wrote: > > > > > It's hard to know what to think about this without benchmarking > numbers. > > http://oss.sgi.com/projects/page_fault_performance/ Oh no, not that page again ;) Seems to say that prezeroing makes negligible difference to kernel builds, but speeds up a big malloc+memset by 3x to 4x, yes? Are there any real-worldish workloads which show an appreciable benefit? The large speedup for a big memset seems odd - I assume it's simply transferring CPU load from the user's process over to kscrubd. Or is it the fancy page-zeroing hardware? How do we differentiate the two? Are there any workloads which are seeing a benefit on a CPU which doesn't have the zeroing hardware? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Where is a reference for ioctl32() usage?
Thanks for all the help in the past, and I'm once again knocking at your door for more help. I am trying to get my PCI bus device driver running on an Xeon 64-bit FC-3 distribution. I got the compiler warnings all cleaned up, the driver compiles and loads, but the test executable which was compiled on a 32-bit FC-3 distribution is causing these messages in /var/log/messages: Mar 17 15:42:55 noble kernel: ioctl32(boardtest:3730): Unknown cmd fd(3) cmd(8004440e){00} arg(d824) on /dev/sse0 Mar 17 15:42:55 noble kernel: ioctl32(boardtest:3730): Unknown cmd fd(3) cmd(8004440e){00} arg(d8c4) on /dev/sse0 Mar 17 15:42:55 noble kernel: ioctl32(boardtest:3730): Unknown cmd fd(3) cmd(40044414){00} arg() on /dev/sse0 Mar 17 15:42:55 noble kernel: ioctl32(boardtest:3730): Unknown cmd fd(3) cmd(80044403){00} arg(0804f780) on /dev/sse0 It's probably a simple thing to change my ioctl() interface in the driver, but I googled myself blue in the face, and I didn't find it, so I come to you, hat-in-hand for help. Where can I find out how to change my driver so I can have a 32-bit executable talk to it using ioctl()? I did change the "type" argument in _IOR and _IOW to uint32_t from int, but that didn't change things. -Alan -- - Alan Kilian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > And given that we have separate buddy structures for zeroed and not-zeroed > > pages, why is this tagging needed at all? > > Because the buddy pointers may point to a page of the different kind. Then > a merge is not possible. In that case I still don't understand, sorry. If each zone has two buddy lists, one for zeroed and one for not-zeroed, how can we ever get known-to-be-zeroed pages on the not-known-to-be-zeroed list or vice versa? > > These are all design decisions which have been made, but they're not > > communicated either in the patch description or in code comments. It's to > > everyone's advantage to fix that, no? > > Of course. Try to do this ASAP. Testing a patch that defines the > following: > > Index: linux-2.6.11/include/linux/gfp.h > === > --- linux-2.6.11.orig/include/linux/gfp.h 2005-03-01 > 23:37:50.0 -0800 > +++ linux-2.6.11/include/linux/gfp.h2005-03-17 14:59:06.0 > -0800 > @@ -125,6 +125,8 @@ extern void FASTCALL(__free_pages(struct > extern void FASTCALL(free_pages(unsigned long addr, unsigned int order)); > extern void FASTCALL(free_hot_page(struct page *page)); > extern void FASTCALL(free_cold_page(struct page *page)); > +extern void FASTCALL(free_hot_zeroed_page(struct page *page)); > +extern void FASTCALL(free_cold_zeroed_page(struct page *page)); > > #define __free_page(page) __free_pages((page), 0) > #define free_page(addr) free_pages((addr),0) > > This is what you want right? Well, it was more a question that a request. If we do this, does it speed anything up? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux: detect application crash
Allison wrote: Hi, Several times when I worked with Windows, I have had a scenario when I am editing a file and saved some time ago and then the application crashes and I lose all recent data. Can the operating system detect all application crashes ? If so, why can't the OS save the user data to disk before the application quits ? How does this work in Linux. I was curious if such a functionality already exists in Linux. If not, what are the issues involved in implementing this functionality. The OS doesn't have enough information to be able to save the app's data in the event of a crash in a form that would be usable or meaningful, since only the app knows what format its data structures are in. The app itself could do this (installing a signal handler for segfaults, etc.) but the problem is that whatever caused the program to crash may have also left its data in a messed-up state. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.11] aoe [1/12]: remove too-low cap on minor number
I've applied 11 of these 12 patches (the one from Randy was already included) to my trees. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Pre-approved Application for linux-kernel-announce@vger.kernel.org Thu, 17 Mar 2005 15:45:41 -0800
Hello, We sent you an email a while ago, because you now qualify for a much lower rate based on the biggest rate drop in years. You can now get $327,000 for as little as $617 a month! Bad credit? Doesn't matter, low rates are fixed no matter what! Follow this link to process your application and a 24 hour approval: http://www.alowerrate.net/?id=c77 Best Regards, Augustus Felton http://www.alowerrate.net/byebye.php - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] avoid signed vs unsigned comparison in efi_range_is_wc()
This little function in include/linux/efi.h : static inline int efi_range_is_wc(unsigned long start, unsigned long len) { int i; for (i = 0; i < len; i += (1UL << EFI_PAGE_SHIFT)) { unsigned long paddr = __pa(start + i); if (!(efi_mem_attributes(paddr) & EFI_MEMORY_WC)) return 0; } /* The range checked out */ return 1; } generates this warning when building with gcc -W : include/linux/efi.h: In function `efi_range_is_wc': include/linux/efi.h:320: warning: comparison between signed and unsigned It looks to me like a significantly large 'len' passed in could cause the loop to never end. Isn't it safer to make 'i' an unsigned long as well? Like this little patch below (which of course also kills the warning) : Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]> diff -up linux-2.6.11-mm4-orig/include/linux/efi.h linux-2.6.11-mm4/include/linux/efi.h --- linux-2.6.11-mm4-orig/include/linux/efi.h 2005-03-16 15:45:35.0 +0100 +++ linux-2.6.11-mm4/include/linux/efi.h2005-03-18 00:34:36.0 +0100 @@ -315,7 +315,7 @@ extern struct efi_memory_map memmap; */ static inline int efi_range_is_wc(unsigned long start, unsigned long len) { - int i; + unsigned long i; for (i = 0; i < len; i += (1UL << EFI_PAGE_SHIFT)) { unsigned long paddr = __pa(start + i); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
8250 - sparse error fixes
Ensure __iomem on the correct bits of the serial_struct and other definitions. Remove the attempts to size zero length arrays, which causes problems from sparse. Signed-off-by: Ben Dooks <[EMAIL PROTECTED]> diff -urN -X ../dontdiff linux-2.6.11.3-bk3/include/linux/serial.h linux-2.6.11.3-bk3-fix1/include/linux/serial.h --- linux-2.6.11.3-bk3/include/linux/serial.h 2005-03-02 07:37:50.0 + +++ linux-2.6.11.3-bk3-fix1/include/linux/serial.h 2005-03-17 23:08:53.0 + @@ -45,7 +45,7 @@ int hub6; unsigned short closing_wait; /* time to wait before closing */ unsigned short closing_wait2; /* no longer used... */ - unsigned char *iomem_base; + unsigned char __iomem *iomem_base; unsigned short iomem_reg_shift; unsigned intport_high; unsigned long iomap_base; /* cookie passed into ioremap */ diff -urN -X ../dontdiff linux-2.6.11.3-bk3/drivers/serial/8250.c linux-2.6.11.3-bk3-fix1/drivers/serial/8250.c --- linux-2.6.11.3-bk3/drivers/serial/8250.c2005-03-17 22:58:47.0 + +++ linux-2.6.11.3-bk3-fix1/drivers/serial/8250.c 2005-03-17 23:23:39.0 + @@ -111,15 +111,41 @@ * standard enumeration mechanism. Platforms that can find all * serial ports via mechanisms like ACPI or PCI need not supply it. */ -#ifndef SERIAL_PORT_DFNS -#define SERIAL_PORT_DFNS -#endif +#ifdef SERIAL_PORT_DFNS static struct old_serial_port old_serial_port[] = { SERIAL_PORT_DFNS /* defined in asm/serial.h */ }; +static inline void __init serial8240_isa_init_asmdefs(void) +{ + struct uart_8250_port *up; + int i; + + for (i = 0, up = serial8250_ports; i < ARRAY_SIZE(old_serial_port); +i++, up++) { + up->port.iobase = old_serial_port[i].port; + up->port.irq = irq_canonicalize(old_serial_port[i].irq); + up->port.uartclk = old_serial_port[i].baud_base * 16; + up->port.flags= old_serial_port[i].flags; + up->port.hub6 = old_serial_port[i].hub6; + up->port.membase = old_serial_port[i].iomem_base; + up->port.iotype = old_serial_port[i].io_type; + up->port.regshift = old_serial_port[i].iomem_reg_shift; + if (share_irqs) + up->port.flags |= UPF_SHARE_IRQ; + } +} + #define UART_NR(ARRAY_SIZE(old_serial_port) + CONFIG_SERIAL_8250_NR_UARTS) +#else + +#define UART_NR(CONFIG_SERIAL_8250_NR_UARTS) + +static inline void __init serial8240_isa_init_asmdefs(void) +{ +} +#endif #ifdef CONFIG_SERIAL_8250_RSA @@ -2021,9 +2047,9 @@ return; first = 0; - for (i = 0; i < UART_NR; i++) { - struct uart_8250_port *up = _ports[i]; + up = _ports[0]; + for (i = 0; i < UART_NR; i++, up++) { up->port.line = i; spin_lock_init(>port.lock); @@ -2039,19 +2065,7 @@ up->port.ops = _pops; } - for (i = 0, up = serial8250_ports; i < ARRAY_SIZE(old_serial_port); -i++, up++) { - up->port.iobase = old_serial_port[i].port; - up->port.irq = irq_canonicalize(old_serial_port[i].irq); - up->port.uartclk = old_serial_port[i].baud_base * 16; - up->port.flags= old_serial_port[i].flags; - up->port.hub6 = old_serial_port[i].hub6; - up->port.membase = old_serial_port[i].iomem_base; - up->port.iotype = old_serial_port[i].io_type; - up->port.regshift = old_serial_port[i].iomem_reg_shift; - if (share_irqs) - up->port.flags |= UPF_SHARE_IRQ; - } + serial8240_isa_init_asmdefs(); } static void __init diff -urN -X ../dontdiff linux-2.6.11.3-bk3/drivers/serial/8250.h linux-2.6.11.3-bk3-fix1/drivers/serial/8250.h --- linux-2.6.11.3-bk3/drivers/serial/8250.h2005-03-02 07:37:30.0 + +++ linux-2.6.11.3-bk3-fix1/drivers/serial/8250.h 2005-03-17 23:07:10.0 + @@ -30,7 +30,7 @@ unsigned int flags; unsigned char hub6; unsigned char io_type; - unsigned char *iomem_base; + unsigned char __iomem *iomem_base; unsigned short iomem_reg_shift; }; diff -urN -X ../dontdiff linux-2.6.11.3-bk3/drivers/serial/serial_core.c linux-2.6.11.3-bk3-fix1/drivers/serial/serial_core.c --- linux-2.6.11.3-bk3/drivers/serial/serial_core.c 2005-03-02 07:37:50.0 + +++ linux-2.6.11.3-bk3-fix1/drivers/serial/serial_core.c2005-03-17 23:09:36.0 + @@ -592,7 +592,7 @@ tmp.hub6= port->hub6; tmp.io_type = port->iotype; tmp.iomem_reg_shift = port->regshift; - tmp.iomem_base = (void *)port->mapbase; + tmp.iomem_base = (void __iomem *)port->mapbase; if
Re: [PATCH] Prezeroing V8
On Thu, 17 Mar 2005, Andrew Morton wrote: > > > It's hard to know what to think about this without benchmarking numbers. http://oss.sgi.com/projects/page_fault_performance/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
On Thu, 17 Mar 2005, Andrew Morton wrote: > OK, so we're splitting each zone's buddy structure into two: one for zeroed > pages and one for not-zeroed pages, yes? Right. > It's not obvious what the page->private of freed pages are being used for. > Please comment that. Ok. > What's all this (zero << 10) stuff? > > + page->private = order + (zero << 10); > + (page_zorder(page) == order + (zero << 10)) && > > Doesn't this explode if we already have order-1024 pages in there? I guess > that's a reasonable restriction, but where did the "10" come from? > Non-obvious, needs commenting. Yes it will fail if we have pages of the size of 2^1036. > And given that we have separate buddy structures for zeroed and not-zeroed > pages, why is this tagging needed at all? Because the buddy pointers may point to a page of the different kind. Then a merge is not possible. > These are all design decisions which have been made, but they're not > communicated either in the patch description or in code comments. It's to > everyone's advantage to fix that, no? Of course. Try to do this ASAP. Testing a patch that defines the following: Index: linux-2.6.11/include/linux/gfp.h === --- linux-2.6.11.orig/include/linux/gfp.h 2005-03-01 23:37:50.0 -0800 +++ linux-2.6.11/include/linux/gfp.h2005-03-17 14:59:06.0 -0800 @@ -125,6 +125,8 @@ extern void FASTCALL(__free_pages(struct extern void FASTCALL(free_pages(unsigned long addr, unsigned int order)); extern void FASTCALL(free_hot_page(struct page *page)); extern void FASTCALL(free_cold_page(struct page *page)); +extern void FASTCALL(free_hot_zeroed_page(struct page *page)); +extern void FASTCALL(free_cold_zeroed_page(struct page *page)); #define __free_page(page) __free_pages((page), 0) #define free_page(addr) free_pages((addr),0) This is what you want right? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] PCI-PCI transparent bridge handling improvements (pci core)
"Transparent" PCI-PCI bridges are currently "ignored" by the resource management code in the PCI core. This means devices behind the bridge are handled as if there was no bridge. However, it seems more suitable -- and it seems to allow for proper "prefetch"-type memory handling, too -- to handle a transparent PCI-PCI bridge like any other PCI-PCI bridge, and to only break out of the limits set by the bridge windows if the resource allocation code determines it needs to do s. The tricky part is in pci_find_parent_resource(). There are two types of functions calling it: some functions already know the exact resource for which they want to find the parent in order to properly insert it into the resource database. This can be handled easily -- if the resource is inside the bridge window, this is returned; if it isn't, the bridge's parent resource is returned. However, two callers (yenta_socket and i2o) intend something different: they call pci_find_parent_resource() with an empty resource and want to find out the biggest valid resource of the proper type in order to analyze it and adapt its own hunger for resources to it. To keep this behaviour backwards-compatible, we always need to not limit it to the bridge window resources, but get back to the parent bus. This patch is a modified and (hopefully) improved derivation of Linus' "pcmcia-bridge-resource-management-fix.patch" included in 2.6.11-rc4-mm1. Signed-off-by: Dominik Brodowski <[EMAIL PROTECTED]> Index: 2.6.11++/drivers/pci/bus.c === --- 2.6.11++.orig/drivers/pci/bus.c 2005-03-17 00:39:00.0 +0100 +++ 2.6.11++/drivers/pci/bus.c 2005-03-17 00:39:24.0 +0100 @@ -18,22 +18,12 @@ #include "pci.h" /** - * pci_bus_alloc_resource - allocate a resource from a parent bus - * @bus: PCI bus - * @res: resource to allocate - * @size: size of resource to allocate - * @align: alignment of resource to allocate - * @min: minimum /proc/iomem address to allocate - * @type_mask: IORESOURCE_* type flags - * @alignf: resource alignment function - * @alignf_data: data argument for resource alignment function + * pci_one_bus_alloc_resource - allocate a resource from one specific bus * - * Given the PCI bus a device resides on, the size, minimum address, - * alignment and type, try to find an acceptable resource allocation - * for a specific device resource. + * Always use pci_bus_alloc_resource() described below. */ -int -pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res, +static int +pci_one_bus_alloc_resource(struct pci_bus *bus, struct resource *res, unsigned long size, unsigned long align, unsigned long min, unsigned int type_mask, void (*alignf)(void *, struct resource *, @@ -69,6 +59,48 @@ } /** + * pci_bus_alloc_resource - allocate a resource from a parent bus + * @bus: PCI bus + * @res: resource to allocate + * @size: size of resource to allocate + * @align: alignment of resource to allocate + * @min: minimum /proc/iomem address to allocate + * @type_mask: IORESOURCE_* type flags + * @alignf: resource alignment function + * @alignf_data: data argument for resource alignment function + * + * Given the PCI bus a device resides on, the size, minimum address, + * alignment and type, try to find an acceptable resource allocation + * for a specific device resource. + */ +int +pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res, + unsigned long size, unsigned long align, unsigned long min, + unsigned int type_mask, + void (*alignf)(void *, struct resource *, + unsigned long, unsigned long), + void *alignf_data) +{ + int ret = pci_one_bus_alloc_resource(bus, res, size, align, min, + type_mask, alignf, alignf_data); + + /* +* If allocation from the resources available to this bus failed, +* and there is a transparent parent PCI-PCI bridge, we can check +* the resources of the parent bus as well +*/ + while (ret && bus->self && bus->self->transparent) { + bus = bus->self->bus; + if (!bus) + return ret; + ret = pci_one_bus_alloc_resource(bus, res, size, align, min, + type_mask, alignf, alignf_data); + } + return ret; +} + + +/** * add a single device * @dev: device to add * Index: 2.6.11++/drivers/pci/pci.c === --- 2.6.11++.orig/drivers/pci/pci.c 2005-03-17 00:39:00.0 +0100 +++ 2.6.11++/drivers/pci/pci.c 2005-03-17 01:12:18.0 +0100 @@ -195,18 +195,13 @@ } /** - * pci_find_parent_resource - return resource region of parent bus of given region - * @dev: PCI device structure contains resources to be searched - * @res: child resource record for which parent is sought + *
[PATCH 2/2] PCI-PCI transparent bridge handling improvements (yenta_socket)
As a follow-up, we can make yenta_socket try harder to limit itself to the parent bridge windows. This is done by lowering the PCIBIOS_MIN_CARDBUS_IO and by updating yenta_allocate_res(). It now tries at first to get resources within the bridge windows, and if they are large enough (>=BRIDGE_{IO,MEM}_ACC), these are used. If no or only too small resources were found, it falls back to the resources behind the parent PCI bridge if this is "transparent". Using this patch may result in such "funny" /proc/ioports as: 2800-28ff : PCI CardBus #07 3000-3fff : PCI Bus #02 3000-303f : :02:08.0 3000-303f : e100 3400-34ff : PCI CardBus #03 3800-38ff : PCI CardBus #03 3c00-3cff : PCI CardBus #07 There weren't enough properly aligned ports available inside PCI Bus #02 to stuff all four (2x2) IO windows into it, so one was taken outside the transparent PCI bridge ioport window. Signed-off-by: Dominik Brodowski <[EMAIL PROTECTED]> Index: 2.6.11++/drivers/pcmcia/yenta_socket.c === --- 2.6.11++.orig/drivers/pcmcia/yenta_socket.c 2005-03-17 23:13:58.0 +0100 +++ 2.6.11++/drivers/pcmcia/yenta_socket.c 2005-03-17 23:40:38.0 +0100 @@ -518,19 +518,23 @@ * Use an adaptive allocation for the memory resource, * sometimes the memory behind pci bridges is limited: * 1/8 of the size of the io window of the parent. - * max 4 MB, min 16 kB. + * max 4 MB, min 16 kB. We try very hard to not get + * below the "ACC" values, though. */ #define BRIDGE_MEM_MAX 4*1024*1024 +#define BRIDGE_MEM_ACC 128*1024 #define BRIDGE_MEM_MIN 16*1024 #define BRIDGE_IO_MAX 256 +#define BRIDGE_IO_ACC 256 #define BRIDGE_IO_MIN 32 #ifndef PCIBIOS_MIN_CARDBUS_IO #define PCIBIOS_MIN_CARDBUS_IO PCIBIOS_MIN_IO #endif -static void yenta_allocate_res(struct yenta_socket *socket, int nr, unsigned type) +static int yenta_try_allocate_res(struct yenta_socket *socket, int nr, + unsigned int type, unsigned int run) { struct pci_bus *bus; struct resource *root, *res; @@ -550,11 +554,11 @@ res->name = bus->name; res->flags = type; res->start = 0; - res->end = 0; + res->end = run; root = pci_find_parent_resource(socket->dev, res); if (!root) - return; + return -ENODEV; start = config_readl(socket, offset) & mask; end = config_readl(socket, offset+4) | ~mask; @@ -562,7 +566,8 @@ res->start = start; res->end = end; if (request_resource(root, res) == 0) - return; + return 0; + printk(KERN_INFO "yenta %s: Preassigned resource %d busy, reconfiguring...\n", pci_name(socket->dev), nr); res->start = res->end = 0; @@ -571,12 +576,12 @@ if (type & IORESOURCE_IO) { align = 1024; size = BRIDGE_IO_MAX; - min = BRIDGE_IO_MIN; + min = run ? BRIDGE_IO_ACC : BRIDGE_IO_MIN; start = PCIBIOS_MIN_CARDBUS_IO; end = ~0U; } else { unsigned long avail = root->end - root->start; - int i; + u32 i; size = BRIDGE_MEM_MAX; if (size > avail/8) { size=(avail+1)/8; @@ -586,26 +591,36 @@ i++; size = 1 << i; } - if (size < BRIDGE_MEM_MIN) - size = BRIDGE_MEM_MIN; + i = run ? BRIDGE_MEM_ACC : BRIDGE_MEM_MIN; + if (size < i) + size = i; min = BRIDGE_MEM_MIN; align = size; start = PCIBIOS_MIN_MEM; end = ~0U; } - + do { if (allocate_resource(root, res, size, start, end, align, NULL, NULL)==0) { config_writel(socket, offset, res->start); config_writel(socket, offset+4, res->end); - return; + return 0; } size = size/2; align = size; } while (size >= min); + + return -ENODEV; +} + +static void yenta_allocate_res(struct yenta_socket *socket, int nr, unsigned type) +{ + if (!(yenta_try_allocate_res(socket, nr, type, 1)) || + !(yenta_try_allocate_res(socket, nr, type, 0))) + return; + printk(KERN_INFO "yenta %s: no resource of type %x available, trying to continue...\n", pci_name(socket->dev), type); - res->start = res->end = 0; } /* @@ -616,7 +631,7 @@ yenta_allocate_res(socket, 0, IORESOURCE_MEM|IORESOURCE_PREFETCH); yenta_allocate_res(socket, 1, IORESOURCE_MEM);
Re: [PATCH] pci_ids.h correction for Intel ICH7M - 2.6.11
On Fri, Mar 04, 2005 at 06:04:43PM -0800, Jason Gaston wrote: > This patch corrects the ICH7M LPC controller DID in pci_ids.h from > x27B1 to x27B9. ?This patch was build against 2.6.11. > If acceptable, please apply. Applied, thanks. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
Christoph Lameter <[EMAIL PROTECTED]> wrote: > > On Thu, 17 Mar 2005, Andrew Morton wrote: > > > Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > > > > Adds management of ZEROED and NOT_ZEROED pages and a background daemon > > > called scrubd. /proc/sys/vm/scrubd_load, /proc/sys/vm_scrubd_start and > > > /proc/sys/vm_scrubd_stop control the scrub daemon. See Documentation/vm/ > > > scrubd.txt > > > > It's hard to know what to think about this without benchmarking numbers. ? > > > > It would help if you could briefly describe the implementation and design > > decisions when sending patches. > > Oh. This was discussed so many times that I thought it would not be > necessary anymore. The discussion is attached. Add it to the changelog and maintain it, please. It never hurts. But that only describes why we want the feature, which is nice. It's also useful to explain how the feature works. Although my preference there is that this be done within code comments if at all appropriate. OK, so we're splitting each zone's buddy structure into two: one for zeroed pages and one for not-zeroed pages, yes? It's not obvious what the page->private of freed pages are being used for. Please comment that. What's all this (zero << 10) stuff? + page->private = order + (zero << 10); + (page_zorder(page) == order + (zero << 10)) && Doesn't this explode if we already have order-1024 pages in there? I guess that's a reasonable restriction, but where did the "10" come from? Non-obvious, needs commenting. And given that we have separate buddy structures for zeroed and not-zeroed pages, why is this tagging needed at all? These are all design decisions which have been made, but they're not communicated either in the patch description or in code comments. It's to everyone's advantage to fix that, no? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [openib-general] [PATCH] Add PCI device ID for new Mellanox HCA
On Tue, Mar 01, 2005 at 08:42:47AM -0800, Roland Dreier wrote: > Hi Greg, > > It turns out that Mellanox decided to change the device ID at the last > minute. So of course there will be parts with both IDs. Here's an > updated patch that includes both IDs. Please use this instead. Applied, thanks. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] add TIMEOUT to firmware_class hotplug event
On Thu, Mar 17, 2005 at 12:07:55PM +0100, Kay Sievers wrote: > On Wed, 2005-03-16 at 21:46 -0800, Greg KH wrote: > > On Thu, Mar 17, 2005 at 03:34:31AM +0100, Kay Sievers wrote: > > > On Tue, 2005-03-15 at 09:25 +0100, Hannes Reinecke wrote: > > > > The current implementation of the firmware class breaks a fundamental > > > > assumption in udevd: that the physical device can be initialised fully > > > > prior to executing the next event for that device. > > > > > > Here we add a TIMEOUT value to the hotplug environment of the firmware > > > requesting event. I will adapt udevd not to wait for anything else, if > > > it finds a TIMEOUT key. > > > > Can't you just trigger off of the FIRMWARE variable instead? > > Sure, that will work too. I just thought it would be nice to give > userspace a hint about the event behavior the kernel expects, instead of > adding an exception to the udevd event management? Hm, so by adding the TIMEOUT value, we are telling userspace that we better act on this operation soon, right? That's a special case too :) Anyway, sure, this is fine, I'll go add this to the driver-bk tree. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
On Thu, 17 Mar 2005, Nish Aravamudan wrote: > > + if (system_state != SYSTEM_RUNNING) > > + return; > > + > > +while (avenrun[0] >= ((unsigned long)sysctl_scrub_load << FSHIFT)) > > + schedule_timeout(30*HZ); > > This is a busy-loop, unless you set the state before you call > schedule_timeout(). Additionally, you really want to sleep 30 seconds Ahh. Missed that thanks. > at a time? Please use msleep() or msleep_interruptible(), unless you > expect wait-queue events. I want to sleep 30 seconds because the system load is unlikely to change frequently. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
On Thu, 17 Mar 2005 13:43:47 -0800 (PST), Christoph Lameter <[EMAIL PROTECTED]> wrote: > Changelog: > - Drop clear_pages and the approach to zero pages of higher order > first > - Zero a percentage of pages from all orders to avoid fragmentation > > Adds management of ZEROED and NOT_ZEROED pages and a background daemon > called scrubd. /proc/sys/vm/scrubd_load, /proc/sys/vm_scrubd_start and > /proc/sys/vm_scrubd_stop control the scrub daemon. See Documentation/vm/ > scrubd.txt > > In an SMP environment the scrub daemon is typically running on the most > idle cpu. Thus a single threaded application running > on one cpu may have the other cpu zeroing pages for it etc. The scrub > daemon is hardly noticable and usually finishes zeroing quickly since > most processors are optimized for linear memory filling. > > Patch against 2.6.11.3-bk3 > > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> > > Index: linux-2.6.11/mm/scrubd.c > === > --- /dev/null 1970-01-01 00:00:00.0 + > +++ linux-2.6.11/mm/scrubd.c2005-03-17 13:12:23.0 -0800 > +/* > + * scrub_pgdat() will work across all this node's zones. > + */ > +static void scrub_pgdat(pg_data_t *pgdat) > +{ > + int i; > + > + if (system_state != SYSTEM_RUNNING) > + return; > + > +while (avenrun[0] >= ((unsigned long)sysctl_scrub_load << FSHIFT)) > + schedule_timeout(30*HZ); This is a busy-loop, unless you set the state before you call schedule_timeout(). Additionally, you really want to sleep 30 seconds at a time? Please use msleep() or msleep_interruptible(), unless you expect wait-queue events. Thanks, Nish - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: PCI Error Recovery API Proposal. (WAS:: [PATCH/RFC] PCIErrorRecovery)
> On a fatal error the interface is down. No matter what the driver > supports (AER aware, EEH aware, unaware) all IO is likely to fail. > Resetting a bus in a point-to-point environment like PCI Express or EEH > (as you describe) should have little adverse effect. The risk is the > bus reset will cause a card reset and the driver must understand to > re-initialize the card. A link reset in PCI Express will not cause a > card reset. We assume the driver will reset its card if necessary. Does the link side of PCIE provides a way to trigger a hard reset of the rest of the card ? If not, then it's dodgy as there may be no way to consistently "reset" the card if it's in a bad state. I have to double check, but I suspect that IBM's implementation of EEH-compliant PCIE will add a full hard reset not just a link reset. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: KGDB question
On Thu, Mar 17, 2005 at 02:29:58PM -0800, Andrew Morton wrote: > Jesse Barnes <[EMAIL PROTECTED]> wrote: > > > > > kgdb patches are maintained in -mm kernels. > > > > > > Patches are in > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11 > > >-mm1/broken-out/*kgdb* > > > > > > And the patch application order is described in > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11 > > >-mm1/patch-series - > > > > What's the latest status on these? Last I heard, some cleanup was going to > > happen to make kgdb suitable for the mainline, did that ever happen? > > It part-happened, then the effort seemed to die. > > > Also, > > it would be nice if I could connect to a remote kernel running the kgdb > > stubs > > w/o having to run gdb on the same ethernet segment. Would that be > > difficult > > to fix? > > > > Maybe we'd have to teach kgdboe to arp for the remote debug host. I think > Matt was talking about that a while back. > > > > If switches send the destination MAC address through unchanged then maybe > the problem is that the switch simply doesn't know the MAC address of the > remote debug host yet? If the switch has its own MAC address (it doesn't, > does it), or if it's actually a router then perhaps you should specify the > router's MAC address and not the remote debug host's. I haven't tried this, but I believe you need to set up kgdboe's destination MAC address as the MAC of the next IP hop. Switches should be invisible to kgdboe. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[no subject]
- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: binfmt_elf padzero problems
Nir Tzachar <[EMAIL PROTECTED]> wrote: > > hello. > > i am seeing a problem(?) with the patch described at: > http://marc.theaimsgroup.com/?l=linux-kernel=109865760703851=2 > i'm using vanilla 2.6.11 (not .1/.2/.3/.4 ...) > > the short version: > padzero does not alway do the right thing (more correctly, it's caller, > load_elf_binary). > > the longer version: > > padzero calls clear_user. clear_user first checks if the address passed > is writable. if it is not, an error is returned. > the problem manifest itself when the area being cleared is not > writable... this should not normally happen in the context of > load_elf_binary, however it _can_ happen with the following assembly > code (intel syntax): > > section .text > global _start > _start: > mov eax,0x1 > mov ebx,0x0 > int 0x80 > hlt > > assembled with nasm -f elf, produces a binary with a bss segment of zero > size, aligned to 1, and one program header. > now, the when calling padzero, elf_bss holds an address which belongs > to .text (since no (fake)program header for .bss wad created), i.e; not > writable > when padzero is called, it tries to clean the rest of the .text section, > which clearly results with an error. > > thus, my (very) small binary always segfaults under 2.6.11+ > > on the other hand, i can be dead wrong.. if so, id like to know why... > Tricky. I guess if the bss has zero length then we can skip the zeroing of the end of the page at the end of bss, as long as we're dead sure that we didn't accidentally instantiate a single page on behalf of that zero-length bss. Something like this, perhaps? --- 25/fs/binfmt_elf.c~aThu Mar 17 14:47:35 2005 +++ 25-akpm/fs/binfmt_elf.c Thu Mar 17 14:48:44 2005 @@ -907,15 +907,17 @@ static int load_elf_binary(struct linux_ * mapping in the interpreter, to make sure it doesn't wind * up getting placed where the bss needs to go. */ - retval = set_brk(elf_bss, elf_brk); - if (retval) { - send_sig(SIGKILL, current, 0); - goto out_free_dentry; - } - if (padzero(elf_bss)) { - send_sig(SIGSEGV, current, 0); - retval = -EFAULT; /* Nobody gets to see this, but.. */ - goto out_free_dentry; + if (likely(elf_bss != elf_brk)) { /* Is there any bss at all? */ + retval = set_brk(elf_bss, elf_brk); + if (retval) { + send_sig(SIGKILL, current, 0); + goto out_free_dentry; + } + if (padzero(elf_bss)) { + send_sig(SIGSEGV, current, 0); + retval = -EFAULT; /* Nobody gets to see this, but.. */ + goto out_free_dentry; + } } if (elf_interpreter) { _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Business Proposal sdim
Good day Sir/Madam, My name is Micheal Lewin, I am representing a group of business men who deal in raw materials and other exports into Canada, America and Europe. We are searching for representatives who can help us establish a medium of getting to our customers in these countries as well as making there payments through you to us. If you are interested in transacting business with us, we will be very glad. Please Contact us Subject to your satisfaction, you will be given the opportunity to negotiate your terms of which we will Pay for your services as our representative. If you are interested, kindly forward to us your: 1. Your full names. 2. Your full postal and mailing address. 3. Your Contact telephone and fax numbers Faithfully yours, I remain. Mr chi hang Mr Micheal Lewin Secretary. China Metallurgical Import & Export Henan Company (CMIEC HN) www.cmiec.com jibsptduhrlekrtibtpdhkowtj - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: KGDB question
Jesse Barnes <[EMAIL PROTECTED]> wrote: > > > kgdb patches are maintained in -mm kernels. > > > > Patches are in > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11 > >-mm1/broken-out/*kgdb* > > > > And the patch application order is described in > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11 > >-mm1/patch-series - > > What's the latest status on these? Last I heard, some cleanup was going to > happen to make kgdb suitable for the mainline, did that ever happen? It part-happened, then the effort seemed to die. > Also, > it would be nice if I could connect to a remote kernel running the kgdb stubs > w/o having to run gdb on the same ethernet segment. Would that be difficult > to fix? Maybe we'd have to teach kgdboe to arp for the remote debug host. I think Matt was talking about that a while back. If switches send the destination MAC address through unchanged then maybe the problem is that the switch simply doesn't know the MAC address of the remote debug host yet? If the switch has its own MAC address (it doesn't, does it), or if it's actually a router then perhaps you should specify the router's MAC address and not the remote debug host's. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
On Thu, 17 Mar 2005, Andrew Morton wrote: > Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > > Adds management of ZEROED and NOT_ZEROED pages and a background daemon > > called scrubd. /proc/sys/vm/scrubd_load, /proc/sys/vm_scrubd_start and > > /proc/sys/vm_scrubd_stop control the scrub daemon. See Documentation/vm/ > > scrubd.txt > > It's hard to know what to think about this without benchmarking numbers. > > It would help if you could briefly describe the implementation and design > decisions when sending patches. Oh. This was discussed so many times that I thought it would not be necessary anymore. The discussion is attached. > For example, one area where we could use this is in pagetable management, > where we need zeroed pages and we tend to free up known-to-be-zero and > probably cache-warm pages. Right now some architectures are maintaining > their own quicklists, or using a slab cache, both of which are suboptimal. Right. > But afaict the patch doesn't differentiate between cache-cold and cache-hot > zeroed pages, and doesn't have an API with which clients can free up a > known-to-be-zero page. end_zero_page(page, 0) would do put a zeroed page back on the zeroed list. But we may have to define a cleaner API for it. Plus this is a hot zero page. So I would need to add a hot zero hotlist to the existing cold zero hotlist. Description The most expensive operation in the page fault handler is (apart of SMP locking overhead) the touching of all cache lines of a page by zeroing the page. This zeroing means that all cachelines of the faulted page (on Altix that means all 128 cachelines of 128 byte each) must be handled and later written back. This patch allows to avoid having to use all cachelines if only a part of the cachelines of that page is needed immediately after the fault. Doing so will only be effective for sparsely accessed memory which is typical for anonymous memory and pte maps. The patch makes prezeroing very effective by also allowing the use of hardware support for offloading zeroing from the cpu. This avoids the invalidation of the cpu caches by extensive zeroing operations. The scrub daemon is invoked when the number of zeroed pages falls below a lower threshhold (defined by setting /proc/sys/vm/scrub_start) so that its worth running it. kscrubd then zeroes free pages until the upper threshold is reached (set by /proc/sys/vm/scrub_stop). The zeroing is performed on a percentage of pages at each order of freed pages. kscrubd performs short bursts of zeroing when needed and tries to stay out off the processor as much as possible. Kscrubd will only run when the load is less than set in /proc/sys/vm/scrub_load (defaults to 1). The benefits of prezeroing are reduced to minimal quantities if all cachelines of a page are touched. Prezeroing can only be effective if the whole page is not immediately used after the page fault. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: vm_dirty_ratio seems a bit large.
> "Andrew" == Andrew Morton <[EMAIL PROTECTED]> writes: Andrew> Robin Holt <[EMAIL PROTECTED]> wrote: >> One other issue we have is the vm_dirty_ratio and background_ratio >> adjustments are a little coarse with these memory sizes. Since our >> minimum adjustment is 1%, we are adjusting by 40GB on the largest >> configuration from above. The hardware we are shipping today is >> capable of going to far greater amounts of memory, but we don't >> have customers demanding that yet. I would like to plan ahead for >> that and change vm_dirty_ratio from a straight percent into a >> millipercent (thousandth of a percent). Would that type of change >> be acceptable? Andrew> Oh drat. I think such a change would require a new set of Andrew> /proc entries. No, you could just extend them to understand fixed point. Keep printing integers as integers, print non-integers with one (or two: will we ever need 0.01% increments?) decimal places. -- Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au The technical we do immediately, the political takes *forever* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH libata-dev-2.6 04/05] libata: support descriptor sense in ctrl page
04_libata_control_pg_desc_bit.patch libata must support the descriptor format sense blocks as they are required to properly report results of ATA pass through commands as well as other SCSI commands reporting 48b LBAs. This patch adjusts the control mode page to properly report this. Signed-off-by: Brett Russ <[EMAIL PROTECTED]> libata-scsi.c |7 ++- 1 files changed, 6 insertions(+), 1 deletion(-) Index: libata-dev-2.6/drivers/scsi/libata-scsi.c === --- libata-dev-2.6.orig/drivers/scsi/libata-scsi.c 2005-03-17 17:16:58.0 -0500 +++ libata-dev-2.6/drivers/scsi/libata-scsi.c 2005-03-17 17:16:58.0 -0500 @@ -1370,7 +1370,12 @@ static unsigned int ata_msense_caching(u static unsigned int ata_msense_ctl_mode(u8 **ptr_io, const u8 *last) { - const u8 page[] = {0xa, 0xa, 2, 0, 0, 0, 0, 0, 0xff, 0xff, 0, 30}; + const u8 page[] = {0xa, 0xa, 6, 0, 0, 0, 0, 0, 0xff, 0xff, 0, 30}; + + /* byte 2: set the descriptor format sense data bit (bit 2) +* since we need to support returning this format for SAT +* commands and any SCSI commands against a 48b LBA device. +*/ ata_msense_push(ptr_io, last, page, sizeof(page)); return sizeof(page); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH libata-dev-2.6 01/05] libata: AHCI tf_read() support
01_libata_garzik-ahci-tf-read.patch (included in libata-2.6) This is Jeff's tf_read() support patch for AHCI. Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> ahci.c | 11 +++ 1 files changed, 11 insertions(+) Index: libata-dev-2.6/drivers/scsi/ahci.c === --- libata-dev-2.6.orig/drivers/scsi/ahci.c 2005-03-17 12:36:29.0 -0500 +++ libata-dev-2.6/drivers/scsi/ahci.c 2005-03-17 17:16:57.0 -0500 @@ -179,6 +179,7 @@ static void ahci_eng_timeout(struct ata_ static int ahci_port_start(struct ata_port *ap); static void ahci_port_stop(struct ata_port *ap); static void ahci_host_stop(struct ata_host_set *host_set); +static void ahci_tf_read(struct ata_port *ap, struct ata_taskfile *tf); static void ahci_qc_prep(struct ata_queued_cmd *qc); static u8 ahci_check_status(struct ata_port *ap); static u8 ahci_check_err(struct ata_port *ap); @@ -213,6 +214,8 @@ static struct ata_port_operations ahci_o .check_err = ahci_check_err, .dev_select = ata_noop_dev_select, + .tf_read= ahci_tf_read, + .phy_reset = ahci_phy_reset, .qc_prep= ahci_qc_prep, @@ -466,6 +469,14 @@ static u8 ahci_check_err(struct ata_port return (readl(mmio + PORT_TFDATA) >> 8) & 0xFF; } +static void ahci_tf_read(struct ata_port *ap, struct ata_taskfile *tf) +{ + struct ahci_port_priv *pp = ap->private_data; + u8 *d2h_fis = pp->rx_fis + RX_FIS_D2H_REG; + + ata_tf_from_fis(d2h_fis, tf); +} + static void ahci_fill_sg(struct ata_queued_cmd *qc) { struct ahci_port_priv *pp = qc->ap->private_data; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH libata-dev-2.6 05/05] libata: rework how CCs generated
05_libata_split_ata_to_sense_error.patch This patch fixes several bugs as well as reorganizes the way check conditions are generated. Bugs fixed: 1) in ata_scsi_qc_complete(), ATA_12/16 commands wouldn't call ata_pass_thru_cc() on error status; 2) ata_pass_thru_cc() wouldn't put the SK, ASC, and ASCQ from ata_to_sense_error() in the correct place in the sense block because ata_to_sense_error() was writing a fixed sense block. Per the recommendations in the comments, ata_to_sense_error() is now split into 3 parts. The existing fcn is only used for outputting a sense key/ASC/ASCQ triplicate. A new function ata_dump_status() was created to print the error info, similar to the ide variety. A third function ata_gen_fixed_sense() was created to generate a fixed length sense block. I added the use of the info field for 28b LBAs only. ata_pass_thru_cc() renamed to ata_gen_ata_desc_sense() to match naming convention, presumably to include another descriptor format function in the future (see question 2 below). Questions: 1) I made the ata_gen_..._sense() routines read the status register themselves rather than use the drv_stat values that used to be passed in? These values seemed unreliable/useless since they were often hard coded (see calls to ata_qc_complete() for origins of most drv_stat variables). Sound ok? 2) the SAT spec has little about error handling and sense information, sepcifically what descriptor format is valid for use by SAT commands. I want to use descriptor type 00 (information) in my next patch until a spec says differently. Sound ok? Signed-off-by: Brett Russ <[EMAIL PROTECTED]> libata-scsi.c | 342 +- libata.h |1 2 files changed, 197 insertions(+), 146 deletions(-) Index: libata-dev-2.6/drivers/scsi/libata-scsi.c === --- libata-dev-2.6.orig/drivers/scsi/libata-scsi.c 2005-03-17 17:16:58.0 -0500 +++ libata-dev-2.6/drivers/scsi/libata-scsi.c 2005-03-17 17:16:59.0 -0500 @@ -331,24 +331,69 @@ struct ata_queued_cmd *ata_scsi_qc_new(s } /** + * ata_dump_status - user friendly display of error info + * @id: id of the port in question + * @tf: ptr to filled out taskfile + * + * Decode and dump the ATA error/status registers for the user so + * that they have some idea what really happened at the non + * make-believe layer. + * + * LOCKING: + * inherited from caller + */ +void ata_dump_status(unsigned id, struct ata_taskfile *tf) +{ + u8 stat = tf->command, err = tf->feature; + + printk(KERN_WARNING "ata%u: status=0x%02x { ", id, stat); + if (stat & ATA_BUSY) { + printk("Busy }\n"); /* Data is not valid in this case */ + } else { + if (stat & 0x40)printk("DriveReady "); + if (stat & 0x20)printk("DeviceFault "); + if (stat & 0x10)printk("SeekComplete "); + if (stat & 0x08)printk("DataRequest "); + if (stat & 0x04)printk("CorrectedError "); + if (stat & 0x02)printk("Index "); + if (stat & 0x01)printk("Error "); + printk("}\n"); + + if (err) { + printk(KERN_WARNING "ata%u: error=0x%02x { ", id, err); + if (err & 0x04) printk("DriveStatusError "); + if (err & 0x80) { + if (err & 0x04) printk("BadCRC "); + else printk("Sector "); + } + if (err & 0x40) printk("UncorrectableError "); + if (err & 0x10) printk("SectorIdNotFound "); + if (err & 0x02) printk("TrackZeroNotFound "); + if (err & 0x01) printk("AddrMarkNotFound "); + printk("}\n"); + } + } +} + +/** * ata_to_sense_error - convert ATA error to SCSI error - * @qc: Command that we are erroring out * @drv_stat: value contained in ATA status register + * @drv_err: value contained in ATA error register + * @sk: the sense key we'll fill out + * @asc: the additional sense code we'll fill out + * @ascq: the additional sense code qualifier we'll fill out * - * Converts an ATA error into a SCSI error. While we are at it - * we decode and dump the ATA error for the user so that they - * have some idea what really happened at the non make-believe - * layer. + * Converts an ATA error into a SCSI error. Fill out
Re: [PATCH libata-dev-2.6 03/05] libata: update ATA PT sense desc code
03_libata_update_desc_code.patch Change the ATA pass through sense block descriptor code to 0x09 per SAT Signed-off-by: Brett Russ <[EMAIL PROTECTED]> libata-scsi.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Index: libata-dev-2.6/drivers/scsi/libata-scsi.c === --- libata-dev-2.6.orig/drivers/scsi/libata-scsi.c 2005-03-08 08:47:48.0 -0500 +++ libata-dev-2.6/drivers/scsi/libata-scsi.c 2005-03-17 17:16:58.0 -0500 @@ -531,7 +531,7 @@ void ata_pass_thru_cc(struct ata_queued_ */ sb[0] = 0x72 ; - desc[0] = 0x8e ;/* TODO: replace with official value. */ + desc[0] = 0x09; /* * Set length of additional sense data. @@ -2059,7 +2059,7 @@ void ata_scsi_simulate(u16 *id, ata_scsi_rbuf_fill(, ata_scsiop_report_luns); break; - /* mandantory commands we haven't implemented yet */ + /* mandatory commands we haven't implemented yet */ case REQUEST_SENSE: /* all other commands */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH libata-dev-2.6 02/05] libata: AHCI error handling fix
02_libata_ahci-err-int.patch (included in libata-2.6) Fixes AHCI bits during handling of fatal error int. Signed-off-by: Brett Russ <[EMAIL PROTECTED]> ahci.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Index: libata-dev-2.6/drivers/scsi/ahci.c === --- libata-dev-2.6.orig/drivers/scsi/ahci.c 2005-03-17 17:16:57.0 -0500 +++ libata-dev-2.6/drivers/scsi/ahci.c 2005-03-17 17:16:57.0 -0500 @@ -548,7 +548,7 @@ static void ahci_intr_error(struct ata_p /* stop DMA */ tmp = readl(port_mmio + PORT_CMD); - tmp &= PORT_CMD_START | PORT_CMD_FIS_RX; + tmp &= ~PORT_CMD_START; writel(tmp, port_mmio + PORT_CMD); /* wait for engine to stop. TODO: this could be @@ -580,7 +580,7 @@ static void ahci_intr_error(struct ata_p /* re-start DMA */ tmp = readl(port_mmio + PORT_CMD); - tmp |= PORT_CMD_START | PORT_CMD_FIS_RX; + tmp |= PORT_CMD_START; writel(tmp, port_mmio + PORT_CMD); readl(port_mmio + PORT_CMD); /* flush */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH libata-dev-2.6 00/05] libata: scsi error handling improvements
This patch series attempts to clean up the SCSI error handling a bit. See comments in below TOC or patch emails. All of the below have been tested in success and error paths through the VERIFY_10 and ATA_16 commands using the AHCI driver. IMPORTANT: the patchset below against libata-dev-2.6 relies on the recent AHCI driver fixes recently patched into libata-2.6. I am including the two specific patches as 1 and 2 of this series for completeness, although of course they should be merged from libata-2.6 instead. Therefore, you may ignore these two unless you want to test this series now on libata-dev. [ Start of patch descriptions ] 01_libata_garzik-ahci-tf-read.patch : AHCI tf_read() support (included in libata-2.6) This is Jeff's tf_read() support patch for AHCI. Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> 02_libata_ahci-err-int.patch : AHCI error handling fix (included in libata-2.6) Fixes AHCI bits during handling of fatal error int. 03_libata_update_desc_code.patch : update ATA PT sense desc code Change the ATA pass through sense block descriptor code to 0x09 per SAT 04_libata_control_pg_desc_bit.patch : support descriptor sense in ctrl page libata must support the descriptor format sense blocks as they are required to properly report results of ATA pass through commands as well as other SCSI commands reporting 48b LBAs. This patch adjusts the control mode page to properly report this. 05_libata_split_ata_to_sense_error.patch : rework how CCs generated This patch fixes several bugs as well as reorganizes the way check conditions are generated. Bugs fixed: 1) in ata_scsi_qc_complete(), ATA_12/16 commands wouldn't call ata_pass_thru_cc() on error status; 2) ata_pass_thru_cc() wouldn't put the SK, ASC, and ASCQ from ata_to_sense_error() in the correct place in the sense block because ata_to_sense_error() was writing a fixed sense block. Per the recommendations in the comments, ata_to_sense_error() is now split into 3 parts. The existing fcn is only used for outputting a sense key/ASC/ASCQ triplicate. A new function ata_dump_status() was created to print the error info, similar to the ide variety. A third function ata_gen_fixed_sense() was created to generate a fixed length sense block. I added the use of the info field for 28b LBAs only. ata_pass_thru_cc() renamed to ata_gen_ata_desc_sense() to match naming convention, presumably to include another descriptor format function in the future (see question 2 below). Questions: 1) I made the ata_gen_..._sense() routines read the status register themselves rather than use the drv_stat values that used to be passed in? These values seemed unreliable/useless since they were often hard coded (see calls to ata_qc_complete() for origins of most drv_stat variables). Sound ok? 2) the SAT spec has little about error handling and sense information, sepcifically what descriptor format is valid for use by SAT commands. I want to use descriptor type 00 (information) in my next patch until a spec says differently. Sound ok? [ End of patch descriptions ] BR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Awful long timeouts for flash-file-system
On Thu, 17 Mar 2005 05:06:23 +0100 Voluspa wrote: Went back to 2.6.10 and just got one of those dma_timer_expiry freezes. Seems the disk is on the blink then. Sorry about the noise. Mvh Mats Johannesson -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Memory Stick Changes in 2.6.11?
I was pulling some pictures out of memory sticks from a camera, and after I pulled tham I was removing the image files from the stick. One of the sticks mounted read-only. After a few attempts to explicitly use "rw" in the mount command and things like that, I booted back into 2.6.10 and found the stick mounted rw. I looked at the code, and I don't see anything obvious. Can someone point me to where the change is made? OT: I think that if I explicitly use the rw option the mount should do what I ask or fail. This "I can't do what you want so I did something else" behaviour make scripts more complex. -- -bill davidsen ([EMAIL PROTECTED]) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: KGDB question
On Thursday, March 17, 2005 1:54 pm, Andrew Morton wrote: > "Abhinkar, Sameer" <[EMAIL PROTECTED]> wrote: > > Are there any patches or hooks > > available to enable KGDB for linux-2.6.11.2? > > kgdb patches are maintained in -mm kernels. > > Patches are in > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11 >-mm1/broken-out/*kgdb* > > And the patch application order is described in > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11 >-mm1/patch-series - What's the latest status on these? Last I heard, some cleanup was going to happen to make kgdb suitable for the mainline, did that ever happen? Also, it would be nice if I could connect to a remote kernel running the kgdb stubs w/o having to run gdb on the same ethernet segment. Would that be difficult to fix? Thanks, Jesse - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prezeroing V8
Christoph Lameter <[EMAIL PROTECTED]> wrote: > > Adds management of ZEROED and NOT_ZEROED pages and a background daemon > called scrubd. /proc/sys/vm/scrubd_load, /proc/sys/vm_scrubd_start and > /proc/sys/vm_scrubd_stop control the scrub daemon. See Documentation/vm/ > scrubd.txt It's hard to know what to think about this without benchmarking numbers. It would help if you could briefly describe the implementation and design decisions when sending patches. For example, one area where we could use this is in pagetable management, where we need zeroed pages and we tend to free up known-to-be-zero and probably cache-warm pages. Right now some architectures are maintaining their own quicklists, or using a slab cache, both of which are suboptimal. But afaict the patch doesn't differentiate between cache-cold and cache-hot zeroed pages, and doesn't have an API with which clients can free up a known-to-be-zero page. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] fork_connector: add a fork connector
On Thursday, March 17, 2005 1:38 pm, Evgeniy Polyakov wrote: > The most significant part there - is requirement to store > u32 seq in each CPU's cache and thus flush cacheline + > invalidate/get from mem on each other cpus > each time it is accessed, which is a big price. Same thing has to happen with the lock. To put it simply, writing global variables from multiple CPUs with anything other than very low frequency is bad. > It is totally Guillaume's work - so he decides, > I would recomend per cpu counters and processor's > id in each message. > And of course userspace should take care of misordered > messages. > I personally prefer such mechanism. Yep, I agree. Hopefully Guillaume will too :) Jesse - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: KGDB question
"Abhinkar, Sameer" <[EMAIL PROTECTED]> wrote: > > Are there any patches or hooks > available to enable KGDB for linux-2.6.11.2? kgdb patches are maintained in -mm kernels. Patches are in ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11-mm1/broken-out/*kgdb* And the patch application order is described in ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11-mm1/patch-series - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] NFS: add I/O performance counters
[EMAIL PROTECTED] (Chuck Lever) wrote: > > +static inline void nfs_inc_stats(struct inode *inode, unsigned int stat) > +{ > + struct nfs_iostats *iostats = NFS_SERVER(inode)->io_stats; > + iostats[smp_processor_id()].counts[stat]++; > +} The use of smp_processor_id() outside locks should spit a runtime warning. And it is racy: if you switch CPUs between the read and the write (via preemption), the stats will be corrupted. A preempt_disable()/enable() will fix those things up. > +static inline struct nfs_iostats *nfs_alloc_iostats(void) > +{ > + struct nfs_iostats *new; > + new = kmalloc(sizeof(struct nfs_iostats) * NR_CPUS, GFP_KERNEL); > + if (new) > + memset(new, 0, sizeof(struct nfs_iostats) * NR_CPUS); > + return new; > +} > + You'd be better off using alloc_percpu() here, so each CPU's counter goes into its node-local memory. Or simply use . AFACIT the warning at the top of that file isn't true any more. A 4-byte counter on a 32-way should consume just a little over 256 bytes. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/