Re: [Bug 8464] New: autoreconf: page allocation failure. order:2, mode:0x84020
Le jeudi 10 mai 2007 à 16:01 -0700, Christoph Lameter a écrit : > On Fri, 11 May 2007, Mel Gorman wrote: > > > Nicholas, could you backout the patch > > dont-group-high-order-atomic-allocations.patch and test again please? > > The following patch has the same effect. Thanks > > Great! Thanks. The proposed patch did not apply + cd /builddir/build/BUILD + rm -rf linux-2.6.21 + /usr/bin/bzip2 -dc /builddir/build/SOURCES/linux-2.6.21.tar.bz2 + tar -xf - + STATUS=0 + '[' 0 -ne 0 ']' + cd linux-2.6.21 ++ /usr/bin/id -u + '[' 499 = 0 ']' ++ /usr/bin/id -u + '[' 499 = 0 ']' + /bin/chmod -Rf a+rX,u+w,g-w,o-w . + echo 'Patch #2 (2.6.21-mm2.bz2):' Patch #2 (2.6.21-mm2.bz2): + /usr/bin/bzip2 -d + patch -p1 -s + STATUS=0 + '[' 0 -ne 0 ']' + echo 'Patch #3 (md-improve-partition-detection-in-md-array.patch):' Patch #3 (md-improve-partition-detection-in-md-array.patch): + patch -p1 -R -s + echo 'Patch #4 (bug-8464.patch):' Patch #4 (bug-8464.patch): + patch -p1 -s 1 out of 1 hunk FAILED -- saving rejects to file include/linux/pageblock-flags.h .rej 6 out of 6 hunks FAILED -- saving rejects to file mm/page_alloc.c.rej Backing out dont-group-high-order-atomic-allocations.patch worked and seems to have cured the system so far (need to charge it a bit longer to be sure) -- Nicolas Mailhot signature.asc Description: Ceci est une partie de message numériquement signée
Re: [PATCH] UDF: check for allocated memory for inode data
[Andrew Morton - Thu, May 10, 2007 at 03:46:40PM -0700] [...snip...] | But please let's not add three copies of identical code. Do something like: [...snip...] Thanks for comments, Andrew. Let me rewrite the patch... Cyrill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][debian-2.6.20-1-686] bridging + vlans + "vconfig rem" == stuck kernel
On May 10, 2007, at 00:34:11, Kyle Moffett wrote: On May 10, 2007, at 00:25:54, Ben Greear wrote: Looks like a deadlock in the vlan code. Any chance you can run this test with lockdep enabled? You could also add a printk in vlan_device_event() to check which event it is hanging on, and the netdevice that is passed in. Ok, I'll try building a 2.6.21 kernel with lockdep and some debugging printk()s in the vlan_device_event() function and get back to you tomorrow. Thanks for the quick response! Progress!!! I built a 2.6.21.1 kernel with a 1MB dmesg buffer, almost all of the locking debugging options on (as well as a few others just for kicks), a VLAN debug #define turned on in the net/ 8021q/vlan.h file, and lots of extra debugging messages added to the functions in vlan.c. My initial interpretation is that due to the funny order in which "ifdown -a" takes down interfaces, it tries to delete the VLAN interfaces before the bridges running atop them have been taken down. Ordinarily this seems to work, but when the underlying physical ethernet is down already, the last VLAN to be deleted seems to hang somehow. The full results are as follows: The lock dependency validator at startup passes all 218 testcases, indicating that all the locking crap is probably working correctly (those debug options chew up another meg of RAM). ifup -a brings up the interfaces in this order (See previous email for configuration details): lo net0 wfi0 world0 lan lan:0 world ifdown -a appears to bring them down in the same order (at least, until it gets stuck). Attached below is filtered debugging information. I cut out 90% of the crap in the syslog, but there's still a lot left over to sift through; sorry. If you want my .config or the full text of the log then email me privately and I'll send it to you, as it's kinda big. I appreciate any advice, thanks for all your help Cheers, Kyle Moffett This first bit is the "ifup -a -v -i interfaces": ADDRCONF(NETDEV_UP): net0: link is not ready vlan_ioctl_handler: args.cmd: 6 vlan_ioctl_handler: args.cmd: 0 register_vlan_device: if_name -:net0:-^Ivid: 2 About to allocate name, vlan_name_type: 3 Allocated new name -:net0.2:- About to go find the group for idx: 2 vlan_transfer_operstate: net0 state transition applies to net0.2 too: vlan_proc_add, device -:net0.2:- being added. Allocated new device successfully, returning. wfi0: add 33:33:00:00:00:01 mcast address to master interface wfi0: add 01:00:5e:00:00:01 mcast address to master interface ADDRCONF(NETDEV_UP): wfi0: link is not ready vlan_ioctl_handler: args.cmd: 6 vlan_ioctl_handler: args.cmd: 0 register_vlan_device: if_name -:net0:-^Ivid: 4094 About to allocate name, vlan_name_type: 3 Allocated new name -:net0.4094:- About to go find the group for idx: 2 vlan_transfer_operstate: net0 state transition applies to net0.4094 too: vlan_proc_add, device -:net0.4094:- being added. Allocated new device successfully, returning. world0: add 33:33:00:00:00:01 mcast address to master interface world0: add 01:00:5e:00:00:01 mcast address to master interface ADDRCONF(NETDEV_UP): world0: link is not ready tg3: net0: Link is up at 1000 Mbps, full duplex. tg3: net0: Flow control is on for TX and on for RX. ADDRCONF(NETDEV_CHANGE): net0: link becomes ready Propagating NETDEV_CHANGE for device net0... ... to wfi0 vlan_transfer_operstate: net0 state transition applies to wfi0 too: ...found a carrier, applying to VLAN device ... to world0 vlan_transfer_operstate: net0 state transition applies to world0 too: ...found a carrier, applying to VLAN device lan: port 1(net0) entering listening state ADDRCONF(NETDEV_CHANGE): wfi0: link becomes ready wfi0: dev_set_promiscuity(master, 1) wfi0: add 33:33:ff:5f:60:92 mcast address to master interface lan: port 2(wfi0) entering listening state ADDRCONF(NETDEV_CHANGE): world0: link becomes ready world0: add 33:33:ff:91:e2:4c mcast address to master interface lan: no IPv6 routers present world: no IPv6 routers present net0: no IPv6 routers present world0: no IPv6 routers present wfi0: no IPv6 routers present lan: port 1(net0) entering learning state lan: port 2(wfi0) entering learning state lan: topology change detected, propagating lan: port 1(net0) entering forwarding state lan: topology change detected, propagating lan: port 2(wfi0) entering forwarding state This bit is for "ifdown -a -v -i interfaces": Propagating NETDEV_DOWN for device net0... ... to wfi0 wfi0: del 33:33:ff:5f:60:92 mcast address from vlan interface wfi0: del 33:33:ff:5f:60:92 mcast address from master interface wfi0: del 01:00:5e:00:00:01 mcast address from vlan interface wfi0: del 01:00:5e:00:00:01 mcast address from master interface wfi0: del 33:33:00:00:00:01 mcast address from vlan interface wfi0: del 33:33:00:00:00:01 mcast address from master interface lan: port 2(wfi0) entering disabled state ... to world0
[PATCH] PowerPC64 symbols start with '.'
which we want to skip during modpost processing. We need this to make some of the whitelisting work. Signed-off-by: Stephen Rothwell <[EMAIL PROTECTED]> --- scripts/mod/modpost.c | 18 +- 1 files changed, 17 insertions(+), 1 deletions(-) -- Cheers, Stephen Rothwell[EMAIL PROTECTED] diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c index 113dc77..748b058 100644 --- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -164,7 +164,13 @@ static inline unsigned int tdb_hash(const char *name) static struct symbol *alloc_symbol(const char *name, unsigned int weak, struct symbol *next) { - struct symbol *s = NOFAIL(malloc(sizeof(*s) + strlen(name) + 1)); + struct symbol *s; + + /* For our purposes, .foo matches foo. PPC64 needs this. */ + if (name[0] == '.') + name++; + + s = NOFAIL(malloc(sizeof(*s) + strlen(name) + 1)); memset(s, 0, sizeof(*s)); strcpy(s->name, name); @@ -180,6 +186,10 @@ static struct symbol *new_symbol(const char *name, struct module *module, unsigned int hash; struct symbol *new; + /* For our purposes, .foo matches foo. PPC64 needs this. */ + if (name[0] == '.') + name++; + hash = tdb_hash(name) % SYMBOL_HASH_SIZE; new = symbolhash[hash] = alloc_symbol(name, 0, symbolhash[hash]); new->module = module; @@ -684,6 +694,12 @@ static int secref_whitelist(const char *modname, const char *tosec, NULL }; + /* For our purposes, .foo matches foo. PPC64 needs this. */ + if (atsym[0] == '.') + atsym++; + if (refsymname[0] == '.') + refsymname++; + /* Check for pattern 1 */ if (strcmp(tosec, ".init.data") != 0) f1 = 0; -- 1.5.1.4 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/12] crypto: don't pollute the global namespace with sg_next()
Jens Axboe wrote: > On Thu, May 10 2007, Benny Halevy wrote: >> Jens Axboe wrote: >>> It's a subsystem function, prefix it as such. >> Jens, Boaz and I talked about this over lunch. >> I wonder whether the crypto code must use your implementation >> instead of its own as it needs to over the sglist, e.g. for >> calculating iscsi (data) digest. > > The thought did cross my mind, and yes I think that would be a good > idea. The whole thing should probably just migrate to > lib/scattersomething.c > >> The crypto implementation of chained sglists in crypto/scatterwalk.h >> determines the chain link by !sg->length which will sorta work >> with your implementation, however the marker bit on page pointer must >> be cleared to use it. > > I don't like using sg->length, as that may be modified for legitimate > reason. That's why I chose to use the lsb bit of the page pointer. > >> Also, is it possible that after the original sglist has gone through >> dma_map_sg and entries were merged, some entries will have zero >> length? I'm not sure... If so, if the crypto implementation scans >> the sg list after it was dma mapped (maybe in a retry path) it >> may hit an entry that looks to it like a chaining link. This >> might be an existing bug and another reason for the crypto code >> to use your implementation. > > It's hard to say, depends heavily on the sub system or arch. Even if > using the pointer tagging mechanism seems a bit nasty, I think it's the > more resilient approach. > We're in agreement then :) I was trying to say that the methods should be compatible, otherwise bugs can happen, and that your scheme is better since it can handle sglists with zero length entries that aren't the last. A case that might be valid after dma mapping and merging. If indeed this case is possible, this seems to be the right time to converge to your scheme. Benny - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: slub-i386-support.patch
On Thu, 10 May 2007, William Lee Irwin III wrote: > Looking more closely at it, the entire attempt to avoid struct page > pointers is far beyond pointless. The freeing functions unconditionally > require struct page pointers to either be passed or computed and the > allocation function's virtual address it returns as a result is not > directly usable. The callers all have to do arithmetic on the result. > One might as well stash precomputed pfn's (if not paddrs) and vaddrs in > page->private and page->mapping, chain them with ->lru (use only .next > if you care to stay singly-linked), and handle struct page pointers Well then you'd have to rewrite the existing ways of fiddling with page structs. This way all is clear and you fiddle as you want. It just works. Could we get this in? You acked it once already? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] drivers/scsi/aic7xxx_old: Convert to generic boolean
Signed-off-by: Richard Knutsson <[EMAIL PROTECTED]> --- Compile-tested with all(yes|mod|no)config on x86(|_64) & sparc(|64) got some warnings on some builds, none related to this patch Diffed against Linus' git-tree. aic7xxx_old.c | 326 ++--- aic7xxx_old/aic7xxx_proc.c |2 2 files changed, 161 insertions(+), 167 deletions(-) diff --git a/drivers/scsi/aic7xxx_old.c b/drivers/scsi/aic7xxx_old.c index a988d5a..afa5ded 100644 --- a/drivers/scsi/aic7xxx_old.c +++ b/drivers/scsi/aic7xxx_old.c @@ -255,12 +255,6 @@ #define ALL_LUNS -1 #define MAX_TARGETS 16 #define MAX_LUNS 8 -#ifndef TRUE -# define TRUE 1 -#endif -#ifndef FALSE -# define FALSE 0 -#endif #if defined(__powerpc__) || defined(__i386__) || defined(__x86_64__) # define MMAPIO @@ -1382,7 +1376,7 @@ aic7xxx_setup(char *s) char *tok, *tok_end, *tok_end2; char tok_list[] = { '.', ',', '{', '}', '\0' }; int i, instance = -1, device = -1; -unsigned char done = FALSE; +bool done = false; base = p; tok = base + n + 1; /* Forward us just past the ':' */ @@ -1410,14 +1404,14 @@ aic7xxx_setup(char *s) case ',': case '.': if (instance == -1) -done = TRUE; +done = true; else if (device >= 0) device++; else if (instance >= 0) instance++; if ( (device >= MAX_TARGETS) || (instance >= ARRAY_SIZE(aic7xxx_tag_info)) ) -done = TRUE; +done = true; tok++; if (!done) { @@ -1425,10 +1419,10 @@ aic7xxx_setup(char *s) } break; case '\0': - done = TRUE; + done = true; break; default: - done = TRUE; + done = true; tok_end = strchr(tok, '\0'); for(i=0; tok_list[i]; i++) { @@ -1436,7 +1430,7 @@ aic7xxx_setup(char *s) if ( (tok_end2) && (tok_end2 < tok_end) ) { tok_end = tok_end2; - done = FALSE; + done = false; } } if ( (instance >= 0) && (device >= 0) && @@ -1512,7 +1506,7 @@ pause_sequencer(struct aic7xxx_host *p) * warrant an easy way to do it. *-F*/ static void -unpause_sequencer(struct aic7xxx_host *p, int unpause_always) +unpause_sequencer(struct aic7xxx_host *p, bool unpause_always) { if (unpause_always || ( !(aic_inb(p, INTSTAT) & (SCSIINT | SEQINT | BRKADRINT)) && @@ -1771,7 +1765,7 @@ aic7xxx_loadseq(struct aic7xxx_host *p) aic_outb(p, 0, SEQADDR0); aic_outb(p, 0, SEQADDR1); aic_outb(p, FASTMODE | FAILDIS, SEQCTL); - unpause_sequencer(p, TRUE); + unpause_sequencer(p, true); mdelay(1); pause_sequencer(p); aic_outb(p, FASTMODE, SEQCTL); @@ -1820,7 +1814,7 @@ aic7xxx_print_sequencer(struct aic7xxx_host *p, int downloaded) aic_outb(p, 0, SEQADDR0); aic_outb(p, 0, SEQADDR1); aic_outb(p, FASTMODE | FAILDIS, SEQCTL); - unpause_sequencer(p, TRUE); + unpause_sequencer(p, true); mdelay(1); pause_sequencer(p); aic_outb(p, FASTMODE, SEQCTL); @@ -1868,7 +1862,7 @@ aic7xxx_find_syncrate(struct aic7xxx_host *p, unsigned int *period, unsigned int maxsync, unsigned char *options) { struct aic7xxx_syncrate *syncrate; - int done = FALSE; + bool done = false; switch(*options) { @@ -1924,7 +1918,7 @@ aic7xxx_find_syncrate(struct aic7xxx_host *p, unsigned int *period, case MSG_EXT_PPR_OPTION_DT_UNITS: if(!(syncrate->sxfr_ultra2 & AHC_SYNCRATE_CRC)) { -done = TRUE; +done = true; /* * oops, we went too low for the CRC/DualEdge signalling, so * clear the options byte @@ -1938,7 +1932,7 @@ aic7xxx_find_syncrate(struct aic7xxx_host *p, unsigned int *period, } else { -done = TRUE; +done = true; if(syncrate == _syncrates[maxsync]) { *period = syncrate->period; @@ -1948,7 +1942,7 @@ aic7xxx_find_syncrate(struct aic7xxx_host *p, unsigned int *period, default: if(!(syncrate->sxfr_ultra2 & AHC_SYNCRATE_CRC)) { -done = TRUE; +done = true; if(syncrate == _syncrates[maxsync]) { *period = syncrate->period; @@ -2375,22 +2369,22 @@ scbq_insert_tail(volatile scb_queue_type *queue, struct aic7xxx_scb *scb) * on the
Re: [patch 05/10] Linux Kernel Markers - i386 optimized version
On Thu, May 10, 2007 at 12:59:18PM -0400, Mathieu Desnoyers wrote: > * Alan Cox ([EMAIL PROTECTED]) wrote: ... > > > * Third issue : Scalability. Changing code will stop every CPU on the > > > system for a while. Compared to this, the int3-based approach will run > > > through the breakpoint handler "if" one of the CPU happens to execute > > > this code at the wrong time. The standard case is just an IPI (to > > > > If I read the errata right then patching in an int3 will itself trigger > > the errata so anything could happen. > > > > I believe there are other safe sequences for doing code patching - perhaps > > one of the Intel folk can advise ? IIRC, when the first implementation of what exists now as kprobes was done (as part of the dprobes framework), this question did come up. I think the conclusion was that the errata applies only to multi-byte modifications and single-byte changes are guaranteed to be atomic. Given int3 on Intel is just 1-byte, we are safe. > I'll let the Intel guys confirm this, I don't have the reference nearby > (I got this information by talking with the kprobe team members, and > they got this information directly from Intel developers) but the > int3 is the one special case to which the errata does not apply. > Otherwise, kprobes and gdb would have a big, big issue. Perhaps Richard/Suparna can confirm. Ananth - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 2.6.21-mm1 (git-input) on Dell D610 laptop
Hi Remi, On Thursday 10 May 2007 21:50, Andrew Morton wrote: > On Thu, 10 May 2007 15:05:25 +0200 > Remi Colinet <[EMAIL PROTECTED]> wrote: > > > My D610 ALPS Glide Point is unresponsive with 2.6.21-mm1 patch. > > No problem noticed with 2.6.21. > > > > The culprit seems to be git-input. I have applied 2.6.21-mm1 on top of > > 2.6.21 > > and then removed git-input patch. It is ok since then. Have you tried any other -mm? Also, does it help if you stick ps2_command(>ps2dev, NULL, PSMOUSE_CMD_SETSTREAM); at the very beginning of psmouse_initialize() in drivers/input/mouse/psmouse-base.c? -- Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86 setup rewrite tree ready for flamage^W review
On Tue, May 08, 2007 at 10:15:21PM -0700, H. Peter Anvin wrote: > Hello all, > > I believe the x86 setup tree is now finished. I will turn it into a > "clean patchset" later this week, but I wanted to get flamed^W feedback > on it first. > > The git tree is at: > > http://git.kernel.org/?p=linux/kernel/git/hpa/linux-2.6-newsetup.git;a=summary > git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-newsetup.git > ... > > ... and a flat patch at ... > > http://www.kernel.org/pub/linux/kernel/people/hpa/newsetup-36f021b5.patch > Wow, reading code in C is so much better than decoding assembly. :-) Had a quick look, mainly from relocatable kernel code point of view. Yet to dive deeper. PHYSICAL_ALIGN needs to be 2MB on x86_64 instead of 1MB. Thanks Vivek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21 broke arm scsi on qemu -M volatilepb
[adding linux-scsi] On Thu, 10 May 2007 23:59:10 -0400 Rob Landley wrote: > Booting a 2.6.20 kernel under qemu works fine and gets me to a shell prompt, > but booting a 2.6.21.1 kernel cycles endlessly on scsi, going: > > Loading iSCSI transport class v2.0-724. > PCI: enabling device :00:0c.0 (0140 -> 0143) > sym0: <895a> rev 0x0 at pci :00:0c.0 irq 0 > sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking > sym0: SCSI BUS has been reset. > scsi0 : sym-2.2.3 > scsi 0:0:0:0: ABORT operation started. > scsi 0:0:0:0: ABORT operation timed-out. > scsi 0:0:0:0: DEVICE RESET operation started. > scsi 0:0:0:0: DEVICE RESET operation timed-out. > scsi 0:0:0:0: BUS RESET operation started. > scsi 0:0:0:0: BUS RESET operation timed-out. > scsi 0:0:0:0: HOST RESET operation started. > sym0: SCSI BUS has been reset. > ... > And so on. > > If you're interested in reproducing this, download the most recent > http://landley.net/hg/firmware snapshot (links up top), run "./build.sh > armv4l", and when that's done "cd build" and "./run-armv4l.sh". > > Is this a known issue? A quick google for "arm scsi 2.6.21" didn't turn up > anything relevant... > > Rob --- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: (hacky) [PATCH] silence MODPOST section mismatch warnings
On Thu, May 10, 2007 at 11:18:50PM +0100, Russell King wrote: > On Fri, May 11, 2007 at 12:16:59AM +0200, Sam Ravnborg wrote: > > On Thu, May 10, 2007 at 10:59:20PM +0100, Russell King wrote: > > > file:(section+offset): message > > > > I like the new format - thanks! > > Did you drop the ':' after the file on purpose? > > Oops, yes. > > > PS. Will apply the path you submitted in next mail. > > Do you want a patch with added colons? I will add them locally - no problem. Sam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.21 broke arm scsi on qemu -M volatilepb
Booting a 2.6.20 kernel under qemu works fine and gets me to a shell prompt, but booting a 2.6.21.1 kernel cycles endlessly on scsi, going: Loading iSCSI transport class v2.0-724. PCI: enabling device :00:0c.0 (0140 -> 0143) sym0: <895a> rev 0x0 at pci :00:0c.0 irq 0 sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking sym0: SCSI BUS has been reset. scsi0 : sym-2.2.3 scsi 0:0:0:0: ABORT operation started. scsi 0:0:0:0: ABORT operation timed-out. scsi 0:0:0:0: DEVICE RESET operation started. scsi 0:0:0:0: DEVICE RESET operation timed-out. scsi 0:0:0:0: BUS RESET operation started. scsi 0:0:0:0: BUS RESET operation timed-out. scsi 0:0:0:0: HOST RESET operation started. sym0: SCSI BUS has been reset. ... And so on. If you're interested in reproducing this, download the most recent http://landley.net/hg/firmware snapshot (links up top), run "./build.sh armv4l", and when that's done "cd build" and "./run-armv4l.sh". Is this a known issue? A quick google for "arm scsi 2.6.21" didn't turn up anything relevant... Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3 years since last 2.2 release, why still on kernel.org main page?
Out of curiosity, since 2.2 hasn't had a release in 3 years, and the last prepatch was 2 years ago, why is its' status still on the kernel.org main page? Not exactly something people are checking the status of on a daily basis... Just wondering... Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
exporting variables across modules
I have 2 kernel modules. In one module I have 2 global pointers(of type unsigned char and int respectively). I'd like to use these 2 pointers in the other module. I am adopting the following approach but not really sure whether its right or not: 1. I have exported both the pointers using EXPORT_SYMBOL(sym1) and EXPORT_SYMBOL(sym2) 2. Then in the module in which I want to use these pointers I have declared these 2 pointers as global extern variables as: extern unsigned char * sym1; extern unsigned int * sym2; Is this the right way to use these pointers? Kindly guide me. Regards, Bhuvan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-mm2 - 100% CPU on ksoftirqd/1
On Wed, 09 May 2007 12:08:43 EDT, [EMAIL PROTECTED] said: > On Wed, 09 May 2007 01:23:22 PDT, Andrew Morton said: > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21/2.6.21-mm2/ > > Boots up to multiuser mostly OK. However... > > It comes up with a screaming ksoftirqd - usually /1 but one boot had /0. > Just sitting there, 100% CPU according to 'top'. Tried 'echo t > > /proc/sysrq-trigger' to get > a trace, but it was always running on the other CPU - even after I reniced > it down to 19 and launched 2 'for(;;)' C programs to suck the cycles. It > would > be failing to get any CPU - until I did the 'echo t' and then it would be > "running" again. Anybody got any good debugging ideas here? OK, finally tracked this one down - the out-of-tree iwlwifi git tree for the Intel 3945ABG card had some disagreements with the 2.6.21-mm2 git-wireless.patch Unfortunately, the last known-working for this was -r5-mm2, as I didn't test -rc6-mm* or -rc7-mm* for this (I hit other issues with those so I didn't notice this one). I'll try to work up some of those tomorrow and see if I can narrow it down at least a *little* bit. pgpAAM5MFXA5b.pgp Description: PGP signature
Re: [GIT PATCH] ACPI patches for 2.6.22 - part 2
On Thu, 10 May 2007, Len Brown wrote: > > That said, can you send me or point me to the acpidump output > for your EVO. Yes, I'm sure you've sent it before a long time > ago, but that was about probably 2,000,000 e-mail messages > and a couple of disk crashes ago:-) Sure. If you send me a pointer to "acpidump" again, because I've long since updated that machine, and no longer have it. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PCI legacy I/O port free driver - Making Intel e1000 driver legacy I/O port free
Tomohiro Kusumi wrote: Dear Auke > I'm ok with the bottom part of the patch, but I do not like > the modification of the pci device ID table in this way. As > Arjan van der Ven previously commented as well, this makes > it hard for future device ID's to be bound to the driver. I googled the previous comment by Arjan. Now I understand that the patch makes it difficult to add PCI ID's to the driver at runtime. > On top of that, there is no logical correlation between the > mapping and chipsets, so a lot of information is lost in that > table. It really does not show which _chipsets_ support this > functionality. Thanks for pointing out the problem, but I can't quite understand what you are trying to say. What do you mean by the chipset? Are you talking about the chipset of the NIC? or the South bridge? I'd be glad if you can explain it to me. perhaps my wording was poor. I was referring to the NIC chip. Since there are about 12 different physical e1000 NIC chips (and lots of different pci IDs per e1000 NIC chip), it would be best to correlate the capability of each NIC chip number to be able to work without legacy IO mode instead of providing this mapping based on the PCI device ID. It would serve two purposes: new pci id's for a chipset of which we already know that it can work without legacy IO can automatically inherit this property from the NIC chipset properties, and new e1000 chips would automatically get a default property for this value. I will (time permitting) try to reverse your matrix to chip numbers and see if we can add this property in a much easier way. Auke Tomohiro Kusumi Kok, Auke wrote: Tomohiro Kusumi wrote: Hi As you can see in the "10. pci_enable_device_bars() and Legacy I/O Port space" of the Documentation/pci.txt, the latest kernel has interfaces for PCI device drivers to tell the kernel which resource the driver want to use, ex. I/O port or MMIO. I've made a patch which makes Intel e1000 driver legacy I/O port free by using the PCI core changes I mentioned above. The Intel e1000 driver can handle some of its devices without using I/O port. So this patch changes the driver not to enable/request I/O port region depending on the device id. As a result, the driver can handle its device even when there are huge number of PCI devices being used on the system and no I/O port region assigned to the device. Tomohiro, I'm ok with the bottom part of the patch, but I do not like the modification of the pci device ID table in this way. As Arjan van der Ven previously commented as well, this makes it hard for future device ID's to be bound to the driver. On top of that, there is no logical correlation between the mapping and chipsets, so a lot of information is lost in that table. It really does not show which _chipsets_ support this functionality. I think if we want to work with this, we need some way of mapping the device ID's back to chipsets, and enable the feature on that basis. Auke Tomohiro Kusumi Signed-off-by: Tomohiro Kusumi <[EMAIL PROTECTED]> --- e1000.h |6 +- e1000_main.c | 152 +++ 2 files changed, 86 insertions(+), 72 deletions(-) diff -uprN linux-2.6.21.orig/drivers/net/e1000/e1000.h linux-2.6.21/drivers/net/e1000/e1000.h --- linux-2.6.21.orig/drivers/net/e1000/e1000.h2007-05-09 18:02:26.0 +0900 +++ linux-2.6.21/drivers/net/e1000/e1000.h2007-05-09 18:02:59.0 +0900 @@ -74,8 +74,9 @@ #define BAR_11 #define BAR_55 -#define INTEL_E1000_ETHERNET_DEVICE(device_id) {\ -PCI_DEVICE(PCI_VENDOR_ID_INTEL, device_id)} +#define E1000_USE_IOPORT (1 << 0) +#define INTEL_E1000_ETHERNET_DEVICE(device_id, flags) {\ + PCI_DEVICE(PCI_VENDOR_ID_INTEL, device_id), .driver_data = flags} struct e1000_adapter; @@ -347,6 +348,7 @@ struct e1000_adapter { boolean_t quad_port_a; unsigned long flags; uint32_t eeprom_wol; +int bars; /* BARs to be enabled */ }; enum e1000_state_t { diff -uprN linux-2.6.21.orig/drivers/net/e1000/e1000_main.c linux-2.6.21/drivers/net/e1000/e1000_main.c --- linux-2.6.21.orig/drivers/net/e1000/e1000_main.c2007-05-09 18:02:27.0 +0900 +++ linux-2.6.21/drivers/net/e1000/e1000_main.c2007-05-09 18:03:00.0 +0900 @@ -48,65 +48,65 @@ static char e1000_copyright[] = "Copyrig * {PCI_DEVICE(PCI_VENDOR_ID_INTEL, device_id)} */ static struct pci_device_id e1000_pci_tbl[] = { -INTEL_E1000_ETHERNET_DEVICE(0x1000), -INTEL_E1000_ETHERNET_DEVICE(0x1001), -INTEL_E1000_ETHERNET_DEVICE(0x1004), -INTEL_E1000_ETHERNET_DEVICE(0x1008), -INTEL_E1000_ETHERNET_DEVICE(0x1009), -INTEL_E1000_ETHERNET_DEVICE(0x100C), -INTEL_E1000_ETHERNET_DEVICE(0x100D), -INTEL_E1000_ETHERNET_DEVICE(0x100E), -INTEL_E1000_ETHERNET_DEVICE(0x100F), -INTEL_E1000_ETHERNET_DEVICE(0x1010), -
Re: Kconfig warnings on latest GIT
On Fri, May 11, 2007 at 11:27:22AM +0900, Simon Horman wrote: > On Thu, May 10, 2007 at 09:13:34PM -0500, Kumar Gala wrote: > > On Fri, 11 May 2007, Simon Horman wrote: > > > > > On Thu, May 10, 2007 at 08:47:05PM -0500, Kumar Gala wrote: > > > > Try this patch: > > > > > > That certainly resolves the problem for me. > > > I'll see about doing something like that for the similar > > > Kconfig problems that I see. > > > > I've got a similar fix for SYS_SUPPORTS_APM_EMULATION already. I'll push > > both of these to Paul. If you can put something in place for the > > Atari/68k and send it to Geert that would be good (feeling a little lazy > > right now :) > > > > I'm still not happy about this fix. I'd like to get Sam's feeling on if > > we can fixup kconfig not to warn if the dependency isn't meet. I think > > the select is valid, and would prefer to fix this properly before we paper > > tape over it. > > I agree. I had thought a little about a kconfig fix. Though I'm > wondering if removing the warning will lead to oodles of dangling > symbols and invalid checks over time. > > In any case, I'll look into the Atari problem. At least that way > there will be some patches to add to the discussion. The fix below seems to work for the ATARI problem. Do you want me to submit it properly, do you want to submit it along with the other patches, or do you think we should sit on things for a bit? -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ From: Simon Horman <[EMAIL PROTECTED]> Subject: [PATCH] [IA64] ATARI_KBD_CORE only exists on m68k ATARI_KBD_CORE doesn't exist on architectures other than m68k, which causes the following warnings: drivers/input/keyboard/Kconfig:170:warning: 'select' used by config symbol 'KEYBOARD_ATARI' refers to undefined symbol 'ATARI_KBD_CORE' drivers/input/mouse/Kconfig:181:warning: 'select' used by config symbol 'MOUSE_ATARI' refers to undefined symbol 'ATARI_KBD_CORE' By reversing the Kconfig logic, the same results should occur on m68k as the current code, but the warnings go away on other platforms. Cc: Kumar Gala <[EMAIL PROTECTED]> Signed-off-by: Simon Horman <[EMAIL PROTECTED]> --- arch/m68k/Kconfig |1 + drivers/input/keyboard/Kconfig |1 - drivers/input/mouse/Kconfig|1 - 3 files changed, 1 insertion(+), 2 deletions(-) Index: linux-2.6/arch/m68k/Kconfig === --- linux-2.6.orig/arch/m68k/Kconfig2007-05-11 11:37:25.0 +0900 +++ linux-2.6/arch/m68k/Kconfig 2007-05-11 11:42:48.0 +0900 @@ -410,6 +410,7 @@ config STRAM_PROC Say Y here to report ST-RAM usage statistics in /proc/stram. config ATARI_KBD_CORE + default y if KEYBOARD_ATARI || MOUSE_ATARI bool config HEARTBEAT Index: linux-2.6/drivers/input/keyboard/Kconfig === --- linux-2.6.orig/drivers/input/keyboard/Kconfig 2007-05-11 11:37:25.0 +0900 +++ linux-2.6/drivers/input/keyboard/Kconfig2007-05-11 11:42:53.0 +0900 @@ -167,7 +167,6 @@ config KEYBOARD_AMIGA config KEYBOARD_ATARI tristate "Atari keyboard" depends on ATARI - select ATARI_KBD_CORE help Say Y here if you are running Linux on any Atari and have a keyboard attached. Index: linux-2.6/drivers/input/mouse/Kconfig === --- linux-2.6.orig/drivers/input/mouse/Kconfig 2007-05-11 11:40:32.0 +0900 +++ linux-2.6/drivers/input/mouse/Kconfig 2007-05-11 11:42:58.0 +0900 @@ -178,7 +178,6 @@ config MOUSE_AMIGA config MOUSE_ATARI tristate "Atari mouse" depends on ATARI - select ATARI_KBD_CORE help Say Y here if you have an Atari and want its native mouse supported by the kernel. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] v4l: saa7134: support ir-remote for 10moons TM300
Enable the IR-remote of the 10moons TM300 card and add the key-codes for it's remote. It has been tested using lirc. All the key codes are accepted. Signed-off-by: Tony Wan <[EMAIL PROTECTED]> --- drivers/media/common/ir-keymaps.c | 69 +++ drivers/media/video/saa7134/saa7134-cards.c |1 + drivers/media/video/saa7134/saa7134-input.c |6 ++ include/media/ir-common.h |1 + 4 files changed, 77 insertions(+), 0 deletions(-) diff --git a/drivers/media/common/ir-keymaps.c b/drivers/media/common/ir-keymaps.c index cbd1184..5aa293e 100644 --- a/drivers/media/common/ir-keymaps.c +++ b/drivers/media/common/ir-keymaps.c @@ -1783,3 +1783,72 @@ IR_KEYTAB_TYPE ir_codes_tt_1500[IR_KEYTAB_SIZE] = { }; EXPORT_SYMBOL_GPL(ir_codes_tt_1500); + +/* 10MOONS TM300 */ +IR_KEYTAB_TYPE ir_codes_10moonstm3[IR_KEYTAB_SIZE] = { + [ 0x10 ] = KEY_POWER, // Power + [ 0x0d ] = KEY_MUTE,// Mute + [ 0x1e ] = KEY_TUNER, // Cable + [ 0x00 ] = KEY_VIDEO, // Composite / S-Video + [ 0x01 ] = KEY_RADIO, // Music + [ 0x02 ] = KEY_TEXT,// Photo + + [ 0x1f ] = KEY_1, + [ 0x03 ] = KEY_2, + [ 0x04 ] = KEY_3, + [ 0x05 ] = KEY_4, + [ 0x1c ] = KEY_5, + [ 0x06 ] = KEY_6, + [ 0x07 ] = KEY_7, + [ 0x08 ] = KEY_8, + [ 0x1d ] = KEY_9, + [ 0x09 ] = KEY_SELECT, // 2 digit select (-/--) + [ 0x0a ] = KEY_0, + [ 0x0b ] = KEY_AGAIN, // Recall + + [ 0x14 ] = KEY_F1, // Begin + [ 0x15 ] = KEY_F2, // End + + [ 0x16 ] = KEY_CHANNELUP, // CH+ + [ 0x12 ] = KEY_CHANNELDOWN, // CH- + [ 0x0c ] = KEY_VOLUMEUP,// VOL+ + [ 0x17 ] = KEY_VOLUMEDOWN, // VOL- + [ 0x18 ] = KEY_OK, // OK + + [ 0x0e ] = KEY_EXIT,// Exit + [ 0x13 ] = KEY_COMPUTER,// Desktop + [ 0x11 ] = KEY_TAB, // TAB + [ 0x19 ] = KEY_CYCLEWINDOWS,// Switch task + + [ 0x1a ] = KEY_MENU,// Menu + [ 0x1b ] = KEY_ZOOM,// Fullscreen + [ 0x24 ] = KEY_ARCHIVE, // Time shifting + [ 0x20 ] = KEY_SWITCHVIDEOMODE, // Selcect source + + [ 0x3a ] = KEY_RECORD, // Record + [ 0x22 ] = KEY_PLAY,// Play/Pause + [ 0x25 ] = KEY_STOP,// Stop + [ 0x23 ] = KEY_CAMERA, // Snapshot + + [ 0x28 ] = KEY_BACK,// Backward << + [ 0x2a ] = KEY_FORWARD, // Forward >> + [ 0x29 ] = KEY_PREVIOUS,// Back |<< + [ 0x2b ] = KEY_NEXT,// End >>| + + [ 0x2c ] = KEY_PROGRAM, // Multi-view + [ 0x2d ] = KEY_AUDIO, // Audio Tracks + [ 0x2e ] = KEY_SOUND, // Sound + [ 0x2f ] = KEY_SUBTITLE,// Subtitles + + [ 0x30 ] = KEY_TIME,// Set timer + [ 0x31 ] = KEY_CHANNEL, // Stereo + [ 0x32 ] = KEY_LANGUAGE,// Language + [ 0x33 ] = KEY_TEXT,// Text + + [ 0x39 ] = KEY_RED, // RED + [ 0x21 ] = KEY_GREEN, // GREEN + [ 0x27 ] = KEY_YELLOW, // YELLOW + [ 0x37 ] = KEY_BLUE,// BLUE +}; + +EXPORT_SYMBOL_GPL(ir_codes_10moonstm3); diff --git a/drivers/media/video/saa7134/saa7134-cards.c b/drivers/media/video/saa7134/saa7134-cards.c index 44f2077..5813509 100644 --- a/drivers/media/video/saa7134/saa7134-cards.c +++ b/drivers/media/video/saa7134/saa7134-cards.c @@ -4368,6 +4368,7 @@ int saa7134_board_init1(struct saa7134_dev *dev) case SAA7134_BOARD_AVERMEDIA_A16AR: case SAA7134_BOARD_ENCORE_ENLTV: case SAA7134_BOARD_ENCORE_ENLTV_FM: + case SAA7134_BOARD_10MOONSTVMASTER3: dev->has_remote = SAA7134_REMOTE_GPIO; break; case SAA7134_BOARD_FLYDVBS_LR300: diff --git a/drivers/media/video/saa7134/saa7134-input.c b/drivers/media/video/saa7134/saa7134-input.c index c0de37e..c87755b 100644 --- a/drivers/media/video/saa7134/saa7134-input.c +++ b/drivers/media/video/saa7134/saa7134-input.c @@ -333,6 +333,12 @@ int saa7134_input_init1(struct saa7134_dev *dev) mask_keyup = 0x04; polling = 50; // ms break; + case SAA7134_BOARD_10MOONSTVMASTER3: + ir_codes = ir_codes_10moonstm3; + mask_keycode = 0x4f8; + mask_keyup = 0x800; + polling = 50; //ms + break; } if (NULL == ir_codes) { printk("%s: Oops: IR config error [card=%d]\n", diff --git a/include/media/ir-common.h b/include/media/ir-common.h index 9807a7c..4e4d207 100644 --- a/include/media/ir-common.h +++ b/include/media/ir-common.h @@ -140,6 +140,7 @@ extern IR_KEYTAB_TYPE
Re: [PATCH] PCI legacy I/O port free driver - Making Intel e1000 driver legacy I/O port free
Dear Auke > I'm ok with the bottom part of the patch, but I do not like > the modification of the pci device ID table in this way. As > Arjan van der Ven previously commented as well, this makes > it hard for future device ID's to be bound to the driver. I googled the previous comment by Arjan. Now I understand that the patch makes it difficult to add PCI ID's to the driver at runtime. > On top of that, there is no logical correlation between the > mapping and chipsets, so a lot of information is lost in that > table. It really does not show which _chipsets_ support this > functionality. Thanks for pointing out the problem, but I can't quite understand what you are trying to say. What do you mean by the chipset? Are you talking about the chipset of the NIC? or the South bridge? I'd be glad if you can explain it to me. Tomohiro Kusumi Kok, Auke wrote: Tomohiro Kusumi wrote: Hi As you can see in the "10. pci_enable_device_bars() and Legacy I/O Port space" of the Documentation/pci.txt, the latest kernel has interfaces for PCI device drivers to tell the kernel which resource the driver want to use, ex. I/O port or MMIO. I've made a patch which makes Intel e1000 driver legacy I/O port free by using the PCI core changes I mentioned above. The Intel e1000 driver can handle some of its devices without using I/O port. So this patch changes the driver not to enable/request I/O port region depending on the device id. As a result, the driver can handle its device even when there are huge number of PCI devices being used on the system and no I/O port region assigned to the device. Tomohiro, I'm ok with the bottom part of the patch, but I do not like the modification of the pci device ID table in this way. As Arjan van der Ven previously commented as well, this makes it hard for future device ID's to be bound to the driver. On top of that, there is no logical correlation between the mapping and chipsets, so a lot of information is lost in that table. It really does not show which _chipsets_ support this functionality. I think if we want to work with this, we need some way of mapping the device ID's back to chipsets, and enable the feature on that basis. Auke Tomohiro Kusumi Signed-off-by: Tomohiro Kusumi <[EMAIL PROTECTED]> --- e1000.h |6 +- e1000_main.c | 152 +++ 2 files changed, 86 insertions(+), 72 deletions(-) diff -uprN linux-2.6.21.orig/drivers/net/e1000/e1000.h linux-2.6.21/drivers/net/e1000/e1000.h --- linux-2.6.21.orig/drivers/net/e1000/e1000.h2007-05-09 18:02:26.0 +0900 +++ linux-2.6.21/drivers/net/e1000/e1000.h2007-05-09 18:02:59.0 +0900 @@ -74,8 +74,9 @@ #define BAR_11 #define BAR_55 -#define INTEL_E1000_ETHERNET_DEVICE(device_id) {\ -PCI_DEVICE(PCI_VENDOR_ID_INTEL, device_id)} +#define E1000_USE_IOPORT (1 << 0) +#define INTEL_E1000_ETHERNET_DEVICE(device_id, flags) {\ + PCI_DEVICE(PCI_VENDOR_ID_INTEL, device_id), .driver_data = flags} struct e1000_adapter; @@ -347,6 +348,7 @@ struct e1000_adapter { boolean_t quad_port_a; unsigned long flags; uint32_t eeprom_wol; +int bars; /* BARs to be enabled */ }; enum e1000_state_t { diff -uprN linux-2.6.21.orig/drivers/net/e1000/e1000_main.c linux-2.6.21/drivers/net/e1000/e1000_main.c --- linux-2.6.21.orig/drivers/net/e1000/e1000_main.c2007-05-09 18:02:27.0 +0900 +++ linux-2.6.21/drivers/net/e1000/e1000_main.c2007-05-09 18:03:00.0 +0900 @@ -48,65 +48,65 @@ static char e1000_copyright[] = "Copyrig * {PCI_DEVICE(PCI_VENDOR_ID_INTEL, device_id)} */ static struct pci_device_id e1000_pci_tbl[] = { -INTEL_E1000_ETHERNET_DEVICE(0x1000), -INTEL_E1000_ETHERNET_DEVICE(0x1001), -INTEL_E1000_ETHERNET_DEVICE(0x1004), -INTEL_E1000_ETHERNET_DEVICE(0x1008), -INTEL_E1000_ETHERNET_DEVICE(0x1009), -INTEL_E1000_ETHERNET_DEVICE(0x100C), -INTEL_E1000_ETHERNET_DEVICE(0x100D), -INTEL_E1000_ETHERNET_DEVICE(0x100E), -INTEL_E1000_ETHERNET_DEVICE(0x100F), -INTEL_E1000_ETHERNET_DEVICE(0x1010), -INTEL_E1000_ETHERNET_DEVICE(0x1011), -INTEL_E1000_ETHERNET_DEVICE(0x1012), -INTEL_E1000_ETHERNET_DEVICE(0x1013), -INTEL_E1000_ETHERNET_DEVICE(0x1014), -INTEL_E1000_ETHERNET_DEVICE(0x1015), -INTEL_E1000_ETHERNET_DEVICE(0x1016), -INTEL_E1000_ETHERNET_DEVICE(0x1017), -INTEL_E1000_ETHERNET_DEVICE(0x1018), -INTEL_E1000_ETHERNET_DEVICE(0x1019), -INTEL_E1000_ETHERNET_DEVICE(0x101A), -INTEL_E1000_ETHERNET_DEVICE(0x101D), -INTEL_E1000_ETHERNET_DEVICE(0x101E), -INTEL_E1000_ETHERNET_DEVICE(0x1026), -INTEL_E1000_ETHERNET_DEVICE(0x1027), -INTEL_E1000_ETHERNET_DEVICE(0x1028), -INTEL_E1000_ETHERNET_DEVICE(0x1049), -INTEL_E1000_ETHERNET_DEVICE(0x104A), -INTEL_E1000_ETHERNET_DEVICE(0x104B), -INTEL_E1000_ETHERNET_DEVICE(0x104C), -
Re: Kconfig warnings on latest GIT
Simon Horman wrote: I agree. I had thought a little about a kconfig fix. Though I'm wondering if removing the warning will lead to oodles of dangling symbols and invalid checks over time. I'm pretty sure it will. Perhaps we need to have a lint for Kconfig? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ckrm-tech] [PATCH 3/9] Containers (V9): Add tasks file interface
Paul Menage wrote: > On 5/8/07, Balbir Singh <[EMAIL PROTECTED]> wrote: >> >> I now have a use case for maintaining a per-container task list. >> I am trying to build a per-container stats similar to taskstats. >> I intend to support container accounting of >> >> 1. Tasks running >> 2. Tasks stopped >> 3. Tasks un-interruptible >> 4. Tasks blocked on IO >> 5. Tasks sleeping >> >> This would provide statistics similar to the patch that Pavel had sent >> out. >> >> I faced the following problems while trying to implement this feature >> >> 1. There is no easy way to get a list of all tasks belonging to a >> container >>(we need to walk all threads) > > Well, walking the taks list is pretty easy - but yes, it could become > inefficient when there are many small containers in use. > > I've got some ideas for a way of tracking this specifically for > containers with subsystems that want this, while avoiding the overhead > for subsystems that don't really need it. I'll try to add them to the > next patchset. Super! > >> 2. There is no concept of a container identifier. When a user issues a >> command >>to extract statistics, the only unique container identifier is the >> container >>path, which means that we need to do a path lookup to determine the >> dentry >>for the container (which gets quite ugly with all the string >> manipulation) > > We could just cache the container path permanently in the container, > and invalidate it if any of its parents gets renamed. (I imagine this > happens almost never.) > Here's what I have so far, I cache the mount point of the container and add the container path to it. I'm now stuck examining tasks, while walking through a bunch of tasks, there is no easy way of knowing the container path of the task without walking all subsystems and then extracting the containers absolute path. >> >>Adding a container id, will make it easier to find a container and >> return >>statistics belonging to the container. > > Not unreasonable, but there are a few questions that would have to be > answered: > > - how is the container id picked? Like a pid, or user-defined? Or some > kind of string? > I was planning on using a hierarchical scheme, top 8 bits for the container hierarchy and bottom 24 for a unique id. The id is automatically selected. Once we know the container id, we'll need a more efficient mechanism to map the id to the container. > - how would it be exposed to userspace? A generic control file > provided by the container filesystem in all container directories? > A file in all container directories is an option > - can you give a more concrete example of how this would actually be > useful? For your container stats, it seems that just reading a control > file in the container's directory would give you the stats that you > want, and userspace already knows the container's name/id since it > opened the control file. > Sure, the plan is to build a containerstats interface like taskstats. In taskstats, we exchange data between user space and kernel space using genetlink sockets. We have a push and pull mechanism for statistics. > Paul -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kconfig warnings on latest GIT
On Thu, May 10, 2007 at 09:13:34PM -0500, Kumar Gala wrote: > On Fri, 11 May 2007, Simon Horman wrote: > > > On Thu, May 10, 2007 at 08:47:05PM -0500, Kumar Gala wrote: > > > Try this patch: > > > > That certainly resolves the problem for me. > > I'll see about doing something like that for the similar > > Kconfig problems that I see. > > I've got a similar fix for SYS_SUPPORTS_APM_EMULATION already. I'll push > both of these to Paul. If you can put something in place for the > Atari/68k and send it to Geert that would be good (feeling a little lazy > right now :) > > I'm still not happy about this fix. I'd like to get Sam's feeling on if > we can fixup kconfig not to warn if the dependency isn't meet. I think > the select is valid, and would prefer to fix this properly before we paper > tape over it. I agree. I had thought a little about a kconfig fix. Though I'm wondering if removing the warning will lead to oodles of dangling symbols and invalid checks over time. In any case, I'll look into the Atari problem. At least that way there will be some patches to add to the discussion. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Slab allocators: Drop support for destructors
On Fri, May 11, 2007 at 08:35:27AM +0900, Paul Mundt wrote: > On Thu, May 10, 2007 at 12:00:08PM -0700, Christoph Lameter wrote: > > As far as I can tell there is only a single slab destructor left (there > > is currently another in i386 but its going to go as soon as Andi merges > > i386s support for quicklists). > > > > I wonder how difficult it would be to remove it? If we have no need for > > destructors anymore then maybe we could remove destructor support from the > > slab allocators? There is no point in checking for destructor uses in > > the slab allocators if there are none. > > > > Or are there valid reason to keep them around? It seems they were mainly > > used for list management which required them to take a spinlock. Taking a > > spinlock in a destructor is a bit risky since the slab allocators may run > > the destructors anytime they decide a slab is no longer needed. > > > > Or do we want to continue support destructors? If so why? > > > [snip pmb stuff] > > I'll take a look at tidying up the PMB slab, getting rid of the dtor > shouldn't be terribly painful. I simply opted to do the list management > there since others were doing it for the PGD slab cache at the time that > was written. And here's the bit for dropping pmb_cache_dtor(), moving the list management up to pmb_alloc() and pmb_free(). With this applied, we're all set for killing off slab destructors from the kernel entirely. Signed-off-by: Paul Mundt <[EMAIL PROTECTED]> -- arch/sh/mm/pmb.c | 79 ++- 1 file changed, 38 insertions(+), 41 deletions(-) diff --git a/arch/sh/mm/pmb.c b/arch/sh/mm/pmb.c index 02aae06..b6a5a33 100644 --- a/arch/sh/mm/pmb.c +++ b/arch/sh/mm/pmb.c @@ -3,7 +3,7 @@ * * Privileged Space Mapping Buffer (PMB) Support. * - * Copyright (C) 2005, 2006 Paul Mundt + * Copyright (C) 2005, 2006, 2007 Paul Mundt * * P1/P2 Section mapping definitions from map32.h, which was: * @@ -68,6 +68,32 @@ static inline unsigned long mk_pmb_data(unsigned int entry) return mk_pmb_entry(entry) | PMB_DATA; } +static DEFINE_SPINLOCK(pmb_list_lock); +static struct pmb_entry *pmb_list; + +static inline void pmb_list_add(struct pmb_entry *pmbe) +{ + struct pmb_entry **p, *tmp; + + p = _list; + while ((tmp = *p) != NULL) + p = >next; + + pmbe->next = tmp; + *p = pmbe; +} + +static inline void pmb_list_del(struct pmb_entry *pmbe) +{ + struct pmb_entry **p, *tmp; + + for (p = _list; (tmp = *p); p = >next) + if (tmp == pmbe) { + *p = tmp->next; + return; + } +} + struct pmb_entry *pmb_alloc(unsigned long vpn, unsigned long ppn, unsigned long flags) { @@ -81,11 +107,19 @@ struct pmb_entry *pmb_alloc(unsigned long vpn, unsigned long ppn, pmbe->ppn = ppn; pmbe->flags = flags; + spin_lock_irq(_list_lock); + pmb_list_add(pmbe); + spin_unlock_irq(_list_lock); + return pmbe; } void pmb_free(struct pmb_entry *pmbe) { + spin_lock_irq(_list_lock); + pmb_list_del(pmbe); + spin_unlock_irq(_list_lock); + kmem_cache_free(pmb_cache, pmbe); } @@ -167,31 +201,6 @@ void clear_pmb_entry(struct pmb_entry *pmbe) clear_bit(entry, _map); } -static DEFINE_SPINLOCK(pmb_list_lock); -static struct pmb_entry *pmb_list; - -static inline void pmb_list_add(struct pmb_entry *pmbe) -{ - struct pmb_entry **p, *tmp; - - p = _list; - while ((tmp = *p) != NULL) - p = >next; - - pmbe->next = tmp; - *p = pmbe; -} - -static inline void pmb_list_del(struct pmb_entry *pmbe) -{ - struct pmb_entry **p, *tmp; - - for (p = _list; (tmp = *p); p = >next) - if (tmp == pmbe) { - *p = tmp->next; - return; - } -} static struct { unsigned long size; @@ -283,25 +292,14 @@ void pmb_unmap(unsigned long addr) } while (pmbe); } -static void pmb_cache_ctor(void *pmb, struct kmem_cache *cachep, unsigned long flags) +static void pmb_cache_ctor(void *pmb, struct kmem_cache *cachep, + unsigned long flags) { struct pmb_entry *pmbe = pmb; memset(pmb, 0, sizeof(struct pmb_entry)); - spin_lock_irq(_list_lock); - pmbe->entry = PMB_NO_ENTRY; - pmb_list_add(pmbe); - - spin_unlock_irq(_list_lock); -} - -static void pmb_cache_dtor(void *pmb, struct kmem_cache *cachep, unsigned long flags) -{ - spin_lock_irq(_list_lock); - pmb_list_del(pmb); - spin_unlock_irq(_list_lock); } static int __init pmb_init(void) @@ -312,8 +310,7 @@ static int __init pmb_init(void) BUG_ON(unlikely(nr_entries >= NR_PMB_ENTRIES)); pmb_cache = kmem_cache_create("pmb", sizeof(struct pmb_entry), 0, -
Re: [PATCH] utimensat implementation
On Thursday May 10, [EMAIL PROTECTED] wrote: > Ulrich Drepper wrote: > > Neil Brown wrote: > >> Does it also specify how to find out what granularity is used by the > >> filesystem? I had a need for this just recently and couldn't see any > >> way to extract it. > > > > That's still on the table. We might end up with an fpathconf() solution. > > OK, the pathconf()-based solution will most probably be in the next > POSIX spec. > > Now, somebody has to provide a way to get to this information. The > kernel does not export it so far. Is it finally time to break down and > allow pathconf() and fpathconf() syscalls into the kernel? Maybe... certainly we want some way to get at this information. It has occurred to me a number of times that there is no easy way to export information about filesystems from the kernel. One specific example is request statistics for an NFS filesystem. We can get system-wide statistics, but to get stats for a single filesystem isn't possible, and a big reason for this is that there is no-where to put that information. Filesystems also have a variety of mount options and they are only available through "/proc/mounts" and can only be change by "remount" which is a bit of a clunky interface. Just about every other kernel object is, or can be, exposed through sysfs. But filesystems cannot. This is presumably because there is no unique handle for them (what with name spaces and bind mounts and so forth). Each filesystem still have a unique device number (->s_dev) so that could be used. e.g. we could create /sysfs/filesystem/00:03/ which would contain info about the filesystem with device number 0:3. We could then put time-granularity and other fs-specific info in there. I feel that would be more flexible than a specific fpathconfat system call. But would it be enough? The pathconf values can apparently be different for different files in a filesystem. Is that important? If it is, we really would want some new syscall rather than just sysfs attributes. So that makes two questions for anyone with opinions: 1/ Does pathconf have to be per-file, or is per-filesystem OK 2/ Can we have a way to put attributes for filesystems in sysfs? NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kconfig warnings on latest GIT
On Fri, 11 May 2007, Simon Horman wrote: > On Thu, May 10, 2007 at 08:47:05PM -0500, Kumar Gala wrote: > > Try this patch: > > That certainly resolves the problem for me. > I'll see about doing something like that for the similar > Kconfig problems that I see. > > -- > Horms > H: http://www.vergenet.net/~horms/ > W: http://www.valinux.co.jp/en/ > I've got a similar fix for SYS_SUPPORTS_APM_EMULATION already. I'll push both of these to Paul. If you can put something in place for the Atari/68k and send it to Geert that would be good (feeling a little lazy right now :) I'm still not happy about this fix. I'd like to get Sam's feeling on if we can fixup kconfig not to warn if the dependency isn't meet. I think the select is valid, and would prefer to fix this properly before we paper tape over it. - k - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] powerpc: fix Kconfig 'select' warning with UCC_FAST
The UCC_GETH Kconfig option in drivers/net/Kconfig had a line to select the UCC_FAST option is arch/powerpc/sysdev/qe_lib/Kconfig, which is only used on PowerPC builds. On other architectures, this would generated a warning. The fix is to have UCC_FAST depend on UCC_GETH. Signed-off-by: Timur Tabi <[EMAIL PROTECTED]> --- The reason I used 'select' in the first place was because I didn't want to have to update the definitions of UCC_FAST or UCC_SLOW every time we added a new UCC device driver, but I guess that's unavoidable. arch/powerpc/sysdev/qe_lib/Kconfig |4 +--- drivers/net/Kconfig|1 - 2 files changed, 1 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/sysdev/qe_lib/Kconfig b/arch/powerpc/sysdev/qe_lib/Kconfig index 887739f..f611d34 100644 --- a/arch/powerpc/sysdev/qe_lib/Kconfig +++ b/arch/powerpc/sysdev/qe_lib/Kconfig @@ -5,15 +5,13 @@ config UCC_SLOW bool default n - select UCC help This option provides qe_lib support to UCC slow protocols: UART, BISYNC, QMC config UCC_FAST bool - default n - select UCC + default y if UCC_GETH help This option provides qe_lib support to UCC fast protocols: HDLC, Ethernet, ATM, transparent diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index b86ccd2..5a5c026 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -2276,7 +2276,6 @@ config GFAR_NAPI config UCC_GETH tristate "Freescale QE Gigabit Ethernet" depends on QUICC_ENGINE - select UCC_FAST help This driver supports the Gigabit Ethernet mode of the QUICC Engine, which is available on some Freescale SOCs. -- 1.5.0.2.260.g2eb065 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kconfig warnings on latest GIT
On Thu, May 10, 2007 at 08:47:05PM -0500, Kumar Gala wrote: > Try this patch: That certainly resolves the problem for me. I'll see about doing something like that for the similar Kconfig problems that I see. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Hi, I have one question about rt_mutex.
Steven Rostedt wrote: > Li Yu wrote: > >>> Now since mutexes can be defined by user-land applications, we don't >>> >> want a DOS >> >>> type of application that nests large amounts of mutexes to create a large >>> PI chain, and have the code holding spin locks while looking at a large >>> amount of data. So to prevent this, the implementation not only implements >>> a maximum lock depth, but also only holds at most two different locks at a >>> time, as it walks the PI chain. More about this below. >>> >> After read the implementation of rt_mutex_adjust_prio_chain(), I found >> the we really require maximin lock depth (1024 default), but I can not >> see the check for more same locks duplication. Does this doc is >> inconsistent with code? >> > > Nope, the code and the doc are still the same. > > The thing that was most difficult in writing that document, was a way to > talk about the user locks (futex - fast user mutex) and the kernel locks > (spin_locks) without confusing the two. The max depth is in reference > to the user futex, but the comment about the "at most two different > locks" is referencing the kernel's spin_locks. > > I don't remember talking about looking for "lock duplication", which I'm > thinking you are referring to circular dead locks. I didn't cover that > in the document and I believe I even mentioned that I would not cover > the debug aspect of the code which would handle catching circular deadlocks. > > But back to the "no more than two kernel locks held". This is very > important. Some PI implementations requires all locks in the PI chain to > have their internal locks held (as in spin_locks). But letting user > space determine the number of spin locks held can cause large latencies > for the rest of the system. So we designed a method to only need to > hold two internal spin_locks in the PI chain at a time. The kernel > doesn't care if the user application is abusing itself (holding too many > of it's own user locks). But the kernel does care if a user application > can affect other non related applications. > > As Esben already mentioned, the PI chain even lets the locking user > mutex schedule without holding any kernel locks. This is very key. It > keeps the latency down on setting up a PI chain which can be very expensive. > > Note: Esben helped a lot in the development of the final design of > rtmutex.c. > > -- Steve > First, Thanks for such good explanation from you two guru in time. Er, I think these two locks which you said are task->pi_lock and rt_mutex->wait_lock. >The max depth is in reference to the user futex, but the comment >about the "at most two different locks" is referencing the >kernel's spin_locks. This sentence make the my world clear from now on ;) However, I found the sys_futex() do not use rt_mutex, so what's mean of the user futex you said? Even, I have not found any usage for rt_mutex in kernel code. Or, some beautiful story will happen in future? Goodluck. - Li Yu - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kconfig warnings on latest GIT
On May 10, 2007, at 8:25 PM, Simon Horman wrote: On Thu, May 10, 2007 at 11:56:48AM -0500, Timur Tabi wrote: Simon Horman wrote: So my question is: in which Kconfig do I define "UCC_FAST_TEMP" and "UCC_SLOW_TEMP"? At first I thought, just put it in drivers/ Kconfig, but that Kconfig does nothing but including other Kconfigs. I believe that if I submit a patch that adds "UCC_FAST_TEMP" and "UCC_SLOW_TEMP" to drivers/Kconfig, it will be rejected. Either that, or I'll spend six weeks trying to persuade everyone that it's a good idea. Does anyone have any suggestions on how I can fix this? That does seem like a reasonable suggestion, and one that would probably work well with the other similar problems that have been introduced sice 2.6.21. Looks like the fix is simpler than I thought. Instead of having UCC_GETH select UCC_FAST I need to do UCC_FAST default y if UCC_GETH I pondered something like that, but I couldn't get it quite right :( I'll have a patch that fixes this out later today. I chose the first method because I wanted each individual UCC device driver to select UCC_FAST or UCC_SLOW as appropriate, so that I wouldn't have to update arch/powerpc/sysdev/qe_lib/Kconfig every time we add a new UCC driver. Oh well. It really seems like the kconfig shouldn't complain if the depends isnt satisfied. - k - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Conveying memory pressure to userspace?
Bas Westerbaan wrote: Hello, Quite a lot of userspace applications cache. Firefox caches pages; mySQL caches queries; libc' free (like practically all other userspace allocators) won't directly return the first byte of memory freed, etc. These applications are unaware of memory pressure. While the kernel is trying its best to to free memory by for instance dropping, possibly more valuable caches, userspace is left blissfully unaware. Obviously this isn't a really big problem, given that we've still got swap to swap out those rarely used caches, except for when the caches aren't _that_ rarely used and of which the backing store (eg. precomputed values) might be faster than the disk to swap back the pages from. A solution would be to either a) let the application make the kernel aware of pages that, when in memory pressure, may be dropped. This would be tricky to implement for the userspace: it's hard to avoid an application to race into a dropped page. However, the kernel can directly free a page from userspace, which makes it use full when under real pressure. This in contrast to b) letting the application register itself with a cache share priority. The application (and other aware applications) would then be able to query how fair they are at the moment proportional to their cache share priority. Freeing would still be completely in their own hands. The only relevant related matter I could find were madvise and mincore. With madvise pages can be marked to be unnecessary and these should be swapped out earlier. With mincore one can determine whether pages are resident (not cached). This would make an existing alternative to solution a. However, this doesn't eliminate the writes to the swap and polling everytime before accessing a cache isn't really pretty. I did consider guessing the memory pressure by looking at /proc/meminfo, but I think it isn't that accurate. The prev_priority field in the zoneinfo stuff is more useful for memory pressure. I'm playing with making a blocking callback that can wake someone up when this gets down to a certain priority level (prio=12 => everything's rosy, prio=0 => we're in deep shit). Before hacking something together (and being uncertain about the thoroughness with which I searched for existing work, sorry), I would like your thoughts on this. Please CC me, I'm not in the list. Bas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 2.6.21-mm1 (git-input) on Dell D610 laptop
On Thu, 10 May 2007 15:05:25 +0200 Remi Colinet <[EMAIL PROTECTED]> wrote: > My D610 ALPS Glide Point is unresponsive with 2.6.21-mm1 patch. > No problem noticed with 2.6.21. > > The culprit seems to be git-input. I have applied 2.6.21-mm1 on top of 2.6.21 > and then removed git-input patch. It is ok since then. > > >From what i can see, no interrupt is raised from the GlidePoint with > >git-input > applied. IRQ count 12 does not increase. It is when using the touchpad. > >CPU0 > 0:160 IO-APIC-edge timer > 1:935 IO-APIC-edge i8042 > 7: 0 IO-APIC-edge parport0 > 8: 1 IO-APIC-edge rtc > 9: 2 IO-APIC-fasteoi acpi > => 12:114 IO-APIC-edge i8042 <= > 14: 3223 IO-APIC-edge libata > 15: 5 IO-APIC-edge libata > 16: 0 IO-APIC-fasteoi uhci_hcd:usb1, ehci_hcd:usb5, Intel ICH6 > 17: 1 IO-APIC-fasteoi uhci_hcd:usb2, ipw2200, Intel ICH6 Modem > 18: 0 IO-APIC-fasteoi uhci_hcd:usb3 > 19: 1 IO-APIC-fasteoi uhci_hcd:usb4, yenta > NMI: 0 > LOC: 4051 > ERR: 0 > MIS: 0 > > I have also tried to disable the ALPS driver in the .config file. IRQ 12 are > then raised when using the Glide Point. X refuses to start. > Are you able to test 2.6.21-mm2? Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kconfig warnings on latest GIT
On Fri, 11 May 2007, Simon Horman wrote: > On Thu, May 10, 2007 at 11:56:48AM -0500, Timur Tabi wrote: > > Simon Horman wrote: > > > > >>So my question is: in which Kconfig do I define "UCC_FAST_TEMP" and > > >>"UCC_SLOW_TEMP"? At first I thought, just put it in drivers/Kconfig, but > > >>that Kconfig does nothing but including other Kconfigs. I believe that > > >>if I > > >>submit a patch that adds "UCC_FAST_TEMP" and "UCC_SLOW_TEMP" to > > >>drivers/Kconfig, it will be rejected. Either that, or I'll spend six > > >>weeks > > >>trying to persuade everyone that it's a good idea. > > >> > > >>Does anyone have any suggestions on how I can fix this? > > >That does seem like a reasonable suggestion, and one that > > >would probably work well with the other similar problems > > >that have been introduced sice 2.6.21. > > > > Looks like the fix is simpler than I thought. Instead of having > > > > UCC_GETH > > select UCC_FAST > > > > I need to do > > > > UCC_FAST > > default y if UCC_GETH > > I pondered something like that, but I couldn't get it quite right :( > > > I'll have a patch that fixes this out later today. > > > > I chose the first method because I wanted each individual UCC device > > driver to select UCC_FAST or UCC_SLOW as appropriate, so that I > > wouldn't have to update arch/powerpc/sysdev/qe_lib/Kconfig every time > > we add a new UCC driver. Oh well. > > -- > Horms > H: http://www.vergenet.net/~horms/ > W: http://www.valinux.co.jp/en/ > Try this patch: diff --git a/arch/powerpc/sysdev/qe_lib/Kconfig b/arch/powerpc/sysdev/qe_lib/Kconfig index 887739f..5de7aba 100644 --- a/arch/powerpc/sysdev/qe_lib/Kconfig +++ b/arch/powerpc/sysdev/qe_lib/Kconfig @@ -12,6 +12,7 @@ config UCC_SLOW config UCC_FAST bool + default y if UCC_GETH default n select UCC help diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index fa489b1..b159c6c 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -2276,7 +2276,6 @@ config GFAR_NAPI config UCC_GETH tristate "Freescale QE Gigabit Ethernet" depends on QUICC_ENGINE - select UCC_FAST help This driver supports the Gigabit Ethernet mode of the QUICC Engine, which is available on some Freescale SOCs. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: slub-i386-support.patch
On Thu, May 10, 2007 at 05:07:02PM -0700, William Lee Irwin III wrote: > I described it as motivated by such, not really correctly handling it. > I didn't bother analyzing it for correctness. I'm not surprised at all > that the TLB flush can be missed where it now stands in the patch. I > wanted to move it to tlb_finish_mmu() all along, along with quicklist > management of lower levels of hierarchy. > quicklist_free() with unflushed TLB entries admits speculation through > the pagetable entries corresponding to the list links. So tlb_finish_mmu() > is the place to call quicklist_free() on pagetables. This requires > distinguishing preconstructed pagetables from freed user pages, which > is not done in include/asm-generic/tlb.h (and core callers may need > to be adjusted, pending the results of audits). > To clarify, upper levels of pagetables are indeed cached by x86 TLB's. > The same kind of deferral of freeing until the TLB is flushed required > for leaf pagetables is required for the upper levels as well. Never mind. The present bit ends up unset because all the vaddrs are page-aligned, and PDPTE entries (which lack present bits) aren't ever internally updated until explicit reloads. I'm still not wild about it, but can't be arsed to deal with it unless it actually breaks. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kconfig warnings on latest GIT
On Thu, May 10, 2007 at 11:56:48AM -0500, Timur Tabi wrote: > Simon Horman wrote: > > >>So my question is: in which Kconfig do I define "UCC_FAST_TEMP" and > >>"UCC_SLOW_TEMP"? At first I thought, just put it in drivers/Kconfig, but > >>that Kconfig does nothing but including other Kconfigs. I believe that if > >>I > >>submit a patch that adds "UCC_FAST_TEMP" and "UCC_SLOW_TEMP" to > >>drivers/Kconfig, it will be rejected. Either that, or I'll spend six weeks > >>trying to persuade everyone that it's a good idea. > >> > >>Does anyone have any suggestions on how I can fix this? > >That does seem like a reasonable suggestion, and one that > >would probably work well with the other similar problems > >that have been introduced sice 2.6.21. > > Looks like the fix is simpler than I thought. Instead of having > > UCC_GETH > select UCC_FAST > > I need to do > > UCC_FAST > default y if UCC_GETH I pondered something like that, but I couldn't get it quite right :( > I'll have a patch that fixes this out later today. > > I chose the first method because I wanted each individual UCC device > driver to select UCC_FAST or UCC_SLOW as appropriate, so that I > wouldn't have to update arch/powerpc/sysdev/qe_lib/Kconfig every time > we add a new UCC driver. Oh well. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-gitX: known regressions
Andrew Morton wrote: On Thu, 10 May 2007 14:04:13 +0200 Michal Piotrowski <[EMAIL PROTECTED]> wrote: Hi all, Here is a list of some known regressions in 2.6.21-gitX. Feel free to add new regressions/remove fixed etc. http://kernelnewbies.org/known_regressions Networking: Subject: panic with e1000 driver on HP Integrity servers References : http://bugzilla.kernel.org/show_bug.cgi?id=8455 Submitter : Doug Chapman <[EMAIL PROTECTED]> Caused-By : Auke Kok <[EMAIL PROTECTED]> commit e0aac5a289b1dacbc94bd9ae8c449bcdf9ab508c Status : Unknown We're trying to reproduce this in our labs here but that piece of code has been extensively tested on various platforms and architectures, so I'm a bit surprised about it. I've asked for more info on the bugzilla as well. So, this is being worked on actively. Cheers, Auke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/5] lguest console driver feedback tidyups
1) Use new lguest_send_dma & lguest_bind_dma functions. 2) sparse: lguest_cons can be static. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- drivers/char/hvc_lguest.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) === --- a/drivers/char/hvc_lguest.c +++ b/drivers/char/hvc_lguest.c @@ -36,7 +36,7 @@ static int put_chars(u32 vtermno, const dma.len[1] = 0; dma.addr[0] = __pa(buf); - hcall(LHCALL_SEND_DMA, LGUEST_CONSOLE_DMA_KEY, __pa(), 0); + lguest_send_dma(LGUEST_CONSOLE_DMA_KEY, ); return count; } @@ -59,7 +59,7 @@ static int get_chars(u32 vtermno, char * return count; } -struct hv_ops lguest_cons = { +static struct hv_ops lguest_cons = { .get_chars = get_chars, .put_chars = put_chars, }; @@ -75,14 +75,17 @@ console_initcall(cons_init); static int lguestcons_probe(struct lguest_device *lgdev) { - lgdev->private = hvc_alloc(0, lgdev->index+1, _cons, 256); + int err; + + lgdev->private = hvc_alloc(0, lgdev_irq(lgdev), _cons, 256); if (IS_ERR(lgdev->private)) return PTR_ERR(lgdev->private); - if (!hcall(LHCALL_BIND_DMA, LGUEST_CONSOLE_DMA_KEY, __pa(_input), - (1<<8) + lgdev->index+1)) + err = lguest_bind_dma(LGUEST_CONSOLE_DMA_KEY, _input, 1, + lgdev_irq(lgdev)); + if (err) printk("lguest console: failed to bind buffer.\n"); - return 0; + return err; } static struct lguest_driver lguestcons_drv = { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/5] lguest block driver feedback tidyups
1) Use new dma wrapper functions, and handle bind failure (may happen in future) 2) Use new lgdev_irq() "get me a good interrupt number" function. 3) __force the ioremap: guests can use it as normal memory. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- drivers/block/lguest_blk.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) === --- a/drivers/block/lguest_blk.c +++ b/drivers/block/lguest_blk.c @@ -123,7 +123,7 @@ static void do_write(struct blockdev *bd pr_debug("lgb: WRITE sector %li\n", (long)req->sector); setup_req(bd, 1, req, ); - hcall(LHCALL_SEND_DMA, bd->phys_addr, __pa(), 0); + lguest_send_dma(bd->phys_addr, ); } static void do_read(struct blockdev *bd, struct request *req) @@ -134,7 +134,7 @@ static void do_read(struct blockdev *bd, setup_req(bd, 0, req, >dma); empty_dma(); - hcall(LHCALL_SEND_DMA, bd->phys_addr, __pa(), 0); + lguest_send_dma(bd->phys_addr, ); } static void do_lgb_request(request_queue_t *q) @@ -183,13 +183,13 @@ static int lguestblk_probe(struct lguest return -ENOMEM; spin_lock_init(>lock); - bd->irq = lgdev->index+1; + bd->irq = lgdev_irq(lgdev); bd->req = NULL; bd->dma.used_len = 0; bd->dma.len[0] = 0; bd->phys_addr = (lguest_devices[lgdev->index].pfn << PAGE_SHIFT); - bd->lb_page = (void *)ioremap(bd->phys_addr, PAGE_SIZE); + bd->lb_page = (__force void *)ioremap(bd->phys_addr, PAGE_SIZE); if (!bd->lb_page) { err = -ENOMEM; goto out_free_bd; @@ -225,7 +225,9 @@ static int lguestblk_probe(struct lguest if (err) goto out_cleanup_queue; - hcall(LHCALL_BIND_DMA, bd->phys_addr, __pa(>dma), (1<<8)+bd->irq); + err = lguest_bind_dma(bd->phys_addr, >dma, 1, bd->irq); + if (err) + goto out_free_irq; bd->disk->major = bd->major; bd->disk->first_minor = 0; @@ -241,6 +243,8 @@ static int lguestblk_probe(struct lguest lgdev->private = bd; return 0; +out_free_irq: + free_irq(bd->irq, bd); out_cleanup_queue: blk_cleanup_queue(bd->disk->queue); out_put_disk: @@ -248,7 +252,7 @@ out_unregister_blkdev: out_unregister_blkdev: unregister_blkdev(bd->major, "lguestblk"); out_unmap: - iounmap(bd->lb_page); + iounmap((__force void *__iomem)bd->lb_page); out_free_bd: kfree(bd); return err; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/5] lguest network driver feedback tidyups
Feedback from Jeff Garzik: 1) Use netdev_priv instead of dev->priv. 2) Check for ioremap failure 3) iounmap on failure. 4) Wrap SEND_DMA and BIND_DMA calls 5) Don't set NETIF_F_SG unless we set NETIF_F_NO_CSUM 6) Use SET_NETDEV_DEV() 7) Don't set dev->irq, mem_start & mem_end (deprecated) Sparse warnings: 8) __force the ioremap: guests can use it as normal memory. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- drivers/net/lguest_net.c | 53 ++ 1 file changed, 31 insertions(+), 22 deletions(-) === --- a/drivers/net/lguest_net.c +++ b/drivers/net/lguest_net.c @@ -35,6 +35,9 @@ struct lguestnet_info unsigned long peer_phys; unsigned long mapsize; + /* The lguest_device I come from */ + struct lguest_device *lgdev; + /* My peerid. */ unsigned int me; @@ -84,7 +87,7 @@ static void skb_to_dma(const struct sk_b static void lguestnet_set_multicast(struct net_device *dev) { - struct lguestnet_info *info = dev->priv; + struct lguestnet_info *info = netdev_priv(dev); if ((dev->flags & (IFF_PROMISC|IFF_ALLMULTI)) || dev->mc_count) info->peer[info->me].mac[0] |= PROMISC_BIT; @@ -110,13 +113,13 @@ static void transfer_packet(struct net_d struct sk_buff *skb, unsigned int peernum) { - struct lguestnet_info *info = dev->priv; + struct lguestnet_info *info = netdev_priv(dev); struct lguest_dma dma; skb_to_dma(skb, skb_headlen(skb), ); pr_debug("xfer length %04x (%u)\n", htons(skb->len), skb->len); - hcall(LHCALL_SEND_DMA, peer_key(info,peernum), __pa(), 0); + lguest_send_dma(peer_key(info, peernum), ); if (dma.used_len != skb->len) { dev->stats.tx_carrier_errors++; pr_debug("Bad xfer to peer %i: %i of %i (dma %p/%i)\n", @@ -137,7 +140,7 @@ static int lguestnet_start_xmit(struct s { unsigned int i; int broadcast; - struct lguestnet_info *info = dev->priv; + struct lguestnet_info *info = netdev_priv(dev); const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest; pr_debug("%s: xmit %02x:%02x:%02x:%02x:%02x:%02x\n", @@ -162,7 +165,7 @@ static int lguestnet_start_xmit(struct s /* Find a new skb to put in this slot in shared mem. */ static int fill_slot(struct net_device *dev, unsigned int slot) { - struct lguestnet_info *info = dev->priv; + struct lguestnet_info *info = netdev_priv(dev); /* Try to create and register a new one. */ info->skb[slot] = netdev_alloc_skb(dev, ETH_HLEN + ETH_DATA_LEN); if (!info->skb[slot]) { @@ -180,7 +183,7 @@ static irqreturn_t lguestnet_rcv(int irq static irqreturn_t lguestnet_rcv(int irq, void *dev_id) { struct net_device *dev = dev_id; - struct lguestnet_info *info = dev->priv; + struct lguestnet_info *info = netdev_priv(dev); unsigned int i, done = 0; for (i = 0; i < ARRAY_SIZE(info->dma); i++) { @@ -220,7 +223,7 @@ static int lguestnet_open(struct net_dev static int lguestnet_open(struct net_device *dev) { int i; - struct lguestnet_info *info = dev->priv; + struct lguestnet_info *info = netdev_priv(dev); /* Set up our MAC address */ memcpy(info->peer[info->me].mac, dev->dev_addr, ETH_ALEN); @@ -232,8 +235,8 @@ static int lguestnet_open(struct net_dev if (fill_slot(dev, i) != 0) goto cleanup; } - if (!hcall(LHCALL_BIND_DMA, peer_key(info, info->me), __pa(info->dma), - (NUM_SKBS << 8) | dev->irq)) + if (lguest_bind_dma(peer_key(info,info->me), info->dma, + NUM_SKBS, lgdev_irq(info->lgdev)) != 0) goto cleanup; return 0; @@ -246,13 +249,13 @@ static int lguestnet_close(struct net_de static int lguestnet_close(struct net_device *dev) { unsigned int i; - struct lguestnet_info *info = dev->priv; + struct lguestnet_info *info = netdev_priv(dev); /* Clear all trace: others might deliver packets, we'll ignore it. */ memset(>peer[info->me], 0, sizeof(info->peer[info->me])); /* Deregister sg lists. */ - hcall(LHCALL_BIND_DMA, peer_key(info, info->me), __pa(info->dma), 0); + lguest_unbind_dma(peer_key(info, info->me), info->dma); for (i = 0; i < ARRAY_SIZE(info->dma); i++) dev_kfree_skb(info->skb[i]); return 0; @@ -290,30 +293,34 @@ static int lguestnet_probe(struct lguest /* Turning on/off promisc will call dev->set_multicast_list. * We don't actually support multicast yet */ dev->set_multicast_list = lguestnet_set_multicast; - dev->mem_start = ((unsigned long)desc->pfn << PAGE_SHIFT); - dev->mem_end = dev->mem_start +
Re: Kconfig warnings on latest GIT
On Thu, May 10, 2007 at 05:39:29PM +0200, Johannes Berg wrote: > On Thu, 2007-05-10 at 14:10 +0900, Simon Horman wrote: > > > drivers/macintosh/Kconfig:112:warning: 'select' used by config symbol > > 'PMAC_APM_EMU' refer to undefined symbol 'SYS_SUPPORTS_APM_EMULATION' > > Argh. Is that with ARCH=ppc? I keep forgetting that it still exists, > sorry. Actually, it was with ARCH=ia64. I have a feeling that you can get it to show up quite easily with anything other than ARCH=powerpc. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/5] lguest guest feedback tidyups
1) send-dma and bind-dma hypercall wrappers for drivers to use, 2) formalization of the convention that devices can use the irq corresponding to their index on the lguest_bus. 3) ___force to shut up sparse: guests *can* use ioremap as virtual mem. 4) lguest.c should include "lguest_bus.h" for lguest_devices declaration. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- drivers/lguest/lguest.c | 20 drivers/lguest/lguest_bus.c |2 +- include/linux/lguest_bus.h | 13 - 3 files changed, 33 insertions(+), 2 deletions(-) === --- a/include/linux/lguest_bus.h +++ b/include/linux/lguest_bus.h @@ -7,7 +7,6 @@ struct lguest_device { /* Unique busid, and index into lguest_page->devices[] */ - /* By convention, each device can use irq index+1 if it wants to. */ unsigned int index; struct device dev; @@ -15,6 +14,18 @@ struct lguest_device { /* Driver can hang data off here. */ void *private; }; + +/* By convention, each device can use irq index+1 if it wants to. */ +static inline int lgdev_irq(const struct lguest_device *dev) +{ + return dev->index + 1; +} + +/* dma args must not be vmalloced! */ +void lguest_send_dma(unsigned long key, struct lguest_dma *dma); +int lguest_bind_dma(unsigned long key, struct lguest_dma *dmas, + unsigned int num, u8 irq); +void lguest_unbind_dma(unsigned long key, struct lguest_dma *dmas); struct lguest_driver { const char *name; === --- a/drivers/lguest/lguest.c +++ b/drivers/lguest/lguest.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include @@ -99,6 +100,25 @@ void async_hcall(unsigned long call, next_call = 0; } local_irq_restore(flags); +} + +void lguest_send_dma(unsigned long key, struct lguest_dma *dma) +{ + dma->used_len = 0; + hcall(LHCALL_SEND_DMA, key, __pa(dma), 0); +} + +int lguest_bind_dma(unsigned long key, struct lguest_dma *dmas, + unsigned int num, u8 irq) +{ + if (!hcall(LHCALL_BIND_DMA, key, __pa(dmas), (num << 8) | irq)) + return -ENOMEM; + return 0; +} + +void lguest_unbind_dma(unsigned long key, struct lguest_dma *dmas) +{ + hcall(LHCALL_BIND_DMA, key, __pa(dmas), 0); } static unsigned long save_fl(void) === --- a/drivers/lguest/lguest_bus.c +++ b/drivers/lguest/lguest_bus.c @@ -136,7 +136,7 @@ static int __init lguest_bus_init(void) return 0; /* Devices are in page above top of "normal" mem. */ - lguest_devices = ioremap(max_pfn << PAGE_SHIFT, PAGE_SIZE); + lguest_devices = (__force void*)ioremap(max_pfn
[PATCH 1/5] lguest host feedback tidyups
1) Sam Ravnborg says lg-objs is deprecated, use lg-y. 2) Sparse: page_tables.c unnecessary initialization 3) Lots of __force to shut sparse up: guest "physical" addresses are userspace virtual. 4) Change prototype of run_lguest and do cast in caller instead (when we add __iomem to cast, it runs over another line). Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- drivers/lguest/Makefile |2 +- drivers/lguest/core.c | 16 drivers/lguest/hypercalls.c |3 ++- drivers/lguest/interrupts_and_traps.c |4 ++-- drivers/lguest/lg.h |2 +- drivers/lguest/lguest_user.c |2 +- drivers/lguest/page_tables.c |2 +- 7 files changed, 16 insertions(+), 15 deletions(-) === --- a/drivers/lguest/Makefile +++ b/drivers/lguest/Makefile @@ -3,5 +3,5 @@ obj-$(CONFIG_LGUEST_GUEST) += lguest.o l # Host requires the other files, which can be a module. obj-$(CONFIG_LGUEST) += lg.o -lg-objs := core.o hypercalls.o page_tables.o interrupts_and_traps.o \ +lg-y := core.o hypercalls.o page_tables.o interrupts_and_traps.o \ segments.o io.o lguest_user.o switcher.o === --- a/drivers/lguest/core.c +++ b/drivers/lguest/core.c @@ -218,7 +218,7 @@ u32 lgread_u32(struct lguest *lg, u32 ad /* Don't let them access lguest binary */ if (!lguest_address_ok(lg, addr, sizeof(val)) - || get_user(val, (u32 __user *)addr) != 0) + || get_user(val, (__force u32 __user *)addr) != 0) kill_guest(lg, "bad read address %u", addr); return val; } @@ -226,14 +226,14 @@ void lgwrite_u32(struct lguest *lg, u32 void lgwrite_u32(struct lguest *lg, u32 addr, u32 val) { if (!lguest_address_ok(lg, addr, sizeof(val)) - || put_user(val, (u32 __user *)addr) != 0) + || put_user(val, (__force u32 __user *)addr) != 0) kill_guest(lg, "bad write address %u", addr); } void lgread(struct lguest *lg, void *b, u32 addr, unsigned bytes) { if (!lguest_address_ok(lg, addr, bytes) - || copy_from_user(b, (void __user *)addr, bytes) != 0) { + || copy_from_user(b, (__force void __user *)addr, bytes) != 0) { /* copy_from_user should do this, but as we rely on it... */ memset(b, 0, bytes); kill_guest(lg, "bad read address %u len %u", addr, bytes); @@ -243,7 +243,7 @@ void lgwrite(struct lguest *lg, u32 addr void lgwrite(struct lguest *lg, u32 addr, const void *b, unsigned bytes) { if (!lguest_address_ok(lg, addr, bytes) - || copy_to_user((void __user *)addr, b, bytes) != 0) + || copy_to_user((__force void __user *)addr, b, bytes) != 0) kill_guest(lg, "bad write address %u len %u", addr, bytes); } @@ -294,7 +294,7 @@ static void run_guest_once(struct lguest : "memory", "%edx", "%ecx", "%edi", "%esi"); } -int run_guest(struct lguest *lg, char *__user user) +int run_guest(struct lguest *lg, unsigned long __user *user) { while (!lg->dead) { unsigned int cr2 = 0; /* Damn gcc */ @@ -302,8 +302,8 @@ int run_guest(struct lguest *lg, char *_ /* Hypercalls first: we might have been out to userspace */ do_hypercalls(lg); if (lg->dma_is_pending) { - if (put_user(lg->pending_dma, (unsigned long *)user) || - put_user(lg->pending_key, (unsigned long *)user+1)) + if (put_user(lg->pending_dma, user) || + put_user(lg->pending_key, user+1)) return -EFAULT; return sizeof(unsigned long)*2; } @@ -420,7 +420,7 @@ static int __init init(void) lock_cpu_hotplug(); if (cpu_has_pge) { /* We have a broader idea of "global". */ cpu_had_pge = 1; - on_each_cpu(adjust_pge, 0, 0, 1); + on_each_cpu(adjust_pge, (void *)0, 0, 1); clear_bit(X86_FEATURE_PGE, boot_cpu_data.x86_capability); } unlock_cpu_hotplug(); === --- a/drivers/lguest/hypercalls.c +++ b/drivers/lguest/hypercalls.c @@ -83,7 +83,8 @@ static void do_hcall(struct lguest *lg, guest_set_pmd(lg, regs->edx, regs->ebx); break; case LHCALL_LOAD_TLS: - guest_load_tls(lg, (struct desc_struct __user*)regs->edx); + guest_load_tls(lg, + (__force struct desc_struct __user*)regs->edx); break; case LHCALL_TS: lg->ts = regs->edx; === ---
[PATCH 0/5] lguest feedback tidyups
Hi all, Gratefully-received recent feedback from CC'd was applied to excellent effect (and the advice from Matt Mackall about my personal appearance is best unrequited). The patch is split in 5 parts to correspond with the 9 parts Andrew sent out before, but here's the summary: 1) Sparse (thanks Christoph Hellwig): - lguest_const can be static now - lguest.c should include "lguest_bus.h" for lguest_devices declaration. - page_tables.c unnecessary initialization - But the cost was high: lots of __force casts 8( 2) Jeff Garzik - Use netdev_priv instead of dev->priv. - Check for ioremap failure - iounmap on failure. - Wrap SEND_DMA and BIND_DMA calls - Don't set NETIF_F_SG unless we set NETIF_F_NO_CSUM - Use SET_NETDEV_DEV() - Don't set dev->irq, mem_start & mem_end (deprecated) 3) Sam Ravnborg - lg-objs is deprecated, use lg-y. Cheers, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-gitX: known regressions
On Thu, 10 May 2007 14:04:13 +0200 Michal Piotrowski <[EMAIL PROTECTED]> wrote: > Hi all, > > Here is a list of some known regressions in 2.6.21-gitX. > > Feel free to add new regressions/remove fixed etc. > http://kernelnewbies.org/known_regressions > > > > Unclassified: > > Subject: 2.6.21-git10/11: files getting truncated on xfs (after > suspend/resume?) > References : http://lkml.org/lkml/2007/5/9/410 > Submitter : Jeremy Fitzhardinge <[EMAIL PROTECTED]> > Handled-By : David Chinner <[EMAIL PROTECTED]> > Status : problem is being debugged > > Subject: Current -git kernel kills X > References : http://lkml.org/lkml/2007/5/8/667 > Submitter : Jeff Garzik <[EMAIL PROTECTED]> > Status : Unknown > > > > Block devices: > > Subject: BUG in loop.ko > References : http://lkml.org/lkml/2007/5/9/510 > Submitter : Jeremy Fitzhardinge <[EMAIL PROTECTED]> > Status : Unknown > > > > Networking: > > Subject: panic with e1000 driver on HP Integrity servers > References : http://bugzilla.kernel.org/show_bug.cgi?id=8455 > Submitter : Doug Chapman <[EMAIL PROTECTED]> > Caused-By : Auke Kok <[EMAIL PROTECTED]> > commit e0aac5a289b1dacbc94bd9ae8c449bcdf9ab508c > Status : Unknown > > > > Timers/NOHZ: > > Subject: 2.6.21-git4 BUG: soft lockup detected on CPU#1! > References : http://lkml.org/lkml/2007/5/2/511 > Submitter : Michal Piotrowski <[EMAIL PROTECTED]> > Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> > Status : problem is being debugged > Please also consider: Subject: libata reset-seq merge broke sata_sil on sh Subject: [Bugme-new] [Bug 8462] New: applications under wine freezes But we have many many more regressions which are in 2.6.21.x, only nobody's tracking those. Nobody seems to be fixing them either. Probably everyone's busy on the 2.6.14 regressions. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] memory hotremove patch take 2 [01/10] (counter of removable page)
On Thu, 10 May 2007 11:00:31 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Wed, 9 May 2007, Yasunori Goto wrote: > > > > > +unsigned int nr_free_movable_pages(void) > > +{ > > + unsigned long nr_pages = 0; > > + struct zone *zone; > > + int nid; > > + > > + for_each_online_node(nid) { > > + zone = &(NODE_DATA(nid)->node_zones[ZONE_MOVABLE]); > > + nr_pages += zone_page_state(zone, NR_FREE_PAGES); > > + } > > + return nr_pages; > > +} > > > H... This is redoing what the vm counters already provide > > Could you add > > NR_MOVABLE_PAGES etc. > > instead and then let the ZVC counter logic take care of the rest? > Okay, we'll try ZVC. Thanks, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] memory hotremove patch take 2 [01/10] (counter of removable page)
On 10 May 2007 15:44:08 +0200 Andi Kleen <[EMAIL PROTECTED]> wrote: > Yasunori Goto <[EMAIL PROTECTED]> writes: > > > (not a full review, just something I noticed) > > @@ -352,6 +352,8 @@ struct sysinfo { > > unsigned short pad; /* explicit padding for m68k */ > > unsigned long totalhigh;/* Total high memory size */ > > unsigned long freehigh; /* Available high memory size */ > > + unsigned long movable; /* pages used only for data */ > > + unsigned long free_movable; /* Avaiable pages in movable */ > > You can't just change that structure, it is exported to user space. > Okay. We'll drop this. Thanks, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: slub-i386-support.patch
On Thu, May 10, 2007 at 05:07:02PM -0700, William Lee Irwin III wrote: > quicklist_free() with unflushed TLB entries admits speculation through > the pagetable entries corresponding to the list links. So tlb_finish_mmu() > is the place to call quicklist_free() on pagetables. This requires > distinguishing preconstructed pagetables from freed user pages, which > is not done in include/asm-generic/tlb.h (and core callers may need > to be adjusted, pending the results of audits). > To clarify, upper levels of pagetables are indeed cached by x86 TLB's. > The same kind of deferral of freeing until the TLB is flushed required > for leaf pagetables is required for the upper levels as well. Looking more closely at it, the entire attempt to avoid struct page pointers is far beyond pointless. The freeing functions unconditionally require struct page pointers to either be passed or computed and the allocation function's virtual address it returns as a result is not directly usable. The callers all have to do arithmetic on the result. One might as well stash precomputed pfn's (if not paddrs) and vaddrs in page->private and page->mapping, chain them with ->lru (use only .next if you care to stay singly-linked), and handle struct page pointers throughout. At that point quicklists not only become directly callable for pagetable freeing (including upper levels) instead of needing calls to quicklist freeing staged to occur at the time of tlb_finish_mmu(), but also become usable for the highpte case. The computations this is trying to save on are computing the virtual and physical addresses (pfn's modulo a cheap shift; besides, all the API's work on pfn's) of a page from the pointer to the struct page. Chaining through the memory for the page incurs the cost of having to stage freeing through tlb_finish_mmu() instead of using the quicklist as a staging arena directly. So the translation from a struct page pointer is not saving work. It's not saving cache, either. The page's memory is no more likely to be hot than its struct page. In the course of freeing the pointer to the struct page is computed whether by the caller or the API function. So the translation to a struct page pointer is done during freeing regardless. A better solution would be to precompute those results and store them in various fields of the struct page. i386 can move to using generation numbers (->_mapcount and ->index are still available for 64 bits there even after quicklists use ->lru, ->mapping, and ->private, and quicklists really only need half of ->lru) to handle change_page_attr() and vmalloc_sync(). -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] utimensat implementation
Christoph Hellwig wrote: > > I'd be happy to have them. While it's not the nicest API in the world > it's in Posix and we have to support it at the library level, so we > should better get it right. > > I'd like to avoid having a big swithc statement in every filesystem, > though, instead of we should have a table-driven approach instead > where each filesystem defines one table (or multiple ones when it > supports subtypes with different limits) and just sets a pointer in > the superblock to it. > This is starting to sound an awful lot like statfs(). Maybe we could create a new statfs call which takes a buffer size input (so that we can add new fields as time goes on) and which returns the necessary information? -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] memory hotremove patch take 2 [05/10] (make basic remove code)
On Thu, 10 May 2007 11:09:29 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Wed, 9 May 2007, Yasunori Goto wrote: > > > +/* > > + * Just an easy implementation. > > + */ > > +static struct page * > > +hotremove_migrate_alloc(struct page *page, > > + unsigned long private, > > + int **x) > > +{ > > + return alloc_page(GFP_HIGH_MOVABLE); > > +} > > This would need to reflect the zone in which you are performing hot > remove. Or is hot remove only possible in the higest zone? > No. We'll allow hot remove in any zone-type. My old patchest didn't include Mel-san's page grouping and just had ZONE_MOVABLE, so I wrote this. Reflecting migration target's zone here is reasobanle. Anyway, I think we'll need more complicated function here. Thanks, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] memory hotremove patch take 2 [04/10] (isolate all free pages)
On Thu, 10 May 2007 17:42:54 +0100 (IST) Mel Gorman <[EMAIL PROTECTED]> wrote: > > + if (!pfn_valid(pfn)) > > + return -EINVAL; > > This may lead to boundary cases where pages cannot be captured at the > start and end of non-aligned zones due to memory holes. > Hm, ok. maybe we can remove this. > > + zone = info->zone; > > + if ((zone != page_zone(pfn_to_page(pfn))) || > > + (zone != page_zone(pfn_to_page(last_pfn > > + return -EINVAL; > > Is this check really necessary? Surely a caller to > capture_isolate_freed_pages() will have already made all the necessary > checks when adding the struct insolation_info ? > just because isolation_info is treated per zone. Maybe MIGRATE_ISOLATING can allow us more flexible approach. > > + drain_all_pages(); > > + spin_lock(>lock); > > + while (pfn < info->end_pfn) { > > + if (!pfn_valid(pfn)) { > > + pfn++; > > + continue; > > + } > > + page = pfn_to_page(pfn); > > + /* See page_is_buddy() */ > > + if (page_count(page) == 0 && PageBuddy(page)) { > > If PageBuddy is set it's free, you shouldn't have to check the page_count. > ok. > > + order = page_order(page); > > + order_size = 1 << order; > > + zone->free_area[order].nr_free--; > > + __mod_zone_page_state(zone, NR_FREE_PAGES, -order_size); > > + list_del(>lru); > > + rmv_page_order(page); > > + isolate_page_nolock(info, page, order); > > + nr_pages += order_size; > > + pfn += order_size; > > + } else { > > + pfn++; > > + } > > + } > > + spin_unlock(>lock); > > + return nr_pages; > > +} > > #endif /* CONFIG_PAGE_ISOLATION */ > > > > This is all similar to move_freepages() other than the locking part. It > would be worth checking if there is code that could be shared or at least > have similar styles. Thank you, I'll look into move_freepages(). -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86 setup rewrite tree ready for flamage^W review
Martin Mares wrote: > Hello! > >> As far as I could tell, "scan" simply caused the nonstandard video >> driver scan modules (unsafe probes) to be invoked. Since those modules >> are no longer present, there appeared to be no need for them. The VGA >> and VESA probes are safe. > > "scan" is still useful, because it is able to find BIOS video modes with > non-standard numbers (they are still sometimes found on recent cards). Well, I don't have a card which does anything like that, but I did just implement the "scan" functionality and pushed it out. If anyone cares about that functionality it would be good if they could test it out and report if it works. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Conveying memory pressure to userspace?
Hello, Quite a lot of userspace applications cache. Firefox caches pages; mySQL caches queries; libc' free (like practically all other userspace allocators) won't directly return the first byte of memory freed, etc. These applications are unaware of memory pressure. While the kernel is trying its best to to free memory by for instance dropping, possibly more valuable caches, userspace is left blissfully unaware. Obviously this isn't a really big problem, given that we've still got swap to swap out those rarely used caches, except for when the caches aren't _that_ rarely used and of which the backing store (eg. precomputed values) might be faster than the disk to swap back the pages from. A solution would be to either a) let the application make the kernel aware of pages that, when in memory pressure, may be dropped. This would be tricky to implement for the userspace: it's hard to avoid an application to race into a dropped page. However, the kernel can directly free a page from userspace, which makes it use full when under real pressure. This in contrast to b) letting the application register itself with a cache share priority. The application (and other aware applications) would then be able to query how fair they are at the moment proportional to their cache share priority. Freeing would still be completely in their own hands. The only relevant related matter I could find were madvise and mincore. With madvise pages can be marked to be unnecessary and these should be swapped out earlier. With mincore one can determine whether pages are resident (not cached). This would make an existing alternative to solution a. However, this doesn't eliminate the writes to the swap and polling everytime before accessing a cache isn't really pretty. I did consider guessing the memory pressure by looking at /proc/meminfo, but I think it isn't that accurate. Before hacking something together (and being uncertain about the thoroughness with which I searched for existing work, sorry), I would like your thoughts on this. Please CC me, I'm not in the list. Bas -- Bas Westerbaan GPG 99BA289B | SINP [EMAIL PROTECTED] http://blog.w-nz.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] memory hotremove patch take 2 [03/10] (drain all pages)
On Thu, 10 May 2007 16:35:37 +0100 (IST) Mel Gorman <[EMAIL PROTECTED]> wrote: > On Wed, 9 May 2007, Yasunori Goto wrote: > > > This patch add function drain_all_pages(void) to drain all > > pages on per-cpu-freelist. > > Page isolation will catch them in free_one_page. > > > > Is this significantly different to what drain_all_local_pages() currently > does? > no difference. this duplicating it. thank you for pointing out. Maybe I missed this because this func only exists in -mm. Regards, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] module_author: don't advice putting in an email address
Hi Rusty. Following up the recent MODULE_MAINTAINER discussion: http://lkml.org/lkml/2007/4/4/170 that concluded with MODULE_MAINTAINER not being a good idea, here's a small patch that just deletes the advice of including an email address in the MODULE_AUTHOR tag as suggested (and not objected to) at the end of it. The email address is the problem I was trying to fix; with multiple current and non-current authors and maintainers who might not even be authors the address(es) available from the tag confuse the issue of whom to contact. It's moreover also information that easily outdated. A bit more than half of the tags in the tree don't include an email address already and I'll submit patches removing more... Rene. commit 3b4fa382d5a6a3d9afdcb5a9232d63c47391fb30 Author: Rene Herman <[EMAIL PROTECTED]> Date: Fri May 11 02:24:35 2007 +0200 module_author: don't advice putting in an email address It's information that's easily outdated and easily mistaken for a driver contact which is a problem especially for modules with multiple current and non-current authors as well as for modules with a maintainer who may not even be a module author. Signed-off-by: Rene Herman <[EMAIL PROTECTED]> diff --git a/include/linux/module.h b/include/linux/module.h index 792d483..e6e0f86 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -124,7 +124,7 @@ extern struct module __this_module; */ #define MODULE_LICENSE(_license) MODULE_INFO(license, _license) -/* Author, ideally of form NAME [, NAME ]*[ and NAME ] */ +/* Author, ideally of form NAME[, NAME]*[ and NAME] */ #define MODULE_AUTHOR(_author) MODULE_INFO(author, _author) /* What your module does. */
Re: [RFC] memory hotremove patch take 2 [03/10] (drain all pages)
On Thu, 10 May 2007 11:07:08 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Wed, 9 May 2007, Yasunori Goto wrote: > > > This patch add function drain_all_pages(void) to drain all > > pages on per-cpu-freelist. > > Page isolation will catch them in free_one_page. > > This is only draining the pcps of the local processor. I would think > that you need to drain all other processors pcps of this zone as well. And > there is no need to drain this processors pcps of other zones. > As Mel-san pointed, -mm has drain_all_local_pages(). We'll use it. Thanks, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: libata reset-seq merge broke sata_sil on sh
On Thu, May 10, 2007 at 03:08:59PM +0200, Tejun Heo wrote: > Paul Mundt wrote: > > The detection is simply flaky after that point, however before the > > current master it never hit the 35 second point (and thus never implied > > that the link was down). I'll double check the bisect log to see if there > > was anything beyond that that may have caused it. > > > > The -ENODEV at least implies that the SRST fails, so at least that's a > > starting point. > > If prereset() fails to get the initial DRDY before 10secs, it assumes > something went wrong and escalates to hardreset. sil family of > controllers report 0xff status while the link is broken and it seems > that your particular drive needs more than the current 150ms to recover > phy link. It probably went unnoticed till now because the device was > never hardreset before. If the diagnosis is correct, increasing the > delay in hardreset should fix the problem. Well, let's see. :-) > Bumping the hardreset delay up does indeed fix it, I've had to bump it up to 1200 before it started working (at 600 it still fails): [0.967379] scsi0 : sata_sil [0.970425] scsi1 : sata_sil [0.973298] ata1: SATA max UDMA/100 cmd 0xfd000280 ctl 0xfd00028a bmdma 0xfd000200 irq 0 [0.981331] ata2: SATA max UDMA/100 cmd 0xfd0002c0 ctl 0xfd0002ca bmdma 0xfd000208 irq 0 [1.299353] ata1: device not ready (errno=-19), forcing hardreset [2.817893] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [2.826284] ata1.00: ata_hpa_resize 1: sectors = 39070080, hpa_sectors = 39070080 [2.831052] ata1.00: ATA-5: HHD424020F7SV00, 00MLA0A5, max UDMA/100 [2.837548] ata1.00: 39070080 sectors, multi 0: LBA [2.842702] ata1.00: applying bridge limits [2.854162] ata1.00: ata_hpa_resize 1: sectors = 39070080, hpa_sectors = 39070080 [2.858938] ata1.00: configured for UDMA/100 [3.172602] ata2: SATA link down (SStatus 0 SControl 310) [3.175736] scsi 0:0:0:0: Direct-Access ATA HHD424020F7SV00 00ML PQ: 0 ANSI: 5 I'm not sure if it matters or not, but this is an iVDR drive, so that might also have additional implications. -- diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 4595d1f..4dad3fd 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -3518,7 +3518,7 @@ int sata_std_hardreset(struct ata_port *ap, unsigned int *class, } /* wait a while before checking status, see SRST for more info */ - msleep(150); + msleep(1200); rc = ata_wait_ready(ap, deadline); /* link occupied, -ENODEV too is an error */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata: add human-readable error value decoding
> Scrollback rarely works as planned, for me. Overall, a balance must be > found. > > More information is more helpful. But. > > There are downsides to spewing everything possible, upon error. You > cause logging to the possibly problematic disk, you push older messages > out of the printk ring buffer, etc., etc. Get yourself a Voodoo5 or similar card cheap off ebay. The firmware on most of them doesn't clear the top 30MB of RAM on a reboot/PCI reset which makes them excellent debug buffers providing you empty the buffer before you run the X server. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] memory hotremove patch take 2 [02/10] (make page unused)
On Thu, 10 May 2007 11:04:37 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Wed, 9 May 2007, Yasunori Goto wrote: > > > This patch is for supporting making page unused. > > > > Isolate pages by capturing freed pages before inserting free_area[], > > buddy allocator. > > If you have an idea for avoiding spin_lock(), please advise me. > > Using the zone lock instead may avoid to introduce another lock? Or is the > new lock here for performance reasons? > > Isnt it possible to just add another flavor of pages like what Mel has > been doing with reclaimable and movable? I.e. add another category of free > pages to Mel's scheme called isolated and use Mel's function to move stuff > over there? > Mel-san's idea seems good. So we'll rewrite the whole this patch. Thank you. -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] memory hotremove patch take 2 [02/10] (make page unused)
On Thu, 10 May 2007 16:34:01 +0100 (IST) Mel Gorman <[EMAIL PROTECTED]> wrote: > > +#ifdef CONFIG_PAGE_ISOLATION > > + /* > > +* For pages which are not used but not free. > > +* See include/linux/page_isolation.h > > +*/ > > + spinlock_t isolation_lock; > > + struct list_headisolation_list; > > +#endif > > Using MIGRATE_ISOLATING instead of this approach does mean that there will > be MAX_ORDER additional struct free_area added to the zone. That is more > lists than this approach. > Thank you!, its an interesting idea. I think it will make our code much simpler. I'll look into. > I am somewhat suprised that CONFIG_PAGE_ISOLATION exists as a separate > option. If it was a compile-time option at all, I would expect it to > depend on memory hot-remove being selected. > I myself think CONFIG_PAGE_ISOLATION can be used by some code which need to isolate some amount of contiguous pages. So config is divided for now. Now, CONFIG_MEMORY_HOTREMOVE selects this. CONFIG_PAGE_ISOLATION and CONFIG_MEMORY_HOTREMOVE will be merged later if there are no one who use this except for hot-removal. > > /* > > * zone_start_pfn, spanned_pages and present_pages are all > > * protected by span_seqlock. It is a seqlock because it has > > Index: current_test/mm/page_alloc.c > > === > > --- current_test.orig/mm/page_alloc.c 2007-05-08 15:07:20.0 > > +0900 > > +++ current_test/mm/page_alloc.c2007-05-08 15:08:34.0 +0900 > > @@ -41,6 +41,7 @@ > > #include > > #include > > #include > > +#include > > > > #include > > #include > > @@ -448,6 +449,9 @@ static inline void __free_one_page(struc > > if (unlikely(PageCompound(page))) > > destroy_compound_page(page, order); > > > > + if (page_under_isolation(zone, page, order)) > > + return; > > + > > Using MIGRATE_ISOLATING would avoid a potential list search here. > yes. thank you. > > page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1); > > > > VM_BUG_ON(page_idx & (order_size - 1)); > > @@ -3259,6 +3263,10 @@ static void __meminit free_area_init_cor > > zone->nr_scan_inactive = 0; > > zap_zone_vm_stats(zone); > > atomic_set(>reclaim_in_progress, 0); > > +#ifdef CONFIG_PAGE_ISOLATION > > + spin_lock_init(>isolation_lock); > > + INIT_LIST_HEAD(>isolation_list); > i> +#endif > > if (!size) > > continue; > > > > @@ -4214,3 +4222,182 @@ void set_pageblock_flags_group(struct pa > > else > > __clear_bit(bitidx + start_bitidx, bitmap); > > } > > + > > +#ifdef CONFIG_PAGE_ISOLATION > > +/* > > + * Page Isolation. > > + * > > + * If a page is removed from usual free_list and will never be used, > > + * It is linked to "struct isolation_info" and set Reserved, Private > > + * bit. page->mapping points to isolation_info in it. > > + * and page_count(page) is 0. > > + * > > + * This can be used for creating a chunk of contiguous *unused* memory. > > + * > > + * current user is Memory-Hot-Remove. > > + * maybe move to some other file is better. > > page_isolation.c to match the header filename seems reasonable. > page_alloc.c has a lot of multi-function stuff like memory initialisation > in it. Hmm. > > > + */ > > +static void > > +isolate_page_nolock(struct isolation_info *info, struct page *page, int > > order) > > +{ > > + int pagenum; > > + pagenum = 1 << order; > > + while (pagenum > 0) { > > + SetPageReserved(page); > > + SetPagePrivate(page); > > + page->private = (unsigned long)info; > > + list_add(>lru, >pages); > > + page++; > > + pagenum--; > > + } > > +} > > It's worth commenting somewhere that pages on the list in isolation_info > are always order-0. > okay. > > + > > +/* > > + * This function is called from page_under_isolation() > > + */ > > + > > +int __page_under_isolation(struct zone *zone, struct page *page, int order) > > +{ > > + struct isolation_info *info; > > + unsigned long pfn = page_to_pfn(page); > > + unsigned long flags; > > + int found = 0; > > + > > + spin_lock_irqsave(>isolation_lock,flags); > > An unwritten convention seems to be that __ versions of same-named > functions are the nolock version. i.e. I would expect > page_under_isolation() to acquire and release the spinlock and > __page_under_isolation() to do no additional locking. > > Locking outside of here might make the flow a little clearer as well if > you had two returns and avoided the use of "found". > Maybe MOVABLE_ISOLATING will simplify these code. > > + list_for_each_entry(info, >isolation_list, list) { > > + if (info->start_pfn <= pfn && pfn < info->end_pfn) { > > + found = 1; > > + break; > > + } > > + } > > + if (found) { > >
Re: Please revert 5b479c91da90eef605f851508744bfe8269591a0 (md partition rescan)
Satyam Sharma wrote: > On 5/10/07, Xavier Bestel <[EMAIL PROTECTED]> wrote: >> On Thu, 2007-05-10 at 16:51 +0200, Jan Engelhardt wrote: >> > >(But Andrew never saw your email, I suspect: "[EMAIL PROTECTED]" is >> > probably >> > >some strange mixup of Andrew Morton and Andi Kleen in your mind ;) >> > >> > What do the letters kp stand for? > > Heh ... I've always wanted to know that myself. It's funny, no one > seems to have asked that on lkml during all these years (at least none > that a Google search would throw up). > >> "Keep Patching" ? > > Unlikely. "akpm" seems to be a pre-Linux-kernel nick. http://en.wikipedia.org/wiki/Andrew_Morton_%28computer_programmer%29 -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?
On Thu, May 10, 2007 at 04:49:35PM -0700, Jeremy Fitzhardinge wrote: > David Chinner wrote: > > Ok, this is important to kow becase we merged a mod around that time > > that changes the way we handle the updates to the file size i.e. the > > fix for the NULL-files-on-crash problem: > > > > http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ba87ea699ebd9dd577bf055ebc4a98200e337542 > > > > and that means the size of the file is not updated to the incore > > cached inode until after the data write is complete. The symptoms > > being seen would match with a inode-not-being-written-after-last- > > data-write-bug in this mod > > > > Yes, that does look like a good candidate. Should I try to > before-and-after this change? Yes please! Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Info about the new netlink layer userland API
On Thu, May 10, 2007 at 04:01:52AM -0700, David Miller wrote: > > It's not OK, please use the generic netlink interface and as > such you will not need to allocate any numbers at all. > > Documentation/networking/generic_netlink.txt gives a link > to some infomration on this topic. Where can I find some infos about userland programming _without_ using the libnl library? There are something similar to the magic command: ret = socket(PF_NETLINK, SOCK_RAW, NETLINK_PPSAPI); in this new netlink API? Thanks for your help, Rodolfo -- GNU/Linux Solutions e-mail:[EMAIL PROTECTED] Linux Device Driver [EMAIL PROTECTED] Embedded Systems[EMAIL PROTECTED] UNIX programming phone: +39 349 2432127 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[ANNOUNCE] GIT 1.5.2-rc3
Upcoming 1.5.2 will have three large-ish new features that the user community wished to have for quite some time. I usually do not CC the kernel list for -rc releases, but this one will hopefully be pretty much the same as what the final one would look like, so here it is. We may get around fixing the long standing git-apply corner case HPA's rebase problem unveiled before v1.5.2, but we might defer it post 1.5.2 (it is not a regression). We'll see. GIT v1.5.2 Release Notes (draft) Updates since v1.5.1 * Plumbing level subproject support. You can include a subdirectory that has an independent git repository in your index and tree objects as a "subproject". This plumbing (i.e. "core") level subproject support explicitly excludes recursive behaviour. The "subproject" entries in the index and trees are incompatible with older versions of git. Experimenting with the plumbing level support is encouraged, but be warned that unless everybody in your project updates to this release or later, using this feature would make your project inaccessible by people with older versions of git. * Plumbing level gitattributes support. The gitattributes mechanism allows you to add 'attributes' to paths in your project, and affect the way certain git operations work. Currently you can influence if a path is considered a binary or text (the former would be treated by 'git diff' not to produce textual output; the latter can go through the line endings conversion process in repositories with core.autocrlf set), expand and unexpand '$ident$' keyword with blob object name, specify a custom 3-way merge driver, and specify a custom diff driver. You can also apply arbitrary filter to contents on check-in/check-out codepath but this feature is an extremely sharp-edged razor and needs to be handled with caution (do not use it unless you understand the earlier mailing list discussion on keyword expansion). * The packfile format now optionally suports 64-bit index. This release supports the "version 2" format of the .idx file. This is automatically enabled when a huge packfile needs more than 32-bit to express offsets of objects in the pack * Comes with an updated git-gui 0.7.0 * Updated gitweb: - can show combined diff for merges; - uses font size of user's preference, not hardcoded in pixels; * New commands and options. - "git bisect start" can optionally take a single bad commit and zero or more good commits on the command line. - "git shortlog" can optionally be told to wrap its output. - "subtree" merge strategy allows another project to be merged in as your subdirectory. - "git format-patch" learned a new --subject-prefix= option, to override the built-in "[PATCH]". - "git add -u" is a quick way to do the first stage of "git commit -a" (i.e. update the index to match the working tree); it obviously does not make a commit. - "git clean" honors a new configuration, "clean.requireforce". When set to true, this makes "git clean" a no-op, preventing you from losing files by typing "git clean" when you meant to say "make clean". You can still say "git clean -f" to override this. - "git log" family of commands learned --date={local,relative,default} option. --date=relative is synonym to the --relative-date. --date=local gives the timestamp in local timezone. * Updated behavior of existing commands. - When $GIT_COMMITTER_EMAIL or $GIT_AUTHOR_EMAIL is not set but $EMAIL is set, the latter is used as a substitute. - "git diff --stat" shows size of preimage and postimage blobs for binary contents. Earlier it only said "Bin". - "git lost-found" shows stuff that are unreachable except from reflogs. - "git checkout branch^0" now detaches HEAD at the tip commit on the named branch, instead of just switching to the branch (use "git checkout branch" to switch to the branch, as before). - "git bisect next" can be used after giving only a bad commit without giving a good one (this starts bisection half-way to the root commit). We used to refuse to operate without a good and a bad commit. - "git push", when pushing into more than one repository, does not stop at the first error. - "git archive" does not insist you to give --format parameter anymore; it defaults to "tar". - "git cvsserver" can use backends other than sqlite. - "gitview" (in contrib/ section) learned to better support "git-annotate". - "git diff $commit1:$path2 $commit2:$path2" can now report mode changes between the two blobs. - Local "git fetch" from a repository whose object store is one of the alternates (e.g. fetching from the origin in a repository created with "git clone -l -s") avoids downloading objects unnecessary. -
Re: slub-i386-support.patch
On Thu, 10 May 2007, William Lee Irwin III wrote: >> So now quicklist semantics vs. TLB flushing are the motive behind the >> odd flush_tlb_mm() affair. The real trick with it is that flushing >> must never occur until the TLB flush. Any change to the core quicklist >> code that retires pages back to the page allocator earlier (e.g. based >> on some limit) will break things badly. On Fri, May 11, 2007 at 12:14:14AM +0100, Hugh Dickins wrote: > I don't think that's right. It's vital that TLB (of an active mm) > be flushed before freeing its page back to the quicklist, before it's > recycled to another mm (or elsewhere in this mm); but having done that, > it really doesn't matter much when quicklist_trim() (check_pgt_cache) > is called to free surplus pages from quicklist back to page_alloc.c. What I was really going on about was that quicklist freeing can't enforce any high watermarks in the future because it must wait until the TLB flush unless it's guaranteed that TLB flushes are done prior to quicklist freeing (which is furthermore required for other reasons, to be described in the sequel). On Fri, May 11, 2007 at 12:14:14AM +0100, Hugh Dickins wrote: > tlb_finish_mmu() happens to be the traditional place it's done, and > that's where we expect it. flush_tlb_mm() avoids flushing TLB unless > it's actually required for the mm in question: so wouldn't be a good > place to rely on flushing TLB for pages freed earlier from other mms > (but we'd already be in trouble to be leaving them that late). > I'm guessing (haven't rechecked source) that the cpu_idle() call comes > about because the top level pgd of a process gets freed very late in > its exit, and after a great flurry of processes have just exited, > perhaps there was nothing to free up the accumulation. Though > it still strikes me as an odd place to do it. I described it as motivated by such, not really correctly handling it. I didn't bother analyzing it for correctness. I'm not surprised at all that the TLB flush can be missed where it now stands in the patch. I wanted to move it to tlb_finish_mmu() all along, along with quicklist management of lower levels of hierarchy. quicklist_free() with unflushed TLB entries admits speculation through the pagetable entries corresponding to the list links. So tlb_finish_mmu() is the place to call quicklist_free() on pagetables. This requires distinguishing preconstructed pagetables from freed user pages, which is not done in include/asm-generic/tlb.h (and core callers may need to be adjusted, pending the results of audits). To clarify, upper levels of pagetables are indeed cached by x86 TLB's. The same kind of deferral of freeing until the TLB is flushed required for leaf pagetables is required for the upper levels as well. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?
David Chinner wrote: > Ok, this is important to kow becase we merged a mod around that time > that changes the way we handle the updates to the file size i.e. the > fix for the NULL-files-on-crash problem: > > http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ba87ea699ebd9dd577bf055ebc4a98200e337542 > > and that means the size of the file is not updated to the incore > cached inode until after the data write is complete. The symptoms > being seen would match with a inode-not-being-written-after-last- > data-write-bug in this mod > Yes, that does look like a good candidate. Should I try to before-and-after this change? J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [2.6.21.1] SATA freeze
Robert Hancock wrote: >Gerhard Mack wrote: >> On Wed, 9 May 2007, Jeff Garzik wrote: >>> Gerhard Mack wrote: May 9 14:51:35 mgerhard kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x180 action 0x2 frozen May 9 14:51:35 mgerhard kernel: ata1.00: cmd 35/00:00:80:6d:c8/00:04:09:00:00/e0 tag 0 cdb 0x0 data 524288 out May 9 14:51:35 mgerhard kernel: res 40/00:c8:68:65:c8/84:00:09:00:00/e0 Emask 0x4 (timeout) May 9 14:51:42 mgerhard kernel: ata1: port is slow to respond, please be patient (Status 0xd0) Anything I can do to figgure out what's causing this? > You're showing various flags set in the SError register, which > suggests you're having SATA communication problems with the drive. A > bad SATA cable or power problems would be a strong possibility. I just joined the list today so apologies if this email breaks any email client post threading. I have been seeing similar errors on two different systems. I applied Robert's sata_nv patch posted to the list on May 5th, and approved today by Jeff Garzik. I've taken several steps to insure that this isn't a faulty cable or drive issue. This is running on a hp dl145g2. Here is my lspci, dmesg, and relevant kernel config sections: Linux version 2.6.21-gentoo ([EMAIL PROTECTED]) (gcc version 4.1.1 (Gentoo 4.1.1)) #6 SMP Sun May 6 16:44:40 PDT 2007 Command line: root=/dev/sda2 BIOS-provided physical RAM map: BIOS-e820: - 00098800 (usable) BIOS-e820: 00098800 - 000a (reserved) BIOS-e820: 000c2000 - 0010 (reserved) BIOS-e820: 0010 - bff2 (usable) BIOS-e820: bff2 - bff29000 (ACPI data) BIOS-e820: bff29000 - bff8 (ACPI NVS) BIOS-e820: bff8 - c000 (reserved) BIOS-e820: d800 - d8000400 (reserved) BIOS-e820: d8001000 - d8001400 (reserved) BIOS-e820: e000 - f000 (reserved) BIOS-e820: fec0 - fec00400 (reserved) BIOS-e820: fee0 - fee01000 (reserved) BIOS-e820: fff8 - 0001 (reserved) BIOS-e820: 0001 - 00014000 (usable) Entering add_active_range(0, 0, 152) 0 entries of 256 used Entering add_active_range(0, 256, 786208) 1 entries of 256 used Entering add_active_range(0, 1048576, 1310720) 2 entries of 256 used end_pfn_map = 1310720 DMI present. Entering add_active_range(0, 0, 152) 0 entries of 256 used Entering add_active_range(0, 256, 786208) 1 entries of 256 used Entering add_active_range(0, 1048576, 1310720) 2 entries of 256 used Zone PFN ranges: DMA 0 -> 4096 DMA324096 -> 1048576 Normal1048576 -> 1310720 early_node_map[3] active PFN ranges 0:0 -> 152 0: 256 -> 786208 0: 1048576 -> 1310720 On node 0 totalpages: 1048248 DMA zone: 56 pages used for memmap DMA zone: 1138 pages reserved DMA zone: 2798 pages, LIFO batch:0 DMA32 zone: 14280 pages used for memmap DMA32 zone: 767832 pages, LIFO batch:31 Normal zone: 3584 pages used for memmap Normal zone: 258560 pages, LIFO batch:31 Intel MultiProcessor Specification v1.4 MPTABLE: OEM ID: AMD MPTABLE: Product ID: HAMMER MPTABLE: APIC at: 0xFEE0 Processor #0 (Bootup-CPU) Processor #1 I/O APIC #2 at 0xFEC0. I/O APIC #3 at 0xD800. I/O APIC #4 at 0xD8001000. Setting APIC routing to flat Processors: 2 Nosave address range: 00098000 - 00099000 Nosave address range: 00099000 - 000a Nosave address range: 000a - 000c2000 Nosave address range: 000c2000 - 0010 Nosave address range: bff2 - bff29000 Nosave address range: bff29000 - bff8 Nosave address range: bff8 - c000 Nosave address range: c000 - d800 Nosave address range: d800 - d8001000 Nosave address range: d8001000 - e000 Nosave address range: e000 - f000 Nosave address range: f000 - fec0 Nosave address range: fec0 - fee0 Nosave address range: fee0 - fee01000 Nosave address range: fee01000 - fff8 Nosave address range: fff8 - 0001 Allocating PCI resources starting at c200 (gap: c000:1800) PERCPU: Allocating 36608 bytes of per cpu data Built 1 zonelists. Total pages: 1029190 Kernel command line: root=/dev/sda2 Initializing CPU#0 PID hash table entries: 4096 (order: 12, 32768 bytes) time.c: Detected 2009.287 MHz processor. Console: colour VGA+ 80x25 Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes) Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes) Checking aperture... CPU 0: aperture @ 233e00 size 32 MB
Re: [PATCH] use defines in sys_getpriority/sys_setpriority
On Thu, 10 May 2007 10:22:23 -0700 Daniel Walker <[EMAIL PROTECTED]> wrote: > Switch to the defines for these two checks, instead of hard > coding the values. > > Signed-Off-By: Daniel Walker <[EMAIL PROTECTED]> > > --- > kernel/sys.c |4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > Index: linux-2.6.21/kernel/sys.c > === > --- linux-2.6.21.orig/kernel/sys.c > +++ linux-2.6.21/kernel/sys.c > @@ -598,7 +598,7 @@ asmlinkage long sys_setpriority(int whic > int error = -EINVAL; > struct pid *pgrp; > > - if (which > 2 || which < 0) > + if (which > PRIO_USER || which < PRIO_PROCESS) > goto out; > > /* normalize: avoid signed division (rounding problems) */ > @@ -662,7 +662,7 @@ asmlinkage long sys_getpriority(int whic > long niceval, retval = -ESRCH; > struct pid *pgrp; > > - if (which > 2 || which < 0) > + if (which > PRIO_USER || which < PRIO_PROCESS) > return -EINVAL; > > read_lock(_lock); I added this: --- a/kernel/sys.c~use-defines-in-sys_getpriority-sys_setpriority-fix +++ a/kernel/sys.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata: add human-readable error value decoding
Robert Hancock wrote: I don't think this is as big of a deal here as in other cases, like oops output. With libata errors, if they're at the console (which they'd have to be to see these messages), unless something has actually caused a panic the scrollback buffer should still be functional and they'd be able to see the entire output.. Scrollback rarely works as planned, for me. Overall, a balance must be found. More information is more helpful. But. There are downsides to spewing everything possible, upon error. You cause logging to the possibly problematic disk, you push older messages out of the printk ring buffer, etc., etc. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-mm2 -- powerpc missing kset
On Thu, 10 May 2007 08:48:02 -0700 Randy Dunlap <[EMAIL PROTECTED]> wrote: > > On Thu, 10 May 2007 22:16:31 +1000 Stephen Rothwell wrote: > > > On Thu, 10 May 2007 12:48:28 +0100 Andy Whitcroft <[EMAIL PROTECTED]> wrote: > > > > > > arch/powerpc/platforms/pseries/power.c:31: warning: `struct subsystem' > > > declared inside parameter list > > > > There is no explicit reference to struct subsystem in the current version > > of that file. > > There is in 2.6.21-mm2. Are you saying that it's been fixed > somewhere else? (where?) Linus' tree. It was fixed by commit 823bccfc4002296ba88c3ad0f049e1abd8108d30 ('remove "struct subsystem" as it is no longer needed') from Greg Kroah-Hartman dated 2007-04-14 which was applied before v2.6.21-mm2 (according to gitk). -- Cheers, Stephen Rothwell[EMAIL PROTECTED] http://www.canb.auug.org.au/~sfr/ pgpKkL6HDhqZb.pgp Description: PGP signature
Re: [PATCH] Use boot based time for process start time and boot time in /proc
On Thu, 10 May 2007 19:10:42 +0200 Tomas Janousek <[EMAIL PROTECTED]> wrote: > Commit 411187fb05cd11676b0979d9fbf3291db69dbce2 caused boot time to move and > process start times to become invalid after suspend. Using boot based time for > those restores the old behaviour and fixes the issue. > > .. > > @@ -445,12 +445,14 @@ static int show_stat(struct seq_file *p, void *v) > unsigned long jif; > cputime64_t user, nice, system, idle, iowait, irq, softirq, steal; > u64 sum = 0; > + struct timespec boottime; > > user = nice = system = idle = iowait = > irq = softirq = steal = cputime64_zero; > - jif = - wall_to_monotonic.tv_sec; > - if (wall_to_monotonic.tv_nsec) > - --jif; > + getboottime(); > + jif = boottime.tv_sec; > + if (boottime.tv_nsec) > + ++jif; > Is the switch from --jif to ++jif a functional change? If so, how come? > for_each_possible_cpu(i) { > int j; > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 40645b4..386ff51 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -918,7 +918,7 @@ struct task_struct { > unsigned int rt_priority; > cputime_t utime, stime; > unsigned long nvcsw, nivcsw; /* context switch counts */ > - struct timespec start_time; > + struct timespec start_time, real_start_time; no, please prefer to do struct timespec start_time; struct timespec real_start_time; which gives a nice place to add a comment documenting the field. Please document fields. What is the difference between start_time and real_start_time? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Introduce boot based time
On Thu, 10 May 2007 19:10:25 +0200 Tomas Janousek <[EMAIL PROTECTED]> wrote: > The commits > 411187fb05cd11676b0979d9fbf3291db69dbce2 (GTOD: persistent clock support) > c1d370e167d66b10bca3b602d3740405469383de (i386: use GTOD persistent clock > support) > changed the monotonic time so that it no longer jumps after resume, but it's > not possible to use it for boot time and process start time calculations then. > Also, the uptime no longer increases during suspend. > > I add a variable to track the wall_to_monotonic changes, a function to get the > real boot time and a function to get the boot based time from the monotonic > one. From: Andrew Morton <[EMAIL PROTECTED]> - I don't think those sybols are needed in modules. - Document total_sleep_time units (would have been better to call it total_sleep_time_secs, perhaps). Cc: John Stultz <[EMAIL PROTECTED]> Cc: Tomas Janousek <[EMAIL PROTECTED]> Cc: Tomas Smetana <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- kernel/time/timekeeping.c |6 +- 1 files changed, 1 insertion(+), 5 deletions(-) diff -puN include/linux/time.h~introduce-boot-based-time-fix include/linux/time.h diff -puN kernel/time/timekeeping.c~introduce-boot-based-time-fix kernel/time/timekeeping.c --- a/kernel/time/timekeeping.c~introduce-boot-based-time-fix +++ a/kernel/time/timekeeping.c @@ -46,7 +46,7 @@ EXPORT_SYMBOL(xtime_lock); */ struct timespec xtime __attribute__ ((aligned (16))); struct timespec wall_to_monotonic __attribute__ ((aligned (16))); -static unsigned long total_sleep_time; +static unsigned long total_sleep_time; /* seconds */ EXPORT_SYMBOL(xtime); @@ -503,8 +503,6 @@ void getboottime(struct timespec *ts) - wall_to_monotonic.tv_nsec); } -EXPORT_SYMBOL(getboottime); - /** * monotonic_to_bootbased - Convert the monotonic time to boot based. * @ts:pointer to the timespec to be converted @@ -513,5 +511,3 @@ void monotonic_to_bootbased(struct times { ts->tv_sec += total_sleep_time; } - -EXPORT_SYMBOL(monotonic_to_bootbased); _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Slab allocators: Drop support for destructors
On Thu, May 10, 2007 at 12:00:08PM -0700, Christoph Lameter wrote: > As far as I can tell there is only a single slab destructor left (there > is currently another in i386 but its going to go as soon as Andi merges > i386s support for quicklists). > > I wonder how difficult it would be to remove it? If we have no need for > destructors anymore then maybe we could remove destructor support from the > slab allocators? There is no point in checking for destructor uses in > the slab allocators if there are none. > > Or are there valid reason to keep them around? It seems they were mainly > used for list management which required them to take a spinlock. Taking a > spinlock in a destructor is a bit risky since the slab allocators may run > the destructors anytime they decide a slab is no longer needed. > > Or do we want to continue support destructors? If so why? > [snip pmb stuff] I'll take a look at tidying up the PMB slab, getting rid of the dtor shouldn't be terribly painful. I simply opted to do the list management there since others were doing it for the PGD slab cache at the time that was written. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: slub-i386-support.patch
William Lee Irwin III wrote: >> Xen is not mandatory as it now stands. On Thu, May 10, 2007 at 02:28:05PM -0700, Jeremy Fitzhardinge wrote: > ? I'm hoping to merge the Xen code in the next couple of days, so I'd > appreciate it if we don't break the foundations just before building the > building. CONFIG_X86_PAE without CONFIG_PARAVIRT is the case in question here. What's done in that case can't break Xen because it doesn't run under Xen. William Lee Irwin III wrote: >> Also, I intend to fix up Xen >> at some point so it doesn't need this. On Thu, May 10, 2007 at 02:28:05PM -0700, Jeremy Fitzhardinge wrote: > As I mentioned in the previous mail, its only really necessary for a > 32-bit guest under a 32-bit hypervisor. While that's going to be a > supported configuration for a long time, we expect that people will > increasingly use 64-bit hypervisors on new machines, so this will become > less of an issue. > We're also looking at shadowing the 4 top-level PAE entries rather than > using them directly, since the shadows only need to be updated when > reloading cr3. This would allow us to use compact pgds, so long as > there's some other way to maintain the pgd list (ideally, something that > can be shared with non-PAE). ISTR you describing this method earlier. This is what I had in mind for fixing up Xen not to need full PAGE_SIZE-sized pgd's. On Thu, May 10, 2007 at 02:28:05PM -0700, Jeremy Fitzhardinge wrote: > Or did I miss something? Is pgd_list being maintained some other way > with slub/quicklists? No, it's identical. clameter's code makes PAGE_SIZE-sized pgd's unconditional for CONFIG_X86_PAE, which is what bothered me. William Lee Irwin III wrote: >> The alternative was 64-bit generation numbers incremented at the time >> of change_page_attr(). If generation numbers were used, it would be >> possible to dispose of the list altogether. Given the awkwardness of >> the list maintenance for Xen, it may be worth using them now. PAE >> pgd's could merely double in size to maintain those for the unshared >> kernel pmd case, and remain 32B otherwise. Full PAGE_SIZE -sized pgd's >> for 2-level pagetables could distribute the generation number across >> page->index and page->private, or any other fields available. On Thu, May 10, 2007 at 02:28:05PM -0700, Jeremy Fitzhardinge wrote: > If you use page->index for that, how does pgd_list get linked together > for vmalloc syncing? It doesn't need to be linked together for vmalloc_sync(). Just increment the generation number and walk the mmlist the same as for pageattr.c -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata: add human-readable error value decoding
Jeff Garzik wrote: Mark Lord wrote: If we're compiling the messages into the kernel regardless, then it doesn't really make much sense to NOT show all of them on the error paths. Not true. Uncontrolled message spewage inevitably results in critical information scrolling off the screen, before a user can take a digital photo of the output... Or of users being confused by subsequent error fallout (i.e. multiple oopses reporting problem). Moderation and restraint still have roles to play... :) Jeff I don't think this is as big of a deal here as in other cases, like oops output. With libata errors, if they're at the console (which they'd have to be to see these messages), unless something has actually caused a panic the scrollback buffer should still be functional and they'd be able to see the entire output.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] locks: fix F_GETLK regression (failure to find conflicts)
On Thu, 2007-05-10 at 18:38 -0400, J. Bruce Fields wrote: > In 9d6a8c5c213e34c475e72b245a8eb709258e968c we changed posix_test_lock > to modify its single file_lock argument instead of taking separate input > and output arguments. This makes it no longer safe to set the output > lock's fl_type to F_UNLCK before looking for a conflict, since that > means searching for a conflict against a lock with type F_UNLCK. > > This fixes a regression which causes F_GETLK to incorrectly report no > conflict on most filesystems (including any filesystem that doesn't do > its own locking). > > Also fix posix_lock_to_flock() to copy the lock type. This isn't > strictly necessary, since the caller already does this; but it seems > less likely to cause confusion in the future. > > Thanks to Doug Chapman for the bug report. > > Signed-off-by: "J. Bruce Fields" <[EMAIL PROTECTED]> > --- > fs/locks.c |5 +++-- > 1 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/fs/locks.c b/fs/locks.c > index 671a034..8ec16ab 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -669,7 +669,6 @@ posix_test_lock(struct file *filp, struct file_lock *fl) > { > struct file_lock *cfl; > > - fl->fl_type = F_UNLCK; > lock_kernel(); > for (cfl = filp->f_path.dentry->d_inode->i_flock; cfl; cfl = > cfl->fl_next) { > if (!IS_POSIX(cfl)) > @@ -681,7 +680,8 @@ posix_test_lock(struct file *filp, struct file_lock *fl) > __locks_copy_lock(fl, cfl); > unlock_kernel(); > return 1; > - } > + } else > + fl->fl_type = F_UNLCK; > unlock_kernel(); > return 0; > } > @@ -1632,6 +1632,7 @@ static int posix_lock_to_flock(struct flock *flock, > struct file_lock *fl) > flock->l_len = fl->fl_end == OFFSET_MAX ? 0 : > fl->fl_end - fl->fl_start + 1; > flock->l_whence = 0; > + flock->l_type = fl->fl_type; > return 0; > } > I tested this both with my little hacked up test program as well as with the LTP tests. Looks good. Nice job on the quick turnaround on this Bruce. - Doug - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata: add human-readable error value decoding
Tejun Heo wrote: +if (ehc->i.serror) +ata_port_printk(ap, KERN_ERR, + "SError: {%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s}\n", + ehc->i.serror & SERR_DATA_RECOVERED ? "RecovDataErr " : "", + ehc->i.serror & SERR_COMM_RECOVERED ? "RecovCommErr " : "", + ehc->i.serror & SERR_DATA ? "UnrecovDataErr " : "", + ehc->i.serror & SERR_PERSISTENT ? "PersistErr " : "", + ehc->i.serror & SERR_PROTOCOL ? "ProtocolErr " : "", + ehc->i.serror & SERR_INTERNAL ? "HostInternalErr " : "", + ehc->i.serror & SERR_PHYRDY_CHG ? "PHYRdyChg " : "", + ehc->i.serror & SERR_PHY_INT_ERR ? "PHYInternalErr " : "", + ehc->i.serror & SERR_COMM_WAKE ? "CommWake " : "", + ehc->i.serror & SERR_10B_8B_ERR ? "10B8BErr " : "", + ehc->i.serror & SERR_DISPARITY ? "Disparity " : "", + ehc->i.serror & SERR_CRC ? "CRCErr " : "", + ehc->i.serror & SERR_HANDSHAKE ? "HandshakeErr " : "", + ehc->i.serror & SERR_LINK_SEQ_ERR ? "LinkSeqErr " : "", + ehc->i.serror & SERR_TRANS_ST_ERROR ? "TransStatTransErr " : "", + ehc->i.serror & SERR_UNRECOG_FIS ? "UnrecogFIS " : "", + ehc->i.serror & SERR_DEV_XCHG ? "DevExchanged " : "" ); I'm not really convinced whether this is necessary. The human readable form is also a bit cryptic and can get quite long. So, mild NACK from me. It certainly seems useful when debugging hotplug issues or random SATA problems which end up being caused by communication problems. Without this output, Joe User stands no chance of figuring out what's going on, and neither does Joe libata Developer unless they really care to dig through the spec and count bits to figure out what they mean. At least with this you can see that there was a CRC error, etc. and go from that.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?
On Thu, May 10, 2007 at 04:07:30PM -0700, Jeremy Fitzhardinge wrote: > David Chinner wrote: > > Just to confirm this isn't a result of a recent change, can you reproduce > > this on a 2.6.20 or 2.6.21 kernel? (sorry if you've already done this - > > I've juggling > > some many things at once it's easy to forget little things). > > It is the result of a recent change. I had seen no problem until around > 2.6.21-git8-11. I will try again with a plain 2.6.21 kernel, just to > confirm. Ok, this is important to kow becase we merged a mod around that time that changes the way we handle the updates to the file size i.e. the fix for the NULL-files-on-crash problem: http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ba87ea699ebd9dd577bf055ebc4a98200e337542 and that means the size of the file is not updated to the incore cached inode until after the data write is complete. The symptoms being seen would match with a inode-not-being-written-after-last- data-write-bug in this mod Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] AFS: Fix interminable loop in afs_write_back_from_locked_page()
On Thu, 10 May 2007 15:33:34 +0100 David Howells <[EMAIL PROTECTED]> wrote: > Following bug was uncovered by compiling with '-W' flag: gcc -W finds a number of fairly scary bugs. More than one would expect, given that it is recommended in Documentation/SubmitChecklist, which everyone reads ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: slub-i386-support.patch
On Thu, 10 May 2007, William Lee Irwin III wrote: > On Thu, May 10, 2007 at 09:03:39PM +0100, Hugh Dickins wrote: > > Though when I look at the patchset (copied below), I do wonder why > > it puts a quicklist_trim() into i386's cpu_idle() and flush_tlb_mm(): > > neither is where I'd expect us to be secretly freeing pages. Ah, > > several arches do it in cpu_idle(): how odd, oh well. > > So now quicklist semantics vs. TLB flushing are the motive behind the > odd flush_tlb_mm() affair. The real trick with it is that flushing > must never occur until the TLB flush. Any change to the core quicklist > code that retires pages back to the page allocator earlier (e.g. based > on some limit) will break things badly. I don't think that's right. It's vital that TLB (of an active mm) be flushed before freeing its page back to the quicklist, before it's recycled to another mm (or elsewhere in this mm); but having done that, it really doesn't matter much when quicklist_trim() (check_pgt_cache) is called to free surplus pages from quicklist back to page_alloc.c. tlb_finish_mmu() happens to be the traditional place it's done, and that's where we expect it. flush_tlb_mm() avoids flushing TLB unless it's actually required for the mm in question: so wouldn't be a good place to rely on flushing TLB for pages freed earlier from other mms (but we'd already be in trouble to be leaving them that late). I'm guessing (haven't rechecked source) that the cpu_idle() call comes about because the top level pgd of a process gets freed very late in its exit, and after a great flurry of processes have just exited, perhaps there was nothing to free up the accumulation. Though it still strikes me as an odd place to do it. Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: getcpu after sched_setaffinity
Andi Kleen wrote: Probably. In principle getcpu() (where does the sched_ come from btw?) getcpu() is an unacceptable name. All te other functions dealing with CPU (sets, etc) have a sched_ prefix. is only designed for the case where you don't set the affinity explicitely; otherwise you should already know where you are and don't need it. That's not true in general. Yes, because I want to test vgetcpu() I restrict the set to just one CPU. But if I have more than 2 "CPUs" and I set the affinity to two CPUs which currently are not used you cannot make this argument. getcpu should always work correctly not only if you cannot determine it in another way. Hmm ok one could probably define memset(..., 0) as a invalidation interface, but because of the considerations above i don't think it is really needed. It is needed. For now I added the cache clearing in the setaffinity calls in libc. Resetting to cache to {0,0} seems to work. -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?
On Thu, May 10, 2007 at 05:51:29PM -0400, Chuck Ebbert wrote: > Jeremy Fitzhardinge wrote: > > Chuck Ebbert wrote: > >> What CPU architecture is this happening on? Not i686 with PAE by > >> any chance? > > > > Yes. Why? > > I have a bug report where NFS files are corrupted only with PAE clients. > Corruption is at the end of the (newly untarred) files. Doesn't happen > without PAE. Chuck, can you post a pointer to this thread? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?
David Chinner wrote: > Just to confirm this isn't a result of a recent change, can you reproduce > this on a 2.6.20 or 2.6.21 kernel? (sorry if you've already done this - I've > juggling > some many things at once it's easy to forget little things). It is the result of a recent change. I had seen no problem until around 2.6.21-git8-11. I will try again with a plain 2.6.21 kernel, just to confirm. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [FW ide-cs] Re: jvc cdrom drive lockup
On Thu, 10 May 2007 14:06:54 +0800 "Zhang, Yanmin" <[EMAIL PROTECTED]> wrote: > On Sun, 2007-05-06 at 16:00 +0100, Richard Kennedy wrote: > > On Fri, 2007-05-04 at 23:32 +0900, Komuro wrote: > > > On Thu, 03 May 2007 15:29:19 +0100 > > > Richard Kennedy <[EMAIL PROTECTED]> wrote: > > > > > > > > > IDE bugs should be posted to the linux-ide mailing list. > > > > > > > > > > Hi all, > > > > I have a JVC MP-CDX1 cdrom drive that came with my laptop which used to > > > > work with ide-cs but stopped working with newer kernels. > > > > > > > > I added its ident to ide-cs.c (see patch below) and the drive now is > > > > detected and gets mounted when plugged in and seems to work correctly. > > > > > > > > But when I eject the card, pccardctl eject 0, the laptop locks up > > > > completely, there are no messages in the log, and the fan goes to full > > > > speed so I guess the cpu is running at 100%. > > > > Any ideas what's going wrong or how to debug it ? > > > > Is there anything else I need to patch to get this working ? > > > > > > > > Thanks > > > > Richard > > > > > > > > card info :- > > > > > > > > May 3 11:22:52 mininote kernel: pccard: PCMCIA card inserted into slot > > > > 0 > > > > May 3 11:22:52 mininote kernel: cs: memory probe > > > > 0xa000-0xa0ff: clean. > > > > May 3 11:22:52 mininote kernel: pcmcia: registering new device > > > > pcmcia0.0 > > > > May 3 11:22:53 mininote kernel: hdc: UJDB130, ATAPI CD/DVD-ROM drive > > > > May 3 11:22:53 mininote kernel: ide1 at 0x190-0x197,0x396 on irq 3 > > > > May 3 11:22:53 mininote kernel: ide-cs: hdc: Vpp = 0.0 > > > > May 3 11:22:54 mininote kernel: hdc: ATAPI 20X CD-ROM drive, 128kB > > > > Cache > > > > May 3 11:22:54 mininote kernel: Uniform CD-ROM driver Revision: 3.20 > > > > May 3 11:23:04 mininote hald: mounted /dev/hdc on behalf of uid 500 > > > > May 3 11:23:34 mininote hald: unmounted /dev/hdc from '/media/FC_4 > > > > i386 ftp #1' on behalf of uid 500 > > > > May 3 11:24:17 mininote kernel: pccard: card ejected from slot 0 > > > > << lockup happened here >> > > > I rebuilt the kernel with the lock dependency checking turned on, which > > shows up 2 problems (and also breaks the deadlock). > > > > kernel: pccard: card ejected from slot 0 > > kernel: > > > kernel: BUG: sleeping function called from invalid context at > > kernel/rwsem.c:20 > > kernel: in_atomic():0, irqs_disabled():1 > > kernel: INFO: lockdep is turned off. > > kernel: irq event stamp: 2258 > > kernel: hardirqs last enabled at (2257): [] kfree+0x78/0x7f > > kernel: hardirqs last disabled at (2258): [] > > _spin_lock_irq+0xc/0x3a > > kernel: softirqs last enabled at (2252): [] do_softirq+0x4d/0xb6 > > kernel: softirqs last disabled at (2243): [] do_softirq+0x4d/0xb6 > > kernel: [] down_read+0x15/0x4d > > kernel: [] pci_get_subsys+0x68/0xea > > kernel: [] pci_get_device+0x16/0x19 > > kernel: [] init_hwif_default+0x28/0xf0 > > kernel: [] ide_unregister+0x242/0x573 > > kernel: [] ide_release+0x18/0x28 [ide_cs] > > kernel: [] ide_detach+0x8/0x14 [ide_cs] > > kernel: [] pcmcia_device_remove+0x50/0xb5 > > kernel: [] __device_release_driver+0x71/0x8e > > kernel: [] device_release_driver+0x31/0x46 > > kernel: [] bus_remove_device+0x70/0x80 > > kernel: [] device_del+0x162/0x1c6 > > kernel: [] device_unregister+0x8/0x10 > > kernel: [] pcmcia_card_remove+0x58/0x77 > > kernel: [] ds_event+0x56/0x87 > > kernel: [] kobject_get+0xf/0x13 > > kernel: [] send_event+0x31/0x49 > > kernel: [] socket_shutdown+0xc/0xb3 > > kernel: [] socket_remove+0x1c/0x26 > > kernel: [] pcmcia_eject_card+0x3f/0x4c > > kernel: [] pccard_store_eject+0x1b/0x22 > > kernel: [] pccard_store_eject+0x0/0x22 > > kernel: [] dev_attr_store+0x27/0x2c > > kernel: [] sysfs_write_file+0xbf/0xe8 > > kernel: [] sysfs_write_file+0x0/0xe8 > > kernel: [] vfs_write+0xa8/0x154 > > kernel: [] sys_write+0x41/0x67 > > kernel: [] sysenter_past_esp+0x5f/0x99 > > kernel: === > Before calling init_hwif_default, ide_unregister gets lock ide_lock and > disables irq. > init_hwif_default calls ide_default_io_base which calls pci_get_device and > later > pci_get_subsys tries to apply for semaphore pci_bus_sem and goes to sleep. > > Mostly, pci_get_device should be called when irq is turned on. > > I still don't understand an issue. If you test it on a mobile, mostly, the > process won't > sleep when applying for pci_bus_sem because there is no too many > opportunities for 2 processes > to apply for the semaphore at the same time. > > As just needing know if pci is initiated, ide_default_io_base just needs find > if list > pci_devices is empty. > > Could you try below patch against 2.6.21? > > Signed-off-by: Zhang Yanmin <[EMAIL PROTECTED]> > > --- > > diff -Nraup linux-2.6.21/drivers/pci/probe.c > linux-2.6.21_fix/drivers/pci/probe.c > --- linux-2.6.21/drivers/pci/probe.c 2007-05-10 11:35:06.0 +0800 > +++
Re: [Bug 8464] New: autoreconf: page allocation failure. order:2, mode:0x84020
On (10/05/07 15:49), Christoph Lameter didst pronounce: > On Thu, 10 May 2007, Mel Gorman wrote: > > > > I cannot predict how allocations on a slab will be performed. In order > > > to avoid the higher order allocations in we would have to add a flag > > > that tells SLUB at slab creation creation time that this cache will be > > > used for atomic allocs and thus we can avoid configuring slabs in such a > > > way that they use higher order allocs. > > > > > > > It is an option. I had the gfp flags passed in to kmem_cache_create() in > > mind for determining this but SLUB creates slabs differently and different > > flags could be passed into kmem_cache_alloc() of course. > > So we have a collection of flags to add > > SLAB_USES_ATOMIC This is a possibility. > SLAB_TEMPORARY I have a patch for this sitting in a queue waiting for testing > SLAB_PERSISTENT > SLAB_RECLAIMABLE > SLAB_MOVABLE I don't think these are required because the necessary information is available from the GFP flags. > > ? > > > Another alternative is that anti-frag used to also group high-order > > allocations together and make it hard to fallback to those areas > > for non-atomic allocations. It is currently backed out by the > > patch dont-group-high-order-atomic-allocations.patch because > > it was intended for rare high-order short-lived allocations > > such as e1000 that are currently dealt with by MIGRATE_RESERVE > > (bias-the-location-of-pages-freed-for-min_free_kbytes-in-the-same-max_order_nr_pages-blocks.patch) > > The high-order atomic groupings may help here because the high-order > > allocations are long-lived and would claim contiguous areas. > > > > The last alternative I think I mentioned already is to have the minimum > > order kswapd reclaims as the same order SLUB uses instead of 0 so that > > min_free_kbytes is kept at higher orders than current. > > Would you get a patch to Nicholas to test either of these solutions? I do not have a kswapd related patch ready but the first alternative is readily available. Nicholas, could you backout the patch dont-group-high-order-atomic-allocations.patch and test again please? The following patch has the same effect. Thanks diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.21-mm2-clean/include/linux/mmzone.h linux-2.6.21-mm2-grouphigh/include/linux/mmzone.h --- linux-2.6.21-mm2-clean/include/linux/mmzone.h 2007-05-09 10:21:28.0 +0100 +++ linux-2.6.21-mm2-grouphigh/include/linux/mmzone.h 2007-05-10 23:54:45.0 +0100 @@ -38,8 +38,9 @@ extern int page_group_by_mobility_disabl #define MIGRATE_UNMOVABLE 0 #define MIGRATE_RECLAIMABLE 1 #define MIGRATE_MOVABLE 2 -#define MIGRATE_RESERVE 3 -#define MIGRATE_TYPES 4 +#define MIGRATE_HIGHATOMIC3 +#define MIGRATE_RESERVE 4 +#define MIGRATE_TYPES 5 #define for_each_migratetype_order(order, type) \ for (order = 0; order < MAX_ORDER; order++) \ diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.21-mm2-clean/include/linux/pageblock-flags.h linux-2.6.21-mm2-grouphigh/include/linux/pageblock-flags.h --- linux-2.6.21-mm2-clean/include/linux/pageblock-flags.h 2007-05-09 10:21:28.0 +0100 +++ linux-2.6.21-mm2-grouphigh/include/linux/pageblock-flags.h 2007-05-10 23:54:45.0 +0100 @@ -31,7 +31,7 @@ /* Bit indices that affect a whole block of pages */ enum pageblock_bits { - PB_range(PB_migrate, 2), /* 2 bits required for migrate types */ + PB_range(PB_migrate, 3), /* 3 bits required for migrate types */ NR_PAGEBLOCK_BITS }; diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.21-mm2-clean/mm/page_alloc.c linux-2.6.21-mm2-grouphigh/mm/page_alloc.c --- linux-2.6.21-mm2-clean/mm/page_alloc.c 2007-05-09 10:21:28.0 +0100 +++ linux-2.6.21-mm2-grouphigh/mm/page_alloc.c 2007-05-10 23:54:45.0 +0100 @@ -167,6 +167,11 @@ static inline int allocflags_to_migratet if (unlikely(page_group_by_mobility_disabled)) return MIGRATE_UNMOVABLE; + /* Cluster high-order atomic allocations together */ + if (unlikely(order > 0) && + (!(gfp_flags & __GFP_WAIT) || in_interrupt())) + return MIGRATE_HIGHATOMIC; + /* Cluster based on mobility */ return (((gfp_flags & __GFP_MOVABLE) != 0) << 1) | ((gfp_flags & __GFP_RECLAIMABLE) != 0); @@ -713,10 +718,11 @@ static struct page *__rmqueue_smallest(s * the free lists for the desirable migrate type are depleted */ static int fallbacks[MIGRATE_TYPES][MIGRATE_TYPES-1] = { - [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, - [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, - [MIGRATE_MOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_RESERVE }, - [MIGRATE_RESERVE] = { MIGRATE_RESERVE, MIGRATE_RESERVE,
Re: [Bug 8464] New: autoreconf: page allocation failure. order:2, mode:0x84020
On Fri, 11 May 2007, Mel Gorman wrote: > Nicholas, could you backout the patch > dont-group-high-order-atomic-allocations.patch and test again please? > The following patch has the same effect. Thanks Great! Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?
On Thu, May 10, 2007 at 02:54:25PM -0700, Jeremy Fitzhardinge wrote: > Chuck Ebbert wrote: > > Jeremy Fitzhardinge wrote: > > > >> Chuck Ebbert wrote: > >> > >>> What CPU architecture is this happening on? Not i686 with PAE by > >>> any chance? > >>> > >> Yes. Why? > >> > > > > I have a bug report where NFS files are corrupted only with PAE clients. > > Corruption is at the end of the (newly untarred) files. Doesn't happen > > without PAE. > > > > Hm, suggestive, but I'm not convinced. Two differences to this situation: > >1. Immediately after the clone ("untar"), the contents are completely > OK; it's only after a umount/mount cycle to problems appear >2. There's no corruption as such; the files are just too short. And > it seems they're at a previously OK length, not some random size. Just to confirm this isn't a result of a recent change, can you reproduce this on a 2.6.20 or 2.6.21 kernel? (sorry if you've already done this - I've juggling some many things at once it's easy to forget little things). Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PATCH] ACPI patches for 2.6.22 - part 2
On Thursday 10 May 2007 16:56, Linus Torvalds wrote: > > On Thu, 10 May 2007, Linus Torvalds wrote: > > > > Seems to work for me. My evo correctly started the fan, and stopped it > > when the temperature went down again. > > Looking at things in "top", I do end up occasionally seeing spikes where > kacpid takes 17% of CPU time, and kacpi_notify takes a few percent too. > But the machine works ok, and it doesn't seem to be horrible: > >64 ?S< 0:15 [kacpid] >65 ?S< 0:08 [kacpi_notify] > > so they've gotten 23 seconds of CPU time over the 37 minutes that laptop > has been up now. That's arguably too much, but on the other hand, I did > end up trying to stress it out by doing some 3D stuff while compiling the > kernel and doing "git grep" over the kernel tree etc. Thanks, I noticed the same thing on an nx6325. The goal at the moment is to revert to the simplest functional & stable solution -- as what is shipping today crashes on some boxes. We've got a couple of tweaks in mind where we think Linux can get smarter -- but this is an area where several platform vendors are taking advantage of Windows' implementation in (different) twisted ways. For us to reach our goal of Linux handling any system out there in as optimal a way as possible, we need to study each one in detail. That said, can you send me or point me to the acpidump output for your EVO. Yes, I'm sure you've sent it before a long time ago, but that was about probably 2,000,000 e-mail messages and a couple of disk crashes ago:-) thanks, -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [1/2] [NET] link_watch: Move link watch list into net_device
* Jeremy Fitzhardinge ([EMAIL PROTECTED]) wrote: > Yep, this patch gets rid of my spinning thread. I can't find this patch > or any discussion on marc.info; is there a better netdev list archive? See the "linkwatch bustage in git-net" thread on netdev http://thread.gmane.org/gmane.linux.network/61800/focus=61812 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [1/2] [NET] link_watch: Move link watch list into net_device
From: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Date: Thu, 10 May 2007 15:45:42 -0700 > David Miller wrote: > > I'm not so certain now that we know it's the jiffies wrap point :-) > > > > The fixes in question are attached below and they were posted and > > discussed on netdev: > > > > Yep, this patch gets rid of my spinning thread. I can't find this patch > or any discussion on marc.info; is there a better netdev list archive? I don't see it there either... let me check my mail archive... Indeed, they were "posted" to netdev but were blocked by the vger regexp filters on the keyword "urgent" so that postings never made it to the list. I removed that filter regexp so that never happens again, sorry. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 8464] New: autoreconf: page allocation failure. order:2, mode:0x84020
On Thu, 10 May 2007, Mel Gorman wrote: > > I cannot predict how allocations on a slab will be performed. In order > > to avoid the higher order allocations in we would have to add a flag > > that tells SLUB at slab creation creation time that this cache will be > > used for atomic allocs and thus we can avoid configuring slabs in such a > > way that they use higher order allocs. > > > > It is an option. I had the gfp flags passed in to kmem_cache_create() in > mind for determining this but SLUB creates slabs differently and different > flags could be passed into kmem_cache_alloc() of course. So we have a collection of flags to add SLAB_USES_ATOMIC SLAB_TEMPORARY SLAB_PERSISTENT SLAB_RECLAIMABLE SLAB_MOVABLE ? > Another alternative is that anti-frag used to also group high-order > allocations together and make it hard to fallback to those areas > for non-atomic allocations. It is currently backed out by the > patch dont-group-high-order-atomic-allocations.patch because > it was intended for rare high-order short-lived allocations > such as e1000 that are currently dealt with by MIGRATE_RESERVE > (bias-the-location-of-pages-freed-for-min_free_kbytes-in-the-same-max_order_nr_pages-blocks.patch) > The high-order atomic groupings may help here because the high-order > allocations are long-lived and would claim contiguous areas. > > The last alternative I think I mentioned already is to have the minimum > order kswapd reclaims as the same order SLUB uses instead of 0 so that > min_free_kbytes is kept at higher orders than current. Would you get a patch to Nicholas to test either of these solutions? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86 setup rewrite tree ready for flamage^W review
Alexander van Heukelum wrote: > On Thu, May 10, 2007 at 11:08:10AM -0700, H. Peter Anvin wrote: >> As far as I could tell, "scan" simply caused the nonstandard video >> driver scan modules (unsafe probes) to be invoked. Since those modules >> are no longer present, there appeared to be no need for them. The VGA >> and VESA probes are safe. > > It doesn't probe the hardware in dangerous ways. (Search for mode_scan > in video.S) It works by trying to set a mode via the normal > AH=0/AL=mode/int 0x10 method for all possible values of mode. It then > checks if the bios reports the new mode as being set and reads a few > standard vga registers to determine if it is a text mode. It's > completely independent of the CONFIG_VIDEO_SVGA stuff. It's dangerous, all right (which is why it doesn't do it by default), since you have no guarantee that the BIOS doesn't totally vomit on these calls -- or, like my laptop, take about a minute before giving up finding nothing. Anyway, I re-implemented scanning and pushed it out to the git tree; please try it out as it does absolutely nothing on any of my machines. > That makes me wonder: (from arch/i386/boot/pmjump.S) > > 37 movw$__BOOT_DS, %cx > 38 > 39 movl%cr0, %edx > 40 orb $1, %dl # Protected mode (PE) bit > 41 movl%edx, %cr0 > 42 > 43 movw%cx, %ds > 44 movw%cx, %es > 45 movw%cx, %fs > 46 movw%cx, %gs > 47 movw%cx, %ss > 48 > 49 # Jump to the 32-bit entrypoint > 50 .byte 0x66, 0xea # ljmpl opcode > 51 2: .long 0 # offset > 52 .word __BOOT_CS # segment > > I thought the 32-bit jump was required to come before the segment loads. > Does this code load values from the gdt, or are they just loaded as real > mode segments? As long as it does not crash it does not matter, because > head.S reloads them again. Once CR0.PE is set, segments are loaded from the GDT. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: getcpu after sched_setaffinity
On Thu, May 10, 2007 at 03:24:58PM -0700, Ulrich Drepper wrote: > The attached test program fails on a dual core (and probably SMP) > machine on x86-64. Depending on where the thread starts, in one of the > iterations the sched_setffinity() call succeeds but then sched_getcpu() > fails to report the correct CPU. > > In set_cpus_allowed migrate_task() is called if the new CPU set does not > include the current CPU. I hope that migrate_task() also works for > p==current. > > This leaves the x86-64 vgetcpu() implementation as the weak point. Is > the caching causing problems? Probably. In principle getcpu() (where does the sched_ come from btw?) is only designed for the case where you don't set the affinity explicitely; otherwise you should already know where you are and don't need it. The cache is optimized for the case when you run without affinity and change CPUs only rarely (which is normal) so it is kept valid for a jiffie. And you always need to handle an outdated result from getcpu anyways because you can't disable preemption from user space and could switch any time. In short your test case has a broken design. > is reset? The vsyscall/kernel can't reset the cache because it is managed by the application. Hmm ok one could probably define memset(..., 0) as a invalidation interface, but because of the considerations above i don't think it is really needed. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] UDF: check for allocated memory for inode data
On Thu, 10 May 2007 18:00:00 +0400 Cyrill Gorcunov <[EMAIL PROTECTED]> wrote: > This patch adds cheking for granted memory while > filling up inode data to prevent possible NULL > pointer usage. If there is not enough memory to > fill inode data we just mark it as "bad". > > Signed-off-by: Cyrill Gorcunov <[EMAIL PROTECTED]> > > Please check the patch, maybe just marking inode as > "bad" is not a good solution. > yes, make_bad_inode() is appropriate here. > > diff --git a/fs/udf/inode.c b/fs/udf/inode.c > index c846155..91cddae 100644 > --- a/fs/udf/inode.c > +++ b/fs/udf/inode.c > @@ -1144,6 +1144,13 @@ static void udf_fill_inode(struct inode *inode, struct > buffer_head *bh) > UDF_I_EFE(inode) = 1; > UDF_I_USE(inode) = 0; > UDF_I_DATA(inode) = kmalloc(inode->i_sb->s_blocksize - > sizeof(struct extendedFileEntry), GFP_KERNEL); > + if (!UDF_I_DATA(inode)) > + { > + printk(KERN_ERR "udf: udf_fill_inode(ino %ld) no free > memory\n", > +inode->i_ino); > + make_bad_inode(inode); > + return; > + } But please let's not add three copies of identical code. Do something like: static int udf_check_inode(struct inode *inode) { if (!UDF_I_DATA(inode)) { printk(KERN_ERR "udf: udf_fill_inode(ino %ld) no free memory\n", inode->i_ino); make_bad_inode(inode); return -1; } return 0; } if (udf_check_inode(inode)) return; In fact you can also do the kmalloc in that helper function too: static int udf_alloc_i_data(struct inode *inode, size_t size) { UDF_I_DATA(inode) = kmalloc(...); ... } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [1/2] [NET] link_watch: Move link watch list into net_device
David Miller wrote: > I'm not so certain now that we know it's the jiffies wrap point :-) > > The fixes in question are attached below and they were posted and > discussed on netdev: > Yep, this patch gets rid of my spinning thread. I can't find this patch or any discussion on marc.info; is there a better netdev list archive? J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 8464] New: autoreconf: page allocation failure. order:2, mode:0x84020
On (10/05/07 15:27), Christoph Lameter didst pronounce: > On Thu, 10 May 2007, Mel Gorman wrote: > > > On (10/05/07 15:11), Christoph Lameter didst pronounce: > > > On Thu, 10 May 2007, Mel Gorman wrote: > > > > > > > I see the gfpmask was 0x84020. That doesn't look like __GFP_WAIT was > > > > set, > > > > right? Does that mean that SLUB is trying to allocate pages atomically? > > > > If so, > > > > it would explain why this situation could still occur even though > > > > high-order > > > > allocations that could sleep would succeed. > > > > > > SLUB is following the gfp mask of the caller like all well behaved slab > > > allocators do. If the caller does not set __GFP_WAIT then the page > > > allocator also cannot wait. > > > > Then SLUB should not use the higher orders for slab allocations that cannot > > sleep during allocations. What could be done in the longer term is decide > > how to tell kswapd to keep pages free at an order other than 0 when it is > > known there are a large number of high-order long-lived allocations like > > this. > > I cannot predict how allocations on a slab will be performed. In order > to avoid the higher order allocations in we would have to add a flag > that tells SLUB at slab creation creation time that this cache will be > used for atomic allocs and thus we can avoid configuring slabs in such a > way that they use higher order allocs. > It is an option. I had the gfp flags passed in to kmem_cache_create() in mind for determining this but SLUB creates slabs differently and different flags could be passed into kmem_cache_alloc() of course. > The other solution is not to use higher order allocations by dropping the > antifrag patches in mm that allow SLUB to use higher order allocations. > But then there would be no higher order allocations at all that would > use the benefits of antifrag measures. That would be an immediate solution. Another alternative is that anti-frag used to also group high-order allocations together and make it hard to fallback to those areas for non-atomic allocations. It is currently backed out by the patch dont-group-high-order-atomic-allocations.patch because it was intended for rare high-order short-lived allocations such as e1000 that are currently dealt with by MIGRATE_RESERVE (bias-the-location-of-pages-freed-for-min_free_kbytes-in-the-same-max_order_nr_pages-blocks.patch) . The high-order atomic groupings may help here because the high-order allocations are long-lived and would claim contiguous areas. The last alternative I think I mentioned already is to have the minimum order kswapd reclaims as the same order SLUB uses instead of 0 so that min_free_kbytes is kept at higher orders than current. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc
On Thu, May 10, 2007 at 05:26:20PM +0530, Amit K. Arora wrote: > On Thu, May 10, 2007 at 10:59:26AM +1000, David Chinner wrote: > > On Wed, May 09, 2007 at 09:31:02PM +0530, Amit K. Arora wrote: > > > I have the updated patches ready which take care of Andrew's comments. > > > Will run some tests and post them soon. > > > > > > But, before submitting these patches, I think it will be better to > > > finalize on certain things which might be worth some discussion here: > > > > > > 1) Should the file size change when preallocation is done beyond EOF ? > > > - Andreas and Chris Wedgwood are in favor of not changing the file size > > > in this case. I also tend to agree with them. Does anyone has an > > > argument in favor of changing the filesize ? If not, I will remove the > > > code which changes the filesize, before I resubmit the concerned ext4 > > > patch. > > > > I think there needs to be both. If we don't have a mechanism to atomically > > change the file size with the preallocation, then applications that use > > stat() to work out if they need to preallocate more space will end up > > racing. > > By "both" above, do you mean we should give user the flexibility if it wants > the filesize changed or not ? It can be done by having *two* modes for > preallocation in the system call - say FA_PREALLOCATE and FA_ALLOCATE. If we > use FA_PREALLOCATE mode, fallocate() will allocate blocks, but will not > change the filesize and [cm]time. If FA_ALLOCATE mode is used, fallocate() > will change the filesize if required (i.e. when allocation is beyond EOF) > and also update [cm]time. This way, the application can decide what it > wants. Yes, that's right. > This will be helpfull for the partial allocation scenario also. Think of the > case when we do not change the filesize in fallocate() and expect > applications/posix_fallocate() to do ftruncate() after fallocate() for this. > Now if fallocate() results in a partial allocation with -ENOSPC error > returned, applications/posix_fallocate() will not know for what length > ftruncate() has to be called. :( Well, posix_fallocate() either gets all the space or it fails. If you truncate to extend the file size after an ENOSPC, then that is a buggy implementation. The same could be said for any application, or even the fallocate() call itself if it changes the filesize without having completely preallocated the space asked > Hence it may be a good idea to give user the flexibility if it wants to > atomically change the file size with preallocation or not. But, with more > flexibility there comes inconsistency in behavior, which is worth > considering. We've got different modes to specify different behaviour. That's what the mode field was put there for in the first place - the interface is *designed* to support different preallocation behaviours > > > 2) For FA_UNALLOCATE mode, should the file system allow unallocation of > > > normal (non-preallocated) blocks (blocks allocated via regular > > > write/truncate operations) also (i.e. work as punch()) ? > > > > Yes. That is the current XFS implementation for XFS_IOC_UNRESVSP, and what > > i did for FA_UNALLOCATE as well. > > Ok. But, some people may not expect/like this. I think, we can keep it on > the backburner for a while, till other issues are sorted out. How can it be a "backburner" issue when it defines the implementation? I've already implemented some thing in XFS that sort of does what I think that the interface is supposed to do, but I need that interface to be nailed down before proceeding any further. All I'm really interested in right now is that the fallocate _interface_ can be used as a *complete replacement* for the pre-existing XFS-specific ioctls that are already used by applications. What ext4 can or can't do right now is irrelevant to this discussion - the interface definition needs to take priority over implementation Cheers, Dave, -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/30] Use menuconfig objects
On Tue, 10 Apr 2007 21:17:40 +0200 (MEST) Jan Engelhardt <[EMAIL PROTECTED]> wrote: > the following patch series turns some menus into menuconfigs, so they > can be disabled whilst "walking" thorugh the parent menu (check the > videos [1], [2] to see what I mean), enabling for disabling lots of > options _quickly_. Well Martin's little tromp through the Kconfig menus meant that I had to repair pretty much every one of these patches. Could you please have a look at http://userweb.kernel.org/~akpm/menuconfig/, see if I screwed anything up? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] locks: fix F_GETLK regression (failure to find conflicts)
In 9d6a8c5c213e34c475e72b245a8eb709258e968c we changed posix_test_lock to modify its single file_lock argument instead of taking separate input and output arguments. This makes it no longer safe to set the output lock's fl_type to F_UNLCK before looking for a conflict, since that means searching for a conflict against a lock with type F_UNLCK. This fixes a regression which causes F_GETLK to incorrectly report no conflict on most filesystems (including any filesystem that doesn't do its own locking). Also fix posix_lock_to_flock() to copy the lock type. This isn't strictly necessary, since the caller already does this; but it seems less likely to cause confusion in the future. Thanks to Doug Chapman for the bug report. Signed-off-by: "J. Bruce Fields" <[EMAIL PROTECTED]> --- fs/locks.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 671a034..8ec16ab 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -669,7 +669,6 @@ posix_test_lock(struct file *filp, struct file_lock *fl) { struct file_lock *cfl; - fl->fl_type = F_UNLCK; lock_kernel(); for (cfl = filp->f_path.dentry->d_inode->i_flock; cfl; cfl = cfl->fl_next) { if (!IS_POSIX(cfl)) @@ -681,7 +680,8 @@ posix_test_lock(struct file *filp, struct file_lock *fl) __locks_copy_lock(fl, cfl); unlock_kernel(); return 1; - } + } else + fl->fl_type = F_UNLCK; unlock_kernel(); return 0; } @@ -1632,6 +1632,7 @@ static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl) flock->l_len = fl->fl_end == OFFSET_MAX ? 0 : fl->fl_end - fl->fl_start + 1; flock->l_whence = 0; + flock->l_type = fl->fl_type; return 0; } -- 1.5.1.1.107.g7a159 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 8464] New: autoreconf: page allocation failure. order:2, mode:0x84020
On (10/05/07 14:49), Christoph Lameter didst pronounce: > On Thu, 10 May 2007, Andrew Morton wrote: > > > Christoph, can we please take a look at /proc/slabinfo and its slub > > equivalent (I forget what that is?) and review any and all changes to the > > underlying allocation size for each cache? > > > > Because this is *not* something we should change lightly. > > It was changed specially for mm in order to stress the antifrag code. If > this causes trouble then do not merge the patches against SLUB that > exploit the antifrag methods. This failure should help see how effective > Mel's antifrag patches are. He needs to get on this dicussion. > The antfrag mechanism depends on the caller being able to sleep and reclaim pages if necessary to get the contiguous allocation. No attempts are being currently made to keep pages at a particular order free. I see the gfpmask was 0x84020. That doesn't look like __GFP_WAIT was set, right? Does that mean that SLUB is trying to allocate pages atomically? If so, it would explain why this situation could still occur even though high-order allocations that could sleep would succeed. > Upstream has slub_max_order=1. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/