Re: Wow! Is memory ever cheap!
On Tue, 8 May 2001 22:22:10 -0700, Larry McVoy <[EMAIL PROTECTED]> wrote: > >Just to make sure you understand: I think ECC is a fine thing. If I'm >running systems with no other integrity checks, I'll take ECC and like it. >However, having ECC does not mean that I trust that my data is safe, >that is most certainly not a true statement. The bus, the disks, the >disk controller, the disk driver, the buffer cache, etc, can all corrupt >the data. Oh, yeah, let's not forget NFS. I have seen each and every >one of those things corrupt data. This is an interesting observation of a truth that was well known in the second generation computers of the 1950s and 1960s. I first worked at John Hancock... they had a bunch of 7074 machines. All those systems made use of programmed checksums in each tape block and in each full file. The reason was that those machines did not have ECC... they did have parity checking if I remember right. With IBM's third generation computers (S/360s) and probably other manufacturers, ECC became a standard feature. Parity checking was added through different data paths such as channel memory, buffer memory, etc. There was so much protection added that the programmed checksums became superfluous. There were still odd moments. I remember working on an Amdahl computer problem where some internal data paths... where the contents of one register moved to an internal storage area... and the path did not have parity. There was a machine fault... the path was electrically open, so the contents of the register always became zero. But since it wasn't parity checked, there was no machine check. I remember another problem on the IBM 3033. Cosmic rays (really) caused one bit errors in channel memory. That was parity but not ECC so you got a weird channel check. Back at the diagnosis ranch, the board looked good. It was only when someone noticed that the rate of such problems was proportional to the height above sea level that the light bulb went on. The lesson is that when paths are not checked, hardware or software, data being held or transformed can change. Old lesson but a good one to know. john alvord - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: what causes Machine Check exception? revisited (2.2.18)
On Mon, May 07, 2001 at 11:57:17AM +0100, Alan Cox wrote: > Generally it indicates a CPU problem but I've see it caused by overclocking > and poorly fitted heatsinks I've been able to trigger a Machine check error on PPC when trying to boot directly from OF with a COFF kernel. The system has worked perfectly with BootX. I wonder why this is the first non-x86 report... Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
tunables??
Hi All, 1. Is there a file which contains all the tunables?? 2. Are the variables - free_pages_high & free_pages_low (, which the kswapd looks for after the timer expires), tunable parameters?? 3. What range of addresses separates the Normal, High Memory & DMA zones?? Thanks & Regards, Srikanth - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Wow! Is memory ever cheap!
On Tue, 8 May 2001, Larry McVoy wrote: > which is a text version of the paper I mentioned before. The basic > message of the paper is that it really doesn't help much to have things > like ECC unless you can be sure that 100% of the rest of your system > has similar checks. UDMA has crc, scsi has parity, pci has (i think) parity, tcpip has crc, your cpu l1 and l2 have ecc... Looks like similar checks are already there. -Dan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Wow! Is memory ever cheap!
On Wed, May 09, 2001 at 12:24:25AM -0400, Marty Leisner wrote: > I'm confused by the "lets not use ECC and use bk" talk. I'll take a pass at unconfusing you, I can see how you might be. I wish I had never mentioned BK, that was never the point. End to end was the point, BK was just an example and now I'm getting accused of bringing up the whole thread as a BK advertisement. Which completely misses the point. Please go read http://www.google.com/search?q=cache:web.mit.edu/Saltzer/www/publications/endtoend/endtoend.pdf+clark+end+to+end=en which is a text version of the paper I mentioned before. The basic message of the paper is that it really doesn't help much to have things like ECC unless you can be sure that 100% of the rest of your system has similar checks. The point was made again, but apparently missed here, when I pointed out that Linux's disk subsystem passes up bad data when it knows there may be a problem. ECC will not help you in this case, the data was bad before it hit memory. So now you have carefully error corrected BAD DATA. See the point? ECC doesn't help unless every other component is equally careful; those components include software and hardware. You can fix that chunk of software and then I'll go find a rogue disk controller that breaks the datapath, there are plenty to choose from. Just to make sure you understand: I think ECC is a fine thing. If I'm running systems with no other integrity checks, I'll take ECC and like it. However, having ECC does not mean that I trust that my data is safe, that is most certainly not a true statement. The bus, the disks, the disk controller, the disk driver, the buffer cache, etc, can all corrupt the data. Oh, yeah, let's not forget NFS. I have seen each and every one of those things corrupt data. As to the BitKeeper stuff, those of you who think this is a BitKeeper discussion are off wacking in the weeds. The point isn't that BitKeeper is good because it has integrity checks, the point is that integrity checks are a good thing. Period. BitKeeper was just an example. If there was a Linux filesystem that had built in integrity checks (and I knew about it, for all I know there is one), then I would have used that as the example. I used BitKeeper as an example because I know it and I can point to numerous cases where it exposed problems that ECC would not have caught. Ask Dave Miller about the mmap/read sparc linux cache aliasing bug that BK exposed, that one was nasty. Let's review: ECC is nice, but it doesn't solve all data corruption problems. Applications which do their own end to end data integrity checks will catch many more error cases than what ECC catches. My efforts in this thread had nothing to do with BitKeeper, they were trying to get people to realize that end to end is good, and ECC isn't end to end. Examples of end to end applications, which I should have thought of sooner, are the md5sums on ftp.kernel.org, the integrity checks in rpms, crcs in cpio. I'm sure you can think of lots of others, this is an old problem. > My understanding is suns big machines stopped using ecc and they The SUN problem was a cache problem and there is no way that I believe that SUN would turn of ECC in the cache. There are good reasons for not doing so. If you think through the end to end argument, you will see that you have no way to do checks on the data path into/out of the processor. If that part of the datapath is not checked then no amount of checking elsewhere does any good, the processor can be corrupting your data and never know it. If SUN was so stupid as to remove this, then it is a dramatically different place. I heard that there was a bug in the cache controller, I never heard that they had removed ECC. If you really want to know I can ask, I know at least one of the guys who works on that stuff there. -- --- Larry McVoy lm at bitmover.com http://www.bitmover.com/lm - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Wow! Is memory ever cheap!
I'm confused by the "lets not use ECC and use bk" talk. My understanding is suns big machines stopped using ecc and they started to have "random" problems running big-iron applications that took them a while to figure out (and a lot of bad press) and can only be rectified in the big cycle (this was last year so its probably solved now). I thought one of the primary reasons to have ecc is to catch wierd things before they become catostrophic...and at least know WHY weirdness is happening... marty - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
nfs warning: mount version older than kernel
Where can I get the latest "mount" ? # mount -t nfs lo:/ /mnt NFS: NFSv3 not supported. nfs warning: mount version older than kernel #mount --version mount: mount-2.11a Thanks, Jeff [ [EMAIL PROTECTED] ] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: page_launder() bug
>That said, anyone who doesn't understand the former should probably >get some more C experience before commenting on others' code... I understood it, but it looked very much like a typo. -- from: Jonathan "Chromatix" Morton mail: [EMAIL PROTECTED] (not for attachments) big-mail: [EMAIL PROTECTED] uni-mail: [EMAIL PROTECTED] The key to knowledge is not to rely on people to teach you it. Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/ -BEGIN GEEK CODE BLOCK- Version 3.12 GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*) -END GEEK CODE BLOCK- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SPARC include problem
The include error was in kernel/sched.c . Should I rewrite the includes for this file to include include/asm/irq.h over include/linux/irq.h? I temporarily bypassed this problem by creating a blank asm/hw_irq.h . I also ran into a compile problem in arch/sparc/kernel/sparc_ksyms.c . The rw semaphores seem to be undeclared. Here are the warnings: D__KERNEL__ -I/usr/src/linux-2.4.4/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -m32 -pipe -mno-fpu -fcall-used-g5 -fcall-used-g7-DEXPORT_SYMTAB -c sparc_ksyms.c In file included from /usr/src/linux-2.4.4/include/linux/sched.h:9, from sparc_ksyms.c:17: /usr/src/linux-2.4.4/include/linux/binfmts.h:45: warning: `struct mm_struct' declared inside parameter list /usr/src/linux-2.4.4/include/linux/binfmts.h:45: warning: its scope is only this definition or declaration, which is probably not what you want. sparc_ksyms.c:121: `___down_read' undeclared here (not in a function) sparc_ksyms.c:121: initializer element is not constant sparc_ksyms.c:121: (near initialization for `__ksymtabdown_read.value') sparc_ksyms.c:122: `___down_write' undeclared here (not in a function) sparc_ksyms.c:122: initializer element is not constant sparc_ksyms.c:122: (near initialization for `__ksymtabdown_write.value') sparc_ksyms.c:123: `___up_read' undeclared here (not in a function) sparc_ksyms.c:123: initializer element is not constant sparc_ksyms.c:123: (near initialization for `__ksymtabup_read.value') sparc_ksyms.c:124: `___up_write' undeclared here (not in a function) sparc_ksyms.c:124: initializer element is not constant sparc_ksyms.c:124: (near initialization for `__ksymtabup_write.value') make[1]: *** [sparc_ksyms.o] Error 1 make[1]: Leaving directory `/usr/src/linux-2.4.4/arch/sparc/kernel' make: *** [_dir_arch/sparc/kernel] Error 2 Thank you, Sean Erik Mouw wrote: > > On Mon, May 07, 2001 at 05:01:03PM -0500, Sean Jones wrote: > > In compiling 2.4.4-ac5 for my SPARCStation 20, I had an error in the > > compile resulting from the inability to find a hw_irq.h in the > > include/asm directory. Do you know where I may be able to find such a > > file? > > You don't. I discussed this last week with Russell King: the ARM port > also doesn't have the file hw_irq.h in include/asm-arm. According to > Russell it is only needed in the arch dependent subdirectories, and not > in the drivers. > > Any driver that includes linux/irq.h is not written to be portable. The > only generic driver that includes it is driver/pcmcia/hd64465_ss.c, but > on second glance it's a Hitachi HD64465 specific driver anyway. > > Erik > > -- > J.A.K. (Erik) Mouw, Information and Communication Theory Group, Department > of Electrical Engineering, Faculty of Information Technology and Systems, > Delft University of Technology, PO BOX 5031, 2600 GA Delft, The Netherlands > Phone: +31-15-2783635 Fax: +31-15-2781843 Email: [EMAIL PROTECTED] > WWW: http://www-ict.its.tudelft.nl/~erik/ > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ECN: Volunteers needed
> This was the big argument I was running into from sites, "well it > isn't standard yet, when it is we'll do something about it". The > larger sites like to avoid updates until absolutely necessary. Good grief - nothing like planning ahead ... and these large-site administrators actually accept paychecks for their lack of foresight? Billy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ECN: Volunteers needed
On Tue, 8 May 2001, jamal wrote: > Any one wishing to volunteer, please still send your emails in -- > we should be ready in a few days from now, > I guess i should have mentioned the IESG is sitting in to approve ECN as proposed standard in about a week or so. cheers, jamal - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: nfs MAP_SHARED corruption fix
On Tue, May 08, 2001 at 05:21:02PM +0200, Trond Myklebust wrote: > AFAICs this fix will clearly deadlock... yeah, it didn't triggered because it probably needs to be the same page writepaged and in the dirty list at the same time. I hooked it very deep into the writeback logic to keep it generic (it wasn't going to add a significant overhead) but it didn't need to be _that_ deep. Even worse I think it was partly wrong because it was only in the close(2) path but not in the fput path that is the one walked by munmap. This looks better to me, what do you think? diff -urN ref/fs/nfs/file.c nfs-corruption/fs/nfs/file.c --- ref/fs/nfs/file.c Thu Feb 22 03:45:10 2001 +++ nfs-corruption/fs/nfs/file.cTue May 8 19:11:57 2001 @@ -39,6 +39,7 @@ static ssize_t nfs_file_write(struct file *, const char *, size_t, loff_t *); static int nfs_file_flush(struct file *); static int nfs_fsync(struct file *, struct dentry *dentry, int datasync); +static void nfs_file_close_vma(struct vm_area_struct *); struct file_operations nfs_file_operations = { read: nfs_file_read, @@ -57,6 +58,11 @@ setattr:nfs_notify_change, }; +static struct vm_operations_struct nfs_file_vm_ops = { + nopage: filemap_nopage, + close: nfs_file_close_vma, +}; + /* Hack for future NFS swap support */ #ifndef IS_SWAPFILE # define IS_SWAPFILE(inode)(0) @@ -104,6 +110,20 @@ return result; } +static void nfs_file_close_vma(struct vm_area_struct * vma) +{ + struct inode * inode; + + inode = vma->vm_file->f_dentry->d_inode; + + if (inode->i_state & I_DIRTY_PAGES) { + filemap_fdatasync(inode->i_mapping); + lock_kernel(); + nfs_wb_file(inode, vma->vm_file); + unlock_kernel(); + } +} + static int nfs_file_mmap(struct file * file, struct vm_area_struct * vma) { @@ -115,8 +135,11 @@ dentry->d_parent->d_name.name, dentry->d_name.name); status = nfs_revalidate_inode(NFS_SERVER(inode), inode); - if (!status) + if (!status) { status = generic_file_mmap(file, vma); + if (!status) + vma->vm_ops = _file_vm_ops; + } return status; } Andrea - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ECN: Volunteers needed
On Tue, 8 May 2001, David S. Miller wrote: > > I believe it would only be prudent to actually send out these messages > starting at the moment ECN is officially standard. > > This was the big argument I was running into from sites, "well it > isn't standard yet, when it is we'll do something about it". The > larger sites like to avoid updates until absolutely necessary. > > If we are to improve ECN deployment, we should understand the > priorities of the people who run the sites which stand in the way > of our doing so. > Sally new draft: ftp://ftp.normos.org/ietf/internet-drafts/draft-floyd-tcp-reset-00.txt builds a strong case against the RST issue and maybe used to point to problems despite ECN. But you are right, it will be stronger and wiser to wait until this is standardized before paying a visit to the site owners. Any one wishing to volunteer, please still send your emails in -- we should be ready in a few days from now, cheers, jamal - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ECN: Volunteers needed
jamal writes: > Help is needed to contact these site owners and politely using a standard > email ask them that their site was non-conformant. > Point them to Sally's draft and the fact that ECN is becoming standard > in the next week or so. Also to Jeff's ECN-under-Linux Unofficial > Vendor Support Page, and to encourage them to have their firewall > or load-balancer upgraded. > I suppose the first volunteer needed is to draft such an email. We have to > be polite and persistent for this to work. I believe it would only be prudent to actually send out these messages starting at the moment ECN is officially standard. This was the big argument I was running into from sites, "well it isn't standard yet, when it is we'll do something about it". The larger sites like to avoid updates until absolutely necessary. If we are to improve ECN deployment, we should understand the priorities of the people who run the sites which stand in the way of our doing so. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] RAID5 NULL Checking Bug Fix
Hi, In drivers/md/raid5.c, the author does not check to see if alloc_page() returns NULL. This patch also adds checks that return 1 (following the error-path convention in the respective function). Please discard this e-mail if this patch is irrelevant to you. I just tried to be thorough. Thank you, David Chan ---snip --- drivers/md/raid5.c.orig Tue May 8 19:17:22 2001 +++ drivers/md/raid5.c Tue May 8 19:20:07 2001 @@ -157,17 +157,21 @@ memset(bh, 0, sizeof (struct buffer_head)); init_waitqueue_head(>b_wait); page = alloc_page(priority); + if (!page) + goto nomem_path; bh->b_data = page_address(page); - if (!bh->b_data) { - kfree(bh); - return 1; - } + if (!bh->b_data) + goto nomem_path; atomic_set(>b_count, 0); bh->b_page = page; sh->bh_cache[i] = bh; } return 0; + +nomem_path: + kfree(bh); + return 1; } static struct buffer_head *raid5_build_block (struct stripe_head *sh, int i); ---snip--- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
blkdev in pagecache
This night I moved the blkdev layer in pagecache in this patch: ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.5pre1/blkdev-pagecache-1 It is incremental and depends on the o_direct functionality, latest o_direct patch against 2.4.5pre1 is here: ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.5pre1/o_direct-5 The main reasons I moved the blkdev in pagecaches is that the current blkdev provides horrible performance with fast I/O subsystem capable of over 50mbyte/sec that I just increased x2 with a simple hack that you can see here if you're curious: ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.5pre1aa2/00_4k_block_dev-1 (btw, also the current rawio uses a 512byte bh->b_size granularity that is even worse than the 1024byte b_size of the blkdev, O_DIRECT is much smarter on this side as it uses the softblocksize of the fs that can be as well 4k if you created the fs with -b 4096) However after running this 4k_block_dev-1 hack on some more machine I noticed the blkdev layer wasn't able anymore to update the superblock of 1k ext2 filesystems and to make it "usable" in real life I needed to fix it. But I didn't wanted ot invest any further time on such an hack and I preferred to move the blkdev in pagecache and to fix the problem on top of the new better design (moving blkdev in pagecache of course introduces that same problem too as I also mentioned in one of the below points). I'll describe here some of the details of the blkdev-pagecache-1 patch: - /dev/raw* and drivers/char/raw.c gets obsoleted and replaced by opening the blkdevice with O_DIRECT, it looks much saner and I basically get it for free by just implementing 10 lines of the blkdev_direct_IO callback, of course I didn't removed the /dev/raw* API for compatibility. While testing O_DIRECT I destroyed the first 50mbyte of the root partition so I will need to wait the test box to return alive before I can make further testing ;). But I just fixed the bug that caused the corruption before uploading the patch so I don't expect further problems (it was only a s/i_dev/i_rdev thing) because the regression testing was working well even if it was writing in the wrong disk ;). - I force the virtual blocksize for all the blkdev I/O (buffered and direct) to work with a 4096 bytes granularity instead of the current 1024 softblocksize because we need that for getting higher performance, 1024 is too low because it wastes too much ram and too much cpu. So a DBMS won't be able anymore to write 512bytes to the disk using rawio being sure it will be a single atomic block update. If you use /dev/raw nothing changed of course, only opening blkdev with O_DIRECT enforce a minimal granularity of 4096 bytes in the I/O. I don't think this is a problem, and also O_DIRECT through the fs was just using the fs softblocksize instead of the hardblocksize as unit of the minimal direct-IO granularity. - writes to the blockdevice won't end in the buffer cache, so it will be impossible to update the superblock of an ext2 partition mounted ro for example, it must not be mounted at all to update the superblock, I will need to invent an hack to fix this problem or it will get too annoying. One way could simply to change ext2 and have it checking the buffer to be uptodate before marking it dirty again but maybe we could also do it in a generic manner that fixes all the fs at once (OTOH probably not that many fs needs to be fscked online...). - mmap should be functional but it's totally untested. - currently the last `harddisk_size & 4095' bytes (if any) won't be accessible via the blkdev, to avoid sending to the hardware requests beyond the end of the device. Not sure how/if to solve this. But this is definitely not a new issue, the same thing happens today in 2.2 and 2.4 after you mount a 4k filesystem on a blockdevice. OTOH I'm scared a mke2fs -b 1024 could get confused. But I really don't want to decrease the b_size of the buffer header even if we fix this. - to share all the filemap.c code and not to change too much stuff in the first patch I added some ISBLK check in fast paths, basically only to check against blk_size instead of inode->i_size, I also considered changing the i_size semantics for the blkdev inodes but I didn't wanted to break all the fs yet so I took the localized slower way for now (I doubt it is noticeable in the benchmarks but nevertheless it would be nice to optimize away those branches). - once the blkdev is closed in the block_close callback I filemap_fdatasync;fsync_dev;filemap_fdatawait;invalidate_inode_pages2 (fdatawait seems not necessary but it won't hurt). I'm not calling truncate_inode_pages because those pages could be still mapped (->release is called when f_count goes down to zero, not when i_count reaches zero). I'd like to defer the
ECN: Volunteers needed
Folks, ECN is about to become a Proposed Standard RFC. Thanks to efforts from the Linux community, a few issues were discovered in the course of deploying the code. Special kudos go to Alexey Kuznetsov and David Miller. I wont go into details of the issues other than to say some midlle-box vendors in the past have associated the semantics of the natural-language English word "reserved" to have a different meaning. visit Jeff Garzik's ECN-under-Linux Unofficial Vendor Support Page at: http://gtf.org/garzik/ecn/ for more details Sally Floyd explains best why it is wrong for vendors of middle boxes to be doing this in the draft to be found at: ftp://ftp.normos.org/ietf/internet-drafts/draft-floyd-tcp-reset-00.txt So why am i posting this? This is to solicit volunteers who will help removing the remaining cruft. Some vendors (special positive mention goes to CISCO) have released patches which are unfortunately not being propagated by some of the site owners. Help is needed to contact these site owners and politely using a standard email ask them that their site was non-conformant. Point them to Sally's draft and the fact that ECN is becoming standard in the next week or so. Also to Jeff's ECN-under-Linux Unofficial Vendor Support Page, and to encourage them to have their firewall or load-balancer upgraded. I suppose the first volunteer needed is to draft such an email. We have to be polite and persistent for this to work. Jitendra Padhye at ACIRI is running weekly tests to detect offending sites. Most recent results can be found at: http://www.aciri.org/tbit/ecn_test3A.html Any site with the word "RST" on the line should be considered non-conformant. Volunteers please send an email to [EMAIL PROTECTED] with subject "interested in volunteering" Flames etc please redirect to netdev (since that's the only list i am on). as well make sure you cc the other people (other than linux-kernel and linux-net) cheers, jamal - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: page_launder() bug
In message <[EMAIL PROTECTED]> you write: > > Jonathan Morton writes: > > >- page_count(page) == (1 + !!page->buffers)); > > > > Two inversions in a row? > > It is the most straightforward way to make a '1' or '0' > integer from the NULL state of a pointer. Overall, I'd have to say that this: - dead_swap_page = - (PageSwapCache(page) && -page_count(page) == (1 + !!page->buffers)); - Is nicer as: int dead_swap_page = 0; if (PageSwapCache(page) && page_count(page) == (page->buffers ? 1 : 2)) dead_swap_page = 1; After all, the second is what the code *means* (1 and 2 are magic numbers). That said, anyone who doesn't understand the former should probably get some more C experience before commenting on others' code... Rusty. -- Premature optmztion is rt of all evl. --DK - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.2.19 + reiserfs 3.5.32 nfsd wait_on_buffer/down_failed
On Tuesday, May 08, 2001 04:42:43 PM +0200 Michael Stiller <[EMAIL PROTECTED]> wrote: > Hi, > > we run a nfs server utilizing 2.2.19 + ReiserFS version 3.5.32 on a > P 3 550 machine. Disk subsystem is a GDT7518RN using 4 UW disks as raid 5 > device. After upgrading from 2.2.17 + reiserfs to 2.2.19 we experience > many (very much more than with 2.2.17) problems with our nfs clients > about 12 (linux). Network ist 100Mbit full duplex / switched. > I do not think this is network related, cause ping -f doesnt show any > packet loss. > > During not so heavy IO on the exported fs > one nfsd thread seems to be waiting for the disk: Are you running any patches to make knfsd deal with the reiserfs iget issues? -chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: page_launder() bug
Marcelo Tosatti writes: > Ok, this patch implements thet thing and also changes ext2+swap+shm > writepage operations (so I could test the thing). > > The performance is better with the patch on my restricted swapping tests. Nice. Now the only bit left is moving the referenced bit checking and/or state into writepage as well. This is still part of the plan right? Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: REVISED: Experimentation with Athlon and fast_page_copy
Alan Cox wrote: > > > the memory copy in the fast_page_copy routine. The machine then > > proceeded > > not to stop at my panic, but I got my "normal" oopses. I then had an > > Ok > > > idea and removed all the prefetch instructions from the beginning of the > > routine and tried the resultin kernel. I now have no crashes. > > What could this mean? > > I think it has to mean a hardware problem. I don't think so, reasons below > What still stands out is that exactly _zero_ people have reported the same > problem with non VIA chipset Athlons. Not any more :-( Hi Alan, IIRC this thread is about boot going catatonic right after unloading __initmem. I'm seeing that in 2.4.5-pre1 with Athlon stepping 2, AMD 751, MS-6195 mobo, 128M. The machine is fine with kernels up through 2.4.4-pre3, and still works with them. On that gear, there is no crash. The keyboard and display are alive and SysRq works. I have copied the stack trace for pid=1 and the processor dump. I'm short of time but I have a kind typist electrifying the trace, and I'll try to generate something ksymoops can digest. Here is what a quick eyeballing of System.map shows. The code is at the end of init/main.c:init(). The processor dump shows init() halted in default_idle() from the sequence L6 -> init -> cpu_idle. Trace of pid 1 shows it stuck in D state. The last addresses listed are from filemap_nopage -> do_execve -> do_no_page -> handle_mm_fault -> __pmd_alloc -> rwsem_down_write_failed -> stext_lock -> system_call. That looks fishy. Earlier, it looks like handle_mm_fault is being triggered from fast_clear_page. I'll post the full dump soon as I have it. Btw, above happens with both gcc-2.95.3 and gcc-3.0-[20010423] compiled kernels. Cheers, Tom -- The Daemons lurk and are dumb. -- Emerson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.4 Kernel - ASUS CUV4X-DLS Question
All, I unfortunately don't have the time this evening to produce actual kernel messages, but I did want to throw out that I have an ASUS CUV4X-DLS board too, with two PIII/1GHz processors in it, and I cannot get it to boot an SMP kernel at all. In addition to the built-in devices, the following cards are also present: * Netgear FA310TX (rev. D, I believe - lspci reports it as a Lite-On LNE100TX rev 21) * Promise PDC20267 Ultra100 controller * Creative SB Live! Value * Matrox G200 AGP When it gets to the point of activating the second processor, kernel 2.4.3-ac13 starts spewing: probable hardware bug: clock timer configuration lost - probably a VIA686a motherboard. probable hardware bug: restoring chip configuration. continuously. Older kernels simply hang at this point. I'll try to get the actual messages leading up to this tomorrow. Also, if there's any other information I can collect from my system that could help, feel free to ask. I'll also build 2.4.4-ac6 on it tomorrow and try booting it SMP. Here's the output of lspci -vv: 00:00.0 Host bridge: VIA Technologies, Inc. VT82C693A/694x [Apollo PRO133x] (rev c4) Subsystem: Asustek Computer, Inc.: Unknown device 8038 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- Capabilities: [c0] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo MVP3/Pro133x AGP] (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- Reset- FastB2B- Capabilities: [80] Power Management version 2 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40) Subsystem: Asustek Computer, Inc.: Unknown device 8038 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- [disabled] [size=256K] 00:0b.0 Unknown mass storage controller: Promise Technology, Inc. 20267 (rev 02) Subsystem: Promise Technology, Inc.: Unknown device 4d33 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- [disabled] [size=64K] Capabilities: [58] Power Management version 1 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:0c.0 Multimedia audio controller: Creative Labs SB Live! EMU1 (rev 08) Subsystem: Creative Labs CT4832 SBLive! Value Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- Starting kswapd v1.8 Winbond Super-IO detection, now testing ports 3F0,370,250,4E,2E ... SMSC Super-IO detection, now testing Ports 2F0, 370 ... 0x378: FIFO is 16 bytes 0x378: writeIntrThreshold is 8 0x378: readIntrThreshold is 8 0x378: PWord is 8 bits 0x378: Interrupts are ISA-Pulses 0x378: ECP port cfgA=0x10 cfgB=0x00 0x378: ECP settings irq= dma= parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,COMPAT,ECP] parport0: cpp_daisy: aa5500ff(38) parport0: assign_addrs: aa5500ff(38) parport0: cpp_daisy: aa5500ff(38) parport0: assign_addrs: aa5500ff(38) parport_pc: Via 686A parallel port: io=0x378 Detected PS/2 Mouse Port. pty: 256 Unix98 ptys configured lp0: using parport0 (polling). block: queued sectors max/low 169506kB/56502kB, 512 slots per queue Uniform Multi-Platform E-IDE driver Revision: 6.31 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: IDE controller on PCI bus 00 dev 21 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci00:04.1 ide0: BM-DMA at 0xd800-0xd807, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0xd808-0xd80f, BIOS settings: hdc:DMA, hdd:DMA PDC20267: IDE controller on PCI bus 00 dev 58 PCI: Found IRQ 10 for device 00:0b.0 PCI: The same IRQ used for device 00:07.0 PDC20267: chipset revision 2 PDC20267: not 100% native mode: will probe irqs later PDC20267: (U)DMA Burst Bit ENABLED Primary PCI
No Subject
sir I am a linux fan from India and i am eagar to know about the emerging technologies please inform me @ [EMAIL PROTECTED] __ Do You Yahoo!? Yahoo! Auctions - buy the things you want at great prices http://auctions.yahoo.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: page_launder() bug
On Tue, 8 May 2001, Linus Torvalds wrote: > > > On Tue, 8 May 2001, Marcelo Tosatti wrote: > > > > There are two issues which I missed yesterday: we have to get a reference > > on the page, mark it clean, drop the locks and then call writepage(). If > > the writepage() fails, we'll have to set_page_dirty(page). > > We can move the "mark it clean" into writepage, which would actually > simplify the error cases for shared memory writepage (no need to mark it > dirty again etc). > > > I guess this is too much overhead for the common case, don't you? > > You could easily be right. > > On the other hand, remember that a noticeable part of the time you should > be seeing a real write too, so the CPU overhead compared to the IO might > not be prohibitive. Ie, let's assuem that 10% of the time we actually end > up doing writes, then that 10% is going to be _soo_ much more than the > extra 10 cycles 90% of the time that the cleanup may well be worth it. > > Especially if the cleanup means that we can avoid doing some of the real > writes altogether, by being better able to release dead memory to the > system. Ok, this patch implements thet thing and also changes ext2+swap+shm writepage operations (so I could test the thing). The performance is better with the patch on my restricted swapping tests. In case you don't have any problems with this I'll fix the other writepage's (so tell me if its ok for you). diff -Nur --exclude-from=exclude linux.orig/fs/buffer.c linux/fs/buffer.c --- linux.orig/fs/buffer.c Mon May 7 20:47:26 2001 +++ linux/fs/buffer.c Tue May 8 22:04:00 2001 @@ -1933,12 +1933,17 @@ return err; } -int block_write_full_page(struct page *page, get_block_t *get_block) +int block_write_full_page(struct page *page, get_block_t *get_block, int priority) { struct inode *inode = page->mapping->host; unsigned long end_index = inode->i_size >> PAGE_CACHE_SHIFT; unsigned offset; int err; + + if (!priority) + return -1; + + ClearPageDirty(page); /* easy case */ if (page->index < end_index) diff -Nur --exclude-from=exclude linux.orig/fs/ext2/inode.c linux/fs/ext2/inode.c --- linux.orig/fs/ext2/inode.c Mon May 7 20:47:26 2001 +++ linux/fs/ext2/inode.c Tue May 8 20:46:54 2001 @@ -650,9 +650,9 @@ return NULL; } -static int ext2_writepage(struct page *page) +static int ext2_writepage(struct page *page, int priority) { - return block_write_full_page(page,ext2_get_block); + return block_write_full_page(page,ext2_get_block,priority); } static int ext2_readpage(struct file *file, struct page *page) { diff -Nur --exclude-from=exclude linux.orig/include/linux/fs.h linux/include/linux/fs.h --- linux.orig/include/linux/fs.h Tue May 8 16:45:42 2001 +++ linux/include/linux/fs.hTue May 8 22:22:38 2001 @@ -362,7 +362,7 @@ struct address_space; struct address_space_operations { - int (*writepage)(struct page *); + int (*writepage)(struct page *, int); int (*readpage)(struct file *, struct page *); int (*sync_page)(struct page *); int (*prepare_write)(struct file *, struct page *, unsigned, unsigned); @@ -1268,7 +1268,7 @@ /* Generic buffer handling for block filesystems.. */ extern int block_flushpage(struct page *, unsigned long); extern int block_symlink(struct inode *, const char *, int); -extern int block_write_full_page(struct page*, get_block_t*); +extern int block_write_full_page(struct page*, get_block_t*, int); extern int block_read_full_page(struct page*, get_block_t*); extern int block_prepare_write(struct page*, unsigned, unsigned, get_block_t*); extern int cont_prepare_write(struct page*, unsigned, unsigned, get_block_t*, diff -Nur --exclude-from=exclude linux.orig/mm/filemap.c linux/mm/filemap.c --- linux.orig/mm/filemap.c Mon May 7 20:47:26 2001 +++ linux/mm/filemap.c Tue May 8 22:22:50 2001 @@ -411,7 +411,7 @@ */ void filemap_fdatasync(struct address_space * mapping) { - int (*writepage)(struct page *) = mapping->a_ops->writepage; + int (*writepage)(struct page *, int) = mapping->a_ops->writepage; spin_lock(_lock); @@ -430,8 +430,7 @@ lock_page(page); if (PageDirty(page)) { - ClearPageDirty(page); - writepage(page); + writepage(page, 1); } else UnlockPage(page); diff -Nur --exclude-from=exclude linux.orig/mm/shmem.c linux/mm/shmem.c --- linux.orig/mm/shmem.c Mon May 7 20:47:26 2001 +++ linux/mm/shmem.cTue May 8 22:23:01 2001 @@ -221,13 +221,16 @@ * once. We still need to guard against racing with * shmem_getpage_locked(). */ -static int shmem_writepage(struct page * page) +static int shmem_writepage(struct page * page, int priority) { int error = 0; struct shmem_inode_info *info; swp_entry_t
Re: SPARC include problem
Sean Jones wrote: > > "David S. Miller" wrote: > > > > Sean Jones writes: > > > In compiling 2.4.4-ac5 for my SPARCStation 20, I had an error in the > > > compile resulting from the inability to find a hw_irq.h in the > > > include/asm directory. Do you know where I may be able to find such a > > > file? > > > > How did you find this problem if the build couldn't find the > > "bzImage" rule? :-) > > > > Later, > > David S. Miller > > [EMAIL PROTECTED] > > I found it by kicking the make stuff around one more time after I sent > that e-mail. > > Sean > > - > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to [EMAIL PROTECTED] > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: nfsd from kernel 2.4.4 oops
On Tuesday May 8, [EMAIL PROTECTED] wrote: > Hi, > > I'm using kernel 2.4.4 cvs from SGI, with xfs. I'm getting this Oops: > > kernel: Unable to handle kernel NULL pointer dereference at virtual address 0010 > kernel: printing eip: > kernel: c017bfd8 > kernel: *pde = > kernel: Oops: > kernel: CPU:0 > kernel: EIP:0010:[nfsd_findparent+120/236] > kernel: EIP:0010:[] > kernel: EFLAGS: 00010246 > kernel: eax: ebx: ecx: cff8d458 edx: 0010 > kernel: esi: cb22c6a0 edi: cb22c720 ebp: cb22c720 esp: ce4c9e54 > kernel: ds: 0018 es: 0018 ss: 0018 > kernel: Process nfsd (pid: 592, stackpage=ce4c9000) > kernel: Stack: 1802280f c017c416 cb22c720 ce4cf814 1127 >ce4cf804 > kernel:c03c5740 cfe3b5c8 000e ff8c c017c7c4 cfe3b400 >1802280f > kernel: 0001 ce4cf804 0008 cb1fc77c ce4cfc00 >ceb7b000 > kernel: Call Trace: [find_fh_dentry+598/928] [fh_verify+612/1128] >[nfsd_lookup+110/1368] [nfsd3_proc_lookup+314/332] [nfs3svc_decode_diropargs+152/268] >[nfsd_dispatch+203/360] [svc_process+684/1348] > kernel: Call Trace: [] [] [] [] [] >[] [] > nfsd_findparent+120/236 corresponds to line 257 on fs/nfsd/nfsfh.h and the condition of the "if" statement: if (aliases->next != aliases) { just after the "spin_lock(_lock)". eax == 0 implies that >d_inode == NULL, and hence the oops. d_inode being NULL here implies that the "lookup" of ".." failed to find a ".." entry, which is very odd. I find it hard to believe that ext2fs would ever do this unless the filesystem was corrupt. XFS might, I don't know. I guess nfsd should be robust against this sort of behaviour in filesystems. Something like: --- nfsfh.c 2001/05/09 00:54:56 1.1 +++ nfsfh.c 2001/05/09 00:56:01 @@ -244,6 +244,10 @@ */ pdentry = child->d_inode->i_op->lookup(child->d_inode, tdentry); d_drop(tdentry); /* we never want ".." hashed */ + if (!pdentry && tdentry->d_inode == NULL) { + dput(tdentry); + pdentry = ERR_PTR(-EINVAL); + } if (!pdentry) { /* I don't want to return a ".." dentry. * I would prefer to return an unconnected "IS_ROOT" dentry, Is probably the best fix for knfsd, but someone should find out why XFS isn't finding ".." when asked (If that is indeed what is happening). NeilBrown > > It's produced very randomly. Some people (readed in xfs list) get similar error and > tested too with a clean 2.4.4 with ext2 filesystem, and oops too. I think this is > related to nfsd code (maybe sunrpc code), and it's not related to xfs code. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci_pool_free from IRQ
David Brownell writes: > Pete's patch to pci_pool_free() is fine with me, and I'd be glad > to see that bit of pci interface cleaned up. Any changes needed > other than the pci.txt doc update? Ummm... What Alan's saying is: 1) Whatever driver is trying to shut down from IRQ context is broken must be fixed. pci_pool is fine. 2) The Documentation/ files which suggest that such device removal from IRQs is "OK" must be fixed because it is not "OK" to handle device removal from IRQ context. So Pete's change is not needed. A fix for the documentation and broken drivers is needed instead. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.4-ac6
On Wed, 9 May 2001, Alan Cox wrote: ... > 2.4.4-ac6 ... To be sincere I was expecting the Athlone pre-pre-pre-patch/fix to be included. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Patch to make ymfpci legacy address 16 bits
Hi: I found that every time I run a 2.4 on my laptop, APM locks up the machine. Apparently, legacy YMF code enabled decoding of 10 bits of I/O address. A call to APM BIOS touched that and somehow the system locked up. If Pavel Roskin, Daisuke Nagano or someone else do not mind, I want this in stock kernel. -- Pete --- linux-2.4.4/drivers/sound/ymfpci.c Thu Apr 26 22:17:27 2001 +++ linux-2.4.4-niph/drivers/sound/ymfpci.c Tue May 8 16:46:58 2001 @@ -2059,9 +2059,10 @@ } if (mpuio >= 0 || oplio >= 0) { - v = 0x003e; + /* 0x0020: 1 - 10 bits of I/O address decoded, 0 - 16 bits. */ + v = 0x001e; pci_write_config_word(pcidev, PCIR_LEGCTRL, v); - + switch (pcidev->device) { case PCI_DEVICE_ID_YAMAHA_724: case PCI_DEVICE_ID_YAMAHA_740: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re:
"Richard B. Johnson" wrote: > > To driver wizards: > > I have a driver which needs to wait for some hardware. > Basically, it needs to have some code added to the run-queue > so it can get some CPU time even though it's not being called. > > It needs to get some CPU time which can be "turned on" or > "turned off" as a result of an interrupt or some external > input from an ioctl(). schedule_task()? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Linux 2.4.4-ac6
ftp://ftp.kernel.org/pub/linux/kernel/people/alan/2.4/ Intermediate diffs are available from http://www.bzimage.org 2.4.4-ac6 o Revert dead swap patch pending fixes(Dave Miller) o Allow arch specific writeproc/DMA for IDE (Bjorn Wesen) o Move to aic7xxx 6.1.13 (Justin Gibbs) o Use pci_set_master on eni.c (Jeff Garzik) o Update wireless drivers, add airport(Jean Tourrihles, Benjamin Herrenschmidt) o Add new pci ids, clean up dup defines in eicon (Jeff Garzik) o Add module loader to kernel docs(Erik Mouw) o Fix wanrouter makefile bug (Arnaldo Carvalho de Melo) o Add another pair of idents to the yenta driver (Alexandr Kanevskiy) o Parport fixes for 1284 mode (Fred Barnes) o Update 8139too driver to handle wakeup bug (Jeff Garzik) o Add koi8-ru locale (Andrzej Krzysztofowicz) o Add ICH3 to the i810 audio driver (Tom Woller) o Improve (hopefully) the confusing I82365 help (me) o Fix a bug in koi8-u tables (Andrzej Krzysztofowicz) o Fix a bug in UTF8->CP1255 (Andrzej Krzysztofowicz) o Fix a bug in iso8859-13 tables (Andrzej Krzysztofowicz) o Update gdth driver to current vendor release(Achim Leubner) o Kill cpia_write_proc (its insecure) (Al Viro, me) o Fix unterminated array strtoul() in comx(Al Viro) o Fix TCP send path leak (Dave Miller) o Restore older skb_cow() headroom behaviour (Dave Miller) o Fix ipv6 oops (Dave Miller) o Small ipx tidy up (Arnaldo Carvalho de Melo) o Fix unprotected userspace reference in trident (Al Viro) audio o Fix expand stack locking(Manfred Spraul) o Fix offslab_limit calculation (Manfred Spraul) o EATA and U14F updates (Dario Ballabio) o Update scsi generic to 3.1.18 (Doug Gilbert) o Clean up abs() (Kai Germaschewski) | This needs further checking o ymfpci update (Pete Zaitcev) o Quota code updates (Jan Kara) o Clean up eicon include abuse(me) 2.4.4-ac5 o Fix DMA setup on hpt366/370 (Tim Hockin) o DRM memory alloc failure checks (Akash Jain) o Remove bogus fs/buffer.c diff (Ben LaHaise) o cs46xx update - adds Hercules Game Theatre XP (Thomas Woller) o Fix menuconfig breakage with () (Andrzej Krzysztofowicz) o Updated multithreaded core dump support (Don Dugger) o Remove dead ibmtr.h include (Mike Phillips) o Fix misplaced letters in koi8-u (Andriy Rysin) o Further alpha module locking fix(Andrea Arcangeli) o Keyspan bitwidth fixes (Hugh Blemings) o usb-uhci oops fix (Pete Zaitcev) o Add ability to specify preferred minor on (Gerd Knorr) video/radio4linux devices o Further IPX updates (Arnaldo Carvalho de Melo) o Further IRDA updates(Dag Brattli) o Make x86 ptrace framesize a define (code clean) (Pavel Machek) o Moxa serial tidy(Tim Hockin) o Fix tiny select race(Rusty Russell) o Update aic7xxx to 6.1.12(Justin Gibbs) o Alpha was missing rwlock_init (Reto Baettig) o Alpha SCHED_YIELD was broken on UP (Andrea Arcangeli) o Allow IRQ sharingon more PCI ide(Pete Zaitcev) o Fix capable checks found by Stanford analyser (me) for cciss/cpqarray o List more devices in sysrq table(Andrzej Krzysztofowicz) o Run uml exit callbacks reverse to init (Andrew Morton) o Fix SMP resched_idle pre-emption bug(Nigel Gamble) o Work around config problem with menuconfig and USB (Andrzej Krzysztofowicz) o Fix nasty bug in Alpha PCI mapping (Hyung Min SEO) | Nautilus specific stuff not applied yet o SBLive endianness fixes (output only so far)(Ira Weiny) o Move sblive pci_enable earlier (Marcus Meissner) o Merge IBM ServeRAID 4.72 driver (Keith Mitchell) o Fix affs
Re: [patch] 2.4.4: mmap() fails for certain legal requests
Alan Cox writes: > And just how is he going to test it ? Considering he was just > asking if the concept was reasonable I think you are a little out > of order I can't test every platform when I have to make such changes. But it always serves to show the port maintainer "what" the change was. Yes, I am slightly out of order if the intent is just "does this idea look fine" (which it does btw, I can't find any problems with it). I apologize to Maciej, but I do deplore him to actually do the final bits for the other ports when he makes his final patch submission. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] 2.4.4: mmap() fails for certain legal requests
On Tue, 8 May 2001, David S. Miller wrote: > That's pretty arrogant that cut and pasting a few lines into some > architecture specific files and reporting the updated patch is too > much to ask. I'm sorry if you find me arrogant -- that certainly was not my intent. I did look at the files and changes are not as trivial as cut and paste. > Perhaps reviewing your change is also, too much to ask. Perhaps > we are too lazy and short on time to have a look at your change. Well, I've been using similar changes since July. I may live with patches forever and be fine. Still this is not the point with free software. It would be malicious if I had a fix and I wouldn't share it. Sooner or later someone would discover the problem again and would waste time to track it down unnecessarily. And again, and again... > I don't think it's asking a lot to provide a complete change. It's not a lot, supposedly, but look at the case from my point of view. It's a bugfix and not a new feature. I've invested a few hours in finding the cause of a weird bug on a MIPS/Linux machine. I am providing a ready solution that works for most architectures with the exception of a few ones I'm not familiar with. Well, it's great I have an opportunity to get better knowledge on these architectures, but I cannot always afford it and I know there are people who already have enough knowledge to be sure bits get written correctly immediately. I never hesitate to do job myself in the areas I am familiar with or when I have enough free time (and I do have, from time to time). I don't have time currently, I am afraid (basically I am now stealing the time I would otherwise spend sleeping for a task that was quite low on my priority list) and I am sure someone familiar with the specific ports would spend less time than I do. Finally I do consider my time equally worth to anyone else's one, so why should I have to spend x units of time, whilst some else would only spend x/2 or x/3 or whatever... Of course I consider this rule working both ways. > I'm sure the MIPS folks know all too well whats it's like when their > port is crapped up because someone only made changes to x86 port > portions. At least for me on after working on Sparc for some time, > I'm adamant about providing complete changes so that this kind of > grief is avoided for other port maintainers. The port gets crapped from time to time, although Ralf is doing great job to keep it fine, so it's more that specific MIPS hosts lag behind the rest of the kernel. Still I consider it the specific maintainer's job to get things synchronized. It just works better this way. > In the time you used to compose your response to me, and now > to read this email from me, you could have fixed up the patch > perhaps 2 or 3 times. Just do it and get it over with ok? I'm not so sure, I'm afraid, especially at this time of the day. Check timestamps of mails if curious... > Dziekuje. Nie za ma co. ;-) A patch follows. Architecture-specific changes are completely untested. I hope I got things right, otherwise I'll consider my time wasted. BTW, I've noticed the "if (flags & MAP_FIXED)" statements in arch_get_unmapped_area() in arch/sparc*/kernel/sys_sparc.c are dead code now, as get_unmapped_area() in mm/mmap.c never calls it if MAP_FIXED is set in flags. Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--+ +e-mail: [EMAIL PROTECTED], PGP key available+ diff -up --recursive --new-file linux-2.4.4.macro/arch/ia64/kernel/sys_ia64.c linux-2.4.4/arch/ia64/kernel/sys_ia64.c --- linux-2.4.4.macro/arch/ia64/kernel/sys_ia64.c Mon May 7 16:43:50 2001 +++ linux-2.4.4/arch/ia64/kernel/sys_ia64.c Tue May 8 23:25:49 2001 @@ -28,13 +28,22 @@ arch_get_unmapped_area (struct file *fil if (len > RGN_MAP_LIMIT) return -ENOMEM; - if (!addr) - addr = TASK_UNMAPPED_BASE; + if (addr) { + if (flags & MAP_SHARED) + addr = COLOR_ALIGN(addr); + else + addr = PAGE_ALIGN(addr); + vmm = find_vma(current->mm, addr); + if (TASK_SIZE - len >= addr && + rgn_offset(addr) + len <= RGN_MAP_LIMIT) && + (!vmm || addr + len <= vmm->vm_start)) + return addr; + } if (flags & MAP_SHARED) - addr = COLOR_ALIGN(addr); + addr = COLOR_ALIGN(TASK_UNMAPPED_BASE); else - addr = PAGE_ALIGN(addr); + addr = PAGE_ALIGN(TASK_UNMAPPED_BASE); for (vmm = find_vma(current->mm, addr); ; vmm = vmm->vm_next) { /* At this point: (!vmm || addr < vmm->vm_end). */ diff -up --recursive --new-file linux-2.4.4.macro/arch/sparc/kernel/sys_sparc.c linux-2.4.4/arch/sparc/kernel/sys_sparc.c ---
Re: page_launder() bug
On Tue, 8 May 2001, Marcelo Tosatti wrote: > > There are two issues which I missed yesterday: we have to get a reference > on the page, mark it clean, drop the locks and then call writepage(). If > the writepage() fails, we'll have to set_page_dirty(page). We can move the "mark it clean" into writepage, which would actually simplify the error cases for shared memory writepage (no need to mark it dirty again etc). > I guess this is too much overhead for the common case, don't you? You could easily be right. On the other hand, remember that a noticeable part of the time you should be seeing a real write too, so the CPU overhead compared to the IO might not be prohibitive. Ie, let's assuem that 10% of the time we actually end up doing writes, then that 10% is going to be _soo_ much more than the extra 10 cycles 90% of the time that the cleanup may well be worth it. Especially if the cleanup means that we can avoid doing some of the real writes altogether, by being better able to release dead memory to the system. Tradeoffs.. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] 2.4.4: mmap() fails for certain legal requests
> > Thanks for your response, though -- maybe there is someone interested, > > after all. > > That's pretty arrogant that cut and pasting a few lines into some > architecture specific files and reporting the updated patch is too > much to ask. And just how is he going to test it ? Considering he was just asking if the concept was reasonable I think you are a little out of order - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
`smp_num_cpus' undeclared in 2.4.3
Dear linux-kernel mailing list, I am trying to build 2.4.3 for Intel machine . But i am getting this error when i say no to 'CONFIG_SMP' :- In file included from ksyms.c:17: /usr/src/linux-2.4.3/include/linux/kernel_stat.h:48: `smp_num_cpus' undeclared (first use in this function) /usr/src/linux-2.4.3/include/linux/kernel_stat.h:48: (Each undeclared identifier is reported only once /usr/src/linux-2.4.3/include/linux/kernel_stat.h:48: for each function it appears in.) make[2]: *** [ksyms.o] Error 1 but when i say yes to 'CONFIG_SMP' , there is no compilation error. I am attaching autoconf.h for reference. Thanks , Best Regards, Jaswinder. -- These are my opinions not 3Di. autoconf.h
Child first after fork violates the SCHED_FIFO and SCHED_RR standard.
The standard says a SCHED_FIFO task only gives up the processor if it blocks, yields, or changes its priority. The counter is not really used by SCHED_FIFO tasks, however the update_process_times() code will set the "need_resched" flag on a SCHED_FIFO task, even though schedule() effectively ignores the entry. The attached patch addresses these issues by setting the counter to -100 for SCHED_FIFO tasks and "teaching" update_process_timers() to not count down negative counters. This avoids the calling of schedule() every jiffie while a SCHED_FIFO task is running. I tried to keep the change to recalculate confined to only the data elements it was already touching, however, the standard really doesn't allow recalculate to touch the SCHED_RR counter. A standard conforming test would restrict recalculate to only SCHED_OTHER tasks. Comments? George diff -urP -X /usr/src/patch.exclude linux-2.4.4-kb/kernel/fork.c linux/kernel/fork.c --- linux-2.4.4-kb/kernel/fork.cMon May 7 14:46:17 2001 +++ linux/kernel/fork.c Tue May 8 15:17:51 2001 @@ -673,10 +673,14 @@ * if the child for a fork() just wants to do a few simple things * and then exec(). This is only important in the first timeslice. * In the long run, the scheduling behavior is unchanged. + * SCHED_FIFO tasks don't count down and have a negative counter. + * Don't change these, least they all end up at -1. */ - p->counter = current->counter; - current->counter = 0; - current->need_resched = 1; +if (p->policy == SCHED_OTHER){ +p->counter = current->counter; +current->counter = 0; +current->need_resched = 1; +} /* * Ok, add it to the run-queues and make it diff -urP -X /usr/src/patch.exclude linux-2.4.4-kb/kernel/sched.c linux/kernel/sched.c --- linux-2.4.4-kb/kernel/sched.c Mon May 7 14:46:17 2001 +++ linux/kernel/sched.cTue May 8 13:43:54 2001 @@ -682,7 +682,10 @@ spin_unlock_irq(_lock); read_lock(_lock); for_each_task(p) - p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice); +if (p->counter >= 0 ){ +p->counter = (p->counter >> 1) + +NICE_TO_TICKS(p->nice); +} read_unlock(_lock); spin_lock_irq(_lock); } @@ -932,6 +935,11 @@ retval = 0; p->policy = policy; +if ( policy == SCHED_FIFO) { +p->counter = -100;/* we don't count down neg couters */ +}else{ +p->counter = NICE_TO_TICKS(p->nice); +} p->rt_priority = lp.sched_priority; if (task_on_runqueue(p)) move_first_runqueue(p); diff -urP -X /usr/src/patch.exclude linux-2.4.4-kb/kernel/timer.c linux/kernel/timer.c --- linux-2.4.4-kb/kernel/timer.c Sun Dec 10 09:53:19 2000 +++ linux/kernel/timer.cTue May 8 15:14:37 2001 @@ -583,7 +583,11 @@ update_one_process(p, user_tick, system, cpu); if (p->pid) { - if (--p->counter <= 0) { +/* + * SCHED_FIFO and the idle(s) have counters set to -100, + * so we won't count them. + */ + if (p->counter >= 0 && --p->counter <= 0) { p->counter = 0; p->need_resched = 1; }
Re: [PATCH][RFT] smbfs bugfixes for 2.4.4
> No, I broke it when copying the ncpfs dircache code. > > That code will reuse an old inode if it already exists (and thus also > any pages attached to it), which is what I wanted and should be fine > except that it needs to invalidate_inode_pages() if something changed. > > Xuan and James, you have both seen this bug with smbfs not properly > handling changes made on the server. Could you please test this patch > vs 2.4.4 and let me know if it helps or not. > > http://www.hojdpunkten.ac.se/054/samba/smbfs-2.4.4-truncate+retry-4.patch Urban: I am actually using a 2.4.3 kernel, rather than 2.4.4. However, I manually applied the patches to my 2.4.3 kernel, and did some tests - it appears to work now! I probably won't be using Samba heavily until next week, but I will let you know if I see any evidence that the problem is not fixed. Thank you very much for the fix. -- James James H. Puttick Kerr Vayne Systems Ltd. 1 Valleywood Drive, Unit 5A Markham, Ontario L3R 5L9 Canada +1 905 475 6161 office +1 905 479 9833 fax mailto:[EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] proc_root_init() made saner
Changes: * proc_root_init() is called later in the boot sequence, after all essential VFS stuff had been initialized. That way we can have proc_mnt (along with superblock, root of dentry tree, etc.) set before we start registering any entries. As the result, now we are able to use all normal VFS machinery in create_proc_entry() and friends. What's more important, we can use it when we do sysctl_init(). That will allow to remove a lot of cruft from sysctl handling. * procfs_syms.c is gone (merged with root.c). This change is backwards compatible, BTW - nothing done that early (between the old and new locations of proc_root_init() call) tries to create proc entries, so we don't break anything by postponing the call. Please, apply it. It makes life much simpler for all procfs and sysctl stuff - we _will_ need something equivalent if we ever want to get rid of proc_dir_entry mess. Al diff -urN S5-pre1/fs/proc/Makefile S5-pre1-proc_init/fs/proc/Makefile --- S5-pre1/fs/proc/MakefileFri Feb 16 21:06:31 2001 +++ S5-pre1-proc_init/fs/proc/Makefile Tue May 8 17:36:57 2001 @@ -9,10 +9,10 @@ O_TARGET := proc.o -export-objs := procfs_syms.o +export-objs := root.o obj-y:= inode.o root.o base.o generic.o array.o \ - kmsg.o proc_tty.o proc_misc.o kcore.o procfs_syms.o + kmsg.o proc_tty.o proc_misc.o kcore.o ifeq ($(CONFIG_PROC_DEVICETREE),y) obj-y += proc_devtree.o diff -urN S5-pre1/fs/proc/procfs_syms.c S5-pre1-proc_init/fs/proc/procfs_syms.c --- S5-pre1/fs/proc/procfs_syms.c Tue May 8 17:55:17 2001 +++ S5-pre1-proc_init/fs/proc/procfs_syms.c Wed Dec 31 19:00:00 1969 @@ -1,46 +0,0 @@ -#include -#include -#include -#include -#include - -extern struct proc_dir_entry *proc_sys_root; - -#ifdef CONFIG_SYSCTL -EXPORT_SYMBOL(proc_sys_root); -#endif -EXPORT_SYMBOL(proc_symlink); -EXPORT_SYMBOL(proc_mknod); -EXPORT_SYMBOL(proc_mkdir); -EXPORT_SYMBOL(create_proc_entry); -EXPORT_SYMBOL(remove_proc_entry); -EXPORT_SYMBOL(proc_root); -EXPORT_SYMBOL(proc_root_fs); -EXPORT_SYMBOL(proc_net); -EXPORT_SYMBOL(proc_bus); -EXPORT_SYMBOL(proc_root_driver); - -static DECLARE_FSTYPE(proc_fs_type, "proc", proc_read_super, FS_SINGLE); - -static int __init init_proc_fs(void) -{ - int err = register_filesystem(_fs_type); - if (!err) { - proc_mnt = kern_mount(_fs_type); - err = PTR_ERR(proc_mnt); - if (IS_ERR(proc_mnt)) - unregister_filesystem(_fs_type); - else - err = 0; - } - return err; -} - -static void __exit exit_proc_fs(void) -{ - unregister_filesystem(_fs_type); - kern_umount(proc_mnt); -} - -module_init(init_proc_fs) -module_exit(exit_proc_fs) diff -urN S5-pre1/fs/proc/root.c S5-pre1-proc_init/fs/proc/root.c --- S5-pre1/fs/proc/root.c Fri Feb 16 20:25:45 2001 +++ S5-pre1-proc_init/fs/proc/root.cTue May 8 17:37:44 2001 @@ -14,6 +14,7 @@ #include #include #include +#include #include struct proc_dir_entry *proc_net, *proc_bus, *proc_root_fs, *proc_root_driver; @@ -22,8 +23,19 @@ struct proc_dir_entry *proc_sys_root; #endif +static DECLARE_FSTYPE(proc_fs_type, "proc", proc_read_super, FS_SINGLE); + void __init proc_root_init(void) { + int err = register_filesystem(_fs_type); + if (err) + return; + proc_mnt = kern_mount(_fs_type); + err = PTR_ERR(proc_mnt); + if (IS_ERR(proc_mnt)) { + unregister_filesystem(_fs_type); + return; + } proc_misc_init(); proc_net = proc_mkdir("net", 0); #ifdef CONFIG_SYSVIPC @@ -106,3 +118,17 @@ proc_fops: _root_operations, parent: _root, }; + +#ifdef CONFIG_SYSCTL +EXPORT_SYMBOL(proc_sys_root); +#endif +EXPORT_SYMBOL(proc_symlink); +EXPORT_SYMBOL(proc_mknod); +EXPORT_SYMBOL(proc_mkdir); +EXPORT_SYMBOL(create_proc_entry); +EXPORT_SYMBOL(remove_proc_entry); +EXPORT_SYMBOL(proc_root); +EXPORT_SYMBOL(proc_root_fs); +EXPORT_SYMBOL(proc_net); +EXPORT_SYMBOL(proc_bus); +EXPORT_SYMBOL(proc_root_driver); diff -urN S5-pre1/init/main.c S5-pre1-proc_init/init/main.c --- S5-pre1/init/main.c Wed May 2 11:16:38 2001 +++ S5-pre1-proc_init/init/main.c Tue May 8 17:19:42 2001 @@ -561,9 +561,6 @@ #endif mem_init(); kmem_cache_sizes_init(); -#ifdef CONFIG_PROC_FS - proc_root_init(); -#endif mempages = num_physpages; fork_init(mempages); @@ -577,6 +574,9 @@ signals_init(); bdev_init(); inode_init(mempages); +#ifdef CONFIG_PROC_FS + proc_root_init(); +#endif #if defined(CONFIG_SYSVIPC) ipc_init(); #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please
Re: [patch] 2.4.4: mmap() fails for certain legal requests
Czesc, Maciej W. Rozycki writes: > Yep, I know (ia64 and sparc*). But being lazy enough (and being short on > time) I won't do it until I know the idea of the change is accepted. I'm > sorry -- I sent previous versions of the patch twice since last Summer > with no response at all and doing bits no one is interested in is a waste > of time. > > Thanks for your response, though -- maybe there is someone interested, > after all. That's pretty arrogant that cut and pasting a few lines into some architecture specific files and reporting the updated patch is too much to ask. Perhaps reviewing your change is also, too much to ask. Perhaps we are too lazy and short on time to have a look at your change. I don't think it's asking a lot to provide a complete change. I'm sure the MIPS folks know all too well whats it's like when their port is crapped up because someone only made changes to x86 port portions. At least for me on after working on Sparc for some time, I'm adamant about providing complete changes so that this kind of grief is avoided for other port maintainers. In the time you used to compose your response to me, and now to read this email from me, you could have fixed up the patch perhaps 2 or 3 times. Just do it and get it over with ok? Dziekuje. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci_pool_free from IRQ
Alan Cox writes: > I suspect we should fix the documentation (and if need be the code) to reflect > the fact that you have to be completely out of your tree to handle device > removal in the irq handler Agreed. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oddity with page_launder() handling of dirty pages
On Tue, 8 May 2001, Marcelo Tosatti wrote: > > Linus, since you wrote that part of the code, I ask you: do you have any > reason to not remove a page being writepage()'d from the > inactive_dirty_list to avoid this kind of problems ? > > (the page must be added back to the inactive_dirty_list again after the > writeout, yes). This is the reason. I think it is absolutely _wrong_ to add it back after the writeout - anything could have happened to the page, including the page moving to other lists or not being a page cache page AT ALL. We had tons of bugs in this area when the page lists were introduced. Leaving it on the list and letting anybody who changed the state of the page remove it cleanly fixed all the bugs. And I'm not going back to the old and broken code. You can move it to the "active_list" if you want to while it is being written out ("it's busy, so it's active"). As long as you move it _before_ you start the write-out. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Direct Sockets Support??
[EMAIL PROTECTED] said: > > But in the case of an application which fits in main memory, and > > has been running for a while (so all pages are present and > > dirty), all you'd really have to do is verify the page tables are > > in the proper state and skip the TLB flush, right? > > We really cannot assume this. There are two cases > a. when a user app wants to receive some data, it allocates > memory(using malloc) and waits for the hw to do zero-copy read. The kernel > does not allocate physical page frames for the entire memory region > allocated. We need to lock the memory (and locking is expensive due to > costly TLB flushes) to do this > > b. when a user app wants to send data, he fills the buffer > and waits for the hw to transmit data, but under heavy physical memory > pressure, the swapper might swap the pages we want to transmit. So we need > to lock the memory to be 100% sure. You're right, of course. But I suspect that the fast path of re-locking memory which is happily in core will go much faster by removing the multi-processor TLB purge. And it can't hurt, unless I'm missing something. -- Pete --- linux-2.4.4-stock/mm/mlock.cTue May 8 17:26:34 2001 +++ linux/mm/mlock.cTue May 8 17:24:13 2001 @@ -114,6 +114,10 @@ return 0; } +/* implemented in mm/memory.c */ +extern int mlock_make_pages_present(struct vm_area_struct *vma, + unsigned long addr, unsigned long end); + static int mlock_fixup(struct vm_area_struct * vma, unsigned long start, unsigned long end, unsigned int newflags) { @@ -138,7 +142,7 @@ pages = (end - start) >> PAGE_SHIFT; if (newflags & VM_LOCKED) { pages = -pages; - make_pages_present(start, end); + mlock_make_pages_present(vma, start, end); } vma->vm_mm->locked_vm -= pages; } --- linux-2.4.4-stock/mm/memory.c Tue May 8 17:25:36 2001 +++ linux/mm/memory.c Tue May 8 17:24:40 2001 @@ -1438,3 +1438,80 @@ } while (addr < end); return 0; } + +/* + * Specialized version of make_pages_present which does not require + * a multi-processor TLB purge for every page if nothing about the PTE + * was modified. + */ +int mlock_make_pages_present(struct vm_area_struct *vma, + unsigned long addr, unsigned long end) +{ + int ret, write; + struct mm_struct *mm = current->mm; + + write = (vma->vm_flags & VM_WRITE) != 0; + + /* +* We need the page table lock to synchronize with kswapd +* and the SMP-safe atomic PTE updates. +*/ + spin_lock(>page_table_lock); + + ret = 0; + for (ret=0; !ret && addr < end; addr += PAGE_SIZE) { + pgd_t *pgd; + pmd_t *pmd; + pte_t *pte, entry; + int modified; + + current->state = TASK_RUNNING; + pgd = pgd_offset(mm, addr); + pmd = pmd_alloc(mm, pgd, addr); + if (!pmd) { + ret = -1; + break; + } + pte = pte_alloc(mm, pmd, addr); + if (!pte) { + ret = -1; + break; + } + entry = *pte; + if (!pte_present(entry)) { + /* +* If it truly wasn't present, we know that kswapd +* and the PTE updates will not touch it later. So +* drop the lock. +*/ + if (pte_none(entry)) { + ret = do_no_page(mm, vma, addr, write, pte); + continue; + } + ret = do_swap_page(mm, vma, addr, pte, + pte_to_swp_entry(entry), write); + continue; + } + + modified = 0; + if (write) { + if (!pte_write(entry)) { + ret = do_wp_page(mm, vma, addr, pte, entry); + continue; + } + if (!pte_dirty(entry)) { + entry = pte_mkdirty(entry); + modified = 1; + } + } + if (!pte_young(entry)) { + entry = pte_mkyoung(entry); + modified = 1; + } + if (modified) + establish_pte(vma, addr, pte, entry); + } + + spin_unlock(>page_table_lock); + return ret; +} - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at
Re: pci_pool_free from IRQ
> This sure makes life difficult. Device removal events can be called > from interrupt context according to Documentation/pci.txt. This is > certainly a place where one might want to call pci_consistent_free. None of our device code supports interrupt based device removal. In fact many drivers use vmalloc directly so will hit the same problem the pci_consistent_free hits on the ARM. I suspect we should fix the documentation (and if need be the code) to reflect the fact that you have to be completely out of your tree to handle device removal in the irq handler - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] 2.4.4: mmap() fails for certain legal requests
On Tue, 8 May 2001, David S. Miller wrote: > There are several get_unmapped_area() implementations besides the > standard one (search for HAVE_ARCH_UNMAPPED_AREA). Please fix > them up too. Yep, I know (ia64 and sparc*). But being lazy enough (and being short on time) I won't do it until I know the idea of the change is accepted. I'm sorry -- I sent previous versions of the patch twice since last Summer with no response at all and doing bits no one is interested in is a waste of time. Thanks for your response, though -- maybe there is someone interested, after all. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--+ +e-mail: [EMAIL PROTECTED], PGP key available+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re:
"Richard B. Johnson" wrote: > > To driver wizards: > > I have a driver which needs to wait for some hardware. > Basically, it needs to have some code added to the run-queue > so it can get some CPU time even though it's not being called. > > It needs to get some CPU time which can be "turned on" or > "turned off" as a result of an interrupt or some external > input from an ioctl(). > > So I thought that the "tasklet" would be ideal. However, the > scheduler "thinks" that a tasklet is an interrupt, so any > attempt to sleep in the tasklet results in a kernel panic, > "ieee scheduling in an interrupt..., BUG sched.c line 688". > > Next, I added code to try queue_task(). This has the same problem. > > Basically the procedure needs to do: > > procedure() > { > if(some_event) > schedule_timeout(n); /* Needs to sleep */ > else if(something_else) > do_something(); >queue_task(procedure, _immediate); /* Needs to queue itself again */ > } > > Since I'm running against a time-line, I temporarily gave the module > some CPU time through an ioctl(), i.e., a separate task that does nothing > except repeatably execute ioctl(GIVE_CPU, NULL); This shows that the > driver actually works. It's a GPIB driver so it needs to get the > CPU to find out if it's addressed to listen, etc. These events don't > produce interrupts. > > So, what am I supposed to do to add a piece of driver code to the > run queue so it gets scheduled occasionally? > > Cheers, > Dick Johnson How about something like: #include void queue_task(void process_timeout(void), unsigned long timeout, struct timer_list *timer, unsigned long data) { unsigned long expire = timeout + jiffies; init_timer(); timer->expires = expire; timer->data = data; timer->function = process_timeout; add_timer(); } You will have to define the "struct timer_list timer". This should cause the function passed to be called after "timeout" jiffies (1/HZ, not to be confused with 10 ms). If you want to stop the timer early do: del_timer_sync(); "data" was not used in you example, but process_timeout will be passed "data" when it is called. This routine is called as part of the timer interrupt, so it must be fast and should not do schedule() calls. It could queue a tasklet, however, to relax constraints a bit. George - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: REVISED: Experimentation with Athlon and fast_page_copy
In article <[EMAIL PROTECTED]> you wrote: > Arjan - care to unroll the tail 320 bytes of copying from the main loop ? I'll see what I can do to make us not loose too much speed. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Athlon and fast_page_copy: What's it worth ? :)
In article <[EMAIL PROTECTED]> you wrote: > Hi, > Before I go any further with this investigation, I'd like to get an > idea > of how much of a performance improvement the K7 fast_page_copy will give > me. > Can someone suggest the best benchmark to test the speed of this > routine? http://www.fenrus.demon.nl/athlon.c is a userspace benchmark of the current code vs C etc - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci_pool_free from IRQ
Pete Zaitcev writes: > Russel King complained that you might be calling pci_consistent_free > from an interrupt, which is unsafe on ARM. This sure makes life difficult. Device removal events can be called from interrupt context according to Documentation/pci.txt. This is certainly a place where one might want to call pci_consistent_free. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
pci_pool_free from IRQ
David, Russel King complained that you might be calling pci_consistent_free from an interrupt, which is unsafe on ARM. Why don't you remove this part from pci_pool_free(): + else if (!is_page_busy (pool->blocks_per_page, page->bitmap)) + pool_free_page (pool, page); In that case, fully free pages will stick about until the whole pool is destroyed, which I think is not a big deal. -- Pete - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: your mail
On Tue, 8 May 2001, Alan Cox wrote: > > I have a driver which needs to wait for some hardware. > > Basically, it needs to have some code added to the run-queue > > so it can get some CPU time even though it's not being called. > > Wht does it have to wait ? Why cant it just poll and come back next time ? > Good question. I wanted to be able to call the exact same routine(s) that other routines (exected from read() and write()), execute. These routines are complex and sleep while waiting for events. I didn't want to duplicate that code with different time-out mechanisms. GPIB is nasty because you can't do anything unless the 'controller' tells you to do it. When "addressed to talk", you have to parse all the stuff sent via interrupt (ATN bit set, control byte, which address from the control byte, etc.), then let somebody sleeping in poll() know that they can now "write()". That can all be handled via interrupt. But, now for the receive . The user-mode code needs to be sleeping until some data are available. That data may never be available. Something in the driver needs to wait until the hardware is addressed to receive. Since it is not now receiving, there is no interrupt! It takes time for the controller to tell you to listen and then tell somebody else to talk to you. This means that I need some timeout to recover from the fact that the other guy may never talk. Once the other guy starts sending data, the interrupts can be used to handle the data and, once there are valid data, the device owner can be awakened, presumably sleeping in poll() or select(). It's the intermediate time where there are no interrupts that needs the CPU to determine that we've waited too long for interrupts so the device had better get off the bus to start the error recovery procedure. Bright an early tommorrow, I will check out both ways. A kernel thread might be "neat". However, I may just look to see if I can just poll while using existing code. Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: OT: ps source?
Pierre Rousselet writes: > James Bourne wrote: >> From the procps man page: >>Albert Cahalan <[EMAIL PROTECTED]> rewrote ps for full >>Unix98 and BSD support, along with some ugly hacks for >>obsolete and foreign syntax. >> >>Michael K. Johnson <[EMAIL PROTECTED]> is the current >>maintainer. There has been a bit of a fork actually... sorry. > Right. For international support procps-2.0.7 is the one to choose with > the patch procps-2.0.7-intl.patch. That one is quite buggy. The parser is broken ("ps -o %p" fails), you can get a core dump if you get unlucky with the System.map file, the BSD-style process selection is incorrect... I've fixed about 100 bugs and introduced only a few. What you really ought to use is the Debian package. That gives you my source plus a few fixes that I don't have yet. Head over to www.debian.org and drill down to the "unstable" package. There you will find a source tarball and a patch file for it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Patch to improve readability of sock_rcvlowat() - comments wanted...
Ronald Bultje wrote: > On 2001.05.08 01:04:57 +0200 Jesper Juhl wrote: > >> static inline int sock_rcvlowat(struct sock *sk, int waitall, int len) >> { >> int r = len; >> if (!waitall) >> r = min(sk->rcvlowat, len); >> return max(1,r); >> } >> > > > return max(1, waitall ? len : min(sk->rcvlowat, len)); > > Although I doubt this is more readable... :-) > IMO your version is less readable than the 4-liner above, and the code it generates is a lot bigger than both the original and the proposed replacement - but thank you for the suggestion... - Jesper Juhl - [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFT] smbfs bugfixes for 2.4.4
On 7 May 2001, Linus Torvalds wrote: > It has code to do that in smb_revalidate_inode(), but it may be that > something else refreshes the inode size _without_ doing the proper > invalidation checks. Or maybe Urban broke that logic by mistake while > fixing the other one ;) No, I broke it when copying the ncpfs dircache code. That code will reuse an old inode if it already exists (and thus also any pages attached to it), which is what I wanted and should be fine except that it needs to invalidate_inode_pages() if something changed. Xuan and James, you have both seen this bug with smbfs not properly handling changes made on the server. Could you please test this patch vs 2.4.4 and let me know if it helps or not. http://www.hojdpunkten.ac.se/054/samba/smbfs-2.4.4-truncate+retry-4.patch (Apply with 'patch -p1' in the linux/ source dir) /Urban - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Direct Sockets Support??
> a. when a user app wants to receive some data, it allocates > memory(using malloc) and waits for the hw to do zero-copy read. The kernel > does not allocate physical page frames for the entire memory region > allocated. We need to lock the memory (and locking is expensive due to > costly TLB flushes) to do this > > b. when a user app wants to send data, he fills the buffer > and waits for the hw to transmit data, but under heavy physical memory > pressure, the swapper might swap the pages we want to transmit. So we need > to lock the memory to be 100% sure. > Or c) you prealloc two ring buffers. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: your mail
> I have a driver which needs to wait for some hardware. > Basically, it needs to have some code added to the run-queue > so it can get some CPU time even though it's not being called. Wht does it have to wait ? Why cant it just poll and come back next time ? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] allocation looping + kswapd CPU cycles
Marcelo Tosatti writes: > On Tue, 8 May 2001, Mark Hemment wrote: > > Does anyone know why the 2.4.3pre6 change was made? > > Because wakeup_bdflush(0) can wakeup bdflush _even_ if it does not have > any job to do (ie less than 30% dirty buffers in the default config). Actually, the change was made because it is illogical to try only once on multi-order pages. Especially because we depend upon order 1 pages so much (every task struct allocated). We depend upon them even more so on sparc64 (certain kinds of page tables need to be allocated as 1 order pages). The old code failed _far_ too easily, it was unacceptable. Why put some strange limit in there? Whatever number you pick is arbitrary, and I can probably piece together an allocation state where the choosen limit is too small. So instead, you could test for the condition that prevents any possible forward progress, no? Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] 2.4.4: mmap() fails for certain legal requests
Hi, The mmap() call fails when addr is specified, MAP_FIXED is cleared in flags and no address space can be allocated either at addr or above it. This is a legal request and it should not fail as long as there is space available below addr. Following is a patch that fixes the problem. This is nothing new -- I already submitted a similar patch against 2.4.0-test4 once upon a time. This patch is clean(er), though, and I believe it can be safely applied to the upcoming 2.4.5 release. A simple test case to trigger the current mmap() bad behaviour for 32-bit CPUs is something like: fd = open("/dev/zero", O_RDONLY); p = mmap((void *)0xf000, 4096, PROT_READ, MAP_SHARED, fd, 0); With my patch the code does not fail anymore -- p is set to an available address lower than 0xf000. The bug was discovered when tracking down the reason of dlopen() failures when called from statically linked binaries on MIPS/Linux. The patch fixes them. Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--+ +e-mail: [EMAIL PROTECTED], PGP key available+ diff -up --recursive --new-file linux-2.4.4.macro/mm/mmap.c linux-2.4.4/mm/mmap.c --- linux-2.4.4.macro/mm/mmap.c Tue May 1 17:24:25 2001 +++ linux-2.4.4/mm/mmap.c Tue May 1 18:23:25 2001 @@ -219,7 +219,7 @@ unsigned long do_mmap_pgoff(struct file if ((len = PAGE_ALIGN(len)) == 0) return addr; - if (len > TASK_SIZE || addr > TASK_SIZE-len) + if (len > TASK_SIZE) return -EINVAL; /* offset overflow? */ @@ -405,9 +405,15 @@ static inline unsigned long arch_get_unm if (len > TASK_SIZE) return -ENOMEM; - if (!addr) - addr = TASK_UNMAPPED_BASE; - addr = PAGE_ALIGN(addr); + + if (addr) { + addr = PAGE_ALIGN(addr); + vma = find_vma(current->mm, addr); + if (TASK_SIZE - len >= addr && + (!vma || addr + len <= vma->vm_start)) + return addr; + } + addr = PAGE_ALIGN(TASK_UNMAPPED_BASE); for (vma = find_vma(current->mm, addr); ; vma = vma->vm_next) { /* At this point: (!vma || addr < vma->vm_end). */ @@ -425,6 +431,8 @@ extern unsigned long arch_get_unmapped_a unsigned long get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { if (flags & MAP_FIXED) { + if (addr > TASK_SIZE - len) + return -EINVAL; if (addr & ~PAGE_MASK) return -EINVAL; return addr; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: your mail
On Tue, May 08 2001, Richard B. Johnson wrote: > > Use a kernel thread? If you don't need to access user space, context > > switches are very cheap. > > > > > So, what am I supposed to do to add a piece of driver code to the > > > run queue so it gets scheduled occasionally? > > > > Several, grep for kernel_thread. > > > > -- > > Jens Axboe > > > > Okay. Thanks. I thought I would have to do that too. No problem. A small worker thread and a wait queue to sleeep on and you are all set, 10 minutes tops :-) > It's a "tomorrow" thing. Ten hours it too long to stare at a > screen. Sissy! -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [RFC] Direct Sockets Support??
> But in the case of an application which fits in main memory, and > has been running for a while (so all pages are present and > dirty), all you'd really have to do is verify the page tables are > in the proper state and skip the TLB flush, right? We really cannot assume this. There are two cases a. when a user app wants to receive some data, it allocates memory(using malloc) and waits for the hw to do zero-copy read. The kernel does not allocate physical page frames for the entire memory region allocated. We need to lock the memory (and locking is expensive due to costly TLB flushes) to do this b. when a user app wants to send data, he fills the buffer and waits for the hw to transmit data, but under heavy physical memory pressure, the swapper might swap the pages we want to transmit. So we need to lock the memory to be 100% sure. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: your mail
On Tue, 8 May 2001, Jens Axboe wrote: > On Tue, May 08 2001, Richard B. Johnson wrote: > > > > To driver wizards: > > > > I have a driver which needs to wait for some hardware. > > Basically, it needs to have some code added to the run-queue > > so it can get some CPU time even though it's not being called. > > > > It needs to get some CPU time which can be "turned on" or > > "turned off" as a result of an interrupt or some external > > input from an ioctl(). > > > > So I thought that the "tasklet" would be ideal. However, the > > scheduler "thinks" that a tasklet is an interrupt, so any > > attempt to sleep in the tasklet results in a kernel panic, > > "ieee scheduling in an interrupt..., BUG sched.c line 688". > > Use a kernel thread? If you don't need to access user space, context > switches are very cheap. > > > So, what am I supposed to do to add a piece of driver code to the > > run queue so it gets scheduled occasionally? > > Several, grep for kernel_thread. > > -- > Jens Axboe > Okay. Thanks. I thought I would have to do that too. No problem. It's a "tomorrow" thing. Ten hours it too long to stare at a screen. Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: your mail
On Tue, May 08 2001, Richard B. Johnson wrote: > > To driver wizards: > > I have a driver which needs to wait for some hardware. > Basically, it needs to have some code added to the run-queue > so it can get some CPU time even though it's not being called. > > It needs to get some CPU time which can be "turned on" or > "turned off" as a result of an interrupt or some external > input from an ioctl(). > > So I thought that the "tasklet" would be ideal. However, the > scheduler "thinks" that a tasklet is an interrupt, so any > attempt to sleep in the tasklet results in a kernel panic, > "ieee scheduling in an interrupt..., BUG sched.c line 688". Use a kernel thread? If you don't need to access user space, context switches are very cheap. > So, what am I supposed to do to add a piece of driver code to the > run queue so it gets scheduled occasionally? Several, grep for kernel_thread. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
oddity with page_launder() handling of dirty pages
Hi, I was just wondering how bad the current way of writing out dirty pages is wrt multiple page_launder() users. We don't remove a dirty page from the inactive dirty list when writing it out (as opposed to "direct" page->buffers ll_rw_block() IO). When we have multiple users inside page_launder(), that means a dirty page which is being written out (and has an additional reference gotten by the writer) but has no page->buffers mapping yet will be moved to the beginning of the active list and kept there until the reference is released by the writer (since refill_inactive_scan() will not move it back to the inactive dirty list because of the extra reference). Remeber that we limit the amount of swap writeout's at rw_swap_page(), so any writepage() which blocks there will have its page moved to the _beginning_ of the active list because it has no page->buffers yet. Linus, since you wrote that part of the code, I ask you: do you have any reason to not remove a page being writepage()'d from the inactive_dirty_list to avoid this kind of problems ? (the page must be added back to the inactive_dirty_list again after the writeout, yes). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[i386 arch] MTR messages significant?]
I've been seeing these for a while now (2.4.4 - <=2.4.2) also coincidental with a change to XFree86 X 4.0.3 from "MetroX" in the time frame. Am not sure exactly when they started but was wondering if they were significant. It seems some app is trying to delete or modify something. On console and in syslog: mtrr: no MTRR for fd00,80 found mtrr: MTRR 1 not used mtrr: reg 1 not used while /proc/mtrr currently contains: reg00: base=0x ( 0MB), size= 512MB: write-back, count=1 reg01: base=0xfd00 (4048MB), size= 8MB: write-combining, count=1 Could it be the X server trying to delete a segment when it it starts up or shuts down? Is it an error in the X server to try to delete a non-existant segment? Does the kernel 'care'? I.e. -- why is it printing out messages -- are they debug messages that perhaps should be off by default? Concurrent with these messages and perhaps unrelated is a new, unwelcome, behavior of X dying on display of some Netscape-rendered websites (cf. it doesn't die under konqueror). thanks, -linda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Direct Sockets Support??
[EMAIL PROTECTED] said: > > A couple of concerns I have: > > * How to pin or pagelock the application buffer without > > making a kernel transition. > > You need to pin them in advance. And pinning pages is _expensive_ so you dont > want to keep pinning/unpinning pages I can't convince myself why this has to be so expensive. The current implementation does this for mlock: 1. Split vma if only a subset of the pages are being locked. 2. Mark bit in vma. 3. Make sure the pages are in core. That third step has the potential of being the most expensive, as changing the page tables requires invalidating the TLBs on all processors. Currently make_pages_present() does the work for 3. But in the case of an application which fits in main memory, and has been running for a while (so all pages are present and dirty), all you'd really have to do is verify the page tables are in the proper state and skip the TLB flush, right? Then 3 turns into a single spin_lock pair for the page_table_lock, and walking down the page table. The VMA splitting can be nasty, as it might require a couple of slab allocations, and doing an AVL insertion. (More nastiness in the case of shared memory or file mapping, too.) But nothing like playing with TLBs. Any reason why make_pages_present() is not the really oversized hammer it seems to be? -- Pete - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
No Subject
To driver wizards: I have a driver which needs to wait for some hardware. Basically, it needs to have some code added to the run-queue so it can get some CPU time even though it's not being called. It needs to get some CPU time which can be "turned on" or "turned off" as a result of an interrupt or some external input from an ioctl(). So I thought that the "tasklet" would be ideal. However, the scheduler "thinks" that a tasklet is an interrupt, so any attempt to sleep in the tasklet results in a kernel panic, "ieee scheduling in an interrupt..., BUG sched.c line 688". Next, I added code to try queue_task(). This has the same problem. Basically the procedure needs to do: procedure() { if(some_event) schedule_timeout(n); /* Needs to sleep */ else if(something_else) do_something(); queue_task(procedure, _immediate); /* Needs to queue itself again */ } Since I'm running against a time-line, I temporarily gave the module some CPU time through an ioctl(), i.e., a separate task that does nothing except repeatably execute ioctl(GIVE_CPU, NULL); This shows that the driver actually works. It's a GPIB driver so it needs to get the CPU to find out if it's addressed to listen, etc. These events don't produce interrupts. So, what am I supposed to do to add a piece of driver code to the run queue so it gets scheduled occasionally? Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] allocation looping + kswapd CPU cycles
On Tue, May 08 2001, Marcelo Tosatti wrote: > > The attached patch (against 2.4.5-pre1) fixes the looping symptom, by > > adding a counter and looping only twice for non-zero order allocations. > > Looks good. (actually Rik had a patch similar to this which fixed a real > case with cdda2wav just like you described) Not cdda2wav, I pressume, but the optimization discussed here before that wasn't really doable because of the vm behaviour when doing do try to alloc some amount of contiogous pages if (ok) break lower number of pages wanted while true CDROMREADAUDIO stopped doing this and fell back to single cdda frame size allocations because of these failures, even though it meant a huge decrease in speed. cdda2wav will ask for iirc 16 frames at the time, the current driver will try and to 8 first and then fall back to slower extraction if allocations fail. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: write to dvd ram
On Tue, May 08 2001, Thiago Vinhas de Moraes wrote: > > Hi! > > Can this new UDF driver do cd-rewriting ? No not in itself, but you can give the pktcdvd module a shot. It can do rw CD-RW mount so far, at least. *.kernel.org/pub/linux/kernel/people/axboe/packet/ There's a packet-writing mailing list for the above patch, there is more info in the tar file above (subscribe info, archives, resources, etc). -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] allocation looping + kswapd CPU cycles
On Tue, 8 May 2001, Mark Hemment wrote: > > In 2.4.3pre6, code in page_alloc.c:__alloc_pages(), changed from; > > try_to_free_pages(gfp_mask); > wakeup_bdflush(); > if (!order) > goto try_again; > to > try_to_free_pages(gfp_mask); > wakeup_bdflush(); > goto try_again; > > > This introduced the effect of a non-zero order, __GFP_WAIT allocation > (without PF_MEMALLOC set), never returning failure. The allocation keeps > looping in __alloc_pages(), kicking kswapd, until the allocation succeeds. > > If there is plenty of memory in the free-pools and inactive-lists > free_shortage() will return false, causing the state of these > free-pools/inactive-lists not to be 'improved' by kswapd. > > If there is nothing else changing/improving the free-pools or > inactive-lists, the allocation loops forever (kicking kswapd). > > Does anyone know why the 2.4.3pre6 change was made? Because wakeup_bdflush(0) can wakeup bdflush _even_ if it does not have any job to do (ie less than 30% dirty buffers in the default config). > > The attached patch (against 2.4.5-pre1) fixes the looping symptom, by > adding a counter and looping only twice for non-zero order allocations. Looks good. (actually Rik had a patch similar to this which fixed a real case with cdda2wav just like you described) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: write to dvd ram
Hi! Can this new UDF driver do cd-rewriting ? Em Ter 08 Mai 2001 14:50, Jens Axboe escreveu: > On Tue, May 08 2001, Ben Fennema wrote: > > > The log is: > > > Apr 15 20:58:27 hydra kernel: UDF-fs INFO UDF 0.9.1 (2000/02/29) > > > Mounting volume 'UDF Volume', timestamp 2001/03/02 11:55 (1e98) > > > > At the very least, run 0.9.3 from sourceforce (or the cvs version) and > > see if it works any better. > > I was just about to say the same thing, 0.9.3 works well for me. In fact > so well, that I made a patch to bring 2.4.5-pre1 UDF up to date with > current CVS earlier this afternoon (hint hint, Ben :-). > > *.kernel.org/pub/linux/kernel/people/axboe/patches/2.4.5-pre1/ > > udf-0.9.3-2.4.5p1-1.bz2 -- Thiago Vinhas de Moraes NetWorx - A SuaCompanhia.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: LSB 0.9 public draft
On Tue, May 08, 2001 at 04:29:37PM +0100, Alan Cox wrote: > To make sure this gets enough publicity and eyes on it.. > > http://www.linuxbase.org/spec/lsbreview.html Yes. Lots of tiny inaccuracies. And no email address. (But a form with the mysterious button "Change".) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Problem: 'keyboard: Timeout - AT keyboard not present?'
I have been encountering the following problem for quite a while now (in 2.4 pre kernels, and 2.4.x final kernels), and from what I have been able to determine, it has affected people since 2.3.4x or so, and is also affecting 2.2.17 and above. The problem is that once in a while (which varies greatly and doesn't appear at all consistent), the keyboard will lock up for a second or two and the kernel prints the message 'keyboard: Timeout - AT keyboard not present?'. This almost always involves getting an extra character (usually the one just hit or the one before it), or missing the character just hit. It can even occur when not typing at all, although that is much more rare. I used to think it was just a problem in the keyboard controller on my machine, but I now have it happening on the machine I just switched to, and have found a number of other posts about this problem by searching www.google.com for the string in the error message. I no longer think it is necessarily a hardware problem (although I can't rule out this error being caused by flaky keyboard controllers). The same machines that display this error, never even once have done so with 2.2.16 or lower. Some interesting trends I found while searching for any info on this message on google are these: initialize_kbd: Keyboard reset failed, no ACK pty: 256 Unix98 ptys configured keyboard: Timeout - AT keyboard not present? keyboard: Timeout - AT keyboard not present? RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize initialize_kbd: Keyboard interface failed self test pty: 256 Unix98 ptys configured keyboard: Timeout - AT keyboard not present? keyboard: Timeout - AT keyboard not present? Floppy drive(s): fd0 is 2.88M initialize_kbd: Keyboard reset failed, no ACK Serial driver version 4.92 (2000-1-27) with MANY_PORTS SHARE_IRQ SERIAL_PCI enabled keyboard: Timeout - AT keyboard not present? keyboard: Timeout - AT keyboard not present? ttyS00 at 0x03f8 (irq = 4) is a 16550A and: PIIX4: IDE controller on PCI bus 00 dev f9 PIIX4: chipset revision 1 PIIX4: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio keyboard: Timeout - AT keyboard not present? keyboard: Timeout - AT keyboard not present? hda: Maxtor 91366U4, ATA DISK drive PIIX4: not 100% native mode: will probe irqs later ide0: BM-DMA at 0x84c0-0x84c7, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0x84c8-0x84cf, BIOS settings: hdc:pio, hdd:DMA keyboard: Timeout - AT keyboard not present? keyboard: Timeout - AT keyboard not present? hda: QUANTUM FIREBALLP LM15, ATA DISK drive SIS5513: chipset revision 208 SIS5513: not 100% native mode: will probe irqs later SiS5597 ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:DMA, hdd:pio keyboard: Timeout - AT keyboard not present? keyboard: Timeout - AT keyboard not present? hda: ST5660A, ATA DISK drive It seems that it tends to occur just after the ide controller is detected, and/or just around the unix98 pty init (which is right around the serial port init). Not sure if the probing of hardware involves the interrupts being disabled, and that causing a problem. It of course also happens lots while I am typing, so I am not sure what can be causing interrupt loses, but it could be disk access or power management. I will try using a kernel without any power management and see if that makes a difference. Len Sorensen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
nfsd from kernel 2.4.4 oops
Hi, I'm using kernel 2.4.4 cvs from SGI, with xfs. I'm getting this Oops: kernel: Unable to handle kernel NULL pointer dereference at virtual address 0010 kernel: printing eip: kernel: c017bfd8 kernel: *pde = kernel: Oops: kernel: CPU:0 kernel: EIP:0010:[nfsd_findparent+120/236] kernel: EIP:0010:[] kernel: EFLAGS: 00010246 kernel: eax: ebx: ecx: cff8d458 edx: 0010 kernel: esi: cb22c6a0 edi: cb22c720 ebp: cb22c720 esp: ce4c9e54 kernel: ds: 0018 es: 0018 ss: 0018 kernel: Process nfsd (pid: 592, stackpage=ce4c9000) kernel: Stack: 1802280f c017c416 cb22c720 ce4cf814 1127 ce4cf804 kernel:c03c5740 cfe3b5c8 000e ff8c c017c7c4 cfe3b400 1802280f kernel: 0001 ce4cf804 0008 cb1fc77c ce4cfc00 ceb7b000 kernel: Call Trace: [find_fh_dentry+598/928] [fh_verify+612/1128] [nfsd_lookup+110/1368] [nfsd3_proc_lookup+314/332] [nfs3svc_decode_diropargs+152/268] [nfsd_dispatch+203/360] [svc_process+684/1348] kernel: Call Trace: [] [] [] [] [] [] [] It's produced very randomly. Some people (readed in xfs list) get similar error and tested too with a clean 2.4.4 with ext2 filesystem, and oops too. I think this is related to nfsd code (maybe sunrpc code), and it's not related to xfs code. Always is produced in nfsd_findparent function. I enabled kdb support and I always see the same stack trace, same order on functions calls. Client machines mount exports from this server (an i386 with SMP enabled, 2 processors), and both 3 and 2 nfs protocol version are used. Some hint? Someone else gets similar oops? How can I enable some debugging in nfsd & sunrpc stuff to try to see what is happen? Thanx! /Fermin [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: page_launder() bug
[EMAIL PROTECTED] (Horst von Brand) wrote on 07.05.01 in <[EMAIL PROTECTED]>: > "David S. Miller" <[EMAIL PROTECTED]> said: > > Jonathan Morton writes: > > > >-page_count(page) == (1 + !!page->buffers)); > > > > > > Two inversions in a row? > > > > It is the most straightforward way to make a '1' or '0' > > integer from the NULL state of a pointer. > > IMVHO, it is clearer to write: > > page_count(page) == 1 + (page->buffers != NULL) > > At least, the original poster wouldn't have wondered, and I wouldn't have > had to think a bit to find out what it meant... If gcc generates worse code > for this, it should be fixed. Huh. IMO, that is significantly *less* readable. And incidentally I'd be less certain that it actually does what you want - it is rather easy to convince yourself that !! has to do the right thing. MfG Kai - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCMCIA IDE flash problem found
> Why did not you take care of the request_region() call and just disabled it? > The ports will be considered free by the system, and another device might > grab them later on! Because it was one of changes between 2.4.0 and 2.4.4. Ignore that. Pavel > > Vassilii > > -Original Message- > From: Pavel Machek [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, May 08, 2001 8:14 AM > To: kernel list > Subject: PCMCIA IDE flash problem found > > > Hi! > > 2.4.[123] changed name of ide-cs module, which means your pcmcia setup > breaks... This is how to undo the damage. Works for me, do *not* apply > into anything official. > > Pavel > > --- clean/drivers/ide/ide-cs.cSun Apr 1 00:23:29 2001 > +++ linux/drivers/ide/ide-cs.cTue May 8 14:06:09 2001 > @@ -95,7 +96,7 @@ > static int ide_event(event_t event, int priority, >event_callback_args_t *args); > > -static dev_info_t dev_info = "ide-cs"; > +static dev_info_t dev_info = "ide_cs"; > > static dev_link_t *ide_attach(void); > static void ide_detach(dev_link_t *); > @@ -388,9 +389,12 @@ > MOD_DEC_USE_COUNT; > } > > +#if 0 > request_region(link->io.BasePort1, link->io.NumPorts1,"ide-cs"); > if (link->io.NumPorts2) > request_region(link->io.BasePort2, link->io.NumPorts2,"ide-cs"); > +#endif > +printk("Should call request_region\n"); > > info->ndev = 0; > link->dev = NULL; -- The best software in life is free (not shareware)! Pavel GCM d? s-: !g p?:+ au- a--@ w+ v- C++@ UL+++ L++ N++ E++ W--- M- Y- R+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdb wishlist
> * Change kdb invocation key from ^A to ^X^X^X within 3 seconds. ^A is > used by emacs, bash, minicom etc. Why not Alt-SysRq-D (like Debug) or so? > * Command history. Handle up/down/left/right/delete keys. Each > kdba_io routine is responsible for recognising the arch specific > keys, with a common history and editting routine. yes! > * Clean up repeating commands. Pressing enter at the kdb prompt > repeats the previous command, no matter what the previous command > was. Some commands it makes no sense to repeat (bp in particular), > for other commands you want to repeat the command but without the > parameter (md in particular). Should be configurable. Sometimes I accidentally hit enter or do it just to do something... -mirabilos -- EA F0 FF 00 F0 #$@%CARRIER LOST - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: write to dvd ram
Thanks, I'll try it. Didn't get the prior response. -- C. The best way out is always through. - Robert Frost A Servant to Servants, 1914 Jens Axboe wrote: > On Tue, May 08 2001, Ben Fennema wrote: > > > The log is: > > > Apr 15 20:58:27 hydra kernel: UDF-fs INFO UDF 0.9.1 (2000/02/29) Mounting > > > volume 'UDF Volume', timestamp 2001/03/02 11:55 (1e98) > > > > At the very least, run 0.9.3 from sourceforce (or the cvs version) and > > see if it works any better. > > I was just about to say the same thing, 0.9.3 works well for me. In fact > so well, that I made a patch to bring 2.4.5-pre1 UDF up to date with > current CVS earlier this afternoon (hint hint, Ben :-). > > *.kernel.org/pub/linux/kernel/people/axboe/patches/2.4.5-pre1/ > > udf-0.9.3-2.4.5p1-1.bz2 > > -- > Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: PCMCIA IDE flash problem found
Why did not you take care of the request_region() call and just disabled it? The ports will be considered free by the system, and another device might grab them later on! Vassilii -Original Message- From: Pavel Machek [mailto:[EMAIL PROTECTED]] Sent: Tuesday, May 08, 2001 8:14 AM To: kernel list Subject: PCMCIA IDE flash problem found Hi! 2.4.[123] changed name of ide-cs module, which means your pcmcia setup breaks... This is how to undo the damage. Works for me, do *not* apply into anything official. Pavel --- clean/drivers/ide/ide-cs.c Sun Apr 1 00:23:29 2001 +++ linux/drivers/ide/ide-cs.c Tue May 8 14:06:09 2001 @@ -95,7 +96,7 @@ static int ide_event(event_t event, int priority, event_callback_args_t *args); -static dev_info_t dev_info = "ide-cs"; +static dev_info_t dev_info = "ide_cs"; static dev_link_t *ide_attach(void); static void ide_detach(dev_link_t *); @@ -388,9 +389,12 @@ MOD_DEC_USE_COUNT; } +#if 0 request_region(link->io.BasePort1, link->io.NumPorts1,"ide-cs"); if (link->io.NumPorts2) request_region(link->io.BasePort2, link->io.NumPorts2,"ide-cs"); +#endif +printk("Should call request_region\n"); info->ndev = 0; link->dev = NULL; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: write to dvd ram
On Tue, May 08 2001, Ben Fennema wrote: > > The log is: > > Apr 15 20:58:27 hydra kernel: UDF-fs INFO UDF 0.9.1 (2000/02/29) Mounting > > volume 'UDF Volume', timestamp 2001/03/02 11:55 (1e98) > > At the very least, run 0.9.3 from sourceforce (or the cvs version) and > see if it works any better. I was just about to say the same thing, 0.9.3 works well for me. In fact so well, that I made a patch to bring 2.4.5-pre1 UDF up to date with current CVS earlier this afternoon (hint hint, Ben :-). *.kernel.org/pub/linux/kernel/people/axboe/patches/2.4.5-pre1/ udf-0.9.3-2.4.5p1-1.bz2 -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PCMCIA IDE flash problem found
Hi! 2.4.[123] changed name of ide-cs module, which means your pcmcia setup breaks... This is how to undo the damage. Works for me, do *not* apply into anything official. Pavel --- clean/drivers/ide/ide-cs.c Sun Apr 1 00:23:29 2001 +++ linux/drivers/ide/ide-cs.c Tue May 8 14:06:09 2001 @@ -95,7 +96,7 @@ static int ide_event(event_t event, int priority, event_callback_args_t *args); -static dev_info_t dev_info = "ide-cs"; +static dev_info_t dev_info = "ide_cs"; static dev_link_t *ide_attach(void); static void ide_detach(dev_link_t *); @@ -388,9 +389,12 @@ MOD_DEC_USE_COUNT; } +#if 0 request_region(link->io.BasePort1, link->io.NumPorts1,"ide-cs"); if (link->io.NumPorts2) request_region(link->io.BasePort2, link->io.NumPorts2,"ide-cs"); +#endif +printk("Should call request_region\n"); info->ndev = 0; link->dev = NULL; -- I'm [EMAIL PROTECTED] "In my country we have almost anarchy and I don't care." Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdb wishlist
> "slurn" == slurn <[EMAIL PROTECTED]> writes: >> >> Keith Owens wrote: >> > >> > This is part of my kdb wishlist, does anybody fancy writing the code to >> > add any of these features? It would be a nice project for anybody >> > wanting to start on the kernel. Replies to [EMAIL PROTECTED] please. >> > Current patches at http://oss.sgi.com/projects/kdb/download/ >> > >> > * Change kdb invocation key from ^A to ^X^X^X within 3 seconds. ^A is >> > used by emacs, bash, minicom etc. >> > >> ^X^X swaps point and mark in emacs. One (well, I) often will do >> ^X^X^X^X to examine where mark is and then return to point. slurn> How about using the break condition instead. This is only for the slurn> serial port, and most terminal emulators (e.g. kermit, minicom) provide slurn> a means to generate a break condition on the serial port. kdb uses BREAK in the serial port (that minicom uses C-a for sending a break is an anecdote :) But the problem at hang is the console. I vote for the ^X^X^X as I a think that it is not a difficult shortcut. (and yes, I also use emacs and ^X^X all the time, but I think that this combination is not specially bad, and I suppose that the pet aplication of other people will have problems with something like: ^A^A^A that I never use). Later, Juan. -- In theory, practice and theory are the same, but in practice they are different -- Larry McVoy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PCMCIA ide flash card does not work
Hi! My ide flash card used to work in 2.4.0, but does not work in 2.4.4. Everything compiled in (no modules) May 8 13:43:44 bug cardmgr[58]: initializing socket 0 May 8 13:43:44 bug cardmgr[58]: socket 0: ATA/IDE Fixed Disk May 8 13:43:44 bug cardmgr[58]: module //pcmcia/ide_cs.o not available May 8 13:43:45 bug cardmgr[58]: get dev info on socket 0 failed: Resource temporarily unavailable PCMCIA ne2000 card works at the same time. Any hints? Pavel -- I'm [EMAIL PROTECTED] "In my country we have almost anarchy and I don't care." Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Question] Explanation of zero-copy networking
At first, thanks for the (unexpected large) discussion and hints! Second: sorry for the multimedia-centric viewpoint, but i think it's an important task for future operating systems development (or better: for a real world OS like linux) to have sophisticated support for a _large diversity_ in application requirements and realtime/multimedia apps are treated stepmotherly for too long. "Richard B. Johnson" wrote: > So, the kernel is going to send a packet to another host on > behalf of the system caller. It copies the data, (partial > checksum) assembles the packet, finishes the checksum, then > sends it. The CPU is given to somebody else while waiting > for the packet to get somewhere and be ACKed. But, think > about a server where EVERY task is waiting for I/O to > complete! These CPU cycles, that you saved by eliminating > a copy (or two), are now wasted spinning. > > Basically, "no copy" is an academic exercise. It makes the first > packet get sent more quickly, after which everything slows to > the natural bandwidth of the system. This is the semantic of a typical client/server request/reply protocol which is used in "traditional" applications. But it isn't appropriate for the communication of realtime mediastreams because it breakes the strict timing constraints. Here we need asynchronous (*non blocking semantics*) communication. > > If you used a server for multicast-only. In other words, you > just spewed out unidirectional data, you still slow to the rate > at which the media can take the data. And CPUs can obtain or > generate these data a lot faster than 100-base can sink them. > > When we get to media that can sink data as fast as we can generate > them (it), then we have to worry about memory copy speed. However, > these new devices are actually an IP subsystem. They generate and > receive entire datagrams. To fully utilize these devices, the data- > gram generation and reception (the basis of all TCP/IP networking) > will have to be moved out of the kernel and into these boards. The > kernel code will only handle interfaces, connections, and rules. O, these are the arguments of people rather investing in more ressources than investing in clever algorithms. It's comparable to the old war between the ATM folks and the IP/Ethernet folks; concepts against "brute" ressources. 1. You don't take into account that there are not only high-end PC's and Workstations with enormous CPU and memory resources! Devices for "pervasive ubiquitous computing" (don't blame me for this fashion word) for example are mostly embedded systems with scarce ressources, happy to have enough CPU-cycles for video-codecs. 2. On the other hand are Video-on-Demand servers with (not only one) high speed NIC's, large SAN's or disk arrays for video storage with gigabit/infiniband connections, . Here's the problem not only saturating the links (for economic reasons), but also to guarantee low delay and jitter to every connection. I think we should extend the usability of linux to this class of servers too. 3. Have a look at the various papers on high performance networking. The gap between the growth in network bandwidth and the growth in CPU and bus performance is increasing. Today the system-busses are not considered to be in the "window of scarcity" (today we have 100MBit Ethernets and 133++MB/s PCI). Tomorrow our operating system concepts have to cope with 1, 10, ?? Gigabit Ethernets, Infiniband , ... who knows. This means: scale CPU and memory-bus performance accordingly or use ressource-sparing ipc-mechanisms and implement computational complex algorithms (checksum calculations, encryption) in hardware. Besides continuous-media applications other applications who need to move data-chunks much larger as the CPU-caches will benefit from such infrastructures too. (Both classes of systems from above will be affected.) For those applications copy avoidance is so fundamantal or copying is so expensive because copying needs all three basic system ressources (CPU, memory and bandwidth of local communication- facilities - busses) at the same time (synchronous)! Many researchers recognized this problem and developed techniques to overcome the dusted os-concepts (UNet, UVM,..). Unfortunately they need special hardware (NIC's), have partially too much overheads or are not general enough. The one thing it shows us is that there is still some work to be done. Regards, Alexander Eichhorn -- Alexander Eichhorn Technical University of Ilmenau Computer Science And Automation Faculty Distributed Systems and Operating Systems Department Phone +49 3677 69 4557, Fax +49 3677 69 4541 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: write to dvd ram
> The log is: > Apr 15 20:58:27 hydra kernel: UDF-fs INFO UDF 0.9.1 (2000/02/29) Mounting > volume 'UDF Volume', timestamp 2001/03/02 11:55 (1e98) At the very least, run 0.9.3 from sourceforce (or the cvs version) and see if it works any better. Ben - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdb wishlist
> > Keith Owens wrote: > > > > This is part of my kdb wishlist, does anybody fancy writing the code to > > add any of these features? It would be a nice project for anybody > > wanting to start on the kernel. Replies to [EMAIL PROTECTED] please. > > Current patches at http://oss.sgi.com/projects/kdb/download/ > > > > * Change kdb invocation key from ^A to ^X^X^X within 3 seconds. ^A is > > used by emacs, bash, minicom etc. > > > ^X^X swaps point and mark in emacs. One (well, I) often will do > ^X^X^X^X to examine where mark is and then return to point. How about using the break condition instead. This is only for the serial port, and most terminal emulators (e.g. kermit, minicom) provide a means to generate a break condition on the serial port. scott > > George > > ~snip > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86 page fault handler not interrupt safe
On Tue, 8 May 2001, Alan Cox wrote: > > I dont see where the alternative patch ensures the user didnt flip the > direction flag for one Yeah. We might as well just make it "eflags & IF", none of the other flags should matter (or we explicitly want them cleared). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ot] named sockets
On Mon, 7 May 2001 21:47:33 -0400 (EDT), Adam <[EMAIL PROTECTED]> wrote: >So I'm wondering, is there a way, kind of like "relink" system call which >coule take existing file descriptor (they are still so the fd is there, >just unlinked) and link it back to file name? POSIX' fattach(int fd, const char *path) library call does that, although it's often limited to STREAMS fd:s. It's usually implemented as mounting "namefs" at the path (SVR4) or setting a magic mount option (OSF1), with the fd passed in as mount-point specific data. Regular users are allowed to do this special mount(). Linux currently doesn't have this functionality, but it could probably be implemented as a pseudo-fs in 2.4, assuming the 2.4 VFS properly supports stacking of file systems. (There's some gotchas concerning chown/chmod changes and restoring the original object after the fd is unmounted.) Not that I think Linux really needs this creeping featurism ... /Mikael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Question] Explanation of zero-copy networking
Alan Cox wrote: > > so there's still single copy for write() of a mmap()ed page? > > An mmap page will go direct to disk. Looking at the 2.4.4 code, mmap() of file followed by write() to socket will copy the data once. I could be mistaken (only glanced at the code quickly) but I base that on the only call to ->sendpage being through sendfile. So yes, there's a single copy overhead for mmap()+write(). -- Jamie - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible PCI subsystem bug in 2.4
On 4 May 2001, Eric W. Biederman wrote: > The example that sticks out in my head is we rely on the MP table to > tell us if the local apic is in pic_mode or in virtual wire mode. > When all we really have to do is ask it. You can't. IMCR is write-only and may involve chipset-specific side-effects. Then even if IMCR exists, a system's firmware might have chosen the virtual wire mode for whatever reason (e.g. broken hardware). -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--+ +e-mail: [EMAIL PROTECTED], PGP key available+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: nfs MAP_SHARED corruption fix
On Tue, May 08, 2001 at 05:21:02PM +0200, Trond Myklebust wrote: > Could you instead detail exactly which corruption problem you are > trying to fix? int fd = open (name, O_RDWR); char* adr = (char*) mmap (0, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); /* write to *adr through *(ard+len-1) */ /* Try adding here: msync (adr, len, MS_SYNC); */ munmap (adr, len); close (fd); The code works on files on local harddisks and on NFS volumes on a 2.2 kernel, but breaks on NFS drives on a 2.4.4 kernel. msync() works around the bug. Andrea's patch did help as well. Regards, -- Kurt Garloff <[EMAIL PROTECTED]> Eindhoven, NL GPG key: See mail header, key servers Linux kernel development SuSE GmbH, Nuernberg, FRG SCSI, Security PGP signature
LSB 0.9 public draft
To make sure this gets enough publicity and eyes on it.. - > The Linux Standard Base is in the final stages of the LSB written > specification for Linux. The workgroup has published the LSB v0.9 written > specification, and is undergoing a thirty day Request For Comments from > the public until Wednesday June 6th, 2001. Afterwards, this draft will be > submitted to the Free Standards Group for adoption. > > http://www.linuxbase.org/spec/lsbreview.html > > The goal of the LSB is to develop and promote a set of standards that will > increase compatibility among Linux distributions and enable software > applications to run on any compliant Linux system. In addition, the LSB > will help coordinate efforts to recruit software vendors to port and write > products for Linux. > > http://www.linuxbase.org/ > > George (gk4) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: nfs MAP_SHARED corruption fix
> " " == Andrea Arcangeli <[EMAIL PROTECTED]> writes: > This fixes corruption with MAP_SHARED on top of nfs filesystem > in 2.4: > --- 2.4.5pre1aa2/fs/nfs/write.c.~1~ Tue May 1 19:35:29 2001 > +++ 2.4.5pre1aa2/fs/nfs/write.c Tue May 8 02:04:15 2001 > @@ -1533,6 +1533,7 @@ > if (!inode && file) > inode = file->f_dentry->d_inode; > + filemap_fdatasync(inode->i_mapping); > do { > error = 0; if (wait) Yech! Apart from the fact that this means you do a full fdatasync() even when you are simply trying to flush out single pages, nfs_sync_file() gets called all over the place including in areas where we know we're already holding a page lock. AFAICs this fix will clearly deadlock... Could you instead detail exactly which corruption problem you are trying to fix? Cheers, Trond - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RAID question
On Tue, May 08, 2001 at 12:48:25PM +1000, Peter Waltenberg wrote: > We have a RAID 5 system thats had 2 of 6 disks in the RAID go into thermal > shutdown. (Air-con failure). > > The disks are functional, but the RAID won't restart because the superblock > timestamps on those two disks are now out of step with the rest of the array and > there aren't enough "good" disks to reconstruct the array. > > We know there was very little activity when this happened. > > Does anyone out there know of a way to hack the superblocks on the "bad" disks > to force them to appear to be O.K. so that the RAID will restart. As documented in the HOWTO (http://unthought.net/Software-RAID.HOWTO), you should re-run mkraid after making dead sure that your raidtab still corrosponds to the RAID on your disks (it usually does unless someone screwed it up). Run fsck on the RAID after mkraid. -- : [EMAIL PROTECTED] : And I see the elder races, : :.: putrid forms of man: : Jakob Østergaard : See him rise and claim the earth, : :OZ9ABN : his downfall is at hand. : :.:{Konkhra}...: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] allocation looping + kswapd CPU cycles
> The real fix is to measure fragmentation and the progress of kswapd, but > that is too drastic for 2.4.x. I suspect the real fix might, in general, be a) to reduce use of kmalloc() etc. which gives physically contiguous memory, where virtually contiguous memory will do (and is, presumably, far easier to come by). (or perhaps add some flag to kmalloc to allocate out of virtual rather than physical memory). b) to bias flush or swap out routines to create physically contiguous higher order blocks. Many heuristics will give you that ability. Disclaimer: I haven't looked at this for issue for years, but Linux seems to fail on >4k allocations now, and fragment memory far more, than it did on much smaller systems doing lots of nasty (8k, thus 3 pages including header) NFS stuff back in 94. -- Alex Bligh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: write to dvd ram
Many thanks Jim. Now at least I have a way. But I caution others that Linux modifies the UDF filesystem somehow, so that Winders can no longer understand it. I nearly lost all my music & photo archives to this. And attempts to rm or mv on a DVDRAM with UDF cause it to segfault & jam up. There doesn't seem to be an answer to this. (yes, I have written to the developer of cddriver; no response at all after two weeks) UDF2 is just nonfunctional in Linux and I don't know why. To recap: running Panasonic LF-D101 DVDRAM drive on SCSI (AHA2940) and getting segfaults. On-disk format is UDF2.0, as 2.1 won't mount. Mount, ls, umount, mount, ls, umount, etc - no problem except filestructure is now no longer available to Winders. (CAUTION! Save data using Linux for recovery in Winders) Mount, cp <20Mfile>, umount, mount, ls, (20Mfile), umount, mount, ls, (20Mfile), rpm -q 20Mfile, umount, etc - no problem except filestructure no longer available to Winders. Mount, rm <20Mfile>, Segmentation Fault, umount, (device busy), umount, (device busy), etc. Reboot without reset and bootup hangs at Running Linuxconf hooks. Reset & system boots fine. Mount, ls, (no files), umount, mount, ls, (no files), umount, etc. Running RedHat Wolverine with HelixGnome & Nautilus. -- C. The best way out is always through. - Robert Frost A Servant to Servants, 1914 Keywords: DVDRAM DVD-RAM LF-D101 LFD101 cdrecord The log is: Apr 15 20:58:27 hydra kernel: UDF-fs INFO UDF 0.9.1 (2000/02/29) Mounting volume 'UDF Volume', timestamp 2001/03/02 11:55 (1e98) Apr 15 20:59:31 hydra kernel: UDF-fs INFO UDF 0.9.1 (2000/02/29) Mounting volume 'UDF Volume', timestamp 2001/03/02 11:55 (1e98) Apr 15 20:59:50 hydra last message repeated 3 times Apr 15 21:00:17 hydra mon[1258]: failure for servers http 987390017 localhost Apr 15 21:01:11 hydra kernel: UDF-fs INFO UDF 0.9.1 (2000/02/29) Mounting volume 'UDF Volume', timestamp 2001/03/02 11:55 (1e98) Apr 15 21:03:25 hydra last message repeated 2 times Apr 15 21:03:40 hydra kernel: kernel BUG at inode.c:890! Apr 15 21:03:40 hydra kernel: invalid operand: Apr 15 21:03:40 hydra kernel: CPU:0 Apr 15 21:03:40 hydra kernel: EIP:0010:[iput_free+216/352] Apr 15 21:03:40 hydra kernel: EIP:0010:[] Apr 15 21:03:40 hydra kernel: EFLAGS: 00010286 Apr 15 21:03:40 hydra kernel: eax: 001b ebx: cb1ad640 ecx: 0004 edx: c5508840 Apr 15 21:03:40 hydra kernel: esi: c0319560 edi: cb4f2740 ebp: b678 esp: c9d5ff20 Apr 15 21:03:40 hydra kernel: ds: 0018 es: 0018 ss: 0018 Apr 15 21:03:40 hydra kernel: Process rm (pid: 2254, stackpage=c9d5f000) Apr 15 21:03:40 hydra kernel: Stack: c02a1610 c02a16f3 037a 0012 c392 cffb3560 cb4f2740 Apr 15 21:03:40 hydra kernel:cb1ad640 c0144a3c cb1ad640 0184 fff0 c39229c0 cb4f2740 Apr 15 21:03:40 hydra kernel:c39229c0 c013e31c cb4f2740 cfc83d40 c9d5ff9c ffeb cb4f2740 Apr 15 21:03:40 hydra kernel: Call Trace: [error_table+39488/42452] [error_table+39715/42452] [d_delete+76/112] [vfs_unlink+316/368] [sys_unlink+150/272] [do_page_fault+0/1088] [system_call+51/56] Apr 15 21:03:40 hydra kernel: Call Trace: [] [] [] [] [] [] [] Apr 15 21:03:40 hydra kernel: Apr 15 21:03:40 hydra kernel: Code: 0f 0b 83 c4 0c eb 69 90 39 1b 74 3c f6 83 f8 00 00 00 07 75 Apr 15 21:04:18 hydra mon[1258]: failure for servers http 987390258 localhost ver_linux Linux hydra.darkmatter.com 2.4.2-0.1.49 #1 Sun Apr 15 18:12:33 MDT 2001 i686 unknown Gnu C 2.96 Gnu make 3.79.1 binutils 2.10.91.0.2 util-linux 2.10r modutils 2.4.2 e2fsprogs 1.19 reiserfsprogs 3.x.0b PPP2.4.0 isdn4k-utils 3.1pre1 Linux C Library2.2.2 Dynamic linker (ldd) 2.2.2 Procps 2.0.7 Net-tools 1.57 Console-tools 0.3.3 Sh-utils 2.0 Modules Loaded via82cxxx_audio ac97_codec binfmt_misc autofs nls_iso8859-1 nls_cp437 cdrecord 1.9-6 "Hawthorne, Jim J SSI-ISEA" wrote: > No problem with ext2 file system -- I have been using LM 7.2 with kernel > 2.2.14 and it works straight out of the box. Newer kernel 2.4.x should also > work . > Have Toshiba w1101 scsi dvd ram and initio wide scsi card. I also use > WINDOZE 2000 and use UDF for DVD RAM on WINDOZE 2000 (Instant Write from VOB > at www.vob.de -- came with the drive). > > I use BRUBACK for backup to backup both Linux and W2K (Bruback will backup > windoze filesystem from Linux -- no problem) > > format your media with mke2fs /dev/scd1 (or wherever your dvd ram is > detected) -- just use defaults takes about 1 min to format a 2.6 GB media. > > create a directory under / say /dvdram > > then simply mount -t ext2 /dev/scd1 /dvdram hey presto you should get > read/write access to your drive > > your fstab entry should look something like this > > /dev/scd1 /dvdram ext2
2.2.19 + reiserfs 3.5.32 nfsd wait_on_buffer/down_failed
Hi, we run a nfs server utilizing 2.2.19 + ReiserFS version 3.5.32 on a P 3 550 machine. Disk subsystem is a GDT7518RN using 4 UW disks as raid 5 device. After upgrading from 2.2.17 + reiserfs to 2.2.19 we experience many (very much more than with 2.2.17) problems with our nfs clients about 12 (linux). Network ist 100Mbit full duplex / switched. I do not think this is network related, cause ping -f doesnt show any packet loss. During not so heavy IO on the exported fs one nfsd thread seems to be waiting for the disk: 621 root 1 0 00 wait_on_b DW6.2 0.0 1:49 nfsd and the other threads are waiting in down_fail: 610 root 0 0 00 down_fail DW0.0 0.0 1:52 nfsd 611 root 0 0 00 down_fail DW0.0 0.0 1:40 nfsd 612 root 0 0 00 down_fail DW0.0 0.0 1:41 nfsd 613 root 0 0 00 down_fail DW0.0 0.0 1:48 nfsd 614 root 0 0 00 down_fail DW0.0 0.0 1:45 nfsd 615 root 0 0 00 down_fail DW0.0 0.0 1:43 nfsd 616 root 0 0 00 down_fail DW0.0 0.0 1:50 nfsd 617 root 0 0 00 down_fail DW0.0 0.0 1:42 nfsd 618 root 0 0 00 down_fail DW0.0 0.0 1:44 nfsd 619 root 0 0 00 down_fail DW0.0 0.0 1:42 nfsd 620 root 0 0 00 down_fail DW0.0 0.0 1:47 nfsd 622 root 0 0 00 down_fail DW0.0 0.0 1:47 nfsd 623 root 0 0 00 down_fail DW0.0 0.0 1:43 nfsd 624 root 0 0 00 down_fail DW0.0 0.0 1:48 nfsd 609 root 0 0 00 down_fail DW0.0 0.0 1:50 nfsd During this event: - If i check the disk io with e.g. vmstat 1 the machine is doing about 200 bi per second, which is not so much i guess. - the client machines hang, should be clear: nfs: server foo is not responding nfs: server foo still not responding nfs: server foo OK Our idea is to revert back to 2.2.17 cause the behaviour was much better. How can i debug this ? Can i do some tuning ? Should i revert to some older kernel. Are there any patches for this problem ? Does anyone has the same or related problem ? Any pointer would be useful. TIA and cheers, -Michael -- In a world where an admin is rendered useless when the ball in his mouse has been taken out, its good to know that I know UNIX. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: fs.file-max
On Tue, May 08, 2001 at 10:03:23AM +, Federico Edelman Anaya wrote: > What can I do to test the FD limit? ... Because, the FD limit is set in > /proc/sys/fs/file-max, sample: > > echo "2048" > /proc/sys/fs/file-max > ulimit -n 8192 > > In this case ... the FD limit = 8192 :( ... when the limit should be > 2048? > > I wrote a perl script for the test ... anybody known a "C" program for > test the FD limit? Hmm, we seem to be missing this test case from the Linux Test Project. I see that dup03 exhausts all FDs and tests dup() for EMFILE. You could easily adapt that test case to a setrlimit() test case. -- Nate Straz [EMAIL PROTECTED] sgi, inc http://www.sgi.com/ Linux Test Project http://ltp.sourceforge.net/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kmalloc(..., GFP_ATOMIC) buffers contiguous - hence suitable for PCI DMA?
Hi folks! (I have looked up in the archive the linux-kernel threads for kwds "DMA, contiguous, address" before writing this mail, and read the corresponding threads.) I am trying to port some driver to Linux2.4/i386. I have just read the "Linux device drivers" book by A.Rubini, and this is what he says there in Ch.13 "Mmap & DMA", on the GFP_DMA allocator flag: "The kernel guarantees that DMA-capable buffers have 2 features. 1st, the phys. addrs must be conseccutive when get_free_page() returns > 1 page (but this is always true, indep. of GFP_DMA, because the kernel arranges free memory in clusters of consecutive pages). And second, when GFP_DMA is set, the kernel guarantees that only addrs lower than MAX_DMA_ADDRESS are returned. The macro MAX_DMA_ADDRESS is set to 16MB on the PC, to deal with the ISA [...]. As far as PCI is concerned, there's no MAX_DMA_ADDRESS limit, and a PCI dev. driver should avoid setting GFP_DMA when allocating its buffers." Is this really still true at kernels 2.2 and on? (The book refers to 2.1.43 as to the most modern version as of the time of its writing) I.e., can I just assume a buffer which I know to have been successfully allocated with just a kmalloc(..., GFP_ATOMIC) will be physically contiguous and hence suitable for PCI DMA? I tried to understand the corresponding code path in mm/slab.c, but failed to come up with a 100%-assuring opinion out of it. The driver and the device at present are not oriented for doing scatter-gather. TIA for any possible help, Vassilii - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: fs.file-max
Dan: Hi ... Dan Kegel wrote: > Federico Edelman Anaya ([EMAIL PROTECTED]) wrote: > > > What can I do to test the FD limit? ... Because, the FD limit is set in > > /proc/sys/fs/file-max, sample: > > > > echo "2048" > /proc/sys/fs/file-max > > That sets the systemwide limit to 2048. Ok ... > > > > ulimit -n 8192 > > That sets the per-process limit (for this process > and its children) to 2048. > But, my perl script could open 8192 files ... I don't understand exactly work ... which is the limit of FD? file-max? > > > In this case ... the FD limit = 8192 :( ... when the limit should be > > 2048? > > No, the two limits are independant (except, obviously, that > that process will reach the systemwide fd limit before it > exhausts its per-process fd limit). > > > I wrote a perl script for the test ... anybody known a "C" program for > > test the FD limit? > > http://www.kegel.com/dkftpbench/#tuning > > - Dan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CML2 design philosophy heads-up
Eric S. Raymond wrote: > More generally, arguments of the form "Non-mainline custom hack X > could invalidate constraint Y, therefore we can't have Y in the > rulebase" are dangerous -- I suspect you could reduce your set of > constraints to nil very quickly that way, and thus badly screw over > the 99% of people who just want to build a more or less stock kernel. Eric, Still being able to use the "tool" is useful! So I want a "don't mess with me" mode where I'd get more control than 99% of the lusers Roger. -- ** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 ** *-- BitWizard writes Linux device drivers for any device you may have! --* * There are old pilots, and there are bold pilots. * There are also old, bald pilots. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/