Re: DVD blockdevice buffers

2001-05-24 Thread Stephen C. Tweedie
Hi, On Wed, May 23, 2001 at 01:01:56PM -0700, Linus Torvalds wrote: > On Wed, 23 May 2001, Stephen C. Tweedie wrote: > > > that the filesystems already do. And you can do it a lot _better_ than the > > > current buffer-cache-based approach. Done right, you can actually do

Re: [PATCH] struct char_device

2001-05-24 Thread Stephen C. Tweedie
Hi, On Wed, May 23, 2001 at 01:54:15PM -0400, Alexander Viro wrote: > On Wed, 23 May 2001 [EMAIL PROTECTED] wrote: > > > > But I don't want an initrd. > > Don't be afraid of words. You wouldnt notice - it would do its > > job and disappear just like piggyback today. > > Andries, initrd code is

Re: O_TRUNC problem on a full filesystem

2001-05-24 Thread Stephen C. Tweedie
On Wed, May 23, 2001 at 07:55:48PM +1000, Andrew Morton wrote: > When you truncated your file, the blocks remained preallocated > on behalf of the file, and were hence considered "used". For > some reason, a subsequent attempt to allocate blocks for the > same file failed to use that file's prea

Re: DVD blockdevice buffers

2001-05-23 Thread Stephen C. Tweedie
Hi, On Wed, May 23, 2001 at 11:12:00AM -0700, Linus Torvalds wrote: > > On Wed, 23 May 2001, Stephen C. Tweedie wrote: > No, you can actually do all the "prepare_write()"/"commit_write()" stuff > that the filesystems already do. And you can do it a lot _better_

Re: DVD blockdevice buffers

2001-05-23 Thread Stephen C. Tweedie
Hi, On Sat, May 19, 2001 at 07:36:07PM -0700, Linus Torvalds wrote: > Right now we don't try to aggressively drop streaming pages, but it's > possible. Using raw devices is a silly work-around that should not be > needed, and this load shows a real problem in current Linux (one soon to > be fixe

Re: Getting FS access events

2001-05-23 Thread Stephen C. Tweedie
Hi, On Tue, May 15, 2001 at 04:37:01PM +1200, Chris Wedgwood wrote: > On Sun, May 13, 2001 at 08:39:23PM -0600, Richard Gooch wrote: > > Yeah, we need a decent unfragmenter. We can do that now with > bmap(). > > SCT wrote a defragger for ext2 but it only handles 1k blocks :( Actually,

Re: Getting FS access events

2001-05-23 Thread Stephen C. Tweedie
Hi, On Fri, May 18, 2001 at 09:55:14AM +0200, Rogier Wolff wrote: > The "boot quickly" was an example. "Load netscape quickly" on some > systems is done by dd-ing the binary to /dev/null. This is one of the reasons why some filesystems use extent maps instead of inode indirection trees. The p

Re: Getting FS access events

2001-05-23 Thread Stephen C. Tweedie
Hi, On Sat, May 19, 2001 at 12:47:15PM -0700, Linus Torvalds wrote: > > On Sat, 19 May 2001, Pavel Machek wrote: > > > > > Don't get _too_ hung up about the power-management kind of "invisible > > > suspend/resume" sequence where you resume the whole kernel state. > > > > Ugh. Now I'm confused

Re: Ext2, fsync() and MTA's?

2001-05-22 Thread Stephen C. Tweedie
Hi, On Tue, May 22, 2001 at 11:54:55AM -0500, Oliver Xymoron wrote: > > > > That's probably the right thing to add. > > > > > > I'd vote for an async flag instead. > > > > Why??? Why change the default behaviour to be something much slower? > > I was suggesting an async flag _in addition_ to t

Re: Ext2, fsync() and MTA's?

2001-05-22 Thread Stephen C. Tweedie
Hi, On Tue, May 22, 2001 at 10:50:51AM -0500, Oliver Xymoron wrote: > On Mon, 21 May 2001, Theodore Tso wrote: > > > On Mon, May 21, 2001 at 06:47:58PM +0100, Stephen C. Tweedie wrote: > > > > > Just set chattr +S on the spool dir. That's what the flag is for

Re: Ext2, fsync() and MTA's?

2001-05-21 Thread Stephen C. Tweedie
Hi, On Sun, May 13, 2001 at 12:53:37AM +1000, Andrew McNamara wrote: > I seem to recall that in 2.2, fsync behaved like fdatasync, and that > it's only in 2.4 that it also syncs metadata - is this correct? No, fsync should be safe on 2.2. There was a problem with O_SYNC not syncing all metadat

Re: Ext2, fsync() and MTA's?

2001-05-21 Thread Stephen C. Tweedie
Hi, On Sat, May 12, 2001 at 03:13:55PM +0100, Alan Cox wrote: > fsync guarantees the inode data is up to date, fdatasync just the data. fdatasync guarantees "important" inode data too. The only thing that fdatasync is allowed to skip is the timestamps. --Stephen - To unsubscribe from this lis

Re: [RFC][PATCH] Re: Linux 2.4.4-ac10

2001-05-21 Thread Stephen C. Tweedie
Hi, On Sun, May 20, 2001 at 07:04:31AM -0300, Rik van Riel wrote: > On Sun, 20 May 2001, Mike Galbraith wrote: > > > > Looking at the locking and trying to think SMP (grunt) though, I > > don't like the thought of taking two locks for each page until > > > 100%. The data in that block is toast

Re: LANANA: To Pending Device Number Registrants

2001-05-21 Thread Stephen C. Tweedie
Hi, On Sat, May 19, 2001 at 04:20:11PM -0400, Michael Meissner wrote: > On Fri, May 18, 2001 at 03:17:50PM +0100, Stephen C. Tweedie wrote: > Presumably, a new UUID is created each time format a partition, which means it > is a slight bit of hassle if you have to reload a partition fr

Re: LANANA: To Pending Device Number Registrants

2001-05-19 Thread Stephen C. Tweedie
Hi, On Sat, May 19, 2001 at 05:29:32PM +1200, Chris Wedgwood wrote: > > Or you can fall back to mounting by UUID, which is globally > unique and still avoids referencing physical location. You also > don't need to manually set LABELs for UUID to work: all e2fsprogs > over the pa

Re: Linux 2.4.4-ac10

2001-05-18 Thread Stephen C. Tweedie
Hi, On Fri, May 18, 2001 at 07:44:39PM -0300, Rik van Riel wrote: > This is the core of why we cannot (IMHO) have a discussion > of whether a patch introducing new VM tunables can go in: > there is no clear overview of exactly what would need to be > tunable and how it would help. It's worse th

Re: [PATCH] SMP race in ext2 - metadata corruption.

2001-05-18 Thread Stephen C. Tweedie
Hi, On Fri, May 11, 2001 at 04:54:44PM +0200, Daniel Phillips wrote: > The only reasonable way I can think of getting a block-coherent view > underneath a mounted fs is to have a reverse map, and update it each > time we map block into the page cache or unmap it. It's called the "buffer cache

Re: LANANA: To Pending Device Number Registrants

2001-05-18 Thread Stephen C. Tweedie
Hi, On Wed, May 16, 2001 at 12:18:15PM -0400, Michael Meissner wrote: > With the current LABEL= support, you won't be able to mount the disks with > duplicate labels, but you can still mount them via /dev/sd. Or you can fall back to mounting by UUID, which is globally unique and still avoids re

Re: [PATCH] allocation looping + kswapd CPU cycles

2001-05-10 Thread Stephen C. Tweedie
Hi, On Thu, May 10, 2001 at 03:49:05PM -0300, Marcelo Tosatti wrote: > Back to the main discussion --- I guess we could make __GFP_FAIL (with > __GFP_WAIT set :)) allocations actually fail if "try_to_free_pages()" does > not make any progress (ie returns zero). But maybe thats a bit too > extrem

Re: [PATCH] allocation looping + kswapd CPU cycles

2001-05-10 Thread Stephen C. Tweedie
Hi, On Thu, May 10, 2001 at 03:22:57PM -0300, Marcelo Tosatti wrote: > Initially I thought about __GFP_FAIL to be used by writeout routines which > want to cluster pages until they can allocate memory without causing any > pressure to the system. Something like this: > > while ((page = alloc_p

Re: [PATCH] allocation looping + kswapd CPU cycles

2001-05-10 Thread Stephen C. Tweedie
Hi, On Thu, May 10, 2001 at 01:43:46PM -0300, Marcelo Tosatti wrote: > No. __GFP_FAIL can to try to reclaim pages from inactive clean. > > We just want to avoid __GFP_FAIL allocations from going to > try_to_free_pages(). Why? __GFP_FAIL is only useful as an indication that the caller has some

Re: Swap space deallocation speed. (fwd)

2001-05-04 Thread Stephen C. Tweedie
Hi, On Thu, May 03, 2001 at 12:03:39AM -0400, Dave Mielke wrote: > unresponsive. The relevant line in the log, as you can find in the attached > "crash.log" file, appears to be: > > Unable to handle kernel paging request at virtual address 00020024 > Apr 16 11:23:06 dave kernel: esi: 00020

Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Stephen C. Tweedie
Hi, On Wed, May 02, 2001 at 01:49:16PM +0100, Hugh Dickins wrote: > On Wed, 2 May 2001, Stephen C. Tweedie wrote: > > > > So the aim is more complex. Basically, once we are short on VM, we > > want to eliminate redundant copies of swap data. That implies two > >

Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Stephen C. Tweedie
Hi, On Wed, May 02, 2001 at 12:54:15PM +0200, Rogier Wolff wrote: > > first: Thanks for clearing this up for me. > > So, there are in fact some more "states" a swap-page can be in: > > -(0) free > -(1) allocated, not in mem. > -(2) on swap, valid copy of memory. > -(

Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Stephen C. Tweedie
Hi, On Tue, May 01, 2001 at 06:14:54PM +0200, Rogier Wolff wrote: > Shouldn't the algorithm be: > > - If (current_access == write ) > free (swap_page); > else > map (page, READONLY) > > and > when a write access happens, we fault again, and map free the > swap-page as it i

Re: [Patch] deadlock on write in tmpfs

2001-05-02 Thread Stephen C. Tweedie
hi, On Tue, May 01, 2001 at 03:39:47PM +0200, Christoph Rohland wrote: > > tmpfs deadlocks when writing into a file from a mapping of the same > file. > > So I see two choices: > > 1) Do not serialise the whole of shmem_getpage_locked but protect >critical pathes with the spinlock and do

Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Stephen C. Tweedie
Hi, On Mon, Apr 30, 2001 at 07:12:12PM +0100, Alan Cox wrote: > > paging in just released 2.4.4, but in previuos kernel, a page that was > > paged-out, reserves its place in swap even if it is paged-in again, so > > once you have paged-out all your ram at least once, you can't get any > > more me

Re: generic_osync_inode/ext2_fsync_inode still not safe

2001-04-20 Thread Stephen C. Tweedie
Hi, On Wed, Apr 18, 2001 at 06:45:40AM -0300, Marcelo Tosatti wrote: > As far as I can see, you cannot guarantee that an inode which is unlocked > _and_ clean (accordingly to the inode->i_state) is safely on disk. > > The reason for that are calls to sync_one() which write the inode > asynchron

Re: RFC: pageable kernel-segments

2001-04-20 Thread Stephen C. Tweedie
Hi, On Fri, Apr 20, 2001 at 03:49:30PM +0100, Alan Cox wrote: > There is a proposal (several it seems) to make 2.5 replace the conventional > unix swap with a filesystem of backing store for anonymous objects. That will > mean each object has its own vm area and inode and thus we can start blowi

Re: RFC: pageable kernel-segments

2001-04-20 Thread Stephen C. Tweedie
Hi, On Tue, Apr 17, 2001 at 12:21:17PM -0700, H. Peter Anvin wrote: > > Certain parts of drivers could get the __pageable prefix or so > > (like the __init parts of drivers which get removed) for letting > > the paging-code know that it can be discared if memory-pressure > > demands it. > > VMS

Re: Asynchronous IO

2001-04-20 Thread Stephen C. Tweedie
Hi, On Fri, Apr 13, 2001 at 04:45:07AM -0400, Dan Maas wrote: > IIRC the problem with implementing asynchronous *disk* I/O in Linux today is > that the filesystem code assumes synchronous I/O operations that block the > whole process/thread. So implementing "real" asynch I/O (without the > overhe

Re: [NEED TESTERS] remove swapin_readahead Re: shmem_getpage_locked() / swapin_readahead() race in 2.4.4-pre3

2001-04-17 Thread Stephen C. Tweedie
Hi, On Sat, Apr 14, 2001 at 08:31:07PM -0300, Marcelo Tosatti wrote: > On Sat, 14 Apr 2001, Rik van Riel wrote: > > On Sat, 14 Apr 2001, Marcelo Tosatti wrote: > > > > > There is a nasty race between shmem_getpage_locked() and > > > swapin_readahead() with the new shmem code (introduced in > > >

Re: generic_osync_inode/ext2_fsync_inode still not safe

2001-04-17 Thread Stephen C. Tweedie
Hi, On Sat, Apr 14, 2001 at 07:24:42AM -0300, Marcelo Tosatti wrote: > > As described earlier, code which wants to write an inode cannot rely on > the I_DIRTY bits (on inode->i_state) being clean to guarantee that the > inode and its dirty pages, if any, are safely synced on disk. Indeed --- fo

Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory

2001-03-25 Thread Stephen C. Tweedie
Hi, On Sat, Mar 24, 2001 at 10:05:18PM -0300, Rik van Riel wrote: > On Sun, 25 Mar 2001, Stephen C. Tweedie wrote: > > > Rik, do you think it is really necessary to take the page lock and > > release it inside lookup_swap_cache? I may be overlooking something, > > but

[PATCH] 2.4.2-ac24 buffer.c oops on highmem

2001-03-24 Thread Stephen C. Tweedie
Hi, We've just seen a buffer.c oops in: >>EIP; c013ae4b <__block_prepare_write+2bb/300> <= Trace; c013b732 Trace; c015dbba Trace; c012a67e Trace; c015dbba Trace; c01281c0 Trace; c01384a6 Trace; c010910b __block_prepare_write()'s "out:" error handler tries to do a

Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory

2001-03-24 Thread Stephen C. Tweedie
Hi, On Fri, Mar 23, 2001 at 11:58:50AM -0800, Linus Torvalds wrote: > Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. Uggh --- the shmem code already does, see: shmem_truncate->shmem_truncate_part->shmem_free_swp-> lookup_swap_cache->find_lock_page It looks messy: lookup

Re: [linux-lvm] EXT2-fs panic (device lvm(58,0)):

2001-03-22 Thread Stephen C. Tweedie
Hi, On Wed, Mar 07, 2001 at 01:35:05PM -0700, Andreas Dilger wrote: > The only remote possibility is in ext2_free_blocks() if block+count > overflows a 32-bit unsigned value. Only 2 places call ext2_free_blocks() > with a count != 1, and ext2_free_data() looks to be OK. The other > possibility

[PATCH] Fix races in 2.4.2-ac22 SysV shared memory

2001-03-22 Thread Stephen C. Tweedie
Hi, The patch below is for two races in sysV shared memory. The first (minor) one is in shmem_free_swp: swap_free (entry); *ptr = (swp_entry_t){0}; freed++; if (!(page = lookup_swap_cache(entry))) continue;

Re: 2.4.2 fs/inode.c

2001-03-22 Thread Stephen C. Tweedie
Hi, On Thu, Mar 22, 2001 at 01:42:15PM -0500, Jan Harkes wrote: > > I found some code that seems wrong and didn't even match it's comment. > Patch is against 2.4.2, but should go cleanly against 2.4.3-pre6 as well. Patch looks fine to me. Have you tested it? If this goes wrong, things break

Re: Thinko in kswapd?

2001-03-22 Thread Stephen C. Tweedie
Hi, On Thu, Mar 22, 2001 at 09:36:48AM -0800, Linus Torvalds wrote: > On Thu, 22 Mar 2001, Stephen C. Tweedie wrote: > > > > There is what appears to be a simple thinko in kswapd. We really > > ought to keep kswapd running as long as there is either a free space &g

Thinko in kswapd?

2001-03-22 Thread Stephen C. Tweedie
Hi, There is what appears to be a simple thinko in kswapd. We really ought to keep kswapd running as long as there is either a free space or an inactive page shortfall; but right now we only keep going if _both_ are short. Diff below. With this change, I've got a 64MB box running Applix and St

Re: changing mm->mmap_sem (was: Re: system call for process information?)

2001-03-19 Thread Stephen C. Tweedie
Hi, On Sun, Mar 18, 2001 at 10:34:38AM +0100, Manfred Spraul wrote: > > The problem is that mmap_sem seems to be protecting the list > > of VMAs, so taking _only_ the page_table_lock could let a VMA > > change under us while a page fault is underway ... > > No, that can't happen. It can. Page

Re: [PATCH]: Only one memory zone for sparc64

2001-03-16 Thread Stephen C. Tweedie
Hi, On Thu, Mar 15, 2001 at 07:13:52PM +1100, Anton Blanchard wrote: > > On sparc64 we dont care about the different memory zones and iterating > through them all over the place only serves to waste CPU. I suspect this > would be the case with some other architectures but for the moment I > have

Re: changing mm->mmap_sem (was: Re: system call for process information?)

2001-03-16 Thread Stephen C. Tweedie
Hi, On Fri, Mar 16, 2001 at 08:50:25AM -0300, Rik van Riel wrote: > On Fri, 16 Mar 2001, Stephen C. Tweedie wrote: > > > > Write locks would be used in the code where we actually want > > > to change the VMA list and page faults would use an extra lock > > &

Re: O_DSYNC flag for open

2001-03-16 Thread Stephen C. Tweedie
Hi, On Wed, Mar 14, 2001 at 10:26:42PM -0500, Tom Vier wrote: > fdatasync() is the same as fsync(), in linux. No, in 2.4 fdatasync does the right thing and skips the inode flush if only the timestamps have changed. > until fdatasync() is > implimented (ie, syncs the data only) fdatasync is req

Re: changing mm->mmap_sem (was: Re: system call for process information?)

2001-03-16 Thread Stephen C. Tweedie
Hi, On Thu, Mar 15, 2001 at 09:24:59AM -0300, Rik van Riel wrote: > On Wed, 14 Mar 2001, Rik van Riel wrote: > The mmap_sem is used in procfs to prevent the list of VMAs > from changing. In the page fault code it seems to be used > to prevent other page faults to happen at the same time with > t

Re: magic device renumbering was -- Re: Linux 2.4.2ac20

2001-03-16 Thread Stephen C. Tweedie
Hi, On Wed, Mar 14, 2001 at 02:11:57PM -0500, Lars Kellogg-Stedman wrote: > > Put LABEL= in you fstab in place of the device name. > > Which is great, for filesystems that support labels. Unfortunately, > this isn't universally available -- for instance, you cannot mount > a swap partition by l

Re: BUG? race between kswapd and ptrace (access_process_vm )

2001-03-12 Thread Stephen C. Tweedie
Hi, On Thu, Mar 08, 2001 at 09:12:52PM +0100, Manfred Spraul wrote: > > > Fixing the bug is more difficult than I thought: > > Initially I assumed it would be a two-liner (lock, unlock) but kmap() > can sleep. > > Can I reuse a kmap_atomic() type or should I add a new type? I've just tried wi

Re: 64-bit capable block device layer

2001-03-08 Thread Stephen C. Tweedie
Hi, On Wed, Mar 07, 2001 at 07:53:23PM +0100, Jens Axboe wrote: > > > > OTOH, I'm not sure what problems it could give to make this > > a compile-time option... > > Plus compile time options are nasty :-). It would probably make > bigger sense to completely skip all the merging etc for low end

Re: scsi vs ide performance on fsync's

2001-03-08 Thread Stephen C. Tweedie
Hi, On Wed, Mar 07, 2001 at 10:36:38AM -0800, Linus Torvalds wrote: > On Wed, 7 Mar 2001, Jeremy Hansen wrote: > > > > So in the meantime as this gets worked out on a lower level, we've decided > > to take the fsync() out of berkeley db for mysql transaction logs and > > mount the filesystem -o

Re: scsi vs ide performance on fsync's

2001-03-07 Thread Stephen C. Tweedie
Hi, On Wed, Mar 07, 2001 at 09:15:36PM +0100, Jens Axboe wrote: > On Wed, Mar 07 2001, Stephen C. Tweedie wrote: > > > > For most fs'es, that's not an issue. The fs won't start writeback on > > the primary disk at all until the journal commit has been acknow

Re: scsi vs ide performance on fsync's

2001-03-07 Thread Stephen C. Tweedie
Hi, On Wed, Mar 07, 2001 at 07:51:52PM +0100, Jens Axboe wrote: > On Wed, Mar 07 2001, Stephen C. Tweedie wrote: > > My bigger concern is when the journalled fs has a log on a different > queue. For most fs'es, that's not an issue. The fs won't start writeback on th

Re: scsi vs ide performance on fsync's

2001-03-07 Thread Stephen C. Tweedie
Hi, On Wed, Mar 07, 2001 at 03:12:41PM +0100, Jens Axboe wrote: > > Yep, it's much harder than it seems. Especially because for the barrier > to be really useful, having inter-request dependencies becomes a > requirement. So you can say something like 'flush X and Y, but don't > flush Y before X

Re: scsi vs ide performance on fsync's

2001-03-07 Thread Stephen C. Tweedie
Hi, On Tue, Mar 06, 2001 at 09:37:20PM +0100, Jens Axboe wrote: > > SCSI has ordered tag, which fit the model Alan described quite nicely. > I've been meaning to implement this for some time, it would be handy > for journalled fs to use such a barrier. Since ATA doesn't do queueing > (at least n

Re: scsi vs ide performance on fsync's

2001-03-07 Thread Stephen C. Tweedie
Hi, On Tue, Mar 06, 2001 at 10:44:34AM -0800, Linus Torvalds wrote: > On Tue, 6 Mar 2001, Alan Cox wrote: > > You want a write barrier. Write buffering (at least for short intervals) in > > the drive is very sensible. The kernel needs to able to send drivers a write > > barrier which will not be

Raw IO fixes for 2.4.2-ac8

2001-03-02 Thread Stephen C. Tweedie
Hi, I've just uploaded the current raw IO fixes as kiobuf-2.4.2-ac8-A0.tar.gz on ftp.uk.linux.org:/pub/linux/sct/fs/raw-io/ and ftp.*.kernel.org:/pub/linux/kernel/people/sct/raw-io/ This includes: 00-movecode.diff: move kiobuf code from mm/memory.c to fs/iobuf.c 02-faul

Re: [patch] set kiobuf io_count once, instead of increment

2001-03-02 Thread Stephen C. Tweedie
Hi, On Wed, Feb 28, 2001 at 09:18:59AM -0800, Robert Read wrote: > On Tue, Feb 27, 2001 at 10:50:54PM -0300, Marcelo Tosatti wrote: > This is true, but it looks like the brw_kiovec allocation failure > handling is broken already; it's calling __put_unused_buffer_head on > bhs without waiting for

Re: [patch] set kiobuf io_count once, instead of increment

2001-03-02 Thread Stephen C. Tweedie
On Tue, Feb 27, 2001 at 04:22:22PM -0800, Robert Read wrote: > Currently in brw_kiovec, iobuf->io_count is being incremented as each > bh is submitted, and decremented in the bh->b_end_io(). This means > io_count can go to zero before all the bhs have been submitted, > especially during a large r

Re: Writing on raw device with software RAID 0 is slow

2001-03-01 Thread Stephen C. Tweedie
Hi, On Thu, Mar 01, 2001 at 11:08:13AM -0500, Ben LaHaise wrote: > On Thu, 1 Mar 2001, Stephen C. Tweedie wrote: > > Actually, how about making it a sysctl? That's probably the most > reasonable approach for now since the optimal size depends on hardware. Fine with me

Re: Writing on raw device with software RAID 0 is slow

2001-03-01 Thread Stephen C. Tweedie
Hi, On Thu, Mar 01, 2001 at 10:44:38AM -0500, Ben LaHaise wrote: > > On Thu, 1 Mar 2001, Stephen C. Tweedie wrote: > > > Raw IO is always synchronous: it gets flushed to disk before the write > > returns. You don't get any write-behind with raw IO, so the smaller &

Re: ext3 fsck question

2001-03-01 Thread Stephen C. Tweedie
Hi, On Wed, Feb 28, 2001 at 08:03:21PM -0600, Neal Gieselman wrote: > > I applied the libs and other utilites from e2fsprogs by hand. > I ran fsck.ext3 on my secondary partition and it ran fine. The boot fsck > on / was complaining about something but I could not catch it. > I then went single

Re: Writing on raw device with software RAID 0 is slow

2001-03-01 Thread Stephen C. Tweedie
Hi, On Wed, Feb 28, 2001 at 03:58:11PM +0100, Martin Rauh wrote: > > Writing to an software RAID 0 containing 4 SCSI discs is very fast. > I get transfer rates of about 100 MBytes/s. The filesystem on the RAID > is ext2. > > Writing to the same RAID directly (that means on the raw device withou

Re: [PATCH] guard mm->rss with page_table_lock (241p11)

2001-02-13 Thread Stephen C. Tweedie
Hi, On Mon, Feb 12, 2001 at 07:15:57PM -0800, george anzinger wrote: > Excuse me if I am off base here, but wouldn't an atomic operation be > better here. There are atomic inc/dec and add/sub macros for this. It > just seems that that is all that is needed here (from inspection of the > patch).

Re: ext2: block > big ?

2001-02-12 Thread Stephen C. Tweedie
Hi, On Sun, Feb 11, 2001 at 05:44:02PM -0700, Brian Grossman wrote: > > What does a message like 'ext2: block > big' indicate? An attempt was made to access a block beyond the legal max size for an ext2 file. That probably implies a corrupt inode, because the ext2 file write code checks for th

Re: Raw devices bound to RAID arrays ?

2001-02-11 Thread Stephen C. Tweedie
Hi, On Sun, Feb 11, 2001 at 06:29:12PM +0200, Petru Paler wrote: > > Is it possible to bind a raw device to a software RAID 1 array ? Yes. --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at ht

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-08 Thread Stephen C. Tweedie
Hi, On Thu, Feb 08, 2001 at 03:52:35PM +0100, Mikulas Patocka wrote: > > > How do you write high-performance ftp server without threads if select > > on regular file always returns "ready"? > > No, it's not really possible on Linux. Use SYS$QIO call on VMS :-) Ahh, but even VMS SYS$QIO is sync

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-08 Thread Stephen C. Tweedie
Hi, On Thu, Feb 08, 2001 at 12:15:13AM +0100, Pavel Machek wrote: > > > EAGAIN is _not_ a valid return value for block devices or for regular > > files. And in fact it _cannot_ be, because select() is defined to always > > return 1 on them - so if a write() were to return EAGAIN, user space woul

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-07 Thread Stephen C. Tweedie
Hi, On Wed, Feb 07, 2001 at 12:12:44PM -0700, Richard Gooch wrote: > Stephen C. Tweedie writes: > > > > Sorry? I'm not sure where communication is breaking down here, but > > we really don't seem to be talking about the same things. SGI's > > kiobuf r

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-07 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 06:37:41PM -0800, Linus Torvalds wrote: > > > However, I really _do_ want to have the page cache have a bigger > granularity than the smallest memory mapping size, and there are always > special cases that might be able to generate IO in bigger chunks (ie > in-kernel s

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-07 Thread Stephen C. Tweedie
Hi, On Wed, Feb 07, 2001 at 09:10:32AM +, David Howells wrote: > > I presume that correct_size will always be a power of 2... Yes. --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http:/

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 04:50:19PM -0800, Linus Torvalds wrote: > > > On Wed, 7 Feb 2001, Stephen C. Tweedie wrote: > > > > That gets us from 512-byte blocks to 4k, but no more (ll_rw_block > > enforces a single blocksize on all requests but that relaxing tha

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 04:41:21PM -0800, Linus Torvalds wrote: > > On Wed, 7 Feb 2001, Stephen C. Tweedie wrote: > > No, it is a problem of the ll_rw_block interface: buffer_heads need to > > be aligned on disk at a multiple of their buffer size. > > Ehh.. T

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 07:25:19PM -0500, Ingo Molnar wrote: > > On Wed, 7 Feb 2001, Stephen C. Tweedie wrote: > > > No, it is a problem of the ll_rw_block interface: buffer_heads need to > > be aligned on disk at a multiple of their buffer size. Under the Unix >

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 08:57:13PM +0100, Ingo Molnar wrote: > > [overhead of 512-byte bhs in the raw IO code is an artificial problem of > the raw IO code.] No, it is a problem of the ll_rw_block interface: buffer_heads need to be aligned on disk at a multiple of their buffer size. Under

Re: sync & asyck i/o

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 11:25:00AM -0800, Andre Hedrick wrote: > On Tue, 6 Feb 2001, Stephen C. Tweedie wrote: > > No, we simply omit to instruct them to enable write-back caching. > > Linux assumes that the WCE (write cache enable) bit in a disk's > > caching m

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 06:22:58PM +0100, Christoph Hellwig wrote: > On Tue, Feb 06, 2001 at 05:05:06PM +0000, Stephen C. Tweedie wrote: > > The whole point of the post was that it is merging, not splitting, > > which is troublesome. How are you going to merge requests wit

Re: sync & asyck i/o

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 05:54:41PM +, David Woodhouse wrote: > > [EMAIL PROTECTED] said: > > Linux will obey that if it possibly can: only in cases where the > > hardware is actively lying about when the data has hit disk will the > > guarantee break down. > > Do we attempt to ask SCS

Re: sync & asyck i/o

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 02:52:40PM +, Alan Cox wrote: > > According to the man page for fsync it copies in-core data to disk > > prior to its return. Does that take async i/o to the media in account? > > I.e. does it wait for completion of the async i/o to the disk? > > Undefined. >

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 06:00:58PM +0100, Christoph Hellwig wrote: > On Tue, Feb 06, 2001 at 12:07:04AM +0000, Stephen C. Tweedie wrote: > > > > Is that a realistic basis for a cleaned-up ll_rw_blk.c? > > I don't think os. If we minimize the state in the IO con

Re: rawio usage

2001-02-06 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 10:36:32PM -0800, Mayank Vasa wrote: > > When I run this program as root, I get the error "write: Invalid argument". Raw IO requires that the buffers are aligned on a 512-byte boundary in memory. --Stephen - To unsubscribe from this list: send the line "unsubscribe

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-05 Thread Stephen C. Tweedie
Hi, OK, if we take a step back what does this look like: On Mon, Feb 05, 2001 at 08:54:29PM +, Stephen C. Tweedie wrote: > > If we are doing readahead, we want completion callbacks raised as soon > as possible on IO completions, no matter how many other IOs have been > mer

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 11:06:48PM +, Alan Cox wrote: > > do you then tell the application _above_ raid0 if one of the > > underlying IOs succeeds and the other fails halfway through? > > struct > { > u32 flags; /* because everything needs flags */ > struct io_completio

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 10:28:37PM +0100, Ingo Molnar wrote: > > On Mon, 5 Feb 2001, Stephen C. Tweedie wrote: > > it's exactly these 'compound' structures i'm vehemently against. I do > think it's a design nightmare. I can picture these monster kiob

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 11:28:17AM -0800, Linus Torvalds wrote: > The _vectors_ are needed at the very lowest levels: the levels that do not > necessarily have to worry at all about completion notification etc. You > want the arbitrary scatter-gather vectors passed down to the stuff that > s

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 08:36:31AM -0800, Linus Torvalds wrote: > Have you ever thought about other things, like networking, special > devices, stuff like that? They can (and do) have packet boundaries that > have nothing to do with pages what-so-ever. They can have such notions as > packets

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 05:29:47PM +, Alan Cox wrote: > > > > _All_ drivers would have to do that in the degenerate case, because > > none of our drivers can deal with a dma boundary in the middle of a > > sector, and even in those places where the hardware supports it in > > theory, you

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 03:19:09PM +, Alan Cox wrote: > > Yes, it's the sort of thing that you would hope should work, but in > > practice it's not reliable. > > So the less smart devices need to call something like > > kiovec_align(kiovec, 512); > > and have it do the bounce buf

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 01:00:51PM +0100, Manfred Spraul wrote: > "Stephen C. Tweedie" wrote: > > > > You simply cannot do physical disk IO on > > non-sector-aligned memory or in chunks which aren't a multiple of > > sector size. > > Why no

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 08:01:45PM +0530, [EMAIL PROTECTED] wrote: > > >It's the very essence of readahead that we wake up the earlier buffers > >as soon as they become available, without waiting for the later ones > >to complete, so we _need_ this multiple completion concept. > > I can und

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Fri, Feb 02, 2001 at 01:02:28PM +0100, Christoph Hellwig wrote: > > > I may still be persuaded that we need the full scatter-gather list > > fields throughout, but for now I tend to think that, at least in the > > disk layers, we may get cleaner results by allow linked lists of > > page-a

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Sun, Feb 04, 2001 at 06:54:58PM +0530, [EMAIL PROTECTED] wrote: > > Can't we define a kiobuf structure as just this ? A combination of a > frag_list and a page_list ? Then all code which needs to accept an arbitrary kiobuf needs to be able to parse both --- ugh. > BTW, We could have a h

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Sat, Feb 03, 2001 at 12:28:47PM -0800, Linus Torvalds wrote: > > On Thu, 1 Feb 2001, Stephen C. Tweedie wrote: > > > Neither the read nor the write are page-aligned. I don't know where you > got that idea. It's obviously not true even in the common case: it

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-02 Thread Stephen C. Tweedie
Hi, On Fri, Feb 02, 2001 at 12:51:35PM +0100, Christoph Hellwig wrote: > > > > If I have a page vector with a single offset/length pair, I can build > > a new header with the same vector and modified offset/length to split > > the vector in two without copying it. > > You just say in the higher

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 09:33:27PM +0100, Christoph Hellwig wrote: > I think you want the whole kio concept only for disk-like IO. No. I want something good for zero-copy IO in general, but a lot of that concerns the problem of interacting with the user, and the basic center of that inte

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 09:33:27PM +0100, Christoph Hellwig wrote: > > > On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote: > > In the disk IO case, you basically don't get that (the only thing > > which comes close is raid5 parity blocks). The data which the user > > started with is

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 09:46:27PM +0100, Christoph Hellwig wrote: > > Right now we can take a kiobuf and turn it into a bunch of > > buffer_heads for IO. The io_count lets us track all of those sub-IOs > > so that we know when all submitted IO has completed, so that we can > > pass the com

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 07:14:03PM +0100, Christoph Hellwig wrote: > On Thu, Feb 01, 2001 at 05:41:20PM +0000, Stephen C. Tweedie wrote: > > > > > > We can't allocate a huge kiobuf structure just for requesting one page of > > > IO. It might get better

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 06:49:50PM +0100, Christoph Hellwig wrote: > > > Adding tons of base/limit pairs to kiobufs makes it worse not better > > For disk I/O it makes the handling a little easier for the cost of the > additional offset/length fields. Umm, actually, no, it makes it much wo

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote: > > > > I don't see any real advantage for disk IO. The real advantage is that > > we can have a generic structure that is also usefull in e.g. networking > > and can lead to a unified IO buffering scheme (a little like IO-Lite). >

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 06:05:15PM +0100, Christoph Hellwig wrote: > On Thu, Feb 01, 2001 at 04:16:15PM +0000, Stephen C. Tweedie wrote: > > > > > > No, and with the current kiobufs it would not make sense, because they > > > are to heavy-weight. > >

<    1   2   3   4   >