Re: Distributed storage. Move away from char device ioctls.

2007-09-16 Thread Kyle Moffett
On Sep 15, 2007, at 13:24:46, Andreas Dilger wrote: On Sep 15, 2007 16:29 +0400, Evgeniy Polyakov wrote: Yes, block device itself is not able to scale well, but it is the place for redundancy, since filesystem will just fail if underlying device does not work correctly and FS actually does

Re: Distributed storage. Move away from char device ioctls.

2007-09-16 Thread Evgeniy Polyakov
On Sat, Sep 15, 2007 at 11:24:46AM -0600, Andreas Dilger ([EMAIL PROTECTED]) wrote: When Chris Mason announced btrfs, I found that quite a few new ideas are already implemented there, so I postponed project (although direction of the developement of the btrfs seems to move to the zfs side

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Goswin von Brederlow
Andrea Arcangeli [EMAIL PROTECTED] writes: On Sat, Sep 15, 2007 at 10:14:44PM +0200, Goswin von Brederlow wrote: - Userspace allocates a lot of memory in those slabs. If with slabs you mean slab/slub, I can't follow, there has never been a single byte of userland memory allocated there since

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Andrea Arcangeli
On Sun, Sep 16, 2007 at 03:54:56PM +0200, Goswin von Brederlow wrote: Andrea Arcangeli [EMAIL PROTECTED] writes: On Sat, Sep 15, 2007 at 10:14:44PM +0200, Goswin von Brederlow wrote: - Userspace allocates a lot of memory in those slabs. If with slabs you mean slab/slub, I can't follow,

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Jörn Engel
On Sun, 16 September 2007 00:30:32 +0200, Andrea Arcangeli wrote: Movable? I rather assume all slab allocations aren't movable. Then slab defrag can try to tackle on users like dcache and inodes. Keep in mind that with the exception of updatedb, those inodes/dentries will be pinned and you

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Jörn Engel
On Sat, 15 September 2007 01:44:49 -0700, Andrew Morton wrote: On Tue, 11 Sep 2007 14:12:26 +0200 Jörn Engel [EMAIL PROTECTED] wrote: While I agree with your concern, those numbers are quite silly. The chances of 99.8% of pages being free and the remaining 0.2% being perfectly spread

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Mel Gorman
On (15/09/07 14:14), Goswin von Brederlow didst pronounce: Andrew Morton [EMAIL PROTECTED] writes: On Tue, 11 Sep 2007 14:12:26 +0200 Jörn Engel [EMAIL PROTECTED] wrote: While I agree with your concern, those numbers are quite silly. The chances of 99.8% of pages being free and the

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Mel Gorman
On (15/09/07 17:51), Andrea Arcangeli didst pronounce: On Sat, Sep 15, 2007 at 02:14:42PM +0200, Goswin von Brederlow wrote: I keep coming back to the fact that movable objects should be moved out of the way for unmovable ones. Anything else just allows That's incidentally exactly what the

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Linus Torvalds
On Sun, 16 Sep 2007, Jörn Engel wrote: I have been toying with the idea of having seperate caches for pinned and movable dentries. Downside of such a patch would be the number of memcpy() operations when moving dentries from one cache to the other. Totally inappropriate. I bet 99% of all

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Jörn Engel
On Sun, 16 September 2007 11:15:36 -0700, Linus Torvalds wrote: On Sun, 16 Sep 2007, Jörn Engel wrote: I have been toying with the idea of having seperate caches for pinned and movable dentries. Downside of such a patch would be the number of memcpy() operations when moving dentries

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Linus Torvalds
On Sun, 16 Sep 2007, Jörn Engel wrote: My approach is to have one for mount points and ramfs/tmpfs/sysfs/etc. which are pinned for their entire lifetime and another for regular files/inodes. One could take a three-way approach and have always-pinned, often-pinned and rarely-pinned. We

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Andrea Arcangeli
On Sun, Sep 16, 2007 at 07:15:04PM +0100, Mel Gorman wrote: Except now as I've repeatadly pointed out, you have internal fragmentation problems. If we went with the SLAB, we would need 16MB slabs on PowerPC for example to get the same sort of results and a lot of copying and moving when Well

Re: [RFC][PATCH] 9p: add readahead support for loose mode

2007-09-16 Thread Peter Zijlstra
On Sat, 15 Sep 2007 03:41:26 -0700 Andrew Morton [EMAIL PROTECTED] wrote: eww, kmap. Large amounts of them, apparently. Be aware that kmap is a) slow and b) deadlockable. The latter happens when multiple tasks want to take more than one kmap simultaneously: they all wait for someone else

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Mel Gorman
On (16/09/07 20:50), Andrea Arcangeli didst pronounce: On Sun, Sep 16, 2007 at 07:15:04PM +0100, Mel Gorman wrote: Except now as I've repeatadly pointed out, you have internal fragmentation problems. If we went with the SLAB, we would need 16MB slabs on PowerPC for example to get the same

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Mel Gorman
On (16/09/07 17:08), Andrea Arcangeli didst pronounce: On Sun, Sep 16, 2007 at 03:54:56PM +0200, Goswin von Brederlow wrote: Andrea Arcangeli [EMAIL PROTECTED] writes: On Sat, Sep 15, 2007 at 10:14:44PM +0200, Goswin von Brederlow wrote: - Userspace allocates a lot of memory in those

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Mel Gorman
On (15/09/07 02:31), Goswin von Brederlow didst pronounce: Mel Gorman [EMAIL PROTECTED] writes: On Fri, 2007-09-14 at 18:10 +0200, Goswin von Brederlow wrote: Nick Piggin [EMAIL PROTECTED] writes: In my attack, I cause the kernel to allocate lots of unmovable allocations and

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Andrea Arcangeli
On Sun, Sep 16, 2007 at 09:54:18PM +0100, Mel Gorman wrote: The 16MB is the size of a hugepage, the size of interest as far as I am concerned. Your idea makes sense for large block support, but much less for huge pages because you are incurring a cost in the general case for something that may

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Mel Gorman
On (16/09/07 19:53), J?rn Engel didst pronounce: On Sat, 15 September 2007 01:44:49 -0700, Andrew Morton wrote: On Tue, 11 Sep 2007 14:12:26 +0200 Jörn Engel [EMAIL PROTECTED] wrote: While I agree with your concern, those numbers are quite silly. The chances of 99.8% of pages being

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Goswin von Brederlow
[EMAIL PROTECTED] (Mel Gorman) writes: On (15/09/07 14:14), Goswin von Brederlow didst pronounce: Andrew Morton [EMAIL PROTECTED] writes: On Tue, 11 Sep 2007 14:12:26 +0200 Jörn Engel [EMAIL PROTECTED] wrote: While I agree with your concern, those numbers are quite silly. The chances

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Goswin von Brederlow
Jörn Engel [EMAIL PROTECTED] writes: On Sun, 16 September 2007 00:30:32 +0200, Andrea Arcangeli wrote: Movable? I rather assume all slab allocations aren't movable. Then slab defrag can try to tackle on users like dcache and inodes. Keep in mind that with the exception of updatedb, those

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Goswin von Brederlow
[EMAIL PROTECTED] (Mel Gorman) writes: On (15/09/07 02:31), Goswin von Brederlow didst pronounce: Mel Gorman [EMAIL PROTECTED] writes: On Fri, 2007-09-14 at 18:10 +0200, Goswin von Brederlow wrote: Nick Piggin [EMAIL PROTECTED] writes: In my attack, I cause the kernel to allocate

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Jörn Engel
On Mon, 17 September 2007 00:06:24 +0200, Goswin von Brederlow wrote: How probable is it that the dentry is needed again? If you copy it and it is not needed then you wasted time. If you throw it out and it is needed then you wasted time too. Depending on the probability one of the two is

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Goswin von Brederlow
Linus Torvalds [EMAIL PROTECTED] writes: On Sun, 16 Sep 2007, Jörn Engel wrote: My approach is to have one for mount points and ramfs/tmpfs/sysfs/etc. which are pinned for their entire lifetime and another for regular files/inodes. One could take a three-way approach and have

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Goswin von Brederlow
[EMAIL PROTECTED] (Mel Gorman) writes: On (16/09/07 17:08), Andrea Arcangeli didst pronounce: zooming in I see red pixels all over the squares mized with green pixels in the same square. This is exactly what happens with the variable order page cache and that's why it provides zero guarantees

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread Goswin von Brederlow
Andrea Arcangeli [EMAIL PROTECTED] writes: You ignore one other bit, when /usr/bin/free says 1G is free, with config-page-shift it's free no matter what, same goes for not mlocked cache. With variable order page cache, /usr/bin/free becomes mostly a lie as long as there's no 4k fallback (like

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-16 Thread David Chinner
On Fri, Sep 14, 2007 at 06:48:55AM +1000, Nick Piggin wrote: On Thursday 13 September 2007 12:01, Nick Piggin wrote: On Thursday 13 September 2007 23:03, David Chinner wrote: Then just do operations on directories with lots of files in them (tens of thousands). Every directory operation