Re: [PATCH RFC] extent mapped page cache

2007-07-12 Thread Daniel Phillips
On Tuesday 10 July 2007 14:03, Chris Mason wrote: > This patch aims to demonstrate one way to replace buffer heads with a > few extent trees... Hi Chris, Quite terse commentary on algorithms and data structures, but I suppose that is not a problem because Jon has a whole week to reverse engineer

Re: Distributed storage.

2007-08-02 Thread Daniel Phillips
On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote: > Hi. > > I'm pleased to announce first release of the distributed storage > subsystem, which allows to form a storage on top of remote and local > nodes, which in turn can be exported to another storage as a node to > form tree-like storages.

Re: Distributed storage.

2007-08-03 Thread Daniel Phillips
On Friday 03 August 2007 06:49, Evgeniy Polyakov wrote: > ...rx has global reserve (always allocated on > startup or sometime way before reclaim/oom)where data is originally > received (including skb, shared info and whatever is needed, page is > just an exmaple), then it is copied into per-socket

Re: Distributed storage.

2007-08-03 Thread Daniel Phillips
On Friday 03 August 2007 07:53, Peter Zijlstra wrote: > On Fri, 2007-08-03 at 17:49 +0400, Evgeniy Polyakov wrote: > > On Fri, Aug 03, 2007 at 02:27:52PM +0200, Peter Zijlstra wrote: > > ...my main position is to > > allocate per socket reserve from socket's queue, and copy data > > there from main

Re: Distributed storage.

2007-08-03 Thread Daniel Phillips
Hi Evgeniy, Nit alert: On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote: > * storage can be formed on top of remote nodes and be exported > simultaneously (iSCSI is peer-to-peer only, NBD requires device > mapper and is synchronous) In fact, NBD has nothing to do with device

Re: Distributed storage.

2007-08-03 Thread Daniel Phillips
Hi Mike, On Thursday 02 August 2007 21:09, Mike Snitzer wrote: > But NBD's synchronous nature is actually an asset when coupled with > MD raid1 as it provides guarantees that the data has _really_ been > mirrored remotely. And bio completion doesn't? Regards, Daniel - To unsubscribe from this l

Re: Distributed storage.

2007-08-03 Thread Daniel Phillips
On Friday 03 August 2007 03:26, Evgeniy Polyakov wrote: > On Thu, Aug 02, 2007 at 02:08:24PM -0700, I wrote: > > I see bits that worry me, e.g.: > > > > + req = mempool_alloc(st->w->req_pool, GFP_NOIO); > > > > which seems to be callable in response to a local request, just the > > case w

Re: Distributed storage.

2007-08-05 Thread Daniel Phillips
On Saturday 04 August 2007 09:37, Evgeniy Polyakov wrote: > On Fri, Aug 03, 2007 at 06:19:16PM -0700, I wrote: > > To be sure, I am not very proud of this throttling mechanism for > > various reasons, but the thing is, _any_ throttling mechanism no > > matter how sucky solves the deadlock problem.

Re: Distributed storage.

2007-08-05 Thread Daniel Phillips
On Saturday 04 August 2007 09:44, Evgeniy Polyakov wrote: > > On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote: > > > * storage can be formed on top of remote nodes and be > > > exported simultaneously (iSCSI is peer-to-peer only, NBD requires > > > device mapper and is synchronous) > > >

Re: Distributed storage.

2007-08-05 Thread Daniel Phillips
On Sunday 05 August 2007 08:08, Evgeniy Polyakov wrote: > If we are sleeping in memory pool, then we already do not have memory > to complete previous requests, so we are in trouble. Not at all. Any requests in flight are guaranteed to get the resources they need to complete. This is guaranteed

Re: Distributed storage.

2007-08-05 Thread Daniel Phillips
On Sunday 05 August 2007 08:01, Evgeniy Polyakov wrote: > On Sun, Aug 05, 2007 at 01:06:58AM -0700, Daniel Phillips wrote: > > > DST original code worked as device mapper plugin too, but its two > > > additional allocations (io and clone) per block request ended up > >

Re: Distributed storage.

2007-08-07 Thread Daniel Phillips
On Tuesday 07 August 2007 05:05, Jens Axboe wrote: > On Sun, Aug 05 2007, Daniel Phillips wrote: > > A simple way to solve the stable accounting field issue is to add a > > new pointer to struct bio that is owned by the top level submitter > > (normally generic_make_request b

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-08-12 Thread Daniel Phillips
Hi Evgeniy, Sorry for not getting back to you right away, I was on the road with limited email access. Incidentally, the reason my mails to you keep bouncing is, your MTA is picky about my mailer's IP reversing to a real hostname. I will take care of that pretty soon, but for now my direct m

Re: Distributed storage.

2007-08-12 Thread Daniel Phillips
On Tuesday 07 August 2007 13:55, Jens Axboe wrote: > I don't like structure bloat, but I do like nice design. Overloading > is a necessary evil sometimes, though. Even today, there isn't enough > room to hold bi_rw and bi_flags in the same variable on 32-bit archs, > so that concern can be scratche

Re: Block device throttling [Re: Distributed storage.]

2007-08-12 Thread Daniel Phillips
On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote: > On Tue, Aug 07, 2007 at 10:55:38PM +0200, Jens Axboe ([EMAIL PROTECTED]) wrote: > > So, what did we decide? To bloat bio a bit (add a queue pointer) or > to use physical device limits? The latter requires to replace all > occurence of bi

Re: Block device throttling [Re: Distributed storage.]

2007-08-12 Thread Daniel Phillips
(previous incomplete message sent accidentally) On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote: > On Tue, Aug 07, 2007 at 10:55:38PM +0200, Jens Axboe wrote: > > So, what did we decide? To bloat bio a bit (add a queue pointer) or > to use physical device limits? The latter requires to r

Re: Block device throttling [Re: Distributed storage.]

2007-08-13 Thread Daniel Phillips
On Sunday 12 August 2007 22:36, I wrote: > Note! There are two more issues I forgot to mention earlier. Oops, and there is also: 3) The bio throttle, which is supposed to prevent deadlock, can itself deadlock. Let me see if I can remember how it goes. * generic_make_request puts a bio in fl

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 00:28, Jens Axboe wrote: > On Sun, Aug 12 2007, Daniel Phillips wrote: > > Right, that is done by bi_vcnt. I meant bi_max_vecs, which you can > > derive efficiently from BIO_POOL_IDX() provided the bio was > > allocated in the standard way. > >

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 00:45, Jens Axboe wrote: > On Mon, Aug 13 2007, Jens Axboe wrote: > > > You did not comment on the one about putting the bio destructor > > > in the ->endio handler, which looks dead simple. The majority of > > > cases just use the default endio handler and the default > >

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 02:13, Jens Axboe wrote: > On Mon, Aug 13 2007, Daniel Phillips wrote: > > On Monday 13 August 2007 00:45, Jens Axboe wrote: > > > On Mon, Aug 13 2007, Jens Axboe wrote: > > > > > You did not comment on the one about putting the bio &g

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 02:18, Evgeniy Polyakov wrote: > On Mon, Aug 13, 2007 at 02:08:57AM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > > But that idea fails as well, since reference counts and IO > > > completion are two completely seperate entities. So unl

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 03:06, Jens Axboe wrote: > On Mon, Aug 13 2007, Daniel Phillips wrote: > > Of course not. Nothing I said stops endio from being called in the > > usual way as well. For this to work, endio just needs to know that > > one call means "end&quo

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 03:22, Jens Axboe wrote: > I never compared the bio to struct page, I'd obviously agree that > shrinking struct page was a worthy goal and that it'd be ok to uglify > some code to do that. The same isn't true for struct bio. I thought I just said that. Regards, Daniel -

Re: Block device throttling [Re: Distributed storage.]

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 01:14, Evgeniy Polyakov wrote: > > Oops, and there is also: > > > > 3) The bio throttle, which is supposed to prevent deadlock, can > > itself deadlock. Let me see if I can remember how it goes. > > > > * generic_make_request puts a bio in flight > > * the bio gets pas

Re: Block device throttling [Re: Distributed storage.]

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 01:23, Evgeniy Polyakov wrote: > On Sun, Aug 12, 2007 at 10:36:23PM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > (previous incomplete message sent accidentally) > > > > On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote: > > >

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 04:03, Evgeniy Polyakov wrote: > On Mon, Aug 13, 2007 at 03:12:33AM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > > This is not a very good solution, since it requires all users of > > > the bios to know how to free it. > > > > No

Re: Block device throttling [Re: Distributed storage.]

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 05:04, Evgeniy Polyakov wrote: > On Mon, Aug 13, 2007 at 04:04:26AM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > On Monday 13 August 2007 01:14, Evgeniy Polyakov wrote: > > > > Oops, and there is also: > > > > > > > &

Re: Block device throttling [Re: Distributed storage.]

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 05:18, Evgeniy Polyakov wrote: > > Say you have a device mapper device with some physical device > > sitting underneath, the classic use case for this throttle code. > > Say 8,000 threads each submit an IO in parallel. The device mapper > > mapping function will be called

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips
On Monday 13 August 2007 02:12, Jens Axboe wrote: > > It is a system wide problem. Every block device needs throttling, > > otherwise queues expand without limit. Currently, block devices > > that use the standard request library get a slipshod form of > > throttling for free in the form of limit

Re: Block device throttling [Re: Distributed storage.]

2007-08-14 Thread Daniel Phillips
On Tuesday 14 August 2007 01:46, Evgeniy Polyakov wrote: > On Mon, Aug 13, 2007 at 06:04:06AM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > Perhaps you never worried about the resources that the device > > mapper mapping function allocates to handle each bio and so did n

Re: Block device throttling [Re: Distributed storage.]

2007-08-14 Thread Daniel Phillips
On Tuesday 14 August 2007 04:30, Evgeniy Polyakov wrote: > > And it will not solve the deadlock problem in general. (Maybe it > > works for your virtual device, but I wonder...) If the virtual > > device allocates memory during generic_make_request then the memory > > needs to be throttled. > > D

Re: Block device throttling [Re: Distributed storage.]

2007-08-14 Thread Daniel Phillips
On Tuesday 14 August 2007 04:50, Evgeniy Polyakov wrote: > On Tue, Aug 14, 2007 at 04:35:43AM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > On Tuesday 14 August 2007 04:30, Evgeniy Polyakov wrote: > > > > And it will not solve the deadlock problem in general. (Maybe

Re: Block device throttling [Re: Distributed storage.]

2007-08-14 Thread Daniel Phillips
On Tuesday 14 August 2007 05:46, Evgeniy Polyakov wrote: > > The throttling of the virtual device must begin in > > generic_make_request and last to ->endio. You release the throttle > > of the virtual device at the point you remap the bio to an > > underlying device, which you have convinced your

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-08-27 Thread Daniel Phillips
Say Evgeniy, something I was curious about but forgot to ask you earlier... On Wednesday 08 August 2007 03:17, Evgeniy Polyakov wrote: > ...All oerations are not atomic, since we do not care about precise > number of bios, but a fact, that we are close or close enough to the > limit. > ... in bi

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-08-28 Thread Daniel Phillips
On Tuesday 28 August 2007 02:35, Evgeniy Polyakov wrote: > On Mon, Aug 27, 2007 at 02:57:37PM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > Say Evgeniy, something I was curious about but forgot to ask you > > earlier... > > > > On Wednesday 08 August 2007 03

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-08-28 Thread Daniel Phillips
On Tuesday 28 August 2007 10:54, Evgeniy Polyakov wrote: > On Tue, Aug 28, 2007 at 10:27:59AM -0700, Daniel Phillips ([EMAIL PROTECTED]) > wrote: > > > We do not care about one cpu being able to increase its counter > > > higher than the limit, such inaccuracy (maximum b

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-08-30 Thread Daniel Phillips
On Wednesday 29 August 2007 01:53, Evgeniy Polyakov wrote: > Then, if of course you will want, which I doubt, you can reread > previous mails and find that it was pointed to that race and > possibilities to solve it way too long ago. What still bothers me about your response is that, while you kno

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-09-01 Thread Daniel Phillips
On Friday 31 August 2007 14:41, Alasdair G Kergon wrote: > On Thu, Aug 30, 2007 at 04:20:35PM -0700, Daniel Phillips wrote: > > Resubmitting a bio or submitting a dependent bio from > > inside a block driver does not need to be throttled because all > > resources required to

Re: [PATCH 1/3] VFS: apply coding standards to fs/ioctl.c

2007-10-28 Thread Daniel Phillips
On 10/28/07, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > While you're at it, it's probably worth splitting this out into > a small helper function. Why? Is the same pattern called from more than one place? Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe linux-fsdev

Re: [PATCH][RFC] fast file mapping for loop

2008-01-11 Thread Daniel Phillips
Hi Jens, This looks really useful. On Wednesday 09 January 2008 00:52, Jens Axboe wrote: > Disadvantages: > > - The file block mappings must not change while loop is using the > file. This means that we have to ensure exclusive access to the file > and this is the bit that is currently missing in

Re: [RFD] Incremental fsck

2008-01-12 Thread Daniel Phillips
On Wednesday 09 January 2008 01:16, Andreas Dilger wrote: > While an _incremental_ fsck isn't so easy for existing filesystem > types, what is pretty easy to automate is making a read-only snapshot > of a filesystem via LVM/DM and then running e2fsck against that. The > kernel and filesystem have

Re: [RFD] Incremental fsck

2008-01-13 Thread Daniel Phillips
Hi Ted, On Saturday 12 January 2008 06:51, Theodore Tso wrote: > What is very hard to check is whether or not the link count on the > inode is correct. Suppose the link count is 1, but there are > actually two directory entries pointing at it. Now when someone > unlinks the file through one of t

Re: [Patch] document ext3 requirements (was Re: [RFD] Incremental fsck)

2008-01-15 Thread Daniel Phillips
On Jan 15, 2008 6:07 PM, Pavel Machek <[EMAIL PROTECTED]> wrote: > I had write cache enabled on my main computer. Oops. I guess that > means we do need better documentation. Writeback cache on disk in iteself is not bad, it only gets bad if the disk is not engineered to save all its dirty cache on

Re: [Patch] document ext3 requirements (was Re: [RFD] Incremental fsck)

2008-01-15 Thread Daniel Phillips
On Jan 15, 2008 7:15 PM, Alan Cox <[EMAIL PROTECTED]> wrote: > > Writeback cache on disk in iteself is not bad, it only gets bad if the > > disk is not engineered to save all its dirty cache on power loss, > > using the disk motor as a generator or alternatively a small battery. > > It would be awf

Re: [Patch] document ext3 requirements (was Re: [RFD] Incremental fsck)

2008-01-15 Thread Daniel Phillips
Hi Pavel, Along with this effort, could you let me know if the world actually cares about online fsck? Now we know how to do it I think, but is it worth the effort. Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROT

Re: [Btrfs-devel] [ANNOUNCE] Btrfs v0.10 (online growing/shrinking, ext3 conversion, and more)

2008-01-17 Thread Daniel Phillips
On Jan 17, 2008 1:25 PM, Chris mason <[EMAIL PROTECTED]> wrote: > So, I've put v0.11 out there. It fixes those two problems and will also > compile on older (2.6.18) enterprise kernels. > > v0.11 does not have any disk format changes. Hi Chris, First, massive congratulations for bringing this to

Re: [Patch] document ext3 requirements (was Re: [RFD] Incremental fsck)

2008-01-17 Thread Daniel Phillips
On Jan 17, 2008 7:29 AM, Szabolcs Szakacsits <[EMAIL PROTECTED]> wrote: > Similarly to ZFS, Windows Server 2008 also has self-healing NTFS: I guess that is enough votes to justify going ahead and trying an implementation of the reverse mapping ideas I posted. But of course more votes for this is

Re: [patchlet] Removing unneeded line in vmtruncate() (2.4.0-t8p1)

2000-09-05 Thread Daniel Phillips
Alexander Viro wrote: > > On Fri, 1 Sep 2000, Tigran Aivazian wrote: > > > Rasmus, you introduced a bug because you removed the code but left the > > comment around. now /* this should go into ->truncate */ is there and very > > confusing - what should go into ->truncate? > > ... except that co

block_write_full_page doesn't use filesystem's override methods

2000-09-09 Thread Daniel Phillips
Here's what's on my mind: block_write_full_page does not use the filesystem's prepare_write and commit_write methods, instead it calls its own internal functions. Presumeably this is done for efficiency, but it limits flexibility. Reading the code, I can't see any reason right off why the file

Stupid Question

2000-09-10 Thread Daniel Phillips
Can I ask a stupid question: why do we pass a pointer to the file position along with the file when calling read/write methods as in: read(file, buf, count, &file->f_pos) Is there ever a case when we don't want to use the f_pos from the file struct? -- Daniel - To unsubscribe from this list:

Re: Stupid Question

2000-09-10 Thread Daniel Phillips
Alexander Viro wrote: > > On Sun, 10 Sep 2000, Daniel Phillips wrote: > > > Can I ask a stupid question: why do we pass a pointer to the file > > position along with the file when calling read/write methods as in: > > > > read(file, buf, count, &file->

Re: Stupid Question

2000-09-11 Thread Daniel Phillips
Ion Badulescu wrote: > > On Mon, 11 Sep 2000, Andi Kleen wrote: > > > > Passing an update_fpos boolean would be so much cleaner. > > > > An update_fpos boolean only would require that pread/pwrite create their own > > file structure to pass in the user offset. > > Ok, true, I didn't think of th

Tailmerging - dragons sighted

2000-09-11 Thread Daniel Phillips
I found myself in a cave where dragons were sleeping. I crept forward trying to be as quiet as possible, but before I had crossed to the other side one of them rolled over and belched. It gazed at me; I gazed back. As smoke began to drift lazilly from its nostrils I raised my sword... Back in

Re: block_write_full_page doesn't use filesystem's override methods

2000-09-11 Thread Daniel Phillips
Chris Mason wrote: > > --On 09/10/00 00:38:31 +0200 Daniel Phillips > <[EMAIL PROTECTED]> wrote: > > > Here's what's on my mind: block_write_full_page does not use the > > filesystem's prepare_write and commit_write methods, instead it calls > &g

Re: Tailmerging - dragons sighted

2000-09-12 Thread Daniel Phillips
Chris Mason wrote: > Daniel Phillips wrote: > > > > Simply stated, the new cache design divides filesystem blocks into two > > classes: those that can be memory-mapped and those that can't. There > > is no defined way to move a given block from one class to the o

Re: Tailmerging - dragons sighted

2000-09-12 Thread Daniel Phillips
Chris Mason wrote: > > You still run into races between truncate and writepage, I'm fixing this in > the next reiserfs release by keeping the last page locked for the duration > of the truncate. > > You also have to be very careful during the unmerge not to allow up to date > information in the

Re: Tailmerging - dragons sighted

2000-09-12 Thread Daniel Phillips
Alexander Viro wrote: > > On Tue, 12 Sep 2000, Daniel Phillips wrote: > > > This is getblk, except for pages. It finds or sets up a page in a > > mapping. It puts buffers on the page if necessary but doesn't cause > > any I/O action. > > ... and

Re: Tailmerging - dragons sighted

2000-09-12 Thread Daniel Phillips
Chris Mason wrote: > > Now I can unmerge this way: > > > > - Fix up various inode fields > > - getpage the tail page from the mapping > > - bread the shared tail block > > - get the appropriate page buffer using page_buffer > > - copy the tail fragment to that buffer and dirty it > > >

Re: Tailmerging - dragons sighted

2000-09-12 Thread Daniel Phillips
Alexander Viro wrote: > Do it in ->prepare_write() if page is the last one. End of story. You'll > have to call the thing in your truncate() (expanding case) and that's it. > Pageout _never_ changes the file size, write and truncate own the ->i_sem. > So "page covers the end of file" is not going

Re: Tailmerging - dragons sighted

2000-09-12 Thread Daniel Phillips
Alexander Viro wrote: > > On Tue, 12 Sep 2000, Daniel Phillips wrote: > > Wait a minute. block_*_page is pure library stuff. VFS doesn't use it - it > just happens to be common for many filesystems, thus it had been placed > into fs/buffer.c. > > There is absolute

Re: Tailmerging - dragons sighted

2000-09-12 Thread Daniel Phillips
Alexander Viro wrote: > > On Tue, 12 Sep 2000, Daniel Phillips wrote: > > > Alexander Viro wrote: > > > > > > On Tue, 12 Sep 2000, Daniel Phillips wrote: > > > > > > > This is getblk, except for pages. It finds or sets up a page in a >

Re: Tailmerging - dragons sighted

2000-09-12 Thread Daniel Phillips
Alexander Viro wrote: > > On Tue, 12 Sep 2000, Daniel Phillips wrote: > > > > There is a very heavy investment in generic_read/write/mmap - I don't > > want to throw that away. This is a mod to Ext2, and Ext2 uses these > > Oh, but these function

Re: Tailmerging - dragons sighted

2000-09-13 Thread Daniel Phillips
Alexander Viro wrote: > > On Wed, 13 Sep 2000, Daniel Phillips wrote: > > > Alexander Viro wrote: > > > > > > On Tue, 12 Sep 2000, Daniel Phillips wrote: > > > > > > > > There is a very heavy investment in generic_read/write/mmap - I don&

Re: Tailmerging - dragons sighted

2000-09-13 Thread Daniel Phillips
"Juan J. Quintela" wrote: > > >>>>> "daniel" == Daniel Phillips writes: > daniel> A comment on create_empty_buffers: it takes inode as a parameter and > daniel> only uses that to set the b_dev. Shouldn't it just take dev as a > dan

Re: Tailmerging - dragons sighted

2000-09-13 Thread Daniel Phillips
"Juan J. Quintela" wrote: > > if everybody agrees, here is the patch against test8 using the second > alternative. How about letting the world see it: -static void create_empty_buffers(struct page *page, struct inode *inode, unsigned long blocksize) +void create_empty_buffers(struct page *page,

Re: Tailmerging - dragons sighted

2000-09-13 Thread Daniel Phillips
Alexander Viro wrote: > > On Wed, 13 Sep 2000, Daniel Phillips wrote: > > > "Juan J. Quintela" wrote: > > > > > > if everybody agrees, here is the patch against test8 using the second > > > alternative. > > > > How about letting th

Canonic buffer states and transitions

2000-09-14 Thread Daniel Phillips
I'm going to admit that I've always been confused about the deep meaning of some of the buffer state bits. There are four buffer state bits that describe the relationship between the cached in a buffer and data stored on disk: BH_Uptodate BH_Dirty BH_Mapped BH_New Does anybody think tha

Tailmerging - status

2000-09-15 Thread Daniel Phillips
OK, most dragons seem to be dead or at least severely wounded. I hope I will post a patch against test8 today or at the latest tomorrow. For anyone who is interested, there is enough here to verify whether this tailmerging strategy is effective at saving Ext2 filesystem space or not (seems prett

Re: Tailmerging - dragons sighted

2000-09-15 Thread Daniel Phillips
Chris Mason wrote: > If you unmerge inside ext2_get_block... I unmerge at a higher level - file_read, notify_change, delete_inode. Inside my hacked ext2_get_block and ext2_commit_write I just handle the tail data copy. I don't know for sure *why* I do it this way, it just seems right and allow

Re: Tailmerging - dragons sighted

2000-09-16 Thread Daniel Phillips
Alexander Viro wrote: > frankly, I see no point in "put buffers on page if it's a block-based fs > that uses buffer_heads, but don't map them" as a method. I finally understand where this comment came from. The current arrangement doesn't handle unmapped buffers on a page very well. As soon as

Tailmerging - version 0.01 for -test8

2000-09-17 Thread Daniel Phillips
This patch is against 2.4.0-test8, and now supports mmap file ops - I think - I haven't tried it yet. It is still *not* thread safe. It's available at: innominate.org/~phillips/tailmerge.0.01.zip I didn't find any bugs in version 0.0 today, but that's probably because I didn't try hard enoug

To page-cache or not to page-cache?

2000-09-25 Thread Daniel Phillips
It's time to port my Tux2 prototype to 2.4, and this has given me the opportunity to get even more tangled up in the two-cache problem than ever before. Right now, Ext2 uses the page cache for data and the buffer cache for metadata. This platypus gets through life pretty well because in Ext2 dat

Meaning of blk_size

2000-10-01 Thread Daniel Phillips
After staring at the block device code for, um, quite a long time, I came to the conclusion that blk_size stores one less than the number of 512 byte blocks on a device. Is this true? -- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to

Re: Meaning of blk_size

2000-10-01 Thread Daniel Phillips
Daniel Phillips wrote: > > After staring at the block device code for, um, quite a long time, I > came to the conclusion that blk_size stores one less than the number of > 512 byte blocks on a device. Is this true? Um, slight revision: they wouldn't be blocks, they'd be &

Re: Meaning of blk_size

2000-10-01 Thread Daniel Phillips
Andries Brouwer wrote: > > On Sun, Oct 01, 2000 at 11:16:23PM +0200, Daniel Phillips wrote: > > Daniel Phillips wrote: > > > > After staring at the block device code for, um, quite a long time, I > > > came to the conclusion that blk_size stores one less than th

Re: [RFC] api for consistent lvm snapshots

2000-10-03 Thread Daniel Phillips
Chris Mason wrote: > Heinz Mauelshagen and I have come up with an API for LVM to use > for creating consistent snapshots. The idea is to block FS modifications > while the snapshot is being created, and to give the FS the chance to flush > everything (including all pending transactions) to disk b

Re: [RFC] api for consistent lvm snapshots

2000-10-03 Thread Daniel Phillips
Chris Mason wrote: > > --On 10/03/00 21:13:04 +0200 Daniel Phillips > <[EMAIL PROTECTED]> wrote: > > > Chris Mason wrote: > >> Heinz Mauelshagen and I have come up with an API for LVM to use > >> for creating consistent snapshots. The idea is to bloc

Re: [RFC] api for consistent lvm snapshots

2000-10-04 Thread Daniel Phillips
Chris Mason wrote: > > --On 10/04/00 02:23:30 +0200 Daniel Phillips <[EMAIL PROTECTED]> wrote: > > > Chris Mason wrote: > >> > >> --On 10/03/00 21:13:04 +0200 Daniel Phillips > >> <[EMAIL PROTECTED]> wrote: > >> > >> >

Re: [RFC] api for consistent lvm snapshots

2000-10-04 Thread Daniel Phillips
Ken Hirsch wrote: > > Daniel Phillips wrote: > > > Um, this isn't the only place where support for dumb filesystems messes > > things up for smart filesystems, it's just the most painful. Dumb > > filesystems should have the right to remain dumb, but sma

Re: [RFC] api for consistent lvm snapshots

2000-10-05 Thread Daniel Phillips
Chris Mason wrote: > > For the most part, reiserfs can play nice with bdflush. I give it blocks > when I've decided they are ready to get to disk, and I keep blocks away > from it when they aren't allowed to be written. But why not give them straight to ll_rw_block? Maybe the real question is,

Re: [RFC] api for consistent lvm snapshots

2000-10-05 Thread Daniel Phillips
Chris Mason wrote: > --On 10/05/00 13:49:31 +0200 Daniel Phillips wrote: > > Chris Mason wrote: > >> > >> For the most part, reiserfs can play nice with bdflush. I give it blocks > >> when I've decided they are ready to get to disk, and I keep blocks away

Re: [RFC] api for consistent lvm snapshots

2000-10-06 Thread Daniel Phillips
Chris Mason wrote: > > So I have the priorty ordering: > > > > blocks(i) -> root(i) -> blocks(i+1) -> root(i+1) -> etc > > > > And it would be possible to compress that slightly to: > > > > root(i-1) + blocks(i) -> root(i) + blocks(i+1) -> etc > > Then the io borders would benefit you as well

Re: [RFC] api for consistent lvm snapshots

2000-10-06 Thread Daniel Phillips
Chris Mason wrote: > > Ah, I see. I was assuming I'd sleep until the I/O completion, wake up > > and instantly handle the metaroot I/O start, sleep again and wake up to > > do the next phase transition. Actually, I don't see why that wouldn't > > work. > > It will work, just not as fast as the

Old threads on Extended Attributes?

2000-10-25 Thread Daniel Phillips
I'm aware that this is far from the first time we've seen a flurry of activity on the extended attributes front - but I missed all those other threads, here and on linux-kernel. Can somebody point me at some of the highlights? Subject lines? I'd like to have at least half a clue before I jump i

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-10-26 Thread Daniel Phillips
Curtis Anderson wrote: [...] > I could copy all of the kernel code and machinery that supports directories > into my attribute support code files, or I could go the other direction. > Since duplicate code is generally a bad idea, starting with directories is > better. What I now need is a flag in

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-10-26 Thread Daniel Phillips
Curtis Anderson wrote: > For example, "mv" will devolve to "cp" when the source and destination > filesystems are different. So "mv" will preserve attributes for some > operations, and will drop them on the floor for others. Unexpected by the > average user, and therefore bad. But why should cp

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-10-26 Thread Daniel Phillips
Curtis Anderson wrote: > > So it's not clear how you reached to the conclusion that directories > > shouldn't be pressed into service as compound files. I'm sure you have > > a reason, but it wasn't revealed here! > > It's an aesthetics argument. There are no new features in a directory-based >

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-10-27 Thread Daniel Phillips
Curtis Anderson wrote: [...] > I could copy all of the kernel code and machinery that supports directories > into my attribute support code files, or I could go the other direction. > Since duplicate code is generally a bad idea, starting with directories is > better. What I now need is a flag in

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-10-27 Thread Daniel Phillips
Anton Altaparmakov wrote: > I'm not Curtis and don't know about XFS but I will try to answer your > question anyway. (-; > If XFS doesn't order the EAs that's fine. - However for FSs which do order > them the API has to provide the interface. Does it? I think you may be jumping to your conclusio

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-10-27 Thread Daniel Phillips
Curtis Anderson wrote: > It all depends on how optional things are, and what differences an unmodified > app sees. IMHO, "none" is the right answer in this case. Part of my believing > that directory-hack stream-style attributes are not good is that I don't know > how to do them without making v

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-10-27 Thread Daniel Phillips
Anton Altaparmakov wrote: > > At 11:00 28/10/2000, Hans Reiser wrote: > >Curtis Anderson wrote: > > > > > The problem with streams-style attributes comes from stepping onto the > > > slippery slope of trying to put too much generality into it. I chose the > > > block-access style of API so that

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-10-28 Thread Daniel Phillips
"Stephen C. Tweedie" wrote: > atomicity is not an option you can enable or disable arbitrarily Hmm, I was planning to do exactly that in Tux2 because a) it's easy and b) its not possible to forsee all the places where applications want things to be atomic. A full-data JFS should also be able to

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-10-28 Thread Daniel Phillips
"Stephen C. Tweedie" wrote: > On Fri, Oct 27, 2000 at 10:46:26AM +0200, Andreas Gruenbacher wrote: > > Imagine if the kernel did store > > "[EMAIL PROTECTED]" on ACLs on the filesystem. When an access control > > decision needs to be done, the kernel simply has no idea about what > > "[EMAIL PROTE

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-10-29 Thread Daniel Phillips
"Stephen C. Tweedie" wrote: > Read my original proposal. It allowed you to use uids, but if you > happened to be passing utf/8, it gave you the ability to say so. But it would be a win if the strings could go away, getting it down to just one kind of object, right? > On Sat, Oct 28, 2000 at 09:

Re: nesting transactions

2000-10-31 Thread Daniel Phillips
"Stephen C. Tweedie" wrote: > On Mon, Oct 30, 2000 at 09:34:27AM -0500, Chris Mason wrote: > > > Yes, reiserfs could do it the same way, if we start putting a transaction > > handle pointer in the task struct. > > > > But, I thought you required some way of knowing when the transaction was > > br

Re: Buffer & pache cache synchronization on 2.2x kernel series

2000-11-01 Thread Daniel Phillips
Martin Frey wrote: > What I don't understand is what happens if somebody > writes to a mmaped area. The page will certainly be > modifed. Reads will see the modification since the > page is in page cache. What about a write that > modifies a part of the same page? The contents in > the page cache

Transaction API?

2000-11-02 Thread Daniel Phillips
There was talk of exposing some kind of transaction API now that we have a crop of fs's on the way that can do transactions. Has this gone anywhere? -- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED]

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-11-02 Thread Daniel Phillips
"Stephen C. Tweedie" wrote: > > On Sun, Oct 29, 2000 at 02:45:32PM +0100, Andreas Gruenbacher wrote: > > If so, then the client's kernel could map these id's to the proper names > > when needed, right? > > Yes. Exactly. You're getting the point --- it is possible for a > client to manage the m

Re: [PROPOSAL] Extended attributes for Posix security extensions

2000-11-02 Thread Daniel Phillips
"Stephen C. Tweedie" wrote: I have to admit that when I first looked at your proposal I thought it was overkill, but having followed the thread all the way through I now have some appreciation of the weird wacky world out there that you are dealing with. I think your approach is correct. Here a

Re: Ext2 btree directories

2000-11-03 Thread Daniel Phillips
Andreas Gruenbacher wrote: > have there been any advances with ext2 btree directories so far? Ted Ts'o talked about his design ideas with me and Ben Lahaise last month in Miami. The basic ideas are are: - Directory file to stay as a regular file but with btree index blocks added - Roo

  1   2   >