Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-02-05 Thread Al Boldi
Jan Kara wrote: On Tue 05-02-08 10:07:44, Al Boldi wrote: Jan Kara wrote: On Sat 02-02-08 00:26:00, Al Boldi wrote: Chris Mason wrote: Al, could you please compare the write throughput from vmstat for the data=ordered vs data=writeback runs? I would guess the data=ordered

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-02-04 Thread Al Boldi
Jan Kara wrote: On Sat 02-02-08 00:26:00, Al Boldi wrote: Chris Mason wrote: Al, could you please compare the write throughput from vmstat for the data=ordered vs data=writeback runs? I would guess the data=ordered one has a lower overall write throughput. That's what I would have

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-02-01 Thread Al Boldi
Chris Mason wrote: On Thursday 31 January 2008, Jan Kara wrote: On Thu 31-01-08 11:56:01, Chris Mason wrote: On Thursday 31 January 2008, Al Boldi wrote: The big difference between ordered and writeback is that once the slowdown starts, ordered goes into ~100% iowait, whereas

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-28 Thread Al Boldi
Jan Kara wrote: On Sat 26-01-08 08:27:59, Al Boldi wrote: Do you mean there is a locking problem? No, but if you write to an mmaped file, then we can find out only later we have dirty data in pages and we call writepage() on behalf of e.g. pdflush(). Ok, that's a special case, which we

Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)

2008-01-26 Thread Al Boldi
KOSAKI Motohiro wrote: And from a performance point of view letting applications voluntarily free some memory is better even than starting to swap. Absolutely. the mem_notify patch can realize just before starting swapping notification :) to be honest, I don't know fs guys

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-25 Thread Al Boldi
Chris Snook wrote: Al Boldi wrote: Greetings! data=ordered mode has proven reliable over the years, and it does this by ordering filedata flushes before metadata flushes. But this sometimes causes contention in the order of a 10x slowdown for certain apps, either due to the misuse

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-25 Thread Al Boldi
Diego Calleja wrote: El Thu, 24 Jan 2008 23:36:00 +0300, Al Boldi [EMAIL PROTECTED] escribió: Greetings! data=ordered mode has proven reliable over the years, and it does this by ordering filedata flushes before metadata flushes. But this sometimes causes contention in the order

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-25 Thread Al Boldi
[EMAIL PROTECTED] wrote: On Thu, 24 Jan 2008 23:36:00 +0300, Al Boldi said: This RFC proposes to introduce a tunable which allows to disable fsync and changes ordered into writeback writeout on a per-process basis like this: : : But if you want to give them enough rope to shoot themselves

[RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-24 Thread Al Boldi
Greetings! data=ordered mode has proven reliable over the years, and it does this by ordering filedata flushes before metadata flushes. But this sometimes causes contention in the order of a 10x slowdown for certain apps, either due to the misuse of fsync or due to inherent behaviour like

Re: konqueror deadlocks on 2.6.22

2008-01-22 Thread Al Boldi
Ingo Molnar wrote: * Oliver Pinter (Pintér Olivér) [EMAIL PROTECTED] wrote: and then please update to CFS-v24.1 http://people.redhat.com/~mingo/cfs-scheduler/sched-cfs-v2.6.22.15-v24.1 .patch Yes with CFSv20.4, as in the log. It also hangs on 2.6.23.13 my feeling is that this is

Re: konqueror deadlocks on 2.6.22

2008-01-22 Thread Al Boldi
Chris Mason wrote: Running fsync in data=ordered means that all of the dirty blocks on the FS will get written before fsync returns. Hm, that's strange, I expected this kind of behaviour from data=journal. data=writeback should return immediatly, which seems it does, but data=ordered should

konqueror deadlocks on 2.6.22

2008-01-19 Thread Al Boldi
I was just attacked by some deadlock issue involving sqlite3 and konqueror. While sqlite3 continues to slowly fill a 7M-record db in transaction mode, konqueror hangs for a few minutes, then continues only to hang again and again. Looks like an fs/blockIO issue involving fsync. As a workaround,

Re: konqueror deadlocks on 2.6.22

2008-01-19 Thread Al Boldi
Oliver Pinter (Pintér Olivér) wrote: This kernel is vanilla 2.6.22.y or with CFS? Yes with CFSv20.4, as in the log. It also hangs on 2.6.23.13 On 1/19/08, Al Boldi [EMAIL PROTECTED] wrote: I was just attacked by some deadlock issue involving sqlite3 and konqueror. While sqlite3 continues

Re: [RFD] Incremental fsck

2008-01-13 Thread Al Boldi
Theodore Tso wrote: On Wed, Jan 09, 2008 at 02:52:14PM +0300, Al Boldi wrote: Ok, but let's look at this a bit more opportunistic / optimistic. Even after a black-out shutdown, the corruption is pretty minimal, using ext3fs at least. After a unclean shutdown, assuming you have decent

Re: [RFD] Incremental fsck

2008-01-12 Thread Al Boldi
Bodo Eggert wrote: Al Boldi [EMAIL PROTECTED] wrote: Even after a black-out shutdown, the corruption is pretty minimal, using ext3fs at least. So let's take advantage of this fact and do an optimistic fsck, to assure integrity per-dir, and assume no external corruption. Then we release

Re: [RFD] Incremental fsck

2008-01-10 Thread Al Boldi
Rik van Riel wrote: Al Boldi [EMAIL PROTECTED] wrote: Ok, but let's look at this a bit more opportunistic / optimistic. You can't play fast and loose with data integrity. Correct, but you have to be realistic... Besides, if we looked at things optimistically, we would conclude

Re: [RFD] Incremental fsck

2008-01-09 Thread Al Boldi
Valerie Henson wrote: On Jan 8, 2008 8:40 PM, Al Boldi [EMAIL PROTECTED] wrote: Rik van Riel wrote: Al Boldi [EMAIL PROTECTED] wrote: Has there been some thought about an incremental fsck? You know, somehow fencing a sub-dir to do an online fsck? Search for chunkfs Sure

Re: Massive slowdown when re-querying large nfs dir

2007-11-07 Thread Al Boldi
Andrew Morton wrote: I would suggest getting a 'tcpdump -s0' trace and seeing (with wireshark) what is different between the various cases. Thanks Neil for looking into this. Your suggestion has already been answered in a previous post, where the difference has been attributed to ls

Re: Massive slowdown when re-querying large nfs dir

2007-11-06 Thread Al Boldi
Al Boldi wrote: There is a massive (3-18x) slowdown when re-querying a large nfs dir (2k+ entries) using a simple ls -l. On 2.6.23 client and server running userland rpc.nfs.V2: first try: time -p ls -l 2k+ entry dir in ~2.5sec more tries: time -p ls -l 2k+ entry dir in ~8sec first try

Massive slowdown when re-querying large nfs dir

2007-11-04 Thread Al Boldi
There is a massive (3-18x) slowdown when re-querying a large nfs dir (2k+ entries) using a simple ls -l. On 2.6.23 client and server running userland rpc.nfs.V2: first try: time -p ls -l 2k+ entry dir in ~2.5sec more tries: time -p ls -l 2k+ entry dir in ~8sec first try: time -p ls -l 5k+

Re: Massive slowdown when re-querying large nfs dir

2007-11-04 Thread Al Boldi
Matthew Wilcox wrote: How about tcpdumping and seeing what requests are flowing across the wire? You might be able to figure out what's being done differently. I think lookup is faster than getattr. Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in

Re: Distributed storage. Move away from char device ioctls.

2007-09-14 Thread Al Boldi
Jeff Garzik wrote: Evgeniy Polyakov wrote: Hi. I'm pleased to announce fourth release of the distributed storage subsystem, which allows to form a storage on top of remote and local nodes, which in turn can be exported to another storage as a node to form tree-like storages. This

Re: [RFC] Union Mount: Readdir approaches

2007-09-12 Thread Al Boldi
[EMAIL PROTECTED] wrote: But if you really want to read or try it, you can get all source files from sourceforge. Read http://aufs.sf.net and try, $ cvs -d:pserver:[EMAIL PROTECTED]:/cvsroot/aufs login (CVS password is empty) $ cvs -z3 -d:pserver:[EMAIL PROTECTED]:/cvsroot/aufs co aufs This

Re: [RFC] Union Mount: Readdir approaches

2007-09-07 Thread Al Boldi
[EMAIL PROTECTED] wrote: If you are interested in this approach, please refer to http://aufs.sf.net. It is working and used by several people. Any chance you can post a patch against 2.6.22? Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body

Re: [GIT PULL -mm] Unionfs/fsstack/eCryptfs updates/cleanups/fixes

2007-09-03 Thread Al Boldi
Erez Zadok wrote: Al, we have back-ports of the latest Unionfs to 2.6.{22,21,20,19,18,9}, all in http://unionfs.filesystems.org/. Before we release any change, we test it on all back-ports as well as the latest -rc/-mm code base (takes over 24 hours straight to get through all of our

Re: [GIT PULL -mm] Unionfs/fsstack/eCryptfs updates/cleanups/fixes

2007-09-02 Thread Al Boldi
Josef 'Jeff' Sipek wrote: The following is a series of patches related to Unionfs, which include three small VFS/fsstack patches and one eCryptfs patch; the rest are Unionfs patches. The patches here represent several months of work and testing under various conditions, especially low-memory,

[RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid])

2007-08-12 Thread Al Boldi
Lars Ellenberg wrote: meanwhile, please, anyone interessted, the drbd paper for LinuxConf Eu 2007 is finalized. http://www.drbd.org/fileadmin/drbd/publications/ drbd8.linux-conf.eu.2007.pdf it does not give too much implementation detail (would be inapropriate for conference proceedings,

[RFC] VFS: mnotify (was: [PATCH 00/23] per device dirty throttling -v8)

2007-08-05 Thread Al Boldi
Jakob Oestergaard wrote: Why on earth would you cripple the kernel defaults for ext3 (which is a fine FS for boot/root filesystems), when the *fundamental* problem you really want to solve lie much deeper in the implementation of the filesystem? Noatime doesn't solve the problem, it just

[PATCH] NFS Kconfig: Make ROOT_NFS visibly part of NFS_FS group

2007-07-31 Thread Al Boldi
Relocate ROOT_NFS from below NFSD to be visibly part of NFS_FS group. This makes the ROOT_NFS Kconfig option logically coherent. Signed-off-by: Al Boldi [EMAIL PROTECTED] Cc: Trond Myklebust [EMAIL PROTECTED] --- --- a/fs/Kconfig2007-07-09 06:38:41.0 +0300 +++ b/fs/Kconfig

Re: [RFC 00/26] VFS based Union Mount (V2)

2007-07-30 Thread Al Boldi
Jan Blunck wrote: Here is another post of the VFS based union mount implementation. Unlike the traditional mount which hides the contents of the mount point, union mounts present the merged view of the mount point and the mounted filesytem. Great! Recent changes: - brand new union

Re: bonnie++ benchmarks for ext2,ext3,ext4,jfs,reiserfs,xfs,zfs on software raid 5

2007-07-30 Thread Al Boldi
Justin Piszcz wrote: CONFIG: Software RAID 5 (400GB x 6): Default mkfs parameters for all filesystems. Kernel was 2.6.21 or 2.6.22, did these awhile ago. Hardware was SATA with PCI-e only, nothing on the PCI bus. ZFS was userspace+fuse of course. Wow! Userspace and still that efficient.

Re: [RFH] Partition table recovery

2007-07-22 Thread Al Boldi
Theodore Tso wrote: On Sun, Jul 22, 2007 at 07:10:31AM +0300, Al Boldi wrote: Sounds great, but it may be advisable to hook this into the partition modification routines instead of mkfs/fsck. Which would mean that the partition manager could ask the kernel to instruct its fs subsystem

Re: [RFH] Partition table recovery

2007-07-21 Thread Al Boldi
Theodore Tso wrote: On Sat, Jul 21, 2007 at 07:54:14PM +0200, Rene Herman wrote: sfdisk -d already works most of the time. Not as a verbatim tool (I actually semi-frequently use a sfdisk -d /dev/hda | sfdisk invocation as a way to _rewrite_ the CHS fields to other values after changing

Re: Hardlink Pitfalls (was: Patches for REALLY TINY 386 kernels)

2007-07-16 Thread Al Boldi
Jörn Engel wrote: On Mon, 16 July 2007 22:14:41 +0530, Satyam Sharma wrote: On 7/16/07, Al Boldi [EMAIL PROTECTED] wrote: Satyam Sharma wrote: Or just cp -al to create multiple trees at (almost) no disk cost that won't interfere with each other in any way, and makes the development

Re: [Advocacy] Re: 3ware 9650 tips

2007-07-16 Thread Al Boldi
Bryan J. Smith wrote: Off-topic, advocacy-level response ... On Mon, 2007-07-16 at 11:43 -0400, Joshua Baker-LePain wrote: I do so wish that RedHat shared this view... I've been trying to convince them since Red Hat Linux 7 (and, later, 9) that they need to realize the limits of Ext3 at

[RFC] VFS: data=ordered (was: [Advocacy] Re: 3ware 9650 tips)

2007-07-16 Thread Al Boldi
Matthew Wilcox wrote: On Mon, Jul 16, 2007 at 08:40:00PM +0300, Al Boldi wrote: XFS surely rocks, but it's missing one critical component: data=ordered And that's one component that's just too critical to overlook for an enterprise environment that is built on data-integrity over

Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-03 Thread Al Boldi
Brad Boyer wrote: To be honest, I think it looks bad for someone associated with redhat to be suggesting that life should be made more difficult for those who write proprietary software on Linux. The support from commercial software is a major reason for the success of the RHEL product line.

Simulated Ordered Mode

2005-07-31 Thread Al Boldi
Filesystems are generally updated in a serial fashion: 1. Update MetaData 2. Update FileData 3. Sync... (optional) Is it possible to instruct the FS to delay metadata update until after a filedata sync? Like: 1. Buffer MetaData 2. Update FileData 3. Sync 4. Update MetaData 5. Sync...

RE: Simulated Ordered Mode

2005-07-31 Thread Al Boldi
Martin Jambor wrote: { On 7/31/05, Al Boldi [EMAIL PROTECTED] wrote: Is it possible to instruct the FS to delay metadata update until after a filedata sync? If you delayed any update until after a sync it wouldn't be a sync anymore, would it? } True, but what about an implied MetaData sync

Re: XFS corruption during power-blackout

2005-07-16 Thread Al Boldi
Russell Howe wrote: { XFS only journals metadata, not data. So, you are supposed to get a consistent filesystem structure, but your data consistency isn't guaranteed. } What did XFS do to detect filedata-corruption before it was added to the vanilla-kernel? Maybe it did not update the metadata

RE: XFS corruption during power-blackout

2005-07-05 Thread Al Boldi
Sonny Rao wrote: { On Wed, Jun 29, 2005 at 07:53:09AM +0300, Al Boldi wrote: What I found were 4 things in the dest dir: 1. Missing Dirs,Files. That's OK. 2. Files of size 0. That's acceptable. 3. Corrupted Files. That's unacceptable. 4. Corrupted Files with original fingerprint

RE: XFS corruption during power-blackout

2005-07-05 Thread Al Boldi
Sonny Rao wrote: { On Wed, Jun 29, 2005 at 07:53:09AM +0300, Al Boldi wrote: What I found were 4 things in the dest dir: 1. Missing Dirs,Files. That's OK. 2. Files of size 0. That's acceptable. 3. Corrupted Files. That's unacceptable. 4. Corrupted Files with original fingerprint