Chris Mason wrote:

>On Sat, 2002-05-04 at 10:59, Hans Reiser wrote:
>  
>
>>So how about if you revise fsync so that it always sends data blocks to 
>>the journal not to the main disk?
>>    
>>
>
>This gets a little sticky.
>
>Once you log a block, it might be replayed after a crash.  So, you have
>to protect against corner cases like this:
>
>write(file)
>fsync(file) ; /* logs modified data blocks */
>write(file) ; /* write the same blocks without fsync */
>sync ;        /* use expects new version of the blocks on disk */
><crash>
>
>During replay, the logged data blocks overwrite the blocks sent to disk
>via sync().
>
>This isn't hard to correct for, every time a buffer is marked dirty, you
>check the journal hash tables to see if it is replayable, and if so you
>log it instead (the 2.2.x code did this due to tails).  This translates
>to increased CPU usage for every write.
>
>I'd rather not put it back in because it adds yet another corner case to
>maintain for all time.  Most of the fsync/O_SYNC bound applications are
>just given their own partition anyway, so most users that need data
>logging need it for every write.
>
Does mozilla's mail user agent use fsync?  Should I give it its own 
partition?  I bet it is fsync bound....;-)

Also, I don't think you can reasonably expect most persons to know that 
they should turn data logging on for high fsync performance, even if you 
document it.

Most persons using small fsyncs are using it because the person who 
wrote their application wrote it wrong.  What's more, many of the 
persons who wrote those applications cannot understand that they did it 
wrong even if you tell them (e.g. qmail author reportedly cannot 
understand, sendmail guys now understand but had Kirk McKusick on their 
staff and attending the meeting when I explained it to them so they are 
not very typical....).  

In other words, handling stupidity is an important life skill, and we 
all need to excell at it.;-)

Tell me what your thoughts are on the following:

If you ask randomly selected ReiserFS users (not the reiserfs-list, but 
the ones who would never send you an email....)  the following 
questions, what percentage will answer which choice?

The filesystem you are using is named:

a) the Performance Optimized SuSE FS

b) NTFS

c) FAT

d) ext2

e) ReiserFS

If you want to change reiserfs to use data journaling you must do which:

a) reinstall the reiserfs package using rpm

b) modify /etc/fs.conf

c) reinstall the operating system from scratch, and select different 
options during the install this time

d) reformat your reiserfs partition using mkreiserfs

e) none of the above

f) all of the above except e)


What do you think the chances are that you can convince Hubert that 
every SuSE Enterprise Edition user should be asked at install time if 
they are going to use fsync a lot on each partition, and to use a 
different fstab setting if yes?

I know that you are an experienced sysadmin who was good at it.  Your 
intuition tells you that most sysadmins are like the ones you were 
willing to hire into your group at the university.  They aren't.

Linux needs to be like a telephone.  You plug it in, push buttons, and 
talk.  It works well, but most folks don't know why.

A moderate number of programs are small fsync bound for the simple 
reason that it is simpler to write them that way.    We need to cover 
over their simplistic designs.

So, you have my sympathies Chris, because I believe you that it makes 
the code uglier and it won't be a joy to code and test.  I hope you also 
see that it should be done.

Hans

Reply via email to