You use the term inefficient to mean wasted space.  Smaller block sizes tend to reduce 
throughput, and use more CPU.  This could also be called inefficient.

It is a trade off.  I think the 4096 byte block size was chosen because the way the 
Linux kernel works limits the block size to the page size, and this is 4096 bytes on 
many systems, including the x86, Power, and zSeries.

I realize that you may want to trade off throughput for space savings.

-----Original Message-----
From: Michael MacIsaac [mailto:[EMAIL PROTECTED]
Sent: Wednesday, February 18, 2004 6:25 AM
To: [EMAIL PROTECTED]
Subject: dasdfmt with a 1K block size - still not recommded?


Hi list,

An issue came up with file systems that have a lot of small files. This
question might be helpful given the current more generic thread on file
systems.

I did a small test to create 50000 20 byte files.  An ext3 file system
with the default 4096 block size is quite inefficient. while using a 1024
block size seems to be nearly linear in the increased efficiency.  While
doing the test I also threw in a reiser and JFS file system (4096 block
size is required).  Here is the disk space usage, not including the
journal, after creating the files (I could supply details of the test if
anyone is interested.):

ext3, 4096: 20080KB
ext3, 1024:  5079KB
reiser:       712KB
JFS:        22712KB

In order to create a 1024 block size, the "-b 1024" parameter to dasdfmt
must be used. However, the dasdfmt man page has the warning:
  "Due to  some  limitations in the driver, it is strongly recommended to
use a blksize of 4096."

I remember this issue when Linux for s390 became available in 2000.  Have
these "limitations" been fixed and the man page simply hasn't been
updated, or are there still legitimate problems?  If anyone has a
background on this it would be appreciated.  Thanks.

(You might be asking, why not just use a reiser fs?.  *As I understand
it*, there is a bug with reiser where under extreme load, the journal
inappropriately gets marked dirty and the file system begins thrashing.
Using the "noatime,nodiratime" mount options is a workaround to this bug,
but not a fix. I sent a query to Hans Reiser on this issue, but he said
his company can no longer offer free support :(( ).

-Mike MacIsaac, IBM  mikemac at us.ibm.com   (845) 433-7061

Reply via email to