>>> What might be the best way to "defrag" files. >> Dump/reload to/from a freshly formatted filesystem >> *on a different device*.
>>> I've noticed a /huge/ speedup when defrag'ing files using the >>> following method: >>> create a new file in same directory >>> use ftruncate to set the new file's size the same as existing file >>> copy data from existing to new file >>> close both files and rename new file to the old file (atomic replace) >> That is a silly, random way. > I've had very good success at doing that on a much larger basis. > [ ... ] in my couple of a dozen experiments it never failed to > substantially improve overall fragmentation - frequently by > quite a bit. Asking for advice here is a bit pointless if you already know better :-). > Several filesystems I tested with had over 40% fragmentation > (as reported by e2fsck) Weren't we talking about JFS here? > and each one of them ended up in the single digits. A silly random way can still appear to work in a dozen small random experiments. You were asking not for some way to defrag something, but the "best way to "defrag" files". >>> I recently did this to a .iso image for openSUSE 11.2: >>> openSUSE-11.2-DVD-x86_64.iso: 20517 extents found >>> turned into: >>> openSUSE-11.2-DVD-x86_64.iso: 23 extents found >> That ISO file is heavily discontiguous most likely because it >> was written incrememtally in many tiny pieces (typically a >> download) and the algorithms in the JFS allocator that try to >> pick contiguous allocation areas don't handle that well > Yes, I know why it was discontiguous. Then you could have mentioned that. The explanation was meant to help understand what is going on and how to do things better. But then if you already suspected that "heavily discontiguous most likely because it was written incrememtally in many tiny pieces", why suggest below that "write a \0 every 4K" is a good idea? >> (they handle fairly well continuous write of largish bits). > What I'm asking is what filesystem operations best suit large > JFS allocations. I think that the answers are "continuous write of largish bits", ideally to a "freshly formatted filesystem". The reasons why may be inferred from the disk layout, described here: http://en.wikipedia.org/wiki/JFS_(file_system) http://www.sabi.co.uk/Notes/linuxFS.html#jfsStruct http://jfs.sourceforge.net/project/pub/jfslayout.pdf > Should I allocate the entire file by way of truncate? That should not allocate anything: base# grep /dev/root /proc/mounts /dev/root / jfs rw 0 0 base# perl -e 'truncate STDOUT,1000*1000*1000' > /1G base# ls -lsd /1G 4 -rw------- 1 root root 1000000000 Dec 16 16:27 /1G > Should I write a \0 every 4K (this is similar to what > fallocate_posix does in glibc) for the size of the file? > Normally I use truncate but I've had good success with both > methods. Both are sort of random methods. Also, writing a single byte every 4KiB is less than optimal (larger writes and less seeking would probably be better). I'd use something like: dd bs=10M count=100 if=/dev/zero oflag=direct of=tmp/1G again ideally on a non-busy, freshly formatted filesystem. > I'm asking specifically for the best way to do this *for JFS*. It is not very JFS specific, because JFS does not have specific ways to preallocate space for a file. Howver, a large difference between JFS and some other file systems is that it allocates stuff in much larger "cylinder groups" (AGs in JFS), and that it will allocate space for one file per "cylinder group" (as per the references above). As a curiosity I have just checked AG size vs. aggregate size for 3 JFS filesystems I got (12G, 120G, 460GB): # for A in sdc1 sdc6 sdc9; do jfs_tune -l /dev/$A; done | egrep '(gate|group) size' Aggregate size: 24373496 blocks Allocation group size: 32768 aggregate blocks Aggregate size: 249124744 blocks Allocation group size: 262144 aggregate blocks Aggregate size: 976179568 blocks Allocation group size: 1048576 aggregate blocks A bit surprising as I was expecting bigger allocation groups, but it looks like the goal is to have around 1,000 AGs per aggregate (AG size is 120MB, 1GB, 3.8GB). Extents will be grow across AGs, but only one file at a time. >>> Is there a way to get "1 extents"? >> That is pointless. What matters is what percentage of IO is > I should have asked "what is the most optimal way to allocate > space for large files on JFS". Again, that would be "continuous write of largish bits" and "on a less busy filesystem", but pre-creating files is often regrettably fairly pointless as many applications don't overwrite files, but just open them with "O_CREAT", and that truncates them to 0 before writing. Ideally applications would (optionally) open and overwrite and truncate at the end, but that is quite rare. >>> I have other .iso images of similar size with 1 extent. >> You copied them on a less busy filesystem. > The filesystem was no more or less busy, but that's non the > matter. That can have a lot of influence on the result -- if a filesystem is busy (either in the sense of having quite a bit of IO ongoing or of having had quite a bit of allocation in the past) the free space is more likely to be widely scattered. Sometimes even just writing two files at the same time in small pieces causes trouble (but much less so on a fresh mostly unused filesystem, especially if they are largish, as the AGs will then be bigger). ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Jfs-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/jfs-discussion
