I've observed some performance degradation when writing large files.
jfsCommit reaches nearly 100% cpu use and the write rate slows to a crawl.
I've done a lot of digging, and I think I know the cause. In summary, I
believe txUpdateMap is spending a *lot* of extra time re-writing the
persistent block maps when extending the last xad in the file's xtree. xtLog
stores the *entire last xad* for update, and when appending a large file on
a sparsely allocated volume this xad might cover a lot of blocks. And this
causes each commit to update millions of already allocated blocks. As the
file grows the total number of updates grows exponentially causing huge
performance loss.
The reason I investigated this was a 320GB file write slowed to about
7-10MB/s when the underlying RAID volume can usually write 400MB/s.
jfsCommit cpu was 99.4% at the same time. For the record I'm running Linux
2.6.35.11-83.fc14.x86_64.
I confirmed the file was not fragmented by dumping out the xtree for the
inode in question. (I used some hacky python code that I wrote for exploring
JFS volumes, and I used sync and drop_caches to ensure metadata was flushed
out to the disk).
Here's the xtree (4kb allocation size), and each tuple is (offset, address,
length):
(0, 336166839, 1)
(1, 992337163, 2)
(3, 992398076, 1)
(4, 992471283, 22)
(26, 992771329, 13861631)
(13861657, 1026074340, 14113052)
(27974709, 1059503188, 14238636)
(42213345, 1093863162, 2)
(42213347, 1094345396, 175)
(42213522, 1094345577, 91)
(42213613, 1094588258, 12707998)
(54921611, 1115498095, 1)
(54921612, 1115498097, 2)
(54921614, 1115787154, 1)
(54921615, 1116292353, 102)
(54921717, 1116650670, 39713)
(54961430, 1116695739, 16777215)
(71738645, 1133472954, 6404161)
The performance issue occurred while writing the last 50 GB or so which is
consistent with the last two XADs. When the file was around 266Gib long the
last xad (at offset 54961430) would have had around 10M blocks. Every
commit would have updated 10M blocks, until the xad overflowed and the new
xad at offset 71738645 was created. And sure enough, when the file got to
be around 293Gib the performance went up again but slowly fell again.
This should be reproducible by writing a large file on a large empty
volume. As each xad fills the number of blocks being updated will increase
and the total number of updates will grow exponentially causing the write to
slow down. Once the xad overflows performance should go up again.
-Bob
------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today. Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Jfs-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jfs-discussion