Chris Mason wrote:
On Fri, Jun 05, 2009 at 04:27:55PM -0500, Steven Pratt wrote:
Steven Pratt wrote:
Chris Mason wrote:
On Thu, Jun 04, 2009 at 02:02:20PM -0500, Steven Pratt wrote:
Chris Mason wrote:
Hello everyone,
Yan Zheng has been doing some major surgery to the back references and
extent allocation code, tackling bottlenecks in the code that tracks
extents. It scales better with many snapshots and performs better in
the common case of no snapshots at all.
THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE. This means
it is
compatible with the current btrfs disk format, but once you mount a
filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM OLD
KERNELS. Old kernels spit out an error message when you try them
on new
format filesystems.
This is a large change, and I'm hoping to have it stable in time
for the
2.6.31 merge window. I've been testing it for about a week now, and
haven't been able to cause major problems yet. But, testing the
compatibility with old format filesystems is the hard part, and
everyone that pulls the new code should backup their data first.
I've setup git branches called newformat where you can pull the
new code.
For the kernel (based on 2.6.30-rc7):
git pull
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git
newformat
So I started the performance runs on this. The base tests completed
fine on the raid system and I will post results as soon as I can
finish postprocessing, but when I tried to do nodatacow that
machine it crashed pretty early. Here is console log:
Hi Steve,
Thanks again for hammering on these. Yan Zheng and I have both been
trying to reproduce problems with nodatacow and with the database random
write run.
So now that the raid machine is actually up, I discovered it got
further than I thought on nodatacow. It did all the read tests, but
appeared to died on 16 thread random write(not odirect). There were no
messages logged to var/log/messages at all. Last I saw was :
Jun 4 03:14:24 btrfs1 kernel: [65856.065491] btrfs: setting nodatacow
Jun 4 15:24:45 btrfs1 syslogd 1.4.1: restart.
Just dead until we rebooted machine later that day.
So the raid system complete the re-run of the nodatacow runs without
error. So still no idea what happened on this box the first time
around. As for the single disk system, it died during the random write
test again, but it now looks like we might have a real HW failure. This
time we see SCSI error messages. I have replaced the test disks and
will try one more time.
The net is, I would hold off digging too much into this as even I don't
have any repeatable errors.
Thanks for rerunning all of this, appreciate the update.
No problem. Raid results are uploading to
http://btrfs.boxacle.net/repository/raid/history/History.html now.
There were massive improvements in the random write workloads,
especially with cow enabled!! MailServer had moderate perf gains, but
dramatic decrease in CPU utilization, so this is very good as well.
The only regression I see is on large file creates, CPU is up 200% or
more while performance is fairly flat. btrfs_tree_lock now dominates
the profile.
I am still having issues on the single disk system, which I am still not
sure if it is btrfs or HW, but I am off on a family vacation tomorrow so
it will have to wait for a week or so.
Steve
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html