Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-25 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Ian Collins Add to that: if running dedup, get plenty of RAM and cache. Add plenty RAM. And tweak your arc_meta_limit. You can at least get dedup performance that's on the same order of

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-24 Thread Roberto Waltman
Edward Ned Harvey wrote: So I'm getting comparisons of write speeds for 10G files, sampling at 100G intervals. For a 6x performance degradation, it would be 7 sec to write without dedup, and 40-45sec to write with dedup. For a totally unscientific data point: The HW: Server - Supermicro

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-24 Thread Nico Williams
On Jul 9, 2011 1:56 PM, Edward Ned Harvey opensolarisisdeadlongliveopensola...@nedharvey.com wrote: Given the abysmal performance, I have to assume there is a significant number of overhead reads or writes in order to maintain the DDT for each actual block write operation. Something I didn't

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-24 Thread Ian Collins
On 07/25/11 04:21 AM, Roberto Waltman wrote: Edward Ned Harvey wrote: So I'm getting comparisons of write speeds for 10G files, sampling at 100G intervals. For a 6x performance degradation, it would be 7 sec to write without dedup, and 40-45sec to write with dedup. For a totally

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-23 Thread Ian Collins
On 07/10/11 04:04 AM, Edward Ned Harvey wrote: There were a lot of useful details put into the thread Summary: Dedup and L2ARC memory requirements Please refer to that thread as necessary... After much discussion leading up to that thread, I thought I had enough understanding to make

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-23 Thread Edward Ned Harvey
From: Ian Collins [mailto:i...@ianshome.com] Sent: Saturday, July 23, 2011 4:02 AM Can you provide more details of your tests? Here's everything: http://dl.dropbox.com/u/543241/dedup%20tests/dedup%20tests.zip In particular: Under the work server directory. The basic concept goes like

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-15 Thread Frank Van Damme
Op 15-07-11 04:27, Edward Ned Harvey schreef: Is anyone from Oracle reading this? I understand if you can't say what you're working on and stuff like that. But I am merely hopeful this work isn't going into a black hole... Anyway. Thanks for listening (I hope.) ttyl If they aren't,

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-15 Thread phil.har...@gmail.com
If you clone zones from a golden image using ZFS cloning, you get fast, efficient dedup for free. Sparse root always was a horrible hack! - Reply message - From: Jim Klimov jimkli...@cos.ru To: Cc: zfs-discuss@opensolaris.org Subject: [zfs-discuss] Summary: Dedup memory and performance

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-15 Thread Jim Klimov
2011-07-15 11:10, phil.har...@gmail.com пишет: If you clone zones from a golden image using ZFS cloning, you get fast, efficient dedup for free. Sparse root always was a horrible hack! Sounds like a holy war is flaming up ;) From what I heard, sparse root zones with shared common system

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-15 Thread Mike Gerdts
On Fri, Jul 15, 2011 at 5:19 AM, Jim Klimov jimkli...@cos.ru wrote: 2011-07-15 11:10, phil.har...@gmail.com пишет: If you clone zones from a golden image using ZFS cloning, you get fast, efficient dedup for free. Sparse root always was a horrible hack! Sounds like a holy war is flaming up ;)

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Frank Van Damme
Op 12-07-11 13:40, Jim Klimov schreef: Even if I batch background RM's so a hundred processes hang and then they all at once complete in a minute or two. Hmmm. I only run one rm process at a time. You think running more processes at the same time would be faster? -- No part of this copyright

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Jim Klimov
2011-07-14 11:54, Frank Van Damme пишет: Op 12-07-11 13:40, Jim Klimov schreef: Even if I batch background RM's so a hundred processes hang and then they all at once complete in a minute or two. Hmmm. I only run one rm process at a time. You think running more processes at the same time would

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Frank Van Damme
Op 14-07-11 12:28, Jim Klimov schreef: Yes, quite often it seems so. Whenever my slow dcpool decides to accept a write, it processes a hundred pending deletions instead of one ;) Even so, it took quite a few pool or iscsi hangs and then reboots of both server and client, and about a week

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Jim Klimov
2011-07-14 15:48, Frank Van Damme пишет: It seems counter-intuitive - you'd say: concurrent disk access makes things only slower - , but it turns out to be true. I'm deleting a dozen times faster than before. How completely ridiculous. Thank you :-) Well, look at it this way: it is not only

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Daniel Carosone
um, this is what xargs -P is for ... -- Dan. On Thu, Jul 14, 2011 at 07:24:52PM +0400, Jim Klimov wrote: 2011-07-14 15:48, Frank Van Damme ?: It seems counter-intuitive - you'd say: concurrent disk access makes things only slower - , but it turns out to be true. I'm deleting a dozen

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Edward Ned Harvey I understand the argument, DDT must be stored in the primary storage pool so you can increase the size of the storage pool without running out of space to hold the DDT...

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Jim Klimov
2011-07-15 6:21, Daniel Carosone ?: um, this is what xargs -P is for ... Thanks for the hint. True, I don't often use xargs. However from the man pages, I don't see a -P option on OpenSolaris boxes of different releases, and there is only a -p (prompt) mode. I am not eager to enter yes

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Daniel Carosone
On Fri, Jul 15, 2011 at 07:56:25AM +0400, Jim Klimov wrote: 2011-07-15 6:21, Daniel Carosone ?: um, this is what xargs -P is for ... Thanks for the hint. True, I don't often use xargs. However from the man pages, I don't see a -P option on OpenSolaris boxes of different releases, and

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Jim Klimov
2011-07-09 20:04, Edward Ned Harvey ?: --- Performance gain: Unfortunately there was only one area that I found any performance gain. When you read back duplicate data that was previously written with dedup, then you get a lot more cache hits, and as a result, the reads go faster.

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Jim Klimov By the way, did you estimate how much is dedup's overhead in terms of metadata blocks? For example it was often said on the list that you shouldn't bother with dedup unless you

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Jim Klimov
This dedup discussion (and my own bad expreience) have also left me with another grim thought: some time ago sparse-root zone support was ripped out of OpenSolaris. Among the published rationales were transition to IPS and the assumption that most people used them to save on disk space (notion

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Jim Klimov
You and I seem to have different interprettations of the empirical 2x soft-requirement to make dedup worthwhile. Well, until recently I had little interpretation for it at all, so your approach may be better. I hope that authors of the requirement statement would step forward and explain

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Bob Friesenhahn
On Tue, 12 Jul 2011, Edward Ned Harvey wrote: You know what? A year ago I would have said dedup still wasn't stable enough for production. Now I would say it's plenty stable enough... But it needs performance enhancement before it's truly useful for most cases. What has changed for you to

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Edward Ned Harvey
From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] Sent: Tuesday, July 12, 2011 9:58 AM You know what? A year ago I would have said dedup still wasn't stable enough for production. Now I would say it's plenty stable enough... But it needs performance enhancement before it's

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-10 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Edward Ned Harvey --- Performance loss: I ran one more test, that is rather enlightening. I repeated test #2 (tweak arc_meta_limit, use the default primarycache=all) but this time I wrote

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-10 Thread Edward Ned Harvey
From: Roy Sigurd Karlsbakk [mailto:r...@karlsbakk.net] Sent: Saturday, July 09, 2011 3:44 PM Could you test with some SSD SLOGs and see how well or bad the system performs? These are all async writes, so slog won't be used. Async writes that have a single fflush() and fsync() at

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-10 Thread Edward Ned Harvey
From: Roy Sigurd Karlsbakk [mailto:r...@karlsbakk.net] Sent: Saturday, July 09, 2011 3:44 PM Sorry, my bad, I meant L2ARC to help buffer the DDT Also, bear in mind, the L2ARC is only for reads. So it can't help accelerate writing updates to the DDT. Those updates need to hit the pool,

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-09 Thread Edward Ned Harvey
Given the abysmal performance, I have to assume there is a significant number of overhead reads or writes in order to maintain the DDT for each actual block write operation. Something I didn't mention in the other email is that I also tracked iostat throughout the whole operation. It's all

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-09 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Edward Ned Harvey When you read back duplicate data that was previously written with dedup, then you get a lot more cache hits, and as a result, the reads go faster.  Unfortunately these

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-09 Thread Roy Sigurd Karlsbakk
When it's not cached, of course the read time was equal to the original write time. When it's cached, it goes 4x faster. Perhaps this is only because I'm testing on a machine that has super fast storage... 11 striped SAS disks yielding 8Gbit/sec as compared to all-RAM which yielded

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-09 Thread Edward Ned Harvey
From: Roy Sigurd Karlsbakk [mailto:r...@karlsbakk.net] Sent: Saturday, July 09, 2011 2:33 PM Could you test with some SSD SLOGs and see how well or bad the system performs? These are all async writes, so slog won't be used. Async writes that have a single fflush() and fsync() at the end

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-09 Thread Roy Sigurd Karlsbakk
From: Roy Sigurd Karlsbakk [mailto:r...@karlsbakk.net] Sent: Saturday, July 09, 2011 2:33 PM Could you test with some SSD SLOGs and see how well or bad the system performs? These are all async writes, so slog won't be used. Async writes that have a single fflush() and fsync() at