Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-25 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Ian Collins
 
 Add to that: if running dedup, get plenty of RAM and cache.

Add plenty RAM.  And tweak your arc_meta_limit.  You can at least get dedup
performance that's on the same order of magnitude as performance without
dedup.

Cache devices don't really help dedup very much - Because each DDT stored in
ARC/L2ARC takes 376 bytes, and each reference to an L2ARC entry requires 176
bytes of ARC.  So in order to prevent an individual DDT entry from being
evicted to disk, you must either keep the 376 bytes in ARC, or evict it to
L2ARC and keep 176 bytes.  This is a very small payload.  A good payload
would be to evict a 128k block from ARC into L2ARC, keeping the 176 bytes
only in ARC.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-24 Thread Roberto Waltman

Edward Ned Harvey wrote:
  So I'm getting comparisons of write speeds for 10G files, sampling 
at 100G

intervals.  For a 6x performance degradation, it would be 7 sec to write
without dedup, and 40-45sec to write with dedup.


For a totally unscientific data point:

The HW:  Server - Supermicro server motherboard.
Intel 920 CPU.
6 GB memory.
1 x 16 GB SSD as a boot device.
8 x 2TB green (5400 RPM?) hard drives.
The disks configured with 3 equal size partitions, all p1's in one 
raidz2 pool, all p2's in another, all p3's in another.
(Done to improve performance by limiting head movement when most of the 
disk activity is in one pool)


The SW: the last release of Open Solaris. (Current at the time, I have 
since moved to Solaris 11)


The test: backup an almost full 750Gb external hard disk formatted as a 
single NTFS volume. The disk was connected via eSATA to a fast computer 
(also a supermicro + I920) running Ubuntu.

The Ubuntu machine had access to the file server via NFS.
The NFS-exported file system was created new for this backup, with dedup 
enabled, encryption and compression disabled, atime=off. This was the 
first (and last) time I tried enabling dedup.


From previous similar transfers, (without dedup), I expected the backup 
to be finished in a few hours overnight, with the bottlenecks being the 
NTFS-3G driver in Ubuntu and the 100Mbit ethernet connection.


It took more than EIGHT DAYS, without any other activity going on both 
machines.


(My) conclusion: Running low on storage? get more/bigger disks.

--
Roberto Waltman
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-24 Thread Nico Williams
On Jul 9, 2011 1:56 PM, Edward Ned Harvey 
opensolarisisdeadlongliveopensola...@nedharvey.com wrote:

 Given the abysmal performance, I have to assume there is a significant
 number of overhead reads or writes in order to maintain the DDT for each
 actual block write operation.  Something I didn't mention in the other
 email is that I also tracked iostat throughout the whole operation.  It's
 all writes (or at least 99.9% writes.)  So I am forced to conclude it's a
 bunch of small DDT maintenance writes taking place and incurring access
time
 penalties in addition to each intended single block access time penalty.

 The nature of the DDT is that it's a bunch of small blocks, that tend to
be
 scattered randomly, and require maintenance in order to do anything else.
 This sounds like precisely the usage pattern that benefits from low
latency
 devices such as SSD's.

The DDT should be written to in COW fashion, and asynchronously, so there
should be no access time penalty.  Or so ISTM it should be.

Dedup is necessarily slower for writing because of the deduplication table
lookups.  Those are synchronous lookups, but for async writes you'd think
that total write throughput would only be affected by a) the additional read
load (which is zero in your case) and b) any inability to put together large
transactions due to the high latency of each logical write, but (b)
shouldn't happen, particularly if the DDT fits in RAM or L2ARC, as it does
in your case.

So, at first glance my guess is ZFS is leaving dedup write performance on
the table most likely due to implementation reasons, not design reasons.

Nico
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-24 Thread Ian Collins

 On 07/25/11 04:21 AM, Roberto Waltman wrote:

Edward Ned Harvey wrote:
 So I'm getting comparisons of write speeds for 10G files, sampling
at 100G

intervals.  For a 6x performance degradation, it would be 7 sec to write
without dedup, and 40-45sec to write with dedup.

For a totally unscientific data point:

The HW:  Server - Supermicro server motherboard.
Intel 920 CPU.
6 GB memory.
1 x 16 GB SSD as a boot device.
8 x 2TB green (5400 RPM?) hard drives.
The disks configured with 3 equal size partitions, all p1's in one
raidz2 pool, all p2's in another, all p3's in another.
(Done to improve performance by limiting head movement when most of the
disk activity is in one pool)

The SW: the last release of Open Solaris. (Current at the time, I have
since moved to Solaris 11)

The test: backup an almost full 750Gb external hard disk formatted as a
single NTFS volume. The disk was connected via eSATA to a fast computer
(also a supermicro + I920) running Ubuntu.
The Ubuntu machine had access to the file server via NFS.
The NFS-exported file system was created new for this backup, with dedup
enabled, encryption and compression disabled, atime=off. This was the
first (and last) time I tried enabling dedup.

  From previous similar transfers, (without dedup), I expected the backup
to be finished in a few hours overnight, with the bottlenecks being the
NTFS-3G driver in Ubuntu and the 100Mbit ethernet connection.

It took more than EIGHT DAYS, without any other activity going on both
machines.

(My) conclusion: Running low on storage? get more/bigger disks.


Add to that: if running dedup, get plenty of RAM and cache.

I'm still seeing similar performance on my test system with and without 
dedup enabled.  Snapshot deletion appears slightly slower, but I have 
yet to run timed tests.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-23 Thread Ian Collins

 On 07/10/11 04:04 AM, Edward Ned Harvey wrote:


There were a lot of useful details put into the thread Summary: Dedup 
and L2ARC memory requirements


Please refer to that thread as necessary...  After much discussion 
leading up to that thread, I thought I had enough understanding to 
make dedup useful, but then in practice, it didn't work out.  Now I've 
done a lot more work on it, reduced it all to practice, and I finally 
feel I can draw up conclusions that are actually useful:


I am testing on a Sun Oracle server, X4270, 1 Xeon 4-core 2.4Ghz, 24G 
ram, 12 disks ea 2T sas 7.2krpm.  Solaris 11 express snv_151a


Can you provide more details of your tests?  I'm currently testing a 
couple of slightly better configured X4270s (2 CPU, 96GB RAM and a FLASH 
accelerator card) using real data from an existing server.  So far, I 
haven't seen the levels of performance fall off you report.


I currently have about 5TB of uncompressed data in the pool (stripe of 5 
mirrors) and throughput is similar to the existing, Solaris 10, 
servers.  The pool dedup ratio is 1.7, so there's a good mix of unique 
and duplicate blocks.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-23 Thread Edward Ned Harvey
 From: Ian Collins [mailto:i...@ianshome.com]
 Sent: Saturday, July 23, 2011 4:02 AM
 
 Can you provide more details of your tests?  

Here's everything:
http://dl.dropbox.com/u/543241/dedup%20tests/dedup%20tests.zip

In particular:
Under the work server directory.

The basic concept goes like this:
Find some amount of data that takes approx 10 sec to write.  I don't know
the size, I just kept increasing a block counter till got times I felt were
reasonable, so let's suppose it's 10G.

Time Write that much without dedup (all unique).
Remove the file.
Time Write that much with dedup (sha256, no verify) (all unique).
Remove the file.
Write 10x that much with dedup (all unique).
Don't remove the file.
Repeat.

So I'm getting comparisons of write speeds for 10G files, sampling at 100G
intervals.  For a 6x performance degradation, it would be 7 sec to write
without dedup, and 40-45sec to write with dedup.

I am doing fflush() and fsync() at the end of every file write, to ensure
results are not skewed by write buffering.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-15 Thread Frank Van Damme
Op 15-07-11 04:27, Edward Ned Harvey schreef:
 Is anyone from Oracle reading this?  I understand if you can't say what
 you're working on and stuff like that.  But I am merely hopeful this work
 isn't going into a black hole...  
 
 Anyway.  Thanks for listening (I hope.)   ttyl

If they aren't, maybe someone from an open source Solaris version is :)

-- 
No part of this copyright message may be reproduced, read or seen,
dead or alive or by any means, including but not limited to telepathy
without the benevolence of the author.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-15 Thread phil.har...@gmail.com
If you clone zones from a golden image using ZFS cloning, you get fast, 
efficient dedup for free. Sparse root always was a horrible hack! 

- Reply message -
From: Jim Klimov jimkli...@cos.ru
To: 
Cc: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Summary: Dedup memory and performance (again, again)
Date: Tue, Jul 12, 2011 14:05


This dedup discussion (and my own bad expreience) have also
left me with another grim thought: some time ago sparse-root
zone support was ripped out of OpenSolaris.

Among the published rationales were transition to IPS and the
assumption that most people used them to save on disk space
(notion about saving RAM on shared objects was somehow
dismissed).

Regarding the disk savings, it was said that dedup would solve
the problem, at least for those systems which use dedup on
zoneroot dataset (and preferably that would be in the rpool, too).

On one hand, storing zoneroots in the rpool was never practical
for us because we tend to keep the rpool small and un-clobbered,
and on the other hand, now adding dedup to rpool would seem
like shooting oneself in the foot with a salt-loaded shotgun.
Maybe it won't kill, but would hurt a lot and for a long time.

On the third hand ;) with a small rpool hosting zoneroots as well,
the DDT would reasonably be small too, and may actually boost
performance while saving space. But lots of attention should now
be paid to seperating /opt, parts of /var and stuff into delegated
datasets from a larger datapool. And software like Sun JES which
installs into a full-root zone's /usr might overwhelm a small rpool
as well.

Anyhow, Edward, is there a test for this scenario - i.e. a 10Gb
pool with lots of non-unique data in small blocks?

Thanks,
//Jim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-15 Thread Jim Klimov

2011-07-15 11:10, phil.har...@gmail.com пишет:
If you clone zones from a golden image using ZFS cloning, you get 
fast, efficient dedup for free. Sparse root always was a horrible hack!

Sounds like a holy war is flaming up ;)

From what I heard, sparse root zones with shared common
system libraries allowed to save not only on disk space but
also on RAM. Can't vouch, never tested extensively myself.

Cloning of golden zones is of course used in our systems.
But this approach breaks badly upon any major systems
update (i.e. LiveUpgrade to a new release) - many of the
binaries change, and you either suddenly have the zones
(wanting to) consume many gigabytes of disk space which
are not there on a small rpool or a busy data pool, or you
have to make a new golden image, clone a new set of
zones and reinstall/migrate all applications and settings.

True, this is a no-brainer for zones running a single task
like an /opt/tomcat directory which can be tossed around
to any OS, but becomes tedious for software with many
packages and complicated settings, especially if (in some
extremity) it was homebrewn and/or home-compiled and
unpackaged ;)

I am not the first (or probably last) to write about inconvenience
of zone upgrades which loses the cloning benefit, and much
of the same is true for upgrading cloned/deduped VM golden
images as well, where the golden image is just some common
baseline OS but the clones all run different software. And it is
this different software which makes them useful and unique,
and too distinct to maintain a dozen of golden images efficiently
(i.e. there might be just 2 or 3 clones of each gold).

But in general, the problem is there - you either accept that
your OS images in effect won't be deduped, much or at all,
after some lifespan involving OS upgrades, or you don't
update them often (which may be inacceptable for security
and/or paranoia types of deployments), or you use some
trickery to update frequently and not lose much disk space,
such as automation of software and configs migration
from one clone (of old gold) to another clone (of new gold).

Dedup was a promising variant in this context, unless it
kills performance and/or stability... which was the subject
of this thread, with Edward's research into performance
of current dedup implementation (and perhaps some
baseline to test whether real improvements appear in
the future).

And in terms of performance there's some surprise in
Edward's findings regarding i.e. reads from the deduped
data. For infrequent maintenance (i.e. monthly upgrades)
zoneroots (OS image part) would be read-mostly and write
performance of dedup may not matter much. If the updates
must pour in often for whatever reason, then write and
delete performance of dedup may begin to matter.

Sorry about straying the discussion into zones - they,
their performance and coping with changes introduced
during lifetime (see OS upgrades), are one good example
for discussion of dedup, and its one application which
may be commonly useful on any server or workstation,
not only on on hardware built for dedicated storage.

Sparse-root vs. full-root zones, or disk images of VMs;
are they stuffed in one rpool or spread between rpool and
data pools - that detail is not actually the point of the thread.

Actual useability of dedup for savings and gains on these
tasks (preferably working also on low-mid-range boxes,
where adding a good enterprise SSD would double the
server cost - not only on those big good systems with
tens of GB of RAM), and hopefully simplifying the system
configuration and maintenance - that is indeed the point
in question.

//Jim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-15 Thread Mike Gerdts
On Fri, Jul 15, 2011 at 5:19 AM, Jim Klimov jimkli...@cos.ru wrote:
 2011-07-15 11:10, phil.har...@gmail.com пишет:

 If you clone zones from a golden image using ZFS cloning, you get fast,
 efficient dedup for free. Sparse root always was a horrible hack!

 Sounds like a holy war is flaming up ;)

 From what I heard, sparse root zones with shared common
 system libraries allowed to save not only on disk space but
 also on RAM. Can't vouch, never tested extensively myself.

There may be some benefit to that, I'd argue that most of the time
there's not that much.  Using what is surely an imperfect way of
measuring, I took a look a zone on a Solaris 10 box that I happen to
be logged into. I found it is using about 52 MB of memory in mappings
of executables and libraries.  By disabling webconsole (a java program
that has a RSS size of 100+ MB) the shared mappings drop to 40 MB.

# cd /proc
# pmap -xa * | grep r.x | grep -v ' anon ' | grep -v ' stack ' | grep
-v ' heap ' | sort -u | nawk '{ t+= $3 } END { print t / 1024, MB }'
pmap: cannot examine 22427: system process
40.3281 MB

If you are running the same large application (large executable +
libraries resident in memory) in many zones, you may have additional
benefit.

Solaris 10 was released in 2005, meaning that sparse root zones were
conceived sometime in the years leading up to that.  In that time, the
entry level servers have gone from 1 - 2 GB of memory (e.g. a V210 or
V240) to 12 - 16+ GB of memory (X2270 M2, T3-1).  Further, large
systems tend to have NUMA characteristics that challenge the logic of
trying to maintain only one copy of hot read-only executable pages.
It just doesn't make sense to constrain the design of zones around
something that is going to save 0.3% of the memory of an entry level
server.  Even in 2005, I'm not so sure it was a strong argument.

Disk space is another issue.  Jim does a fine job of describing the
issues around that.

 Cloning of golden zones is of course used in our systems.
 But this approach breaks badly upon any major systems
 update (i.e. LiveUpgrade to a new release) - many of the
 binaries change, and you either suddenly have the zones
 (wanting to) consume many gigabytes of disk space which
 are not there on a small rpool or a busy data pool, or you
 have to make a new golden image, clone a new set of
 zones and reinstall/migrate all applications and settings.

 True, this is a no-brainer for zones running a single task
 like an /opt/tomcat directory which can be tossed around
 to any OS, but becomes tedious for software with many
 packages and complicated settings, especially if (in some
 extremity) it was homebrewn and/or home-compiled and
 unpackaged ;)

 I am not the first (or probably last) to write about inconvenience
 of zone upgrades which loses the cloning benefit, and much
 of the same is true for upgrading cloned/deduped VM golden
 images as well, where the golden image is just some common
 baseline OS but the clones all run different software. And it is
 this different software which makes them useful and unique,
 and too distinct to maintain a dozen of golden images efficiently
 (i.e. there might be just 2 or 3 clones of each gold).

 But in general, the problem is there - you either accept that
 your OS images in effect won't be deduped, much or at all,
 after some lifespan involving OS upgrades, or you don't
 update them often (which may be inacceptable for security
 and/or paranoia types of deployments), or you use some
 trickery to update frequently and not lose much disk space,
 such as automation of software and configs migration
 from one clone (of old gold) to another clone (of new gold).

 Dedup was a promising variant in this context, unless it
 kills performance and/or stability... which was the subject
 of this thread, with Edward's research into performance
 of current dedup implementation (and perhaps some
 baseline to test whether real improvements appear in
 the future).

 And in terms of performance there's some surprise in
 Edward's findings regarding i.e. reads from the deduped
 data. For infrequent maintenance (i.e. monthly upgrades)
 zoneroots (OS image part) would be read-mostly and write
 performance of dedup may not matter much. If the updates
 must pour in often for whatever reason, then write and
 delete performance of dedup may begin to matter.

 Sorry about straying the discussion into zones - they,
 their performance and coping with changes introduced
 during lifetime (see OS upgrades), are one good example
 for discussion of dedup, and its one application which
 may be commonly useful on any server or workstation,
 not only on on hardware built for dedicated storage.

 Sparse-root vs. full-root zones, or disk images of VMs;
 are they stuffed in one rpool or spread between rpool and
 data pools - that detail is not actually the point of the thread.

 Actual useability of dedup for savings and gains on these
 tasks (preferably working also on low-mid-range boxes,
 

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Frank Van Damme
Op 12-07-11 13:40, Jim Klimov schreef:
 Even if I batch background RM's so a hundred processes hang
 and then they all at once complete in a minute or two.

Hmmm. I only run one rm process at a time. You think running more
processes at the same time would be faster?

-- 
No part of this copyright message may be reproduced, read or seen,
dead or alive or by any means, including but not limited to telepathy
without the benevolence of the author.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Jim Klimov

2011-07-14 11:54, Frank Van Damme пишет:

Op 12-07-11 13:40, Jim Klimov schreef:

Even if I batch background RM's so a hundred processes hang
and then they all at once complete in a minute or two.

Hmmm. I only run one rm process at a time. You think running more
processes at the same time would be faster?

Yes, quite often it seems so.
Whenever my slow dcpool decides to accept a write,
it processes a hundred pending deletions instead of one ;)

Even so, it took quite a few pool or iscsi hangs and then
reboots of both server and client, and about a week overall,
to remove a 50Gb dir with 400k small files from a deduped
pool served over iscsi from a volume in a physical pool.

Just completed this night ;)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Frank Van Damme
Op 14-07-11 12:28, Jim Klimov schreef:

 Yes, quite often it seems so.
 Whenever my slow dcpool decides to accept a write,
 it processes a hundred pending deletions instead of one ;)
 
 Even so, it took quite a few pool or iscsi hangs and then
 reboots of both server and client, and about a week overall,
 to remove a 50Gb dir with 400k small files from a deduped
 pool served over iscsi from a volume in a physical pool.
 
 Just completed this night ;)

It seems counter-intuitive - you'd say: concurrent disk access makes
things only slower - , but it turns out to be true. I'm deleting a dozen
times faster than before. How completely ridiculous.

Thank you :-)

-- 
No part of this copyright message may be reproduced, read or seen,
dead or alive or by any means, including but not limited to telepathy
without the benevolence of the author.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Jim Klimov

2011-07-14 15:48, Frank Van Damme пишет:
It seems counter-intuitive - you'd say: concurrent disk access makes 
things only slower - , but it turns out to be true. I'm deleting a 
dozen times faster than before. How completely ridiculous. Thank you :-)


Well, look at it this way: it is not only about singular disk accesses
(i.e. unlike other FSes, you do not in-place modify a directory entry),
with ZFS COW it is about rewriting a tree of block pointers, with any
new writes going into free (unreferenced ATM) disk blocks anyway.

So by hoarding writes you have a chance to reduce mechanical
IOPS required for your tasks. Until you run out of RAM ;)

Just in case it helps, to quickly fire up removals of the specific 
directory

after yet another reboot of the box, and not overwhelm it with hundreds
of thousands queued rmprocesses either, I made this script as /bin/RM:

===
#!/bin/sh

SLEEP=10
[ x$1 != x ]  SLEEP=$1

A=0
# To rm small files: find ... -size -10
find /export/OLD/PATH/TO/REMOVE -type f | while read LINE; do
  du -hs $LINE
  rm -f $LINE 
  A=$(($A+1))
  [ $A -ge 100 ]  ( date; while [ `ps -ef | grep -wc rm` -gt 50 ]; do
 echo Sleep $SLEEP...; ps -ef | grep -wc rm ; sleep $SLEEP; ps 
-ef | grep -wc rm;

  done
  date )  A=`ps -ef | grep -wc rm`
done ; date
===

Essentially, after firing up 100 rm attempts it waits for the rm
process count to go below 50, then goes on. Sizing may vary
between systems, phase of the moon and computer's attitude.
Sometimes I had 700 processes stacked and processed quickly.
Sometimes it hung on 50...

HTH,
//Jim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Daniel Carosone
um, this is what xargs -P is for ...

--
Dan.

On Thu, Jul 14, 2011 at 07:24:52PM +0400, Jim Klimov wrote:
 2011-07-14 15:48, Frank Van Damme ?:
 It seems counter-intuitive - you'd say: concurrent disk access makes  
 things only slower - , but it turns out to be true. I'm deleting a  
 dozen times faster than before. How completely ridiculous. Thank you 
 :-)

 Well, look at it this way: it is not only about singular disk accesses
 (i.e. unlike other FSes, you do not in-place modify a directory entry),
 with ZFS COW it is about rewriting a tree of block pointers, with any
 new writes going into free (unreferenced ATM) disk blocks anyway.

 So by hoarding writes you have a chance to reduce mechanical
 IOPS required for your tasks. Until you run out of RAM ;)

 Just in case it helps, to quickly fire up removals of the specific  
 directory
 after yet another reboot of the box, and not overwhelm it with hundreds
 of thousands queued rmprocesses either, I made this script as /bin/RM:

 ===
 #!/bin/sh

 SLEEP=10
 [ x$1 != x ]  SLEEP=$1

 A=0
 # To rm small files: find ... -size -10
 find /export/OLD/PATH/TO/REMOVE -type f | while read LINE; do
   du -hs $LINE
   rm -f $LINE 
   A=$(($A+1))
   [ $A -ge 100 ]  ( date; while [ `ps -ef | grep -wc rm` -gt 50 ]; do
  echo Sleep $SLEEP...; ps -ef | grep -wc rm ; sleep $SLEEP; ps -ef 
 | grep -wc rm;
   done
   date )  A=`ps -ef | grep -wc rm`
 done ; date
 ===

 Essentially, after firing up 100 rm attempts it waits for the rm
 process count to go below 50, then goes on. Sizing may vary
 between systems, phase of the moon and computer's attitude.
 Sometimes I had 700 processes stacked and processed quickly.
 Sometimes it hung on 50...

 HTH,
 //Jim

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


pgprXDuV2KRuK.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
 
 I understand the argument, DDT must be stored in the primary storage pool
 so
 you can increase the size of the storage pool without running out of space
 to hold the DDT...  But it's a fatal design flaw as long as you care about
 performance...  If you don't care about performance, you might as well use
 the netapp and do offline dedup.  The point of online dedup is to gain
 performance.  So in ZFS you have to care about the performance.
 
 There are only two possible ways to fix the problem.
 Either ...
 The DDT must be changed so it can be stored entirely in a designated
 sequential area of disk, and maintained entirely in RAM, so all DDT
 reads/writes can be infrequent and serial in nature...  This would solve
the
 case of async writes and large sync writes, but would still perform poorly
 for small sync writes.  And it would be memory intensive.  But it should
 perform very nicely given those limitations.  ;-)
 Or ...
 The DDT stays as it is now, highly scattered small blocks, and there needs
 to be an option to store it entirely on low latency devices such as
 dedicated SSD's.  Eliminate the need for the DDT to reside on the slow
 primary storage pool disks.  I understand you must consider what happens
 when the dedicated SSD gets full.  The obvious choices would be either (a)
 dedup turns off whenever the metadatadevice is full or (b) it defaults to
 writing blocks in the main storage pool.  Maybe that could even be a
 configurable behavior.  Either way, there's a very realistic use case
here.
 For some people in some situations, it may be acceptable to say I have
32G
 mirrored metadatadevice, divided by 137bytes per entry I can dedup up to a
 maximum 218M unique blocks in pool, and if I estimate 100K average block
 size that means up to 20T primary pool storage.  If I reach that limit,
I'll
 add more metadatadevice.
 
 Both of those options would also go a long way toward eliminating the
 surprise delete performance black hole.

Is anyone from Oracle reading this?  I understand if you can't say what
you're working on and stuff like that.  But I am merely hopeful this work
isn't going into a black hole...  

Anyway.  Thanks for listening (I hope.)   ttyl

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Jim Klimov

2011-07-15 6:21, Daniel Carosone ?:

um, this is what xargs -P is for ...


Thanks for the hint. True, I don't often use xargs.

However from the man pages, I don't see a -P option
on OpenSolaris boxes of different releases, and there
is only a -p (prompt) mode. I am not eager to enter
yes 40 times ;)

The way I had this script in practice, I could enter RM
once and it worked till the box hung. Even then, a watchdog
script could often have it rebooted without my interaction
so it could continue in the next lifetime ;)



--
Dan.

On Thu, Jul 14, 2011 at 07:24:52PM +0400, Jim Klimov wrote:

2011-07-14 15:48, Frank Van Damme ?:

It seems counter-intuitive - you'd say: concurrent disk access makes
things only slower - , but it turns out to be true. I'm deleting a
dozen times faster than before. How completely ridiculous. Thank you
:-)

Well, look at it this way: it is not only about singular disk accesses
(i.e. unlike other FSes, you do not in-place modify a directory entry),
with ZFS COW it is about rewriting a tree of block pointers, with any
new writes going into free (unreferenced ATM) disk blocks anyway.

So by hoarding writes you have a chance to reduce mechanical
IOPS required for your tasks. Until you run out of RAM ;)

Just in case it helps, to quickly fire up removals of the specific
directory
after yet another reboot of the box, and not overwhelm it with hundreds
of thousands queued rmprocesses either, I made this script as /bin/RM:

===
#!/bin/sh

SLEEP=10
[ x$1 != x ]  SLEEP=$1

A=0
# To rm small files: find ... -size -10
find /export/OLD/PATH/TO/REMOVE -type f | while read LINE; do
   du -hs $LINE
   rm -f $LINE
   A=$(($A+1))
   [ $A -ge 100 ]  ( date; while [ `ps -ef | grep -wc rm` -gt 50 ]; do
  echo Sleep $SLEEP...; ps -ef | grep -wc rm ; sleep $SLEEP; ps -ef
| grep -wc rm;
   done
   date )  A=`ps -ef | grep -wc rm`
done ; date
===

Essentially, after firing up 100 rm attempts it waits for the rm
process count to go below 50, then goes on. Sizing may vary
between systems, phase of the moon and computer's attitude.
Sometimes I had 700 processes stacked and processed quickly.
Sometimes it hung on 50...

HTH,
//Jim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-14 Thread Daniel Carosone
On Fri, Jul 15, 2011 at 07:56:25AM +0400, Jim Klimov wrote:
 2011-07-15 6:21, Daniel Carosone ?:
 um, this is what xargs -P is for ...

 Thanks for the hint. True, I don't often use xargs.

 However from the man pages, I don't see a -P option
 on OpenSolaris boxes of different releases, and there
 is only a -p (prompt) mode. I am not eager to enter
 yes 40 times ;)

you want the /usr/gnu/{bin,share/man} version, at least in this case.

--
Dan.


pgpItiuUybbdI.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Jim Klimov

2011-07-09 20:04, Edward Ned Harvey ?:


---  Performance gain:

Unfortunately there was only one area that I found any performance 
gain.  When you read back duplicate data that was previously written 
with dedup, then you get a lot more cache hits, and as a result, the 
reads go faster.  Unfortunately these gains are diminished...  I don't 
know by what...  But you only have about 2x to 4x performance gain 
reading previously dedup'd data, as compared to reading the same data 
which was never dedup'd.  Even when repeatedly reading the same file 
which is 100% duplicate data (created by dd from /dev/zero) so all the 
data is 100% in cache...   I still see only 2x to 4x performance gain 
with dedup.




First of all, thanks for all the experimental research and results,
even if the outlook is grim. I'd love to see comments about those
systems which use dedup and actually gain benefits, and how
much they gain (i.e. VM farms, etc.), and what may differ in
terms of setup (i.e. at least 256Gb RAM or whatever).

Hopefully the discrepancy between blissful hopes (I had) - that
dedup would save disk space and boost the systems somewhat
kinda like online compression can do - and cruel reality would
result in some improvement project. Perhaps it would be an
offline dedup implementation (perhaps with an online-dedup
option turnable off), as recently discussed on list.

Deleting stuff is still apain though. For the past week my box
is trying to delete an rsynced backup of a linux machine, some
300k files summed up to 50Gb. Deleting large files was rather
quick, but those consuming just a few blocks are really slow.
Even if I batch background RM's so a hundred processes hang
and then they all at once complete in a minute or two.
And quite often the iSCSI initiator or target go crazy so one of
the boxes (or both) have to be rebooted, about trice a day.
I described my setup before, won't clobber it into here ;)

Regarding the low read performance gain, you suggested
in a later post that this could be due to the RAM and disk
bandwidth difference in your machine. I for one think that
(without sufficient ARC block-caching) dedup reading would
suffer greatly also from fragmentation - any one large file
with some or all deduped data is basically guaranteed to
have its blocks scattered across all of your storage.
At least if this file was committed to the deduped pool
late in its life, when most or all of the blocks were already
there.

By the way, did you estimate how much is dedup's overhead
in terms of metadata blocks? For example it was often said
on the list that you shouldn't bother with dedup unless you
data can be deduped 2x or better, and if you're lucky to
already have it on ZFS - you can estimate the reduction
with zdb. Now, I wonder where the number comes from -
is it empirical, or would dedup metadata take approx 1x
the data space, thus under 2x reduction you gain little
or nothing? ;)

Thanks for the research,
//Jim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Jim Klimov
  
 By the way, did you estimate how much is dedup's overhead
 in terms of metadata blocks? For example it was often said
 on the list that you shouldn't bother with dedup unless you
 data can be deduped 2x or better, and if you're lucky to
 already have it on ZFS - you can estimate the reduction
 with zdb. Now, I wonder where the number comes from -
 is it empirical, or would dedup metadata take approx 1x
 the data space, thus under 2x reduction you gain little
 or nothing? ;)

You and I seem to have different interprettations of the empirical 2x
soft-requirement to make dedup worthwhile.  I always interpretted it like
this: If read/write of DUPLICATE blocks with dedup enabled yields 4x
performance gain, and read/write of UNIQUE blocks with dedup enabled yields
4x performance loss, then you need a 50/50 mix of unique and duplicate
blocks in the system in order to break even.  This is the same as having a
2x dedup ratio.  Unfortunately based on this experience, I would now say
something like a dedup ratio of 10x is more likely the break even point.

Ideally, read/write of unique blocks should be just as fast, with or without
dedup.  Ideally, read/write of duplicate blocks would be an order of
magnitude (or more) faster with dedup.  It's not there right now...  But I
still have high hopes.

You know what?  A year ago I would have said dedup still wasn't stable
enough for production.  Now I would say it's plenty stable enough...  But it
needs performance enhancement before it's truly useful for most cases.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Jim Klimov

This dedup discussion (and my own bad expreience) have also
left me with another grim thought: some time ago sparse-root
zone support was ripped out of OpenSolaris.

Among the published rationales were transition to IPS and the
assumption that most people used them to save on disk space
(notion about saving RAM on shared objects was somehow
dismissed).

Regarding the disk savings, it was said that dedup would solve
the problem, at least for those systems which use dedup on
zoneroot dataset (and preferably that would be in the rpool, too).

On one hand, storing zoneroots in the rpool was never practical
for us because we tend to keep the rpool small and un-clobbered,
and on the other hand, now adding dedup to rpool would seem
like shooting oneself in the foot with a salt-loaded shotgun.
Maybe it won't kill, but would hurt a lot and for a long time.

On the third hand ;) with a small rpool hosting zoneroots as well,
the DDT would reasonably be small too, and may actually boost
performance while saving space. But lots of attention should now
be paid to seperating /opt, parts of /var and stuff into delegated
datasets from a larger datapool. And software like Sun JES which
installs into a full-root zone's /usr might overwhelm a small rpool
as well.

Anyhow, Edward, is there a test for this scenario - i.e. a 10Gb
pool with lots of non-unique data in small blocks?

Thanks,
//Jim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Jim Klimov
 You and I seem to have different interprettations of the 
 empirical 2x soft-requirement to make dedup worthwhile.
 
Well, until recently I had little interpretation for it at all, so your
approach may be better.
 
I hope that authors of the requirement statement would step
forward and explain what it is about under the hood and why 2x ;)
 
 You know what?  A year ago I would have said dedup still 
 wasn't stable enough for production.  Now I would say it's plenty stable 
 enough...  But it needs performance enhancement before it's
 truly useful for most cases.

Well, not that this would contradict you, but on my oi_148a (which
may be based on code close to a year old), it seems rather unstable, 
with systems either freezing or slowing down after some writes and 
having to be rebooted in order to work (fresh after boot writes are
usually relatively good, i.e. 5Mb/s vs. 100k/s). On the iSCSI server
side, the LUN and STMF service often lock up with device busy 
even though the volume pool/dcpool is not itself deduped. For me 
this is only solved by a reboot... And reboots of the VM client which
fights its way through deleting files from the deduped datasets inside
dcpool (imported over iSCSI) are beyond counting.
 
Actually in a couple of weeks I might be passing by that machine
and may have a chance to update it to oi_151-dev, would that
buy me any improvements, or potentially worsen my situation? ;)
 
//Jim
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Bob Friesenhahn

On Tue, 12 Jul 2011, Edward Ned Harvey wrote:


You know what?  A year ago I would have said dedup still wasn't stable
enough for production.  Now I would say it's plenty stable enough...  But it
needs performance enhancement before it's truly useful for most cases.


What has changed for you to change your mind?  Did the zfs code change 
in the past year, or is this based on experience with the same old 
stagnant code?


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-12 Thread Edward Ned Harvey
 From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us]
 Sent: Tuesday, July 12, 2011 9:58 AM
 
  You know what?  A year ago I would have said dedup still wasn't stable
  enough for production.  Now I would say it's plenty stable enough...
But it
  needs performance enhancement before it's truly useful for most cases.
 
 What has changed for you to change your mind?  Did the zfs code change
 in the past year, or is this based on experience with the same old
 stagnant code?

No idea.  I assume they've been patching, and I don't hear many people
complaining of dedup instability on this list anymore.  But the other option
is that nothing's changed, and only my perception has changed.  I
acknowledge that's possible.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-10 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
 
 --- Performance loss:

I ran one more test, that is rather enlightening.  I repeated test #2 (tweak
arc_meta_limit, use the default primarycache=all) but this time I wrote 100%
duplicate data instead of unique.  Dedup=sha256 (no verify).  Ideally, you
would expect this to write very very fast... Because it's all duplicate
data, and it's all async, the system should just buffer a bunch of tiny
metadata changes, aggregate them, and occasionally write a single serial
block when it flushes the TXG.  It should be much faster to write dedup.

The results are:  With dedup, it writes several times slower.  Just the same
as test #2, minus the amount of time it takes to write the actual data.  For
example, here's one datapoint, which is representative of the whole test:
time to write unique data without dedup:  7.090 sec
time to write unique data with dedup: 47.379 sec

time to write duplic data without dedup:  7.016 sec
time to write duplic data with dedup: 39.852 sec

This clearly breaks it down:
7 sec to write the actual data
40 sec overhead caused by dedup
1 sec is about how fast it should have been writing duplicated data

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-10 Thread Edward Ned Harvey
 From: Roy Sigurd Karlsbakk [mailto:r...@karlsbakk.net]
 Sent: Saturday, July 09, 2011 3:44 PM
 
   Could you test with some SSD SLOGs and see how well or bad the
   system
   performs?
 
  These are all async writes, so slog won't be used. Async writes that
  have a single fflush() and fsync() at the end to ensure system
  buffering is not skewing the results.
 
 Sorry, my bad, I meant L2ARC to help buffer the DDT

Oh - 

It just so happens I don't have one available, but that doesn't mean I can't 
talk about it.  ;-)

For quite a lot of these tests, all the data resides in the ARC, period.  The 
only area where the L2ARC would have an effect is after that region... When I'm 
pushing the limits of ARC then there may be some benefit from the use of L2ARC. 
 So ...

It is distinctly possible the L2ARC might help soften the brick wall.  When 
reaching arc_meta_limit, some of the metadata might have been pushed out to 
L2ARC in order to leave a (slightly) smaller footprint in the ARC...  I doubt 
it, but maybe there could be some gain here.

It is distinctly possible the L2ARC might help test #2 approach the performance 
of test #3 (test #2 had primarycache=all and suffered approx 10x write 
performance degradation, while test #3 had primarycache=metadata and suffered 
approx 6x write performance degradation.)

But there's positively no way the L2ARC would come into play on test #3.  In 
this situation, all the metadata, the complete DDT resides in RAM.  So with or 
without the cache device, the best case we're currently looking at is approx 6x 
write performance degradation.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-10 Thread Edward Ned Harvey
 From: Roy Sigurd Karlsbakk [mailto:r...@karlsbakk.net]
 Sent: Saturday, July 09, 2011 3:44 PM
 
 Sorry, my bad, I meant L2ARC to help buffer the DDT

Also, bear in mind, the L2ARC is only for reads.  So it can't help accelerate 
writing updates to the DDT.  Those updates need to hit the pool, period.

Yes, on test 1 and test 2, there were significant regions where reads were 
taking place.  (Basically the whole test, approx 25% to 30% reads.)

On test 3, there were absolutely no reads up till 75M entries (9.07 T used) 
arc_meta_used= 12960 MB.  Up to this point, it was a 4x write performance 
degradation.  Then it suddenly started performing about 5% reads and 95% writes 
and suddenly jumped to 6x write performance degradation.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-09 Thread Edward Ned Harvey
Given the abysmal performance, I have to assume there is a significant
number of overhead reads or writes in order to maintain the DDT for each
actual block write operation.  Something I didn't mention in the other
email is that I also tracked iostat throughout the whole operation.  It's
all writes (or at least 99.9% writes.)  So I am forced to conclude it's a
bunch of small DDT maintenance writes taking place and incurring access time
penalties in addition to each intended single block access time penalty.

The nature of the DDT is that it's a bunch of small blocks, that tend to be
scattered randomly, and require maintenance in order to do anything else.
This sounds like precisely the usage pattern that benefits from low latency
devices such as SSD's.

I understand the argument, DDT must be stored in the primary storage pool so
you can increase the size of the storage pool without running out of space
to hold the DDT...  But it's a fatal design flaw as long as you care about
performance...  If you don't care about performance, you might as well use
the netapp and do offline dedup.  The point of online dedup is to gain
performance.  So in ZFS you have to care about the performance.  

There are only two possible ways to fix the problem.  
Either ...
The DDT must be changed so it can be stored entirely in a designated
sequential area of disk, and maintained entirely in RAM, so all DDT
reads/writes can be infrequent and serial in nature...  This would solve the
case of async writes and large sync writes, but would still perform poorly
for small sync writes.  And it would be memory intensive.  But it should
perform very nicely given those limitations.  ;-)
Or ...
The DDT stays as it is now, highly scattered small blocks, and there needs
to be an option to store it entirely on low latency devices such as
dedicated SSD's.  Eliminate the need for the DDT to reside on the slow
primary storage pool disks.  I understand you must consider what happens
when the dedicated SSD gets full.  The obvious choices would be either (a)
dedup turns off whenever the metadatadevice is full or (b) it defaults to
writing blocks in the main storage pool.  Maybe that could even be a
configurable behavior.  Either way, there's a very realistic use case here.
For some people in some situations, it may be acceptable to say I have 32G
mirrored metadatadevice, divided by 137bytes per entry I can dedup up to a
maximum 218M unique blocks in pool, and if I estimate 100K average block
size that means up to 20T primary pool storage.  If I reach that limit, I'll
add more metadatadevice.

Both of those options would also go a long way toward eliminating the
surprise delete performance black hole.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-09 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
 
 When you read back duplicate data that was previously written with
 dedup, then you get a lot more cache hits, and as a result, the reads go
 faster.  Unfortunately these gains are diminished...  I don't know by
 what...  But you only have about 2x to 4x performance gain reading
 previously dedup'd data, as compared to reading the same data which was
 never dedup'd.  Even when repeatedly reading the same file which is 100%
 duplicate data (created by dd from /dev/zero) so all the data is 100% in
 cache...   I still see only 2x to 4x performance gain with dedup.

For what it's worth:

I also repeated this without dedup.  Created a large file (17G, just big
enough that it will fit entirely in my ARC).  Rebooted.  Timed reading it.
Now it's entirely in cache.  Time reading it again.

When it's not cached, of course the read time was equal to the original
write time.  When it's cached, it goes 4x faster.  Perhaps this is only
because I'm testing on a machine that has super fast storage...  11 striped
SAS disks yielding 8Gbit/sec as compared to all-RAM which yielded
31.2Gbit/sec.  It seems in this case, RAM is only 4x faster than the storage
itself...  But I would have expected a couple orders of magnitude...  So
perhaps my expectations are off, or the ARC itself simply incurs overhead.
Either way, dedup is not to blame for obtaining merely 2x or 4x performance
gain over the non-dedup equivalent.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-09 Thread Roy Sigurd Karlsbakk
 When it's not cached, of course the read time was equal to the
 original
 write time. When it's cached, it goes 4x faster. Perhaps this is only
 because I'm testing on a machine that has super fast storage... 11
 striped
 SAS disks yielding 8Gbit/sec as compared to all-RAM which yielded
 31.2Gbit/sec. It seems in this case, RAM is only 4x faster than the
 storage
 itself... But I would have expected a couple orders of magnitude... So
 perhaps my expectations are off, or the ARC itself simply incurs
 overhead.
 Either way, dedup is not to blame for obtaining merely 2x or 4x
 performance
 gain over the non-dedup equivalent.

Could you test with some SSD SLOGs and see how well or bad the system performs?

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-09 Thread Edward Ned Harvey
 From: Roy Sigurd Karlsbakk [mailto:r...@karlsbakk.net]
 Sent: Saturday, July 09, 2011 2:33 PM
 
 Could you test with some SSD SLOGs and see how well or bad the system
 performs?

These are all async writes, so slog won't be used.  Async writes that have a 
single fflush() and fsync() at the end to ensure system buffering is not 
skewing the results.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-09 Thread Roy Sigurd Karlsbakk
  From: Roy Sigurd Karlsbakk [mailto:r...@karlsbakk.net]
  Sent: Saturday, July 09, 2011 2:33 PM
 
  Could you test with some SSD SLOGs and see how well or bad the
  system
  performs?
 
 These are all async writes, so slog won't be used. Async writes that
 have a single fflush() and fsync() at the end to ensure system
 buffering is not skewing the results.

Sorry, my bad, I meant L2ARC to help buffer the DDT

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss