[zfs-discuss] dedup experience with sufficient RAM/l2arc/cpu

2011-01-28 Thread Ware Adams
There's a lot of discussion of dedup performance issues (including problems 
backing out of using it which concerns me), but many/most of those involve 
relatively limited RAM and CPU configurations.  I wanted to see if there is 
experience that people could share using it on with higher RAM levels and l2arc.

We have built a backup storage server nearly identical to this:

http://www.natecarlson.com/2010/05/07/review-supermicros-sc847a-4u-chassis-with-36-drive-bays/

briefly:

SuperMicro 36 bay case
48 GB RAM
2x 5620 CPU
Hitachi A7K2000 drives for storage
X25-M for l2arc (160 GB)
4x LSI SAS9211-8i
Solaris 11 Express

The main storage pool is mirrored and uses gzip compression.  Our use consists 
of backing up daily snapshots of multiple MySQL hosts from a Sun 7410 
appliance.  We rsync the snapshot to the backup server (ZFS send to 
non-appliance host isn't supported on the 7000 unfortunately), snapshot (so now 
we have a snapshot of that matches the original on the 7410), clone, start 
MySQL on the clone to verify the backup, shut down MySQL.  We do this daily 
across 10 hosts which have significant overlap in data.

I might guess that dedup would provide good space savings, but before I turn it 
on I wanted to see if people with larger configurations had found it workable.  
My greatest concern are stories of not only poor performance but worse complete 
non-responsiveness when trying to zfs destroy a filesystem with dedup turned on.

We are somewhat flexible here.  We are not terribly pressed for space, and we 
do not need massive performance out of this.  Because of that I probably won't 
use dedup without hearing it is workable on a similar configuration, but if 
people have had success it would give us more cushion for inevitable data 
growth.

Thanks for any help,
Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedup experience with sufficient RAM/l2arc/cpu

2011-01-28 Thread Ware Adams
On Jan 28, 2011, at 12:21 PM, Richard Elling wrote:
 
 On Jan 28, 2011, at 7:13 AM, Ware Adams wrote:
 
 SuperMicro 36 bay case
 48 GB RAM
 2x 5620 CPU
 Hitachi A7K2000 drives for storage
 X25-M for l2arc (160 GB)
 4x LSI SAS9211-8i
 Solaris 11 Express
 
 I apologize for the shortness, but since you have such large, slow drives, 
 rather than making
 a single huge pool and deduping, create a pool per month/week/quarter. Send 
 the snaps over
 that you need, destroy the old pool. KISS  fast destroy.

I hadn't thought about that, but I think it might add its own complexity.  Some 
more detail on what we are doing:

This host is a backup storage server for (currently) six MySQL hosts (whose 
data sets reside on NFS shares exported from the 7410).  Each data set is ~1.5 
TB uncompressed.  Of this about 30 GB changes per day (that's the rsync'd 
amount, ZFS send -i would be less but I can't do that from the 7410).  We are 
getting about 3.6:1 compression using gzip.

Then we are keeping daily backups for a month, weeklies for 6 months and then 
monthlies for a year.  By far our most frequent use of backups is an 
accidentally dropped table, but we also need with some frequency to recover 
from a situation where a user's code error was writing garbage to a field for 
say a month and they need to recover as of a certain date several months ago.  
So all in all we would like to keep quite a number of backups, say 6 hosts * 
(30 dailies + 20 weeklies + 6 monthlies) = 336.  The dailies and weeklies get 
pruned as they age into later time periods and aren't needed (and all are 
pruned after a year).

With the above I'd be able to have 18 pools with mirrors or 36 pools with just 
single drives.  So there are two things that would seem to add complexity.  
First, I'd have to assign each incoming snapshot from the 7410 to one of those 
pools based on whether it is going to expire or not.  I assume you could live 
with 18 or 36 slots, but I haven't done the logic to exactly find out.  
Still, it would be some added complexity vs. today's process which is bascially:

rsync from 7410
snapshot
clone

The other issue is the rsync step.  With only one pool I just rsync the 30 GB 
of changed data to that MySQL hosts's share.  In the multiple pool scenario's I 
guess I would have a base copy of the full data set per pool?  That would eat 
up ~400 GB on each 2 TB pool, so I wouldn't be able to fit all 6 hosts onto a 
given pool.

We haven't done a lot of zfs destroy yet (though some in testing), so I can't 
say the current setup is workable.  But unless it is horribly slow there does 
seem to be some simplicity benefit from having a single pool.  I'll keep this 
in mind though.  We could probably have a larger pool for the 6 dailies per 
week that will be destroyed.  I'd still have to zfs send the base directory 
prior to rsync, but that would simplify some.

Thanks for the suggestion.

--Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is my bottleneck RAM?

2011-01-20 Thread Ware Adams
On Jan 20, 2011, at 11:18 AM, Eugen Leitl wrote:
 I'd expect more than 105290K/s on a sequential read as a peak for a single 
 drive, let alone a striped set. The system has a relatively decent CPU, 
 however only 2GB memory, do you think increasing this to 4GB would 
 noticeably affect performance of my zpool? The memory is only DDR1.
 
 2GB or 4GB of RAM + dedup is a recipe for pain. Do yourself a favor, turn 
 off dedup
 and enable compression.
 
 Assuming 4x 3 TByte drives and 8 GByte RAM, and a lowly dual-core 1.3 GHZ
 AMD Neo, should I do the same? Or should I even not bother with compression?
 The data set is a lot of scanned documents, already compressed (TIF and PDF).
 I presume the incidence of identical blocks will be very low under such
 circumstances.

This would seem very unlikely to benefit from dedup (unless you cp the 
individual files to multiple directories).  But if you are just keeping lots of 
scans the odds of a given block being identical to a lot of other ones seem low.

The thing about compression is it is easy to test out (whereas dedup can be 
more painful to test when it doesn't work out).  So you might as well try, but 
it would seem like dedup is a waste of time and might well cause a lot of 
headaches.

Good luck,
Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 RIP

2010-11-09 Thread Ware Adams
On Nov 9, 2010, at 12:24 PM, Maurice Volaski wrote:
 
 http://www.supermicro.com/products/chassis/4U/?chs=847
 
 Stay away from the 24 port expander backplanes. I've gone thru several
 and they still don't work right - timeout and dropped drives under load.
 The 12-port works just fine connected to a variety of controllers. If you
 insist on the 24-port expander backplane, use a non-expander equipped LSI
 controller to drive it.
 
 I was wondering if you can clarify. Isn't the case that all 24-port
 backplane utilize expander chips directly on the backplane to support
 their 24 ports or are they utilized only when something else, such as
 another 12-port backplane, is connected to one of the cascade ports in the
 back?

I think he is referring to the different flavors of the 847, namely the one 
that uses expanders (E1, E2, E16, E26) vs. the one that does not (the 847A).  
This page about a storage server build does a very good job of detailing all 
the different versions of the 847:

http://www.natecarlson.com/2010/05/07/review-supermicros-sc847a-4u-chassis-with-36-drive-bays/

--Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Ware Adams
On Aug 27, 2010, at 2:32 PM, Mark wrote:
 Saddly most of those options will not work, since we are using a SUN Unified 
 Storage 7210, the only option is to buy the SUN SSD's for it, which is about 
 $15k USD for a pair.   We also don't have the ability to shut off ZIL or any 
 of the other options that one might have under OpenSolaris itself :(
 
 It sounds like I do want to change to a RAID10 mirror instead of RAIDz.   It 
 sounds like enabling write-cash without the ZIL in place might work but would 
 lead to corruption should something crash.
 
 So the question is with a proper ZIL SSD from SUN, and a RAID10... would I be 
 able to support all the VM's or would it still be pushing the limits a 44 
 disk pool?

We run roughly that number of VMs on ESXi 4 using a 7410 and a 7310 via NFS.  
The 7410 and 7310 have fewer disks (24), but they are arranged in a mirror 
configuration.  Each has both readzilla and logzilla SSDs.  Our VMs are 
similarly lightly loaded (much like yours...mix of Windows and Ubuntu, about 
25% run a DB server with very little load).  We use compression but not 
deduplication.

It has worked extremely well for us.  No complaints on speed, very stable.  
From what I have read on this list iscsi will not be a huge speed improvement 
for you (though we haven't tried it), and you give up a lot of management 
flexibility vs. NFS.

I would say that the 7210 should be able to support your needs if you put SSDs 
in based on our experience (and the 7210 has more disks than our 7310 or 7410). 
 Of course switching to a mirror pool requires destroying your current 
configuration, so it isn't easy.  You might also need to remove some HDDs to 
make room for the SSDs.

As far as analytics, the ARC stats (hit/miss) are available which will give you 
some indication of whether an L2ARC will help.  On the SLOG, look at latency by 
file and operation for a VM that is having performance issues...is it showing 
high latency on NFS writes?

Good luck,
Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Ware Adams

On Mar 15, 2010, at 10:55 AM, Gabriele Bulfon wrote:

 - In this case, the storage appliance is a legacy system based on linux, so 
 raids/mirrors are managed at the storage side its own way. Being an iscsi 
 target, this volume was mounted as a single iscsi disk from the solaris host, 
 and prepared as a zfs pool consisting of this single iscsi target. ZFS best 
 practices, tell me that to be safe in case of corruption, pools should always 
 be mirrors or raidz on 2 or more disks. In this case, I considered all safe, 
 because the mirror and raid was managed by the storage machine. But from the 
 solaris host point of view, the pool was just one! And maybe this has been 
 the point of failure. What is the correct way to go in this case?

I'd guess this could be because the iscsi target wasn't honoring ZFS flush 
requests.

 - Finally, looking forward to run new storage appliances using OpenSolaris 
 and its ZFS+iscsitadm and/or comstar, I feel a bit confused by the 
 possibility of having a double zfs situation: in this case, I would have the 
 storage zfs filesystem divided into zfs volumes, accessed via iscsi by a 
 possible solaris host that creates his own zfs pool on it (...is it too 
 redundant??) and again I would fall in the same previous case (host zfs pool 
 connected to one only iscsi resource).

My experience with this is significantly lower end, but I have had iSCSI shares 
from a ZFS NAS come up as corrupt to the client.  It's fixable if you have 
snapshots.

I've been using iSCSI to provide Time Machine targets to OS X boxes.  We had a 
client crash during writing, and upon reboot it showed the iSCSI volume is 
corrupt.  You can put whatever file system you like the iSCSI target obviously. 
 The current OpenSolaris iSCSI implementation I believe uses synchronous 
writes, so hopefully what happened to you wouldn't happen in this case.

In my case I was using HFS+ (the OS X client has to), and I couldn't repair the 
volume.  However, with a snapshot I could roll it back.  If you plan ahead this 
should save you some restoration work (you'll need to be able to roll back all 
the files that have to be consistent).

Good luck,
Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Ware Adams

On Mar 15, 2010, at 12:13 PM, Gabriele Bulfon wrote:

 Well, I actually don't know what implementation is inside this legacy machine.
 This machine is an AMI StoreTrends ITX, but maybe it has been built around 
 IET, don't know.
 Well, maybe I should disable write-back on every zfs host connecting on iscsi?
 How do I check this?

I think this would be a property of the NAS, not the clients.

--Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Comments on home OpenSolaris/ZFS server

2009-09-28 Thread Ware Adams

Hello,

I have been researching building a home storage server based on  
OpenSolaris and ZFS, and I would appreciate any time people could take  
to comment on my current leanings.


I've tried to gather old information from this list as well as the  
HCL, but I would welcome anyone's experience on both compatibility and  
appropriateness for my goals.  I'd love if that white box server wiki  
page were set up now, but for now I'll have to just ask here.


My priorities:

1)  Data security.  I'm hoping I can get this via ECC RAM and  
enterprise drives that hopefully don't lie to ZFS about flushing to  
disk?  I'll run mirrored pools for redundancy (which leads me to want  
a case w/a lot of bays).
2)  Compatibility.  For me this translates into low upkeep cost  
(time).  I'm not looking to be the first person to get OpenSolaris  
running on some particular piece of hardware.
3)  Scaleable.  I'd like to not have to upgrade every year.  I can  
always use something like an external JBOD array, but there's some  
appeal to having enough space in the case for reasonable growth.  I'd  
also like to have enough performance to keep up with scaling data  
volume and ZFS features.
4)  Ability to run some other (lightweight) services on the box.  I'll  
be using NFS (iTunes libraries for OS X clients) and iSCSI (Time  
Machine backups) primarily, but my current home server also runs a few  
small services (MySQL etc...) that are very lightweight but  
nevertheless might be difficult to do on a ZFS (or ZFS like) appliance
5)  Cost.  All things being equal cheaper is better, but I'm willing  
to pay more to accomplish particularly 1-3 above.


My current thinking:

SuperMicro 7046A-3 Workstation
http://supermicro.com/products/system/4U/7046/SYS-7046A-3.cfm
8 hot swappable drive bays (SAS or SATA, I'd use SATA)
Network/Main board/SAS/SATA controllers seem well supported by  
OpenSolaris

Will take IPMI card for remote admin (with video and iso redirection)
12 RAM slots so I can buy less dense chips
2x 5.25 drive bays.  I'd use a SuperMicro Mobile Rack M14T (http://www.supermicro.com/products/accessories/mobilerack/CSE-M14.cfm 
) to get 4 2.5 SAS drives in one of these.  2 would be used for a  
mirrored boot pool leaving 2 for potential future use (like a ZIL on  
SSD).


Nehalem E5520 CPU
These are clearly more than enough now, but I'm hoping to have decent  
CPU performance for say 5 years (and I'm willing to pay for it up  
front vs. upgrading every 2 years...I don't want this to be too time  
consuming of a hobby).  I'd like to have processor capacity for  
compression and (hopefully reasonably soon) de-duplication as well as  
obviously support ECC RAM.


Crucial RAM in 4 GB density (price scales linearly up through this  
point and I've had good support from Crucial)


Seagate Barracuda ES.2 1TB SATA (Model ST31000340NS) for storage  
pool.  I would like to use a larger drive, but I can't find anything  
rated to run 24x7 larger than 1TB from Seagate.  I'd like to have  
drives rated for 24x7 use, and I've had good experience w/Seagate.   
Again, a larger case gives me some flexibility here.


Misc (mainly interested in compatibility b/c it will hardly be used):
Sun XVR-100 video card from eBay
Syba SY-PCI45004 (http://www.newegg.com/Product/Product.aspx?Item=N82E16816124025 
) IDE card for CD-ROM
Sony DDU1678A (http://www.newegg.com/Product/Product.aspx?Item=N82E16827131061 
) CD-ROM


Thanks a lot for any thoughts you might have.

--Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Comments on home OpenSolaris/ZFS server

2009-09-28 Thread Ware Adams

On Sep 28, 2009, at 4:20 PM, Michael Shadle wrote:


I agree - SOHO usage of ZFS is still a scary will this work? deal. I
found a working setup and I cloned it. It gives me 16x SATA + 2x SATA
for mirrored boot, 4GB ECC RAM and a quad core processor - total cost
without disks was ~ $1k I believe. Not too shabby. Emphasis was also
for acoustics - rack dense would be great but my current living
situation doesn't warrant that


This sounds interesting.  Do you have any info on it (case you started  
with, etc...).


I'm concerned about noise too as this will be in a closet close to the  
room where our television is.  Currently there is a MacPro in there  
which isn't terribly quiet, but the SuperMicro case is reported to be  
fairly quiet.


Thanks,
Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss